linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/20] dlb2: introduce DLB 2.0 device driver
@ 2020-07-12 13:43 Gage Eads
  2020-07-12 13:43 ` [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver Gage Eads
                   ` (19 more replies)
  0 siblings, 20 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

This commit introduces a new misc device driver for the Intel(r) Dynamic
Load Balancer 2.0 (Intel(r) DLB 2.0). The Intel DLB 2.0 is a PCIe device that
provides load-balanced, prioritized scheduling of core-to-core communication.

The Intel DLB 2.0 consists of queues and arbiters that connect producer cores
and consumer cores. The device implements load-balanced queueing features
including:
- Lock-free multi-producer/multi-consumer operation.
- Multiple priority levels for varying traffic types.
- 'Direct' traffic (i.e. multi-producer/single-consumer)
- Simple unordered load-balanced distribution.
- Atomic lock free load balancing across multiple consumers.
- Queue element reordering feature allowing ordered load-balanced
  distribution.

Intel DLB 2.0 can be used in an event-driven programming model, such as DPDK's
Event Device Library[2]. Such frameworks are commonly used in packet processing
pipelines that benefit from the framework's multi-core scalability, dynamic
load-balancing, and variety of packet distribution and synchronization schemes.

These distribution schemes include "parallel" (packets are load-balanced
across multiple cores and processed in parallel), "ordered" (similar to
"parallel" but packets are reordered into ingress order by the device), and
"atomic" (packet flows are scheduled to a single core at a time such that
locks are not required to access per-flow data, and dynamically migrated to
ensure load-balance).

The fundamental unit of communication through the device is a queue entry
(QE), which consists of 8B of data and 8B of metadata (destination queue,
priority, etc.). The data field can be any type that fits within 8B.

A core's interface to the device, a "port," consists of a memory-mappable
region through which the core enqueues a queue entry, and an in-memory
queue (the "consumer queue") to which the device schedules QEs. Each QE
is enqueued to a device-managed queue, and from there scheduled to a port.
Software specifies the "linking" of queues and ports; i.e. which ports the
device is allowed to schedule to for a given queue. The device uses a
credit scheme to prevent overflow of the on-device queue storage.

Applications can interface directly with the device by mapping the port's
memory and MMIO regions into the application's address space for enqueue
and dequeue operations, but call into the kernel driver for configuration
operations. An application can also be polling- or interrupt-driven;
Intel DLB 2.0 supports both modes of operation.

Device resources -- i.e. ports, queues, and credits -- are contained within
a scheduling domain. Scheduling domains are isolated from one another; a
port can only enqueue to and dequeue from queues within its scheduling
domain. A scheduling domain's resources are configured through a scheduling
domain file, which is acquired through an ioctl.

Intel DLB 2.0 supports SR-IOV and Scalable IOV, and allows for a flexible
division of its resources among the PF and its virtual devices. The virtual
devices are incapable of configuring the device directly; they use a hardware
mailbox to proxy configuration requests to the PF driver. This driver supports
both PF and virtual devices, as there is significant code re-use between the
two, with device-specific behavior handled through a callback interface.
Virtualization support will be added in a later patch set.

The dlb driver uses ioctls as its primary interface (it makes use of sysfs
as well, to a lesser extent). The dlb device file supports a different
ioctl interface than the scheduling domain file; the dlb device file
is used for device-wide operations (including scheduling domain creation),
and the scheduling domain file supports operations on the scheduling
domain's resources (primarily resource configuration).

The dlb driver is partially dual-licensed (GPLv2 and BSD 3 clause). The
dual-licensed files implement a "hardware access library" that is largely
Linux independent (except, for example, kmalloc() and kfree() calls), so
that they can be ported to other environments (e.g. DPDK).

[1] https://builders.intel.com/docs/networkbuilders/SKU-343247-001US-queue-management-and-load-balancing-on-intel-architecture.pdf
[2] https://doc.dpdk.org/guides/prog_guide/eventdev.html

Note: This patchset underwent multiple internal reviews and revisions, and
those changes include:
- Many style corrections (multi-line comments, header order, etc.)
- Add device and software documentation (dlb2.rst)
- Use dyndbg for debug messages
- Refactor mmap and device file architecture
- Use ida library for ID management
- Fill data structure holes
- Add missing static keyword for various functions
- Various bug fixes

Gage Eads (20):
  dlb2: add skeleton for DLB 2.0 driver
  dlb2: initialize PF device
  dlb2: add resource and device initialization
  dlb2: add device ioctl layer and first 4 ioctls
  dlb2: add sched domain config and reset support
  dlb2: add ioctl to get sched domain fd
  dlb2: add runtime power-management support
  dlb2: add queue create and queue-depth-get ioctls
  dlb2: add ioctl to configure ports, query poll mode
  dlb2: add port mmap support
  dlb2: add start domain ioctl
  dlb2: add queue map and unmap ioctls
  dlb2: add port enable/disable ioctls
  dlb2: add CQ interrupt support
  dlb2: add domain alert support
  dlb2: add sequence-number management ioctls
  dlb2: add cos bandwidth get/set ioctls
  dlb2: add device FLR support
  dlb2: add basic PF sysfs interfaces
  dlb2: add ingress error handling

 Documentation/ABI/testing/sysfs-driver-dlb2 |  202 +
 Documentation/misc-devices/dlb2.rst         |  313 ++
 Documentation/misc-devices/index.rst        |    1 +
 MAINTAINERS                                 |    7 +
 drivers/misc/Kconfig                        |    1 +
 drivers/misc/Makefile                       |    1 +
 drivers/misc/dlb2/Kconfig                   |   11 +
 drivers/misc/dlb2/Makefile                  |   13 +
 drivers/misc/dlb2/dlb2_bitmap.h             |  286 ++
 drivers/misc/dlb2/dlb2_file.c               |  133 +
 drivers/misc/dlb2/dlb2_file.h               |   19 +
 drivers/misc/dlb2/dlb2_hw_types.h           |  357 ++
 drivers/misc/dlb2/dlb2_intr.c               |  140 +
 drivers/misc/dlb2/dlb2_intr.h               |   30 +
 drivers/misc/dlb2/dlb2_ioctl.c              | 1319 +++++
 drivers/misc/dlb2/dlb2_ioctl.h              |   19 +
 drivers/misc/dlb2/dlb2_main.c               | 1181 +++++
 drivers/misc/dlb2/dlb2_main.h               |  285 ++
 drivers/misc/dlb2/dlb2_pf_ops.c             | 1313 +++++
 drivers/misc/dlb2/dlb2_regs.h               | 3702 ++++++++++++++
 drivers/misc/dlb2/dlb2_resource.c           | 7119 +++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h           |  924 ++++
 include/linux/pci_ids.h                     |    2 +
 include/uapi/linux/dlb2_user.h              | 1158 +++++
 24 files changed, 18536 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-dlb2
 create mode 100644 Documentation/misc-devices/dlb2.rst
 create mode 100644 drivers/misc/dlb2/Kconfig
 create mode 100644 drivers/misc/dlb2/Makefile
 create mode 100644 drivers/misc/dlb2/dlb2_bitmap.h
 create mode 100644 drivers/misc/dlb2/dlb2_file.c
 create mode 100644 drivers/misc/dlb2/dlb2_file.h
 create mode 100644 drivers/misc/dlb2/dlb2_hw_types.h
 create mode 100644 drivers/misc/dlb2/dlb2_intr.c
 create mode 100644 drivers/misc/dlb2/dlb2_intr.h
 create mode 100644 drivers/misc/dlb2/dlb2_ioctl.c
 create mode 100644 drivers/misc/dlb2/dlb2_ioctl.h
 create mode 100644 drivers/misc/dlb2/dlb2_main.c
 create mode 100644 drivers/misc/dlb2/dlb2_main.h
 create mode 100644 drivers/misc/dlb2/dlb2_pf_ops.c
 create mode 100644 drivers/misc/dlb2/dlb2_regs.h
 create mode 100644 drivers/misc/dlb2/dlb2_resource.c
 create mode 100644 drivers/misc/dlb2/dlb2_resource.h
 create mode 100644 include/uapi/linux/dlb2_user.h


base-commit: 2d41d2ab85d414f6ea9eac2f853b1aa31ba0820f
-- 
2.13.6


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 15:56   ` Greg KH
  2020-07-12 15:58   ` Greg KH
  2020-07-12 13:43 ` [PATCH 02/20] dlb2: initialize PF device Gage Eads
                   ` (18 subsequent siblings)
  19 siblings, 2 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

This initial commit contains basic driver functionality (load, unload,
probe, and remove callbacks) file_operations stubs, and device
documentation as well.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 Documentation/misc-devices/dlb2.rst  | 313 +++++++++++++++++++++++++++++++++++
 Documentation/misc-devices/index.rst |   1 +
 MAINTAINERS                          |   7 +
 drivers/misc/Kconfig                 |   1 +
 drivers/misc/Makefile                |   1 +
 drivers/misc/dlb2/Kconfig            |  11 ++
 drivers/misc/dlb2/Makefile           |   8 +
 drivers/misc/dlb2/dlb2_hw_types.h    |  29 ++++
 drivers/misc/dlb2/dlb2_main.c        | 216 ++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.h        |  37 +++++
 include/linux/pci_ids.h              |   2 +
 11 files changed, 626 insertions(+)
 create mode 100644 Documentation/misc-devices/dlb2.rst
 create mode 100644 drivers/misc/dlb2/Kconfig
 create mode 100644 drivers/misc/dlb2/Makefile
 create mode 100644 drivers/misc/dlb2/dlb2_hw_types.h
 create mode 100644 drivers/misc/dlb2/dlb2_main.c
 create mode 100644 drivers/misc/dlb2/dlb2_main.h

diff --git a/Documentation/misc-devices/dlb2.rst b/Documentation/misc-devices/dlb2.rst
new file mode 100644
index 000000000000..bbd623798b8b
--- /dev/null
+++ b/Documentation/misc-devices/dlb2.rst
@@ -0,0 +1,313 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+===========================================
+Intel(R) Dynamic Load Balancer 2.0 Overview
+===========================================
+
+:Author: Gage Eads
+
+Contents
+========
+
+- Introduction
+- Scheduling
+- Queue Entry
+- Port
+- Queue
+- Credits
+- Scheduling Domain
+- Interrupts
+- Power Management
+- Virtualization
+- User Interface
+- Reset
+
+Introduction
+============
+
+The Intel(r) Dynamic Load Balancer 2.0 (Intel(r) DLB 2.0) is a PCIe device that
+provides load-balanced, prioritized scheduling of core-to-core communication.
+
+The Intel DLB 2.0 device consists of queues and arbiters that connect producer
+cores and consumer cores. The device implements load-balanced queueing features
+including:
+- Lock-free multi-producer/multi-consumer operation.
+- Multiple priority levels for varying traffic types.
+- 'Direct' traffic (i.e. multi-producer/single-consumer)
+- Simple unordered load-balanced distribution.
+- Atomic lock free load balancing across multiple consumers.
+- Queue element reordering feature allowing ordered load-balanced distribution.
+
+Intel DLB 2.0 can be used in an event-driven programming model, such as DPDK's
+Event Device Library[2]. Such frameworks are commonly used in packet processing
+pipelines that benefit from the framework's multi-core scalability, dynamic
+load-balancing, and variety of packet distribution and synchronization schemes.
+
+Scheduling Types
+================
+
+Intel DLB 2.0 supports four types of scheduling of 'events' (using DPDK
+terminology), where an event can represent any type of data (e.g. a network
+packet). The first, ``directed``, is multi-producer/single-consumer style
+core-to-core communication. The remaining three are
+multi-producer/multi-consumer, and support load-balancing across the consumers.
+
+- ``Unordered``: events are load-balanced across consumers without any ordering
+                 guarantees.
+
+- ``Ordered``: events are load-balanced across consumers, and when the consumer
+               re-injects each event into the device it is re-ordered into the
+               original order. This scheduling type allows software to
+               parallelize ordered event processing without the synchronization
+               cost of re-ordering packets.
+
+- ``Atomic``: events are load-balanced across consumers, with the guarantee that
+              events from a particular 'flow' are only scheduled to a single
+              consumer at a time (but can migrate over time). This allows, for
+              example, packet processing applications to parallelize while
+              avoiding locks on per-flow data and maintaining ordering within a
+              flow.
+
+Intel DLB 2.0 provides hierarchical priority scheduling, with eight priority
+levels within each. Each consumer selects up to eight queues to receive events
+from, and assigns a priority to each of these 'connected' queues. To schedule
+an event to a consumer, the device selects the highest priority non-empty queue
+of the (up to) eight connected queues. Within that queue, the device selects
+the highest priority event available (selecting a lower priority event for
+starvation avoidance 1% of the time, by default).
+
+The device also supports four load-balanced scheduler classes of service. Each
+class of service receives a (user-configurable) guaranteed percentage of the
+scheduler bandwidth, and any unreserved bandwidth is divided evenly among the
+four classes.
+
+Queue Entry
+===========
+
+Each event is contained in a queue entry (QE), the fundamental unit of
+communication through the device, which consists of 8B of data and 8B of
+metadata, as depicted below.
+
+QE structure format
+::
+    data     :64
+    opaque   :16
+    qid      :8
+    sched    :2
+    priority :3
+    msg_type :3
+    lock_id  :16
+    rsvd     :8
+    cmd      :8
+
+The ``data`` field can be any type that fits within 8B (pointer, integer,
+etc.); DLB 2.0 merely copies this field from producer to consumer. The
+``opaque`` and ``msg_type`` fields behave the same way.
+
+``qid`` is set by the producer to specify to which DLB 2.0 queue it wishes to
+enqueue this QE. The ID spaces for load-balanced and directed queues are both
+zero-based; the ``sched`` field is used to distinguish whether the queue is
+load-balanced or directed.
+
+``sched`` controls the scheduling type: atomic, unordered, ordered, or
+directed. The first three scheduling types are only valid for load-balanced
+queues, and the directed scheduling type is only valid for directed queues.
+
+``priority`` is the priority with which this QE should be scheduled.
+
+``lock_id``, used only by atomic scheduling, identifies the atomic flow to
+which the QE belongs. When sending a directed event, ``lock_id`` is simply
+copied like the ``data``, ``opaque``, and ``msg_type`` fields.
+
+``cmd`` specifies the operation, such as:
+- Enqueue a new QE
+- Forward a QE that was dequeued
+- Complete/terminate a QE that was dequeued
+- Return one or more consumer queue tokens.
+- Arm the port's consumer queue interrupt.
+
+Port
+====
+
+A core's interface to the DLB 2.0 is called a "port," and consists of an MMIO
+region through which the core enqueues a queue entry, and an in-memory queue
+(the "consumer queue") to which the device schedules QEs. A core enqueues a QE
+to a device queue, then the device schedules the event to a port. Software
+specifies the connection of queues and ports; i.e. for each queue, to which
+ports the device is allowed to schedule its events. The device uses a credit
+scheme to prevent overflow of the on-device queue storage.
+
+Applications interface directly with the device by mapping the port's memory
+and MMIO regions into the application's address space for enqueue and dequeue
+operations, but call into the kernel driver for configuration operations. An
+application can also be polling- or interrupt-driven; DLB 2.0 supports both
+modes of operation.
+
+Queue
+=====
+
+The device contains 32 load-balanced (i.e. capable of atomic, ordered, and
+unordered scheduling) queues and 64 directed queues. Each queue comprises 8
+internal queues, one per priority level. The internal queue that an event is
+enqueued to is selected by the event's priority field.
+
+A load-balanced queue is capable of scheduling its events to any combination of
+load-balanced ports, whereas each directed queue has one-to-one mapping with a
+directed port. There is no restriction on port or queue types when a port
+enqueues an event to a queue; that is, a load-balanced port can enqueue to a
+directed queue and vice versa.
+
+Credits
+=======
+
+The Intel DLB 2.0 uses a credit scheme to prevent overflow of the on-device
+queue storage, with separate credits for load-balanced and directed queues. A
+port spends one credit when it enqueues a QE, and one credit is replenished
+when a QE is scheduled to a consumer queue. Each scheduling domain has one pool
+of load-balanced credits and one pool of directed credits; software is
+responsible for managing the allocation and replenishment of these credits among
+the scheduling domain's ports.
+
+Scheduling Domain
+=================
+
+Device resources -- including ports, queues, and credits -- are contained
+within a scheduling domain. Scheduling domains are isolated from one another; a
+port can only enqueue to and dequeue from queues within its scheduling domain.
+
+A scheduling domain's resources are configured through a domain file descriptor,
+which is acquired through an ioctl. This design means that any application with
+sufficient permissions to access the device file can request the fd of any
+scheduling domain within that device. When necessary to prevent independent dlb2
+applications from potentially accessing each other's scheduling domains, the
+user can create multiple virtual functions (each with its own device file) and
+restrict access via file permissions.
+
+Consumer Queue Interrupts
+=========================
+
+Each port has its own interrupt which fires, if armed, when the consumer queue
+depth becomes non-zero. Software arms an interrupt by enqueueing a special
+'interrupt arm' command to the device through the port's MMIO window.
+
+Power Management
+================
+
+The kernel driver keeps the device in D3Hot when not in use. The driver
+transitions the device to D0 when the first device file is opened or a virtual
+function is created, and keeps it there until there are no open device files,
+memory mappings, or virtual functions.
+
+Virtualization
+==============
+
+The DLB 2.0 supports both SR-IOV and Scalable IOV, and can flexibly divide its
+resources among the physical function (PF) and its virtual devices. Virtual
+devices do not configure the device directly; they use a hardware mailbox to
+proxy configuration requests to the PF driver. Mailbox communication is
+initiated by the virtual device with a registration message that establishes
+the mailbox interface version.
+
+SR-IOV
+------
+
+Each SR-IOV virtual function (VF) has 32 interrupts, 1 for PF->VF mailbox
+messages and the remainder for CQ interrupts. If a VF user (e.g. a guest OS)
+needs more CQ interrupts, they have to use more than one VF.
+
+To support this case, the driver introduces the notion of primary and auxiliary
+VFs. A VF is either considered primary or auxiliary:
+- Primary: the VF is used as a regular DLB 2.0 device. The primary VF has 0+
+           auxiliary VFs supporting it.
+- Auxiliary: the VF doesn't have any resources of its own, and serves only to
+             provide the primary VF with MSI vectors for its CQ interrupts.
+
+Each VF has an aux_vf_ids file in its sysfs directory, which is a R/W file that
+controls the primary VF’s auxiliaries. When a VF is made auxiliary to another,
+its resources are relinquished to the PF device.
+
+When the VF driver registers its device with the PF driver, the PF driver tells
+the VF driver whether its device is auxiliary or primary, and if so then the ID
+of its primary VF. If it is auxiliary, the VF device will “claim” up to 31 of
+the primary VF’s CQs, such that they use the auxiliary VF’s MSI vectors.
+
+When a primary VF has one or more auxiliary VFs, the entire VF group must be
+assigned to the same virtual machine. The PF driver will not allow the primary
+VF to configure its resources until all its auxiliary VFs have been registered
+by the guest OS’s driver.
+
+Scalable IOV
+------------
+
+Scalable IOV is a virtualization solution that, compared to SR-IOV, enables
+highly-scalable, high-performance, and fine-grained sharing of I/O devices
+across isolated domains.
+
+In Scalable IOV, the smallest granularity of sharing a device is the Assignable Device
+Interface (ADI). Similar to SR-IOV’s Virtual Function, Scalable IOV defines the
+Virtual Device (VDEV) as the abstraction at which a Scalable IOV device is exposed to
+guest software, and a VDEV contains one or more ADIs.
+
+Kernel software is responsible for composing and managing VDEV instances in
+Scalable IOV. The device-specific software components are the Host (PF) Driver,
+the Guest (VDEV) Driver, and the Virtual Device Composition Module (VDCM). The
+VDCM is responsible for managing the software-based virtualization of (slow)
+control path operations, like the mailbox between Host and Guest drivers.
+
+For DLB 2.0, the ADI is the scheduling domain, which consists of load-balanced
+and directed queues, ports, and other resources. Each port, whether
+load-balanced or directed, consists of:
+- A producer port: a 4-KB separated MMIO window
+- A consumer queue: a memory-based ring to which the device writes queue entries
+- One CQ interrupt message
+
+DLB 2.0 supports up to 16 VDEVs per PF.
+
+For Scalable IOV guest-host communication, DLB 2.0 uses a software-based
+mailbox. This mailbox interface matches the SR-IOV hardware mailbox (i.e. PF2VF
+and VF2PF MMIO regions) except that it is backed by shared memory (allocated
+and managed by the VDCM). Similarly, the VF2PF interrupt trigger register
+instead causes a VM exit into the VDCM driver, and the PF2VF interrupt is
+replaced by a virtual interrupt injected into the guest through the VDCM.
+
+User Interface
+==============
+
+The dlb2 driver uses ioctls as its primary interface. It provides two types of
+files: the dlb2 device file and the scheduling domain file.
+
+The two types support different ioctl interfaces; the dlb2 device file is used
+for device-wide operations (including scheduling domain creation), and the
+scheduling domain device file supports operations on the scheduling domain's
+resources such as port and queue configuration.
+
+The driver also exports an mmap interface through port files, which are
+acquired through scheduling domain ioctls. This mmap interface is used to map
+a port's memory and MMIO window into the process's address space.
+
+Reset
+=====
+
+The dlb2 driver supports reset at two levels: scheduling domain and device-wide
+(i.e. FLR).
+
+Scheduling domain reset occurs when an application stops using its domain.
+Specifically, when no more file references or memory mappings exist. At this
+time, the driver resets all the domain's resources (flushes its queues and
+ports) and puts them in their respective available-resource lists for later
+use.
+
+An FLR can occur while the device is in use by user-space software, so the
+driver uses its reset_prepare callback to ensure that no applications continue
+to use the device while the FLR executes. First, the driver blocks user-space
+from executing ioctls or opening a device file, and evicts any threads blocked
+on a CQ interrupt. The driver then notifies applications and virtual functions
+that an FLR is pending, and waits for them to clean up with a timeout (default
+of 5 seconds). If the timeout expires and the device is still in use by an
+application, the driver zaps its MMIO mappings. Virtual functions, whether in
+use or not, are reset as part of a PF FLR.
+
+While PF FLR is a hardware procedure, VF FLR is a software procedure. When a
+VF FLR is triggered, this causes an interrupt to be delivered to the PF driver,
+which performs the actual reset. This consists of performing the scheduling
+domain reset operation for each of the VF's scheduling domains.
diff --git a/Documentation/misc-devices/index.rst b/Documentation/misc-devices/index.rst
index 1ecc05fbe6f4..4160f7ef33ca 100644
--- a/Documentation/misc-devices/index.rst
+++ b/Documentation/misc-devices/index.rst
@@ -14,6 +14,7 @@ fit into other categories.
 .. toctree::
    :maxdepth: 2
 
+   dlb2
    eeprom
    ibmvmc
    ics932s401
diff --git a/MAINTAINERS b/MAINTAINERS
index 496fd4eafb68..aada3a3b2c8c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8622,6 +8622,13 @@ L:	linux-kernel@vger.kernel.org
 S:	Supported
 F:	arch/x86/include/asm/intel-family.h
 
+INTEL DYNAMIC LOAD BALANCER 2.0 DRIVER
+M:	Gage Eads <gage.eads@intel.com>
+S:	Maintained
+F:	Documentation/ABI/testing/sysfs-driver-dlb2
+F:	drivers/misc/dlb*
+F:	include/uapi/linux/dlb2_user.h
+
 INTEL DRM DRIVERS (excluding Poulsbo, Moorestown and derivative chipsets)
 M:	Jani Nikula <jani.nikula@linux.intel.com>
 M:	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index e1b1ba5e2b92..965f3457e3f7 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -472,4 +472,5 @@ source "drivers/misc/ocxl/Kconfig"
 source "drivers/misc/cardreader/Kconfig"
 source "drivers/misc/habanalabs/Kconfig"
 source "drivers/misc/uacce/Kconfig"
+source "drivers/misc/dlb2/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index c7bd01ac6291..b6afc8edea2b 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -57,3 +57,4 @@ obj-$(CONFIG_PVPANIC)   	+= pvpanic.o
 obj-$(CONFIG_HABANA_AI)		+= habanalabs/
 obj-$(CONFIG_UACCE)		+= uacce/
 obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
+obj-$(CONFIG_INTEL_DLB2)	+= dlb2/
diff --git a/drivers/misc/dlb2/Kconfig b/drivers/misc/dlb2/Kconfig
new file mode 100644
index 000000000000..de40450f6f6a
--- /dev/null
+++ b/drivers/misc/dlb2/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config INTEL_DLB2
+       tristate "Intel(R) Dynamic Load Balancer 2.0 Driver"
+       depends on 64BIT && PCI && X86
+       help
+         This driver supports the Intel(R) Dynamic Load Balancer 2.0 (DLB 2.0)
+         device.
+
+         To compile this driver as a module, choose M here. The module
+         will be called dlb2.
diff --git a/drivers/misc/dlb2/Makefile b/drivers/misc/dlb2/Makefile
new file mode 100644
index 000000000000..90ae953d2a8f
--- /dev/null
+++ b/drivers/misc/dlb2/Makefile
@@ -0,0 +1,8 @@
+#
+# Makefile for the Intel(R) Dynamic Load Balancer 2.0 (dlb2.ko) driver
+#
+
+obj-$(CONFIG_INTEL_DLB2) := dlb2.o
+
+dlb2-objs :=  \
+  dlb2_main.o \
diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
new file mode 100644
index 000000000000..558099db50b8
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_HW_TYPES_H
+#define __DLB2_HW_TYPES_H
+
+#define DLB2_MAX_NUM_VDEVS 16
+#define DLB2_MAX_NUM_DOMAINS 32
+#define DLB2_MAX_NUM_LDB_QUEUES 32 /* LDB == load-balanced */
+#define DLB2_MAX_NUM_DIR_QUEUES 64 /* DIR == directed */
+#define DLB2_MAX_NUM_LDB_PORTS 64
+#define DLB2_MAX_NUM_DIR_PORTS DLB2_MAX_NUM_DIR_QUEUES
+#define DLB2_MAX_NUM_LDB_CREDITS 8192
+#define DLB2_MAX_NUM_DIR_CREDITS 2048
+#define DLB2_MAX_NUM_HIST_LIST_ENTRIES 2048
+#define DLB2_MAX_NUM_AQED_ENTRIES 2048
+#define DLB2_MAX_NUM_QIDS_PER_LDB_CQ 8
+#define DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS 2
+#define DLB2_MAX_NUM_SEQUENCE_NUMBER_MODES 5
+#define DLB2_QID_PRIORITIES 8
+#define DLB2_NUM_ARB_WEIGHTS 8
+#define DLB2_MAX_WEIGHT 255
+#define DLB2_NUM_COS_DOMAINS 4
+#define DLB2_MAX_CQ_COMP_CHECK_LOOPS 409600
+#define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
+#define DLB2_HZ 800000000
+
+#endif /* __DLB2_HW_TYPES_H */
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
new file mode 100644
index 000000000000..fd953a1c4cb5
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -0,0 +1,216 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2018-2020 Intel Corporation */
+
+#include <linux/aer.h>
+#include <linux/cdev.h>
+#include <linux/delay.h>
+#include <linux/fs.h>
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/uaccess.h>
+
+#include "dlb2_main.h"
+
+static const char
+dlb2_driver_copyright[] = "Copyright(c) 2018-2020 Intel Corporation";
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Copyright(c) 2018-2020 Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) Dynamic Load Balancer 2.0 Driver");
+
+/* The driver lock protects data structures that used by multiple devices. */
+static DEFINE_MUTEX(dlb2_driver_lock);
+static struct list_head dlb2_dev_list = LIST_HEAD_INIT(dlb2_dev_list);
+
+static struct class *dlb2_class;
+static dev_t dlb2_dev_number_base;
+
+/*****************************/
+/****** Devfs callbacks ******/
+/*****************************/
+
+static int dlb2_open(struct inode *i, struct file *f)
+{
+	return 0;
+}
+
+static int dlb2_close(struct inode *i, struct file *f)
+{
+	return 0;
+}
+
+static const struct file_operations dlb2_fops = {
+	.owner   = THIS_MODULE,
+	.open    = dlb2_open,
+	.release = dlb2_close,
+};
+
+/**********************************/
+/****** PCI driver callbacks ******/
+/**********************************/
+
+static DEFINE_IDA(dlb2_ids);
+
+static int dlb2_alloc_id(void)
+{
+	return ida_alloc_max(&dlb2_ids, DLB2_MAX_NUM_DEVICES - 1, GFP_KERNEL);
+}
+
+static void dlb2_free_id(int id)
+{
+	ida_free(&dlb2_ids, id);
+}
+
+static int dlb2_probe(struct pci_dev *pdev,
+		      const struct pci_device_id *pdev_id)
+{
+	struct dlb2_dev *dlb2_dev;
+	int ret;
+
+	dev_dbg(&pdev->dev, "probe\n");
+
+	dlb2_dev = devm_kzalloc(&pdev->dev, sizeof(*dlb2_dev), GFP_KERNEL);
+	if (!dlb2_dev)
+		return -ENOMEM;
+
+	pci_set_drvdata(pdev, dlb2_dev);
+
+	dlb2_dev->pdev = pdev;
+
+	dlb2_dev->id = dlb2_alloc_id();
+	if (dlb2_dev->id < 0) {
+		dev_err(&pdev->dev, "probe: device ID allocation failed\n");
+
+		ret = dlb2_dev->id;
+		goto alloc_id_fail;
+	}
+
+	ret = pci_enable_device(pdev);
+	if (ret != 0) {
+		dev_err(&pdev->dev, "pci_enable_device() returned %d\n", ret);
+
+		goto pci_enable_device_fail;
+	}
+
+	ret = pci_request_regions(pdev, dlb2_driver_name);
+	if (ret != 0) {
+		dev_err(&pdev->dev,
+			"pci_request_regions(): returned %d\n", ret);
+
+		goto pci_request_regions_fail;
+	}
+
+	pci_set_master(pdev);
+
+	if (pci_enable_pcie_error_reporting(pdev))
+		dev_info(&pdev->dev, "[%s()] Failed to enable AER\n", __func__);
+
+	mutex_lock(&dlb2_driver_lock);
+	list_add(&dlb2_dev->list, &dlb2_dev_list);
+	mutex_unlock(&dlb2_driver_lock);
+
+	return 0;
+
+pci_request_regions_fail:
+	pci_disable_device(pdev);
+pci_enable_device_fail:
+	dlb2_free_id(dlb2_dev->id);
+alloc_id_fail:
+	devm_kfree(&pdev->dev, dlb2_dev);
+	return ret;
+}
+
+static void dlb2_remove(struct pci_dev *pdev)
+{
+	struct dlb2_dev *dlb2_dev;
+
+	/* Undo all the dlb2_probe() operations */
+	dev_dbg(&pdev->dev, "Cleaning up the DLB driver for removal\n");
+	dlb2_dev = pci_get_drvdata(pdev);
+
+	mutex_lock(&dlb2_driver_lock);
+	list_del(&dlb2_dev->list);
+	mutex_unlock(&dlb2_driver_lock);
+
+	pci_disable_pcie_error_reporting(pdev);
+
+	pci_release_regions(pdev);
+
+	pci_disable_device(pdev);
+
+	dlb2_free_id(dlb2_dev->id);
+
+	devm_kfree(&pdev->dev, dlb2_dev);
+}
+
+static struct pci_device_id dlb2_id_table[] = {
+	{ PCI_DEVICE_DATA(INTEL, DLB2_PF, NULL) },
+	{ 0 }
+};
+MODULE_DEVICE_TABLE(pci, dlb2_id_table);
+
+static struct pci_driver dlb2_pci_driver = {
+	.name		 = (char *)dlb2_driver_name,
+	.id_table	 = dlb2_id_table,
+	.probe		 = dlb2_probe,
+	.remove		 = dlb2_remove,
+};
+
+static int __init dlb2_init_module(void)
+{
+	int err;
+
+	pr_info("%s\n", dlb2_driver_name);
+	pr_info("%s\n", dlb2_driver_copyright);
+
+	dlb2_class = class_create(THIS_MODULE, dlb2_driver_name);
+
+	if (IS_ERR(dlb2_class)) {
+		pr_err("%s: class_create() returned %ld\n",
+		       dlb2_driver_name, PTR_ERR(dlb2_class));
+
+		return PTR_ERR(dlb2_class);
+	}
+
+	/* Allocate one minor number per domain */
+	err = alloc_chrdev_region(&dlb2_dev_number_base,
+				  0,
+				  DLB2_MAX_NUM_DEVICES,
+				  dlb2_driver_name);
+
+	if (err < 0) {
+		pr_err("%s: alloc_chrdev_region() returned %d\n",
+		       dlb2_driver_name, err);
+
+		return err;
+	}
+
+	err = pci_register_driver(&dlb2_pci_driver);
+	if (err < 0) {
+		pr_err("%s: pci_register_driver() returned %d\n",
+		       dlb2_driver_name, err);
+		return err;
+	}
+
+	return 0;
+}
+
+static void __exit dlb2_exit_module(void)
+{
+	pr_info("%s: exit\n", dlb2_driver_name);
+
+	pci_unregister_driver(&dlb2_pci_driver);
+
+	unregister_chrdev_region(dlb2_dev_number_base,
+				 DLB2_MAX_NUM_DEVICES);
+
+	if (dlb2_class) {
+		class_destroy(dlb2_class);
+		dlb2_class = NULL;
+	}
+}
+
+module_init(dlb2_init_module);
+module_exit(dlb2_exit_module);
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
new file mode 100644
index 000000000000..cf3d37b0f0ae
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ * Copyright(c) 2017-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_MAIN_H
+#define __DLB2_MAIN_H
+
+#include <linux/cdev.h>
+#include <linux/device.h>
+#include <linux/ktime.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/types.h>
+
+#include "dlb2_hw_types.h"
+
+static const char dlb2_driver_name[] = KBUILD_MODNAME;
+
+/*
+ * The dlb2 driver uses a different minor number for each device file, of which
+ * there are:
+ * - 33 per device (PF or VF/VDEV): 1 for the device, 32 for scheduling domains
+ * - Up to 17 devices per PF: 1 PF and up to 16 VFs/VDEVs
+ * - Up to 16 PFs per system
+ */
+#define DLB2_MAX_NUM_PFS 16
+#define DLB2_NUM_FUNCS_PER_DEVICE (1 + DLB2_MAX_NUM_VDEVS)
+#define DLB2_MAX_NUM_DEVICES (DLB2_MAX_NUM_PFS * DLB2_NUM_FUNCS_PER_DEVICE)
+
+struct dlb2_dev {
+	struct pci_dev *pdev;
+	struct list_head list;
+	int id;
+};
+
+#endif /* __DLB2_MAIN_H */
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 0ad57693f392..7626ae8f33f6 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2807,6 +2807,8 @@
 #define PCI_DEVICE_ID_INTEL_ESB2_14	0x2698
 #define PCI_DEVICE_ID_INTEL_ESB2_17	0x269b
 #define PCI_DEVICE_ID_INTEL_ESB2_18	0x269e
+#define PCI_DEVICE_ID_INTEL_DLB2_PF	0x2710
+#define PCI_DEVICE_ID_INTEL_DLB2_VF	0x2711
 #define PCI_DEVICE_ID_INTEL_ICH7_0	0x27b8
 #define PCI_DEVICE_ID_INTEL_ICH7_1	0x27b9
 #define PCI_DEVICE_ID_INTEL_ICH7_30	0x27b0
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/20] dlb2: initialize PF device
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
  2020-07-12 13:43 ` [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 03/20] dlb2: add resource and device initialization Gage Eads
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

The driver detects the device type (PF/VF) at probe time, and assigns the
corresponding 'ops' callbacks from that. These callbacks include mapping
and unmapping the PCI BAR space, creating/destroying the device, and
adding/deleting a char device.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/Makefile        |   5 +-
 drivers/misc/dlb2/dlb2_hw_types.h |  28 ++++++++
 drivers/misc/dlb2/dlb2_main.c     |  49 ++++++++++++-
 drivers/misc/dlb2/dlb2_main.h     |  30 ++++++++
 drivers/misc/dlb2/dlb2_pf_ops.c   | 140 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 249 insertions(+), 3 deletions(-)
 create mode 100644 drivers/misc/dlb2/dlb2_pf_ops.c

diff --git a/drivers/misc/dlb2/Makefile b/drivers/misc/dlb2/Makefile
index 90ae953d2a8f..95e67e5bd8ff 100644
--- a/drivers/misc/dlb2/Makefile
+++ b/drivers/misc/dlb2/Makefile
@@ -4,5 +4,6 @@
 
 obj-$(CONFIG_INTEL_DLB2) := dlb2.o
 
-dlb2-objs :=  \
-  dlb2_main.o \
+dlb2-objs :=    \
+  dlb2_main.o   \
+  dlb2_pf_ops.o \
diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
index 558099db50b8..860d81bf3103 100644
--- a/drivers/misc/dlb2/dlb2_hw_types.h
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -5,6 +5,25 @@
 #ifndef __DLB2_HW_TYPES_H
 #define __DLB2_HW_TYPES_H
 
+#include <linux/io.h>
+
+#define DLB2_PCI_REG_READ(addr)        ioread32(addr)
+#define DLB2_PCI_REG_WRITE(reg, value) iowrite32(value, reg)
+
+/* Read/write register 'reg' in the CSR BAR space */
+#define DLB2_CSR_REG_ADDR(a, reg) ((a)->csr_kva + (reg))
+#define DLB2_CSR_RD(hw, reg) \
+	DLB2_PCI_REG_READ(DLB2_CSR_REG_ADDR((hw), (reg)))
+#define DLB2_CSR_WR(hw, reg, value) \
+	DLB2_PCI_REG_WRITE(DLB2_CSR_REG_ADDR((hw), (reg)), (value))
+
+/* Read/write register 'reg' in the func BAR space */
+#define DLB2_FUNC_REG_ADDR(a, reg) ((a)->func_kva + (reg))
+#define DLB2_FUNC_RD(hw, reg) \
+	DLB2_PCI_REG_READ(DLB2_FUNC_REG_ADDR((hw), (reg)))
+#define DLB2_FUNC_WR(hw, reg, value) \
+	DLB2_PCI_REG_WRITE(DLB2_FUNC_REG_ADDR((hw), (reg)), (value))
+
 #define DLB2_MAX_NUM_VDEVS 16
 #define DLB2_MAX_NUM_DOMAINS 32
 #define DLB2_MAX_NUM_LDB_QUEUES 32 /* LDB == load-balanced */
@@ -26,4 +45,13 @@
 #define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
 #define DLB2_HZ 800000000
 
+struct dlb2_hw {
+	/* BAR 0 address */
+	void __iomem *csr_kva;
+	unsigned long csr_phys_addr;
+	/* BAR 2 address */
+	void __iomem *func_kva;
+	unsigned long func_phys_addr;
+};
+
 #endif /* __DLB2_HW_TYPES_H */
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index fd953a1c4cb5..d6fb46c05985 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -51,6 +51,18 @@ static const struct file_operations dlb2_fops = {
 /****** PCI driver callbacks ******/
 /**********************************/
 
+static void dlb2_assign_ops(struct dlb2_dev *dlb2_dev,
+			    const struct pci_device_id *pdev_id)
+{
+	dlb2_dev->type = pdev_id->driver_data;
+
+	switch (pdev_id->driver_data) {
+	case DLB2_PF:
+		dlb2_dev->ops = &dlb2_pf_ops;
+		break;
+	}
+}
+
 static DEFINE_IDA(dlb2_ids);
 
 static int dlb2_alloc_id(void)
@@ -75,6 +87,8 @@ static int dlb2_probe(struct pci_dev *pdev,
 	if (!dlb2_dev)
 		return -ENOMEM;
 
+	dlb2_assign_ops(dlb2_dev, pdev_id);
+
 	pci_set_drvdata(pdev, dlb2_dev);
 
 	dlb2_dev->pdev = pdev;
@@ -107,12 +121,39 @@ static int dlb2_probe(struct pci_dev *pdev,
 	if (pci_enable_pcie_error_reporting(pdev))
 		dev_info(&pdev->dev, "[%s()] Failed to enable AER\n", __func__);
 
+	ret = dlb2_dev->ops->map_pci_bar_space(dlb2_dev, pdev);
+	if (ret)
+		goto map_pci_bar_fail;
+
+	ret = dlb2_dev->ops->cdev_add(dlb2_dev,
+				      dlb2_dev_number_base,
+				      &dlb2_fops);
+	if (ret)
+		goto cdev_add_fail;
+
+	ret = dlb2_dev->ops->device_create(dlb2_dev, pdev, dlb2_class);
+	if (ret)
+		goto device_add_fail;
+
+	ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+	if (ret)
+		goto dma_set_mask_fail;
+
 	mutex_lock(&dlb2_driver_lock);
 	list_add(&dlb2_dev->list, &dlb2_dev_list);
 	mutex_unlock(&dlb2_driver_lock);
 
 	return 0;
 
+dma_set_mask_fail:
+	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
+device_add_fail:
+	dlb2_dev->ops->cdev_del(dlb2_dev);
+cdev_add_fail:
+	dlb2_dev->ops->unmap_pci_bar_space(dlb2_dev, pdev);
+map_pci_bar_fail:
+	pci_disable_pcie_error_reporting(pdev);
+	pci_release_regions(pdev);
 pci_request_regions_fail:
 	pci_disable_device(pdev);
 pci_enable_device_fail:
@@ -134,6 +175,12 @@ static void dlb2_remove(struct pci_dev *pdev)
 	list_del(&dlb2_dev->list);
 	mutex_unlock(&dlb2_driver_lock);
 
+	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
+
+	dlb2_dev->ops->cdev_del(dlb2_dev);
+
+	dlb2_dev->ops->unmap_pci_bar_space(dlb2_dev, pdev);
+
 	pci_disable_pcie_error_reporting(pdev);
 
 	pci_release_regions(pdev);
@@ -146,7 +193,7 @@ static void dlb2_remove(struct pci_dev *pdev)
 }
 
 static struct pci_device_id dlb2_id_table[] = {
-	{ PCI_DEVICE_DATA(INTEL, DLB2_PF, NULL) },
+	{ PCI_DEVICE_DATA(INTEL, DLB2_PF, DLB2_PF) },
 	{ 0 }
 };
 MODULE_DEVICE_TABLE(pci, dlb2_id_table);
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index cf3d37b0f0ae..5e35d1e7c251 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -28,10 +28,40 @@ static const char dlb2_driver_name[] = KBUILD_MODNAME;
 #define DLB2_NUM_FUNCS_PER_DEVICE (1 + DLB2_MAX_NUM_VDEVS)
 #define DLB2_MAX_NUM_DEVICES (DLB2_MAX_NUM_PFS * DLB2_NUM_FUNCS_PER_DEVICE)
 
+enum dlb2_device_type {
+	DLB2_PF,
+	DLB2_VF,
+};
+
+struct dlb2_dev;
+
+struct dlb2_device_ops {
+	int (*map_pci_bar_space)(struct dlb2_dev *dev, struct pci_dev *pdev);
+	void (*unmap_pci_bar_space)(struct dlb2_dev *dev,
+				    struct pci_dev *pdev);
+	int (*device_create)(struct dlb2_dev *dlb2_dev,
+			     struct pci_dev *pdev,
+			     struct class *dlb2_class);
+	void (*device_destroy)(struct dlb2_dev *dlb2_dev,
+			       struct class *dlb2_class);
+	int (*cdev_add)(struct dlb2_dev *dlb2_dev,
+			dev_t base,
+			const struct file_operations *fops);
+	void (*cdev_del)(struct dlb2_dev *dlb2_dev);
+};
+
+extern struct dlb2_device_ops dlb2_pf_ops;
+
 struct dlb2_dev {
 	struct pci_dev *pdev;
+	struct dlb2_hw hw;
+	struct cdev cdev;
+	struct dlb2_device_ops *ops;
 	struct list_head list;
+	struct device *dlb2_device;
+	enum dlb2_device_type type;
 	int id;
+	dev_t dev_number;
 };
 
 #endif /* __DLB2_MAIN_H */
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
new file mode 100644
index 000000000000..14a50a778675
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2017-2020 Intel Corporation */
+
+#include "dlb2_main.h"
+
+/********************************/
+/****** PCI BAR management ******/
+/********************************/
+
+static void
+dlb2_pf_unmap_pci_bar_space(struct dlb2_dev *dlb2_dev,
+			    struct pci_dev *pdev)
+{
+	pci_iounmap(pdev, dlb2_dev->hw.csr_kva);
+	pci_iounmap(pdev, dlb2_dev->hw.func_kva);
+}
+
+static int
+dlb2_pf_map_pci_bar_space(struct dlb2_dev *dlb2_dev,
+			  struct pci_dev *pdev)
+{
+	/* BAR 0: PF FUNC BAR space */
+	dlb2_dev->hw.func_kva = pci_iomap(pdev, 0, 0);
+	dlb2_dev->hw.func_phys_addr = pci_resource_start(pdev, 0);
+
+	if (!dlb2_dev->hw.func_kva) {
+		dev_err(&pdev->dev, "Cannot iomap BAR 0 (size %llu)\n",
+			pci_resource_len(pdev, 0));
+
+		return -EIO;
+	}
+
+	dev_dbg(&pdev->dev, "BAR 0 iomem base: %p\n",
+		dlb2_dev->hw.func_kva);
+	dev_dbg(&pdev->dev, "BAR 0 start: 0x%llx\n",
+		pci_resource_start(pdev, 0));
+	dev_dbg(&pdev->dev, "BAR 0 len: %llu\n",
+		pci_resource_len(pdev, 0));
+
+	/* BAR 2: PF CSR BAR space */
+	dlb2_dev->hw.csr_kva = pci_iomap(pdev, 2, 0);
+	dlb2_dev->hw.csr_phys_addr = pci_resource_start(pdev, 2);
+
+	if (!dlb2_dev->hw.csr_kva) {
+		dev_err(&pdev->dev, "Cannot iomap BAR 2 (size %llu)\n",
+			pci_resource_len(pdev, 2));
+
+		pci_iounmap(pdev, dlb2_dev->hw.func_kva);
+		return -EIO;
+	}
+
+	dev_dbg(&pdev->dev, "BAR 2 iomem base: %p\n",
+		dlb2_dev->hw.csr_kva);
+	dev_dbg(&pdev->dev, "BAR 2 start: 0x%llx\n",
+		pci_resource_start(pdev, 2));
+	dev_dbg(&pdev->dev, "BAR 2 len: %llu\n",
+		pci_resource_len(pdev, 2));
+
+	return 0;
+}
+
+/*******************************/
+/****** Driver management ******/
+/*******************************/
+
+static int
+dlb2_pf_cdev_add(struct dlb2_dev *dlb2_dev,
+		 dev_t base,
+		 const struct file_operations *fops)
+{
+	int ret;
+
+	dlb2_dev->dev_number = MKDEV(MAJOR(base), MINOR(base) + dlb2_dev->id);
+
+	cdev_init(&dlb2_dev->cdev, fops);
+
+	dlb2_dev->cdev.dev   = dlb2_dev->dev_number;
+	dlb2_dev->cdev.owner = THIS_MODULE;
+
+	ret = cdev_add(&dlb2_dev->cdev, dlb2_dev->cdev.dev, 1);
+	if (ret < 0)
+		dev_err(dlb2_dev->dlb2_device,
+			"%s: cdev_add() returned %d\n",
+			dlb2_driver_name, ret);
+
+	return ret;
+}
+
+static void
+dlb2_pf_cdev_del(struct dlb2_dev *dlb2_dev)
+{
+	cdev_del(&dlb2_dev->cdev);
+}
+
+static int
+dlb2_pf_device_create(struct dlb2_dev *dlb2_dev,
+		      struct pci_dev *pdev,
+		      struct class *dlb2_class)
+{
+	/*
+	 * Create a new device in order to create a /dev/dlb node. This device
+	 * is a child of the DLB PCI device.
+	 */
+	dlb2_dev->dlb2_device = device_create(dlb2_class,
+					      &pdev->dev,
+					      dlb2_dev->dev_number,
+					      dlb2_dev,
+					      "dlb%d",
+					      dlb2_dev->id);
+
+	if (IS_ERR(dlb2_dev->dlb2_device)) {
+		dev_err(&pdev->dev,
+			"%s: device_create() returned %ld\n",
+			dlb2_driver_name, PTR_ERR(dlb2_dev->dlb2_device));
+
+		return PTR_ERR(dlb2_dev->dlb2_device);
+	}
+
+	return 0;
+}
+
+static void
+dlb2_pf_device_destroy(struct dlb2_dev *dlb2_dev,
+		       struct class *dlb2_class)
+{
+	device_destroy(dlb2_class, dlb2_dev->dev_number);
+}
+
+/********************************/
+/****** DLB2 PF Device Ops ******/
+/********************************/
+
+struct dlb2_device_ops dlb2_pf_ops = {
+	.map_pci_bar_space = dlb2_pf_map_pci_bar_space,
+	.unmap_pci_bar_space = dlb2_pf_unmap_pci_bar_space,
+	.device_create = dlb2_pf_device_create,
+	.device_destroy = dlb2_pf_device_destroy,
+	.cdev_add = dlb2_pf_cdev_add,
+	.cdev_del = dlb2_pf_cdev_del,
+};
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/20] dlb2: add resource and device initialization
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
  2020-07-12 13:43 ` [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver Gage Eads
  2020-07-12 13:43 ` [PATCH 02/20] dlb2: initialize PF device Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls Gage Eads
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

This commit adds the hardware resource data structures and functions for
their initialization/teardown and for device power-on. In subsequent
commits, dlb2_resource.c will be expanded to hold the dlb2
resource-management and configuration logic (using the data structures
defined in dlb2_hw_types.h).

This commit also introduces dlb2_bitmap_* functions, a thin convenience
layer wrapping the Linux bitmap interfaces, used by the bitmaps in the dlb2
hardware types.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/Makefile        |   7 +-
 drivers/misc/dlb2/dlb2_bitmap.h   | 121 +++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_hw_types.h | 173 +++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.c     |  51 ++++++++++
 drivers/misc/dlb2/dlb2_main.h     |  11 +++
 drivers/misc/dlb2/dlb2_pf_ops.c   |  73 ++++++++++++++
 drivers/misc/dlb2/dlb2_regs.h     |  83 ++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.c | 197 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h |  46 +++++++++
 9 files changed, 759 insertions(+), 3 deletions(-)
 create mode 100644 drivers/misc/dlb2/dlb2_bitmap.h
 create mode 100644 drivers/misc/dlb2/dlb2_regs.h
 create mode 100644 drivers/misc/dlb2/dlb2_resource.c
 create mode 100644 drivers/misc/dlb2/dlb2_resource.h

diff --git a/drivers/misc/dlb2/Makefile b/drivers/misc/dlb2/Makefile
index 95e67e5bd8ff..4fdf7ffc555b 100644
--- a/drivers/misc/dlb2/Makefile
+++ b/drivers/misc/dlb2/Makefile
@@ -4,6 +4,7 @@
 
 obj-$(CONFIG_INTEL_DLB2) := dlb2.o
 
-dlb2-objs :=    \
-  dlb2_main.o   \
-  dlb2_pf_ops.o \
+dlb2-objs :=      \
+  dlb2_main.o     \
+  dlb2_pf_ops.o   \
+  dlb2_resource.o \
diff --git a/drivers/misc/dlb2/dlb2_bitmap.h b/drivers/misc/dlb2/dlb2_bitmap.h
new file mode 100644
index 000000000000..c5bb4ba84d5c
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_bitmap.h
@@ -0,0 +1,121 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ * Copyright(c) 2017-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_OSDEP_BITMAP_H
+#define __DLB2_OSDEP_BITMAP_H
+
+#include <linux/bitmap.h>
+#include <linux/slab.h>
+
+#include "dlb2_main.h"
+
+/*************************/
+/*** Bitmap operations ***/
+/*************************/
+struct dlb2_bitmap {
+	unsigned long *map;
+	unsigned int len;
+};
+
+/**
+ * dlb2_bitmap_alloc() - alloc a bitmap data structure
+ * @bitmap: pointer to dlb2_bitmap structure pointer.
+ * @len: number of entries in the bitmap.
+ *
+ * This function allocates a bitmap and initializes it with length @len. All
+ * entries are initially zero.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or len is 0.
+ * ENOMEM - could not allocate memory for the bitmap data structure.
+ */
+static inline int dlb2_bitmap_alloc(struct dlb2_bitmap **bitmap,
+				    unsigned int len)
+{
+	struct dlb2_bitmap *bm;
+
+	if (!bitmap || len == 0)
+		return -EINVAL;
+
+	bm = kzalloc(sizeof(*bm), GFP_KERNEL);
+	if (!bm)
+		return -ENOMEM;
+
+	bm->map = kcalloc(BITS_TO_LONGS(len), sizeof(*bm->map), GFP_KERNEL);
+	if (!bm->map) {
+		kfree(bm);
+		return -ENOMEM;
+	}
+
+	bm->len = len;
+
+	*bitmap = bm;
+
+	return 0;
+}
+
+/**
+ * dlb2_bitmap_free() - free a previously allocated bitmap data structure
+ * @bitmap: pointer to dlb2_bitmap structure.
+ *
+ * This function frees a bitmap that was allocated with dlb2_bitmap_alloc().
+ */
+static inline void dlb2_bitmap_free(struct dlb2_bitmap *bitmap)
+{
+	if (!bitmap)
+		return;
+
+	kfree(bitmap->map);
+
+	kfree(bitmap);
+}
+
+/**
+ * dlb2_bitmap_fill() - fill a bitmap with all 1s
+ * @bitmap: pointer to dlb2_bitmap structure.
+ *
+ * This function sets all bitmap values to 1.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb2_bitmap_fill(struct dlb2_bitmap *bitmap)
+{
+	if (!bitmap || !bitmap->map)
+		return -EINVAL;
+
+	bitmap_fill(bitmap->map, bitmap->len);
+
+	return 0;
+}
+
+/**
+ * dlb2_bitmap_zero() - fill a bitmap with all 0s
+ * @bitmap: pointer to dlb2_bitmap structure.
+ *
+ * This function sets all bitmap values to 0.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb2_bitmap_zero(struct dlb2_bitmap *bitmap)
+{
+	if (!bitmap || !bitmap->map)
+		return -EINVAL;
+
+	bitmap_zero(bitmap->map, bitmap->len);
+
+	return 0;
+}
+
+#endif /*  __DLB2_OSDEP_BITMAP_H */
diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
index 860d81bf3103..c5c4e45d4ee5 100644
--- a/drivers/misc/dlb2/dlb2_hw_types.h
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -6,6 +6,9 @@
 #define __DLB2_HW_TYPES_H
 
 #include <linux/io.h>
+#include <linux/types.h>
+
+#include "dlb2_bitmap.h"
 
 #define DLB2_PCI_REG_READ(addr)        ioread32(addr)
 #define DLB2_PCI_REG_WRITE(reg, value) iowrite32(value, reg)
@@ -45,6 +48,169 @@
 #define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
 #define DLB2_HZ 800000000
 
+struct dlb2_resource_id {
+	u32 phys_id;
+	u32 virt_id;
+	u8 vdev_owned;
+	u8 vdev_id;
+};
+
+struct dlb2_freelist {
+	u32 base;
+	u32 bound;
+	u32 offset;
+};
+
+static inline u32 dlb2_freelist_count(struct dlb2_freelist *list)
+{
+	return list->bound - list->base - list->offset;
+}
+
+struct dlb2_ldb_queue {
+	struct list_head domain_list;
+	struct list_head func_list;
+	struct dlb2_resource_id id;
+	struct dlb2_resource_id domain_id;
+	u32 num_qid_inflights;
+	u32 aqed_limit;
+	u32 sn_group; /* sn == sequence number */
+	u32 sn_slot;
+	u32 num_mappings;
+	u8 sn_cfg_valid;
+	u8 num_pending_additions;
+	u8 owned;
+	u8 configured;
+};
+
+/*
+ * Directed ports and queues are paired by nature, so the driver tracks them
+ * with a single data structure.
+ */
+struct dlb2_dir_pq_pair {
+	struct list_head domain_list;
+	struct list_head func_list;
+	struct dlb2_resource_id id;
+	struct dlb2_resource_id domain_id;
+	u32 ref_cnt;
+	u8 init_tkn_cnt;
+	u8 queue_configured;
+	u8 port_configured;
+	u8 owned;
+	u8 enabled;
+};
+
+enum dlb2_qid_map_state {
+	/* The slot doesn't contain a valid queue mapping */
+	DLB2_QUEUE_UNMAPPED,
+	/* The slot contains a valid queue mapping */
+	DLB2_QUEUE_MAPPED,
+	/* The driver is mapping a queue into this slot */
+	DLB2_QUEUE_MAP_IN_PROG,
+	/* The driver is unmapping a queue from this slot */
+	DLB2_QUEUE_UNMAP_IN_PROG,
+	/*
+	 * The driver is unmapping a queue from this slot, and once complete
+	 * will replace it with another mapping.
+	 */
+	DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP,
+};
+
+struct dlb2_ldb_port_qid_map {
+	enum dlb2_qid_map_state state;
+	u16 qid;
+	u16 pending_qid;
+	u8 priority;
+	u8 pending_priority;
+};
+
+struct dlb2_ldb_port {
+	struct list_head domain_list;
+	struct list_head func_list;
+	struct dlb2_resource_id id;
+	struct dlb2_resource_id domain_id;
+	/* The qid_map represents the hardware QID mapping state. */
+	struct dlb2_ldb_port_qid_map qid_map[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
+	u32 hist_list_entry_base;
+	u32 hist_list_entry_limit;
+	u32 ref_cnt;
+	u8 init_tkn_cnt;
+	u8 num_pending_removals;
+	u8 num_mappings;
+	u8 owned;
+	u8 enabled;
+	u8 configured;
+};
+
+struct dlb2_sn_group {
+	u32 mode;
+	u32 sequence_numbers_per_queue;
+	u32 slot_use_bitmap;
+	u32 id;
+};
+
+struct dlb2_hw_domain {
+	struct dlb2_function_resources *parent_func;
+	struct list_head func_list;
+	struct list_head used_ldb_queues;
+	struct list_head used_ldb_ports[DLB2_NUM_COS_DOMAINS];
+	struct list_head used_dir_pq_pairs;
+	struct list_head avail_ldb_queues;
+	struct list_head avail_ldb_ports[DLB2_NUM_COS_DOMAINS];
+	struct list_head avail_dir_pq_pairs;
+	u32 total_hist_list_entries;
+	u32 avail_hist_list_entries;
+	u32 hist_list_entry_base;
+	u32 hist_list_entry_offset;
+	u32 num_ldb_credits;
+	u32 num_dir_credits;
+	u32 num_avail_aqed_entries;
+	u32 num_used_aqed_entries;
+	struct dlb2_resource_id id;
+	int num_pending_removals;
+	int num_pending_additions;
+	u8 configured;
+	u8 started;
+};
+
+struct dlb2_function_resources {
+	struct list_head avail_domains;
+	struct list_head used_domains;
+	struct list_head avail_ldb_queues;
+	struct list_head avail_ldb_ports[DLB2_NUM_COS_DOMAINS];
+	struct list_head avail_dir_pq_pairs;
+	struct dlb2_bitmap *avail_hist_list_entries;
+	u32 num_avail_domains;
+	u32 num_avail_ldb_queues;
+	u32 num_avail_ldb_ports[DLB2_NUM_COS_DOMAINS];
+	u32 num_avail_dir_pq_pairs;
+	u32 num_avail_qed_entries;
+	u32 num_avail_dqed_entries;
+	u32 num_avail_aqed_entries;
+	u8 locked; /* (VDEV only) */
+};
+
+/*
+ * After initialization, each resource in dlb2_hw_resources is located in one
+ * of the following lists:
+ * -- The PF's available resources list. These are unconfigured resources owned
+ *	by the PF and not allocated to a dlb2 scheduling domain.
+ * -- A VDEV's available resources list. These are VDEV-owned unconfigured
+ *	resources not allocated to a dlb2 scheduling domain.
+ * -- A domain's available resources list. These are domain-owned unconfigured
+ *	resources.
+ * -- A domain's used resources list. These are are domain-owned configured
+ *	resources.
+ *
+ * A resource moves to a new list when a VDEV or domain is created or destroyed,
+ * or when the resource is configured.
+ */
+struct dlb2_hw_resources {
+	struct dlb2_ldb_queue ldb_queues[DLB2_MAX_NUM_LDB_QUEUES];
+	struct dlb2_ldb_port ldb_ports[DLB2_MAX_NUM_LDB_PORTS];
+	struct dlb2_dir_pq_pair dir_pq_pairs[DLB2_MAX_NUM_DIR_PORTS];
+	struct dlb2_sn_group sn_groups[DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS];
+};
+
 struct dlb2_hw {
 	/* BAR 0 address */
 	void __iomem *csr_kva;
@@ -52,6 +218,13 @@ struct dlb2_hw {
 	/* BAR 2 address */
 	void __iomem *func_kva;
 	unsigned long func_phys_addr;
+
+	/* Resource tracking */
+	struct dlb2_hw_resources rsrcs;
+	struct dlb2_function_resources pf;
+	struct dlb2_function_resources vdev[DLB2_MAX_NUM_VDEVS];
+	struct dlb2_hw_domain domains[DLB2_MAX_NUM_DOMAINS];
+	u8 cos_reservation[DLB2_NUM_COS_DOMAINS];
 };
 
 #endif /* __DLB2_HW_TYPES_H */
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index d6fb46c05985..ae39dfea51e3 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -12,6 +12,7 @@
 #include <linux/uaccess.h>
 
 #include "dlb2_main.h"
+#include "dlb2_resource.h"
 
 static const char
 dlb2_driver_copyright[] = "Copyright(c) 2018-2020 Intel Corporation";
@@ -27,6 +28,23 @@ static struct list_head dlb2_dev_list = LIST_HEAD_INIT(dlb2_dev_list);
 static struct class *dlb2_class;
 static dev_t dlb2_dev_number_base;
 
+static int dlb2_reset_device(struct pci_dev *pdev)
+{
+	int ret;
+
+	ret = pci_save_state(pdev);
+	if (ret)
+		return ret;
+
+	ret = __pci_reset_function_locked(pdev);
+	if (ret)
+		return ret;
+
+	pci_restore_state(pdev);
+
+	return 0;
+}
+
 /*****************************/
 /****** Devfs callbacks ******/
 /*****************************/
@@ -139,12 +157,41 @@ static int dlb2_probe(struct pci_dev *pdev,
 	if (ret)
 		goto dma_set_mask_fail;
 
+	/*
+	 * PM enable must be done before any other MMIO accesses, and this
+	 * setting is persistent across device reset.
+	 */
+	dlb2_dev->ops->enable_pm(dlb2_dev);
+
+	ret = dlb2_dev->ops->wait_for_device_ready(dlb2_dev, pdev);
+	if (ret)
+		goto wait_for_device_ready_fail;
+
+	ret = dlb2_reset_device(pdev);
+	if (ret)
+		goto dlb2_reset_fail;
+
+	ret = dlb2_resource_init(&dlb2_dev->hw);
+	if (ret)
+		goto resource_init_fail;
+
+	ret = dlb2_dev->ops->init_driver_state(dlb2_dev);
+	if (ret)
+		goto init_driver_state_fail;
+
+	dlb2_dev->ops->init_hardware(dlb2_dev);
+
 	mutex_lock(&dlb2_driver_lock);
 	list_add(&dlb2_dev->list, &dlb2_dev_list);
 	mutex_unlock(&dlb2_driver_lock);
 
 	return 0;
 
+init_driver_state_fail:
+	dlb2_resource_free(&dlb2_dev->hw);
+resource_init_fail:
+dlb2_reset_fail:
+wait_for_device_ready_fail:
 dma_set_mask_fail:
 	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
 device_add_fail:
@@ -175,6 +222,10 @@ static void dlb2_remove(struct pci_dev *pdev)
 	list_del(&dlb2_dev->list);
 	mutex_unlock(&dlb2_driver_lock);
 
+	dlb2_dev->ops->free_driver_state(dlb2_dev);
+
+	dlb2_resource_free(&dlb2_dev->hw);
+
 	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
 
 	dlb2_dev->ops->cdev_del(dlb2_dev);
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 5e35d1e7c251..0bb646b82e5c 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -39,6 +39,8 @@ struct dlb2_device_ops {
 	int (*map_pci_bar_space)(struct dlb2_dev *dev, struct pci_dev *pdev);
 	void (*unmap_pci_bar_space)(struct dlb2_dev *dev,
 				    struct pci_dev *pdev);
+	int (*init_driver_state)(struct dlb2_dev *dev);
+	void (*free_driver_state)(struct dlb2_dev *dev);
 	int (*device_create)(struct dlb2_dev *dlb2_dev,
 			     struct pci_dev *pdev,
 			     struct class *dlb2_class);
@@ -48,6 +50,10 @@ struct dlb2_device_ops {
 			dev_t base,
 			const struct file_operations *fops);
 	void (*cdev_del)(struct dlb2_dev *dlb2_dev);
+	void (*enable_pm)(struct dlb2_dev *dev);
+	int (*wait_for_device_ready)(struct dlb2_dev *dev,
+				     struct pci_dev *pdev);
+	void (*init_hardware)(struct dlb2_dev *dev);
 };
 
 extern struct dlb2_device_ops dlb2_pf_ops;
@@ -59,6 +65,11 @@ struct dlb2_dev {
 	struct dlb2_device_ops *ops;
 	struct list_head list;
 	struct device *dlb2_device;
+	/*
+	 * The resource mutex serializes access to driver data structures and
+	 * hardware registers.
+	 */
+	struct mutex resource_mutex;
 	enum dlb2_device_type type;
 	int id;
 	dev_t dev_number;
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 14a50a778675..01664211d60a 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -1,7 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2017-2020 Intel Corporation */
 
+#include <linux/delay.h>
+
 #include "dlb2_main.h"
+#include "dlb2_regs.h"
+#include "dlb2_resource.h"
 
 /********************************/
 /****** PCI BAR management ******/
@@ -64,6 +68,19 @@ dlb2_pf_map_pci_bar_space(struct dlb2_dev *dlb2_dev,
 /*******************************/
 
 static int
+dlb2_pf_init_driver_state(struct dlb2_dev *dlb2_dev)
+{
+	mutex_init(&dlb2_dev->resource_mutex);
+
+	return 0;
+}
+
+static void
+dlb2_pf_free_driver_state(struct dlb2_dev *dlb2_dev)
+{
+}
+
+static int
 dlb2_pf_cdev_add(struct dlb2_dev *dlb2_dev,
 		 dev_t base,
 		 const struct file_operations *fops)
@@ -126,6 +143,57 @@ dlb2_pf_device_destroy(struct dlb2_dev *dlb2_dev,
 	device_destroy(dlb2_class, dlb2_dev->dev_number);
 }
 
+static void
+dlb2_pf_enable_pm(struct dlb2_dev *dlb2_dev)
+{
+	dlb2_clr_pmcsr_disable(&dlb2_dev->hw);
+}
+
+#define DLB2_READY_RETRY_LIMIT 1000
+static int
+dlb2_pf_wait_for_device_ready(struct dlb2_dev *dlb2_dev,
+			      struct pci_dev *pdev)
+{
+	u32 retries = DLB2_READY_RETRY_LIMIT;
+
+	/* Allow at least 1s for the device to become active after power-on */
+	do {
+		union dlb2_cfg_mstr_cfg_diagnostic_idle_status idle;
+		union dlb2_cfg_mstr_cfg_pm_status pm_st;
+		u32 addr;
+
+		addr = DLB2_CFG_MSTR_CFG_PM_STATUS;
+
+		pm_st.val = DLB2_CSR_RD(&dlb2_dev->hw, addr);
+
+		addr = DLB2_CFG_MSTR_CFG_DIAGNOSTIC_IDLE_STATUS;
+
+		idle.val = DLB2_CSR_RD(&dlb2_dev->hw, addr);
+
+		if (pm_st.field.pmsm == 1 && idle.field.dlb_func_idle == 1)
+			break;
+
+		dev_dbg(&pdev->dev, "[%s()] pm_st: 0x%x\n",
+			__func__, pm_st.val);
+		dev_dbg(&pdev->dev, "[%s()] idle: 0x%x\n",
+			__func__, idle.val);
+
+		usleep_range(1000, 2000);
+	} while (--retries);
+
+	if (!retries) {
+		dev_err(&pdev->dev, "Device idle test failed\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static void
+dlb2_pf_init_hardware(struct dlb2_dev *dlb2_dev)
+{
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -133,8 +201,13 @@ dlb2_pf_device_destroy(struct dlb2_dev *dlb2_dev,
 struct dlb2_device_ops dlb2_pf_ops = {
 	.map_pci_bar_space = dlb2_pf_map_pci_bar_space,
 	.unmap_pci_bar_space = dlb2_pf_unmap_pci_bar_space,
+	.init_driver_state = dlb2_pf_init_driver_state,
+	.free_driver_state = dlb2_pf_free_driver_state,
 	.device_create = dlb2_pf_device_create,
 	.device_destroy = dlb2_pf_device_destroy,
 	.cdev_add = dlb2_pf_cdev_add,
 	.cdev_del = dlb2_pf_cdev_del,
+	.enable_pm = dlb2_pf_enable_pm,
+	.wait_for_device_ready = dlb2_pf_wait_for_device_ready,
+	.init_hardware = dlb2_pf_init_hardware,
 };
diff --git a/drivers/misc/dlb2/dlb2_regs.h b/drivers/misc/dlb2/dlb2_regs.h
new file mode 100644
index 000000000000..77612c40ce0b
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_regs.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_REGS_H
+#define __DLB2_REGS_H
+
+#include "linux/types.h"
+
+#define DLB2_CFG_MSTR_CFG_DIAGNOSTIC_IDLE_STATUS 0xb4000004
+#define DLB2_CFG_MSTR_CFG_DIAGNOSTIC_IDLE_STATUS_RST 0x9d0fffff
+union dlb2_cfg_mstr_cfg_diagnostic_idle_status {
+	struct {
+		u32 chp_pipeidle : 1;
+		u32 rop_pipeidle : 1;
+		u32 lsp_pipeidle : 1;
+		u32 nalb_pipeidle : 1;
+		u32 ap_pipeidle : 1;
+		u32 dp_pipeidle : 1;
+		u32 qed_pipeidle : 1;
+		u32 dqed_pipeidle : 1;
+		u32 aqed_pipeidle : 1;
+		u32 sys_pipeidle : 1;
+		u32 chp_unit_idle : 1;
+		u32 rop_unit_idle : 1;
+		u32 lsp_unit_idle : 1;
+		u32 nalb_unit_idle : 1;
+		u32 ap_unit_idle : 1;
+		u32 dp_unit_idle : 1;
+		u32 qed_unit_idle : 1;
+		u32 dqed_unit_idle : 1;
+		u32 aqed_unit_idle : 1;
+		u32 sys_unit_idle : 1;
+		u32 rsvd1 : 4;
+		u32 mstr_cfg_ring_idle : 1;
+		u32 mstr_cfg_mstr_idle : 1;
+		u32 mstr_flr_clkreq_b : 1;
+		u32 mstr_proc_idle : 1;
+		u32 mstr_proc_idle_masked : 1;
+		u32 rsvd0 : 2;
+		u32 dlb_func_idle : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CFG_MSTR_CFG_PM_STATUS 0xb4000014
+#define DLB2_CFG_MSTR_CFG_PM_STATUS_RST 0x100403e
+union dlb2_cfg_mstr_cfg_pm_status {
+	struct {
+		u32 prochot : 1;
+		u32 pgcb_dlb_idle : 1;
+		u32 pgcb_dlb_pg_rdy_ack_b : 1;
+		u32 pmsm_pgcb_req_b : 1;
+		u32 pgbc_pmc_pg_req_b : 1;
+		u32 pmc_pgcb_pg_ack_b : 1;
+		u32 pmc_pgcb_fet_en_b : 1;
+		u32 pgcb_fet_en_b : 1;
+		u32 rsvz0 : 1;
+		u32 rsvz1 : 1;
+		u32 fuse_force_on : 1;
+		u32 fuse_proc_disable : 1;
+		u32 rsvz2 : 1;
+		u32 rsvz3 : 1;
+		u32 pm_fsm_d0tod3_ok : 1;
+		u32 pm_fsm_d3tod0_ok : 1;
+		u32 dlb_in_d3 : 1;
+		u32 rsvz4 : 7;
+		u32 pmsm : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CFG_MSTR_CFG_PM_PMCSR_DISABLE 0xb4000018
+#define DLB2_CFG_MSTR_CFG_PM_PMCSR_DISABLE_RST 0x1
+union dlb2_cfg_mstr_cfg_pm_pmcsr_disable {
+	struct {
+		u32 disable : 1;
+		u32 rsvz0 : 31;
+	} field;
+	u32 val;
+};
+
+#endif /* __DLB2_REGS_H */
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
new file mode 100644
index 000000000000..036ab10d7a98
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+/* Copyright(c) 2016-2020 Intel Corporation */
+
+#include "dlb2_bitmap.h"
+#include "dlb2_hw_types.h"
+#include "dlb2_regs.h"
+#include "dlb2_resource.h"
+
+static void dlb2_init_fn_rsrc_lists(struct dlb2_function_resources *rsrc)
+{
+	int i;
+
+	INIT_LIST_HEAD(&rsrc->avail_domains);
+	INIT_LIST_HEAD(&rsrc->used_domains);
+	INIT_LIST_HEAD(&rsrc->avail_ldb_queues);
+	INIT_LIST_HEAD(&rsrc->avail_dir_pq_pairs);
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		INIT_LIST_HEAD(&rsrc->avail_ldb_ports[i]);
+}
+
+static void dlb2_init_domain_rsrc_lists(struct dlb2_hw_domain *domain)
+{
+	int i;
+
+	INIT_LIST_HEAD(&domain->used_ldb_queues);
+	INIT_LIST_HEAD(&domain->used_dir_pq_pairs);
+	INIT_LIST_HEAD(&domain->avail_ldb_queues);
+	INIT_LIST_HEAD(&domain->avail_dir_pq_pairs);
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		INIT_LIST_HEAD(&domain->used_ldb_ports[i]);
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		INIT_LIST_HEAD(&domain->avail_ldb_ports[i]);
+}
+
+void dlb2_resource_free(struct dlb2_hw *hw)
+{
+	int i;
+
+	if (hw->pf.avail_hist_list_entries)
+		dlb2_bitmap_free(hw->pf.avail_hist_list_entries);
+
+	for (i = 0; i < DLB2_MAX_NUM_VDEVS; i++) {
+		if (hw->vdev[i].avail_hist_list_entries)
+			dlb2_bitmap_free(hw->vdev[i].avail_hist_list_entries);
+	}
+}
+
+int dlb2_resource_init(struct dlb2_hw *hw)
+{
+	struct list_head *list;
+	unsigned int i;
+	int ret;
+
+	/*
+	 * For optimal load-balancing, ports that map to one or more QIDs in
+	 * common should not be in numerical sequence. This is application
+	 * dependent, but the driver interleaves port IDs as much as possible
+	 * to reduce the likelihood of this. This initial allocation maximizes
+	 * the average distance between an ID and its immediate neighbors (i.e.
+	 * the distance from 1 to 0 and to 2, the distance from 2 to 1 and to
+	 * 3, etc.).
+	 */
+	u8 init_ldb_port_allocation[DLB2_MAX_NUM_LDB_PORTS] = {
+		0,  7,  14,  5, 12,  3, 10,  1,  8, 15,  6, 13,  4, 11,  2,  9,
+		16, 23, 30, 21, 28, 19, 26, 17, 24, 31, 22, 29, 20, 27, 18, 25,
+		32, 39, 46, 37, 44, 35, 42, 33, 40, 47, 38, 45, 36, 43, 34, 41,
+		48, 55, 62, 53, 60, 51, 58, 49, 56, 63, 54, 61, 52, 59, 50, 57,
+	};
+
+	/* Zero-out resource tracking data structures */
+	memset(&hw->rsrcs, 0, sizeof(hw->rsrcs));
+	memset(&hw->pf, 0, sizeof(hw->pf));
+
+	dlb2_init_fn_rsrc_lists(&hw->pf);
+
+	for (i = 0; i < DLB2_MAX_NUM_VDEVS; i++) {
+		memset(&hw->vdev[i], 0, sizeof(hw->vdev[i]));
+		dlb2_init_fn_rsrc_lists(&hw->vdev[i]);
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++) {
+		memset(&hw->domains[i], 0, sizeof(hw->domains[i]));
+		dlb2_init_domain_rsrc_lists(&hw->domains[i]);
+		hw->domains[i].parent_func = &hw->pf;
+	}
+
+	/* Give all resources to the PF driver */
+	hw->pf.num_avail_domains = DLB2_MAX_NUM_DOMAINS;
+	for (i = 0; i < hw->pf.num_avail_domains; i++) {
+		list = &hw->domains[i].func_list;
+
+		list_add(list, &hw->pf.avail_domains);
+	}
+
+	hw->pf.num_avail_ldb_queues = DLB2_MAX_NUM_LDB_QUEUES;
+	for (i = 0; i < hw->pf.num_avail_ldb_queues; i++) {
+		list = &hw->rsrcs.ldb_queues[i].func_list;
+
+		list_add(list, &hw->pf.avail_ldb_queues);
+	}
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		hw->pf.num_avail_ldb_ports[i] =
+			DLB2_MAX_NUM_LDB_PORTS / DLB2_NUM_COS_DOMAINS;
+
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++) {
+		int cos_id = i >> DLB2_NUM_COS_DOMAINS;
+		struct dlb2_ldb_port *port;
+
+		port = &hw->rsrcs.ldb_ports[init_ldb_port_allocation[i]];
+
+		list_add(&port->func_list, &hw->pf.avail_ldb_ports[cos_id]);
+	}
+
+	hw->pf.num_avail_dir_pq_pairs = DLB2_MAX_NUM_DIR_PORTS;
+	for (i = 0; i < hw->pf.num_avail_dir_pq_pairs; i++) {
+		list = &hw->rsrcs.dir_pq_pairs[i].func_list;
+
+		list_add(list, &hw->pf.avail_dir_pq_pairs);
+	}
+
+	hw->pf.num_avail_qed_entries = DLB2_MAX_NUM_LDB_CREDITS;
+	hw->pf.num_avail_dqed_entries = DLB2_MAX_NUM_DIR_CREDITS;
+	hw->pf.num_avail_aqed_entries = DLB2_MAX_NUM_AQED_ENTRIES;
+
+	ret = dlb2_bitmap_alloc(&hw->pf.avail_hist_list_entries,
+				DLB2_MAX_NUM_HIST_LIST_ENTRIES);
+	if (ret)
+		goto unwind;
+
+	ret = dlb2_bitmap_fill(hw->pf.avail_hist_list_entries);
+	if (ret)
+		goto unwind;
+
+	for (i = 0; i < DLB2_MAX_NUM_VDEVS; i++) {
+		ret = dlb2_bitmap_alloc(&hw->vdev[i].avail_hist_list_entries,
+					DLB2_MAX_NUM_HIST_LIST_ENTRIES);
+		if (ret)
+			goto unwind;
+
+		ret = dlb2_bitmap_zero(hw->vdev[i].avail_hist_list_entries);
+		if (ret)
+			goto unwind;
+	}
+
+	/* Initialize the hardware resource IDs */
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++) {
+		hw->domains[i].id.phys_id = i;
+		hw->domains[i].id.vdev_owned = false;
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_LDB_QUEUES; i++) {
+		hw->rsrcs.ldb_queues[i].id.phys_id = i;
+		hw->rsrcs.ldb_queues[i].id.vdev_owned = false;
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++) {
+		hw->rsrcs.ldb_ports[i].id.phys_id = i;
+		hw->rsrcs.ldb_ports[i].id.vdev_owned = false;
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++) {
+		hw->rsrcs.dir_pq_pairs[i].id.phys_id = i;
+		hw->rsrcs.dir_pq_pairs[i].id.vdev_owned = false;
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+		hw->rsrcs.sn_groups[i].id = i;
+		/* Default mode (0) is 64 sequence numbers per queue */
+		hw->rsrcs.sn_groups[i].mode = 0;
+		hw->rsrcs.sn_groups[i].sequence_numbers_per_queue = 64;
+		hw->rsrcs.sn_groups[i].slot_use_bitmap = 0;
+	}
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		hw->cos_reservation[i] = 100 / DLB2_NUM_COS_DOMAINS;
+
+	return 0;
+
+unwind:
+	dlb2_resource_free(hw);
+
+	return ret;
+}
+
+void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw)
+{
+	union dlb2_cfg_mstr_cfg_pm_pmcsr_disable r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_CFG_MSTR_CFG_PM_PMCSR_DISABLE);
+
+	r0.field.disable = 0;
+
+	DLB2_CSR_WR(hw, DLB2_CFG_MSTR_CFG_PM_PMCSR_DISABLE, r0.val);
+}
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
new file mode 100644
index 000000000000..73528943d36e
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_RESOURCE_H
+#define __DLB2_RESOURCE_H
+
+#include <linux/types.h>
+
+#include "dlb2_hw_types.h"
+
+/**
+ * dlb2_resource_init() - initialize the device
+ * @hw: pointer to struct dlb2_hw.
+ *
+ * This function initializes the device's software state (pointed to by the hw
+ * argument) and programs global scheduling QoS registers. This function should
+ * be called during driver initialization.
+ *
+ * The dlb2_hw struct must be unique per DLB 2.0 device and persist until the
+ * device is reset.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ */
+int dlb2_resource_init(struct dlb2_hw *hw);
+
+/**
+ * dlb2_resource_free() - free device state memory
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function frees software state pointed to by dlb2_hw. This function
+ * should be called when resetting the device or unloading the driver.
+ */
+void dlb2_resource_free(struct dlb2_hw *hw);
+
+/**
+ * dlb2_clr_pmcsr_disable() - power on bulk of DLB 2.0 logic
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * Clearing the PMCSR must be done at initialization to make the device fully
+ * operational.
+ */
+void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw);
+
+#endif /* __DLB2_RESOURCE_H */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (2 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 03/20] dlb2: add resource and device initialization Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 14:42   ` Randy Dunlap
                     ` (2 more replies)
  2020-07-12 13:43 ` [PATCH 05/20] dlb2: add sched domain config and reset support Gage Eads
                   ` (15 subsequent siblings)
  19 siblings, 3 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

This commit introduces the dlb2 device ioctl layer, and the first four
ioctls: query device version, driver version, and available resources; and
create a scheduling domain. This commit also introduces the user-space
interface file dlb2_user.h.

The PF hardware operation for scheduling domain creation will be added in a
subsequent commit.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/Makefile        |   1 +
 drivers/misc/dlb2/dlb2_bitmap.h   |  63 ++++++++++
 drivers/misc/dlb2/dlb2_ioctl.c    | 186 ++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_ioctl.h    |  14 +++
 drivers/misc/dlb2/dlb2_main.c     |  33 +++++-
 drivers/misc/dlb2/dlb2_main.h     |   7 ++
 drivers/misc/dlb2/dlb2_pf_ops.c   |  21 ++++
 drivers/misc/dlb2/dlb2_resource.c |  48 ++++++++
 drivers/misc/dlb2/dlb2_resource.h |  22 ++++
 include/uapi/linux/dlb2_user.h    | 236 ++++++++++++++++++++++++++++++++++++++
 10 files changed, 630 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/dlb2/dlb2_ioctl.c
 create mode 100644 drivers/misc/dlb2/dlb2_ioctl.h
 create mode 100644 include/uapi/linux/dlb2_user.h

diff --git a/drivers/misc/dlb2/Makefile b/drivers/misc/dlb2/Makefile
index 4fdf7ffc555b..18b5498b20e6 100644
--- a/drivers/misc/dlb2/Makefile
+++ b/drivers/misc/dlb2/Makefile
@@ -6,5 +6,6 @@ obj-$(CONFIG_INTEL_DLB2) := dlb2.o
 
 dlb2-objs :=      \
   dlb2_main.o     \
+  dlb2_ioctl.o    \
   dlb2_pf_ops.o   \
   dlb2_resource.o \
diff --git a/drivers/misc/dlb2/dlb2_bitmap.h b/drivers/misc/dlb2/dlb2_bitmap.h
index c5bb4ba84d5c..2d2d2927b0ec 100644
--- a/drivers/misc/dlb2/dlb2_bitmap.h
+++ b/drivers/misc/dlb2/dlb2_bitmap.h
@@ -118,4 +118,67 @@ static inline int dlb2_bitmap_zero(struct dlb2_bitmap *bitmap)
 	return 0;
 }
 
+/**
+ * dlb2_bitmap_count() - returns the number of set bits
+ * @bitmap: pointer to dlb2_bitmap structure.
+ *
+ * This function looks for a single set bit.
+ *
+ * Return:
+ * Returns the number of set bits upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb2_bitmap_count(struct dlb2_bitmap *bitmap)
+{
+	if (!bitmap || !bitmap->map)
+		return -EINVAL;
+
+	return bitmap_weight(bitmap->map, bitmap->len);
+}
+
+/**
+ * dlb2_bitmap_longest_set_range() - returns longest contiguous range of set
+ *				     bits
+ * @bitmap: pointer to dlb2_bitmap structure.
+ *
+ * Return:
+ * Returns the bitmap's longest contiguous range of of set bits upon success,
+ * <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb2_bitmap_longest_set_range(struct dlb2_bitmap *bitmap)
+{
+	unsigned int bits_per_long;
+	unsigned int i, j;
+	int max_len, len;
+
+	if (!bitmap || !bitmap->map)
+		return -EINVAL;
+
+	if (dlb2_bitmap_count(bitmap) == 0)
+		return 0;
+
+	max_len = 0;
+	len = 0;
+	bits_per_long = sizeof(unsigned long) * BITS_PER_BYTE;
+
+	for (i = 0; i < BITS_TO_LONGS(bitmap->len); i++) {
+		for (j = 0; j < bits_per_long; j++) {
+			if ((i * bits_per_long + j) >= bitmap->len)
+				break;
+
+			len = (test_bit(j, &bitmap->map[i])) ? len + 1 : 0;
+
+			if (len > max_len)
+				max_len = len;
+		}
+	}
+
+	return max_len;
+}
+
 #endif /*  __DLB2_OSDEP_BITMAP_H */
diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
new file mode 100644
index 000000000000..97a7220f0ea0
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2017-2020 Intel Corporation */
+
+#include <linux/uaccess.h>
+
+#include <uapi/linux/dlb2_user.h>
+
+#include "dlb2_ioctl.h"
+#include "dlb2_main.h"
+
+/* Verify the ioctl argument size and copy the argument into kernel memory */
+static int dlb2_copy_from_user(struct dlb2_dev *dev,
+			       unsigned long user_arg,
+			       u16 user_size,
+			       void *arg,
+			       size_t size)
+{
+	if (user_size != size) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid ioctl size\n", __func__);
+		return -EINVAL;
+	}
+
+	if (copy_from_user(arg, (void __user *)user_arg, size)) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid ioctl argument pointer\n", __func__);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int dlb2_copy_resp_to_user(struct dlb2_dev *dev,
+				  unsigned long user_resp,
+				  struct dlb2_cmd_response *resp)
+{
+	if (copy_to_user((void __user *)user_resp, resp, sizeof(*resp))) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid ioctl response pointer\n", __func__);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+/* [7:0]: device revision, [15:8]: device version */
+#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
+
+static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
+					 unsigned long user_arg,
+					 u16 size)
+{
+	struct dlb2_get_device_version_args arg;
+	struct dlb2_cmd_response response;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	response.status = 0;
+	response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return 0;
+}
+
+static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
+					  unsigned long user_arg,
+					  u16 size)
+{
+	struct dlb2_create_sched_domain_args arg;
+	struct dlb2_cmd_response response = {0};
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->create_sched_domain(&dev->hw, &arg, &response);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_ioctl_get_num_resources(struct dlb2_dev *dev,
+					unsigned long user_arg,
+					u16 size)
+{
+	struct dlb2_get_num_resources_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->get_num_resources(&dev->hw, &arg);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)user_arg, &arg, sizeof(arg))) {
+		dev_err(dev->dlb2_device, "Invalid DLB resources pointer\n");
+		return -EFAULT;
+	}
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_ioctl_get_driver_version(struct dlb2_dev *dev,
+					 unsigned long user_arg,
+					 u16 size)
+{
+	struct dlb2_get_driver_version_args arg;
+	struct dlb2_cmd_response response;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	response.status = 0;
+	response.id = DLB2_VERSION;
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return 0;
+}
+
+typedef int (*dlb2_ioctl_callback_fn_t)(struct dlb2_dev *dev,
+					unsigned long arg,
+					u16 size);
+
+static dlb2_ioctl_callback_fn_t dlb2_ioctl_callback_fns[NUM_DLB2_CMD] = {
+	dlb2_ioctl_get_device_version,
+	dlb2_ioctl_create_sched_domain,
+	dlb2_ioctl_get_num_resources,
+	dlb2_ioctl_get_driver_version,
+};
+
+int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
+			  unsigned int cmd,
+			  unsigned long arg)
+{
+	u16 sz = _IOC_SIZE(cmd);
+
+	if (_IOC_NR(cmd) >= NUM_DLB2_CMD) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Unexpected DLB command %d\n",
+			__func__, _IOC_NR(cmd));
+		return -1;
+	}
+
+	return dlb2_ioctl_callback_fns[_IOC_NR(cmd)](dev, arg, sz);
+}
diff --git a/drivers/misc/dlb2/dlb2_ioctl.h b/drivers/misc/dlb2/dlb2_ioctl.h
new file mode 100644
index 000000000000..476548cdd33c
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_ioctl.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ * Copyright(c) 2017-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_IOCTL_H
+#define __DLB2_IOCTL_H
+
+#include "dlb2_main.h"
+
+int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
+			  unsigned int cmd,
+			  unsigned long arg);
+
+#endif /* __DLB2_IOCTL_H */
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index ae39dfea51e3..e3e32c6f3d63 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -11,15 +11,25 @@
 #include <linux/pci.h>
 #include <linux/uaccess.h>
 
+#include "dlb2_ioctl.h"
 #include "dlb2_main.h"
 #include "dlb2_resource.h"
 
 static const char
 dlb2_driver_copyright[] = "Copyright(c) 2018-2020 Intel Corporation";
 
+#define TO_STR2(s) #s
+#define TO_STR(s) TO_STR2(s)
+
+#define DRV_VERSION \
+	TO_STR(DLB2_VERSION_MAJOR_NUMBER) "." \
+	TO_STR(DLB2_VERSION_MINOR_NUMBER) "." \
+	TO_STR(DLB2_VERSION_REVISION_NUMBER)
+
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Copyright(c) 2018-2020 Intel Corporation");
 MODULE_DESCRIPTION("Intel(R) Dynamic Load Balancer 2.0 Driver");
+MODULE_VERSION(DRV_VERSION);
 
 /* The driver lock protects data structures that used by multiple devices. */
 static DEFINE_MUTEX(dlb2_driver_lock);
@@ -59,10 +69,28 @@ static int dlb2_close(struct inode *i, struct file *f)
 	return 0;
 }
 
+static long
+dlb2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct dlb2_dev *dev;
+
+	dev = container_of(f->f_inode->i_cdev, struct dlb2_dev, cdev);
+
+	if (_IOC_TYPE(cmd) != DLB2_IOC_MAGIC) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Bad magic number!\n", __func__);
+		return -EINVAL;
+	}
+
+	return dlb2_ioctl_dispatcher(dev, cmd, arg);
+}
+
 static const struct file_operations dlb2_fops = {
 	.owner   = THIS_MODULE,
 	.open    = dlb2_open,
 	.release = dlb2_close,
+	.unlocked_ioctl = dlb2_ioctl,
+	.compat_ioctl = compat_ptr_ioctl,
 };
 
 /**********************************/
@@ -260,7 +288,10 @@ static int __init dlb2_init_module(void)
 {
 	int err;
 
-	pr_info("%s\n", dlb2_driver_name);
+	pr_info("%s - version %d.%d.%d\n", dlb2_driver_name,
+		DLB2_VERSION_MAJOR_NUMBER,
+		DLB2_VERSION_MINOR_NUMBER,
+		DLB2_VERSION_REVISION_NUMBER);
 	pr_info("%s\n", dlb2_driver_copyright);
 
 	dlb2_class = class_create(THIS_MODULE, dlb2_driver_name);
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 0bb646b82e5c..9211a6d999d1 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -13,6 +13,8 @@
 #include <linux/pci.h>
 #include <linux/types.h>
 
+#include <uapi/linux/dlb2_user.h>
+
 #include "dlb2_hw_types.h"
 
 static const char dlb2_driver_name[] = KBUILD_MODNAME;
@@ -53,6 +55,11 @@ struct dlb2_device_ops {
 	void (*enable_pm)(struct dlb2_dev *dev);
 	int (*wait_for_device_ready)(struct dlb2_dev *dev,
 				     struct pci_dev *pdev);
+	int (*create_sched_domain)(struct dlb2_hw *hw,
+				   struct dlb2_create_sched_domain_args *args,
+				   struct dlb2_cmd_response *resp);
+	int (*get_num_resources)(struct dlb2_hw *hw,
+				 struct dlb2_get_num_resources_args *args);
 	void (*init_hardware)(struct dlb2_dev *dev);
 };
 
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 01664211d60a..8142b302ba95 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -194,6 +194,25 @@ dlb2_pf_init_hardware(struct dlb2_dev *dlb2_dev)
 {
 }
 
+/*****************************/
+/****** IOCTL callbacks ******/
+/*****************************/
+
+static int
+dlb2_pf_create_sched_domain(struct dlb2_hw *hw,
+			    struct dlb2_create_sched_domain_args *args,
+			    struct dlb2_cmd_response *resp)
+{
+	return 0;
+}
+
+static int
+dlb2_pf_get_num_resources(struct dlb2_hw *hw,
+			  struct dlb2_get_num_resources_args *args)
+{
+	return dlb2_hw_get_num_resources(hw, args, false, 0);
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -209,5 +228,7 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.cdev_del = dlb2_pf_cdev_del,
 	.enable_pm = dlb2_pf_enable_pm,
 	.wait_for_device_ready = dlb2_pf_wait_for_device_ready,
+	.create_sched_domain = dlb2_pf_create_sched_domain,
+	.get_num_resources = dlb2_pf_get_num_resources,
 	.init_hardware = dlb2_pf_init_hardware,
 };
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 036ab10d7a98..44da2a3d3061 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -185,6 +185,54 @@ int dlb2_resource_init(struct dlb2_hw *hw)
 	return ret;
 }
 
+int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
+			      struct dlb2_get_num_resources_args *arg,
+			      bool vdev_req,
+			      unsigned int vdev_id)
+{
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_bitmap *map;
+	int i;
+
+	if (vdev_req && vdev_id >= DLB2_MAX_NUM_VDEVS)
+		return -EINVAL;
+
+	if (vdev_req)
+		rsrcs = &hw->vdev[vdev_id];
+	else
+		rsrcs = &hw->pf;
+
+	arg->num_sched_domains = rsrcs->num_avail_domains;
+
+	arg->num_ldb_queues = rsrcs->num_avail_ldb_queues;
+
+	arg->num_ldb_ports = 0;
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		arg->num_ldb_ports += rsrcs->num_avail_ldb_ports[i];
+
+	arg->num_cos0_ldb_ports = rsrcs->num_avail_ldb_ports[0];
+	arg->num_cos1_ldb_ports = rsrcs->num_avail_ldb_ports[1];
+	arg->num_cos2_ldb_ports = rsrcs->num_avail_ldb_ports[2];
+	arg->num_cos3_ldb_ports = rsrcs->num_avail_ldb_ports[3];
+
+	arg->num_dir_ports = rsrcs->num_avail_dir_pq_pairs;
+
+	arg->num_atomic_inflights = rsrcs->num_avail_aqed_entries;
+
+	map = rsrcs->avail_hist_list_entries;
+
+	arg->num_hist_list_entries = dlb2_bitmap_count(map);
+
+	arg->max_contiguous_hist_list_entries =
+		dlb2_bitmap_longest_set_range(map);
+
+	arg->num_ldb_credits = rsrcs->num_avail_qed_entries;
+
+	arg->num_dir_credits = rsrcs->num_avail_dqed_entries;
+
+	return 0;
+}
+
 void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw)
 {
 	union dlb2_cfg_mstr_cfg_pm_pmcsr_disable r0;
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 73528943d36e..33c0b0ea6025 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -35,6 +35,28 @@ int dlb2_resource_init(struct dlb2_hw *hw);
 void dlb2_resource_free(struct dlb2_hw *hw);
 
 /**
+ * dlb2_hw_get_num_resources() - query the PCI function's available resources
+ * @hw: dlb2_hw handle for a particular device.
+ * @arg: pointer to resource counts.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function returns the number of available resources for the PF or for a
+ * VF.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, -EINVAL if vdev_request is true and vdev_id is
+ * invalid.
+ */
+int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
+			      struct dlb2_get_num_resources_args *arg,
+			      bool vdev_request,
+			      unsigned int vdev_id);
+
+/**
  * dlb2_clr_pmcsr_disable() - power on bulk of DLB 2.0 logic
  * @hw: dlb2_hw handle for a particular device.
  *
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
new file mode 100644
index 000000000000..4742efae0b38
--- /dev/null
+++ b/include/uapi/linux/dlb2_user.h
@@ -0,0 +1,236 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_USER_H
+#define __DLB2_USER_H
+
+#include <linux/types.h>
+
+struct dlb2_cmd_response {
+	__u32 status; /* Interpret using enum dlb2_error */
+	__u32 id;
+};
+
+/********************************/
+/* 'dlb2' device file commands */
+/********************************/
+
+#define DLB2_DEVICE_VERSION(x) (((x) >> 8) & 0xFF)
+#define DLB2_DEVICE_REVISION(x) ((x) & 0xFF)
+
+enum dlb2_revisions {
+	DLB2_REV_A0 = 0,
+};
+
+/*
+ * DLB2_CMD_GET_DEVICE_VERSION: Query the DLB device version.
+ *
+ *	This ioctl interface is the same in all driver versions and is always
+ *	the first ioctl.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id[7:0]: Device revision.
+ *	response.id[15:8]: Device version.
+ */
+
+struct dlb2_get_device_version_args {
+	/* Output parameters */
+	__u64 response;
+};
+
+#define DLB2_VERSION_MAJOR_NUMBER 1
+#define DLB2_VERSION_MINOR_NUMBER 0
+#define DLB2_VERSION_REVISION_NUMBER 0
+#define DLB2_VERSION (DLB2_VERSION_MAJOR_NUMBER << 24 | \
+		      DLB2_VERSION_MINOR_NUMBER << 16 | \
+		      DLB2_VERSION_REVISION_NUMBER)
+
+#define DLB2_VERSION_GET_MAJOR_NUMBER(x) (((x) >> 24) & 0xFF)
+#define DLB2_VERSION_GET_MINOR_NUMBER(x) (((x) >> 16) & 0xFF)
+#define DLB2_VERSION_GET_REVISION_NUMBER(x) ((x) & 0xFFFF)
+
+static inline __u8 dlb2_version_incompatible(__u32 version)
+{
+	__u8 inc;
+
+	inc = DLB2_VERSION_GET_MAJOR_NUMBER(version) !=
+		DLB2_VERSION_MAJOR_NUMBER;
+	inc |= (int)DLB2_VERSION_GET_MINOR_NUMBER(version) <
+		DLB2_VERSION_MINOR_NUMBER;
+
+	return inc;
+}
+
+/*
+ * DLB2_CMD_GET_DRIVER_VERSION: Query the DLB2 driver version. The major
+ *	number is changed when there is an ABI-breaking change, the minor
+ *	number is changed if the API is changed in a backwards-compatible way,
+ *	and the revision number is changed for fixes that don't affect the API.
+ *
+ *	If the kernel driver's API version major number and the header's
+ *	DLB2_VERSION_MAJOR_NUMBER differ, the two are incompatible, or if the
+ *	major numbers match but the kernel driver's minor number is less than
+ *	the header file's, they are incompatible. The DLB2_VERSION_INCOMPATIBLE
+ *	macro should be used to check for compatibility.
+ *
+ *	This ioctl interface is the same in all driver versions. Applications
+ *	should check the driver version before performing any other ioctl
+ *	operations.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Driver API version. Use the DLB2_VERSION_GET_MAJOR_NUMBER,
+ *		DLB2_VERSION_GET_MINOR_NUMBER, and
+ *		DLB2_VERSION_GET_REVISION_NUMBER macros to interpret the field.
+ */
+
+struct dlb2_get_driver_version_args {
+	/* Output parameters */
+	__u64 response;
+};
+
+/*
+ * DLB2_CMD_CREATE_SCHED_DOMAIN: Create a DLB 2.0 scheduling domain and reserve
+ *	its hardware resources. This command returns the newly created domain
+ *	ID and a file descriptor for accessing the domain. Additional file
+ *	descriptors can be opened using DLB2_CMD_GET_SCHED_DOMAIN_FD.
+ *
+ * Input parameters:
+ * - num_ldb_queues: Number of load-balanced queues.
+ * - num_ldb_ports: Number of load-balanced ports that can be allocated from
+ *	from any class-of-service with available ports.
+ * - num_cos0_ldb_ports: Number of load-balanced ports from class-of-service 0.
+ * - num_cos1_ldb_ports: Number of load-balanced ports from class-of-service 1.
+ * - num_cos2_ldb_ports: Number of load-balanced ports from class-of-service 2.
+ * - num_cos3_ldb_ports: Number of load-balanced ports from class-of-service 3.
+ * - num_dir_ports: Number of directed ports. A directed port has one directed
+ *	queue, so no num_dir_queues argument is necessary.
+ * - num_atomic_inflights: This specifies the amount of temporary atomic QE
+ *	storage for the domain. This storage is divided among the domain's
+ *	load-balanced queues that are configured for atomic scheduling.
+ * - num_hist_list_entries: Amount of history list storage. This is divided
+ *	among the domain's CQs.
+ * - num_ldb_credits: Amount of load-balanced QE storage (QED). QEs occupy this
+ *	space until they are scheduled to a load-balanced CQ. One credit
+ *	represents the storage for one QE.
+ * - num_dir_credits: Amount of directed QE storage (DQED). QEs occupy this
+ *	space until they are scheduled to a directed CQ. One credit represents
+ *	the storage for one QE.
+ * - cos_strict: If set, return an error if there are insufficient ports in
+ *	class-of-service N to satisfy the num_ldb_ports_cosN argument. If
+ *	unset, attempt to fulfill num_ldb_ports_cosN arguments from other
+ *	classes-of-service if class N does not contain enough free ports.
+ * - padding1: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: domain ID.
+ * - domain_fd: file descriptor for performing the domain's ioctl operations
+ * - padding0: Reserved for future use.
+ */
+struct dlb2_create_sched_domain_args {
+	/* Output parameters */
+	__u64 response;
+	__u32 domain_fd;
+	__u32 padding0;
+	/* Input parameters */
+	__u32 num_ldb_queues;
+	__u32 num_ldb_ports;
+	__u32 num_cos0_ldb_ports;
+	__u32 num_cos1_ldb_ports;
+	__u32 num_cos2_ldb_ports;
+	__u32 num_cos3_ldb_ports;
+	__u32 num_dir_ports;
+	__u32 num_atomic_inflights;
+	__u32 num_hist_list_entries;
+	__u32 num_ldb_credits;
+	__u32 num_dir_credits;
+	__u8 cos_strict;
+	__u8 padding1[3];
+};
+
+/*
+ * DLB2_CMD_GET_NUM_RESOURCES: Return the number of available resources
+ *	(queues, ports, etc.) that this device owns.
+ *
+ * Output parameters:
+ * - num_domains: Number of available scheduling domains.
+ * - num_ldb_queues: Number of available load-balanced queues.
+ * - num_ldb_ports: Total number of available load-balanced ports.
+ * - num_cos0_ldb_ports: Number of available load-balanced ports from
+ *	class-of-service 0.
+ * - num_cos1_ldb_ports: Number of available load-balanced ports from
+ *	class-of-service 1.
+ * - num_cos2_ldb_ports: Number of available load-balanced ports from
+ *	class-of-service 2.
+ * - num_cos3_ldb_ports: Number of available load-balanced ports from
+ *	class-of-service 3.
+ * - num_dir_ports: Number of available directed ports. There is one directed
+ *	queue for every directed port.
+ * - num_atomic_inflights: Amount of available temporary atomic QE storage.
+ * - num_hist_list_entries: Amount of history list storage.
+ * - max_contiguous_hist_list_entries: History list storage is allocated in
+ *	a contiguous chunk, and this return value is the longest available
+ *	contiguous range of history list entries.
+ * - num_ldb_credits: Amount of available load-balanced QE storage.
+ * - num_dir_credits: Amount of available directed QE storage.
+ */
+struct dlb2_get_num_resources_args {
+	/* Output parameters */
+	__u32 num_sched_domains;
+	__u32 num_ldb_queues;
+	__u32 num_ldb_ports;
+	__u32 num_cos0_ldb_ports;
+	__u32 num_cos1_ldb_ports;
+	__u32 num_cos2_ldb_ports;
+	__u32 num_cos3_ldb_ports;
+	__u32 num_dir_ports;
+	__u32 num_atomic_inflights;
+	__u32 num_hist_list_entries;
+	__u32 max_contiguous_hist_list_entries;
+	__u32 num_ldb_credits;
+	__u32 num_dir_credits;
+};
+
+enum dlb2_user_interface_commands {
+	DLB2_CMD_GET_DEVICE_VERSION,
+	DLB2_CMD_CREATE_SCHED_DOMAIN,
+	DLB2_CMD_GET_NUM_RESOURCES,
+	DLB2_CMD_GET_DRIVER_VERSION,
+
+	/* NUM_DLB2_CMD must be last */
+	NUM_DLB2_CMD,
+};
+
+/********************/
+/* dlb2 ioctl codes */
+/********************/
+
+#define DLB2_IOC_MAGIC  'h'
+
+#define DLB2_IOC_GET_DEVICE_VERSION				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_DEVICE_VERSION,		\
+		      struct dlb2_get_device_version_args)
+#define DLB2_IOC_CREATE_SCHED_DOMAIN				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_CREATE_SCHED_DOMAIN,		\
+		      struct dlb2_create_sched_domain_args)
+#define DLB2_IOC_GET_NUM_RESOURCES				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_NUM_RESOURCES,		\
+		      struct dlb2_get_num_resources_args)
+#define DLB2_IOC_GET_DRIVER_VERSION				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_DRIVER_VERSION,		\
+		      struct dlb2_get_driver_version_args)
+
+#endif /* __DLB2_USER_H */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/20] dlb2: add sched domain config and reset support
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (3 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 06/20] dlb2: add ioctl to get sched domain fd Gage Eads
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

When a user requests to create a scheduling domain, the requested resources
are validated then reserved for the scheduling domain. This takes place in
dlb2_resource.c. Finally, the ioctl handler allocates an anonymous file
descriptor for the domain.

Once created, user-space can use its file descriptor for the scheduling
domain to configure the domain's resources (to be added in a subsequent
commit). User-space can also get additional file descriptors (for
multiprocess usage).

The driver maintains a reference count for each scheduling domain,
incrementing it each time user-space requests a file descriptor and
decrementing it in the file's release callback.

When the reference count transitions from 1->0 the driver automatically
resets the scheduling domain's resources and makes them available for use
by future applications. This ensures that applications that crash without
explicitly cleaning up do not orphan device resources.

Broadly speaking, the scheduling domain reset logic progressively disables
resources and flushes/drains them. The driver uses iosubmit_cmds512() to
flush load-balanced and directed ports. At the end of the domain reset
function, it resets hardware (register) state and software state.

This commit introduces device register definitions (dlb2_regs.h), used in
this and upcoming commits.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/dlb2_bitmap.h   |  102 ++
 drivers/misc/dlb2/dlb2_hw_types.h |   96 +
 drivers/misc/dlb2/dlb2_ioctl.c    |   42 +-
 drivers/misc/dlb2/dlb2_main.c     |   80 +
 drivers/misc/dlb2/dlb2_main.h     |   25 +
 drivers/misc/dlb2/dlb2_pf_ops.c   |    9 +-
 drivers/misc/dlb2/dlb2_regs.h     | 3619 +++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.c | 3140 +++++++++++++++++++++++++++++++-
 drivers/misc/dlb2/dlb2_resource.h |   74 +
 include/uapi/linux/dlb2_user.h    |  112 +-
 10 files changed, 7273 insertions(+), 26 deletions(-)

diff --git a/drivers/misc/dlb2/dlb2_bitmap.h b/drivers/misc/dlb2/dlb2_bitmap.h
index 2d2d2927b0ec..ecc31c8b8288 100644
--- a/drivers/misc/dlb2/dlb2_bitmap.h
+++ b/drivers/misc/dlb2/dlb2_bitmap.h
@@ -119,6 +119,108 @@ static inline int dlb2_bitmap_zero(struct dlb2_bitmap *bitmap)
 }
 
 /**
+ * dlb2_bitmap_set_range() - set a range of bitmap entries
+ * @bitmap: pointer to dlb2_bitmap structure.
+ * @bit: starting bit index.
+ * @len: length of the range.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized, or the range exceeds the bitmap
+ *	    length.
+ */
+static inline int dlb2_bitmap_set_range(struct dlb2_bitmap *bitmap,
+					unsigned int bit,
+					unsigned int len)
+{
+	if (!bitmap || !bitmap->map)
+		return -EINVAL;
+
+	if (bitmap->len <= bit)
+		return -EINVAL;
+
+	bitmap_set(bitmap->map, bit, len);
+
+	return 0;
+}
+
+/**
+ * dlb2_bitmap_clear_range() - clear a range of bitmap entries
+ * @bitmap: pointer to dlb2_bitmap structure.
+ * @bit: starting bit index.
+ * @len: length of the range.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized, or the range exceeds the bitmap
+ *	    length.
+ */
+static inline int dlb2_bitmap_clear_range(struct dlb2_bitmap *bitmap,
+					  unsigned int bit,
+					  unsigned int len)
+{
+	if (!bitmap || !bitmap->map)
+		return -EINVAL;
+
+	if (bitmap->len <= bit)
+		return -EINVAL;
+
+	bitmap_clear(bitmap->map, bit, len);
+
+	return 0;
+}
+
+/**
+ * dlb2_bitmap_find_set_bit_range() - find an range of set bits
+ * @bitmap: pointer to dlb2_bitmap structure.
+ * @len: length of the range.
+ *
+ * This function looks for a range of set bits of length @len.
+ *
+ * Return:
+ * Returns the base bit index upon success, < 0 otherwise.
+ *
+ * Errors:
+ * ENOENT - unable to find a length *len* range of set bits.
+ * EINVAL - bitmap is NULL or is uninitialized, or len is invalid.
+ */
+static inline int dlb2_bitmap_find_set_bit_range(struct dlb2_bitmap *bitmap,
+						 unsigned int len)
+{
+	struct dlb2_bitmap *complement_mask = NULL;
+	int ret;
+
+	if (!bitmap || !bitmap->map || len == 0)
+		return -EINVAL;
+
+	if (bitmap->len < len)
+		return -ENOENT;
+
+	ret = dlb2_bitmap_alloc(&complement_mask, bitmap->len);
+	if (ret)
+		return ret;
+
+	dlb2_bitmap_zero(complement_mask);
+
+	bitmap_complement(complement_mask->map, bitmap->map, bitmap->len);
+
+	ret = bitmap_find_next_zero_area(complement_mask->map,
+					 complement_mask->len,
+					 0,
+					 len,
+					 0);
+
+	dlb2_bitmap_free(complement_mask);
+
+	/* No set bit range of length len? */
+	return (ret >= (int)bitmap->len) ? -ENOENT : ret;
+}
+
+/**
  * dlb2_bitmap_count() - returns the number of set bits
  * @bitmap: pointer to dlb2_bitmap structure.
  *
diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
index c5c4e45d4ee5..732482e364d6 100644
--- a/drivers/misc/dlb2/dlb2_hw_types.h
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -48,6 +48,33 @@
 #define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
 #define DLB2_HZ 800000000
 
+/*
+ * Hardware-defined base addresses. Those prefixed 'DLB2_DRV' are only used by
+ * the PF driver.
+ */
+#define DLB2_DRV_LDB_PP_BASE 0x2300000
+#define DLB2_DRV_LDB_PP_STRIDE 0x1000
+#define DLB2_DRV_LDB_PP_BOUND \
+	(DLB2_DRV_LDB_PP_BASE + DLB2_DRV_LDB_PP_STRIDE * \
+	 DLB2_MAX_NUM_LDB_PORTS)
+#define DLB2_DRV_DIR_PP_BASE 0x2200000
+#define DLB2_DRV_DIR_PP_STRIDE 0x1000
+#define DLB2_DRV_DIR_PP_BOUND \
+	(DLB2_DRV_DIR_PP_BASE + DLB2_DRV_DIR_PP_STRIDE * \
+	 DLB2_MAX_NUM_DIR_PORTS)
+#define DLB2_LDB_PP_BASE 0x2100000
+#define DLB2_LDB_PP_STRIDE 0x1000
+#define DLB2_LDB_PP_BOUND \
+	(DLB2_LDB_PP_BASE + DLB2_LDB_PP_STRIDE * \
+	 DLB2_MAX_NUM_LDB_PORTS)
+#define DLB2_LDB_PP_OFFS(id) (DLB2_LDB_PP_BASE + (id) * DLB2_PP_SIZE)
+#define DLB2_DIR_PP_BASE 0x2000000
+#define DLB2_DIR_PP_STRIDE 0x1000
+#define DLB2_DIR_PP_BOUND \
+	(DLB2_DIR_PP_BASE + DLB2_DIR_PP_STRIDE * \
+	 DLB2_MAX_NUM_DIR_PORTS)
+#define DLB2_DIR_PP_OFFS(id) (DLB2_DIR_PP_BASE + (id) * DLB2_PP_SIZE)
+
 struct dlb2_resource_id {
 	u32 phys_id;
 	u32 virt_id;
@@ -66,6 +93,29 @@ static inline u32 dlb2_freelist_count(struct dlb2_freelist *list)
 	return list->bound - list->base - list->offset;
 }
 
+struct dlb2_hcw {
+	u64 data;
+	/* Word 3 */
+	u16 opaque;
+	u8 qid;
+	u8 sched_type:2;
+	u8 priority:3;
+	u8 msg_type:3;
+	/* Word 4 */
+	u16 lock_id;
+	u8 ts_flag:1;
+	u8 rsvd1:2;
+	u8 no_dec:1;
+	u8 cmp_id:4;
+	u8 cq_token:1;
+	u8 qe_comp:1;
+	u8 qe_frag:1;
+	u8 qe_valid:1;
+	u8 int_arm:1;
+	u8 error:1;
+	u8 rsvd:2;
+};
+
 struct dlb2_ldb_queue {
 	struct list_head domain_list;
 	struct list_head func_list;
@@ -148,6 +198,49 @@ struct dlb2_sn_group {
 	u32 id;
 };
 
+static inline bool dlb2_sn_group_full(struct dlb2_sn_group *group)
+{
+	u32 mask[] = {
+		0x0000ffff,  /* 64 SNs per queue */
+		0x000000ff,  /* 128 SNs per queue */
+		0x0000000f,  /* 256 SNs per queue */
+		0x00000003,  /* 512 SNs per queue */
+		0x00000001}; /* 1024 SNs per queue */
+
+	return group->slot_use_bitmap == mask[group->mode];
+}
+
+static inline int dlb2_sn_group_alloc_slot(struct dlb2_sn_group *group)
+{
+	u32 bound[6] = {16, 8, 4, 2, 1};
+	u32 i;
+
+	for (i = 0; i < bound[group->mode]; i++) {
+		if (!(group->slot_use_bitmap & (1 << i))) {
+			group->slot_use_bitmap |= 1 << i;
+			return i;
+		}
+	}
+
+	return -1;
+}
+
+static inline void
+dlb2_sn_group_free_slot(struct dlb2_sn_group *group, int slot)
+{
+	group->slot_use_bitmap &= ~(1 << slot);
+}
+
+static inline int dlb2_sn_group_used_slots(struct dlb2_sn_group *group)
+{
+	int i, cnt = 0;
+
+	for (i = 0; i < 32; i++)
+		cnt += !!(group->slot_use_bitmap & (1 << i));
+
+	return cnt;
+}
+
 struct dlb2_hw_domain {
 	struct dlb2_function_resources *parent_func;
 	struct list_head func_list;
@@ -225,6 +318,9 @@ struct dlb2_hw {
 	struct dlb2_function_resources vdev[DLB2_MAX_NUM_VDEVS];
 	struct dlb2_hw_domain domains[DLB2_MAX_NUM_DOMAINS];
 	u8 cos_reservation[DLB2_NUM_COS_DOMAINS];
+
+	/* Virtualization */
+	int virt_mode;
 };
 
 #endif /* __DLB2_HW_TYPES_H */
diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index 97a7220f0ea0..8ad3391b63d4 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2017-2020 Intel Corporation */
 
+#include <linux/anon_inodes.h>
 #include <linux/uaccess.h>
 
 #include <uapi/linux/dlb2_user.h>
@@ -78,7 +79,9 @@ static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
 {
 	struct dlb2_create_sched_domain_args arg;
 	struct dlb2_cmd_response response = {0};
-	int ret;
+	struct dlb2_domain *domain;
+	size_t offset;
+	int ret, fd;
 
 	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
 
@@ -93,8 +96,45 @@ static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (dev->domain_reset_failed) {
+		response.status = DLB2_ST_DOMAIN_RESET_FAILED;
+		ret = -EINVAL;
+		goto unlock;
+	}
+
 	ret = dev->ops->create_sched_domain(&dev->hw, &arg, &response);
+	if (ret)
+		goto unlock;
+
+	ret = dlb2_init_domain(dev, response.id);
+	if (ret) {
+		dev->ops->reset_domain(&dev->hw, response.id);
+		goto unlock;
+	}
+
+	domain = dev->sched_domains[response.id];
+
+	fd = anon_inode_getfd("[dlb2domain]", &dlb2_domain_fops,
+			      domain, O_RDWR);
+
+	if (fd < 0) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Failed to get anon fd.\n", __func__);
+		kref_put(&domain->refcnt, dlb2_free_domain);
+		ret = fd;
+		goto unlock;
+	}
+
+	offset = offsetof(struct dlb2_create_sched_domain_args, domain_fd);
+
+	/* There's no reason this should fail, since the copy was validated by
+	 * dlb2_copy_from_user() earlier in the function. Regardless, check for
+	 * an error (but skip the unwind code).
+	 */
+	if (copy_to_user((void __user *)user_arg + offset, &fd, sizeof(fd)))
+		return -EFAULT;
 
+unlock:
 	mutex_unlock(&dev->resource_mutex);
 
 	if (copy_to_user((void __user *)arg.response,
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index e3e32c6f3d63..eb716587e738 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -61,6 +61,14 @@ static int dlb2_reset_device(struct pci_dev *pdev)
 
 static int dlb2_open(struct inode *i, struct file *f)
 {
+	struct dlb2_dev *dev;
+
+	dev = container_of(f->f_inode->i_cdev, struct dlb2_dev, cdev);
+
+	dev_dbg(dev->dlb2_device, "Opening DLB device file\n");
+
+	f->private_data = dev;
+
 	return 0;
 }
 
@@ -93,6 +101,78 @@ static const struct file_operations dlb2_fops = {
 	.compat_ioctl = compat_ptr_ioctl,
 };
 
+int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id)
+{
+	struct dlb2_domain *domain;
+
+	dev_dbg(dlb2_dev->dlb2_device, "Initializing domain %d\n", domain_id);
+
+	domain = devm_kzalloc(dlb2_dev->dlb2_device,
+			      sizeof(*domain),
+			      GFP_KERNEL);
+	if (!domain)
+		return -ENOMEM;
+
+	domain->id = domain_id;
+
+	kref_init(&domain->refcnt);
+	domain->dlb2_dev = dlb2_dev;
+
+	dlb2_dev->sched_domains[domain_id] = domain;
+
+	return 0;
+}
+
+static int __dlb2_free_domain(struct dlb2_dev *dev, struct dlb2_domain *domain)
+{
+	int ret = 0;
+
+	ret = dev->ops->reset_domain(&dev->hw, domain->id);
+	if (ret) {
+		dev->domain_reset_failed = true;
+		dev_err(dev->dlb2_device,
+			"Internal error: Domain reset failed. To recover, reset the device.\n");
+	}
+
+	dev->sched_domains[domain->id] = NULL;
+
+	devm_kfree(dev->dlb2_device, domain);
+
+	return ret;
+}
+
+void dlb2_free_domain(struct kref *kref)
+{
+	struct dlb2_domain *domain;
+
+	domain = container_of(kref, struct dlb2_domain, refcnt);
+
+	__dlb2_free_domain(domain->dlb2_dev, domain);
+}
+
+static int dlb2_domain_close(struct inode *i, struct file *f)
+{
+	struct dlb2_domain *domain = f->private_data;
+	struct dlb2_dev *dev = domain->dlb2_dev;
+	int ret = 0;
+
+	mutex_lock(&dev->resource_mutex);
+
+	dev_dbg(dev->dlb2_device,
+		"Closing domain %d's device file\n", domain->id);
+
+	kref_put(&domain->refcnt, dlb2_free_domain);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	return ret;
+}
+
+const struct file_operations dlb2_domain_fops = {
+	.owner   = THIS_MODULE,
+	.release = dlb2_domain_close,
+};
+
 /**********************************/
 /****** PCI driver callbacks ******/
 /**********************************/
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 9211a6d999d1..81795754c070 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -60,10 +60,18 @@ struct dlb2_device_ops {
 				   struct dlb2_cmd_response *resp);
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
+	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
 	void (*init_hardware)(struct dlb2_dev *dev);
 };
 
 extern struct dlb2_device_ops dlb2_pf_ops;
+extern const struct file_operations dlb2_domain_fops;
+
+struct dlb2_domain {
+	struct dlb2_dev *dlb2_dev;
+	struct kref refcnt;
+	u8 id;
+};
 
 struct dlb2_dev {
 	struct pci_dev *pdev;
@@ -72,6 +80,7 @@ struct dlb2_dev {
 	struct dlb2_device_ops *ops;
 	struct list_head list;
 	struct device *dlb2_device;
+	struct dlb2_domain *sched_domains[DLB2_MAX_NUM_DOMAINS];
 	/*
 	 * The resource mutex serializes access to driver data structures and
 	 * hardware registers.
@@ -80,6 +89,22 @@ struct dlb2_dev {
 	enum dlb2_device_type type;
 	int id;
 	dev_t dev_number;
+	u8 domain_reset_failed;
 };
 
+int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id);
+void dlb2_free_domain(struct kref *kref);
+
+#define DLB2_HW_ERR(dlb2, ...) do {		       \
+	struct dlb2_dev *dev;			       \
+	dev = container_of(dlb2, struct dlb2_dev, hw); \
+	dev_err(dev->dlb2_device, __VA_ARGS__);        \
+} while (0)
+
+#define DLB2_HW_DBG(dlb2, ...) do {		       \
+	struct dlb2_dev *dev;			       \
+	dev = container_of(dlb2, struct dlb2_dev, hw); \
+	dev_dbg(dev->dlb2_device, __VA_ARGS__);	       \
+} while (0)
+
 #endif /* __DLB2_MAIN_H */
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 8142b302ba95..ae83a2feb3f9 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -203,7 +203,7 @@ dlb2_pf_create_sched_domain(struct dlb2_hw *hw,
 			    struct dlb2_create_sched_domain_args *args,
 			    struct dlb2_cmd_response *resp)
 {
-	return 0;
+	return dlb2_hw_create_sched_domain(hw, args, resp, false, 0);
 }
 
 static int
@@ -213,6 +213,12 @@ dlb2_pf_get_num_resources(struct dlb2_hw *hw,
 	return dlb2_hw_get_num_resources(hw, args, false, 0);
 }
 
+static int
+dlb2_pf_reset_domain(struct dlb2_hw *hw, u32 id)
+{
+	return dlb2_reset_domain(hw, id, false, 0);
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -230,5 +236,6 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.wait_for_device_ready = dlb2_pf_wait_for_device_ready,
 	.create_sched_domain = dlb2_pf_create_sched_domain,
 	.get_num_resources = dlb2_pf_get_num_resources,
+	.reset_domain = dlb2_pf_reset_domain,
 	.init_hardware = dlb2_pf_init_hardware,
 };
diff --git a/drivers/misc/dlb2/dlb2_regs.h b/drivers/misc/dlb2/dlb2_regs.h
index 77612c40ce0b..e5fd5acf6063 100644
--- a/drivers/misc/dlb2/dlb2_regs.h
+++ b/drivers/misc/dlb2/dlb2_regs.h
@@ -7,6 +7,3554 @@
 
 #include "linux/types.h"
 
+#define DLB2_FUNC_PF_VF2PF_MAILBOX_BYTES 256
+#define DLB2_FUNC_PF_VF2PF_MAILBOX(vf_id, x) \
+	(0x1000 + 0x4 * (x) + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_VF2PF_MAILBOX_RST 0x0
+union dlb2_func_pf_vf2pf_mailbox {
+	struct {
+		u32 msg : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_PF_VF2PF_MAILBOX_ISR(vf_id) \
+	(0x1f00 + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_VF2PF_MAILBOX_ISR_RST 0x0
+union dlb2_func_pf_vf2pf_mailbox_isr {
+	struct {
+		u32 vf0_isr : 1;
+		u32 vf1_isr : 1;
+		u32 vf2_isr : 1;
+		u32 vf3_isr : 1;
+		u32 vf4_isr : 1;
+		u32 vf5_isr : 1;
+		u32 vf6_isr : 1;
+		u32 vf7_isr : 1;
+		u32 vf8_isr : 1;
+		u32 vf9_isr : 1;
+		u32 vf10_isr : 1;
+		u32 vf11_isr : 1;
+		u32 vf12_isr : 1;
+		u32 vf13_isr : 1;
+		u32 vf14_isr : 1;
+		u32 vf15_isr : 1;
+		u32 rsvd0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_PF_VF2PF_FLR_ISR(vf_id) \
+	(0x1f04 + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_VF2PF_FLR_ISR_RST 0x0
+union dlb2_func_pf_vf2pf_flr_isr {
+	struct {
+		u32 vf0_isr : 1;
+		u32 vf1_isr : 1;
+		u32 vf2_isr : 1;
+		u32 vf3_isr : 1;
+		u32 vf4_isr : 1;
+		u32 vf5_isr : 1;
+		u32 vf6_isr : 1;
+		u32 vf7_isr : 1;
+		u32 vf8_isr : 1;
+		u32 vf9_isr : 1;
+		u32 vf10_isr : 1;
+		u32 vf11_isr : 1;
+		u32 vf12_isr : 1;
+		u32 vf13_isr : 1;
+		u32 vf14_isr : 1;
+		u32 vf15_isr : 1;
+		u32 rsvd0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_PF_VF2PF_ISR_PEND(vf_id) \
+	(0x1f10 + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_VF2PF_ISR_PEND_RST 0x0
+union dlb2_func_pf_vf2pf_isr_pend {
+	struct {
+		u32 isr_pend : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_PF_PF2VF_MAILBOX_BYTES 64
+#define DLB2_FUNC_PF_PF2VF_MAILBOX(vf_id, x) \
+	(0x2000 + 0x4 * (x) + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_PF2VF_MAILBOX_RST 0x0
+union dlb2_func_pf_pf2vf_mailbox {
+	struct {
+		u32 msg : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_PF_PF2VF_MAILBOX_ISR(vf_id) \
+	(0x2f00 + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_PF2VF_MAILBOX_ISR_RST 0x0
+union dlb2_func_pf_pf2vf_mailbox_isr {
+	struct {
+		u32 vf0_isr : 1;
+		u32 vf1_isr : 1;
+		u32 vf2_isr : 1;
+		u32 vf3_isr : 1;
+		u32 vf4_isr : 1;
+		u32 vf5_isr : 1;
+		u32 vf6_isr : 1;
+		u32 vf7_isr : 1;
+		u32 vf8_isr : 1;
+		u32 vf9_isr : 1;
+		u32 vf10_isr : 1;
+		u32 vf11_isr : 1;
+		u32 vf12_isr : 1;
+		u32 vf13_isr : 1;
+		u32 vf14_isr : 1;
+		u32 vf15_isr : 1;
+		u32 rsvd0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_PF_VF_RESET_IN_PROGRESS(vf_id) \
+	(0x3000 + (vf_id) * 0x10000)
+#define DLB2_FUNC_PF_VF_RESET_IN_PROGRESS_RST 0xffff
+union dlb2_func_pf_vf_reset_in_progress {
+	struct {
+		u32 vf0_reset_in_progress : 1;
+		u32 vf1_reset_in_progress : 1;
+		u32 vf2_reset_in_progress : 1;
+		u32 vf3_reset_in_progress : 1;
+		u32 vf4_reset_in_progress : 1;
+		u32 vf5_reset_in_progress : 1;
+		u32 vf6_reset_in_progress : 1;
+		u32 vf7_reset_in_progress : 1;
+		u32 vf8_reset_in_progress : 1;
+		u32 vf9_reset_in_progress : 1;
+		u32 vf10_reset_in_progress : 1;
+		u32 vf11_reset_in_progress : 1;
+		u32 vf12_reset_in_progress : 1;
+		u32 vf13_reset_in_progress : 1;
+		u32 vf14_reset_in_progress : 1;
+		u32 vf15_reset_in_progress : 1;
+		u32 rsvd0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_MSIX_MEM_VECTOR_CTRL(x) \
+	(0x100000c + (x) * 0x10)
+#define DLB2_MSIX_MEM_VECTOR_CTRL_RST 0x1
+union dlb2_msix_mem_vector_ctrl {
+	struct {
+		u32 vec_mask : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_COMP_MASK1(x) \
+	(0x8002024 + (x) * 0x40)
+#define DLB2_IOSF_SMON_COMP_MASK1_RST 0xffffffff
+union dlb2_iosf_smon_comp_mask1 {
+	struct {
+		u32 comp_mask1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_COMP_MASK0(x) \
+	(0x8002020 + (x) * 0x40)
+#define DLB2_IOSF_SMON_COMP_MASK0_RST 0xffffffff
+union dlb2_iosf_smon_comp_mask0 {
+	struct {
+		u32 comp_mask0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_MAX_TMR(x) \
+	(0x800201c + (x) * 0x40)
+#define DLB2_IOSF_SMON_MAX_TMR_RST 0x0
+union dlb2_iosf_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_TMR(x) \
+	(0x8002018 + (x) * 0x40)
+#define DLB2_IOSF_SMON_TMR_RST 0x0
+union dlb2_iosf_smon_tmr {
+	struct {
+		u32 timer_val : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_ACTIVITYCNTR1(x) \
+	(0x8002014 + (x) * 0x40)
+#define DLB2_IOSF_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_iosf_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_ACTIVITYCNTR0(x) \
+	(0x8002010 + (x) * 0x40)
+#define DLB2_IOSF_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_iosf_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_COMPARE1(x) \
+	(0x800200c + (x) * 0x40)
+#define DLB2_IOSF_SMON_COMPARE1_RST 0x0
+union dlb2_iosf_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_COMPARE0(x) \
+	(0x8002008 + (x) * 0x40)
+#define DLB2_IOSF_SMON_COMPARE0_RST 0x0
+union dlb2_iosf_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_CFG1(x) \
+	(0x8002004 + (x) * 0x40)
+#define DLB2_IOSF_SMON_CFG1_RST 0x0
+union dlb2_iosf_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvd : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_SMON_CFG0(x) \
+	(0x8002000 + (x) * 0x40)
+#define DLB2_IOSF_SMON_CFG0_RST 0x40000000
+union dlb2_iosf_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 rsvd2 : 3;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvd1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvd0 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_IOSF_FUNC_VF_BAR_DSBL(x) \
+	(0x20 + (x) * 0x4)
+#define DLB2_IOSF_FUNC_VF_BAR_DSBL_RST 0x0
+union dlb2_iosf_func_vf_bar_dsbl {
+	struct {
+		u32 func_vf_bar_dis : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_VAS 0x1000011c
+#define DLB2_SYS_TOTAL_VAS_RST 0x20
+union dlb2_sys_total_vas {
+	struct {
+		u32 total_vas : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_DIR_PORTS 0x10000118
+#define DLB2_SYS_TOTAL_DIR_PORTS_RST 0x40
+union dlb2_sys_total_dir_ports {
+	struct {
+		u32 total_dir_ports : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_LDB_PORTS 0x10000114
+#define DLB2_SYS_TOTAL_LDB_PORTS_RST 0x40
+union dlb2_sys_total_ldb_ports {
+	struct {
+		u32 total_ldb_ports : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_DIR_QID 0x10000110
+#define DLB2_SYS_TOTAL_DIR_QID_RST 0x40
+union dlb2_sys_total_dir_qid {
+	struct {
+		u32 total_dir_qid : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_LDB_QID 0x1000010c
+#define DLB2_SYS_TOTAL_LDB_QID_RST 0x20
+union dlb2_sys_total_ldb_qid {
+	struct {
+		u32 total_ldb_qid : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_DIR_CRDS 0x10000108
+#define DLB2_SYS_TOTAL_DIR_CRDS_RST 0x1000
+union dlb2_sys_total_dir_crds {
+	struct {
+		u32 total_dir_credits : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_TOTAL_LDB_CRDS 0x10000104
+#define DLB2_SYS_TOTAL_LDB_CRDS_RST 0x2000
+union dlb2_sys_total_ldb_crds {
+	struct {
+		u32 total_ldb_credits : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_PF_SYND2 0x10000508
+#define DLB2_SYS_ALARM_PF_SYND2_RST 0x0
+union dlb2_sys_alarm_pf_synd2 {
+	struct {
+		u32 lock_id : 16;
+		u32 meas : 1;
+		u32 debug : 7;
+		u32 cq_pop : 1;
+		u32 qe_uhl : 1;
+		u32 qe_orsp : 1;
+		u32 qe_valid : 1;
+		u32 cq_int_rearm : 1;
+		u32 dsi_error : 1;
+		u32 rsvd0 : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_PF_SYND1 0x10000504
+#define DLB2_SYS_ALARM_PF_SYND1_RST 0x0
+union dlb2_sys_alarm_pf_synd1 {
+	struct {
+		u32 dsi : 16;
+		u32 qid : 8;
+		u32 qtype : 2;
+		u32 qpri : 3;
+		u32 msg_type : 3;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_PF_SYND0 0x10000500
+#define DLB2_SYS_ALARM_PF_SYND0_RST 0x0
+union dlb2_sys_alarm_pf_synd0 {
+	struct {
+		u32 syndrome : 8;
+		u32 rtype : 2;
+		u32 rsvd0 : 3;
+		u32 is_ldb : 1;
+		u32 cls : 2;
+		u32 aid : 6;
+		u32 unit : 4;
+		u32 source : 4;
+		u32 more : 1;
+		u32 valid : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_LDB_VPP_V(x) \
+	(0x10000f00 + (x) * 0x1000)
+#define DLB2_SYS_VF_LDB_VPP_V_RST 0x0
+union dlb2_sys_vf_ldb_vpp_v {
+	struct {
+		u32 vpp_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_LDB_VPP2PP(x) \
+	(0x10000f04 + (x) * 0x1000)
+#define DLB2_SYS_VF_LDB_VPP2PP_RST 0x0
+union dlb2_sys_vf_ldb_vpp2pp {
+	struct {
+		u32 pp : 6;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_DIR_VPP_V(x) \
+	(0x10000f08 + (x) * 0x1000)
+#define DLB2_SYS_VF_DIR_VPP_V_RST 0x0
+union dlb2_sys_vf_dir_vpp_v {
+	struct {
+		u32 vpp_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_DIR_VPP2PP(x) \
+	(0x10000f0c + (x) * 0x1000)
+#define DLB2_SYS_VF_DIR_VPP2PP_RST 0x0
+union dlb2_sys_vf_dir_vpp2pp {
+	struct {
+		u32 pp : 6;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_LDB_VQID_V(x) \
+	(0x10000f10 + (x) * 0x1000)
+#define DLB2_SYS_VF_LDB_VQID_V_RST 0x0
+union dlb2_sys_vf_ldb_vqid_v {
+	struct {
+		u32 vqid_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_LDB_VQID2QID(x) \
+	(0x10000f14 + (x) * 0x1000)
+#define DLB2_SYS_VF_LDB_VQID2QID_RST 0x0
+union dlb2_sys_vf_ldb_vqid2qid {
+	struct {
+		u32 qid : 5;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_QID2VQID(x) \
+	(0x10000f18 + (x) * 0x1000)
+#define DLB2_SYS_LDB_QID2VQID_RST 0x0
+union dlb2_sys_ldb_qid2vqid {
+	struct {
+		u32 vqid : 5;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_DIR_VQID_V(x) \
+	(0x10000f1c + (x) * 0x1000)
+#define DLB2_SYS_VF_DIR_VQID_V_RST 0x0
+union dlb2_sys_vf_dir_vqid_v {
+	struct {
+		u32 vqid_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_VF_DIR_VQID2QID(x) \
+	(0x10000f20 + (x) * 0x1000)
+#define DLB2_SYS_VF_DIR_VQID2QID_RST 0x0
+union dlb2_sys_vf_dir_vqid2qid {
+	struct {
+		u32 qid : 6;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_VASQID_V(x) \
+	(0x10000f24 + (x) * 0x1000)
+#define DLB2_SYS_LDB_VASQID_V_RST 0x0
+union dlb2_sys_ldb_vasqid_v {
+	struct {
+		u32 vasqid_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_VASQID_V(x) \
+	(0x10000f28 + (x) * 0x1000)
+#define DLB2_SYS_DIR_VASQID_V_RST 0x0
+union dlb2_sys_dir_vasqid_v {
+	struct {
+		u32 vasqid_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_VF_SYND2(x) \
+	(0x10000f48 + (x) * 0x1000)
+#define DLB2_SYS_ALARM_VF_SYND2_RST 0x0
+union dlb2_sys_alarm_vf_synd2 {
+	struct {
+		u32 lock_id : 16;
+		u32 debug : 8;
+		u32 cq_pop : 1;
+		u32 qe_uhl : 1;
+		u32 qe_orsp : 1;
+		u32 qe_valid : 1;
+		u32 isz : 1;
+		u32 dsi_error : 1;
+		u32 dlbrsvd : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_VF_SYND1(x) \
+	(0x10000f44 + (x) * 0x1000)
+#define DLB2_SYS_ALARM_VF_SYND1_RST 0x0
+union dlb2_sys_alarm_vf_synd1 {
+	struct {
+		u32 dsi : 16;
+		u32 qid : 8;
+		u32 qtype : 2;
+		u32 qpri : 3;
+		u32 msg_type : 3;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_VF_SYND0(x) \
+	(0x10000f40 + (x) * 0x1000)
+#define DLB2_SYS_ALARM_VF_SYND0_RST 0x0
+union dlb2_sys_alarm_vf_synd0 {
+	struct {
+		u32 syndrome : 8;
+		u32 rtype : 2;
+		u32 vf_synd0_parity : 1;
+		u32 vf_synd1_parity : 1;
+		u32 vf_synd2_parity : 1;
+		u32 is_ldb : 1;
+		u32 cls : 2;
+		u32 aid : 6;
+		u32 unit : 4;
+		u32 source : 4;
+		u32 more : 1;
+		u32 valid : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_QID_CFG_V(x) \
+	(0x10000f58 + (x) * 0x1000)
+#define DLB2_SYS_LDB_QID_CFG_V_RST 0x0
+union dlb2_sys_ldb_qid_cfg_v {
+	struct {
+		u32 sn_cfg_v : 1;
+		u32 fid_cfg_v : 1;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_QID_ITS(x) \
+	(0x10000f54 + (x) * 0x1000)
+#define DLB2_SYS_LDB_QID_ITS_RST 0x0
+union dlb2_sys_ldb_qid_its {
+	struct {
+		u32 qid_its : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_QID_V(x) \
+	(0x10000f50 + (x) * 0x1000)
+#define DLB2_SYS_LDB_QID_V_RST 0x0
+union dlb2_sys_ldb_qid_v {
+	struct {
+		u32 qid_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_QID_ITS(x) \
+	(0x10000f64 + (x) * 0x1000)
+#define DLB2_SYS_DIR_QID_ITS_RST 0x0
+union dlb2_sys_dir_qid_its {
+	struct {
+		u32 qid_its : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_QID_V(x) \
+	(0x10000f60 + (x) * 0x1000)
+#define DLB2_SYS_DIR_QID_V_RST 0x0
+union dlb2_sys_dir_qid_v {
+	struct {
+		u32 qid_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_AI_DATA(x) \
+	(0x10000fa8 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_AI_DATA_RST 0x0
+union dlb2_sys_ldb_cq_ai_data {
+	struct {
+		u32 cq_ai_data : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_AI_ADDR(x) \
+	(0x10000fa4 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_AI_ADDR_RST 0x0
+union dlb2_sys_ldb_cq_ai_addr {
+	struct {
+		u32 rsvd1 : 2;
+		u32 cq_ai_addr : 18;
+		u32 rsvd0 : 12;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_PASID(x) \
+	(0x10000fa0 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_PASID_RST 0x0
+union dlb2_sys_ldb_cq_pasid {
+	struct {
+		u32 pasid : 20;
+		u32 exe_req : 1;
+		u32 priv_req : 1;
+		u32 fmt2 : 1;
+		u32 rsvd0 : 9;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_AT(x) \
+	(0x10000f9c + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_AT_RST 0x0
+union dlb2_sys_ldb_cq_at {
+	struct {
+		u32 cq_at : 2;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_ISR(x) \
+	(0x10000f98 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_ISR_RST 0x0
+/* CQ Interrupt Modes */
+#define DLB2_CQ_ISR_MODE_DIS  0
+#define DLB2_CQ_ISR_MODE_MSI  1
+#define DLB2_CQ_ISR_MODE_MSIX 2
+#define DLB2_CQ_ISR_MODE_ADI  3
+union dlb2_sys_ldb_cq_isr {
+	struct {
+		u32 vector : 6;
+		u32 vf : 4;
+		u32 en_code : 2;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ2VF_PF_RO(x) \
+	(0x10000f94 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ2VF_PF_RO_RST 0x0
+union dlb2_sys_ldb_cq2vf_pf_ro {
+	struct {
+		u32 vf : 4;
+		u32 is_pf : 1;
+		u32 ro : 1;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_PP_V(x) \
+	(0x10000f90 + (x) * 0x1000)
+#define DLB2_SYS_LDB_PP_V_RST 0x0
+union dlb2_sys_ldb_pp_v {
+	struct {
+		u32 pp_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_PP2VDEV(x) \
+	(0x10000f8c + (x) * 0x1000)
+#define DLB2_SYS_LDB_PP2VDEV_RST 0x0
+union dlb2_sys_ldb_pp2vdev {
+	struct {
+		u32 vdev : 4;
+		u32 rsvd0 : 28;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_PP2VAS(x) \
+	(0x10000f88 + (x) * 0x1000)
+#define DLB2_SYS_LDB_PP2VAS_RST 0x0
+union dlb2_sys_ldb_pp2vas {
+	struct {
+		u32 vas : 5;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_ADDR_U(x) \
+	(0x10000f84 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_ADDR_U_RST 0x0
+union dlb2_sys_ldb_cq_addr_u {
+	struct {
+		u32 addr_u : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_ADDR_L(x) \
+	(0x10000f80 + (x) * 0x1000)
+#define DLB2_SYS_LDB_CQ_ADDR_L_RST 0x0
+union dlb2_sys_ldb_cq_addr_l {
+	struct {
+		u32 rsvd0 : 6;
+		u32 addr_l : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_FMT(x) \
+	(0x10000fec + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_FMT_RST 0x0
+union dlb2_sys_dir_cq_fmt {
+	struct {
+		u32 keep_pf_ppid : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_AI_DATA(x) \
+	(0x10000fe8 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_AI_DATA_RST 0x0
+union dlb2_sys_dir_cq_ai_data {
+	struct {
+		u32 cq_ai_data : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_AI_ADDR(x) \
+	(0x10000fe4 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_AI_ADDR_RST 0x0
+union dlb2_sys_dir_cq_ai_addr {
+	struct {
+		u32 rsvd1 : 2;
+		u32 cq_ai_addr : 18;
+		u32 rsvd0 : 12;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_PASID(x) \
+	(0x10000fe0 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_PASID_RST 0x0
+union dlb2_sys_dir_cq_pasid {
+	struct {
+		u32 pasid : 20;
+		u32 exe_req : 1;
+		u32 priv_req : 1;
+		u32 fmt2 : 1;
+		u32 rsvd0 : 9;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_AT(x) \
+	(0x10000fdc + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_AT_RST 0x0
+union dlb2_sys_dir_cq_at {
+	struct {
+		u32 cq_at : 2;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_ISR(x) \
+	(0x10000fd8 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_ISR_RST 0x0
+union dlb2_sys_dir_cq_isr {
+	struct {
+		u32 vector : 6;
+		u32 vf : 4;
+		u32 en_code : 2;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ2VF_PF_RO(x) \
+	(0x10000fd4 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ2VF_PF_RO_RST 0x0
+union dlb2_sys_dir_cq2vf_pf_ro {
+	struct {
+		u32 vf : 4;
+		u32 is_pf : 1;
+		u32 ro : 1;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_PP_V(x) \
+	(0x10000fd0 + (x) * 0x1000)
+#define DLB2_SYS_DIR_PP_V_RST 0x0
+union dlb2_sys_dir_pp_v {
+	struct {
+		u32 pp_v : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_PP2VDEV(x) \
+	(0x10000fcc + (x) * 0x1000)
+#define DLB2_SYS_DIR_PP2VDEV_RST 0x0
+union dlb2_sys_dir_pp2vdev {
+	struct {
+		u32 vdev : 4;
+		u32 rsvd0 : 28;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_PP2VAS(x) \
+	(0x10000fc8 + (x) * 0x1000)
+#define DLB2_SYS_DIR_PP2VAS_RST 0x0
+union dlb2_sys_dir_pp2vas {
+	struct {
+		u32 vas : 5;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_ADDR_U(x) \
+	(0x10000fc4 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_ADDR_U_RST 0x0
+union dlb2_sys_dir_cq_addr_u {
+	struct {
+		u32 addr_u : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_ADDR_L(x) \
+	(0x10000fc0 + (x) * 0x1000)
+#define DLB2_SYS_DIR_CQ_ADDR_L_RST 0x0
+union dlb2_sys_dir_cq_addr_l {
+	struct {
+		u32 rsvd0 : 6;
+		u32 addr_l : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_COMP_MASK1 0x10003024
+#define DLB2_SYS_PM_SMON_COMP_MASK1_RST 0xffffffff
+union dlb2_sys_pm_smon_comp_mask1 {
+	struct {
+		u32 comp_mask1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_COMP_MASK0 0x10003020
+#define DLB2_SYS_PM_SMON_COMP_MASK0_RST 0xffffffff
+union dlb2_sys_pm_smon_comp_mask0 {
+	struct {
+		u32 comp_mask0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_MAX_TMR 0x1000301c
+#define DLB2_SYS_PM_SMON_MAX_TMR_RST 0x0
+union dlb2_sys_pm_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_TMR 0x10003018
+#define DLB2_SYS_PM_SMON_TMR_RST 0x0
+union dlb2_sys_pm_smon_tmr {
+	struct {
+		u32 timer_val : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_ACTIVITYCNTR1 0x10003014
+#define DLB2_SYS_PM_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_sys_pm_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_ACTIVITYCNTR0 0x10003010
+#define DLB2_SYS_PM_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_sys_pm_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_COMPARE1 0x1000300c
+#define DLB2_SYS_PM_SMON_COMPARE1_RST 0x0
+union dlb2_sys_pm_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_COMPARE0 0x10003008
+#define DLB2_SYS_PM_SMON_COMPARE0_RST 0x0
+union dlb2_sys_pm_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_CFG1 0x10003004
+#define DLB2_SYS_PM_SMON_CFG1_RST 0x0
+union dlb2_sys_pm_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvd : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_PM_SMON_CFG0 0x10003000
+#define DLB2_SYS_PM_SMON_CFG0_RST 0x40000000
+union dlb2_sys_pm_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 rsvd2 : 3;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvd1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvd0 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_COMP_MASK1(x) \
+	(0x18002024 + (x) * 0x40)
+#define DLB2_SYS_SMON_COMP_MASK1_RST 0xffffffff
+union dlb2_sys_smon_comp_mask1 {
+	struct {
+		u32 comp_mask1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_COMP_MASK0(x) \
+	(0x18002020 + (x) * 0x40)
+#define DLB2_SYS_SMON_COMP_MASK0_RST 0xffffffff
+union dlb2_sys_smon_comp_mask0 {
+	struct {
+		u32 comp_mask0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_MAX_TMR(x) \
+	(0x1800201c + (x) * 0x40)
+#define DLB2_SYS_SMON_MAX_TMR_RST 0x0
+union dlb2_sys_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_TMR(x) \
+	(0x18002018 + (x) * 0x40)
+#define DLB2_SYS_SMON_TMR_RST 0x0
+union dlb2_sys_smon_tmr {
+	struct {
+		u32 timer_val : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_ACTIVITYCNTR1(x) \
+	(0x18002014 + (x) * 0x40)
+#define DLB2_SYS_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_sys_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_ACTIVITYCNTR0(x) \
+	(0x18002010 + (x) * 0x40)
+#define DLB2_SYS_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_sys_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_COMPARE1(x) \
+	(0x1800200c + (x) * 0x40)
+#define DLB2_SYS_SMON_COMPARE1_RST 0x0
+union dlb2_sys_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_COMPARE0(x) \
+	(0x18002008 + (x) * 0x40)
+#define DLB2_SYS_SMON_COMPARE0_RST 0x0
+union dlb2_sys_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_CFG1(x) \
+	(0x18002004 + (x) * 0x40)
+#define DLB2_SYS_SMON_CFG1_RST 0x0
+union dlb2_sys_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvd : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_SMON_CFG0(x) \
+	(0x18002000 + (x) * 0x40)
+#define DLB2_SYS_SMON_CFG0_RST 0x40000000
+union dlb2_sys_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 rsvd2 : 3;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvd1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvd0 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_INGRESS_ALARM_ENBL 0x10000300
+#define DLB2_SYS_INGRESS_ALARM_ENBL_RST 0x0
+union dlb2_sys_ingress_alarm_enbl {
+	struct {
+		u32 illegal_hcw : 1;
+		u32 illegal_pp : 1;
+		u32 illegal_pasid : 1;
+		u32 illegal_qid : 1;
+		u32 disabled_qid : 1;
+		u32 illegal_ldb_qid_cfg : 1;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_MSIX_ACK 0x10000400
+#define DLB2_SYS_MSIX_ACK_RST 0x0
+union dlb2_sys_msix_ack {
+	struct {
+		u32 msix_0_ack : 1;
+		u32 msix_1_ack : 1;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_MSIX_PASSTHRU 0x10000404
+#define DLB2_SYS_MSIX_PASSTHRU_RST 0x0
+union dlb2_sys_msix_passthru {
+	struct {
+		u32 msix_0_passthru : 1;
+		u32 msix_1_passthru : 1;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_MSIX_MODE 0x10000408
+#define DLB2_SYS_MSIX_MODE_RST 0x0
+/* MSI-X Modes */
+#define DLB2_MSIX_MODE_PACKED     0
+#define DLB2_MSIX_MODE_COMPRESSED 1
+union dlb2_sys_msix_mode {
+	struct {
+		u32 mode : 1;
+		u32 poll_mode : 1;
+		u32 poll_mask : 1;
+		u32 poll_lock : 1;
+		u32 rsvd0 : 28;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_31_0_OCC_INT_STS 0x10000440
+#define DLB2_SYS_DIR_CQ_31_0_OCC_INT_STS_RST 0x0
+union dlb2_sys_dir_cq_31_0_occ_int_sts {
+	struct {
+		u32 cq_0_occ_int : 1;
+		u32 cq_1_occ_int : 1;
+		u32 cq_2_occ_int : 1;
+		u32 cq_3_occ_int : 1;
+		u32 cq_4_occ_int : 1;
+		u32 cq_5_occ_int : 1;
+		u32 cq_6_occ_int : 1;
+		u32 cq_7_occ_int : 1;
+		u32 cq_8_occ_int : 1;
+		u32 cq_9_occ_int : 1;
+		u32 cq_10_occ_int : 1;
+		u32 cq_11_occ_int : 1;
+		u32 cq_12_occ_int : 1;
+		u32 cq_13_occ_int : 1;
+		u32 cq_14_occ_int : 1;
+		u32 cq_15_occ_int : 1;
+		u32 cq_16_occ_int : 1;
+		u32 cq_17_occ_int : 1;
+		u32 cq_18_occ_int : 1;
+		u32 cq_19_occ_int : 1;
+		u32 cq_20_occ_int : 1;
+		u32 cq_21_occ_int : 1;
+		u32 cq_22_occ_int : 1;
+		u32 cq_23_occ_int : 1;
+		u32 cq_24_occ_int : 1;
+		u32 cq_25_occ_int : 1;
+		u32 cq_26_occ_int : 1;
+		u32 cq_27_occ_int : 1;
+		u32 cq_28_occ_int : 1;
+		u32 cq_29_occ_int : 1;
+		u32 cq_30_occ_int : 1;
+		u32 cq_31_occ_int : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_63_32_OCC_INT_STS 0x10000444
+#define DLB2_SYS_DIR_CQ_63_32_OCC_INT_STS_RST 0x0
+union dlb2_sys_dir_cq_63_32_occ_int_sts {
+	struct {
+		u32 cq_32_occ_int : 1;
+		u32 cq_33_occ_int : 1;
+		u32 cq_34_occ_int : 1;
+		u32 cq_35_occ_int : 1;
+		u32 cq_36_occ_int : 1;
+		u32 cq_37_occ_int : 1;
+		u32 cq_38_occ_int : 1;
+		u32 cq_39_occ_int : 1;
+		u32 cq_40_occ_int : 1;
+		u32 cq_41_occ_int : 1;
+		u32 cq_42_occ_int : 1;
+		u32 cq_43_occ_int : 1;
+		u32 cq_44_occ_int : 1;
+		u32 cq_45_occ_int : 1;
+		u32 cq_46_occ_int : 1;
+		u32 cq_47_occ_int : 1;
+		u32 cq_48_occ_int : 1;
+		u32 cq_49_occ_int : 1;
+		u32 cq_50_occ_int : 1;
+		u32 cq_51_occ_int : 1;
+		u32 cq_52_occ_int : 1;
+		u32 cq_53_occ_int : 1;
+		u32 cq_54_occ_int : 1;
+		u32 cq_55_occ_int : 1;
+		u32 cq_56_occ_int : 1;
+		u32 cq_57_occ_int : 1;
+		u32 cq_58_occ_int : 1;
+		u32 cq_59_occ_int : 1;
+		u32 cq_60_occ_int : 1;
+		u32 cq_61_occ_int : 1;
+		u32 cq_62_occ_int : 1;
+		u32 cq_63_occ_int : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_31_0_OCC_INT_STS 0x10000460
+#define DLB2_SYS_LDB_CQ_31_0_OCC_INT_STS_RST 0x0
+union dlb2_sys_ldb_cq_31_0_occ_int_sts {
+	struct {
+		u32 cq_0_occ_int : 1;
+		u32 cq_1_occ_int : 1;
+		u32 cq_2_occ_int : 1;
+		u32 cq_3_occ_int : 1;
+		u32 cq_4_occ_int : 1;
+		u32 cq_5_occ_int : 1;
+		u32 cq_6_occ_int : 1;
+		u32 cq_7_occ_int : 1;
+		u32 cq_8_occ_int : 1;
+		u32 cq_9_occ_int : 1;
+		u32 cq_10_occ_int : 1;
+		u32 cq_11_occ_int : 1;
+		u32 cq_12_occ_int : 1;
+		u32 cq_13_occ_int : 1;
+		u32 cq_14_occ_int : 1;
+		u32 cq_15_occ_int : 1;
+		u32 cq_16_occ_int : 1;
+		u32 cq_17_occ_int : 1;
+		u32 cq_18_occ_int : 1;
+		u32 cq_19_occ_int : 1;
+		u32 cq_20_occ_int : 1;
+		u32 cq_21_occ_int : 1;
+		u32 cq_22_occ_int : 1;
+		u32 cq_23_occ_int : 1;
+		u32 cq_24_occ_int : 1;
+		u32 cq_25_occ_int : 1;
+		u32 cq_26_occ_int : 1;
+		u32 cq_27_occ_int : 1;
+		u32 cq_28_occ_int : 1;
+		u32 cq_29_occ_int : 1;
+		u32 cq_30_occ_int : 1;
+		u32 cq_31_occ_int : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_LDB_CQ_63_32_OCC_INT_STS 0x10000464
+#define DLB2_SYS_LDB_CQ_63_32_OCC_INT_STS_RST 0x0
+union dlb2_sys_ldb_cq_63_32_occ_int_sts {
+	struct {
+		u32 cq_32_occ_int : 1;
+		u32 cq_33_occ_int : 1;
+		u32 cq_34_occ_int : 1;
+		u32 cq_35_occ_int : 1;
+		u32 cq_36_occ_int : 1;
+		u32 cq_37_occ_int : 1;
+		u32 cq_38_occ_int : 1;
+		u32 cq_39_occ_int : 1;
+		u32 cq_40_occ_int : 1;
+		u32 cq_41_occ_int : 1;
+		u32 cq_42_occ_int : 1;
+		u32 cq_43_occ_int : 1;
+		u32 cq_44_occ_int : 1;
+		u32 cq_45_occ_int : 1;
+		u32 cq_46_occ_int : 1;
+		u32 cq_47_occ_int : 1;
+		u32 cq_48_occ_int : 1;
+		u32 cq_49_occ_int : 1;
+		u32 cq_50_occ_int : 1;
+		u32 cq_51_occ_int : 1;
+		u32 cq_52_occ_int : 1;
+		u32 cq_53_occ_int : 1;
+		u32 cq_54_occ_int : 1;
+		u32 cq_55_occ_int : 1;
+		u32 cq_56_occ_int : 1;
+		u32 cq_57_occ_int : 1;
+		u32 cq_58_occ_int : 1;
+		u32 cq_59_occ_int : 1;
+		u32 cq_60_occ_int : 1;
+		u32 cq_61_occ_int : 1;
+		u32 cq_62_occ_int : 1;
+		u32 cq_63_occ_int : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_DIR_CQ_OPT_CLR 0x100004c0
+#define DLB2_SYS_DIR_CQ_OPT_CLR_RST 0x0
+union dlb2_sys_dir_cq_opt_clr {
+	struct {
+		u32 cq : 6;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_SYS_ALARM_HW_SYND 0x1000050c
+#define DLB2_SYS_ALARM_HW_SYND_RST 0x0
+union dlb2_sys_alarm_hw_synd {
+	struct {
+		u32 syndrome : 8;
+		u32 rtype : 2;
+		u32 alarm : 1;
+		u32 cwd : 1;
+		u32 vf_pf_mb : 1;
+		u32 rsvd0 : 1;
+		u32 cls : 2;
+		u32 aid : 6;
+		u32 unit : 4;
+		u32 source : 4;
+		u32 more : 1;
+		u32 valid : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_QID_FID_LIM(x) \
+	(0x20000000 + (x) * 0x1000)
+#define DLB2_AQED_PIPE_QID_FID_LIM_RST 0x7ff
+union dlb2_aqed_pipe_qid_fid_lim {
+	struct {
+		u32 qid_fid_limit : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_QID_HID_WIDTH(x) \
+	(0x20080000 + (x) * 0x1000)
+#define DLB2_AQED_PIPE_QID_HID_WIDTH_RST 0x0
+union dlb2_aqed_pipe_qid_hid_width {
+	struct {
+		u32 compress_code : 3;
+		u32 rsvd0 : 29;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_CFG_ARB_WEIGHTS_TQPRI_ATM_0 0x24000004
+#define DLB2_AQED_PIPE_CFG_ARB_WEIGHTS_TQPRI_ATM_0_RST 0xfefcfaf8
+union dlb2_aqed_pipe_cfg_arb_weights_tqpri_atm_0 {
+	struct {
+		u32 pri0 : 8;
+		u32 pri1 : 8;
+		u32 pri2 : 8;
+		u32 pri3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_ACTIVITYCNTR0 0x2c00004c
+#define DLB2_AQED_PIPE_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_aqed_pipe_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_ACTIVITYCNTR1 0x2c000050
+#define DLB2_AQED_PIPE_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_aqed_pipe_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_COMPARE0 0x2c000054
+#define DLB2_AQED_PIPE_SMON_COMPARE0_RST 0x0
+union dlb2_aqed_pipe_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_COMPARE1 0x2c000058
+#define DLB2_AQED_PIPE_SMON_COMPARE1_RST 0x0
+union dlb2_aqed_pipe_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_CFG0 0x2c00005c
+#define DLB2_AQED_PIPE_SMON_CFG0_RST 0x40000000
+union dlb2_aqed_pipe_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_CFG1 0x2c000060
+#define DLB2_AQED_PIPE_SMON_CFG1_RST 0x0
+union dlb2_aqed_pipe_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_MAX_TMR 0x2c000064
+#define DLB2_AQED_PIPE_SMON_MAX_TMR_RST 0x0
+union dlb2_aqed_pipe_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_AQED_PIPE_SMON_TMR 0x2c000068
+#define DLB2_AQED_PIPE_SMON_TMR_RST 0x0
+union dlb2_aqed_pipe_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_QID2CQIDIX_00(x) \
+	(0x30080000 + (x) * 0x1000)
+#define DLB2_ATM_QID2CQIDIX_00_RST 0x0
+#define DLB2_ATM_QID2CQIDIX(x, y) \
+	(DLB2_ATM_QID2CQIDIX_00(x) + 0x80000 * (y))
+#define DLB2_ATM_QID2CQIDIX_NUM 16
+union dlb2_atm_qid2cqidix_00 {
+	struct {
+		u32 cq_p0 : 8;
+		u32 cq_p1 : 8;
+		u32 cq_p2 : 8;
+		u32 cq_p3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_CFG_ARB_WEIGHTS_RDY_BIN 0x34000004
+#define DLB2_ATM_CFG_ARB_WEIGHTS_RDY_BIN_RST 0xfffefdfc
+union dlb2_atm_cfg_arb_weights_rdy_bin {
+	struct {
+		u32 bin0 : 8;
+		u32 bin1 : 8;
+		u32 bin2 : 8;
+		u32 bin3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_CFG_ARB_WEIGHTS_SCHED_BIN 0x34000008
+#define DLB2_ATM_CFG_ARB_WEIGHTS_SCHED_BIN_RST 0xfffefdfc
+union dlb2_atm_cfg_arb_weights_sched_bin {
+	struct {
+		u32 bin0 : 8;
+		u32 bin1 : 8;
+		u32 bin2 : 8;
+		u32 bin3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_ACTIVITYCNTR0 0x3c000050
+#define DLB2_ATM_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_atm_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_ACTIVITYCNTR1 0x3c000054
+#define DLB2_ATM_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_atm_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_COMPARE0 0x3c000058
+#define DLB2_ATM_SMON_COMPARE0_RST 0x0
+union dlb2_atm_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_COMPARE1 0x3c00005c
+#define DLB2_ATM_SMON_COMPARE1_RST 0x0
+union dlb2_atm_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_CFG0 0x3c000060
+#define DLB2_ATM_SMON_CFG0_RST 0x40000000
+union dlb2_atm_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_CFG1 0x3c000064
+#define DLB2_ATM_SMON_CFG1_RST 0x0
+union dlb2_atm_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_MAX_TMR 0x3c000068
+#define DLB2_ATM_SMON_MAX_TMR_RST 0x0
+union dlb2_atm_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_ATM_SMON_TMR 0x3c00006c
+#define DLB2_ATM_SMON_TMR_RST 0x0
+union dlb2_atm_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_VAS_CRD(x) \
+	(0x40000000 + (x) * 0x1000)
+#define DLB2_CHP_CFG_DIR_VAS_CRD_RST 0x0
+union dlb2_chp_cfg_dir_vas_crd {
+	struct {
+		u32 count : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_VAS_CRD(x) \
+	(0x40080000 + (x) * 0x1000)
+#define DLB2_CHP_CFG_LDB_VAS_CRD_RST 0x0
+union dlb2_chp_cfg_ldb_vas_crd {
+	struct {
+		u32 count : 15;
+		u32 rsvd0 : 17;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_ORD_QID_SN(x) \
+	(0x40100000 + (x) * 0x1000)
+#define DLB2_CHP_ORD_QID_SN_RST 0x0
+union dlb2_chp_ord_qid_sn {
+	struct {
+		u32 sn : 10;
+		u32 rsvd0 : 22;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_ORD_QID_SN_MAP(x) \
+	(0x40180000 + (x) * 0x1000)
+#define DLB2_CHP_ORD_QID_SN_MAP_RST 0x0
+union dlb2_chp_ord_qid_sn_map {
+	struct {
+		u32 mode : 3;
+		u32 slot : 4;
+		u32 rsvz0 : 1;
+		u32 grp : 1;
+		u32 rsvz1 : 1;
+		u32 rsvd0 : 22;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SN_CHK_ENBL(x) \
+	(0x40200000 + (x) * 0x1000)
+#define DLB2_CHP_SN_CHK_ENBL_RST 0x0
+union dlb2_chp_sn_chk_enbl {
+	struct {
+		u32 en : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_DEPTH(x) \
+	(0x40280000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_DEPTH_RST 0x0
+union dlb2_chp_dir_cq_depth {
+	struct {
+		u32 depth : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_INT_DEPTH_THRSH(x) \
+	(0x40300000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_INT_DEPTH_THRSH_RST 0x0
+union dlb2_chp_dir_cq_int_depth_thrsh {
+	struct {
+		u32 depth_threshold : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_INT_ENB(x) \
+	(0x40380000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_INT_ENB_RST 0x0
+union dlb2_chp_dir_cq_int_enb {
+	struct {
+		u32 en_tim : 1;
+		u32 en_depth : 1;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_TMR_THRSH(x) \
+	(0x40480000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_TMR_THRSH_RST 0x1
+union dlb2_chp_dir_cq_tmr_thrsh {
+	struct {
+		u32 thrsh_0 : 1;
+		u32 thrsh_13_1 : 13;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_TKN_DEPTH_SEL(x) \
+	(0x40500000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_TKN_DEPTH_SEL_RST 0x0
+union dlb2_chp_dir_cq_tkn_depth_sel {
+	struct {
+		u32 token_depth_select : 4;
+		u32 rsvd0 : 28;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_WD_ENB(x) \
+	(0x40580000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_WD_ENB_RST 0x0
+union dlb2_chp_dir_cq_wd_enb {
+	struct {
+		u32 wd_enable : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_WPTR(x) \
+	(0x40600000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ_WPTR_RST 0x0
+union dlb2_chp_dir_cq_wptr {
+	struct {
+		u32 write_pointer : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ2VAS(x) \
+	(0x40680000 + (x) * 0x1000)
+#define DLB2_CHP_DIR_CQ2VAS_RST 0x0
+union dlb2_chp_dir_cq2vas {
+	struct {
+		u32 cq2vas : 5;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_HIST_LIST_BASE(x) \
+	(0x40700000 + (x) * 0x1000)
+#define DLB2_CHP_HIST_LIST_BASE_RST 0x0
+union dlb2_chp_hist_list_base {
+	struct {
+		u32 base : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_HIST_LIST_LIM(x) \
+	(0x40780000 + (x) * 0x1000)
+#define DLB2_CHP_HIST_LIST_LIM_RST 0x0
+union dlb2_chp_hist_list_lim {
+	struct {
+		u32 limit : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_HIST_LIST_POP_PTR(x) \
+	(0x40800000 + (x) * 0x1000)
+#define DLB2_CHP_HIST_LIST_POP_PTR_RST 0x0
+union dlb2_chp_hist_list_pop_ptr {
+	struct {
+		u32 pop_ptr : 13;
+		u32 generation : 1;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_HIST_LIST_PUSH_PTR(x) \
+	(0x40880000 + (x) * 0x1000)
+#define DLB2_CHP_HIST_LIST_PUSH_PTR_RST 0x0
+union dlb2_chp_hist_list_push_ptr {
+	struct {
+		u32 push_ptr : 13;
+		u32 generation : 1;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_DEPTH(x) \
+	(0x40900000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_DEPTH_RST 0x0
+union dlb2_chp_ldb_cq_depth {
+	struct {
+		u32 depth : 11;
+		u32 rsvd0 : 21;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_INT_DEPTH_THRSH(x) \
+	(0x40980000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_INT_DEPTH_THRSH_RST 0x0
+union dlb2_chp_ldb_cq_int_depth_thrsh {
+	struct {
+		u32 depth_threshold : 11;
+		u32 rsvd0 : 21;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_INT_ENB(x) \
+	(0x40a00000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_INT_ENB_RST 0x0
+union dlb2_chp_ldb_cq_int_enb {
+	struct {
+		u32 en_tim : 1;
+		u32 en_depth : 1;
+		u32 rsvd0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_TMR_THRSH(x) \
+	(0x40b00000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_TMR_THRSH_RST 0x1
+union dlb2_chp_ldb_cq_tmr_thrsh {
+	struct {
+		u32 thrsh_0 : 1;
+		u32 thrsh_13_1 : 13;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_TKN_DEPTH_SEL(x) \
+	(0x40b80000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_TKN_DEPTH_SEL_RST 0x0
+union dlb2_chp_ldb_cq_tkn_depth_sel {
+	struct {
+		u32 token_depth_select : 4;
+		u32 rsvd0 : 28;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_WD_ENB(x) \
+	(0x40c00000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_WD_ENB_RST 0x0
+union dlb2_chp_ldb_cq_wd_enb {
+	struct {
+		u32 wd_enable : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_WPTR(x) \
+	(0x40c80000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ_WPTR_RST 0x0
+union dlb2_chp_ldb_cq_wptr {
+	struct {
+		u32 write_pointer : 11;
+		u32 rsvd0 : 21;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ2VAS(x) \
+	(0x40d00000 + (x) * 0x1000)
+#define DLB2_CHP_LDB_CQ2VAS_RST 0x0
+union dlb2_chp_ldb_cq2vas {
+	struct {
+		u32 cq2vas : 5;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_CHP_CSR_CTRL 0x44000008
+#define DLB2_CHP_CFG_CHP_CSR_CTRL_RST 0x180002
+union dlb2_chp_cfg_chp_csr_ctrl {
+	struct {
+		u32 int_cor_alarm_dis : 1;
+		u32 int_cor_synd_dis : 1;
+		u32 int_uncr_alarm_dis : 1;
+		u32 int_unc_synd_dis : 1;
+		u32 int_inf0_alarm_dis : 1;
+		u32 int_inf0_synd_dis : 1;
+		u32 int_inf1_alarm_dis : 1;
+		u32 int_inf1_synd_dis : 1;
+		u32 int_inf2_alarm_dis : 1;
+		u32 int_inf2_synd_dis : 1;
+		u32 int_inf3_alarm_dis : 1;
+		u32 int_inf3_synd_dis : 1;
+		u32 int_inf4_alarm_dis : 1;
+		u32 int_inf4_synd_dis : 1;
+		u32 int_inf5_alarm_dis : 1;
+		u32 int_inf5_synd_dis : 1;
+		u32 dlb_cor_alarm_enable : 1;
+		u32 cfg_64bytes_qe_ldb_cq_mode : 1;
+		u32 cfg_64bytes_qe_dir_cq_mode : 1;
+		u32 pad_write_ldb : 1;
+		u32 pad_write_dir : 1;
+		u32 pad_first_write_ldb : 1;
+		u32 pad_first_write_dir : 1;
+		u32 rsvz0 : 9;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_INTR_ARMED0 0x4400005c
+#define DLB2_CHP_DIR_CQ_INTR_ARMED0_RST 0x0
+union dlb2_chp_dir_cq_intr_armed0 {
+	struct {
+		u32 armed : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_DIR_CQ_INTR_ARMED1 0x44000060
+#define DLB2_CHP_DIR_CQ_INTR_ARMED1_RST 0x0
+union dlb2_chp_dir_cq_intr_armed1 {
+	struct {
+		u32 armed : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_CQ_TIMER_CTL 0x44000084
+#define DLB2_CHP_CFG_DIR_CQ_TIMER_CTL_RST 0x0
+union dlb2_chp_cfg_dir_cq_timer_ctl {
+	struct {
+		u32 sample_interval : 8;
+		u32 enb : 1;
+		u32 rsvz0 : 23;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_WDTO_0 0x44000088
+#define DLB2_CHP_CFG_DIR_WDTO_0_RST 0x0
+union dlb2_chp_cfg_dir_wdto_0 {
+	struct {
+		u32 wdto : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_WDTO_1 0x4400008c
+#define DLB2_CHP_CFG_DIR_WDTO_1_RST 0x0
+union dlb2_chp_cfg_dir_wdto_1 {
+	struct {
+		u32 wdto : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_WD_DISABLE0 0x44000098
+#define DLB2_CHP_CFG_DIR_WD_DISABLE0_RST 0xffffffff
+union dlb2_chp_cfg_dir_wd_disable0 {
+	struct {
+		u32 wd_disable : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_WD_DISABLE1 0x4400009c
+#define DLB2_CHP_CFG_DIR_WD_DISABLE1_RST 0xffffffff
+union dlb2_chp_cfg_dir_wd_disable1 {
+	struct {
+		u32 wd_disable : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_WD_ENB_INTERVAL 0x440000a0
+#define DLB2_CHP_CFG_DIR_WD_ENB_INTERVAL_RST 0x0
+union dlb2_chp_cfg_dir_wd_enb_interval {
+	struct {
+		u32 sample_interval : 28;
+		u32 enb : 1;
+		u32 rsvz0 : 3;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_DIR_WD_THRESHOLD 0x440000ac
+#define DLB2_CHP_CFG_DIR_WD_THRESHOLD_RST 0x0
+union dlb2_chp_cfg_dir_wd_threshold {
+	struct {
+		u32 wd_threshold : 8;
+		u32 rsvz0 : 24;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_INTR_ARMED0 0x440000b0
+#define DLB2_CHP_LDB_CQ_INTR_ARMED0_RST 0x0
+union dlb2_chp_ldb_cq_intr_armed0 {
+	struct {
+		u32 armed : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_LDB_CQ_INTR_ARMED1 0x440000b4
+#define DLB2_CHP_LDB_CQ_INTR_ARMED1_RST 0x0
+union dlb2_chp_ldb_cq_intr_armed1 {
+	struct {
+		u32 armed : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_CQ_TIMER_CTL 0x440000d8
+#define DLB2_CHP_CFG_LDB_CQ_TIMER_CTL_RST 0x0
+union dlb2_chp_cfg_ldb_cq_timer_ctl {
+	struct {
+		u32 sample_interval : 8;
+		u32 enb : 1;
+		u32 rsvz0 : 23;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_WDTO_0 0x440000dc
+#define DLB2_CHP_CFG_LDB_WDTO_0_RST 0x0
+union dlb2_chp_cfg_ldb_wdto_0 {
+	struct {
+		u32 wdto : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_WDTO_1 0x440000e0
+#define DLB2_CHP_CFG_LDB_WDTO_1_RST 0x0
+union dlb2_chp_cfg_ldb_wdto_1 {
+	struct {
+		u32 wdto : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_WD_DISABLE0 0x440000ec
+#define DLB2_CHP_CFG_LDB_WD_DISABLE0_RST 0xffffffff
+union dlb2_chp_cfg_ldb_wd_disable0 {
+	struct {
+		u32 wd_disable : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_WD_DISABLE1 0x440000f0
+#define DLB2_CHP_CFG_LDB_WD_DISABLE1_RST 0xffffffff
+union dlb2_chp_cfg_ldb_wd_disable1 {
+	struct {
+		u32 wd_disable : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_WD_ENB_INTERVAL 0x440000f4
+#define DLB2_CHP_CFG_LDB_WD_ENB_INTERVAL_RST 0x0
+union dlb2_chp_cfg_ldb_wd_enb_interval {
+	struct {
+		u32 sample_interval : 28;
+		u32 enb : 1;
+		u32 rsvz0 : 3;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CFG_LDB_WD_THRESHOLD 0x44000100
+#define DLB2_CHP_CFG_LDB_WD_THRESHOLD_RST 0x0
+union dlb2_chp_cfg_ldb_wd_threshold {
+	struct {
+		u32 wd_threshold : 8;
+		u32 rsvz0 : 24;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_COMPARE0 0x4c000000
+#define DLB2_CHP_SMON_COMPARE0_RST 0x0
+union dlb2_chp_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_COMPARE1 0x4c000004
+#define DLB2_CHP_SMON_COMPARE1_RST 0x0
+union dlb2_chp_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_CFG0 0x4c000008
+#define DLB2_CHP_SMON_CFG0_RST 0x40000000
+union dlb2_chp_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_CFG1 0x4c00000c
+#define DLB2_CHP_SMON_CFG1_RST 0x0
+union dlb2_chp_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_ACTIVITYCNTR0 0x4c000010
+#define DLB2_CHP_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_chp_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_ACTIVITYCNTR1 0x4c000014
+#define DLB2_CHP_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_chp_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_MAX_TMR 0x4c000018
+#define DLB2_CHP_SMON_MAX_TMR_RST 0x0
+union dlb2_chp_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_SMON_TMR 0x4c00001c
+#define DLB2_CHP_SMON_TMR_RST 0x0
+union dlb2_chp_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CHP_CTRL_DIAG_02 0x4c000028
+#define DLB2_CHP_CTRL_DIAG_02_RST 0x1555
+union dlb2_chp_ctrl_diag_02 {
+	struct {
+		u32 egress_credit_status_empty : 1;
+		u32 egress_credit_status_afull : 1;
+		u32 chp_outbound_hcw_pipe_credit_status_empty : 1;
+		u32 chp_outbound_hcw_pipe_credit_status_afull : 1;
+		u32 chp_lsp_ap_cmp_pipe_credit_status_empty : 1;
+		u32 chp_lsp_ap_cmp_pipe_credit_status_afull : 1;
+		u32 chp_lsp_tok_pipe_credit_status_empty : 1;
+		u32 chp_lsp_tok_pipe_credit_status_afull : 1;
+		u32 chp_rop_pipe_credit_status_empty : 1;
+		u32 chp_rop_pipe_credit_status_afull : 1;
+		u32 qed_to_cq_pipe_credit_status_empty : 1;
+		u32 qed_to_cq_pipe_credit_status_afull : 1;
+		u32 egress_lsp_token_credit_status_empty : 1;
+		u32 egress_lsp_token_credit_status_afull : 1;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_DIR_0 0x54000000
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_DIR_0_RST 0xfefcfaf8
+union dlb2_dp_cfg_arb_weights_tqpri_dir_0 {
+	struct {
+		u32 pri0 : 8;
+		u32 pri1 : 8;
+		u32 pri2 : 8;
+		u32 pri3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_DIR_1 0x54000004
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_DIR_1_RST 0x0
+union dlb2_dp_cfg_arb_weights_tqpri_dir_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_REPLAY_0 0x54000008
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_REPLAY_0_RST 0xfefcfaf8
+union dlb2_dp_cfg_arb_weights_tqpri_replay_0 {
+	struct {
+		u32 pri0 : 8;
+		u32 pri1 : 8;
+		u32 pri2 : 8;
+		u32 pri3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_REPLAY_1 0x5400000c
+#define DLB2_DP_CFG_ARB_WEIGHTS_TQPRI_REPLAY_1_RST 0x0
+union dlb2_dp_cfg_arb_weights_tqpri_replay_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_DIR_CSR_CTRL 0x54000010
+#define DLB2_DP_DIR_CSR_CTRL_RST 0x0
+union dlb2_dp_dir_csr_ctrl {
+	struct {
+		u32 int_cor_alarm_dis : 1;
+		u32 int_cor_synd_dis : 1;
+		u32 int_uncr_alarm_dis : 1;
+		u32 int_unc_synd_dis : 1;
+		u32 int_inf0_alarm_dis : 1;
+		u32 int_inf0_synd_dis : 1;
+		u32 int_inf1_alarm_dis : 1;
+		u32 int_inf1_synd_dis : 1;
+		u32 int_inf2_alarm_dis : 1;
+		u32 int_inf2_synd_dis : 1;
+		u32 int_inf3_alarm_dis : 1;
+		u32 int_inf3_synd_dis : 1;
+		u32 int_inf4_alarm_dis : 1;
+		u32 int_inf4_synd_dis : 1;
+		u32 int_inf5_alarm_dis : 1;
+		u32 int_inf5_synd_dis : 1;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_ACTIVITYCNTR0 0x5c000058
+#define DLB2_DP_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_dp_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_ACTIVITYCNTR1 0x5c00005c
+#define DLB2_DP_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_dp_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_COMPARE0 0x5c000060
+#define DLB2_DP_SMON_COMPARE0_RST 0x0
+union dlb2_dp_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_COMPARE1 0x5c000064
+#define DLB2_DP_SMON_COMPARE1_RST 0x0
+union dlb2_dp_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_CFG0 0x5c000068
+#define DLB2_DP_SMON_CFG0_RST 0x40000000
+union dlb2_dp_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_CFG1 0x5c00006c
+#define DLB2_DP_SMON_CFG1_RST 0x0
+union dlb2_dp_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_MAX_TMR 0x5c000070
+#define DLB2_DP_SMON_MAX_TMR_RST 0x0
+union dlb2_dp_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DP_SMON_TMR 0x5c000074
+#define DLB2_DP_SMON_TMR_RST 0x0
+union dlb2_dp_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_ACTIVITYCNTR0 0x6c000024
+#define DLB2_DQED_PIPE_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_dqed_pipe_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_ACTIVITYCNTR1 0x6c000028
+#define DLB2_DQED_PIPE_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_dqed_pipe_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_COMPARE0 0x6c00002c
+#define DLB2_DQED_PIPE_SMON_COMPARE0_RST 0x0
+union dlb2_dqed_pipe_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_COMPARE1 0x6c000030
+#define DLB2_DQED_PIPE_SMON_COMPARE1_RST 0x0
+union dlb2_dqed_pipe_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_CFG0 0x6c000034
+#define DLB2_DQED_PIPE_SMON_CFG0_RST 0x40000000
+union dlb2_dqed_pipe_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_CFG1 0x6c000038
+#define DLB2_DQED_PIPE_SMON_CFG1_RST 0x0
+union dlb2_dqed_pipe_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_MAX_TMR 0x6c00003c
+#define DLB2_DQED_PIPE_SMON_MAX_TMR_RST 0x0
+union dlb2_dqed_pipe_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_DQED_PIPE_SMON_TMR 0x6c000040
+#define DLB2_DQED_PIPE_SMON_TMR_RST 0x0
+union dlb2_dqed_pipe_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_ACTIVITYCNTR0 0x7c000024
+#define DLB2_QED_PIPE_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_qed_pipe_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_ACTIVITYCNTR1 0x7c000028
+#define DLB2_QED_PIPE_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_qed_pipe_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_COMPARE0 0x7c00002c
+#define DLB2_QED_PIPE_SMON_COMPARE0_RST 0x0
+union dlb2_qed_pipe_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_COMPARE1 0x7c000030
+#define DLB2_QED_PIPE_SMON_COMPARE1_RST 0x0
+union dlb2_qed_pipe_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_CFG0 0x7c000034
+#define DLB2_QED_PIPE_SMON_CFG0_RST 0x40000000
+union dlb2_qed_pipe_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_CFG1 0x7c000038
+#define DLB2_QED_PIPE_SMON_CFG1_RST 0x0
+union dlb2_qed_pipe_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_MAX_TMR 0x7c00003c
+#define DLB2_QED_PIPE_SMON_MAX_TMR_RST 0x0
+union dlb2_qed_pipe_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_QED_PIPE_SMON_TMR 0x7c000040
+#define DLB2_QED_PIPE_SMON_TMR_RST 0x0
+union dlb2_qed_pipe_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_ATQ_0 0x84000000
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_ATQ_0_RST 0xfefcfaf8
+union dlb2_nalb_pipe_cfg_arb_weights_tqpri_atq_0 {
+	struct {
+		u32 pri0 : 8;
+		u32 pri1 : 8;
+		u32 pri2 : 8;
+		u32 pri3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_ATQ_1 0x84000004
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_ATQ_1_RST 0x0
+union dlb2_nalb_pipe_cfg_arb_weights_tqpri_atq_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_NALB_0 0x84000008
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_NALB_0_RST 0xfefcfaf8
+union dlb2_nalb_pipe_cfg_arb_weights_tqpri_nalb_0 {
+	struct {
+		u32 pri0 : 8;
+		u32 pri1 : 8;
+		u32 pri2 : 8;
+		u32 pri3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_NALB_1 0x8400000c
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_NALB_1_RST 0x0
+union dlb2_nalb_pipe_cfg_arb_weights_tqpri_nalb_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_REPLAY_0 0x84000010
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_REPLAY_0_RST 0xfefcfaf8
+union dlb2_nalb_pipe_cfg_arb_weights_tqpri_replay_0 {
+	struct {
+		u32 pri0 : 8;
+		u32 pri1 : 8;
+		u32 pri2 : 8;
+		u32 pri3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_REPLAY_1 0x84000014
+#define DLB2_NALB_PIPE_CFG_ARB_WEIGHTS_TQPRI_REPLAY_1_RST 0x0
+union dlb2_nalb_pipe_cfg_arb_weights_tqpri_replay_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_ACTIVITYCNTR0 0x8c000064
+#define DLB2_NALB_PIPE_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_nalb_pipe_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_ACTIVITYCNTR1 0x8c000068
+#define DLB2_NALB_PIPE_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_nalb_pipe_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_COMPARE0 0x8c00006c
+#define DLB2_NALB_PIPE_SMON_COMPARE0_RST 0x0
+union dlb2_nalb_pipe_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_COMPARE1 0x8c000070
+#define DLB2_NALB_PIPE_SMON_COMPARE1_RST 0x0
+union dlb2_nalb_pipe_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_CFG0 0x8c000074
+#define DLB2_NALB_PIPE_SMON_CFG0_RST 0x40000000
+union dlb2_nalb_pipe_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_CFG1 0x8c000078
+#define DLB2_NALB_PIPE_SMON_CFG1_RST 0x0
+union dlb2_nalb_pipe_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_MAX_TMR 0x8c00007c
+#define DLB2_NALB_PIPE_SMON_MAX_TMR_RST 0x0
+union dlb2_nalb_pipe_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_NALB_PIPE_SMON_TMR 0x8c000080
+#define DLB2_NALB_PIPE_SMON_TMR_RST 0x0
+union dlb2_nalb_pipe_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_GRP_0_SLT_SHFT(x) \
+	(0x96000000 + (x) * 0x4)
+#define DLB2_RO_PIPE_GRP_0_SLT_SHFT_RST 0x0
+union dlb2_ro_pipe_grp_0_slt_shft {
+	struct {
+		u32 change : 10;
+		u32 rsvd0 : 22;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_GRP_1_SLT_SHFT(x) \
+	(0x96010000 + (x) * 0x4)
+#define DLB2_RO_PIPE_GRP_1_SLT_SHFT_RST 0x0
+union dlb2_ro_pipe_grp_1_slt_shft {
+	struct {
+		u32 change : 10;
+		u32 rsvd0 : 22;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_GRP_SN_MODE 0x94000000
+#define DLB2_RO_PIPE_GRP_SN_MODE_RST 0x0
+union dlb2_ro_pipe_grp_sn_mode {
+	struct {
+		u32 sn_mode_0 : 3;
+		u32 rszv0 : 5;
+		u32 sn_mode_1 : 3;
+		u32 rszv1 : 21;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_CFG_CTRL_GENERAL_0 0x9c000000
+#define DLB2_RO_PIPE_CFG_CTRL_GENERAL_0_RST 0x0
+union dlb2_ro_pipe_cfg_ctrl_general_0 {
+	struct {
+		u32 unit_single_step_mode : 1;
+		u32 rr_en : 1;
+		u32 rszv0 : 30;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_ACTIVITYCNTR0 0x9c000030
+#define DLB2_RO_PIPE_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_ro_pipe_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_ACTIVITYCNTR1 0x9c000034
+#define DLB2_RO_PIPE_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_ro_pipe_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_COMPARE0 0x9c000038
+#define DLB2_RO_PIPE_SMON_COMPARE0_RST 0x0
+union dlb2_ro_pipe_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_COMPARE1 0x9c00003c
+#define DLB2_RO_PIPE_SMON_COMPARE1_RST 0x0
+union dlb2_ro_pipe_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_CFG0 0x9c000040
+#define DLB2_RO_PIPE_SMON_CFG0_RST 0x40000000
+union dlb2_ro_pipe_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_CFG1 0x9c000044
+#define DLB2_RO_PIPE_SMON_CFG1_RST 0x0
+union dlb2_ro_pipe_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_MAX_TMR 0x9c000048
+#define DLB2_RO_PIPE_SMON_MAX_TMR_RST 0x0
+union dlb2_ro_pipe_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_RO_PIPE_SMON_TMR 0x9c00004c
+#define DLB2_RO_PIPE_SMON_TMR_RST 0x0
+union dlb2_ro_pipe_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ2PRIOV(x) \
+	(0xa0000000 + (x) * 0x1000)
+#define DLB2_LSP_CQ2PRIOV_RST 0x0
+union dlb2_lsp_cq2priov {
+	struct {
+		u32 prio : 24;
+		u32 v : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ2QID0(x) \
+	(0xa0080000 + (x) * 0x1000)
+#define DLB2_LSP_CQ2QID0_RST 0x0
+union dlb2_lsp_cq2qid0 {
+	struct {
+		u32 qid_p0 : 7;
+		u32 rsvd3 : 1;
+		u32 qid_p1 : 7;
+		u32 rsvd2 : 1;
+		u32 qid_p2 : 7;
+		u32 rsvd1 : 1;
+		u32 qid_p3 : 7;
+		u32 rsvd0 : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ2QID1(x) \
+	(0xa0100000 + (x) * 0x1000)
+#define DLB2_LSP_CQ2QID1_RST 0x0
+union dlb2_lsp_cq2qid1 {
+	struct {
+		u32 qid_p4 : 7;
+		u32 rsvd3 : 1;
+		u32 qid_p5 : 7;
+		u32 rsvd2 : 1;
+		u32 qid_p6 : 7;
+		u32 rsvd1 : 1;
+		u32 qid_p7 : 7;
+		u32 rsvd0 : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_DIR_DSBL(x) \
+	(0xa0180000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_DIR_DSBL_RST 0x1
+union dlb2_lsp_cq_dir_dsbl {
+	struct {
+		u32 disabled : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_DIR_TKN_CNT(x) \
+	(0xa0200000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_DIR_TKN_CNT_RST 0x0
+union dlb2_lsp_cq_dir_tkn_cnt {
+	struct {
+		u32 count : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI(x) \
+	(0xa0280000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI_RST 0x0
+union dlb2_lsp_cq_dir_tkn_depth_sel_dsi {
+	struct {
+		u32 token_depth_select : 4;
+		u32 disable_wb_opt : 1;
+		u32 ignore_depth : 1;
+		u32 rsvd0 : 26;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_DIR_TOT_SCH_CNTL(x) \
+	(0xa0300000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_DIR_TOT_SCH_CNTL_RST 0x0
+union dlb2_lsp_cq_dir_tot_sch_cntl {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_DIR_TOT_SCH_CNTH(x) \
+	(0xa0380000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_DIR_TOT_SCH_CNTH_RST 0x0
+union dlb2_lsp_cq_dir_tot_sch_cnth {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_DSBL(x) \
+	(0xa0400000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_DSBL_RST 0x1
+union dlb2_lsp_cq_ldb_dsbl {
+	struct {
+		u32 disabled : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_INFL_CNT(x) \
+	(0xa0480000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_INFL_CNT_RST 0x0
+union dlb2_lsp_cq_ldb_infl_cnt {
+	struct {
+		u32 count : 12;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_INFL_LIM(x) \
+	(0xa0500000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_INFL_LIM_RST 0x0
+union dlb2_lsp_cq_ldb_infl_lim {
+	struct {
+		u32 limit : 12;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_TKN_CNT(x) \
+	(0xa0580000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_TKN_CNT_RST 0x0
+union dlb2_lsp_cq_ldb_tkn_cnt {
+	struct {
+		u32 token_count : 11;
+		u32 rsvd0 : 21;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_TKN_DEPTH_SEL(x) \
+	(0xa0600000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_TKN_DEPTH_SEL_RST 0x0
+union dlb2_lsp_cq_ldb_tkn_depth_sel {
+	struct {
+		u32 token_depth_select : 4;
+		u32 ignore_depth : 1;
+		u32 rsvd0 : 27;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_TOT_SCH_CNTL(x) \
+	(0xa0680000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_TOT_SCH_CNTL_RST 0x0
+union dlb2_lsp_cq_ldb_tot_sch_cntl {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CQ_LDB_TOT_SCH_CNTH(x) \
+	(0xa0700000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_TOT_SCH_CNTH_RST 0x0
+union dlb2_lsp_cq_ldb_tot_sch_cnth {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_DIR_MAX_DEPTH(x) \
+	(0xa0780000 + (x) * 0x1000)
+#define DLB2_LSP_QID_DIR_MAX_DEPTH_RST 0x0
+union dlb2_lsp_qid_dir_max_depth {
+	struct {
+		u32 depth : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_DIR_TOT_ENQ_CNTL(x) \
+	(0xa0800000 + (x) * 0x1000)
+#define DLB2_LSP_QID_DIR_TOT_ENQ_CNTL_RST 0x0
+union dlb2_lsp_qid_dir_tot_enq_cntl {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_DIR_TOT_ENQ_CNTH(x) \
+	(0xa0880000 + (x) * 0x1000)
+#define DLB2_LSP_QID_DIR_TOT_ENQ_CNTH_RST 0x0
+union dlb2_lsp_qid_dir_tot_enq_cnth {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_DIR_ENQUEUE_CNT(x) \
+	(0xa0900000 + (x) * 0x1000)
+#define DLB2_LSP_QID_DIR_ENQUEUE_CNT_RST 0x0
+union dlb2_lsp_qid_dir_enqueue_cnt {
+	struct {
+		u32 count : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_DIR_DEPTH_THRSH(x) \
+	(0xa0980000 + (x) * 0x1000)
+#define DLB2_LSP_QID_DIR_DEPTH_THRSH_RST 0x0
+union dlb2_lsp_qid_dir_depth_thrsh {
+	struct {
+		u32 thresh : 13;
+		u32 rsvd0 : 19;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_AQED_ACTIVE_CNT(x) \
+	(0xa0a00000 + (x) * 0x1000)
+#define DLB2_LSP_QID_AQED_ACTIVE_CNT_RST 0x0
+union dlb2_lsp_qid_aqed_active_cnt {
+	struct {
+		u32 count : 12;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_AQED_ACTIVE_LIM(x) \
+	(0xa0a80000 + (x) * 0x1000)
+#define DLB2_LSP_QID_AQED_ACTIVE_LIM_RST 0x0
+union dlb2_lsp_qid_aqed_active_lim {
+	struct {
+		u32 limit : 12;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_ATM_TOT_ENQ_CNTL(x) \
+	(0xa0b00000 + (x) * 0x1000)
+#define DLB2_LSP_QID_ATM_TOT_ENQ_CNTL_RST 0x0
+union dlb2_lsp_qid_atm_tot_enq_cntl {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_ATM_TOT_ENQ_CNTH(x) \
+	(0xa0b80000 + (x) * 0x1000)
+#define DLB2_LSP_QID_ATM_TOT_ENQ_CNTH_RST 0x0
+union dlb2_lsp_qid_atm_tot_enq_cnth {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_ATQ_ENQUEUE_CNT(x) \
+	(0xa0c00000 + (x) * 0x1000)
+#define DLB2_LSP_QID_ATQ_ENQUEUE_CNT_RST 0x0
+union dlb2_lsp_qid_atq_enqueue_cnt {
+	struct {
+		u32 count : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_LDB_ENQUEUE_CNT(x) \
+	(0xa0c80000 + (x) * 0x1000)
+#define DLB2_LSP_QID_LDB_ENQUEUE_CNT_RST 0x0
+union dlb2_lsp_qid_ldb_enqueue_cnt {
+	struct {
+		u32 count : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_LDB_INFL_CNT(x) \
+	(0xa0d00000 + (x) * 0x1000)
+#define DLB2_LSP_QID_LDB_INFL_CNT_RST 0x0
+union dlb2_lsp_qid_ldb_infl_cnt {
+	struct {
+		u32 count : 12;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_LDB_INFL_LIM(x) \
+	(0xa0d80000 + (x) * 0x1000)
+#define DLB2_LSP_QID_LDB_INFL_LIM_RST 0x0
+union dlb2_lsp_qid_ldb_infl_lim {
+	struct {
+		u32 limit : 12;
+		u32 rsvd0 : 20;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID2CQIDIX_00(x) \
+	(0xa0e00000 + (x) * 0x1000)
+#define DLB2_LSP_QID2CQIDIX_00_RST 0x0
+#define DLB2_LSP_QID2CQIDIX(x, y) \
+	(DLB2_LSP_QID2CQIDIX_00(x) + 0x80000 * (y))
+#define DLB2_LSP_QID2CQIDIX_NUM 16
+union dlb2_lsp_qid2cqidix_00 {
+	struct {
+		u32 cq_p0 : 8;
+		u32 cq_p1 : 8;
+		u32 cq_p2 : 8;
+		u32 cq_p3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID2CQIDIX2_00(x) \
+	(0xa1600000 + (x) * 0x1000)
+#define DLB2_LSP_QID2CQIDIX2_00_RST 0x0
+#define DLB2_LSP_QID2CQIDIX2(x, y) \
+	(DLB2_LSP_QID2CQIDIX2_00(x) + 0x80000 * (y))
+#define DLB2_LSP_QID2CQIDIX2_NUM 16
+union dlb2_lsp_qid2cqidix2_00 {
+	struct {
+		u32 cq_p0 : 8;
+		u32 cq_p1 : 8;
+		u32 cq_p2 : 8;
+		u32 cq_p3 : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_LDB_REPLAY_CNT(x) \
+	(0xa1e00000 + (x) * 0x1000)
+#define DLB2_LSP_QID_LDB_REPLAY_CNT_RST 0x0
+union dlb2_lsp_qid_ldb_replay_cnt {
+	struct {
+		u32 count : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_NALDB_MAX_DEPTH(x) \
+	(0xa1f00000 + (x) * 0x1000)
+#define DLB2_LSP_QID_NALDB_MAX_DEPTH_RST 0x0
+union dlb2_lsp_qid_naldb_max_depth {
+	struct {
+		u32 depth : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_NALDB_TOT_ENQ_CNTL(x) \
+	(0xa1f80000 + (x) * 0x1000)
+#define DLB2_LSP_QID_NALDB_TOT_ENQ_CNTL_RST 0x0
+union dlb2_lsp_qid_naldb_tot_enq_cntl {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_NALDB_TOT_ENQ_CNTH(x) \
+	(0xa2000000 + (x) * 0x1000)
+#define DLB2_LSP_QID_NALDB_TOT_ENQ_CNTH_RST 0x0
+union dlb2_lsp_qid_naldb_tot_enq_cnth {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_ATM_DEPTH_THRSH(x) \
+	(0xa2080000 + (x) * 0x1000)
+#define DLB2_LSP_QID_ATM_DEPTH_THRSH_RST 0x0
+union dlb2_lsp_qid_atm_depth_thrsh {
+	struct {
+		u32 thresh : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_NALDB_DEPTH_THRSH(x) \
+	(0xa2100000 + (x) * 0x1000)
+#define DLB2_LSP_QID_NALDB_DEPTH_THRSH_RST 0x0
+union dlb2_lsp_qid_naldb_depth_thrsh {
+	struct {
+		u32 thresh : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_QID_ATM_ACTIVE(x) \
+	(0xa2180000 + (x) * 0x1000)
+#define DLB2_LSP_QID_ATM_ACTIVE_RST 0x0
+union dlb2_lsp_qid_atm_active {
+	struct {
+		u32 count : 14;
+		u32 rsvd0 : 18;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_0 0xa4000008
+#define DLB2_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_0_RST 0x0
+union dlb2_lsp_cfg_arb_weight_atm_nalb_qid_0 {
+	struct {
+		u32 pri0_weight : 8;
+		u32 pri1_weight : 8;
+		u32 pri2_weight : 8;
+		u32 pri3_weight : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_1 0xa400000c
+#define DLB2_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_1_RST 0x0
+union dlb2_lsp_cfg_arb_weight_atm_nalb_qid_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_ARB_WEIGHT_LDB_QID_0 0xa4000014
+#define DLB2_LSP_CFG_ARB_WEIGHT_LDB_QID_0_RST 0x0
+union dlb2_lsp_cfg_arb_weight_ldb_qid_0 {
+	struct {
+		u32 pri0_weight : 8;
+		u32 pri1_weight : 8;
+		u32 pri2_weight : 8;
+		u32 pri3_weight : 8;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_ARB_WEIGHT_LDB_QID_1 0xa4000018
+#define DLB2_LSP_CFG_ARB_WEIGHT_LDB_QID_1_RST 0x0
+union dlb2_lsp_cfg_arb_weight_ldb_qid_1 {
+	struct {
+		u32 rsvz0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_LDB_SCHED_CTRL 0xa400002c
+#define DLB2_LSP_LDB_SCHED_CTRL_RST 0x0
+union dlb2_lsp_ldb_sched_ctrl {
+	struct {
+		u32 cq : 8;
+		u32 qidix : 3;
+		u32 value : 1;
+		u32 nalb_haswork_v : 1;
+		u32 rlist_haswork_v : 1;
+		u32 slist_haswork_v : 1;
+		u32 inflight_ok_v : 1;
+		u32 aqed_nfull_v : 1;
+		u32 rsvz0 : 15;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_DIR_SCH_CNT_L 0xa4000034
+#define DLB2_LSP_DIR_SCH_CNT_L_RST 0x0
+union dlb2_lsp_dir_sch_cnt_l {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_DIR_SCH_CNT_H 0xa4000038
+#define DLB2_LSP_DIR_SCH_CNT_H_RST 0x0
+union dlb2_lsp_dir_sch_cnt_h {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_LDB_SCH_CNT_L 0xa400003c
+#define DLB2_LSP_LDB_SCH_CNT_L_RST 0x0
+union dlb2_lsp_ldb_sch_cnt_l {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_LDB_SCH_CNT_H 0xa4000040
+#define DLB2_LSP_LDB_SCH_CNT_H_RST 0x0
+union dlb2_lsp_ldb_sch_cnt_h {
+	struct {
+		u32 count : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_SHDW_CTRL 0xa4000070
+#define DLB2_LSP_CFG_SHDW_CTRL_RST 0x0
+union dlb2_lsp_cfg_shdw_ctrl {
+	struct {
+		u32 transfer : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_SHDW_RANGE_COS(x) \
+	(0xa4000074 + (x) * 4)
+#define DLB2_LSP_CFG_SHDW_RANGE_COS_RST 0x40
+union dlb2_lsp_cfg_shdw_range_cos {
+	struct {
+		u32 bw_range : 9;
+		u32 rsvz0 : 22;
+		u32 no_extra_credit : 1;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_CFG_CTRL_GENERAL_0 0xac000000
+#define DLB2_LSP_CFG_CTRL_GENERAL_0_RST 0x0
+union dlb2_lsp_cfg_ctrl_general_0 {
+	struct {
+		u32 disab_atq_empty_arb : 1;
+		u32 inc_tok_unit_idle : 1;
+		u32 disab_rlist_pri : 1;
+		u32 inc_cmp_unit_idle : 1;
+		u32 rsvz0 : 2;
+		u32 dir_single_op : 1;
+		u32 dir_half_bw : 1;
+		u32 dir_single_out : 1;
+		u32 dir_disab_multi : 1;
+		u32 atq_single_op : 1;
+		u32 atq_half_bw : 1;
+		u32 atq_single_out : 1;
+		u32 atq_disab_multi : 1;
+		u32 dirrpl_single_op : 1;
+		u32 dirrpl_half_bw : 1;
+		u32 dirrpl_single_out : 1;
+		u32 lbrpl_single_op : 1;
+		u32 lbrpl_half_bw : 1;
+		u32 lbrpl_single_out : 1;
+		u32 ldb_single_op : 1;
+		u32 ldb_half_bw : 1;
+		u32 ldb_disab_multi : 1;
+		u32 atm_single_sch : 1;
+		u32 atm_single_cmp : 1;
+		u32 ldb_ce_tog_arb : 1;
+		u32 rsvz1 : 1;
+		u32 smon0_valid_sel : 2;
+		u32 smon0_value_sel : 1;
+		u32 smon0_compare_sel : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_COMPARE0 0xac000048
+#define DLB2_LSP_SMON_COMPARE0_RST 0x0
+union dlb2_lsp_smon_compare0 {
+	struct {
+		u32 compare0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_COMPARE1 0xac00004c
+#define DLB2_LSP_SMON_COMPARE1_RST 0x0
+union dlb2_lsp_smon_compare1 {
+	struct {
+		u32 compare1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_CFG0 0xac000050
+#define DLB2_LSP_SMON_CFG0_RST 0x40000000
+union dlb2_lsp_smon_cfg0 {
+	struct {
+		u32 smon_enable : 1;
+		u32 smon_0trigger_enable : 1;
+		u32 rsvz0 : 2;
+		u32 smon0_function : 3;
+		u32 smon0_function_compare : 1;
+		u32 smon1_function : 3;
+		u32 smon1_function_compare : 1;
+		u32 smon_mode : 4;
+		u32 stopcounterovfl : 1;
+		u32 intcounterovfl : 1;
+		u32 statcounter0ovfl : 1;
+		u32 statcounter1ovfl : 1;
+		u32 stoptimerovfl : 1;
+		u32 inttimerovfl : 1;
+		u32 stattimerovfl : 1;
+		u32 rsvz1 : 1;
+		u32 timer_prescale : 5;
+		u32 rsvz2 : 1;
+		u32 version : 2;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_CFG1 0xac000054
+#define DLB2_LSP_SMON_CFG1_RST 0x0
+union dlb2_lsp_smon_cfg1 {
+	struct {
+		u32 mode0 : 8;
+		u32 mode1 : 8;
+		u32 rsvz0 : 16;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_ACTIVITYCNTR0 0xac000058
+#define DLB2_LSP_SMON_ACTIVITYCNTR0_RST 0x0
+union dlb2_lsp_smon_activitycntr0 {
+	struct {
+		u32 counter0 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_ACTIVITYCNTR1 0xac00005c
+#define DLB2_LSP_SMON_ACTIVITYCNTR1_RST 0x0
+union dlb2_lsp_smon_activitycntr1 {
+	struct {
+		u32 counter1 : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_MAX_TMR 0xac000060
+#define DLB2_LSP_SMON_MAX_TMR_RST 0x0
+union dlb2_lsp_smon_max_tmr {
+	struct {
+		u32 maxvalue : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_LSP_SMON_TMR 0xac000064
+#define DLB2_LSP_SMON_TMR_RST 0x0
+union dlb2_lsp_smon_tmr {
+	struct {
+		u32 timer : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_CFG_MSTR_DIAG_RESET_STS 0xb4000000
+#define DLB2_CFG_MSTR_DIAG_RESET_STS_RST 0x80000bff
+union dlb2_cfg_mstr_diag_reset_sts {
+	struct {
+		u32 chp_pf_reset_done : 1;
+		u32 rop_pf_reset_done : 1;
+		u32 lsp_pf_reset_done : 1;
+		u32 nalb_pf_reset_done : 1;
+		u32 ap_pf_reset_done : 1;
+		u32 dp_pf_reset_done : 1;
+		u32 qed_pf_reset_done : 1;
+		u32 dqed_pf_reset_done : 1;
+		u32 aqed_pf_reset_done : 1;
+		u32 sys_pf_reset_done : 1;
+		u32 pf_reset_active : 1;
+		u32 flrsm_state : 7;
+		u32 rsvd0 : 13;
+		u32 dlb_proc_reset_done : 1;
+	} field;
+	u32 val;
+};
+
 #define DLB2_CFG_MSTR_CFG_DIAGNOSTIC_IDLE_STATUS 0xb4000004
 #define DLB2_CFG_MSTR_CFG_DIAGNOSTIC_IDLE_STATUS_RST 0x9d0fffff
 union dlb2_cfg_mstr_cfg_diagnostic_idle_status {
@@ -80,4 +3628,75 @@ union dlb2_cfg_mstr_cfg_pm_pmcsr_disable {
 	u32 val;
 };
 
+#define DLB2_FUNC_VF_VF2PF_MAILBOX_BYTES 256
+#define DLB2_FUNC_VF_VF2PF_MAILBOX(x) \
+	(0x1000 + (x) * 0x4)
+#define DLB2_FUNC_VF_VF2PF_MAILBOX_RST 0x0
+union dlb2_func_vf_vf2pf_mailbox {
+	struct {
+		u32 msg : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_VF_VF2PF_MAILBOX_ISR 0x1f00
+#define DLB2_FUNC_VF_VF2PF_MAILBOX_ISR_RST 0x0
+#define DLB2_FUNC_VF_SIOV_VF2PF_MAILBOX_ISR_TRIGGER 0x8000
+union dlb2_func_vf_vf2pf_mailbox_isr {
+	struct {
+		u32 isr : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_VF_PF2VF_MAILBOX_BYTES 64
+#define DLB2_FUNC_VF_PF2VF_MAILBOX(x) \
+	(0x2000 + (x) * 0x4)
+#define DLB2_FUNC_VF_PF2VF_MAILBOX_RST 0x0
+union dlb2_func_vf_pf2vf_mailbox {
+	struct {
+		u32 msg : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_VF_PF2VF_MAILBOX_ISR 0x2f00
+#define DLB2_FUNC_VF_PF2VF_MAILBOX_ISR_RST 0x0
+union dlb2_func_vf_pf2vf_mailbox_isr {
+	struct {
+		u32 pf_isr : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_VF_VF_MSI_ISR_PEND 0x2f10
+#define DLB2_FUNC_VF_VF_MSI_ISR_PEND_RST 0x0
+union dlb2_func_vf_vf_msi_isr_pend {
+	struct {
+		u32 isr_pend : 32;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_VF_VF_RESET_IN_PROGRESS 0x3000
+#define DLB2_FUNC_VF_VF_RESET_IN_PROGRESS_RST 0x1
+union dlb2_func_vf_vf_reset_in_progress {
+	struct {
+		u32 reset_in_progress : 1;
+		u32 rsvd0 : 31;
+	} field;
+	u32 val;
+};
+
+#define DLB2_FUNC_VF_VF_MSI_ISR 0x4000
+#define DLB2_FUNC_VF_VF_MSI_ISR_RST 0x0
+union dlb2_func_vf_vf_msi_isr {
+	struct {
+		u32 vf_msi_isr : 32;
+	} field;
+	u32 val;
+};
+
 #endif /* __DLB2_REGS_H */
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 44da2a3d3061..d3773c0c5dd1 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -1,11 +1,40 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
 /* Copyright(c) 2016-2020 Intel Corporation */
 
+#include <linux/frame.h>
+
 #include "dlb2_bitmap.h"
 #include "dlb2_hw_types.h"
+#include "dlb2_main.h"
 #include "dlb2_regs.h"
 #include "dlb2_resource.h"
 
+#define DLB2_DOM_LIST_HEAD(head, type) \
+	list_first_entry_or_null(&(head), type, domain_list)
+
+#define DLB2_FUNC_LIST_HEAD(head, type) \
+	list_first_entry_or_null(&(head), type, func_list)
+
+#define DLB2_DOM_LIST_FOR(head, ptr) \
+	list_for_each_entry(ptr, &(head), domain_list)
+
+#define DLB2_FUNC_LIST_FOR(head, ptr) \
+	list_for_each_entry(ptr, &(head), func_list)
+
+#define DLB2_DOM_LIST_FOR_SAFE(head, ptr, ptr_tmp) \
+	list_for_each_entry_safe(ptr, ptr_tmp, &(head), domain_list)
+
+/*
+ * The PF driver cannot assume that a register write will affect subsequent HCW
+ * writes. To ensure a write completes, the driver must read back a CSR. This
+ * function only need be called for configuration that can occur after the
+ * domain has started; prior to starting, applications can't send HCWs.
+ */
+static inline void dlb2_flush_csr(struct dlb2_hw *hw)
+{
+	DLB2_CSR_RD(hw, DLB2_SYS_TOTAL_VAS);
+}
+
 static void dlb2_init_fn_rsrc_lists(struct dlb2_function_resources *rsrc)
 {
 	int i;
@@ -185,6 +214,3109 @@ int dlb2_resource_init(struct dlb2_hw *hw)
 	return ret;
 }
 
+static struct dlb2_hw_domain *dlb2_get_domain_from_id(struct dlb2_hw *hw,
+						      u32 id,
+						      bool vdev_req,
+						      unsigned int vdev_id)
+{
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_hw_domain *domain;
+
+	if (id >= DLB2_MAX_NUM_DOMAINS)
+		return NULL;
+
+	if (!vdev_req)
+		return &hw->domains[id];
+
+	rsrcs = &hw->vdev[vdev_id];
+
+	DLB2_FUNC_LIST_FOR(rsrcs->used_domains, domain)
+		if (domain->id.virt_id == id)
+			return domain;
+
+	return NULL;
+}
+
+static struct dlb2_ldb_queue *
+dlb2_get_ldb_queue_from_id(struct dlb2_hw *hw,
+			   u32 id,
+			   bool vdev_req,
+			   unsigned int vdev_id)
+{
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_queue *queue;
+
+	if (id >= DLB2_MAX_NUM_LDB_QUEUES)
+		return NULL;
+
+	rsrcs = (vdev_req) ? &hw->vdev[vdev_id] : &hw->pf;
+
+	if (!vdev_req)
+		return &hw->rsrcs.ldb_queues[id];
+
+	DLB2_FUNC_LIST_FOR(rsrcs->used_domains, domain) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue)
+			if (queue->id.virt_id == id)
+				return queue;
+	}
+
+	DLB2_FUNC_LIST_FOR(rsrcs->avail_ldb_queues, queue)
+		if (queue->id.virt_id == id)
+			return queue;
+
+	return NULL;
+}
+
+static int dlb2_attach_ldb_queues(struct dlb2_hw *hw,
+				  struct dlb2_function_resources *rsrcs,
+				  struct dlb2_hw_domain *domain,
+				  u32 num_queues,
+				  struct dlb2_cmd_response *resp)
+{
+	unsigned int i;
+
+	if (rsrcs->num_avail_ldb_queues < num_queues) {
+		resp->status = DLB2_ST_LDB_QUEUES_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	for (i = 0; i < num_queues; i++) {
+		struct dlb2_ldb_queue *queue;
+
+		queue = DLB2_FUNC_LIST_HEAD(rsrcs->avail_ldb_queues,
+					    typeof(*queue));
+		if (!queue) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: domain validation failed\n",
+				    __func__);
+			return -EFAULT;
+		}
+
+		list_del(&queue->func_list);
+
+		queue->domain_id = domain->id;
+		queue->owned = true;
+
+		list_add(&queue->domain_list, &domain->avail_ldb_queues);
+	}
+
+	rsrcs->num_avail_ldb_queues -= num_queues;
+
+	return 0;
+}
+
+static struct dlb2_ldb_port *
+dlb2_get_next_ldb_port(struct dlb2_hw *hw,
+		       struct dlb2_function_resources *rsrcs,
+		       u32 domain_id,
+		       u32 cos_id)
+{
+	struct dlb2_ldb_port *port;
+
+	/*
+	 * To reduce the odds of consecutive load-balanced ports mapping to the
+	 * same queue(s), the driver attempts to allocate ports whose neighbors
+	 * are owned by a different domain.
+	 */
+	DLB2_FUNC_LIST_FOR(rsrcs->avail_ldb_ports[cos_id], port) {
+		u32 next, prev;
+		u32 phys_id;
+
+		phys_id = port->id.phys_id;
+		next = phys_id + 1;
+		prev = phys_id - 1;
+
+		if (phys_id == DLB2_MAX_NUM_LDB_PORTS - 1)
+			next = 0;
+		if (phys_id == 0)
+			prev = DLB2_MAX_NUM_LDB_PORTS - 1;
+
+		if (!hw->rsrcs.ldb_ports[next].owned ||
+		    hw->rsrcs.ldb_ports[next].domain_id.phys_id == domain_id)
+			continue;
+
+		if (!hw->rsrcs.ldb_ports[prev].owned ||
+		    hw->rsrcs.ldb_ports[prev].domain_id.phys_id == domain_id)
+			continue;
+
+		return port;
+	}
+
+	/*
+	 * Failing that, the driver looks for a port with one neighbor owned by
+	 * a different domain and the other unallocated.
+	 */
+	DLB2_FUNC_LIST_FOR(rsrcs->avail_ldb_ports[cos_id], port) {
+		u32 next, prev;
+		u32 phys_id;
+
+		phys_id = port->id.phys_id;
+		next = phys_id + 1;
+		prev = phys_id - 1;
+
+		if (phys_id == DLB2_MAX_NUM_LDB_PORTS - 1)
+			next = 0;
+		if (phys_id == 0)
+			prev = DLB2_MAX_NUM_LDB_PORTS - 1;
+
+		if (!hw->rsrcs.ldb_ports[prev].owned &&
+		    hw->rsrcs.ldb_ports[next].owned &&
+		    hw->rsrcs.ldb_ports[next].domain_id.phys_id != domain_id)
+			return port;
+
+		if (!hw->rsrcs.ldb_ports[next].owned &&
+		    hw->rsrcs.ldb_ports[prev].owned &&
+		    hw->rsrcs.ldb_ports[prev].domain_id.phys_id != domain_id)
+			return port;
+	}
+
+	/*
+	 * Failing that, the driver looks for a port with both neighbors
+	 * unallocated.
+	 */
+	DLB2_FUNC_LIST_FOR(rsrcs->avail_ldb_ports[cos_id], port) {
+		u32 next, prev;
+		u32 phys_id;
+
+		phys_id = port->id.phys_id;
+		next = phys_id + 1;
+		prev = phys_id - 1;
+
+		if (phys_id == DLB2_MAX_NUM_LDB_PORTS - 1)
+			next = 0;
+		if (phys_id == 0)
+			prev = DLB2_MAX_NUM_LDB_PORTS - 1;
+
+		if (!hw->rsrcs.ldb_ports[prev].owned &&
+		    !hw->rsrcs.ldb_ports[next].owned)
+			return port;
+	}
+
+	/* If all else fails, the driver returns the next available port. */
+	return DLB2_FUNC_LIST_HEAD(rsrcs->avail_ldb_ports[cos_id],
+				   typeof(*port));
+}
+
+static int __dlb2_attach_ldb_ports(struct dlb2_hw *hw,
+				   struct dlb2_function_resources *rsrcs,
+				   struct dlb2_hw_domain *domain,
+				   u32 num_ports,
+				   u32 cos_id,
+				   struct dlb2_cmd_response *resp)
+{
+	unsigned int i;
+
+	if (rsrcs->num_avail_ldb_ports[cos_id] < num_ports) {
+		resp->status = DLB2_ST_LDB_PORTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	for (i = 0; i < num_ports; i++) {
+		struct dlb2_ldb_port *port;
+
+		port = dlb2_get_next_ldb_port(hw, rsrcs,
+					      domain->id.phys_id, cos_id);
+		if (!port) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: domain validation failed\n",
+				    __func__);
+			return -EFAULT;
+		}
+
+		list_del(&port->func_list);
+
+		port->domain_id = domain->id;
+		port->owned = true;
+
+		list_add(&port->domain_list,
+			 &domain->avail_ldb_ports[cos_id]);
+	}
+
+	rsrcs->num_avail_ldb_ports[cos_id] -= num_ports;
+
+	return 0;
+}
+
+static int dlb2_attach_ldb_ports(struct dlb2_hw *hw,
+				 struct dlb2_function_resources *rsrcs,
+				 struct dlb2_hw_domain *domain,
+				 struct dlb2_create_sched_domain_args *args,
+				 struct dlb2_cmd_response *resp)
+{
+	unsigned int i;
+	int ret, j;
+
+	if (args->cos_strict) {
+		for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+			u32 num = args->num_cos_ldb_ports[i];
+
+			/* Allocate ports from specific classes-of-service */
+			ret = __dlb2_attach_ldb_ports(hw,
+						      rsrcs,
+						      domain,
+						      num,
+						      i,
+						      resp);
+			if (ret)
+				return ret;
+		}
+	} else {
+		unsigned int k;
+		u32 cos_id;
+
+		/*
+		 * Attempt to allocate from specific class-of-service, but
+		 * fallback to the other classes if that fails.
+		 */
+		for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+			for (j = 0; j < args->num_cos_ldb_ports[i]; j++) {
+				for (k = 0; k < DLB2_NUM_COS_DOMAINS; k++) {
+					cos_id = (i + k) % DLB2_NUM_COS_DOMAINS;
+
+					ret = __dlb2_attach_ldb_ports(hw,
+								      rsrcs,
+								      domain,
+								      1,
+								      cos_id,
+								      resp);
+					if (ret == 0)
+						break;
+				}
+
+				if (ret < 0)
+					return ret;
+			}
+		}
+	}
+
+	/* Allocate num_ldb_ports from any class-of-service */
+	for (i = 0; i < args->num_ldb_ports; i++) {
+		for (j = 0; j < DLB2_NUM_COS_DOMAINS; j++) {
+			ret = __dlb2_attach_ldb_ports(hw,
+						      rsrcs,
+						      domain,
+						      1,
+						      j,
+						      resp);
+			if (ret == 0)
+				break;
+		}
+
+		if (ret < 0)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int dlb2_attach_dir_ports(struct dlb2_hw *hw,
+				 struct dlb2_function_resources *rsrcs,
+				 struct dlb2_hw_domain *domain,
+				 u32 num_ports,
+				 struct dlb2_cmd_response *resp)
+{
+	unsigned int i;
+
+	if (rsrcs->num_avail_dir_pq_pairs < num_ports) {
+		resp->status = DLB2_ST_DIR_PORTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	for (i = 0; i < num_ports; i++) {
+		struct dlb2_dir_pq_pair *port;
+
+		port = DLB2_FUNC_LIST_HEAD(rsrcs->avail_dir_pq_pairs,
+					   typeof(*port));
+		if (!port) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: domain validation failed\n",
+				    __func__);
+			return -EFAULT;
+		}
+
+		list_del(&port->func_list);
+
+		port->domain_id = domain->id;
+		port->owned = true;
+
+		list_add(&port->domain_list, &domain->avail_dir_pq_pairs);
+	}
+
+	rsrcs->num_avail_dir_pq_pairs -= num_ports;
+
+	return 0;
+}
+
+static int dlb2_attach_ldb_credits(struct dlb2_function_resources *rsrcs,
+				   struct dlb2_hw_domain *domain,
+				   u32 num_credits,
+				   struct dlb2_cmd_response *resp)
+{
+	if (rsrcs->num_avail_qed_entries < num_credits) {
+		resp->status = DLB2_ST_LDB_CREDITS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	rsrcs->num_avail_qed_entries -= num_credits;
+	domain->num_ldb_credits += num_credits;
+	return 0;
+}
+
+static int dlb2_attach_dir_credits(struct dlb2_function_resources *rsrcs,
+				   struct dlb2_hw_domain *domain,
+				   u32 num_credits,
+				   struct dlb2_cmd_response *resp)
+{
+	if (rsrcs->num_avail_dqed_entries < num_credits) {
+		resp->status = DLB2_ST_DIR_CREDITS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	rsrcs->num_avail_dqed_entries -= num_credits;
+	domain->num_dir_credits += num_credits;
+	return 0;
+}
+
+static int dlb2_attach_atomic_inflights(struct dlb2_function_resources *rsrcs,
+					struct dlb2_hw_domain *domain,
+					u32 num_atomic_inflights,
+					struct dlb2_cmd_response *resp)
+{
+	if (rsrcs->num_avail_aqed_entries < num_atomic_inflights) {
+		resp->status = DLB2_ST_ATOMIC_INFLIGHTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	rsrcs->num_avail_aqed_entries -= num_atomic_inflights;
+	domain->num_avail_aqed_entries += num_atomic_inflights;
+	return 0;
+}
+
+static int
+dlb2_attach_domain_hist_list_entries(struct dlb2_function_resources *rsrcs,
+				     struct dlb2_hw_domain *domain,
+				     u32 num_hist_list_entries,
+				     struct dlb2_cmd_response *resp)
+{
+	struct dlb2_bitmap *bitmap;
+	int base;
+
+	if (num_hist_list_entries) {
+		bitmap = rsrcs->avail_hist_list_entries;
+
+		base = dlb2_bitmap_find_set_bit_range(bitmap,
+						      num_hist_list_entries);
+		if (base < 0)
+			goto error;
+
+		domain->total_hist_list_entries = num_hist_list_entries;
+		domain->avail_hist_list_entries = num_hist_list_entries;
+		domain->hist_list_entry_base = base;
+		domain->hist_list_entry_offset = 0;
+
+		dlb2_bitmap_clear_range(bitmap, base, num_hist_list_entries);
+	}
+	return 0;
+
+error:
+	resp->status = DLB2_ST_HIST_LIST_ENTRIES_UNAVAILABLE;
+	return -EINVAL;
+}
+
+static int
+dlb2_verify_create_sched_dom_args(struct dlb2_function_resources *rsrcs,
+				  struct dlb2_create_sched_domain_args *args,
+				  struct dlb2_cmd_response *resp)
+{
+	u32 num_avail_ldb_ports, req_ldb_ports;
+	struct dlb2_bitmap *avail_hl_entries;
+	unsigned int max_contig_hl_range;
+	int i;
+
+	avail_hl_entries = rsrcs->avail_hist_list_entries;
+
+	max_contig_hl_range = dlb2_bitmap_longest_set_range(avail_hl_entries);
+
+	num_avail_ldb_ports = 0;
+	req_ldb_ports = 0;
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		num_avail_ldb_ports += rsrcs->num_avail_ldb_ports[i];
+
+		req_ldb_ports += args->num_cos_ldb_ports[i];
+	}
+
+	req_ldb_ports += args->num_ldb_ports;
+
+	if (rsrcs->num_avail_domains < 1) {
+		resp->status = DLB2_ST_DOMAIN_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (rsrcs->num_avail_ldb_queues < args->num_ldb_queues) {
+		resp->status = DLB2_ST_LDB_QUEUES_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (req_ldb_ports > num_avail_ldb_ports) {
+		resp->status = DLB2_ST_LDB_PORTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	for (i = 0; args->cos_strict && i < DLB2_NUM_COS_DOMAINS; i++) {
+		if (args->num_cos_ldb_ports[i] >
+		    rsrcs->num_avail_ldb_ports[i]) {
+			resp->status = DLB2_ST_LDB_PORTS_UNAVAILABLE;
+			return -EINVAL;
+		}
+	}
+
+	if (args->num_ldb_queues > 0 && req_ldb_ports == 0) {
+		resp->status = DLB2_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES;
+		return -EINVAL;
+	}
+
+	if (rsrcs->num_avail_dir_pq_pairs < args->num_dir_ports) {
+		resp->status = DLB2_ST_DIR_PORTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (rsrcs->num_avail_qed_entries < args->num_ldb_credits) {
+		resp->status = DLB2_ST_LDB_CREDITS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (rsrcs->num_avail_dqed_entries < args->num_dir_credits) {
+		resp->status = DLB2_ST_DIR_CREDITS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (rsrcs->num_avail_aqed_entries < args->num_atomic_inflights) {
+		resp->status = DLB2_ST_ATOMIC_INFLIGHTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (max_contig_hl_range < args->num_hist_list_entries) {
+		resp->status = DLB2_ST_HIST_LIST_ENTRIES_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (resp->status)
+		return -EINVAL;
+
+	return 0;
+}
+
+static bool dlb2_port_find_slot(struct dlb2_ldb_port *port,
+				enum dlb2_qid_map_state state,
+				int *slot)
+{
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+		if (port->qid_map[i].state == state)
+			break;
+	}
+
+	*slot = i;
+
+	return (i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ);
+}
+
+static bool dlb2_port_find_slot_queue(struct dlb2_ldb_port *port,
+				      enum dlb2_qid_map_state state,
+				      struct dlb2_ldb_queue *queue,
+				      int *slot)
+{
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+		if (port->qid_map[i].state == state &&
+		    port->qid_map[i].qid == queue->id.phys_id)
+			break;
+	}
+
+	*slot = i;
+
+	return (i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ);
+}
+
+static int dlb2_port_slot_state_transition(struct dlb2_hw *hw,
+					   struct dlb2_ldb_port *port,
+					   struct dlb2_ldb_queue *queue,
+					   int slot,
+					   enum dlb2_qid_map_state new_state)
+{
+	enum dlb2_qid_map_state curr_state = port->qid_map[slot].state;
+	struct dlb2_hw_domain *domain;
+	int domain_id;
+
+	domain_id = port->domain_id.phys_id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, false, 0);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: unable to find domain %d\n",
+			    __func__, domain_id);
+		return -EINVAL;
+	}
+
+	switch (curr_state) {
+	case DLB2_QUEUE_UNMAPPED:
+		switch (new_state) {
+		case DLB2_QUEUE_MAPPED:
+			queue->num_mappings++;
+			port->num_mappings++;
+			break;
+		case DLB2_QUEUE_MAP_IN_PROG:
+			queue->num_pending_additions++;
+			domain->num_pending_additions++;
+			break;
+		default:
+			goto error;
+		}
+		break;
+	case DLB2_QUEUE_MAPPED:
+		switch (new_state) {
+		case DLB2_QUEUE_UNMAPPED:
+			queue->num_mappings--;
+			port->num_mappings--;
+			break;
+		case DLB2_QUEUE_UNMAP_IN_PROG:
+			port->num_pending_removals++;
+			domain->num_pending_removals++;
+			break;
+		case DLB2_QUEUE_MAPPED:
+			/* Priority change, nothing to update */
+			break;
+		default:
+			goto error;
+		}
+		break;
+	case DLB2_QUEUE_MAP_IN_PROG:
+		switch (new_state) {
+		case DLB2_QUEUE_UNMAPPED:
+			queue->num_pending_additions--;
+			domain->num_pending_additions--;
+			break;
+		case DLB2_QUEUE_MAPPED:
+			queue->num_mappings++;
+			port->num_mappings++;
+			queue->num_pending_additions--;
+			domain->num_pending_additions--;
+			break;
+		default:
+			goto error;
+		}
+		break;
+	case DLB2_QUEUE_UNMAP_IN_PROG:
+		switch (new_state) {
+		case DLB2_QUEUE_UNMAPPED:
+			port->num_pending_removals--;
+			domain->num_pending_removals--;
+			queue->num_mappings--;
+			port->num_mappings--;
+			break;
+		case DLB2_QUEUE_MAPPED:
+			port->num_pending_removals--;
+			domain->num_pending_removals--;
+			break;
+		case DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP:
+			/* Nothing to update */
+			break;
+		default:
+			goto error;
+		}
+		break;
+	case DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP:
+		switch (new_state) {
+		case DLB2_QUEUE_UNMAP_IN_PROG:
+			/* Nothing to update */
+			break;
+		case DLB2_QUEUE_UNMAPPED:
+			/*
+			 * An UNMAP_IN_PROG_PENDING_MAP slot briefly
+			 * becomes UNMAPPED before it transitions to
+			 * MAP_IN_PROG.
+			 */
+			queue->num_mappings--;
+			port->num_mappings--;
+			port->num_pending_removals--;
+			domain->num_pending_removals--;
+			break;
+		default:
+			goto error;
+		}
+		break;
+	default:
+		goto error;
+	}
+
+	port->qid_map[slot].state = new_state;
+
+	DLB2_HW_DBG(hw,
+		    "[%s()] queue %d -> port %d state transition (%d -> %d)\n",
+		    __func__, queue->id.phys_id, port->id.phys_id,
+		    curr_state, new_state);
+	return 0;
+
+error:
+	DLB2_HW_ERR(hw,
+		    "[%s()] Internal error: invalid queue %d -> port %d state transition (%d -> %d)\n",
+		    __func__, queue->id.phys_id, port->id.phys_id,
+		    curr_state, new_state);
+	return -EFAULT;
+}
+
+static void dlb2_configure_domain_credits(struct dlb2_hw *hw,
+					  struct dlb2_hw_domain *domain)
+{
+	union dlb2_chp_cfg_ldb_vas_crd r0 = { {0} };
+	union dlb2_chp_cfg_dir_vas_crd r1 = { {0} };
+
+	r0.field.count = domain->num_ldb_credits;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_CFG_LDB_VAS_CRD(domain->id.phys_id), r0.val);
+
+	r1.field.count = domain->num_dir_credits;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_CFG_DIR_VAS_CRD(domain->id.phys_id), r1.val);
+}
+
+static int
+dlb2_domain_attach_resources(struct dlb2_hw *hw,
+			     struct dlb2_function_resources *rsrcs,
+			     struct dlb2_hw_domain *domain,
+			     struct dlb2_create_sched_domain_args *args,
+			     struct dlb2_cmd_response *resp)
+{
+	int ret;
+
+	ret = dlb2_attach_ldb_queues(hw,
+				     rsrcs,
+				     domain,
+				     args->num_ldb_queues,
+				     resp);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_attach_ldb_ports(hw,
+				    rsrcs,
+				    domain,
+				    args,
+				    resp);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_attach_dir_ports(hw,
+				    rsrcs,
+				    domain,
+				    args->num_dir_ports,
+				    resp);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_attach_ldb_credits(rsrcs,
+				      domain,
+				      args->num_ldb_credits,
+				      resp);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_attach_dir_credits(rsrcs,
+				      domain,
+				      args->num_dir_credits,
+				      resp);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_attach_domain_hist_list_entries(rsrcs,
+						   domain,
+						   args->num_hist_list_entries,
+						   resp);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_attach_atomic_inflights(rsrcs,
+					   domain,
+					   args->num_atomic_inflights,
+					   resp);
+	if (ret < 0)
+		return ret;
+
+	dlb2_configure_domain_credits(hw, domain);
+
+	domain->configured = true;
+
+	domain->started = false;
+
+	rsrcs->num_avail_domains--;
+
+	return 0;
+}
+
+static void dlb2_ldb_port_cq_enable(struct dlb2_hw *hw,
+				    struct dlb2_ldb_port *port)
+{
+	union dlb2_lsp_cq_ldb_dsbl reg;
+
+	/*
+	 * Don't re-enable the port if a removal is pending. The caller should
+	 * mark this port as enabled (if it isn't already), and when the
+	 * removal completes the port will be enabled.
+	 */
+	if (port->num_pending_removals)
+		return;
+
+	reg.field.disabled = 0;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_DSBL(port->id.phys_id), reg.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static void dlb2_ldb_port_cq_disable(struct dlb2_hw *hw,
+				     struct dlb2_ldb_port *port)
+{
+	union dlb2_lsp_cq_ldb_dsbl reg;
+
+	reg.field.disabled = 1;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_DSBL(port->id.phys_id), reg.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static void dlb2_dir_port_cq_enable(struct dlb2_hw *hw,
+				    struct dlb2_dir_pq_pair *port)
+{
+	union dlb2_lsp_cq_dir_dsbl reg;
+
+	reg.field.disabled = 0;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ_DIR_DSBL(port->id.phys_id), reg.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static void dlb2_dir_port_cq_disable(struct dlb2_hw *hw,
+				     struct dlb2_dir_pq_pair *port)
+{
+	union dlb2_lsp_cq_dir_dsbl reg;
+
+	reg.field.disabled = 1;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ_DIR_DSBL(port->id.phys_id), reg.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static int dlb2_ldb_port_map_qid_static(struct dlb2_hw *hw,
+					struct dlb2_ldb_port *p,
+					struct dlb2_ldb_queue *q,
+					u8 priority)
+{
+	union dlb2_lsp_cq2priov r0;
+	union dlb2_lsp_cq2qid0 r1;
+	union dlb2_atm_qid2cqidix_00 r2;
+	union dlb2_lsp_qid2cqidix_00 r3;
+	union dlb2_lsp_qid2cqidix2_00 r4;
+	enum dlb2_qid_map_state state;
+	int i;
+
+	/* Look for a pending or already mapped slot, else an unused slot */
+	if (!dlb2_port_find_slot_queue(p, DLB2_QUEUE_MAP_IN_PROG, q, &i) &&
+	    !dlb2_port_find_slot_queue(p, DLB2_QUEUE_MAPPED, q, &i) &&
+	    !dlb2_port_find_slot(p, DLB2_QUEUE_UNMAPPED, &i)) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: CQ has no available QID mapping slots\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port slot tracking failed\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/* Read-modify-write the priority and valid bit register */
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ2PRIOV(p->id.phys_id));
+
+	r0.field.v |= 1 << i;
+	r0.field.prio |= (priority & 0x7) << i * 3;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(p->id.phys_id), r0.val);
+
+	/* Read-modify-write the QID map register */
+	if (i < 4)
+		r1.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ2QID0(p->id.phys_id));
+	else
+		r1.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ2QID1(p->id.phys_id));
+
+	if (i == 0 || i == 4)
+		r1.field.qid_p0 = q->id.phys_id;
+	if (i == 1 || i == 5)
+		r1.field.qid_p1 = q->id.phys_id;
+	if (i == 2 || i == 6)
+		r1.field.qid_p2 = q->id.phys_id;
+	if (i == 3 || i == 7)
+		r1.field.qid_p3 = q->id.phys_id;
+
+	if (i < 4)
+		DLB2_CSR_WR(hw, DLB2_LSP_CQ2QID0(p->id.phys_id), r1.val);
+	else
+		DLB2_CSR_WR(hw, DLB2_LSP_CQ2QID1(p->id.phys_id), r1.val);
+
+	r2.val = DLB2_CSR_RD(hw,
+			     DLB2_ATM_QID2CQIDIX(q->id.phys_id,
+						 p->id.phys_id / 4));
+
+	r3.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID2CQIDIX(q->id.phys_id,
+						 p->id.phys_id / 4));
+
+	r4.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID2CQIDIX2(q->id.phys_id,
+						  p->id.phys_id / 4));
+
+	switch (p->id.phys_id % 4) {
+	case 0:
+		r2.field.cq_p0 |= 1 << i;
+		r3.field.cq_p0 |= 1 << i;
+		r4.field.cq_p0 |= 1 << i;
+		break;
+
+	case 1:
+		r2.field.cq_p1 |= 1 << i;
+		r3.field.cq_p1 |= 1 << i;
+		r4.field.cq_p1 |= 1 << i;
+		break;
+
+	case 2:
+		r2.field.cq_p2 |= 1 << i;
+		r3.field.cq_p2 |= 1 << i;
+		r4.field.cq_p2 |= 1 << i;
+		break;
+
+	case 3:
+		r2.field.cq_p3 |= 1 << i;
+		r3.field.cq_p3 |= 1 << i;
+		r4.field.cq_p3 |= 1 << i;
+		break;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_ATM_QID2CQIDIX(q->id.phys_id, p->id.phys_id / 4),
+		    r2.val);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID2CQIDIX(q->id.phys_id, p->id.phys_id / 4),
+		    r3.val);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID2CQIDIX2(q->id.phys_id, p->id.phys_id / 4),
+		    r4.val);
+
+	dlb2_flush_csr(hw);
+
+	p->qid_map[i].qid = q->id.phys_id;
+	p->qid_map[i].priority = priority;
+
+	state = DLB2_QUEUE_MAPPED;
+
+	return dlb2_port_slot_state_transition(hw, p, q, i, state);
+}
+
+static int dlb2_ldb_port_set_has_work_bits(struct dlb2_hw *hw,
+					   struct dlb2_ldb_port *port,
+					   struct dlb2_ldb_queue *queue,
+					   int slot)
+{
+	union dlb2_lsp_qid_aqed_active_cnt r0;
+	union dlb2_lsp_qid_ldb_enqueue_cnt r1;
+	union dlb2_lsp_ldb_sched_ctrl r2 = { {0} };
+
+	/* Set the atomic scheduling haswork bit */
+	r0.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_AQED_ACTIVE_CNT(queue->id.phys_id));
+
+	r2.field.cq = port->id.phys_id;
+	r2.field.qidix = slot;
+	r2.field.value = 1;
+	r2.field.rlist_haswork_v = r0.field.count > 0;
+
+	/* Set the non-atomic scheduling haswork bit */
+	DLB2_CSR_WR(hw, DLB2_LSP_LDB_SCHED_CTRL, r2.val);
+
+	r1.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_LDB_ENQUEUE_CNT(queue->id.phys_id));
+
+	memset(&r2, 0, sizeof(r2));
+
+	r2.field.cq = port->id.phys_id;
+	r2.field.qidix = slot;
+	r2.field.value = 1;
+	r2.field.nalb_haswork_v = (r1.field.count > 0);
+
+	DLB2_CSR_WR(hw, DLB2_LSP_LDB_SCHED_CTRL, r2.val);
+
+	dlb2_flush_csr(hw);
+
+	return 0;
+}
+
+static void dlb2_ldb_port_clear_has_work_bits(struct dlb2_hw *hw,
+					      struct dlb2_ldb_port *port,
+					      u8 slot)
+{
+	union dlb2_lsp_ldb_sched_ctrl r2 = { {0} };
+
+	r2.field.cq = port->id.phys_id;
+	r2.field.qidix = slot;
+	r2.field.value = 0;
+	r2.field.rlist_haswork_v = 1;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_LDB_SCHED_CTRL, r2.val);
+
+	memset(&r2, 0, sizeof(r2));
+
+	r2.field.cq = port->id.phys_id;
+	r2.field.qidix = slot;
+	r2.field.value = 0;
+	r2.field.nalb_haswork_v = 1;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_LDB_SCHED_CTRL, r2.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static void dlb2_ldb_port_clear_queue_if_status(struct dlb2_hw *hw,
+						struct dlb2_ldb_port *port,
+						int slot)
+{
+	union dlb2_lsp_ldb_sched_ctrl r0 = { {0} };
+
+	r0.field.cq = port->id.phys_id;
+	r0.field.qidix = slot;
+	r0.field.value = 0;
+	r0.field.inflight_ok_v = 1;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_LDB_SCHED_CTRL, r0.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static void dlb2_ldb_port_set_queue_if_status(struct dlb2_hw *hw,
+					      struct dlb2_ldb_port *port,
+					      int slot)
+{
+	union dlb2_lsp_ldb_sched_ctrl r0 = { {0} };
+
+	r0.field.cq = port->id.phys_id;
+	r0.field.qidix = slot;
+	r0.field.value = 1;
+	r0.field.inflight_ok_v = 1;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_LDB_SCHED_CTRL, r0.val);
+
+	dlb2_flush_csr(hw);
+}
+
+static void dlb2_ldb_queue_set_inflight_limit(struct dlb2_hw *hw,
+					      struct dlb2_ldb_queue *queue)
+{
+	union dlb2_lsp_qid_ldb_infl_lim r0 = { {0} };
+
+	r0.field.limit = queue->num_qid_inflights;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_QID_LDB_INFL_LIM(queue->id.phys_id), r0.val);
+}
+
+static void dlb2_ldb_queue_clear_inflight_limit(struct dlb2_hw *hw,
+						struct dlb2_ldb_queue *queue)
+{
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID_LDB_INFL_LIM(queue->id.phys_id),
+		    DLB2_LSP_QID_LDB_INFL_LIM_RST);
+}
+
+/*
+ * dlb2_ldb_queue_{enable, disable}_mapped_cqs() don't operate exactly as
+ * their function names imply, and should only be called by the dynamic CQ
+ * mapping code.
+ */
+static void dlb2_ldb_queue_disable_mapped_cqs(struct dlb2_hw *hw,
+					      struct dlb2_hw_domain *domain,
+					      struct dlb2_ldb_queue *queue)
+{
+	struct dlb2_ldb_port *port;
+	int slot, i;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			enum dlb2_qid_map_state state = DLB2_QUEUE_MAPPED;
+
+			if (!dlb2_port_find_slot_queue(port, state,
+						       queue, &slot))
+				continue;
+
+			if (port->enabled)
+				dlb2_ldb_port_cq_disable(hw, port);
+		}
+	}
+}
+
+static void dlb2_ldb_queue_enable_mapped_cqs(struct dlb2_hw *hw,
+					     struct dlb2_hw_domain *domain,
+					     struct dlb2_ldb_queue *queue)
+{
+	struct dlb2_ldb_port *port;
+	int slot, i;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			enum dlb2_qid_map_state state = DLB2_QUEUE_MAPPED;
+
+			if (!dlb2_port_find_slot_queue(port, state,
+						       queue, &slot))
+				continue;
+
+			if (port->enabled)
+				dlb2_ldb_port_cq_enable(hw, port);
+		}
+	}
+}
+
+static int dlb2_ldb_port_finish_map_qid_dynamic(struct dlb2_hw *hw,
+						struct dlb2_hw_domain *domain,
+						struct dlb2_ldb_port *port,
+						struct dlb2_ldb_queue *queue)
+{
+	union dlb2_lsp_qid_ldb_infl_cnt r0;
+	enum dlb2_qid_map_state state;
+	int slot, ret, i;
+	u8 prio;
+
+	r0.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+
+	if (r0.field.count) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: non-zero QID inflight count\n",
+			    __func__);
+		return -EINVAL;
+	}
+
+	/*
+	 * Static map the port and set its corresponding has_work bits.
+	 */
+	state = DLB2_QUEUE_MAP_IN_PROG;
+	if (!dlb2_port_find_slot_queue(port, state, queue, &slot))
+		return -EINVAL;
+
+	if (slot >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port slot tracking failed\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	prio = port->qid_map[slot].priority;
+
+	/*
+	 * Update the CQ2QID, CQ2PRIOV, and QID2CQIDX registers, and
+	 * the port's qid_map state.
+	 */
+	ret = dlb2_ldb_port_map_qid_static(hw, port, queue, prio);
+	if (ret)
+		return ret;
+
+	ret = dlb2_ldb_port_set_has_work_bits(hw, port, queue, slot);
+	if (ret)
+		return ret;
+
+	/*
+	 * Ensure IF_status(cq,qid) is 0 before enabling the port to
+	 * prevent spurious schedules to cause the queue's inflight
+	 * count to increase.
+	 */
+	dlb2_ldb_port_clear_queue_if_status(hw, port, slot);
+
+	/* Reset the queue's inflight status */
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			state = DLB2_QUEUE_MAPPED;
+			if (!dlb2_port_find_slot_queue(port, state,
+						       queue, &slot))
+				continue;
+
+			dlb2_ldb_port_set_queue_if_status(hw, port, slot);
+		}
+	}
+
+	dlb2_ldb_queue_set_inflight_limit(hw, queue);
+
+	/* Re-enable CQs mapped to this queue */
+	dlb2_ldb_queue_enable_mapped_cqs(hw, domain, queue);
+
+	/* If this queue has other mappings pending, clear its inflight limit */
+	if (queue->num_pending_additions > 0)
+		dlb2_ldb_queue_clear_inflight_limit(hw, queue);
+
+	return 0;
+}
+
+/**
+ * dlb2_ldb_port_map_qid_dynamic() - perform a "dynamic" QID->CQ mapping
+ * @hw: dlb2_hw handle for a particular device.
+ * @port: load-balanced port
+ * @queue: load-balanced queue
+ * @priority: queue servicing priority
+ *
+ * Returns 0 if the queue was mapped, 1 if the mapping is scheduled to occur
+ * at a later point, and <0 if an error occurred.
+ */
+static int dlb2_ldb_port_map_qid_dynamic(struct dlb2_hw *hw,
+					 struct dlb2_ldb_port *port,
+					 struct dlb2_ldb_queue *queue,
+					 u8 priority)
+{
+	union dlb2_lsp_qid_ldb_infl_cnt r0 = { {0} };
+	enum dlb2_qid_map_state state;
+	struct dlb2_hw_domain *domain;
+	int domain_id, slot, ret;
+
+	domain_id = port->domain_id.phys_id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, false, 0);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: unable to find domain %d\n",
+			    __func__, port->domain_id.phys_id);
+		return -EINVAL;
+	}
+
+	/*
+	 * Set the QID inflight limit to 0 to prevent further scheduling of the
+	 * queue.
+	 */
+	DLB2_CSR_WR(hw, DLB2_LSP_QID_LDB_INFL_LIM(queue->id.phys_id), 0);
+
+	if (!dlb2_port_find_slot(port, DLB2_QUEUE_UNMAPPED, &slot)) {
+		DLB2_HW_ERR(hw,
+			    "Internal error: No available unmapped slots\n");
+		return -EFAULT;
+	}
+
+	if (slot >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port slot tracking failed\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	port->qid_map[slot].qid = queue->id.phys_id;
+	port->qid_map[slot].priority = priority;
+
+	state = DLB2_QUEUE_MAP_IN_PROG;
+	ret = dlb2_port_slot_state_transition(hw, port, queue, slot, state);
+	if (ret)
+		return ret;
+
+	/*
+	 * Disable the affected CQ, and the CQs already mapped to the QID,
+	 * before reading the QID's inflight count a second time. There is an
+	 * unlikely race in which the QID may schedule one more QE after we
+	 * read an inflight count of 0, and disabling the CQs guarantees that
+	 * the race will not occur after a re-read of the inflight count
+	 * register.
+	 */
+	if (port->enabled)
+		dlb2_ldb_port_cq_disable(hw, port);
+
+	dlb2_ldb_queue_disable_mapped_cqs(hw, domain, queue);
+
+	r0.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+
+	if (r0.field.count) {
+		if (port->enabled)
+			dlb2_ldb_port_cq_enable(hw, port);
+
+		dlb2_ldb_queue_enable_mapped_cqs(hw, domain, queue);
+
+		return 1;
+	}
+
+	return dlb2_ldb_port_finish_map_qid_dynamic(hw, domain, port, queue);
+}
+
+static int dlb2_ldb_port_map_qid(struct dlb2_hw *hw,
+				 struct dlb2_hw_domain *domain,
+				 struct dlb2_ldb_port *port,
+				 struct dlb2_ldb_queue *queue,
+				 u8 prio)
+{
+	if (domain->started)
+		return dlb2_ldb_port_map_qid_dynamic(hw, port, queue, prio);
+	else
+		return dlb2_ldb_port_map_qid_static(hw, port, queue, prio);
+}
+
+static int dlb2_ldb_port_unmap_qid(struct dlb2_hw *hw,
+				   struct dlb2_ldb_port *port,
+				   struct dlb2_ldb_queue *queue)
+{
+	enum dlb2_qid_map_state mapped, in_progress, pending_map, unmapped;
+	union dlb2_lsp_cq2priov r0;
+	union dlb2_atm_qid2cqidix_00 r1;
+	union dlb2_lsp_qid2cqidix_00 r2;
+	union dlb2_lsp_qid2cqidix2_00 r3;
+	u32 queue_id;
+	u32 port_id;
+	int i;
+
+	/* Find the queue's slot */
+	mapped = DLB2_QUEUE_MAPPED;
+	in_progress = DLB2_QUEUE_UNMAP_IN_PROG;
+	pending_map = DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP;
+
+	if (!dlb2_port_find_slot_queue(port, mapped, queue, &i) &&
+	    !dlb2_port_find_slot_queue(port, in_progress, queue, &i) &&
+	    !dlb2_port_find_slot_queue(port, pending_map, queue, &i)) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: QID %d isn't mapped\n",
+			    __func__, __LINE__, queue->id.phys_id);
+		return -EFAULT;
+	}
+
+	if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port slot tracking failed\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	port_id = port->id.phys_id;
+	queue_id = queue->id.phys_id;
+
+	/* Read-modify-write the priority and valid bit register */
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ2PRIOV(port_id));
+
+	r0.field.v &= ~(1 << i);
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(port_id), r0.val);
+
+	r1.val = DLB2_CSR_RD(hw,
+			     DLB2_ATM_QID2CQIDIX(queue_id, port_id / 4));
+
+	r2.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID2CQIDIX(queue_id, port_id / 4));
+
+	r3.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID2CQIDIX2(queue_id, port_id / 4));
+
+	switch (port_id % 4) {
+	case 0:
+		r1.field.cq_p0 &= ~(1 << i);
+		r2.field.cq_p0 &= ~(1 << i);
+		r3.field.cq_p0 &= ~(1 << i);
+		break;
+
+	case 1:
+		r1.field.cq_p1 &= ~(1 << i);
+		r2.field.cq_p1 &= ~(1 << i);
+		r3.field.cq_p1 &= ~(1 << i);
+		break;
+
+	case 2:
+		r1.field.cq_p2 &= ~(1 << i);
+		r2.field.cq_p2 &= ~(1 << i);
+		r3.field.cq_p2 &= ~(1 << i);
+		break;
+
+	case 3:
+		r1.field.cq_p3 &= ~(1 << i);
+		r2.field.cq_p3 &= ~(1 << i);
+		r3.field.cq_p3 &= ~(1 << i);
+		break;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_ATM_QID2CQIDIX(queue_id, port_id / 4),
+		    r1.val);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID2CQIDIX(queue_id, port_id / 4),
+		    r2.val);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID2CQIDIX2(queue_id, port_id / 4),
+		    r3.val);
+
+	dlb2_flush_csr(hw);
+
+	unmapped = DLB2_QUEUE_UNMAPPED;
+
+	return dlb2_port_slot_state_transition(hw, port, queue, i, unmapped);
+}
+
+static void
+dlb2_log_create_sched_domain_args(struct dlb2_hw *hw,
+				  struct dlb2_create_sched_domain_args *args,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 create sched domain arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tNumber of LDB queues:          %d\n",
+		    args->num_ldb_queues);
+	DLB2_HW_DBG(hw, "\tNumber of LDB ports (any CoS): %d\n",
+		    args->num_ldb_ports);
+	DLB2_HW_DBG(hw, "\tNumber of LDB ports (CoS 0):   %d\n",
+		    args->num_cos_ldb_ports[0]);
+	DLB2_HW_DBG(hw, "\tNumber of LDB ports (CoS 1):   %d\n",
+		    args->num_cos_ldb_ports[1]);
+	DLB2_HW_DBG(hw, "\tNumber of LDB ports (CoS 2):   %d\n",
+		    args->num_cos_ldb_ports[2]);
+	DLB2_HW_DBG(hw, "\tNumber of LDB ports (CoS 3):   %d\n",
+		    args->num_cos_ldb_ports[3]);
+	DLB2_HW_DBG(hw, "\tStrict CoS allocation:         %d\n",
+		    args->cos_strict);
+	DLB2_HW_DBG(hw, "\tNumber of DIR ports:           %d\n",
+		    args->num_dir_ports);
+	DLB2_HW_DBG(hw, "\tNumber of ATM inflights:       %d\n",
+		    args->num_atomic_inflights);
+	DLB2_HW_DBG(hw, "\tNumber of hist list entries:   %d\n",
+		    args->num_hist_list_entries);
+	DLB2_HW_DBG(hw, "\tNumber of LDB credits:         %d\n",
+		    args->num_ldb_credits);
+	DLB2_HW_DBG(hw, "\tNumber of DIR credits:         %d\n",
+		    args->num_dir_credits);
+}
+
+/**
+ * dlb2_hw_create_sched_domain() - Allocate and initialize a DLB scheduling
+ *	domain and its resources.
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb2_hw_create_sched_domain(struct dlb2_hw *hw,
+				struct dlb2_create_sched_domain_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_req,
+				unsigned int vdev_id)
+{
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_hw_domain *domain;
+	int ret;
+
+	rsrcs = (vdev_req) ? &hw->vdev[vdev_id] : &hw->pf;
+
+	dlb2_log_create_sched_domain_args(hw, args, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_create_sched_dom_args(rsrcs, args, resp);
+	if (ret)
+		return ret;
+
+	domain = DLB2_FUNC_LIST_HEAD(rsrcs->avail_domains, typeof(*domain));
+
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: no available domains\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	if (domain->configured) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: avail_domains contains configured domains.\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	dlb2_init_domain_rsrc_lists(domain);
+
+	ret = dlb2_domain_attach_resources(hw, rsrcs, domain, args, resp);
+	if (ret < 0) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: failed to verify args.\n",
+			    __func__);
+
+		return ret;
+	}
+
+	list_del(&domain->func_list);
+
+	list_add(&domain->func_list, &rsrcs->used_domains);
+
+	resp->id = (vdev_req) ? domain->id.virt_id : domain->id.phys_id;
+	resp->status = 0;
+
+	return 0;
+}
+
+static void
+dlb2_domain_finish_unmap_port_slot(struct dlb2_hw *hw,
+				   struct dlb2_hw_domain *domain,
+				   struct dlb2_ldb_port *port,
+				   int slot)
+{
+	enum dlb2_qid_map_state state;
+	struct dlb2_ldb_queue *queue;
+
+	queue = &hw->rsrcs.ldb_queues[port->qid_map[slot].qid];
+
+	state = port->qid_map[slot].state;
+
+	/* Update the QID2CQIDX and CQ2QID vectors */
+	dlb2_ldb_port_unmap_qid(hw, port, queue);
+
+	/*
+	 * Ensure the QID will not be serviced by this {CQ, slot} by clearing
+	 * the has_work bits
+	 */
+	dlb2_ldb_port_clear_has_work_bits(hw, port, slot);
+
+	/* Reset the {CQ, slot} to its default state */
+	dlb2_ldb_port_set_queue_if_status(hw, port, slot);
+
+	/* Re-enable the CQ if it wasn't manually disabled by the user */
+	if (port->enabled)
+		dlb2_ldb_port_cq_enable(hw, port);
+
+	/*
+	 * If there is a mapping that is pending this slot's removal, perform
+	 * the mapping now.
+	 */
+	if (state == DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP) {
+		struct dlb2_ldb_port_qid_map *map;
+		struct dlb2_ldb_queue *map_queue;
+		u8 prio;
+
+		map = &port->qid_map[slot];
+
+		map->qid = map->pending_qid;
+		map->priority = map->pending_priority;
+
+		map_queue = &hw->rsrcs.ldb_queues[map->qid];
+		prio = map->priority;
+
+		dlb2_ldb_port_map_qid(hw, domain, port, map_queue, prio);
+	}
+}
+
+static bool dlb2_domain_finish_unmap_port(struct dlb2_hw *hw,
+					  struct dlb2_hw_domain *domain,
+					  struct dlb2_ldb_port *port)
+{
+	union dlb2_lsp_cq_ldb_infl_cnt r0;
+	int i;
+
+	if (port->num_pending_removals == 0)
+		return false;
+
+	/*
+	 * The unmap requires all the CQ's outstanding inflights to be
+	 * completed.
+	 */
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ_LDB_INFL_CNT(port->id.phys_id));
+	if (r0.field.count > 0)
+		return false;
+
+	for (i = 0; i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+		struct dlb2_ldb_port_qid_map *map;
+
+		map = &port->qid_map[i];
+
+		if (map->state != DLB2_QUEUE_UNMAP_IN_PROG &&
+		    map->state != DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP)
+			continue;
+
+		dlb2_domain_finish_unmap_port_slot(hw, domain, port, i);
+	}
+
+	return true;
+}
+
+static unsigned int
+dlb2_domain_finish_unmap_qid_procedures(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (!domain->configured || domain->num_pending_removals == 0)
+		return 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			dlb2_domain_finish_unmap_port(hw, domain, port);
+	}
+
+	return domain->num_pending_removals;
+}
+
+static void dlb2_domain_finish_map_port(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain,
+					struct dlb2_ldb_port *port)
+{
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+		union dlb2_lsp_qid_ldb_infl_cnt r0;
+		struct dlb2_ldb_queue *queue;
+		int qid;
+
+		if (port->qid_map[i].state != DLB2_QUEUE_MAP_IN_PROG)
+			continue;
+
+		qid = port->qid_map[i].qid;
+
+		queue = dlb2_get_ldb_queue_from_id(hw, qid, false, 0);
+
+		if (!queue) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: unable to find queue %d\n",
+				    __func__, qid);
+			continue;
+		}
+
+		r0.val = DLB2_CSR_RD(hw, DLB2_LSP_QID_LDB_INFL_CNT(qid));
+
+		if (r0.field.count)
+			continue;
+
+		/*
+		 * Disable the affected CQ, and the CQs already mapped to the
+		 * QID, before reading the QID's inflight count a second time.
+		 * There is an unlikely race in which the QID may schedule one
+		 * more QE after we read an inflight count of 0, and disabling
+		 * the CQs guarantees that the race will not occur after a
+		 * re-read of the inflight count register.
+		 */
+		if (port->enabled)
+			dlb2_ldb_port_cq_disable(hw, port);
+
+		dlb2_ldb_queue_disable_mapped_cqs(hw, domain, queue);
+
+		r0.val = DLB2_CSR_RD(hw, DLB2_LSP_QID_LDB_INFL_CNT(qid));
+
+		if (r0.field.count) {
+			if (port->enabled)
+				dlb2_ldb_port_cq_enable(hw, port);
+
+			dlb2_ldb_queue_enable_mapped_cqs(hw, domain, queue);
+
+			continue;
+		}
+
+		dlb2_ldb_port_finish_map_qid_dynamic(hw, domain, port, queue);
+	}
+}
+
+static unsigned int
+dlb2_domain_finish_map_qid_procedures(struct dlb2_hw *hw,
+				      struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (!domain->configured || domain->num_pending_additions == 0)
+		return 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			dlb2_domain_finish_map_port(hw, domain, port);
+	}
+
+	return domain->num_pending_additions;
+}
+
+static u32 dlb2_ldb_cq_inflight_count(struct dlb2_hw *hw,
+				      struct dlb2_ldb_port *port)
+{
+	union dlb2_lsp_cq_ldb_infl_cnt r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ_LDB_INFL_CNT(port->id.phys_id));
+
+	return r0.field.count;
+}
+
+static u32 dlb2_ldb_cq_token_count(struct dlb2_hw *hw,
+				   struct dlb2_ldb_port *port)
+{
+	union dlb2_lsp_cq_ldb_tkn_cnt r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ_LDB_TKN_CNT(port->id.phys_id));
+
+	/*
+	 * Account for the initial token count, which is used in order to
+	 * provide a CQ with depth less than 8.
+	 */
+
+	return r0.field.token_count - port->init_tkn_cnt;
+}
+
+static void __iomem *dlb2_producer_port_addr(struct dlb2_hw *hw,
+					     u8 port_id,
+					     bool is_ldb)
+{
+	struct dlb2_dev *dlb2_dev;
+	unsigned long size;
+	uintptr_t address;
+
+	dlb2_dev = container_of(hw, struct dlb2_dev, hw);
+
+	address = (uintptr_t)dlb2_dev->hw.func_kva;
+
+	if (is_ldb) {
+		size = DLB2_LDB_PP_STRIDE;
+		address += DLB2_DRV_LDB_PP_BASE + size * port_id;
+	} else {
+		size = DLB2_DIR_PP_STRIDE;
+		address += DLB2_DRV_DIR_PP_BASE + size * port_id;
+	}
+
+	return (void __iomem *)address;
+}
+
+static void dlb2_fence_hcw(void __iomem *addr)
+{
+	/*
+	 * To ensure outstanding HCWs reach the device before subsequent device
+	 * accesses, fence them.
+	 */
+	mb();
+}
+
+static int dlb2_drain_ldb_cq(struct dlb2_hw *hw, struct dlb2_ldb_port *port)
+{
+	u32 infl_cnt, tkn_cnt;
+	unsigned int i;
+
+	infl_cnt = dlb2_ldb_cq_inflight_count(hw, port);
+	tkn_cnt = dlb2_ldb_cq_token_count(hw, port);
+
+	if (infl_cnt || tkn_cnt) {
+		struct dlb2_hcw hcw_mem[8], *hcw;
+		void __iomem *pp_addr;
+
+		pp_addr = dlb2_producer_port_addr(hw, port->id.phys_id, true);
+
+		/* Point hcw to a 64B-aligned location */
+		hcw = (struct dlb2_hcw *)((uintptr_t)&hcw_mem[4] & ~0x3F);
+
+		/*
+		 * Program the first HCW for a completion and token return and
+		 * the other HCWs as NOOPS
+		 */
+
+		memset(hcw, 0, 4 * sizeof(*hcw));
+		hcw->qe_comp = (infl_cnt > 0);
+		hcw->cq_token = (tkn_cnt > 0);
+		hcw->lock_id = tkn_cnt - 1;
+
+		/* Return tokens in the first HCW */
+		iosubmit_cmds512(pp_addr, hcw, 1);
+
+		hcw->cq_token = 0;
+
+		/* Issue remaining completions (if any) */
+		for (i = 1; i < infl_cnt; i++)
+			iosubmit_cmds512(pp_addr, hcw, 1);
+
+		dlb2_fence_hcw(pp_addr);
+	}
+
+	return 0;
+}
+
+static int dlb2_domain_wait_for_ldb_cqs_to_empty(struct dlb2_hw *hw,
+						 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			int i;
+
+			for (i = 0; i < DLB2_MAX_CQ_COMP_CHECK_LOOPS; i++) {
+				if (dlb2_ldb_cq_inflight_count(hw, port) == 0)
+					break;
+			}
+
+			if (i == DLB2_MAX_CQ_COMP_CHECK_LOOPS) {
+				DLB2_HW_ERR(hw,
+					    "[%s()] Internal error: failed to flush load-balanced port %d's completions.\n",
+					    __func__, port->id.phys_id);
+				return -EFAULT;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static int dlb2_domain_reset_software_state(struct dlb2_hw *hw,
+					    struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *tmp_dir_port;
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_ldb_queue *tmp_ldb_queue;
+	struct dlb2_ldb_port *tmp_ldb_port;
+	struct dlb2_dir_pq_pair *dir_port;
+	struct dlb2_ldb_queue *ldb_queue;
+	struct dlb2_ldb_port *ldb_port;
+	struct list_head *list;
+	int ret, i;
+
+	rsrcs = domain->parent_func;
+
+	/* Move the domain's ldb queues to the function's avail list */
+	list = &domain->used_ldb_queues;
+	DLB2_DOM_LIST_FOR_SAFE(*list, ldb_queue, tmp_ldb_queue) {
+		if (ldb_queue->sn_cfg_valid) {
+			struct dlb2_sn_group *grp;
+
+			grp = &hw->rsrcs.sn_groups[ldb_queue->sn_group];
+
+			dlb2_sn_group_free_slot(grp, ldb_queue->sn_slot);
+			ldb_queue->sn_cfg_valid = false;
+		}
+
+		ldb_queue->owned = false;
+		ldb_queue->num_mappings = 0;
+		ldb_queue->num_pending_additions = 0;
+
+		list_del(&ldb_queue->domain_list);
+		list_add(&ldb_queue->func_list, &rsrcs->avail_ldb_queues);
+		rsrcs->num_avail_ldb_queues++;
+	}
+
+	list = &domain->avail_ldb_queues;
+	DLB2_DOM_LIST_FOR_SAFE(*list, ldb_queue, tmp_ldb_queue) {
+		ldb_queue->owned = false;
+
+		list_del(&ldb_queue->domain_list);
+		list_add(&ldb_queue->func_list, &rsrcs->avail_ldb_queues);
+		rsrcs->num_avail_ldb_queues++;
+	}
+
+	/* Move the domain's ldb ports to the function's avail list */
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		list = &domain->used_ldb_ports[i];
+		DLB2_DOM_LIST_FOR_SAFE(*list, ldb_port, tmp_ldb_port) {
+			int j;
+
+			ldb_port->owned = false;
+			ldb_port->configured = false;
+			ldb_port->num_pending_removals = 0;
+			ldb_port->num_mappings = 0;
+			ldb_port->init_tkn_cnt = 0;
+			for (j = 0; j < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; j++)
+				ldb_port->qid_map[j].state =
+					DLB2_QUEUE_UNMAPPED;
+
+			list_del(&ldb_port->domain_list);
+			list_add(&ldb_port->func_list,
+				 &rsrcs->avail_ldb_ports[i]);
+			rsrcs->num_avail_ldb_ports[i]++;
+		}
+
+		list = &domain->avail_ldb_ports[i];
+		DLB2_DOM_LIST_FOR_SAFE(*list, ldb_port, tmp_ldb_port) {
+			ldb_port->owned = false;
+
+			list_del(&ldb_port->domain_list);
+			list_add(&ldb_port->func_list,
+				 &rsrcs->avail_ldb_ports[i]);
+			rsrcs->num_avail_ldb_ports[i]++;
+		}
+	}
+
+	/* Move the domain's dir ports to the function's avail list */
+	list = &domain->used_dir_pq_pairs;
+	DLB2_DOM_LIST_FOR_SAFE(*list, dir_port, tmp_dir_port) {
+		dir_port->owned = false;
+		dir_port->port_configured = false;
+		dir_port->init_tkn_cnt = 0;
+
+		list_del(&dir_port->domain_list);
+
+		list_add(&dir_port->func_list, &rsrcs->avail_dir_pq_pairs);
+		rsrcs->num_avail_dir_pq_pairs++;
+	}
+
+	list = &domain->avail_dir_pq_pairs;
+	DLB2_DOM_LIST_FOR_SAFE(*list, dir_port, tmp_dir_port) {
+		dir_port->owned = false;
+
+		list_del(&dir_port->domain_list);
+
+		list_add(&dir_port->func_list, &rsrcs->avail_dir_pq_pairs);
+		rsrcs->num_avail_dir_pq_pairs++;
+	}
+
+	/* Return hist list entries to the function */
+	ret = dlb2_bitmap_set_range(rsrcs->avail_hist_list_entries,
+				    domain->hist_list_entry_base,
+				    domain->total_hist_list_entries);
+	if (ret) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: domain hist list base doesn't match the function's bitmap.\n",
+			    __func__);
+		return ret;
+	}
+
+	domain->total_hist_list_entries = 0;
+	domain->avail_hist_list_entries = 0;
+	domain->hist_list_entry_base = 0;
+	domain->hist_list_entry_offset = 0;
+
+	rsrcs->num_avail_qed_entries += domain->num_ldb_credits;
+	domain->num_ldb_credits = 0;
+
+	rsrcs->num_avail_dqed_entries += domain->num_dir_credits;
+	domain->num_dir_credits = 0;
+
+	rsrcs->num_avail_aqed_entries += domain->num_avail_aqed_entries;
+	rsrcs->num_avail_aqed_entries += domain->num_used_aqed_entries;
+	domain->num_avail_aqed_entries = 0;
+	domain->num_used_aqed_entries = 0;
+
+	domain->num_pending_removals = 0;
+	domain->num_pending_additions = 0;
+	domain->configured = false;
+	domain->started = false;
+
+	/*
+	 * Move the domain out of the used_domains list and back to the
+	 * function's avail_domains list.
+	 */
+	list_del(&domain->func_list);
+	list_add(&domain->func_list, &rsrcs->avail_domains);
+	rsrcs->num_avail_domains++;
+
+	return 0;
+}
+
+static u32 dlb2_dir_queue_depth(struct dlb2_hw *hw,
+				struct dlb2_dir_pq_pair *queue)
+{
+	union dlb2_lsp_qid_dir_enqueue_cnt r0;
+
+	r0.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_DIR_ENQUEUE_CNT(queue->id.phys_id));
+
+	return r0.field.count;
+}
+
+static bool dlb2_dir_queue_is_empty(struct dlb2_hw *hw,
+				    struct dlb2_dir_pq_pair *queue)
+{
+	return dlb2_dir_queue_depth(hw, queue) == 0;
+}
+
+static u32 dlb2_ldb_queue_depth(struct dlb2_hw *hw,
+				struct dlb2_ldb_queue *queue)
+{
+	union dlb2_lsp_qid_aqed_active_cnt r0;
+	union dlb2_lsp_qid_atm_active r1;
+	union dlb2_lsp_qid_ldb_enqueue_cnt r2;
+
+	r0.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_AQED_ACTIVE_CNT(queue->id.phys_id));
+	r1.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_ATM_ACTIVE(queue->id.phys_id));
+
+	r2.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_LDB_ENQUEUE_CNT(queue->id.phys_id));
+
+	return r0.field.count + r1.field.count + r2.field.count;
+}
+
+static bool dlb2_ldb_queue_is_empty(struct dlb2_hw *hw,
+				    struct dlb2_ldb_queue *queue)
+{
+	return dlb2_ldb_queue_depth(hw, queue) == 0;
+}
+
+static void __dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
+						   struct dlb2_ldb_port *port)
+{
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_PP2VAS(port->id.phys_id),
+		    DLB2_SYS_LDB_PP2VAS_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ2VAS(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ2VAS_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_PP2VDEV(port->id.phys_id),
+		    DLB2_SYS_LDB_PP2VDEV_RST);
+
+	if (port->id.vdev_owned) {
+		unsigned int offs;
+		u32 virt_id;
+
+		/*
+		 * DLB uses producer port address bits 17:12 to determine the
+		 * producer port ID. In Scalable IOV mode, PP accesses come
+		 * through the PF MMIO window for the physical producer port,
+		 * so for translation purposes the virtual and physical port
+		 * IDs are equal.
+		 */
+		if (hw->virt_mode == DLB2_VIRT_SRIOV)
+			virt_id = port->id.virt_id;
+		else
+			virt_id = port->id.phys_id;
+
+		offs = port->id.vdev_id * DLB2_MAX_NUM_LDB_PORTS + virt_id;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_VF_LDB_VPP2PP(offs),
+			    DLB2_SYS_VF_LDB_VPP2PP_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_VF_LDB_VPP_V(offs),
+			    DLB2_SYS_VF_LDB_VPP_V_RST);
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_PP_V(port->id.phys_id),
+		    DLB2_SYS_LDB_PP_V_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_DSBL(port->id.phys_id),
+		    DLB2_LSP_CQ_LDB_DSBL_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_DEPTH(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_DEPTH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_INFL_LIM(port->id.phys_id),
+		    DLB2_LSP_CQ_LDB_INFL_LIM_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_HIST_LIST_LIM(port->id.phys_id),
+		    DLB2_CHP_HIST_LIST_LIM_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_HIST_LIST_BASE(port->id.phys_id),
+		    DLB2_CHP_HIST_LIST_BASE_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_HIST_LIST_POP_PTR(port->id.phys_id),
+		    DLB2_CHP_HIST_LIST_POP_PTR_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_HIST_LIST_PUSH_PTR(port->id.phys_id),
+		    DLB2_CHP_HIST_LIST_PUSH_PTR_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_INT_DEPTH_THRSH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_TMR_THRSH(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_TMR_THRSH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_INT_ENB(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_INT_ENB_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ_ISR(port->id.phys_id),
+		    DLB2_SYS_LDB_CQ_ISR_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_TKN_DEPTH_SEL(port->id.phys_id),
+		    DLB2_LSP_CQ_LDB_TKN_DEPTH_SEL_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_TKN_DEPTH_SEL_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_WPTR(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_WPTR_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_TKN_CNT(port->id.phys_id),
+		    DLB2_LSP_CQ_LDB_TKN_CNT_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ_ADDR_L(port->id.phys_id),
+		    DLB2_SYS_LDB_CQ_ADDR_L_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ_ADDR_U(port->id.phys_id),
+		    DLB2_SYS_LDB_CQ_ADDR_U_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ_AT(port->id.phys_id),
+		    DLB2_SYS_LDB_CQ_AT_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ_PASID(port->id.phys_id),
+		    DLB2_SYS_LDB_CQ_PASID_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ2VF_PF_RO(port->id.phys_id),
+		    DLB2_SYS_LDB_CQ2VF_PF_RO_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_TOT_SCH_CNTL(port->id.phys_id),
+		    DLB2_LSP_CQ_LDB_TOT_SCH_CNTL_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_TOT_SCH_CNTH(port->id.phys_id),
+		    DLB2_LSP_CQ_LDB_TOT_SCH_CNTH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ2QID0(port->id.phys_id),
+		    DLB2_LSP_CQ2QID0_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ2QID1(port->id.phys_id),
+		    DLB2_LSP_CQ2QID1_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ2PRIOV(port->id.phys_id),
+		    DLB2_LSP_CQ2PRIOV_RST);
+}
+
+static void dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
+						 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			__dlb2_domain_reset_ldb_port_registers(hw, port);
+	}
+}
+
+static void
+__dlb2_domain_reset_dir_port_registers(struct dlb2_hw *hw,
+				       struct dlb2_dir_pq_pair *port)
+{
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ2VAS(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ2VAS_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_DIR_DSBL(port->id.phys_id),
+		    DLB2_LSP_CQ_DIR_DSBL_RST);
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ_OPT_CLR, port->id.phys_id);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_DEPTH(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_DEPTH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_INT_DEPTH_THRSH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_TMR_THRSH(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_TMR_THRSH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_INT_ENB(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_INT_ENB_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_ISR(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ_ISR_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI(port->id.phys_id),
+		    DLB2_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_TKN_DEPTH_SEL_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_WPTR(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_WPTR_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_DIR_TKN_CNT(port->id.phys_id),
+		    DLB2_LSP_CQ_DIR_TKN_CNT_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_ADDR_L(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ_ADDR_L_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_ADDR_U(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ_ADDR_U_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_AT(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ_AT_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_PASID(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ_PASID_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_FMT(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ_FMT_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ2VF_PF_RO(port->id.phys_id),
+		    DLB2_SYS_DIR_CQ2VF_PF_RO_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_DIR_TOT_SCH_CNTL(port->id.phys_id),
+		    DLB2_LSP_CQ_DIR_TOT_SCH_CNTL_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_DIR_TOT_SCH_CNTH(port->id.phys_id),
+		    DLB2_LSP_CQ_DIR_TOT_SCH_CNTH_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_PP2VAS(port->id.phys_id),
+		    DLB2_SYS_DIR_PP2VAS_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ2VAS(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ2VAS_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_PP2VDEV(port->id.phys_id),
+		    DLB2_SYS_DIR_PP2VDEV_RST);
+
+	if (port->id.vdev_owned) {
+		unsigned int offs;
+		u32 virt_id;
+
+		/*
+		 * DLB uses producer port address bits 17:12 to determine the
+		 * producer port ID. In Scalable IOV mode, PP accesses come
+		 * through the PF MMIO window for the physical producer port,
+		 * so for translation purposes the virtual and physical port
+		 * IDs are equal.
+		 */
+		if (hw->virt_mode == DLB2_VIRT_SRIOV)
+			virt_id = port->id.virt_id;
+		else
+			virt_id = port->id.phys_id;
+
+		offs = port->id.vdev_id * DLB2_MAX_NUM_DIR_PORTS + virt_id;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_VF_DIR_VPP2PP(offs),
+			    DLB2_SYS_VF_DIR_VPP2PP_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_VF_DIR_VPP_V(offs),
+			    DLB2_SYS_VF_DIR_VPP_V_RST);
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_PP_V(port->id.phys_id),
+		    DLB2_SYS_DIR_PP_V_RST);
+}
+
+static void dlb2_domain_reset_dir_port_registers(struct dlb2_hw *hw,
+						 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *port;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port)
+		__dlb2_domain_reset_dir_port_registers(hw, port);
+}
+
+static void dlb2_domain_reset_ldb_queue_registers(struct dlb2_hw *hw,
+						  struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_queue *queue;
+
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue) {
+		unsigned int queue_id = queue->id.phys_id;
+		int i;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_NALDB_TOT_ENQ_CNTL(queue_id),
+			    DLB2_LSP_QID_NALDB_TOT_ENQ_CNTL_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_NALDB_TOT_ENQ_CNTH(queue_id),
+			    DLB2_LSP_QID_NALDB_TOT_ENQ_CNTH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_ATM_TOT_ENQ_CNTL(queue_id),
+			    DLB2_LSP_QID_ATM_TOT_ENQ_CNTL_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_ATM_TOT_ENQ_CNTH(queue_id),
+			    DLB2_LSP_QID_ATM_TOT_ENQ_CNTH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_NALDB_MAX_DEPTH(queue_id),
+			    DLB2_LSP_QID_NALDB_MAX_DEPTH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_LDB_INFL_LIM(queue_id),
+			    DLB2_LSP_QID_LDB_INFL_LIM_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_AQED_ACTIVE_LIM(queue_id),
+			    DLB2_LSP_QID_AQED_ACTIVE_LIM_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_ATM_DEPTH_THRSH(queue_id),
+			    DLB2_LSP_QID_ATM_DEPTH_THRSH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_NALDB_DEPTH_THRSH(queue_id),
+			    DLB2_LSP_QID_NALDB_DEPTH_THRSH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_QID_ITS(queue_id),
+			    DLB2_SYS_LDB_QID_ITS_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_CHP_ORD_QID_SN(queue_id),
+			    DLB2_CHP_ORD_QID_SN_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_CHP_ORD_QID_SN_MAP(queue_id),
+			    DLB2_CHP_ORD_QID_SN_MAP_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_QID_V(queue_id),
+			    DLB2_SYS_LDB_QID_V_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_QID_CFG_V(queue_id),
+			    DLB2_SYS_LDB_QID_CFG_V_RST);
+
+		if (queue->sn_cfg_valid) {
+			u32 offs[2];
+
+			offs[0] = DLB2_RO_PIPE_GRP_0_SLT_SHFT(queue->sn_slot);
+			offs[1] = DLB2_RO_PIPE_GRP_1_SLT_SHFT(queue->sn_slot);
+
+			DLB2_CSR_WR(hw,
+				    offs[queue->sn_group],
+				    DLB2_RO_PIPE_GRP_0_SLT_SHFT_RST);
+		}
+
+		for (i = 0; i < DLB2_LSP_QID2CQIDIX_NUM; i++) {
+			DLB2_CSR_WR(hw,
+				    DLB2_LSP_QID2CQIDIX(queue_id, i),
+				    DLB2_LSP_QID2CQIDIX_00_RST);
+
+			DLB2_CSR_WR(hw,
+				    DLB2_LSP_QID2CQIDIX2(queue_id, i),
+				    DLB2_LSP_QID2CQIDIX2_00_RST);
+
+			DLB2_CSR_WR(hw,
+				    DLB2_ATM_QID2CQIDIX(queue_id, i),
+				    DLB2_ATM_QID2CQIDIX_00_RST);
+		}
+	}
+}
+
+static void dlb2_domain_reset_dir_queue_registers(struct dlb2_hw *hw,
+						  struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *queue;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, queue) {
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_DIR_MAX_DEPTH(queue->id.phys_id),
+			    DLB2_LSP_QID_DIR_MAX_DEPTH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_DIR_TOT_ENQ_CNTL(queue->id.phys_id),
+			    DLB2_LSP_QID_DIR_TOT_ENQ_CNTL_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_DIR_TOT_ENQ_CNTH(queue->id.phys_id),
+			    DLB2_LSP_QID_DIR_TOT_ENQ_CNTH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_QID_DIR_DEPTH_THRSH(queue->id.phys_id),
+			    DLB2_LSP_QID_DIR_DEPTH_THRSH_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_DIR_QID_ITS(queue->id.phys_id),
+			    DLB2_SYS_DIR_QID_ITS_RST);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_DIR_QID_V(queue->id.phys_id),
+			    DLB2_SYS_DIR_QID_V_RST);
+	}
+}
+
+static u32 dlb2_dir_cq_token_count(struct dlb2_hw *hw,
+				   struct dlb2_dir_pq_pair *port)
+{
+	union dlb2_lsp_cq_dir_tkn_cnt r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ_DIR_TKN_CNT(port->id.phys_id));
+
+	/*
+	 * Account for the initial token count, which is used in order to
+	 * provide a CQ with depth less than 8.
+	 */
+
+	return r0.field.count - port->init_tkn_cnt;
+}
+
+static int dlb2_domain_verify_reset_success(struct dlb2_hw *hw,
+					    struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *dir_port;
+	struct dlb2_ldb_port *ldb_port;
+	struct dlb2_ldb_queue *queue;
+	int i;
+
+	/*
+	 * Confirm that all the domain's queue's inflight counts and AQED
+	 * active counts are 0.
+	 */
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue) {
+		if (!dlb2_ldb_queue_is_empty(hw, queue)) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: failed to empty ldb queue %d\n",
+				    __func__, queue->id.phys_id);
+			return -EFAULT;
+		}
+	}
+
+	/* Confirm that all the domain's CQs inflight and token counts are 0. */
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], ldb_port) {
+			if (dlb2_ldb_cq_inflight_count(hw, ldb_port) ||
+			    dlb2_ldb_cq_token_count(hw, ldb_port)) {
+				DLB2_HW_ERR(hw,
+					    "[%s()] Internal error: failed to empty ldb port %d\n",
+					    __func__, ldb_port->id.phys_id);
+				return -EFAULT;
+			}
+		}
+	}
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_port) {
+		if (!dlb2_dir_queue_is_empty(hw, dir_port)) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: failed to empty dir queue %d\n",
+				    __func__, dir_port->id.phys_id);
+			return -EFAULT;
+		}
+
+		if (dlb2_dir_cq_token_count(hw, dir_port)) {
+			DLB2_HW_ERR(hw,
+				    "[%s()] Internal error: failed to empty dir port %d\n",
+				    __func__, dir_port->id.phys_id);
+			return -EFAULT;
+		}
+	}
+
+	return 0;
+}
+
+static void dlb2_domain_reset_registers(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	dlb2_domain_reset_ldb_port_registers(hw, domain);
+
+	dlb2_domain_reset_dir_port_registers(hw, domain);
+
+	dlb2_domain_reset_ldb_queue_registers(hw, domain);
+
+	dlb2_domain_reset_dir_queue_registers(hw, domain);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_CFG_LDB_VAS_CRD(domain->id.phys_id),
+		    DLB2_CHP_CFG_LDB_VAS_CRD_RST);
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_CFG_DIR_VAS_CRD(domain->id.phys_id),
+		    DLB2_CHP_CFG_DIR_VAS_CRD_RST);
+}
+
+static int dlb2_domain_drain_ldb_cqs(struct dlb2_hw *hw,
+				     struct dlb2_hw_domain *domain,
+				     bool toggle_port)
+{
+	struct dlb2_ldb_port *port;
+	int ret, i;
+
+	/* If the domain hasn't been started, there's no traffic to drain */
+	if (!domain->started)
+		return 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			if (toggle_port)
+				dlb2_ldb_port_cq_disable(hw, port);
+
+			ret = dlb2_drain_ldb_cq(hw, port);
+			if (ret < 0)
+				return ret;
+
+			if (toggle_port)
+				dlb2_ldb_port_cq_enable(hw, port);
+		}
+	}
+
+	return 0;
+}
+
+static bool dlb2_domain_mapped_queues_empty(struct dlb2_hw *hw,
+					    struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_queue *queue;
+
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue) {
+		if (queue->num_mappings == 0)
+			continue;
+
+		if (!dlb2_ldb_queue_is_empty(hw, queue))
+			return false;
+	}
+
+	return true;
+}
+
+static int dlb2_domain_drain_mapped_queues(struct dlb2_hw *hw,
+					   struct dlb2_hw_domain *domain)
+{
+	int i, ret;
+
+	/* If the domain hasn't been started, there's no traffic to drain */
+	if (!domain->started)
+		return 0;
+
+	if (domain->num_pending_removals > 0) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: failed to unmap domain queues\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	for (i = 0; i < DLB2_MAX_QID_EMPTY_CHECK_LOOPS; i++) {
+		ret = dlb2_domain_drain_ldb_cqs(hw, domain, true);
+		if (ret < 0)
+			return ret;
+
+		if (dlb2_domain_mapped_queues_empty(hw, domain))
+			break;
+	}
+
+	if (i == DLB2_MAX_QID_EMPTY_CHECK_LOOPS) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: failed to empty queues\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	/*
+	 * Drain the CQs one more time. For the queues to go empty, they would
+	 * have scheduled one or more QEs.
+	 */
+	ret = dlb2_domain_drain_ldb_cqs(hw, domain, true);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static int dlb2_domain_drain_unmapped_queue(struct dlb2_hw *hw,
+					    struct dlb2_hw_domain *domain,
+					    struct dlb2_ldb_queue *queue)
+{
+	struct dlb2_ldb_port *port;
+	int ret, i;
+
+	/* If a domain has LDB queues, it must have LDB ports */
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		if (!list_empty(&domain->used_ldb_ports[i]))
+			break;
+	}
+
+	if (i == DLB2_NUM_COS_DOMAINS) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: No configured LDB ports\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	port = DLB2_DOM_LIST_HEAD(domain->used_ldb_ports[i], typeof(*port));
+
+	/* If necessary, free up a QID slot in this CQ */
+	if (port->num_mappings == DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+		struct dlb2_ldb_queue *mapped_queue;
+
+		mapped_queue = &hw->rsrcs.ldb_queues[port->qid_map[0].qid];
+
+		ret = dlb2_ldb_port_unmap_qid(hw, port, mapped_queue);
+		if (ret)
+			return ret;
+	}
+
+	ret = dlb2_ldb_port_map_qid_dynamic(hw, port, queue, 0);
+	if (ret)
+		return ret;
+
+	return dlb2_domain_drain_mapped_queues(hw, domain);
+}
+
+static int dlb2_domain_drain_unmapped_queues(struct dlb2_hw *hw,
+					     struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_queue *queue;
+	int ret;
+
+	/* If the domain hasn't been started, there's no traffic to drain */
+	if (!domain->started)
+		return 0;
+
+	/*
+	 * Pre-condition: the unattached queue must not have any outstanding
+	 * completions. This is ensured by calling dlb2_domain_drain_ldb_cqs()
+	 * prior to this in dlb2_domain_drain_mapped_queues().
+	 */
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue) {
+		if (queue->num_mappings != 0 ||
+		    dlb2_ldb_queue_is_empty(hw, queue))
+			continue;
+
+		ret = dlb2_domain_drain_unmapped_queue(hw, domain, queue);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int dlb2_drain_dir_cq(struct dlb2_hw *hw,
+			     struct dlb2_dir_pq_pair *port)
+{
+	unsigned int port_id = port->id.phys_id;
+	u32 cnt;
+
+	/* Return any outstanding tokens */
+	cnt = dlb2_dir_cq_token_count(hw, port);
+
+	if (cnt != 0) {
+		struct dlb2_hcw hcw_mem[8], *hcw;
+		void __iomem *pp_addr;
+
+		pp_addr = dlb2_producer_port_addr(hw, port_id, false);
+
+		/* Point hcw to a 64B-aligned location */
+		hcw = (struct dlb2_hcw *)((uintptr_t)&hcw_mem[4] & ~0x3F);
+
+		/*
+		 * Program the first HCW for a batch token return and
+		 * the rest as NOOPS
+		 */
+		memset(hcw, 0, 4 * sizeof(*hcw));
+		hcw->cq_token = 1;
+		hcw->lock_id = cnt - 1;
+
+		iosubmit_cmds512(pp_addr, hcw, 1);
+
+		dlb2_fence_hcw(pp_addr);
+	}
+
+	return 0;
+}
+
+static int dlb2_domain_drain_dir_cqs(struct dlb2_hw *hw,
+				     struct dlb2_hw_domain *domain,
+				     bool toggle_port)
+{
+	struct dlb2_dir_pq_pair *port;
+	int ret;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port) {
+		/*
+		 * Can't drain a port if it's not configured, and there's
+		 * nothing to drain if its queue is unconfigured.
+		 */
+		if (!port->port_configured || !port->queue_configured)
+			continue;
+
+		if (toggle_port)
+			dlb2_dir_port_cq_disable(hw, port);
+
+		ret = dlb2_drain_dir_cq(hw, port);
+		if (ret < 0)
+			return ret;
+
+		if (toggle_port)
+			dlb2_dir_port_cq_enable(hw, port);
+	}
+
+	return 0;
+}
+
+static bool dlb2_domain_dir_queues_empty(struct dlb2_hw *hw,
+					 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *queue;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, queue) {
+		if (!dlb2_dir_queue_is_empty(hw, queue))
+			return false;
+	}
+
+	return true;
+}
+
+static int dlb2_domain_drain_dir_queues(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	int i, ret;
+
+	/* If the domain hasn't been started, there's no traffic to drain */
+	if (!domain->started)
+		return 0;
+
+	for (i = 0; i < DLB2_MAX_QID_EMPTY_CHECK_LOOPS; i++) {
+		ret = dlb2_domain_drain_dir_cqs(hw, domain, true);
+		if (ret < 0)
+			return ret;
+
+		if (dlb2_domain_dir_queues_empty(hw, domain))
+			break;
+	}
+
+	if (i == DLB2_MAX_QID_EMPTY_CHECK_LOOPS) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: failed to empty queues\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	/*
+	 * Drain the CQs one more time. For the queues to go empty, they would
+	 * have scheduled one or more QEs.
+	 */
+	ret = dlb2_domain_drain_dir_cqs(hw, domain, true);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+dlb2_domain_disable_dir_producer_ports(struct dlb2_hw *hw,
+				       struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *port;
+	union dlb2_sys_dir_pp_v r1;
+
+	r1.field.pp_v = 0;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port)
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_DIR_PP_V(port->id.phys_id),
+			    r1.val);
+}
+
+static void
+dlb2_domain_disable_ldb_producer_ports(struct dlb2_hw *hw,
+				       struct dlb2_hw_domain *domain)
+{
+	union dlb2_sys_ldb_pp_v r1;
+	struct dlb2_ldb_port *port;
+	int i;
+
+	r1.field.pp_v = 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			DLB2_CSR_WR(hw,
+				    DLB2_SYS_LDB_PP_V(port->id.phys_id),
+				    r1.val);
+	}
+}
+
+static void dlb2_domain_disable_dir_vpps(struct dlb2_hw *hw,
+					 struct dlb2_hw_domain *domain,
+					 unsigned int vdev_id)
+{
+	union dlb2_sys_vf_dir_vpp_v r1;
+	struct dlb2_dir_pq_pair *port;
+
+	r1.field.vpp_v = 0;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port) {
+		unsigned int offs;
+		u32 virt_id;
+
+		if (hw->virt_mode == DLB2_VIRT_SRIOV)
+			virt_id = port->id.virt_id;
+		else
+			virt_id = port->id.phys_id;
+
+		offs = vdev_id * DLB2_MAX_NUM_DIR_PORTS + virt_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_DIR_VPP_V(offs), r1.val);
+	}
+}
+
+static void dlb2_domain_disable_ldb_vpps(struct dlb2_hw *hw,
+					 struct dlb2_hw_domain *domain,
+					 unsigned int vdev_id)
+{
+	union dlb2_sys_vf_ldb_vpp_v r1;
+	struct dlb2_ldb_port *port;
+	int i;
+
+	r1.field.vpp_v = 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			unsigned int offs;
+			u32 virt_id;
+
+			if (hw->virt_mode == DLB2_VIRT_SRIOV)
+				virt_id = port->id.virt_id;
+			else
+				virt_id = port->id.phys_id;
+
+			offs = vdev_id * DLB2_MAX_NUM_LDB_PORTS + virt_id;
+
+			DLB2_CSR_WR(hw, DLB2_SYS_VF_LDB_VPP_V(offs), r1.val);
+		}
+	}
+}
+
+static void dlb2_domain_disable_ldb_seq_checks(struct dlb2_hw *hw,
+					       struct dlb2_hw_domain *domain)
+{
+	union dlb2_chp_sn_chk_enbl r1;
+	struct dlb2_ldb_port *port;
+	int i;
+
+	r1.field.en = 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			DLB2_CSR_WR(hw,
+				    DLB2_CHP_SN_CHK_ENBL(port->id.phys_id),
+				    r1.val);
+	}
+}
+
+static void
+dlb2_domain_disable_ldb_port_interrupts(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	union dlb2_chp_ldb_cq_int_enb r0 = { {0} };
+	union dlb2_chp_ldb_cq_wd_enb r1 = { {0} };
+	struct dlb2_ldb_port *port;
+	int i;
+
+	r0.field.en_tim = 0;
+	r0.field.en_depth = 0;
+
+	r1.field.wd_enable = 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			DLB2_CSR_WR(hw,
+				    DLB2_CHP_LDB_CQ_INT_ENB(port->id.phys_id),
+				    r0.val);
+
+			DLB2_CSR_WR(hw,
+				    DLB2_CHP_LDB_CQ_WD_ENB(port->id.phys_id),
+				    r1.val);
+		}
+	}
+}
+
+static void
+dlb2_domain_disable_dir_port_interrupts(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	union dlb2_chp_dir_cq_int_enb r0 = { {0} };
+	union dlb2_chp_dir_cq_wd_enb r1 = { {0} };
+	struct dlb2_dir_pq_pair *port;
+
+	r0.field.en_tim = 0;
+	r0.field.en_depth = 0;
+
+	r1.field.wd_enable = 0;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port) {
+		DLB2_CSR_WR(hw,
+			    DLB2_CHP_DIR_CQ_INT_ENB(port->id.phys_id),
+			    r0.val);
+
+		DLB2_CSR_WR(hw,
+			    DLB2_CHP_DIR_CQ_WD_ENB(port->id.phys_id),
+			    r1.val);
+	}
+}
+
+static void
+dlb2_domain_disable_ldb_queue_write_perms(struct dlb2_hw *hw,
+					  struct dlb2_hw_domain *domain)
+{
+	int domain_offset = domain->id.phys_id * DLB2_MAX_NUM_LDB_QUEUES;
+	struct dlb2_ldb_queue *queue;
+
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue) {
+		union dlb2_sys_ldb_vasqid_v r0 = { {0} };
+		union dlb2_sys_ldb_qid2vqid r1 = { {0} };
+		union dlb2_sys_vf_ldb_vqid_v r2 = { {0} };
+		union dlb2_sys_vf_ldb_vqid2qid r3 = { {0} };
+		int idx;
+
+		idx = domain_offset + queue->id.phys_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_LDB_VASQID_V(idx), r0.val);
+
+		if (queue->id.vdev_owned) {
+			DLB2_CSR_WR(hw,
+				    DLB2_SYS_LDB_QID2VQID(queue->id.phys_id),
+				    r1.val);
+
+			idx = queue->id.vdev_id * DLB2_MAX_NUM_LDB_QUEUES +
+				queue->id.virt_id;
+
+			DLB2_CSR_WR(hw,
+				    DLB2_SYS_VF_LDB_VQID_V(idx),
+				    r2.val);
+
+			DLB2_CSR_WR(hw,
+				    DLB2_SYS_VF_LDB_VQID2QID(idx),
+				    r3.val);
+		}
+	}
+}
+
+static void
+dlb2_domain_disable_dir_queue_write_perms(struct dlb2_hw *hw,
+					  struct dlb2_hw_domain *domain)
+{
+	int domain_offset = domain->id.phys_id * DLB2_MAX_NUM_DIR_PORTS;
+	struct dlb2_dir_pq_pair *queue;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, queue) {
+		union dlb2_sys_dir_vasqid_v r0 = { {0} };
+		union dlb2_sys_vf_dir_vqid_v r1 = { {0} };
+		union dlb2_sys_vf_dir_vqid2qid r2 = { {0} };
+		int idx;
+
+		idx = domain_offset + queue->id.phys_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_DIR_VASQID_V(idx), r0.val);
+
+		if (queue->id.vdev_owned) {
+			idx = queue->id.vdev_id * DLB2_MAX_NUM_DIR_PORTS +
+				queue->id.virt_id;
+
+			DLB2_CSR_WR(hw,
+				    DLB2_SYS_VF_DIR_VQID_V(idx),
+				    r1.val);
+
+			DLB2_CSR_WR(hw,
+				    DLB2_SYS_VF_DIR_VQID2QID(idx),
+				    r2.val);
+		}
+	}
+}
+
+static void dlb2_domain_disable_dir_cqs(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *port;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port) {
+		port->enabled = false;
+
+		dlb2_dir_port_cq_disable(hw, port);
+	}
+}
+
+static void dlb2_domain_disable_ldb_cqs(struct dlb2_hw *hw,
+					struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			port->enabled = false;
+
+			dlb2_ldb_port_cq_disable(hw, port);
+		}
+	}
+}
+
+static void dlb2_domain_enable_ldb_cqs(struct dlb2_hw *hw,
+				       struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port) {
+			port->enabled = true;
+
+			dlb2_ldb_port_cq_enable(hw, port);
+		}
+	}
+}
+
+static void dlb2_log_reset_domain(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 reset domain:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n", domain_id);
+}
+
+/**
+ * dlb2_reset_domain() - Reset a DLB scheduling domain and its associated
+ *	hardware resources.
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @domain_id: Domain ID
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Note: User software *must* stop sending to this domain's producer ports
+ * before invoking this function, otherwise undefined behavior will result.
+ *
+ * Return: returns < 0 on error, 0 otherwise.
+ */
+int dlb2_reset_domain(struct dlb2_hw *hw,
+		      u32 domain_id,
+		      bool vdev_req,
+		      unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	int ret;
+
+	dlb2_log_reset_domain(hw, domain_id, vdev_req, vdev_id);
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain || !domain->configured)
+		return -EINVAL;
+
+	/* Disable VPPs */
+	if (vdev_req) {
+		dlb2_domain_disable_dir_vpps(hw, domain, vdev_id);
+
+		dlb2_domain_disable_ldb_vpps(hw, domain, vdev_id);
+	}
+
+	/* Disable CQ interrupts */
+	dlb2_domain_disable_dir_port_interrupts(hw, domain);
+
+	dlb2_domain_disable_ldb_port_interrupts(hw, domain);
+
+	/*
+	 * For each queue owned by this domain, disable its write permissions to
+	 * cause any traffic sent to it to be dropped. Well-behaved software
+	 * should not be sending QEs at this point.
+	 */
+	dlb2_domain_disable_dir_queue_write_perms(hw, domain);
+
+	dlb2_domain_disable_ldb_queue_write_perms(hw, domain);
+
+	/* Turn off completion tracking on all the domain's PPs. */
+	dlb2_domain_disable_ldb_seq_checks(hw, domain);
+
+	/*
+	 * Disable the LDB CQs and drain them in order to complete the map and
+	 * unmap procedures, which require zero CQ inflights and zero QID
+	 * inflights respectively.
+	 */
+	dlb2_domain_disable_ldb_cqs(hw, domain);
+
+	ret = dlb2_domain_drain_ldb_cqs(hw, domain, false);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_domain_wait_for_ldb_cqs_to_empty(hw, domain);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_domain_finish_unmap_qid_procedures(hw, domain);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_domain_finish_map_qid_procedures(hw, domain);
+	if (ret < 0)
+		return ret;
+
+	/* Re-enable the CQs in order to drain the mapped queues. */
+	dlb2_domain_enable_ldb_cqs(hw, domain);
+
+	ret = dlb2_domain_drain_mapped_queues(hw, domain);
+	if (ret < 0)
+		return ret;
+
+	ret = dlb2_domain_drain_unmapped_queues(hw, domain);
+	if (ret < 0)
+		return ret;
+
+	/* Done draining LDB QEs, so disable the CQs. */
+	dlb2_domain_disable_ldb_cqs(hw, domain);
+
+	dlb2_domain_drain_dir_queues(hw, domain);
+
+	/* Done draining DIR QEs, so disable the CQs. */
+	dlb2_domain_disable_dir_cqs(hw, domain);
+
+	/* Disable PPs */
+	dlb2_domain_disable_dir_producer_ports(hw, domain);
+
+	dlb2_domain_disable_ldb_producer_ports(hw, domain);
+
+	ret = dlb2_domain_verify_reset_success(hw, domain);
+	if (ret)
+		return ret;
+
+	/* Reset the QID and port state. */
+	dlb2_domain_reset_registers(hw, domain);
+
+	/* Hardware reset complete. Reset the domain's software state */
+	ret = dlb2_domain_reset_software_state(hw, domain);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
 int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
 			      struct dlb2_get_num_resources_args *arg,
 			      bool vdev_req,
@@ -210,10 +3342,10 @@ int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
 	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
 		arg->num_ldb_ports += rsrcs->num_avail_ldb_ports[i];
 
-	arg->num_cos0_ldb_ports = rsrcs->num_avail_ldb_ports[0];
-	arg->num_cos1_ldb_ports = rsrcs->num_avail_ldb_ports[1];
-	arg->num_cos2_ldb_ports = rsrcs->num_avail_ldb_ports[2];
-	arg->num_cos3_ldb_ports = rsrcs->num_avail_ldb_ports[3];
+	arg->num_cos_ldb_ports[0] = rsrcs->num_avail_ldb_ports[0];
+	arg->num_cos_ldb_ports[1] = rsrcs->num_avail_ldb_ports[1];
+	arg->num_cos_ldb_ports[2] = rsrcs->num_avail_ldb_ports[2];
+	arg->num_cos_ldb_ports[3] = rsrcs->num_avail_ldb_ports[3];
 
 	arg->num_dir_ports = rsrcs->num_avail_dir_pq_pairs;
 
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 33c0b0ea6025..212039c4554a 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -7,6 +7,8 @@
 
 #include <linux/types.h>
 
+#include <uapi/linux/dlb2_user.h>
+
 #include "dlb2_hw_types.h"
 
 /**
@@ -35,6 +37,69 @@ int dlb2_resource_init(struct dlb2_hw *hw);
 void dlb2_resource_free(struct dlb2_hw *hw);
 
 /**
+ * dlb2_hw_create_sched_domain() - create a scheduling domain
+ * @hw: dlb2_hw handle for a particular device.
+ * @args: scheduling domain creation arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function creates a scheduling domain containing the resources specified
+ * in args. The individual resources (queues, ports, credits) can be configured
+ * after creating a scheduling domain.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the domain ID.
+ *
+ * resp->id contains a virtual ID if vdev_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, or the requested domain name
+ *	    is already in use.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_create_sched_domain(struct dlb2_hw *hw,
+				struct dlb2_create_sched_domain_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_request,
+				unsigned int vdev_id);
+
+/**
+ * dlb2_reset_domain() - reset a scheduling domain
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function resets and frees a DLB 2.0 scheduling domain and its associated
+ * resources.
+ *
+ * Pre-condition: the driver must ensure software has stopped sending QEs
+ * through this domain's producer ports before invoking this function, or
+ * undefined behavior will result.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, -1 otherwise.
+ *
+ * EINVAL - Invalid domain ID, or the domain is not configured.
+ * EFAULT - Internal error. (Possibly caused if software is the pre-condition
+ *	    is not met.)
+ * ETIMEDOUT - Hardware component didn't reset in the expected time.
+ */
+int dlb2_reset_domain(struct dlb2_hw *hw,
+		      u32 domain_id,
+		      bool vdev_request,
+		      unsigned int vdev_id);
+
+/**
  * dlb2_hw_get_num_resources() - query the PCI function's available resources
  * @hw: dlb2_hw handle for a particular device.
  * @arg: pointer to resource counts.
@@ -65,4 +130,13 @@ int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
  */
 void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw);
 
+enum dlb2_virt_mode {
+	DLB2_VIRT_NONE,
+	DLB2_VIRT_SRIOV,
+	DLB2_VIRT_SIOV,
+
+	/* NUM_DLB2_VIRT_MODES must be last */
+	NUM_DLB2_VIRT_MODES,
+};
+
 #endif /* __DLB2_RESOURCE_H */
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 4742efae0b38..37b0a7b98a86 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -7,6 +7,92 @@
 
 #include <linux/types.h>
 
+enum dlb2_error {
+	DLB2_ST_SUCCESS = 0,
+	DLB2_ST_NAME_EXISTS,
+	DLB2_ST_DOMAIN_UNAVAILABLE,
+	DLB2_ST_LDB_PORTS_UNAVAILABLE,
+	DLB2_ST_DIR_PORTS_UNAVAILABLE,
+	DLB2_ST_LDB_QUEUES_UNAVAILABLE,
+	DLB2_ST_LDB_CREDITS_UNAVAILABLE,
+	DLB2_ST_DIR_CREDITS_UNAVAILABLE,
+	DLB2_ST_SEQUENCE_NUMBERS_UNAVAILABLE,
+	DLB2_ST_INVALID_DOMAIN_ID,
+	DLB2_ST_INVALID_QID_INFLIGHT_ALLOCATION,
+	DLB2_ST_ATOMIC_INFLIGHTS_UNAVAILABLE,
+	DLB2_ST_HIST_LIST_ENTRIES_UNAVAILABLE,
+	DLB2_ST_INVALID_LDB_QUEUE_ID,
+	DLB2_ST_INVALID_CQ_DEPTH,
+	DLB2_ST_INVALID_CQ_VIRT_ADDR,
+	DLB2_ST_INVALID_PORT_ID,
+	DLB2_ST_INVALID_QID,
+	DLB2_ST_INVALID_PRIORITY,
+	DLB2_ST_NO_QID_SLOTS_AVAILABLE,
+	DLB2_ST_INVALID_DIR_QUEUE_ID,
+	DLB2_ST_DIR_QUEUES_UNAVAILABLE,
+	DLB2_ST_DOMAIN_NOT_CONFIGURED,
+	DLB2_ST_INTERNAL_ERROR,
+	DLB2_ST_DOMAIN_IN_USE,
+	DLB2_ST_DOMAIN_NOT_FOUND,
+	DLB2_ST_QUEUE_NOT_FOUND,
+	DLB2_ST_DOMAIN_STARTED,
+	DLB2_ST_DOMAIN_NOT_STARTED,
+	DLB2_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES,
+	DLB2_ST_DOMAIN_RESET_FAILED,
+	DLB2_ST_MBOX_ERROR,
+	DLB2_ST_INVALID_HIST_LIST_DEPTH,
+	DLB2_ST_NO_MEMORY,
+	DLB2_ST_INVALID_LOCK_ID_COMP_LEVEL,
+	DLB2_ST_INVALID_COS_ID,
+	DLB2_ST_INVALID_SMON_ID,
+	DLB2_ST_INVALID_SMON_MODE,
+	DLB2_ST_INVALID_SMON_COMP_MODE,
+	DLB2_ST_INVALID_SMON_CAP_MODE,
+};
+
+static const char dlb2_error_strings[][128] = {
+	"DLB2_ST_SUCCESS",
+	"DLB2_ST_NAME_EXISTS",
+	"DLB2_ST_DOMAIN_UNAVAILABLE",
+	"DLB2_ST_LDB_PORTS_UNAVAILABLE",
+	"DLB2_ST_DIR_PORTS_UNAVAILABLE",
+	"DLB2_ST_LDB_QUEUES_UNAVAILABLE",
+	"DLB2_ST_LDB_CREDITS_UNAVAILABLE",
+	"DLB2_ST_DIR_CREDITS_UNAVAILABLE",
+	"DLB2_ST_SEQUENCE_NUMBERS_UNAVAILABLE",
+	"DLB2_ST_INVALID_DOMAIN_ID",
+	"DLB2_ST_INVALID_QID_INFLIGHT_ALLOCATION",
+	"DLB2_ST_ATOMIC_INFLIGHTS_UNAVAILABLE",
+	"DLB2_ST_HIST_LIST_ENTRIES_UNAVAILABLE",
+	"DLB2_ST_INVALID_LDB_QUEUE_ID",
+	"DLB2_ST_INVALID_CQ_DEPTH",
+	"DLB2_ST_INVALID_CQ_VIRT_ADDR",
+	"DLB2_ST_INVALID_PORT_ID",
+	"DLB2_ST_INVALID_QID",
+	"DLB2_ST_INVALID_PRIORITY",
+	"DLB2_ST_NO_QID_SLOTS_AVAILABLE",
+	"DLB2_ST_INVALID_DIR_QUEUE_ID",
+	"DLB2_ST_DIR_QUEUES_UNAVAILABLE",
+	"DLB2_ST_DOMAIN_NOT_CONFIGURED",
+	"DLB2_ST_INTERNAL_ERROR",
+	"DLB2_ST_DOMAIN_IN_USE",
+	"DLB2_ST_DOMAIN_NOT_FOUND",
+	"DLB2_ST_QUEUE_NOT_FOUND",
+	"DLB2_ST_DOMAIN_STARTED",
+	"DLB2_ST_DOMAIN_NOT_STARTED",
+	"DLB2_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES",
+	"DLB2_ST_DOMAIN_RESET_FAILED",
+	"DLB2_ST_MBOX_ERROR",
+	"DLB2_ST_INVALID_HIST_LIST_DEPTH",
+	"DLB2_ST_NO_MEMORY",
+	"DLB2_ST_INVALID_LOCK_ID_COMP_LEVEL",
+	"DLB2_ST_INVALID_COS_ID",
+	"DLB2_ST_INVALID_SMON_ID",
+	"DLB2_ST_INVALID_SMON_MODE",
+	"DLB2_ST_INVALID_SMON_COMP_MODE",
+	"DLB2_ST_INVALID_SMON_CAP_MODE",
+};
+
 struct dlb2_cmd_response {
 	__u32 status; /* Interpret using enum dlb2_error */
 	__u32 id;
@@ -105,10 +191,8 @@ struct dlb2_get_driver_version_args {
  * - num_ldb_queues: Number of load-balanced queues.
  * - num_ldb_ports: Number of load-balanced ports that can be allocated from
  *	from any class-of-service with available ports.
- * - num_cos0_ldb_ports: Number of load-balanced ports from class-of-service 0.
- * - num_cos1_ldb_ports: Number of load-balanced ports from class-of-service 1.
- * - num_cos2_ldb_ports: Number of load-balanced ports from class-of-service 2.
- * - num_cos3_ldb_ports: Number of load-balanced ports from class-of-service 3.
+ * - num_cos_ldb_ports[4]: Number of load-balanced ports from
+ *	classes-of-service 0-3.
  * - num_dir_ports: Number of directed ports. A directed port has one directed
  *	queue, so no num_dir_queues argument is necessary.
  * - num_atomic_inflights: This specifies the amount of temporary atomic QE
@@ -144,10 +228,7 @@ struct dlb2_create_sched_domain_args {
 	/* Input parameters */
 	__u32 num_ldb_queues;
 	__u32 num_ldb_ports;
-	__u32 num_cos0_ldb_ports;
-	__u32 num_cos1_ldb_ports;
-	__u32 num_cos2_ldb_ports;
-	__u32 num_cos3_ldb_ports;
+	__u32 num_cos_ldb_ports[4];
 	__u32 num_dir_ports;
 	__u32 num_atomic_inflights;
 	__u32 num_hist_list_entries;
@@ -165,14 +246,8 @@ struct dlb2_create_sched_domain_args {
  * - num_domains: Number of available scheduling domains.
  * - num_ldb_queues: Number of available load-balanced queues.
  * - num_ldb_ports: Total number of available load-balanced ports.
- * - num_cos0_ldb_ports: Number of available load-balanced ports from
- *	class-of-service 0.
- * - num_cos1_ldb_ports: Number of available load-balanced ports from
- *	class-of-service 1.
- * - num_cos2_ldb_ports: Number of available load-balanced ports from
- *	class-of-service 2.
- * - num_cos3_ldb_ports: Number of available load-balanced ports from
- *	class-of-service 3.
+ * - num_cos_ldb_ports[4]: Number of available load-balanced ports from
+ *	classes-of-service 0-3.
  * - num_dir_ports: Number of available directed ports. There is one directed
  *	queue for every directed port.
  * - num_atomic_inflights: Amount of available temporary atomic QE storage.
@@ -188,10 +263,7 @@ struct dlb2_get_num_resources_args {
 	__u32 num_sched_domains;
 	__u32 num_ldb_queues;
 	__u32 num_ldb_ports;
-	__u32 num_cos0_ldb_ports;
-	__u32 num_cos1_ldb_ports;
-	__u32 num_cos2_ldb_ports;
-	__u32 num_cos3_ldb_ports;
+	__u32 num_cos_ldb_ports[4];
 	__u32 num_dir_ports;
 	__u32 num_atomic_inflights;
 	__u32 num_hist_list_entries;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/20] dlb2: add ioctl to get sched domain fd
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (4 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 05/20] dlb2: add sched domain config and reset support Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 07/20] dlb2: add runtime power-management support Gage Eads
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

To create the file and install it in the current process's file table, the
driver uses anon_inode_getfd(). The calling process can then use this fd
to issue ioctls for configuration of the scheduilng domain's resources (to
be added in an upcoming commit).

This design means that any application with sufficient permissions to
access the /dev/dlb<N> file can request the fd of any scheduling domain --
potentially one that was created by another process. One way to ensure that
dlb applications cannot access each other's scheduling domains is to use
virtualization; that is, create multiple virtual functions -- and thus
multiple /dev/dlb<N> nodes -- each using unique file permissions.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c | 68 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/dlb2_user.h | 30 +++++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index 8ad3391b63d4..eef9b824b276 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -147,6 +147,73 @@ static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
 	return ret;
 }
 
+static int dlb2_ioctl_get_sched_domain_fd(struct dlb2_dev *dev,
+					  unsigned long user_arg,
+					  u16 size)
+{
+	struct dlb2_get_sched_domain_fd_args arg;
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_domain *domain;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	if (arg.domain_id >= DLB2_MAX_NUM_DOMAINS) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid domain id %u\n",
+			__func__, arg.domain_id);
+		response.status = DLB2_ST_INVALID_DOMAIN_ID;
+		ret = -EINVAL;
+		goto copy;
+	}
+
+	mutex_lock(&dev->resource_mutex);
+
+	domain = dev->sched_domains[arg.domain_id];
+
+	if (!domain) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain %u not configured\n",
+			__func__, arg.domain_id);
+		response.status = DLB2_ST_DOMAIN_UNAVAILABLE;
+		ret = -ENOENT;
+		goto unlock;
+	}
+
+	ret = anon_inode_getfd("[dlb2domain]", &dlb2_domain_fops,
+			       domain, O_RDWR);
+
+	response.id = ret;
+
+	if (ret >= 0) {
+		kref_get(&domain->refcnt);
+
+		ret = 0;
+	}
+
+unlock:
+	mutex_unlock(&dev->resource_mutex);
+
+copy:
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
 static int dlb2_ioctl_get_num_resources(struct dlb2_dev *dev,
 					unsigned long user_arg,
 					u16 size)
@@ -205,6 +272,7 @@ typedef int (*dlb2_ioctl_callback_fn_t)(struct dlb2_dev *dev,
 static dlb2_ioctl_callback_fn_t dlb2_ioctl_callback_fns[NUM_DLB2_CMD] = {
 	dlb2_ioctl_get_device_version,
 	dlb2_ioctl_create_sched_domain,
+	dlb2_ioctl_get_sched_domain_fd,
 	dlb2_ioctl_get_num_resources,
 	dlb2_ioctl_get_driver_version,
 };
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 37b0a7b98a86..95e0d2672abf 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -239,6 +239,31 @@ struct dlb2_create_sched_domain_args {
 };
 
 /*
+ * DLB2_CMD_GET_SCHED_DOMAIN_FD: Get an anonymous scheduling domain fd.
+ *
+ *	The domain must have been previously created with the ioctl
+ *	DLB2_CMD_CREATE_SCHED_DOMAIN. The domain is reset when all
+ *	its open files and memory mappings are closed.
+ *
+ * Input parameters:
+ * - domain_id: Domain ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: domain fd.
+ */
+struct dlb2_get_sched_domain_fd_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 domain_id;
+	__u32 padding0;
+};
+
+/*
  * DLB2_CMD_GET_NUM_RESOURCES: Return the number of available resources
  *	(queues, ports, etc.) that this device owns.
  *
@@ -275,6 +300,7 @@ struct dlb2_get_num_resources_args {
 enum dlb2_user_interface_commands {
 	DLB2_CMD_GET_DEVICE_VERSION,
 	DLB2_CMD_CREATE_SCHED_DOMAIN,
+	DLB2_CMD_GET_SCHED_DOMAIN_FD,
 	DLB2_CMD_GET_NUM_RESOURCES,
 	DLB2_CMD_GET_DRIVER_VERSION,
 
@@ -296,6 +322,10 @@ enum dlb2_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_CREATE_SCHED_DOMAIN,		\
 		      struct dlb2_create_sched_domain_args)
+#define DLB2_IOC_GET_SCHED_DOMAIN_FD				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_SCHED_DOMAIN_FD,		\
+		      struct dlb2_get_sched_domain_fd_args)
 #define DLB2_IOC_GET_NUM_RESOURCES				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_GET_NUM_RESOURCES,		\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/20] dlb2: add runtime power-management support
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (5 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 06/20] dlb2: add ioctl to get sched domain fd Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 08/20] dlb2: add queue create and queue-depth-get ioctls Gage Eads
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

The driver's power-management policy is to put the device in D0 when in use
(when there are any open device files or memory mappings, or there are any
virtual devices), and leave it in D3Hot otherwise. This includes
resume/suspend callbacks; when the device resumes, the driver resets the
hardware to a known good state.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c  |  2 ++
 drivers/misc/dlb2/dlb2_main.c   | 79 +++++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.h   |  2 ++
 drivers/misc/dlb2/dlb2_pf_ops.c | 30 ++++++++++++++++
 4 files changed, 113 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index eef9b824b276..b36e255e8d35 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -197,6 +197,8 @@ static int dlb2_ioctl_get_sched_domain_fd(struct dlb2_dev *dev,
 	if (ret >= 0) {
 		kref_get(&domain->refcnt);
 
+		dev->ops->inc_pm_refcnt(dev->pdev, true);
+
 		ret = 0;
 	}
 
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index eb716587e738..8ace8e1edbcb 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -69,11 +69,21 @@ static int dlb2_open(struct inode *i, struct file *f)
 
 	f->private_data = dev;
 
+	dev->ops->inc_pm_refcnt(dev->pdev, true);
+
 	return 0;
 }
 
 static int dlb2_close(struct inode *i, struct file *f)
 {
+	struct dlb2_dev *dev;
+
+	dev = container_of(f->f_inode->i_cdev, struct dlb2_dev, cdev);
+
+	dev_dbg(dev->dlb2_device, "Closing DLB device file\n");
+
+	dev->ops->dec_pm_refcnt(dev->pdev);
+
 	return 0;
 }
 
@@ -120,6 +130,8 @@ int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id)
 
 	dlb2_dev->sched_domains[domain_id] = domain;
 
+	dlb2_dev->ops->inc_pm_refcnt(dlb2_dev->pdev, true);
+
 	return 0;
 }
 
@@ -293,6 +305,15 @@ static int dlb2_probe(struct pci_dev *pdev,
 	list_add(&dlb2_dev->list, &dlb2_dev_list);
 	mutex_unlock(&dlb2_driver_lock);
 
+	/*
+	 * The driver puts the device to sleep (D3hot) while there are no
+	 * scheduling domains to service. The usage counter of a PCI device at
+	 * probe time is 2, so decrement it twice here. (The PCI layer has
+	 * already called pm_runtime_enable().)
+	 */
+	dlb2_dev->ops->dec_pm_refcnt(pdev);
+	dlb2_dev->ops->dec_pm_refcnt(pdev);
+
 	return 0;
 
 init_driver_state_fail:
@@ -330,6 +351,10 @@ static void dlb2_remove(struct pci_dev *pdev)
 	list_del(&dlb2_dev->list);
 	mutex_unlock(&dlb2_driver_lock);
 
+	/* Undo the PM operations in dlb2_probe(). */
+	dlb2_dev->ops->inc_pm_refcnt(pdev, false);
+	dlb2_dev->ops->inc_pm_refcnt(pdev, false);
+
 	dlb2_dev->ops->free_driver_state(dlb2_dev);
 
 	dlb2_resource_free(&dlb2_dev->hw);
@@ -351,17 +376,71 @@ static void dlb2_remove(struct pci_dev *pdev)
 	devm_kfree(&pdev->dev, dlb2_dev);
 }
 
+#ifdef CONFIG_PM
+static void dlb2_reset_hardware_state(struct dlb2_dev *dev)
+{
+	dlb2_reset_device(dev->pdev);
+
+	/* Reinitialize any other hardware state */
+	dev->ops->init_hardware(dev);
+}
+
+static int dlb2_runtime_suspend(struct device *dev)
+{
+	struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+	struct dlb2_dev *dlb2_dev = pci_get_drvdata(pdev);
+
+	dev_dbg(dlb2_dev->dlb2_device, "Suspending device operation\n");
+
+	/* Return and let the PCI subsystem put the device in D3hot. */
+
+	return 0;
+}
+
+static int dlb2_runtime_resume(struct device *dev)
+{
+	struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+	struct dlb2_dev *dlb2_dev = pci_get_drvdata(pdev);
+	int ret;
+
+	/*
+	 * The PCI subsystem put the device in D0, but the device may not have
+	 * completed powering up. Wait until the device is ready before
+	 * proceeding.
+	 */
+	ret = dlb2_dev->ops->wait_for_device_ready(dlb2_dev, pdev);
+	if (ret)
+		return ret;
+
+	dev_dbg(dlb2_dev->dlb2_device, "Resuming device operation\n");
+
+	/* Now reinitialize the device state. */
+	dlb2_reset_hardware_state(dlb2_dev);
+
+	return 0;
+}
+#endif
+
 static struct pci_device_id dlb2_id_table[] = {
 	{ PCI_DEVICE_DATA(INTEL, DLB2_PF, DLB2_PF) },
 	{ 0 }
 };
 MODULE_DEVICE_TABLE(pci, dlb2_id_table);
 
+#ifdef CONFIG_PM
+static const struct dev_pm_ops dlb2_pm_ops = {
+	SET_RUNTIME_PM_OPS(dlb2_runtime_suspend, dlb2_runtime_resume, NULL)
+};
+#endif
+
 static struct pci_driver dlb2_pci_driver = {
 	.name		 = (char *)dlb2_driver_name,
 	.id_table	 = dlb2_id_table,
 	.probe		 = dlb2_probe,
 	.remove		 = dlb2_remove,
+#ifdef CONFIG_PM
+	.driver.pm	 = &dlb2_pm_ops,
+#endif
 };
 
 static int __init dlb2_init_module(void)
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 81795754c070..76380f0ca51b 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -41,6 +41,8 @@ struct dlb2_device_ops {
 	int (*map_pci_bar_space)(struct dlb2_dev *dev, struct pci_dev *pdev);
 	void (*unmap_pci_bar_space)(struct dlb2_dev *dev,
 				    struct pci_dev *pdev);
+	void (*inc_pm_refcnt)(struct pci_dev *pdev, bool resume);
+	void (*dec_pm_refcnt)(struct pci_dev *pdev);
 	int (*init_driver_state)(struct dlb2_dev *dev);
 	void (*free_driver_state)(struct dlb2_dev *dev);
 	int (*device_create)(struct dlb2_dev *dlb2_dev,
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index ae83a2feb3f9..e4de46eccf87 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -2,11 +2,39 @@
 /* Copyright(c) 2017-2020 Intel Corporation */
 
 #include <linux/delay.h>
+#include <linux/pm_runtime.h>
 
 #include "dlb2_main.h"
 #include "dlb2_regs.h"
 #include "dlb2_resource.h"
 
+/***********************************/
+/****** Runtime PM management ******/
+/***********************************/
+
+static void
+dlb2_pf_pm_inc_refcnt(struct pci_dev *pdev, bool resume)
+{
+	if (resume)
+		/*
+		 * Increment the device's usage count and immediately wake it
+		 * if it was suspended.
+		 */
+		pm_runtime_get_sync(&pdev->dev);
+	else
+		pm_runtime_get_noresume(&pdev->dev);
+}
+
+static void
+dlb2_pf_pm_dec_refcnt(struct pci_dev *pdev)
+{
+	/*
+	 * Decrement the device's usage count and suspend it if the
+	 * count reaches zero.
+	 */
+	pm_runtime_put_sync_suspend(&pdev->dev);
+}
+
 /********************************/
 /****** PCI BAR management ******/
 /********************************/
@@ -226,6 +254,8 @@ dlb2_pf_reset_domain(struct dlb2_hw *hw, u32 id)
 struct dlb2_device_ops dlb2_pf_ops = {
 	.map_pci_bar_space = dlb2_pf_map_pci_bar_space,
 	.unmap_pci_bar_space = dlb2_pf_unmap_pci_bar_space,
+	.inc_pm_refcnt = dlb2_pf_pm_inc_refcnt,
+	.dec_pm_refcnt = dlb2_pf_pm_dec_refcnt,
 	.init_driver_state = dlb2_pf_init_driver_state,
 	.free_driver_state = dlb2_pf_free_driver_state,
 	.device_create = dlb2_pf_device_create,
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/20] dlb2: add queue create and queue-depth-get ioctls
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (6 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 07/20] dlb2: add runtime power-management support Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode Gage Eads
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

When a CPU enqueues a queue entry (QE) to DLB 2.0, the QE entry is sent to
a DLB 2.0 queue. These queues hold queue entries (QEs) that have not yet
been scheduled to a destination port. The queue's depth is the number of
QEs residing in a queue.

Each queue supports multiple priority levels, and while a directed queue
has a 1:1 mapping with a directed port, load-balanced queues can be
configured with a set of load-balanced ports that software desires the
queue's QEs to be scheduled to.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c    |  82 +++++
 drivers/misc/dlb2/dlb2_ioctl.h    |   5 +
 drivers/misc/dlb2/dlb2_main.c     |  17 +
 drivers/misc/dlb2/dlb2_main.h     |  16 +
 drivers/misc/dlb2/dlb2_pf_ops.c   |  40 +++
 drivers/misc/dlb2/dlb2_resource.c | 732 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h | 125 +++++++
 include/uapi/linux/dlb2_user.h    | 142 ++++++++
 8 files changed, 1159 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index b36e255e8d35..b1aa0c7bf011 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -44,6 +44,88 @@ static int dlb2_copy_resp_to_user(struct dlb2_dev *dev,
 	return 0;
 }
 
+/*
+ * The DLB domain ioctl callback template minimizes replication of boilerplate
+ * code to copy arguments, acquire and release the resource lock, and execute
+ * the command.  The arguments and response structure name should have the
+ * format dlb2_<lower_name>_args.
+ */
+#define DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(lower_name)			   \
+static int dlb2_domain_ioctl_##lower_name(struct dlb2_dev *dev,		   \
+					  struct dlb2_domain *domain,	   \
+					  unsigned long user_arg,	   \
+					  u16 size)			   \
+{									   \
+	struct dlb2_##lower_name##_args arg;				   \
+	struct dlb2_cmd_response response = {0};			   \
+	int ret;							   \
+									   \
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);		   \
+									   \
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg)); \
+	if (ret)							   \
+		return ret;						   \
+									   \
+	/* Copy zeroes to verify the user-provided response pointer */	   \
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);	   \
+	if (ret)							   \
+		return ret;						   \
+									   \
+	mutex_lock(&dev->resource_mutex);				   \
+									   \
+	ret = dev->ops->lower_name(&dev->hw,				   \
+				   domain->id,				   \
+				   &arg,				   \
+				   &response);				   \
+									   \
+	mutex_unlock(&dev->resource_mutex);				   \
+									   \
+	if (copy_to_user((void __user *)arg.response,			   \
+			 &response,					   \
+			 sizeof(response)))				   \
+		return -EFAULT;						   \
+									   \
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);		   \
+									   \
+	return ret;							   \
+}
+
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_ldb_queue)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_dir_queue)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_ldb_queue_depth)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_dir_queue_depth)
+
+typedef int (*dlb2_domain_ioctl_callback_fn_t)(struct dlb2_dev *dev,
+					       struct dlb2_domain *domain,
+					       unsigned long arg,
+					       u16 size);
+
+static dlb2_domain_ioctl_callback_fn_t
+dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
+	dlb2_domain_ioctl_create_ldb_queue,
+	dlb2_domain_ioctl_create_dir_queue,
+	dlb2_domain_ioctl_get_ldb_queue_depth,
+	dlb2_domain_ioctl_get_dir_queue_depth,
+};
+
+int dlb2_domain_ioctl_dispatcher(struct dlb2_dev *dev,
+				 struct dlb2_domain *domain,
+				 unsigned int cmd,
+				 unsigned long arg)
+{
+	int cmd_nr = _IOC_NR(cmd);
+	u16 sz = _IOC_SIZE(cmd);
+
+	if (cmd_nr >= NUM_DLB2_DOMAIN_CMD) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Unexpected DLB DOMAIN command %d\n",
+			__func__, _IOC_NR(cmd));
+		return -1;
+	}
+
+	return dlb2_domain_ioctl_callback_fns[cmd_nr](dev, domain, arg, sz);
+}
+
 /* [7:0]: device revision, [15:8]: device version */
 #define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
 
diff --git a/drivers/misc/dlb2/dlb2_ioctl.h b/drivers/misc/dlb2/dlb2_ioctl.h
index 476548cdd33c..0e5fa0dd630a 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.h
+++ b/drivers/misc/dlb2/dlb2_ioctl.h
@@ -11,4 +11,9 @@ int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
 			  unsigned int cmd,
 			  unsigned long arg);
 
+int dlb2_domain_ioctl_dispatcher(struct dlb2_dev *dev,
+				 struct dlb2_domain *domain,
+				 unsigned int cmd,
+				 unsigned long arg);
+
 #endif /* __DLB2_IOCTL_H */
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index 8ace8e1edbcb..a19bc542f637 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -180,9 +180,26 @@ static int dlb2_domain_close(struct inode *i, struct file *f)
 	return ret;
 }
 
+static long
+dlb2_domain_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct dlb2_domain *domain = f->private_data;
+	struct dlb2_dev *dev = domain->dlb2_dev;
+
+	if (_IOC_TYPE(cmd) != DLB2_IOC_MAGIC) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Bad magic number!\n", __func__);
+		return -EINVAL;
+	}
+
+	return dlb2_domain_ioctl_dispatcher(dev, domain, cmd, arg);
+}
+
 const struct file_operations dlb2_domain_fops = {
 	.owner   = THIS_MODULE,
 	.release = dlb2_domain_close,
+	.unlocked_ioctl = dlb2_domain_ioctl,
+	.compat_ioctl = compat_ptr_ioctl,
 };
 
 /**********************************/
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 76380f0ca51b..f89888fa781a 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -60,9 +60,25 @@ struct dlb2_device_ops {
 	int (*create_sched_domain)(struct dlb2_hw *hw,
 				   struct dlb2_create_sched_domain_args *args,
 				   struct dlb2_cmd_response *resp);
+	int (*create_ldb_queue)(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_create_ldb_queue_args *args,
+				struct dlb2_cmd_response *resp);
+	int (*create_dir_queue)(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_create_dir_queue_args *args,
+				struct dlb2_cmd_response *resp);
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
 	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
+	int (*get_ldb_queue_depth)(struct dlb2_hw *hw,
+				   u32 domain_id,
+				   struct dlb2_get_ldb_queue_depth_args *args,
+				   struct dlb2_cmd_response *resp);
+	int (*get_dir_queue_depth)(struct dlb2_hw *hw,
+				   u32 domain_id,
+				   struct dlb2_get_dir_queue_depth_args *args,
+				   struct dlb2_cmd_response *resp);
 	void (*init_hardware)(struct dlb2_dev *dev);
 };
 
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index e4de46eccf87..ab36d253f396 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -235,6 +235,24 @@ dlb2_pf_create_sched_domain(struct dlb2_hw *hw,
 }
 
 static int
+dlb2_pf_create_ldb_queue(struct dlb2_hw *hw,
+			 u32 id,
+			 struct dlb2_create_ldb_queue_args *args,
+			 struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_create_ldb_queue(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_create_dir_queue(struct dlb2_hw *hw,
+			 u32 id,
+			 struct dlb2_create_dir_queue_args *args,
+			 struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_create_dir_queue(hw, id, args, resp, false, 0);
+}
+
+static int
 dlb2_pf_get_num_resources(struct dlb2_hw *hw,
 			  struct dlb2_get_num_resources_args *args)
 {
@@ -247,6 +265,24 @@ dlb2_pf_reset_domain(struct dlb2_hw *hw, u32 id)
 	return dlb2_reset_domain(hw, id, false, 0);
 }
 
+static int
+dlb2_pf_get_ldb_queue_depth(struct dlb2_hw *hw,
+			    u32 id,
+			    struct dlb2_get_ldb_queue_depth_args *args,
+			    struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_get_ldb_queue_depth(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_get_dir_queue_depth(struct dlb2_hw *hw,
+			    u32 id,
+			    struct dlb2_get_dir_queue_depth_args *args,
+			    struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_get_dir_queue_depth(hw, id, args, resp, false, 0);
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -265,7 +301,11 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.enable_pm = dlb2_pf_enable_pm,
 	.wait_for_device_ready = dlb2_pf_wait_for_device_ready,
 	.create_sched_domain = dlb2_pf_create_sched_domain,
+	.create_ldb_queue = dlb2_pf_create_ldb_queue,
+	.create_dir_queue = dlb2_pf_create_dir_queue,
 	.get_num_resources = dlb2_pf_get_num_resources,
 	.reset_domain = dlb2_pf_reset_domain,
+	.get_ldb_queue_depth = dlb2_pf_get_ldb_queue_depth,
+	.get_dir_queue_depth = dlb2_pf_get_dir_queue_depth,
 	.init_hardware = dlb2_pf_init_hardware,
 };
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index d3773c0c5dd1..512b458b14e6 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -237,6 +237,24 @@ static struct dlb2_hw_domain *dlb2_get_domain_from_id(struct dlb2_hw *hw,
 	return NULL;
 }
 
+static struct dlb2_dir_pq_pair *
+dlb2_get_domain_used_dir_pq(u32 id,
+			    bool vdev_req,
+			    struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *port;
+
+	if (id >= DLB2_MAX_NUM_DIR_PORTS)
+		return NULL;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port)
+		if ((!vdev_req && port->id.phys_id == id) ||
+		    (vdev_req && port->id.virt_id == id))
+			return port;
+
+	return NULL;
+}
+
 static struct dlb2_ldb_queue *
 dlb2_get_ldb_queue_from_id(struct dlb2_hw *hw,
 			   u32 id,
@@ -268,6 +286,24 @@ dlb2_get_ldb_queue_from_id(struct dlb2_hw *hw,
 	return NULL;
 }
 
+static struct dlb2_ldb_queue *
+dlb2_get_domain_ldb_queue(u32 id,
+			  bool vdev_req,
+			  struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_queue *queue;
+
+	if (id >= DLB2_MAX_NUM_LDB_QUEUES)
+		return NULL;
+
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, queue)
+		if ((!vdev_req && queue->id.phys_id == id) ||
+		    (vdev_req && queue->id.virt_id == id))
+			return queue;
+
+	return NULL;
+}
+
 static int dlb2_attach_ldb_queues(struct dlb2_hw *hw,
 				  struct dlb2_function_resources *rsrcs,
 				  struct dlb2_hw_domain *domain,
@@ -707,6 +743,352 @@ dlb2_verify_create_sched_dom_args(struct dlb2_function_resources *rsrcs,
 	return 0;
 }
 
+static int
+dlb2_verify_create_ldb_queue_args(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  struct dlb2_create_ldb_queue_args *args,
+				  struct dlb2_cmd_response *resp,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	int i;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	if (domain->started) {
+		resp->status = DLB2_ST_DOMAIN_STARTED;
+		return -EINVAL;
+	}
+
+	if (list_empty(&domain->avail_ldb_queues)) {
+		resp->status = DLB2_ST_LDB_QUEUES_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (args->num_sequence_numbers) {
+		for (i = 0; i < DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+			struct dlb2_sn_group *group = &hw->rsrcs.sn_groups[i];
+
+			if (group->sequence_numbers_per_queue ==
+			    args->num_sequence_numbers &&
+			    !dlb2_sn_group_full(group))
+				break;
+		}
+
+		if (i == DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS) {
+			resp->status = DLB2_ST_SEQUENCE_NUMBERS_UNAVAILABLE;
+			return -EINVAL;
+		}
+	}
+
+	if (args->num_qid_inflights > 4096) {
+		resp->status = DLB2_ST_INVALID_QID_INFLIGHT_ALLOCATION;
+		return -EINVAL;
+	}
+
+	/* Inflights must be <= number of sequence numbers if ordered */
+	if (args->num_sequence_numbers != 0 &&
+	    args->num_qid_inflights > args->num_sequence_numbers) {
+		resp->status = DLB2_ST_INVALID_QID_INFLIGHT_ALLOCATION;
+		return -EINVAL;
+	}
+
+	if (domain->num_avail_aqed_entries < args->num_atomic_inflights) {
+		resp->status = DLB2_ST_ATOMIC_INFLIGHTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	if (args->num_atomic_inflights &&
+	    args->lock_id_comp_level != 0 &&
+	    args->lock_id_comp_level != 64 &&
+	    args->lock_id_comp_level != 128 &&
+	    args->lock_id_comp_level != 256 &&
+	    args->lock_id_comp_level != 512 &&
+	    args->lock_id_comp_level != 1024 &&
+	    args->lock_id_comp_level != 2048 &&
+	    args->lock_id_comp_level != 4096 &&
+	    args->lock_id_comp_level != 65536) {
+		resp->status = DLB2_ST_INVALID_LOCK_ID_COMP_LEVEL;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+dlb2_verify_create_dir_queue_args(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  struct dlb2_create_dir_queue_args *args,
+				  struct dlb2_cmd_response *resp,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	if (domain->started) {
+		resp->status = DLB2_ST_DOMAIN_STARTED;
+		return -EINVAL;
+	}
+
+	/*
+	 * If the user claims the port is already configured, validate the port
+	 * ID, its domain, and whether the port is configured.
+	 */
+	if (args->port_id != -1) {
+		struct dlb2_dir_pq_pair *port;
+
+		port = dlb2_get_domain_used_dir_pq(args->port_id,
+						   vdev_req,
+						   domain);
+
+		if (!port || port->domain_id.phys_id != domain->id.phys_id ||
+		    !port->port_configured) {
+			resp->status = DLB2_ST_INVALID_PORT_ID;
+			return -EINVAL;
+		}
+	}
+
+	/*
+	 * If the queue's port is not configured, validate that a free
+	 * port-queue pair is available.
+	 */
+	if (args->port_id == -1 && list_empty(&domain->avail_dir_pq_pairs)) {
+		resp->status = DLB2_ST_DIR_QUEUES_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void dlb2_configure_ldb_queue(struct dlb2_hw *hw,
+				     struct dlb2_hw_domain *domain,
+				     struct dlb2_ldb_queue *queue,
+				     struct dlb2_create_ldb_queue_args *args,
+				     bool vdev_req,
+				     unsigned int vdev_id)
+{
+	union dlb2_sys_vf_ldb_vqid_v r0 = { {0} };
+	union dlb2_sys_vf_ldb_vqid2qid r1 = { {0} };
+	union dlb2_sys_ldb_qid2vqid r2 = { {0} };
+	union dlb2_sys_ldb_vasqid_v r3 = { {0} };
+	union dlb2_lsp_qid_ldb_infl_lim r4 = { {0} };
+	union dlb2_lsp_qid_aqed_active_lim r5 = { {0} };
+	union dlb2_aqed_pipe_qid_hid_width r6 = { {0} };
+	union dlb2_sys_ldb_qid_its r7 = { {0} };
+	union dlb2_lsp_qid_atm_depth_thrsh r8 = { {0} };
+	union dlb2_lsp_qid_naldb_depth_thrsh r9 = { {0} };
+	union dlb2_aqed_pipe_qid_fid_lim r10 = { {0} };
+	union dlb2_chp_ord_qid_sn_map r11 = { {0} };
+	union dlb2_sys_ldb_qid_cfg_v r12 = { {0} };
+	union dlb2_sys_ldb_qid_v r13 = { {0} };
+
+	struct dlb2_sn_group *sn_group;
+	unsigned int offs;
+
+	/* QID write permissions are turned on when the domain is started */
+	r3.field.vasqid_v = 0;
+
+	offs = domain->id.phys_id * DLB2_MAX_NUM_LDB_QUEUES +
+		queue->id.phys_id;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_VASQID_V(offs), r3.val);
+
+	/*
+	 * Unordered QIDs get 4K inflights, ordered get as many as the number
+	 * of sequence numbers.
+	 */
+	r4.field.limit = args->num_qid_inflights;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_QID_LDB_INFL_LIM(queue->id.phys_id), r4.val);
+
+	r5.field.limit = queue->aqed_limit;
+
+	if (r5.field.limit > DLB2_MAX_NUM_AQED_ENTRIES)
+		r5.field.limit = DLB2_MAX_NUM_AQED_ENTRIES;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID_AQED_ACTIVE_LIM(queue->id.phys_id),
+		    r5.val);
+
+	switch (args->lock_id_comp_level) {
+	case 64:
+		r6.field.compress_code = 1;
+		break;
+	case 128:
+		r6.field.compress_code = 2;
+		break;
+	case 256:
+		r6.field.compress_code = 3;
+		break;
+	case 512:
+		r6.field.compress_code = 4;
+		break;
+	case 1024:
+		r6.field.compress_code = 5;
+		break;
+	case 2048:
+		r6.field.compress_code = 6;
+		break;
+	case 4096:
+		r6.field.compress_code = 7;
+		break;
+	case 0:
+	case 65536:
+		r6.field.compress_code = 0;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_AQED_PIPE_QID_HID_WIDTH(queue->id.phys_id),
+		    r6.val);
+
+	/* Don't timestamp QEs that pass through this queue */
+	r7.field.qid_its = 0;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_QID_ITS(queue->id.phys_id),
+		    r7.val);
+
+	r8.field.thresh = args->depth_threshold;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID_ATM_DEPTH_THRSH(queue->id.phys_id),
+		    r8.val);
+
+	r9.field.thresh = args->depth_threshold;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID_NALDB_DEPTH_THRSH(queue->id.phys_id),
+		    r9.val);
+
+	/*
+	 * This register limits the number of inflight flows a queue can have
+	 * at one time.  It has an upper bound of 2048, but can be
+	 * over-subscribed. 512 is chosen so that a single queue doesn't use
+	 * the entire atomic storage, but can use a substantial portion if
+	 * needed.
+	 */
+	r10.field.qid_fid_limit = 512;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_AQED_PIPE_QID_FID_LIM(queue->id.phys_id),
+		    r10.val);
+
+	/* Configure SNs */
+	sn_group = &hw->rsrcs.sn_groups[queue->sn_group];
+	r11.field.mode = sn_group->mode;
+	r11.field.slot = queue->sn_slot;
+	r11.field.grp  = sn_group->id;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_ORD_QID_SN_MAP(queue->id.phys_id), r11.val);
+
+	r12.field.sn_cfg_v = (args->num_sequence_numbers != 0);
+	r12.field.fid_cfg_v = (args->num_atomic_inflights != 0);
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_QID_CFG_V(queue->id.phys_id), r12.val);
+
+	if (vdev_req) {
+		offs = vdev_id * DLB2_MAX_NUM_LDB_QUEUES + queue->id.virt_id;
+
+		r0.field.vqid_v = 1;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_LDB_VQID_V(offs), r0.val);
+
+		r1.field.qid = queue->id.phys_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_LDB_VQID2QID(offs), r1.val);
+
+		r2.field.vqid = queue->id.virt_id;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_QID2VQID(queue->id.phys_id),
+			    r2.val);
+	}
+
+	r13.field.qid_v = 1;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_QID_V(queue->id.phys_id), r13.val);
+}
+
+static void dlb2_configure_dir_queue(struct dlb2_hw *hw,
+				     struct dlb2_hw_domain *domain,
+				     struct dlb2_dir_pq_pair *queue,
+				     struct dlb2_create_dir_queue_args *args,
+				     bool vdev_req,
+				     unsigned int vdev_id)
+{
+	union dlb2_sys_dir_vasqid_v r0 = { {0} };
+	union dlb2_sys_dir_qid_its r1 = { {0} };
+	union dlb2_lsp_qid_dir_depth_thrsh r2 = { {0} };
+	union dlb2_sys_dir_qid_v r5 = { {0} };
+
+	unsigned int offs;
+
+	/* QID write permissions are turned on when the domain is started */
+	r0.field.vasqid_v = 0;
+
+	offs = domain->id.phys_id * DLB2_MAX_NUM_DIR_QUEUES +
+		queue->id.phys_id;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_VASQID_V(offs), r0.val);
+
+	/* Don't timestamp QEs that pass through this queue */
+	r1.field.qid_its = 0;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_QID_ITS(queue->id.phys_id),
+		    r1.val);
+
+	r2.field.thresh = args->depth_threshold;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_QID_DIR_DEPTH_THRSH(queue->id.phys_id),
+		    r2.val);
+
+	if (vdev_req) {
+		union dlb2_sys_vf_dir_vqid_v r3 = { {0} };
+		union dlb2_sys_vf_dir_vqid2qid r4 = { {0} };
+
+		offs = vdev_id * DLB2_MAX_NUM_DIR_QUEUES + queue->id.virt_id;
+
+		r3.field.vqid_v = 1;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_DIR_VQID_V(offs), r3.val);
+
+		r4.field.qid = queue->id.phys_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_DIR_VQID2QID(offs), r4.val);
+	}
+
+	r5.field.qid_v = 1;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_QID_V(queue->id.phys_id), r5.val);
+
+	queue->queue_configured = true;
+}
+
 static bool dlb2_port_find_slot(struct dlb2_ldb_port *port,
 				enum dlb2_qid_map_state state,
 				int *slot)
@@ -955,6 +1337,68 @@ dlb2_domain_attach_resources(struct dlb2_hw *hw,
 	return 0;
 }
 
+static int
+dlb2_ldb_queue_attach_to_sn_group(struct dlb2_hw *hw,
+				  struct dlb2_ldb_queue *queue,
+				  struct dlb2_create_ldb_queue_args *args)
+{
+	int slot = -1;
+	int i;
+
+	queue->sn_cfg_valid = false;
+
+	if (args->num_sequence_numbers == 0)
+		return 0;
+
+	for (i = 0; i < DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+		struct dlb2_sn_group *group = &hw->rsrcs.sn_groups[i];
+
+		if (group->sequence_numbers_per_queue ==
+		    args->num_sequence_numbers &&
+		    !dlb2_sn_group_full(group)) {
+			slot = dlb2_sn_group_alloc_slot(group);
+			if (slot >= 0)
+				break;
+		}
+	}
+
+	if (slot == -1) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: no sequence number slots available\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	queue->sn_cfg_valid = true;
+	queue->sn_group = i;
+	queue->sn_slot = slot;
+	return 0;
+}
+
+static int
+dlb2_ldb_queue_attach_resources(struct dlb2_hw *hw,
+				struct dlb2_hw_domain *domain,
+				struct dlb2_ldb_queue *queue,
+				struct dlb2_create_ldb_queue_args *args)
+{
+	int ret;
+
+	ret = dlb2_ldb_queue_attach_to_sn_group(hw, queue, args);
+	if (ret)
+		return ret;
+
+	/* Attach QID inflights */
+	queue->num_qid_inflights = args->num_qid_inflights;
+
+	/* Attach atomic inflights */
+	queue->aqed_limit = args->num_atomic_inflights;
+
+	domain->num_avail_aqed_entries -= args->num_atomic_inflights;
+	domain->num_used_aqed_entries += args->num_atomic_inflights;
+
+	return 0;
+}
+
 static void dlb2_ldb_port_cq_enable(struct dlb2_hw *hw,
 				    struct dlb2_ldb_port *port)
 {
@@ -1670,6 +2114,203 @@ int dlb2_hw_create_sched_domain(struct dlb2_hw *hw,
 }
 
 static void
+dlb2_log_create_ldb_queue_args(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_create_ldb_queue_args *args,
+			       bool vdev_req,
+			       unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 create load-balanced queue arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID:                  %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tNumber of sequence numbers: %d\n",
+		    args->num_sequence_numbers);
+	DLB2_HW_DBG(hw, "\tNumber of QID inflights:    %d\n",
+		    args->num_qid_inflights);
+	DLB2_HW_DBG(hw, "\tNumber of ATM inflights:    %d\n",
+		    args->num_atomic_inflights);
+}
+
+/**
+ * dlb2_hw_create_ldb_queue() - Allocate and initialize a DLB LDB queue.
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @domain_id: Domain ID
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb2_hw_create_ldb_queue(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_create_ldb_queue_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_req,
+			     unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_queue *queue;
+	int ret;
+
+	dlb2_log_create_ldb_queue_args(hw, domain_id, args, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_create_ldb_queue_args(hw,
+						domain_id,
+						args,
+						resp,
+						vdev_req,
+						vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	queue = DLB2_DOM_LIST_HEAD(domain->avail_ldb_queues, typeof(*queue));
+
+	if (!queue) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: no available ldb queues\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	ret = dlb2_ldb_queue_attach_resources(hw, domain, queue, args);
+	if (ret < 0) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: failed to attach the ldb queue resources\n",
+			    __func__, __LINE__);
+		return ret;
+	}
+
+	dlb2_configure_ldb_queue(hw, domain, queue, args, vdev_req, vdev_id);
+
+	queue->num_mappings = 0;
+
+	queue->configured = true;
+
+	/*
+	 * Configuration succeeded, so move the resource from the 'avail' to
+	 * the 'used' list.
+	 */
+	list_del(&queue->domain_list);
+
+	list_add(&queue->domain_list, &domain->used_ldb_queues);
+
+	resp->status = 0;
+	resp->id = (vdev_req) ? queue->id.virt_id : queue->id.phys_id;
+
+	return 0;
+}
+
+static void
+dlb2_log_create_dir_queue_args(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_create_dir_queue_args *args,
+			       bool vdev_req,
+			       unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 create directed queue arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n", domain_id);
+	DLB2_HW_DBG(hw, "\tPort ID:   %d\n", args->port_id);
+}
+
+/**
+ * dlb2_hw_create_dir_queue() - Allocate and initialize a DLB DIR queue.
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @domain_id: Domain ID
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb2_hw_create_dir_queue(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_create_dir_queue_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_req,
+			     unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *queue;
+	struct dlb2_hw_domain *domain;
+	int ret;
+
+	dlb2_log_create_dir_queue_args(hw, domain_id, args, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_create_dir_queue_args(hw,
+						domain_id,
+						args,
+						resp,
+						vdev_req,
+						vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	if (args->port_id != -1)
+		queue = dlb2_get_domain_used_dir_pq(args->port_id,
+						    vdev_req,
+						    domain);
+	else
+		queue = DLB2_DOM_LIST_HEAD(domain->avail_dir_pq_pairs,
+					   typeof(*queue));
+
+	if (!queue) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: no available dir queues\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	dlb2_configure_dir_queue(hw, domain, queue, args, vdev_req, vdev_id);
+
+	/*
+	 * Configuration succeeded, so move the resource from the 'avail' to
+	 * the 'used' list (if it's not already there).
+	 */
+	if (args->port_id == -1) {
+		list_del(&queue->domain_list);
+
+		list_add(&queue->domain_list, &domain->used_dir_pq_pairs);
+	}
+
+	resp->status = 0;
+
+	resp->id = (vdev_req) ? queue->id.virt_id : queue->id.phys_id;
+
+	return 0;
+}
+
+static void
 dlb2_domain_finish_unmap_port_slot(struct dlb2_hw *hw,
 				   struct dlb2_hw_domain *domain,
 				   struct dlb2_ldb_port *port,
@@ -2132,6 +2773,54 @@ static bool dlb2_dir_queue_is_empty(struct dlb2_hw *hw,
 	return dlb2_dir_queue_depth(hw, queue) == 0;
 }
 
+static void dlb2_log_get_dir_queue_depth(struct dlb2_hw *hw,
+					 u32 domain_id,
+					 u32 queue_id,
+					 bool vdev_req,
+					 unsigned int vf_id)
+{
+	DLB2_HW_DBG(hw, "DLB get directed queue depth:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from VF %d)\n", vf_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n", domain_id);
+	DLB2_HW_DBG(hw, "\tQueue ID: %d\n", queue_id);
+}
+
+int dlb2_hw_get_dir_queue_depth(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_get_dir_queue_depth_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_req,
+				unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *queue;
+	struct dlb2_hw_domain *domain;
+	int id;
+
+	id = domain_id;
+
+	dlb2_log_get_dir_queue_depth(hw, domain_id, args->queue_id,
+				     vdev_req, vdev_id);
+
+	domain = dlb2_get_domain_from_id(hw, id, vdev_req, vdev_id);
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	id = args->queue_id;
+
+	queue = dlb2_get_domain_used_dir_pq(id, vdev_req, domain);
+	if (!queue) {
+		resp->status = DLB2_ST_INVALID_QID;
+		return -EINVAL;
+	}
+
+	resp->id = dlb2_dir_queue_depth(hw, queue);
+
+	return 0;
+}
+
 static u32 dlb2_ldb_queue_depth(struct dlb2_hw *hw,
 				struct dlb2_ldb_queue *queue)
 {
@@ -2156,6 +2845,49 @@ static bool dlb2_ldb_queue_is_empty(struct dlb2_hw *hw,
 	return dlb2_ldb_queue_depth(hw, queue) == 0;
 }
 
+static void dlb2_log_get_ldb_queue_depth(struct dlb2_hw *hw,
+					 u32 domain_id,
+					 u32 queue_id,
+					 bool vdev_req,
+					 unsigned int vf_id)
+{
+	DLB2_HW_DBG(hw, "DLB get load-balanced queue depth:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from VF %d)\n", vf_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n", domain_id);
+	DLB2_HW_DBG(hw, "\tQueue ID: %d\n", queue_id);
+}
+
+int dlb2_hw_get_ldb_queue_depth(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_get_ldb_queue_depth_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_req,
+				unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_queue *queue;
+
+	dlb2_log_get_ldb_queue_depth(hw, domain_id, args->queue_id,
+				     vdev_req, vdev_id);
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	queue = dlb2_get_domain_ldb_queue(args->queue_id, vdev_req, domain);
+	if (!queue) {
+		resp->status = DLB2_ST_INVALID_QID;
+		return -EINVAL;
+	}
+
+	resp->id = dlb2_ldb_queue_depth(hw, queue);
+
+	return 0;
+}
+
 static void __dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
 						   struct dlb2_ldb_port *port)
 {
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 212039c4554a..476230ffe4b6 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -70,6 +70,73 @@ int dlb2_hw_create_sched_domain(struct dlb2_hw *hw,
 				unsigned int vdev_id);
 
 /**
+ * dlb2_hw_create_ldb_queue() - create a load-balanced queue
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue creation arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function creates a load-balanced queue.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * resp->id contains a virtual ID if vdev_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, the domain is not configured,
+ *	    the domain has already been started, or the requested queue name is
+ *	    already in use.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_create_ldb_queue(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_create_ldb_queue_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_request,
+			     unsigned int vdev_id);
+
+/**
+ * dlb2_hw_create_dir_queue() - create a directed queue
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue creation arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function creates a directed queue.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * resp->id contains a virtual ID if vdev_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, the domain is not configured,
+ *	    or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_create_dir_queue(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_create_dir_queue_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_request,
+			     unsigned int vdev_id);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
@@ -130,6 +197,64 @@ int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
  */
 void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw);
 
+/**
+ * dlb2_hw_get_ldb_queue_depth() - returns the depth of a load-balanced queue
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue depth args
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function returns the depth of a load-balanced queue.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the depth.
+ *
+ * Errors:
+ * EINVAL - Invalid domain ID or queue ID.
+ */
+int dlb2_hw_get_ldb_queue_depth(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_get_ldb_queue_depth_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_request,
+				unsigned int vdev_id);
+
+/**
+ * dlb2_hw_get_dir_queue_depth() - returns the depth of a directed queue
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue depth args
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function returns the depth of a directed queue.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the depth.
+ *
+ * Errors:
+ * EINVAL - Invalid domain ID or queue ID.
+ */
+int dlb2_hw_get_dir_queue_depth(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_get_dir_queue_depth_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_request,
+				unsigned int vdev_id);
+
 enum dlb2_virt_mode {
 	DLB2_VIRT_NONE,
 	DLB2_VIRT_SRIOV,
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 95e0d2672abf..04c469e09529 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -308,6 +308,132 @@ enum dlb2_user_interface_commands {
 	NUM_DLB2_CMD,
 };
 
+/*********************************/
+/* 'domain' device file commands */
+/*********************************/
+
+/*
+ * DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE: Configure a load-balanced queue.
+ * Input parameters:
+ * - num_atomic_inflights: This specifies the amount of temporary atomic QE
+ *	storage for this queue. If zero, the queue will not support atomic
+ *	scheduling.
+ * - num_sequence_numbers: This specifies the number of sequence numbers used
+ *	by this queue. If zero, the queue will not support ordered scheduling.
+ *	If non-zero, the queue will not support unordered scheduling.
+ * - num_qid_inflights: The maximum number of QEs that can be inflight
+ *	(scheduled to a CQ but not completed) at any time. If
+ *	num_sequence_numbers is non-zero, num_qid_inflights must be set equal
+ *	to num_sequence_numbers.
+ * - lock_id_comp_level: Lock ID compression level. Specifies the number of
+ *	unique lock IDs the queue should compress down to. Valid compression
+ *	levels: 0, 64, 128, 256, 512, 1k, 2k, 4k, 64k. If lock_id_comp_level is
+ *	0, the queue won't compress its lock IDs.
+ * - depth_threshold: DLB sets two bits in the received QE to indicate the
+ *	depth of the queue relative to the threshold before scheduling the
+ *	QE to a CQ:
+ *	- 2’b11: depth > threshold
+ *	- 2’b10: threshold >= depth > 0.75 * threshold
+ *	- 2’b01: 0.75 * threshold >= depth > 0.5 * threshold
+ *	- 2’b00: depth <= 0.5 * threshold
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Queue ID.
+ */
+struct dlb2_create_ldb_queue_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 num_sequence_numbers;
+	__u32 num_qid_inflights;
+	__u32 num_atomic_inflights;
+	__u32 lock_id_comp_level;
+	__u32 depth_threshold;
+	__u32 padding0;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE: Configure a directed queue.
+ * Input parameters:
+ * - port_id: Port ID. If the corresponding directed port is already created,
+ *	specify its ID here. Else this argument must be 0xFFFFFFFF to indicate
+ *	that the queue is being created before the port.
+ * - depth_threshold: DLB sets two bits in the received QE to indicate the
+ *	depth of the queue relative to the threshold before scheduling the
+ *	QE to a CQ:
+ *	- 2’b11: depth > threshold
+ *	- 2’b10: threshold >= depth > 0.75 * threshold
+ *	- 2’b01: 0.75 * threshold >= depth > 0.5 * threshold
+ *	- 2’b00: depth <= 0.5 * threshold
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Queue ID.
+ */
+struct dlb2_create_dir_queue_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__s32 port_id;
+	__u32 depth_threshold;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
+ * Input parameters:
+ * - queue_id: The load-balanced queue ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: queue depth.
+ */
+struct dlb2_get_ldb_queue_depth_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 queue_id;
+	__u32 padding0;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_DIR_QUEUE_DEPTH: Get a directed queue's depth.
+ * Input parameters:
+ * - queue_id: The directed queue ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: queue depth.
+ */
+struct dlb2_get_dir_queue_depth_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 queue_id;
+	__u32 padding0;
+};
+
+enum dlb2_domain_user_interface_commands {
+	DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,
+	DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,
+	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
+	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
+
+	/* NUM_DLB2_DOMAIN_CMD must be last */
+	NUM_DLB2_DOMAIN_CMD,
+};
+
 /********************/
 /* dlb2 ioctl codes */
 /********************/
@@ -334,5 +460,21 @@ enum dlb2_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_GET_DRIVER_VERSION,		\
 		      struct dlb2_get_driver_version_args)
+#define DLB2_IOC_CREATE_LDB_QUEUE				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,		\
+		      struct dlb2_create_ldb_queue_args)
+#define DLB2_IOC_CREATE_DIR_QUEUE				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,		\
+		      struct dlb2_create_dir_queue_args)
+#define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
+		      struct dlb2_get_ldb_queue_depth_args)
+#define DLB2_IOC_GET_DIR_QUEUE_DEPTH				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,	\
+		      struct dlb2_get_dir_queue_depth_args)
 
 #endif /* __DLB2_USER_H */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (7 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 08/20] dlb2: add queue create and queue-depth-get ioctls Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 15:34   ` Arnd Bergmann
  2020-07-12 13:43 ` [PATCH 10/20] dlb2: add port mmap support Gage Eads
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

The port is a core's interface to the DLB, and it consists of an MMIO page
(the "producer port" (PP)) through which the core enqueues a queue entry
and an in-memory queue (the "consumer queue" (CQ)) to which the device
schedules QEs. The driver allocates DMA memory for each port's CQ, and
frees this memory during domain reset or driver removal. A subsequent
commit will add the mmap interface for an application to directly access
the PP and CQ regions.

The device supports two formats ("standard" and "sparse") for CQ entries,
dubbed the "poll mode". This (device-wide) mode is selected by the driver;
to determine the mode at run time, the driver provides an ioctl for
user-space software to query which mode the driver has configured. In this
way, the policy of which mode to use is decoupled from user-space software.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/dlb2_hw_types.h |   1 +
 drivers/misc/dlb2/dlb2_ioctl.c    | 194 ++++++++
 drivers/misc/dlb2/dlb2_main.c     |  73 +++
 drivers/misc/dlb2/dlb2_main.h     |  23 +
 drivers/misc/dlb2/dlb2_pf_ops.c   |  46 ++
 drivers/misc/dlb2/dlb2_resource.c | 922 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h |  89 ++++
 include/uapi/linux/dlb2_user.h    | 102 +++++
 8 files changed, 1450 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
index 732482e364d6..8eed1e0f20a3 100644
--- a/drivers/misc/dlb2/dlb2_hw_types.h
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -321,6 +321,7 @@ struct dlb2_hw {
 
 	/* Virtualization */
 	int virt_mode;
+	unsigned int pasid[DLB2_MAX_NUM_VDEVS];
 };
 
 #endif /* __DLB2_HW_TYPES_H */
diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index b1aa0c7bf011..e9303b7df8e2 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -95,6 +95,166 @@ DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_dir_queue)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_ldb_queue_depth)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_dir_queue_depth)
 
+/*
+ * Port creation ioctls don't use the callback template macro because they have
+ * a number of OS-dependent memory operations.
+ */
+static int dlb2_domain_ioctl_create_ldb_port(struct dlb2_dev *dev,
+					     struct dlb2_domain *domain,
+					     unsigned long user_arg,
+					     u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_create_ldb_port_args arg;
+	dma_addr_t cq_dma_base = 0;
+	void *cq_base = NULL;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	cq_base = dma_alloc_coherent(&dev->pdev->dev,
+				     DLB2_CQ_SIZE,
+				     &cq_dma_base,
+				     GFP_KERNEL);
+	if (!cq_base) {
+		response.status = DLB2_ST_NO_MEMORY;
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	ret = dev->ops->create_ldb_port(&dev->hw,
+					domain->id,
+					&arg,
+					(uintptr_t)cq_dma_base,
+					&response);
+	if (ret)
+		goto unlock;
+
+	/* Fill out the per-port data structure */
+	dev->ldb_port[response.id].id = response.id;
+	dev->ldb_port[response.id].is_ldb = true;
+	dev->ldb_port[response.id].domain = domain;
+	dev->ldb_port[response.id].cq_base = cq_base;
+	dev->ldb_port[response.id].cq_dma_base = cq_dma_base;
+	dev->ldb_port[response.id].valid = true;
+
+unlock:
+	if (ret) {
+		dev_err(dev->dlb2_device, "[%s()]: Error %s\n",
+			__func__, dlb2_error_strings[response.status]);
+
+		if (cq_dma_base)
+			dma_free_coherent(&dev->pdev->dev,
+					  DLB2_CQ_SIZE,
+					  cq_base,
+					  cq_dma_base);
+	} else {
+		dev_dbg(dev->dlb2_device, "CQ PA: 0x%llx\n",
+			virt_to_phys(cq_base));
+		dev_dbg(dev->dlb2_device, "CQ IOVA: 0x%llx\n", cq_dma_base);
+	}
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_create_dir_port(struct dlb2_dev *dev,
+					     struct dlb2_domain *domain,
+					     unsigned long user_arg,
+					     u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_create_dir_port_args arg;
+	dma_addr_t cq_dma_base = 0;
+	void *cq_base = NULL;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	cq_base = dma_alloc_coherent(&dev->pdev->dev,
+				     DLB2_CQ_SIZE,
+				     &cq_dma_base,
+				     GFP_KERNEL);
+	if (!cq_base) {
+		response.status = DLB2_ST_NO_MEMORY;
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	ret = dev->ops->create_dir_port(&dev->hw,
+					domain->id,
+					&arg,
+					(uintptr_t)cq_dma_base,
+					&response);
+	if (ret)
+		goto unlock;
+
+	/* Fill out the per-port data structure */
+	dev->dir_port[response.id].id = response.id;
+	dev->dir_port[response.id].is_ldb = false;
+	dev->dir_port[response.id].domain = domain;
+	dev->dir_port[response.id].cq_base = cq_base;
+	dev->dir_port[response.id].cq_dma_base = cq_dma_base;
+	dev->dir_port[response.id].valid = true;
+
+unlock:
+	if (ret) {
+		dev_err(dev->dlb2_device, "[%s()]: Error %s\n",
+			__func__, dlb2_error_strings[response.status]);
+
+		if (cq_dma_base)
+			dma_free_coherent(&dev->pdev->dev,
+					  DLB2_CQ_SIZE,
+					  cq_base,
+					  cq_dma_base);
+	} else {
+		dev_dbg(dev->dlb2_device, "CQ PA: 0x%llx\n",
+			virt_to_phys(cq_base));
+		dev_dbg(dev->dlb2_device, "CQ IOVA: 0x%llx\n", cq_dma_base);
+	}
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
 typedef int (*dlb2_domain_ioctl_callback_fn_t)(struct dlb2_dev *dev,
 					       struct dlb2_domain *domain,
 					       unsigned long arg,
@@ -104,6 +264,8 @@ static dlb2_domain_ioctl_callback_fn_t
 dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_create_ldb_queue,
 	dlb2_domain_ioctl_create_dir_queue,
+	dlb2_domain_ioctl_create_ldb_port,
+	dlb2_domain_ioctl_create_dir_port,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
 };
@@ -349,6 +511,37 @@ static int dlb2_ioctl_get_driver_version(struct dlb2_dev *dev,
 	return 0;
 }
 
+static int dlb2_ioctl_query_cq_poll_mode(struct dlb2_dev *dev,
+					 unsigned long user_arg,
+					 u16 size)
+{
+	struct dlb2_query_cq_poll_mode_args arg;
+	struct dlb2_cmd_response response = {0};
+	int ret;
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->query_cq_poll_mode(dev, &response);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	return ret;
+}
+
 typedef int (*dlb2_ioctl_callback_fn_t)(struct dlb2_dev *dev,
 					unsigned long arg,
 					u16 size);
@@ -359,6 +552,7 @@ static dlb2_ioctl_callback_fn_t dlb2_ioctl_callback_fns[NUM_DLB2_CMD] = {
 	dlb2_ioctl_get_sched_domain_fd,
 	dlb2_ioctl_get_num_resources,
 	dlb2_ioctl_get_driver_version,
+	dlb2_ioctl_query_cq_poll_mode,
 };
 
 int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index a19bc542f637..ad0b1a9fb768 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -135,11 +135,82 @@ int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id)
 	return 0;
 }
 
+static void dlb2_release_domain_memory(struct dlb2_dev *dev, u32 domain_id)
+{
+	struct dlb2_port *port;
+	int i;
+
+	dev_dbg(dev->dlb2_device,
+		"Releasing memory allocated for domain %d\n", domain_id);
+
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++) {
+		port = &dev->ldb_port[i];
+
+		if (port->valid && port->domain->id == domain_id) {
+			dma_free_coherent(&dev->pdev->dev,
+					  DLB2_CQ_SIZE,
+					  port->cq_base,
+					  port->cq_dma_base);
+
+			port->valid = false;
+		}
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++) {
+		port = &dev->dir_port[i];
+
+		if (port->valid && port->domain->id == domain_id) {
+			dma_free_coherent(&dev->pdev->dev,
+					  DLB2_CQ_SIZE,
+					  port->cq_base,
+					  port->cq_dma_base);
+
+			port->valid = false;
+		}
+	}
+}
+
+static void dlb2_release_device_memory(struct dlb2_dev *dev)
+{
+	struct dlb2_port *port;
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++) {
+		port = &dev->ldb_port[i];
+
+		if (port->valid) {
+			dma_free_coherent(&dev->pdev->dev,
+					  DLB2_CQ_SIZE,
+					  port->cq_base,
+					  port->cq_dma_base);
+
+			port->valid = false;
+		}
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++) {
+		port = &dev->dir_port[i];
+
+		if (port->valid) {
+			dma_free_coherent(&dev->pdev->dev,
+					  DLB2_CQ_SIZE,
+					  port->cq_base,
+					  port->cq_dma_base);
+
+			port->valid = false;
+		}
+	}
+}
+
 static int __dlb2_free_domain(struct dlb2_dev *dev, struct dlb2_domain *domain)
 {
 	int ret = 0;
 
 	ret = dev->ops->reset_domain(&dev->hw, domain->id);
+
+	/* Unpin and free all memory pages associated with the domain */
+	dlb2_release_domain_memory(dev, domain->id);
+
 	if (ret) {
 		dev->domain_reset_failed = true;
 		dev_err(dev->dlb2_device,
@@ -376,6 +447,8 @@ static void dlb2_remove(struct pci_dev *pdev)
 
 	dlb2_resource_free(&dlb2_dev->hw);
 
+	dlb2_release_device_memory(dlb2_dev);
+
 	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
 
 	dlb2_dev->ops->cdev_del(dlb2_dev);
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index f89888fa781a..fd6381b537a2 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -68,6 +68,16 @@ struct dlb2_device_ops {
 				u32 domain_id,
 				struct dlb2_create_dir_queue_args *args,
 				struct dlb2_cmd_response *resp);
+	int (*create_ldb_port)(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_create_ldb_port_args *args,
+			       uintptr_t cq_dma_base,
+			       struct dlb2_cmd_response *resp);
+	int (*create_dir_port)(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_create_dir_port_args *args,
+			       uintptr_t cq_dma_base,
+			       struct dlb2_cmd_response *resp);
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
 	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
@@ -80,11 +90,22 @@ struct dlb2_device_ops {
 				   struct dlb2_get_dir_queue_depth_args *args,
 				   struct dlb2_cmd_response *resp);
 	void (*init_hardware)(struct dlb2_dev *dev);
+	int (*query_cq_poll_mode)(struct dlb2_dev *dev,
+				  struct dlb2_cmd_response *user_resp);
 };
 
 extern struct dlb2_device_ops dlb2_pf_ops;
 extern const struct file_operations dlb2_domain_fops;
 
+struct dlb2_port {
+	void *cq_base;
+	dma_addr_t cq_dma_base;
+	struct dlb2_domain *domain;
+	int id;
+	u8 is_ldb;
+	u8 valid;
+};
+
 struct dlb2_domain {
 	struct dlb2_dev *dlb2_dev;
 	struct kref refcnt;
@@ -99,6 +120,8 @@ struct dlb2_dev {
 	struct list_head list;
 	struct device *dlb2_device;
 	struct dlb2_domain *sched_domains[DLB2_MAX_NUM_DOMAINS];
+	struct dlb2_port ldb_port[DLB2_MAX_NUM_LDB_PORTS];
+	struct dlb2_port dir_port[DLB2_MAX_NUM_DIR_PORTS];
 	/*
 	 * The resource mutex serializes access to driver data structures and
 	 * hardware registers.
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index ab36d253f396..f60ef7daca54 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -217,9 +217,16 @@ dlb2_pf_wait_for_device_ready(struct dlb2_dev *dlb2_dev,
 	return 0;
 }
 
+static bool dlb2_sparse_cq_enabled = true;
+
 static void
 dlb2_pf_init_hardware(struct dlb2_dev *dlb2_dev)
 {
+	if (dlb2_sparse_cq_enabled) {
+		dlb2_hw_enable_sparse_ldb_cq_mode(&dlb2_dev->hw);
+
+		dlb2_hw_enable_sparse_dir_cq_mode(&dlb2_dev->hw);
+	}
 }
 
 /*****************************/
@@ -253,6 +260,28 @@ dlb2_pf_create_dir_queue(struct dlb2_hw *hw,
 }
 
 static int
+dlb2_pf_create_ldb_port(struct dlb2_hw *hw,
+			u32 id,
+			struct dlb2_create_ldb_port_args *args,
+			uintptr_t cq_dma_base,
+			struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_create_ldb_port(hw, id, args, cq_dma_base,
+				       resp, false, 0);
+}
+
+static int
+dlb2_pf_create_dir_port(struct dlb2_hw *hw,
+			u32 id,
+			struct dlb2_create_dir_port_args *args,
+			uintptr_t cq_dma_base,
+			struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_create_dir_port(hw, id, args, cq_dma_base,
+				       resp, false, 0);
+}
+
+static int
 dlb2_pf_get_num_resources(struct dlb2_hw *hw,
 			  struct dlb2_get_num_resources_args *args)
 {
@@ -283,6 +312,20 @@ dlb2_pf_get_dir_queue_depth(struct dlb2_hw *hw,
 	return dlb2_hw_get_dir_queue_depth(hw, id, args, resp, false, 0);
 }
 
+static int
+dlb2_pf_query_cq_poll_mode(struct dlb2_dev *dlb2_dev,
+			   struct dlb2_cmd_response *user_resp)
+{
+	user_resp->status = 0;
+
+	if (dlb2_sparse_cq_enabled)
+		user_resp->id = DLB2_CQ_POLL_MODE_SPARSE;
+	else
+		user_resp->id = DLB2_CQ_POLL_MODE_STD;
+
+	return 0;
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -303,9 +346,12 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.create_sched_domain = dlb2_pf_create_sched_domain,
 	.create_ldb_queue = dlb2_pf_create_ldb_queue,
 	.create_dir_queue = dlb2_pf_create_dir_queue,
+	.create_ldb_port = dlb2_pf_create_ldb_port,
+	.create_dir_port = dlb2_pf_create_dir_port,
 	.get_num_resources = dlb2_pf_get_num_resources,
 	.reset_domain = dlb2_pf_reset_domain,
 	.get_ldb_queue_depth = dlb2_pf_get_ldb_queue_depth,
 	.get_dir_queue_depth = dlb2_pf_get_dir_queue_depth,
 	.init_hardware = dlb2_pf_init_hardware,
+	.query_cq_poll_mode = dlb2_pf_query_cq_poll_mode,
 };
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 512b458b14e6..2b3d10975f18 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -1089,6 +1089,171 @@ static void dlb2_configure_dir_queue(struct dlb2_hw *hw,
 	queue->queue_configured = true;
 }
 
+static int
+dlb2_verify_create_ldb_port_args(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 uintptr_t cq_dma_base,
+				 struct dlb2_create_ldb_port_args *args,
+				 struct dlb2_cmd_response *resp,
+				 bool vdev_req,
+				 unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	int i;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	if (domain->started) {
+		resp->status = DLB2_ST_DOMAIN_STARTED;
+		return -EINVAL;
+	}
+
+	if (args->cos_id >= DLB2_NUM_COS_DOMAINS) {
+		resp->status = DLB2_ST_INVALID_COS_ID;
+		return -EINVAL;
+	}
+
+	if (args->cos_strict) {
+		if (list_empty(&domain->avail_ldb_ports[args->cos_id])) {
+			resp->status = DLB2_ST_LDB_PORTS_UNAVAILABLE;
+			return -EINVAL;
+		}
+	} else {
+		for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+			if (!list_empty(&domain->avail_ldb_ports[i]))
+				break;
+		}
+
+		if (i == DLB2_NUM_COS_DOMAINS) {
+			resp->status = DLB2_ST_LDB_PORTS_UNAVAILABLE;
+			return -EINVAL;
+		}
+	}
+
+	/* Check cache-line alignment */
+	if ((cq_dma_base & 0x3F) != 0) {
+		resp->status = DLB2_ST_INVALID_CQ_VIRT_ADDR;
+		return -EINVAL;
+	}
+
+	if (args->cq_depth != 1 &&
+	    args->cq_depth != 2 &&
+	    args->cq_depth != 4 &&
+	    args->cq_depth != 8 &&
+	    args->cq_depth != 16 &&
+	    args->cq_depth != 32 &&
+	    args->cq_depth != 64 &&
+	    args->cq_depth != 128 &&
+	    args->cq_depth != 256 &&
+	    args->cq_depth != 512 &&
+	    args->cq_depth != 1024) {
+		resp->status = DLB2_ST_INVALID_CQ_DEPTH;
+		return -EINVAL;
+	}
+
+	/* The history list size must be >= 1 */
+	if (!args->cq_history_list_size) {
+		resp->status = DLB2_ST_INVALID_HIST_LIST_DEPTH;
+		return -EINVAL;
+	}
+
+	if (args->cq_history_list_size > domain->avail_hist_list_entries) {
+		resp->status = DLB2_ST_HIST_LIST_ENTRIES_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+dlb2_verify_create_dir_port_args(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 uintptr_t cq_dma_base,
+				 struct dlb2_create_dir_port_args *args,
+				 struct dlb2_cmd_response *resp,
+				 bool vdev_req,
+				 unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	if (domain->started) {
+		resp->status = DLB2_ST_DOMAIN_STARTED;
+		return -EINVAL;
+	}
+
+	/*
+	 * If the user claims the queue is already configured, validate
+	 * the queue ID, its domain, and whether the queue is configured.
+	 */
+	if (args->queue_id != -1) {
+		struct dlb2_dir_pq_pair *queue;
+
+		queue = dlb2_get_domain_used_dir_pq(args->queue_id,
+						    vdev_req,
+						    domain);
+
+		if (!queue || queue->domain_id.phys_id != domain->id.phys_id ||
+		    !queue->queue_configured) {
+			resp->status = DLB2_ST_INVALID_DIR_QUEUE_ID;
+			return -EINVAL;
+		}
+	}
+
+	/*
+	 * If the port's queue is not configured, validate that a free
+	 * port-queue pair is available.
+	 */
+	if (args->queue_id == -1 && list_empty(&domain->avail_dir_pq_pairs)) {
+		resp->status = DLB2_ST_DIR_PORTS_UNAVAILABLE;
+		return -EINVAL;
+	}
+
+	/* Check cache-line alignment */
+	if ((cq_dma_base & 0x3F) != 0) {
+		resp->status = DLB2_ST_INVALID_CQ_VIRT_ADDR;
+		return -EINVAL;
+	}
+
+	if (args->cq_depth != 1 &&
+	    args->cq_depth != 2 &&
+	    args->cq_depth != 4 &&
+	    args->cq_depth != 8 &&
+	    args->cq_depth != 16 &&
+	    args->cq_depth != 32 &&
+	    args->cq_depth != 64 &&
+	    args->cq_depth != 128 &&
+	    args->cq_depth != 256 &&
+	    args->cq_depth != 512 &&
+	    args->cq_depth != 1024) {
+		resp->status = DLB2_ST_INVALID_CQ_DEPTH;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static bool dlb2_port_find_slot(struct dlb2_ldb_port *port,
 				enum dlb2_qid_map_state state,
 				int *slot)
@@ -1455,6 +1620,495 @@ static void dlb2_dir_port_cq_disable(struct dlb2_hw *hw,
 	dlb2_flush_csr(hw);
 }
 
+static void dlb2_ldb_port_configure_pp(struct dlb2_hw *hw,
+				       struct dlb2_hw_domain *domain,
+				       struct dlb2_ldb_port *port,
+				       bool vdev_req,
+				       unsigned int vdev_id)
+{
+	union dlb2_sys_ldb_pp2vas r0 = { {0} };
+	union dlb2_sys_ldb_pp_v r4 = { {0} };
+
+	r0.field.vas = domain->id.phys_id;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_PP2VAS(port->id.phys_id), r0.val);
+
+	if (vdev_req) {
+		union dlb2_sys_vf_ldb_vpp2pp r1 = { {0} };
+		union dlb2_sys_ldb_pp2vdev r2 = { {0} };
+		union dlb2_sys_vf_ldb_vpp_v r3 = { {0} };
+		unsigned int offs;
+		u32 virt_id;
+
+		/*
+		 * DLB uses producer port address bits 17:12 to determine the
+		 * producer port ID. In Scalable IOV mode, PP accesses come
+		 * through the PF MMIO window for the physical producer port,
+		 * so for translation purposes the virtual and physical port
+		 * IDs are equal.
+		 */
+		if (hw->virt_mode == DLB2_VIRT_SRIOV)
+			virt_id = port->id.virt_id;
+		else
+			virt_id = port->id.phys_id;
+
+		r1.field.pp = port->id.phys_id;
+
+		offs = vdev_id * DLB2_MAX_NUM_LDB_PORTS + virt_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_LDB_VPP2PP(offs), r1.val);
+
+		r2.field.vdev = vdev_id;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_PP2VDEV(port->id.phys_id),
+			    r2.val);
+
+		r3.field.vpp_v = 1;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_LDB_VPP_V(offs), r3.val);
+	}
+
+	r4.field.pp_v = 1;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_PP_V(port->id.phys_id),
+		    r4.val);
+}
+
+static int dlb2_ldb_port_configure_cq(struct dlb2_hw *hw,
+				      struct dlb2_hw_domain *domain,
+				      struct dlb2_ldb_port *port,
+				      uintptr_t cq_dma_base,
+				      struct dlb2_create_ldb_port_args *args,
+				      bool vdev_req,
+				      unsigned int vdev_id)
+{
+	union dlb2_sys_ldb_cq_addr_l r0 = { {0} };
+	union dlb2_sys_ldb_cq_addr_u r1 = { {0} };
+	union dlb2_sys_ldb_cq2vf_pf_ro r2 = { {0} };
+	union dlb2_chp_ldb_cq_tkn_depth_sel r3 = { {0} };
+	union dlb2_lsp_cq_ldb_tkn_depth_sel r4 = { {0} };
+	union dlb2_chp_hist_list_lim r5 = { {0} };
+	union dlb2_chp_hist_list_base r6 = { {0} };
+	union dlb2_lsp_cq_ldb_infl_lim r7 = { {0} };
+	union dlb2_chp_hist_list_push_ptr r8 = { {0} };
+	union dlb2_chp_hist_list_pop_ptr r9 = { {0} };
+	union dlb2_sys_ldb_cq_at r10 = { {0} };
+	union dlb2_sys_ldb_cq_pasid r11 = { {0} };
+	union dlb2_chp_ldb_cq2vas r12 = { {0} };
+	union dlb2_lsp_cq2priov r13 = { {0} };
+
+	/* The CQ address is 64B-aligned, and the DLB only wants bits [63:6] */
+	r0.field.addr_l = cq_dma_base >> 6;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_CQ_ADDR_L(port->id.phys_id), r0.val);
+
+	r1.field.addr_u = cq_dma_base >> 32;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_CQ_ADDR_U(port->id.phys_id), r1.val);
+
+	/*
+	 * 'ro' == relaxed ordering. This setting allows DLB2 to write
+	 * cache lines out-of-order (but QEs within a cache line are always
+	 * updated in-order).
+	 */
+	r2.field.vf = vdev_id;
+	r2.field.is_pf = !vdev_req && (hw->virt_mode != DLB2_VIRT_SIOV);
+	r2.field.ro = 1;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_CQ2VF_PF_RO(port->id.phys_id), r2.val);
+
+	if (args->cq_depth <= 8) {
+		r3.field.token_depth_select = 1;
+	} else if (args->cq_depth == 16) {
+		r3.field.token_depth_select = 2;
+	} else if (args->cq_depth == 32) {
+		r3.field.token_depth_select = 3;
+	} else if (args->cq_depth == 64) {
+		r3.field.token_depth_select = 4;
+	} else if (args->cq_depth == 128) {
+		r3.field.token_depth_select = 5;
+	} else if (args->cq_depth == 256) {
+		r3.field.token_depth_select = 6;
+	} else if (args->cq_depth == 512) {
+		r3.field.token_depth_select = 7;
+	} else if (args->cq_depth == 1024) {
+		r3.field.token_depth_select = 8;
+	} else {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: invalid CQ depth\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+		    r3.val);
+
+	/*
+	 * To support CQs with depth less than 8, program the token count
+	 * register with a non-zero initial value. Operations such as domain
+	 * reset must take this initial value into account when quiescing the
+	 * CQ.
+	 */
+	port->init_tkn_cnt = 0;
+
+	if (args->cq_depth < 8) {
+		union dlb2_lsp_cq_ldb_tkn_cnt r14 = { {0} };
+
+		port->init_tkn_cnt = 8 - args->cq_depth;
+
+		r14.field.token_count = port->init_tkn_cnt;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_LDB_TKN_CNT(port->id.phys_id),
+			    r14.val);
+	} else {
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_LDB_TKN_CNT(port->id.phys_id),
+			    DLB2_LSP_CQ_LDB_TKN_CNT_RST);
+	}
+
+	r4.field.token_depth_select = r3.field.token_depth_select;
+	r4.field.ignore_depth = 0;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_LDB_TKN_DEPTH_SEL(port->id.phys_id),
+		    r4.val);
+
+	/* Reset the CQ write pointer */
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_WPTR(port->id.phys_id),
+		    DLB2_CHP_LDB_CQ_WPTR_RST);
+
+	r5.field.limit = port->hist_list_entry_limit - 1;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_HIST_LIST_LIM(port->id.phys_id), r5.val);
+
+	r6.field.base = port->hist_list_entry_base;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_HIST_LIST_BASE(port->id.phys_id), r6.val);
+
+	/*
+	 * The inflight limit sets a cap on the number of QEs for which this CQ
+	 * can owe completions at one time.
+	 */
+	r7.field.limit = args->cq_history_list_size;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_LIM(port->id.phys_id), r7.val);
+
+	r8.field.push_ptr = r6.field.base;
+	r8.field.generation = 0;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_HIST_LIST_PUSH_PTR(port->id.phys_id),
+		    r8.val);
+
+	r9.field.pop_ptr = r6.field.base;
+	r9.field.generation = 0;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_HIST_LIST_POP_PTR(port->id.phys_id), r9.val);
+
+	/*
+	 * Address translation (AT) settings: 0: untranslated, 2: translated
+	 * (see ATS spec regarding Address Type field for more details)
+	 */
+	r10.field.cq_at = 0;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_CQ_AT(port->id.phys_id), r10.val);
+
+	if (vdev_req && hw->virt_mode == DLB2_VIRT_SIOV) {
+		r11.field.pasid = hw->pasid[vdev_id];
+		r11.field.fmt2 = 1;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_LDB_CQ_PASID(port->id.phys_id),
+		    r11.val);
+
+	r12.field.cq2vas = domain->id.phys_id;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_LDB_CQ2VAS(port->id.phys_id), r12.val);
+
+	/* Disable the port's QID mappings */
+	r13.field.v = 0;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(port->id.phys_id), r13.val);
+
+	return 0;
+}
+
+static int dlb2_configure_ldb_port(struct dlb2_hw *hw,
+				   struct dlb2_hw_domain *domain,
+				   struct dlb2_ldb_port *port,
+				   uintptr_t cq_dma_base,
+				   struct dlb2_create_ldb_port_args *args,
+				   bool vdev_req,
+				   unsigned int vdev_id)
+{
+	int ret, i;
+
+	port->hist_list_entry_base = domain->hist_list_entry_base +
+				     domain->hist_list_entry_offset;
+	port->hist_list_entry_limit = port->hist_list_entry_base +
+				      args->cq_history_list_size;
+
+	domain->hist_list_entry_offset += args->cq_history_list_size;
+	domain->avail_hist_list_entries -= args->cq_history_list_size;
+
+	ret = dlb2_ldb_port_configure_cq(hw,
+					 domain,
+					 port,
+					 cq_dma_base,
+					 args,
+					 vdev_req,
+					 vdev_id);
+	if (ret < 0)
+		return ret;
+
+	dlb2_ldb_port_configure_pp(hw,
+				   domain,
+				   port,
+				   vdev_req,
+				   vdev_id);
+
+	dlb2_ldb_port_cq_enable(hw, port);
+
+	for (i = 0; i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; i++)
+		port->qid_map[i].state = DLB2_QUEUE_UNMAPPED;
+	port->num_mappings = 0;
+
+	port->enabled = true;
+
+	port->configured = true;
+
+	return 0;
+}
+
+static void dlb2_dir_port_configure_pp(struct dlb2_hw *hw,
+				       struct dlb2_hw_domain *domain,
+				       struct dlb2_dir_pq_pair *port,
+				       bool vdev_req,
+				       unsigned int vdev_id)
+{
+	union dlb2_sys_dir_pp2vas r0 = { {0} };
+	union dlb2_sys_dir_pp_v r4 = { {0} };
+
+	r0.field.vas = domain->id.phys_id;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_PP2VAS(port->id.phys_id), r0.val);
+
+	if (vdev_req) {
+		union dlb2_sys_vf_dir_vpp2pp r1 = { {0} };
+		union dlb2_sys_dir_pp2vdev r2 = { {0} };
+		union dlb2_sys_vf_dir_vpp_v r3 = { {0} };
+		unsigned int offs;
+		u32 virt_id;
+
+		/*
+		 * DLB uses producer port address bits 17:12 to determine the
+		 * producer port ID. In Scalable IOV mode, PP accesses come
+		 * through the PF MMIO window for the physical producer port,
+		 * so for translation purposes the virtual and physical port
+		 * IDs are equal.
+		 */
+		if (hw->virt_mode == DLB2_VIRT_SRIOV)
+			virt_id = port->id.virt_id;
+		else
+			virt_id = port->id.phys_id;
+
+		r1.field.pp = port->id.phys_id;
+
+		offs = vdev_id * DLB2_MAX_NUM_DIR_PORTS + virt_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_DIR_VPP2PP(offs), r1.val);
+
+		r2.field.vdev = vdev_id;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_DIR_PP2VDEV(port->id.phys_id),
+			    r2.val);
+
+		r3.field.vpp_v = 1;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_VF_DIR_VPP_V(offs), r3.val);
+	}
+
+	r4.field.pp_v = 1;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_PP_V(port->id.phys_id),
+		    r4.val);
+}
+
+static int dlb2_dir_port_configure_cq(struct dlb2_hw *hw,
+				      struct dlb2_hw_domain *domain,
+				      struct dlb2_dir_pq_pair *port,
+				      uintptr_t cq_dma_base,
+				      struct dlb2_create_dir_port_args *args,
+				      bool vdev_req,
+				      unsigned int vdev_id)
+{
+	union dlb2_sys_dir_cq_addr_l r0 = { {0} };
+	union dlb2_sys_dir_cq_addr_u r1 = { {0} };
+	union dlb2_sys_dir_cq2vf_pf_ro r2 = { {0} };
+	union dlb2_chp_dir_cq_tkn_depth_sel r3 = { {0} };
+	union dlb2_lsp_cq_dir_tkn_depth_sel_dsi r4 = { {0} };
+	union dlb2_sys_dir_cq_fmt r9 = { {0} };
+	union dlb2_sys_dir_cq_at r10 = { {0} };
+	union dlb2_sys_dir_cq_pasid r11 = { {0} };
+	union dlb2_chp_dir_cq2vas r12 = { {0} };
+
+	/* The CQ address is 64B-aligned, and the DLB only wants bits [63:6] */
+	r0.field.addr_l = cq_dma_base >> 6;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ_ADDR_L(port->id.phys_id), r0.val);
+
+	r1.field.addr_u = cq_dma_base >> 32;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ_ADDR_U(port->id.phys_id), r1.val);
+
+	/*
+	 * 'ro' == relaxed ordering. This setting allows DLB2 to write
+	 * cache lines out-of-order (but QEs within a cache line are always
+	 * updated in-order).
+	 */
+	r2.field.vf = vdev_id;
+	r2.field.is_pf = !vdev_req && (hw->virt_mode != DLB2_VIRT_SIOV);
+	r2.field.ro = 1;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ2VF_PF_RO(port->id.phys_id), r2.val);
+
+	if (args->cq_depth <= 8) {
+		r3.field.token_depth_select = 1;
+	} else if (args->cq_depth == 16) {
+		r3.field.token_depth_select = 2;
+	} else if (args->cq_depth == 32) {
+		r3.field.token_depth_select = 3;
+	} else if (args->cq_depth == 64) {
+		r3.field.token_depth_select = 4;
+	} else if (args->cq_depth == 128) {
+		r3.field.token_depth_select = 5;
+	} else if (args->cq_depth == 256) {
+		r3.field.token_depth_select = 6;
+	} else if (args->cq_depth == 512) {
+		r3.field.token_depth_select = 7;
+	} else if (args->cq_depth == 1024) {
+		r3.field.token_depth_select = 8;
+	} else {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: invalid CQ depth\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+		    r3.val);
+
+	/*
+	 * To support CQs with depth less than 8, program the token count
+	 * register with a non-zero initial value. Operations such as domain
+	 * reset must take this initial value into account when quiescing the
+	 * CQ.
+	 */
+	port->init_tkn_cnt = 0;
+
+	if (args->cq_depth < 8) {
+		union dlb2_lsp_cq_dir_tkn_cnt r13 = { {0} };
+
+		port->init_tkn_cnt = 8 - args->cq_depth;
+
+		r13.field.count = port->init_tkn_cnt;
+
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_DIR_TKN_CNT(port->id.phys_id),
+			    r13.val);
+	} else {
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_DIR_TKN_CNT(port->id.phys_id),
+			    DLB2_LSP_CQ_DIR_TKN_CNT_RST);
+	}
+
+	r4.field.token_depth_select = r3.field.token_depth_select;
+	r4.field.disable_wb_opt = 0;
+	r4.field.ignore_depth = 0;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI(port->id.phys_id),
+		    r4.val);
+
+	/* Reset the CQ write pointer */
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_WPTR(port->id.phys_id),
+		    DLB2_CHP_DIR_CQ_WPTR_RST);
+
+	/* Virtualize the PPID */
+	r9.field.keep_pf_ppid = 0;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ_FMT(port->id.phys_id), r9.val);
+
+	/*
+	 * Address translation (AT) settings: 0: untranslated, 2: translated
+	 * (see ATS spec regarding Address Type field for more details)
+	 */
+	r10.field.cq_at = 0;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ_AT(port->id.phys_id), r10.val);
+
+	if (vdev_req && hw->virt_mode == DLB2_VIRT_SIOV) {
+		r11.field.pasid = hw->pasid[vdev_id];
+		r11.field.fmt2 = 1;
+	}
+
+	DLB2_CSR_WR(hw,
+		    DLB2_SYS_DIR_CQ_PASID(port->id.phys_id),
+		    r11.val);
+
+	r12.field.cq2vas = domain->id.phys_id;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_DIR_CQ2VAS(port->id.phys_id), r12.val);
+
+	return 0;
+}
+
+static int dlb2_configure_dir_port(struct dlb2_hw *hw,
+				   struct dlb2_hw_domain *domain,
+				   struct dlb2_dir_pq_pair *port,
+				   uintptr_t cq_dma_base,
+				   struct dlb2_create_dir_port_args *args,
+				   bool vdev_req,
+				   unsigned int vdev_id)
+{
+	int ret;
+
+	ret = dlb2_dir_port_configure_cq(hw,
+					 domain,
+					 port,
+					 cq_dma_base,
+					 args,
+					 vdev_req,
+					 vdev_id);
+
+	if (ret < 0)
+		return ret;
+
+	dlb2_dir_port_configure_pp(hw,
+				   domain,
+				   port,
+				   vdev_req,
+				   vdev_id);
+
+	dlb2_dir_port_cq_enable(hw, port);
+
+	port->enabled = true;
+
+	port->port_configured = true;
+
+	return 0;
+}
+
 static int dlb2_ldb_port_map_qid_static(struct dlb2_hw *hw,
 					struct dlb2_ldb_port *p,
 					struct dlb2_ldb_queue *q,
@@ -2311,6 +2965,252 @@ int dlb2_hw_create_dir_queue(struct dlb2_hw *hw,
 }
 
 static void
+dlb2_log_create_ldb_port_args(struct dlb2_hw *hw,
+			      u32 domain_id,
+			      uintptr_t cq_dma_base,
+			      struct dlb2_create_ldb_port_args *args,
+			      bool vdev_req,
+			      unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 create load-balanced port arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID:                 %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tCQ depth:                  %d\n",
+		    args->cq_depth);
+	DLB2_HW_DBG(hw, "\tCQ hist list size:         %d\n",
+		    args->cq_history_list_size);
+	DLB2_HW_DBG(hw, "\tCQ base address:           0x%lx\n",
+		    cq_dma_base);
+	DLB2_HW_DBG(hw, "\tCoS ID:                    %u\n", args->cos_id);
+	DLB2_HW_DBG(hw, "\tStrict CoS allocation:     %u\n",
+		    args->cos_strict);
+}
+
+/**
+ * dlb2_hw_create_ldb_port() - Allocate and initialize a load-balanced port and
+ *	its resources.
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @domain_id: Domain ID
+ * @args: User-provided arguments.
+ * @cq_dma_base: Base DMA address for consumer queue memory
+ * @resp: Response to user.
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb2_hw_create_ldb_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_create_ldb_port_args *args,
+			    uintptr_t cq_dma_base,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_req,
+			    unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	int ret, cos_id, i;
+
+	dlb2_log_create_ldb_port_args(hw,
+				      domain_id,
+				      cq_dma_base,
+				      args,
+				      vdev_req,
+				      vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_create_ldb_port_args(hw,
+					       domain_id,
+					       cq_dma_base,
+					       args,
+					       resp,
+					       vdev_req,
+					       vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	if (args->cos_strict) {
+		cos_id = args->cos_id;
+
+		port = DLB2_DOM_LIST_HEAD(domain->avail_ldb_ports[cos_id],
+					  typeof(*port));
+	} else {
+		int idx;
+
+		for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+			idx = (args->cos_id + i) % DLB2_NUM_COS_DOMAINS;
+
+			port = DLB2_DOM_LIST_HEAD(domain->avail_ldb_ports[idx],
+						  typeof(*port));
+			if (port)
+				break;
+		}
+
+		cos_id = idx;
+	}
+
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: no available ldb ports\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	if (port->configured) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: avail_ldb_ports contains configured ports.\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	ret = dlb2_configure_ldb_port(hw,
+				      domain,
+				      port,
+				      cq_dma_base,
+				      args,
+				      vdev_req,
+				      vdev_id);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Configuration succeeded, so move the resource from the 'avail' to
+	 * the 'used' list.
+	 */
+	list_del(&port->domain_list);
+
+	list_add(&port->domain_list, &domain->used_ldb_ports[cos_id]);
+
+	resp->status = 0;
+	resp->id = (vdev_req) ? port->id.virt_id : port->id.phys_id;
+
+	return 0;
+}
+
+static void
+dlb2_log_create_dir_port_args(struct dlb2_hw *hw,
+			      u32 domain_id,
+			      uintptr_t cq_dma_base,
+			      struct dlb2_create_dir_port_args *args,
+			      bool vdev_req,
+			      unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 create directed port arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID:                 %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tCQ depth:                  %d\n",
+		    args->cq_depth);
+	DLB2_HW_DBG(hw, "\tCQ base address:           0x%lx\n",
+		    cq_dma_base);
+}
+
+/**
+ * dlb2_hw_create_dir_port() - Allocate and initialize a DLB directed port
+ *	and queue. The port/queue pair have the same ID and name.
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @domain_id: Domain ID
+ * @args: User-provided arguments.
+ * @cq_dma_base: Base DMA address for consumer queue memory
+ * @resp: Response to user.
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb2_hw_create_dir_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_create_dir_port_args *args,
+			    uintptr_t cq_dma_base,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_req,
+			    unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *port;
+	struct dlb2_hw_domain *domain;
+	int ret;
+
+	dlb2_log_create_dir_port_args(hw,
+				      domain_id,
+				      cq_dma_base,
+				      args,
+				      vdev_req,
+				      vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_create_dir_port_args(hw,
+					       domain_id,
+					       cq_dma_base,
+					       args,
+					       resp,
+					       vdev_req,
+					       vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (args->queue_id != -1)
+		port = dlb2_get_domain_used_dir_pq(args->queue_id,
+						   vdev_req,
+						   domain);
+	else
+		port = DLB2_DOM_LIST_HEAD(domain->avail_dir_pq_pairs,
+					  typeof(*port));
+
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: no available dir ports\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	ret = dlb2_configure_dir_port(hw,
+				      domain,
+				      port,
+				      cq_dma_base,
+				      args,
+				      vdev_req,
+				      vdev_id);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Configuration succeeded, so move the resource from the 'avail' to
+	 * the 'used' list (if it's not already there).
+	 */
+	if (args->queue_id == -1) {
+		list_del(&port->domain_list);
+
+		list_add(&port->domain_list, &domain->used_dir_pq_pairs);
+	}
+
+	resp->status = 0;
+	resp->id = (vdev_req) ? port->id.virt_id : port->id.phys_id;
+
+	return 0;
+}
+
+static void
 dlb2_domain_finish_unmap_port_slot(struct dlb2_hw *hw,
 				   struct dlb2_hw_domain *domain,
 				   struct dlb2_ldb_port *port,
@@ -4107,3 +5007,25 @@ void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw)
 
 	DLB2_CSR_WR(hw, DLB2_CFG_MSTR_CFG_PM_PMCSR_DISABLE, r0.val);
 }
+
+void dlb2_hw_enable_sparse_ldb_cq_mode(struct dlb2_hw *hw)
+{
+	union dlb2_chp_cfg_chp_csr_ctrl r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_CHP_CFG_CHP_CSR_CTRL);
+
+	r0.field.cfg_64bytes_qe_ldb_cq_mode = 1;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_CFG_CHP_CSR_CTRL, r0.val);
+}
+
+void dlb2_hw_enable_sparse_dir_cq_mode(struct dlb2_hw *hw)
+{
+	union dlb2_chp_cfg_chp_csr_ctrl r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_CHP_CFG_CHP_CSR_CTRL);
+
+	r0.field.cfg_64bytes_qe_dir_cq_mode = 1;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_CFG_CHP_CSR_CTRL, r0.val);
+}
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 476230ffe4b6..b030722d2d6a 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -137,6 +137,78 @@ int dlb2_hw_create_dir_queue(struct dlb2_hw *hw,
 			     unsigned int vdev_id);
 
 /**
+ * dlb2_hw_create_dir_port() - create a directed port
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port creation arguments.
+ * @cq_dma_base: base address of the CQ memory. This can be a PA or an IOVA.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function creates a directed port.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the port ID.
+ *
+ * resp->id contains a virtual ID if vdev_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, a credit setting is invalid, a
+ *	    pointer address is not properly aligned, the domain is not
+ *	    configured, or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_create_dir_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_create_dir_port_args *args,
+			    uintptr_t cq_dma_base,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_request,
+			    unsigned int vdev_id);
+
+/**
+ * dlb2_hw_create_ldb_port() - create a load-balanced port
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port creation arguments.
+ * @cq_dma_base: base address of the CQ memory. This can be a PA or an IOVA.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function creates a load-balanced port.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the port ID.
+ *
+ * resp->id contains a virtual ID if vdev_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, a credit setting is invalid, a
+ *	    pointer address is not properly aligned, the domain is not
+ *	    configured, or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_create_ldb_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_create_ldb_port_args *args,
+			    uintptr_t cq_dma_base,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_request,
+			    unsigned int vdev_id);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
@@ -264,4 +336,21 @@ enum dlb2_virt_mode {
 	NUM_DLB2_VIRT_MODES,
 };
 
+/**
+ * dlb2_hw_enable_sparse_ldb_cq_mode() - enable sparse mode for load-balanced
+ *	ports.
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function must be called prior to configuring scheduling domains.
+ */
+void dlb2_hw_enable_sparse_ldb_cq_mode(struct dlb2_hw *hw);
+
+/**
+ * dlb2_hw_enable_sparse_dir_cq_mode() - enable sparse mode for directed ports.
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function must be called prior to configuring scheduling domains.
+ */
+void dlb2_hw_enable_sparse_dir_cq_mode(struct dlb2_hw *hw);
+
 #endif /* __DLB2_RESOURCE_H */
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 04c469e09529..75d07fb44f0c 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -297,12 +297,35 @@ struct dlb2_get_num_resources_args {
 	__u32 num_dir_credits;
 };
 
+enum dlb2_cq_poll_modes {
+	DLB2_CQ_POLL_MODE_STD,
+	DLB2_CQ_POLL_MODE_SPARSE,
+
+	/* NUM_DLB2_CQ_POLL_MODE must be last */
+	NUM_DLB2_CQ_POLL_MODE,
+};
+
+/*
+ * DLB2_CMD_QUERY_CQ_POLL_MODE: Query the CQ poll mode setting
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: CQ poll mode (see enum dlb2_cq_poll_modes).
+ */
+struct dlb2_query_cq_poll_mode_args {
+	/* Output parameters */
+	__u64 response;
+};
+
 enum dlb2_user_interface_commands {
 	DLB2_CMD_GET_DEVICE_VERSION,
 	DLB2_CMD_CREATE_SCHED_DOMAIN,
 	DLB2_CMD_GET_SCHED_DOMAIN_FD,
 	DLB2_CMD_GET_NUM_RESOURCES,
 	DLB2_CMD_GET_DRIVER_VERSION,
+	DLB2_CMD_QUERY_CQ_POLL_MODE,
 
 	/* NUM_DLB2_CMD must be last */
 	NUM_DLB2_CMD,
@@ -385,6 +408,69 @@ struct dlb2_create_dir_queue_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_CREATE_LDB_PORT: Configure a load-balanced port.
+ * Input parameters:
+ * - cq_depth: Depth of the port's CQ. Must be a power-of-two between 8 and
+ *	1024, inclusive.
+ * - cq_depth_threshold: CQ depth interrupt threshold. A value of N means that
+ *	the CQ interrupt won't fire until there are N or more outstanding CQ
+ *	tokens.
+ * - num_hist_list_entries: Number of history list entries. This must be
+ *	greater than or equal cq_depth.
+ * - cos_id: class-of-service to allocate this port from. Must be between 0 and
+ *	3, inclusive.
+ * - cos_strict: If set, return an error if there are no available ports in the
+ *	requested class-of-service. Else, allocate the port from a different
+ *	class-of-service if the requested class has no available ports.
+ *
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: port ID.
+ */
+
+struct dlb2_create_ldb_port_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u16 cq_depth;
+	__u16 cq_depth_threshold;
+	__u16 cq_history_list_size;
+	__u8 cos_id;
+	__u8 cos_strict;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_CREATE_DIR_PORT: Configure a directed port.
+ * Input parameters:
+ * - cq_depth: Depth of the port's CQ. Must be a power-of-two between 8 and
+ *	1024, inclusive.
+ * - cq_depth_threshold: CQ depth interrupt threshold. A value of N means that
+ *	the CQ interrupt won't fire until there are N or more outstanding CQ
+ *	tokens.
+ * - qid: Queue ID. If the corresponding directed queue is already created,
+ *	specify its ID here. Else this argument must be 0xFFFFFFFF to indicate
+ *	that the port is being created before the queue.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Port ID.
+ */
+struct dlb2_create_dir_port_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u16 cq_depth;
+	__u16 cq_depth_threshold;
+	__s32 queue_id;
+};
+
+/*
  * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
  * Input parameters:
  * - queue_id: The load-balanced queue ID.
@@ -427,6 +513,8 @@ struct dlb2_get_dir_queue_depth_args {
 enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,
 	DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,
+	DLB2_DOMAIN_CMD_CREATE_LDB_PORT,
+	DLB2_DOMAIN_CMD_CREATE_DIR_PORT,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
 
@@ -434,6 +522,8 @@ enum dlb2_domain_user_interface_commands {
 	NUM_DLB2_DOMAIN_CMD,
 };
 
+#define DLB2_CQ_SIZE 65536
+
 /********************/
 /* dlb2 ioctl codes */
 /********************/
@@ -460,6 +550,10 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_GET_DRIVER_VERSION,		\
 		      struct dlb2_get_driver_version_args)
+#define DLB2_IOC_QUERY_CQ_POLL_MODE				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_QUERY_CQ_POLL_MODE,		\
+		      struct dlb2_query_cq_poll_mode_args)
 #define DLB2_IOC_CREATE_LDB_QUEUE				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,		\
@@ -468,6 +562,14 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,		\
 		      struct dlb2_create_dir_queue_args)
+#define DLB2_IOC_CREATE_LDB_PORT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_CREATE_LDB_PORT,		\
+		      struct dlb2_create_ldb_port_args)
+#define DLB2_IOC_CREATE_DIR_PORT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_CREATE_DIR_PORT,		\
+		      struct dlb2_create_dir_port_args)
 #define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/20] dlb2: add port mmap support
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (8 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 11/20] dlb2: add start domain ioctl Gage Eads
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

Once a port is created, the application can mmap the corresponding DMA
memory and MMIO into user-space. This allows user-space applications to
do (performance-sensitive) enqueue and dequeue independent of the kernel
driver.

The mmap callback is only available through special port files: a producer
port (PP) file and a consumer queue (CQ) file. User-space gets an fd for
these files by calling a new ioctl, DLB2_DOMAIN_CMD_GET_{LDB,
DIR}_PORT_{PP, CQ}_FD, and passing in a port ID. If the ioctl succeeds, the
returned fd can be used to mmap that port's PP/CQ.

Device reset requires first unmapping all user-space mappings, to prevent
applications from interfering with the reset operation. To this end, the
driver uses a single inode -- allocated when the first PP/CQ file is
created, and freed when the last such file is closed -- and attaches all
port files to this common inode, as done elsewhere in Linux (e.g. cxl,
dax).

Allocating this inode requires creating a pseudo-filesystem. The driver
initializes this FS when the inode is allocated, and frees the FS after the
inode is freed.

The driver doesn't use anon_inode_getfd() for these port mmap files because
the anon inode layer uses a single inode that is shared with other kernel
components -- calling unmap_mapping_range() on that shared inode would
likely break the kernel.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 drivers/misc/dlb2/Makefile        |   1 +
 drivers/misc/dlb2/dlb2_file.c     | 133 +++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_file.h     |  19 ++++
 drivers/misc/dlb2/dlb2_ioctl.c    | 203 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.c     | 109 ++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.h     |  13 +++
 drivers/misc/dlb2/dlb2_pf_ops.c   |  22 +++++
 drivers/misc/dlb2/dlb2_resource.c |  99 +++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h |  50 ++++++++++
 include/uapi/linux/dlb2_user.h    |  60 +++++++++++
 10 files changed, 709 insertions(+)
 create mode 100644 drivers/misc/dlb2/dlb2_file.c
 create mode 100644 drivers/misc/dlb2/dlb2_file.h

diff --git a/drivers/misc/dlb2/Makefile b/drivers/misc/dlb2/Makefile
index 18b5498b20e6..12361461dcff 100644
--- a/drivers/misc/dlb2/Makefile
+++ b/drivers/misc/dlb2/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_INTEL_DLB2) := dlb2.o
 
 dlb2-objs :=      \
   dlb2_main.o     \
+  dlb2_file.o     \
   dlb2_ioctl.o    \
   dlb2_pf_ops.o   \
   dlb2_resource.o \
diff --git a/drivers/misc/dlb2/dlb2_file.c b/drivers/misc/dlb2/dlb2_file.c
new file mode 100644
index 000000000000..8e73231336d7
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_file.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2020 Intel Corporation */
+
+#include <linux/anon_inodes.h>
+#include <linux/file.h>
+#include <linux/module.h>
+#include <linux/mount.h>
+#include <linux/pseudo_fs.h>
+
+#include "dlb2_file.h"
+#include "dlb2_main.h"
+
+/*
+ * dlb2 tracks its memory mappings so it can revoke them when an FLR is
+ * requested and user-space cannot be allowed to access the device. To achieve
+ * that, the driver creates a single inode through which all driver-created
+ * files can share a struct address_space, and unmaps the inode's address space
+ * during the reset preparation phase. Since the anon inode layer shares its
+ * inode with multiple kernel components, we cannot use that here.
+ *
+ * Doing so requires a custom pseudo-filesystem to allocate the inode. The FS
+ * and the inode are allocated on demand when a file is created, and both are
+ * freed when the last such file is closed.
+ *
+ * This is inspired by other drivers (cxl, dax, mem) and the anon inode layer.
+ */
+static int dlb2_fs_cnt;
+static struct vfsmount *dlb2_vfs_mount;
+
+#define DLB2FS_MAGIC 0x444C4232
+static int dlb2_init_fs_context(struct fs_context *fc)
+{
+	return init_pseudo(fc, DLB2FS_MAGIC) ? 0 : -ENOMEM;
+}
+
+static struct file_system_type dlb2_fs_type = {
+	.name	 = "dlb2",
+	.owner	 = THIS_MODULE,
+	.init_fs_context = dlb2_init_fs_context,
+	.kill_sb = kill_anon_super,
+};
+
+static struct inode *dlb2_alloc_inode(struct dlb2_dev *dev)
+{
+	struct inode *inode;
+	int ret;
+
+	/* Increment the pseudo-FS's refcnt and (if not already) mount it. */
+	ret = simple_pin_fs(&dlb2_fs_type, &dlb2_vfs_mount, &dlb2_fs_cnt);
+	if (ret < 0) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Cannot mount pseudo filesystem: %d\n",
+			__func__, ret);
+		return ERR_PTR(ret);
+	}
+
+	if (dlb2_fs_cnt > 1) {
+		/*
+		 * Return the previously allocated inode. In this case, there
+		 * is guaranteed >= 1 reference and so ihold() is safe to call.
+		 */
+		ihold(dev->inode);
+		return dev->inode;
+	}
+
+	inode = alloc_anon_inode(dlb2_vfs_mount->mnt_sb);
+	if (IS_ERR(inode)) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Cannot allocate inode: %d\n",
+			__func__, ret);
+		simple_release_fs(&dlb2_vfs_mount, &dlb2_fs_cnt);
+	}
+
+	dev->inode = inode;
+
+	return inode;
+}
+
+/*
+ * Decrement the inode reference count and release the FS. Intended for
+ * unwinding dlb2_alloc_inode(). Must hold the resource mutex while calling.
+ */
+static void dlb2_free_inode(struct inode *inode)
+{
+	iput(inode);
+	simple_release_fs(&dlb2_vfs_mount, &dlb2_fs_cnt);
+}
+
+/*
+ * Release the FS. Intended for use in a file_operations release callback,
+ * which decrements the inode reference count separately. Must hold the
+ * resource mutex while calling.
+ */
+void dlb2_release_fs(struct dlb2_dev *dev)
+{
+	simple_release_fs(&dlb2_vfs_mount, &dlb2_fs_cnt);
+
+	/* When the fs refcnt reaches zero, the inode has been freed */
+	if (dlb2_fs_cnt == 0)
+		dev->inode = NULL;
+}
+
+/*
+ * Allocate a file with the requested flags, file operations, and name that
+ * uses the device's shared inode. Must hold the resource mutex while calling.
+ *
+ * Caller must separately allocate an fd and install the file in that fd.
+ */
+struct file *dlb2_getfile(struct dlb2_dev *dev,
+			  int flags,
+			  const struct file_operations *fops,
+			  const char *name)
+{
+	struct inode *inode;
+	struct file *f;
+
+	if (!try_module_get(THIS_MODULE))
+		return ERR_PTR(-ENOENT);
+
+	inode = dlb2_alloc_inode(dev);
+	if (IS_ERR(inode)) {
+		module_put(THIS_MODULE);
+		return ERR_CAST(inode);
+	}
+
+	f = alloc_file_pseudo(inode, dlb2_vfs_mount, name, flags, fops);
+	if (IS_ERR(f)) {
+		dlb2_free_inode(inode);
+		module_put(THIS_MODULE);
+	}
+
+	return f;
+}
diff --git a/drivers/misc/dlb2/dlb2_file.h b/drivers/misc/dlb2/dlb2_file.h
new file mode 100644
index 000000000000..20a3b04eb00e
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_file.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef __DLB2_FILE_H
+#define __DLB2_FILE_H
+
+#include <linux/file.h>
+
+#include "dlb2_main.h"
+
+void dlb2_release_fs(struct dlb2_dev *dev);
+
+struct file *dlb2_getfile(struct dlb2_dev *dev,
+			  int flags,
+			  const struct file_operations *fops,
+			  const char *name);
+
+#endif /* __DLB2_FILE_H */
diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index e9303b7df8e2..b4d40de9d0dc 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -6,6 +6,7 @@
 
 #include <uapi/linux/dlb2_user.h>
 
+#include "dlb2_file.h"
 #include "dlb2_ioctl.h"
 #include "dlb2_main.h"
 
@@ -255,6 +256,204 @@ static int dlb2_domain_ioctl_create_dir_port(struct dlb2_dev *dev,
 	return ret;
 }
 
+static int dlb2_create_port_fd(struct dlb2_dev *dev,
+			       struct dlb2_domain *domain,
+			       const char *prefix,
+			       u32 id,
+			       const struct file_operations *fops,
+			       int *fd,
+			       struct file **f)
+{
+	char *name;
+	int ret;
+
+	ret = get_unused_fd_flags(O_RDWR);
+	if (ret < 0)
+		return ret;
+
+	*fd = ret;
+
+	name = kasprintf(GFP_KERNEL, "%s:%d", prefix, id);
+	if (!name) {
+		put_unused_fd(*fd);
+		return -ENOMEM;
+	}
+
+	*f = dlb2_getfile(dev, O_RDWR, fops, name);
+
+	kfree(name);
+
+	if (IS_ERR(*f)) {
+		put_unused_fd(*fd);
+		return PTR_ERR(*f);
+	}
+
+	return 0;
+}
+
+static int dlb2_domain_get_port_fd(struct dlb2_dev *dev,
+				   struct dlb2_domain *domain,
+				   unsigned long user_arg,
+				   u16 size,
+				   const char *name,
+				   const struct file_operations *fops,
+				   bool is_ldb)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_get_port_fd_args arg;
+	struct file *file = NULL;
+	struct dlb2_port *port;
+	int ret, fd;
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	if ((is_ldb &&
+	     dev->ops->ldb_port_owned_by_domain(&dev->hw,
+						domain->id,
+						arg.port_id) != 1)) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid port id %u\n",
+			__func__, arg.port_id);
+		response.status = DLB2_ST_INVALID_PORT_ID;
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	if (!is_ldb &&
+	    dev->ops->dir_port_owned_by_domain(&dev->hw,
+					       domain->id,
+					       arg.port_id) != 1) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid port id %u\n",
+			__func__, arg.port_id);
+		response.status = DLB2_ST_INVALID_PORT_ID;
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	port = (is_ldb) ? &dev->ldb_port[arg.port_id] :
+			  &dev->dir_port[arg.port_id];
+
+	if (!port->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Port %u is not configured\n",
+			__func__, arg.port_id);
+		response.status = DLB2_ST_INVALID_PORT_ID;
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	ret = dlb2_create_port_fd(dev, domain, name, arg.port_id,
+				  fops, &fd, &file);
+	if (ret < 0)
+		goto unlock;
+
+	file->private_data = port;
+
+	response.id = fd;
+	ret = 0;
+
+unlock:
+	mutex_unlock(&dev->resource_mutex);
+
+	/* This copy was verified earlier and should not fail */
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	/*
+	 * Save fd_install() until after the last point of failure. The domain
+	 * and pm refcnt are decremented in the close callback.
+	 */
+	if (ret == 0) {
+		kref_get(&domain->refcnt);
+
+		dev->ops->inc_pm_refcnt(dev->pdev, true);
+
+		fd_install(fd, file);
+	}
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_get_ldb_port_pp_fd(struct dlb2_dev *dev,
+						struct dlb2_domain *domain,
+						unsigned long user_arg,
+						u16 size)
+{
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_domain_get_port_fd(dev, domain, user_arg, size,
+				      "dlb2_ldb_pp:", &dlb2_pp_fops, true);
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_get_ldb_port_cq_fd(struct dlb2_dev *dev,
+						struct dlb2_domain *domain,
+						unsigned long user_arg,
+						u16 size)
+{
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_domain_get_port_fd(dev, domain, user_arg, size,
+				      "dlb2_ldb_cq:", &dlb2_cq_fops, true);
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_get_dir_port_pp_fd(struct dlb2_dev *dev,
+						struct dlb2_domain *domain,
+						unsigned long user_arg,
+						u16 size)
+{
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_domain_get_port_fd(dev, domain, user_arg, size,
+				      "dlb2_dir_pp:", &dlb2_pp_fops, false);
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_get_dir_port_cq_fd(struct dlb2_dev *dev,
+						struct dlb2_domain *domain,
+						unsigned long user_arg,
+						u16 size)
+{
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_domain_get_port_fd(dev, domain, user_arg, size,
+				      "dlb2_dir_cq:", &dlb2_cq_fops, false);
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
 typedef int (*dlb2_domain_ioctl_callback_fn_t)(struct dlb2_dev *dev,
 					       struct dlb2_domain *domain,
 					       unsigned long arg,
@@ -268,6 +467,10 @@ dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_create_dir_port,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
+	dlb2_domain_ioctl_get_ldb_port_pp_fd,
+	dlb2_domain_ioctl_get_ldb_port_cq_fd,
+	dlb2_domain_ioctl_get_dir_port_pp_fd,
+	dlb2_domain_ioctl_get_dir_port_cq_fd,
 };
 
 int dlb2_domain_ioctl_dispatcher(struct dlb2_dev *dev,
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index ad0b1a9fb768..63ea5b6b58c8 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -11,6 +11,7 @@
 #include <linux/pci.h>
 #include <linux/uaccess.h>
 
+#include "dlb2_file.h"
 #include "dlb2_ioctl.h"
 #include "dlb2_main.h"
 #include "dlb2_resource.h"
@@ -273,6 +274,114 @@ const struct file_operations dlb2_domain_fops = {
 	.compat_ioctl = compat_ptr_ioctl,
 };
 
+static int dlb2_pp_mmap(struct file *f, struct vm_area_struct *vma)
+{
+	struct dlb2_port *port = f->private_data;
+	struct dlb2_domain *domain = port->domain;
+	struct dlb2_dev *dev = domain->dlb2_dev;
+	unsigned long pgoff;
+	pgprot_t pgprot;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "[%s()] %s port %d\n",
+		__func__, port->is_ldb ? "LDB" : "DIR", port->id);
+
+	mutex_lock(&dev->resource_mutex);
+
+	if ((vma->vm_end - vma->vm_start) != DLB2_PP_SIZE) {
+		ret = -EINVAL;
+		goto end;
+	}
+
+	pgprot = pgprot_noncached(vma->vm_page_prot);
+
+	pgoff = dev->hw.func_phys_addr;
+
+	if (port->is_ldb)
+		pgoff += DLB2_LDB_PP_OFFS(port->id);
+	else
+		pgoff += DLB2_DIR_PP_OFFS(port->id);
+
+	ret = io_remap_pfn_range(vma,
+				 vma->vm_start,
+				 pgoff >> PAGE_SHIFT,
+				 vma->vm_end - vma->vm_start,
+				 pgprot);
+
+end:
+	mutex_unlock(&dev->resource_mutex);
+
+	return ret;
+}
+
+static int dlb2_cq_mmap(struct file *f, struct vm_area_struct *vma)
+{
+	struct dlb2_port *port = f->private_data;
+	struct dlb2_domain *domain = port->domain;
+	struct dlb2_dev *dev = domain->dlb2_dev;
+	struct page *page;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "[%s()] %s port %d\n",
+		__func__, port->is_ldb ? "LDB" : "DIR", port->id);
+
+	mutex_lock(&dev->resource_mutex);
+
+	if ((vma->vm_end - vma->vm_start) != DLB2_CQ_SIZE) {
+		ret = -EINVAL;
+		goto end;
+	}
+
+	page = virt_to_page(port->cq_base);
+
+	ret = remap_pfn_range(vma,
+			      vma->vm_start,
+			      page_to_pfn(page),
+			      vma->vm_end - vma->vm_start,
+			      vma->vm_page_prot);
+
+end:
+	mutex_unlock(&dev->resource_mutex);
+
+	return ret;
+}
+
+static int dlb2_port_close(struct inode *i, struct file *f)
+{
+	struct dlb2_port *port = f->private_data;
+	struct dlb2_domain *domain = port->domain;
+	struct dlb2_dev *dev = domain->dlb2_dev;
+	int ret = 0;
+
+	mutex_lock(&dev->resource_mutex);
+
+	dev_dbg(dev->dlb2_device,
+		"Closing domain %d's port file\n", domain->id);
+
+	kref_put(&domain->refcnt, dlb2_free_domain);
+
+	dev->ops->dec_pm_refcnt(dev->pdev);
+
+	/* Decrement the refcnt of the pseudo-FS used to allocate the inode */
+	dlb2_release_fs(dev);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	return ret;
+}
+
+const struct file_operations dlb2_pp_fops = {
+	.owner   = THIS_MODULE,
+	.release = dlb2_port_close,
+	.mmap    = dlb2_pp_mmap,
+};
+
+const struct file_operations dlb2_cq_fops = {
+	.owner   = THIS_MODULE,
+	.release = dlb2_port_close,
+	.mmap    = dlb2_cq_mmap,
+};
+
 /**********************************/
 /****** PCI driver callbacks ******/
 /**********************************/
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index fd6381b537a2..537f849b0597 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -81,6 +81,12 @@ struct dlb2_device_ops {
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
 	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
+	int (*ldb_port_owned_by_domain)(struct dlb2_hw *hw,
+					u32 domain_id,
+					u32 port_id);
+	int (*dir_port_owned_by_domain)(struct dlb2_hw *hw,
+					u32 domain_id,
+					u32 port_id);
 	int (*get_ldb_queue_depth)(struct dlb2_hw *hw,
 				   u32 domain_id,
 				   struct dlb2_get_ldb_queue_depth_args *args,
@@ -96,6 +102,8 @@ struct dlb2_device_ops {
 
 extern struct dlb2_device_ops dlb2_pf_ops;
 extern const struct file_operations dlb2_domain_fops;
+extern const struct file_operations dlb2_pp_fops;
+extern const struct file_operations dlb2_cq_fops;
 
 struct dlb2_port {
 	void *cq_base;
@@ -123,6 +131,11 @@ struct dlb2_dev {
 	struct dlb2_port ldb_port[DLB2_MAX_NUM_LDB_PORTS];
 	struct dlb2_port dir_port[DLB2_MAX_NUM_DIR_PORTS];
 	/*
+	 * Anonymous inode used to share an address_space for all domain
+	 * device file mappings.
+	 */
+	struct inode *inode;
+	/*
 	 * The resource mutex serializes access to driver data structures and
 	 * hardware registers.
 	 */
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index f60ef7daca54..c3044d603263 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -326,6 +326,26 @@ dlb2_pf_query_cq_poll_mode(struct dlb2_dev *dlb2_dev,
 	return 0;
 }
 
+/**************************************/
+/****** Resource query callbacks ******/
+/**************************************/
+
+static int
+dlb2_pf_ldb_port_owned_by_domain(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 u32 port_id)
+{
+	return dlb2_ldb_port_owned_by_domain(hw, domain_id, port_id, false, 0);
+}
+
+static int
+dlb2_pf_dir_port_owned_by_domain(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 u32 port_id)
+{
+	return dlb2_dir_port_owned_by_domain(hw, domain_id, port_id, false, 0);
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -350,6 +370,8 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.create_dir_port = dlb2_pf_create_dir_port,
 	.get_num_resources = dlb2_pf_get_num_resources,
 	.reset_domain = dlb2_pf_reset_domain,
+	.ldb_port_owned_by_domain = dlb2_pf_ldb_port_owned_by_domain,
+	.dir_port_owned_by_domain = dlb2_pf_dir_port_owned_by_domain,
 	.get_ldb_queue_depth = dlb2_pf_get_ldb_queue_depth,
 	.get_dir_queue_depth = dlb2_pf_get_dir_queue_depth,
 	.init_hardware = dlb2_pf_init_hardware,
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 2b3d10975f18..1de4ef9ae405 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -237,6 +237,32 @@ static struct dlb2_hw_domain *dlb2_get_domain_from_id(struct dlb2_hw *hw,
 	return NULL;
 }
 
+static struct dlb2_ldb_port *
+dlb2_get_domain_ldb_port(u32 id,
+			 bool vdev_req,
+			 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+
+		DLB2_DOM_LIST_FOR(domain->avail_ldb_ports[i], port)
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+	}
+
+	return NULL;
+}
+
 static struct dlb2_dir_pq_pair *
 dlb2_get_domain_used_dir_pq(u32 id,
 			    bool vdev_req,
@@ -255,6 +281,29 @@ dlb2_get_domain_used_dir_pq(u32 id,
 	return NULL;
 }
 
+static struct dlb2_dir_pq_pair *
+dlb2_get_domain_dir_pq(u32 id,
+		       bool vdev_req,
+		       struct dlb2_hw_domain *domain)
+{
+	struct dlb2_dir_pq_pair *port;
+
+	if (id >= DLB2_MAX_NUM_DIR_PORTS)
+		return NULL;
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port)
+		if ((!vdev_req && port->id.phys_id == id) ||
+		    (vdev_req && port->id.virt_id == id))
+			return port;
+
+	DLB2_DOM_LIST_FOR(domain->avail_dir_pq_pairs, port)
+		if ((!vdev_req && port->id.phys_id == id) ||
+		    (vdev_req && port->id.virt_id == id))
+			return port;
+
+	return NULL;
+}
+
 static struct dlb2_ldb_queue *
 dlb2_get_ldb_queue_from_id(struct dlb2_hw *hw,
 			   u32 id,
@@ -4949,6 +4998,56 @@ int dlb2_reset_domain(struct dlb2_hw *hw,
 	return 0;
 }
 
+int dlb2_ldb_port_owned_by_domain(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  u32 port_id,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+
+	if (vdev_req && vdev_id >= DLB2_MAX_NUM_VDEVS)
+		return -EINVAL;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain || !domain->configured)
+		return -EINVAL;
+
+	port = dlb2_get_domain_ldb_port(port_id, vdev_req, domain);
+
+	if (!port)
+		return -EINVAL;
+
+	return port->domain_id.phys_id == domain->id.phys_id;
+}
+
+int dlb2_dir_port_owned_by_domain(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  u32 port_id,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *port;
+	struct dlb2_hw_domain *domain;
+
+	if (vdev_req && vdev_id >= DLB2_MAX_NUM_VDEVS)
+		return -EINVAL;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain || !domain->configured)
+		return -EINVAL;
+
+	port = dlb2_get_domain_dir_pq(port_id, vdev_req, domain);
+
+	if (!port)
+		return -EINVAL;
+
+	return port->domain_id.phys_id == domain->id.phys_id;
+}
+
 int dlb2_hw_get_num_resources(struct dlb2_hw *hw,
 			      struct dlb2_get_num_resources_args *arg,
 			      bool vdev_req,
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index b030722d2d6a..47b0d6f785fb 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -239,6 +239,56 @@ int dlb2_reset_domain(struct dlb2_hw *hw,
 		      unsigned int vdev_id);
 
 /**
+ * dlb2_ldb_port_owned_by_domain() - query whether a port is owned by a domain
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @port_id: indicates whether this request came from a VF.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function returns whether a load-balanced port is owned by a specified
+ * domain.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 if false, 1 if true, <0 otherwise.
+ *
+ * EINVAL - Invalid domain or port ID, or the domain is not configured.
+ */
+int dlb2_ldb_port_owned_by_domain(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  u32 port_id,
+				  bool vdev_request,
+				  unsigned int vdev_id);
+
+/**
+ * dlb2_dir_port_owned_by_domain() - query whether a port is owned by a domain
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @port_id: indicates whether this request came from a VF.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function returns whether a directed port is owned by a specified
+ * domain.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 if false, 1 if true, <0 otherwise.
+ *
+ * EINVAL - Invalid domain or port ID, or the domain is not configured.
+ */
+int dlb2_dir_port_owned_by_domain(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  u32 port_id,
+				  bool vdev_request,
+				  unsigned int vdev_id);
+
+/**
  * dlb2_hw_get_num_resources() - query the PCI function's available resources
  * @hw: dlb2_hw handle for a particular device.
  * @arg: pointer to resource counts.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 75d07fb44f0c..b0aba1ba5e3f 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -510,6 +510,41 @@ struct dlb2_get_dir_queue_depth_args {
 	__u32 padding0;
 };
 
+/*
+ * DLB2_CMD_GET_LDB_PORT_PP_FD: Get file descriptor to mmap a load-balanced
+ *	port's producer port (PP).
+ * DLB2_CMD_GET_LDB_PORT_CQ_FD: Get file descriptor to mmap a load-balanced
+ *	port's consumer queue (CQ).
+ *
+ *	The load-balanced port must have been previously created with the ioctl
+ *	DLB2_CMD_CREATE_LDB_PORT. The fd is used to mmap the PP/CQ region.
+ *
+ * DLB2_CMD_GET_DIR_PORT_PP_FD: Get file descriptor to mmap a directed port's
+ *	producer port (PP).
+ * DLB2_CMD_GET_DIR_PORT_CQ_FD: Get file descriptor to mmap a directed port's
+ *	consumer queue (CQ).
+ *
+ *	The directed port must have been previously created with the ioctl
+ *	DLB2_CMD_CREATE_DIR_PORT. The fd is used to mmap PP/CQ region.
+ *
+ * Input parameters:
+ * - port_id: port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: fd.
+ */
+struct dlb2_get_port_fd_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 padding0;
+};
+
 enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,
 	DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,
@@ -517,12 +552,21 @@ enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_CREATE_DIR_PORT,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
+	DLB2_DOMAIN_CMD_GET_LDB_PORT_PP_FD,
+	DLB2_DOMAIN_CMD_GET_LDB_PORT_CQ_FD,
+	DLB2_DOMAIN_CMD_GET_DIR_PORT_PP_FD,
+	DLB2_DOMAIN_CMD_GET_DIR_PORT_CQ_FD,
 
 	/* NUM_DLB2_DOMAIN_CMD must be last */
 	NUM_DLB2_DOMAIN_CMD,
 };
 
+/*
+ * Mapping sizes for memory mapping the consumer queue (CQ) memory space, and
+ * producer port (PP) MMIO space.
+ */
 #define DLB2_CQ_SIZE 65536
+#define DLB2_PP_SIZE 4096
 
 /********************/
 /* dlb2 ioctl codes */
@@ -578,5 +622,21 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,	\
 		      struct dlb2_get_dir_queue_depth_args)
+#define DLB2_IOC_GET_LDB_PORT_PP_FD				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_GET_LDB_PORT_PP_FD,	\
+		      struct dlb2_get_port_fd_args)
+#define DLB2_IOC_GET_LDB_PORT_CQ_FD				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_GET_LDB_PORT_CQ_FD,	\
+		      struct dlb2_get_port_fd_args)
+#define DLB2_IOC_GET_DIR_PORT_PP_FD				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_GET_DIR_PORT_PP_FD,	\
+		      struct dlb2_get_port_fd_args)
+#define DLB2_IOC_GET_DIR_PORT_CQ_FD				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_GET_DIR_PORT_CQ_FD,	\
+		      struct dlb2_get_port_fd_args)
 
 #endif /* __DLB2_USER_H */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/20] dlb2: add start domain ioctl
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (9 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 10/20] dlb2: add port mmap support Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 12/20] dlb2: add queue map and unmap ioctls Gage Eads
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

Once a scheduling domain and its resources have been configured, the start
domain ioctl is called to enable its ports to begin enqueueing to the
device. Once started, the domain's resources cannot be configured again
until after the domain is reset.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c    |   2 +
 drivers/misc/dlb2/dlb2_main.h     |   4 ++
 drivers/misc/dlb2/dlb2_pf_ops.c   |  10 ++++
 drivers/misc/dlb2/dlb2_resource.c | 120 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h |  30 ++++++++++
 include/uapi/linux/dlb2_user.h    |  24 ++++++++
 6 files changed, 190 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index b4d40de9d0dc..6369798fc53e 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -93,6 +93,7 @@ static int dlb2_domain_ioctl_##lower_name(struct dlb2_dev *dev,		   \
 
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_ldb_queue)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_dir_queue)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(start_domain)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_ldb_queue_depth)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_dir_queue_depth)
 
@@ -465,6 +466,7 @@ dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_create_dir_queue,
 	dlb2_domain_ioctl_create_ldb_port,
 	dlb2_domain_ioctl_create_dir_port,
+	dlb2_domain_ioctl_start_domain,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
 	dlb2_domain_ioctl_get_ldb_port_pp_fd,
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 537f849b0597..dabe798f7da9 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -78,6 +78,10 @@ struct dlb2_device_ops {
 			       struct dlb2_create_dir_port_args *args,
 			       uintptr_t cq_dma_base,
 			       struct dlb2_cmd_response *resp);
+	int (*start_domain)(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_start_domain_args *args,
+			    struct dlb2_cmd_response *resp);
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
 	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index c3044d603263..fb2f914a99a4 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -282,6 +282,15 @@ dlb2_pf_create_dir_port(struct dlb2_hw *hw,
 }
 
 static int
+dlb2_pf_start_domain(struct dlb2_hw *hw,
+		     u32 id,
+		     struct dlb2_start_domain_args *args,
+		     struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_start_domain(hw, id, args, resp, false, 0);
+}
+
+static int
 dlb2_pf_get_num_resources(struct dlb2_hw *hw,
 			  struct dlb2_get_num_resources_args *args)
 {
@@ -368,6 +377,7 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.create_dir_queue = dlb2_pf_create_dir_queue,
 	.create_ldb_port = dlb2_pf_create_ldb_port,
 	.create_dir_port = dlb2_pf_create_dir_port,
+	.start_domain = dlb2_pf_start_domain,
 	.get_num_resources = dlb2_pf_get_num_resources,
 	.reset_domain = dlb2_pf_reset_domain,
 	.ldb_port_owned_by_domain = dlb2_pf_ldb_port_owned_by_domain,
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 1de4ef9ae405..6e98b68972f5 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -1303,6 +1303,34 @@ dlb2_verify_create_dir_port_args(struct dlb2_hw *hw,
 	return 0;
 }
 
+static int dlb2_verify_start_domain_args(struct dlb2_hw *hw,
+					 u32 domain_id,
+					 struct dlb2_cmd_response *resp,
+					 bool vdev_req,
+					 unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	if (domain->started) {
+		resp->status = DLB2_ST_DOMAIN_STARTED;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static bool dlb2_port_find_slot(struct dlb2_ldb_port *port,
 				enum dlb2_qid_map_state state,
 				int *slot)
@@ -3259,6 +3287,98 @@ int dlb2_hw_create_dir_port(struct dlb2_hw *hw,
 	return 0;
 }
 
+static void dlb2_log_start_domain(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 start domain arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n", domain_id);
+}
+
+/**
+ * dlb2_hw_start_domain() - Lock the domain configuration
+ * @hw:	Contains the current state of the DLB2 hardware.
+ * @domain_id: Domain ID
+ * @arg: User-provided arguments (unused, here for ioctl callback template).
+ * @resp: Response to user.
+ * @vdev_req: Request came from a virtual device.
+ * @vdev_id: If vdev_req is true, this contains the virtual device's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int
+dlb2_hw_start_domain(struct dlb2_hw *hw,
+		     u32 domain_id,
+		     __attribute((unused)) struct dlb2_start_domain_args *arg,
+		     struct dlb2_cmd_response *resp,
+		     bool vdev_req,
+		     unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *dir_queue;
+	struct dlb2_ldb_queue *ldb_queue;
+	struct dlb2_hw_domain *domain;
+	int ret;
+
+	dlb2_log_start_domain(hw, domain_id, vdev_req, vdev_id);
+
+	ret = dlb2_verify_start_domain_args(hw,
+					    domain_id,
+					    resp,
+					    vdev_req,
+					    vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/*
+	 * Enable load-balanced and directed queue write permissions for the
+	 * queues this domain owns. Without this, the DLB2 will drop all
+	 * incoming traffic to those queues.
+	 */
+	DLB2_DOM_LIST_FOR(domain->used_ldb_queues, ldb_queue) {
+		union dlb2_sys_ldb_vasqid_v r0 = { {0} };
+		unsigned int offs;
+
+		r0.field.vasqid_v = 1;
+
+		offs = domain->id.phys_id * DLB2_MAX_NUM_LDB_QUEUES +
+			ldb_queue->id.phys_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_LDB_VASQID_V(offs), r0.val);
+	}
+
+	DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_queue) {
+		union dlb2_sys_dir_vasqid_v r0 = { {0} };
+		unsigned int offs;
+
+		r0.field.vasqid_v = 1;
+
+		offs = domain->id.phys_id * DLB2_MAX_NUM_DIR_PORTS +
+			dir_queue->id.phys_id;
+
+		DLB2_CSR_WR(hw, DLB2_SYS_DIR_VASQID_V(offs), r0.val);
+	}
+
+	dlb2_flush_csr(hw);
+
+	domain->started = true;
+
+	resp->status = 0;
+
+	return 0;
+}
+
 static void
 dlb2_domain_finish_unmap_port_slot(struct dlb2_hw *hw,
 				   struct dlb2_hw_domain *domain,
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 47b0d6f785fb..cd865c756c6d 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -209,6 +209,36 @@ int dlb2_hw_create_ldb_port(struct dlb2_hw *hw,
 			    unsigned int vdev_id);
 
 /**
+ * dlb2_hw_start_domain() - start a scheduling domain
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: start domain arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function starts a scheduling domain, which allows applications to send
+ * traffic through it. Once a domain is started, its resources can no longer be
+ * configured (besides QID remapping and port enable/disable).
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - the domain is not configured, or the domain is already started.
+ */
+int dlb2_hw_start_domain(struct dlb2_hw *hw,
+			 u32 domain_id,
+			 struct dlb2_start_domain_args *args,
+			 struct dlb2_cmd_response *resp,
+			 bool vdev_request,
+			 unsigned int vdev_id);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index b0aba1ba5e3f..d409586466b1 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -471,6 +471,25 @@ struct dlb2_create_dir_port_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_START_DOMAIN: Mark the end of the domain configuration. This
+ *	must be called before passing QEs into the device, and no configuration
+ *	ioctls can be issued once the domain has started. Sending QEs into the
+ *	device before calling this ioctl will result in undefined behavior.
+ * Input parameters:
+ * - (None)
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_start_domain_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+};
+
+/*
  * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
  * Input parameters:
  * - queue_id: The load-balanced queue ID.
@@ -550,6 +569,7 @@ enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,
 	DLB2_DOMAIN_CMD_CREATE_LDB_PORT,
 	DLB2_DOMAIN_CMD_CREATE_DIR_PORT,
+	DLB2_DOMAIN_CMD_START_DOMAIN,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_LDB_PORT_PP_FD,
@@ -614,6 +634,10 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_CREATE_DIR_PORT,		\
 		      struct dlb2_create_dir_port_args)
+#define DLB2_IOC_START_DOMAIN					\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_START_DOMAIN,		\
+		      struct dlb2_start_domain_args)
 #define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/20] dlb2: add queue map and unmap ioctls
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (10 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 11/20] dlb2: add start domain ioctl Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 13/20] dlb2: add port enable/disable ioctls Gage Eads
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

Load-balanced queues can be "mapped" to any number of load-balanced ports.
Once mapped, the port becomes a candidate to which the device can schedule
queue entries from the queue. If a port is unmapped from a queue, it is no
longer a candidate for scheduling from that queue.

The process for map and unmap is different depending on whether or not the
domain is started, because the process requires updating hardware
structures that are dynamically modified by the device when it's in use.
If the domain is not started, its ports and queues are guaranteed not to be
in active use.

If the domain is started, the map process must (temporarily) disable the
queue and wait for it to quiesce. Similarly, the unmap process must
(temporarily) disable the port and wait for it to quiesce. 'Quiesce' here
means the user processes any in-flight QEs.

It's possible that the thread that requires the map/unmap is the same one
which is responsible for doing the processing that would quiesce the
queue/port, thus the driver may have to complete the operation
asynchronously. To that end, the driver uses a workqueue that periodically
checks whether any outstanding operations can be completed. This workqueue
function is only scheduled when there is at least one outstanding map/unmap
operation.

To support this asynchronous operation while also providing a reasonably
usable user-interface, the driver maintains two views of the queue map
state:
- The hardware view: the actual queue map state in the device
- The user/software view: the queue map state as though the operations were
  synchronous.

While a map/unmap operation is inflight, these two views are out-of-sync.
When the user requests a new map/unmap operation, the driver verifies the
request against the software view, so the errors are synchronous from the
user’s perspective, then adds the request to the queue of in-progress
operations. When possible -- for example if the user requests to map a
queue and then immediately requests to unmap it -- the driver will coalesce
or cancel outstanding operations.

This commit also adds a pending-port-unmaps ioctl, that user-space can call
to query how many pending/outstanding unmap operations exist for a given
port.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c    |   6 +
 drivers/misc/dlb2/dlb2_main.h     |  14 +
 drivers/misc/dlb2/dlb2_pf_ops.c   |  30 ++
 drivers/misc/dlb2/dlb2_resource.c | 789 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h | 111 ++++++
 include/uapi/linux/dlb2_user.h    |  82 ++++
 6 files changed, 1032 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index 6369798fc53e..65d7ab82161c 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -94,8 +94,11 @@ static int dlb2_domain_ioctl_##lower_name(struct dlb2_dev *dev,		   \
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_ldb_queue)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(create_dir_queue)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(start_domain)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(map_qid)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(unmap_qid)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_ldb_queue_depth)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_dir_queue_depth)
+DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(pending_port_unmaps)
 
 /*
  * Port creation ioctls don't use the callback template macro because they have
@@ -467,8 +470,11 @@ dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_create_ldb_port,
 	dlb2_domain_ioctl_create_dir_port,
 	dlb2_domain_ioctl_start_domain,
+	dlb2_domain_ioctl_map_qid,
+	dlb2_domain_ioctl_unmap_qid,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
+	dlb2_domain_ioctl_pending_port_unmaps,
 	dlb2_domain_ioctl_get_ldb_port_pp_fd,
 	dlb2_domain_ioctl_get_ldb_port_cq_fd,
 	dlb2_domain_ioctl_get_dir_port_pp_fd,
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index dabe798f7da9..bca24e3a4f84 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -82,6 +82,18 @@ struct dlb2_device_ops {
 			    u32 domain_id,
 			    struct dlb2_start_domain_args *args,
 			    struct dlb2_cmd_response *resp);
+	int (*map_qid)(struct dlb2_hw *hw,
+		       u32 domain_id,
+		       struct dlb2_map_qid_args *args,
+		       struct dlb2_cmd_response *resp);
+	int (*unmap_qid)(struct dlb2_hw *hw,
+			 u32 domain_id,
+			 struct dlb2_unmap_qid_args *args,
+			 struct dlb2_cmd_response *resp);
+	int (*pending_port_unmaps)(struct dlb2_hw *hw,
+				   u32 domain_id,
+				   struct dlb2_pending_port_unmaps_args *args,
+				   struct dlb2_cmd_response *resp);
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
 	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
@@ -144,10 +156,12 @@ struct dlb2_dev {
 	 * hardware registers.
 	 */
 	struct mutex resource_mutex;
+	struct work_struct work;
 	enum dlb2_device_type type;
 	int id;
 	dev_t dev_number;
 	u8 domain_reset_failed;
+	u8 worker_launched;
 };
 
 int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id);
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index fb2f914a99a4..51562978ee1a 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -291,6 +291,33 @@ dlb2_pf_start_domain(struct dlb2_hw *hw,
 }
 
 static int
+dlb2_pf_map_qid(struct dlb2_hw *hw,
+		u32 id,
+		struct dlb2_map_qid_args *args,
+		struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_map_qid(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_unmap_qid(struct dlb2_hw *hw,
+		  u32 id,
+		  struct dlb2_unmap_qid_args *args,
+		  struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_unmap_qid(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_pending_port_unmaps(struct dlb2_hw *hw,
+			    u32 id,
+			    struct dlb2_pending_port_unmaps_args *args,
+			    struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_pending_port_unmaps(hw, id, args, resp, false, 0);
+}
+
+static int
 dlb2_pf_get_num_resources(struct dlb2_hw *hw,
 			  struct dlb2_get_num_resources_args *args)
 {
@@ -378,6 +405,9 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.create_ldb_port = dlb2_pf_create_ldb_port,
 	.create_dir_port = dlb2_pf_create_dir_port,
 	.start_domain = dlb2_pf_start_domain,
+	.map_qid = dlb2_pf_map_qid,
+	.unmap_qid = dlb2_pf_unmap_qid,
+	.pending_port_unmaps = dlb2_pf_pending_port_unmaps,
 	.get_num_resources = dlb2_pf_get_num_resources,
 	.reset_domain = dlb2_pf_reset_domain,
 	.ldb_port_owned_by_domain = dlb2_pf_ldb_port_owned_by_domain,
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 6e98b68972f5..bc1c16ebadde 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -238,6 +238,32 @@ static struct dlb2_hw_domain *dlb2_get_domain_from_id(struct dlb2_hw *hw,
 }
 
 static struct dlb2_ldb_port *
+dlb2_get_domain_used_ldb_port(u32 id,
+			      bool vdev_req,
+			      struct dlb2_hw_domain *domain)
+{
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+
+		DLB2_DOM_LIST_FOR(domain->avail_ldb_ports[i], port)
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+	}
+
+	return NULL;
+}
+
+static struct dlb2_ldb_port *
 dlb2_get_domain_ldb_port(u32 id,
 			 bool vdev_req,
 			 struct dlb2_hw_domain *domain)
@@ -1331,6 +1357,64 @@ static int dlb2_verify_start_domain_args(struct dlb2_hw *hw,
 	return 0;
 }
 
+static int dlb2_verify_map_qid_args(struct dlb2_hw *hw,
+				    u32 domain_id,
+				    struct dlb2_map_qid_args *args,
+				    struct dlb2_cmd_response *resp,
+				    bool vdev_req,
+				    unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	struct dlb2_ldb_queue *queue;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+
+	if (!port || !port->configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	if (args->priority >= DLB2_QID_PRIORITIES) {
+		resp->status = DLB2_ST_INVALID_PRIORITY;
+		return -EINVAL;
+	}
+
+	queue = dlb2_get_domain_ldb_queue(args->qid, vdev_req, domain);
+
+	if (!queue || !queue->configured) {
+		resp->status = DLB2_ST_INVALID_QID;
+		return -EINVAL;
+	}
+
+	if (queue->domain_id.phys_id != domain->id.phys_id) {
+		resp->status = DLB2_ST_INVALID_QID;
+		return -EINVAL;
+	}
+
+	if (port->domain_id.phys_id != domain->id.phys_id) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static bool dlb2_port_find_slot(struct dlb2_ldb_port *port,
 				enum dlb2_qid_map_state state,
 				int *slot)
@@ -1365,6 +1449,26 @@ static bool dlb2_port_find_slot_queue(struct dlb2_ldb_port *port,
 	return (i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ);
 }
 
+static bool
+dlb2_port_find_slot_with_pending_map_queue(struct dlb2_ldb_port *port,
+					   struct dlb2_ldb_queue *queue,
+					   int *slot)
+{
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+		struct dlb2_ldb_port_qid_map *map = &port->qid_map[i];
+
+		if (map->state == DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP &&
+		    map->pending_qid == queue->id.phys_id)
+			break;
+	}
+
+	*slot = i;
+
+	return (i < DLB2_MAX_NUM_QIDS_PER_LDB_CQ);
+}
+
 static int dlb2_port_slot_state_transition(struct dlb2_hw *hw,
 					   struct dlb2_ldb_port *port,
 					   struct dlb2_ldb_queue *queue,
@@ -1492,6 +1596,117 @@ static int dlb2_port_slot_state_transition(struct dlb2_hw *hw,
 	return -EFAULT;
 }
 
+static int dlb2_verify_map_qid_slot_available(struct dlb2_ldb_port *port,
+					      struct dlb2_ldb_queue *queue,
+					      struct dlb2_cmd_response *resp)
+{
+	enum dlb2_qid_map_state state;
+	int i;
+
+	/* Unused slot available? */
+	if (port->num_mappings < DLB2_MAX_NUM_QIDS_PER_LDB_CQ)
+		return 0;
+
+	/*
+	 * If the queue is already mapped (from the application's perspective),
+	 * this is simply a priority update.
+	 */
+	state = DLB2_QUEUE_MAPPED;
+	if (dlb2_port_find_slot_queue(port, state, queue, &i))
+		return 0;
+
+	state = DLB2_QUEUE_MAP_IN_PROG;
+	if (dlb2_port_find_slot_queue(port, state, queue, &i))
+		return 0;
+
+	if (dlb2_port_find_slot_with_pending_map_queue(port, queue, &i))
+		return 0;
+
+	/*
+	 * If the slot contains an unmap in progress, it's considered
+	 * available.
+	 */
+	state = DLB2_QUEUE_UNMAP_IN_PROG;
+	if (dlb2_port_find_slot(port, state, &i))
+		return 0;
+
+	state = DLB2_QUEUE_UNMAPPED;
+	if (dlb2_port_find_slot(port, state, &i))
+		return 0;
+
+	resp->status = DLB2_ST_NO_QID_SLOTS_AVAILABLE;
+	return -EINVAL;
+}
+
+static int dlb2_verify_unmap_qid_args(struct dlb2_hw *hw,
+				      u32 domain_id,
+				      struct dlb2_unmap_qid_args *args,
+				      struct dlb2_cmd_response *resp,
+				      bool vdev_req,
+				      unsigned int vdev_id)
+{
+	enum dlb2_qid_map_state state;
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_queue *queue;
+	struct dlb2_ldb_port *port;
+	int slot;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+
+	if (!port || !port->configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	if (port->domain_id.phys_id != domain->id.phys_id) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	queue = dlb2_get_domain_ldb_queue(args->qid, vdev_req, domain);
+
+	if (!queue || !queue->configured) {
+		DLB2_HW_ERR(hw, "[%s()] Can't unmap unconfigured queue %d\n",
+			    __func__, args->qid);
+		resp->status = DLB2_ST_INVALID_QID;
+		return -EINVAL;
+	}
+
+	/*
+	 * Verify that the port has the queue mapped. From the application's
+	 * perspective a queue is mapped if it is actually mapped, the map is
+	 * in progress, or the map is blocked pending an unmap.
+	 */
+	state = DLB2_QUEUE_MAPPED;
+	if (dlb2_port_find_slot_queue(port, state, queue, &slot))
+		return 0;
+
+	state = DLB2_QUEUE_MAP_IN_PROG;
+	if (dlb2_port_find_slot_queue(port, state, queue, &slot))
+		return 0;
+
+	if (dlb2_port_find_slot_with_pending_map_queue(port, queue, &slot))
+		return 0;
+
+	resp->status = DLB2_ST_INVALID_QID;
+	return -EINVAL;
+}
+
 static void dlb2_configure_domain_credits(struct dlb2_hw *hw,
 					  struct dlb2_hw_domain *domain)
 {
@@ -2304,6 +2519,26 @@ static int dlb2_ldb_port_map_qid_static(struct dlb2_hw *hw,
 	return dlb2_port_slot_state_transition(hw, p, q, i, state);
 }
 
+static void dlb2_ldb_port_change_qid_priority(struct dlb2_hw *hw,
+					      struct dlb2_ldb_port *port,
+					      int slot,
+					      struct dlb2_map_qid_args *args)
+{
+	union dlb2_lsp_cq2priov r0;
+
+	/* Read-modify-write the priority and valid bit register */
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CQ2PRIOV(port->id.phys_id));
+
+	r0.field.v |= 1 << slot;
+	r0.field.prio |= (args->priority & 0x7) << slot * 3;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(port->id.phys_id), r0.val);
+
+	dlb2_flush_csr(hw);
+
+	port->qid_map[slot].priority = args->priority;
+}
+
 static int dlb2_ldb_port_set_has_work_bits(struct dlb2_hw *hw,
 					   struct dlb2_ldb_port *port,
 					   struct dlb2_ldb_queue *queue,
@@ -2543,6 +2778,62 @@ static int dlb2_ldb_port_finish_map_qid_dynamic(struct dlb2_hw *hw,
 	return 0;
 }
 
+static unsigned int dlb2_finish_unmap_qid_procedures(struct dlb2_hw *hw);
+static unsigned int dlb2_finish_map_qid_procedures(struct dlb2_hw *hw);
+
+/*
+ * The workqueue callback runs until it completes all outstanding QID->CQ
+ * map and unmap requests. To prevent deadlock, this function gives other
+ * threads a chance to grab the resource mutex and configure hardware.
+ */
+static void dlb2_complete_queue_map_unmap(struct work_struct *work)
+{
+	struct dlb2_dev *dlb2_dev;
+	int ret;
+
+	dlb2_dev = container_of(work, struct dlb2_dev, work);
+
+	mutex_lock(&dlb2_dev->resource_mutex);
+
+	ret = dlb2_finish_unmap_qid_procedures(&dlb2_dev->hw);
+	ret += dlb2_finish_map_qid_procedures(&dlb2_dev->hw);
+
+	if (ret != 0)
+		/*
+		 * Relinquish the CPU so the application can process its CQs,
+		 * so this function doesn't deadlock.
+		 */
+		schedule_work(&dlb2_dev->work);
+	else
+		dlb2_dev->worker_launched = false;
+
+	mutex_unlock(&dlb2_dev->resource_mutex);
+}
+
+/**
+ * dlb2_schedule_work() - launch a thread to process pending map and unmap work
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function launches a kernel thread that will run until all pending
+ * map and unmap procedures are complete.
+ */
+static void dlb2_schedule_work(struct dlb2_hw *hw)
+{
+	struct dlb2_dev *dlb2_dev;
+
+	dlb2_dev = container_of(hw, struct dlb2_dev, hw);
+
+	/* Nothing to do if the worker is already running */
+	if (dlb2_dev->worker_launched)
+		return;
+
+	INIT_WORK(&dlb2_dev->work, dlb2_complete_queue_map_unmap);
+
+	schedule_work(&dlb2_dev->work);
+
+	dlb2_dev->worker_launched = true;
+}
+
 /**
  * dlb2_ldb_port_map_qid_dynamic() - perform a "dynamic" QID->CQ mapping
  * @hw: dlb2_hw handle for a particular device.
@@ -2600,6 +2891,20 @@ static int dlb2_ldb_port_map_qid_dynamic(struct dlb2_hw *hw,
 	if (ret)
 		return ret;
 
+	r0.val = DLB2_CSR_RD(hw,
+			     DLB2_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+
+	if (r0.field.count) {
+		/*
+		 * The queue is owed completions so it's not safe to map it
+		 * yet. Schedule a kernel thread to complete the mapping later,
+		 * once software has completed all the queue's inflight events.
+		 */
+		dlb2_schedule_work(hw);
+
+		return 1;
+	}
+
 	/*
 	 * Disable the affected CQ, and the CQs already mapped to the QID,
 	 * before reading the QID's inflight count a second time. There is an
@@ -2622,6 +2927,13 @@ static int dlb2_ldb_port_map_qid_dynamic(struct dlb2_hw *hw,
 
 		dlb2_ldb_queue_enable_mapped_cqs(hw, domain, queue);
 
+		/*
+		 * The queue is owed completions so it's not safe to map it
+		 * yet. Schedule a kernel thread to complete the mapping later,
+		 * once software has completed all the queue's inflight events.
+		 */
+		dlb2_schedule_work(hw);
+
 		return 1;
 	}
 
@@ -3480,6 +3792,20 @@ dlb2_domain_finish_unmap_qid_procedures(struct dlb2_hw *hw,
 	return domain->num_pending_removals;
 }
 
+static unsigned int dlb2_finish_unmap_qid_procedures(struct dlb2_hw *hw)
+{
+	int i, num = 0;
+
+	/* Finish queue unmap jobs for any domain that needs it */
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++) {
+		struct dlb2_hw_domain *domain = &hw->domains[i];
+
+		num += dlb2_domain_finish_unmap_qid_procedures(hw, domain);
+	}
+
+	return num;
+}
+
 static void dlb2_domain_finish_map_port(struct dlb2_hw *hw,
 					struct dlb2_hw_domain *domain,
 					struct dlb2_ldb_port *port)
@@ -3556,6 +3882,427 @@ dlb2_domain_finish_map_qid_procedures(struct dlb2_hw *hw,
 	return domain->num_pending_additions;
 }
 
+static unsigned int dlb2_finish_map_qid_procedures(struct dlb2_hw *hw)
+{
+	int i, num = 0;
+
+	/* Finish queue map jobs for any domain that needs it */
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++) {
+		struct dlb2_hw_domain *domain = &hw->domains[i];
+
+		num += dlb2_domain_finish_map_qid_procedures(hw, domain);
+	}
+
+	return num;
+}
+
+static void dlb2_log_map_qid(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_map_qid_args *args,
+			     bool vdev_req,
+			     unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 map QID arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tPort ID:   %d\n",
+		    args->port_id);
+	DLB2_HW_DBG(hw, "\tQueue ID:  %d\n",
+		    args->qid);
+	DLB2_HW_DBG(hw, "\tPriority:  %d\n",
+		    args->priority);
+}
+
+int dlb2_hw_map_qid(struct dlb2_hw *hw,
+		    u32 domain_id,
+		    struct dlb2_map_qid_args *args,
+		    struct dlb2_cmd_response *resp,
+		    bool vdev_req,
+		    unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_queue *queue;
+	enum dlb2_qid_map_state st;
+	struct dlb2_ldb_port *port;
+	int ret, i, id;
+	u8 prio;
+
+	dlb2_log_map_qid(hw, domain_id, args, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_map_qid_args(hw,
+				       domain_id,
+				       args,
+				       resp,
+				       vdev_req,
+				       vdev_id);
+	if (ret)
+		return ret;
+
+	prio = args->priority;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	queue = dlb2_get_domain_ldb_queue(args->qid, vdev_req, domain);
+	if (!queue) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: queue not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/*
+	 * If there are any outstanding detach operations for this port,
+	 * attempt to complete them. This may be necessary to free up a QID
+	 * slot for this requested mapping.
+	 */
+	if (port->num_pending_removals)
+		dlb2_domain_finish_unmap_port(hw, domain, port);
+
+	ret = dlb2_verify_map_qid_slot_available(port, queue, resp);
+	if (ret)
+		return ret;
+
+	/* Hardware requires disabling the CQ before mapping QIDs. */
+	if (port->enabled)
+		dlb2_ldb_port_cq_disable(hw, port);
+
+	/*
+	 * If this is only a priority change, don't perform the full QID->CQ
+	 * mapping procedure
+	 */
+	st = DLB2_QUEUE_MAPPED;
+	if (dlb2_port_find_slot_queue(port, st, queue, &i)) {
+		if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+			DLB2_HW_ERR(hw,
+				    "[%s():%d] Internal error: port slot tracking failed\n",
+				    __func__, __LINE__);
+			return -EFAULT;
+		}
+
+		if (prio != port->qid_map[i].priority) {
+			dlb2_ldb_port_change_qid_priority(hw, port, i, args);
+			DLB2_HW_DBG(hw, "DLB2 map: priority change\n");
+		}
+
+		st = DLB2_QUEUE_MAPPED;
+		ret = dlb2_port_slot_state_transition(hw, port, queue, i, st);
+		if (ret)
+			return ret;
+
+		goto map_qid_done;
+	}
+
+	st = DLB2_QUEUE_UNMAP_IN_PROG;
+	if (dlb2_port_find_slot_queue(port, st, queue, &i)) {
+		if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+			DLB2_HW_ERR(hw,
+				    "[%s():%d] Internal error: port slot tracking failed\n",
+				    __func__, __LINE__);
+			return -EFAULT;
+		}
+
+		if (prio != port->qid_map[i].priority) {
+			dlb2_ldb_port_change_qid_priority(hw, port, i, args);
+			DLB2_HW_DBG(hw, "DLB2 map: priority change\n");
+		}
+
+		st = DLB2_QUEUE_MAPPED;
+		ret = dlb2_port_slot_state_transition(hw, port, queue, i, st);
+		if (ret)
+			return ret;
+
+		goto map_qid_done;
+	}
+
+	/*
+	 * If this is a priority change on an in-progress mapping, don't
+	 * perform the full QID->CQ mapping procedure.
+	 */
+	st = DLB2_QUEUE_MAP_IN_PROG;
+	if (dlb2_port_find_slot_queue(port, st, queue, &i)) {
+		if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+			DLB2_HW_ERR(hw,
+				    "[%s():%d] Internal error: port slot tracking failed\n",
+				    __func__, __LINE__);
+			return -EFAULT;
+		}
+
+		port->qid_map[i].priority = prio;
+
+		DLB2_HW_DBG(hw, "DLB2 map: priority change only\n");
+
+		goto map_qid_done;
+	}
+
+	/*
+	 * If this is a priority change on a pending mapping, update the
+	 * pending priority
+	 */
+	if (dlb2_port_find_slot_with_pending_map_queue(port, queue, &i)) {
+		if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+			DLB2_HW_ERR(hw,
+				    "[%s():%d] Internal error: port slot tracking failed\n",
+				    __func__, __LINE__);
+			return -EFAULT;
+		}
+
+		port->qid_map[i].pending_priority = prio;
+
+		DLB2_HW_DBG(hw, "DLB2 map: priority change only\n");
+
+		goto map_qid_done;
+	}
+
+	/*
+	 * If all the CQ's slots are in use, then there's an unmap in progress
+	 * (guaranteed by dlb2_verify_map_qid_slot_available()), so add this
+	 * mapping to pending_map and return. When the removal is completed for
+	 * the slot's current occupant, this mapping will be performed.
+	 */
+	if (!dlb2_port_find_slot(port, DLB2_QUEUE_UNMAPPED, &i)) {
+		if (dlb2_port_find_slot(port, DLB2_QUEUE_UNMAP_IN_PROG, &i)) {
+			enum dlb2_qid_map_state st;
+
+			if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+				DLB2_HW_ERR(hw,
+					    "[%s():%d] Internal error: port slot tracking failed\n",
+					    __func__, __LINE__);
+				return -EFAULT;
+			}
+
+			port->qid_map[i].pending_qid = queue->id.phys_id;
+			port->qid_map[i].pending_priority = prio;
+
+			st = DLB2_QUEUE_UNMAP_IN_PROG_PENDING_MAP;
+
+			ret = dlb2_port_slot_state_transition(hw, port, queue,
+							      i, st);
+			if (ret)
+				return ret;
+
+			DLB2_HW_DBG(hw, "DLB2 map: map pending removal\n");
+
+			goto map_qid_done;
+		}
+	}
+
+	/*
+	 * If the domain has started, a special "dynamic" CQ->queue mapping
+	 * procedure is required in order to safely update the CQ<->QID tables.
+	 * The "static" procedure cannot be used when traffic is flowing,
+	 * because the CQ<->QID tables cannot be updated atomically and the
+	 * scheduler won't see the new mapping unless the queue's if_status
+	 * changes, which isn't guaranteed.
+	 */
+	ret = dlb2_ldb_port_map_qid(hw, domain, port, queue, prio);
+
+	/* If ret is less than zero, it's due to an internal error */
+	if (ret < 0)
+		return ret;
+
+map_qid_done:
+	if (port->enabled)
+		dlb2_ldb_port_cq_enable(hw, port);
+
+	resp->status = 0;
+
+	return 0;
+}
+
+static void dlb2_log_unmap_qid(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_unmap_qid_args *args,
+			       bool vdev_req,
+			       unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 unmap QID arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tPort ID:   %d\n",
+		    args->port_id);
+	DLB2_HW_DBG(hw, "\tQueue ID:  %d\n",
+		    args->qid);
+	if (args->qid < DLB2_MAX_NUM_LDB_QUEUES)
+		DLB2_HW_DBG(hw, "\tQueue's num mappings:  %d\n",
+			    hw->rsrcs.ldb_queues[args->qid].num_mappings);
+}
+
+int dlb2_hw_unmap_qid(struct dlb2_hw *hw,
+		      u32 domain_id,
+		      struct dlb2_unmap_qid_args *args,
+		      struct dlb2_cmd_response *resp,
+		      bool vdev_req,
+		      unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_queue *queue;
+	enum dlb2_qid_map_state st;
+	struct dlb2_ldb_port *port;
+	bool unmap_complete;
+	int i, ret, id;
+
+	dlb2_log_unmap_qid(hw, domain_id, args, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_unmap_qid_args(hw,
+					 domain_id,
+					 args,
+					 resp,
+					 vdev_req,
+					 vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	queue = dlb2_get_domain_ldb_queue(args->qid, vdev_req, domain);
+	if (!queue) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: queue not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/*
+	 * If the queue hasn't been mapped yet, we need to update the slot's
+	 * state and re-enable the queue's inflights.
+	 */
+	st = DLB2_QUEUE_MAP_IN_PROG;
+	if (dlb2_port_find_slot_queue(port, st, queue, &i)) {
+		if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+			DLB2_HW_ERR(hw,
+				    "[%s():%d] Internal error: port slot tracking failed\n",
+				    __func__, __LINE__);
+			return -EFAULT;
+		}
+
+		/*
+		 * Since the in-progress map was aborted, re-enable the QID's
+		 * inflights.
+		 */
+		if (queue->num_pending_additions == 0)
+			dlb2_ldb_queue_set_inflight_limit(hw, queue);
+
+		st = DLB2_QUEUE_UNMAPPED;
+		ret = dlb2_port_slot_state_transition(hw, port, queue, i, st);
+		if (ret)
+			return ret;
+
+		goto unmap_qid_done;
+	}
+
+	/*
+	 * If the queue mapping is on hold pending an unmap, we simply need to
+	 * update the slot's state.
+	 */
+	if (dlb2_port_find_slot_with_pending_map_queue(port, queue, &i)) {
+		if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+			DLB2_HW_ERR(hw,
+				    "[%s():%d] Internal error: port slot tracking failed\n",
+				    __func__, __LINE__);
+			return -EFAULT;
+		}
+
+		st = DLB2_QUEUE_UNMAP_IN_PROG;
+		ret = dlb2_port_slot_state_transition(hw, port, queue, i, st);
+		if (ret)
+			return ret;
+
+		goto unmap_qid_done;
+	}
+
+	st = DLB2_QUEUE_MAPPED;
+	if (!dlb2_port_find_slot_queue(port, st, queue, &i)) {
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: no available CQ slots\n",
+			    __func__);
+		return -EFAULT;
+	}
+
+	if (i >= DLB2_MAX_NUM_QIDS_PER_LDB_CQ) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port slot tracking failed\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/*
+	 * QID->CQ mapping removal is an asychronous procedure. It requires
+	 * stopping the DLB2 from scheduling this CQ, draining all inflights
+	 * from the CQ, then unmapping the queue from the CQ. This function
+	 * simply marks the port as needing the queue unmapped, and (if
+	 * necessary) starts the unmapping worker thread.
+	 */
+	dlb2_ldb_port_cq_disable(hw, port);
+
+	st = DLB2_QUEUE_UNMAP_IN_PROG;
+	ret = dlb2_port_slot_state_transition(hw, port, queue, i, st);
+	if (ret)
+		return ret;
+
+	/*
+	 * Attempt to finish the unmapping now, in case the port has no
+	 * outstanding inflights. If that's not the case, this will fail and
+	 * the unmapping will be completed at a later time.
+	 */
+	unmap_complete = dlb2_domain_finish_unmap_port(hw, domain, port);
+
+	/*
+	 * If the unmapping couldn't complete immediately, launch the worker
+	 * thread (if it isn't already launched) to finish it later.
+	 */
+	if (!unmap_complete)
+		dlb2_schedule_work(hw);
+
+unmap_qid_done:
+	resp->status = 0;
+
+	return 0;
+}
+
 static u32 dlb2_ldb_cq_inflight_count(struct dlb2_hw *hw,
 				      struct dlb2_ldb_port *port)
 {
@@ -3890,6 +4637,48 @@ int dlb2_hw_get_dir_queue_depth(struct dlb2_hw *hw,
 	return 0;
 }
 
+static void
+dlb2_log_pending_port_unmaps_args(struct dlb2_hw *hw,
+				  struct dlb2_pending_port_unmaps_args *args,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB unmaps in progress arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from VF %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tPort ID: %d\n", args->port_id);
+}
+
+int dlb2_hw_pending_port_unmaps(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_pending_port_unmaps_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vdev_req,
+				unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+
+	dlb2_log_pending_port_unmaps_args(hw, args, vdev_req, vdev_id);
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	port = dlb2_get_domain_used_ldb_port(args->port_id, vdev_req, domain);
+	if (!port || !port->configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	resp->id = port->num_pending_removals;
+
+	return 0;
+}
+
 static u32 dlb2_ldb_queue_depth(struct dlb2_hw *hw,
 				struct dlb2_ldb_queue *queue)
 {
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index cd865c756c6d..80bebeee7bb6 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -239,6 +239,92 @@ int dlb2_hw_start_domain(struct dlb2_hw *hw,
 			 unsigned int vdev_id);
 
 /**
+ * dlb2_hw_map_qid() - map a load-balanced queue to a load-balanced port
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: map QID arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function configures the DLB to schedule QEs from the specified queue
+ * to the specified port. Each load-balanced port can be mapped to up to 8
+ * queues; each load-balanced queue can potentially map to all the
+ * load-balanced ports.
+ *
+ * A successful return does not necessarily mean the mapping was configured. If
+ * this function is unable to immediately map the queue to the port, it will
+ * add the requested operation to a per-port list of pending map/unmap
+ * operations, and (if it's not already running) launch a kernel thread that
+ * periodically attempts to process all pending operations. In a sense, this is
+ * an asynchronous function.
+ *
+ * This asynchronicity creates two views of the state of hardware: the actual
+ * hardware state and the requested state (as if every request completed
+ * immediately). If there are any pending map/unmap operations, the requested
+ * state will differ from the actual state. All validation is performed with
+ * respect to the pending state; for instance, if there are 8 pending map
+ * operations for port X, a request for a 9th will fail because a load-balanced
+ * port can only map up to 8 queues.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, invalid port or queue ID, or
+ *	    the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_map_qid(struct dlb2_hw *hw,
+		    u32 domain_id,
+		    struct dlb2_map_qid_args *args,
+		    struct dlb2_cmd_response *resp,
+		    bool vdev_request,
+		    unsigned int vdev_id);
+
+/**
+ * dlb2_hw_unmap_qid() - Unmap a load-balanced queue from a load-balanced port
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: unmap QID arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function configures the DLB to stop scheduling QEs from the specified
+ * queue to the specified port.
+ *
+ * A successful return does not necessarily mean the mapping was removed. If
+ * this function is unable to immediately unmap the queue from the port, it
+ * will add the requested operation to a per-port list of pending map/unmap
+ * operations, and (if it's not already running) launch a kernel thread that
+ * periodically attempts to process all pending operations. See
+ * dlb2_hw_map_qid() for more details.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, invalid port or queue ID, or
+ *	    the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_unmap_qid(struct dlb2_hw *hw,
+		      u32 domain_id,
+		      struct dlb2_unmap_qid_args *args,
+		      struct dlb2_cmd_response *resp,
+		      bool vdev_request,
+		      unsigned int vdev_id);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
@@ -417,6 +503,31 @@ enum dlb2_virt_mode {
 };
 
 /**
+ * dlb2_hw_pending_port_unmaps() - returns the number of unmap operations in
+ *	progress.
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: number of unmaps in progress args
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the number of unmaps in progress.
+ *
+ * Errors:
+ * EINVAL - Invalid port ID.
+ */
+int dlb2_hw_pending_port_unmaps(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_pending_port_unmaps_args *args,
+				struct dlb2_cmd_response *resp,
+				bool vf_request,
+				unsigned int vf_id);
+
+/**
  * dlb2_hw_enable_sparse_ldb_cq_mode() - enable sparse mode for load-balanced
  *	ports.
  * @hw: dlb2_hw handle for a particular device.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index d409586466b1..bc692eb73bc8 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -490,6 +490,49 @@ struct dlb2_start_domain_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_MAP_QID: Map a load-balanced queue to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - qid: Load-balanced queue ID.
+ * - priority: Queue->port service priority.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_map_qid_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 qid;
+	__u32 priority;
+	__u32 padding0;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_UNMAP_QID: Unmap a load-balanced queue to a load-balanced
+ *	port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - qid: Load-balanced queue ID.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_unmap_qid_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 qid;
+};
+
+/*
  * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
  * Input parameters:
  * - queue_id: The load-balanced queue ID.
@@ -530,6 +573,30 @@ struct dlb2_get_dir_queue_depth_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_PENDING_PORT_UNMAPS: Get number of queue unmap operations in
+ *	progress for a load-balanced port.
+ *
+ *	Note: This is a snapshot; the number of unmap operations in progress
+ *	is subject to change at any time.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: number of unmaps in progress.
+ */
+struct dlb2_pending_port_unmaps_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 padding0;
+};
+
+/*
  * DLB2_CMD_GET_LDB_PORT_PP_FD: Get file descriptor to mmap a load-balanced
  *	port's producer port (PP).
  * DLB2_CMD_GET_LDB_PORT_CQ_FD: Get file descriptor to mmap a load-balanced
@@ -570,8 +637,11 @@ enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_CREATE_LDB_PORT,
 	DLB2_DOMAIN_CMD_CREATE_DIR_PORT,
 	DLB2_DOMAIN_CMD_START_DOMAIN,
+	DLB2_DOMAIN_CMD_MAP_QID,
+	DLB2_DOMAIN_CMD_UNMAP_QID,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
+	DLB2_DOMAIN_CMD_PENDING_PORT_UNMAPS,
 	DLB2_DOMAIN_CMD_GET_LDB_PORT_PP_FD,
 	DLB2_DOMAIN_CMD_GET_LDB_PORT_CQ_FD,
 	DLB2_DOMAIN_CMD_GET_DIR_PORT_PP_FD,
@@ -638,6 +708,14 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_START_DOMAIN,		\
 		      struct dlb2_start_domain_args)
+#define DLB2_IOC_MAP_QID					\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_MAP_QID,			\
+		      struct dlb2_map_qid_args)
+#define DLB2_IOC_UNMAP_QID					\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_UNMAP_QID,		\
+		      struct dlb2_unmap_qid_args)
 #define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
@@ -646,6 +724,10 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,	\
 		      struct dlb2_get_dir_queue_depth_args)
+#define DLB2_IOC_PENDING_PORT_UNMAPS				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_PENDING_PORT_UNMAPS,	\
+		      struct dlb2_pending_port_unmaps_args)
 #define DLB2_IOC_GET_LDB_PORT_PP_FD				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_PORT_PP_FD,	\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/20] dlb2: add port enable/disable ioctls
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (11 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 12/20] dlb2: add queue map and unmap ioctls Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 14/20] dlb2: add CQ interrupt support Gage Eads
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

The ioctls can be used to dynamically enable or disable scheduling to a
port. (By default, ports start with their scheduling enabled.) Doing so
allows software to, for example, quickly add/remove cores to/from a worker
pool.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c    | 152 +++++++++++++++
 drivers/misc/dlb2/dlb2_main.h     |  16 ++
 drivers/misc/dlb2/dlb2_pf_ops.c   |  40 ++++
 drivers/misc/dlb2/dlb2_resource.c | 382 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h | 120 ++++++++++++
 include/uapi/linux/dlb2_user.h    |  96 ++++++++++
 6 files changed, 806 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index 65d7ab82161c..ff1037ffee3f 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -101,6 +101,154 @@ DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(get_dir_queue_depth)
 DLB2_DOMAIN_IOCTL_CALLBACK_TEMPLATE(pending_port_unmaps)
 
 /*
+ * Port enable/disable ioctls don't use the callback template macro because
+ * they have additional CQ interrupt management logic.
+ */
+static int dlb2_domain_ioctl_enable_ldb_port(struct dlb2_dev *dev,
+					     struct dlb2_domain *domain,
+					     unsigned long user_arg,
+					     u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_enable_ldb_port_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->enable_ldb_port(&dev->hw, domain->id, &arg, &response);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_enable_dir_port(struct dlb2_dev *dev,
+					     struct dlb2_domain *domain,
+					     unsigned long user_arg,
+					     u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_enable_dir_port_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->enable_dir_port(&dev->hw, domain->id, &arg, &response);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_disable_ldb_port(struct dlb2_dev *dev,
+					      struct dlb2_domain *domain,
+					      unsigned long user_arg,
+					      u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_disable_ldb_port_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->disable_ldb_port(&dev->hw, domain->id, &arg, &response);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_domain_ioctl_disable_dir_port(struct dlb2_dev *dev,
+					      struct dlb2_domain *domain,
+					      unsigned long user_arg,
+					      u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_disable_dir_port_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->disable_dir_port(&dev->hw, domain->id, &arg, &response);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+/*
  * Port creation ioctls don't use the callback template macro because they have
  * a number of OS-dependent memory operations.
  */
@@ -472,6 +620,10 @@ dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_start_domain,
 	dlb2_domain_ioctl_map_qid,
 	dlb2_domain_ioctl_unmap_qid,
+	dlb2_domain_ioctl_enable_ldb_port,
+	dlb2_domain_ioctl_enable_dir_port,
+	dlb2_domain_ioctl_disable_ldb_port,
+	dlb2_domain_ioctl_disable_dir_port,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
 	dlb2_domain_ioctl_pending_port_unmaps,
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index bca24e3a4f84..edd5d9703ebb 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -94,6 +94,22 @@ struct dlb2_device_ops {
 				   u32 domain_id,
 				   struct dlb2_pending_port_unmaps_args *args,
 				   struct dlb2_cmd_response *resp);
+	int (*enable_ldb_port)(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_enable_ldb_port_args *args,
+			       struct dlb2_cmd_response *resp);
+	int (*disable_ldb_port)(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_disable_ldb_port_args *args,
+				struct dlb2_cmd_response *resp);
+	int (*enable_dir_port)(struct dlb2_hw *hw,
+			       u32 domain_id,
+			       struct dlb2_enable_dir_port_args *args,
+			       struct dlb2_cmd_response *resp);
+	int (*disable_dir_port)(struct dlb2_hw *hw,
+				u32 domain_id,
+				struct dlb2_disable_dir_port_args *args,
+				struct dlb2_cmd_response *resp);
 	int (*get_num_resources)(struct dlb2_hw *hw,
 				 struct dlb2_get_num_resources_args *args);
 	int (*reset_domain)(struct dlb2_hw *hw, u32 domain_id);
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 51562978ee1a..b5b1b14bcb7e 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -318,6 +318,42 @@ dlb2_pf_pending_port_unmaps(struct dlb2_hw *hw,
 }
 
 static int
+dlb2_pf_enable_ldb_port(struct dlb2_hw *hw,
+			u32 id,
+			struct dlb2_enable_ldb_port_args *args,
+			struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_enable_ldb_port(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_disable_ldb_port(struct dlb2_hw *hw,
+			 u32 id,
+			 struct dlb2_disable_ldb_port_args *args,
+			 struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_disable_ldb_port(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_enable_dir_port(struct dlb2_hw *hw,
+			u32 id,
+			struct dlb2_enable_dir_port_args *args,
+			struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_enable_dir_port(hw, id, args, resp, false, 0);
+}
+
+static int
+dlb2_pf_disable_dir_port(struct dlb2_hw *hw,
+			 u32 id,
+			 struct dlb2_disable_dir_port_args *args,
+			 struct dlb2_cmd_response *resp)
+{
+	return dlb2_hw_disable_dir_port(hw, id, args, resp, false, 0);
+}
+
+static int
 dlb2_pf_get_num_resources(struct dlb2_hw *hw,
 			  struct dlb2_get_num_resources_args *args)
 {
@@ -408,6 +444,10 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.map_qid = dlb2_pf_map_qid,
 	.unmap_qid = dlb2_pf_unmap_qid,
 	.pending_port_unmaps = dlb2_pf_pending_port_unmaps,
+	.enable_ldb_port = dlb2_pf_enable_ldb_port,
+	.enable_dir_port = dlb2_pf_enable_dir_port,
+	.disable_ldb_port = dlb2_pf_disable_ldb_port,
+	.disable_dir_port = dlb2_pf_disable_dir_port,
 	.get_num_resources = dlb2_pf_get_num_resources,
 	.reset_domain = dlb2_pf_reset_domain,
 	.ldb_port_owned_by_domain = dlb2_pf_ldb_port_owned_by_domain,
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index bc1c16ebadde..a7e28088458e 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -1707,6 +1707,150 @@ static int dlb2_verify_unmap_qid_args(struct dlb2_hw *hw,
 	return -EINVAL;
 }
 
+static int
+dlb2_verify_enable_ldb_port_args(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 struct dlb2_enable_ldb_port_args *args,
+				 struct dlb2_cmd_response *resp,
+				 bool vdev_req,
+				 unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+
+	if (!port || !port->configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+dlb2_verify_enable_dir_port_args(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 struct dlb2_enable_dir_port_args *args,
+				 struct dlb2_cmd_response *resp,
+				 bool vdev_req,
+				 unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_dir_pq_pair *port;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_dir_pq(id, vdev_req, domain);
+
+	if (!port || !port->port_configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+dlb2_verify_disable_ldb_port_args(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  struct dlb2_disable_ldb_port_args *args,
+				  struct dlb2_cmd_response *resp,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+
+	if (!port || !port->configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+dlb2_verify_disable_dir_port_args(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  struct dlb2_disable_dir_port_args *args,
+				  struct dlb2_cmd_response *resp,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_dir_pq_pair *port;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	if (!domain) {
+		resp->status = DLB2_ST_INVALID_DOMAIN_ID;
+		return -EINVAL;
+	}
+
+	if (!domain->configured) {
+		resp->status = DLB2_ST_DOMAIN_NOT_CONFIGURED;
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_dir_pq(id, vdev_req, domain);
+
+	if (!port || !port->port_configured) {
+		resp->status = DLB2_ST_INVALID_PORT_ID;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static void dlb2_configure_domain_credits(struct dlb2_hw *hw,
 					  struct dlb2_hw_domain *domain)
 {
@@ -4303,6 +4447,244 @@ int dlb2_hw_unmap_qid(struct dlb2_hw *hw,
 	return 0;
 }
 
+static void dlb2_log_enable_port(struct dlb2_hw *hw,
+				 u32 domain_id,
+				 u32 port_id,
+				 bool vdev_req,
+				 unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 enable port arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tPort ID:   %d\n",
+		    port_id);
+}
+
+int dlb2_hw_enable_ldb_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_enable_ldb_port_args *args,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_req,
+			    unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	int ret, id;
+
+	dlb2_log_enable_port(hw, domain_id, args->port_id, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_enable_ldb_port_args(hw,
+					       domain_id,
+					       args,
+					       resp,
+					       vdev_req,
+					       vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+
+	/* Hardware requires disabling the CQ before unmapping QIDs. */
+	if (!port->enabled) {
+		dlb2_ldb_port_cq_enable(hw, port);
+		port->enabled = true;
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
+
+static void dlb2_log_disable_port(struct dlb2_hw *hw,
+				  u32 domain_id,
+				  u32 port_id,
+				  bool vdev_req,
+				  unsigned int vdev_id)
+{
+	DLB2_HW_DBG(hw, "DLB2 disable port arguments:\n");
+	if (vdev_req)
+		DLB2_HW_DBG(hw, "(Request from vdev %d)\n", vdev_id);
+	DLB2_HW_DBG(hw, "\tDomain ID: %d\n",
+		    domain_id);
+	DLB2_HW_DBG(hw, "\tPort ID:   %d\n",
+		    port_id);
+}
+
+int dlb2_hw_disable_ldb_port(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_disable_ldb_port_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_req,
+			     unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	int ret, id;
+
+	dlb2_log_disable_port(hw, domain_id, args->port_id, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_disable_ldb_port_args(hw,
+						domain_id,
+						args,
+						resp,
+						vdev_req,
+						vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/* Hardware requires disabling the CQ before unmapping QIDs. */
+	if (port->enabled) {
+		dlb2_ldb_port_cq_disable(hw, port);
+		port->enabled = false;
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
+
+int dlb2_hw_enable_dir_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_enable_dir_port_args *args,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_req,
+			    unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *port;
+	struct dlb2_hw_domain *domain;
+	int ret, id;
+
+	dlb2_log_enable_port(hw, domain_id, args->port_id, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_enable_dir_port_args(hw,
+					       domain_id,
+					       args,
+					       resp,
+					       vdev_req,
+					       vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_dir_pq(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/* Hardware requires disabling the CQ before unmapping QIDs. */
+	if (!port->enabled) {
+		dlb2_dir_port_cq_enable(hw, port);
+		port->enabled = true;
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
+
+int dlb2_hw_disable_dir_port(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_disable_dir_port_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_req,
+			     unsigned int vdev_id)
+{
+	struct dlb2_dir_pq_pair *port;
+	struct dlb2_hw_domain *domain;
+	int ret, id;
+
+	dlb2_log_disable_port(hw, domain_id, args->port_id, vdev_req, vdev_id);
+
+	/*
+	 * Verify that hardware resources are available before attempting to
+	 * satisfy the request. This simplifies the error unwinding code.
+	 */
+	ret = dlb2_verify_disable_dir_port_args(hw,
+						domain_id,
+						args,
+						resp,
+						vdev_req,
+						vdev_id);
+	if (ret)
+		return ret;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_used_dir_pq(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EFAULT;
+	}
+
+	/* Hardware requires disabling the CQ before unmapping QIDs. */
+	if (port->enabled) {
+		dlb2_dir_port_cq_disable(hw, port);
+		port->enabled = false;
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
+
 static u32 dlb2_ldb_cq_inflight_count(struct dlb2_hw *hw,
 				      struct dlb2_ldb_port *port)
 {
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 80bebeee7bb6..bc02d01e2a5e 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -325,6 +325,126 @@ int dlb2_hw_unmap_qid(struct dlb2_hw *hw,
 		      unsigned int vdev_id);
 
 /**
+ * dlb2_hw_enable_ldb_port() - enable a load-balanced port for scheduling
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port enable arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function configures the DLB to schedule QEs to a load-balanced port.
+ * Ports are enabled by default.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_enable_ldb_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_enable_ldb_port_args *args,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_request,
+			    unsigned int vdev_id);
+
+/**
+ * dlb2_hw_disable_ldb_port() - disable a load-balanced port for scheduling
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port disable arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function configures the DLB to stop scheduling QEs to a load-balanced
+ * port. Ports are enabled by default.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_disable_ldb_port(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_disable_ldb_port_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_request,
+			     unsigned int vdev_id);
+
+/**
+ * dlb2_hw_enable_dir_port() - enable a directed port for scheduling
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port enable arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function configures the DLB to schedule QEs to a directed port.
+ * Ports are enabled by default.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_enable_dir_port(struct dlb2_hw *hw,
+			    u32 domain_id,
+			    struct dlb2_enable_dir_port_args *args,
+			    struct dlb2_cmd_response *resp,
+			    bool vdev_request,
+			    unsigned int vdev_id);
+
+/**
+ * dlb2_hw_disable_dir_port() - disable a directed port for scheduling
+ * @hw: dlb2_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port disable arguments.
+ * @resp: response structure.
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function configures the DLB to stop scheduling QEs to a directed port.
+ * Ports are enabled by default.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb2_hw_disable_dir_port(struct dlb2_hw *hw,
+			     u32 domain_id,
+			     struct dlb2_disable_dir_port_args *args,
+			     struct dlb2_cmd_response *resp,
+			     bool vdev_request,
+			     unsigned int vdev_id);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index bc692eb73bc8..763ae7eb7a23 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -533,6 +533,82 @@ struct dlb2_unmap_qid_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_ENABLE_LDB_PORT: Enable scheduling to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_enable_ldb_port_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 padding0;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_ENABLE_DIR_PORT: Enable scheduling to a directed port.
+ * Input parameters:
+ * - port_id: Directed port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_enable_dir_port_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_DISABLE_LDB_PORT: Disable scheduling to a load-balanced
+ *	port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_disable_ldb_port_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 padding0;
+};
+
+/*
+ * DLB2_DOMAIN_CMD_DISABLE_DIR_PORT: Disable scheduling to a directed port.
+ * Input parameters:
+ * - port_id: Directed port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_disable_dir_port_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u32 padding0;
+};
+
+/*
  * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
  * Input parameters:
  * - queue_id: The load-balanced queue ID.
@@ -639,6 +715,10 @@ enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_START_DOMAIN,
 	DLB2_DOMAIN_CMD_MAP_QID,
 	DLB2_DOMAIN_CMD_UNMAP_QID,
+	DLB2_DOMAIN_CMD_ENABLE_LDB_PORT,
+	DLB2_DOMAIN_CMD_ENABLE_DIR_PORT,
+	DLB2_DOMAIN_CMD_DISABLE_LDB_PORT,
+	DLB2_DOMAIN_CMD_DISABLE_DIR_PORT,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_PENDING_PORT_UNMAPS,
@@ -716,6 +796,22 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_UNMAP_QID,		\
 		      struct dlb2_unmap_qid_args)
+#define DLB2_IOC_ENABLE_LDB_PORT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_ENABLE_LDB_PORT,		\
+		      struct dlb2_enable_ldb_port_args)
+#define DLB2_IOC_ENABLE_DIR_PORT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_ENABLE_DIR_PORT,		\
+		      struct dlb2_enable_dir_port_args)
+#define DLB2_IOC_DISABLE_LDB_PORT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_DISABLE_LDB_PORT,		\
+		      struct dlb2_disable_ldb_port_args)
+#define DLB2_IOC_DISABLE_DIR_PORT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_DISABLE_DIR_PORT,		\
+		      struct dlb2_disable_dir_port_args)
 #define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/20] dlb2: add CQ interrupt support
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (12 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 13/20] dlb2: add port enable/disable ioctls Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 15/20] dlb2: add domain alert support Gage Eads
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

DLB 2.0 supports interrupt-driven applications with per-CQ interrupts. A CQ
interrupt is armed by user-space software by enqueueing a special command
to the device through the port's MMIO window, and the interrupt fires when
the armed CQ becomes non-empty.

All CQ interrupts use a single MSI-X interrupt vector, and the ISR reads
bitmap registers to determine which CQ(s)'s interrupt fired. For each of
these CQs, the driver wakes up a wait-queue -- on which user-space threads
may be blocking waiting for the interrupt. User-space software calls a
block-on-CQ-interrupt ioctl in order to block on the wait queue.

A CQ's interrupt is enabled when its port is configured, and interrupts are
enabled/disabled when a port is enabled/disabled. If a port is disabled and
a thread is blocked on the wait queue, the thread is woken and returned to
user-space.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/Makefile        |   1 +
 drivers/misc/dlb2/dlb2_hw_types.h |  13 ++
 drivers/misc/dlb2/dlb2_intr.c     | 133 +++++++++++++++++
 drivers/misc/dlb2/dlb2_intr.h     |  29 ++++
 drivers/misc/dlb2/dlb2_ioctl.c    |  80 ++++++++++
 drivers/misc/dlb2/dlb2_main.c     |  20 ++-
 drivers/misc/dlb2/dlb2_main.h     |  43 ++++++
 drivers/misc/dlb2/dlb2_pf_ops.c   | 219 +++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.c | 306 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h | 128 ++++++++++++++++
 include/uapi/linux/dlb2_user.h    |  38 +++++
 11 files changed, 1009 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/dlb2/dlb2_intr.c
 create mode 100644 drivers/misc/dlb2/dlb2_intr.h

diff --git a/drivers/misc/dlb2/Makefile b/drivers/misc/dlb2/Makefile
index 12361461dcff..64ec27489b73 100644
--- a/drivers/misc/dlb2/Makefile
+++ b/drivers/misc/dlb2/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_INTEL_DLB2) := dlb2.o
 
 dlb2-objs :=      \
   dlb2_main.o     \
+  dlb2_intr.o     \
   dlb2_file.o     \
   dlb2_ioctl.o    \
   dlb2_pf_ops.o   \
diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
index 8eed1e0f20a3..bcd7820b5f7f 100644
--- a/drivers/misc/dlb2/dlb2_hw_types.h
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -48,6 +48,19 @@
 #define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
 #define DLB2_HZ 800000000
 
+/* Interrupt related macros */
+#define DLB2_PF_NUM_NON_CQ_INTERRUPT_VECTORS 1
+#define DLB2_PF_NUM_CQ_INTERRUPT_VECTORS	 64
+#define DLB2_PF_TOTAL_NUM_INTERRUPT_VECTORS \
+	(DLB2_PF_NUM_NON_CQ_INTERRUPT_VECTORS + \
+	 DLB2_PF_NUM_CQ_INTERRUPT_VECTORS)
+#define DLB2_PF_NUM_COMPRESSED_MODE_VECTORS \
+	(DLB2_PF_NUM_NON_CQ_INTERRUPT_VECTORS + 1)
+#define DLB2_PF_NUM_PACKED_MODE_VECTORS \
+	DLB2_PF_TOTAL_NUM_INTERRUPT_VECTORS
+#define DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID \
+	DLB2_PF_NUM_NON_CQ_INTERRUPT_VECTORS
+
 /*
  * Hardware-defined base addresses. Those prefixed 'DLB2_DRV' are only used by
  * the PF driver.
diff --git a/drivers/misc/dlb2/dlb2_intr.c b/drivers/misc/dlb2/dlb2_intr.c
new file mode 100644
index 000000000000..88c936ab7c69
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_intr.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2017-2020 Intel Corporation */
+
+#include <linux/interrupt.h>
+#include <linux/uaccess.h>
+
+#include "dlb2_intr.h"
+#include "dlb2_main.h"
+#include "dlb2_resource.h"
+
+void dlb2_wake_thread(struct dlb2_dev *dev,
+		      struct dlb2_cq_intr *intr,
+		      enum dlb2_wake_reason reason)
+{
+	switch (reason) {
+	case WAKE_CQ_INTR:
+		WRITE_ONCE(intr->wake, true);
+		break;
+	case WAKE_PORT_DISABLED:
+		WRITE_ONCE(intr->disabled, true);
+		break;
+	default:
+		break;
+	}
+
+	wake_up_interruptible(&intr->wq_head);
+}
+
+static inline bool wake_condition(struct dlb2_cq_intr *intr,
+				  struct dlb2_dev *dev,
+				  struct dlb2_domain *domain)
+{
+	return (READ_ONCE(intr->wake) || READ_ONCE(intr->disabled));
+}
+
+struct dlb2_dequeue_qe {
+	u8 rsvd0[15];
+	u8 cq_gen:1;
+	u8 rsvd1:7;
+} __packed;
+
+/**
+ * dlb2_cq_empty() - determine whether a CQ is empty
+ * @dev: struct dlb2_dev pointer.
+ * @user_cq_va: User VA pointing to next CQ entry.
+ * @cq_gen: Current CQ generation bit.
+ *
+ * Return:
+ * Returns 1 if empty, 0 if non-empty, or < 0 if an error occurs.
+ */
+static int dlb2_cq_empty(struct dlb2_dev *dev, u64 user_cq_va, u8 cq_gen)
+{
+	struct dlb2_dequeue_qe qe;
+
+	if (copy_from_user(&qe, (void __user *)user_cq_va, sizeof(qe))) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Invalid cq_va pointer\n", __func__);
+		return -EFAULT;
+	}
+
+	return qe.cq_gen != cq_gen;
+}
+
+int dlb2_block_on_cq_interrupt(struct dlb2_dev *dev,
+			       struct dlb2_domain *dom,
+			       int port_id,
+			       bool is_ldb,
+			       u64 cq_va,
+			       u8 cq_gen,
+			       bool arm)
+{
+	struct dlb2_cq_intr *intr;
+	int ret = 0;
+
+	if (is_ldb && port_id >= DLB2_MAX_NUM_LDB_PORTS)
+		return -EINVAL;
+	if (!is_ldb && port_id >= DLB2_MAX_NUM_DIR_PORTS)
+		return -EINVAL;
+
+	if (is_ldb)
+		intr = &dev->intr.ldb_cq_intr[port_id];
+	else
+		intr = &dev->intr.dir_cq_intr[port_id];
+
+	if (!intr->configured || intr->domain_id != dom->id)
+		return -EINVAL;
+
+	/*
+	 * This function requires that only one thread process the CQ at a time.
+	 * Otherwise, the wake condition could become false in the time between
+	 * the ISR calling wake_up_interruptible() and the thread checking its
+	 * wake condition.
+	 */
+	mutex_lock(&intr->mutex);
+
+	/* Return early if the port's interrupt is disabled */
+	if (READ_ONCE(intr->disabled)) {
+		mutex_unlock(&intr->mutex);
+		return -EACCES;
+	}
+
+	dev_dbg(dev->dlb2_device,
+		"Thread is blocking on %s port %d's interrupt\n",
+		(is_ldb) ? "LDB" : "DIR", port_id);
+
+	/* Don't block if the CQ is non-empty */
+	ret = dlb2_cq_empty(dev, cq_va, cq_gen);
+	if (ret != 1)
+		goto error;
+
+	if (arm) {
+		ret = dev->ops->arm_cq_interrupt(dev, dom->id, port_id, is_ldb);
+		if (ret)
+			goto error;
+	}
+
+	ret = wait_event_interruptible(intr->wq_head,
+				       wake_condition(intr, dev, dom));
+
+	if (ret == 0 && READ_ONCE(intr->disabled))
+		ret = -EACCES;
+
+	WRITE_ONCE(intr->wake, false);
+
+	dev_dbg(dev->dlb2_device,
+		"Thread is unblocked from %s port %d's interrupt\n",
+		(is_ldb) ? "LDB" : "DIR", port_id);
+
+error:
+	mutex_unlock(&intr->mutex);
+
+	return ret;
+}
diff --git a/drivers/misc/dlb2/dlb2_intr.h b/drivers/misc/dlb2/dlb2_intr.h
new file mode 100644
index 000000000000..613179795d8f
--- /dev/null
+++ b/drivers/misc/dlb2/dlb2_intr.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ * Copyright(c) 2017-2020 Intel Corporation
+ */
+
+#ifndef __DLB2_INTR_H
+#define __DLB2_INTR_H
+
+#include <linux/pci.h>
+
+#include "dlb2_main.h"
+
+int dlb2_block_on_cq_interrupt(struct dlb2_dev *dev,
+			       struct dlb2_domain *domain,
+			       int port_id,
+			       bool is_ldb,
+			       u64 cq_va,
+			       u8 cq_gen,
+			       bool arm);
+
+enum dlb2_wake_reason {
+	WAKE_CQ_INTR,
+	WAKE_PORT_DISABLED,
+};
+
+void dlb2_wake_thread(struct dlb2_dev *dev,
+		      struct dlb2_cq_intr *intr,
+		      enum dlb2_wake_reason reason);
+
+#endif /* __DLB2_INTR_H */
diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index ff1037ffee3f..a4222265da79 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -7,6 +7,7 @@
 #include <uapi/linux/dlb2_user.h>
 
 #include "dlb2_file.h"
+#include "dlb2_intr.h"
 #include "dlb2_ioctl.h"
 #include "dlb2_main.h"
 
@@ -128,6 +129,10 @@ static int dlb2_domain_ioctl_enable_ldb_port(struct dlb2_dev *dev,
 
 	ret = dev->ops->enable_ldb_port(&dev->hw, domain->id, &arg, &response);
 
+	/* Allow threads to block on this port's CQ interrupt */
+	if (!ret)
+		WRITE_ONCE(dev->intr.ldb_cq_intr[arg.port_id].disabled, false);
+
 	mutex_unlock(&dev->resource_mutex);
 
 	if (copy_to_user((void __user *)arg.response,
@@ -164,6 +169,10 @@ static int dlb2_domain_ioctl_enable_dir_port(struct dlb2_dev *dev,
 
 	ret = dev->ops->enable_dir_port(&dev->hw, domain->id, &arg, &response);
 
+	/* Allow threads to block on this port's CQ interrupt */
+	if (!ret)
+		WRITE_ONCE(dev->intr.dir_cq_intr[arg.port_id].disabled, false);
+
 	mutex_unlock(&dev->resource_mutex);
 
 	if (copy_to_user((void __user *)arg.response,
@@ -200,6 +209,15 @@ static int dlb2_domain_ioctl_disable_ldb_port(struct dlb2_dev *dev,
 
 	ret = dev->ops->disable_ldb_port(&dev->hw, domain->id, &arg, &response);
 
+	/*
+	 * Wake threads blocked on this port's CQ interrupt, and prevent
+	 * subsequent attempts to block on it.
+	 */
+	if (!ret)
+		dlb2_wake_thread(dev,
+				 &dev->intr.ldb_cq_intr[arg.port_id],
+				 WAKE_PORT_DISABLED);
+
 	mutex_unlock(&dev->resource_mutex);
 
 	if (copy_to_user((void __user *)arg.response,
@@ -236,6 +254,15 @@ static int dlb2_domain_ioctl_disable_dir_port(struct dlb2_dev *dev,
 
 	ret = dev->ops->disable_dir_port(&dev->hw, domain->id, &arg, &response);
 
+	/*
+	 * Wake threads blocked on this port's CQ interrupt, and prevent
+	 * subsequent attempts to block on it.
+	 */
+	if (!ret)
+		dlb2_wake_thread(dev,
+				 &dev->intr.dir_cq_intr[arg.port_id],
+				 WAKE_PORT_DISABLED);
+
 	mutex_unlock(&dev->resource_mutex);
 
 	if (copy_to_user((void __user *)arg.response,
@@ -294,6 +321,13 @@ static int dlb2_domain_ioctl_create_ldb_port(struct dlb2_dev *dev,
 	if (ret)
 		goto unlock;
 
+	ret = dev->ops->enable_ldb_cq_interrupts(dev,
+						 domain->id,
+						 response.id,
+						 arg.cq_depth_threshold);
+	if (ret)
+		goto unlock; /* Internal error, don't unwind port creation */
+
 	/* Fill out the per-port data structure */
 	dev->ldb_port[response.id].id = response.id;
 	dev->ldb_port[response.id].is_ldb = true;
@@ -372,6 +406,13 @@ static int dlb2_domain_ioctl_create_dir_port(struct dlb2_dev *dev,
 	if (ret)
 		goto unlock;
 
+	dev->ops->enable_dir_cq_interrupts(dev,
+					   domain->id,
+					   response.id,
+					   arg.cq_depth_threshold);
+	if (ret)
+		goto unlock; /* Internal error, don't unwind port creation */
+
 	/* Fill out the per-port data structure */
 	dev->dir_port[response.id].id = response.id;
 	dev->dir_port[response.id].is_ldb = false;
@@ -408,6 +449,44 @@ static int dlb2_domain_ioctl_create_dir_port(struct dlb2_dev *dev,
 	return ret;
 }
 
+static int dlb2_domain_ioctl_block_on_cq_interrupt(struct dlb2_dev *dev,
+						   struct dlb2_domain *domain,
+						   unsigned long user_arg,
+						   u16 size)
+{
+	struct dlb2_block_on_cq_interrupt_args arg;
+	struct dlb2_cmd_response response = {0};
+	int ret = 0;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	ret = dlb2_block_on_cq_interrupt(dev,
+					 domain,
+					 arg.port_id,
+					 arg.is_ldb,
+					 arg.cq_va,
+					 arg.cq_gen,
+					 arg.arm);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
 static int dlb2_create_port_fd(struct dlb2_dev *dev,
 			       struct dlb2_domain *domain,
 			       const char *prefix,
@@ -624,6 +703,7 @@ dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_enable_dir_port,
 	dlb2_domain_ioctl_disable_ldb_port,
 	dlb2_domain_ioctl_disable_dir_port,
+	dlb2_domain_ioctl_block_on_cq_interrupt,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
 	dlb2_domain_ioctl_pending_port_unmaps,
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index 63ea5b6b58c8..33b60a8d46fe 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -205,13 +205,20 @@ static void dlb2_release_device_memory(struct dlb2_dev *dev)
 
 static int __dlb2_free_domain(struct dlb2_dev *dev, struct dlb2_domain *domain)
 {
-	int ret = 0;
+	int i, ret = 0;
 
 	ret = dev->ops->reset_domain(&dev->hw, domain->id);
 
 	/* Unpin and free all memory pages associated with the domain */
 	dlb2_release_domain_memory(dev, domain->id);
 
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++)
+		if (dev->intr.ldb_cq_intr[i].domain_id == domain->id)
+			dev->intr.ldb_cq_intr[i].configured = false;
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++)
+		if (dev->intr.dir_cq_intr[i].domain_id == domain->id)
+			dev->intr.dir_cq_intr[i].configured = false;
+
 	if (ret) {
 		dev->domain_reset_failed = true;
 		dev_err(dev->dlb2_device,
@@ -488,6 +495,10 @@ static int dlb2_probe(struct pci_dev *pdev,
 	if (ret)
 		goto dlb2_reset_fail;
 
+	ret = dlb2_dev->ops->init_interrupts(dlb2_dev, pdev);
+	if (ret)
+		goto init_interrupts_fail;
+
 	ret = dlb2_resource_init(&dlb2_dev->hw);
 	if (ret)
 		goto resource_init_fail;
@@ -516,6 +527,8 @@ static int dlb2_probe(struct pci_dev *pdev,
 init_driver_state_fail:
 	dlb2_resource_free(&dlb2_dev->hw);
 resource_init_fail:
+	dlb2_dev->ops->free_interrupts(dlb2_dev, pdev);
+init_interrupts_fail:
 dlb2_reset_fail:
 wait_for_device_ready_fail:
 dma_set_mask_fail:
@@ -556,6 +569,8 @@ static void dlb2_remove(struct pci_dev *pdev)
 
 	dlb2_resource_free(&dlb2_dev->hw);
 
+	dlb2_dev->ops->free_interrupts(dlb2_dev, pdev);
+
 	dlb2_release_device_memory(dlb2_dev);
 
 	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
@@ -580,6 +595,9 @@ static void dlb2_reset_hardware_state(struct dlb2_dev *dev)
 {
 	dlb2_reset_device(dev->pdev);
 
+	/* Reinitialize interrupt configuration */
+	dev->ops->reinit_interrupts(dev);
+
 	/* Reinitialize any other hardware state */
 	dev->ops->init_hardware(dev);
 }
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index edd5d9703ebb..ffbba2c606ba 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -54,6 +54,21 @@ struct dlb2_device_ops {
 			dev_t base,
 			const struct file_operations *fops);
 	void (*cdev_del)(struct dlb2_dev *dlb2_dev);
+	int (*init_interrupts)(struct dlb2_dev *dev, struct pci_dev *pdev);
+	int (*enable_ldb_cq_interrupts)(struct dlb2_dev *dev,
+					int domain_id,
+					int port_id,
+					u16 thresh);
+	int (*enable_dir_cq_interrupts)(struct dlb2_dev *dev,
+					int domain_id,
+					int port_id,
+					u16 thresh);
+	int (*arm_cq_interrupt)(struct dlb2_dev *dev,
+				int domain_id,
+				int port_id,
+				bool is_ldb);
+	void (*reinit_interrupts)(struct dlb2_dev *dev);
+	void (*free_interrupts)(struct dlb2_dev *dev, struct pci_dev *pdev);
 	void (*enable_pm)(struct dlb2_dev *dev);
 	int (*wait_for_device_ready)(struct dlb2_dev *dev,
 				     struct pci_dev *pdev);
@@ -152,6 +167,33 @@ struct dlb2_domain {
 	u8 id;
 };
 
+struct dlb2_cq_intr {
+	wait_queue_head_t wq_head;
+	/*
+	 * The CQ interrupt mutex guarantees one thread is blocking on a CQ's
+	 * interrupt at a time.
+	 */
+	struct mutex mutex;
+	u8 wake;
+	u8 configured;
+	u8 domain_id;
+	/*
+	 * disabled is true if the port is disabled. In that
+	 * case, the driver doesn't allow applications to block on the
+	 * port's interrupt.
+	 */
+	u8 disabled;
+} ____cacheline_aligned;
+
+struct dlb2_intr {
+	struct dlb2_cq_intr ldb_cq_intr[DLB2_MAX_NUM_LDB_PORTS];
+	struct dlb2_cq_intr dir_cq_intr[DLB2_MAX_NUM_DIR_PORTS];
+	u8 isr_registered[DLB2_PF_NUM_CQ_INTERRUPT_VECTORS];
+	int num_vectors;
+	int base_vector;
+	int mode;
+};
+
 struct dlb2_dev {
 	struct pci_dev *pdev;
 	struct dlb2_hw hw;
@@ -167,6 +209,7 @@ struct dlb2_dev {
 	 * device file mappings.
 	 */
 	struct inode *inode;
+	struct dlb2_intr intr;
 	/*
 	 * The resource mutex serializes access to driver data structures and
 	 * hardware registers.
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index b5b1b14bcb7e..5b212624517a 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -4,6 +4,7 @@
 #include <linux/delay.h>
 #include <linux/pm_runtime.h>
 
+#include "dlb2_intr.h"
 #include "dlb2_main.h"
 #include "dlb2_regs.h"
 #include "dlb2_resource.h"
@@ -91,6 +92,218 @@ dlb2_pf_map_pci_bar_space(struct dlb2_dev *dlb2_dev,
 	return 0;
 }
 
+/**********************************/
+/****** Interrupt management ******/
+/**********************************/
+
+static irqreturn_t
+dlb2_compressed_cq_intr_handler(int irq, void *hdlr_ptr)
+{
+	struct dlb2_dev *dev = (struct dlb2_dev *)hdlr_ptr;
+	u32 ldb_cq_interrupts[DLB2_MAX_NUM_LDB_PORTS / 32];
+	u32 dir_cq_interrupts[DLB2_MAX_NUM_DIR_PORTS / 32];
+	int i;
+
+	dev_dbg(dev->dlb2_device, "Entered ISR\n");
+
+	dlb2_read_compressed_cq_intr_status(&dev->hw,
+					    ldb_cq_interrupts,
+					    dir_cq_interrupts);
+
+	dlb2_ack_compressed_cq_intr(&dev->hw,
+				    ldb_cq_interrupts,
+				    dir_cq_interrupts);
+
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++) {
+		if (!(ldb_cq_interrupts[i / 32] & (1 << (i % 32))))
+			continue;
+
+		dev_dbg(dev->dlb2_device, "[%s()] Waking LDB port %d\n",
+			__func__, i);
+
+		dlb2_wake_thread(dev, &dev->intr.ldb_cq_intr[i], WAKE_CQ_INTR);
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++) {
+		if (!(dir_cq_interrupts[i / 32] & (1 << (i % 32))))
+			continue;
+
+		dev_dbg(dev->dlb2_device, "[%s()] Waking DIR port %d\n",
+			__func__, i);
+
+		dlb2_wake_thread(dev, &dev->intr.dir_cq_intr[i], WAKE_CQ_INTR);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static int
+dlb2_init_compressed_mode_interrupts(struct dlb2_dev *dev,
+				     struct pci_dev *pdev)
+{
+	int ret, irq;
+
+	irq = pci_irq_vector(pdev, DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID);
+
+	ret = devm_request_irq(&pdev->dev,
+			       irq,
+			       dlb2_compressed_cq_intr_handler,
+			       0,
+			       "dlb2_compressed_cq",
+			       dev);
+	if (ret)
+		return ret;
+
+	dev->intr.isr_registered[DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID] = true;
+
+	dev->intr.mode = DLB2_MSIX_MODE_COMPRESSED;
+
+	dlb2_set_msix_mode(&dev->hw, DLB2_MSIX_MODE_COMPRESSED);
+
+	return 0;
+}
+
+static void
+dlb2_pf_free_interrupts(struct dlb2_dev *dev,
+			struct pci_dev *pdev)
+{
+	int i;
+
+	for (i = 0; i < dev->intr.num_vectors; i++) {
+		if (dev->intr.isr_registered[i])
+			devm_free_irq(&pdev->dev, pci_irq_vector(pdev, i), dev);
+	}
+
+	pci_free_irq_vectors(pdev);
+}
+
+static int
+dlb2_pf_init_interrupts(struct dlb2_dev *dev, struct pci_dev *pdev)
+{
+	int ret, i;
+
+	/*
+	 * DLB supports two modes for CQ interrupts:
+	 * - "compressed mode": all CQ interrupts are packed into a single
+	 *	vector. The ISR reads six interrupt status registers to
+	 *	determine the source(s).
+	 * - "packed mode" (unused): the hardware supports up to 64 vectors.
+	 */
+
+	ret = pci_alloc_irq_vectors(pdev,
+				    DLB2_PF_NUM_COMPRESSED_MODE_VECTORS,
+				    DLB2_PF_NUM_COMPRESSED_MODE_VECTORS,
+				    PCI_IRQ_MSIX);
+	if (ret < 0)
+		return ret;
+
+	dev->intr.num_vectors = ret;
+	dev->intr.base_vector = pci_irq_vector(pdev, 0);
+
+	ret = dlb2_init_compressed_mode_interrupts(dev, pdev);
+	if (ret) {
+		dlb2_pf_free_interrupts(dev, pdev);
+		return ret;
+	}
+
+	/*
+	 * Initialize per-CQ interrupt structures, such as wait queues that
+	 * threads will wait on until the CQ's interrupt fires.
+	 */
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++) {
+		init_waitqueue_head(&dev->intr.ldb_cq_intr[i].wq_head);
+		mutex_init(&dev->intr.ldb_cq_intr[i].mutex);
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++) {
+		init_waitqueue_head(&dev->intr.dir_cq_intr[i].wq_head);
+		mutex_init(&dev->intr.dir_cq_intr[i].mutex);
+	}
+
+	return 0;
+}
+
+/*
+ * If the device is reset during use, its interrupt registers need to be
+ * reinitialized.
+ */
+static void
+dlb2_pf_reinit_interrupts(struct dlb2_dev *dev)
+{
+	dlb2_set_msix_mode(&dev->hw, DLB2_MSIX_MODE_COMPRESSED);
+}
+
+static int
+dlb2_pf_enable_ldb_cq_interrupts(struct dlb2_dev *dev,
+				 int domain_id,
+				 int id,
+				 u16 thresh)
+{
+	int mode, vec;
+
+	if (dev->intr.mode == DLB2_MSIX_MODE_COMPRESSED) {
+		mode = DLB2_CQ_ISR_MODE_MSIX;
+		vec = 0;
+	} else {
+		mode = DLB2_CQ_ISR_MODE_MSIX;
+		vec = id % 64;
+	}
+
+	dev->intr.ldb_cq_intr[id].disabled = false;
+	dev->intr.ldb_cq_intr[id].configured = true;
+	dev->intr.ldb_cq_intr[id].domain_id = domain_id;
+
+	return dlb2_configure_ldb_cq_interrupt(&dev->hw, id, vec,
+					       mode, 0, 0, thresh);
+}
+
+static int
+dlb2_pf_enable_dir_cq_interrupts(struct dlb2_dev *dev,
+				 int domain_id,
+				 int id,
+				 u16 thresh)
+{
+	int mode, vec;
+
+	if (dev->intr.mode == DLB2_MSIX_MODE_COMPRESSED) {
+		mode = DLB2_CQ_ISR_MODE_MSIX;
+		vec = 0;
+	} else {
+		mode = DLB2_CQ_ISR_MODE_MSIX;
+		vec = id % 64;
+	}
+
+	dev->intr.dir_cq_intr[id].disabled = false;
+	dev->intr.dir_cq_intr[id].configured = true;
+	dev->intr.dir_cq_intr[id].domain_id = domain_id;
+
+	return dlb2_configure_dir_cq_interrupt(&dev->hw, id, vec,
+					       mode, 0, 0, thresh);
+}
+
+static int
+dlb2_pf_arm_cq_interrupt(struct dlb2_dev *dev,
+			 int domain_id,
+			 int port_id,
+			 bool is_ldb)
+{
+	int ret;
+
+	if (is_ldb)
+		ret = dev->ops->ldb_port_owned_by_domain(&dev->hw,
+							 domain_id,
+							 port_id);
+	else
+		ret = dev->ops->dir_port_owned_by_domain(&dev->hw,
+							 domain_id,
+							 port_id);
+
+	if (ret != 1)
+		return -EINVAL;
+
+	return dlb2_arm_cq_interrupt(&dev->hw, port_id, is_ldb, false, 0);
+}
+
 /*******************************/
 /****** Driver management ******/
 /*******************************/
@@ -433,6 +646,12 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.device_destroy = dlb2_pf_device_destroy,
 	.cdev_add = dlb2_pf_cdev_add,
 	.cdev_del = dlb2_pf_cdev_del,
+	.init_interrupts = dlb2_pf_init_interrupts,
+	.enable_ldb_cq_interrupts = dlb2_pf_enable_ldb_cq_interrupts,
+	.enable_dir_cq_interrupts = dlb2_pf_enable_dir_cq_interrupts,
+	.arm_cq_interrupt = dlb2_pf_arm_cq_interrupt,
+	.reinit_interrupts = dlb2_pf_reinit_interrupts,
+	.free_interrupts = dlb2_pf_free_interrupts,
 	.enable_pm = dlb2_pf_enable_pm,
 	.wait_for_device_ready = dlb2_pf_wait_for_device_ready,
 	.create_sched_domain = dlb2_pf_create_sched_domain,
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index a7e28088458e..436629cf02a2 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -237,6 +237,41 @@ static struct dlb2_hw_domain *dlb2_get_domain_from_id(struct dlb2_hw *hw,
 	return NULL;
 }
 
+static struct dlb2_ldb_port *dlb2_get_ldb_port_from_id(struct dlb2_hw *hw,
+						       u32 id,
+						       bool vdev_req,
+						       unsigned int vdev_id)
+{
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	rsrcs = (vdev_req) ? &hw->vdev[vdev_id] : &hw->pf;
+
+	if (!vdev_req)
+		return &hw->rsrcs.ldb_ports[id];
+
+	DLB2_FUNC_LIST_FOR(rsrcs->used_domains, domain) {
+		for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+			DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port)
+				if (port->id.virt_id == id)
+					return port;
+		}
+	}
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_FUNC_LIST_FOR(rsrcs->avail_ldb_ports[i], port)
+			if (port->id.virt_id == id)
+				return port;
+	}
+
+	return NULL;
+}
+
 static struct dlb2_ldb_port *
 dlb2_get_domain_used_ldb_port(u32 id,
 			      bool vdev_req,
@@ -289,6 +324,36 @@ dlb2_get_domain_ldb_port(u32 id,
 	return NULL;
 }
 
+static struct dlb2_dir_pq_pair *dlb2_get_dir_pq_from_id(struct dlb2_hw *hw,
+							u32 id,
+							bool vdev_req,
+							unsigned int vdev_id)
+{
+	struct dlb2_function_resources *rsrcs;
+	struct dlb2_dir_pq_pair *port;
+	struct dlb2_hw_domain *domain;
+
+	if (id >= DLB2_MAX_NUM_DIR_PORTS)
+		return NULL;
+
+	rsrcs = (vdev_req) ? &hw->vdev[vdev_id] : &hw->pf;
+
+	if (!vdev_req)
+		return &hw->rsrcs.dir_pq_pairs[id];
+
+	DLB2_FUNC_LIST_FOR(rsrcs->used_domains, domain) {
+		DLB2_DOM_LIST_FOR(domain->used_dir_pq_pairs, port)
+			if (port->id.virt_id == id)
+				return port;
+	}
+
+	DLB2_FUNC_LIST_FOR(rsrcs->avail_dir_pq_pairs, port)
+		if (port->id.virt_id == id)
+			return port;
+
+	return NULL;
+}
+
 static struct dlb2_dir_pq_pair *
 dlb2_get_domain_used_dir_pq(u32 id,
 			    bool vdev_req,
@@ -4685,6 +4750,247 @@ int dlb2_hw_disable_dir_port(struct dlb2_hw *hw,
 	return 0;
 }
 
+void dlb2_set_msix_mode(struct dlb2_hw *hw, int mode)
+{
+	union dlb2_sys_msix_mode r0 = { {0} };
+
+	r0.field.mode = mode;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_MSIX_MODE, r0.val);
+}
+
+int dlb2_configure_ldb_cq_interrupt(struct dlb2_hw *hw,
+				    int port_id,
+				    int vector,
+				    int mode,
+				    unsigned int vf,
+				    unsigned int owner_vf,
+				    u16 threshold)
+{
+	union dlb2_chp_ldb_cq_int_depth_thrsh r0 = { {0} };
+	union dlb2_chp_ldb_cq_int_enb r1 = { {0} };
+	union dlb2_sys_ldb_cq_isr r2 = { {0} };
+	struct dlb2_ldb_port *port;
+	bool vdev_req;
+
+	vdev_req = (mode == DLB2_CQ_ISR_MODE_MSI ||
+		    mode == DLB2_CQ_ISR_MODE_ADI);
+
+	port = dlb2_get_ldb_port_from_id(hw, port_id, vdev_req, vf);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s()]: Internal error: failed to enable LDB CQ int\n\tport_id: %u, vdev_req: %u, vdev: %u\n",
+			    __func__, port_id, vdev_req, vf);
+		return -EINVAL;
+	}
+
+	/* Trigger the interrupt when threshold or more QEs arrive in the CQ */
+	r0.field.depth_threshold = threshold - 1;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_LDB_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+		    r0.val);
+
+	r1.field.en_depth = 1;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_LDB_CQ_INT_ENB(port->id.phys_id), r1.val);
+
+	r2.field.vector = vector;
+	r2.field.vf = owner_vf;
+	r2.field.en_code = mode;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_LDB_CQ_ISR(port->id.phys_id), r2.val);
+
+	return 0;
+}
+
+int dlb2_configure_dir_cq_interrupt(struct dlb2_hw *hw,
+				    int port_id,
+				    int vector,
+				    int mode,
+				    unsigned int vf,
+				    unsigned int owner_vf,
+				    u16 threshold)
+{
+	union dlb2_chp_dir_cq_int_depth_thrsh r0 = { {0} };
+	union dlb2_chp_dir_cq_int_enb r1 = { {0} };
+	union dlb2_sys_dir_cq_isr r2 = { {0} };
+	struct dlb2_dir_pq_pair *port;
+	bool vdev_req;
+
+	vdev_req = (mode == DLB2_CQ_ISR_MODE_MSI ||
+		    mode == DLB2_CQ_ISR_MODE_ADI);
+
+	port = dlb2_get_dir_pq_from_id(hw, port_id, vdev_req, vf);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s()]: Internal error: failed to enable DIR CQ int\n\tport_id: %u, vdev_req: %u, vdev: %u\n",
+			    __func__, port_id, vdev_req, vf);
+		return -EINVAL;
+	}
+
+	/* Trigger the interrupt when threshold or more QEs arrive in the CQ */
+	r0.field.depth_threshold = threshold - 1;
+
+	DLB2_CSR_WR(hw,
+		    DLB2_CHP_DIR_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+		    r0.val);
+
+	r1.field.en_depth = 1;
+
+	DLB2_CSR_WR(hw, DLB2_CHP_DIR_CQ_INT_ENB(port->id.phys_id), r1.val);
+
+	r2.field.vector = vector;
+	r2.field.vf = owner_vf;
+	r2.field.en_code = mode;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_DIR_CQ_ISR(port->id.phys_id), r2.val);
+
+	return 0;
+}
+
+int dlb2_arm_cq_interrupt(struct dlb2_hw *hw,
+			  int port_id,
+			  bool is_ldb,
+			  bool vdev_req,
+			  unsigned int vdev_id)
+{
+	u32 val;
+	u32 reg;
+
+	if (vdev_req && is_ldb) {
+		struct dlb2_ldb_port *ldb_port;
+
+		ldb_port = dlb2_get_ldb_port_from_id(hw, port_id,
+						     true, vdev_id);
+
+		if (!ldb_port || !ldb_port->configured)
+			return -EINVAL;
+
+		port_id = ldb_port->id.phys_id;
+	} else if (vdev_req && !is_ldb) {
+		struct dlb2_dir_pq_pair *dir_port;
+
+		dir_port = dlb2_get_dir_pq_from_id(hw, port_id, true, vdev_id);
+
+		if (!dir_port || !dir_port->port_configured)
+			return -EINVAL;
+
+		port_id = dir_port->id.phys_id;
+	}
+
+	val = 1 << (port_id % 32);
+
+	if (is_ldb && port_id < 32)
+		reg = DLB2_CHP_LDB_CQ_INTR_ARMED0;
+	else if (is_ldb && port_id < 64)
+		reg = DLB2_CHP_LDB_CQ_INTR_ARMED1;
+	else if (!is_ldb && port_id < 32)
+		reg = DLB2_CHP_DIR_CQ_INTR_ARMED0;
+	else
+		reg = DLB2_CHP_DIR_CQ_INTR_ARMED1;
+
+	DLB2_CSR_WR(hw, reg, val);
+
+	dlb2_flush_csr(hw);
+
+	return 0;
+}
+
+void dlb2_read_compressed_cq_intr_status(struct dlb2_hw *hw,
+					 u32 *ldb_interrupts,
+					 u32 *dir_interrupts)
+{
+	/* Read every CQ's interrupt status */
+
+	ldb_interrupts[0] = DLB2_CSR_RD(hw,
+					DLB2_SYS_LDB_CQ_31_0_OCC_INT_STS);
+	ldb_interrupts[1] = DLB2_CSR_RD(hw,
+					DLB2_SYS_LDB_CQ_63_32_OCC_INT_STS);
+
+	dir_interrupts[0] = DLB2_CSR_RD(hw,
+					DLB2_SYS_DIR_CQ_31_0_OCC_INT_STS);
+	dir_interrupts[1] = DLB2_CSR_RD(hw,
+					DLB2_SYS_DIR_CQ_63_32_OCC_INT_STS);
+}
+
+static void dlb2_ack_msix_interrupt(struct dlb2_hw *hw, int vector)
+{
+	union dlb2_sys_msix_ack r0 = { {0} };
+
+	switch (vector) {
+	case 0:
+		r0.field.msix_0_ack = 1;
+		break;
+	case 1:
+		r0.field.msix_1_ack = 1;
+		/*
+		 * CSSY-1650
+		 * workaround h/w bug for lost MSI-X interrupts
+		 *
+		 * The recommended workaround for acknowledging
+		 * vector 1 interrupts is :
+		 *   1: set   MSI-X mask
+		 *   2: set   MSIX_PASSTHROUGH
+		 *   3: clear MSIX_ACK
+		 *   4: clear MSIX_PASSTHROUGH
+		 *   5: clear MSI-X mask
+		 *
+		 * The MSIX-ACK (step 3) is cleared for all vectors
+		 * below. We handle steps 1 & 2 for vector 1 here.
+		 *
+		 * The bitfields for MSIX_ACK and MSIX_PASSTHRU are
+		 * defined the same, so we just use the MSIX_ACK
+		 * value when writing to PASSTHRU.
+		 */
+
+		/* set MSI-X mask and passthrough for vector 1 */
+		DLB2_FUNC_WR(hw, DLB2_MSIX_MEM_VECTOR_CTRL(1), 1);
+		DLB2_CSR_WR(hw, DLB2_SYS_MSIX_PASSTHRU, r0.val);
+		break;
+	}
+
+	/* clear MSIX_ACK (write one to clear) */
+	DLB2_CSR_WR(hw, DLB2_SYS_MSIX_ACK, r0.val);
+
+	if (vector == 1) {
+		/*
+		 * finish up steps 4 & 5 of the workaround -
+		 * clear pasthrough and mask
+		 */
+		DLB2_CSR_WR(hw, DLB2_SYS_MSIX_PASSTHRU, 0);
+		DLB2_FUNC_WR(hw, DLB2_MSIX_MEM_VECTOR_CTRL(1), 0);
+	}
+
+	dlb2_flush_csr(hw);
+}
+
+void dlb2_ack_compressed_cq_intr(struct dlb2_hw *hw,
+				 u32 *ldb_interrupts,
+				 u32 *dir_interrupts)
+{
+	/* Write back the status regs to ack the interrupts */
+	if (ldb_interrupts[0])
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_CQ_31_0_OCC_INT_STS,
+			    ldb_interrupts[0]);
+	if (ldb_interrupts[1])
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_LDB_CQ_63_32_OCC_INT_STS,
+			    ldb_interrupts[1]);
+
+	if (dir_interrupts[0])
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_DIR_CQ_31_0_OCC_INT_STS,
+			    dir_interrupts[0]);
+	if (dir_interrupts[1])
+		DLB2_CSR_WR(hw,
+			    DLB2_SYS_DIR_CQ_63_32_OCC_INT_STS,
+			    dir_interrupts[1]);
+
+	dlb2_ack_msix_interrupt(hw, DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID);
+}
+
 static u32 dlb2_ldb_cq_inflight_count(struct dlb2_hw *hw,
 				      struct dlb2_ldb_port *port)
 {
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index bc02d01e2a5e..a5921fb3273d 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -445,6 +445,134 @@ int dlb2_hw_disable_dir_port(struct dlb2_hw *hw,
 			     unsigned int vdev_id);
 
 /**
+ * dlb2_configure_ldb_cq_interrupt() - configure load-balanced CQ for
+ *					interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ * @port_id: load-balanced port ID.
+ * @vector: interrupt vector ID. Should be 0 for MSI or compressed MSI-X mode,
+ *	    else a value up to 64.
+ * @mode: interrupt type (DLB2_CQ_ISR_MODE_MSI or DLB2_CQ_ISR_MODE_MSIX)
+ * @vf: If the port is VF-owned, the VF's ID. This is used for translating the
+ *	virtual port ID to a physical port ID. Ignored if mode is not MSI.
+ * @owner_vf: the VF to route the interrupt to. Ignore if mode is not MSI.
+ * @threshold: the minimum CQ depth at which the interrupt can fire. Must be
+ *	greater than 0.
+ *
+ * This function configures the DLB registers for load-balanced CQ's
+ * interrupts. This doesn't enable the CQ's interrupt; that can be done with
+ * dlb2_arm_cq_interrupt() or through an interrupt arm QE.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid.
+ */
+int dlb2_configure_ldb_cq_interrupt(struct dlb2_hw *hw,
+				    int port_id,
+				    int vector,
+				    int mode,
+				    unsigned int vf,
+				    unsigned int owner_vf,
+				    u16 threshold);
+
+/**
+ * dlb2_configure_dir_cq_interrupt() - configure directed CQ for interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ * @port_id: load-balanced port ID.
+ * @vector: interrupt vector ID. Should be 0 for MSI or compressed MSI-X mode,
+ *	    else a value up to 64.
+ * @mode: interrupt type (DLB2_CQ_ISR_MODE_MSI or DLB2_CQ_ISR_MODE_MSIX)
+ * @vf: If the port is VF-owned, the VF's ID. This is used for translating the
+ *	virtual port ID to a physical port ID. Ignored if mode is not MSI.
+ * @owner_vf: the VF to route the interrupt to. Ignore if mode is not MSI.
+ * @threshold: the minimum CQ depth at which the interrupt can fire. Must be
+ *	greater than 0.
+ *
+ * This function configures the DLB registers for directed CQ's interrupts.
+ * This doesn't enable the CQ's interrupt; that can be done with
+ * dlb2_arm_cq_interrupt() or through an interrupt arm QE.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid.
+ */
+int dlb2_configure_dir_cq_interrupt(struct dlb2_hw *hw,
+				    int port_id,
+				    int vector,
+				    int mode,
+				    unsigned int vf,
+				    unsigned int owner_vf,
+				    u16 threshold);
+
+/**
+ * dlb2_set_msix_mode() - enable certain hardware alarm interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ * @mode: MSI-X mode (DLB2_MSIX_MODE_PACKED or DLB2_MSIX_MODE_COMPRESSED)
+ *
+ * This function configures the hardware to use either packed or compressed
+ * mode. This function should not be called if using MSI interrupts.
+ */
+void dlb2_set_msix_mode(struct dlb2_hw *hw, int mode);
+
+/**
+ * dlb2_arm_cq_interrupt() - arm a CQ's interrupt
+ * @hw: dlb2_hw handle for a particular device.
+ * @port_id: port ID
+ * @is_ldb: true for load-balanced port, false for a directed port
+ * @vdev_request: indicates whether this request came from a vdev.
+ * @vdev_id: If vdev_request is true, this contains the vdev's ID.
+ *
+ * This function arms the CQ's interrupt. The CQ must be configured prior to
+ * calling this function.
+ *
+ * The function does no parameter validation; that is the caller's
+ * responsibility.
+ *
+ * A vdev can be either an SR-IOV virtual function or a Scalable IOV virtual
+ * device.
+ *
+ * Return: returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - Invalid port ID.
+ */
+int dlb2_arm_cq_interrupt(struct dlb2_hw *hw,
+			  int port_id,
+			  bool is_ldb,
+			  bool vdev_request,
+			  unsigned int vdev_id);
+
+/**
+ * dlb2_read_compressed_cq_intr_status() - read compressed CQ interrupt status
+ * @hw: dlb2_hw handle for a particular device.
+ * @ldb_interrupts: 2-entry array of u32 bitmaps
+ * @dir_interrupts: 4-entry array of u32 bitmaps
+ *
+ * This function can be called from a compressed CQ interrupt handler to
+ * determine which CQ interrupts have fired. The caller should take appropriate
+ * (such as waking threads blocked on a CQ's interrupt) then ack the interrupts
+ * with dlb2_ack_compressed_cq_intr().
+ */
+void dlb2_read_compressed_cq_intr_status(struct dlb2_hw *hw,
+					 u32 *ldb_interrupts,
+					 u32 *dir_interrupts);
+
+/**
+ * dlb2_ack_compressed_cq_intr_status() - ack compressed CQ interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ * @ldb_interrupts: 2-entry array of u32 bitmaps
+ * @dir_interrupts: 4-entry array of u32 bitmaps
+ *
+ * This function ACKs compressed CQ interrupts. Its arguments should be the
+ * same ones passed to dlb2_read_compressed_cq_intr_status().
+ */
+void dlb2_ack_compressed_cq_intr(struct dlb2_hw *hw,
+				 u32 *ldb_interrupts,
+				 u32 *dir_interrupts);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 763ae7eb7a23..73bba5892295 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -609,6 +609,39 @@ struct dlb2_disable_dir_port_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT: Block on a CQ interrupt until a QE
+ *	arrives for the specified port. If a QE is already present, the ioctl
+ *	will immediately return.
+ *
+ *	Note: Only one thread can block on a CQ's interrupt at a time. Doing
+ *	otherwise can result in hung threads.
+ *
+ * Input parameters:
+ * - port_id: Port ID.
+ * - is_ldb: True if the port is load-balanced, false otherwise.
+ * - arm: Tell the driver to arm the interrupt.
+ * - cq_gen: Current CQ generation bit.
+ * - padding0: Reserved for future use.
+ * - cq_va: VA of the CQ entry where the next QE will be placed.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_block_on_cq_interrupt_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 port_id;
+	__u8 is_ldb;
+	__u8 arm;
+	__u8 cq_gen;
+	__u8 padding0;
+	__u64 cq_va;
+};
+
+/*
  * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
  * Input parameters:
  * - queue_id: The load-balanced queue ID.
@@ -719,6 +752,7 @@ enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_ENABLE_DIR_PORT,
 	DLB2_DOMAIN_CMD_DISABLE_LDB_PORT,
 	DLB2_DOMAIN_CMD_DISABLE_DIR_PORT,
+	DLB2_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_PENDING_PORT_UNMAPS,
@@ -812,6 +846,10 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_DISABLE_DIR_PORT,		\
 		      struct dlb2_disable_dir_port_args)
+#define DLB2_IOC_BLOCK_ON_CQ_INTERRUPT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT,	\
+		      struct dlb2_block_on_cq_interrupt_args)
 #define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/20] dlb2: add domain alert support
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (13 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 14/20] dlb2: add CQ interrupt support Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 16/20] dlb2: add sequence-number management ioctls Gage Eads
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

Domain alerts are a mechanism for the driver to asynchronously notify
user-space applications of device reset or hardware alarms (both to be
added in later commits). This mechanism also allows the application to
enqueue an alert to its domain, as a form of (limited) IPC in a
multi-process scenario.

An application can read its domain alerts through the domain device file's
read callback. Applications are expected to spawn a thread that performs a
blocking read, and rarely (if ever) wakes and returns to user-space.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c |  21 +++++++
 drivers/misc/dlb2/dlb2_main.c  | 134 ++++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.h  |  16 +++++
 include/uapi/linux/dlb2_user.h | 135 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 306 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index a4222265da79..dbbf39d0b019 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -487,6 +487,26 @@ static int dlb2_domain_ioctl_block_on_cq_interrupt(struct dlb2_dev *dev,
 	return ret;
 }
 
+static int dlb2_domain_ioctl_enqueue_domain_alert(struct dlb2_dev *dev,
+						  struct dlb2_domain *domain,
+						  unsigned long user_arg,
+						  u16 size)
+{
+	struct dlb2_enqueue_domain_alert_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	return dlb2_write_domain_alert(dev,
+				       domain,
+				       DLB2_DOMAIN_ALERT_USER,
+				       arg.aux_alert_data);
+}
+
 static int dlb2_create_port_fd(struct dlb2_dev *dev,
 			       struct dlb2_domain *domain,
 			       const char *prefix,
@@ -704,6 +724,7 @@ dlb2_domain_ioctl_callback_fns[NUM_DLB2_DOMAIN_CMD] = {
 	dlb2_domain_ioctl_disable_ldb_port,
 	dlb2_domain_ioctl_disable_dir_port,
 	dlb2_domain_ioctl_block_on_cq_interrupt,
+	dlb2_domain_ioctl_enqueue_domain_alert,
 	dlb2_domain_ioctl_get_ldb_queue_depth,
 	dlb2_domain_ioctl_get_dir_queue_depth,
 	dlb2_domain_ioctl_pending_port_unmaps,
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index 33b60a8d46fe..ef964cb044f8 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -129,6 +129,9 @@ int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id)
 	kref_init(&domain->refcnt);
 	domain->dlb2_dev = dlb2_dev;
 
+	spin_lock_init(&domain->alert_lock);
+	init_waitqueue_head(&domain->wq_head);
+
 	dlb2_dev->sched_domains[domain_id] = domain;
 
 	dlb2_dev->ops->inc_pm_refcnt(dlb2_dev->pdev, true);
@@ -259,6 +262,136 @@ static int dlb2_domain_close(struct inode *i, struct file *f)
 	return ret;
 }
 
+int dlb2_write_domain_alert(struct dlb2_dev *dev,
+			    struct dlb2_domain *domain,
+			    u64 alert_id,
+			    u64 aux_alert_data)
+{
+	struct dlb2_domain_alert alert;
+	int idx;
+
+	if (!domain) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		return -EINVAL;
+	}
+
+	/* Grab the alert lock to access the read and write indexes */
+	spin_lock(&domain->alert_lock);
+
+	/* If there's no space for this notification, return */
+	if ((domain->alert_wr_idx - domain->alert_rd_idx) ==
+	    (DLB2_DOMAIN_ALERT_RING_SIZE - 1)) {
+		spin_unlock(&domain->alert_lock);
+		return 0;
+	}
+
+	alert.alert_id = alert_id;
+	alert.aux_alert_data = aux_alert_data;
+
+	idx = domain->alert_wr_idx % DLB2_DOMAIN_ALERT_RING_SIZE;
+
+	domain->alerts[idx] = alert;
+
+	domain->alert_wr_idx++;
+
+	spin_unlock(&domain->alert_lock);
+
+	/* Wake any blocked readers */
+	wake_up_interruptible(&domain->wq_head);
+
+	return 0;
+}
+
+static bool dlb2_alerts_avail(struct dlb2_domain *domain)
+{
+	bool ret;
+
+	spin_lock(&domain->alert_lock);
+
+	ret = domain->alert_rd_idx != domain->alert_wr_idx;
+
+	spin_unlock(&domain->alert_lock);
+
+	return ret;
+}
+
+static int dlb2_read_domain_alert(struct dlb2_dev *dev,
+				  struct dlb2_domain *domain,
+				  struct dlb2_domain_alert *alert,
+				  bool nonblock)
+{
+	int idx;
+
+	/* Grab the alert lock to access the read and write indexes */
+	spin_lock(&domain->alert_lock);
+
+	while (domain->alert_rd_idx == domain->alert_wr_idx) {
+		/*
+		 * Release the alert lock before putting the thread on the wait
+		 * queue.
+		 */
+		spin_unlock(&domain->alert_lock);
+
+		if (nonblock)
+			return -EWOULDBLOCK;
+
+		dev_dbg(dev->dlb2_device,
+			"Thread %d is blocking waiting for an alert in domain %d\n",
+			current->pid, domain->id);
+
+		if (wait_event_interruptible(domain->wq_head,
+					     dlb2_alerts_avail(domain)))
+			return -ERESTARTSYS;
+
+		spin_lock(&domain->alert_lock);
+	}
+
+	/* The alert indexes are not equal, so there is an alert available. */
+	idx = domain->alert_rd_idx % DLB2_DOMAIN_ALERT_RING_SIZE;
+
+	memcpy(alert, &domain->alerts[idx], sizeof(*alert));
+
+	domain->alert_rd_idx++;
+
+	spin_unlock(&domain->alert_lock);
+
+	return 0;
+}
+
+static ssize_t dlb2_domain_read(struct file *f,
+				char __user *buf,
+				size_t len,
+				loff_t *offset)
+{
+	struct dlb2_domain *domain = f->private_data;
+	struct dlb2_dev *dev = domain->dlb2_dev;
+	struct dlb2_domain_alert alert;
+	int ret;
+
+	if (len != sizeof(alert))
+		return -EINVAL;
+
+	/* See dlb2_user.h for details on domain alert notifications */
+
+	ret = dlb2_read_domain_alert(dev,
+				     domain,
+				     &alert,
+				     f->f_flags & O_NONBLOCK);
+	if (ret)
+		return ret;
+
+	if (copy_to_user(buf, &alert, sizeof(alert)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device,
+		"Thread %d received alert 0x%llx, with aux data 0x%llx\n",
+		current->pid, ((u64 *)&alert)[0], ((u64 *)&alert)[1]);
+
+	return sizeof(alert);
+}
+
 static long
 dlb2_domain_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
 {
@@ -277,6 +410,7 @@ dlb2_domain_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
 const struct file_operations dlb2_domain_fops = {
 	.owner   = THIS_MODULE,
 	.release = dlb2_domain_close,
+	.read    = dlb2_domain_read,
 	.unlocked_ioctl = dlb2_domain_ioctl,
 	.compat_ioctl = compat_ptr_ioctl,
 };
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index ffbba2c606ba..6e387b394c84 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -11,6 +11,7 @@
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/pci.h>
+#include <linux/spinlock.h>
 #include <linux/types.h>
 
 #include <uapi/linux/dlb2_user.h>
@@ -161,9 +162,20 @@ struct dlb2_port {
 	u8 valid;
 };
 
+#define DLB2_DOMAIN_ALERT_RING_SIZE 256
+
 struct dlb2_domain {
 	struct dlb2_dev *dlb2_dev;
+	struct dlb2_domain_alert alerts[DLB2_DOMAIN_ALERT_RING_SIZE];
+	wait_queue_head_t wq_head;
+	/*
+	 * The alert lock protects access to the alert ring and its read and
+	 * write indexes.
+	 */
+	spinlock_t alert_lock;
 	struct kref refcnt;
+	u8 alert_rd_idx;
+	u8 alert_wr_idx;
 	u8 id;
 };
 
@@ -225,6 +237,10 @@ struct dlb2_dev {
 
 int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id);
 void dlb2_free_domain(struct kref *kref);
+int dlb2_write_domain_alert(struct dlb2_dev *dev,
+			    struct dlb2_domain *domain,
+			    u64 alert_id,
+			    u64 aux_alert_data);
 
 #define DLB2_HW_ERR(dlb2, ...) do {		       \
 	struct dlb2_dev *dev;			       \
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index 73bba5892295..cbc4b5b37687 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -331,6 +331,117 @@ enum dlb2_user_interface_commands {
 	NUM_DLB2_CMD,
 };
 
+/*******************************/
+/* 'domain' device file alerts */
+/*******************************/
+
+/*
+ * Scheduling domain device files can be read to receive domain-specific
+ * notifications, for alerts such as hardware errors or device reset.
+ *
+ * Each alert is encoded in a 16B message. The first 8B contains the alert ID,
+ * and the second 8B is optional and contains additional information.
+ * Applications should cast read data to a struct dlb2_domain_alert, and
+ * interpret the struct's alert_id according to dlb2_domain_alert_id. The read
+ * length must be 16B, or the function will return -EINVAL.
+ *
+ * Reads are destructive, and in the case of multiple file descriptors for the
+ * same domain device file, an alert will be read by only one of the file
+ * descriptors.
+ *
+ * The driver stores alerts in a fixed-size alert ring until they are read. If
+ * the alert ring fills completely, subsequent alerts will be dropped. It is
+ * recommended that DLB2 applications dedicate a thread to perform blocking
+ * reads on the device file.
+ */
+enum dlb2_domain_alert_id {
+	/*
+	 * Software issued an illegal enqueue for a port in this domain. An
+	 * illegal enqueue could be:
+	 * - Illegal (excess) completion
+	 * - Illegal fragment
+	 * - Insufficient credits
+	 * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+	 * contains a flag indicating whether the port is load-balanced (1) or
+	 * directed (0).
+	 */
+	DLB2_DOMAIN_ALERT_PP_ILLEGAL_ENQ,
+	/*
+	 * Software issued excess CQ token pops for a port in this domain.
+	 * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+	 * contains a flag indicating whether the port is load-balanced (1) or
+	 * directed (0).
+	 */
+	DLB2_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS,
+	/*
+	 * A enqueue contained either an invalid command encoding or a REL,
+	 * REL_T, RLS, FWD, FWD_T, FRAG, or FRAG_T from a directed port.
+	 *
+	 * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+	 * contains a flag indicating whether the port is load-balanced (1) or
+	 * directed (0).
+	 */
+	DLB2_DOMAIN_ALERT_ILLEGAL_HCW,
+	/*
+	 * The QID must be valid and less than 128.
+	 *
+	 * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+	 * contains a flag indicating whether the port is load-balanced (1) or
+	 * directed (0).
+	 */
+	DLB2_DOMAIN_ALERT_ILLEGAL_QID,
+	/*
+	 * An enqueue went to a disabled QID.
+	 *
+	 * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+	 * contains a flag indicating whether the port is load-balanced (1) or
+	 * directed (0).
+	 */
+	DLB2_DOMAIN_ALERT_DISABLED_QID,
+	/*
+	 * The device containing this domain was reset. All applications using
+	 * the device need to exit for the driver to complete the reset
+	 * procedure.
+	 *
+	 * aux_alert_data doesn't contain any information for this alert.
+	 */
+	DLB2_DOMAIN_ALERT_DEVICE_RESET,
+	/*
+	 * User-space has enqueued an alert.
+	 *
+	 * aux_alert_data contains user-provided data.
+	 */
+	DLB2_DOMAIN_ALERT_USER,
+	/*
+	 * The watchdog timer fired for the specified port. This occurs if its
+	 * CQ was not serviced for a large amount of time, likely indicating a
+	 * hung thread.
+	 * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+	 * contains a flag indicating whether the port is load-balanced (1) or
+	 * directed (0).
+	 */
+	DLB2_DOMAIN_ALERT_CQ_WATCHDOG_TIMEOUT,
+
+	/* Number of DLB2 domain alerts */
+	NUM_DLB2_DOMAIN_ALERTS
+};
+
+static const char dlb2_domain_alert_strings[][128] = {
+	"DLB2_DOMAIN_ALERT_PP_ILLEGAL_ENQ",
+	"DLB2_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS",
+	"DLB2_DOMAIN_ALERT_ILLEGAL_HCW",
+	"DLB2_DOMAIN_ALERT_ILLEGAL_QID",
+	"DLB2_DOMAIN_ALERT_DISABLED_QID",
+	"DLB2_DOMAIN_ALERT_DEVICE_RESET",
+	"DLB2_DOMAIN_ALERT_USER",
+	"DLB2_DOMAIN_ALERT_CQ_WATCHDOG_TIMEOUT",
+};
+
+struct dlb2_domain_alert {
+	__u64 alert_id;
+	__u64 aux_alert_data;
+};
+
 /*********************************/
 /* 'domain' device file commands */
 /*********************************/
@@ -642,6 +753,25 @@ struct dlb2_block_on_cq_interrupt_args {
 };
 
 /*
+ * DLB2_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT: Enqueue a domain alert that will be
+ *	read by one reader thread.
+ *
+ * Input parameters:
+ * - aux_alert_data: user-defined auxiliary data.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_enqueue_domain_alert_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u64 aux_alert_data;
+};
+
+/*
  * DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
  * Input parameters:
  * - queue_id: The load-balanced queue ID.
@@ -753,6 +883,7 @@ enum dlb2_domain_user_interface_commands {
 	DLB2_DOMAIN_CMD_DISABLE_LDB_PORT,
 	DLB2_DOMAIN_CMD_DISABLE_DIR_PORT,
 	DLB2_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT,
+	DLB2_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT,
 	DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
 	DLB2_DOMAIN_CMD_PENDING_PORT_UNMAPS,
@@ -850,6 +981,10 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT,	\
 		      struct dlb2_block_on_cq_interrupt_args)
+#define DLB2_IOC_ENQUEUE_DOMAIN_ALERT				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT,	\
+		      struct dlb2_enqueue_domain_alert_args)
 #define DLB2_IOC_GET_LDB_QUEUE_DEPTH				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,	\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/20] dlb2: add sequence-number management ioctls
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (14 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 15/20] dlb2: add domain alert support Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 17/20] dlb2: add cos bandwidth get/set ioctls Gage Eads
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

In order for a load-balanced DLB 2.0 queue to support ordered scheduling,
it must be configured with an allocation of sequence numbers (SNs) -- a
hardware resource used to re-order QEs.

The device evenly partitions its SNs across two groups. A queue is
allocated SNs by taking a 'slot' in one of these two groups, and the
number of slots is variable. A group can support as many as 16 slots (64
SNs per slot) and as few as 1 slot (1024 SNs per slot).

This commit adds ioctls to get and set a sequence number group's slot
configuration, as well as query the group's slot occupancy.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c    | 116 ++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.h     |   3 +
 drivers/misc/dlb2/dlb2_pf_ops.c   |  21 +++++++
 drivers/misc/dlb2/dlb2_resource.c |  67 ++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h |  47 +++++++++++++++
 include/uapi/linux/dlb2_user.h    |  87 ++++++++++++++++++++++++++++
 6 files changed, 341 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index dbbf39d0b019..8701643fc1f1 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -975,6 +975,119 @@ static int dlb2_ioctl_get_driver_version(struct dlb2_dev *dev,
 	return 0;
 }
 
+static int dlb2_ioctl_set_sn_allocation(struct dlb2_dev *dev,
+					unsigned long user_arg,
+					u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_set_sn_allocation_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->set_sn_allocation(&dev->hw, arg.group, arg.num);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_ioctl_get_sn_allocation(struct dlb2_dev *dev,
+					unsigned long user_arg,
+					u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_get_sn_allocation_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->get_sn_allocation(&dev->hw, arg.group);
+
+	response.id = ret;
+
+	ret = (ret > 0) ? 0 : ret;
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_ioctl_get_sn_occupancy(struct dlb2_dev *dev,
+				       unsigned long user_arg,
+				       u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_get_sn_occupancy_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->get_sn_occupancy(&dev->hw, arg.group);
+
+	response.id = ret;
+
+	ret = (ret > 0) ? 0 : ret;
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
 static int dlb2_ioctl_query_cq_poll_mode(struct dlb2_dev *dev,
 					 unsigned long user_arg,
 					 u16 size)
@@ -1016,6 +1129,9 @@ static dlb2_ioctl_callback_fn_t dlb2_ioctl_callback_fns[NUM_DLB2_CMD] = {
 	dlb2_ioctl_get_sched_domain_fd,
 	dlb2_ioctl_get_num_resources,
 	dlb2_ioctl_get_driver_version,
+	dlb2_ioctl_set_sn_allocation,
+	dlb2_ioctl_get_sn_allocation,
+	dlb2_ioctl_get_sn_occupancy,
 	dlb2_ioctl_query_cq_poll_mode,
 };
 
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 6e387b394c84..aa73dad1ce77 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -135,6 +135,9 @@ struct dlb2_device_ops {
 	int (*dir_port_owned_by_domain)(struct dlb2_hw *hw,
 					u32 domain_id,
 					u32 port_id);
+	int (*get_sn_allocation)(struct dlb2_hw *hw, u32 group_id);
+	int (*set_sn_allocation)(struct dlb2_hw *hw, u32 group_id, u32 num);
+	int (*get_sn_occupancy)(struct dlb2_hw *hw, u32 group_id);
 	int (*get_ldb_queue_depth)(struct dlb2_hw *hw,
 				   u32 domain_id,
 				   struct dlb2_get_ldb_queue_depth_args *args,
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 5b212624517a..8bdeb86a1e87 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -631,6 +631,24 @@ dlb2_pf_dir_port_owned_by_domain(struct dlb2_hw *hw,
 	return dlb2_dir_port_owned_by_domain(hw, domain_id, port_id, false, 0);
 }
 
+static int
+dlb2_pf_get_sn_allocation(struct dlb2_hw *hw, u32 group_id)
+{
+	return dlb2_get_group_sequence_numbers(hw, group_id);
+}
+
+static int
+dlb2_pf_set_sn_allocation(struct dlb2_hw *hw, u32 group_id, u32 num)
+{
+	return dlb2_set_group_sequence_numbers(hw, group_id, num);
+}
+
+static int
+dlb2_pf_get_sn_occupancy(struct dlb2_hw *hw, u32 group_id)
+{
+	return dlb2_get_group_sequence_number_occupancy(hw, group_id);
+}
+
 /********************************/
 /****** DLB2 PF Device Ops ******/
 /********************************/
@@ -671,6 +689,9 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.reset_domain = dlb2_pf_reset_domain,
 	.ldb_port_owned_by_domain = dlb2_pf_ldb_port_owned_by_domain,
 	.dir_port_owned_by_domain = dlb2_pf_dir_port_owned_by_domain,
+	.get_sn_allocation = dlb2_pf_get_sn_allocation,
+	.set_sn_allocation = dlb2_pf_set_sn_allocation,
+	.get_sn_occupancy = dlb2_pf_get_sn_occupancy,
 	.get_ldb_queue_depth = dlb2_pf_get_ldb_queue_depth,
 	.get_dir_queue_depth = dlb2_pf_get_dir_queue_depth,
 	.init_hardware = dlb2_pf_init_hardware,
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 436629cf02a2..f686b3045a2b 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -4991,6 +4991,73 @@ void dlb2_ack_compressed_cq_intr(struct dlb2_hw *hw,
 	dlb2_ack_msix_interrupt(hw, DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID);
 }
 
+int dlb2_get_group_sequence_numbers(struct dlb2_hw *hw, unsigned int group_id)
+{
+	if (group_id >= DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
+		return -EINVAL;
+
+	return hw->rsrcs.sn_groups[group_id].sequence_numbers_per_queue;
+}
+
+int dlb2_get_group_sequence_number_occupancy(struct dlb2_hw *hw,
+					     unsigned int group_id)
+{
+	if (group_id >= DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
+		return -EINVAL;
+
+	return dlb2_sn_group_used_slots(&hw->rsrcs.sn_groups[group_id]);
+}
+
+static void dlb2_log_set_group_sequence_numbers(struct dlb2_hw *hw,
+						unsigned int group_id,
+						unsigned long val)
+{
+	DLB2_HW_DBG(hw, "DLB2 set group sequence numbers:\n");
+	DLB2_HW_DBG(hw, "\tGroup ID: %u\n", group_id);
+	DLB2_HW_DBG(hw, "\tValue:    %lu\n", val);
+}
+
+int dlb2_set_group_sequence_numbers(struct dlb2_hw *hw,
+				    unsigned int group_id,
+				    unsigned long val)
+{
+	u32 valid_allocations[] = {64, 128, 256, 512, 1024};
+	union dlb2_ro_pipe_grp_sn_mode r0 = { {0} };
+	struct dlb2_sn_group *group;
+	int mode;
+
+	if (group_id >= DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
+		return -EINVAL;
+
+	group = &hw->rsrcs.sn_groups[group_id];
+
+	/*
+	 * Once the first load-balanced queue using an SN group is configured,
+	 * the group cannot be changed.
+	 */
+	if (group->slot_use_bitmap != 0)
+		return -EPERM;
+
+	for (mode = 0; mode < DLB2_MAX_NUM_SEQUENCE_NUMBER_MODES; mode++)
+		if (val == valid_allocations[mode])
+			break;
+
+	if (mode == DLB2_MAX_NUM_SEQUENCE_NUMBER_MODES)
+		return -EINVAL;
+
+	group->mode = mode;
+	group->sequence_numbers_per_queue = val;
+
+	r0.field.sn_mode_0 = hw->rsrcs.sn_groups[0].mode;
+	r0.field.sn_mode_1 = hw->rsrcs.sn_groups[1].mode;
+
+	DLB2_CSR_WR(hw, DLB2_RO_PIPE_GRP_SN_MODE, r0.val);
+
+	dlb2_log_set_group_sequence_numbers(hw, group_id, val);
+
+	return 0;
+}
+
 static u32 dlb2_ldb_cq_inflight_count(struct dlb2_hw *hw,
 				      struct dlb2_ldb_port *port)
 {
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index a5921fb3273d..246d4e303ecf 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -573,6 +573,53 @@ void dlb2_ack_compressed_cq_intr(struct dlb2_hw *hw,
 				 u32 *dir_interrupts);
 
 /**
+ * dlb2_get_group_sequence_numbers() - return a group's number of SNs per queue
+ * @hw: dlb2_hw handle for a particular device.
+ * @group_id: sequence number group ID.
+ *
+ * This function returns the configured number of sequence numbers per queue
+ * for the specified group.
+ *
+ * Return:
+ * Returns -EINVAL if group_id is invalid, else the group's SNs per queue.
+ */
+int dlb2_get_group_sequence_numbers(struct dlb2_hw *hw,
+				    unsigned int group_id);
+
+/**
+ * dlb2_get_group_sequence_number_occupancy() - return a group's in-use slots
+ * @hw: dlb2_hw handle for a particular device.
+ * @group_id: sequence number group ID.
+ *
+ * This function returns the group's number of in-use slots (i.e. load-balanced
+ * queues using the specified group).
+ *
+ * Return:
+ * Returns -EINVAL if group_id is invalid, else the group's SNs per queue.
+ */
+int dlb2_get_group_sequence_number_occupancy(struct dlb2_hw *hw,
+					     unsigned int group_id);
+
+/**
+ * dlb2_set_group_sequence_numbers() - assign a group's number of SNs per queue
+ * @hw: dlb2_hw handle for a particular device.
+ * @group_id: sequence number group ID.
+ * @val: requested amount of sequence numbers per queue.
+ *
+ * This function configures the group's number of sequence numbers per queue.
+ * val can be a power-of-two between 32 and 1024, inclusive. This setting can
+ * be configured until the first ordered load-balanced queue is configured, at
+ * which point the configuration is locked.
+ *
+ * Return:
+ * Returns 0 upon success; -EINVAL if group_id or val is invalid, -EPERM if an
+ * ordered queue is configured.
+ */
+int dlb2_set_group_sequence_numbers(struct dlb2_hw *hw,
+				    unsigned int group_id,
+				    unsigned long val);
+
+/**
  * dlb2_reset_domain() - reset a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @domain_id: domain ID.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index cbc4b5b37687..d86e5a748f16 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -297,6 +297,78 @@ struct dlb2_get_num_resources_args {
 	__u32 num_dir_credits;
 };
 
+/*
+ * DLB2_CMD_SET_SN_ALLOCATION: Configure a sequence number group (PF only)
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - num: Number of sequence numbers per queue.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_set_sn_allocation_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 group;
+	__u32 num;
+};
+
+/*
+ * DLB2_CMD_GET_SN_ALLOCATION: Get a sequence number group's configuration
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Specified group's number of sequence numbers per queue.
+ */
+struct dlb2_get_sn_allocation_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 group;
+	__u32 padding0;
+};
+
+/*
+ * DLB2_CMD_GET_SN_OCCUPANCY: Get a sequence number group's occupancy
+ *
+ * Each sequence number group has one or more slots, depending on its
+ * configuration. I.e.:
+ * - If configured for 1024 sequence numbers per queue, the group has 1 slot
+ * - If configured for 512 sequence numbers per queue, the group has 2 slots
+ *   ...
+ * - If configured for 32 sequence numbers per queue, the group has 32 slots
+ *
+ * This ioctl returns the group's number of in-use slots. If its occupancy is
+ * 0, the group's sequence number allocation can be reconfigured.
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Specified group's number of used slots.
+ */
+struct dlb2_get_sn_occupancy_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 group;
+	__u32 padding0;
+};
+
 enum dlb2_cq_poll_modes {
 	DLB2_CQ_POLL_MODE_STD,
 	DLB2_CQ_POLL_MODE_SPARSE,
@@ -325,6 +397,9 @@ enum dlb2_user_interface_commands {
 	DLB2_CMD_GET_SCHED_DOMAIN_FD,
 	DLB2_CMD_GET_NUM_RESOURCES,
 	DLB2_CMD_GET_DRIVER_VERSION,
+	DLB2_CMD_SET_SN_ALLOCATION,
+	DLB2_CMD_GET_SN_ALLOCATION,
+	DLB2_CMD_GET_SN_OCCUPANCY,
 	DLB2_CMD_QUERY_CQ_POLL_MODE,
 
 	/* NUM_DLB2_CMD must be last */
@@ -929,6 +1004,18 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_GET_DRIVER_VERSION,		\
 		      struct dlb2_get_driver_version_args)
+#define DLB2_IOC_SET_SN_ALLOCATION				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_SET_SN_ALLOCATION,		\
+		      struct dlb2_set_sn_allocation_args)
+#define DLB2_IOC_GET_SN_ALLOCATION				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_SN_ALLOCATION,		\
+		      struct dlb2_get_sn_allocation_args)
+#define DLB2_IOC_GET_SN_OCCUPANCY				\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_SN_OCCUPANCY,		\
+		      struct dlb2_get_sn_occupancy_args)
 #define DLB2_IOC_QUERY_CQ_POLL_MODE				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_QUERY_CQ_POLL_MODE,		\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 17/20] dlb2: add cos bandwidth get/set ioctls
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (15 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 16/20] dlb2: add sequence-number management ioctls Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 18/20] dlb2: add device FLR support Gage Eads
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

The DLB 2.0 supports four load-balanced port classes of service (CoS). Each
CoS receives a guaranteed percentage of the load-balanced scheduler's
bandwidth, and any unreserved bandwidth is divided among the four CoS.

These two ioctls allow applications to query CoS allocations and adjust
them as needed. These allocations are controlled by the PF driver; virtual
devices, when support for them is added, will not be able to adjust them.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/misc/dlb2/dlb2_ioctl.c    | 76 +++++++++++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_main.h     |  2 ++
 drivers/misc/dlb2/dlb2_pf_ops.c   | 14 ++++++++
 drivers/misc/dlb2/dlb2_resource.c | 59 ++++++++++++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.h | 28 +++++++++++++++
 include/uapi/linux/dlb2_user.h    | 54 ++++++++++++++++++++++++++++
 6 files changed, 233 insertions(+)

diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index 8701643fc1f1..ce80cdd71af8 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -1049,6 +1049,80 @@ static int dlb2_ioctl_get_sn_allocation(struct dlb2_dev *dev,
 	return ret;
 }
 
+static int dlb2_ioctl_set_cos_bw(struct dlb2_dev *dev,
+				 unsigned long user_arg,
+				 u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_set_cos_bw_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->set_cos_bw(&dev->hw, arg.cos_id, arg.bandwidth);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
+static int dlb2_ioctl_get_cos_bw(struct dlb2_dev *dev,
+				 unsigned long user_arg,
+				 u16 size)
+{
+	struct dlb2_cmd_response response = {0};
+	struct dlb2_get_cos_bw_args arg;
+	int ret;
+
+	dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
+	if (ret)
+		return ret;
+
+	/* Copy zeroes to verify the user-provided response pointer */
+	ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
+	if (ret)
+		return ret;
+
+	mutex_lock(&dev->resource_mutex);
+
+	ret = dev->ops->get_cos_bw(&dev->hw, arg.cos_id);
+
+	response.id = ret;
+
+	ret = (ret > 0) ? 0 : ret;
+
+	mutex_unlock(&dev->resource_mutex);
+
+	if (copy_to_user((void __user *)arg.response,
+			 &response,
+			 sizeof(response)))
+		return -EFAULT;
+
+	dev_dbg(dev->dlb2_device, "Exiting %s()\n", __func__);
+
+	return ret;
+}
+
 static int dlb2_ioctl_get_sn_occupancy(struct dlb2_dev *dev,
 				       unsigned long user_arg,
 				       u16 size)
@@ -1131,6 +1205,8 @@ static dlb2_ioctl_callback_fn_t dlb2_ioctl_callback_fns[NUM_DLB2_CMD] = {
 	dlb2_ioctl_get_driver_version,
 	dlb2_ioctl_set_sn_allocation,
 	dlb2_ioctl_get_sn_allocation,
+	dlb2_ioctl_set_cos_bw,
+	dlb2_ioctl_get_cos_bw,
 	dlb2_ioctl_get_sn_occupancy,
 	dlb2_ioctl_query_cq_poll_mode,
 };
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index aa73dad1ce77..4b7a430ffd86 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -146,6 +146,8 @@ struct dlb2_device_ops {
 				   u32 domain_id,
 				   struct dlb2_get_dir_queue_depth_args *args,
 				   struct dlb2_cmd_response *resp);
+	int (*set_cos_bw)(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth);
+	int (*get_cos_bw)(struct dlb2_hw *hw, u32 cos_id);
 	void (*init_hardware)(struct dlb2_dev *dev);
 	int (*query_cq_poll_mode)(struct dlb2_dev *dev,
 				  struct dlb2_cmd_response *user_resp);
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 8bdeb86a1e87..124e097ff979 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -644,6 +644,18 @@ dlb2_pf_set_sn_allocation(struct dlb2_hw *hw, u32 group_id, u32 num)
 }
 
 static int
+dlb2_pf_set_cos_bw(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth)
+{
+	return dlb2_hw_set_cos_bandwidth(hw, cos_id, bandwidth);
+}
+
+static int
+dlb2_pf_get_cos_bw(struct dlb2_hw *hw, u32 cos_id)
+{
+	return dlb2_hw_get_cos_bandwidth(hw, cos_id);
+}
+
+static int
 dlb2_pf_get_sn_occupancy(struct dlb2_hw *hw, u32 group_id)
 {
 	return dlb2_get_group_sequence_number_occupancy(hw, group_id);
@@ -694,6 +706,8 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.get_sn_occupancy = dlb2_pf_get_sn_occupancy,
 	.get_ldb_queue_depth = dlb2_pf_get_ldb_queue_depth,
 	.get_dir_queue_depth = dlb2_pf_get_dir_queue_depth,
+	.set_cos_bw = dlb2_pf_set_cos_bw,
+	.get_cos_bw = dlb2_pf_get_cos_bw,
 	.init_hardware = dlb2_pf_init_hardware,
 	.query_cq_poll_mode = dlb2_pf_query_cq_poll_mode,
 };
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index f686b3045a2b..2c04c5164223 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -6771,6 +6771,65 @@ void dlb2_clr_pmcsr_disable(struct dlb2_hw *hw)
 	DLB2_CSR_WR(hw, DLB2_CFG_MSTR_CFG_PM_PMCSR_DISABLE, r0.val);
 }
 
+int dlb2_hw_get_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id)
+{
+	if (cos_id >= DLB2_NUM_COS_DOMAINS)
+		return -EINVAL;
+
+	return hw->cos_reservation[cos_id];
+}
+
+static void dlb2_log_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bw)
+{
+	DLB2_HW_DBG(hw, "DLB2 set port CoS bandwidth:\n");
+	DLB2_HW_DBG(hw, "\tCoS ID:    %u\n", cos_id);
+	DLB2_HW_DBG(hw, "\tBandwidth: %u\n", bw);
+}
+
+int dlb2_hw_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth)
+{
+	union dlb2_lsp_cfg_shdw_range_cos r0 = { {0} };
+	union dlb2_lsp_cfg_shdw_ctrl r1 = { {0} };
+	unsigned int i;
+	u8 total;
+
+	if (cos_id >= DLB2_NUM_COS_DOMAINS)
+		return -EINVAL;
+
+	if (bandwidth > 100)
+		return -EINVAL;
+
+	total = 0;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++)
+		total += (i == cos_id) ? bandwidth : hw->cos_reservation[i];
+
+	if (total > 100)
+		return -EINVAL;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_LSP_CFG_SHDW_RANGE_COS(cos_id));
+
+	/*
+	 * Normalize the bandwidth to a value in the range 0-255. Integer
+	 * division may leave unreserved scheduling slots; these will be
+	 * divided among the 4 classes of service.
+	 */
+	r0.field.bw_range = (bandwidth * 256) / 100;
+
+	DLB2_CSR_WR(hw, DLB2_LSP_CFG_SHDW_RANGE_COS(cos_id), r0.val);
+
+	r1.field.transfer = 1;
+
+	/* Atomically transfer the newly configured service weight */
+	DLB2_CSR_WR(hw, DLB2_LSP_CFG_SHDW_CTRL, r1.val);
+
+	dlb2_log_set_cos_bandwidth(hw, cos_id, bandwidth);
+
+	hw->cos_reservation[cos_id] = bandwidth;
+
+	return 0;
+}
+
 void dlb2_hw_enable_sparse_ldb_cq_mode(struct dlb2_hw *hw)
 {
 	union dlb2_chp_cfg_chp_csr_ctrl r0;
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 246d4e303ecf..430175efcd71 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -823,6 +823,34 @@ int dlb2_hw_pending_port_unmaps(struct dlb2_hw *hw,
 				unsigned int vf_id);
 
 /**
+ * dlb2_hw_get_cos_bandwidth() - returns the percent of bandwidth allocated
+ *	to a port class-of-service.
+ * @hw: dlb2_hw handle for a particular device.
+ * @cos_id: class-of-service ID.
+ *
+ * Return:
+ * Returns -EINVAL if cos_id is invalid, else the class' bandwidth allocation.
+ */
+int dlb2_hw_get_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id);
+
+/**
+ * dlb2_hw_set_cos_bandwidth() - set a bandwidth allocation percentage for a
+ *	port class-of-service.
+ * @hw: dlb2_hw handle for a particular device.
+ * @cos_id: class-of-service ID.
+ * @bandwidth: class-of-service bandwidth.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - Invalid cos ID, bandwidth is greater than 100, or bandwidth would
+ *	    cause the total bandwidth across all classes of service to exceed
+ *	    100%.
+ */
+int dlb2_hw_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth);
+
+/**
  * dlb2_hw_enable_sparse_ldb_cq_mode() - enable sparse mode for load-balanced
  *	ports.
  * @hw: dlb2_hw handle for a particular device.
diff --git a/include/uapi/linux/dlb2_user.h b/include/uapi/linux/dlb2_user.h
index d86e5a748f16..b6c55174b1c0 100644
--- a/include/uapi/linux/dlb2_user.h
+++ b/include/uapi/linux/dlb2_user.h
@@ -339,6 +339,50 @@ struct dlb2_get_sn_allocation_args {
 };
 
 /*
+ * DLB2_CMD_SET_COS_BW: Set a bandwidth allocation percentage for a
+ *	load-balanced port class-of-service (PF only).
+ *
+ * Input parameters:
+ * - cos_id: class-of-service ID, between 0 and 3 (inclusive).
+ * - bandwidth: class-of-service bandwidth percentage. Total bandwidth
+ *		percentages across all 4 classes cannot exceed 100%.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ */
+struct dlb2_set_cos_bw_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 cos_id;
+	__u32 bandwidth;
+};
+
+/*
+ * DLB2_CMD_GET_COS_BW: Get the bandwidth allocation percentage for a
+ *	load-balanced port class-of-service.
+ *
+ * Input parameters:
+ * - cos_id: class-of-service ID, between 0 and 3 (inclusive).
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb2_cmd_response.
+ *	response.status: Detailed error code. In certain cases, such as if the
+ *		response pointer is invalid, the driver won't set status.
+ *	response.id: Specified class's bandwidth percentage.
+ */
+struct dlb2_get_cos_bw_args {
+	/* Output parameters */
+	__u64 response;
+	/* Input parameters */
+	__u32 cos_id;
+	__u32 padding0;
+};
+
+/*
  * DLB2_CMD_GET_SN_OCCUPANCY: Get a sequence number group's occupancy
  *
  * Each sequence number group has one or more slots, depending on its
@@ -399,6 +443,8 @@ enum dlb2_user_interface_commands {
 	DLB2_CMD_GET_DRIVER_VERSION,
 	DLB2_CMD_SET_SN_ALLOCATION,
 	DLB2_CMD_GET_SN_ALLOCATION,
+	DLB2_CMD_SET_COS_BW,
+	DLB2_CMD_GET_COS_BW,
 	DLB2_CMD_GET_SN_OCCUPANCY,
 	DLB2_CMD_QUERY_CQ_POLL_MODE,
 
@@ -1012,6 +1058,14 @@ enum dlb2_domain_user_interface_commands {
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_GET_SN_ALLOCATION,		\
 		      struct dlb2_get_sn_allocation_args)
+#define DLB2_IOC_SET_COS_BW					\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_SET_COS_BW,			\
+		      struct dlb2_set_cos_bw_args)
+#define DLB2_IOC_GET_COS_BW					\
+		_IOWR(DLB2_IOC_MAGIC,				\
+		      DLB2_CMD_GET_COS_BW,			\
+		      struct dlb2_get_cos_bw_args)
 #define DLB2_IOC_GET_SN_OCCUPANCY				\
 		_IOWR(DLB2_IOC_MAGIC,				\
 		      DLB2_CMD_GET_SN_OCCUPANCY,		\
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 18/20] dlb2: add device FLR support
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (16 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 17/20] dlb2: add cos bandwidth get/set ioctls Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 19/20] dlb2: add basic PF sysfs interfaces Gage Eads
  2020-07-12 13:43 ` [PATCH 20/20] dlb2: add ingress error handling Gage Eads
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

A device FLR can be triggered while applications are actively using the
device, which poses two problems:
- If applications continue to enqueue to the hardware they will cause
  hardware errors, because the FLR will have reset the scheduling domains,
  ports, and queues.
- When the applications end, they must not trigger the driver's domain
  reset logic, which would fail because the device's registers will have
  been reset by the FLR.

The driver needs to stop applications from using the device before the FLR
begins, and detect when these applications exit (so they don't trigger the
domain reset that occurs when their last file reference (or memory mapping)
closes). The driver must also disallow applications from configuring the
device while the reset is in progress.

To avoid these problems, the driver handles unexpected resets as follows:
1. Set the reset_active flag. This flag blocks new device files from being
   opened and is used as a wakeup condition in the driver's wait queues.
2. If this is a PF FLR and there are active VFs, send them a pre-reset
   notification, so they can stop any VF applications. (Added in a later
   commit.)
3. Disable all device files (set the per-file valid flag to false, which
   prevents the file from being used after FLR completes) and wake any
   threads blocked on a wait queue.
4. If the device is not in use -- i.e. no open device files or memory
   mappings, and no VFs in use (PF FLR only) -- the FLR can begin.
5. Else, the driver waits (up to a user-specified timeout, default 5s) for
   software to stop using the driver and the device. If the timeout
   elapses, the driver zaps any remaining MMIO mappings so the
   extant applications can no longer use the device.
6. Do not unlock the resource mutex at the end of the reset prepare
   function, to prevent device access during the reset.

The reset timeout is exposed as a module parameter.

After the FLR:
1. Clear the per-domain pointers (the memory is freed elsewhere).
2. Release any remaining allocated port or CQ memory, now that it's
   guaranteed the device is unconfigured and won't write to memory.
3. Reset software and hardware state.
4. Notify VFs that the FLR is complete. (Added in a later commit.)
5. Set reset_active to false.
6. Unlock the resource mutex.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 Documentation/ABI/testing/sysfs-driver-dlb2 |  22 ++
 drivers/misc/dlb2/dlb2_intr.c               |  13 +-
 drivers/misc/dlb2/dlb2_intr.h               |   1 +
 drivers/misc/dlb2/dlb2_ioctl.c              |  91 ++++++++
 drivers/misc/dlb2/dlb2_main.c               | 327 +++++++++++++++++++++++++++-
 drivers/misc/dlb2/dlb2_main.h               |   4 +
 drivers/misc/dlb2/dlb2_resource.c           |  17 ++
 drivers/misc/dlb2/dlb2_resource.h           |   9 +
 8 files changed, 475 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-dlb2

diff --git a/Documentation/ABI/testing/sysfs-driver-dlb2 b/Documentation/ABI/testing/sysfs-driver-dlb2
new file mode 100644
index 000000000000..38d0d3d92670
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-dlb2
@@ -0,0 +1,22 @@
+What:		/sys/bus/pci/drivers/dlb2/module/parameters/reset_timeout_s
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:	Interface for setting the driver's reset timeout.
+		When a device reset (FLR) is issued, the driver waits for
+		user-space to stop using the device before allowing the FLR to
+		proceed, with a timeout. The device is considered in use if
+		there are any open domain device file descriptors or memory
+		mapped producer ports. (For PF device resets, this includes all
+		VF-owned domains and producer ports.)
+
+		The amount of time the driver waits for userspace to stop using
+		the device is controlled by the module parameter
+		reset_timeout_s, which is in units of seconds and defaults to
+		5. If reset_timeout_s seconds elapse and any user is still
+		using the device, the driver zaps those processes' memory
+		mappings and marks their device file descriptors as invalid.
+		This is necessary because user processes that do not relinquish
+		their device mappings can interfere with processes that use the
+		device after the reset completes. To ensure that user processes
+		have enough time to clean up, reset_timeout_s can be increased.
diff --git a/drivers/misc/dlb2/dlb2_intr.c b/drivers/misc/dlb2/dlb2_intr.c
index 88c936ab7c69..bcda045c6660 100644
--- a/drivers/misc/dlb2/dlb2_intr.c
+++ b/drivers/misc/dlb2/dlb2_intr.c
@@ -30,7 +30,10 @@ static inline bool wake_condition(struct dlb2_cq_intr *intr,
 				  struct dlb2_dev *dev,
 				  struct dlb2_domain *domain)
 {
-	return (READ_ONCE(intr->wake) || READ_ONCE(intr->disabled));
+	return (READ_ONCE(intr->wake) ||
+		READ_ONCE(dev->reset_active) ||
+		!READ_ONCE(domain->valid) ||
+		READ_ONCE(intr->disabled));
 }
 
 struct dlb2_dequeue_qe {
@@ -117,8 +120,12 @@ int dlb2_block_on_cq_interrupt(struct dlb2_dev *dev,
 	ret = wait_event_interruptible(intr->wq_head,
 				       wake_condition(intr, dev, dom));
 
-	if (ret == 0 && READ_ONCE(intr->disabled))
-		ret = -EACCES;
+	if (ret == 0) {
+		if (READ_ONCE(dev->reset_active) || !READ_ONCE(dom->valid))
+			ret = -EINTR;
+		else if (READ_ONCE(intr->disabled))
+			ret = -EACCES;
+	}
 
 	WRITE_ONCE(intr->wake, false);
 
diff --git a/drivers/misc/dlb2/dlb2_intr.h b/drivers/misc/dlb2/dlb2_intr.h
index 613179795d8f..250c391d729e 100644
--- a/drivers/misc/dlb2/dlb2_intr.h
+++ b/drivers/misc/dlb2/dlb2_intr.h
@@ -20,6 +20,7 @@ int dlb2_block_on_cq_interrupt(struct dlb2_dev *dev,
 enum dlb2_wake_reason {
 	WAKE_CQ_INTR,
 	WAKE_PORT_DISABLED,
+	WAKE_DEV_RESET
 };
 
 void dlb2_wake_thread(struct dlb2_dev *dev,
diff --git a/drivers/misc/dlb2/dlb2_ioctl.c b/drivers/misc/dlb2/dlb2_ioctl.c
index ce80cdd71af8..d0e87118bdc7 100644
--- a/drivers/misc/dlb2/dlb2_ioctl.c
+++ b/drivers/misc/dlb2/dlb2_ioctl.c
@@ -75,6 +75,14 @@ static int dlb2_domain_ioctl_##lower_name(struct dlb2_dev *dev,		   \
 									   \
 	mutex_lock(&dev->resource_mutex);				   \
 									   \
+	if (!domain->valid) {						   \
+		dev_err(dev->dlb2_device,				   \
+			"[%s()] Domain invalid (device reset)\n",	   \
+			__func__);					   \
+		mutex_unlock(&dev->resource_mutex);			   \
+		return -EINVAL;						   \
+	}								   \
+									   \
 	ret = dev->ops->lower_name(&dev->hw,				   \
 				   domain->id,				   \
 				   &arg,				   \
@@ -127,6 +135,14 @@ static int dlb2_domain_ioctl_enable_ldb_port(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	ret = dev->ops->enable_ldb_port(&dev->hw, domain->id, &arg, &response);
 
 	/* Allow threads to block on this port's CQ interrupt */
@@ -167,6 +183,14 @@ static int dlb2_domain_ioctl_enable_dir_port(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	ret = dev->ops->enable_dir_port(&dev->hw, domain->id, &arg, &response);
 
 	/* Allow threads to block on this port's CQ interrupt */
@@ -207,6 +231,14 @@ static int dlb2_domain_ioctl_disable_ldb_port(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	ret = dev->ops->disable_ldb_port(&dev->hw, domain->id, &arg, &response);
 
 	/*
@@ -252,6 +284,14 @@ static int dlb2_domain_ioctl_disable_dir_port(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	ret = dev->ops->disable_dir_port(&dev->hw, domain->id, &arg, &response);
 
 	/*
@@ -303,6 +343,14 @@ static int dlb2_domain_ioctl_create_ldb_port(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	cq_base = dma_alloc_coherent(&dev->pdev->dev,
 				     DLB2_CQ_SIZE,
 				     &cq_dma_base,
@@ -388,6 +436,14 @@ static int dlb2_domain_ioctl_create_dir_port(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	cq_base = dma_alloc_coherent(&dev->pdev->dev,
 				     DLB2_CQ_SIZE,
 				     &cq_dma_base,
@@ -469,6 +525,17 @@ static int dlb2_domain_ioctl_block_on_cq_interrupt(struct dlb2_dev *dev,
 	if (ret)
 		return ret;
 
+	/*
+	 * Note: dlb2_block_on_cq_interrupt() checks domain->valid again when
+	 * it puts the thread on the waitqueue
+	 */
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		return -EINVAL;
+	}
+
 	ret = dlb2_block_on_cq_interrupt(dev,
 					 domain,
 					 arg.port_id,
@@ -567,6 +634,14 @@ static int dlb2_domain_get_port_fd(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		dev_err(dev->dlb2_device,
+			"[%s()] Domain invalid (device reset)\n",
+			__func__);
+		mutex_unlock(&dev->resource_mutex);
+		return -EINVAL;
+	}
+
 	if ((is_ldb &&
 	     dev->ops->ldb_port_owned_by_domain(&dev->hw,
 						domain->id,
@@ -804,6 +879,14 @@ static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (dev->reset_active) {
+		dev_err(dev->dlb2_device,
+			"[%s()] The DLB is being reset; applications cannot use it during this time.\n",
+			__func__);
+		ret = -EINVAL;
+		goto unlock;
+	}
+
 	if (dev->domain_reset_failed) {
 		response.status = DLB2_ST_DOMAIN_RESET_FAILED;
 		ret = -EINVAL;
@@ -886,6 +969,14 @@ static int dlb2_ioctl_get_sched_domain_fd(struct dlb2_dev *dev,
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (dev->reset_active) {
+		dev_err(dev->dlb2_device,
+			"[%s()] The DLB is being reset; applications cannot use it during this time.\n",
+			__func__);
+		ret = -EINVAL;
+		goto unlock;
+	}
+
 	domain = dev->sched_domains[arg.domain_id];
 
 	if (!domain) {
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index ef964cb044f8..ade71ed2e66b 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -12,6 +12,7 @@
 #include <linux/uaccess.h>
 
 #include "dlb2_file.h"
+#include "dlb2_intr.h"
 #include "dlb2_ioctl.h"
 #include "dlb2_main.h"
 #include "dlb2_resource.h"
@@ -32,6 +33,11 @@ MODULE_AUTHOR("Copyright(c) 2018-2020 Intel Corporation");
 MODULE_DESCRIPTION("Intel(R) Dynamic Load Balancer 2.0 Driver");
 MODULE_VERSION(DRV_VERSION);
 
+static unsigned int dlb2_reset_timeout_s = DLB2_DEFAULT_RESET_TIMEOUT_S;
+module_param_named(reset_timeout_s, dlb2_reset_timeout_s, uint, 0644);
+MODULE_PARM_DESC(reset_timeout_s,
+		 "Wait time (in seconds) after reset is requested given for app shutdown until driver zaps VMAs");
+
 /* The driver lock protects data structures that used by multiple devices. */
 static DEFINE_MUTEX(dlb2_driver_lock);
 static struct list_head dlb2_dev_list = LIST_HEAD_INIT(dlb2_dev_list);
@@ -68,6 +74,14 @@ static int dlb2_open(struct inode *i, struct file *f)
 
 	dev_dbg(dev->dlb2_device, "Opening DLB device file\n");
 
+	/* See dlb2_reset_prepare() for more details */
+	if (dev->reset_active) {
+		dev_err(dev->dlb2_device,
+			"[%s()] The DLB is being reset; applications cannot use it during this time.\n",
+			__func__);
+		return -EINVAL;
+	}
+
 	f->private_data = dev;
 
 	dev->ops->inc_pm_refcnt(dev->pdev, true);
@@ -126,6 +140,7 @@ int dlb2_init_domain(struct dlb2_dev *dlb2_dev, u32 domain_id)
 
 	domain->id = domain_id;
 
+	domain->valid = true;
 	kref_init(&domain->refcnt);
 	domain->dlb2_dev = dlb2_dev;
 
@@ -210,6 +225,22 @@ static int __dlb2_free_domain(struct dlb2_dev *dev, struct dlb2_domain *domain)
 {
 	int i, ret = 0;
 
+	/*
+	 * Check if the domain was reset and its memory released during FLR
+	 * handling.
+	 */
+	if (!domain->valid) {
+		/*
+		 * Before clearing the sched_domains[] pointer, confirm the
+		 * slot isn't in use by a newer (valid) domain.
+		 */
+		if (dev->sched_domains[domain->id] == domain)
+			dev->sched_domains[domain->id] = NULL;
+
+		devm_kfree(dev->dlb2_device, domain);
+		return 0;
+	}
+
 	ret = dev->ops->reset_domain(&dev->hw, domain->id);
 
 	/* Unpin and free all memory pages associated with the domain */
@@ -270,7 +301,7 @@ int dlb2_write_domain_alert(struct dlb2_dev *dev,
 	struct dlb2_domain_alert alert;
 	int idx;
 
-	if (!domain) {
+	if (!domain || !domain->valid) {
 		dev_err(dev->dlb2_device,
 			"[%s()] Domain invalid (device reset)\n",
 			__func__);
@@ -342,9 +373,16 @@ static int dlb2_read_domain_alert(struct dlb2_dev *dev,
 			current->pid, domain->id);
 
 		if (wait_event_interruptible(domain->wq_head,
-					     dlb2_alerts_avail(domain)))
+					     dlb2_alerts_avail(domain) ||
+					     !READ_ONCE(domain->valid)))
 			return -ERESTARTSYS;
 
+		/* See dlb2_reset_prepare() for more details */
+		if (!READ_ONCE(domain->valid)) {
+			alert->alert_id = DLB2_DOMAIN_ALERT_DEVICE_RESET;
+			return 0;
+		}
+
 		spin_lock(&domain->alert_lock);
 	}
 
@@ -373,6 +411,11 @@ static ssize_t dlb2_domain_read(struct file *f,
 	if (len != sizeof(alert))
 		return -EINVAL;
 
+	if (!domain->valid) {
+		alert.alert_id = DLB2_DOMAIN_ALERT_DEVICE_RESET;
+		goto copy;
+	}
+
 	/* See dlb2_user.h for details on domain alert notifications */
 
 	ret = dlb2_read_domain_alert(dev,
@@ -382,6 +425,7 @@ static ssize_t dlb2_domain_read(struct file *f,
 	if (ret)
 		return ret;
 
+copy:
 	if (copy_to_user(buf, &alert, sizeof(alert)))
 		return -EFAULT;
 
@@ -429,6 +473,11 @@ static int dlb2_pp_mmap(struct file *f, struct vm_area_struct *vma)
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		ret = -EINVAL;
+		goto end;
+	}
+
 	if ((vma->vm_end - vma->vm_start) != DLB2_PP_SIZE) {
 		ret = -EINVAL;
 		goto end;
@@ -468,6 +517,11 @@ static int dlb2_cq_mmap(struct file *f, struct vm_area_struct *vma)
 
 	mutex_lock(&dev->resource_mutex);
 
+	if (!domain->valid) {
+		ret = -EINVAL;
+		goto end;
+	}
+
 	if ((vma->vm_end - vma->vm_start) != DLB2_CQ_SIZE) {
 		ret = -EINVAL;
 		goto end;
@@ -724,10 +778,10 @@ static void dlb2_remove(struct pci_dev *pdev)
 	devm_kfree(&pdev->dev, dlb2_dev);
 }
 
-#ifdef CONFIG_PM
-static void dlb2_reset_hardware_state(struct dlb2_dev *dev)
+static void dlb2_reset_hardware_state(struct dlb2_dev *dev, bool issue_flr)
 {
-	dlb2_reset_device(dev->pdev);
+	if (issue_flr)
+		dlb2_reset_device(dev->pdev);
 
 	/* Reinitialize interrupt configuration */
 	dev->ops->reinit_interrupts(dev);
@@ -736,6 +790,7 @@ static void dlb2_reset_hardware_state(struct dlb2_dev *dev)
 	dev->ops->init_hardware(dev);
 }
 
+#ifdef CONFIG_PM
 static int dlb2_runtime_suspend(struct device *dev)
 {
 	struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
@@ -766,7 +821,7 @@ static int dlb2_runtime_resume(struct device *dev)
 	dev_dbg(dlb2_dev->dlb2_device, "Resuming device operation\n");
 
 	/* Now reinitialize the device state. */
-	dlb2_reset_hardware_state(dlb2_dev);
+	dlb2_reset_hardware_state(dlb2_dev, true);
 
 	return 0;
 }
@@ -778,6 +833,265 @@ static struct pci_device_id dlb2_id_table[] = {
 };
 MODULE_DEVICE_TABLE(pci, dlb2_id_table);
 
+static unsigned int dlb2_total_device_file_refcnt(struct dlb2_dev *dev)
+{
+	unsigned int cnt = 0;
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++)
+		if (dev->sched_domains[i])
+			cnt += kref_read(&dev->sched_domains[i]->refcnt);
+
+	return cnt;
+}
+
+static bool dlb2_in_use(struct dlb2_dev *dev)
+{
+	return dlb2_total_device_file_refcnt(dev) != 0;
+}
+
+static void dlb2_wait_for_idle(struct dlb2_dev *dlb2_dev)
+{
+	int i;
+
+	for (i = 0; i < dlb2_reset_timeout_s * 10; i++) {
+		bool idle;
+
+		mutex_lock(&dlb2_dev->resource_mutex);
+
+		/*
+		 * Check for any application threads in the driver, extant
+		 * mmaps, or open device files.
+		 */
+		idle = !dlb2_in_use(dlb2_dev);
+
+		mutex_unlock(&dlb2_dev->resource_mutex);
+
+		if (idle)
+			return;
+
+		cond_resched();
+		msleep(100);
+	}
+
+	dev_err(dlb2_dev->dlb2_device,
+		"PF driver timed out waiting for applications to idle\n");
+}
+
+static void dlb2_unmap_all_mappings(struct dlb2_dev *dlb2_dev)
+{
+	dev_dbg(dlb2_dev->dlb2_device, "Zapping memory mappings\n");
+
+	if (dlb2_dev->inode)
+		unmap_mapping_range(dlb2_dev->inode->i_mapping, 0, 0, 1);
+}
+
+static void dlb2_disable_domain_files(struct dlb2_dev *dlb2_dev)
+{
+	int i;
+
+	/*
+	 * Set all domain->valid flags to false to prevent existing device
+	 * files from being used to enter the device driver.
+	 */
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++) {
+		if (dlb2_dev->sched_domains[i])
+			dlb2_dev->sched_domains[i]->valid = false;
+	}
+}
+
+static void dlb2_wake_threads(struct dlb2_dev *dlb2_dev)
+{
+	int i;
+
+	/*
+	 * Wake any blocked device file readers. These threads will return the
+	 * DLB2_DOMAIN_ALERT_DEVICE_RESET alert, and well-behaved applications
+	 * will close their fds and unmap DLB memory as a result.
+	 */
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++) {
+		if (!dlb2_dev->sched_domains[i])
+			continue;
+
+		wake_up_interruptible(&dlb2_dev->sched_domains[i]->wq_head);
+	}
+
+	/* Wake threads blocked on a CQ interrupt */
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++)
+		dlb2_wake_thread(dlb2_dev,
+				 &dlb2_dev->intr.ldb_cq_intr[i],
+				 WAKE_DEV_RESET);
+
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++)
+		dlb2_wake_thread(dlb2_dev,
+				 &dlb2_dev->intr.dir_cq_intr[i],
+				 WAKE_DEV_RESET);
+}
+
+static void dlb2_stop_users(struct dlb2_dev *dlb2_dev)
+{
+	/*
+	 * Disable existing domain files to prevent applications from enter the
+	 * device driver through file operations. (New files can't be opened
+	 * while the resource mutex is held.)
+	 */
+	dlb2_disable_domain_files(dlb2_dev);
+
+	/* Wake any threads blocked in the kernel */
+	dlb2_wake_threads(dlb2_dev);
+}
+
+static void dlb2_reset_prepare(struct pci_dev *pdev)
+{
+	/*
+	 * Unexpected FLR. Applications may be actively using the device at
+	 * the same time, which poses two problems:
+	 * - If applications continue to enqueue to the hardware they will
+	 *   cause hardware errors, because the FLR will have reset the
+	 *   scheduling domains, ports, and queues.
+	 * - When the applications end, they must not trigger the driver's
+	 *   domain reset code. The domain reset procedure would fail because
+	 *   the device's registers will have been reset by the FLR.
+	 *
+	 * To avoid these problems, the driver handles unexpected resets as
+	 * follows:
+	 * 1. Set the reset_active flag. This flag blocks new device files
+	 *    from being opened and is used as a wakeup condition in the
+	 *    driver's wait queues.
+	 * 2. If this is a PF FLR and there are active VFs, send them a
+	 *    pre-reset notification, so they can stop any VF applications.
+	 * 3. Disable all device files (set the per-file valid flag to false,
+	 *    which prevents the file from being used after FLR completes) and
+	 *    wake any threads on a wait queue.
+	 * 4. If the DLB is not in use -- i.e. no open device files or memory
+	 *    mappings, and no VFs in use (PF FLR only) -- the FLR can begin.
+	 * 5. Else, the driver waits (up to a user-specified timeout, default
+	 *    5s) for software to stop using the driver and the device. If the
+	 *    timeout elapses, the driver zaps any remaining MMIO mappings.
+	 *
+	 * After the FLR:
+	 * 1. Clear the per-domain pointers (the memory is freed in either
+	 *    dlb2_close or dlb2_stop_users).
+	 * 2. Release any remaining allocated port or CQ memory, now that it's
+	 *    guaranteed the device is unconfigured and won't write to memory.
+	 * 3. Reset software and hardware state
+	 * 4. Notify VFs that the FLR is complete.
+	 * 5. Set reset_active to false.
+	 */
+
+	struct dlb2_dev *dlb2_dev;
+
+	dlb2_dev = pci_get_drvdata(pdev);
+
+	dev_dbg(dlb2_dev->dlb2_device, "DLB driver reset prepare\n");
+
+	mutex_lock(&dlb2_dev->resource_mutex);
+
+	/* Block any new device files from being opened */
+	dlb2_dev->reset_active = true;
+
+	/*
+	 * If the device has 1+ VFs, even if they're not in use, it will not be
+	 * suspended. To avoid having to handle two cases (reset while device
+	 * suspended and reset while device active), increment the device's PM
+	 * refcnt here, to guarantee that the device is in D0 for the duration
+	 * of the reset.
+	 */
+	dlb2_dev->ops->inc_pm_refcnt(dlb2_dev->pdev, true);
+
+	/*
+	 * Stop existing applications from continuing to use the device by
+	 * blocking kernel driver interfaces and waking any threads on wait
+	 * queues, but don't zap VMA entries yet. If this is a PF FLR, notify
+	 * any VFs of the impending FLR so they can stop their users as well.
+	 */
+	dlb2_stop_users(dlb2_dev);
+
+	/* If no software is using the device, there's nothing to clean up. */
+	if (!dlb2_in_use(dlb2_dev))
+		return;
+
+	dev_dbg(dlb2_dev->dlb2_device, "Waiting for users to stop\n");
+
+	/*
+	 * Release the resource mutex so threads can complete their work and
+	 * exit the driver
+	 */
+	mutex_unlock(&dlb2_dev->resource_mutex);
+
+	/*
+	 * Wait until the device is idle or dlb2_reset_timeout_s seconds
+	 * elapse. If the timeout occurs, zap any remaining VMA entries to
+	 * guarantee applications can't reach the device.
+	 */
+	dlb2_wait_for_idle(dlb2_dev);
+
+	mutex_lock(&dlb2_dev->resource_mutex);
+
+	if (!dlb2_in_use(dlb2_dev))
+		return;
+
+	dlb2_unmap_all_mappings(dlb2_dev);
+
+	/*
+	 * Don't release resource_mutex until after the FLR occurs. This
+	 * prevents applications from accessing the device during reset.
+	 */
+}
+
+static void dlb2_reset_done(struct pci_dev *pdev)
+{
+	struct dlb2_dev *dlb2_dev;
+	int i;
+
+	dlb2_dev = pci_get_drvdata(pdev);
+
+	/*
+	 * Clear all domain pointers, to be filled in by post-FLR applications
+	 * using the device driver.
+	 *
+	 * Note that domain memory isn't leaked -- it is either freed during
+	 * dlb2_stop_users() or in the file close callback.
+	 */
+	for (i = 0; i < DLB2_MAX_NUM_DOMAINS; i++)
+		dlb2_dev->sched_domains[i] = NULL;
+
+	/*
+	 * Free allocated CQ memory. These are no longer accessible to
+	 * user-space: either the applications closed, or their mappings were
+	 * zapped in dlb2_reset_prepare().
+	 */
+	dlb2_release_device_memory(dlb2_dev);
+
+	/* Reset interrupt state */
+	for (i = 0; i < DLB2_MAX_NUM_LDB_PORTS; i++)
+		dlb2_dev->intr.ldb_cq_intr[i].configured = false;
+	for (i = 0; i < DLB2_MAX_NUM_DIR_PORTS; i++)
+		dlb2_dev->intr.dir_cq_intr[i].configured = false;
+
+	/* Reset resource allocation state */
+	dlb2_resource_reset(&dlb2_dev->hw);
+
+	/* Reset the hardware state, but don't issue an additional FLR */
+	dlb2_reset_hardware_state(dlb2_dev, false);
+
+	dev_dbg(dlb2_dev->dlb2_device, "DLB driver reset done\n");
+
+	dlb2_dev->domain_reset_failed = false;
+
+	dlb2_dev->reset_active = false;
+
+	/* Undo the PM refcnt increment in dlb2_reset_prepare(). */
+	dlb2_dev->ops->dec_pm_refcnt(dlb2_dev->pdev);
+
+	mutex_unlock(&dlb2_dev->resource_mutex);
+}
+
+static const struct pci_error_handlers dlb2_err_handler = {
+	.reset_prepare = dlb2_reset_prepare,
+	.reset_done    = dlb2_reset_done,
+};
+
 #ifdef CONFIG_PM
 static const struct dev_pm_ops dlb2_pm_ops = {
 	SET_RUNTIME_PM_OPS(dlb2_runtime_suspend, dlb2_runtime_resume, NULL)
@@ -792,6 +1106,7 @@ static struct pci_driver dlb2_pci_driver = {
 #ifdef CONFIG_PM
 	.driver.pm	 = &dlb2_pm_ops,
 #endif
+	.err_handler     = &dlb2_err_handler,
 };
 
 static int __init dlb2_init_module(void)
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 4b7a430ffd86..e804639af0f8 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -31,6 +31,8 @@ static const char dlb2_driver_name[] = KBUILD_MODNAME;
 #define DLB2_NUM_FUNCS_PER_DEVICE (1 + DLB2_MAX_NUM_VDEVS)
 #define DLB2_MAX_NUM_DEVICES (DLB2_MAX_NUM_PFS * DLB2_NUM_FUNCS_PER_DEVICE)
 
+#define DLB2_DEFAULT_RESET_TIMEOUT_S 5
+
 enum dlb2_device_type {
 	DLB2_PF,
 	DLB2_VF,
@@ -182,6 +184,7 @@ struct dlb2_domain {
 	u8 alert_rd_idx;
 	u8 alert_wr_idx;
 	u8 id;
+	u8 valid;
 };
 
 struct dlb2_cq_intr {
@@ -237,6 +240,7 @@ struct dlb2_dev {
 	int id;
 	dev_t dev_number;
 	u8 domain_reset_failed;
+	u8 reset_active;
 	u8 worker_launched;
 };
 
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 2c04c5164223..023409022f39 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -24,6 +24,9 @@
 #define DLB2_DOM_LIST_FOR_SAFE(head, ptr, ptr_tmp) \
 	list_for_each_entry_safe(ptr, ptr_tmp, &(head), domain_list)
 
+#define DLB2_FUNC_LIST_FOR_SAFE(head, ptr, ptr_tmp) \
+	list_for_each_entry_safe(ptr, ptr_tmp, &(head), func_list)
+
 /*
  * The PF driver cannot assume that a register write will affect subsequent HCW
  * writes. To ensure a write completes, the driver must read back a CSR. This
@@ -5327,6 +5330,20 @@ static int dlb2_domain_reset_software_state(struct dlb2_hw *hw,
 	return 0;
 }
 
+void dlb2_resource_reset(struct dlb2_hw *hw)
+{
+	struct dlb2_hw_domain *domain, *next __attribute__((unused));
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_VDEVS; i++) {
+		DLB2_FUNC_LIST_FOR_SAFE(hw->vdev[i].used_domains, domain, next)
+			dlb2_domain_reset_software_state(hw, domain);
+	}
+
+	DLB2_FUNC_LIST_FOR_SAFE(hw->pf.used_domains, domain, next)
+		dlb2_domain_reset_software_state(hw, domain);
+}
+
 static u32 dlb2_dir_queue_depth(struct dlb2_hw *hw,
 				struct dlb2_dir_pq_pair *queue)
 {
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 430175efcd71..7570e02d37e5 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -37,6 +37,15 @@ int dlb2_resource_init(struct dlb2_hw *hw);
 void dlb2_resource_free(struct dlb2_hw *hw);
 
 /**
+ * dlb2_resource_reset() - reset in-use resources to their initial state
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function resets in-use resources, and makes them available for use.
+ * All resources go back to their owning function, whether a PF or a VF.
+ */
+void dlb2_resource_reset(struct dlb2_hw *hw);
+
+/**
  * dlb2_hw_create_sched_domain() - create a scheduling domain
  * @hw: dlb2_hw handle for a particular device.
  * @args: scheduling domain creation arguments.
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 19/20] dlb2: add basic PF sysfs interfaces
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (17 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 18/20] dlb2: add device FLR support Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  2020-07-12 13:43 ` [PATCH 20/20] dlb2: add ingress error handling Gage Eads
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

These interfaces include files for reading the total and available device
resources, getting and setting sequence number group allocations, and
reading the device ID.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 Documentation/ABI/testing/sysfs-driver-dlb2 | 165 +++++++++++
 drivers/misc/dlb2/dlb2_main.c               |  11 +
 drivers/misc/dlb2/dlb2_main.h               |   3 +
 drivers/misc/dlb2/dlb2_pf_ops.c             | 423 ++++++++++++++++++++++++++++
 4 files changed, 602 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-dlb2 b/Documentation/ABI/testing/sysfs-driver-dlb2
index 38d0d3d92670..c5cb1cbb70f4 100644
--- a/Documentation/ABI/testing/sysfs-driver-dlb2
+++ b/Documentation/ABI/testing/sysfs-driver-dlb2
@@ -20,3 +20,168 @@ Description:	Interface for setting the driver's reset timeout.
 		their device mappings can interfere with processes that use the
 		device after the reset completes. To ensure that user processes
 		have enough time to clean up, reset_timeout_s can be increased.
+
+What:		/sys/bus/pci/devices/.../sequence_numbers/group<N>_sns_per_queue
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:	Interface for configuring dlb2 load-balanced sequence numbers.
+
+		The DLB 2.0 has a fixed number of sequence numbers used for
+		ordered scheduling. They are divided among two sequence number
+		groups. A group can be configured to contain one queue with
+		1,024 sequence numbers, or two queues with 512 sequence numbers
+		each, and so on, down to 16 queues with 64 sequence numbers
+		each.
+
+		When a load-balanced queue is configured with non-zero sequence
+		numbers, the driver finds a group configured for the same
+		number of sequence numbers and an available slot. If no such
+		groups are found, the queue cannot be configured.
+
+		Once the first ordered queue is configured, the sequence number
+		configurations are locked. The driver returns an error on writes
+		to locked sequence number configurations. When all ordered
+		queues are unconfigured, the sequence number configurations can
+		be changed again.
+
+		This file is only accessible for physical function DLB 2.0
+		devices.
+
+What:		/sys/bus/pci/devices/.../total_resources/num_atomic_inflights
+What:		/sys/bus/pci/devices/.../total_resources/num_dir_credits
+What:		/sys/bus/pci/devices/.../total_resources/num_dir_ports
+What:		/sys/bus/pci/devices/.../total_resources/num_hist_list_entries
+What:		/sys/bus/pci/devices/.../total_resources/num_ldb_credits
+What:		/sys/bus/pci/devices/.../total_resources/num_ldb_ports
+What:		/sys/bus/pci/devices/.../total_resources/num_cos0_ldb_ports
+What:		/sys/bus/pci/devices/.../total_resources/num_cos1_ldb_ports
+What:		/sys/bus/pci/devices/.../total_resources/num_cos2_ldb_ports
+What:		/sys/bus/pci/devices/.../total_resources/num_cos3_ldb_ports
+What:		/sys/bus/pci/devices/.../total_resources/num_ldb_queues
+What:		/sys/bus/pci/devices/.../total_resources/num_sched_domains
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:
+		The total_resources subdirectory contains read-only files that
+		indicate the total number of resources in the device.
+
+		num_atomic_inflights:  Total number of atomic inflights in the
+				       device. Atomic inflights refers to the
+				       on-device storage used by the atomic
+				       scheduler.
+
+		num_dir_credits:       Total number of directed credits in the
+				       device.
+
+		num_dir_ports:	       Total number of directed ports (and
+				       queues) in the device.
+
+		num_hist_list_entries: Total number of history list entries in
+				       the device.
+
+		num_ldb_credits:       Total number of load-balanced credits in
+				       the device.
+
+		num_ldb_ports:	       Total number of load-balanced ports in
+				       the device.
+
+		num_cos<M>_ldb_ports:  Total number of load-balanced ports
+				       belonging to class-of-service M in the
+				       device.
+
+		num_ldb_queues:	       Total number of load-balanced queues in
+				       the device.
+
+		num_sched_domains:     Total number of scheduling domains in the
+				       device.
+
+What:		/sys/bus/pci/devices/.../avail_resources/num_atomic_inflights
+What:		/sys/bus/pci/devices/.../avail_resources/num_dir_credits
+What:		/sys/bus/pci/devices/.../avail_resources/num_dir_ports
+What:		/sys/bus/pci/devices/.../avail_resources/num_hist_list_entries
+What:		/sys/bus/pci/devices/.../avail_resources/num_ldb_credits
+What:		/sys/bus/pci/devices/.../avail_resources/num_ldb_ports
+What:		/sys/bus/pci/devices/.../avail_resources/num_cos0_ldb_ports
+What:		/sys/bus/pci/devices/.../avail_resources/num_cos1_ldb_ports
+What:		/sys/bus/pci/devices/.../avail_resources/num_cos2_ldb_ports
+What:		/sys/bus/pci/devices/.../avail_resources/num_cos3_ldb_ports
+What:		/sys/bus/pci/devices/.../avail_resources/num_ldb_queues
+What:		/sys/bus/pci/devices/.../avail_resources/num_sched_domains
+What:		/sys/bus/pci/devices/.../avail_resources/max_ctg_hl_entries
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:
+		The avail_resources subdirectory contains read-only files that
+		indicate the available number of resources in the device.
+		"Available" here means resources that are not currently in use
+		by an application or, in the case of a physical function
+		device, assigned to a virtual function.
+
+		num_atomic_inflights:  Available number of atomic inflights in
+				       the device.
+
+		num_dir_ports:	       Available number of directed ports (and
+				       queues) in the device.
+
+		num_hist_list_entries: Available number of history list entries
+				       in the device.
+
+		num_ldb_credits:       Available number of load-balanced credits
+				       in the device.
+
+		num_ldb_ports:	       Available number of load-balanced ports
+				       in the device.
+
+		num_cos<M>_ldb_ports:  Available number of load-balanced ports
+				       belonging to class-of-service M in the
+				       device.
+
+		num_ldb_queues:	       Available number of load-balanced queues
+				       in the device.
+
+		num_sched_domains:     Available number of scheduling domains in
+				       the device.
+
+		max_ctg_hl_entries:    Maximum contiguous history list entries
+				       available in the device.
+
+				       Each scheduling domain is created with
+				       an allocation of history list entries,
+				       and each domain's allocation of entries
+				       must be contiguous.
+
+What:		/sys/bus/pci/devices/.../cos_bw/cos<N>_bw_percent
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:	Interface for configuring dlb2 class-of-service bandwidths.
+
+		The DLB 2.0 supports four load-balanced port classes of service
+		(CoS). Each CoS receives a guaranteed percentage of the
+		load-balanced scheduler's bandwidth, and any unreserved
+		bandwidth is divided among the four CoS.
+
+		By default, each CoS is guaranteed 25% of the bandwidth. Writing
+		these files allows the user to change the CoS, subject to the
+		constraint that the total reserved bandwidth of all CoS cannot
+		exceed 100%.
+
+		Bandwidth reservations can be modified during run-time.
+
+		These files are only accessible for physical function DLB 2.0
+		devices.
+
+What:		/sys/bus/pci/devices/.../dev_id
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:	Device ID used in /dev, i.e. /dev/dlb<device ID>
+
+		Each DLB 2.0 PF and VF device is granted a unique ID by the
+		kernel driver, and this ID is used to construct the device's
+		/dev directory: /dev/dlb<device ID>. This sysfs file can be read
+		to determine a device's ID, which allows the user to map a
+		device file to a PCI BDF.
diff --git a/drivers/misc/dlb2/dlb2_main.c b/drivers/misc/dlb2/dlb2_main.c
index ade71ed2e66b..f3996e7cb256 100644
--- a/drivers/misc/dlb2/dlb2_main.c
+++ b/drivers/misc/dlb2/dlb2_main.c
@@ -669,6 +669,10 @@ static int dlb2_probe(struct pci_dev *pdev,
 	if (ret)
 		goto dma_set_mask_fail;
 
+	ret = dlb2_dev->ops->sysfs_create(dlb2_dev);
+	if (ret)
+		goto sysfs_create_fail;
+
 	/*
 	 * PM enable must be done before any other MMIO accesses, and this
 	 * setting is persistent across device reset.
@@ -719,6 +723,8 @@ static int dlb2_probe(struct pci_dev *pdev,
 init_interrupts_fail:
 dlb2_reset_fail:
 wait_for_device_ready_fail:
+	dlb2_dev->ops->sysfs_destroy(dlb2_dev);
+sysfs_create_fail:
 dma_set_mask_fail:
 	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
 device_add_fail:
@@ -761,6 +767,8 @@ static void dlb2_remove(struct pci_dev *pdev)
 
 	dlb2_release_device_memory(dlb2_dev);
 
+	dlb2_dev->ops->sysfs_destroy(dlb2_dev);
+
 	dlb2_dev->ops->device_destroy(dlb2_dev, dlb2_class);
 
 	dlb2_dev->ops->cdev_del(dlb2_dev);
@@ -786,6 +794,9 @@ static void dlb2_reset_hardware_state(struct dlb2_dev *dev, bool issue_flr)
 	/* Reinitialize interrupt configuration */
 	dev->ops->reinit_interrupts(dev);
 
+	/* Reset configuration done through the sysfs */
+	dev->ops->sysfs_reapply(dev);
+
 	/* Reinitialize any other hardware state */
 	dev->ops->init_hardware(dev);
 }
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index e804639af0f8..05f6a9c976bf 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -57,6 +57,9 @@ struct dlb2_device_ops {
 			dev_t base,
 			const struct file_operations *fops);
 	void (*cdev_del)(struct dlb2_dev *dlb2_dev);
+	int (*sysfs_create)(struct dlb2_dev *dlb2_dev);
+	void (*sysfs_destroy)(struct dlb2_dev *dlb2_dev);
+	void (*sysfs_reapply)(struct dlb2_dev *dev);
 	int (*init_interrupts)(struct dlb2_dev *dev, struct pci_dev *pdev);
 	int (*enable_ldb_cq_interrupts)(struct dlb2_dev *dev,
 					int domain_id,
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index 124e097ff979..f17be00ebc81 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -443,6 +443,426 @@ dlb2_pf_init_hardware(struct dlb2_dev *dlb2_dev)
 }
 
 /*****************************/
+/****** Sysfs callbacks ******/
+/*****************************/
+
+#define DLB2_TOTAL_SYSFS_SHOW(name, macro)		\
+static ssize_t total_##name##_show(			\
+	struct device *dev,				\
+	struct device_attribute *attr,			\
+	char *buf)					\
+{							\
+	int val = DLB2_MAX_NUM_##macro;			\
+							\
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val);	\
+}
+
+DLB2_TOTAL_SYSFS_SHOW(num_sched_domains, DOMAINS)
+DLB2_TOTAL_SYSFS_SHOW(num_ldb_queues, LDB_QUEUES)
+DLB2_TOTAL_SYSFS_SHOW(num_ldb_ports, LDB_PORTS)
+DLB2_TOTAL_SYSFS_SHOW(num_cos0_ldb_ports, LDB_PORTS / DLB2_NUM_COS_DOMAINS)
+DLB2_TOTAL_SYSFS_SHOW(num_cos1_ldb_ports, LDB_PORTS / DLB2_NUM_COS_DOMAINS)
+DLB2_TOTAL_SYSFS_SHOW(num_cos2_ldb_ports, LDB_PORTS / DLB2_NUM_COS_DOMAINS)
+DLB2_TOTAL_SYSFS_SHOW(num_cos3_ldb_ports, LDB_PORTS / DLB2_NUM_COS_DOMAINS)
+DLB2_TOTAL_SYSFS_SHOW(num_dir_ports, DIR_PORTS)
+DLB2_TOTAL_SYSFS_SHOW(num_ldb_credits, LDB_CREDITS)
+DLB2_TOTAL_SYSFS_SHOW(num_dir_credits, DIR_CREDITS)
+DLB2_TOTAL_SYSFS_SHOW(num_atomic_inflights, AQED_ENTRIES)
+DLB2_TOTAL_SYSFS_SHOW(num_hist_list_entries, HIST_LIST_ENTRIES)
+
+#define DLB2_AVAIL_SYSFS_SHOW(name)			     \
+static ssize_t avail_##name##_show(			     \
+	struct device *dev,				     \
+	struct device_attribute *attr,			     \
+	char *buf)					     \
+{							     \
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);    \
+	struct dlb2_get_num_resources_args arg;		     \
+	struct dlb2_hw *hw = &dlb2_dev->hw;		     \
+	int val;					     \
+							     \
+	mutex_lock(&dlb2_dev->resource_mutex);		     \
+							     \
+	val = dlb2_hw_get_num_resources(hw, &arg, false, 0); \
+							     \
+	mutex_unlock(&dlb2_dev->resource_mutex);	     \
+							     \
+	if (val)					     \
+		return -1;				     \
+							     \
+	val = arg.name;					     \
+							     \
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val);	     \
+}
+
+#define DLB2_AVAIL_SYSFS_SHOW_COS(name, idx)		     \
+static ssize_t avail_##name##_show(			     \
+	struct device *dev,				     \
+	struct device_attribute *attr,			     \
+	char *buf)					     \
+{							     \
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);    \
+	struct dlb2_get_num_resources_args arg;		     \
+	struct dlb2_hw *hw = &dlb2_dev->hw;		     \
+	int val;					     \
+							     \
+	mutex_lock(&dlb2_dev->resource_mutex);		     \
+							     \
+	val = dlb2_hw_get_num_resources(hw, &arg, false, 0); \
+							     \
+	mutex_unlock(&dlb2_dev->resource_mutex);	     \
+							     \
+	if (val)					     \
+		return -1;				     \
+							     \
+	val = arg.num_cos_ldb_ports[idx];		     \
+							     \
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val);	     \
+}
+
+DLB2_AVAIL_SYSFS_SHOW(num_sched_domains)
+DLB2_AVAIL_SYSFS_SHOW(num_ldb_queues)
+DLB2_AVAIL_SYSFS_SHOW(num_ldb_ports)
+DLB2_AVAIL_SYSFS_SHOW_COS(num_cos0_ldb_ports, 0)
+DLB2_AVAIL_SYSFS_SHOW_COS(num_cos1_ldb_ports, 1)
+DLB2_AVAIL_SYSFS_SHOW_COS(num_cos2_ldb_ports, 2)
+DLB2_AVAIL_SYSFS_SHOW_COS(num_cos3_ldb_ports, 3)
+DLB2_AVAIL_SYSFS_SHOW(num_dir_ports)
+DLB2_AVAIL_SYSFS_SHOW(num_ldb_credits)
+DLB2_AVAIL_SYSFS_SHOW(num_dir_credits)
+DLB2_AVAIL_SYSFS_SHOW(num_atomic_inflights)
+DLB2_AVAIL_SYSFS_SHOW(num_hist_list_entries)
+
+static ssize_t max_ctg_hl_entries_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);
+	struct dlb2_get_num_resources_args arg;
+	struct dlb2_hw *hw = &dlb2_dev->hw;
+	int val;
+
+	mutex_lock(&dlb2_dev->resource_mutex);
+
+	val = dlb2_hw_get_num_resources(hw, &arg, false, 0);
+
+	mutex_unlock(&dlb2_dev->resource_mutex);
+
+	if (val)
+		return -1;
+
+	val = arg.max_contiguous_hist_list_entries;
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+/*
+ * Device attribute name doesn't match the show function name, so we define our
+ * own DEVICE_ATTR macro.
+ */
+#define DLB2_DEVICE_ATTR_RO(_prefix, _name) \
+struct device_attribute dev_attr_##_prefix##_##_name = {\
+	.attr = { .name = __stringify(_name), .mode = 0444 },\
+	.show = _prefix##_##_name##_show,\
+}
+
+static DLB2_DEVICE_ATTR_RO(total, num_sched_domains);
+static DLB2_DEVICE_ATTR_RO(total, num_ldb_queues);
+static DLB2_DEVICE_ATTR_RO(total, num_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(total, num_cos0_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(total, num_cos1_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(total, num_cos2_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(total, num_cos3_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(total, num_dir_ports);
+static DLB2_DEVICE_ATTR_RO(total, num_ldb_credits);
+static DLB2_DEVICE_ATTR_RO(total, num_dir_credits);
+static DLB2_DEVICE_ATTR_RO(total, num_atomic_inflights);
+static DLB2_DEVICE_ATTR_RO(total, num_hist_list_entries);
+
+static struct attribute *dlb2_total_attrs[] = {
+	&dev_attr_total_num_sched_domains.attr,
+	&dev_attr_total_num_ldb_queues.attr,
+	&dev_attr_total_num_ldb_ports.attr,
+	&dev_attr_total_num_cos0_ldb_ports.attr,
+	&dev_attr_total_num_cos1_ldb_ports.attr,
+	&dev_attr_total_num_cos2_ldb_ports.attr,
+	&dev_attr_total_num_cos3_ldb_ports.attr,
+	&dev_attr_total_num_dir_ports.attr,
+	&dev_attr_total_num_ldb_credits.attr,
+	&dev_attr_total_num_dir_credits.attr,
+	&dev_attr_total_num_atomic_inflights.attr,
+	&dev_attr_total_num_hist_list_entries.attr,
+	NULL
+};
+
+static const struct attribute_group dlb2_total_attr_group = {
+	.attrs = dlb2_total_attrs,
+	.name = "total_resources",
+};
+
+static DLB2_DEVICE_ATTR_RO(avail, num_sched_domains);
+static DLB2_DEVICE_ATTR_RO(avail, num_ldb_queues);
+static DLB2_DEVICE_ATTR_RO(avail, num_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(avail, num_cos0_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(avail, num_cos1_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(avail, num_cos2_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(avail, num_cos3_ldb_ports);
+static DLB2_DEVICE_ATTR_RO(avail, num_dir_ports);
+static DLB2_DEVICE_ATTR_RO(avail, num_ldb_credits);
+static DLB2_DEVICE_ATTR_RO(avail, num_dir_credits);
+static DLB2_DEVICE_ATTR_RO(avail, num_atomic_inflights);
+static DLB2_DEVICE_ATTR_RO(avail, num_hist_list_entries);
+static DEVICE_ATTR_RO(max_ctg_hl_entries);
+
+static struct attribute *dlb2_avail_attrs[] = {
+	&dev_attr_avail_num_sched_domains.attr,
+	&dev_attr_avail_num_ldb_queues.attr,
+	&dev_attr_avail_num_ldb_ports.attr,
+	&dev_attr_avail_num_cos0_ldb_ports.attr,
+	&dev_attr_avail_num_cos1_ldb_ports.attr,
+	&dev_attr_avail_num_cos2_ldb_ports.attr,
+	&dev_attr_avail_num_cos3_ldb_ports.attr,
+	&dev_attr_avail_num_dir_ports.attr,
+	&dev_attr_avail_num_ldb_credits.attr,
+	&dev_attr_avail_num_dir_credits.attr,
+	&dev_attr_avail_num_atomic_inflights.attr,
+	&dev_attr_avail_num_hist_list_entries.attr,
+	&dev_attr_max_ctg_hl_entries.attr,
+	NULL
+};
+
+static const struct attribute_group dlb2_avail_attr_group = {
+	.attrs = dlb2_avail_attrs,
+	.name = "avail_resources",
+};
+
+#define DLB2_GROUP_SNS_PER_QUEUE_SHOW(id)	       \
+static ssize_t group##id##_sns_per_queue_show(	       \
+	struct device *dev,			       \
+	struct device_attribute *attr,		       \
+	char *buf)				       \
+{						       \
+	struct dlb2_dev *dlb2_dev =		       \
+		dev_get_drvdata(dev);		       \
+	struct dlb2_hw *hw = &dlb2_dev->hw;	       \
+	int val;				       \
+						       \
+	mutex_lock(&dlb2_dev->resource_mutex);	       \
+						       \
+	val = dlb2_get_group_sequence_numbers(hw, id); \
+						       \
+	mutex_unlock(&dlb2_dev->resource_mutex);       \
+						       \
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val); \
+}
+
+DLB2_GROUP_SNS_PER_QUEUE_SHOW(0)
+DLB2_GROUP_SNS_PER_QUEUE_SHOW(1)
+
+#define DLB2_GROUP_SNS_PER_QUEUE_STORE(id)		    \
+static ssize_t group##id##_sns_per_queue_store(		    \
+	struct device *dev,				    \
+	struct device_attribute *attr,			    \
+	const char *buf,				    \
+	size_t count)					    \
+{							    \
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);   \
+	struct dlb2_hw *hw = &dlb2_dev->hw;		    \
+	unsigned long val;				    \
+	int err;					    \
+							    \
+	err = kstrtoul(buf, 0, &val);			    \
+	if (err)					    \
+		return -1;				    \
+							    \
+	mutex_lock(&dlb2_dev->resource_mutex);		    \
+							    \
+	err = dlb2_set_group_sequence_numbers(hw, id, val); \
+							    \
+	mutex_unlock(&dlb2_dev->resource_mutex);	    \
+							    \
+	return err ? err : count;			    \
+}
+
+DLB2_GROUP_SNS_PER_QUEUE_STORE(0)
+DLB2_GROUP_SNS_PER_QUEUE_STORE(1)
+
+/* RW sysfs files in the sequence_numbers/ subdirectory */
+static DEVICE_ATTR_RW(group0_sns_per_queue);
+static DEVICE_ATTR_RW(group1_sns_per_queue);
+
+static struct attribute *dlb2_sequence_number_attrs[] = {
+	&dev_attr_group0_sns_per_queue.attr,
+	&dev_attr_group1_sns_per_queue.attr,
+	NULL
+};
+
+static const struct attribute_group dlb2_sequence_number_attr_group = {
+	.attrs = dlb2_sequence_number_attrs,
+	.name = "sequence_numbers"
+};
+
+#define DLB2_COS_BW_PERCENT_SHOW(id)		       \
+static ssize_t cos##id##_bw_percent_show(	       \
+	struct device *dev,			       \
+	struct device_attribute *attr,		       \
+	char *buf)				       \
+{						       \
+	struct dlb2_dev *dlb2_dev =		       \
+		dev_get_drvdata(dev);		       \
+	struct dlb2_hw *hw = &dlb2_dev->hw;	       \
+	int val;				       \
+						       \
+	mutex_lock(&dlb2_dev->resource_mutex);	       \
+						       \
+	val = dlb2_hw_get_cos_bandwidth(hw, id);       \
+						       \
+	mutex_unlock(&dlb2_dev->resource_mutex);       \
+						       \
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val); \
+}
+
+DLB2_COS_BW_PERCENT_SHOW(0)
+DLB2_COS_BW_PERCENT_SHOW(1)
+DLB2_COS_BW_PERCENT_SHOW(2)
+DLB2_COS_BW_PERCENT_SHOW(3)
+
+#define DLB2_COS_BW_PERCENT_STORE(id)		      \
+static ssize_t cos##id##_bw_percent_store(	      \
+	struct device *dev,			      \
+	struct device_attribute *attr,		      \
+	const char *buf,			      \
+	size_t count)				      \
+{						      \
+	struct dlb2_dev *dlb2_dev =		      \
+		dev_get_drvdata(dev);		      \
+	struct dlb2_hw *hw = &dlb2_dev->hw;	      \
+	unsigned long val;			      \
+	int err;				      \
+						      \
+	err = kstrtoul(buf, 0, &val);		      \
+	if (err)				      \
+		return -1;			      \
+						      \
+	mutex_lock(&dlb2_dev->resource_mutex);	      \
+						      \
+	err = dlb2_hw_set_cos_bandwidth(hw, id, val); \
+						      \
+	mutex_unlock(&dlb2_dev->resource_mutex);      \
+						      \
+	return err ? err : count;		      \
+}
+
+DLB2_COS_BW_PERCENT_STORE(0)
+DLB2_COS_BW_PERCENT_STORE(1)
+DLB2_COS_BW_PERCENT_STORE(2)
+DLB2_COS_BW_PERCENT_STORE(3)
+
+/* RW sysfs files in the sequence_numbers/ subdirectory */
+static DEVICE_ATTR_RW(cos0_bw_percent);
+static DEVICE_ATTR_RW(cos1_bw_percent);
+static DEVICE_ATTR_RW(cos2_bw_percent);
+static DEVICE_ATTR_RW(cos3_bw_percent);
+
+static struct attribute *dlb2_cos_bw_percent_attrs[] = {
+	&dev_attr_cos0_bw_percent.attr,
+	&dev_attr_cos1_bw_percent.attr,
+	&dev_attr_cos2_bw_percent.attr,
+	&dev_attr_cos3_bw_percent.attr,
+	NULL
+};
+
+static const struct attribute_group dlb2_cos_bw_percent_attr_group = {
+	.attrs = dlb2_cos_bw_percent_attrs,
+	.name = "cos_bw"
+};
+
+static ssize_t dev_id_show(struct device *dev,
+			   struct device_attribute *attr,
+			   char *buf)
+{
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);
+
+	return scnprintf(buf, PAGE_SIZE, "%d\n", dlb2_dev->id);
+}
+
+static DEVICE_ATTR_RO(dev_id);
+
+static int
+dlb2_pf_sysfs_create(struct dlb2_dev *dlb2_dev)
+{
+	struct kobject *kobj;
+	int ret;
+
+	kobj = &dlb2_dev->pdev->dev.kobj;
+
+	ret = sysfs_create_file(kobj, &dev_attr_dev_id.attr);
+	if (ret)
+		goto dlb2_dev_id_attr_group_fail;
+
+	ret = sysfs_create_group(kobj, &dlb2_total_attr_group);
+	if (ret)
+		goto dlb2_total_attr_group_fail;
+
+	ret = sysfs_create_group(kobj, &dlb2_avail_attr_group);
+	if (ret)
+		goto dlb2_avail_attr_group_fail;
+
+	ret = sysfs_create_group(kobj, &dlb2_sequence_number_attr_group);
+	if (ret)
+		goto dlb2_sn_attr_group_fail;
+
+	ret = sysfs_create_group(kobj, &dlb2_cos_bw_percent_attr_group);
+	if (ret)
+		goto dlb2_cos_bw_percent_attr_group_fail;
+
+	return 0;
+
+dlb2_cos_bw_percent_attr_group_fail:
+	sysfs_remove_group(kobj, &dlb2_sequence_number_attr_group);
+dlb2_sn_attr_group_fail:
+	sysfs_remove_group(kobj, &dlb2_avail_attr_group);
+dlb2_avail_attr_group_fail:
+	sysfs_remove_group(kobj, &dlb2_total_attr_group);
+dlb2_total_attr_group_fail:
+	sysfs_remove_file(kobj, &dev_attr_dev_id.attr);
+dlb2_dev_id_attr_group_fail:
+	return ret;
+}
+
+static void
+dlb2_pf_sysfs_destroy(struct dlb2_dev *dlb2_dev)
+{
+	struct kobject *kobj;
+
+	kobj = &dlb2_dev->pdev->dev.kobj;
+
+	sysfs_remove_group(kobj, &dlb2_cos_bw_percent_attr_group);
+	sysfs_remove_group(kobj, &dlb2_sequence_number_attr_group);
+	sysfs_remove_group(kobj, &dlb2_avail_attr_group);
+	sysfs_remove_group(kobj, &dlb2_total_attr_group);
+	sysfs_remove_file(kobj, &dev_attr_dev_id.attr);
+}
+
+static void
+dlb2_pf_sysfs_reapply_configuration(struct dlb2_dev *dev)
+{
+	int i;
+
+	for (i = 0; i < DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+		int num_sns = dlb2_get_group_sequence_numbers(&dev->hw, i);
+
+		dlb2_set_group_sequence_numbers(&dev->hw, i, num_sns);
+	}
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		int bw = dlb2_hw_get_cos_bandwidth(&dev->hw, i);
+
+		dlb2_hw_set_cos_bandwidth(&dev->hw, i, bw);
+	}
+}
+
+/*****************************/
 /****** IOCTL callbacks ******/
 /*****************************/
 
@@ -676,6 +1096,9 @@ struct dlb2_device_ops dlb2_pf_ops = {
 	.device_destroy = dlb2_pf_device_destroy,
 	.cdev_add = dlb2_pf_cdev_add,
 	.cdev_del = dlb2_pf_cdev_del,
+	.sysfs_create = dlb2_pf_sysfs_create,
+	.sysfs_destroy = dlb2_pf_sysfs_destroy,
+	.sysfs_reapply = dlb2_pf_sysfs_reapply_configuration,
 	.init_interrupts = dlb2_pf_init_interrupts,
 	.enable_ldb_cq_interrupts = dlb2_pf_enable_ldb_cq_interrupts,
 	.enable_dir_cq_interrupts = dlb2_pf_enable_dir_cq_interrupts,
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 20/20] dlb2: add ingress error handling
  2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
                   ` (18 preceding siblings ...)
  2020-07-12 13:43 ` [PATCH 19/20] dlb2: add basic PF sysfs interfaces Gage Eads
@ 2020-07-12 13:43 ` Gage Eads
  19 siblings, 0 replies; 46+ messages in thread
From: Gage Eads @ 2020-07-12 13:43 UTC (permalink / raw)
  To: linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

When DLB 2.0 detects that an application enqueued an illegal QE, it fires
an ingress error interrupt. The driver processes these errors and notifies
user-space through a domain alert.

A malicious application or virtual machine could attempt a
denial-of-service attack by enqueueing illegal QEs at a high rate and tying
up the PF driver, so the driver tracks the rate of ingress error alarms per
second. If they exceed a threshold of 1000/second, the driver simply
disables the interrupt. The interrupt is re-enabled when the device is
reset, the driver is reloaded, or by a user through a sysfs interface.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
---
 Documentation/ABI/testing/sysfs-driver-dlb2 |  15 ++
 drivers/misc/dlb2/dlb2_hw_types.h           |  17 ++
 drivers/misc/dlb2/dlb2_main.h               |  16 ++
 drivers/misc/dlb2/dlb2_pf_ops.c             | 177 ++++++++++++++++++++
 drivers/misc/dlb2/dlb2_resource.c           | 251 +++++++++++++++++++++++++++-
 drivers/misc/dlb2/dlb2_resource.h           |  45 +++++
 6 files changed, 520 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-dlb2 b/Documentation/ABI/testing/sysfs-driver-dlb2
index c5cb1cbb70f4..3598ce9caa03 100644
--- a/Documentation/ABI/testing/sysfs-driver-dlb2
+++ b/Documentation/ABI/testing/sysfs-driver-dlb2
@@ -185,3 +185,18 @@ Description:	Device ID used in /dev, i.e. /dev/dlb<device ID>
 		/dev directory: /dev/dlb<device ID>. This sysfs file can be read
 		to determine a device's ID, which allows the user to map a
 		device file to a PCI BDF.
+
+What:		/sys/bus/pci/devices/.../ingress_err_en
+Date:		June 22, 2020
+KernelVersion:	5.9
+Contact:	gage.eads@intel.com
+Description:	Interface for re-enabling ingress error interrupts
+
+		If the PF driver detects an overload of ingress error
+		interrupts, it will -- to prevent a possible denial-of-service
+		attack -- disable them.
+
+		If the ingress error interrupt is disabled, they will be
+		re-enabled when the device is reset or the driver is reloaded.
+		This interface can be used to re-enable the interrupt as well,
+		and can be read to determine whether the interrupt is enabled.
diff --git a/drivers/misc/dlb2/dlb2_hw_types.h b/drivers/misc/dlb2/dlb2_hw_types.h
index bcd7820b5f7f..e27f43e4c7b2 100644
--- a/drivers/misc/dlb2/dlb2_hw_types.h
+++ b/drivers/misc/dlb2/dlb2_hw_types.h
@@ -61,6 +61,23 @@
 #define DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID \
 	DLB2_PF_NUM_NON_CQ_INTERRUPT_VECTORS
 
+/* DLB non-CQ interrupts (alarm, mailbox, WDT) */
+#define DLB2_INT_NON_CQ 0
+
+#define DLB2_ALARM_HW_SOURCE_SYS 0
+#define DLB2_ALARM_HW_SOURCE_DLB 1
+
+#define DLB2_ALARM_HW_UNIT_CHP 4
+
+#define DLB2_ALARM_SYS_AID_ILLEGAL_QID 3
+#define DLB2_ALARM_SYS_AID_DISABLED_QID 4
+#define DLB2_ALARM_SYS_AID_ILLEGAL_HCW 5
+#define DLB2_ALARM_HW_CHP_AID_ILLEGAL_ENQ 1
+#define DLB2_ALARM_HW_CHP_AID_EXCESS_TOKEN_POPS 2
+
+#define DLB2_VF_NUM_NON_CQ_INTERRUPT_VECTORS 1
+#define DLB2_VF_NUM_CQ_INTERRUPT_VECTORS 31
+
 /*
  * Hardware-defined base addresses. Those prefixed 'DLB2_DRV' are only used by
  * the PF driver.
diff --git a/drivers/misc/dlb2/dlb2_main.h b/drivers/misc/dlb2/dlb2_main.h
index 05f6a9c976bf..ada5d96c1241 100644
--- a/drivers/misc/dlb2/dlb2_main.h
+++ b/drivers/misc/dlb2/dlb2_main.h
@@ -217,6 +217,21 @@ struct dlb2_intr {
 	int mode;
 };
 
+/*
+ * ISR overload is defined as more than DLB2_ISR_OVERLOAD_THRESH interrupts
+ * (of a particular type) occurring in a 1s period. If overload is detected,
+ * the driver blocks that interrupt (exact mechanism depending on the
+ * interrupt) from overloading the PF driver.
+ */
+#define DLB2_ISR_OVERLOAD_THRESH 1000
+#define DLB2_ISR_OVERLOAD_PERIOD_S 1
+
+struct dlb2_alarm {
+	ktime_t ts;
+	unsigned int enabled;
+	u32 count;
+};
+
 struct dlb2_dev {
 	struct pci_dev *pdev;
 	struct dlb2_hw hw;
@@ -239,6 +254,7 @@ struct dlb2_dev {
 	 */
 	struct mutex resource_mutex;
 	struct work_struct work;
+	struct dlb2_alarm ingress_err;
 	enum dlb2_device_type type;
 	int id;
 	dev_t dev_number;
diff --git a/drivers/misc/dlb2/dlb2_pf_ops.c b/drivers/misc/dlb2/dlb2_pf_ops.c
index f17be00ebc81..6b3042489e5d 100644
--- a/drivers/misc/dlb2/dlb2_pf_ops.c
+++ b/drivers/misc/dlb2/dlb2_pf_ops.c
@@ -96,6 +96,109 @@ dlb2_pf_map_pci_bar_space(struct dlb2_dev *dlb2_dev,
 /****** Interrupt management ******/
 /**********************************/
 
+static void dlb2_detect_ingress_err_overload(struct dlb2_dev *dev)
+{
+	s64 delta_us;
+
+	if (dev->ingress_err.count == 0)
+		dev->ingress_err.ts = ktime_get();
+
+	dev->ingress_err.count++;
+
+	/* Don't check for overload until OVERLOAD_THRESH ISRs have run */
+	if (dev->ingress_err.count < DLB2_ISR_OVERLOAD_THRESH)
+		return;
+
+	delta_us = ktime_us_delta(ktime_get(), dev->ingress_err.ts);
+
+	/* Reset stats for next measurement period */
+	dev->ingress_err.count = 0;
+	dev->ingress_err.ts = ktime_get();
+
+	/* Check for overload during this measurement period */
+	if (delta_us > DLB2_ISR_OVERLOAD_PERIOD_S * USEC_PER_SEC)
+		return;
+
+	/*
+	 * Alarm interrupt overload: disable software-generated alarms,
+	 * so only hardware problems (e.g. ECC errors) interrupt the PF.
+	 */
+	dlb2_disable_ingress_error_alarms(&dev->hw);
+
+	dev->ingress_err.enabled = false;
+
+	dev_err(dev->dlb2_device,
+		"[%s()] Overloaded detected: disabling ingress error interrupts",
+		__func__);
+}
+
+/*
+ * The alarm handler logs the alarm syndrome and, for user-caused errors,
+ * reports the alarm to user-space through the per-domain device file interface.
+ *
+ * This function runs as a bottom-half handler because it can call printk
+ * and/or acquire a mutex. These alarms don't need to be handled immediately --
+ * they represent a serious, unexpected error (either in hardware or software)
+ * that can't be recovered without restarting the application or resetting the
+ * device. The VF->PF operations are also non-trivial and require running in a
+ * bottom-half handler.
+ */
+static irqreturn_t
+dlb2_service_intr_handler(int irq, void *hdlr_ptr)
+{
+	struct dlb2_dev *dev = (struct dlb2_dev *)hdlr_ptr;
+	union dlb2_sys_alarm_hw_synd r0;
+
+	dev_dbg(dev->dlb2_device, "DLB service interrupt fired\n");
+
+	mutex_lock(&dev->resource_mutex);
+
+	r0.val = DLB2_CSR_RD(&dev->hw, DLB2_SYS_ALARM_HW_SYND);
+
+	/*
+	 * Clear the MSI-X ack bit before processing the VF->PF or watchdog
+	 * timer interrupts. This order is necessary so that if an interrupt
+	 * event arrives after reading the corresponding bit vector, the event
+	 * won't be lost.
+	 */
+	dlb2_ack_msix_interrupt(&dev->hw, DLB2_INT_NON_CQ);
+
+	if (r0.field.alarm & r0.field.valid)
+		dlb2_process_alarm_interrupt(&dev->hw);
+
+	if (dlb2_process_ingress_error_interrupt(&dev->hw))
+		dlb2_detect_ingress_err_overload(dev);
+
+	mutex_unlock(&dev->resource_mutex);
+
+	return IRQ_HANDLED;
+}
+
+static int
+dlb2_init_alarm_interrupts(struct dlb2_dev *dev,
+			   struct pci_dev *pdev)
+{
+	int i, ret;
+
+	for (i = 0; i < DLB2_PF_NUM_NON_CQ_INTERRUPT_VECTORS; i++) {
+		ret = devm_request_threaded_irq(&pdev->dev,
+						pci_irq_vector(pdev, i),
+						NULL,
+						dlb2_service_intr_handler,
+						IRQF_ONESHOT,
+						"dlb2_alarm",
+						dev);
+		if (ret)
+			return ret;
+
+		dev->intr.isr_registered[i] = true;
+	}
+
+	dlb2_enable_ingress_error_alarms(&dev->hw);
+
+	return 0;
+}
+
 static irqreturn_t
 dlb2_compressed_cq_intr_handler(int irq, void *hdlr_ptr)
 {
@@ -200,6 +303,12 @@ dlb2_pf_init_interrupts(struct dlb2_dev *dev, struct pci_dev *pdev)
 	dev->intr.num_vectors = ret;
 	dev->intr.base_vector = pci_irq_vector(pdev, 0);
 
+	ret = dlb2_init_alarm_interrupts(dev, pdev);
+	if (ret) {
+		dlb2_pf_free_interrupts(dev, pdev);
+		return ret;
+	}
+
 	ret = dlb2_init_compressed_mode_interrupts(dev, pdev);
 	if (ret) {
 		dlb2_pf_free_interrupts(dev, pdev);
@@ -230,6 +339,16 @@ dlb2_pf_init_interrupts(struct dlb2_dev *dev, struct pci_dev *pdev)
 static void
 dlb2_pf_reinit_interrupts(struct dlb2_dev *dev)
 {
+	/* Re-enable alarms after device reset */
+	dlb2_enable_ingress_error_alarms(&dev->hw);
+
+	if (!dev->ingress_err.enabled)
+		dev_err(dev->dlb2_device,
+			"[%s()] Re-enabling ingress error interrupts",
+			__func__);
+
+	dev->ingress_err.enabled = true;
+
 	dlb2_set_msix_mode(&dev->hw, DLB2_MSIX_MODE_COMPRESSED);
 }
 
@@ -311,6 +430,9 @@ dlb2_pf_arm_cq_interrupt(struct dlb2_dev *dev,
 static int
 dlb2_pf_init_driver_state(struct dlb2_dev *dlb2_dev)
 {
+	dlb2_dev->ingress_err.count = 0;
+	dlb2_dev->ingress_err.enabled = true;
+
 	mutex_init(&dlb2_dev->resource_mutex);
 
 	return 0;
@@ -788,6 +910,54 @@ static ssize_t dev_id_show(struct device *dev,
 
 static DEVICE_ATTR_RO(dev_id);
 
+static ssize_t ingress_err_en_show(struct device *dev,
+				   struct device_attribute *attr,
+				   char *buf)
+{
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);
+	ssize_t ret;
+
+	mutex_lock(&dlb2_dev->resource_mutex);
+
+	ret = scnprintf(buf, PAGE_SIZE, "%d\n", dlb2_dev->ingress_err.enabled);
+
+	mutex_unlock(&dlb2_dev->resource_mutex);
+
+	return ret;
+}
+
+static ssize_t ingress_err_en_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf,
+				    size_t count)
+{
+	struct dlb2_dev *dlb2_dev = dev_get_drvdata(dev);
+	unsigned long num;
+	ssize_t ret;
+
+	ret = kstrtoul(buf, 0, &num);
+	if (ret)
+		return -1;
+
+	mutex_lock(&dlb2_dev->resource_mutex);
+
+	if (!dlb2_dev->ingress_err.enabled && num) {
+		dlb2_enable_ingress_error_alarms(&dlb2_dev->hw);
+
+		dev_err(dlb2_dev->dlb2_device,
+			"[%s()] Re-enabling ingress error interrupts",
+			__func__);
+
+		dlb2_dev->ingress_err.enabled = true;
+	}
+
+	mutex_unlock(&dlb2_dev->resource_mutex);
+
+	return (ret == 0) ? count : ret;
+}
+
+static DEVICE_ATTR_RW(ingress_err_en);
+
 static int
 dlb2_pf_sysfs_create(struct dlb2_dev *dlb2_dev)
 {
@@ -796,6 +966,10 @@ dlb2_pf_sysfs_create(struct dlb2_dev *dlb2_dev)
 
 	kobj = &dlb2_dev->pdev->dev.kobj;
 
+	ret = sysfs_create_file(kobj, &dev_attr_ingress_err_en.attr);
+	if (ret)
+		goto dlb2_ingress_err_en_attr_group_fail;
+
 	ret = sysfs_create_file(kobj, &dev_attr_dev_id.attr);
 	if (ret)
 		goto dlb2_dev_id_attr_group_fail;
@@ -827,6 +1001,8 @@ dlb2_pf_sysfs_create(struct dlb2_dev *dlb2_dev)
 dlb2_total_attr_group_fail:
 	sysfs_remove_file(kobj, &dev_attr_dev_id.attr);
 dlb2_dev_id_attr_group_fail:
+	sysfs_remove_file(kobj, &dev_attr_ingress_err_en.attr);
+dlb2_ingress_err_en_attr_group_fail:
 	return ret;
 }
 
@@ -842,6 +1018,7 @@ dlb2_pf_sysfs_destroy(struct dlb2_dev *dlb2_dev)
 	sysfs_remove_group(kobj, &dlb2_avail_attr_group);
 	sysfs_remove_group(kobj, &dlb2_total_attr_group);
 	sysfs_remove_file(kobj, &dev_attr_dev_id.attr);
+	sysfs_remove_file(kobj, &dev_attr_ingress_err_en.attr);
 }
 
 static void
diff --git a/drivers/misc/dlb2/dlb2_resource.c b/drivers/misc/dlb2/dlb2_resource.c
index 023409022f39..c1ad34357b0a 100644
--- a/drivers/misc/dlb2/dlb2_resource.c
+++ b/drivers/misc/dlb2/dlb2_resource.c
@@ -4917,7 +4917,7 @@ void dlb2_read_compressed_cq_intr_status(struct dlb2_hw *hw,
 					DLB2_SYS_DIR_CQ_63_32_OCC_INT_STS);
 }
 
-static void dlb2_ack_msix_interrupt(struct dlb2_hw *hw, int vector)
+void dlb2_ack_msix_interrupt(struct dlb2_hw *hw, int vector)
 {
 	union dlb2_sys_msix_ack r0 = { {0} };
 
@@ -4994,6 +4994,255 @@ void dlb2_ack_compressed_cq_intr(struct dlb2_hw *hw,
 	dlb2_ack_msix_interrupt(hw, DLB2_PF_COMPRESSED_MODE_CQ_VECTOR_ID);
 }
 
+void dlb2_enable_ingress_error_alarms(struct dlb2_hw *hw)
+{
+	union dlb2_sys_ingress_alarm_enbl r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_SYS_INGRESS_ALARM_ENBL);
+
+	r0.field.illegal_hcw = 1;
+	r0.field.illegal_pp = 1;
+	r0.field.illegal_pasid = 1;
+	r0.field.illegal_qid = 1;
+	r0.field.disabled_qid = 1;
+	r0.field.illegal_ldb_qid_cfg = 1;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_INGRESS_ALARM_ENBL, r0.val);
+}
+
+void dlb2_disable_ingress_error_alarms(struct dlb2_hw *hw)
+{
+	union dlb2_sys_ingress_alarm_enbl r0;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_SYS_INGRESS_ALARM_ENBL);
+
+	r0.field.illegal_hcw = 0;
+	r0.field.illegal_pp = 0;
+	r0.field.illegal_pasid = 0;
+	r0.field.illegal_qid = 0;
+	r0.field.disabled_qid = 0;
+	r0.field.illegal_ldb_qid_cfg = 0;
+
+	DLB2_CSR_WR(hw, DLB2_SYS_INGRESS_ALARM_ENBL, r0.val);
+}
+
+static void dlb2_log_alarm_syndrome(struct dlb2_hw *hw,
+				    const char *str,
+				    union dlb2_sys_alarm_hw_synd r0)
+{
+	DLB2_HW_ERR(hw, "%s:\n", str);
+	DLB2_HW_ERR(hw, "\tsyndrome: 0x%x\n", r0.field.syndrome);
+	DLB2_HW_ERR(hw, "\trtype:    0x%x\n", r0.field.rtype);
+	DLB2_HW_ERR(hw, "\talarm:    0x%x\n", r0.field.alarm);
+	DLB2_HW_ERR(hw, "\tcwd:      0x%x\n", r0.field.cwd);
+	DLB2_HW_ERR(hw, "\tvf_pf_mb: 0x%x\n", r0.field.vf_pf_mb);
+	DLB2_HW_ERR(hw, "\tcls:      0x%x\n", r0.field.cls);
+	DLB2_HW_ERR(hw, "\taid:      0x%x\n", r0.field.aid);
+	DLB2_HW_ERR(hw, "\tunit:     0x%x\n", r0.field.unit);
+	DLB2_HW_ERR(hw, "\tsource:   0x%x\n", r0.field.source);
+	DLB2_HW_ERR(hw, "\tmore:     0x%x\n", r0.field.more);
+	DLB2_HW_ERR(hw, "\tvalid:    0x%x\n", r0.field.valid);
+}
+
+/* Note: this array's contents must match dlb2_alert_id() */
+static const char dlb2_alert_strings[NUM_DLB2_DOMAIN_ALERTS][128] = {
+	[DLB2_DOMAIN_ALERT_PP_ILLEGAL_ENQ] = "Illegal enqueue",
+	[DLB2_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS] = "Excess token pops",
+	[DLB2_DOMAIN_ALERT_ILLEGAL_HCW] = "Illegal HCW",
+	[DLB2_DOMAIN_ALERT_ILLEGAL_QID] = "Illegal QID",
+	[DLB2_DOMAIN_ALERT_DISABLED_QID] = "Disabled QID",
+};
+
+static void dlb2_log_pf_vf_syndrome(struct dlb2_hw *hw,
+				    const char *str,
+				    union dlb2_sys_alarm_pf_synd0 r0,
+				    union dlb2_sys_alarm_pf_synd1 r1,
+				    union dlb2_sys_alarm_pf_synd2 r2,
+				    u32 alert_id)
+{
+	DLB2_HW_ERR(hw, "%s:\n", str);
+	if (alert_id < NUM_DLB2_DOMAIN_ALERTS)
+		DLB2_HW_ERR(hw, "Alert: %s\n", dlb2_alert_strings[alert_id]);
+	DLB2_HW_ERR(hw, "\tsyndrome:     0x%x\n", r0.field.syndrome);
+	DLB2_HW_ERR(hw, "\trtype:        0x%x\n", r0.field.rtype);
+	DLB2_HW_ERR(hw, "\tis_ldb:       0x%x\n", r0.field.is_ldb);
+	DLB2_HW_ERR(hw, "\tcls:          0x%x\n", r0.field.cls);
+	DLB2_HW_ERR(hw, "\taid:          0x%x\n", r0.field.aid);
+	DLB2_HW_ERR(hw, "\tunit:         0x%x\n", r0.field.unit);
+	DLB2_HW_ERR(hw, "\tsource:       0x%x\n", r0.field.source);
+	DLB2_HW_ERR(hw, "\tmore:         0x%x\n", r0.field.more);
+	DLB2_HW_ERR(hw, "\tvalid:        0x%x\n", r0.field.valid);
+	DLB2_HW_ERR(hw, "\tdsi:          0x%x\n", r1.field.dsi);
+	DLB2_HW_ERR(hw, "\tqid:          0x%x\n", r1.field.qid);
+	DLB2_HW_ERR(hw, "\tqtype:        0x%x\n", r1.field.qtype);
+	DLB2_HW_ERR(hw, "\tqpri:         0x%x\n", r1.field.qpri);
+	DLB2_HW_ERR(hw, "\tmsg_type:     0x%x\n", r1.field.msg_type);
+	DLB2_HW_ERR(hw, "\tlock_id:      0x%x\n", r2.field.lock_id);
+	DLB2_HW_ERR(hw, "\tmeas:         0x%x\n", r2.field.meas);
+	DLB2_HW_ERR(hw, "\tdebug:        0x%x\n", r2.field.debug);
+	DLB2_HW_ERR(hw, "\tcq_pop:       0x%x\n", r2.field.cq_pop);
+	DLB2_HW_ERR(hw, "\tqe_uhl:       0x%x\n", r2.field.qe_uhl);
+	DLB2_HW_ERR(hw, "\tqe_orsp:      0x%x\n", r2.field.qe_orsp);
+	DLB2_HW_ERR(hw, "\tqe_valid:     0x%x\n", r2.field.qe_valid);
+	DLB2_HW_ERR(hw, "\tcq_int_rearm: 0x%x\n", r2.field.cq_int_rearm);
+	DLB2_HW_ERR(hw, "\tdsi_error:    0x%x\n", r2.field.dsi_error);
+}
+
+static void dlb2_clear_syndrome_register(struct dlb2_hw *hw, u32 offset)
+{
+	union dlb2_sys_alarm_hw_synd r0 = { {0} };
+
+	r0.field.valid = 1;
+	r0.field.more = 1;
+
+	DLB2_CSR_WR(hw, offset, r0.val);
+}
+
+void dlb2_process_alarm_interrupt(struct dlb2_hw *hw)
+{
+	union dlb2_sys_alarm_hw_synd r0;
+
+	DLB2_HW_DBG(hw, "Processing alarm interrupt\n");
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_HW_SYND);
+
+	dlb2_log_alarm_syndrome(hw, "HW alarm syndrome", r0);
+
+	dlb2_clear_syndrome_register(hw, DLB2_SYS_ALARM_HW_SYND);
+}
+
+static void dlb2_process_ingress_error(struct dlb2_hw *hw,
+				       union dlb2_sys_alarm_pf_synd0 r0,
+				       u32 alert_id,
+				       bool vf_error,
+				       unsigned int vf_id)
+{
+	struct dlb2_dev *dev;
+	u32 domain_id;
+	bool is_ldb;
+	u8 port_id;
+	int ret;
+
+	port_id = r0.field.syndrome & 0x7F;
+	if (r0.field.source == DLB2_ALARM_HW_SOURCE_SYS)
+		is_ldb = r0.field.is_ldb;
+	else
+		is_ldb = (r0.field.syndrome & 0x80) != 0;
+
+	/* Get the domain ID and, if it's a VF domain, the virtual port ID */
+	if (is_ldb) {
+		struct dlb2_ldb_port *port;
+
+		port = dlb2_get_ldb_port_from_id(hw, port_id, vf_error, vf_id);
+		if (!port) {
+			DLB2_HW_ERR(hw,
+				    "[%s()]: Internal error: unable to find LDB port\n\tport: %u, vf_error: %u, vf_id: %u\n",
+				    __func__, port_id, vf_error, vf_id);
+			return;
+		}
+
+		domain_id = port->domain_id.phys_id;
+	} else {
+		struct dlb2_dir_pq_pair *port;
+
+		port = dlb2_get_dir_pq_from_id(hw, port_id, vf_error, vf_id);
+		if (!port) {
+			DLB2_HW_ERR(hw,
+				    "[%s()]: Internal error: unable to find DIR port\n\tport: %u, vf_error: %u, vf_id: %u\n",
+				    __func__, port_id, vf_error, vf_id);
+			return;
+		}
+
+		domain_id = port->domain_id.phys_id;
+	}
+
+	dev = container_of(hw, struct dlb2_dev, hw);
+
+	ret = dlb2_write_domain_alert(dev,
+				      dev->sched_domains[domain_id],
+				      alert_id,
+				      (is_ldb << 8) | port_id);
+	if (ret)
+		DLB2_HW_ERR(hw,
+			    "[%s()] Internal error: failed to notify\n",
+			    __func__);
+}
+
+static u32 dlb2_alert_id(union dlb2_sys_alarm_pf_synd0 r0)
+{
+	if (r0.field.unit == DLB2_ALARM_HW_UNIT_CHP &&
+	    r0.field.aid == DLB2_ALARM_HW_CHP_AID_ILLEGAL_ENQ)
+		return DLB2_DOMAIN_ALERT_PP_ILLEGAL_ENQ;
+	else if (r0.field.unit == DLB2_ALARM_HW_UNIT_CHP &&
+		 r0.field.aid == DLB2_ALARM_HW_CHP_AID_EXCESS_TOKEN_POPS)
+		return DLB2_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS;
+	else if (r0.field.source == DLB2_ALARM_HW_SOURCE_SYS &&
+		 r0.field.aid == DLB2_ALARM_SYS_AID_ILLEGAL_HCW)
+		return DLB2_DOMAIN_ALERT_ILLEGAL_HCW;
+	else if (r0.field.source == DLB2_ALARM_HW_SOURCE_SYS &&
+		 r0.field.aid == DLB2_ALARM_SYS_AID_ILLEGAL_QID)
+		return DLB2_DOMAIN_ALERT_ILLEGAL_QID;
+	else if (r0.field.source == DLB2_ALARM_HW_SOURCE_SYS &&
+		 r0.field.aid == DLB2_ALARM_SYS_AID_DISABLED_QID)
+		return DLB2_DOMAIN_ALERT_DISABLED_QID;
+	else
+		return NUM_DLB2_DOMAIN_ALERTS;
+}
+
+bool dlb2_process_ingress_error_interrupt(struct dlb2_hw *hw)
+{
+	union dlb2_sys_alarm_pf_synd0 r0;
+	union dlb2_sys_alarm_pf_synd1 r1;
+	union dlb2_sys_alarm_pf_synd2 r2;
+	u32 alert_id;
+	bool valid;
+	int i;
+
+	r0.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_PF_SYND0);
+
+	valid = r0.field.valid;
+
+	if (r0.field.valid) {
+		r1.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_PF_SYND1);
+		r2.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_PF_SYND2);
+
+		alert_id = dlb2_alert_id(r0);
+
+		dlb2_log_pf_vf_syndrome(hw,
+					"PF Ingress error alarm",
+					r0, r1, r2, alert_id);
+
+		dlb2_clear_syndrome_register(hw, DLB2_SYS_ALARM_PF_SYND0);
+
+		dlb2_process_ingress_error(hw, r0, alert_id, false, 0);
+	}
+
+	for (i = 0; i < DLB2_MAX_NUM_VDEVS; i++) {
+		r0.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_VF_SYND0(i));
+
+		valid |= r0.field.valid;
+
+		if (!r0.field.valid)
+			continue;
+
+		r1.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_VF_SYND1(i));
+		r2.val = DLB2_CSR_RD(hw, DLB2_SYS_ALARM_VF_SYND2(i));
+
+		alert_id = dlb2_alert_id(r0);
+
+		dlb2_log_pf_vf_syndrome(hw,
+					"VF Ingress error alarm",
+					r0, r1, r2, alert_id);
+
+		dlb2_clear_syndrome_register(hw,
+					     DLB2_SYS_ALARM_VF_SYND0(i));
+
+		dlb2_process_ingress_error(hw, r0, alert_id, true, i);
+	}
+
+	return valid;
+}
+
 int dlb2_get_group_sequence_numbers(struct dlb2_hw *hw, unsigned int group_id)
 {
 	if (group_id >= DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
diff --git a/drivers/misc/dlb2/dlb2_resource.h b/drivers/misc/dlb2/dlb2_resource.h
index 7570e02d37e5..c194778d88bc 100644
--- a/drivers/misc/dlb2/dlb2_resource.h
+++ b/drivers/misc/dlb2/dlb2_resource.h
@@ -517,6 +517,18 @@ int dlb2_configure_dir_cq_interrupt(struct dlb2_hw *hw,
 				    u16 threshold);
 
 /**
+ * dlb2_enable_ingress_error_alarms() - enable ingress error alarm interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ */
+void dlb2_enable_ingress_error_alarms(struct dlb2_hw *hw);
+
+/**
+ * dlb2_disable_ingress_error_alarms() - disable ingress error alarm interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ */
+void dlb2_disable_ingress_error_alarms(struct dlb2_hw *hw);
+
+/**
  * dlb2_set_msix_mode() - enable certain hardware alarm interrupts
  * @hw: dlb2_hw handle for a particular device.
  * @mode: MSI-X mode (DLB2_MSIX_MODE_PACKED or DLB2_MSIX_MODE_COMPRESSED)
@@ -527,6 +539,16 @@ int dlb2_configure_dir_cq_interrupt(struct dlb2_hw *hw,
 void dlb2_set_msix_mode(struct dlb2_hw *hw, int mode);
 
 /**
+ * dlb2_ack_msix_interrupt() - Ack an MSI-X interrupt
+ * @hw: dlb2_hw handle for a particular device.
+ * @vector: interrupt vector.
+ *
+ * Note: Only needed for PF service interrupts (vector 0). CQ interrupts are
+ * acked in dlb2_ack_compressed_cq_intr().
+ */
+void dlb2_ack_msix_interrupt(struct dlb2_hw *hw, int vector);
+
+/**
  * dlb2_arm_cq_interrupt() - arm a CQ's interrupt
  * @hw: dlb2_hw handle for a particular device.
  * @port_id: port ID
@@ -582,6 +604,29 @@ void dlb2_ack_compressed_cq_intr(struct dlb2_hw *hw,
 				 u32 *dir_interrupts);
 
 /**
+ * dlb2_process_alarm_interrupt() - process an alarm interrupt
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function reads the alarm syndrome, logs its, and acks the interrupt.
+ * This function should be called from the alarm interrupt handler when
+ * interrupt vector DLB2_INT_ALARM fires.
+ */
+void dlb2_process_alarm_interrupt(struct dlb2_hw *hw);
+
+/**
+ * dlb2_process_ingress_error_interrupt() - process ingress error interrupts
+ * @hw: dlb2_hw handle for a particular device.
+ *
+ * This function reads the alarm syndrome, logs it, notifies user-space, and
+ * acks the interrupt. This function should be called from the alarm interrupt
+ * handler when interrupt vector DLB2_INT_INGRESS_ERROR fires.
+ *
+ * Return:
+ * Returns true if an ingress error interrupt occurred, false otherwise
+ */
+bool dlb2_process_ingress_error_interrupt(struct dlb2_hw *hw);
+
+/**
  * dlb2_get_group_sequence_numbers() - return a group's number of SNs per queue
  * @hw: dlb2_hw handle for a particular device.
  * @group_id: sequence number group ID.
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 13:43 ` [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls Gage Eads
@ 2020-07-12 14:42   ` Randy Dunlap
  2020-07-17 18:20     ` Eads, Gage
  2020-07-12 14:53   ` Randy Dunlap
  2020-07-12 15:28   ` Arnd Bergmann
  2 siblings, 1 reply; 46+ messages in thread
From: Randy Dunlap @ 2020-07-12 14:42 UTC (permalink / raw)
  To: Gage Eads, linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

On 7/12/20 6:43 AM, Gage Eads wrote:
> +/********************/
> +/* dlb2 ioctl codes */
> +/********************/
> +
> +#define DLB2_IOC_MAGIC  'h'

Hi,
This magic value should be documented in
Documentation/userspace-api/ioctl/ioctl-number.rst.

thanks.
-- 
~Randy


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 13:43 ` [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls Gage Eads
  2020-07-12 14:42   ` Randy Dunlap
@ 2020-07-12 14:53   ` Randy Dunlap
  2020-07-17 18:20     ` Eads, Gage
  2020-07-12 15:28   ` Arnd Bergmann
  2 siblings, 1 reply; 46+ messages in thread
From: Randy Dunlap @ 2020-07-12 14:53 UTC (permalink / raw)
  To: Gage Eads, linux-kernel, arnd, gregkh; +Cc: magnus.karlsson, bjorn.topel

On 7/12/20 6:43 AM, Gage Eads wrote:
> +int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
> +			  unsigned int cmd,
> +			  unsigned long arg)
> +{
> +	u16 sz = _IOC_SIZE(cmd);
> +
> +	if (_IOC_NR(cmd) >= NUM_DLB2_CMD) {

Does this bounds check need to use array_index_nospec()
from <linux/nospec.h> ?

> +		dev_err(dev->dlb2_device,
> +			"[%s()] Unexpected DLB command %d\n",
> +			__func__, _IOC_NR(cmd));
> +		return -1;
> +	}
> +
> +	return dlb2_ioctl_callback_fns[_IOC_NR(cmd)](dev, arg, sz);
> +}

I don't know if it needs to or not. I just want to make sure
that you or someone has thought about it.

-- 
~Randy


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 13:43 ` [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls Gage Eads
  2020-07-12 14:42   ` Randy Dunlap
  2020-07-12 14:53   ` Randy Dunlap
@ 2020-07-12 15:28   ` Arnd Bergmann
  2020-07-17 18:19     ` Eads, Gage
  2020-08-04 22:20     ` Eads, Gage
  2 siblings, 2 replies; 46+ messages in thread
From: Arnd Bergmann @ 2020-07-12 15:28 UTC (permalink / raw)
  To: Gage Eads; +Cc: linux-kernel, gregkh, magnus.karlsson, Björn Töpel

On Sun, Jul 12, 2020 at 3:46 PM Gage Eads <gage.eads@intel.com> wrote:
>
> This commit introduces the dlb2 device ioctl layer, and the first four
> ioctls: query device version, driver version, and available resources; and
> create a scheduling domain. This commit also introduces the user-space
> interface file dlb2_user.h.
>
> The PF hardware operation for scheduling domain creation will be added in a
> subsequent commit.
>
> Signed-off-by: Gage Eads <gage.eads@intel.com>
> Reviewed-by: Magnus Karlsson <magnus.karlsson@intel.com>

I'm looking only at the ioctl interface here, as that is usually the
hardest part to get right, and you can't easily change it after it gets
merged.

The definitions all look portable and will work fine in compat mode
(which is often a problem), but you have added a bit of complexity
compared to how it is commonly done, so it's better to simplify it
and write it more like other drivers do. More on that below.

> +/* Verify the ioctl argument size and copy the argument into kernel memory */
> +static int dlb2_copy_from_user(struct dlb2_dev *dev,
> +                              unsigned long user_arg,
> +                              u16 user_size,
> +                              void *arg,
> +                              size_t size)
> +{
> +       if (user_size != size) {
> +               dev_err(dev->dlb2_device,
> +                       "[%s()] Invalid ioctl size\n", __func__);
> +               return -EINVAL;
> +       }
> +
> +       if (copy_from_user(arg, (void __user *)user_arg, size)) {
> +               dev_err(dev->dlb2_device,
> +                       "[%s()] Invalid ioctl argument pointer\n", __func__);
> +               return -EFAULT;
> +       }
> +
> +       return 0;
> +}

You should avoid error messages that are triggered based on user input.
and can cause a denial-of-service when the console is spammed that
way.

A plain copy_from_user() in place of this function should be fine.

> +/* [7:0]: device revision, [15:8]: device version */
> +#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> +
> +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> +                                        unsigned long user_arg,
> +                                        u16 size)
> +{
> +       struct dlb2_get_device_version_args arg;
> +       struct dlb2_cmd_response response;
> +       int ret;
> +
> +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> +
> +       response.status = 0;
> +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> +
> +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> +       if (ret)
> +               return ret;
> +
> +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);

Better avoid any indirect pointers. As you always return a constant
here, I think the entire ioctl command can be removed until you
actually need it. If you have an ioctl command that needs both
input and output, use _IOWR() to define it and put all arguments
into the same structure.

> +static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
> +                                         unsigned long user_arg,
> +                                         u16 size)
> +{
> +       struct dlb2_create_sched_domain_args arg;
> +       struct dlb2_cmd_response response = {0};
> +       int ret;
> +
> +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);

I assume you have finished debugging this version and can remove
these debug comments again. If you find new bugs, you can add them
temporarily, but nothing is gained by these here. You can use ftrace
to see when functions are called.

> +
> +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> +       if (ret)
> +               return ret;
> +
> +       /* Copy zeroes to verify the user-provided response pointer */
> +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> +       if (ret)
> +               return ret;

There is no need to verify the response pointer. If it fails later,
it's the user's fault, and they get to deal with it.

> +static int dlb2_ioctl_get_driver_version(struct dlb2_dev *dev,
> +                                        unsigned long user_arg,
> +                                        u16 size)
> +{
> +       struct dlb2_get_driver_version_args arg;
> +       struct dlb2_cmd_response response;
> +       int ret;
> +
> +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> +
> +       response.status = 0;
> +       response.id = DLB2_VERSION;

Just remove the driver version command, trying to have explicit
interface versions creates more problems than it solves.

> +int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
> +                         unsigned int cmd,
> +                         unsigned long arg)
> +{
> +       u16 sz = _IOC_SIZE(cmd);
> +
> +       if (_IOC_NR(cmd) >= NUM_DLB2_CMD) {
> +               dev_err(dev->dlb2_device,
> +                       "[%s()] Unexpected DLB command %d\n",
> +                       __func__, _IOC_NR(cmd));
> +               return -1;
> +       }
> +
> +       return dlb2_ioctl_callback_fns[_IOC_NR(cmd)](dev, arg, sz);
> +}

This is usually written with a switch/case statement, so doing the same here
tends to make it easier to understand.

> +static long
> +dlb2_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
> +{
> +       struct dlb2_dev *dev;
> +
> +       dev = container_of(f->f_inode->i_cdev, struct dlb2_dev, cdev);
> +
> +       if (_IOC_TYPE(cmd) != DLB2_IOC_MAGIC) {
> +               dev_err(dev->dlb2_device,
> +                       "[%s()] Bad magic number!\n", __func__);
> +               return -EINVAL;
> +       }
> +
> +       return dlb2_ioctl_dispatcher(dev, cmd, arg);
> +}

This function can also be removed then, just call the dispatcher
directly.
>         int err;
>
> -       pr_info("%s\n", dlb2_driver_name);
> +       pr_info("%s - version %d.%d.%d\n", dlb2_driver_name,
> +               DLB2_VERSION_MAJOR_NUMBER,
> +               DLB2_VERSION_MINOR_NUMBER,
> +               DLB2_VERSION_REVISION_NUMBER);
>         pr_info("%s\n", dlb2_driver_copyright);

Just remove the pr_info completely.

      Arnd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode
  2020-07-12 13:43 ` [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode Gage Eads
@ 2020-07-12 15:34   ` Arnd Bergmann
  2020-07-17 18:19     ` Eads, Gage
  0 siblings, 1 reply; 46+ messages in thread
From: Arnd Bergmann @ 2020-07-12 15:34 UTC (permalink / raw)
  To: Gage Eads; +Cc: linux-kernel, gregkh, magnus.karlsson, Björn Töpel

On Sun, Jul 12, 2020 at 3:46 PM Gage Eads <gage.eads@intel.com> wrote:
>  enum dlb2_user_interface_commands {
>         DLB2_CMD_GET_DEVICE_VERSION,
>         DLB2_CMD_CREATE_SCHED_DOMAIN,
>         DLB2_CMD_GET_SCHED_DOMAIN_FD,
>         DLB2_CMD_GET_NUM_RESOURCES,
>         DLB2_CMD_GET_DRIVER_VERSION,
> +       DLB2_CMD_QUERY_CQ_POLL_MODE,
>
>         /* NUM_DLB2_CMD must be last */
>         NUM_DLB2_CMD,

> @@ -427,6 +513,8 @@ struct dlb2_get_dir_queue_depth_args {
>  enum dlb2_domain_user_interface_commands {
>         DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,
>         DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,
> +       DLB2_DOMAIN_CMD_CREATE_LDB_PORT,
> +       DLB2_DOMAIN_CMD_CREATE_DIR_PORT,
>         DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
>         DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,


You cannot add new commands in the middle without changing the ABI.

Maybe use individual #define lines in place of the enum to make sure these
remain constants, or add a numeric value for each one when they are
originally introduced.

(yes, I realize this is the initial contribution of the new driver, but it still
seems wrong to have it change in the middle of the series).

      Arnd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-12 13:43 ` [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver Gage Eads
@ 2020-07-12 15:56   ` Greg KH
  2020-07-17 18:19     ` Eads, Gage
  2020-07-12 15:58   ` Greg KH
  1 sibling, 1 reply; 46+ messages in thread
From: Greg KH @ 2020-07-12 15:56 UTC (permalink / raw)
  To: Gage Eads; +Cc: linux-kernel, arnd, magnus.karlsson, bjorn.topel

On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> +config INTEL_DLB2
> +       tristate "Intel(R) Dynamic Load Balancer 2.0 Driver"
> +       depends on 64BIT && PCI && X86

Why just that platform?  What about CONFIG_TEST for everything else?

> +       help
> +         This driver supports the Intel(R) Dynamic Load Balancer 2.0 (DLB 2.0)
> +         device.

Are you sure you need the (R) in Kconfig texts everywhere?

And a bit more info here would be nice, as no one knows if they have
this or not, right?

> --- /dev/null
> +++ b/drivers/misc/dlb2/dlb2_hw_types.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)

Why dual licensed?  I thought that Intel told me they were not going to
do that anymore for any kernel code going forward as it was just such a
pain and never actually helped anything.  Has that changed?


> + * Copyright(c) 2016-2020 Intel Corporation
> + */
> +
> +#ifndef __DLB2_HW_TYPES_H
> +#define __DLB2_HW_TYPES_H
> +
> +#define DLB2_MAX_NUM_VDEVS 16
> +#define DLB2_MAX_NUM_DOMAINS 32
> +#define DLB2_MAX_NUM_LDB_QUEUES 32 /* LDB == load-balanced */
> +#define DLB2_MAX_NUM_DIR_QUEUES 64 /* DIR == directed */
> +#define DLB2_MAX_NUM_LDB_PORTS 64
> +#define DLB2_MAX_NUM_DIR_PORTS DLB2_MAX_NUM_DIR_QUEUES
> +#define DLB2_MAX_NUM_LDB_CREDITS 8192
> +#define DLB2_MAX_NUM_DIR_CREDITS 2048
> +#define DLB2_MAX_NUM_HIST_LIST_ENTRIES 2048
> +#define DLB2_MAX_NUM_AQED_ENTRIES 2048
> +#define DLB2_MAX_NUM_QIDS_PER_LDB_CQ 8
> +#define DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS 2
> +#define DLB2_MAX_NUM_SEQUENCE_NUMBER_MODES 5
> +#define DLB2_QID_PRIORITIES 8
> +#define DLB2_NUM_ARB_WEIGHTS 8
> +#define DLB2_MAX_WEIGHT 255
> +#define DLB2_NUM_COS_DOMAINS 4
> +#define DLB2_MAX_CQ_COMP_CHECK_LOOPS 409600
> +#define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
> +#define DLB2_HZ 800000000

No tabs?  How easy is that to read?  :(

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-12 13:43 ` [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver Gage Eads
  2020-07-12 15:56   ` Greg KH
@ 2020-07-12 15:58   ` Greg KH
  2020-07-17 18:18     ` Eads, Gage
  1 sibling, 1 reply; 46+ messages in thread
From: Greg KH @ 2020-07-12 15:58 UTC (permalink / raw)
  To: Gage Eads; +Cc: linux-kernel, arnd, magnus.karlsson, bjorn.topel

On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> +static int dlb2_probe(struct pci_dev *pdev,
> +		      const struct pci_device_id *pdev_id)
> +{
> +	struct dlb2_dev *dlb2_dev;
> +	int ret;
> +
> +	dev_dbg(&pdev->dev, "probe\n");

ftrace is your friend.  Remove all of your debugging code now, you don't
need it anymore, especially for stuff like this where you didn't even
need it in the first place :(

Same for everywhere else in all of these patches.  I'll stop reviewing
now, someone at Intel should have caught basic stuff like this before
now, sad...

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-12 15:58   ` Greg KH
@ 2020-07-17 18:18     ` Eads, Gage
  2020-07-18  6:46       ` Greg KH
  0 siblings, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 18:18 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Sunday, July 12, 2020 10:58 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> 
> On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > +static int dlb2_probe(struct pci_dev *pdev,
> > +		      const struct pci_device_id *pdev_id) {
> > +	struct dlb2_dev *dlb2_dev;
> > +	int ret;
> > +
> > +	dev_dbg(&pdev->dev, "probe\n");
> 
> ftrace is your friend.  Remove all of your debugging code now, you don't need
> it anymore, especially for stuff like this where you didn't even need it in the
> first place :(

I'll remove this and other similar dev_dbg() calls. This was an oversight on my part.

I have other instances that a kprobe can't easily replace, such as printing structure contents, that are useful for tracing the usage of the driver. It looks like other misc drivers use dev_dbg() similarly -- do you consider this an acceptable use of a debug print?

Thanks,
Gage

> 
> Same for everywhere else in all of these patches.  I'll stop reviewing now,
> someone at Intel should have caught basic stuff like this before now, sad...
> 
> greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-12 15:56   ` Greg KH
@ 2020-07-17 18:19     ` Eads, Gage
  2020-07-18  6:46       ` Greg KH
  0 siblings, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 18:19 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Sunday, July 12, 2020 10:57 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> 
> On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > +config INTEL_DLB2
> > +       tristate "Intel(R) Dynamic Load Balancer 2.0 Driver"
> > +       depends on 64BIT && PCI && X86
> 
> Why just that platform?  What about CONFIG_TEST for everything else?

This device will only appear on an x86 platform. CONFIG_COMPILE_TEST won't work, since the driver uses the x86-only function iosubmit_cmds512().

> 
> > +       help
> > +         This driver supports the Intel(R) Dynamic Load Balancer 2.0 (DLB 2.0)
> > +         device.
> 
> Are you sure you need the (R) in Kconfig texts everywhere?

The second is probably overkill. Just the first one is required.

> 
> And a bit more info here would be nice, as no one knows if they have this or
> not, right?

Intel hasn't yet announced more information that I can include here. For now, "lspci -d 8086:2710" will tell the user if this device is present.

> 
> > --- /dev/null
> > +++ b/drivers/misc/dlb2/dlb2_hw_types.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
> 
> Why dual licensed?  I thought that Intel told me they were not going to do
> that anymore for any kernel code going forward as it was just such a pain and
> never actually helped anything.  Has that changed?
> 

The driver is mostly GPLv2-only, but a subset constitutes a "hardware access library" that is almost completely OS-independent. "almost" because it has calls to non-GPL symbols like kmalloc() and kfree(). This dual-licensed portion can be ported to other environments that need the more permissive BSD license.

For the broader policy question, Intel's open source team will get back to you on this.

> 
> > + * Copyright(c) 2016-2020 Intel Corporation */
> > +
> > +#ifndef __DLB2_HW_TYPES_H
> > +#define __DLB2_HW_TYPES_H
> > +
> > +#define DLB2_MAX_NUM_VDEVS 16
> > +#define DLB2_MAX_NUM_DOMAINS 32
> > +#define DLB2_MAX_NUM_LDB_QUEUES 32 /* LDB == load-balanced */
> #define
> > +DLB2_MAX_NUM_DIR_QUEUES 64 /* DIR == directed */ #define
> > +DLB2_MAX_NUM_LDB_PORTS 64 #define DLB2_MAX_NUM_DIR_PORTS
> > +DLB2_MAX_NUM_DIR_QUEUES #define DLB2_MAX_NUM_LDB_CREDITS
> 8192 #define
> > +DLB2_MAX_NUM_DIR_CREDITS 2048 #define
> DLB2_MAX_NUM_HIST_LIST_ENTRIES
> > +2048 #define DLB2_MAX_NUM_AQED_ENTRIES 2048 #define
> > +DLB2_MAX_NUM_QIDS_PER_LDB_CQ 8 #define
> > +DLB2_MAX_NUM_SEQUENCE_NUMBER_GROUPS 2 #define
> > +DLB2_MAX_NUM_SEQUENCE_NUMBER_MODES 5 #define
> DLB2_QID_PRIORITIES 8
> > +#define DLB2_NUM_ARB_WEIGHTS 8 #define DLB2_MAX_WEIGHT 255
> #define
> > +DLB2_NUM_COS_DOMAINS 4 #define DLB2_MAX_CQ_COMP_CHECK_LOOPS
> 409600
> > +#define DLB2_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
> > +#define DLB2_HZ 800000000
> 
> No tabs?  How easy is that to read?  :(

I'll improve this (and a few other instances in later patches) in V2.

Thanks,
Gage

> 
> greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode
  2020-07-12 15:34   ` Arnd Bergmann
@ 2020-07-17 18:19     ` Eads, Gage
  0 siblings, 0 replies; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 18:19 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, gregkh, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Arnd Bergmann <arnd@arndb.de>
> Sent: Sunday, July 12, 2020 10:34 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: linux-kernel@vger.kernel.org; gregkh <gregkh@linuxfoundation.org>;
> Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> <bjorn.topel@intel.com>
> Subject: Re: [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode
> 
> On Sun, Jul 12, 2020 at 3:46 PM Gage Eads <gage.eads@intel.com> wrote:
> >  enum dlb2_user_interface_commands {
> >         DLB2_CMD_GET_DEVICE_VERSION,
> >         DLB2_CMD_CREATE_SCHED_DOMAIN,
> >         DLB2_CMD_GET_SCHED_DOMAIN_FD,
> >         DLB2_CMD_GET_NUM_RESOURCES,
> >         DLB2_CMD_GET_DRIVER_VERSION,
> > +       DLB2_CMD_QUERY_CQ_POLL_MODE,
> >
> >         /* NUM_DLB2_CMD must be last */
> >         NUM_DLB2_CMD,
> 
> > @@ -427,6 +513,8 @@ struct dlb2_get_dir_queue_depth_args {  enum
> > dlb2_domain_user_interface_commands {
> >         DLB2_DOMAIN_CMD_CREATE_LDB_QUEUE,
> >         DLB2_DOMAIN_CMD_CREATE_DIR_QUEUE,
> > +       DLB2_DOMAIN_CMD_CREATE_LDB_PORT,
> > +       DLB2_DOMAIN_CMD_CREATE_DIR_PORT,
> >         DLB2_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
> >         DLB2_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
> 
> 
> You cannot add new commands in the middle without changing the ABI.
> 
> Maybe use individual #define lines in place of the enum to make sure these
> remain constants, or add a numeric value for each one when they are
> originally introduced.
> 
> (yes, I realize this is the initial contribution of the new driver, but it still seems
> wrong to have it change in the middle of the series).

Ok, no problem -- I was assuming this patchset would be considered an atomic unit. I'll correct this in V2.

Thanks,
Gage

> 
>       Arnd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 15:28   ` Arnd Bergmann
@ 2020-07-17 18:19     ` Eads, Gage
  2020-07-17 18:56       ` Arnd Bergmann
  2020-08-04 22:20     ` Eads, Gage
  1 sibling, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 18:19 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, gregkh, Karlsson, Magnus, Topel, Bjorn

 
> > +/* Verify the ioctl argument size and copy the argument into kernel
> > +memory */ static int dlb2_copy_from_user(struct dlb2_dev *dev,
> > +                              unsigned long user_arg,
> > +                              u16 user_size,
> > +                              void *arg,
> > +                              size_t size) {
> > +       if (user_size != size) {
> > +               dev_err(dev->dlb2_device,
> > +                       "[%s()] Invalid ioctl size\n", __func__);
> > +               return -EINVAL;
> > +       }
> > +
> > +       if (copy_from_user(arg, (void __user *)user_arg, size)) {
> > +               dev_err(dev->dlb2_device,
> > +                       "[%s()] Invalid ioctl argument pointer\n", __func__);
> > +               return -EFAULT;
> > +       }
> > +
> > +       return 0;
> > +}
> 
> You should avoid error messages that are triggered based on user input.
> and can cause a denial-of-service when the console is spammed that way.
> 

Makes sense, will fix.

> A plain copy_from_user() in place of this function should be fine.

This function also validates the user size arg to prevent buffer overflow; centralizing it here avoids the case where a programmer accidentally forgets the check in an ioctl handler (and reduces code duplication). If it's alright with you, I'll keep the function but drop the dev_err() prints.

> 
> > +/* [7:0]: device revision, [15:8]: device version */ #define
> > +DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > +
> > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > +                                        unsigned long user_arg,
> > +                                        u16 size) {
> > +       struct dlb2_get_device_version_args arg;
> > +       struct dlb2_cmd_response response;
> > +       int ret;
> > +
> > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > +
> > +       response.status = 0;
> > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > +
> > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> 
> Better avoid any indirect pointers. As you always return a constant here, I
> think the entire ioctl command can be removed until you actually need it. If
> you have an ioctl command that needs both input and output, use _IOWR()
> to define it and put all arguments into the same structure.

Ok, I'll merge the response structure into the ioctl structure (here and elsewhere).

Say I add this command later: without driver versioning, how would user-space know in advance whether the command is supported? It could attempt the command and interpret -ENOTTY as "unsupported", but that strikes me as an inelegant way to reverse-engineer the version.

> 
> > +static int dlb2_ioctl_create_sched_domain(struct dlb2_dev *dev,
> > +                                         unsigned long user_arg,
> > +                                         u16 size) {
> > +       struct dlb2_create_sched_domain_args arg;
> > +       struct dlb2_cmd_response response = {0};
> > +       int ret;
> > +
> > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> 
> I assume you have finished debugging this version and can remove these
> debug comments again. If you find new bugs, you can add them temporarily,
> but nothing is gained by these here. You can use ftrace to see when functions
> are called.

You're right, and this was mentioned in another comment as well. I'll remove these unnecessary dev_dbg() calls.

> 
> > +
> > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > +       if (ret)
> > +               return ret;
> > +
> > +       /* Copy zeroes to verify the user-provided response pointer */
> > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> > +       if (ret)
> > +               return ret;
> 
> There is no need to verify the response pointer. If it fails later, it's the user's
> fault, and they get to deal with it.
> 
> > +static int dlb2_ioctl_get_driver_version(struct dlb2_dev *dev,
> > +                                        unsigned long user_arg,
> > +                                        u16 size) {
> > +       struct dlb2_get_driver_version_args arg;
> > +       struct dlb2_cmd_response response;
> > +       int ret;
> > +
> > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > +
> > +       response.status = 0;
> > +       response.id = DLB2_VERSION;
> 
> Just remove the driver version command, trying to have explicit interface
> versions creates more problems than it solves.

(See question above)

> 
> > +int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
> > +                         unsigned int cmd,
> > +                         unsigned long arg) {
> > +       u16 sz = _IOC_SIZE(cmd);
> > +
> > +       if (_IOC_NR(cmd) >= NUM_DLB2_CMD) {
> > +               dev_err(dev->dlb2_device,
> > +                       "[%s()] Unexpected DLB command %d\n",
> > +                       __func__, _IOC_NR(cmd));
> > +               return -1;
> > +       }
> > +
> > +       return dlb2_ioctl_callback_fns[_IOC_NR(cmd)](dev, arg, sz); }
> 
> This is usually written with a switch/case statement, so doing the same here
> tends to make it easier to understand.
> 

Will do. And as Randy noticed, this needed array_index_nospec() -- easier to avoid that entirely with a switch statement.

> > +static long
> > +dlb2_ioctl(struct file *f, unsigned int cmd, unsigned long arg) {
> > +       struct dlb2_dev *dev;
> > +
> > +       dev = container_of(f->f_inode->i_cdev, struct dlb2_dev, cdev);
> > +
> > +       if (_IOC_TYPE(cmd) != DLB2_IOC_MAGIC) {
> > +               dev_err(dev->dlb2_device,
> > +                       "[%s()] Bad magic number!\n", __func__);
> > +               return -EINVAL;
> > +       }
> > +
> > +       return dlb2_ioctl_dispatcher(dev, cmd, arg); }
> 
> This function can also be removed then, just call the dispatcher directly.
> >         int err;
> >
> > -       pr_info("%s\n", dlb2_driver_name);
> > +       pr_info("%s - version %d.%d.%d\n", dlb2_driver_name,
> > +               DLB2_VERSION_MAJOR_NUMBER,
> > +               DLB2_VERSION_MINOR_NUMBER,
> > +               DLB2_VERSION_REVISION_NUMBER);
> >         pr_info("%s\n", dlb2_driver_copyright);
> 
> Just remove the pr_info completely.

Can you elaborate? Printing the driver name/copyright/etc. seems to be a common pattern in upstream drivers.

Thanks,
Gage

> 
>       Arnd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 14:53   ` Randy Dunlap
@ 2020-07-17 18:20     ` Eads, Gage
  0 siblings, 0 replies; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 18:20 UTC (permalink / raw)
  To: Randy Dunlap, linux-kernel, arnd, gregkh; +Cc: Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Randy Dunlap <rdunlap@infradead.org>
> Sent: Sunday, July 12, 2020 9:54 AM
> To: Eads, Gage <gage.eads@intel.com>; linux-kernel@vger.kernel.org;
> arnd@arndb.de; gregkh@linuxfoundation.org
> Cc: Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> <bjorn.topel@intel.com>
> Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> 
> On 7/12/20 6:43 AM, Gage Eads wrote:
> > +int dlb2_ioctl_dispatcher(struct dlb2_dev *dev,
> > +			  unsigned int cmd,
> > +			  unsigned long arg)
> > +{
> > +	u16 sz = _IOC_SIZE(cmd);
> > +
> > +	if (_IOC_NR(cmd) >= NUM_DLB2_CMD) {
> 
> Does this bounds check need to use array_index_nospec() from
> <linux/nospec.h> ?
> 
> > +		dev_err(dev->dlb2_device,
> > +			"[%s()] Unexpected DLB command %d\n",
> > +			__func__, _IOC_NR(cmd));
> > +		return -1;
> > +	}
> > +
> > +	return dlb2_ioctl_callback_fns[_IOC_NR(cmd)](dev, arg, sz); }
> 
> I don't know if it needs to or not. I just want to make sure that you or
> someone has thought about it.

Thanks for catching this -- it does. Per Arnd's suggestion, I'm going to convert this to a switch statement and avoid the index altogether.

Thanks,
Gage

> 
> --
> ~Randy

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 14:42   ` Randy Dunlap
@ 2020-07-17 18:20     ` Eads, Gage
  0 siblings, 0 replies; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 18:20 UTC (permalink / raw)
  To: Randy Dunlap, linux-kernel, arnd, gregkh; +Cc: Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Randy Dunlap <rdunlap@infradead.org>
> Sent: Sunday, July 12, 2020 9:42 AM
> To: Eads, Gage <gage.eads@intel.com>; linux-kernel@vger.kernel.org;
> arnd@arndb.de; gregkh@linuxfoundation.org
> Cc: Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> <bjorn.topel@intel.com>
> Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> 
> On 7/12/20 6:43 AM, Gage Eads wrote:
> > +/********************/
> > +/* dlb2 ioctl codes */
> > +/********************/
> > +
> > +#define DLB2_IOC_MAGIC  'h'
> 
> Hi,
> This magic value should be documented in Documentation/userspace-
> api/ioctl/ioctl-number.rst.
> 
> thanks.
> --
> ~Randy

Will fix in v2.

Thanks,
Gage

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-17 18:19     ` Eads, Gage
@ 2020-07-17 18:56       ` Arnd Bergmann
  2020-07-17 20:05         ` Eads, Gage
  0 siblings, 1 reply; 46+ messages in thread
From: Arnd Bergmann @ 2020-07-17 18:56 UTC (permalink / raw)
  To: Eads, Gage; +Cc: linux-kernel, gregkh, Karlsson, Magnus, Topel, Bjorn

On Fri, Jul 17, 2020 at 8:19 PM Eads, Gage <gage.eads@intel.com> wrote:

> > A plain copy_from_user() in place of this function should be fine.
>
> This function also validates the user size arg to prevent buffer overflow; centralizing it here avoids the case where a programmer accidentally forgets the check in an ioctl handler (and reduces code duplication). If it's alright with you, I'll keep the function but drop the dev_err() prints.

Once you use a 'switch(cmd)' statement in the top ioctl handler, the
data structure size will be fixed, so there is no way the argument
size can go wrong.

> >
> > > +/* [7:0]: device revision, [15:8]: device version */ #define
> > > +DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > > +
> > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > +                                        unsigned long user_arg,
> > > +                                        u16 size) {
> > > +       struct dlb2_get_device_version_args arg;
> > > +       struct dlb2_cmd_response response;
> > > +       int ret;
> > > +
> > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > +
> > > +       response.status = 0;
> > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > +
> > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> >
> > Better avoid any indirect pointers. As you always return a constant here, I
> > think the entire ioctl command can be removed until you actually need it. If
> > you have an ioctl command that needs both input and output, use _IOWR()
> > to define it and put all arguments into the same structure.
>
> Ok, I'll merge the response structure into the ioctl structure (here and elsewhere).
>
> Say I add this command later: without driver versioning, how would
> user-space know in advance whether the command is supported?
> It could attempt the command and interpret -ENOTTY as "unsupported",
> but that strikes me as an inelegant way to reverse-engineer the version.

There is not really a driver "version" once the driver is upstream, the concept
doesn't really make sense here when arbitrary patches can get backported
from the latest kernel into whatever the user is running.

The ENOTTY check is indeed the normal way that user space deals
with interfaces that may have been added later. What you normally want
is to keep using the original interfaces anyway, unless you absolutely
need a later revision for a certain feature, and in that case the user space
program will fail no matter what.

> > This function can also be removed then, just call the dispatcher directly.
> > >         int err;
> > >
> > > -       pr_info("%s\n", dlb2_driver_name);
> > > +       pr_info("%s - version %d.%d.%d\n", dlb2_driver_name,
> > > +               DLB2_VERSION_MAJOR_NUMBER,
> > > +               DLB2_VERSION_MINOR_NUMBER,
> > > +               DLB2_VERSION_REVISION_NUMBER);
> > >         pr_info("%s\n", dlb2_driver_copyright);
> >
> > Just remove the pr_info completely.
>
> Can you elaborate? Printing the driver name/copyright/etc. seems to be a common pattern in upstream drivers.

Most drivers don't do it, and it's generally not recommended. You can
print a message when something goes wrong, but most users don't
care about that stuff and it clutters up the kernel log if each driver
prints a line or two.

     Arnd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-17 18:56       ` Arnd Bergmann
@ 2020-07-17 20:05         ` Eads, Gage
  2020-07-18  6:48           ` gregkh
  0 siblings, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-07-17 20:05 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, gregkh, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Arnd Bergmann <arnd@arndb.de>
> Sent: Friday, July 17, 2020 1:57 PM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: linux-kernel@vger.kernel.org; gregkh <gregkh@linuxfoundation.org>;
> Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> <bjorn.topel@intel.com>
> Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> 
> On Fri, Jul 17, 2020 at 8:19 PM Eads, Gage <gage.eads@intel.com> wrote:
> 
> > > A plain copy_from_user() in place of this function should be fine.
> >
> > This function also validates the user size arg to prevent buffer overflow;
> centralizing it here avoids the case where a programmer accidentally forgets
> the check in an ioctl handler (and reduces code duplication). If it's alright with
> you, I'll keep the function but drop the dev_err() prints.
> 
> Once you use a 'switch(cmd)' statement in the top ioctl handler, the data
> structure size will be fixed, so there is no way the argument size can go wrong.
> 

Ah, understood. Will fix in v2.

> > >
> > > > +/* [7:0]: device revision, [15:8]: device version */ #define
> > > > +DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > > > +
> > > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > > +                                        unsigned long user_arg,
> > > > +                                        u16 size) {
> > > > +       struct dlb2_get_device_version_args arg;
> > > > +       struct dlb2_cmd_response response;
> > > > +       int ret;
> > > > +
> > > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > > +
> > > > +       response.status = 0;
> > > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > > +
> > > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       ret = dlb2_copy_resp_to_user(dev, arg.response,
> > > > + &response);
> > >
> > > Better avoid any indirect pointers. As you always return a constant
> > > here, I think the entire ioctl command can be removed until you
> > > actually need it. If you have an ioctl command that needs both input
> > > and output, use _IOWR() to define it and put all arguments into the same
> structure.
> >
> > Ok, I'll merge the response structure into the ioctl structure (here and
> elsewhere).
> >
> > Say I add this command later: without driver versioning, how would
> > user-space know in advance whether the command is supported?
> > It could attempt the command and interpret -ENOTTY as "unsupported",
> > but that strikes me as an inelegant way to reverse-engineer the version.
> 
> There is not really a driver "version" once the driver is upstream, the concept
> doesn't really make sense here when arbitrary patches can get backported
> from the latest kernel into whatever the user is running.
> 

"Driver interface version" is the better term for what I'm trying to accomplish here. Any backports would have to be done in such a way that the interface version is honored, but if that can't be reasonably expected...then I agree, versioning is unworkable.

> The ENOTTY check is indeed the normal way that user space deals with
> interfaces that may have been added later. What you normally want is to
> keep using the original interfaces anyway, unless you absolutely need a later
> revision for a certain feature, and in that case the user space program will fail
> no matter what.
> 
> > > This function can also be removed then, just call the dispatcher directly.
> > > >         int err;
> > > >
> > > > -       pr_info("%s\n", dlb2_driver_name);
> > > > +       pr_info("%s - version %d.%d.%d\n", dlb2_driver_name,
> > > > +               DLB2_VERSION_MAJOR_NUMBER,
> > > > +               DLB2_VERSION_MINOR_NUMBER,
> > > > +               DLB2_VERSION_REVISION_NUMBER);
> > > >         pr_info("%s\n", dlb2_driver_copyright);
> > >
> > > Just remove the pr_info completely.
> >
> > Can you elaborate? Printing the driver name/copyright/etc. seems to be a
> common pattern in upstream drivers.
> 
> Most drivers don't do it, and it's generally not recommended. You can print a
> message when something goes wrong, but most users don't care about that
> stuff and it clutters up the kernel log if each driver prints a line or two.
> 

Fair enough. I'll remove it.

Thanks,
Gage

>      Arnd

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-17 18:19     ` Eads, Gage
@ 2020-07-18  6:46       ` Greg KH
  0 siblings, 0 replies; 46+ messages in thread
From: Greg KH @ 2020-07-18  6:46 UTC (permalink / raw)
  To: Eads, Gage; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn

On Fri, Jul 17, 2020 at 06:19:14PM +0000, Eads, Gage wrote:
> 
> 
> > -----Original Message-----
> > From: Greg KH <gregkh@linuxfoundation.org>
> > Sent: Sunday, July 12, 2020 10:57 AM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > 
> > On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > > +config INTEL_DLB2
> > > +       tristate "Intel(R) Dynamic Load Balancer 2.0 Driver"
> > > +       depends on 64BIT && PCI && X86
> > 
> > Why just that platform?  What about CONFIG_TEST for everything else?
> 
> This device will only appear on an x86 platform. CONFIG_COMPILE_TEST won't work, since the driver uses the x86-only function iosubmit_cmds512().

Please wrap your lines correctly...

Anyway, there is no config option for that function that you can trigger
off of?

> > > +       help
> > > +         This driver supports the Intel(R) Dynamic Load Balancer 2.0 (DLB 2.0)
> > > +         device.
> > 
> > Are you sure you need the (R) in Kconfig texts everywhere?
> 
> The second is probably overkill. Just the first one is required.

Really?  I would just drop it.  Unless you get a signed-off-by from a
lawyer saying it is required :)

> > And a bit more info here would be nice, as no one knows if they have this or
> > not, right?
> 
> Intel hasn't yet announced more information that I can include here. For now, "lspci -d 8086:2710" will tell the user if this device is present.

That's fine, but we can't take a 1 sentance help text, that means
nothing.

> > > --- /dev/null
> > > +++ b/drivers/misc/dlb2/dlb2_hw_types.h
> > > @@ -0,0 +1,29 @@
> > > +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
> > 
> > Why dual licensed?  I thought that Intel told me they were not going to do
> > that anymore for any kernel code going forward as it was just such a pain and
> > never actually helped anything.  Has that changed?
> > 
> 
> The driver is mostly GPLv2-only, but a subset constitutes a "hardware access library" that is almost completely OS-independent. "almost" because it has calls to non-GPL symbols like kmalloc() and kfree(). This dual-licensed portion can be ported to other environments that need the more permissive BSD license.

Then put that "OS independant" part as a separate file, with a separate
license.  You all know how to do this properly, don't mix this stuff up.

But even then, I would drop such a library as that's not going to make a
good Linux driver, we do not like, or need, such things in the kernel.

> For the broader policy question, Intel's open source team will get back to you on this.

Wonderful, when will that happen?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-17 18:18     ` Eads, Gage
@ 2020-07-18  6:46       ` Greg KH
  2020-07-20 19:02         ` Eads, Gage
  0 siblings, 1 reply; 46+ messages in thread
From: Greg KH @ 2020-07-18  6:46 UTC (permalink / raw)
  To: Eads, Gage; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn

On Fri, Jul 17, 2020 at 06:18:46PM +0000, Eads, Gage wrote:
> 
> 
> > -----Original Message-----
> > From: Greg KH <gregkh@linuxfoundation.org>
> > Sent: Sunday, July 12, 2020 10:58 AM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > 
> > On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > > +static int dlb2_probe(struct pci_dev *pdev,
> > > +		      const struct pci_device_id *pdev_id) {
> > > +	struct dlb2_dev *dlb2_dev;
> > > +	int ret;
> > > +
> > > +	dev_dbg(&pdev->dev, "probe\n");
> > 
> > ftrace is your friend.  Remove all of your debugging code now, you don't need
> > it anymore, especially for stuff like this where you didn't even need it in the
> > first place :(
> 
> I'll remove this and other similar dev_dbg() calls. This was an oversight on my part.
> 
> I have other instances that a kprobe can't easily replace, such as printing structure contents, that are useful for tracing the usage of the driver. It looks like other misc drivers use dev_dbg() similarly -- do you consider this an acceptable use of a debug print?

Why can't a kernel tracepoint print a structure?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-17 20:05         ` Eads, Gage
@ 2020-07-18  6:48           ` gregkh
  0 siblings, 0 replies; 46+ messages in thread
From: gregkh @ 2020-07-18  6:48 UTC (permalink / raw)
  To: Eads, Gage; +Cc: Arnd Bergmann, linux-kernel, Karlsson, Magnus, Topel, Bjorn

On Fri, Jul 17, 2020 at 08:05:08PM +0000, Eads, Gage wrote:
> 
> 
> > -----Original Message-----
> > From: Arnd Bergmann <arnd@arndb.de>
> > Sent: Friday, July 17, 2020 1:57 PM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: linux-kernel@vger.kernel.org; gregkh <gregkh@linuxfoundation.org>;
> > Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> > <bjorn.topel@intel.com>
> > Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> > 
> > On Fri, Jul 17, 2020 at 8:19 PM Eads, Gage <gage.eads@intel.com> wrote:
> > 
> > > > A plain copy_from_user() in place of this function should be fine.
> > >
> > > This function also validates the user size arg to prevent buffer overflow;
> > centralizing it here avoids the case where a programmer accidentally forgets
> > the check in an ioctl handler (and reduces code duplication). If it's alright with
> > you, I'll keep the function but drop the dev_err() prints.
> > 
> > Once you use a 'switch(cmd)' statement in the top ioctl handler, the data
> > structure size will be fixed, so there is no way the argument size can go wrong.
> > 
> 
> Ah, understood. Will fix in v2.
> 
> > > >
> > > > > +/* [7:0]: device revision, [15:8]: device version */ #define
> > > > > +DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > > > > +
> > > > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > > > +                                        unsigned long user_arg,
> > > > > +                                        u16 size) {
> > > > > +       struct dlb2_get_device_version_args arg;
> > > > > +       struct dlb2_cmd_response response;
> > > > > +       int ret;
> > > > > +
> > > > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > > > +
> > > > > +       response.status = 0;
> > > > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > > > +
> > > > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       ret = dlb2_copy_resp_to_user(dev, arg.response,
> > > > > + &response);
> > > >
> > > > Better avoid any indirect pointers. As you always return a constant
> > > > here, I think the entire ioctl command can be removed until you
> > > > actually need it. If you have an ioctl command that needs both input
> > > > and output, use _IOWR() to define it and put all arguments into the same
> > structure.
> > >
> > > Ok, I'll merge the response structure into the ioctl structure (here and
> > elsewhere).
> > >
> > > Say I add this command later: without driver versioning, how would
> > > user-space know in advance whether the command is supported?
> > > It could attempt the command and interpret -ENOTTY as "unsupported",
> > > but that strikes me as an inelegant way to reverse-engineer the version.
> > 
> > There is not really a driver "version" once the driver is upstream, the concept
> > doesn't really make sense here when arbitrary patches can get backported
> > from the latest kernel into whatever the user is running.
> > 
> 
> "Driver interface version" is the better term for what I'm trying to accomplish here. Any backports would have to be done in such a way that the interface version is honored, but if that can't be reasonably expected...then I agree, versioning is unworkable.

There is no such thing as a "driver interface version", sorry, that is
not going to be workable at all.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-18  6:46       ` Greg KH
@ 2020-07-20 19:02         ` Eads, Gage
  2020-07-20 19:19           ` Greg KH
  0 siblings, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-07-20 19:02 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Saturday, July 18, 2020 1:47 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> 
> On Fri, Jul 17, 2020 at 06:18:46PM +0000, Eads, Gage wrote:
> >
> >
> > > -----Original Message-----
> > > From: Greg KH <gregkh@linuxfoundation.org>
> > > Sent: Sunday, July 12, 2020 10:58 AM
> > > To: Eads, Gage <gage.eads@intel.com>
> > > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > >
> > > On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > > > +static int dlb2_probe(struct pci_dev *pdev,
> > > > +		      const struct pci_device_id *pdev_id) {
> > > > +	struct dlb2_dev *dlb2_dev;
> > > > +	int ret;
> > > > +
> > > > +	dev_dbg(&pdev->dev, "probe\n");
> > >
> > > ftrace is your friend.  Remove all of your debugging code now, you don't
> need
> > > it anymore, especially for stuff like this where you didn't even need it in
> the
> > > first place :(
> >
> > I'll remove this and other similar dev_dbg() calls. This was an oversight on
> my part.
> >
> > I have other instances that a kprobe can't easily replace, such as printing
> structure contents, that are useful for tracing the usage of the driver. It looks
> like other misc drivers use dev_dbg() similarly -- do you consider this an
> acceptable use of a debug print?
> 
> Why can't a kernel tracepoint print a structure?

I meant the command-line installed kprobes[1], but instrumenting the driver is
certainly an option. We don't require the much lower overhead of a tracepoint,
so I didn't choose it. This driver handles the (performance-insensitive)
device configuration, while the fast-path operations take place in user-space.

Another reason is the "hardware access library" files use only non-GPL external
symbols, and some tracepoint functions are exported GPL. Though it's probably
feasible to lift that tracing code up into a (GPLv2-only) caller function.

But if tracepoints are the preferred method and/or you think the driver would
benefit, I'll make the change.

Thanks,
Gage

[1] https://www.kernel.org/doc/html/latest/trace/kprobetrace.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-20 19:02         ` Eads, Gage
@ 2020-07-20 19:19           ` Greg KH
  2020-07-24 21:00             ` Eads, Gage
  0 siblings, 1 reply; 46+ messages in thread
From: Greg KH @ 2020-07-20 19:19 UTC (permalink / raw)
  To: Eads, Gage; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn

On Mon, Jul 20, 2020 at 07:02:05PM +0000, Eads, Gage wrote:
> 
> 
> > -----Original Message-----
> > From: Greg KH <gregkh@linuxfoundation.org>
> > Sent: Saturday, July 18, 2020 1:47 AM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > 
> > On Fri, Jul 17, 2020 at 06:18:46PM +0000, Eads, Gage wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Greg KH <gregkh@linuxfoundation.org>
> > > > Sent: Sunday, July 12, 2020 10:58 AM
> > > > To: Eads, Gage <gage.eads@intel.com>
> > > > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > > > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > > > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > > >
> > > > On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > > > > +static int dlb2_probe(struct pci_dev *pdev,
> > > > > +		      const struct pci_device_id *pdev_id) {
> > > > > +	struct dlb2_dev *dlb2_dev;
> > > > > +	int ret;
> > > > > +
> > > > > +	dev_dbg(&pdev->dev, "probe\n");
> > > >
> > > > ftrace is your friend.  Remove all of your debugging code now, you don't
> > need
> > > > it anymore, especially for stuff like this where you didn't even need it in
> > the
> > > > first place :(
> > >
> > > I'll remove this and other similar dev_dbg() calls. This was an oversight on
> > my part.
> > >
> > > I have other instances that a kprobe can't easily replace, such as printing
> > structure contents, that are useful for tracing the usage of the driver. It looks
> > like other misc drivers use dev_dbg() similarly -- do you consider this an
> > acceptable use of a debug print?
> > 
> > Why can't a kernel tracepoint print a structure?
> 
> I meant the command-line installed kprobes[1], but instrumenting the driver is
> certainly an option. We don't require the much lower overhead of a tracepoint,
> so I didn't choose it. This driver handles the (performance-insensitive)
> device configuration, while the fast-path operations take place in user-space.
> 
> Another reason is the "hardware access library" files use only non-GPL external
> symbols, and some tracepoint functions are exported GPL. Though it's probably
> feasible to lift that tracing code up into a (GPLv2-only) caller function.

Stop going through crazy gyrations for something that your own legal
team has told you not to do anymore in the first place.

No "hardware access library" files please, that's not how Linux drivers
are written.

you all know better...

> But if tracepoints are the preferred method and/or you think the driver would
> benefit, I'll make the change.

I don't think you need any of that stuff, now that the code works
properly, right?

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
  2020-07-20 19:19           ` Greg KH
@ 2020-07-24 21:00             ` Eads, Gage
  0 siblings, 0 replies; 46+ messages in thread
From: Eads, Gage @ 2020-07-24 21:00 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, arnd, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: Greg KH <gregkh@linuxfoundation.org>
> Sent: Monday, July 20, 2020 2:19 PM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> 
> On Mon, Jul 20, 2020 at 07:02:05PM +0000, Eads, Gage wrote:
> >
> >
> > > -----Original Message-----
> > > From: Greg KH <gregkh@linuxfoundation.org>
> > > Sent: Saturday, July 18, 2020 1:47 AM
> > > To: Eads, Gage <gage.eads@intel.com>
> > > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > >
> > > On Fri, Jul 17, 2020 at 06:18:46PM +0000, Eads, Gage wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Greg KH <gregkh@linuxfoundation.org>
> > > > > Sent: Sunday, July 12, 2020 10:58 AM
> > > > > To: Eads, Gage <gage.eads@intel.com>
> > > > > Cc: linux-kernel@vger.kernel.org; arnd@arndb.de; Karlsson, Magnus
> > > > > <magnus.karlsson@intel.com>; Topel, Bjorn <bjorn.topel@intel.com>
> > > > > Subject: Re: [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver
> > > > >
> > > > > On Sun, Jul 12, 2020 at 08:43:12AM -0500, Gage Eads wrote:
> > > > > > +static int dlb2_probe(struct pci_dev *pdev,
> > > > > > +		      const struct pci_device_id *pdev_id) {
> > > > > > +	struct dlb2_dev *dlb2_dev;
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	dev_dbg(&pdev->dev, "probe\n");
> > > > >
> > > > > ftrace is your friend.  Remove all of your debugging code now, you
> don't
> > > need
> > > > > it anymore, especially for stuff like this where you didn't even need it
> in
> > > the
> > > > > first place :(
> > > >
> > > > I'll remove this and other similar dev_dbg() calls. This was an oversight
> on
> > > my part.
> > > >
> > > > I have other instances that a kprobe can't easily replace, such as
> printing
> > > structure contents, that are useful for tracing the usage of the driver. It
> looks
> > > like other misc drivers use dev_dbg() similarly -- do you consider this an
> > > acceptable use of a debug print?
> > >
> > > Why can't a kernel tracepoint print a structure?
> >
> > I meant the command-line installed kprobes[1], but instrumenting the
> driver is
> > certainly an option. We don't require the much lower overhead of a
> tracepoint,
> > so I didn't choose it. This driver handles the (performance-insensitive)
> > device configuration, while the fast-path operations take place in user-
> space.
> >
> > Another reason is the "hardware access library" files use only non-GPL
> external
> > symbols, and some tracepoint functions are exported GPL. Though it's
> probably
> > feasible to lift that tracing code up into a (GPLv2-only) caller function.
> 
> Stop going through crazy gyrations for something that your own legal
> team has told you not to do anymore in the first place.
> 
> No "hardware access library" files please, that's not how Linux drivers
> are written.
> 
> you all know better...
> 
> > But if tracepoints are the preferred method and/or you think the driver
> would
> > benefit, I'll make the change.
> 
> I don't think you need any of that stuff, now that the code works
> properly, right?

There are no known issues, correct. The logging (whether it's
dev_dbg/tracepoints/etc.) would be for user-space developers -- visibility into
the driver could help them debug issues in their own code.

It's hardly a critical feature; I'm happy to change or remove it if necessary.
But it could be helpful, isn't a maintenance burden or performance hindrance,
and (AFAICT) shouldn't pose any security risks.

Thanks,
Gage

> 
> greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-07-12 15:28   ` Arnd Bergmann
  2020-07-17 18:19     ` Eads, Gage
@ 2020-08-04 22:20     ` Eads, Gage
  2020-08-05  6:46       ` gregkh
  1 sibling, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-08-04 22:20 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, gregkh, Karlsson, Magnus, Topel, Bjorn

> > +/* [7:0]: device revision, [15:8]: device version */
> > +#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > +
> > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > +                                        unsigned long user_arg,
> > +                                        u16 size)
> > +{
> > +       struct dlb2_get_device_version_args arg;
> > +       struct dlb2_cmd_response response;
> > +       int ret;
> > +
> > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > +
> > +       response.status = 0;
> > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > +
> > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> 
> Better avoid any indirect pointers. As you always return a constant
> here, I think the entire ioctl command can be removed until you
> actually need it. If you have an ioctl command that needs both
> input and output, use _IOWR() to define it and put all arguments
> into the same structure.

I should've caught this in my earlier response, sorry. The device version
command is intentionally the first in the user interface enum. My
goal is for all device versions (e.g. DLB 1.0 in the future) to be accessible
through a /dev/dlb%d node. To allow this, all drivers would support the same
device-version command as command 0, then the subsequent commands can be
tailored to that particular device. User-space would query the version first
to determine which set of ioctl commands it needs to use.

So even though the response is constant (for now), it must occupy command 0 for
this design to work.

Thanks,
Gage

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-08-04 22:20     ` Eads, Gage
@ 2020-08-05  6:46       ` gregkh
  2020-08-05 15:07         ` Eads, Gage
  0 siblings, 1 reply; 46+ messages in thread
From: gregkh @ 2020-08-05  6:46 UTC (permalink / raw)
  To: Eads, Gage; +Cc: Arnd Bergmann, linux-kernel, Karlsson, Magnus, Topel, Bjorn

On Tue, Aug 04, 2020 at 10:20:47PM +0000, Eads, Gage wrote:
> > > +/* [7:0]: device revision, [15:8]: device version */
> > > +#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > > +
> > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > +                                        unsigned long user_arg,
> > > +                                        u16 size)
> > > +{
> > > +       struct dlb2_get_device_version_args arg;
> > > +       struct dlb2_cmd_response response;
> > > +       int ret;
> > > +
> > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > +
> > > +       response.status = 0;
> > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > +
> > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg, sizeof(arg));
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> > 
> > Better avoid any indirect pointers. As you always return a constant
> > here, I think the entire ioctl command can be removed until you
> > actually need it. If you have an ioctl command that needs both
> > input and output, use _IOWR() to define it and put all arguments
> > into the same structure.
> 
> I should've caught this in my earlier response, sorry. The device version
> command is intentionally the first in the user interface enum. My
> goal is for all device versions (e.g. DLB 1.0 in the future) to be accessible
> through a /dev/dlb%d node. To allow this, all drivers would support the same
> device-version command as command 0, then the subsequent commands can be
> tailored to that particular device. User-space would query the version first
> to determine which set of ioctl commands it needs to use.
> 
> So even though the response is constant (for now), it must occupy command 0 for
> this design to work.

"versions" for ioctls just do not work, please don't go down that path,
they should not be needed.  See the many different discussions about
this topic on lkml for other subsystem submissions if you are curious.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-08-05  6:46       ` gregkh
@ 2020-08-05 15:07         ` Eads, Gage
  2020-08-05 15:17           ` gregkh
  0 siblings, 1 reply; 46+ messages in thread
From: Eads, Gage @ 2020-08-05 15:07 UTC (permalink / raw)
  To: gregkh; +Cc: Arnd Bergmann, linux-kernel, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: gregkh <gregkh@linuxfoundation.org>
> Sent: Wednesday, August 5, 2020 1:46 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>; linux-kernel@vger.kernel.org;
> Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> <bjorn.topel@intel.com>
> Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> 
> On Tue, Aug 04, 2020 at 10:20:47PM +0000, Eads, Gage wrote:
> > > > +/* [7:0]: device revision, [15:8]: device version */
> > > > +#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > > > +
> > > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > > +                                        unsigned long user_arg,
> > > > +                                        u16 size)
> > > > +{
> > > > +       struct dlb2_get_device_version_args arg;
> > > > +       struct dlb2_cmd_response response;
> > > > +       int ret;
> > > > +
> > > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > > +
> > > > +       response.status = 0;
> > > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > > +
> > > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg,
> sizeof(arg));
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> > >
> > > Better avoid any indirect pointers. As you always return a constant
> > > here, I think the entire ioctl command can be removed until you
> > > actually need it. If you have an ioctl command that needs both
> > > input and output, use _IOWR() to define it and put all arguments
> > > into the same structure.
> >
> > I should've caught this in my earlier response, sorry. The device version
> > command is intentionally the first in the user interface enum. My
> > goal is for all device versions (e.g. DLB 1.0 in the future) to be accessible
> > through a /dev/dlb%d node. To allow this, all drivers would support the
> same
> > device-version command as command 0, then the subsequent commands
> can be
> > tailored to that particular device. User-space would query the version first
> > to determine which set of ioctl commands it needs to use.
> >
> > So even though the response is constant (for now), it must occupy
> command 0 for
> > this design to work.
> 
> "versions" for ioctls just do not work, please don't go down that path,
> they should not be needed.  See the many different discussions about
> this topic on lkml for other subsystem submissions if you are curious.
> 

This approach is based on VFIO's modular ioctl design, which has a different
API for Type1 vs. SPAPR IOMMUs. Similarly a DLB driver could have a different
API for each device version (but each API would be fixed, not versioned). I
didn't see any concerns on lkml over VFIO when it was originally submitted -- though
that was 8 years ago, perhaps the community's feelings have changed since then.

Thanks,
Gage

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-08-05 15:07         ` Eads, Gage
@ 2020-08-05 15:17           ` gregkh
  2020-08-05 15:39             ` Eads, Gage
  0 siblings, 1 reply; 46+ messages in thread
From: gregkh @ 2020-08-05 15:17 UTC (permalink / raw)
  To: Eads, Gage; +Cc: Arnd Bergmann, linux-kernel, Karlsson, Magnus, Topel, Bjorn

On Wed, Aug 05, 2020 at 03:07:50PM +0000, Eads, Gage wrote:
> 
> 
> > -----Original Message-----
> > From: gregkh <gregkh@linuxfoundation.org>
> > Sent: Wednesday, August 5, 2020 1:46 AM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: Arnd Bergmann <arnd@arndb.de>; linux-kernel@vger.kernel.org;
> > Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> > <bjorn.topel@intel.com>
> > Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> > 
> > On Tue, Aug 04, 2020 at 10:20:47PM +0000, Eads, Gage wrote:
> > > > > +/* [7:0]: device revision, [15:8]: device version */
> > > > > +#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) | (rev))
> > > > > +
> > > > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > > > +                                        unsigned long user_arg,
> > > > > +                                        u16 size)
> > > > > +{
> > > > > +       struct dlb2_get_device_version_args arg;
> > > > > +       struct dlb2_cmd_response response;
> > > > > +       int ret;
> > > > > +
> > > > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > > > +
> > > > > +       response.status = 0;
> > > > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > > > +
> > > > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg,
> > sizeof(arg));
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       ret = dlb2_copy_resp_to_user(dev, arg.response, &response);
> > > >
> > > > Better avoid any indirect pointers. As you always return a constant
> > > > here, I think the entire ioctl command can be removed until you
> > > > actually need it. If you have an ioctl command that needs both
> > > > input and output, use _IOWR() to define it and put all arguments
> > > > into the same structure.
> > >
> > > I should've caught this in my earlier response, sorry. The device version
> > > command is intentionally the first in the user interface enum. My
> > > goal is for all device versions (e.g. DLB 1.0 in the future) to be accessible
> > > through a /dev/dlb%d node. To allow this, all drivers would support the
> > same
> > > device-version command as command 0, then the subsequent commands
> > can be
> > > tailored to that particular device. User-space would query the version first
> > > to determine which set of ioctl commands it needs to use.
> > >
> > > So even though the response is constant (for now), it must occupy
> > command 0 for
> > > this design to work.
> > 
> > "versions" for ioctls just do not work, please don't go down that path,
> > they should not be needed.  See the many different discussions about
> > this topic on lkml for other subsystem submissions if you are curious.
> > 
> 
> This approach is based on VFIO's modular ioctl design, which has a different
> API for Type1 vs. SPAPR IOMMUs. Similarly a DLB driver could have a different
> API for each device version (but each API would be fixed, not versioned). I
> didn't see any concerns on lkml over VFIO when it was originally submitted -- though
> that was 8 years ago, perhaps the community's feelings have changed since then.

Fixed apis for device types is usually the better way to go.  See the
review comments on the nitro_enclaves driver submission a few weeks ago
for the full details.

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
  2020-08-05 15:17           ` gregkh
@ 2020-08-05 15:39             ` Eads, Gage
  0 siblings, 0 replies; 46+ messages in thread
From: Eads, Gage @ 2020-08-05 15:39 UTC (permalink / raw)
  To: gregkh; +Cc: Arnd Bergmann, linux-kernel, Karlsson, Magnus, Topel, Bjorn



> -----Original Message-----
> From: gregkh <gregkh@linuxfoundation.org>
> Sent: Wednesday, August 5, 2020 10:18 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>; linux-kernel@vger.kernel.org;
> Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> <bjorn.topel@intel.com>
> Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> 
> On Wed, Aug 05, 2020 at 03:07:50PM +0000, Eads, Gage wrote:
> >
> >
> > > -----Original Message-----
> > > From: gregkh <gregkh@linuxfoundation.org>
> > > Sent: Wednesday, August 5, 2020 1:46 AM
> > > To: Eads, Gage <gage.eads@intel.com>
> > > Cc: Arnd Bergmann <arnd@arndb.de>; linux-kernel@vger.kernel.org;
> > > Karlsson, Magnus <magnus.karlsson@intel.com>; Topel, Bjorn
> > > <bjorn.topel@intel.com>
> > > Subject: Re: [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls
> > >
> > > On Tue, Aug 04, 2020 at 10:20:47PM +0000, Eads, Gage wrote:
> > > > > > +/* [7:0]: device revision, [15:8]: device version */
> > > > > > +#define DLB2_SET_DEVICE_VERSION(ver, rev) (((ver) << 8) |
> (rev))
> > > > > > +
> > > > > > +static int dlb2_ioctl_get_device_version(struct dlb2_dev *dev,
> > > > > > +                                        unsigned long user_arg,
> > > > > > +                                        u16 size)
> > > > > > +{
> > > > > > +       struct dlb2_get_device_version_args arg;
> > > > > > +       struct dlb2_cmd_response response;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       dev_dbg(dev->dlb2_device, "Entering %s()\n", __func__);
> > > > > > +
> > > > > > +       response.status = 0;
> > > > > > +       response.id = DLB2_SET_DEVICE_VERSION(2, DLB2_REV_A0);
> > > > > > +
> > > > > > +       ret = dlb2_copy_from_user(dev, user_arg, size, &arg,
> > > sizeof(arg));
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       ret = dlb2_copy_resp_to_user(dev, arg.response,
> &response);
> > > > >
> > > > > Better avoid any indirect pointers. As you always return a constant
> > > > > here, I think the entire ioctl command can be removed until you
> > > > > actually need it. If you have an ioctl command that needs both
> > > > > input and output, use _IOWR() to define it and put all arguments
> > > > > into the same structure.
> > > >
> > > > I should've caught this in my earlier response, sorry. The device
> version
> > > > command is intentionally the first in the user interface enum. My
> > > > goal is for all device versions (e.g. DLB 1.0 in the future) to be
> accessible
> > > > through a /dev/dlb%d node. To allow this, all drivers would support
> the
> > > same
> > > > device-version command as command 0, then the subsequent
> commands
> > > can be
> > > > tailored to that particular device. User-space would query the version
> first
> > > > to determine which set of ioctl commands it needs to use.
> > > >
> > > > So even though the response is constant (for now), it must occupy
> > > command 0 for
> > > > this design to work.
> > >
> > > "versions" for ioctls just do not work, please don't go down that path,
> > > they should not be needed.  See the many different discussions about
> > > this topic on lkml for other subsystem submissions if you are curious.
> > >
> >
> > This approach is based on VFIO's modular ioctl design, which has a
> different
> > API for Type1 vs. SPAPR IOMMUs. Similarly a DLB driver could have a
> different
> > API for each device version (but each API would be fixed, not versioned). I
> > didn't see any concerns on lkml over VFIO when it was originally submitted
> -- though
> > that was 8 years ago, perhaps the community's feelings have changed
> since then.
> 
> Fixed apis for device types is usually the better way to go.  See the
> review comments on the nitro_enclaves driver submission a few weeks ago
> for the full details.

Thanks for the pointer -- I think we're on the same page regarding fixed APIs.
I'll clarify the documentation for this ioctl in v2 so it's clear that its
purpose is to support fixed APIs for different hardware versions/types.

Thanks,
Gage

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2020-08-05 20:15 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-12 13:43 [PATCH 00/20] dlb2: introduce DLB 2.0 device driver Gage Eads
2020-07-12 13:43 ` [PATCH 01/20] dlb2: add skeleton for DLB 2.0 driver Gage Eads
2020-07-12 15:56   ` Greg KH
2020-07-17 18:19     ` Eads, Gage
2020-07-18  6:46       ` Greg KH
2020-07-12 15:58   ` Greg KH
2020-07-17 18:18     ` Eads, Gage
2020-07-18  6:46       ` Greg KH
2020-07-20 19:02         ` Eads, Gage
2020-07-20 19:19           ` Greg KH
2020-07-24 21:00             ` Eads, Gage
2020-07-12 13:43 ` [PATCH 02/20] dlb2: initialize PF device Gage Eads
2020-07-12 13:43 ` [PATCH 03/20] dlb2: add resource and device initialization Gage Eads
2020-07-12 13:43 ` [PATCH 04/20] dlb2: add device ioctl layer and first 4 ioctls Gage Eads
2020-07-12 14:42   ` Randy Dunlap
2020-07-17 18:20     ` Eads, Gage
2020-07-12 14:53   ` Randy Dunlap
2020-07-17 18:20     ` Eads, Gage
2020-07-12 15:28   ` Arnd Bergmann
2020-07-17 18:19     ` Eads, Gage
2020-07-17 18:56       ` Arnd Bergmann
2020-07-17 20:05         ` Eads, Gage
2020-07-18  6:48           ` gregkh
2020-08-04 22:20     ` Eads, Gage
2020-08-05  6:46       ` gregkh
2020-08-05 15:07         ` Eads, Gage
2020-08-05 15:17           ` gregkh
2020-08-05 15:39             ` Eads, Gage
2020-07-12 13:43 ` [PATCH 05/20] dlb2: add sched domain config and reset support Gage Eads
2020-07-12 13:43 ` [PATCH 06/20] dlb2: add ioctl to get sched domain fd Gage Eads
2020-07-12 13:43 ` [PATCH 07/20] dlb2: add runtime power-management support Gage Eads
2020-07-12 13:43 ` [PATCH 08/20] dlb2: add queue create and queue-depth-get ioctls Gage Eads
2020-07-12 13:43 ` [PATCH 09/20] dlb2: add ioctl to configure ports, query poll mode Gage Eads
2020-07-12 15:34   ` Arnd Bergmann
2020-07-17 18:19     ` Eads, Gage
2020-07-12 13:43 ` [PATCH 10/20] dlb2: add port mmap support Gage Eads
2020-07-12 13:43 ` [PATCH 11/20] dlb2: add start domain ioctl Gage Eads
2020-07-12 13:43 ` [PATCH 12/20] dlb2: add queue map and unmap ioctls Gage Eads
2020-07-12 13:43 ` [PATCH 13/20] dlb2: add port enable/disable ioctls Gage Eads
2020-07-12 13:43 ` [PATCH 14/20] dlb2: add CQ interrupt support Gage Eads
2020-07-12 13:43 ` [PATCH 15/20] dlb2: add domain alert support Gage Eads
2020-07-12 13:43 ` [PATCH 16/20] dlb2: add sequence-number management ioctls Gage Eads
2020-07-12 13:43 ` [PATCH 17/20] dlb2: add cos bandwidth get/set ioctls Gage Eads
2020-07-12 13:43 ` [PATCH 18/20] dlb2: add device FLR support Gage Eads
2020-07-12 13:43 ` [PATCH 19/20] dlb2: add basic PF sysfs interfaces Gage Eads
2020-07-12 13:43 ` [PATCH 20/20] dlb2: add ingress error handling Gage Eads

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).