All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] IB/hfi1: Clean up cdevs, convert write to ioctl, and destage driver
@ 2016-05-19 12:25 Dennis Dalessandro
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:25 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

The purpose of this patch series is to get the hfi1 driver out of staging and
into the drivers/infiniband mainline. We have made significant changes beyond
what was originally requested in the TODO list. The biggest areas of contention
have simply been removed: snoop/capture, eprom, ui and the write()/writev()
problem has been solved. We will be working with the community to get back the
removed functionality, particularly the snoop/capture.

This patch series combines the previously posted series which converts write()
to ioctl() [1] and the previously posted series to remove cdevs [2]. These have
been merged into a single series to have a more logical order and aid in review.
For instance there is no change to convert eprom to ioctl then remove it. It is
just removed.

Other changes in this series which were not included in [1] and [2] include:

* Dropped sysfs version file
* Added a new ioctl to get the sw version
* Remove ioctl access OK helper function and rely on get_user and friends
* Adds kref object accounting for opening/closing the cdev to prevent parent
  structure from being removed.

This patch series should apply on Doug's k.o/for-4.7 branch.

Patches can also be viewed in my repo at:
https://github.com/ddalessa/kernel/tree/for-4.7

For history on the two patch sets please see:
[1] http://marc.info/?l=linux-rdma&m=146307351401595&w=2
[2] http://marc.info/?l=linux-rdma&m=146307407201814&w=2

---

Dennis Dalessandro (10):
      IB/hfi1: Remove multiple device cdev
      IB/hfi1: Remove UI char device
      IB/hfi1: Remove EPROM functionality from data device
      IB/hfi1: Remove snoop/diag interface
      IB/hfi1: Remove unused user command
      IB/hfi1: Add ioctl() interface for user commands
      IB/hfi1: Remove write(), use ioctl() for user cmds
      IB/hfi1: Add trace message in user IOCTL handling
      IB/hfi1: Do not free hfi1 cdev parent structure early
      IB/hfi1: Move driver out of staging


 MAINTAINERS                                 |   13 
 drivers/infiniband/Kconfig                  |    2 
 drivers/infiniband/hw/Makefile              |    1 
 drivers/infiniband/hw/hfi1/Kconfig          |    0 
 drivers/infiniband/hw/hfi1/Makefile         |    2 
 drivers/infiniband/hw/hfi1/affinity.c       |    0 
 drivers/infiniband/hw/hfi1/affinity.h       |    0 
 drivers/infiniband/hw/hfi1/aspm.h           |    0 
 drivers/infiniband/hw/hfi1/chip.c           |    0 
 drivers/infiniband/hw/hfi1/chip.h           |    0 
 drivers/infiniband/hw/hfi1/chip_registers.h |    0 
 drivers/infiniband/hw/hfi1/common.h         |    5 
 drivers/infiniband/hw/hfi1/debugfs.c        |    0 
 drivers/infiniband/hw/hfi1/debugfs.h        |    0 
 drivers/infiniband/hw/hfi1/device.c         |    4 
 drivers/infiniband/hw/hfi1/device.h         |    3 
 drivers/infiniband/hw/hfi1/dma.c            |    0 
 drivers/infiniband/hw/hfi1/driver.c         |    0 
 drivers/infiniband/hw/hfi1/efivar.c         |    0 
 drivers/infiniband/hw/hfi1/efivar.h         |    0 
 drivers/infiniband/hw/hfi1/eprom.c          |   84 +
 drivers/infiniband/hw/hfi1/eprom.h          |    0 
 drivers/infiniband/hw/hfi1/file_ops.c       |  549 ++------
 drivers/infiniband/hw/hfi1/firmware.c       |    0 
 drivers/infiniband/hw/hfi1/hfi.h            |    1 
 drivers/infiniband/hw/hfi1/init.c           |   14 
 drivers/infiniband/hw/hfi1/intr.c           |    0 
 drivers/infiniband/hw/hfi1/iowait.h         |    0 
 drivers/infiniband/hw/hfi1/mad.c            |    0 
 drivers/infiniband/hw/hfi1/mad.h            |    0 
 drivers/infiniband/hw/hfi1/mmu_rb.c         |    0 
 drivers/infiniband/hw/hfi1/mmu_rb.h         |    0 
 drivers/infiniband/hw/hfi1/opa_compat.h     |    0 
 drivers/infiniband/hw/hfi1/pcie.c           |    0 
 drivers/infiniband/hw/hfi1/pio.c            |    0 
 drivers/infiniband/hw/hfi1/pio.h            |    0 
 drivers/infiniband/hw/hfi1/pio_copy.c       |    0 
 drivers/infiniband/hw/hfi1/platform.c       |    0 
 drivers/infiniband/hw/hfi1/platform.h       |    0 
 drivers/infiniband/hw/hfi1/qp.c             |    0 
 drivers/infiniband/hw/hfi1/qp.h             |    0 
 drivers/infiniband/hw/hfi1/qsfp.c           |    0 
 drivers/infiniband/hw/hfi1/qsfp.h           |    0 
 drivers/infiniband/hw/hfi1/rc.c             |    0 
 drivers/infiniband/hw/hfi1/ruc.c            |    0 
 drivers/infiniband/hw/hfi1/sdma.c           |    0 
 drivers/infiniband/hw/hfi1/sdma.h           |    0 
 drivers/infiniband/hw/hfi1/sdma_txreq.h     |    0 
 drivers/infiniband/hw/hfi1/sysfs.c          |    0 
 drivers/infiniband/hw/hfi1/trace.c          |    1 
 drivers/infiniband/hw/hfi1/trace.h          |    1 
 drivers/infiniband/hw/hfi1/twsi.c           |    0 
 drivers/infiniband/hw/hfi1/twsi.h           |    0 
 drivers/infiniband/hw/hfi1/uc.c             |    0 
 drivers/infiniband/hw/hfi1/ud.c             |    0 
 drivers/infiniband/hw/hfi1/user_exp_rcv.c   |    0 
 drivers/infiniband/hw/hfi1/user_exp_rcv.h   |    0 
 drivers/infiniband/hw/hfi1/user_pages.c     |    0 
 drivers/infiniband/hw/hfi1/user_sdma.c      |    0 
 drivers/infiniband/hw/hfi1/user_sdma.h      |    0 
 drivers/infiniband/hw/hfi1/verbs.c          |    0 
 drivers/infiniband/hw/hfi1/verbs.h          |    0 
 drivers/infiniband/hw/hfi1/verbs_txreq.c    |    0 
 drivers/infiniband/hw/hfi1/verbs_txreq.h    |    0 
 drivers/staging/rdma/Kconfig                |    2 
 drivers/staging/rdma/Makefile               |    1 
 drivers/staging/rdma/hfi1/TODO              |    6 
 drivers/staging/rdma/hfi1/diag.c            | 1925 ---------------------------
 drivers/staging/rdma/hfi1/eprom.c           |  471 -------
 include/uapi/rdma/hfi/hfi1_user.h           |   75 +
 70 files changed, 267 insertions(+), 2893 deletions(-)
 rename drivers/{staging/rdma/hfi1/Kconfig => infiniband/hw/hfi1/Kconfig} (100%)
 rename drivers/{staging/rdma/hfi1/Makefile => infiniband/hw/hfi1/Makefile} (88%)
 rename drivers/{staging/rdma/hfi1/affinity.c => infiniband/hw/hfi1/affinity.c} (100%)
 rename drivers/{staging/rdma/hfi1/affinity.h => infiniband/hw/hfi1/affinity.h} (100%)
 rename drivers/{staging/rdma/hfi1/aspm.h => infiniband/hw/hfi1/aspm.h} (100%)
 rename drivers/{staging/rdma/hfi1/chip.c => infiniband/hw/hfi1/chip.c} (100%)
 rename drivers/{staging/rdma/hfi1/chip.h => infiniband/hw/hfi1/chip.h} (100%)
 rename drivers/{staging/rdma/hfi1/chip_registers.h => infiniband/hw/hfi1/chip_registers.h} (100%)
 rename drivers/{staging/rdma/hfi1/common.h => infiniband/hw/hfi1/common.h} (98%)
 rename drivers/{staging/rdma/hfi1/debugfs.c => infiniband/hw/hfi1/debugfs.c} (100%)
 rename drivers/{staging/rdma/hfi1/debugfs.h => infiniband/hw/hfi1/debugfs.h} (100%)
 rename drivers/{staging/rdma/hfi1/device.c => infiniband/hw/hfi1/device.c} (98%)
 rename drivers/{staging/rdma/hfi1/device.h => infiniband/hw/hfi1/device.h} (97%)
 rename drivers/{staging/rdma/hfi1/dma.c => infiniband/hw/hfi1/dma.c} (100%)
 rename drivers/{staging/rdma/hfi1/driver.c => infiniband/hw/hfi1/driver.c} (100%)
 rename drivers/{staging/rdma/hfi1/efivar.c => infiniband/hw/hfi1/efivar.c} (100%)
 rename drivers/{staging/rdma/hfi1/efivar.h => infiniband/hw/hfi1/efivar.h} (100%)
 copy drivers/{staging/rdma/hfi1/user_sdma.h => infiniband/hw/hfi1/eprom.c} (61%)
 rename drivers/{staging/rdma/hfi1/eprom.h => infiniband/hw/hfi1/eprom.h} (100%)
 rename drivers/{staging/rdma/hfi1/file_ops.c => infiniband/hw/hfi1/file_ops.c} (78%)
 rename drivers/{staging/rdma/hfi1/firmware.c => infiniband/hw/hfi1/firmware.c} (100%)
 rename drivers/{staging/rdma/hfi1/hfi.h => infiniband/hw/hfi1/hfi.h} (99%)
 rename drivers/{staging/rdma/hfi1/init.c => infiniband/hw/hfi1/init.c} (99%)
 rename drivers/{staging/rdma/hfi1/intr.c => infiniband/hw/hfi1/intr.c} (100%)
 rename drivers/{staging/rdma/hfi1/iowait.h => infiniband/hw/hfi1/iowait.h} (100%)
 rename drivers/{staging/rdma/hfi1/mad.c => infiniband/hw/hfi1/mad.c} (100%)
 rename drivers/{staging/rdma/hfi1/mad.h => infiniband/hw/hfi1/mad.h} (100%)
 rename drivers/{staging/rdma/hfi1/mmu_rb.c => infiniband/hw/hfi1/mmu_rb.c} (100%)
 rename drivers/{staging/rdma/hfi1/mmu_rb.h => infiniband/hw/hfi1/mmu_rb.h} (100%)
 rename drivers/{staging/rdma/hfi1/opa_compat.h => infiniband/hw/hfi1/opa_compat.h} (100%)
 rename drivers/{staging/rdma/hfi1/pcie.c => infiniband/hw/hfi1/pcie.c} (100%)
 rename drivers/{staging/rdma/hfi1/pio.c => infiniband/hw/hfi1/pio.c} (100%)
 rename drivers/{staging/rdma/hfi1/pio.h => infiniband/hw/hfi1/pio.h} (100%)
 rename drivers/{staging/rdma/hfi1/pio_copy.c => infiniband/hw/hfi1/pio_copy.c} (100%)
 rename drivers/{staging/rdma/hfi1/platform.c => infiniband/hw/hfi1/platform.c} (100%)
 rename drivers/{staging/rdma/hfi1/platform.h => infiniband/hw/hfi1/platform.h} (100%)
 rename drivers/{staging/rdma/hfi1/qp.c => infiniband/hw/hfi1/qp.c} (100%)
 rename drivers/{staging/rdma/hfi1/qp.h => infiniband/hw/hfi1/qp.h} (100%)
 rename drivers/{staging/rdma/hfi1/qsfp.c => infiniband/hw/hfi1/qsfp.c} (100%)
 rename drivers/{staging/rdma/hfi1/qsfp.h => infiniband/hw/hfi1/qsfp.h} (100%)
 rename drivers/{staging/rdma/hfi1/rc.c => infiniband/hw/hfi1/rc.c} (100%)
 rename drivers/{staging/rdma/hfi1/ruc.c => infiniband/hw/hfi1/ruc.c} (100%)
 rename drivers/{staging/rdma/hfi1/sdma.c => infiniband/hw/hfi1/sdma.c} (100%)
 rename drivers/{staging/rdma/hfi1/sdma.h => infiniband/hw/hfi1/sdma.h} (100%)
 rename drivers/{staging/rdma/hfi1/sdma_txreq.h => infiniband/hw/hfi1/sdma_txreq.h} (100%)
 rename drivers/{staging/rdma/hfi1/sysfs.c => infiniband/hw/hfi1/sysfs.c} (100%)
 rename drivers/{staging/rdma/hfi1/trace.c => infiniband/hw/hfi1/trace.c} (99%)
 rename drivers/{staging/rdma/hfi1/trace.h => infiniband/hw/hfi1/trace.h} (99%)
 rename drivers/{staging/rdma/hfi1/twsi.c => infiniband/hw/hfi1/twsi.c} (100%)
 rename drivers/{staging/rdma/hfi1/twsi.h => infiniband/hw/hfi1/twsi.h} (100%)
 rename drivers/{staging/rdma/hfi1/uc.c => infiniband/hw/hfi1/uc.c} (100%)
 rename drivers/{staging/rdma/hfi1/ud.c => infiniband/hw/hfi1/ud.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_exp_rcv.c => infiniband/hw/hfi1/user_exp_rcv.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_exp_rcv.h => infiniband/hw/hfi1/user_exp_rcv.h} (100%)
 rename drivers/{staging/rdma/hfi1/user_pages.c => infiniband/hw/hfi1/user_pages.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_sdma.c => infiniband/hw/hfi1/user_sdma.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_sdma.h => infiniband/hw/hfi1/user_sdma.h} (100%)
 rename drivers/{staging/rdma/hfi1/verbs.c => infiniband/hw/hfi1/verbs.c} (100%)
 rename drivers/{staging/rdma/hfi1/verbs.h => infiniband/hw/hfi1/verbs.h} (100%)
 rename drivers/{staging/rdma/hfi1/verbs_txreq.c => infiniband/hw/hfi1/verbs_txreq.c} (100%)
 rename drivers/{staging/rdma/hfi1/verbs_txreq.h => infiniband/hw/hfi1/verbs_txreq.h} (100%)
 delete mode 100644 drivers/staging/rdma/hfi1/TODO
 delete mode 100644 drivers/staging/rdma/hfi1/diag.c
 delete mode 100644 drivers/staging/rdma/hfi1/eprom.c

-- 
-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/10] IB/hfi1: Remove multiple device cdev
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2016-05-19 12:25   ` Dennis Dalessandro
  2016-05-19 12:25   ` [PATCH 02/10] IB/hfi1: Remove UI char device Dennis Dalessandro
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:25 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: Mike Marciniszyn, Dean Luick, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Mitko Haralanov, jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Ira Weiny

hfi1 current exports a cdev that can be used to target all of the hfi's
in the system. However there is a problem with this approach in
that the devices could be on different subnets. This is a problem that
user space can figure out and explicitly tell the driver on which device
to create a context.

Remove the multi-purpose cdev leaving a dedicated cdev for each port.
Also remove the striping capability that is dependent upon the user
choosing the multi-purpose cdev. It is now up to user space to determine
how to stripe contexts.

Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/file_ops.c |  102 ++++++----------------------------
 include/uapi/rdma/hfi/hfi1_user.h    |   19 +-----
 2 files changed, 21 insertions(+), 100 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index c1c5bf8..518cd89 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -86,8 +86,7 @@ static int get_ctxt_info(struct file *, void __user *, __u32);
 static int get_base_info(struct file *, void __user *, __u32);
 static int setup_ctxt(struct file *);
 static int setup_subctxt(struct hfi1_ctxtdata *);
-static int get_user_context(struct file *, struct hfi1_user_info *,
-			    int, unsigned);
+static int get_user_context(struct file *, struct hfi1_user_info *, int);
 static int find_shared_ctxt(struct file *, const struct hfi1_user_info *);
 static int allocate_ctxt(struct file *, struct hfi1_devdata *,
 			 struct hfi1_user_info *);
@@ -836,7 +835,7 @@ static u64 kvirt_to_phys(void *addr)
 static int assign_ctxt(struct file *fp, struct hfi1_user_info *uinfo)
 {
 	int i_minor, ret = 0;
-	unsigned swmajor, swminor, alg = HFI1_ALG_ACROSS;
+	unsigned int swmajor, swminor;
 
 	swmajor = uinfo->userversion >> 16;
 	if (swmajor != HFI1_USER_SWMAJOR) {
@@ -846,9 +845,6 @@ static int assign_ctxt(struct file *fp, struct hfi1_user_info *uinfo)
 
 	swminor = uinfo->userversion & 0xffff;
 
-	if (uinfo->hfi1_alg < HFI1_ALG_COUNT)
-		alg = uinfo->hfi1_alg;
-
 	mutex_lock(&hfi1_mutex);
 	/* First, lets check if we need to setup a shared context? */
 	if (uinfo->subctxt_cnt) {
@@ -868,7 +864,7 @@ static int assign_ctxt(struct file *fp, struct hfi1_user_info *uinfo)
 	 */
 	if (!ret) {
 		i_minor = iminor(file_inode(fp)) - HFI1_USER_MINOR_BASE;
-		ret = get_user_context(fp, uinfo, i_minor - 1, alg);
+		ret = get_user_context(fp, uinfo, i_minor);
 	}
 done_unlock:
 	mutex_unlock(&hfi1_mutex);
@@ -876,71 +872,26 @@ done:
 	return ret;
 }
 
-/* return true if the device available for general use */
-static int usable_device(struct hfi1_devdata *dd)
-{
-	struct hfi1_pportdata *ppd = dd->pport;
-
-	return driver_lstate(ppd) == IB_PORT_ACTIVE;
-}
-
 static int get_user_context(struct file *fp, struct hfi1_user_info *uinfo,
-			    int devno, unsigned alg)
+			    int devno)
 {
 	struct hfi1_devdata *dd = NULL;
-	int ret = 0, devmax, npresent, nup, dev;
+	int devmax, npresent, nup;
 
 	devmax = hfi1_count_units(&npresent, &nup);
-	if (!npresent) {
-		ret = -ENXIO;
-		goto done;
-	}
-	if (!nup) {
-		ret = -ENETDOWN;
-		goto done;
-	}
-	if (devno >= 0) {
-		dd = hfi1_lookup(devno);
-		if (!dd)
-			ret = -ENODEV;
-		else if (!dd->freectxts)
-			ret = -EBUSY;
-	} else {
-		struct hfi1_devdata *pdd;
-
-		if (alg == HFI1_ALG_ACROSS) {
-			unsigned free = 0U;
-
-			for (dev = 0; dev < devmax; dev++) {
-				pdd = hfi1_lookup(dev);
-				if (!pdd)
-					continue;
-				if (!usable_device(pdd))
-					continue;
-				if (pdd->freectxts &&
-				    pdd->freectxts > free) {
-					dd = pdd;
-					free = pdd->freectxts;
-				}
-			}
-		} else {
-			for (dev = 0; dev < devmax; dev++) {
-				pdd = hfi1_lookup(dev);
-				if (!pdd)
-					continue;
-				if (!usable_device(pdd))
-					continue;
-				if (pdd->freectxts) {
-					dd = pdd;
-					break;
-				}
-			}
-		}
-		if (!dd)
-			ret = -EBUSY;
-	}
-done:
-	return ret ? ret : allocate_ctxt(fp, dd, uinfo);
+	if (!npresent)
+		return -ENXIO;
+
+	if (!nup)
+		return -ENETDOWN;
+
+	dd = hfi1_lookup(devno);
+	if (!dd)
+		return -ENODEV;
+	else if (!dd->freectxts)
+		return -EBUSY;
+
+	return allocate_ctxt(fp, dd, uinfo);
 }
 
 static int find_shared_ctxt(struct file *fp,
@@ -1698,15 +1649,8 @@ static const struct file_operations ui_file_ops = {
 #define UI_OFFSET 192	/* device minor offset for UI devices */
 static int create_ui = 1;
 
-static struct cdev wildcard_cdev;
-static struct device *wildcard_device;
-
-static atomic_t user_count = ATOMIC_INIT(0);
-
 static void user_remove(struct hfi1_devdata *dd)
 {
-	if (atomic_dec_return(&user_count) == 0)
-		hfi1_cdev_cleanup(&wildcard_cdev, &wildcard_device);
 
 	hfi1_cdev_cleanup(&dd->user_cdev, &dd->user_device);
 	hfi1_cdev_cleanup(&dd->ui_cdev, &dd->ui_device);
@@ -1717,16 +1661,8 @@ static int user_add(struct hfi1_devdata *dd)
 	char name[10];
 	int ret;
 
-	if (atomic_inc_return(&user_count) == 1) {
-		ret = hfi1_cdev_init(0, class_name(), &hfi1_file_ops,
-				     &wildcard_cdev, &wildcard_device,
-				     true);
-		if (ret)
-			goto done;
-	}
-
 	snprintf(name, sizeof(name), "%s_%d", class_name(), dd->unit);
-	ret = hfi1_cdev_init(dd->unit + 1, name, &hfi1_file_ops,
+	ret = hfi1_cdev_init(dd->unit, name, &hfi1_file_ops,
 			     &dd->user_cdev, &dd->user_device,
 			     true);
 	if (ret)
diff --git a/include/uapi/rdma/hfi/hfi1_user.h b/include/uapi/rdma/hfi/hfi1_user.h
index a533cec..0955899 100644
--- a/include/uapi/rdma/hfi/hfi1_user.h
+++ b/include/uapi/rdma/hfi/hfi1_user.h
@@ -75,7 +75,7 @@
  * may not be implemented; the user code must deal with this if it
  * cares, or it must abort after initialization reports the difference.
  */
-#define HFI1_USER_SWMINOR 0
+#define HFI1_USER_SWMINOR 1
 
 /*
  * Set of HW and driver capability/feature bits.
@@ -107,19 +107,6 @@
 #define HFI1_RCVHDR_ENTSIZE_16   (1UL << 1)
 #define HFI1_RCVDHR_ENTSIZE_32   (1UL << 2)
 
-/*
- * If the unit is specified via open, HFI choice is fixed.  If port is
- * specified, it's also fixed.  Otherwise we try to spread contexts
- * across ports and HFIs, using different algorithms.  WITHIN is
- * the old default, prior to this mechanism.
- */
-#define HFI1_ALG_ACROSS 0 /* round robin contexts across HFIs, then
-			  * ports; this is the default */
-#define HFI1_ALG_WITHIN 1 /* use all contexts on an HFI (round robin
-			  * active ports within), then next HFI */
-#define HFI1_ALG_COUNT  2 /* number of algorithm choices */
-
-
 /* User commands. */
 #define HFI1_CMD_ASSIGN_CTXT     1	/* allocate HFI and context */
 #define HFI1_CMD_CTXT_INFO       2	/* find out what resources we got */
@@ -199,9 +186,7 @@ struct hfi1_user_info {
 	 * Should be set to HFI1_USER_SWVERSION.
 	 */
 	__u32 userversion;
-	__u16 pad;
-	/* HFI selection algorithm, if unit has not selected */
-	__u16 hfi1_alg;
+	__u32 pad;
 	/*
 	 * If two or more processes wish to share a context, each process
 	 * must set the subcontext_cnt and subcontext_id to the same

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/10] IB/hfi1: Remove UI char device
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2016-05-19 12:25   ` [PATCH 01/10] IB/hfi1: Remove multiple device cdev Dennis Dalessandro
@ 2016-05-19 12:25   ` Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 03/10] IB/hfi1: Remove EPROM functionality from data device Dennis Dalessandro
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:25 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Dean Luick, Mike Marciniszyn

Remove UI char device which exposes direct access to registers for user
space. This was put in to aid in debugging the hardware. We are looking
into alternatives means of providing the same functionality. This
removes another char device from HFI1's footprint.

Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/file_ops.c |  168 ----------------------------------
 1 files changed, 1 insertions(+), 167 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index 518cd89..eb66bef 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -1497,163 +1497,10 @@ done:
 	return ret;
 }
 
-static int ui_open(struct inode *inode, struct file *filp)
-{
-	struct hfi1_devdata *dd;
-
-	dd = container_of(inode->i_cdev, struct hfi1_devdata, ui_cdev);
-	filp->private_data = dd; /* for other methods */
-	return 0;
-}
-
-static int ui_release(struct inode *inode, struct file *filp)
-{
-	/* nothing to do */
-	return 0;
-}
-
-static loff_t ui_lseek(struct file *filp, loff_t offset, int whence)
-{
-	struct hfi1_devdata *dd = filp->private_data;
-
-	return fixed_size_llseek(filp, offset, whence,
-		(dd->kregend - dd->kregbase) + DC8051_DATA_MEM_SIZE);
-}
-
-/* NOTE: assumes unsigned long is 8 bytes */
-static ssize_t ui_read(struct file *filp, char __user *buf, size_t count,
-		       loff_t *f_pos)
-{
-	struct hfi1_devdata *dd = filp->private_data;
-	void __iomem *base = dd->kregbase;
-	unsigned long total, csr_off,
-		barlen = (dd->kregend - dd->kregbase);
-	u64 data;
-
-	/* only read 8 byte quantities */
-	if ((count % 8) != 0)
-		return -EINVAL;
-	/* offset must be 8-byte aligned */
-	if ((*f_pos % 8) != 0)
-		return -EINVAL;
-	/* destination buffer must be 8-byte aligned */
-	if ((unsigned long)buf % 8 != 0)
-		return -EINVAL;
-	/* must be in range */
-	if (*f_pos + count > (barlen + DC8051_DATA_MEM_SIZE))
-		return -EINVAL;
-	/* only set the base if we are not starting past the BAR */
-	if (*f_pos < barlen)
-		base += *f_pos;
-	csr_off = *f_pos;
-	for (total = 0; total < count; total += 8, csr_off += 8) {
-		/* accessing LCB CSRs requires more checks */
-		if (is_lcb_offset(csr_off)) {
-			if (read_lcb_csr(dd, csr_off, (u64 *)&data))
-				break; /* failed */
-		}
-		/*
-		 * Cannot read ASIC GPIO/QSFP* clear and force CSRs without a
-		 * false parity error.  Avoid the whole issue by not reading
-		 * them.  These registers are defined as having a read value
-		 * of 0.
-		 */
-		else if (csr_off == ASIC_GPIO_CLEAR ||
-			 csr_off == ASIC_GPIO_FORCE ||
-			 csr_off == ASIC_QSFP1_CLEAR ||
-			 csr_off == ASIC_QSFP1_FORCE ||
-			 csr_off == ASIC_QSFP2_CLEAR ||
-			 csr_off == ASIC_QSFP2_FORCE)
-			data = 0;
-		else if (csr_off >= barlen) {
-			/*
-			 * read_8051_data can read more than just 8 bytes at
-			 * a time. However, folding this into the loop and
-			 * handling the reads in 8 byte increments allows us
-			 * to smoothly transition from chip memory to 8051
-			 * memory.
-			 */
-			if (read_8051_data(dd,
-					   (u32)(csr_off - barlen),
-					   sizeof(data), &data))
-				break; /* failed */
-		} else
-			data = readq(base + total);
-		if (put_user(data, (unsigned long __user *)(buf + total)))
-			break;
-	}
-	*f_pos += total;
-	return total;
-}
-
-/* NOTE: assumes unsigned long is 8 bytes */
-static ssize_t ui_write(struct file *filp, const char __user *buf,
-			size_t count, loff_t *f_pos)
-{
-	struct hfi1_devdata *dd = filp->private_data;
-	void __iomem *base;
-	unsigned long total, data, csr_off;
-	int in_lcb;
-
-	/* only write 8 byte quantities */
-	if ((count % 8) != 0)
-		return -EINVAL;
-	/* offset must be 8-byte aligned */
-	if ((*f_pos % 8) != 0)
-		return -EINVAL;
-	/* source buffer must be 8-byte aligned */
-	if ((unsigned long)buf % 8 != 0)
-		return -EINVAL;
-	/* must be in range */
-	if (*f_pos + count > dd->kregend - dd->kregbase)
-		return -EINVAL;
-
-	base = (void __iomem *)dd->kregbase + *f_pos;
-	csr_off = *f_pos;
-	in_lcb = 0;
-	for (total = 0; total < count; total += 8, csr_off += 8) {
-		if (get_user(data, (unsigned long __user *)(buf + total)))
-			break;
-		/* accessing LCB CSRs requires a special procedure */
-		if (is_lcb_offset(csr_off)) {
-			if (!in_lcb) {
-				int ret = acquire_lcb_access(dd, 1);
-
-				if (ret)
-					break;
-				in_lcb = 1;
-			}
-		} else {
-			if (in_lcb) {
-				release_lcb_access(dd, 1);
-				in_lcb = 0;
-			}
-		}
-		writeq(data, base + total);
-	}
-	if (in_lcb)
-		release_lcb_access(dd, 1);
-	*f_pos += total;
-	return total;
-}
-
-static const struct file_operations ui_file_ops = {
-	.owner = THIS_MODULE,
-	.llseek = ui_lseek,
-	.read = ui_read,
-	.write = ui_write,
-	.open = ui_open,
-	.release = ui_release,
-};
-
-#define UI_OFFSET 192	/* device minor offset for UI devices */
-static int create_ui = 1;
-
 static void user_remove(struct hfi1_devdata *dd)
 {
 
 	hfi1_cdev_cleanup(&dd->user_cdev, &dd->user_device);
-	hfi1_cdev_cleanup(&dd->ui_cdev, &dd->ui_device);
 }
 
 static int user_add(struct hfi1_devdata *dd)
@@ -1666,21 +1513,8 @@ static int user_add(struct hfi1_devdata *dd)
 			     &dd->user_cdev, &dd->user_device,
 			     true);
 	if (ret)
-		goto done;
+		user_remove(dd);
 
-	if (create_ui) {
-		snprintf(name, sizeof(name),
-			 "%s_ui%d", class_name(), dd->unit);
-		ret = hfi1_cdev_init(dd->unit + UI_OFFSET, name, &ui_file_ops,
-				     &dd->ui_cdev, &dd->ui_device,
-				     false);
-		if (ret)
-			goto done;
-	}
-
-	return 0;
-done:
-	user_remove(dd);
 	return ret;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/10] IB/hfi1: Remove EPROM functionality from data device
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2016-05-19 12:25   ` [PATCH 01/10] IB/hfi1: Remove multiple device cdev Dennis Dalessandro
  2016-05-19 12:25   ` [PATCH 02/10] IB/hfi1: Remove UI char device Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 04/10] IB/hfi1: Remove snoop/diag interface Dennis Dalessandro
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mike Marciniszyn, Dean Luick

Remove EPROM handling from the cdev which is used for user application
data traffic.

Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/eprom.c    |  369 ----------------------------------
 drivers/staging/rdma/hfi1/file_ops.c |   23 --
 include/uapi/rdma/hfi/hfi1_user.h    |    7 -
 3 files changed, 0 insertions(+), 399 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/eprom.c b/drivers/staging/rdma/hfi1/eprom.c
index bd87715..36b7794 100644
--- a/drivers/staging/rdma/hfi1/eprom.c
+++ b/drivers/staging/rdma/hfi1/eprom.c
@@ -49,387 +49,18 @@
 #include "common.h"
 #include "eprom.h"
 
-/*
- * The EPROM is logically divided into three partitions:
- *	partition 0: the first 128K, visible from PCI ROM BAR
- *	partition 1: 4K config file (sector size)
- *	partition 2: the rest
- */
-#define P0_SIZE (128 * 1024)
-#define P1_SIZE   (4 * 1024)
-#define P1_START P0_SIZE
-#define P2_START (P0_SIZE + P1_SIZE)
-
-/* erase sizes supported by the controller */
-#define SIZE_4KB (4 * 1024)
-#define MASK_4KB (SIZE_4KB - 1)
-
-#define SIZE_32KB (32 * 1024)
-#define MASK_32KB (SIZE_32KB - 1)
-
-#define SIZE_64KB (64 * 1024)
-#define MASK_64KB (SIZE_64KB - 1)
-
-/* controller page size, in bytes */
-#define EP_PAGE_SIZE 256
-#define EEP_PAGE_MASK (EP_PAGE_SIZE - 1)
-
-/* controller commands */
 #define CMD_SHIFT 24
-#define CMD_NOP			    (0)
-#define CMD_PAGE_PROGRAM(addr)	    ((0x02 << CMD_SHIFT) | addr)
-#define CMD_READ_DATA(addr)	    ((0x03 << CMD_SHIFT) | addr)
-#define CMD_READ_SR1		    ((0x05 << CMD_SHIFT))
-#define CMD_WRITE_ENABLE	    ((0x06 << CMD_SHIFT))
-#define CMD_SECTOR_ERASE_4KB(addr)  ((0x20 << CMD_SHIFT) | addr)
-#define CMD_SECTOR_ERASE_32KB(addr) ((0x52 << CMD_SHIFT) | addr)
-#define CMD_CHIP_ERASE		    ((0x60 << CMD_SHIFT))
-#define CMD_READ_MANUF_DEV_ID	    ((0x90 << CMD_SHIFT))
 #define CMD_RELEASE_POWERDOWN_NOID  ((0xab << CMD_SHIFT))
-#define CMD_SECTOR_ERASE_64KB(addr) ((0xd8 << CMD_SHIFT) | addr)
 
 /* controller interface speeds */
 #define EP_SPEED_FULL 0x2	/* full speed */
 
-/* controller status register 1 bits */
-#define SR1_BUSY 0x1ull		/* the BUSY bit in SR1 */
-
-/* sleep length while waiting for controller */
-#define WAIT_SLEEP_US 100	/* must be larger than 5 (see usage) */
-#define COUNT_DELAY_SEC(n) ((n) * (1000000 / WAIT_SLEEP_US))
-
-/* GPIO pins */
-#define EPROM_WP_N BIT_ULL(14)	/* EPROM write line */
-
 /*
  * How long to wait for the EPROM to become available, in ms.
  * The spec 32 Mb EPROM takes around 40s to erase then write.
  * Double it for safety.
  */
 #define EPROM_TIMEOUT 80000 /* ms */
-
-/*
- * Turn on external enable line that allows writing on the flash.
- */
-static void write_enable(struct hfi1_devdata *dd)
-{
-	/* raise signal */
-	write_csr(dd, ASIC_GPIO_OUT, read_csr(dd, ASIC_GPIO_OUT) | EPROM_WP_N);
-	/* raise enable */
-	write_csr(dd, ASIC_GPIO_OE, read_csr(dd, ASIC_GPIO_OE) | EPROM_WP_N);
-}
-
-/*
- * Turn off external enable line that allows writing on the flash.
- */
-static void write_disable(struct hfi1_devdata *dd)
-{
-	/* lower signal */
-	write_csr(dd, ASIC_GPIO_OUT, read_csr(dd, ASIC_GPIO_OUT) & ~EPROM_WP_N);
-	/* lower enable */
-	write_csr(dd, ASIC_GPIO_OE, read_csr(dd, ASIC_GPIO_OE) & ~EPROM_WP_N);
-}
-
-/*
- * Wait for the device to become not busy.  Must be called after all
- * write or erase operations.
- */
-static int wait_for_not_busy(struct hfi1_devdata *dd)
-{
-	unsigned long count = 0;
-	u64 reg;
-	int ret = 0;
-
-	/* starts page mode */
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_READ_SR1);
-	while (1) {
-		udelay(WAIT_SLEEP_US);
-		usleep_range(WAIT_SLEEP_US - 5, WAIT_SLEEP_US + 5);
-		count++;
-		reg = read_csr(dd, ASIC_EEP_DATA);
-		if ((reg & SR1_BUSY) == 0)
-			break;
-		/* 200s is the largest time for a 128Mb device */
-		if (count > COUNT_DELAY_SEC(200)) {
-			dd_dev_err(dd, "waited too long for SPI FLASH busy to clear - failing\n");
-			ret = -ETIMEDOUT;
-			break; /* break, not goto - must stop page mode */
-		}
-	}
-
-	/* stop page mode with a NOP */
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_NOP);
-
-	return ret;
-}
-
-/*
- * Read the device ID from the SPI controller.
- */
-static u32 read_device_id(struct hfi1_devdata *dd)
-{
-	/* read the Manufacture Device ID */
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_READ_MANUF_DEV_ID);
-	return (u32)read_csr(dd, ASIC_EEP_DATA);
-}
-
-/*
- * Erase the whole flash.
- */
-static int erase_chip(struct hfi1_devdata *dd)
-{
-	int ret;
-
-	write_enable(dd);
-
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_WRITE_ENABLE);
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_CHIP_ERASE);
-	ret = wait_for_not_busy(dd);
-
-	write_disable(dd);
-
-	return ret;
-}
-
-/*
- * Erase a range.
- */
-static int erase_range(struct hfi1_devdata *dd, u32 start, u32 len)
-{
-	u32 end = start + len;
-	int ret = 0;
-
-	if (end < start)
-		return -EINVAL;
-
-	/* check the end points for the minimum erase */
-	if ((start & MASK_4KB) || (end & MASK_4KB)) {
-		dd_dev_err(dd,
-			   "%s: non-aligned range (0x%x,0x%x) for a 4KB erase\n",
-			   __func__, start, end);
-		return -EINVAL;
-	}
-
-	write_enable(dd);
-
-	while (start < end) {
-		write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_WRITE_ENABLE);
-		/* check in order of largest to smallest */
-		if (((start & MASK_64KB) == 0) && (start + SIZE_64KB <= end)) {
-			write_csr(dd, ASIC_EEP_ADDR_CMD,
-				  CMD_SECTOR_ERASE_64KB(start));
-			start += SIZE_64KB;
-		} else if (((start & MASK_32KB) == 0) &&
-			   (start + SIZE_32KB <= end)) {
-			write_csr(dd, ASIC_EEP_ADDR_CMD,
-				  CMD_SECTOR_ERASE_32KB(start));
-			start += SIZE_32KB;
-		} else {	/* 4KB will work */
-			write_csr(dd, ASIC_EEP_ADDR_CMD,
-				  CMD_SECTOR_ERASE_4KB(start));
-			start += SIZE_4KB;
-		}
-		ret = wait_for_not_busy(dd);
-		if (ret)
-			goto done;
-	}
-
-done:
-	write_disable(dd);
-
-	return ret;
-}
-
-/*
- * Read a 256 byte (64 dword) EPROM page.
- * All callers have verified the offset is at a page boundary.
- */
-static void read_page(struct hfi1_devdata *dd, u32 offset, u32 *result)
-{
-	int i;
-
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_READ_DATA(offset));
-	for (i = 0; i < EP_PAGE_SIZE / sizeof(u32); i++)
-		result[i] = (u32)read_csr(dd, ASIC_EEP_DATA);
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_NOP); /* close open page */
-}
-
-/*
- * Read length bytes starting at offset.  Copy to user address addr.
- */
-static int read_length(struct hfi1_devdata *dd, u32 start, u32 len, u64 addr)
-{
-	u32 offset;
-	u32 buffer[EP_PAGE_SIZE / sizeof(u32)];
-	int ret = 0;
-
-	/* reject anything not on an EPROM page boundary */
-	if ((start & EEP_PAGE_MASK) || (len & EEP_PAGE_MASK))
-		return -EINVAL;
-
-	for (offset = 0; offset < len; offset += EP_PAGE_SIZE) {
-		read_page(dd, start + offset, buffer);
-		if (copy_to_user((void __user *)(addr + offset),
-				 buffer, EP_PAGE_SIZE)) {
-			ret = -EFAULT;
-			goto done;
-		}
-	}
-
-done:
-	return ret;
-}
-
-/*
- * Write a 256 byte (64 dword) EPROM page.
- * All callers have verified the offset is at a page boundary.
- */
-static int write_page(struct hfi1_devdata *dd, u32 offset, u32 *data)
-{
-	int i;
-
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_WRITE_ENABLE);
-	write_csr(dd, ASIC_EEP_DATA, data[0]);
-	write_csr(dd, ASIC_EEP_ADDR_CMD, CMD_PAGE_PROGRAM(offset));
-	for (i = 1; i < EP_PAGE_SIZE / sizeof(u32); i++)
-		write_csr(dd, ASIC_EEP_DATA, data[i]);
-	/* will close the open page */
-	return wait_for_not_busy(dd);
-}
-
-/*
- * Write length bytes starting at offset.  Read from user address addr.
- */
-static int write_length(struct hfi1_devdata *dd, u32 start, u32 len, u64 addr)
-{
-	u32 offset;
-	u32 buffer[EP_PAGE_SIZE / sizeof(u32)];
-	int ret = 0;
-
-	/* reject anything not on an EPROM page boundary */
-	if ((start & EEP_PAGE_MASK) || (len & EEP_PAGE_MASK))
-		return -EINVAL;
-
-	write_enable(dd);
-
-	for (offset = 0; offset < len; offset += EP_PAGE_SIZE) {
-		if (copy_from_user(buffer, (void __user *)(addr + offset),
-				   EP_PAGE_SIZE)) {
-			ret = -EFAULT;
-			goto done;
-		}
-		ret = write_page(dd, start + offset, buffer);
-		if (ret)
-			goto done;
-	}
-
-done:
-	write_disable(dd);
-	return ret;
-}
-
-/* convert an range composite to a length, in bytes */
-static inline u32 extract_rlen(u32 composite)
-{
-	return (composite & 0xffff) * EP_PAGE_SIZE;
-}
-
-/* convert an range composite to a start, in bytes */
-static inline u32 extract_rstart(u32 composite)
-{
-	return (composite >> 16) * EP_PAGE_SIZE;
-}
-
-/*
- * Perform the given operation on the EPROM.  Called from user space.  The
- * user credentials have already been checked.
- *
- * Return 0 on success, -ERRNO on error
- */
-int handle_eprom_command(struct file *fp, const struct hfi1_cmd *cmd)
-{
-	struct hfi1_devdata *dd;
-	u32 dev_id;
-	u32 rlen;	/* range length */
-	u32 rstart;	/* range start */
-	int i_minor;
-	int ret = 0;
-
-	/*
-	 * Map the device file to device data using the relative minor.
-	 * The device file minor number is the unit number + 1.  0 is
-	 * the generic device file - reject it.
-	 */
-	i_minor = iminor(file_inode(fp)) - HFI1_USER_MINOR_BASE;
-	if (i_minor <= 0)
-		return -EINVAL;
-	dd = hfi1_lookup(i_minor - 1);
-	if (!dd) {
-		pr_err("%s: cannot find unit %d!\n", __func__, i_minor);
-		return -EINVAL;
-	}
-
-	/* some devices do not have an EPROM */
-	if (!dd->eprom_available)
-		return -EOPNOTSUPP;
-
-	ret = acquire_chip_resource(dd, CR_EPROM, EPROM_TIMEOUT);
-	if (ret) {
-		dd_dev_err(dd, "%s: unable to acquire EPROM resource\n",
-			   __func__);
-		goto done_asic;
-	}
-
-	dd_dev_info(dd, "%s: cmd: type %d, len 0x%x, addr 0x%016llx\n",
-		    __func__, cmd->type, cmd->len, cmd->addr);
-
-	switch (cmd->type) {
-	case HFI1_CMD_EP_INFO:
-		if (cmd->len != sizeof(u32)) {
-			ret = -ERANGE;
-			break;
-		}
-		dev_id = read_device_id(dd);
-		/* addr points to a u32 user buffer */
-		if (copy_to_user((void __user *)cmd->addr, &dev_id,
-				 sizeof(u32)))
-			ret = -EFAULT;
-		break;
-
-	case HFI1_CMD_EP_ERASE_CHIP:
-		ret = erase_chip(dd);
-		break;
-
-	case HFI1_CMD_EP_ERASE_RANGE:
-		rlen = extract_rlen(cmd->len);
-		rstart = extract_rstart(cmd->len);
-		ret = erase_range(dd, rstart, rlen);
-		break;
-
-	case HFI1_CMD_EP_READ_RANGE:
-		rlen = extract_rlen(cmd->len);
-		rstart = extract_rstart(cmd->len);
-		ret = read_length(dd, rstart, rlen, cmd->addr);
-		break;
-
-	case HFI1_CMD_EP_WRITE_RANGE:
-		rlen = extract_rlen(cmd->len);
-		rstart = extract_rstart(cmd->len);
-		ret = write_length(dd, rstart, rlen, cmd->addr);
-		break;
-
-	default:
-		dd_dev_err(dd, "%s: unexpected command %d\n",
-			   __func__, cmd->type);
-		ret = -EINVAL;
-		break;
-	}
-
-	release_chip_resource(dd, CR_EPROM);
-done_asic:
-	return ret;
-}
-
 /*
  * Initialize the EPROM handler.
  */
diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index eb66bef..2eddd33 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -189,7 +189,6 @@ static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 	void *dest = NULL;
 	__u64 user_val = 0;
 	int uctxt_required = 1;
-	int must_be_root = 0;
 
 	/* FIXME: This interface cannot continue out of staging */
 	if (WARN_ON_ONCE(!ib_safe_file_access(fp)))
@@ -234,15 +233,6 @@ static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 		copy = 0;
 		user_val = cmd.addr;
 		break;
-	case HFI1_CMD_EP_INFO:
-	case HFI1_CMD_EP_ERASE_CHIP:
-	case HFI1_CMD_EP_ERASE_RANGE:
-	case HFI1_CMD_EP_READ_RANGE:
-	case HFI1_CMD_EP_WRITE_RANGE:
-		uctxt_required = 0;	/* assigned user context not required */
-		must_be_root = 1;	/* validate user */
-		copy = 0;
-		break;
 	default:
 		ret = -EINVAL;
 		goto bail;
@@ -265,12 +255,6 @@ static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 		goto bail;
 	}
 
-	/* only root can do these operations */
-	if (must_be_root && !capable(CAP_SYS_ADMIN)) {
-		ret = -EPERM;
-		goto bail;
-	}
-
 	switch (cmd.type) {
 	case HFI1_CMD_ASSIGN_CTXT:
 		ret = assign_ctxt(fp, &uinfo);
@@ -409,13 +393,6 @@ static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 			sc_return_credits(sc);
 		break;
 	}
-	case HFI1_CMD_EP_INFO:
-	case HFI1_CMD_EP_ERASE_CHIP:
-	case HFI1_CMD_EP_ERASE_RANGE:
-	case HFI1_CMD_EP_READ_RANGE:
-	case HFI1_CMD_EP_WRITE_RANGE:
-		ret = handle_eprom_command(fp, &cmd);
-		break;
 	}
 
 	if (ret >= 0)
diff --git a/include/uapi/rdma/hfi/hfi1_user.h b/include/uapi/rdma/hfi/hfi1_user.h
index 0955899..3e3680d 100644
--- a/include/uapi/rdma/hfi/hfi1_user.h
+++ b/include/uapi/rdma/hfi/hfi1_user.h
@@ -122,13 +122,6 @@
 #define HFI1_CMD_SET_PKEY        11     /* set context's pkey */
 #define HFI1_CMD_CTXT_RESET      12     /* reset context's HW send context */
 #define HFI1_CMD_TID_INVAL_READ  13     /* read TID cache invalidations */
-/* separate EPROM commands from normal PSM commands */
-#define HFI1_CMD_EP_INFO         64      /* read EPROM device ID */
-#define HFI1_CMD_EP_ERASE_CHIP   65      /* erase whole EPROM */
-/* range 66-74 no longer used */
-#define HFI1_CMD_EP_ERASE_RANGE  75      /* erase EPROM range */
-#define HFI1_CMD_EP_READ_RANGE   76      /* read EPROM range */
-#define HFI1_CMD_EP_WRITE_RANGE  77      /* write EPROM range */
 
 #define _HFI1_EVENT_FROZEN_BIT         0
 #define _HFI1_EVENT_LINKDOWN_BIT       1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/10] IB/hfi1: Remove snoop/diag interface
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 03/10] IB/hfi1: Remove EPROM functionality from data device Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 05/10] IB/hfi1: Remove unused user command Dennis Dalessandro
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mike Marciniszyn, Dean Luick

The snoop/diag interface is better served by an implementation which is
more general and usable by other drivers perhaps. Go ahead and remove
the code now and get rid of the char dev. We can put the feature back
when we have a more agreeable solution.

Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/Makefile   |    2 
 drivers/staging/rdma/hfi1/diag.c     | 1925 ----------------------------------
 drivers/staging/rdma/hfi1/file_ops.c |    9 
 3 files changed, 2 insertions(+), 1934 deletions(-)
 delete mode 100644 drivers/staging/rdma/hfi1/diag.c

diff --git a/drivers/staging/rdma/hfi1/Makefile b/drivers/staging/rdma/hfi1/Makefile
index 8dc5938..9b5382c 100644
--- a/drivers/staging/rdma/hfi1/Makefile
+++ b/drivers/staging/rdma/hfi1/Makefile
@@ -7,7 +7,7 @@
 #
 obj-$(CONFIG_INFINIBAND_HFI1) += hfi1.o
 
-hfi1-y := affinity.o chip.o device.o diag.o driver.o efivar.o \
+hfi1-y := affinity.o chip.o device.o driver.o efivar.o \
 	eprom.o file_ops.o firmware.o \
 	init.o intr.o mad.o mmu_rb.o pcie.o pio.o pio_copy.o platform.o \
 	qp.o qsfp.o rc.o ruc.o sdma.o sysfs.o trace.o twsi.o \
diff --git a/drivers/staging/rdma/hfi1/diag.c b/drivers/staging/rdma/hfi1/diag.c
deleted file mode 100644
index 3a679e4..0000000
--- a/drivers/staging/rdma/hfi1/diag.c
+++ /dev/null
@@ -1,1925 +0,0 @@
-/*
- * Copyright(c) 2015, 2016 Intel Corporation.
- *
- * This file is provided under a dual BSD/GPLv2 license.  When using or
- * redistributing this file, you may do so under either license.
- *
- * GPL LICENSE SUMMARY
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of version 2 of the GNU General Public License as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * BSD LICENSE
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- *  - Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- *  - Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in
- *    the documentation and/or other materials provided with the
- *    distribution.
- *  - Neither the name of Intel Corporation nor the names of its
- *    contributors may be used to endorse or promote products derived
- *    from this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- */
-
-/*
- * This file contains support for diagnostic functions.  It is accessed by
- * opening the hfi1_diag device, normally minor number 129.  Diagnostic use
- * of the chip may render the chip or board unusable until the driver
- * is unloaded, or in some cases, until the system is rebooted.
- *
- * Accesses to the chip through this interface are not similar to going
- * through the /sys/bus/pci resource mmap interface.
- */
-
-#include <linux/io.h>
-#include <linux/pci.h>
-#include <linux/poll.h>
-#include <linux/vmalloc.h>
-#include <linux/export.h>
-#include <linux/fs.h>
-#include <linux/uaccess.h>
-#include <linux/module.h>
-#include <rdma/ib_smi.h>
-#include "hfi.h"
-#include "device.h"
-#include "common.h"
-#include "verbs_txreq.h"
-#include "trace.h"
-
-#undef pr_fmt
-#define pr_fmt(fmt) DRIVER_NAME ": " fmt
-#define snoop_dbg(fmt, ...) \
-	hfi1_cdbg(SNOOP, fmt, ##__VA_ARGS__)
-
-/* Snoop option mask */
-#define SNOOP_DROP_SEND		BIT(0)
-#define SNOOP_USE_METADATA	BIT(1)
-#define SNOOP_SET_VL0TOVL15     BIT(2)
-
-static u8 snoop_flags;
-
-/*
- * Extract packet length from LRH header.
- * This is in Dwords so multiply by 4 to get size in bytes
- */
-#define HFI1_GET_PKT_LEN(x)      (((be16_to_cpu((x)->lrh[2]) & 0xFFF)) << 2)
-
-enum hfi1_filter_status {
-	HFI1_FILTER_HIT,
-	HFI1_FILTER_ERR,
-	HFI1_FILTER_MISS
-};
-
-/* snoop processing functions */
-rhf_rcv_function_ptr snoop_rhf_rcv_functions[8] = {
-	[RHF_RCV_TYPE_EXPECTED] = snoop_recv_handler,
-	[RHF_RCV_TYPE_EAGER]    = snoop_recv_handler,
-	[RHF_RCV_TYPE_IB]       = snoop_recv_handler,
-	[RHF_RCV_TYPE_ERROR]    = snoop_recv_handler,
-	[RHF_RCV_TYPE_BYPASS]   = snoop_recv_handler,
-	[RHF_RCV_TYPE_INVALID5] = process_receive_invalid,
-	[RHF_RCV_TYPE_INVALID6] = process_receive_invalid,
-	[RHF_RCV_TYPE_INVALID7] = process_receive_invalid
-};
-
-/* Snoop packet structure */
-struct snoop_packet {
-	struct list_head list;
-	u32 total_len;
-	u8 data[];
-};
-
-/* Do not make these an enum or it will blow up the capture_md */
-#define PKT_DIR_EGRESS 0x0
-#define PKT_DIR_INGRESS 0x1
-
-/* Packet capture metadata returned to the user with the packet. */
-struct capture_md {
-	u8 port;
-	u8 dir;
-	u8 reserved[6];
-	union {
-		u64 pbc;
-		u64 rhf;
-	} u;
-};
-
-static atomic_t diagpkt_count = ATOMIC_INIT(0);
-static struct cdev diagpkt_cdev;
-static struct device *diagpkt_device;
-
-static ssize_t diagpkt_write(struct file *fp, const char __user *data,
-			     size_t count, loff_t *off);
-
-static const struct file_operations diagpkt_file_ops = {
-	.owner = THIS_MODULE,
-	.write = diagpkt_write,
-	.llseek = noop_llseek,
-};
-
-/*
- * This is used for communication with user space for snoop extended IOCTLs
- */
-struct hfi1_link_info {
-	__be64 node_guid;
-	u8 port_mode;
-	u8 port_state;
-	u16 link_speed_active;
-	u16 link_width_active;
-	u16 vl15_init;
-	u8 port_number;
-	/*
-	 * Add padding to make this a full IB SMP payload. Note: changing the
-	 * size of this structure will make the IOCTLs created with _IOWR
-	 * change.
-	 * Be sure to run tests on all IOCTLs when making changes to this
-	 * structure.
-	 */
-	u8 res[47];
-};
-
-/*
- * This starts our ioctl sequence numbers *way* off from the ones
- * defined in ib_core.
- */
-#define SNOOP_CAPTURE_VERSION 0x1
-
-#define IB_IOCTL_MAGIC          0x1b /* See Documentation/ioctl-number.txt */
-#define HFI1_SNOOP_IOC_MAGIC IB_IOCTL_MAGIC
-#define HFI1_SNOOP_IOC_BASE_SEQ 0x80
-
-#define HFI1_SNOOP_IOCGETLINKSTATE \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ)
-#define HFI1_SNOOP_IOCSETLINKSTATE \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 1)
-#define HFI1_SNOOP_IOCCLEARQUEUE \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 2)
-#define HFI1_SNOOP_IOCCLEARFILTER \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 3)
-#define HFI1_SNOOP_IOCSETFILTER \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 4)
-#define HFI1_SNOOP_IOCGETVERSION \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 5)
-#define HFI1_SNOOP_IOCSET_OPTS \
-	_IO(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 6)
-
-/*
- * These offsets +6/+7 could change, but these are already known and used
- * IOCTL numbers so don't change them without a good reason.
- */
-#define HFI1_SNOOP_IOCGETLINKSTATE_EXTRA \
-	_IOWR(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 6, \
-		struct hfi1_link_info)
-#define HFI1_SNOOP_IOCSETLINKSTATE_EXTRA \
-	_IOWR(HFI1_SNOOP_IOC_MAGIC, HFI1_SNOOP_IOC_BASE_SEQ + 7, \
-		struct hfi1_link_info)
-
-static int hfi1_snoop_open(struct inode *in, struct file *fp);
-static ssize_t hfi1_snoop_read(struct file *fp, char __user *data,
-			       size_t pkt_len, loff_t *off);
-static ssize_t hfi1_snoop_write(struct file *fp, const char __user *data,
-				size_t count, loff_t *off);
-static long hfi1_ioctl(struct file *fp, unsigned int cmd, unsigned long arg);
-static unsigned int hfi1_snoop_poll(struct file *fp,
-				    struct poll_table_struct *wait);
-static int hfi1_snoop_release(struct inode *in, struct file *fp);
-
-struct hfi1_packet_filter_command {
-	int opcode;
-	int length;
-	void *value_ptr;
-};
-
-/* Can't re-use PKT_DIR_*GRESS here because 0 means no packets for this */
-#define HFI1_SNOOP_INGRESS 0x1
-#define HFI1_SNOOP_EGRESS  0x2
-
-enum hfi1_packet_filter_opcodes {
-	FILTER_BY_LID,
-	FILTER_BY_DLID,
-	FILTER_BY_MAD_MGMT_CLASS,
-	FILTER_BY_QP_NUMBER,
-	FILTER_BY_PKT_TYPE,
-	FILTER_BY_SERVICE_LEVEL,
-	FILTER_BY_PKEY,
-	FILTER_BY_DIRECTION,
-};
-
-static const struct file_operations snoop_file_ops = {
-	.owner = THIS_MODULE,
-	.open = hfi1_snoop_open,
-	.read = hfi1_snoop_read,
-	.unlocked_ioctl = hfi1_ioctl,
-	.poll = hfi1_snoop_poll,
-	.write = hfi1_snoop_write,
-	.release = hfi1_snoop_release
-};
-
-struct hfi1_filter_array {
-	int (*filter)(void *, void *, void *);
-};
-
-static int hfi1_filter_lid(void *ibhdr, void *packet_data, void *value);
-static int hfi1_filter_dlid(void *ibhdr, void *packet_data, void *value);
-static int hfi1_filter_mad_mgmt_class(void *ibhdr, void *packet_data,
-				      void *value);
-static int hfi1_filter_qp_number(void *ibhdr, void *packet_data, void *value);
-static int hfi1_filter_ibpacket_type(void *ibhdr, void *packet_data,
-				     void *value);
-static int hfi1_filter_ib_service_level(void *ibhdr, void *packet_data,
-					void *value);
-static int hfi1_filter_ib_pkey(void *ibhdr, void *packet_data, void *value);
-static int hfi1_filter_direction(void *ibhdr, void *packet_data, void *value);
-
-static const struct hfi1_filter_array hfi1_filters[] = {
-	{ hfi1_filter_lid },
-	{ hfi1_filter_dlid },
-	{ hfi1_filter_mad_mgmt_class },
-	{ hfi1_filter_qp_number },
-	{ hfi1_filter_ibpacket_type },
-	{ hfi1_filter_ib_service_level },
-	{ hfi1_filter_ib_pkey },
-	{ hfi1_filter_direction },
-};
-
-#define HFI1_MAX_FILTERS	ARRAY_SIZE(hfi1_filters)
-#define HFI1_DIAG_MINOR_BASE	129
-
-static int hfi1_snoop_add(struct hfi1_devdata *dd, const char *name);
-
-int hfi1_diag_add(struct hfi1_devdata *dd)
-{
-	char name[16];
-	int ret = 0;
-
-	snprintf(name, sizeof(name), "%s_diagpkt%d", class_name(),
-		 dd->unit);
-	/*
-	 * Do this for each device as opposed to the normal diagpkt
-	 * interface which is one per host
-	 */
-	ret = hfi1_snoop_add(dd, name);
-	if (ret)
-		dd_dev_err(dd, "Unable to init snoop/capture device");
-
-	snprintf(name, sizeof(name), "%s_diagpkt", class_name());
-	if (atomic_inc_return(&diagpkt_count) == 1) {
-		ret = hfi1_cdev_init(HFI1_DIAGPKT_MINOR, name,
-				     &diagpkt_file_ops, &diagpkt_cdev,
-				     &diagpkt_device, false);
-	}
-
-	return ret;
-}
-
-/* this must be called w/ dd->snoop_in_lock held */
-static void drain_snoop_list(struct list_head *queue)
-{
-	struct list_head *pos, *q;
-	struct snoop_packet *packet;
-
-	list_for_each_safe(pos, q, queue) {
-		packet = list_entry(pos, struct snoop_packet, list);
-		list_del(pos);
-		kfree(packet);
-	}
-}
-
-static void hfi1_snoop_remove(struct hfi1_devdata *dd)
-{
-	unsigned long flags = 0;
-
-	spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-	drain_snoop_list(&dd->hfi1_snoop.queue);
-	hfi1_cdev_cleanup(&dd->hfi1_snoop.cdev, &dd->hfi1_snoop.class_dev);
-	spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-}
-
-void hfi1_diag_remove(struct hfi1_devdata *dd)
-{
-	hfi1_snoop_remove(dd);
-	if (atomic_dec_and_test(&diagpkt_count))
-		hfi1_cdev_cleanup(&diagpkt_cdev, &diagpkt_device);
-	hfi1_cdev_cleanup(&dd->diag_cdev, &dd->diag_device);
-}
-
-/*
- * Allocated structure shared between the credit return mechanism and
- * diagpkt_send().
- */
-struct diagpkt_wait {
-	struct completion credits_returned;
-	int code;
-	atomic_t count;
-};
-
-/*
- * When each side is finished with the structure, they call this.
- * The last user frees the structure.
- */
-static void put_diagpkt_wait(struct diagpkt_wait *wait)
-{
-	if (atomic_dec_and_test(&wait->count))
-		kfree(wait);
-}
-
-/*
- * Callback from the credit return code.  Set the complete, which
- * will let diapkt_send() continue.
- */
-static void diagpkt_complete(void *arg, int code)
-{
-	struct diagpkt_wait *wait = (struct diagpkt_wait *)arg;
-
-	wait->code = code;
-	complete(&wait->credits_returned);
-	put_diagpkt_wait(wait);	/* finished with the structure */
-}
-
-/**
- * diagpkt_send - send a packet
- * @dp: diag packet descriptor
- */
-static ssize_t diagpkt_send(struct diag_pkt *dp)
-{
-	struct hfi1_devdata *dd;
-	struct send_context *sc;
-	struct pio_buf *pbuf;
-	u32 *tmpbuf = NULL;
-	ssize_t ret = 0;
-	u32 pkt_len, total_len;
-	pio_release_cb credit_cb = NULL;
-	void *credit_arg = NULL;
-	struct diagpkt_wait *wait = NULL;
-	int trycount = 0;
-
-	dd = hfi1_lookup(dp->unit);
-	if (!dd || !(dd->flags & HFI1_PRESENT) || !dd->kregbase) {
-		ret = -ENODEV;
-		goto bail;
-	}
-	if (!(dd->flags & HFI1_INITTED)) {
-		/* no hardware, freeze, etc. */
-		ret = -ENODEV;
-		goto bail;
-	}
-
-	if (dp->version != _DIAG_PKT_VERS) {
-		dd_dev_err(dd, "Invalid version %u for diagpkt_write\n",
-			   dp->version);
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	/* send count must be an exact number of dwords */
-	if (dp->len & 3) {
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	/* there is only port 1 */
-	if (dp->port != 1) {
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	/* need a valid context */
-	if (dp->sw_index >= dd->num_send_contexts) {
-		ret = -EINVAL;
-		goto bail;
-	}
-	/* can only use kernel contexts */
-	if (dd->send_contexts[dp->sw_index].type != SC_KERNEL &&
-	    dd->send_contexts[dp->sw_index].type != SC_VL15) {
-		ret = -EINVAL;
-		goto bail;
-	}
-	/* must be allocated */
-	sc = dd->send_contexts[dp->sw_index].sc;
-	if (!sc) {
-		ret = -EINVAL;
-		goto bail;
-	}
-	/* must be enabled */
-	if (!(sc->flags & SCF_ENABLED)) {
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	/* allocate a buffer and copy the data in */
-	tmpbuf = vmalloc(dp->len);
-	if (!tmpbuf) {
-		ret = -ENOMEM;
-		goto bail;
-	}
-
-	if (copy_from_user(tmpbuf,
-			   (const void __user *)(unsigned long)dp->data,
-			   dp->len)) {
-		ret = -EFAULT;
-		goto bail;
-	}
-
-	/*
-	 * pkt_len is how much data we have to write, includes header and data.
-	 * total_len is length of the packet in Dwords plus the PBC should not
-	 * include the CRC.
-	 */
-	pkt_len = dp->len >> 2;
-	total_len = pkt_len + 2; /* PBC + packet */
-
-	/* if 0, fill in a default */
-	if (dp->pbc == 0) {
-		struct hfi1_pportdata *ppd = dd->pport;
-
-		hfi1_cdbg(PKT, "Generating PBC");
-		dp->pbc = create_pbc(ppd, 0, 0, 0, total_len);
-	} else {
-		hfi1_cdbg(PKT, "Using passed in PBC");
-	}
-
-	hfi1_cdbg(PKT, "Egress PBC content is 0x%llx", dp->pbc);
-
-	/*
-	 * The caller wants to wait until the packet is sent and to
-	 * check for errors.  The best we can do is wait until
-	 * the buffer credits are returned and check if any packet
-	 * error has occurred.  If there are any late errors, this
-	 * could miss it.  If there are other senders who generate
-	 * an error, this may find it.  However, in general, it
-	 * should catch most.
-	 */
-	if (dp->flags & F_DIAGPKT_WAIT) {
-		/* always force a credit return */
-		dp->pbc |= PBC_CREDIT_RETURN;
-		/* turn on credit return interrupts */
-		sc_add_credit_return_intr(sc);
-		wait = kmalloc(sizeof(*wait), GFP_KERNEL);
-		if (!wait) {
-			ret = -ENOMEM;
-			goto bail;
-		}
-		init_completion(&wait->credits_returned);
-		atomic_set(&wait->count, 2);
-		wait->code = PRC_OK;
-
-		credit_cb = diagpkt_complete;
-		credit_arg = wait;
-	}
-
-retry:
-	pbuf = sc_buffer_alloc(sc, total_len, credit_cb, credit_arg);
-	if (!pbuf) {
-		if (trycount == 0) {
-			/* force a credit return and try again */
-			sc_return_credits(sc);
-			trycount = 1;
-			goto retry;
-		}
-		/*
-		 * No send buffer means no credit callback.  Undo
-		 * the wait set-up that was done above.  We free wait
-		 * because the callback will never be called.
-		 */
-		if (dp->flags & F_DIAGPKT_WAIT) {
-			sc_del_credit_return_intr(sc);
-			kfree(wait);
-			wait = NULL;
-		}
-		ret = -ENOSPC;
-		goto bail;
-	}
-
-	pio_copy(dd, pbuf, dp->pbc, tmpbuf, pkt_len);
-	/* no flush needed as the HW knows the packet size */
-
-	ret = sizeof(*dp);
-
-	if (dp->flags & F_DIAGPKT_WAIT) {
-		/* wait for credit return */
-		ret = wait_for_completion_interruptible(
-						&wait->credits_returned);
-		/*
-		 * If the wait returns an error, the wait was interrupted,
-		 * e.g. with a ^C in the user program.  The callback is
-		 * still pending.  This is OK as the wait structure is
-		 * kmalloc'ed and the structure will free itself when
-		 * all users are done with it.
-		 *
-		 * A context disable occurs on a send context restart, so
-		 * include that in the list of errors below to check for.
-		 * NOTE: PRC_FILL_ERR is at best informational and cannot
-		 * be depended on.
-		 */
-		if (!ret && (((wait->code & PRC_STATUS_ERR) ||
-			      (wait->code & PRC_FILL_ERR) ||
-			      (wait->code & PRC_SC_DISABLE))))
-			ret = -EIO;
-
-		put_diagpkt_wait(wait);	/* finished with the structure */
-		sc_del_credit_return_intr(sc);
-	}
-
-bail:
-	vfree(tmpbuf);
-	return ret;
-}
-
-static ssize_t diagpkt_write(struct file *fp, const char __user *data,
-			     size_t count, loff_t *off)
-{
-	struct hfi1_devdata *dd;
-	struct send_context *sc;
-	u8 vl;
-
-	struct diag_pkt dp;
-
-	if (count != sizeof(dp))
-		return -EINVAL;
-
-	if (copy_from_user(&dp, data, sizeof(dp)))
-		return -EFAULT;
-
-	/*
-	* The Send Context is derived from the PbcVL value
-	* if PBC is populated
-	*/
-	if (dp.pbc) {
-		dd = hfi1_lookup(dp.unit);
-		if (!dd)
-			return -ENODEV;
-		vl = (dp.pbc >> PBC_VL_SHIFT) & PBC_VL_MASK;
-		sc = dd->vld[vl].sc;
-		if (sc) {
-			dp.sw_index = sc->sw_index;
-			hfi1_cdbg(
-			       PKT,
-			       "Packet sent over VL %d via Send Context %u(%u)",
-			       vl, sc->sw_index, sc->hw_context);
-		}
-	}
-
-	return diagpkt_send(&dp);
-}
-
-static int hfi1_snoop_add(struct hfi1_devdata *dd, const char *name)
-{
-	int ret = 0;
-
-	dd->hfi1_snoop.mode_flag = 0;
-	spin_lock_init(&dd->hfi1_snoop.snoop_lock);
-	INIT_LIST_HEAD(&dd->hfi1_snoop.queue);
-	init_waitqueue_head(&dd->hfi1_snoop.waitq);
-
-	ret = hfi1_cdev_init(HFI1_SNOOP_CAPTURE_BASE + dd->unit, name,
-			     &snoop_file_ops,
-			     &dd->hfi1_snoop.cdev, &dd->hfi1_snoop.class_dev,
-			     false);
-
-	if (ret) {
-		dd_dev_err(dd, "Couldn't create %s device: %d", name, ret);
-		hfi1_cdev_cleanup(&dd->hfi1_snoop.cdev,
-				  &dd->hfi1_snoop.class_dev);
-	}
-
-	return ret;
-}
-
-static struct hfi1_devdata *hfi1_dd_from_sc_inode(struct inode *in)
-{
-	int unit = iminor(in) - HFI1_SNOOP_CAPTURE_BASE;
-	struct hfi1_devdata *dd;
-
-	dd = hfi1_lookup(unit);
-	return dd;
-}
-
-/* clear or restore send context integrity checks */
-static void adjust_integrity_checks(struct hfi1_devdata *dd)
-{
-	struct send_context *sc;
-	unsigned long sc_flags;
-	int i;
-
-	spin_lock_irqsave(&dd->sc_lock, sc_flags);
-	for (i = 0; i < dd->num_send_contexts; i++) {
-		int enable;
-
-		sc = dd->send_contexts[i].sc;
-
-		if (!sc)
-			continue;	/* not allocated */
-
-		enable = likely(!HFI1_CAP_IS_KSET(NO_INTEGRITY)) &&
-			 dd->hfi1_snoop.mode_flag != HFI1_PORT_SNOOP_MODE;
-
-		set_pio_integrity(sc);
-
-		if (enable) /* take HFI_CAP_* flags into account */
-			hfi1_init_ctxt(sc);
-	}
-	spin_unlock_irqrestore(&dd->sc_lock, sc_flags);
-}
-
-static int hfi1_snoop_open(struct inode *in, struct file *fp)
-{
-	int ret;
-	int mode_flag = 0;
-	unsigned long flags = 0;
-	struct hfi1_devdata *dd;
-	struct list_head *queue;
-
-	mutex_lock(&hfi1_mutex);
-
-	dd = hfi1_dd_from_sc_inode(in);
-	if (!dd) {
-		ret = -ENODEV;
-		goto bail;
-	}
-
-	/*
-	 * File mode determines snoop or capture. Some existing user
-	 * applications expect the capture device to be able to be opened RDWR
-	 * because they expect a dedicated capture device. For this reason we
-	 * support a module param to force capture mode even if the file open
-	 * mode matches snoop.
-	 */
-	if ((fp->f_flags & O_ACCMODE) == O_RDONLY) {
-		snoop_dbg("Capture Enabled");
-		mode_flag = HFI1_PORT_CAPTURE_MODE;
-	} else if ((fp->f_flags & O_ACCMODE) == O_RDWR) {
-		snoop_dbg("Snoop Enabled");
-		mode_flag = HFI1_PORT_SNOOP_MODE;
-	} else {
-		snoop_dbg("Invalid");
-		ret =  -EINVAL;
-		goto bail;
-	}
-	queue = &dd->hfi1_snoop.queue;
-
-	/*
-	 * We are not supporting snoop and capture at the same time.
-	 */
-	spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-	if (dd->hfi1_snoop.mode_flag) {
-		ret = -EBUSY;
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-		goto bail;
-	}
-
-	dd->hfi1_snoop.mode_flag = mode_flag;
-	drain_snoop_list(queue);
-
-	dd->hfi1_snoop.filter_callback = NULL;
-	dd->hfi1_snoop.filter_value = NULL;
-
-	/*
-	 * Send side packet integrity checks are not helpful when snooping so
-	 * disable and re-enable when we stop snooping.
-	 */
-	if (mode_flag == HFI1_PORT_SNOOP_MODE) {
-		/* clear after snoop mode is on */
-		adjust_integrity_checks(dd); /* clear */
-
-		/*
-		 * We also do not want to be doing the DLID LMC check for
-		 * ingressed packets.
-		 */
-		dd->hfi1_snoop.dcc_cfg = read_csr(dd, DCC_CFG_PORT_CONFIG1);
-		write_csr(dd, DCC_CFG_PORT_CONFIG1,
-			  (dd->hfi1_snoop.dcc_cfg >> 32) << 32);
-	}
-
-	/*
-	 * As soon as we set these function pointers the recv and send handlers
-	 * are active. This is a race condition so we must make sure to drain
-	 * the queue and init filter values above. Technically we should add
-	 * locking here but all that will happen is on recv a packet will get
-	 * allocated and get stuck on the snoop_lock before getting added to the
-	 * queue. Same goes for send.
-	 */
-	dd->rhf_rcv_function_map = snoop_rhf_rcv_functions;
-	dd->process_pio_send = snoop_send_pio_handler;
-	dd->process_dma_send = snoop_send_pio_handler;
-	dd->pio_inline_send = snoop_inline_pio_send;
-
-	spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-	ret = 0;
-
-bail:
-	mutex_unlock(&hfi1_mutex);
-
-	return ret;
-}
-
-static int hfi1_snoop_release(struct inode *in, struct file *fp)
-{
-	unsigned long flags = 0;
-	struct hfi1_devdata *dd;
-	int mode_flag;
-
-	dd = hfi1_dd_from_sc_inode(in);
-	if (!dd)
-		return -ENODEV;
-
-	spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-
-	/* clear the snoop mode before re-adjusting send context CSRs */
-	mode_flag = dd->hfi1_snoop.mode_flag;
-	dd->hfi1_snoop.mode_flag = 0;
-
-	/*
-	 * Drain the queue and clear the filters we are done with it. Don't
-	 * forget to restore the packet integrity checks
-	 */
-	drain_snoop_list(&dd->hfi1_snoop.queue);
-	if (mode_flag == HFI1_PORT_SNOOP_MODE) {
-		/* restore after snoop mode is clear */
-		adjust_integrity_checks(dd); /* restore */
-
-		/*
-		 * Also should probably reset the DCC_CONFIG1 register for DLID
-		 * checking on incoming packets again. Use the value saved when
-		 * opening the snoop device.
-		 */
-		write_csr(dd, DCC_CFG_PORT_CONFIG1, dd->hfi1_snoop.dcc_cfg);
-	}
-
-	dd->hfi1_snoop.filter_callback = NULL;
-	kfree(dd->hfi1_snoop.filter_value);
-	dd->hfi1_snoop.filter_value = NULL;
-
-	/*
-	 * User is done snooping and capturing, return control to the normal
-	 * handler. Re-enable SDMA handling.
-	 */
-	dd->rhf_rcv_function_map = dd->normal_rhf_rcv_functions;
-	dd->process_pio_send = hfi1_verbs_send_pio;
-	dd->process_dma_send = hfi1_verbs_send_dma;
-	dd->pio_inline_send = pio_copy;
-
-	spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-
-	snoop_dbg("snoop/capture device released");
-
-	return 0;
-}
-
-static unsigned int hfi1_snoop_poll(struct file *fp,
-				    struct poll_table_struct *wait)
-{
-	int ret = 0;
-	unsigned long flags = 0;
-
-	struct hfi1_devdata *dd;
-
-	dd = hfi1_dd_from_sc_inode(fp->f_inode);
-	if (!dd)
-		return -ENODEV;
-
-	spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-
-	poll_wait(fp, &dd->hfi1_snoop.waitq, wait);
-	if (!list_empty(&dd->hfi1_snoop.queue))
-		ret |= POLLIN | POLLRDNORM;
-
-	spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-	return ret;
-}
-
-static ssize_t hfi1_snoop_write(struct file *fp, const char __user *data,
-				size_t count, loff_t *off)
-{
-	struct diag_pkt dpkt;
-	struct hfi1_devdata *dd;
-	size_t ret;
-	u8 byte_two, sl, sc5, sc4, vl, byte_one;
-	struct send_context *sc;
-	u32 len;
-	u64 pbc;
-	struct hfi1_ibport *ibp;
-	struct hfi1_pportdata *ppd;
-
-	dd = hfi1_dd_from_sc_inode(fp->f_inode);
-	if (!dd)
-		return -ENODEV;
-
-	ppd = dd->pport;
-	snoop_dbg("received %lu bytes from user", count);
-
-	memset(&dpkt, 0, sizeof(struct diag_pkt));
-	dpkt.version = _DIAG_PKT_VERS;
-	dpkt.unit = dd->unit;
-	dpkt.port = 1;
-
-	if (likely(!(snoop_flags & SNOOP_USE_METADATA))) {
-		/*
-		* We need to generate the PBC and not let diagpkt_send do it,
-		* to do this we need the VL and the length in dwords.
-		* The VL can be determined by using the SL and looking up the
-		* SC. Then the SC can be converted into VL. The exception to
-		* this is those packets which are from an SMI queue pair.
-		* Since we can't detect anything about the QP here we have to
-		* rely on the SC. If its 0xF then we assume its SMI and
-		* do not look at the SL.
-		*/
-		if (copy_from_user(&byte_one, data, 1))
-			return -EINVAL;
-
-		if (copy_from_user(&byte_two, data + 1, 1))
-			return -EINVAL;
-
-		sc4 = (byte_one >> 4) & 0xf;
-		if (sc4 == 0xF) {
-			snoop_dbg("Detected VL15 packet ignoring SL in packet");
-			vl = sc4;
-		} else {
-			sl = (byte_two >> 4) & 0xf;
-			ibp = to_iport(&dd->verbs_dev.rdi.ibdev, 1);
-			sc5 = ibp->sl_to_sc[sl];
-			vl = sc_to_vlt(dd, sc5);
-			if (vl != sc4) {
-				snoop_dbg("VL %d does not match SC %d of packet",
-					  vl, sc4);
-				return -EINVAL;
-			}
-		}
-
-		sc = dd->vld[vl].sc; /* Look up the context based on VL */
-		if (sc) {
-			dpkt.sw_index = sc->sw_index;
-			snoop_dbg("Sending on context %u(%u)", sc->sw_index,
-				  sc->hw_context);
-		} else {
-			snoop_dbg("Could not find context for vl %d", vl);
-			return -EINVAL;
-		}
-
-		len = (count >> 2) + 2; /* Add in PBC */
-		pbc = create_pbc(ppd, 0, 0, vl, len);
-	} else {
-		if (copy_from_user(&pbc, data, sizeof(pbc)))
-			return -EINVAL;
-		vl = (pbc >> PBC_VL_SHIFT) & PBC_VL_MASK;
-		sc = dd->vld[vl].sc; /* Look up the context based on VL */
-		if (sc) {
-			dpkt.sw_index = sc->sw_index;
-		} else {
-			snoop_dbg("Could not find context for vl %d", vl);
-			return -EINVAL;
-		}
-		data += sizeof(pbc);
-		count -= sizeof(pbc);
-	}
-	dpkt.len = count;
-	dpkt.data = (unsigned long)data;
-
-	snoop_dbg("PBC: vl=0x%llx Length=0x%llx",
-		  (pbc >> 12) & 0xf,
-		  (pbc & 0xfff));
-
-	dpkt.pbc = pbc;
-	ret = diagpkt_send(&dpkt);
-	/*
-	 * diagpkt_send only returns number of bytes in the diagpkt so patch
-	 * that up here before returning.
-	 */
-	if (ret == sizeof(dpkt))
-		return count;
-
-	return ret;
-}
-
-static ssize_t hfi1_snoop_read(struct file *fp, char __user *data,
-			       size_t pkt_len, loff_t *off)
-{
-	ssize_t ret = 0;
-	unsigned long flags = 0;
-	struct snoop_packet *packet = NULL;
-	struct hfi1_devdata *dd;
-
-	dd = hfi1_dd_from_sc_inode(fp->f_inode);
-	if (!dd)
-		return -ENODEV;
-
-	spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-
-	while (list_empty(&dd->hfi1_snoop.queue)) {
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-
-		if (fp->f_flags & O_NONBLOCK)
-			return -EAGAIN;
-
-		if (wait_event_interruptible(
-				dd->hfi1_snoop.waitq,
-				!list_empty(&dd->hfi1_snoop.queue)))
-			return -EINTR;
-
-		spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-	}
-
-	if (!list_empty(&dd->hfi1_snoop.queue)) {
-		packet = list_entry(dd->hfi1_snoop.queue.next,
-				    struct snoop_packet, list);
-		list_del(&packet->list);
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-		if (pkt_len >= packet->total_len) {
-			if (copy_to_user(data, packet->data,
-					 packet->total_len))
-				ret = -EFAULT;
-			else
-				ret = packet->total_len;
-		} else {
-			ret = -EINVAL;
-		}
-
-		kfree(packet);
-	} else {
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-	}
-
-	return ret;
-}
-
-/**
- * hfi1_assign_snoop_link_credits -- Set up credits for VL15 and others
- * @ppd : ptr to hfi1 port data
- * @value : options from user space
- *
- * Assumes the rest of the CM credit registers are zero from a
- * previous global or credit reset.
- * Leave shared count at zero for both global and all vls.
- * In snoop mode ideally we don't use shared credits
- * Reserve 8.5k for VL15
- * If total credits less than 8.5kbytes return error.
- * Divide the rest of the credits across VL0 to VL7 and if
- * each of these levels has less than 34 credits (at least 2048 + 128 bytes)
- * return with an error.
- * The credit registers will be reset to zero on link negotiation or link up
- * so this function should be activated from user space only if the port has
- * gone past link negotiation and link up.
- *
- * Return -- 0 if successful else error condition
- *
- */
-static long hfi1_assign_snoop_link_credits(struct hfi1_pportdata *ppd,
-					   int value)
-{
-#define  OPA_MIN_PER_VL_CREDITS  34  /* 2048 + 128 bytes */
-	struct buffer_control t;
-	int i;
-	struct hfi1_devdata *dd = ppd->dd;
-	u16  total_credits = (value >> 16) & 0xffff;
-	u16  vl15_credits = dd->vl15_init / 2;
-	u16  per_vl_credits;
-	__be16 be_per_vl_credits;
-
-	if (ppd->host_link_state & HLS_DOWN)
-		goto err_exit;
-	if (total_credits  <  vl15_credits)
-		goto err_exit;
-
-	per_vl_credits = (total_credits - vl15_credits) / TXE_NUM_DATA_VL;
-
-	if (per_vl_credits < OPA_MIN_PER_VL_CREDITS)
-		goto err_exit;
-
-	memset(&t, 0, sizeof(t));
-	be_per_vl_credits = cpu_to_be16(per_vl_credits);
-
-	for (i = 0; i < TXE_NUM_DATA_VL; i++)
-		t.vl[i].dedicated = be_per_vl_credits;
-
-	t.vl[15].dedicated  = cpu_to_be16(vl15_credits);
-	return set_buffer_control(ppd, &t);
-
-err_exit:
-	snoop_dbg("port_state = 0x%x, total_credits = %d, vl15_credits = %d",
-		  ppd->host_link_state, total_credits, vl15_credits);
-
-	return -EINVAL;
-}
-
-static long hfi1_ioctl(struct file *fp, unsigned int cmd, unsigned long arg)
-{
-	struct hfi1_devdata *dd;
-	void *filter_value = NULL;
-	long ret = 0;
-	int value = 0;
-	u8 phys_state = 0;
-	u8 link_state = 0;
-	u16 dev_state = 0;
-	unsigned long flags = 0;
-	unsigned long *argp = NULL;
-	struct hfi1_packet_filter_command filter_cmd = {0};
-	int mode_flag = 0;
-	struct hfi1_pportdata *ppd = NULL;
-	unsigned int index;
-	struct hfi1_link_info link_info;
-	int read_cmd, write_cmd, read_ok, write_ok;
-
-	dd = hfi1_dd_from_sc_inode(fp->f_inode);
-	if (!dd)
-		return -ENODEV;
-
-	mode_flag = dd->hfi1_snoop.mode_flag;
-	read_cmd = _IOC_DIR(cmd) & _IOC_READ;
-	write_cmd = _IOC_DIR(cmd) & _IOC_WRITE;
-	write_ok = access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd));
-	read_ok = access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd));
-
-	if ((read_cmd && !write_ok) || (write_cmd && !read_ok))
-		return -EFAULT;
-
-	if (!capable(CAP_SYS_ADMIN))
-		return -EPERM;
-
-	if ((mode_flag & HFI1_PORT_CAPTURE_MODE) &&
-	    (cmd != HFI1_SNOOP_IOCCLEARQUEUE) &&
-	    (cmd != HFI1_SNOOP_IOCCLEARFILTER) &&
-	    (cmd != HFI1_SNOOP_IOCSETFILTER))
-		/* Capture devices are allowed only 3 operations
-		 * 1.Clear capture queue
-		 * 2.Clear capture filter
-		 * 3.Set capture filter
-		 * Other are invalid.
-		 */
-		return -EINVAL;
-
-	switch (cmd) {
-	case HFI1_SNOOP_IOCSETLINKSTATE_EXTRA:
-		memset(&link_info, 0, sizeof(link_info));
-
-		if (copy_from_user(&link_info,
-				   (struct hfi1_link_info __user *)arg,
-				   sizeof(link_info)))
-			return -EFAULT;
-
-		value = link_info.port_state;
-		index = link_info.port_number;
-		if (index > dd->num_pports - 1)
-			return -EINVAL;
-
-		ppd = &dd->pport[index];
-		if (!ppd)
-			return -EINVAL;
-
-		/* What we want to transition to */
-		phys_state = (value >> 4) & 0xF;
-		link_state = value & 0xF;
-		snoop_dbg("Setting link state 0x%x", value);
-
-		switch (link_state) {
-		case IB_PORT_NOP:
-			if (phys_state == 0)
-				break;
-				/* fall through */
-		case IB_PORT_DOWN:
-			switch (phys_state) {
-			case 0:
-				dev_state = HLS_DN_DOWNDEF;
-				break;
-			case 2:
-				dev_state = HLS_DN_POLL;
-				break;
-			case 3:
-				dev_state = HLS_DN_DISABLE;
-				break;
-			default:
-				return -EINVAL;
-			}
-			ret = set_link_state(ppd, dev_state);
-			break;
-		case IB_PORT_ARMED:
-			ret = set_link_state(ppd, HLS_UP_ARMED);
-			if (!ret)
-				send_idle_sma(dd, SMA_IDLE_ARM);
-			break;
-		case IB_PORT_ACTIVE:
-			ret = set_link_state(ppd, HLS_UP_ACTIVE);
-			if (!ret)
-				send_idle_sma(dd, SMA_IDLE_ACTIVE);
-			break;
-		default:
-			return -EINVAL;
-		}
-
-		if (ret)
-			break;
-		/* fall through */
-	case HFI1_SNOOP_IOCGETLINKSTATE:
-	case HFI1_SNOOP_IOCGETLINKSTATE_EXTRA:
-		if (cmd == HFI1_SNOOP_IOCGETLINKSTATE_EXTRA) {
-			memset(&link_info, 0, sizeof(link_info));
-			if (copy_from_user(&link_info,
-					   (struct hfi1_link_info __user *)arg,
-					   sizeof(link_info)))
-				return -EFAULT;
-			index = link_info.port_number;
-		} else {
-			ret = __get_user(index, (int __user *)arg);
-			if (ret !=  0)
-				break;
-		}
-
-		if (index > dd->num_pports - 1)
-			return -EINVAL;
-
-		ppd = &dd->pport[index];
-		if (!ppd)
-			return -EINVAL;
-
-		value = hfi1_ibphys_portstate(ppd);
-		value <<= 4;
-		value |= driver_lstate(ppd);
-
-		snoop_dbg("Link port | Link State: %d", value);
-
-		if ((cmd == HFI1_SNOOP_IOCGETLINKSTATE_EXTRA) ||
-		    (cmd == HFI1_SNOOP_IOCSETLINKSTATE_EXTRA)) {
-			link_info.port_state = value;
-			link_info.node_guid = cpu_to_be64(ppd->guid);
-			link_info.link_speed_active =
-						ppd->link_speed_active;
-			link_info.link_width_active =
-						ppd->link_width_active;
-			if (copy_to_user((struct hfi1_link_info __user *)arg,
-					 &link_info, sizeof(link_info)))
-				return -EFAULT;
-		} else {
-			ret = __put_user(value, (int __user *)arg);
-		}
-		break;
-
-	case HFI1_SNOOP_IOCCLEARQUEUE:
-		snoop_dbg("Clearing snoop queue");
-		spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-		drain_snoop_list(&dd->hfi1_snoop.queue);
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-		break;
-
-	case HFI1_SNOOP_IOCCLEARFILTER:
-		snoop_dbg("Clearing filter");
-		spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-		if (dd->hfi1_snoop.filter_callback) {
-			/* Drain packets first */
-			drain_snoop_list(&dd->hfi1_snoop.queue);
-			dd->hfi1_snoop.filter_callback = NULL;
-		}
-		kfree(dd->hfi1_snoop.filter_value);
-		dd->hfi1_snoop.filter_value = NULL;
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-		break;
-
-	case HFI1_SNOOP_IOCSETFILTER:
-		snoop_dbg("Setting filter");
-		/* just copy command structure */
-		argp = (unsigned long *)arg;
-		if (copy_from_user(&filter_cmd, (void __user *)argp,
-				   sizeof(filter_cmd)))
-			return -EFAULT;
-
-		if (filter_cmd.opcode >= HFI1_MAX_FILTERS) {
-			pr_alert("Invalid opcode in request\n");
-			return -EINVAL;
-		}
-
-		snoop_dbg("Opcode %d Len %d Ptr %p",
-			  filter_cmd.opcode, filter_cmd.length,
-			  filter_cmd.value_ptr);
-
-		filter_value = kcalloc(filter_cmd.length, sizeof(u8),
-				       GFP_KERNEL);
-		if (!filter_value)
-			return -ENOMEM;
-
-		/* copy remaining data from userspace */
-		if (copy_from_user((u8 *)filter_value,
-				   (void __user *)filter_cmd.value_ptr,
-				   filter_cmd.length)) {
-			kfree(filter_value);
-			return -EFAULT;
-		}
-		/* Drain packets first */
-		spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-		drain_snoop_list(&dd->hfi1_snoop.queue);
-		dd->hfi1_snoop.filter_callback =
-			hfi1_filters[filter_cmd.opcode].filter;
-		/* just in case we see back to back sets */
-		kfree(dd->hfi1_snoop.filter_value);
-		dd->hfi1_snoop.filter_value = filter_value;
-		spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-		break;
-	case HFI1_SNOOP_IOCGETVERSION:
-		value = SNOOP_CAPTURE_VERSION;
-		snoop_dbg("Getting version: %d", value);
-		ret = __put_user(value, (int __user *)arg);
-		break;
-	case HFI1_SNOOP_IOCSET_OPTS:
-		snoop_flags = 0;
-		ret = __get_user(value, (int __user *)arg);
-		if (ret != 0)
-			break;
-
-		snoop_dbg("Setting snoop option %d", value);
-		if (value & SNOOP_DROP_SEND)
-			snoop_flags |= SNOOP_DROP_SEND;
-		if (value & SNOOP_USE_METADATA)
-			snoop_flags |= SNOOP_USE_METADATA;
-		if (value & (SNOOP_SET_VL0TOVL15)) {
-			ppd = &dd->pport[0];  /* first port will do */
-			ret = hfi1_assign_snoop_link_credits(ppd, value);
-		}
-		break;
-	default:
-		return -ENOTTY;
-	}
-
-	return ret;
-}
-
-static void snoop_list_add_tail(struct snoop_packet *packet,
-				struct hfi1_devdata *dd)
-{
-	unsigned long flags = 0;
-
-	spin_lock_irqsave(&dd->hfi1_snoop.snoop_lock, flags);
-	if (likely((dd->hfi1_snoop.mode_flag & HFI1_PORT_SNOOP_MODE) ||
-		   (dd->hfi1_snoop.mode_flag & HFI1_PORT_CAPTURE_MODE))) {
-		list_add_tail(&packet->list, &dd->hfi1_snoop.queue);
-		snoop_dbg("Added packet to list");
-	}
-
-	/*
-	 * Technically we can could have closed the snoop device while waiting
-	 * on the above lock and it is gone now. The snoop mode_flag will
-	 * prevent us from adding the packet to the queue though.
-	 */
-
-	spin_unlock_irqrestore(&dd->hfi1_snoop.snoop_lock, flags);
-	wake_up_interruptible(&dd->hfi1_snoop.waitq);
-}
-
-static inline int hfi1_filter_check(void *val, const char *msg)
-{
-	if (!val) {
-		snoop_dbg("Error invalid %s value for filter", msg);
-		return HFI1_FILTER_ERR;
-	}
-	return 0;
-}
-
-static int hfi1_filter_lid(void *ibhdr, void *packet_data, void *value)
-{
-	struct hfi1_ib_header *hdr;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	if (*((u16 *)value) == be16_to_cpu(hdr->lrh[3])) /* matches slid */
-		return HFI1_FILTER_HIT; /* matched */
-
-	return HFI1_FILTER_MISS; /* Not matched */
-}
-
-static int hfi1_filter_dlid(void *ibhdr, void *packet_data, void *value)
-{
-	struct hfi1_ib_header *hdr;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	if (*((u16 *)value) == be16_to_cpu(hdr->lrh[1]))
-		return HFI1_FILTER_HIT;
-
-	return HFI1_FILTER_MISS;
-}
-
-/* Not valid for outgoing packets, send handler passes null for data*/
-static int hfi1_filter_mad_mgmt_class(void *ibhdr, void *packet_data,
-				      void *value)
-{
-	struct hfi1_ib_header *hdr;
-	struct hfi1_other_headers *ohdr = NULL;
-	struct ib_smp *smp = NULL;
-	u32 qpn = 0;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(packet_data, "packet_data");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	/* Check for GRH */
-	if ((be16_to_cpu(hdr->lrh[0]) & 3) == HFI1_LRH_BTH)
-		ohdr = &hdr->u.oth; /* LRH + BTH + DETH */
-	else
-		ohdr = &hdr->u.l.oth; /* LRH + GRH + BTH + DETH */
-
-	qpn = be32_to_cpu(ohdr->bth[1]) & 0x00FFFFFF;
-	if (qpn <= 1) {
-		smp = (struct ib_smp *)packet_data;
-		if (*((u8 *)value) == smp->mgmt_class)
-			return HFI1_FILTER_HIT;
-		else
-			return HFI1_FILTER_MISS;
-	}
-	return HFI1_FILTER_ERR;
-}
-
-static int hfi1_filter_qp_number(void *ibhdr, void *packet_data, void *value)
-{
-	struct hfi1_ib_header *hdr;
-	struct hfi1_other_headers *ohdr = NULL;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	/* Check for GRH */
-	if ((be16_to_cpu(hdr->lrh[0]) & 3) == HFI1_LRH_BTH)
-		ohdr = &hdr->u.oth; /* LRH + BTH + DETH */
-	else
-		ohdr = &hdr->u.l.oth; /* LRH + GRH + BTH + DETH */
-	if (*((u32 *)value) == (be32_to_cpu(ohdr->bth[1]) & 0x00FFFFFF))
-		return HFI1_FILTER_HIT;
-
-	return HFI1_FILTER_MISS;
-}
-
-static int hfi1_filter_ibpacket_type(void *ibhdr, void *packet_data,
-				     void *value)
-{
-	u32 lnh = 0;
-	u8 opcode = 0;
-	struct hfi1_ib_header *hdr;
-	struct hfi1_other_headers *ohdr = NULL;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	lnh = (be16_to_cpu(hdr->lrh[0]) & 3);
-
-	if (lnh == HFI1_LRH_BTH)
-		ohdr = &hdr->u.oth;
-	else if (lnh == HFI1_LRH_GRH)
-		ohdr = &hdr->u.l.oth;
-	else
-		return HFI1_FILTER_ERR;
-
-	opcode = be32_to_cpu(ohdr->bth[0]) >> 24;
-
-	if (*((u8 *)value) == ((opcode >> 5) & 0x7))
-		return HFI1_FILTER_HIT;
-
-	return HFI1_FILTER_MISS;
-}
-
-static int hfi1_filter_ib_service_level(void *ibhdr, void *packet_data,
-					void *value)
-{
-	struct hfi1_ib_header *hdr;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	if ((*((u8 *)value)) == ((be16_to_cpu(hdr->lrh[0]) >> 4) & 0xF))
-		return HFI1_FILTER_HIT;
-
-	return HFI1_FILTER_MISS;
-}
-
-static int hfi1_filter_ib_pkey(void *ibhdr, void *packet_data, void *value)
-{
-	u32 lnh = 0;
-	struct hfi1_ib_header *hdr;
-	struct hfi1_other_headers *ohdr = NULL;
-	int ret;
-
-	ret = hfi1_filter_check(ibhdr, "header");
-	if (ret)
-		return ret;
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	hdr = (struct hfi1_ib_header *)ibhdr;
-
-	lnh = (be16_to_cpu(hdr->lrh[0]) & 3);
-	if (lnh == HFI1_LRH_BTH)
-		ohdr = &hdr->u.oth;
-	else if (lnh == HFI1_LRH_GRH)
-		ohdr = &hdr->u.l.oth;
-	else
-		return HFI1_FILTER_ERR;
-
-	/* P_key is 16-bit entity, however top most bit indicates
-	 * type of membership. 0 for limited and 1 for Full.
-	 * Limited members cannot accept information from other
-	 * Limited members, but communication is allowed between
-	 * every other combination of membership.
-	 * Hence we'll omit comparing top-most bit while filtering
-	 */
-
-	if ((*(u16 *)value & 0x7FFF) ==
-		((be32_to_cpu(ohdr->bth[0])) & 0x7FFF))
-		return HFI1_FILTER_HIT;
-
-	return HFI1_FILTER_MISS;
-}
-
-/*
- * If packet_data is NULL then this is coming from one of the send functions.
- * Thus we know if its an ingressed or egressed packet.
- */
-static int hfi1_filter_direction(void *ibhdr, void *packet_data, void *value)
-{
-	u8 user_dir = *(u8 *)value;
-	int ret;
-
-	ret = hfi1_filter_check(value, "user");
-	if (ret)
-		return ret;
-
-	if (packet_data) {
-		/* Incoming packet */
-		if (user_dir & HFI1_SNOOP_INGRESS)
-			return HFI1_FILTER_HIT;
-	} else {
-		/* Outgoing packet */
-		if (user_dir & HFI1_SNOOP_EGRESS)
-			return HFI1_FILTER_HIT;
-	}
-
-	return HFI1_FILTER_MISS;
-}
-
-/*
- * Allocate a snoop packet. The structure that is stored in the ring buffer, not
- * to be confused with an hfi packet type.
- */
-static struct snoop_packet *allocate_snoop_packet(u32 hdr_len,
-						  u32 data_len,
-						  u32 md_len)
-{
-	struct snoop_packet *packet;
-
-	packet = kzalloc(sizeof(*packet) + hdr_len + data_len
-			 + md_len,
-			 GFP_ATOMIC | __GFP_NOWARN);
-	if (likely(packet))
-		INIT_LIST_HEAD(&packet->list);
-
-	return packet;
-}
-
-/*
- * Instead of having snoop and capture code intermixed with the recv functions,
- * both the interrupt handler and hfi1_ib_rcv() we are going to hijack the call
- * and land in here for snoop/capture but if not enabled the call will go
- * through as before. This gives us a single point to constrain all of the snoop
- * snoop recv logic. There is nothing special that needs to happen for bypass
- * packets. This routine should not try to look into the packet. It just copied
- * it. There is no guarantee for filters when it comes to bypass packets as
- * there is no specific support. Bottom line is this routine does now even know
- * what a bypass packet is.
- */
-int snoop_recv_handler(struct hfi1_packet *packet)
-{
-	struct hfi1_pportdata *ppd = packet->rcd->ppd;
-	struct hfi1_ib_header *hdr = packet->hdr;
-	int header_size = packet->hlen;
-	void *data = packet->ebuf;
-	u32 tlen = packet->tlen;
-	struct snoop_packet *s_packet = NULL;
-	int ret;
-	int snoop_mode = 0;
-	u32 md_len = 0;
-	struct capture_md md;
-
-	snoop_dbg("PACKET IN: hdr size %d tlen %d data %p", header_size, tlen,
-		  data);
-
-	trace_snoop_capture(ppd->dd, header_size, hdr, tlen - header_size,
-			    data);
-
-	if (!ppd->dd->hfi1_snoop.filter_callback) {
-		snoop_dbg("filter not set");
-		ret = HFI1_FILTER_HIT;
-	} else {
-		ret = ppd->dd->hfi1_snoop.filter_callback(hdr, data,
-					ppd->dd->hfi1_snoop.filter_value);
-	}
-
-	switch (ret) {
-	case HFI1_FILTER_ERR:
-		snoop_dbg("Error in filter call");
-		break;
-	case HFI1_FILTER_MISS:
-		snoop_dbg("Filter Miss");
-		break;
-	case HFI1_FILTER_HIT:
-
-		if (ppd->dd->hfi1_snoop.mode_flag & HFI1_PORT_SNOOP_MODE)
-			snoop_mode = 1;
-		if ((snoop_mode == 0) ||
-		    unlikely(snoop_flags & SNOOP_USE_METADATA))
-			md_len = sizeof(struct capture_md);
-
-		s_packet = allocate_snoop_packet(header_size,
-						 tlen - header_size,
-						 md_len);
-
-		if (unlikely(!s_packet)) {
-			dd_dev_warn_ratelimited(ppd->dd, "Unable to allocate snoop/capture packet\n");
-			break;
-		}
-
-		if (md_len > 0) {
-			memset(&md, 0, sizeof(struct capture_md));
-			md.port = 1;
-			md.dir = PKT_DIR_INGRESS;
-			md.u.rhf = packet->rhf;
-			memcpy(s_packet->data, &md, md_len);
-		}
-
-		/* We should always have a header */
-		if (hdr) {
-			memcpy(s_packet->data + md_len, hdr, header_size);
-		} else {
-			dd_dev_err(ppd->dd, "Unable to copy header to snoop/capture packet\n");
-			kfree(s_packet);
-			break;
-		}
-
-		/*
-		 * Packets with no data are possible. If there is no data needed
-		 * to take care of the last 4 bytes which are normally included
-		 * with data buffers and are included in tlen.  Since we kzalloc
-		 * the buffer we do not need to set any values but if we decide
-		 * not to use kzalloc we should zero them.
-		 */
-		if (data)
-			memcpy(s_packet->data + header_size + md_len, data,
-			       tlen - header_size);
-
-		s_packet->total_len = tlen + md_len;
-		snoop_list_add_tail(s_packet, ppd->dd);
-
-		/*
-		 * If we are snooping the packet not capturing then throw away
-		 * after adding to the list.
-		 */
-		snoop_dbg("Capturing packet");
-		if (ppd->dd->hfi1_snoop.mode_flag & HFI1_PORT_SNOOP_MODE) {
-			snoop_dbg("Throwing packet away");
-			/*
-			 * If we are dropping the packet we still may need to
-			 * handle the case where error flags are set, this is
-			 * normally done by the type specific handler but that
-			 * won't be called in this case.
-			 */
-			if (unlikely(rhf_err_flags(packet->rhf)))
-				handle_eflags(packet);
-
-			/* throw the packet on the floor */
-			return RHF_RCV_CONTINUE;
-		}
-		break;
-	default:
-		break;
-	}
-
-	/*
-	 * We do not care what type of packet came in here - just pass it off
-	 * to the normal handler.
-	 */
-	return ppd->dd->normal_rhf_rcv_functions[rhf_rcv_type(packet->rhf)]
-			(packet);
-}
-
-/*
- * Handle snooping and capturing packets when sdma is being used.
- */
-int snoop_send_dma_handler(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
-			   u64 pbc)
-{
-	pr_alert("Snooping/Capture of Send DMA Packets Is Not Supported!\n");
-	snoop_dbg("Unsupported Operation");
-	return hfi1_verbs_send_dma(qp, ps, 0);
-}
-
-/*
- * Handle snooping and capturing packets when pio is being used. Does not handle
- * bypass packets. The only way to send a bypass packet currently is to use the
- * diagpkt interface. When that interface is enable snoop/capture is not.
- */
-int snoop_send_pio_handler(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
-			   u64 pbc)
-{
-	u32 hdrwords = qp->s_hdrwords;
-	struct rvt_sge_state *ss = qp->s_cur_sge;
-	u32 len = qp->s_cur_size;
-	u32 dwords = (len + 3) >> 2;
-	u32 plen = hdrwords + dwords + 2; /* includes pbc */
-	struct hfi1_pportdata *ppd = ps->ppd;
-	struct snoop_packet *s_packet = NULL;
-	u32 *hdr = (u32 *)&ps->s_txreq->phdr.hdr;
-	u32 length = 0;
-	struct rvt_sge_state temp_ss;
-	void *data = NULL;
-	void *data_start = NULL;
-	int ret;
-	int snoop_mode = 0;
-	int md_len = 0;
-	struct capture_md md;
-	u32 vl;
-	u32 hdr_len = hdrwords << 2;
-	u32 tlen = HFI1_GET_PKT_LEN(&ps->s_txreq->phdr.hdr);
-
-	md.u.pbc = 0;
-
-	snoop_dbg("PACKET OUT: hdrword %u len %u plen %u dwords %u tlen %u",
-		  hdrwords, len, plen, dwords, tlen);
-	if (ppd->dd->hfi1_snoop.mode_flag & HFI1_PORT_SNOOP_MODE)
-		snoop_mode = 1;
-	if ((snoop_mode == 0) ||
-	    unlikely(snoop_flags & SNOOP_USE_METADATA))
-		md_len = sizeof(struct capture_md);
-
-	/* not using ss->total_len as arg 2 b/c that does not count CRC */
-	s_packet = allocate_snoop_packet(hdr_len, tlen - hdr_len, md_len);
-
-	if (unlikely(!s_packet)) {
-		dd_dev_warn_ratelimited(ppd->dd, "Unable to allocate snoop/capture packet\n");
-		goto out;
-	}
-
-	s_packet->total_len = tlen + md_len;
-
-	if (md_len > 0) {
-		memset(&md, 0, sizeof(struct capture_md));
-		md.port = 1;
-		md.dir = PKT_DIR_EGRESS;
-		if (likely(pbc == 0)) {
-			vl = be16_to_cpu(ps->s_txreq->phdr.hdr.lrh[0]) >> 12;
-			md.u.pbc = create_pbc(ppd, 0, qp->s_srate, vl, plen);
-		} else {
-			md.u.pbc = 0;
-		}
-		memcpy(s_packet->data, &md, md_len);
-	} else {
-		md.u.pbc = pbc;
-	}
-
-	/* Copy header */
-	if (likely(hdr)) {
-		memcpy(s_packet->data + md_len, hdr, hdr_len);
-	} else {
-		dd_dev_err(ppd->dd,
-			   "Unable to copy header to snoop/capture packet\n");
-		kfree(s_packet);
-		goto out;
-	}
-
-	if (ss) {
-		data = s_packet->data + hdr_len + md_len;
-		data_start = data;
-
-		/*
-		 * Copy SGE State
-		 * The update_sge() function below will not modify the
-		 * individual SGEs in the array. It will make a copy each time
-		 * and operate on that. So we only need to copy this instance
-		 * and it won't impact PIO.
-		 */
-		temp_ss = *ss;
-		length = len;
-
-		snoop_dbg("Need to copy %d bytes", length);
-		while (length) {
-			void *addr = temp_ss.sge.vaddr;
-			u32 slen = temp_ss.sge.length;
-
-			if (slen > length) {
-				slen = length;
-				snoop_dbg("slen %d > len %d", slen, length);
-			}
-			snoop_dbg("copy %d to %p", slen, addr);
-			memcpy(data, addr, slen);
-			update_sge(&temp_ss, slen);
-			length -= slen;
-			data += slen;
-			snoop_dbg("data is now %p bytes left %d", data, length);
-		}
-		snoop_dbg("Completed SGE copy");
-	}
-
-	/*
-	 * Why do the filter check down here? Because the event tracing has its
-	 * own filtering and we need to have the walked the SGE list.
-	 */
-	if (!ppd->dd->hfi1_snoop.filter_callback) {
-		snoop_dbg("filter not set\n");
-		ret = HFI1_FILTER_HIT;
-	} else {
-		ret = ppd->dd->hfi1_snoop.filter_callback(
-					&ps->s_txreq->phdr.hdr,
-					NULL,
-					ppd->dd->hfi1_snoop.filter_value);
-	}
-
-	switch (ret) {
-	case HFI1_FILTER_ERR:
-		snoop_dbg("Error in filter call");
-		/* fall through */
-	case HFI1_FILTER_MISS:
-		snoop_dbg("Filter Miss");
-		kfree(s_packet);
-		break;
-	case HFI1_FILTER_HIT:
-		snoop_dbg("Capturing packet");
-		snoop_list_add_tail(s_packet, ppd->dd);
-
-		if (unlikely((snoop_flags & SNOOP_DROP_SEND) &&
-			     (ppd->dd->hfi1_snoop.mode_flag &
-			      HFI1_PORT_SNOOP_MODE))) {
-			unsigned long flags;
-
-			snoop_dbg("Dropping packet");
-			if (qp->s_wqe) {
-				spin_lock_irqsave(&qp->s_lock, flags);
-				hfi1_send_complete(
-					qp,
-					qp->s_wqe,
-					IB_WC_SUCCESS);
-				spin_unlock_irqrestore(&qp->s_lock, flags);
-			} else if (qp->ibqp.qp_type == IB_QPT_RC) {
-				spin_lock_irqsave(&qp->s_lock, flags);
-				hfi1_rc_send_complete(qp,
-						      &ps->s_txreq->phdr.hdr);
-				spin_unlock_irqrestore(&qp->s_lock, flags);
-			}
-
-			/*
-			 * If snoop is dropping the packet we need to put the
-			 * txreq back because no one else will.
-			 */
-			hfi1_put_txreq(ps->s_txreq);
-			return 0;
-		}
-		break;
-	default:
-		kfree(s_packet);
-		break;
-	}
-out:
-	return hfi1_verbs_send_pio(qp, ps, md.u.pbc);
-}
-
-/*
- * Callers of this must pass a hfi1_ib_header type for the from ptr. Currently
- * this can be used anywhere, but the intention is for inline ACKs for RC and
- * CCA packets. We don't restrict this usage though.
- */
-void snoop_inline_pio_send(struct hfi1_devdata *dd, struct pio_buf *pbuf,
-			   u64 pbc, const void *from, size_t count)
-{
-	int snoop_mode = 0;
-	int md_len = 0;
-	struct capture_md md;
-	struct snoop_packet *s_packet = NULL;
-
-	/*
-	 * count is in dwords so we need to convert to bytes.
-	 * We also need to account for CRC which would be tacked on by hardware.
-	 */
-	int packet_len = (count << 2) + 4;
-	int ret;
-
-	snoop_dbg("ACK OUT: len %d", packet_len);
-
-	if (!dd->hfi1_snoop.filter_callback) {
-		snoop_dbg("filter not set");
-		ret = HFI1_FILTER_HIT;
-	} else {
-		ret = dd->hfi1_snoop.filter_callback(
-				(struct hfi1_ib_header *)from,
-				NULL,
-				dd->hfi1_snoop.filter_value);
-	}
-
-	switch (ret) {
-	case HFI1_FILTER_ERR:
-		snoop_dbg("Error in filter call");
-		/* fall through */
-	case HFI1_FILTER_MISS:
-		snoop_dbg("Filter Miss");
-		break;
-	case HFI1_FILTER_HIT:
-		snoop_dbg("Capturing packet");
-		if (dd->hfi1_snoop.mode_flag & HFI1_PORT_SNOOP_MODE)
-			snoop_mode = 1;
-		if ((snoop_mode == 0) ||
-		    unlikely(snoop_flags & SNOOP_USE_METADATA))
-			md_len = sizeof(struct capture_md);
-
-		s_packet = allocate_snoop_packet(packet_len, 0, md_len);
-
-		if (unlikely(!s_packet)) {
-			dd_dev_warn_ratelimited(dd, "Unable to allocate snoop/capture packet\n");
-			goto inline_pio_out;
-		}
-
-		s_packet->total_len = packet_len + md_len;
-
-		/* Fill in the metadata for the packet */
-		if (md_len > 0) {
-			memset(&md, 0, sizeof(struct capture_md));
-			md.port = 1;
-			md.dir = PKT_DIR_EGRESS;
-			md.u.pbc = pbc;
-			memcpy(s_packet->data, &md, md_len);
-		}
-
-		/* Add the packet data which is a single buffer */
-		memcpy(s_packet->data + md_len, from, packet_len);
-
-		snoop_list_add_tail(s_packet, dd);
-
-		if (unlikely((snoop_flags & SNOOP_DROP_SEND) && snoop_mode)) {
-			snoop_dbg("Dropping packet");
-			return;
-		}
-		break;
-	default:
-		break;
-	}
-
-inline_pio_out:
-	pio_copy(dd, pbuf, pbc, from, count);
-}
diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index 2eddd33..4287c80 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -1500,13 +1500,7 @@ static int user_add(struct hfi1_devdata *dd)
  */
 int hfi1_device_create(struct hfi1_devdata *dd)
 {
-	int r, ret;
-
-	r = user_add(dd);
-	ret = hfi1_diag_add(dd);
-	if (r && !ret)
-		ret = r;
-	return ret;
+	return user_add(dd);
 }
 
 /*
@@ -1516,5 +1510,4 @@ int hfi1_device_create(struct hfi1_devdata *dd)
 void hfi1_device_remove(struct hfi1_devdata *dd)
 {
 	user_remove(dd);
-	hfi1_diag_remove(dd);
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/10] IB/hfi1: Remove unused user command
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 04/10] IB/hfi1: Remove snoop/diag interface Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands Dennis Dalessandro
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Christoph Hellwig

The HFI1_CMD_SDMA_STATUS_UPD command was never implemented it has no
reason to live in the driver. Remove it.

Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/file_ops.c |    3 ---
 include/uapi/rdma/hfi/hfi1_user.h    |    1 -
 2 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index 4287c80..322e55f 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -213,7 +213,6 @@ static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 		copy = sizeof(uinfo);
 		dest = &uinfo;
 		break;
-	case HFI1_CMD_SDMA_STATUS_UPD:
 	case HFI1_CMD_CREDIT_UPD:
 		copy = 0;
 		break;
@@ -273,8 +272,6 @@ static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 		ret = get_base_info(fp, (void __user *)(unsigned long)
 				    user_val, cmd.len);
 		break;
-	case HFI1_CMD_SDMA_STATUS_UPD:
-		break;
 	case HFI1_CMD_CREDIT_UPD:
 		if (uctxt && uctxt->sc)
 			sc_return_credits(uctxt->sc);
diff --git a/include/uapi/rdma/hfi/hfi1_user.h b/include/uapi/rdma/hfi/hfi1_user.h
index 3e3680d..aa48fbe 100644
--- a/include/uapi/rdma/hfi/hfi1_user.h
+++ b/include/uapi/rdma/hfi/hfi1_user.h
@@ -114,7 +114,6 @@
 #define HFI1_CMD_TID_UPDATE      4	/* update expected TID entries */
 #define HFI1_CMD_TID_FREE        5	/* free expected TID entries */
 #define HFI1_CMD_CREDIT_UPD      6	/* force an update of PIO credit */
-#define HFI1_CMD_SDMA_STATUS_UPD 7      /* force update of SDMA status ring */
 
 #define HFI1_CMD_RECV_CTRL       8	/* control receipt of packets */
 #define HFI1_CMD_POLL_TYPE       9	/* set the kind of polling we want */

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 05/10] IB/hfi1: Remove unused user command Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
       [not found]     ` <20160519122622.22041.41686.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2016-05-19 12:26   ` [PATCH 07/10] IB/hfi1: Remove write(), use ioctl() for user cmds Dennis Dalessandro
                     ` (3 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

IOCTL is more suited to what user space commands need to do than the
write() interface. Add IOCTL definitions for all existing write commands
and the handling for those. The write() interface will be removed in a
follow on patch.

Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/common.h   |    5 +
 drivers/staging/rdma/hfi1/file_ops.c |  204 ++++++++++++++++++++++++++++++++++
 include/uapi/rdma/hfi/hfi1_user.h    |   40 +++++++
 3 files changed, 248 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/common.h b/drivers/staging/rdma/hfi1/common.h
index e9b6bb3..fcc9c21 100644
--- a/drivers/staging/rdma/hfi1/common.h
+++ b/drivers/staging/rdma/hfi1/common.h
@@ -178,7 +178,8 @@
 		     HFI1_CAP_PKEY_CHECK |		\
 		     HFI1_CAP_NO_INTEGRITY)
 
-#define HFI1_USER_SWVERSION ((HFI1_USER_SWMAJOR << 16) | HFI1_USER_SWMINOR)
+#define HFI1_USER_SWVERSION ((HFI1_USER_SWMAJOR << HFI1_SWMAJOR_SHIFT) | \
+			     HFI1_USER_SWMINOR)
 
 #ifndef HFI1_KERN_TYPE
 #define HFI1_KERN_TYPE 0
@@ -349,6 +350,8 @@ struct hfi1_message_header {
 #define HFI1_BECN_MASK 1
 #define HFI1_BECN_SMASK BIT(HFI1_BECN_SHIFT)
 
+#define HFI1_PSM_IOC_BASE_SEQ 0x0
+
 static inline __u64 rhf_to_cpu(const __le32 *rbuf)
 {
 	return __le64_to_cpu(*((__le64 *)rbuf));
diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index 322e55f..a338238 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -96,6 +96,8 @@ static int user_event_ack(struct hfi1_ctxtdata *, int, unsigned long);
 static int set_ctxt_pkey(struct hfi1_ctxtdata *, unsigned, u16);
 static int manage_rcvq(struct hfi1_ctxtdata *, unsigned, int);
 static int vma_fault(struct vm_area_struct *, struct vm_fault *);
+static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
+			    unsigned long arg);
 
 static const struct file_operations hfi1_file_ops = {
 	.owner = THIS_MODULE,
@@ -103,6 +105,7 @@ static const struct file_operations hfi1_file_ops = {
 	.write_iter = hfi1_write_iter,
 	.open = hfi1_file_open,
 	.release = hfi1_file_close,
+	.unlocked_ioctl = hfi1_file_ioctl,
 	.poll = hfi1_poll,
 	.mmap = hfi1_file_mmap,
 	.llseek = noop_llseek,
@@ -175,6 +178,207 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
 	return fp->private_data ? 0 : -ENOMEM;
 }
 
+static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
+			    unsigned long arg)
+{
+	struct hfi1_filedata *fd = fp->private_data;
+	struct hfi1_ctxtdata *uctxt = fd->uctxt;
+	struct hfi1_user_info uinfo;
+	struct hfi1_tid_info tinfo;
+	int ret = 0;
+	unsigned long addr;
+	int uval = 0;
+	unsigned long ul_uval = 0;
+	u16 uval16 = 0;
+
+	if (cmd != HFI1_IOCTL_ASSIGN_CTXT &&
+	    cmd != HFI1_IOCTL_GET_VERS &&
+	    !uctxt)
+		return -EINVAL;
+
+	switch (cmd) {
+	case HFI1_IOCTL_ASSIGN_CTXT:
+		if (copy_from_user(&uinfo,
+				   (struct hfi1_user_info __user *)arg,
+				   sizeof(uinfo)))
+			return -EFAULT;
+
+		ret = assign_ctxt(fp, &uinfo);
+		if (ret < 0)
+			return ret;
+		setup_ctxt(fp);
+		if (ret)
+			return ret;
+		ret = user_init(fp);
+		break;
+	case HFI1_IOCTL_CTXT_INFO:
+		ret = get_ctxt_info(fp, (void __user *)(unsigned long)arg,
+				    sizeof(struct hfi1_ctxt_info));
+		break;
+	case HFI1_IOCTL_USER_INFO:
+		ret = get_base_info(fp, (void __user *)(unsigned long)arg,
+				    sizeof(struct hfi1_base_info));
+		break;
+	case HFI1_IOCTL_CREDIT_UPD:
+		if (uctxt && uctxt->sc)
+			sc_return_credits(uctxt->sc);
+		break;
+
+	case HFI1_IOCTL_TID_UPDATE:
+		if (copy_from_user(&tinfo,
+				   (struct hfi11_tid_info __user *)arg,
+				   sizeof(tinfo)))
+			return -EFAULT;
+
+		ret = hfi1_user_exp_rcv_setup(fp, &tinfo);
+		if (!ret) {
+			/*
+			 * Copy the number of tidlist entries we used
+			 * and the length of the buffer we registered.
+			 * These fields are adjacent in the structure so
+			 * we can copy them at the same time.
+			 */
+			addr = arg + offsetof(struct hfi1_tid_info, tidcnt);
+			if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
+					 sizeof(tinfo.tidcnt) +
+					 sizeof(tinfo.length)))
+				ret = -EFAULT;
+		}
+		break;
+
+	case HFI1_IOCTL_TID_FREE:
+		if (copy_from_user(&tinfo,
+				   (struct hfi11_tid_info __user *)arg,
+				   sizeof(tinfo)))
+			return -EFAULT;
+
+		ret = hfi1_user_exp_rcv_clear(fp, &tinfo);
+		if (ret)
+			break;
+		addr = arg + offsetof(struct hfi1_tid_info, tidcnt);
+		if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
+				 sizeof(tinfo.tidcnt)))
+			ret = -EFAULT;
+		break;
+
+	case HFI1_IOCTL_TID_INVAL_READ:
+		if (copy_from_user(&tinfo,
+				   (struct hfi11_tid_info __user *)arg,
+				   sizeof(tinfo)))
+			return -EFAULT;
+
+		ret = hfi1_user_exp_rcv_invalid(fp, &tinfo);
+		if (ret)
+			break;
+		addr = arg + offsetof(struct hfi1_tid_info, tidcnt);
+		if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
+				 sizeof(tinfo.tidcnt)))
+			ret = -EFAULT;
+		break;
+
+	case HFI1_IOCTL_RECV_CTRL:
+		ret = get_user(uval, (int __user *)arg);
+		if (ret != 0)
+			return -EFAULT;
+		ret = manage_rcvq(uctxt, fd->subctxt, uval);
+		break;
+
+	case HFI1_IOCTL_POLL_TYPE:
+		ret = get_user(uval, (int __user *)arg);
+		if (ret != 0)
+			return -EFAULT;
+		uctxt->poll_type = (typeof(uctxt->poll_type))uval;
+		break;
+
+	case HFI1_IOCTL_ACK_EVENT:
+		ret = get_user(ul_uval, (unsigned long __user *)arg);
+		if (ret != 0)
+			return -EFAULT;
+		ret = user_event_ack(uctxt, fd->subctxt, ul_uval);
+		break;
+
+	case HFI1_IOCTL_SET_PKEY:
+		ret = get_user(uval16, (u16 __user *)arg);
+		if (ret != 0)
+			return -EFAULT;
+		if (HFI1_CAP_IS_USET(PKEY_CHECK))
+			ret = set_ctxt_pkey(uctxt, fd->subctxt, uval16);
+		else
+			return -EPERM;
+		break;
+
+	case HFI1_IOCTL_CTXT_RESET: {
+		struct send_context *sc;
+		struct hfi1_devdata *dd;
+
+		if (!uctxt || !uctxt->dd || !uctxt->sc)
+			return -EINVAL;
+
+		/*
+		 * There is no protection here. User level has to
+		 * guarantee that no one will be writing to the send
+		 * context while it is being re-initialized.
+		 * If user level breaks that guarantee, it will break
+		 * it's own context and no one else's.
+		 */
+		dd = uctxt->dd;
+		sc = uctxt->sc;
+		/*
+		 * Wait until the interrupt handler has marked the
+		 * context as halted or frozen. Report error if we time
+		 * out.
+		 */
+		wait_event_interruptible_timeout(
+			sc->halt_wait, (sc->flags & SCF_HALTED),
+			msecs_to_jiffies(SEND_CTXT_HALT_TIMEOUT));
+		if (!(sc->flags & SCF_HALTED))
+			return -ENOLCK;
+
+		/*
+		 * If the send context was halted due to a Freeze,
+		 * wait until the device has been "unfrozen" before
+		 * resetting the context.
+		 */
+		if (sc->flags & SCF_FROZEN) {
+			wait_event_interruptible_timeout(
+				dd->event_queue,
+				!(ACCESS_ONCE(dd->flags) & HFI1_FROZEN),
+				msecs_to_jiffies(SEND_CTXT_HALT_TIMEOUT));
+			if (dd->flags & HFI1_FROZEN)
+				return -ENOLCK;
+
+			if (dd->flags & HFI1_FORCED_FREEZE)
+				/*
+				 * Don't allow context reset if we are into
+				 * forced freeze
+				 */
+				return -ENODEV;
+
+			sc_disable(sc);
+			ret = sc_enable(sc);
+			hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_ENB,
+				     uctxt->ctxt);
+		} else {
+			ret = sc_restart(sc);
+		}
+		if (!ret)
+			sc_return_credits(sc);
+		break;
+	}
+
+	case HFI1_IOCTL_GET_VERS:
+		uval = HFI1_USER_SWVERSION;
+		if (put_user(uval, (int __user *)arg))
+			return -EFAULT;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	return ret;
+}
+
 static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
 			       size_t count, loff_t *offset)
 {
diff --git a/include/uapi/rdma/hfi/hfi1_user.h b/include/uapi/rdma/hfi/hfi1_user.h
index aa48fbe..9fd8725 100644
--- a/include/uapi/rdma/hfi/hfi1_user.h
+++ b/include/uapi/rdma/hfi/hfi1_user.h
@@ -78,6 +78,11 @@
 #define HFI1_USER_SWMINOR 1
 
 /*
+ * We will encode the major/minor inside a single 32bit version number.
+ */
+#define HFI1_SWMAJOR_SHIFT 16
+
+/*
  * Set of HW and driver capability/feature bits.
  * These bit values are used to configure enabled/disabled HW and
  * driver features. The same set of bits are communicated to user
@@ -121,6 +126,41 @@
 #define HFI1_CMD_SET_PKEY        11     /* set context's pkey */
 #define HFI1_CMD_CTXT_RESET      12     /* reset context's HW send context */
 #define HFI1_CMD_TID_INVAL_READ  13     /* read TID cache invalidations */
+#define HFI1_CMD_GET_VERS	 14	/* get the version of the user cdev */
+
+/*
+ * User IOCTLs can not go above 128 if they do then see common.h and change the
+ * base for the snoop ioctl
+ */
+#define IB_IOCTL_MAGIC 0x1b /* See Documentation/ioctl/ioctl-number.txt */
+
+struct hfi1_cmd;
+#define HFI1_IOCTL_ASSIGN_CTXT \
+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_ASSIGN_CTXT, struct hfi1_user_info)
+#define HFI1_IOCTL_CTXT_INFO \
+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_INFO, struct hfi1_ctxt_info)
+#define HFI1_IOCTL_USER_INFO \
+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_USER_INFO, struct hfi1_base_info)
+#define HFI1_IOCTL_TID_UPDATE \
+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_UPDATE, struct hfi1_tid_info)
+#define HFI1_IOCTL_TID_FREE \
+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_FREE, struct hfi1_tid_info)
+#define HFI1_IOCTL_CREDIT_UPD \
+	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CREDIT_UPD)
+#define HFI1_IOCTL_RECV_CTRL \
+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int)
+#define HFI1_IOCTL_POLL_TYPE \
+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_POLL_TYPE, int)
+#define HFI1_IOCTL_ACK_EVENT \
+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_ACK_EVENT, unsigned long)
+#define HFI1_IOCTL_SET_PKEY \
+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_SET_PKEY, __u16)
+#define HFI1_IOCTL_CTXT_RESET \
+	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_RESET)
+#define HFI1_IOCTL_TID_INVAL_READ \
+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_INVAL_READ, struct hfi1_tid_info)
+#define HFI1_IOCTL_GET_VERS \
+	_IOR(IB_IOCTL_MAGIC, HFI1_CMD_GET_VERS, int)
 
 #define _HFI1_EVENT_FROZEN_BIT         0
 #define _HFI1_EVENT_LINKDOWN_BIT       1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/10] IB/hfi1: Remove write(), use ioctl() for user cmds
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 08/10] IB/hfi1: Add trace message in user IOCTL handling Dennis Dalessandro
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov

Remove the write() handler for user space commands now that ioctl
handling is available. User apps will need to change to use ioctl from
this point forward.

Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/file_ops.c |  226 ----------------------------------
 include/uapi/rdma/hfi/hfi1_user.h    |    8 -
 2 files changed, 1 insertions(+), 233 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index a338238..c46eada 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -72,8 +72,6 @@
  */
 static int hfi1_file_open(struct inode *, struct file *);
 static int hfi1_file_close(struct inode *, struct file *);
-static ssize_t hfi1_file_write(struct file *, const char __user *,
-			       size_t, loff_t *);
 static ssize_t hfi1_write_iter(struct kiocb *, struct iov_iter *);
 static unsigned int hfi1_poll(struct file *, struct poll_table_struct *);
 static int hfi1_file_mmap(struct file *, struct vm_area_struct *);
@@ -101,7 +99,6 @@ static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
 
 static const struct file_operations hfi1_file_ops = {
 	.owner = THIS_MODULE,
-	.write = hfi1_file_write,
 	.write_iter = hfi1_write_iter,
 	.open = hfi1_file_open,
 	.release = hfi1_file_close,
@@ -379,229 +376,6 @@ static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
 	return ret;
 }
 
-static ssize_t hfi1_file_write(struct file *fp, const char __user *data,
-			       size_t count, loff_t *offset)
-{
-	const struct hfi1_cmd __user *ucmd;
-	struct hfi1_filedata *fd = fp->private_data;
-	struct hfi1_ctxtdata *uctxt = fd->uctxt;
-	struct hfi1_cmd cmd;
-	struct hfi1_user_info uinfo;
-	struct hfi1_tid_info tinfo;
-	unsigned long addr;
-	ssize_t consumed = 0, copy = 0, ret = 0;
-	void *dest = NULL;
-	__u64 user_val = 0;
-	int uctxt_required = 1;
-
-	/* FIXME: This interface cannot continue out of staging */
-	if (WARN_ON_ONCE(!ib_safe_file_access(fp)))
-		return -EACCES;
-
-	if (count < sizeof(cmd)) {
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	ucmd = (const struct hfi1_cmd __user *)data;
-	if (copy_from_user(&cmd, ucmd, sizeof(cmd))) {
-		ret = -EFAULT;
-		goto bail;
-	}
-
-	consumed = sizeof(cmd);
-
-	switch (cmd.type) {
-	case HFI1_CMD_ASSIGN_CTXT:
-		uctxt_required = 0;	/* assigned user context not required */
-		copy = sizeof(uinfo);
-		dest = &uinfo;
-		break;
-	case HFI1_CMD_CREDIT_UPD:
-		copy = 0;
-		break;
-	case HFI1_CMD_TID_UPDATE:
-	case HFI1_CMD_TID_FREE:
-	case HFI1_CMD_TID_INVAL_READ:
-		copy = sizeof(tinfo);
-		dest = &tinfo;
-		break;
-	case HFI1_CMD_USER_INFO:
-	case HFI1_CMD_RECV_CTRL:
-	case HFI1_CMD_POLL_TYPE:
-	case HFI1_CMD_ACK_EVENT:
-	case HFI1_CMD_CTXT_INFO:
-	case HFI1_CMD_SET_PKEY:
-	case HFI1_CMD_CTXT_RESET:
-		copy = 0;
-		user_val = cmd.addr;
-		break;
-	default:
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	/* If the command comes with user data, copy it. */
-	if (copy) {
-		if (copy_from_user(dest, (void __user *)cmd.addr, copy)) {
-			ret = -EFAULT;
-			goto bail;
-		}
-		consumed += copy;
-	}
-
-	/*
-	 * Make sure there is a uctxt when needed.
-	 */
-	if (uctxt_required && !uctxt) {
-		ret = -EINVAL;
-		goto bail;
-	}
-
-	switch (cmd.type) {
-	case HFI1_CMD_ASSIGN_CTXT:
-		ret = assign_ctxt(fp, &uinfo);
-		if (ret < 0)
-			goto bail;
-		ret = setup_ctxt(fp);
-		if (ret)
-			goto bail;
-		ret = user_init(fp);
-		break;
-	case HFI1_CMD_CTXT_INFO:
-		ret = get_ctxt_info(fp, (void __user *)(unsigned long)
-				    user_val, cmd.len);
-		break;
-	case HFI1_CMD_USER_INFO:
-		ret = get_base_info(fp, (void __user *)(unsigned long)
-				    user_val, cmd.len);
-		break;
-	case HFI1_CMD_CREDIT_UPD:
-		if (uctxt && uctxt->sc)
-			sc_return_credits(uctxt->sc);
-		break;
-	case HFI1_CMD_TID_UPDATE:
-		ret = hfi1_user_exp_rcv_setup(fp, &tinfo);
-		if (!ret) {
-			/*
-			 * Copy the number of tidlist entries we used
-			 * and the length of the buffer we registered.
-			 * These fields are adjacent in the structure so
-			 * we can copy them at the same time.
-			 */
-			addr = (unsigned long)cmd.addr +
-				offsetof(struct hfi1_tid_info, tidcnt);
-			if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
-					 sizeof(tinfo.tidcnt) +
-					 sizeof(tinfo.length)))
-				ret = -EFAULT;
-		}
-		break;
-	case HFI1_CMD_TID_INVAL_READ:
-		ret = hfi1_user_exp_rcv_invalid(fp, &tinfo);
-		if (ret)
-			break;
-		addr = (unsigned long)cmd.addr +
-			offsetof(struct hfi1_tid_info, tidcnt);
-		if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
-				 sizeof(tinfo.tidcnt)))
-			ret = -EFAULT;
-		break;
-	case HFI1_CMD_TID_FREE:
-		ret = hfi1_user_exp_rcv_clear(fp, &tinfo);
-		if (ret)
-			break;
-		addr = (unsigned long)cmd.addr +
-			offsetof(struct hfi1_tid_info, tidcnt);
-		if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
-				 sizeof(tinfo.tidcnt)))
-			ret = -EFAULT;
-		break;
-	case HFI1_CMD_RECV_CTRL:
-		ret = manage_rcvq(uctxt, fd->subctxt, (int)user_val);
-		break;
-	case HFI1_CMD_POLL_TYPE:
-		uctxt->poll_type = (typeof(uctxt->poll_type))user_val;
-		break;
-	case HFI1_CMD_ACK_EVENT:
-		ret = user_event_ack(uctxt, fd->subctxt, user_val);
-		break;
-	case HFI1_CMD_SET_PKEY:
-		if (HFI1_CAP_IS_USET(PKEY_CHECK))
-			ret = set_ctxt_pkey(uctxt, fd->subctxt, user_val);
-		else
-			ret = -EPERM;
-		break;
-	case HFI1_CMD_CTXT_RESET: {
-		struct send_context *sc;
-		struct hfi1_devdata *dd;
-
-		if (!uctxt || !uctxt->dd || !uctxt->sc) {
-			ret = -EINVAL;
-			break;
-		}
-		/*
-		 * There is no protection here. User level has to
-		 * guarantee that no one will be writing to the send
-		 * context while it is being re-initialized.
-		 * If user level breaks that guarantee, it will break
-		 * it's own context and no one else's.
-		 */
-		dd = uctxt->dd;
-		sc = uctxt->sc;
-		/*
-		 * Wait until the interrupt handler has marked the
-		 * context as halted or frozen. Report error if we time
-		 * out.
-		 */
-		wait_event_interruptible_timeout(
-			sc->halt_wait, (sc->flags & SCF_HALTED),
-			msecs_to_jiffies(SEND_CTXT_HALT_TIMEOUT));
-		if (!(sc->flags & SCF_HALTED)) {
-			ret = -ENOLCK;
-			break;
-		}
-		/*
-		 * If the send context was halted due to a Freeze,
-		 * wait until the device has been "unfrozen" before
-		 * resetting the context.
-		 */
-		if (sc->flags & SCF_FROZEN) {
-			wait_event_interruptible_timeout(
-				dd->event_queue,
-				!(ACCESS_ONCE(dd->flags) & HFI1_FROZEN),
-				msecs_to_jiffies(SEND_CTXT_HALT_TIMEOUT));
-			if (dd->flags & HFI1_FROZEN) {
-				ret = -ENOLCK;
-				break;
-			}
-			if (dd->flags & HFI1_FORCED_FREEZE) {
-				/*
-				 * Don't allow context reset if we are into
-				 * forced freeze
-				 */
-				ret = -ENODEV;
-				break;
-			}
-			sc_disable(sc);
-			ret = sc_enable(sc);
-			hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_ENB,
-				     uctxt->ctxt);
-		} else {
-			ret = sc_restart(sc);
-		}
-		if (!ret)
-			sc_return_credits(sc);
-		break;
-	}
-	}
-
-	if (ret >= 0)
-		ret = consumed;
-bail:
-	return ret;
-}
-
 static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
 {
 	struct hfi1_filedata *fd = kiocb->ki_filp->private_data;
diff --git a/include/uapi/rdma/hfi/hfi1_user.h b/include/uapi/rdma/hfi/hfi1_user.h
index 9fd8725..5727c09 100644
--- a/include/uapi/rdma/hfi/hfi1_user.h
+++ b/include/uapi/rdma/hfi/hfi1_user.h
@@ -66,7 +66,7 @@
  * The major version changes when data structures change in an incompatible
  * way. The driver must be the same for initialization to succeed.
  */
-#define HFI1_USER_SWMAJOR 5
+#define HFI1_USER_SWMAJOR 6
 
 /*
  * Minor version differences are always compatible
@@ -260,12 +260,6 @@ struct hfi1_tid_info {
 	__u32 length;
 };
 
-struct hfi1_cmd {
-	__u32 type;        /* command type */
-	__u32 len;         /* length of struct pointed to by add */
-	__u64 addr;        /* pointer to user structure */
-};
-
 enum hfi1_sdma_comp_state {
 	FREE = 0,
 	QUEUED,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/10] IB/hfi1: Add trace message in user IOCTL handling
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 07/10] IB/hfi1: Remove write(), use ioctl() for user cmds Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early Dennis Dalessandro
  2016-05-19 12:26   ` [PATCH 10/10] IB/hfi1: Move driver out of staging Dennis Dalessandro
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Ira Weiny

Add a trace message to HFI1s user IOCTL handling. This allows debugging
of which IOCTLs are being handled by the driver.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/file_ops.c |    1 +
 drivers/staging/rdma/hfi1/trace.c    |    1 +
 drivers/staging/rdma/hfi1/trace.h    |    1 +
 3 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index c46eada..1139170 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -188,6 +188,7 @@ static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
 	unsigned long ul_uval = 0;
 	u16 uval16 = 0;
 
+	hfi1_cdbg(IOCTL, "IOCTL recv: 0x%x", cmd);
 	if (cmd != HFI1_IOCTL_ASSIGN_CTXT &&
 	    cmd != HFI1_IOCTL_GET_VERS &&
 	    !uctxt)
diff --git a/drivers/staging/rdma/hfi1/trace.c b/drivers/staging/rdma/hfi1/trace.c
index 8b62fef..caddb2a 100644
--- a/drivers/staging/rdma/hfi1/trace.c
+++ b/drivers/staging/rdma/hfi1/trace.c
@@ -233,3 +233,4 @@ __hfi1_trace_fn(FIRMWARE);
 __hfi1_trace_fn(RCVCTRL);
 __hfi1_trace_fn(TID);
 __hfi1_trace_fn(MMU);
+__hfi1_trace_fn(IOCTL);
diff --git a/drivers/staging/rdma/hfi1/trace.h b/drivers/staging/rdma/hfi1/trace.h
index 42bcfc3..a6c1adf 100644
--- a/drivers/staging/rdma/hfi1/trace.h
+++ b/drivers/staging/rdma/hfi1/trace.h
@@ -1341,6 +1341,7 @@ __hfi1_trace_def(FIRMWARE);
 __hfi1_trace_def(RCVCTRL);
 __hfi1_trace_def(TID);
 __hfi1_trace_def(MMU);
+__hfi1_trace_def(IOCTL);
 
 #define hfi1_cdbg(which, fmt, ...) \
 	__hfi1_trace_##which(__func__, fmt, ##__VA_ARGS__)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 08/10] IB/hfi1: Add trace message in user IOCTL handling Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
       [not found]     ` <20160519122642.22041.66203.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
  2016-05-19 12:26   ` [PATCH 10/10] IB/hfi1: Move driver out of staging Dennis Dalessandro
  9 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny

The deletion of a cdev is not a fence for holding off references to the
structure. The driver attempts to delete the cdev and then proceeds to
free the parent structure, the hfi1_devdata, or dd. This can potentially
lead to a kernel panic in situations where a user has an FD for the cdev
open, and the pci device gets removed. If the user then closes the FD
there will be a NULL dereference when trying to do put on the cdev's
kobject.

Fix this by pointing the cdev's kobject.parent at a new kobject embedded
in its parent structure. Also take a reference when the device is opened
and put it back when it is closed.

Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/staging/rdma/hfi1/device.c   |    4 +++-
 drivers/staging/rdma/hfi1/device.h   |    3 ++-
 drivers/staging/rdma/hfi1/file_ops.c |   15 ++++++++++++---
 drivers/staging/rdma/hfi1/hfi.h      |    1 +
 drivers/staging/rdma/hfi1/init.c     |   14 +++++++++++++-
 5 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/device.c b/drivers/staging/rdma/hfi1/device.c
index 6ee800f..bf64b5a 100644
--- a/drivers/staging/rdma/hfi1/device.c
+++ b/drivers/staging/rdma/hfi1/device.c
@@ -60,7 +60,8 @@ static dev_t hfi1_dev;
 int hfi1_cdev_init(int minor, const char *name,
 		   const struct file_operations *fops,
 		   struct cdev *cdev, struct device **devp,
-		   bool user_accessible)
+		   bool user_accessible,
+		   struct kobject *parent)
 {
 	const dev_t dev = MKDEV(MAJOR(hfi1_dev), minor);
 	struct device *device = NULL;
@@ -68,6 +69,7 @@ int hfi1_cdev_init(int minor, const char *name,
 
 	cdev_init(cdev, fops);
 	cdev->owner = THIS_MODULE;
+	cdev->kobj.parent = parent;
 	kobject_set_name(&cdev->kobj, name);
 
 	ret = cdev_add(cdev, dev, 1);
diff --git a/drivers/staging/rdma/hfi1/device.h b/drivers/staging/rdma/hfi1/device.h
index 5bb3e83..c3ec19c 100644
--- a/drivers/staging/rdma/hfi1/device.h
+++ b/drivers/staging/rdma/hfi1/device.h
@@ -50,7 +50,8 @@
 int hfi1_cdev_init(int minor, const char *name,
 		   const struct file_operations *fops,
 		   struct cdev *cdev, struct device **devp,
-		   bool user_accessible);
+		   bool user_accessible,
+		   struct kobject *parent);
 void hfi1_cdev_cleanup(struct cdev *cdev, struct device **devp);
 const char *class_name(void);
 int __init dev_init(void);
diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/staging/rdma/hfi1/file_ops.c
index 1139170..7a5b0e6 100644
--- a/drivers/staging/rdma/hfi1/file_ops.c
+++ b/drivers/staging/rdma/hfi1/file_ops.c
@@ -168,6 +168,13 @@ static inline int is_valid_mmap(u64 token)
 
 static int hfi1_file_open(struct inode *inode, struct file *fp)
 {
+	struct hfi1_devdata *dd = container_of(inode->i_cdev,
+					       struct hfi1_devdata,
+					       user_cdev);
+
+	/* Just take a ref now. Not all opens result in a context assign */
+	kobject_get(&dd->kobj);
+
 	/* The real work is performed later in assign_ctxt() */
 	fp->private_data = kzalloc(sizeof(struct hfi1_filedata), GFP_KERNEL);
 	if (fp->private_data) /* no cpu affinity by default */
@@ -690,7 +697,9 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 {
 	struct hfi1_filedata *fdata = fp->private_data;
 	struct hfi1_ctxtdata *uctxt = fdata->uctxt;
-	struct hfi1_devdata *dd;
+	struct hfi1_devdata *dd = container_of(inode->i_cdev,
+					       struct hfi1_devdata,
+					       user_cdev);
 	unsigned long flags, *ev;
 
 	fp->private_data = NULL;
@@ -699,7 +708,6 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 		goto done;
 
 	hfi1_cdbg(PROC, "freeing ctxt %u:%u", uctxt->ctxt, fdata->subctxt);
-	dd = uctxt->dd;
 	mutex_lock(&hfi1_mutex);
 
 	flush_wc();
@@ -765,6 +773,7 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 	mutex_unlock(&hfi1_mutex);
 	hfi1_free_ctxtdata(dd, uctxt);
 done:
+	kobject_put(&dd->kobj);
 	kfree(fdata);
 	return 0;
 }
@@ -1464,7 +1473,7 @@ static int user_add(struct hfi1_devdata *dd)
 	snprintf(name, sizeof(name), "%s_%d", class_name(), dd->unit);
 	ret = hfi1_cdev_init(dd->unit, name, &hfi1_file_ops,
 			     &dd->user_cdev, &dd->user_device,
-			     true);
+			     true, &dd->kobj);
 	if (ret)
 		user_remove(dd);
 
diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h
index 337bd2d..4417a0f 100644
--- a/drivers/staging/rdma/hfi1/hfi.h
+++ b/drivers/staging/rdma/hfi1/hfi.h
@@ -1169,6 +1169,7 @@ struct hfi1_devdata {
 	atomic_t aspm_disabled_cnt;
 
 	struct hfi1_affinity *affinity;
+	struct kobject kobj;
 };
 
 /* 8051 firmware version helper */
diff --git a/drivers/staging/rdma/hfi1/init.c b/drivers/staging/rdma/hfi1/init.c
index 9e71abf..61bf7ff 100644
--- a/drivers/staging/rdma/hfi1/init.c
+++ b/drivers/staging/rdma/hfi1/init.c
@@ -989,8 +989,10 @@ static void release_asic_data(struct hfi1_devdata *dd)
 	dd->asic_data = NULL;
 }
 
-void hfi1_free_devdata(struct hfi1_devdata *dd)
+static void __hfi1_free_devdata(struct kobject *kobj)
 {
+	struct hfi1_devdata *dd =
+		container_of(kobj, struct hfi1_devdata, kobj);
 	unsigned long flags;
 
 	spin_lock_irqsave(&hfi1_devs_lock, flags);
@@ -1007,6 +1009,15 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
 	rvt_dealloc_device(&dd->verbs_dev.rdi);
 }
 
+static struct kobj_type hfi1_devdata_type = {
+	.release = __hfi1_free_devdata,
+};
+
+void hfi1_free_devdata(struct hfi1_devdata *dd)
+{
+	kobject_put(&dd->kobj);
+}
+
 /*
  * Allocate our primary per-unit data structure.  Must be done via verbs
  * allocator, because the verbs cleanup process both does cleanup and
@@ -1102,6 +1113,7 @@ struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev, size_t extra)
 			&pdev->dev,
 			"Could not alloc cpulist info, cpu affinity might be wrong\n");
 	}
+	kobject_init(&dd->kobj, &hfi1_devdata_type);
 	return dd;
 
 bail:

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/10] IB/hfi1: Move driver out of staging
       [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-05-19 12:26   ` [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early Dennis Dalessandro
@ 2016-05-19 12:26   ` Dennis Dalessandro
  9 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-19 12:26 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Jubin John

The TODO list for the hfi1 driver was completed during 4.6. In addition
other objections raised (which are far beyond what was in the TODO list)
have been addressed as well. It is now time to remove the driver from
staging and into the drivers/infiniband sub-tree.

Reviewed-by: Jubin John <jubin.john-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 MAINTAINERS                                 |   13 +++++++------
 drivers/infiniband/Kconfig                  |    2 ++
 drivers/infiniband/hw/Makefile              |    1 +
 drivers/infiniband/hw/hfi1/Kconfig          |    0 
 drivers/infiniband/hw/hfi1/Makefile         |    0 
 drivers/infiniband/hw/hfi1/affinity.c       |    0 
 drivers/infiniband/hw/hfi1/affinity.h       |    0 
 drivers/infiniband/hw/hfi1/aspm.h           |    0 
 drivers/infiniband/hw/hfi1/chip.c           |    0 
 drivers/infiniband/hw/hfi1/chip.h           |    0 
 drivers/infiniband/hw/hfi1/chip_registers.h |    0 
 drivers/infiniband/hw/hfi1/common.h         |    0 
 drivers/infiniband/hw/hfi1/debugfs.c        |    0 
 drivers/infiniband/hw/hfi1/debugfs.h        |    0 
 drivers/infiniband/hw/hfi1/device.c         |    0 
 drivers/infiniband/hw/hfi1/device.h         |    0 
 drivers/infiniband/hw/hfi1/dma.c            |    0 
 drivers/infiniband/hw/hfi1/driver.c         |    0 
 drivers/infiniband/hw/hfi1/efivar.c         |    0 
 drivers/infiniband/hw/hfi1/efivar.h         |    0 
 drivers/infiniband/hw/hfi1/eprom.c          |    0 
 drivers/infiniband/hw/hfi1/eprom.h          |    0 
 drivers/infiniband/hw/hfi1/file_ops.c       |    0 
 drivers/infiniband/hw/hfi1/firmware.c       |    0 
 drivers/infiniband/hw/hfi1/hfi.h            |    0 
 drivers/infiniband/hw/hfi1/init.c           |    0 
 drivers/infiniband/hw/hfi1/intr.c           |    0 
 drivers/infiniband/hw/hfi1/iowait.h         |    0 
 drivers/infiniband/hw/hfi1/mad.c            |    0 
 drivers/infiniband/hw/hfi1/mad.h            |    0 
 drivers/infiniband/hw/hfi1/mmu_rb.c         |    0 
 drivers/infiniband/hw/hfi1/mmu_rb.h         |    0 
 drivers/infiniband/hw/hfi1/opa_compat.h     |    0 
 drivers/infiniband/hw/hfi1/pcie.c           |    0 
 drivers/infiniband/hw/hfi1/pio.c            |    0 
 drivers/infiniband/hw/hfi1/pio.h            |    0 
 drivers/infiniband/hw/hfi1/pio_copy.c       |    0 
 drivers/infiniband/hw/hfi1/platform.c       |    0 
 drivers/infiniband/hw/hfi1/platform.h       |    0 
 drivers/infiniband/hw/hfi1/qp.c             |    0 
 drivers/infiniband/hw/hfi1/qp.h             |    0 
 drivers/infiniband/hw/hfi1/qsfp.c           |    0 
 drivers/infiniband/hw/hfi1/qsfp.h           |    0 
 drivers/infiniband/hw/hfi1/rc.c             |    0 
 drivers/infiniband/hw/hfi1/ruc.c            |    0 
 drivers/infiniband/hw/hfi1/sdma.c           |    0 
 drivers/infiniband/hw/hfi1/sdma.h           |    0 
 drivers/infiniband/hw/hfi1/sdma_txreq.h     |    0 
 drivers/infiniband/hw/hfi1/sysfs.c          |    0 
 drivers/infiniband/hw/hfi1/trace.c          |    0 
 drivers/infiniband/hw/hfi1/trace.h          |    0 
 drivers/infiniband/hw/hfi1/twsi.c           |    0 
 drivers/infiniband/hw/hfi1/twsi.h           |    0 
 drivers/infiniband/hw/hfi1/uc.c             |    0 
 drivers/infiniband/hw/hfi1/ud.c             |    0 
 drivers/infiniband/hw/hfi1/user_exp_rcv.c   |    0 
 drivers/infiniband/hw/hfi1/user_exp_rcv.h   |    0 
 drivers/infiniband/hw/hfi1/user_pages.c     |    0 
 drivers/infiniband/hw/hfi1/user_sdma.c      |    0 
 drivers/infiniband/hw/hfi1/user_sdma.h      |    0 
 drivers/infiniband/hw/hfi1/verbs.c          |    0 
 drivers/infiniband/hw/hfi1/verbs.h          |    0 
 drivers/infiniband/hw/hfi1/verbs_txreq.c    |    0 
 drivers/infiniband/hw/hfi1/verbs_txreq.h    |    0 
 drivers/staging/rdma/Kconfig                |    2 --
 drivers/staging/rdma/Makefile               |    1 -
 drivers/staging/rdma/hfi1/TODO              |    6 ------
 67 files changed, 10 insertions(+), 15 deletions(-)
 rename drivers/{staging/rdma/hfi1/Kconfig => infiniband/hw/hfi1/Kconfig} (100%)
 rename drivers/{staging/rdma/hfi1/Makefile => infiniband/hw/hfi1/Makefile} (100%)
 rename drivers/{staging/rdma/hfi1/affinity.c => infiniband/hw/hfi1/affinity.c} (100%)
 rename drivers/{staging/rdma/hfi1/affinity.h => infiniband/hw/hfi1/affinity.h} (100%)
 rename drivers/{staging/rdma/hfi1/aspm.h => infiniband/hw/hfi1/aspm.h} (100%)
 rename drivers/{staging/rdma/hfi1/chip.c => infiniband/hw/hfi1/chip.c} (100%)
 rename drivers/{staging/rdma/hfi1/chip.h => infiniband/hw/hfi1/chip.h} (100%)
 rename drivers/{staging/rdma/hfi1/chip_registers.h => infiniband/hw/hfi1/chip_registers.h} (100%)
 rename drivers/{staging/rdma/hfi1/common.h => infiniband/hw/hfi1/common.h} (100%)
 rename drivers/{staging/rdma/hfi1/debugfs.c => infiniband/hw/hfi1/debugfs.c} (100%)
 rename drivers/{staging/rdma/hfi1/debugfs.h => infiniband/hw/hfi1/debugfs.h} (100%)
 rename drivers/{staging/rdma/hfi1/device.c => infiniband/hw/hfi1/device.c} (100%)
 rename drivers/{staging/rdma/hfi1/device.h => infiniband/hw/hfi1/device.h} (100%)
 rename drivers/{staging/rdma/hfi1/dma.c => infiniband/hw/hfi1/dma.c} (100%)
 rename drivers/{staging/rdma/hfi1/driver.c => infiniband/hw/hfi1/driver.c} (100%)
 rename drivers/{staging/rdma/hfi1/efivar.c => infiniband/hw/hfi1/efivar.c} (100%)
 rename drivers/{staging/rdma/hfi1/efivar.h => infiniband/hw/hfi1/efivar.h} (100%)
 rename drivers/{staging/rdma/hfi1/eprom.c => infiniband/hw/hfi1/eprom.c} (100%)
 rename drivers/{staging/rdma/hfi1/eprom.h => infiniband/hw/hfi1/eprom.h} (100%)
 rename drivers/{staging/rdma/hfi1/file_ops.c => infiniband/hw/hfi1/file_ops.c} (100%)
 rename drivers/{staging/rdma/hfi1/firmware.c => infiniband/hw/hfi1/firmware.c} (100%)
 rename drivers/{staging/rdma/hfi1/hfi.h => infiniband/hw/hfi1/hfi.h} (100%)
 rename drivers/{staging/rdma/hfi1/init.c => infiniband/hw/hfi1/init.c} (100%)
 rename drivers/{staging/rdma/hfi1/intr.c => infiniband/hw/hfi1/intr.c} (100%)
 rename drivers/{staging/rdma/hfi1/iowait.h => infiniband/hw/hfi1/iowait.h} (100%)
 rename drivers/{staging/rdma/hfi1/mad.c => infiniband/hw/hfi1/mad.c} (100%)
 rename drivers/{staging/rdma/hfi1/mad.h => infiniband/hw/hfi1/mad.h} (100%)
 rename drivers/{staging/rdma/hfi1/mmu_rb.c => infiniband/hw/hfi1/mmu_rb.c} (100%)
 rename drivers/{staging/rdma/hfi1/mmu_rb.h => infiniband/hw/hfi1/mmu_rb.h} (100%)
 rename drivers/{staging/rdma/hfi1/opa_compat.h => infiniband/hw/hfi1/opa_compat.h} (100%)
 rename drivers/{staging/rdma/hfi1/pcie.c => infiniband/hw/hfi1/pcie.c} (100%)
 rename drivers/{staging/rdma/hfi1/pio.c => infiniband/hw/hfi1/pio.c} (100%)
 rename drivers/{staging/rdma/hfi1/pio.h => infiniband/hw/hfi1/pio.h} (100%)
 rename drivers/{staging/rdma/hfi1/pio_copy.c => infiniband/hw/hfi1/pio_copy.c} (100%)
 rename drivers/{staging/rdma/hfi1/platform.c => infiniband/hw/hfi1/platform.c} (100%)
 rename drivers/{staging/rdma/hfi1/platform.h => infiniband/hw/hfi1/platform.h} (100%)
 rename drivers/{staging/rdma/hfi1/qp.c => infiniband/hw/hfi1/qp.c} (100%)
 rename drivers/{staging/rdma/hfi1/qp.h => infiniband/hw/hfi1/qp.h} (100%)
 rename drivers/{staging/rdma/hfi1/qsfp.c => infiniband/hw/hfi1/qsfp.c} (100%)
 rename drivers/{staging/rdma/hfi1/qsfp.h => infiniband/hw/hfi1/qsfp.h} (100%)
 rename drivers/{staging/rdma/hfi1/rc.c => infiniband/hw/hfi1/rc.c} (100%)
 rename drivers/{staging/rdma/hfi1/ruc.c => infiniband/hw/hfi1/ruc.c} (100%)
 rename drivers/{staging/rdma/hfi1/sdma.c => infiniband/hw/hfi1/sdma.c} (100%)
 rename drivers/{staging/rdma/hfi1/sdma.h => infiniband/hw/hfi1/sdma.h} (100%)
 rename drivers/{staging/rdma/hfi1/sdma_txreq.h => infiniband/hw/hfi1/sdma_txreq.h} (100%)
 rename drivers/{staging/rdma/hfi1/sysfs.c => infiniband/hw/hfi1/sysfs.c} (100%)
 rename drivers/{staging/rdma/hfi1/trace.c => infiniband/hw/hfi1/trace.c} (100%)
 rename drivers/{staging/rdma/hfi1/trace.h => infiniband/hw/hfi1/trace.h} (100%)
 rename drivers/{staging/rdma/hfi1/twsi.c => infiniband/hw/hfi1/twsi.c} (100%)
 rename drivers/{staging/rdma/hfi1/twsi.h => infiniband/hw/hfi1/twsi.h} (100%)
 rename drivers/{staging/rdma/hfi1/uc.c => infiniband/hw/hfi1/uc.c} (100%)
 rename drivers/{staging/rdma/hfi1/ud.c => infiniband/hw/hfi1/ud.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_exp_rcv.c => infiniband/hw/hfi1/user_exp_rcv.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_exp_rcv.h => infiniband/hw/hfi1/user_exp_rcv.h} (100%)
 rename drivers/{staging/rdma/hfi1/user_pages.c => infiniband/hw/hfi1/user_pages.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_sdma.c => infiniband/hw/hfi1/user_sdma.c} (100%)
 rename drivers/{staging/rdma/hfi1/user_sdma.h => infiniband/hw/hfi1/user_sdma.h} (100%)
 rename drivers/{staging/rdma/hfi1/verbs.c => infiniband/hw/hfi1/verbs.c} (100%)
 rename drivers/{staging/rdma/hfi1/verbs.h => infiniband/hw/hfi1/verbs.h} (100%)
 rename drivers/{staging/rdma/hfi1/verbs_txreq.c => infiniband/hw/hfi1/verbs_txreq.c} (100%)
 rename drivers/{staging/rdma/hfi1/verbs_txreq.h => infiniband/hw/hfi1/verbs_txreq.h} (100%)
 delete mode 100644 drivers/staging/rdma/hfi1/TODO

diff --git a/MAINTAINERS b/MAINTAINERS
index c802594..9823456 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5086,6 +5086,13 @@ F:	drivers/block/cciss*
 F:	include/linux/cciss_ioctl.h
 F:	include/uapi/linux/cciss_ioctl.h
 
+HFI1 DRIVER
+M:	Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+M:	Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
+L:	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
+S:	Supported
+F:	drivers/infiniband/hw/hfi1
+
 HFS FILESYSTEM
 L:	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
 S:	Orphan
@@ -10661,12 +10668,6 @@ M:	Arnaud Patard <arnaud.patard-dQbF7i+pzddAfugRpC6u6w@public.gmane.org>
 S:	Odd Fixes
 F:	drivers/staging/xgifb/
 
-HFI1 DRIVER
-M:	Mike Marciniszyn <infinipath-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
-L:	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
-S:	Supported
-F:	drivers/staging/rdma/hfi1
-
 STARFIRE/DURALAN NETWORK DRIVER
 M:	Ion Badulescu <ionut-Rv4seVrz/scdnm+yROfE0A@public.gmane.org>
 S:	Odd Fixes
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 6425c0e..2137adf 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -85,4 +85,6 @@ source "drivers/infiniband/ulp/isert/Kconfig"
 
 source "drivers/infiniband/sw/rdmavt/Kconfig"
 
+source "drivers/infiniband/hw/hfi1/Kconfig"
+
 endif # INFINIBAND
diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
index c7ad0a4..c0c7cf8 100644
--- a/drivers/infiniband/hw/Makefile
+++ b/drivers/infiniband/hw/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_MLX5_INFINIBAND)		+= mlx5/
 obj-$(CONFIG_INFINIBAND_NES)		+= nes/
 obj-$(CONFIG_INFINIBAND_OCRDMA)		+= ocrdma/
 obj-$(CONFIG_INFINIBAND_USNIC)		+= usnic/
+obj-$(CONFIG_INFINIBAND_HFI1)		+= hfi1/
diff --git a/drivers/staging/rdma/hfi1/Kconfig b/drivers/infiniband/hw/hfi1/Kconfig
similarity index 100%
rename from drivers/staging/rdma/hfi1/Kconfig
rename to drivers/infiniband/hw/hfi1/Kconfig
diff --git a/drivers/staging/rdma/hfi1/Makefile b/drivers/infiniband/hw/hfi1/Makefile
similarity index 100%
rename from drivers/staging/rdma/hfi1/Makefile
rename to drivers/infiniband/hw/hfi1/Makefile
diff --git a/drivers/staging/rdma/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/affinity.c
rename to drivers/infiniband/hw/hfi1/affinity.c
diff --git a/drivers/staging/rdma/hfi1/affinity.h b/drivers/infiniband/hw/hfi1/affinity.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/affinity.h
rename to drivers/infiniband/hw/hfi1/affinity.h
diff --git a/drivers/staging/rdma/hfi1/aspm.h b/drivers/infiniband/hw/hfi1/aspm.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/aspm.h
rename to drivers/infiniband/hw/hfi1/aspm.h
diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/chip.c
rename to drivers/infiniband/hw/hfi1/chip.c
diff --git a/drivers/staging/rdma/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/chip.h
rename to drivers/infiniband/hw/hfi1/chip.h
diff --git a/drivers/staging/rdma/hfi1/chip_registers.h b/drivers/infiniband/hw/hfi1/chip_registers.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/chip_registers.h
rename to drivers/infiniband/hw/hfi1/chip_registers.h
diff --git a/drivers/staging/rdma/hfi1/common.h b/drivers/infiniband/hw/hfi1/common.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/common.h
rename to drivers/infiniband/hw/hfi1/common.h
diff --git a/drivers/staging/rdma/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/debugfs.c
rename to drivers/infiniband/hw/hfi1/debugfs.c
diff --git a/drivers/staging/rdma/hfi1/debugfs.h b/drivers/infiniband/hw/hfi1/debugfs.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/debugfs.h
rename to drivers/infiniband/hw/hfi1/debugfs.h
diff --git a/drivers/staging/rdma/hfi1/device.c b/drivers/infiniband/hw/hfi1/device.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/device.c
rename to drivers/infiniband/hw/hfi1/device.c
diff --git a/drivers/staging/rdma/hfi1/device.h b/drivers/infiniband/hw/hfi1/device.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/device.h
rename to drivers/infiniband/hw/hfi1/device.h
diff --git a/drivers/staging/rdma/hfi1/dma.c b/drivers/infiniband/hw/hfi1/dma.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/dma.c
rename to drivers/infiniband/hw/hfi1/dma.c
diff --git a/drivers/staging/rdma/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/driver.c
rename to drivers/infiniband/hw/hfi1/driver.c
diff --git a/drivers/staging/rdma/hfi1/efivar.c b/drivers/infiniband/hw/hfi1/efivar.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/efivar.c
rename to drivers/infiniband/hw/hfi1/efivar.c
diff --git a/drivers/staging/rdma/hfi1/efivar.h b/drivers/infiniband/hw/hfi1/efivar.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/efivar.h
rename to drivers/infiniband/hw/hfi1/efivar.h
diff --git a/drivers/staging/rdma/hfi1/eprom.c b/drivers/infiniband/hw/hfi1/eprom.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/eprom.c
rename to drivers/infiniband/hw/hfi1/eprom.c
diff --git a/drivers/staging/rdma/hfi1/eprom.h b/drivers/infiniband/hw/hfi1/eprom.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/eprom.h
rename to drivers/infiniband/hw/hfi1/eprom.h
diff --git a/drivers/staging/rdma/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/file_ops.c
rename to drivers/infiniband/hw/hfi1/file_ops.c
diff --git a/drivers/staging/rdma/hfi1/firmware.c b/drivers/infiniband/hw/hfi1/firmware.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/firmware.c
rename to drivers/infiniband/hw/hfi1/firmware.c
diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/hfi.h
rename to drivers/infiniband/hw/hfi1/hfi.h
diff --git a/drivers/staging/rdma/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/init.c
rename to drivers/infiniband/hw/hfi1/init.c
diff --git a/drivers/staging/rdma/hfi1/intr.c b/drivers/infiniband/hw/hfi1/intr.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/intr.c
rename to drivers/infiniband/hw/hfi1/intr.c
diff --git a/drivers/staging/rdma/hfi1/iowait.h b/drivers/infiniband/hw/hfi1/iowait.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/iowait.h
rename to drivers/infiniband/hw/hfi1/iowait.h
diff --git a/drivers/staging/rdma/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/mad.c
rename to drivers/infiniband/hw/hfi1/mad.c
diff --git a/drivers/staging/rdma/hfi1/mad.h b/drivers/infiniband/hw/hfi1/mad.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/mad.h
rename to drivers/infiniband/hw/hfi1/mad.h
diff --git a/drivers/staging/rdma/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/mmu_rb.c
rename to drivers/infiniband/hw/hfi1/mmu_rb.c
diff --git a/drivers/staging/rdma/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/mmu_rb.h
rename to drivers/infiniband/hw/hfi1/mmu_rb.h
diff --git a/drivers/staging/rdma/hfi1/opa_compat.h b/drivers/infiniband/hw/hfi1/opa_compat.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/opa_compat.h
rename to drivers/infiniband/hw/hfi1/opa_compat.h
diff --git a/drivers/staging/rdma/hfi1/pcie.c b/drivers/infiniband/hw/hfi1/pcie.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/pcie.c
rename to drivers/infiniband/hw/hfi1/pcie.c
diff --git a/drivers/staging/rdma/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/pio.c
rename to drivers/infiniband/hw/hfi1/pio.c
diff --git a/drivers/staging/rdma/hfi1/pio.h b/drivers/infiniband/hw/hfi1/pio.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/pio.h
rename to drivers/infiniband/hw/hfi1/pio.h
diff --git a/drivers/staging/rdma/hfi1/pio_copy.c b/drivers/infiniband/hw/hfi1/pio_copy.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/pio_copy.c
rename to drivers/infiniband/hw/hfi1/pio_copy.c
diff --git a/drivers/staging/rdma/hfi1/platform.c b/drivers/infiniband/hw/hfi1/platform.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/platform.c
rename to drivers/infiniband/hw/hfi1/platform.c
diff --git a/drivers/staging/rdma/hfi1/platform.h b/drivers/infiniband/hw/hfi1/platform.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/platform.h
rename to drivers/infiniband/hw/hfi1/platform.h
diff --git a/drivers/staging/rdma/hfi1/qp.c b/drivers/infiniband/hw/hfi1/qp.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/qp.c
rename to drivers/infiniband/hw/hfi1/qp.c
diff --git a/drivers/staging/rdma/hfi1/qp.h b/drivers/infiniband/hw/hfi1/qp.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/qp.h
rename to drivers/infiniband/hw/hfi1/qp.h
diff --git a/drivers/staging/rdma/hfi1/qsfp.c b/drivers/infiniband/hw/hfi1/qsfp.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/qsfp.c
rename to drivers/infiniband/hw/hfi1/qsfp.c
diff --git a/drivers/staging/rdma/hfi1/qsfp.h b/drivers/infiniband/hw/hfi1/qsfp.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/qsfp.h
rename to drivers/infiniband/hw/hfi1/qsfp.h
diff --git a/drivers/staging/rdma/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/rc.c
rename to drivers/infiniband/hw/hfi1/rc.c
diff --git a/drivers/staging/rdma/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/ruc.c
rename to drivers/infiniband/hw/hfi1/ruc.c
diff --git a/drivers/staging/rdma/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/sdma.c
rename to drivers/infiniband/hw/hfi1/sdma.c
diff --git a/drivers/staging/rdma/hfi1/sdma.h b/drivers/infiniband/hw/hfi1/sdma.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/sdma.h
rename to drivers/infiniband/hw/hfi1/sdma.h
diff --git a/drivers/staging/rdma/hfi1/sdma_txreq.h b/drivers/infiniband/hw/hfi1/sdma_txreq.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/sdma_txreq.h
rename to drivers/infiniband/hw/hfi1/sdma_txreq.h
diff --git a/drivers/staging/rdma/hfi1/sysfs.c b/drivers/infiniband/hw/hfi1/sysfs.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/sysfs.c
rename to drivers/infiniband/hw/hfi1/sysfs.c
diff --git a/drivers/staging/rdma/hfi1/trace.c b/drivers/infiniband/hw/hfi1/trace.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/trace.c
rename to drivers/infiniband/hw/hfi1/trace.c
diff --git a/drivers/staging/rdma/hfi1/trace.h b/drivers/infiniband/hw/hfi1/trace.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/trace.h
rename to drivers/infiniband/hw/hfi1/trace.h
diff --git a/drivers/staging/rdma/hfi1/twsi.c b/drivers/infiniband/hw/hfi1/twsi.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/twsi.c
rename to drivers/infiniband/hw/hfi1/twsi.c
diff --git a/drivers/staging/rdma/hfi1/twsi.h b/drivers/infiniband/hw/hfi1/twsi.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/twsi.h
rename to drivers/infiniband/hw/hfi1/twsi.h
diff --git a/drivers/staging/rdma/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/uc.c
rename to drivers/infiniband/hw/hfi1/uc.c
diff --git a/drivers/staging/rdma/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/ud.c
rename to drivers/infiniband/hw/hfi1/ud.c
diff --git a/drivers/staging/rdma/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/user_exp_rcv.c
rename to drivers/infiniband/hw/hfi1/user_exp_rcv.c
diff --git a/drivers/staging/rdma/hfi1/user_exp_rcv.h b/drivers/infiniband/hw/hfi1/user_exp_rcv.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/user_exp_rcv.h
rename to drivers/infiniband/hw/hfi1/user_exp_rcv.h
diff --git a/drivers/staging/rdma/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/user_pages.c
rename to drivers/infiniband/hw/hfi1/user_pages.c
diff --git a/drivers/staging/rdma/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/user_sdma.c
rename to drivers/infiniband/hw/hfi1/user_sdma.c
diff --git a/drivers/staging/rdma/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/user_sdma.h
rename to drivers/infiniband/hw/hfi1/user_sdma.h
diff --git a/drivers/staging/rdma/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/verbs.c
rename to drivers/infiniband/hw/hfi1/verbs.c
diff --git a/drivers/staging/rdma/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/verbs.h
rename to drivers/infiniband/hw/hfi1/verbs.h
diff --git a/drivers/staging/rdma/hfi1/verbs_txreq.c b/drivers/infiniband/hw/hfi1/verbs_txreq.c
similarity index 100%
rename from drivers/staging/rdma/hfi1/verbs_txreq.c
rename to drivers/infiniband/hw/hfi1/verbs_txreq.c
diff --git a/drivers/staging/rdma/hfi1/verbs_txreq.h b/drivers/infiniband/hw/hfi1/verbs_txreq.h
similarity index 100%
rename from drivers/staging/rdma/hfi1/verbs_txreq.h
rename to drivers/infiniband/hw/hfi1/verbs_txreq.h
diff --git a/drivers/staging/rdma/Kconfig b/drivers/staging/rdma/Kconfig
index f1f3eca..2c5b018 100644
--- a/drivers/staging/rdma/Kconfig
+++ b/drivers/staging/rdma/Kconfig
@@ -22,6 +22,4 @@ menuconfig STAGING_RDMA
 # Please keep entries in alphabetic order
 if STAGING_RDMA
 
-source "drivers/staging/rdma/hfi1/Kconfig"
-
 endif
diff --git a/drivers/staging/rdma/Makefile b/drivers/staging/rdma/Makefile
index 8c7fc1d..b5e94f1 100644
--- a/drivers/staging/rdma/Makefile
+++ b/drivers/staging/rdma/Makefile
@@ -1,2 +1 @@
 # Entries for RDMA_STAGING tree
-obj-$(CONFIG_INFINIBAND_HFI1)	+= hfi1/
diff --git a/drivers/staging/rdma/hfi1/TODO b/drivers/staging/rdma/hfi1/TODO
deleted file mode 100644
index 4c6f1d7..0000000
--- a/drivers/staging/rdma/hfi1/TODO
+++ /dev/null
@@ -1,6 +0,0 @@
-July, 2015
-
-- Remove unneeded file entries in sysfs
-- Remove software processing of IB protocol and place in library for use
-  by qib, ipath (if still present), hfi1, and eventually soft-roce
-- Replace incorrect uAPI

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]     ` <20160519122642.22041.66203.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2016-05-19 18:31       ` Jason Gunthorpe
       [not found]         ` <20160519183100.GC26130-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2016-05-19 18:31 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Thu, May 19, 2016 at 05:26:44AM -0700, Dennis Dalessandro wrote:
>  static int hfi1_file_open(struct inode *inode, struct file *fp)
>  {
> +	struct hfi1_devdata *dd = container_of(inode->i_cdev,
> +					       struct hfi1_devdata,
> +					       user_cdev);
> +
> +	/* Just take a ref now. Not all opens result in a context assign */
> +	kobject_get(&dd->kobj);

Eh?

If dd->kobj is passed as the parent to hfi1_cdev_init then cdev core
will do this for you, just confusing to duplicate it here??

> @@ -1007,6 +1009,15 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
>  	rvt_dealloc_device(&dd->verbs_dev.rdi);

Hurm. How are you synchronizing pci device removal with ioctls on the
cdev?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]         ` <20160519183100.GC26130-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-05-20 15:57           ` Dennis Dalessandro
  2016-05-24 14:17           ` Dennis Dalessandro
  1 sibling, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-20 15:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Thu, May 19, 2016 at 12:31:00PM -0600, Jason Gunthorpe wrote:
>On Thu, May 19, 2016 at 05:26:44AM -0700, Dennis Dalessandro wrote:
>>  static int hfi1_file_open(struct inode *inode, struct file *fp)
>>  {
>> +	struct hfi1_devdata *dd = container_of(inode->i_cdev,
>> +					       struct hfi1_devdata,
>> +					       user_cdev);
>> +
>> +	/* Just take a ref now. Not all opens result in a context assign */
>> +	kobject_get(&dd->kobj);
>
>Eh?
>
>If dd->kobj is passed as the parent to hfi1_cdev_init then cdev core
>will do this for you, just confusing to duplicate it here??

I see your point. Will go ahead and remove it in the next version.

>> @@ -1007,6 +1009,15 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
>>  	rvt_dealloc_device(&dd->verbs_dev.rdi);
>
>Hurm. How are you synchronizing pci device removal with ioctls on the
>cdev?

This patch addresses the parent structure disappearing out from under the 
cdev. We are looking into if further work is needed around the pci device 
disappearing. I don't think there is anything specific to the ioctl changes 
though. In other words this is basically a port of write() to ioctl() so the 
same issues would be in both. Let me get back to you on this.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]     ` <20160519122622.22041.41686.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2016-05-21 12:34       ` Leon Romanovsky
       [not found]         ` <20160521123404.GB25500-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Leon Romanovsky @ 2016-05-21 12:34 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

[-- Attachment #1: Type: text/plain, Size: 2493 bytes --]

On Thu, May 19, 2016 at 05:26:24AM -0700, Dennis Dalessandro wrote:
> IOCTL is more suited to what user space commands need to do than the
> write() interface. Add IOCTL definitions for all existing write commands
> and the handling for those. The write() interface will be removed in a
> follow on patch.
> 
> Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/staging/rdma/hfi1/common.h   |    5 +
>  drivers/staging/rdma/hfi1/file_ops.c |  204 ++++++++++++++++++++++++++++++++++
>  include/uapi/rdma/hfi/hfi1_user.h    |   40 +++++++
>  3 files changed, 248 insertions(+), 1 deletions(-)
> 

<...>

> +#define IB_IOCTL_MAGIC 0x1b /* See Documentation/ioctl/ioctl-number.txt */
> +
> +struct hfi1_cmd;
> +#define HFI1_IOCTL_ASSIGN_CTXT \
> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_ASSIGN_CTXT, struct hfi1_user_info)
> +#define HFI1_IOCTL_CTXT_INFO \
> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_INFO, struct hfi1_ctxt_info)
> +#define HFI1_IOCTL_USER_INFO \
> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_USER_INFO, struct hfi1_base_info)
> +#define HFI1_IOCTL_TID_UPDATE \
> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_UPDATE, struct hfi1_tid_info)
> +#define HFI1_IOCTL_TID_FREE \
> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_FREE, struct hfi1_tid_info)
> +#define HFI1_IOCTL_CREDIT_UPD \
> +	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CREDIT_UPD)
> +#define HFI1_IOCTL_RECV_CTRL \
> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int)
> +#define HFI1_IOCTL_POLL_TYPE \
> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_POLL_TYPE, int)
> +#define HFI1_IOCTL_ACK_EVENT \
> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_ACK_EVENT, unsigned long)
> +#define HFI1_IOCTL_SET_PKEY \
> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_SET_PKEY, __u16)
> +#define HFI1_IOCTL_CTXT_RESET \
> +	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_RESET)
> +#define HFI1_IOCTL_TID_INVAL_READ \
> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_INVAL_READ, struct hfi1_tid_info)
> +#define HFI1_IOCTL_GET_VERS \
> +	_IOR(IB_IOCTL_MAGIC, HFI1_CMD_GET_VERS, int)
>  

I think the overall consensus over participants in OFVWG call was to use
one IOCTL to enter into device specific handler which will do all
necessary parsing and not spamming common IOCTL interface.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]         ` <20160521123404.GB25500-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-05-21 16:23           ` Dennis Dalessandro
       [not found]             ` <20160521162301.GA16770-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-21 16:23 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

On Sat, May 21, 2016 at 03:34:04PM +0300, Leon Romanovsky wrote:
>On Thu, May 19, 2016 at 05:26:24AM -0700, Dennis Dalessandro wrote:
>> IOCTL is more suited to what user space commands need to do than the
>> write() interface. Add IOCTL definitions for all existing write commands
>> and the handling for those. The write() interface will be removed in a
>> follow on patch.
>> 
>> Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> ---
>>  drivers/staging/rdma/hfi1/common.h   |    5 +
>>  drivers/staging/rdma/hfi1/file_ops.c |  204 ++++++++++++++++++++++++++++++++++
>>  include/uapi/rdma/hfi/hfi1_user.h    |   40 +++++++
>>  3 files changed, 248 insertions(+), 1 deletions(-)
>> 
>
><...>
>
>> +#define IB_IOCTL_MAGIC 0x1b /* See Documentation/ioctl/ioctl-number.txt */
>> +
>> +struct hfi1_cmd;
>> +#define HFI1_IOCTL_ASSIGN_CTXT \
>> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_ASSIGN_CTXT, struct hfi1_user_info)
>> +#define HFI1_IOCTL_CTXT_INFO \
>> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_INFO, struct hfi1_ctxt_info)
>> +#define HFI1_IOCTL_USER_INFO \
>> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_USER_INFO, struct hfi1_base_info)
>> +#define HFI1_IOCTL_TID_UPDATE \
>> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_UPDATE, struct hfi1_tid_info)
>> +#define HFI1_IOCTL_TID_FREE \
>> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_FREE, struct hfi1_tid_info)
>> +#define HFI1_IOCTL_CREDIT_UPD \
>> +	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CREDIT_UPD)
>> +#define HFI1_IOCTL_RECV_CTRL \
>> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int)
>> +#define HFI1_IOCTL_POLL_TYPE \
>> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_POLL_TYPE, int)
>> +#define HFI1_IOCTL_ACK_EVENT \
>> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_ACK_EVENT, unsigned long)
>> +#define HFI1_IOCTL_SET_PKEY \
>> +	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_SET_PKEY, __u16)
>> +#define HFI1_IOCTL_CTXT_RESET \
>> +	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_RESET)
>> +#define HFI1_IOCTL_TID_INVAL_READ \
>> +	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_INVAL_READ, struct hfi1_tid_info)
>> +#define HFI1_IOCTL_GET_VERS \
>> +	_IOR(IB_IOCTL_MAGIC, HFI1_CMD_GET_VERS, int)
>>  
>
>I think the overall consensus over participants in OFVWG call was to use
>one IOCTL to enter into device specific handler which will do all
>necessary parsing and not spamming common IOCTL interface.

That was for the verbs working group and the verbs 2.0 uAPI. This is for 
psm. We have some commands which are write or read only, and some which are  
read and write. It seems natural to classify their ioctl as such.

In the interim, while the OFVWG is solidifying its one API to rule them all, 
this solves our current and very specific problem of treating 
write()/writev() differently.  

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]             ` <20160521162301.GA16770-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-05-22 12:01               ` Leon Romanovsky
       [not found]                 ` <20160522120129.GC25500-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Leon Romanovsky @ 2016-05-22 12:01 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

[-- Attachment #1: Type: text/plain, Size: 4615 bytes --]

On Sat, May 21, 2016 at 12:23:02PM -0400, Dennis Dalessandro wrote:
> On Sat, May 21, 2016 at 03:34:04PM +0300, Leon Romanovsky wrote:
> >On Thu, May 19, 2016 at 05:26:24AM -0700, Dennis Dalessandro wrote:
> >>IOCTL is more suited to what user space commands need to do than the
> >>write() interface. Add IOCTL definitions for all existing write commands
> >>and the handling for those. The write() interface will be removed in a
> >>follow on patch.
> >>
> >>Reviewed-by: Mitko Haralanov <mitko.haralanov-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >>Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >>Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >>Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >>---
> >> drivers/staging/rdma/hfi1/common.h   |    5 +
> >> drivers/staging/rdma/hfi1/file_ops.c |  204 ++++++++++++++++++++++++++++++++++
> >> include/uapi/rdma/hfi/hfi1_user.h    |   40 +++++++
> >> 3 files changed, 248 insertions(+), 1 deletions(-)
> >>
> >
> ><...>
> >
> >>+#define IB_IOCTL_MAGIC 0x1b /* See Documentation/ioctl/ioctl-number.txt */
> >>+
> >>+struct hfi1_cmd;
> >>+#define HFI1_IOCTL_ASSIGN_CTXT \
> >>+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_ASSIGN_CTXT, struct hfi1_user_info)
> >>+#define HFI1_IOCTL_CTXT_INFO \
> >>+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_INFO, struct hfi1_ctxt_info)
> >>+#define HFI1_IOCTL_USER_INFO \
> >>+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_USER_INFO, struct hfi1_base_info)
> >>+#define HFI1_IOCTL_TID_UPDATE \
> >>+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_UPDATE, struct hfi1_tid_info)
> >>+#define HFI1_IOCTL_TID_FREE \
> >>+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_FREE, struct hfi1_tid_info)
> >>+#define HFI1_IOCTL_CREDIT_UPD \
> >>+	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CREDIT_UPD)
> >>+#define HFI1_IOCTL_RECV_CTRL \
> >>+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_RECV_CTRL, int)
> >>+#define HFI1_IOCTL_POLL_TYPE \
> >>+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_POLL_TYPE, int)
> >>+#define HFI1_IOCTL_ACK_EVENT \
> >>+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_ACK_EVENT, unsigned long)
> >>+#define HFI1_IOCTL_SET_PKEY \
> >>+	_IOW(IB_IOCTL_MAGIC, HFI1_CMD_SET_PKEY, __u16)
> >>+#define HFI1_IOCTL_CTXT_RESET \
> >>+	_IO(IB_IOCTL_MAGIC, HFI1_CMD_CTXT_RESET)
> >>+#define HFI1_IOCTL_TID_INVAL_READ \
> >>+	_IOWR(IB_IOCTL_MAGIC, HFI1_CMD_TID_INVAL_READ, struct hfi1_tid_info)
> >>+#define HFI1_IOCTL_GET_VERS \
> >>+	_IOR(IB_IOCTL_MAGIC, HFI1_CMD_GET_VERS, int)
> >
> >I think the overall consensus over participants in OFVWG call was to use
> >one IOCTL to enter into device specific handler which will do all
> >necessary parsing and not spamming common IOCTL interface.
> 
> That was for the verbs working group and the verbs 2.0 uAPI. This is for
> psm.

I'm glad that you are supporting my point.
It is vendor specific implementation for vendor specific driver and not
for whole IB core, so there is no need to pollute general IB ioctls.

> In the interim, while the OFVWG is solidifying its one API to rule them all,
> this solves our current and very specific problem of treating
> write()/writev() differently.

OFVWG is working on improving current verbs interface, for proprietary things like this,
there is agreed API which will look similar to this:

1051 static long ib_uverbs_ioctl(struct file *filp, unsigned int cmd,
1052                                 unsigned long arg)
1053 {
1054         struct ib_uverbs_ioctl_hdr __user *user_hdr =
1055                 (struct ib_uverbs_ioctl_hdr __user *)arg;
1056         struct ib_uverbs_ioctl_hdr hdr;
1057
1058         struct ib_uverbs_file *file = filp->private_data;
1059         struct ib_device *ib_dev;
1060         int srcu_key;
1061         long ret = 0;
1062
1063         if (WARN_ON_ONCE(!ib_safe_file_access(filp)))
1064                 return -EACCES;
1065
1066         if (cmd == IB_USER_DIRECT_IOCTL_COMMAND) {
1067                 srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
1068                 ib_dev = srcu_dereference(file->device->ib_dev,
1069						&file->device->disassociate_srcu);
1070                 if (!ib_dev) {
1071			     srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
1072                         return -EIO;
1073                 }
1074
1075                 if (ib_dev->direct_fn)
1076                         ret = ib_dev->direct_fn(filp, arg);
1077
1078                 srcu_read_unlock(&file->device->disassociate_srcu,srcu_key);
1079                 return ret;

> 
> -Denny

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                 ` <20160522120129.GC25500-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-05-22 14:03                   ` Dennis Dalessandro
       [not found]                     ` <20160522140351.GA10696-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-22 14:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
>> >I think the overall consensus over participants in OFVWG call was to use
>> >one IOCTL to enter into device specific handler which will do all
>> >necessary parsing and not spamming common IOCTL interface.
>> 
>> That was for the verbs working group and the verbs 2.0 uAPI. This is for
>> psm.
>
>I'm glad that you are supporting my point.
>It is vendor specific implementation for vendor specific driver and not
>for whole IB core, so there is no need to pollute general IB ioctls.

It is making use of and applying a proper classification.  Is there a 
technical concern with this other than that's not how verbs may end up doing 
it? 

I'm not completely opposed to the single ioctl, I just don't necessarily see 
that as better in this case but am willing to listen to a technical 
justification for why it's incorrect.

>> In the interim, while the OFVWG is solidifying its one API to rule them 
>> all,
>> this solves our current and very specific problem of treating
>> write()/writev() differently.
>
>OFVWG is working on improving current verbs interface, for proprietary things like this,
>there is agreed API which will look similar to this:

As far as I know there has been no patches posted or real on-list review of 
this API as of yet.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                     ` <20160522140351.GA10696-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-05-22 17:57                       ` Leon Romanovsky
       [not found]                         ` <20160522175715.GD25500-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Leon Romanovsky @ 2016-05-22 17:57 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

[-- Attachment #1: Type: text/plain, Size: 1740 bytes --]

On Sun, May 22, 2016 at 10:03:52AM -0400, Dennis Dalessandro wrote:
> On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
> >>>I think the overall consensus over participants in OFVWG call was to use
> >>>one IOCTL to enter into device specific handler which will do all
> >>>necessary parsing and not spamming common IOCTL interface.
> >>
> >>That was for the verbs working group and the verbs 2.0 uAPI. This is for
> >>psm.
> >
> >I'm glad that you are supporting my point.
> >It is vendor specific implementation for vendor specific driver and not
> >for whole IB core, so there is no need to pollute general IB ioctls.
> 
> It is making use of and applying a proper classification.  Is there a
> technical concern with this other than that's not how verbs may end up doing
> it?
> 
> I'm not completely opposed to the single ioctl, I just don't necessarily see
> that as better in this case but am willing to listen to a technical
> justification for why it's incorrect.

it will simplify internal and external development by removing the
tensions over ioctls numbers. Do you plan to take the block of ioctls
for future expansion? Do you plan to mix hfi's ioctls with verbs's ioctls
based on acceptance of new code?

> 
> >>In the interim, while the OFVWG is solidifying its one API to rule them
> >>all,
> >>this solves our current and very specific problem of treating
> >>write()/writev() differently.
> >
> >OFVWG is working on improving current verbs interface, for proprietary things like this,
> >there is agreed API which will look similar to this:
> 
> As far as I know there has been no patches posted or real on-list review of
> this API as of yet.

Right.

> 
> -Denny

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                         ` <20160522175715.GD25500-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-05-23 12:22                           ` Dennis Dalessandro
       [not found]                             ` <20160523122207.GA16764-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-23 12:22 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

On Sun, May 22, 2016 at 08:57:15PM +0300, Leon Romanovsky wrote:
>On Sun, May 22, 2016 at 10:03:52AM -0400, Dennis Dalessandro wrote:
>> On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
>> >>>I think the overall consensus over participants in OFVWG call was to use
>> >>>one IOCTL to enter into device specific handler which will do all
>> >>>necessary parsing and not spamming common IOCTL interface.
>> >>
>> >>That was for the verbs working group and the verbs 2.0 uAPI. This is for
>> >>psm.
>> >
>> >I'm glad that you are supporting my point.
>> >It is vendor specific implementation for vendor specific driver and not
>> >for whole IB core, so there is no need to pollute general IB ioctls.
>> 
>> It is making use of and applying a proper classification.  Is there a
>> technical concern with this other than that's not how verbs may end up doing
>> it?
>> 
>> I'm not completely opposed to the single ioctl, I just don't necessarily see
>> that as better in this case but am willing to listen to a technical
>> justification for why it's incorrect.
>
>it will simplify internal and external development by removing the
>tensions over ioctls numbers. Do you plan to take the block of ioctls
>for future expansion? Do you plan to mix hfi's ioctls with verbs's ioctls
>based on acceptance of new code?

I'm still not sure what you are getting at here. Can you explain what you 
mean by tensions over ioctl numbers?  I guess I don't understand why the 
hfi1_x device's use of icotl numbers has any bearing at all on the 
ibcore/verbs ioctl(s). 

If and when new code is accepted and hfi1 converges its API to go through a 
common character device, then hfi1 would surely change to match whatever is 
there whether that's a single ioctl with a command type embedded or 
something that has not even yet been proposed.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                             ` <20160523122207.GA16764-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-05-23 13:03                               ` Leon Romanovsky
       [not found]                                 ` <20160523130312.GG25500-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Leon Romanovsky @ 2016-05-23 13:03 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

[-- Attachment #1: Type: text/plain, Size: 2179 bytes --]

On Mon, May 23, 2016 at 08:22:08AM -0400, Dennis Dalessandro wrote:
> On Sun, May 22, 2016 at 08:57:15PM +0300, Leon Romanovsky wrote:
> >On Sun, May 22, 2016 at 10:03:52AM -0400, Dennis Dalessandro wrote:
> >>On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
> >>>>>I think the overall consensus over participants in OFVWG call was to use
> >>>>>one IOCTL to enter into device specific handler which will do all
> >>>>>necessary parsing and not spamming common IOCTL interface.
> >>>>
> >>>>That was for the verbs working group and the verbs 2.0 uAPI. This is for
> >>>>psm.
> >>>
> >>>I'm glad that you are supporting my point.
> >>>It is vendor specific implementation for vendor specific driver and not
> >>>for whole IB core, so there is no need to pollute general IB ioctls.
> >>
> >>It is making use of and applying a proper classification.  Is there a
> >>technical concern with this other than that's not how verbs may end up doing
> >>it?
> >>
> >>I'm not completely opposed to the single ioctl, I just don't necessarily see
> >>that as better in this case but am willing to listen to a technical
> >>justification for why it's incorrect.
> >
> >it will simplify internal and external development by removing the
> >tensions over ioctls numbers. Do you plan to take the block of ioctls
> >for future expansion? Do you plan to mix hfi's ioctls with verbs's ioctls
> >based on acceptance of new code?
> 
> I'm still not sure what you are getting at here. Can you explain what you
> mean by tensions over ioctl numbers?  I guess I don't understand why the
> hfi1_x device's use of icotl numbers has any bearing at all on the
> ibcore/verbs ioctl(s).
> 
> If and when new code is accepted and hfi1 converges its API to go through a
> common character device, then hfi1 would surely change to match whatever is
> there whether that's a single ioctl with a command type embedded or
> something that has not even yet been proposed.

Denny,

It is easy for everyone to converge hfi1 API from day one, so if and
when new code is posted, the hfi1 changes will be summarized by one
line change.

Thanks.

> 
> -Denny

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                 ` <20160523130312.GG25500-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-05-23 14:10                                   ` Dennis Dalessandro
       [not found]                                     ` <20160523141049.GE16764-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
       [not found]                                     ` <b9dc4ac8-cfa2-d66e-7e36-f28116f23e59@redhat.com>
  0 siblings, 2 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-23 14:10 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	Mitko Haralanov, Ira Weiny, Mike Marciniszyn

On Mon, May 23, 2016 at 04:03:12PM +0300, Leon Romanovsky wrote:
>On Mon, May 23, 2016 at 08:22:08AM -0400, Dennis Dalessandro wrote:
>> On Sun, May 22, 2016 at 08:57:15PM +0300, Leon Romanovsky wrote:
>> >On Sun, May 22, 2016 at 10:03:52AM -0400, Dennis Dalessandro wrote:
>> >>On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
>> >>>>>I think the overall consensus over participants in OFVWG call was to use
>> >>>>>one IOCTL to enter into device specific handler which will do all
>> >>>>>necessary parsing and not spamming common IOCTL interface.
>> >>>>
>> >>>>That was for the verbs working group and the verbs 2.0 uAPI. This is for
>> >>>>psm.
>> >>>
>> >>>I'm glad that you are supporting my point.
>> >>>It is vendor specific implementation for vendor specific driver and not
>> >>>for whole IB core, so there is no need to pollute general IB ioctls.
>> >>
>> >>It is making use of and applying a proper classification.  Is there a
>> >>technical concern with this other than that's not how verbs may end up doing
>> >>it?
>> >>
>> >>I'm not completely opposed to the single ioctl, I just don't necessarily see
>> >>that as better in this case but am willing to listen to a technical
>> >>justification for why it's incorrect.
>> >
>> >it will simplify internal and external development by removing the
>> >tensions over ioctls numbers. Do you plan to take the block of ioctls
>> >for future expansion? Do you plan to mix hfi's ioctls with verbs's ioctls
>> >based on acceptance of new code?
>> 
>> I'm still not sure what you are getting at here. Can you explain what you
>> mean by tensions over ioctl numbers?  I guess I don't understand why the
>> hfi1_x device's use of icotl numbers has any bearing at all on the
>> ibcore/verbs ioctl(s).
>> 
>> If and when new code is accepted and hfi1 converges its API to go through a
>> common character device, then hfi1 would surely change to match whatever is
>> there whether that's a single ioctl with a command type embedded or
>> something that has not even yet been proposed.
>
>Denny,
>
>It is easy for everyone to converge hfi1 API from day one, so if and
>when new code is posted, the hfi1 changes will be summarized by one
>line change.

Let's put the future API issue, and the specifics of this patch aside for 
just a minute.  I'd like to understand the rationale for wanting a single 
ioctl over specific ioctls in the general sense. I know that's what folks 
seem to prefer from the calls, but perhaps we can get that down in writing 
here on the list.

I see an advantage for the specific ioctls because we can classify them 
based on permission. When running things like strace you can decode the 
ioctl number and see what access it is making. It also makes it easy to have 
a gist of what is going on based on the ioctl call itself.

On the other hand, having a bunch of ioctls may make it harder on user 
space. A well known ioctl may make life easier. I'm assuming there are other 
compelling reasons?

-Denny


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]         ` <20160519183100.GC26130-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-05-20 15:57           ` Dennis Dalessandro
@ 2016-05-24 14:17           ` Dennis Dalessandro
       [not found]             ` <20160524141756.GA17438-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  1 sibling, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-24 14:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Thu, May 19, 2016 at 12:31:00PM -0600, Jason Gunthorpe wrote:
>If dd->kobj is passed as the parent to hfi1_cdev_init then cdev core
>will do this for you, just confusing to duplicate it here??
>
>> @@ -1007,6 +1009,15 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
>>  	rvt_dealloc_device(&dd->verbs_dev.rdi);
>
>Hurm. How are you synchronizing pci device removal with ioctls on the
>cdev?

We have a system in place to tear down the various resources, unregister 
from the core, disable interrupts, disable send/rcv contexts, etc. 

Due to the nature of our hardware user space has direct access to the 
device. This means there is always going to be a race between the card going 
away and user space trying to access something that isn't there.

We could add a flag that says to start failing ioctls, we could forcibly 
close the file descriptor on the user, and I'm sure there are other things 
we could do. However there is always the race condition when the user has 
direct access.

The situations which we have to worry about are someone physically removing 
the card, or using admin priv to unbind it from pci, things of that nature.  
All of which are not normal use cases.

This patch handles a specific issue. The parent data structure of the cdev 
going away. So if something is hanging onto the cdev we won't panic when it 
tries to close. For instance a user application sending the get_version 
ioctl after the device has gone away but before closing its FD. 

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                     ` <20160523141049.GE16764-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-05-24 17:14                                       ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2016-05-24 17:14 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Leon Romanovsky, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny,
	Mike Marciniszyn

On Mon, May 23, 2016 at 10:10:50AM -0400, Dennis Dalessandro wrote:

> ioctl over specific ioctls in the general sense. I know that's what folks
> seem to prefer from the calls, but perhaps we can get that down in writing
> here on the list.

The ioctl space is quite limited, based on # alone there is something
like 256 of them available within 0x1b. This is not enough for all the
calls that uverbs needs, so they must be multiplexed. Once we start
multiplexing we may as well multiplex everything.

It is desired in the kernel, in general, that ioctls are globally
unique. If hfi1 uses an ioctl number it becomes unavailable to other
parts of the kernel, including verbs. You also need to check you are
not re-using any of these ioctl numbers between other ib things like
umad/etc.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]             ` <20160524141756.GA17438-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-05-24 17:20               ` Jason Gunthorpe
       [not found]                 ` <20160524172054.GC8037-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2016-05-24 17:20 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Tue, May 24, 2016 at 10:17:57AM -0400, Dennis Dalessandro wrote:

> Due to the nature of our hardware user space has direct access to the
> device. This means there is always going to be a race between the card going
> away and user space trying to access something that isn't there.

You have to fix this. mlx did and uses a similar direct sharing
scheme. IIRC for hot-removal they swapped out the mmapped PCI bar with
0's or something.

Alternatively, somehow block device removal until it is safe, all
mmaps are closed and all fds are closed.

> The situations which we have to worry about are someone physically removing
> the card, or using admin priv to unbind it from pci, things of that nature.
> All of which are not normal use cases.

You need to go through this process for PCI error recovery, IIRC, and
there was a patch series lately to make the core support device
hot-removal for exactly this reason.

hfi1 does not need to support hot removal, but it must support safe
removal by blocking remove until it is safe. This is the problem with
doing all your own cdev infrastructure, you have to also duplicate all
this stuff from the core code as well.

> This patch handles a specific issue. The parent data structure of the cdev
> going away. So if something is hanging onto the cdev we won't panic when it
> tries to close. For instance a user application sending the get_version
> ioctl after the device has gone away but before closing its FD.

Yes, but there are clearly more problems.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                         ` <20160524175409.GI25500-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-05-24 19:17                                           ` Doug Ledford
       [not found]                                             ` <a4477030-c966-636e-e395-96857454c7de-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Doug Ledford @ 2016-05-24 19:17 UTC (permalink / raw)
  To: leon-DgEjT+Ai2ygdnm+yROfE0A
  Cc: Dennis Dalessandro, Mitko Haralanov, Ira Weiny, Mike Marciniszyn,
	linux-rdma

[-- Attachment #1: Type: text/plain, Size: 4523 bytes --]

On 05/24/2016 01:54 PM, Leon Romanovsky wrote:
> On Tue, May 24, 2016 at 12:13:56PM -0400, Doug Ledford wrote:
>> On 05/23/2016 10:10 AM, Dennis Dalessandro wrote:
>>> On Mon, May 23, 2016 at 04:03:12PM +0300, Leon Romanovsky wrote:
>>>> On Mon, May 23, 2016 at 08:22:08AM -0400, Dennis Dalessandro wrote:
>>>>> On Sun, May 22, 2016 at 08:57:15PM +0300, Leon Romanovsky wrote:
>>>>>> On Sun, May 22, 2016 at 10:03:52AM -0400, Dennis Dalessandro wrote:
>>>>>>> On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
>>>>>>>>>> I think the overall consensus over participants in OFVWG call
>>>>> was to use
>>>>>>>>>> one IOCTL to enter into device specific handler which will do all
>>>>>>>>>> necessary parsing and not spamming common IOCTL interface.
>>>>>>>>>
>>>>>>>>> That was for the verbs working group and the verbs 2.0 uAPI. This
>>>>> is for
>>>>>>>>> psm.
>>>>>>>>
>>>>>>>> I'm glad that you are supporting my point.
>>>>>>>> It is vendor specific implementation for vendor specific driver
>>>>> and not
>>>>>>>> for whole IB core, so there is no need to pollute general IB ioctls.
>>>>>>>
>>>>>>> It is making use of and applying a proper classification.  Is there a
>>>>>>> technical concern with this other than that's not how verbs may end
>>>>> up doing
>>>>>>> it?
>>>>>>>
>>>>>>> I'm not completely opposed to the single ioctl, I just don't
>>>>> necessarily see
>>>>>>> that as better in this case but am willing to listen to a technical
>>>>>>> justification for why it's incorrect.
>>>>>>
>>>>>> it will simplify internal and external development by removing the
>>>>>> tensions over ioctls numbers. Do you plan to take the block of ioctls
>>>>>> for future expansion? Do you plan to mix hfi's ioctls with verbs's
>>>>> ioctls
>>>>>> based on acceptance of new code?
>>>>>
>>>>> I'm still not sure what you are getting at here. Can you explain what
>>>>> you
>>>>> mean by tensions over ioctl numbers?  I guess I don't understand why the
>>>>> hfi1_x device's use of icotl numbers has any bearing at all on the
>>>>> ibcore/verbs ioctl(s).
>>>>>
>>>>> If and when new code is accepted and hfi1 converges its API to go
>>>>> through a
>>>>> common character device, then hfi1 would surely change to match
>>>>> whatever is
>>>>> there whether that's a single ioctl with a command type embedded or
>>>>> something that has not even yet been proposed.
>>>>
>>>> Denny,
>>>>
>>>> It is easy for everyone to converge hfi1 API from day one, so if and
>>>> when new code is posted, the hfi1 changes will be summarized by one
>>>> line change.
>>>
>>> Let's put the future API issue, and the specifics of this patch aside
>>> for just a minute.  I'd like to understand the rationale for wanting a
>>> single ioctl over specific ioctls in the general sense. I know that's
>>> what folks seem to prefer from the calls, but perhaps we can get that
>>> down in writing here on the list.
>>>
>>> I see an advantage for the specific ioctls because we can classify them
>>> based on permission. When running things like strace you can decode the
>>> ioctl number and see what access it is making. It also makes it easy to
>>> have a gist of what is going on based on the ioctl call itself.
>>
>> Personally, if there is no shortage of ioctls (and there shouldn't be in
>> this case because this is ioctls on the psm cdev, not on the uverbs
>> device file), then the separate ioctls have their benefits as Dennis
>> points out.  And seeing as how they (Intel) maintain the psm library
>> that uses this interface, if they want their library using different
>> ioctls and their driver using different ioctls versus one mega ioctl
>> with embedded commands, I'm inclined to let them decide how they want it
>> to be.
> 
> Except one thing that their device should integrate into already
> available char device and don't create new one in IB space.

They have always had their own device.  Until the verbs 2.0 API is moved
forward, I expect them to continue to do so.  That they used the
InfiniBand ioctl number means we might need to make sure that the verbs
2.0 API ioctl numbers and the ones they used don't clash, but given that
we have an assigned range of 256 ioctls and this patchset uses up only
13 (and 13 that could probably be shared with qib), I don't see this as
a starvation of ioctl space issue.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]                 ` <20160524172054.GC8037-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-05-24 19:39                   ` Dennis Dalessandro
       [not found]                     ` <20160524193955.GA17130-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-24 19:39 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Tue, May 24, 2016 at 11:20:54AM -0600, Jason Gunthorpe wrote:
>On Tue, May 24, 2016 at 10:17:57AM -0400, Dennis Dalessandro wrote:
>
>> Due to the nature of our hardware user space has direct access to the
>> device. This means there is always going to be a race between the card going
>> away and user space trying to access something that isn't there.
>
>You have to fix this. mlx did and uses a similar direct sharing
>scheme. IIRC for hot-removal they swapped out the mmapped PCI bar with
>0's or something.
>
>Alternatively, somehow block device removal until it is safe, all
>mmaps are closed and all fds are closed.
>
>> The situations which we have to worry about are someone physically removing
>> the card, or using admin priv to unbind it from pci, things of that nature.
>> All of which are not normal use cases.
>
>You need to go through this process for PCI error recovery, IIRC, and
>there was a patch series lately to make the core support device
>hot-removal for exactly this reason.
>
>hfi1 does not need to support hot removal, but it must support safe
>removal by blocking remove until it is safe. This is the problem with
>doing all your own cdev infrastructure, you have to also duplicate all
>this stuff from the core code as well.

Agreed, it is a drawback. For now we'll continue improving what we have. I'm 
intrigued by the idea of holding onto the PCI bar and looking into that 
more. I see that as a follow on patch though.

>> This patch handles a specific issue. The parent data structure of the cdev
>> going away. So if something is hanging onto the cdev we won't panic when it
>> tries to close. For instance a user application sending the get_version
>> ioctl after the device has gone away but before closing its FD.
>
>Yes, but there are clearly more problems.

We may need other fixes I'll give you that, but is there a reason not to 
apply this particular patch?

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                             ` <a4477030-c966-636e-e395-96857454c7de-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-05-24 20:13                                               ` Leon Romanovsky
       [not found]                                                 ` <20160524201317.GK25500-2ukJVAZIZ/Y@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Leon Romanovsky @ 2016-05-24 20:13 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Dennis Dalessandro, Mitko Haralanov, Ira Weiny, Mike Marciniszyn,
	linux-rdma

[-- Attachment #1: Type: text/plain, Size: 5145 bytes --]

On Tue, May 24, 2016 at 03:17:00PM -0400, Doug Ledford wrote:
> On 05/24/2016 01:54 PM, Leon Romanovsky wrote:
> > On Tue, May 24, 2016 at 12:13:56PM -0400, Doug Ledford wrote:
> >> On 05/23/2016 10:10 AM, Dennis Dalessandro wrote:
> >>> On Mon, May 23, 2016 at 04:03:12PM +0300, Leon Romanovsky wrote:
> >>>> On Mon, May 23, 2016 at 08:22:08AM -0400, Dennis Dalessandro wrote:
> >>>>> On Sun, May 22, 2016 at 08:57:15PM +0300, Leon Romanovsky wrote:
> >>>>>> On Sun, May 22, 2016 at 10:03:52AM -0400, Dennis Dalessandro wrote:
> >>>>>>> On Sun, May 22, 2016 at 03:01:29PM +0300, Leon Romanovsky wrote:
> >>>>>>>>>> I think the overall consensus over participants in OFVWG call
> >>>>> was to use
> >>>>>>>>>> one IOCTL to enter into device specific handler which will do all
> >>>>>>>>>> necessary parsing and not spamming common IOCTL interface.
> >>>>>>>>>
> >>>>>>>>> That was for the verbs working group and the verbs 2.0 uAPI. This
> >>>>> is for
> >>>>>>>>> psm.
> >>>>>>>>
> >>>>>>>> I'm glad that you are supporting my point.
> >>>>>>>> It is vendor specific implementation for vendor specific driver
> >>>>> and not
> >>>>>>>> for whole IB core, so there is no need to pollute general IB ioctls.
> >>>>>>>
> >>>>>>> It is making use of and applying a proper classification.  Is there a
> >>>>>>> technical concern with this other than that's not how verbs may end
> >>>>> up doing
> >>>>>>> it?
> >>>>>>>
> >>>>>>> I'm not completely opposed to the single ioctl, I just don't
> >>>>> necessarily see
> >>>>>>> that as better in this case but am willing to listen to a technical
> >>>>>>> justification for why it's incorrect.
> >>>>>>
> >>>>>> it will simplify internal and external development by removing the
> >>>>>> tensions over ioctls numbers. Do you plan to take the block of ioctls
> >>>>>> for future expansion? Do you plan to mix hfi's ioctls with verbs's
> >>>>> ioctls
> >>>>>> based on acceptance of new code?
> >>>>>
> >>>>> I'm still not sure what you are getting at here. Can you explain what
> >>>>> you
> >>>>> mean by tensions over ioctl numbers?  I guess I don't understand why the
> >>>>> hfi1_x device's use of icotl numbers has any bearing at all on the
> >>>>> ibcore/verbs ioctl(s).
> >>>>>
> >>>>> If and when new code is accepted and hfi1 converges its API to go
> >>>>> through a
> >>>>> common character device, then hfi1 would surely change to match
> >>>>> whatever is
> >>>>> there whether that's a single ioctl with a command type embedded or
> >>>>> something that has not even yet been proposed.
> >>>>
> >>>> Denny,
> >>>>
> >>>> It is easy for everyone to converge hfi1 API from day one, so if and
> >>>> when new code is posted, the hfi1 changes will be summarized by one
> >>>> line change.
> >>>
> >>> Let's put the future API issue, and the specifics of this patch aside
> >>> for just a minute.  I'd like to understand the rationale for wanting a
> >>> single ioctl over specific ioctls in the general sense. I know that's
> >>> what folks seem to prefer from the calls, but perhaps we can get that
> >>> down in writing here on the list.
> >>>
> >>> I see an advantage for the specific ioctls because we can classify them
> >>> based on permission. When running things like strace you can decode the
> >>> ioctl number and see what access it is making. It also makes it easy to
> >>> have a gist of what is going on based on the ioctl call itself.
> >>
> >> Personally, if there is no shortage of ioctls (and there shouldn't be in
> >> this case because this is ioctls on the psm cdev, not on the uverbs
> >> device file), then the separate ioctls have their benefits as Dennis
> >> points out.  And seeing as how they (Intel) maintain the psm library
> >> that uses this interface, if they want their library using different
> >> ioctls and their driver using different ioctls versus one mega ioctl
> >> with embedded commands, I'm inclined to let them decide how they want it
> >> to be.
> > 
> > Except one thing that their device should integrate into already
> > available char device and don't create new one in IB space.
> 
> They have always had their own device.  Until the verbs 2.0 API is moved
> forward, I expect them to continue to do so.  That they used the
> InfiniBand ioctl number means we might need to make sure that the verbs
> 2.0 API ioctl numbers and the ones they used don't clash, but given that
> we have an assigned range of 256 ioctls and this patchset uses up only
> 13 (and 13 that could probably be shared with qib), I don't see this as
> a starvation of ioctl space issue.

Let's assume that you are planning to provide block of 20 ioctls per
device. Right now, there are 10 (or 8 if we count driver families) drivers in
drivers/infiniband/hw/ + 1 is coming (HSI) + uverbs (approximately 40)
=> 8 * 20 + 20 + 40 = 220 ioctls => we already in shortage without room
to expansion.

Why do we need to put ourselves in such situation?

> 
> 
> -- 
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>               GPG KeyID: 0E572FDD
> 
> 



[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                                 ` <20160524201317.GK25500-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-05-24 20:29                                                   ` Hefty, Sean
       [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373AB050188-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Hefty, Sean @ 2016-05-24 20:29 UTC (permalink / raw)
  To: leon-DgEjT+Ai2ygdnm+yROfE0A, Doug Ledford
  Cc: Dalessandro, Dennis, Haralanov, Mitko, Weiny, Ira, Marciniszyn,
	Mike, linux-rdma

> Let's assume that you are planning to provide block of 20 ioctls per
> device. Right now, there are 10 (or 8 if we count driver families) drivers in
> drivers/infiniband/hw/ + 1 is coming (HSI) + uverbs (approximately 40)
> => 8 * 20 + 20 + 40 = 220 ioctls => we already in shortage without room
> to expansion.

Ioctl commands are naturally scoped per open file.  Unless the opened file supports ioctls directed to every piece of hardware, I don't see why there's an obsession with command space.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373AB050188-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-05-24 20:54                                                       ` Jason Gunthorpe
       [not found]                                                         ` <20160524205425.GA7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2016-05-24 20:54 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: leon-DgEjT+Ai2ygdnm+yROfE0A, Doug Ledford, Dalessandro, Dennis,
	Haralanov, Mitko, Weiny, Ira, Marciniszyn, Mike, linux-rdma

On Tue, May 24, 2016 at 08:29:41PM +0000, Hefty, Sean wrote:
> > Let's assume that you are planning to provide block of 20 ioctls per
> > device. Right now, there are 10 (or 8 if we count driver families) drivers in
> > drivers/infiniband/hw/ + 1 is coming (HSI) + uverbs (approximately 40)
> > => 8 * 20 + 20 + 40 = 220 ioctls => we already in shortage without room
> > to expansion.
> 
> Ioctl commands are naturally scoped per open file.  Unless the
> opened file supports ioctls directed to every piece of hardware, I
> don't see why there's an obsession with command space.

The kernel standard is to have ioctl numbers be globally unique, even
though technically you are right and they are scoped to a file.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]                     ` <20160524193955.GA17130-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2016-05-24 21:51                       ` Jason Gunthorpe
       [not found]                         ` <20160524215105.GD7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2016-05-24 21:51 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Tue, May 24, 2016 at 03:39:56PM -0400, Dennis Dalessandro wrote:

> >Yes, but there are clearly more problems.
> 
> We may need other fixes I'll give you that, but is there a reason not to
> apply this particular patch?

Nope, none at all, but my questions are all on the same topic - that
the cdev removal vs device removal is not robust.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                                         ` <20160524205425.GA7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-05-24 22:08                                                           ` Hefty, Sean
       [not found]                                                             ` <1828884A29C6694DAF28B7E6B8A82373AB05027E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2016-05-25 17:56                                                           ` Doug Ledford
  1 sibling, 1 reply; 35+ messages in thread
From: Hefty, Sean @ 2016-05-24 22:08 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: leon-DgEjT+Ai2ygdnm+yROfE0A, Doug Ledford, Dalessandro, Dennis,
	Haralanov, Mitko, Weiny, Ira, Marciniszyn, Mike, linux-rdma

> > Ioctl commands are naturally scoped per open file.  Unless the
> > opened file supports ioctls directed to every piece of hardware, I
> > don't see why there's an obsession with command space.
> 
> The kernel standard is to have ioctl numbers be globally unique, even
> though technically you are right and they are scoped to a file.

I thought the reason behind that was so that an app that sent an ioctl to the wrong file would get an error, rather than unspecified behavior.  If so, collapsing all ioctls to a single command doesn't really solve the issue.  It's kind of like using a write interface instead of an ioctl interface to avoid using ioctls.  :)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                                             ` <1828884A29C6694DAF28B7E6B8A82373AB05027E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2016-05-24 22:15                                                               ` Jason Gunthorpe
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Gunthorpe @ 2016-05-24 22:15 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: leon-DgEjT+Ai2ygdnm+yROfE0A, Doug Ledford, Dalessandro, Dennis,
	Haralanov, Mitko, Weiny, Ira, Marciniszyn, Mike, linux-rdma

On Tue, May 24, 2016 at 10:08:10PM +0000, Hefty, Sean wrote:
> > > Ioctl commands are naturally scoped per open file.  Unless the
> > > opened file supports ioctls directed to every piece of hardware, I
> > > don't see why there's an obsession with command space.
> > 
> > The kernel standard is to have ioctl numbers be globally unique, even
> > though technically you are right and they are scoped to a file.
> 
> I thought the reason behind that was so that an app that sent an
> ioctl to the wrong file would get an error, rather than unspecified
> behavior.

Yes, that is an oft quoted reason, but it also allows strace to decode
all the ioctls.

> If so, collapsing all ioctls to a single command doesn't
> really solve the issue.  It's kind of like using a write interface
> instead of an ioctl interface to avoid using ioctls.  :)

The idea isn't to collapse, but to expand the decode to include some
data from the header. Instead of just dispatching on the ioctl, you
dispatch on the (ioctl,domain,command) tuple. In your patches domain
is encoded in the header, in Leon's domain & command are encoded in
the header. Doesn't really matter.

At the end of the day the decode is globally unique and does not
require scoping to the open'd file to understand.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                                         ` <20160524205425.GA7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2016-05-24 22:08                                                           ` Hefty, Sean
@ 2016-05-25 17:56                                                           ` Doug Ledford
       [not found]                                                             ` <d4913637-7167-8491-88ea-fa65d1e0c22d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 35+ messages in thread
From: Doug Ledford @ 2016-05-25 17:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Hefty Sean
  Cc: leon-DgEjT+Ai2ygdnm+yROfE0A, Dalessandro Dennis, Haralanov Mitko,
	Weiny Ira, Marciniszyn Mike, linux-rdma

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

On 05/24/2016 04:54 PM, Jason Gunthorpe wrote:
> On Tue, May 24, 2016 at 08:29:41PM +0000, Hefty, Sean wrote:
>>> Let's assume that you are planning to provide block of 20 ioctls per
>>> device. Right now, there are 10 (or 8 if we count driver families) drivers in
>>> drivers/infiniband/hw/ + 1 is coming (HSI) + uverbs (approximately 40)
>>> => 8 * 20 + 20 + 40 = 220 ioctls => we already in shortage without room
>>> to expansion.
>>
>> Ioctl commands are naturally scoped per open file.  Unless the
>> opened file supports ioctls directed to every piece of hardware, I
>> don't see why there's an obsession with command space.
> 
> The kernel standard is to have ioctl numbers be globally unique, even
> though technically you are right and they are scoped to a file.

A not very well followed standard...


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early
       [not found]                         ` <20160524215105.GD7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-05-25 18:45                           ` Dennis Dalessandro
  0 siblings, 0 replies; 35+ messages in thread
From: Dennis Dalessandro @ 2016-05-25 18:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mitko Haralanov, Ira Weiny

On Tue, May 24, 2016 at 03:51:05PM -0600, Jason Gunthorpe wrote:
>On Tue, May 24, 2016 at 03:39:56PM -0400, Dennis Dalessandro wrote:
>
>> >Yes, but there are clearly more problems.
>> 
>> We may need other fixes I'll give you that, but is there a reason not to
>> apply this particular patch?
>
>Nope, none at all, but my questions are all on the same topic - that
>the cdev removal vs device removal is not robust.

Yep, point taken. Looking into further patches.

-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands
       [not found]                                                             ` <d4913637-7167-8491-88ea-fa65d1e0c22d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-05-26 16:08                                                               ` Doug Ledford
  0 siblings, 0 replies; 35+ messages in thread
From: Doug Ledford @ 2016-05-26 16:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Hefty Sean
  Cc: leon-DgEjT+Ai2ygdnm+yROfE0A, Dalessandro Dennis, Haralanov Mitko,
	Weiny Ira, Marciniszyn Mike, linux-rdma

[-- Attachment #1: Type: text/plain, Size: 1921 bytes --]

On 05/25/2016 01:56 PM, Doug Ledford wrote:
> On 05/24/2016 04:54 PM, Jason Gunthorpe wrote:
>> On Tue, May 24, 2016 at 08:29:41PM +0000, Hefty, Sean wrote:
>>>> Let's assume that you are planning to provide block of 20 ioctls per
>>>> device. Right now, there are 10 (or 8 if we count driver families) drivers in
>>>> drivers/infiniband/hw/ + 1 is coming (HSI) + uverbs (approximately 40)
>>>> => 8 * 20 + 20 + 40 = 220 ioctls => we already in shortage without room
>>>> to expansion.
>>>
>>> Ioctl commands are naturally scoped per open file.  Unless the
>>> opened file supports ioctls directed to every piece of hardware, I
>>> don't see why there's an obsession with command space.
>>
>> The kernel standard is to have ioctl numbers be globally unique, even
>> though technically you are right and they are scoped to a file.
> 
> A not very well followed standard...
> 
> 

I took this patch, but I moved all of your ioctl numbers to the end of
the range by doing this to your declarations:

/*
 * Make the ioctls occupy the last 0xf0-0xff portion of the IB range
 */
#define __NUM(cmd) (HFI1_CMD_##cmd + 0xe0)

struct hfi1_cmd;
#define HFI1_IOCTL_ASSIGN_CTXT \
        _IOWR(IB_IOCTL_MAGIC, __NUM(ASSIGN_CTXT), struct hfi1_user_info)
...

The intent here is that strace or other tools should be able to decode
the IB ioctls.  Since we are starting at the low end, it would be a
*long* time before we would ever get up here (if we even do).  Between
then and now, these will go away to be replaced by the vendor channel
support or something similar from verbs 2.0.  So I don't really care if
strace ever decodes these until we get official ioctls for the IB stack
at this location, and by then it doesn't need to know these ever existed
as they will be long gone.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2016-05-26 16:08 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-19 12:25 [PATCH 00/10] IB/hfi1: Clean up cdevs, convert write to ioctl, and destage driver Dennis Dalessandro
     [not found] ` <20160519122318.22041.58871.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-05-19 12:25   ` [PATCH 01/10] IB/hfi1: Remove multiple device cdev Dennis Dalessandro
2016-05-19 12:25   ` [PATCH 02/10] IB/hfi1: Remove UI char device Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 03/10] IB/hfi1: Remove EPROM functionality from data device Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 04/10] IB/hfi1: Remove snoop/diag interface Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 05/10] IB/hfi1: Remove unused user command Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 06/10] IB/hfi1: Add ioctl() interface for user commands Dennis Dalessandro
     [not found]     ` <20160519122622.22041.41686.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-05-21 12:34       ` Leon Romanovsky
     [not found]         ` <20160521123404.GB25500-2ukJVAZIZ/Y@public.gmane.org>
2016-05-21 16:23           ` Dennis Dalessandro
     [not found]             ` <20160521162301.GA16770-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-05-22 12:01               ` Leon Romanovsky
     [not found]                 ` <20160522120129.GC25500-2ukJVAZIZ/Y@public.gmane.org>
2016-05-22 14:03                   ` Dennis Dalessandro
     [not found]                     ` <20160522140351.GA10696-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-05-22 17:57                       ` Leon Romanovsky
     [not found]                         ` <20160522175715.GD25500-2ukJVAZIZ/Y@public.gmane.org>
2016-05-23 12:22                           ` Dennis Dalessandro
     [not found]                             ` <20160523122207.GA16764-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-05-23 13:03                               ` Leon Romanovsky
     [not found]                                 ` <20160523130312.GG25500-2ukJVAZIZ/Y@public.gmane.org>
2016-05-23 14:10                                   ` Dennis Dalessandro
     [not found]                                     ` <20160523141049.GE16764-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-05-24 17:14                                       ` Jason Gunthorpe
     [not found]                                     ` <b9dc4ac8-cfa2-d66e-7e36-f28116f23e59@redhat.com>
     [not found]                                       ` <20160524175409.GI25500@leon.nu>
     [not found]                                         ` <20160524175409.GI25500-2ukJVAZIZ/Y@public.gmane.org>
2016-05-24 19:17                                           ` Doug Ledford
     [not found]                                             ` <a4477030-c966-636e-e395-96857454c7de-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-24 20:13                                               ` Leon Romanovsky
     [not found]                                                 ` <20160524201317.GK25500-2ukJVAZIZ/Y@public.gmane.org>
2016-05-24 20:29                                                   ` Hefty, Sean
     [not found]                                                     ` <1828884A29C6694DAF28B7E6B8A82373AB050188-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-05-24 20:54                                                       ` Jason Gunthorpe
     [not found]                                                         ` <20160524205425.GA7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-24 22:08                                                           ` Hefty, Sean
     [not found]                                                             ` <1828884A29C6694DAF28B7E6B8A82373AB05027E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-05-24 22:15                                                               ` Jason Gunthorpe
2016-05-25 17:56                                                           ` Doug Ledford
     [not found]                                                             ` <d4913637-7167-8491-88ea-fa65d1e0c22d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-26 16:08                                                               ` Doug Ledford
2016-05-19 12:26   ` [PATCH 07/10] IB/hfi1: Remove write(), use ioctl() for user cmds Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 08/10] IB/hfi1: Add trace message in user IOCTL handling Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 09/10] IB/hfi1: Do not free hfi1 cdev parent structure early Dennis Dalessandro
     [not found]     ` <20160519122642.22041.66203.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-05-19 18:31       ` Jason Gunthorpe
     [not found]         ` <20160519183100.GC26130-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-20 15:57           ` Dennis Dalessandro
2016-05-24 14:17           ` Dennis Dalessandro
     [not found]             ` <20160524141756.GA17438-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-05-24 17:20               ` Jason Gunthorpe
     [not found]                 ` <20160524172054.GC8037-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-24 19:39                   ` Dennis Dalessandro
     [not found]                     ` <20160524193955.GA17130-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2016-05-24 21:51                       ` Jason Gunthorpe
     [not found]                         ` <20160524215105.GD7950-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-25 18:45                           ` Dennis Dalessandro
2016-05-19 12:26   ` [PATCH 10/10] IB/hfi1: Move driver out of staging Dennis Dalessandro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.