linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [EXT] Re: [PATCH V5 0/2] Change vring space from nomal memory to dma coherent memory
  @ 2020-10-29  2:03 99%                 ` Dutt, Sudeep
  0 siblings, 0 replies; 200+ results
From: Dutt, Sudeep @ 2020-10-29  2:03 UTC (permalink / raw)
  To: sherry.sun, gregkh, fugang.duan
  Cc: linux-imx, linux-kernel, hch, kishon, lorenzo.pieralisi, Dutt,
	Sudeep, Dixit, Ashutosh, linux-pci, arnd, vincent.whitchurch

On Thu, 2020-10-29 at 01:51 +0000, Sherry Sun wrote:
> Hi Greg,
> 
> > Subject: Re: [EXT] Re: [PATCH V5 0/2] Change vring space from nomal
> > memory to dma coherent memory
> > 
> > On Wed, Oct 28, 2020 at 03:11:15PM +0000, Andy Duan wrote:
> > > From: Greg KH <gregkh@linuxfoundation.org> Sent: Wednesday,
> > > October
> > > 28, 2020 7:14 PM
> > > > On Wed, Oct 28, 2020 at 10:17:39AM +0000, Andy Duan wrote:
> > > > > From: Greg KH <gregkh@linuxfoundation.org> Sent: Wednesday,
> > > > > October 28, 2020 3:07 PM
> > > > > > On Wed, Oct 28, 2020 at 06:05:28AM +0000, Sherry Sun wrote:
> > > > > > > Hi Greg,
> > > > > > > 
> > > > > > > > Subject: Re: [PATCH V5 0/2] Change vring space from
> > > > > > > > nomal
> > > > > > > > memory to dma coherent memory
> > > > > > > > 
> > > > > > > > On Wed, Oct 28, 2020 at 10:03:03AM +0800, Sherry Sun
> > > > > > > > wrote:
> > > > > > > > > Changes in V5:
> > > > > > > > > 1. Reorganize the vop_mmap function code in patch 1,
> > > > > > > > > which
> > > > > > > > > is done by
> > > > > > > > 
> > > > > > > > Christoph.
> > > > > > > > > 2. Completely remove the unnecessary code related to
> > > > > > > > > reassign the used ring for card in patch 2.
> > > > > > > > > 
> > > > > > > > > The original vop driver only supports dma coherent
> > > > > > > > > device,
> > > > > > > > > as it allocates and maps vring by _get_free_pages and
> > > > > > > > > dma_map_single, but not use
> > > > > > > > > dma_sync_single_for_cpu/device
> > > > > > > > > to sync the updates of device_page/vring between EP
> > > > > > > > > and
> > > > > > > > > RC, which will cause memory synchronization problem
> > > > > > > > > for
> > > > > > > > > device don't support
> > > > 
> > > > hardware dma coherent.
> > > > > > > > > 
> > > > > > > > > And allocate vrings use dma_alloc_coherent is a
> > > > > > > > > common way
> > > > > > > > > in kernel, as the memory interacted between two
> > > > > > > > > systems
> > > > > > > > > should use consistent memory to avoid caching
> > > > > > > > > effects. So
> > > > > > > > > here add noncoherent platform
> > > > > > > > 
> > > > > > > > support for vop driver.
> > > > > > > > > Also add some related dma changes to make sure
> > > > > > > > > noncoherent
> > > > > > > > > platform works well.
> > > > > > > > > 
> > > > > > > > > Sherry Sun (2):
> > > > > > > > >   misc: vop: change the way of allocating vrings and
> > > > > > > > > device page
> > > > > > > > >   misc: vop: do not allocate and reassign the used
> > > > > > > > > ring
> > > > > > > > > 
> > > > > > > > >  drivers/misc/mic/bus/vop_bus.h     |   2 +
> > > > > > > > >  drivers/misc/mic/host/mic_boot.c   |   9 ++
> > > > > > > > >  drivers/misc/mic/host/mic_main.c   |  43 ++------
> > > > > > > > >  drivers/misc/mic/vop/vop_debugfs.c |   4 -
> > > > > > > > >  drivers/misc/mic/vop/vop_main.c    |  70 +--------
> > > > > > > > > ---
> > > > > > > > >  drivers/misc/mic/vop/vop_vringh.c  | 166 ++++++++++-
> > > > > > > > > ------------
> > 
> > ------
> > > > > > > > >  include/uapi/linux/mic_common.h    |   9 +-
> > > > > > > > >  7 files changed, 85 insertions(+), 218 deletions(-)
> > > > > > > > 
> > > > > > > > Have you all seen:
> > > > > > > > 
> > > > > > > > 
https://eur01.safelinks.protection.outlook.com/?url=https%3A
> > > > > > > > %2F%25
> > > > > > > > 25
> > > > > > > > 
> > 
> > 2Flore.kernel.org%2Fr%2F8c1443136563de34699d2c084df478181c205db4.16
> > > > > > > > 
> > 
> > 03854416.git.sudeep.dutt%40intel.com&amp;data=04%7C01%7Csherry.sun%
> > > > > > > > 
> > 
> > 40nxp.com%7Cc19c987667434969847e08d87b0685e8%7C686ea1d3bc2b4c6f
> > > > > > > > 
> > 
> > a92cd99c5c301635%7C0%7C0%7C637394615238940323%7CUnknown%7CTW
> > > > > > > > 
> > 
> > FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJX
> > > > > > > > 
> > 
> > VCI6Mn0%3D%7C1000&amp;sdata=Zq%2FtHWTq%2BuIVBYXFGoeBmq0JJzYd
> > > > > > > > 9zDyv4NVN4TpC%2FU%3D&amp;reserved=0
> > > > > > > > 
> > > > > > > > Looks like this code is asking to just be deleted, is
> > > > > > > > that ok with you?
> > > > > > > 
> > > > > > > Yes, I saw that patch. I'm ok with it.
> > > > > > 
> > > > > > Great, can you please provide a "Reviewed-by:" or "Acked-
> > > > > > by:" for it?
> > > > > > 
> > > > > > thanks,
> > > > > > 
> > > > > > greg k-h
> > > > > 
> > > > > Sherry took much effort on the features support on i.MX
> > > > > series
> > > > > like
> > > > 
> > > > i.MX8QM/i.MX8QXP/i.MX8MM.
> > > > > 
> > > > > Now it is a pity to delete the vop code.
> > > > > 
> > > > > One question,
> > > > > can we resubmit vop code by clean up, now only for i.MX
> > > > > series as
> > > > > Dutt's
> > > > 
> > > > suggestion ?

Resubmitting the VOP code with cleanups tailored for i.MX makes sense
to me.

> > > > > Or we have to drop the design and switch to select other
> > > > > solutions ?
> > > 
> > > Okay, we plan to switch to NTB solution.
> > 
> > What is a "NTB solution" exactly?
> 
> The driver located at drivers/ntb/, it also can setup a point-to-
> point PCI-E bus connecting between two systems.
> But we haven't got a deep look of this driver yet, so we are not sure
> whether it can replace the vop framework.
> 
> > 
> > > 
> > > > If this whole subsystem is being deleted because it is not used
> > > > and
> > > > never shipped, yes, please use a different solution.
> > > > 
> > > > I don't understand why you were trying to piggy-back on this
> > > > codebase if the hardware was totally different, for some reason
> > > > I
> > > > thought this was the same hardware.  What exactly is this?
> > > 
> > > Not the whole codebase, just the vop framework.
> > 
> > That didn't answer the question at all, what are you all trying to
> > do here, with
> > what hardware, that the VOP code seemed like a good fit?
> 
> Vop is a common framework which is independent of the Intel MIC
> hardware.
> We planed to reuse vop framework on two arm64 architecture devices,
> to setup the connection between two systems based on virtio over
> PCIE.

Yes, we wanted Virtio Over PCIe (VOP) to be independent of the hardware
as much as possible. It did end up under the mic/ driver subsystem
though so it would be good to attempt placing it in a generic folder
which is not tied to a specific hardware layer this time around.

Regards,
Sudeep Dutt

^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next 1/1] misc: mic: remove the MIC drivers
  @ 2020-10-28 17:22 99%   ` Dutt, Sudeep
  0 siblings, 0 replies; 200+ results
From: Dutt, Sudeep @ 2020-10-28 17:22 UTC (permalink / raw)
  To: gregkh
  Cc: corbet, linux-kernel, sherry.sun, dmaengine, Dutt, Sudeep, Rao,
	Nikhil, Dixit, Ashutosh, vkoul, arnd, linux-doc

On Wed, 2020-10-28 at 06:54 +0100, Greg Kroah-Hartman wrote:
> On Tue, Oct 27, 2020 at 08:14:15PM -0700, Sudeep Dutt wrote:
> > This patch removes the MIC drivers from the kernel tree
> > since the corresponding devices have been discontinued.
> 
> Does "discontinued" mean "never shipped a device so no one has access
> to
> this hardware anymore", or does it mean "we stopped shipping devices
> and
> there are customers with this?"

Hi Greg,

We are not aware of any customers of the upstreamed MIC drivers. The 
drivers were upstreamed primarily to lay a foundation for enabling the
next generation MIC devices which did not ship.

> > Removing the dma and char-misc changes in one patch and
> > merging via the char-misc tree is best to avoid any
> > potential build breakage.
> > 
> > Cc: Nikhil Rao <nikhil.rao@intel.com>
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> 
> I like deleting code, can this go into 5.10-final?

Yes, we would prefer this goes into v5.10. I am hoping you can carry
the Ack from Vinod and the Reviewed-by from Sherry but I can resend the
patch with those updates in the commit message if required. I did
verify that this patch passes allmodconfig and allyesconfig builds with
your latest char-misc-next tree.

Many thanks for your help with this subsystem Greg.

Best Regards,
Sudeep Dutt

^ permalink raw reply	[relevance 99%]

* Re: [PATCH 5/6] dma: mic_x100_dma: no need to check return value of debugfs_create functions
  @ 2019-06-12 15:44 99%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2019-06-12 15:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sudeep Dutt, dan.j.williams, vkoul, Ashutosh Dixit, dmaengine,
	linux-kernel

On Wed, 2019-06-12 at 14:25 +0200, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 

Thanks Greg.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Vinod Koul <vkoul@kernel.org>
> Cc: dmaengine@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  drivers/dma/mic_x100_dma.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/dma/mic_x100_dma.c b/drivers/dma/mic_x100_dma.c
> index 6a91e28d537d..584e09661507 100644
> --- a/drivers/dma/mic_x100_dma.c
> +++ b/drivers/dma/mic_x100_dma.c
> @@ -728,10 +728,8 @@ static int mic_dma_driver_probe(struct mbus_device *mbdev)
>  	if (mic_dma_dbg) {
>  		mic_dma_dev->dbg_dir = debugfs_create_dir(dev_name(&mbdev->dev),
>  							  mic_dma_dbg);
> -		if (mic_dma_dev->dbg_dir)
> -			debugfs_create_file("mic_dma_reg", 0444,
> -					    mic_dma_dev->dbg_dir, mic_dma_dev,
> -					    &mic_dma_reg_fops);
> +		debugfs_create_file("mic_dma_reg", 0444, mic_dma_dev->dbg_dir,
> +				    mic_dma_dev, &mic_dma_reg_fops);
>  	}
>  	return 0;
>  }



^ permalink raw reply	[relevance 99%]

* Re: [PATCH] mic: no need to check return value of debugfs_create functions
  @ 2019-06-12 15:41 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2019-06-12 15:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sudeep Dutt, Ashutosh Dixit, Arnd Bergmann, linux-kernel

On Tue, 2019-06-11 at 20:43 +0200, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 

Thanks Greg.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  drivers/misc/mic/card/mic_debugfs.c  | 18 ++----------------
>  drivers/misc/mic/cosm/cosm_debugfs.c |  4 ----
>  drivers/misc/mic/host/mic_debugfs.c  |  4 ----
>  drivers/misc/mic/scif/scif_debugfs.c |  5 -----
>  drivers/misc/mic/vop/vop_debugfs.c   |  4 ----
>  5 files changed, 2 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/misc/mic/card/mic_debugfs.c b/drivers/misc/mic/card/mic_debugfs.c
> index 7a4140874888..fa2b5fefb791 100644
> --- a/drivers/misc/mic/card/mic_debugfs.c
> +++ b/drivers/misc/mic/card/mic_debugfs.c
> @@ -63,25 +63,13 @@ DEFINE_SHOW_ATTRIBUTE(mic_intr);
>   */
>  void __init mic_create_card_debug_dir(struct mic_driver *mdrv)
>  {
> -	struct dentry *d;
> -
>  	if (!mic_dbg)
>  		return;
>  
>  	mdrv->dbg_dir = debugfs_create_dir(mdrv->name, mic_dbg);
> -	if (!mdrv->dbg_dir) {
> -		dev_err(mdrv->dev, "Cant create dbg_dir %s\n", mdrv->name);
> -		return;
> -	}
> -
> -	d = debugfs_create_file("intr_test", 0444, mdrv->dbg_dir,
> -		mdrv, &mic_intr_fops);
>  
> -	if (!d) {
> -		dev_err(mdrv->dev,
> -			"Cant create dbg intr_test %s\n", mdrv->name);
> -		return;
> -	}
> +	debugfs_create_file("intr_test", 0444, mdrv->dbg_dir, mdrv,
> +			    &mic_intr_fops);
>  }
>  
>  /**
> @@ -101,8 +89,6 @@ void mic_delete_card_debug_dir(struct mic_driver *mdrv)
>  void __init mic_init_card_debugfs(void)
>  {
>  	mic_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
> -	if (!mic_dbg)
> -		pr_err("can't create debugfs dir\n");
>  }
>  
>  /**
> diff --git a/drivers/misc/mic/cosm/cosm_debugfs.c b/drivers/misc/mic/cosm/cosm_debugfs.c
> index 71c216d0504d..340ea7171411 100644
> --- a/drivers/misc/mic/cosm/cosm_debugfs.c
> +++ b/drivers/misc/mic/cosm/cosm_debugfs.c
> @@ -105,8 +105,6 @@ void cosm_create_debug_dir(struct cosm_device *cdev)
>  
>  	scnprintf(name, sizeof(name), "mic%d", cdev->index);
>  	cdev->dbg_dir = debugfs_create_dir(name, cosm_dbg);
> -	if (!cdev->dbg_dir)
> -		return;
>  
>  	debugfs_create_file("log_buf", 0444, cdev->dbg_dir, cdev,
>  			    &log_buf_fops);
> @@ -125,8 +123,6 @@ void cosm_delete_debug_dir(struct cosm_device *cdev)
>  void cosm_init_debugfs(void)
>  {
>  	cosm_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
> -	if (!cosm_dbg)
> -		pr_err("can't create debugfs dir\n");
>  }
>  
>  void cosm_exit_debugfs(void)
> diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
> index c6e3c764699f..370f98c7b752 100644
> --- a/drivers/misc/mic/host/mic_debugfs.c
> +++ b/drivers/misc/mic/host/mic_debugfs.c
> @@ -125,8 +125,6 @@ void mic_create_debug_dir(struct mic_device *mdev)
>  
>  	scnprintf(name, sizeof(name), "mic%d", mdev->id);
>  	mdev->dbg_dir = debugfs_create_dir(name, mic_dbg);
> -	if (!mdev->dbg_dir)
> -		return;
>  
>  	debugfs_create_file("smpt", 0444, mdev->dbg_dir, mdev,
>  			    &mic_smpt_fops);
> @@ -155,8 +153,6 @@ void mic_delete_debug_dir(struct mic_device *mdev)
>  void __init mic_init_debugfs(void)
>  {
>  	mic_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
> -	if (!mic_dbg)
> -		pr_err("can't create debugfs dir\n");
>  }
>  
>  /**
> diff --git a/drivers/misc/mic/scif/scif_debugfs.c b/drivers/misc/mic/scif/scif_debugfs.c
> index a6820480105a..8fe38e7ca6e6 100644
> --- a/drivers/misc/mic/scif/scif_debugfs.c
> +++ b/drivers/misc/mic/scif/scif_debugfs.c
> @@ -103,11 +103,6 @@ DEFINE_SHOW_ATTRIBUTE(scif_rma);
>  void __init scif_init_debugfs(void)
>  {
>  	scif_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
> -	if (!scif_dbg) {
> -		dev_err(scif_info.mdev.this_device,
> -			"can't create debugfs dir scif\n");
> -		return;
> -	}
>  
>  	debugfs_create_file("scif_dev", 0444, scif_dbg, NULL, &scif_dev_fops);
>  	debugfs_create_file("scif_rma", 0444, scif_dbg, NULL, &scif_rma_fops);
> diff --git a/drivers/misc/mic/vop/vop_debugfs.c b/drivers/misc/mic/vop/vop_debugfs.c
> index 2ccef52aca23..d4551d522188 100644
> --- a/drivers/misc/mic/vop/vop_debugfs.c
> +++ b/drivers/misc/mic/vop/vop_debugfs.c
> @@ -186,10 +186,6 @@ void vop_init_debugfs(struct vop_info *vi)
>  
>  	snprintf(name, sizeof(name), "%s%d", KBUILD_MODNAME, vi->vpdev->dnode);
>  	vi->dbg = debugfs_create_dir(name, NULL);
> -	if (!vi->dbg) {
> -		pr_err("can't create debugfs dir vop\n");
> -		return;
> -	}
>  	debugfs_create_file("dp", 0444, vi->dbg, vi, &vop_dp_fops);
>  	debugfs_create_file("vdev_info", 0444, vi->dbg, vi, &vop_vdev_info_fops);
>  }



^ permalink raw reply	[relevance 99%]

* Re: scif_insert_vma()
  @ 2019-03-18  5:47 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2019-03-18  5:47 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: Sudeep Dutt, ashutosh.dixit, linux-kernel

On Mon, 2019-03-11 at 08:45 +0200, Jarkko Sakkinen wrote:
> Hi
> 
> Just wondering what will happen if kzalloc() fails in scif_mmap.c. How
> it is recovered? I don't see anything in the VMA callbacks taking care
> of this.

Hi Jarkko,

scif_insert_vma(..) is called from scif_mmap(..) and scif_vma_open(..).
scif_mmap(..) checks for allocation failures but scif_vma_open(..) does
not on purpose.

The vm_operations_struct open(..)/close(..) callbacks do not allow
returning errors. The driver will take a reference to the VMA private
data structure irrespective of whether the allocation during the
open(..) callback succeeds or fails. The close(..) callback cleans up
the data structures from the mmap(..) or open(..) callbacks if any.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH] mic: vop: Fix broken virtqueues
  @ 2019-01-30 16:29 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2019-01-30 16:29 UTC (permalink / raw)
  To: Vincent Whitchurch
  Cc: Sudeep Dutt, gregkh, arnd, ashutosh.dixit, linux-kernel,
	virtualization, tiwei.bie, luto, Vincent Whitchurch

On Tue, 2019-01-29 at 11:22 +0100, Vincent Whitchurch wrote:
> VOP is broken in mainline since commit 1ce9e6055fa0a9043 ("virtio_ring:
> introduce packed ring support"); attempting to use the virtqueues leads
> to various kernel crashes.  I'm testing it with my not-yet-merged
> loopback patches, but even the in-tree MIC hardware cannot work.
> 
> The problem is not in the referenced commit per se, but is due to the
> following hack in vop_find_vq() which depends on the layout of private
> structures in other source files, which that commit happened to change:
> 
>   /*
>    * To reassign the used ring here we are directly accessing
>    * struct vring_virtqueue which is a private data structure
>    * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
>    * vring_new_virtqueue() would ensure that
>    *  (&vq->vring == (struct vring *) (&vq->vq + 1));
>    */
>   vr = (struct vring *)(vq + 1);
>   vr->used = used;
> 
> Fix vop by using __vring_new_virtqueue() to create the needed vring
> layout from the start, instead of attempting to patch in the used ring
> later.  __vring_new_virtqueue() was added way back in commit
> 2a2d1382fe9dcc ("virtio: Add improved queue allocation API") in order to
> address mic's usecase, according to the commit message.
> 

Thank you for fixing this up Vincent.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---
>  drivers/misc/mic/vop/vop_main.c | 60 +++++++++++++++++++--------------
>  1 file changed, 34 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/misc/mic/vop/vop_main.c b/drivers/misc/mic/vop/vop_main.c
> index d2b9782eee87..fef45bf6d519 100644
> --- a/drivers/misc/mic/vop/vop_main.c
> +++ b/drivers/misc/mic/vop/vop_main.c
> @@ -284,6 +284,26 @@ static void vop_del_vqs(struct virtio_device *dev)
>  		vop_del_vq(vq, idx++);
>  }
>  
> +static struct virtqueue *vop_new_virtqueue(unsigned int index,
> +				      unsigned int num,
> +				      struct virtio_device *vdev,
> +				      bool context,
> +				      void *pages,
> +				      bool (*notify)(struct virtqueue *vq),
> +				      void (*callback)(struct virtqueue *vq),
> +				      const char *name,
> +				      void *used)
> +{
> +	bool weak_barriers = false;
> +	struct vring vring;
> +
> +	vring_init(&vring, num, pages, MIC_VIRTIO_RING_ALIGN);
> +	vring.used = used;
> +
> +	return __vring_new_virtqueue(index, vring, vdev, weak_barriers, context,
> +				     notify, callback, name);
> +}
> +
>  /*
>   * This routine will assign vring's allocated in host/io memory. Code in
>   * virtio_ring.c however continues to access this io memory as if it were local
> @@ -303,7 +323,6 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev,
>  	struct _mic_vring_info __iomem *info;
>  	void *used;
>  	int vr_size, _vr_size, err, magic;
> -	struct vring *vr;
>  	u8 type = ioread8(&vdev->desc->type);
>  
>  	if (index >= ioread8(&vdev->desc->num_vq))
> @@ -322,17 +341,7 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev,
>  		return ERR_PTR(-ENOMEM);
>  	vdev->vr[index] = va;
>  	memset_io(va, 0x0, _vr_size);
> -	vq = vring_new_virtqueue(
> -				index,
> -				le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN,
> -				dev,
> -				false,
> -				ctx,
> -				(void __force *)va, vop_notify, callback, name);
> -	if (!vq) {
> -		err = -ENOMEM;
> -		goto unmap;
> -	}
> +
>  	info = va + _vr_size;
>  	magic = ioread32(&info->magic);
>  
> @@ -341,7 +350,6 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev,
>  		goto unmap;
>  	}
>  
> -	/* Allocate and reassign used ring now */
>  	vdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
>  					     sizeof(struct vring_used_elem) *
>  					     le16_to_cpu(config.num));
> @@ -351,8 +359,17 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev,
>  		err = -ENOMEM;
>  		dev_err(_vop_dev(vdev), "%s %d err %d\n",
>  			__func__, __LINE__, err);
> -		goto del_vq;
> +		goto unmap;
> +	}
> +
> +	vq = vop_new_virtqueue(index, le16_to_cpu(config.num), dev, ctx,
> +			       (void __force *)va, vop_notify, callback,
> +			       name, used);
> +	if (!vq) {
> +		err = -ENOMEM;
> +		goto free_used;
>  	}
> +
>  	vdev->used[index] = dma_map_single(&vpdev->dev, used,
>  					    vdev->used_size[index],
>  					    DMA_BIDIRECTIONAL);
> @@ -360,26 +377,17 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev,
>  		err = -ENOMEM;
>  		dev_err(_vop_dev(vdev), "%s %d err %d\n",
>  			__func__, __LINE__, err);
> -		goto free_used;
> +		goto del_vq;
>  	}
>  	writeq(vdev->used[index], &vqconfig->used_address);
> -	/*
> -	 * To reassign the used ring here we are directly accessing
> -	 * struct vring_virtqueue which is a private data structure
> -	 * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
> -	 * vring_new_virtqueue() would ensure that
> -	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
> -	 */
> -	vr = (struct vring *)(vq + 1);
> -	vr->used = used;
>  
>  	vq->priv = vdev;
>  	return vq;
> +del_vq:
> +	vring_del_virtqueue(vq);
>  free_used:
>  	free_pages((unsigned long)used,
>  		   get_order(vdev->used_size[index]));
> -del_vq:
> -	vring_del_virtqueue(vq);
>  unmap:
>  	vpdev->hw_ops->unmap(vpdev, vdev->vr[index]);
>  	return ERR_PTR(err);



^ permalink raw reply	[relevance 99%]

* Re: [PATCH v2] misc: mic: fix a DMA pool free failure
  @ 2019-01-03  1:50 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2019-01-03  1:50 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Sudeep Dutt, Kangjie Lu, Ashutosh Dixit, Arnd Bergmann,
	Greg Kroah-Hartman, open list

On Tue, 2018-12-04 at 09:16 -0600, Wenwen Wang wrote:
> In _scif_prog_signal(), a DMA pool is allocated if the MIC Coprocessor is
> not X100, i.e., the boolean variable 'x100' is false. This DMA pool will be
> freed eventually through the callback function scif_prog_signal_cb() with
> the parameter of 'status', which actually points to the start of DMA pool.
> Specifically, in scif_prog_signal_cb(), the 'ep' field and the
> 'src_dma_addr' field of 'status' are used to free the DMA pool by invoking
> dma_pool_free(). Given that 'status' points to the start address of the DMA
> pool, both 'status->ep' and 'status->src_dma_addr' are in the DMA pool. And
> so, the device has the permission to access them. Even worse, a malicious
> device can modify them. As a result, dma_pool_free() will not succeed.
> 
> To avoid the above issue, this patch introduces a new data structure, i.e.,
> scif_cb_arg, to store the arguments required by the call back function. A
> variable 'cb_arg' is allocated in _scif_prog_signal() to pass the
> arguments. 'cb_arg' will be freed after dma_pool_free() in
> scif_prog_signal_cb().
> 

Thanks for incorporating the previous feedback Wenwen.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> Signed-off-by: Wenwen Wang <wang6495@umn.edu>
> ---
>  drivers/misc/mic/scif/scif_fence.c | 22 +++++++++++++++++-----
>  drivers/misc/mic/scif/scif_rma.h   | 13 +++++++++++++
>  2 files changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_fence.c b/drivers/misc/mic/scif/scif_fence.c
> index 7bb929f..2e7ce6a 100644
> --- a/drivers/misc/mic/scif/scif_fence.c
> +++ b/drivers/misc/mic/scif/scif_fence.c
> @@ -195,10 +195,11 @@ static inline void *scif_get_local_va(off_t off, struct scif_window *window)
>  
>  static void scif_prog_signal_cb(void *arg)
>  {
> -	struct scif_status *status = arg;
> +	struct scif_cb_arg *cb_arg = arg;
>  
> -	dma_pool_free(status->ep->remote_dev->signal_pool, status,
> -		      status->src_dma_addr);
> +	dma_pool_free(cb_arg->ep->remote_dev->signal_pool, cb_arg->status,
> +		      cb_arg->src_dma_addr);
> +	kfree(cb_arg);
>  }
>  
>  static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
> @@ -209,6 +210,7 @@ static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
>  	bool x100 = !is_dma_copy_aligned(chan->device, 1, 1, 1);
>  	struct dma_async_tx_descriptor *tx;
>  	struct scif_status *status = NULL;
> +	struct scif_cb_arg *cb_arg = NULL;
>  	dma_addr_t src;
>  	dma_cookie_t cookie;
>  	int err;
> @@ -257,8 +259,16 @@ static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
>  		goto dma_fail;
>  	}
>  	if (!x100) {
> +		cb_arg = kmalloc(sizeof(*cb_arg), GFP_KERNEL);
> +		if (!cb_arg) {
> +			err = -ENOMEM;
> +			goto dma_fail;
> +		}
> +		cb_arg->src_dma_addr = src;
> +		cb_arg->status = status;
> +		cb_arg->ep = ep;
>  		tx->callback = scif_prog_signal_cb;
> -		tx->callback_param = status;
> +		tx->callback_param = cb_arg;
>  	}
>  	cookie = tx->tx_submit(tx);
>  	if (dma_submit_error(cookie)) {
> @@ -270,9 +280,11 @@ static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
>  	dma_async_issue_pending(chan);
>  	return 0;
>  dma_fail:
> -	if (!x100)
> +	if (!x100) {
>  		dma_pool_free(ep->remote_dev->signal_pool, status,
>  			      src - offsetof(struct scif_status, val));
> +		kfree(cb_arg);
> +	}
>  alloc_fail:
>  	return err;
>  }
> diff --git a/drivers/misc/mic/scif/scif_rma.h b/drivers/misc/mic/scif/scif_rma.h
> index fa67222..84af303 100644
> --- a/drivers/misc/mic/scif/scif_rma.h
> +++ b/drivers/misc/mic/scif/scif_rma.h
> @@ -206,6 +206,19 @@ struct scif_status {
>  };
>  
>  /*
> + * struct scif_cb_arg - Stores the argument of the callback func
> + *
> + * @src_dma_addr: Source buffer DMA address
> + * @status: DMA status
> + * @ep: SCIF endpoint
> + */
> +struct scif_cb_arg {
> +	dma_addr_t src_dma_addr;
> +	struct scif_status *status;
> +	struct scif_endpt *ep;
> +};
> +
> +/*
>   * struct scif_window - Registration Window for Self and Remote
>   *
>   * @nr_pages: Number of pages which is defined as a s64 instead of an int



^ permalink raw reply	[relevance 99%]

* Re: [PATCH] misc: mic: fix a DMA pool free failure
  @ 2018-11-05  2:23 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2018-11-05  2:23 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Sudeep Dutt, Kangjie Lu, Ashutosh Dixit, Arnd Bergmann,
	Greg Kroah-Hartman, open list

On Thu, 2018-10-18 at 14:46 -0500, Wenwen Wang wrote:
> In _scif_prog_signal(), a DMA pool is allocated if the MIC Coprocessor is
> not X100, i.e., the boolean variable 'x100' is false. This DMA pool will be
> freed eventually through the callback function scif_prog_signal_cb() with
> the parameter of 'status', which actually points to the start of DMA pool.
> Specifically, in scif_prog_signal_cb(), the 'ep' field and the
> 'src_dma_addr' field of 'status' are used to free the DMA pool by invoking
> dma_pool_free(). Given that 'status' points to the start address of the DMA
> pool, both 'status->ep' and 'status->src_dma_addr' are in the DMA pool. And
> so, the device has the permission to access them. Even worse, a malicious
> device can modify them. As a result, dma_pool_free() will not succeed.
> 
> To avoid the above issue, this patch introduces a new data structure, i.e.,
> scif_cb_arg, to store the arguments required by the call back function. A
> variable 'cb_arg' is allocated in _scif_prog_signal() to pass the
> arguments. 'cb_arg' will be freed after dma_pool_free() in
> scif_prog_signal_cb().
> 
> Signed-off-by: Wenwen Wang <wang6495@umn.edu>
> ---
>  drivers/misc/mic/scif/scif_fence.c | 17 +++++++++++++----
>  drivers/misc/mic/scif/scif_rma.h   | 14 ++++++++++++++
>  2 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_fence.c b/drivers/misc/mic/scif/scif_fence.c
> index cac3bcc..30f7d9b 100644
> --- a/drivers/misc/mic/scif/scif_fence.c
> +++ b/drivers/misc/mic/scif/scif_fence.c
> @@ -195,10 +195,11 @@ static inline void *scif_get_local_va(off_t off, struct scif_window *window)
>  
>  static void scif_prog_signal_cb(void *arg)
>  {
> -	struct scif_status *status = arg;
> +	struct scif_cb_arg *cb_arg = arg;
>  
> -	dma_pool_free(status->ep->remote_dev->signal_pool, status,
> -		      status->src_dma_addr);
> +	dma_pool_free(cb_arg->ep->remote_dev->signal_pool, cb_arg->status,
> +		      cb_arg->src_dma_addr);
> +	kfree(cb_arg);
>  }
>  
>  static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
> @@ -209,6 +210,7 @@ static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
>  	bool x100 = !is_dma_copy_aligned(chan->device, 1, 1, 1);
>  	struct dma_async_tx_descriptor *tx;
>  	struct scif_status *status = NULL;
> +	struct scif_cb_arg *cb_arg = NULL;
>  	dma_addr_t src;
>  	dma_cookie_t cookie;
>  	int err;
> @@ -257,8 +259,15 @@ static int _scif_prog_signal(scif_epd_t epd, dma_addr_t dst, u64 val)
>  		goto dma_fail;
>  	}
>  	if (!x100) {
> +		err = -ENOMEM;

Should err be set to -ENOMEM only if the cb_arg allocation fails?

> +		cb_arg = kmalloc(sizeof(*cb_arg), GFP_KERNEL);
> +		if (!cb_arg)
> +			goto dma_fail;
> +		cb_arg->src_dma_addr = src;
> +		cb_arg->status = status;
> +		cb_arg->ep = ep;
>  		tx->callback = scif_prog_signal_cb;
> -		tx->callback_param = status;
> +		tx->callback_param = cb_arg;
>  	}

cb_arg should be freed if there is a dma_submit_error(..) below in the
dma_fail path.

Wenwen, can you please fix these up and resend the patch?

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: rdma-core doesn't install driver.h, broke libibscif
  @ 2018-10-18 15:10 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2018-10-18 15:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sudeep Dutt, Woodruff, Robert J, james harvey, linux-rdma,
	Nikhil Rao, Ashutosh Dixit, Greg Kroah-Hartman, linux-kernel

On Wed, 2018-10-17 at 01:54 -0700, Christoph Hellwig wrote:
> On Mon, Oct 15, 2018 at 05:53:44PM +0000, Woodruff, Robert J wrote:
> > James Harvey wrote,
> > 
> > >Short: Is libibscif dead, and should OS repositories remove it?
> > 
> > Libibscif is for an old product,  Intel's KNC, that is no longer sold and the S/W is frozen. The open source libibscif is no longer maintained. 
> > It has already been removed from the latest community OFED distribution. The individual package on the OFA downloads is for archive purposed only.
> 
> Does this mean we can kill the kernel scif code as well? 

The MIC COSM drivers are using the kernel SCIF code. These drivers have
been upstream since 2015 and we are not sure who is using them. Our
preference will be to keep these drivers in the kernel for now.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH] misc: mic: SCIF Fix scif_get_new_port() error handling
  @ 2018-08-08 23:13 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2018-08-08 23:13 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Sudeep Dutt, Dixit, Ashutosh, Arnd Bergmann, Greg Kroah-Hartman,
	Al Viro, Gustavo A. R. Silva, linux-kernel, kernel-janitors

On Thu, 2018-08-02 at 01:42 -0700, Dan Carpenter wrote:
> There are only 2 callers of scif_get_new_port() and both appear to get
> the error handling wrong.  Both treat zero returns as error, but it
> actually returns negative error codes and >= 0 on success.
> 
> Fixes: e9089f43c9a7 ("misc: mic: SCIF open close bind and listen APIs")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> I have not tested this patch.  Please review carefully.

Thanks for the patch Dan.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> 
> diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
> index 463f06d0b4ef..8dd0ccedeb94 100644
> --- a/drivers/misc/mic/scif/scif_api.c
> +++ b/drivers/misc/mic/scif/scif_api.c
> @@ -371,11 +371,10 @@ int scif_bind(scif_epd_t epd, u16 pn)
>  			goto scif_bind_exit;
>  		}
>  	} else {
> -		pn = scif_get_new_port();
> -		if (!pn) {
> -			ret = -ENOSPC;
> +		ret = scif_get_new_port();
> +		if (ret < 0)
>  			goto scif_bind_exit;
> -		}
> +		pn = ret;
>  	}
>  
>  	ep->state = SCIFEP_BOUND;
> @@ -649,13 +648,12 @@ int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block)
>  			err = -EISCONN;
>  		break;
>  	case SCIFEP_UNBOUND:
> -		ep->port.port = scif_get_new_port();
> -		if (!ep->port.port) {
> -			err = -ENOSPC;
> -		} else {
> -			ep->port.node = scif_info.nodeid;
> -			ep->conn_async_state = ASYNC_CONN_IDLE;
> -		}
> +		err = scif_get_new_port();
> +		if (err < 0)
> +			break;
> +		ep->port.port = err;
> +		ep->port.node = scif_info.nodeid;
> +		ep->conn_async_state = ASYNC_CONN_IDLE;
>  		/* Fall through */
>  	case SCIFEP_BOUND:
>  		/*



^ permalink raw reply	[relevance 99%]

* Re: ERROR: "vring_del_virtqueue" [drivers/misc/mic/vop/vop.ko] undefined!
  @ 2016-04-30 18:56 99%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-04-30 18:56 UTC (permalink / raw)
  To: Randy Dunlap, Greg Kroah-Hartman
  Cc: Sudeep Dutt, kbuild test robot, kbuild-all, linux-kernel, Ashutosh Dixit

On Sat, 2016-04-30 at 09:49 -0700, Randy Dunlap wrote:
> On 04/18/16 08:05, kbuild test robot wrote:
> > Hi Sudeep,
> > 
> > FYI, the error/warning still remains.
> 
> and still happens in linux-next-20160428...
> 

The patch to fix this build failure was posted @
https://lkml.org/lkml/2016/4/5/7 
Greg, can you please please apply the patch?

Thanks,
Sudeep Dutt

> 
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> > head:   c3b46c73264b03000d1e18b22f5caf63332547c9
> > commit: 8810df37762746657cfe84014a8f30758e8f366a misc: mic: Enable VOP debugfs and driver build
> > date:   10 weeks ago
> > config: x86_64-randconfig-s2-04182146 (attached as .config)
> > reproduce:
> >         git checkout 8810df37762746657cfe84014a8f30758e8f366a
> >         # save the attached .config to linux build tree
> >         make ARCH=x86_64 
> > 
> > All errors (new ones prefixed by >>):
> > 
> >>> ERROR: "vring_del_virtqueue" [drivers/misc/mic/vop/vop.ko] undefined!
> >>> ERROR: "register_virtio_device" [drivers/misc/mic/vop/vop.ko] undefined!
> >>> ERROR: "unregister_virtio_device" [drivers/misc/mic/vop/vop.ko] undefined!
> >>> ERROR: "vring_new_virtqueue" [drivers/misc/mic/vop/vop.ko] undefined!
> >>> ERROR: "virtio_config_changed" [drivers/misc/mic/vop/vop.ko] undefined!
> >>> ERROR: "vring_interrupt" [drivers/misc/mic/vop/vop.ko] undefined!
> >>> ERROR: "vring_transport_features" [drivers/misc/mic/vop/vop.ko] undefined!
> > 
> > ---
> > 0-DAY kernel test infrastructure                Open Source Technology Center
> > https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> > 
> 
> 

^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-linus] misc: mic: Fix randconfig build error
  @ 2016-04-05  5:00 99%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-04-05  5:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sudeep Dutt, linux-kernel, Ashutosh Dixit, Fengguang Wu

On Mon, 2016-04-04 at 21:41 -0700, Greg Kroah-Hartman wrote:
> On Mon, Apr 04, 2016 at 09:32:30PM -0700, Sudeep Dutt wrote:
> > Fixes randconfig build error reported at
> > https://lkml.org/lkml/2016/4/3/135 by ensuring that
> > the VOP driver selects VIRTIO.
> > 
> > Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > ---
> >  drivers/misc/mic/Kconfig | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
> > index 2e4f3ba..89e5917 100644
> > --- a/drivers/misc/mic/Kconfig
> > +++ b/drivers/misc/mic/Kconfig
> > @@ -132,6 +132,7 @@ config VOP
> >  	tristate "VOP Driver"
> >  	depends on 64BIT && PCI && X86 && VOP_BUS
> >  	select VHOST_RING
> > +	select VIRTIO
> 
> Shouldn't it depend on this instead?

Hi Greg,

The documentation for "config VIRTIO" states that "This option is
selected by any driver which implements the virtio bus". I verified that
this patch fixes the build for the randconfig which was failing earlier.

Thanks,
Sudeep Dutt

^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-linus] misc: mic: Fix randconfig build error
@ 2016-04-05  4:32 99% Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2016-04-05  4:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Ashutosh Dixit, Fengguang Wu, Sudeep Dutt

Fixes randconfig build error reported at
https://lkml.org/lkml/2016/4/3/135 by ensuring that
the VOP driver selects VIRTIO.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 2e4f3ba..89e5917 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -132,6 +132,7 @@ config VOP
 	tristate "VOP Driver"
 	depends on 64BIT && PCI && X86 && VOP_BUS
 	select VHOST_RING
+	select VIRTIO
 	help
 	  This enables VOP (Virtio over PCIe) Driver support for the Intel
 	  Many Integrated Core (MIC) family of PCIe form factor coprocessor
-- 
1.8.2.1

^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-testing v2 1/8] misc: mic: Remove MIC X100 host virtio functionality
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
@ 2016-02-08 23:48 29% ` Sudeep Dutt
  2016-02-08 23:48 49% ` [PATCH char-misc-testing v2 2/8] misc: mic: Remove MIC X100 card " Sudeep Dutt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch deletes the virtio functionality from the MIC X100 host
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/host/Makefile      |   2 -
 drivers/misc/mic/host/mic_device.h  |   6 -
 drivers/misc/mic/host/mic_fops.h    |  32 --
 drivers/misc/mic/host/mic_virtio.h  | 155 -------
 drivers/misc/mic/host/mic_boot.c    |   2 -
 drivers/misc/mic/host/mic_debugfs.c | 190 ---------
 drivers/misc/mic/host/mic_fops.c    | 222 ----------
 drivers/misc/mic/host/mic_main.c    |  48 +--
 drivers/misc/mic/host/mic_virtio.c  | 811 ------------------------------------
 9 files changed, 4 insertions(+), 1464 deletions(-)
 delete mode 100644 drivers/misc/mic/host/mic_fops.h
 delete mode 100644 drivers/misc/mic/host/mic_virtio.h
 delete mode 100644 drivers/misc/mic/host/mic_fops.c
 delete mode 100644 drivers/misc/mic/host/mic_virtio.c

diff --git a/drivers/misc/mic/host/Makefile b/drivers/misc/mic/host/Makefile
index 004d3db..f3b5023 100644
--- a/drivers/misc/mic/host/Makefile
+++ b/drivers/misc/mic/host/Makefile
@@ -9,5 +9,3 @@ mic_host-objs += mic_smpt.o
 mic_host-objs += mic_intr.o
 mic_host-objs += mic_boot.o
 mic_host-objs += mic_debugfs.o
-mic_host-objs += mic_fops.o
-mic_host-objs += mic_virtio.o
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 461184a..8460de1 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -64,9 +64,6 @@ extern struct cosm_hw_ops cosm_hw_ops;
  * @bootaddr: MIC boot address.
  * @dp: virtio device page
  * @dp_dma_addr: virtio device page DMA address.
- * @name: name for the misc char device
- * @miscdev: registered misc char device
- * @vdev_list: list of virtio devices.
  * @dma_mbdev: MIC BUS DMA device.
  * @dma_ch - Array of DMA channels
  * @num_dma_ch - Number of DMA channels available
@@ -91,9 +88,6 @@ struct mic_device {
 	u32 bootaddr;
 	void *dp;
 	dma_addr_t dp_dma_addr;
-	char name[16];
-	struct miscdevice miscdev;
-	struct list_head vdev_list;
 	struct mbus_device *dma_mbdev;
 	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
 	int num_dma_ch;
diff --git a/drivers/misc/mic/host/mic_fops.h b/drivers/misc/mic/host/mic_fops.h
deleted file mode 100644
index dc3893d..0000000
--- a/drivers/misc/mic/host/mic_fops.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#ifndef _MIC_FOPS_H_
-#define _MIC_FOPS_H_
-
-int mic_open(struct inode *inode, struct file *filp);
-int mic_release(struct inode *inode, struct file *filp);
-ssize_t mic_read(struct file *filp, char __user *buf,
-			size_t count, loff_t *pos);
-long mic_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
-int mic_mmap(struct file *f, struct vm_area_struct *vma);
-unsigned int mic_poll(struct file *f, poll_table *wait);
-
-#endif
diff --git a/drivers/misc/mic/host/mic_virtio.h b/drivers/misc/mic/host/mic_virtio.h
deleted file mode 100644
index a80631f..0000000
--- a/drivers/misc/mic/host/mic_virtio.h
+++ /dev/null
@@ -1,155 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#ifndef MIC_VIRTIO_H
-#define MIC_VIRTIO_H
-
-#include <linux/virtio_config.h>
-#include <linux/mic_ioctl.h>
-
-/*
- * Note on endianness.
- * 1. Host can be both BE or LE
- * 2. Guest/card is LE. Host uses le_to_cpu to access desc/avail
- *    rings and ioreadXX/iowriteXX to access used ring.
- * 3. Device page exposed by host to guest contains LE values. Guest
- *    accesses these using ioreadXX/iowriteXX etc. This way in general we
- *    obey the virtio spec according to which guest works with native
- *    endianness and host is aware of guest endianness and does all
- *    required endianness conversion.
- * 4. Data provided from user space to guest (in ADD_DEVICE and
- *    CONFIG_CHANGE ioctl's) is not interpreted by the driver and should be
- *    in guest endianness.
- */
-
-/**
- * struct mic_vringh - Virtio ring host information.
- *
- * @vring: The MIC vring used for setting up user space mappings.
- * @vrh: The host VRINGH used for accessing the card vrings.
- * @riov: The VRINGH read kernel IOV.
- * @wiov: The VRINGH write kernel IOV.
- * @vr_mutex: Mutex for synchronizing access to the VRING.
- * @buf: Temporary kernel buffer used to copy in/out data
- * from/to the card via DMA.
- * @buf_da: dma address of buf.
- * @mvdev: Back pointer to MIC virtio device for vringh_notify(..).
- * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
- */
-struct mic_vringh {
-	struct mic_vring vring;
-	struct vringh vrh;
-	struct vringh_kiov riov;
-	struct vringh_kiov wiov;
-	struct mutex vr_mutex;
-	void *buf;
-	dma_addr_t buf_da;
-	struct mic_vdev *mvdev;
-	u16 head;
-};
-
-/**
- * struct mic_vdev - Host information for a card Virtio device.
- *
- * @virtio_id - Virtio device id.
- * @waitq - Waitqueue to allow ring3 apps to poll.
- * @mdev - Back pointer to host MIC device.
- * @poll_wake - Used for waking up threads blocked in poll.
- * @out_bytes - Debug stats for number of bytes copied from host to card.
- * @in_bytes - Debug stats for number of bytes copied from card to host.
- * @out_bytes_dma - Debug stats for number of bytes copied from host to card
- * using DMA.
- * @in_bytes_dma - Debug stats for number of bytes copied from card to host
- * using DMA.
- * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
- * the transfer length did not have the required DMA alignment.
- * @tx_dst_unaligned - Debug stats for number of bytes copied where the
- * destination address on the card did not have the required DMA alignment.
- * @mvr - Store per VRING data structures.
- * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
- * @dd - Virtio device descriptor.
- * @dc - Virtio device control fields.
- * @list - List of Virtio devices.
- * @virtio_db - The doorbell used by the card to interrupt the host.
- * @virtio_cookie - The cookie returned while requesting interrupts.
- */
-struct mic_vdev {
-	int virtio_id;
-	wait_queue_head_t waitq;
-	struct mic_device *mdev;
-	int poll_wake;
-	unsigned long out_bytes;
-	unsigned long in_bytes;
-	unsigned long out_bytes_dma;
-	unsigned long in_bytes_dma;
-	unsigned long tx_len_unaligned;
-	unsigned long tx_dst_unaligned;
-	struct mic_vringh mvr[MIC_MAX_VRINGS];
-	struct work_struct virtio_bh_work;
-	struct mic_device_desc *dd;
-	struct mic_device_ctrl *dc;
-	struct list_head list;
-	int virtio_db;
-	struct mic_irq *virtio_cookie;
-};
-
-void mic_virtio_uninit(struct mic_device *mdev);
-int mic_virtio_add_device(struct mic_vdev *mvdev,
-			void __user *argp);
-void mic_virtio_del_device(struct mic_vdev *mvdev);
-int mic_virtio_config_change(struct mic_vdev *mvdev,
-			void __user *argp);
-int mic_virtio_copy_desc(struct mic_vdev *mvdev,
-	struct mic_copy_desc *request);
-void mic_virtio_reset_devices(struct mic_device *mdev);
-void mic_bh_handler(struct work_struct *work);
-
-/* Helper API to obtain the MIC PCIe device */
-static inline struct device *mic_dev(struct mic_vdev *mvdev)
-{
-	return &mvdev->mdev->pdev->dev;
-}
-
-/* Helper API to check if a virtio device is initialized */
-static inline int mic_vdev_inited(struct mic_vdev *mvdev)
-{
-	/* Device has not been created yet */
-	if (!mvdev->dd || !mvdev->dd->type) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EINVAL);
-		return -EINVAL;
-	}
-
-	/* Device has been removed/deleted */
-	if (mvdev->dd->type == -1) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -ENODEV);
-		return -ENODEV;
-	}
-
-	return 0;
-}
-
-/* Helper API to check if a virtio device is running */
-static inline bool mic_vdevup(struct mic_vdev *mvdev)
-{
-	return !!mvdev->dd->status;
-}
-#endif
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index 7845564..3df305f 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -28,7 +28,6 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
-#include "mic_virtio.h"
 
 static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
 {
@@ -423,7 +422,6 @@ static void _mic_stop(struct cosm_device *cdev, bool force)
 	 * will be the first to be registered and the last to be
 	 * unregistered.
 	 */
-	mic_virtio_reset_devices(mdev);
 	scif_unregister_device(mdev->scdev);
 	mic_free_dma_chans(mdev);
 	mbus_unregister_device(mdev->dma_mbdev);
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 1058160..0a9daba 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -26,7 +26,6 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
-#include "mic_virtio.h"
 
 /* Debugfs parent dir */
 static struct dentry *mic_dbg;
@@ -100,190 +99,6 @@ static const struct file_operations post_code_ops = {
 	.release = mic_post_code_debug_release
 };
 
-static int mic_dp_show(struct seq_file *s, void *pos)
-{
-	struct mic_device *mdev = s->private;
-	struct mic_device_desc *d;
-	struct mic_device_ctrl *dc;
-	struct mic_vqconfig *vqconfig;
-	__u32 *features;
-	__u8 *config;
-	struct mic_bootparam *bootparam = mdev->dp;
-	int i, j;
-
-	seq_printf(s, "Bootparam: magic 0x%x\n",
-		   bootparam->magic);
-	seq_printf(s, "Bootparam: h2c_config_db %d\n",
-		   bootparam->h2c_config_db);
-	seq_printf(s, "Bootparam: node_id %d\n",
-		   bootparam->node_id);
-	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
-		   bootparam->c2h_scif_db);
-	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
-		   bootparam->h2c_scif_db);
-	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
-		   bootparam->scif_host_dma_addr);
-	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
-		   bootparam->scif_card_dma_addr);
-
-
-	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
-	     i += mic_total_desc_size(d)) {
-		d = mdev->dp + i;
-		dc = (void *)d + mic_aligned_desc_size(d);
-
-		/* end of list */
-		if (d->type == 0)
-			break;
-
-		if (d->type == -1)
-			continue;
-
-		seq_printf(s, "Type %d ", d->type);
-		seq_printf(s, "Num VQ %d ", d->num_vq);
-		seq_printf(s, "Feature Len %d\n", d->feature_len);
-		seq_printf(s, "Config Len %d ", d->config_len);
-		seq_printf(s, "Shutdown Status %d\n", d->status);
-
-		for (j = 0; j < d->num_vq; j++) {
-			vqconfig = mic_vq_config(d) + j;
-			seq_printf(s, "vqconfig[%d]: ", j);
-			seq_printf(s, "address 0x%llx ", vqconfig->address);
-			seq_printf(s, "num %d ", vqconfig->num);
-			seq_printf(s, "used address 0x%llx\n",
-				   vqconfig->used_address);
-		}
-
-		features = (__u32 *)mic_vq_features(d);
-		seq_printf(s, "Features: Host 0x%x ", features[0]);
-		seq_printf(s, "Guest 0x%x\n", features[1]);
-
-		config = mic_vq_configspace(d);
-		for (j = 0; j < d->config_len; j++)
-			seq_printf(s, "config[%d]=%d\n", j, config[j]);
-
-		seq_puts(s, "Device control:\n");
-		seq_printf(s, "Config Change %d ", dc->config_change);
-		seq_printf(s, "Vdev reset %d\n", dc->vdev_reset);
-		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
-		seq_printf(s, "Host ack %d\n", dc->host_ack);
-		seq_printf(s, "Used address updated %d ",
-			   dc->used_address_updated);
-		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
-		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
-		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
-	}
-
-	return 0;
-}
-
-static int mic_dp_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, mic_dp_show, inode->i_private);
-}
-
-static int mic_dp_debug_release(struct inode *inode, struct file *file)
-{
-	return single_release(inode, file);
-}
-
-static const struct file_operations dp_ops = {
-	.owner   = THIS_MODULE,
-	.open    = mic_dp_debug_open,
-	.read    = seq_read,
-	.llseek  = seq_lseek,
-	.release = mic_dp_debug_release
-};
-
-static int mic_vdev_info_show(struct seq_file *s, void *unused)
-{
-	struct mic_device *mdev = s->private;
-	struct list_head *pos, *tmp;
-	struct mic_vdev *mvdev;
-	int i, j;
-
-	mutex_lock(&mdev->mic_mutex);
-	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
-		mvdev = list_entry(pos, struct mic_vdev, list);
-		seq_printf(s, "VDEV type %d state %s in %ld out %ld\n",
-			   mvdev->virtio_id,
-			   mic_vdevup(mvdev) ? "UP" : "DOWN",
-			   mvdev->in_bytes,
-			   mvdev->out_bytes);
-		for (i = 0; i < MIC_MAX_VRINGS; i++) {
-			struct vring_desc *desc;
-			struct vring_avail *avail;
-			struct vring_used *used;
-			struct mic_vringh *mvr = &mvdev->mvr[i];
-			struct vringh *vrh = &mvr->vrh;
-			int num = vrh->vring.num;
-			if (!num)
-				continue;
-			desc = vrh->vring.desc;
-			seq_printf(s, "vring i %d avail_idx %d",
-				   i, mvr->vring.info->avail_idx & (num - 1));
-			seq_printf(s, " vring i %d avail_idx %d\n",
-				   i, mvr->vring.info->avail_idx);
-			seq_printf(s, "vrh i %d weak_barriers %d",
-				   i, vrh->weak_barriers);
-			seq_printf(s, " last_avail_idx %d last_used_idx %d",
-				   vrh->last_avail_idx, vrh->last_used_idx);
-			seq_printf(s, " completed %d\n", vrh->completed);
-			for (j = 0; j < num; j++) {
-				seq_printf(s, "desc[%d] addr 0x%llx len %d",
-					   j, desc->addr, desc->len);
-				seq_printf(s, " flags 0x%x next %d\n",
-					   desc->flags, desc->next);
-				desc++;
-			}
-			avail = vrh->vring.avail;
-			seq_printf(s, "avail flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, avail->flags),
-				   vringh16_to_cpu(vrh, avail->idx) & (num - 1));
-			seq_printf(s, "avail flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, avail->flags),
-				   vringh16_to_cpu(vrh, avail->idx));
-			for (j = 0; j < num; j++)
-				seq_printf(s, "avail ring[%d] %d\n",
-					   j, avail->ring[j]);
-			used = vrh->vring.used;
-			seq_printf(s, "used flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, used->flags),
-				   vringh16_to_cpu(vrh, used->idx) & (num - 1));
-			seq_printf(s, "used flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, used->flags),
-				   vringh16_to_cpu(vrh, used->idx));
-			for (j = 0; j < num; j++)
-				seq_printf(s, "used ring[%d] id %d len %d\n",
-					   j, vringh32_to_cpu(vrh,
-							      used->ring[j].id),
-					   vringh32_to_cpu(vrh,
-							   used->ring[j].len));
-		}
-	}
-	mutex_unlock(&mdev->mic_mutex);
-
-	return 0;
-}
-
-static int mic_vdev_info_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, mic_vdev_info_show, inode->i_private);
-}
-
-static int mic_vdev_info_debug_release(struct inode *inode, struct file *file)
-{
-	return single_release(inode, file);
-}
-
-static const struct file_operations vdev_info_ops = {
-	.owner   = THIS_MODULE,
-	.open    = mic_vdev_info_debug_open,
-	.read    = seq_read,
-	.llseek  = seq_lseek,
-	.release = mic_vdev_info_debug_release
-};
-
 static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 {
 	struct mic_device *mdev  = s->private;
@@ -367,11 +182,6 @@ void mic_create_debug_dir(struct mic_device *mdev)
 	debugfs_create_file("post_code", 0444, mdev->dbg_dir, mdev,
 			    &post_code_ops);
 
-	debugfs_create_file("dp", 0444, mdev->dbg_dir, mdev, &dp_ops);
-
-	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir, mdev,
-			    &vdev_info_ops);
-
 	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir, mdev,
 			    &msi_irq_info_ops);
 }
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
deleted file mode 100644
index 8cc1d90..0000000
--- a/drivers/misc/mic/host/mic_fops.c
+++ /dev/null
@@ -1,222 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#include <linux/poll.h>
-#include <linux/pci.h>
-
-#include <linux/mic_common.h>
-#include "../common/mic_dev.h"
-#include "mic_device.h"
-#include "mic_fops.h"
-#include "mic_virtio.h"
-
-int mic_open(struct inode *inode, struct file *f)
-{
-	struct mic_vdev *mvdev;
-	struct mic_device *mdev = container_of(f->private_data,
-		struct mic_device, miscdev);
-
-	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
-	if (!mvdev)
-		return -ENOMEM;
-
-	init_waitqueue_head(&mvdev->waitq);
-	INIT_LIST_HEAD(&mvdev->list);
-	mvdev->mdev = mdev;
-	mvdev->virtio_id = -1;
-
-	f->private_data = mvdev;
-	return 0;
-}
-
-int mic_release(struct inode *inode, struct file *f)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-
-	if (-1 != mvdev->virtio_id)
-		mic_virtio_del_device(mvdev);
-	f->private_data = NULL;
-	kfree(mvdev);
-	return 0;
-}
-
-long mic_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-	void __user *argp = (void __user *)arg;
-	int ret;
-
-	switch (cmd) {
-	case MIC_VIRTIO_ADD_DEVICE:
-	{
-		ret = mic_virtio_add_device(mvdev, argp);
-		if (ret < 0) {
-			dev_err(mic_dev(mvdev),
-				"%s %d errno ret %d\n",
-				__func__, __LINE__, ret);
-			return ret;
-		}
-		break;
-	}
-	case MIC_VIRTIO_COPY_DESC:
-	{
-		struct mic_copy_desc copy;
-
-		ret = mic_vdev_inited(mvdev);
-		if (ret)
-			return ret;
-
-		if (copy_from_user(&copy, argp, sizeof(copy)))
-			return -EFAULT;
-
-		dev_dbg(mic_dev(mvdev),
-			"%s %d === iovcnt 0x%x vr_idx 0x%x update_used %d\n",
-			__func__, __LINE__, copy.iovcnt, copy.vr_idx,
-			copy.update_used);
-
-		ret = mic_virtio_copy_desc(mvdev, &copy);
-		if (ret < 0) {
-			dev_err(mic_dev(mvdev),
-				"%s %d errno ret %d\n",
-				__func__, __LINE__, ret);
-			return ret;
-		}
-		if (copy_to_user(
-			&((struct mic_copy_desc __user *)argp)->out_len,
-			&copy.out_len, sizeof(copy.out_len))) {
-			dev_err(mic_dev(mvdev), "%s %d errno ret %d\n",
-				__func__, __LINE__, -EFAULT);
-			return -EFAULT;
-		}
-		break;
-	}
-	case MIC_VIRTIO_CONFIG_CHANGE:
-	{
-		ret = mic_vdev_inited(mvdev);
-		if (ret)
-			return ret;
-
-		ret = mic_virtio_config_change(mvdev, argp);
-		if (ret < 0) {
-			dev_err(mic_dev(mvdev),
-				"%s %d errno ret %d\n",
-				__func__, __LINE__, ret);
-			return ret;
-		}
-		break;
-	}
-	default:
-		return -ENOIOCTLCMD;
-	};
-	return 0;
-}
-
-/*
- * We return POLLIN | POLLOUT from poll when new buffers are enqueued, and
- * not when previously enqueued buffers may be available. This means that
- * in the card->host (TX) path, when userspace is unblocked by poll it
- * must drain all available descriptors or it can stall.
- */
-unsigned int mic_poll(struct file *f, poll_table *wait)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-	int mask = 0;
-
-	poll_wait(f, &mvdev->waitq, wait);
-
-	if (mic_vdev_inited(mvdev)) {
-		mask = POLLERR;
-	} else if (mvdev->poll_wake) {
-		mvdev->poll_wake = 0;
-		mask = POLLIN | POLLOUT;
-	}
-
-	return mask;
-}
-
-static inline int
-mic_query_offset(struct mic_vdev *mvdev, unsigned long offset,
-		 unsigned long *size, unsigned long *pa)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	unsigned long start = MIC_DP_SIZE;
-	int i;
-
-	/*
-	 * MMAP interface is as follows:
-	 * offset				region
-	 * 0x0					virtio device_page
-	 * 0x1000				first vring
-	 * 0x1000 + size of 1st vring		second vring
-	 * ....
-	 */
-	if (!offset) {
-		*pa = virt_to_phys(mdev->dp);
-		*size = MIC_DP_SIZE;
-		return 0;
-	}
-
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		struct mic_vringh *mvr = &mvdev->mvr[i];
-		if (offset == start) {
-			*pa = virt_to_phys(mvr->vring.va);
-			*size = mvr->vring.len;
-			return 0;
-		}
-		start += mvr->vring.len;
-	}
-	return -1;
-}
-
-/*
- * Maps the device page and virtio rings to user space for readonly access.
- */
-int
-mic_mmap(struct file *f, struct vm_area_struct *vma)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-	unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
-	unsigned long pa, size = vma->vm_end - vma->vm_start, size_rem = size;
-	int i, err;
-
-	err = mic_vdev_inited(mvdev);
-	if (err)
-		return err;
-
-	if (vma->vm_flags & VM_WRITE)
-		return -EACCES;
-
-	while (size_rem) {
-		i = mic_query_offset(mvdev, offset, &size, &pa);
-		if (i < 0)
-			return -EINVAL;
-		err = remap_pfn_range(vma, vma->vm_start + offset,
-			pa >> PAGE_SHIFT, size, vma->vm_page_prot);
-		if (err)
-			return err;
-		dev_dbg(mic_dev(mvdev),
-			"%s %d type %d size 0x%lx off 0x%lx pa 0x%lx vma 0x%lx\n",
-			__func__, __LINE__, mvdev->virtio_id, size, offset,
-			pa, vma->vm_start + offset);
-		size_rem -= size;
-		offset += size;
-	}
-	return 0;
-}
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 153894e..400def2 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -27,8 +27,6 @@
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
-#include "mic_fops.h"
-#include "mic_virtio.h"
 
 static const char mic_driver_name[] = "mic";
 
@@ -57,17 +55,6 @@ MODULE_DEVICE_TABLE(pci, mic_pci_tbl);
 
 /* ID allocator for MIC devices */
 static struct ida g_mic_ida;
-/* Base device node number for MIC devices */
-static dev_t g_mic_devno;
-
-static const struct file_operations mic_fops = {
-	.open = mic_open,
-	.release = mic_release,
-	.unlocked_ioctl = mic_ioctl,
-	.poll = mic_poll,
-	.mmap = mic_mmap,
-	.owner = THIS_MODULE,
-};
 
 /* Initialize the device page */
 static int mic_dp_init(struct mic_device *mdev)
@@ -169,7 +156,6 @@ mic_device_init(struct mic_device *mdev, struct pci_dev *pdev)
 	mic_ops_init(mdev);
 	mutex_init(&mdev->mic_mutex);
 	mdev->irq_info.next_avail_src = 0;
-	INIT_LIST_HEAD(&mdev->vdev_list);
 }
 
 /**
@@ -259,30 +245,15 @@ static int mic_probe(struct pci_dev *pdev,
 		goto smpt_uninit;
 	}
 	mic_bootparam_init(mdev);
-
 	mic_create_debug_dir(mdev);
 
-	mdev->miscdev.minor = MISC_DYNAMIC_MINOR;
-	snprintf(mdev->name, sizeof(mdev->name), "mic%d", mdev->id);
-	mdev->miscdev.name = mdev->name;
-	mdev->miscdev.fops = &mic_fops;
-	mdev->miscdev.parent = &mdev->pdev->dev;
-	rc = misc_register(&mdev->miscdev);
-	if (rc) {
-		dev_err(&pdev->dev, "misc_register err id %d rc %d\n",
-			mdev->id, rc);
-		goto cleanup_debug_dir;
-	}
-
 	mdev->cosm_dev = cosm_register_device(&mdev->pdev->dev, &cosm_hw_ops);
 	if (IS_ERR(mdev->cosm_dev)) {
 		rc = PTR_ERR(mdev->cosm_dev);
 		dev_err(&pdev->dev, "cosm_add_device failed rc %d\n", rc);
-		goto misc_dereg;
+		goto cleanup_debug_dir;
 	}
 	return 0;
-misc_dereg:
-	misc_deregister(&mdev->miscdev);
 cleanup_debug_dir:
 	mic_delete_debug_dir(mdev);
 	mic_dp_uninit(mdev);
@@ -323,7 +294,6 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	cosm_unregister_device(mdev->cosm_dev);
-	misc_deregister(&mdev->miscdev);
 	mic_delete_debug_dir(mdev);
 	mic_dp_uninit(mdev);
 	mic_smpt_uninit(mdev);
@@ -347,26 +317,17 @@ static int __init mic_init(void)
 {
 	int ret;
 
-	ret = alloc_chrdev_region(&g_mic_devno, 0,
-				  MIC_MAX_NUM_DEVS, mic_driver_name);
-	if (ret) {
-		pr_err("alloc_chrdev_region failed ret %d\n", ret);
-		goto error;
-	}
-
 	mic_init_debugfs();
 	ida_init(&g_mic_ida);
 	ret = pci_register_driver(&mic_driver);
 	if (ret) {
 		pr_err("pci_register_driver failed ret %d\n", ret);
-		goto cleanup_chrdev;
+		goto cleanup_debugfs;
 	}
-	return ret;
-cleanup_chrdev:
+	return 0;
+cleanup_debugfs:
 	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
-	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
-error:
 	return ret;
 }
 
@@ -375,7 +336,6 @@ static void __exit mic_exit(void)
 	pci_unregister_driver(&mic_driver);
 	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
-	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
 }
 
 module_init(mic_init);
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
deleted file mode 100644
index 58b107a..0000000
--- a/drivers/misc/mic/host/mic_virtio.c
+++ /dev/null
@@ -1,811 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#include <linux/pci.h>
-#include <linux/sched.h>
-#include <linux/uaccess.h>
-#include <linux/dmaengine.h>
-#include <linux/mic_common.h>
-#include "../common/mic_dev.h"
-#include "mic_device.h"
-#include "mic_smpt.h"
-#include "mic_virtio.h"
-
-/*
- * Size of the internal buffer used during DMA's as an intermediate buffer
- * for copy to/from user.
- */
-#define MIC_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
-
-static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
-			dma_addr_t src, size_t len)
-{
-	int err = 0;
-	struct dma_async_tx_descriptor *tx;
-	struct dma_chan *mic_ch = mdev->dma_ch[0];
-
-	if (!mic_ch) {
-		err = -EBUSY;
-		goto error;
-	}
-
-	tx = mic_ch->device->device_prep_dma_memcpy(mic_ch, dst, src, len,
-						    DMA_PREP_FENCE);
-	if (!tx) {
-		err = -ENOMEM;
-		goto error;
-	} else {
-		dma_cookie_t cookie = tx->tx_submit(tx);
-
-		err = dma_submit_error(cookie);
-		if (err)
-			goto error;
-		err = dma_sync_wait(mic_ch, cookie);
-	}
-error:
-	if (err)
-		dev_err(&mdev->pdev->dev, "%s %d err %d\n",
-			__func__, __LINE__, err);
-	return err;
-}
-
-/*
- * Initiates the copies across the PCIe bus from card memory to a user
- * space buffer. When transfers are done using DMA, source/destination
- * addresses and transfer length must follow the alignment requirements of
- * the MIC DMA engine.
- */
-static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
-				   size_t len, u64 daddr, size_t dlen,
-				   int vr_idx)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	void __iomem *dbuf = mdev->aper.va + daddr;
-	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
-	size_t dma_offset;
-	size_t partlen;
-	int err;
-
-	dma_offset = daddr - round_down(daddr, dma_alignment);
-	daddr -= dma_offset;
-	len += dma_offset;
-
-	while (len) {
-		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
-
-		err = mic_sync_dma(mdev, mvr->buf_da, daddr,
-				   ALIGN(partlen, dma_alignment));
-		if (err)
-			goto err;
-
-		if (copy_to_user(ubuf, mvr->buf + dma_offset,
-				 partlen - dma_offset)) {
-			err = -EFAULT;
-			goto err;
-		}
-		daddr += partlen;
-		ubuf += partlen;
-		dbuf += partlen;
-		mvdev->in_bytes_dma += partlen;
-		mvdev->in_bytes += partlen;
-		len -= partlen;
-		dma_offset = 0;
-	}
-	return 0;
-err:
-	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
-	return err;
-}
-
-/*
- * Initiates copies across the PCIe bus from a user space buffer to card
- * memory. When transfers are done using DMA, source/destination addresses
- * and transfer length must follow the alignment requirements of the MIC
- * DMA engine.
- */
-static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
-				     size_t len, u64 daddr, size_t dlen,
-				     int vr_idx)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	void __iomem *dbuf = mdev->aper.va + daddr;
-	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
-	size_t partlen;
-	int err;
-
-	if (daddr & (dma_alignment - 1)) {
-		mvdev->tx_dst_unaligned += len;
-		goto memcpy;
-	} else if (ALIGN(len, dma_alignment) > dlen) {
-		mvdev->tx_len_unaligned += len;
-		goto memcpy;
-	}
-
-	while (len) {
-		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
-
-		if (copy_from_user(mvr->buf, ubuf, partlen)) {
-			err = -EFAULT;
-			goto err;
-		}
-		err = mic_sync_dma(mdev, daddr, mvr->buf_da,
-				   ALIGN(partlen, dma_alignment));
-		if (err)
-			goto err;
-		daddr += partlen;
-		ubuf += partlen;
-		dbuf += partlen;
-		mvdev->out_bytes_dma += partlen;
-		mvdev->out_bytes += partlen;
-		len -= partlen;
-	}
-memcpy:
-	/*
-	 * We are copying to IO below and should ideally use something
-	 * like copy_from_user_toio(..) if it existed.
-	 */
-	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
-		err = -EFAULT;
-		goto err;
-	}
-	mvdev->out_bytes += len;
-	return 0;
-err:
-	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
-	return err;
-}
-
-#define MIC_VRINGH_READ true
-
-/* The function to call to notify the card about added buffers */
-static void mic_notify(struct vringh *vrh)
-{
-	struct mic_vringh *mvrh = container_of(vrh, struct mic_vringh, vrh);
-	struct mic_vdev *mvdev = mvrh->mvdev;
-	s8 db = mvdev->dc->h2c_vdev_db;
-
-	if (db != -1)
-		mvdev->mdev->ops->send_intr(mvdev->mdev, db);
-}
-
-/* Determine the total number of bytes consumed in a VRINGH KIOV */
-static inline u32 mic_vringh_iov_consumed(struct vringh_kiov *iov)
-{
-	int i;
-	u32 total = iov->consumed;
-
-	for (i = 0; i < iov->i; i++)
-		total += iov->iov[i].iov_len;
-	return total;
-}
-
-/*
- * Traverse the VRINGH KIOV and issue the APIs to trigger the copies.
- * This API is heavily based on the vringh_iov_xfer(..) implementation
- * in vringh.c. The reason we cannot reuse vringh_iov_pull_kern(..)
- * and vringh_iov_push_kern(..) directly is because there is no
- * way to override the VRINGH xfer(..) routines as of v3.10.
- */
-static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
-			void __user *ubuf, size_t len, bool read, int vr_idx,
-			size_t *out_len)
-{
-	int ret = 0;
-	size_t partlen, tot_len = 0;
-
-	while (len && iov->i < iov->used) {
-		partlen = min(iov->iov[iov->i].iov_len, len);
-		if (read)
-			ret = mic_virtio_copy_to_user(mvdev, ubuf, partlen,
-						(u64)iov->iov[iov->i].iov_base,
-						iov->iov[iov->i].iov_len,
-						vr_idx);
-		else
-			ret = mic_virtio_copy_from_user(mvdev, ubuf, partlen,
-						(u64)iov->iov[iov->i].iov_base,
-						iov->iov[iov->i].iov_len,
-						vr_idx);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			break;
-		}
-		len -= partlen;
-		ubuf += partlen;
-		tot_len += partlen;
-		iov->consumed += partlen;
-		iov->iov[iov->i].iov_len -= partlen;
-		iov->iov[iov->i].iov_base += partlen;
-		if (!iov->iov[iov->i].iov_len) {
-			/* Fix up old iov element then increment. */
-			iov->iov[iov->i].iov_len = iov->consumed;
-			iov->iov[iov->i].iov_base -= iov->consumed;
-
-			iov->consumed = 0;
-			iov->i++;
-		}
-	}
-	*out_len = tot_len;
-	return ret;
-}
-
-/*
- * Use the standard VRINGH infrastructure in the kernel to fetch new
- * descriptors, initiate the copies and update the used ring.
- */
-static int _mic_virtio_copy(struct mic_vdev *mvdev,
-	struct mic_copy_desc *copy)
-{
-	int ret = 0;
-	u32 iovcnt = copy->iovcnt;
-	struct iovec iov;
-	struct iovec __user *u_iov = copy->iov;
-	void __user *ubuf = NULL;
-	struct mic_vringh *mvr = &mvdev->mvr[copy->vr_idx];
-	struct vringh_kiov *riov = &mvr->riov;
-	struct vringh_kiov *wiov = &mvr->wiov;
-	struct vringh *vrh = &mvr->vrh;
-	u16 *head = &mvr->head;
-	struct mic_vring *vr = &mvr->vring;
-	size_t len = 0, out_len;
-
-	copy->out_len = 0;
-	/* Fetch a new IOVEC if all previous elements have been processed */
-	if (riov->i == riov->used && wiov->i == wiov->used) {
-		ret = vringh_getdesc_kern(vrh, riov, wiov,
-				head, GFP_KERNEL);
-		/* Check if there are available descriptors */
-		if (ret <= 0)
-			return ret;
-	}
-	while (iovcnt) {
-		if (!len) {
-			/* Copy over a new iovec from user space. */
-			ret = copy_from_user(&iov, u_iov, sizeof(*u_iov));
-			if (ret) {
-				ret = -EINVAL;
-				dev_err(mic_dev(mvdev), "%s %d err %d\n",
-					__func__, __LINE__, ret);
-				break;
-			}
-			len = iov.iov_len;
-			ubuf = iov.iov_base;
-		}
-		/* Issue all the read descriptors first */
-		ret = mic_vringh_copy(mvdev, riov, ubuf, len, MIC_VRINGH_READ,
-				      copy->vr_idx, &out_len);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			break;
-		}
-		len -= out_len;
-		ubuf += out_len;
-		copy->out_len += out_len;
-		/* Issue the write descriptors next */
-		ret = mic_vringh_copy(mvdev, wiov, ubuf, len, !MIC_VRINGH_READ,
-				      copy->vr_idx, &out_len);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			break;
-		}
-		len -= out_len;
-		ubuf += out_len;
-		copy->out_len += out_len;
-		if (!len) {
-			/* One user space iovec is now completed */
-			iovcnt--;
-			u_iov++;
-		}
-		/* Exit loop if all elements in KIOVs have been processed. */
-		if (riov->i == riov->used && wiov->i == wiov->used)
-			break;
-	}
-	/*
-	 * Update the used ring if a descriptor was available and some data was
-	 * copied in/out and the user asked for a used ring update.
-	 */
-	if (*head != USHRT_MAX && copy->out_len && copy->update_used) {
-		u32 total = 0;
-
-		/* Determine the total data consumed */
-		total += mic_vringh_iov_consumed(riov);
-		total += mic_vringh_iov_consumed(wiov);
-		vringh_complete_kern(vrh, *head, total);
-		*head = USHRT_MAX;
-		if (vringh_need_notify_kern(vrh) > 0)
-			vringh_notify(vrh);
-		vringh_kiov_cleanup(riov);
-		vringh_kiov_cleanup(wiov);
-		/* Update avail idx for user space */
-		vr->info->avail_idx = vrh->last_avail_idx;
-	}
-	return ret;
-}
-
-static inline int mic_verify_copy_args(struct mic_vdev *mvdev,
-		struct mic_copy_desc *copy)
-{
-	if (copy->vr_idx >= mvdev->dd->num_vq) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EINVAL);
-		return -EINVAL;
-	}
-	return 0;
-}
-
-/* Copy a specified number of virtio descriptors in a chain */
-int mic_virtio_copy_desc(struct mic_vdev *mvdev,
-		struct mic_copy_desc *copy)
-{
-	int err;
-	struct mic_vringh *mvr = &mvdev->mvr[copy->vr_idx];
-
-	err = mic_verify_copy_args(mvdev, copy);
-	if (err)
-		return err;
-
-	mutex_lock(&mvr->vr_mutex);
-	if (!mic_vdevup(mvdev)) {
-		err = -ENODEV;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
-		goto err;
-	}
-	err = _mic_virtio_copy(mvdev, copy);
-	if (err) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
-	}
-err:
-	mutex_unlock(&mvr->vr_mutex);
-	return err;
-}
-
-static void mic_virtio_init_post(struct mic_vdev *mvdev)
-{
-	struct mic_vqconfig *vqconfig = mic_vq_config(mvdev->dd);
-	int i;
-
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		if (!le64_to_cpu(vqconfig[i].used_address)) {
-			dev_warn(mic_dev(mvdev), "used_address zero??\n");
-			continue;
-		}
-		mvdev->mvr[i].vrh.vring.used =
-			(void __force *)mvdev->mdev->aper.va +
-			le64_to_cpu(vqconfig[i].used_address);
-	}
-
-	mvdev->dc->used_address_updated = 0;
-
-	dev_dbg(mic_dev(mvdev), "%s: device type %d LINKUP\n",
-		__func__, mvdev->virtio_id);
-}
-
-static inline void mic_virtio_device_reset(struct mic_vdev *mvdev)
-{
-	int i;
-
-	dev_dbg(mic_dev(mvdev), "%s: status %d device type %d RESET\n",
-		__func__, mvdev->dd->status, mvdev->virtio_id);
-
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		/*
-		 * Avoid lockdep false positive. The + 1 is for the mic
-		 * mutex which is held in the reset devices code path.
-		 */
-		mutex_lock_nested(&mvdev->mvr[i].vr_mutex, i + 1);
-
-	/* 0 status means "reset" */
-	mvdev->dd->status = 0;
-	mvdev->dc->vdev_reset = 0;
-	mvdev->dc->host_ack = 1;
-
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		struct vringh *vrh = &mvdev->mvr[i].vrh;
-		mvdev->mvr[i].vring.info->avail_idx = 0;
-		vrh->completed = 0;
-		vrh->last_avail_idx = 0;
-		vrh->last_used_idx = 0;
-	}
-
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		mutex_unlock(&mvdev->mvr[i].vr_mutex);
-}
-
-void mic_virtio_reset_devices(struct mic_device *mdev)
-{
-	struct list_head *pos, *tmp;
-	struct mic_vdev *mvdev;
-
-	dev_dbg(&mdev->pdev->dev, "%s\n",  __func__);
-
-	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
-		mvdev = list_entry(pos, struct mic_vdev, list);
-		mic_virtio_device_reset(mvdev);
-		mvdev->poll_wake = 1;
-		wake_up(&mvdev->waitq);
-	}
-}
-
-void mic_bh_handler(struct work_struct *work)
-{
-	struct mic_vdev *mvdev = container_of(work, struct mic_vdev,
-			virtio_bh_work);
-
-	if (mvdev->dc->used_address_updated)
-		mic_virtio_init_post(mvdev);
-
-	if (mvdev->dc->vdev_reset)
-		mic_virtio_device_reset(mvdev);
-
-	mvdev->poll_wake = 1;
-	wake_up(&mvdev->waitq);
-}
-
-static irqreturn_t mic_virtio_intr_handler(int irq, void *data)
-{
-	struct mic_vdev *mvdev = data;
-	struct mic_device *mdev = mvdev->mdev;
-
-	mdev->ops->intr_workarounds(mdev);
-	schedule_work(&mvdev->virtio_bh_work);
-	return IRQ_HANDLED;
-}
-
-int mic_virtio_config_change(struct mic_vdev *mvdev,
-			void __user *argp)
-{
-	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
-	int ret = 0, retry, i;
-	struct mic_bootparam *bootparam = mvdev->mdev->dp;
-	s8 db = bootparam->h2c_config_db;
-
-	mutex_lock(&mvdev->mdev->mic_mutex);
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		mutex_lock_nested(&mvdev->mvr[i].vr_mutex, i + 1);
-
-	if (db == -1 || mvdev->dd->type == -1) {
-		ret = -EIO;
-		goto exit;
-	}
-
-	if (copy_from_user(mic_vq_configspace(mvdev->dd),
-			   argp, mvdev->dd->config_len)) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EFAULT);
-		ret = -EFAULT;
-		goto exit;
-	}
-	mvdev->dc->config_change = MIC_VIRTIO_PARAM_CONFIG_CHANGED;
-	mvdev->mdev->ops->send_intr(mvdev->mdev, db);
-
-	for (retry = 100; retry--;) {
-		ret = wait_event_timeout(wake,
-			mvdev->dc->guest_ack, msecs_to_jiffies(100));
-		if (ret)
-			break;
-	}
-
-	dev_dbg(mic_dev(mvdev),
-		"%s %d retry: %d\n", __func__, __LINE__, retry);
-	mvdev->dc->config_change = 0;
-	mvdev->dc->guest_ack = 0;
-exit:
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		mutex_unlock(&mvdev->mvr[i].vr_mutex);
-	mutex_unlock(&mvdev->mdev->mic_mutex);
-	return ret;
-}
-
-static int mic_copy_dp_entry(struct mic_vdev *mvdev,
-					void __user *argp,
-					__u8 *type,
-					struct mic_device_desc **devpage)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	struct mic_device_desc dd, *dd_config, *devp;
-	struct mic_vqconfig *vqconfig;
-	int ret = 0, i;
-	bool slot_found = false;
-
-	if (copy_from_user(&dd, argp, sizeof(dd))) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EFAULT);
-		return -EFAULT;
-	}
-
-	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE ||
-	    dd.num_vq > MIC_MAX_VRINGS) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EINVAL);
-		return -EINVAL;
-	}
-
-	dd_config = kmalloc(mic_desc_size(&dd), GFP_KERNEL);
-	if (dd_config == NULL) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -ENOMEM);
-		return -ENOMEM;
-	}
-	if (copy_from_user(dd_config, argp, mic_desc_size(&dd))) {
-		ret = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, ret);
-		goto exit;
-	}
-
-	vqconfig = mic_vq_config(dd_config);
-	for (i = 0; i < dd.num_vq; i++) {
-		if (le16_to_cpu(vqconfig[i].num) > MIC_MAX_VRING_ENTRIES) {
-			ret =  -EINVAL;
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto exit;
-		}
-	}
-
-	/* Find the first free device page entry */
-	for (i = sizeof(struct mic_bootparam);
-		i < MIC_DP_SIZE - mic_total_desc_size(dd_config);
-		i += mic_total_desc_size(devp)) {
-		devp = mdev->dp + i;
-		if (devp->type == 0 || devp->type == -1) {
-			slot_found = true;
-			break;
-		}
-	}
-	if (!slot_found) {
-		ret =  -EINVAL;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, ret);
-		goto exit;
-	}
-	/*
-	 * Save off the type before doing the memcpy. Type will be set in the
-	 * end after completing all initialization for the new device.
-	 */
-	*type = dd_config->type;
-	dd_config->type = 0;
-	memcpy(devp, dd_config, mic_desc_size(dd_config));
-
-	*devpage = devp;
-exit:
-	kfree(dd_config);
-	return ret;
-}
-
-static void mic_init_device_ctrl(struct mic_vdev *mvdev,
-				struct mic_device_desc *devpage)
-{
-	struct mic_device_ctrl *dc;
-
-	dc = (void *)devpage + mic_aligned_desc_size(devpage);
-
-	dc->config_change = 0;
-	dc->guest_ack = 0;
-	dc->vdev_reset = 0;
-	dc->host_ack = 0;
-	dc->used_address_updated = 0;
-	dc->c2h_vdev_db = -1;
-	dc->h2c_vdev_db = -1;
-	mvdev->dc = dc;
-}
-
-int mic_virtio_add_device(struct mic_vdev *mvdev,
-			void __user *argp)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	struct mic_device_desc *dd = NULL;
-	struct mic_vqconfig *vqconfig;
-	int vr_size, i, j, ret;
-	u8 type = 0;
-	s8 db;
-	char irqname[10];
-	struct mic_bootparam *bootparam = mdev->dp;
-	u16 num;
-	dma_addr_t vr_addr;
-
-	mutex_lock(&mdev->mic_mutex);
-
-	ret = mic_copy_dp_entry(mvdev, argp, &type, &dd);
-	if (ret) {
-		mutex_unlock(&mdev->mic_mutex);
-		return ret;
-	}
-
-	mic_init_device_ctrl(mvdev, dd);
-
-	mvdev->dd = dd;
-	mvdev->virtio_id = type;
-	vqconfig = mic_vq_config(dd);
-	INIT_WORK(&mvdev->virtio_bh_work, mic_bh_handler);
-
-	for (i = 0; i < dd->num_vq; i++) {
-		struct mic_vringh *mvr = &mvdev->mvr[i];
-		struct mic_vring *vr = &mvdev->mvr[i].vring;
-		num = le16_to_cpu(vqconfig[i].num);
-		mutex_init(&mvr->vr_mutex);
-		vr_size = PAGE_ALIGN(vring_size(num, MIC_VIRTIO_RING_ALIGN) +
-			sizeof(struct _mic_vring_info));
-		vr->va = (void *)
-			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-					 get_order(vr_size));
-		if (!vr->va) {
-			ret = -ENOMEM;
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto err;
-		}
-		vr->len = vr_size;
-		vr->info = vr->va + vring_size(num, MIC_VIRTIO_RING_ALIGN);
-		vr->info->magic = cpu_to_le32(MIC_MAGIC + mvdev->virtio_id + i);
-		vr_addr = mic_map_single(mdev, vr->va, vr_size);
-		if (mic_map_error(vr_addr)) {
-			free_pages((unsigned long)vr->va, get_order(vr_size));
-			ret = -ENOMEM;
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto err;
-		}
-		vqconfig[i].address = cpu_to_le64(vr_addr);
-
-		vring_init(&vr->vr, num, vr->va, MIC_VIRTIO_RING_ALIGN);
-		ret = vringh_init_kern(&mvr->vrh,
-			*(u32 *)mic_vq_features(mvdev->dd), num, false,
-			vr->vr.desc, vr->vr.avail, vr->vr.used);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto err;
-		}
-		vringh_kiov_init(&mvr->riov, NULL, 0);
-		vringh_kiov_init(&mvr->wiov, NULL, 0);
-		mvr->head = USHRT_MAX;
-		mvr->mvdev = mvdev;
-		mvr->vrh.notify = mic_notify;
-		dev_dbg(&mdev->pdev->dev,
-			"%s %d index %d va %p info %p vr_size 0x%x\n",
-			__func__, __LINE__, i, vr->va, vr->info, vr_size);
-		mvr->buf = (void *)__get_free_pages(GFP_KERNEL,
-					get_order(MIC_INT_DMA_BUF_SIZE));
-		mvr->buf_da = mic_map_single(mvdev->mdev, mvr->buf,
-					  MIC_INT_DMA_BUF_SIZE);
-	}
-
-	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
-		 mvdev->virtio_id);
-	mvdev->virtio_db = mic_next_db(mdev);
-	mvdev->virtio_cookie = mic_request_threaded_irq(mdev,
-					       mic_virtio_intr_handler,
-					       NULL, irqname, mvdev,
-					       mvdev->virtio_db, MIC_INTR_DB);
-	if (IS_ERR(mvdev->virtio_cookie)) {
-		ret = PTR_ERR(mvdev->virtio_cookie);
-		dev_dbg(&mdev->pdev->dev, "request irq failed\n");
-		goto err;
-	}
-
-	mvdev->dc->c2h_vdev_db = mvdev->virtio_db;
-
-	list_add_tail(&mvdev->list, &mdev->vdev_list);
-	/*
-	 * Order the type update with previous stores. This write barrier
-	 * is paired with the corresponding read barrier before the uncached
-	 * system memory read of the type, on the card while scanning the
-	 * device page.
-	 */
-	smp_wmb();
-	dd->type = type;
-
-	dev_dbg(&mdev->pdev->dev, "Added virtio device id %d\n", dd->type);
-
-	db = bootparam->h2c_config_db;
-	if (db != -1)
-		mdev->ops->send_intr(mdev, db);
-	mutex_unlock(&mdev->mic_mutex);
-	return 0;
-err:
-	vqconfig = mic_vq_config(dd);
-	for (j = 0; j < i; j++) {
-		struct mic_vringh *mvr = &mvdev->mvr[j];
-		mic_unmap_single(mdev, le64_to_cpu(vqconfig[j].address),
-				 mvr->vring.len);
-		free_pages((unsigned long)mvr->vring.va,
-			   get_order(mvr->vring.len));
-	}
-	mutex_unlock(&mdev->mic_mutex);
-	return ret;
-}
-
-void mic_virtio_del_device(struct mic_vdev *mvdev)
-{
-	struct list_head *pos, *tmp;
-	struct mic_vdev *tmp_mvdev;
-	struct mic_device *mdev = mvdev->mdev;
-	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
-	int i, ret, retry;
-	struct mic_vqconfig *vqconfig;
-	struct mic_bootparam *bootparam = mdev->dp;
-	s8 db;
-
-	mutex_lock(&mdev->mic_mutex);
-	db = bootparam->h2c_config_db;
-	if (db == -1)
-		goto skip_hot_remove;
-	dev_dbg(&mdev->pdev->dev,
-		"Requesting hot remove id %d\n", mvdev->virtio_id);
-	mvdev->dc->config_change = MIC_VIRTIO_PARAM_DEV_REMOVE;
-	mdev->ops->send_intr(mdev, db);
-	for (retry = 100; retry--;) {
-		ret = wait_event_timeout(wake,
-			mvdev->dc->guest_ack, msecs_to_jiffies(100));
-		if (ret)
-			break;
-	}
-	dev_dbg(&mdev->pdev->dev,
-		"Device id %d config_change %d guest_ack %d retry %d\n",
-		mvdev->virtio_id, mvdev->dc->config_change,
-		mvdev->dc->guest_ack, retry);
-	mvdev->dc->config_change = 0;
-	mvdev->dc->guest_ack = 0;
-skip_hot_remove:
-	mic_free_irq(mdev, mvdev->virtio_cookie, mvdev);
-	flush_work(&mvdev->virtio_bh_work);
-	vqconfig = mic_vq_config(mvdev->dd);
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		struct mic_vringh *mvr = &mvdev->mvr[i];
-
-		mic_unmap_single(mvdev->mdev, mvr->buf_da,
-				 MIC_INT_DMA_BUF_SIZE);
-		free_pages((unsigned long)mvr->buf,
-			   get_order(MIC_INT_DMA_BUF_SIZE));
-		vringh_kiov_cleanup(&mvr->riov);
-		vringh_kiov_cleanup(&mvr->wiov);
-		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
-				 mvr->vring.len);
-		free_pages((unsigned long)mvr->vring.va,
-			   get_order(mvr->vring.len));
-	}
-
-	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
-		tmp_mvdev = list_entry(pos, struct mic_vdev, list);
-		if (tmp_mvdev == mvdev) {
-			list_del(pos);
-			dev_dbg(&mdev->pdev->dev,
-				"Removing virtio device id %d\n",
-				mvdev->virtio_id);
-			break;
-		}
-	}
-	/*
-	 * Order the type update with previous stores. This write barrier
-	 * is paired with the corresponding read barrier before the uncached
-	 * system memory read of the type, on the card while scanning the
-	 * device page.
-	 */
-	smp_wmb();
-	mvdev->dd->type = -1;
-	mutex_unlock(&mdev->mic_mutex);
-}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 29%]

* [PATCH char-misc-testing v2 2/8] misc: mic: Remove MIC X100 card virtio functionality
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
  2016-02-08 23:48 29% ` [PATCH char-misc-testing v2 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
@ 2016-02-08 23:48 49% ` Sudeep Dutt
  2016-02-08 23:48 61% ` [PATCH char-misc-testing v2 3/8] misc: mic: MIC VOP Bus Sudeep Dutt
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch deletes the virtio functionality from the MIC X100 card
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/card/Makefile     |   1 -
 drivers/misc/mic/card/mic_virtio.h |  76 -----
 drivers/misc/mic/card/mic_device.c |   9 +-
 drivers/misc/mic/card/mic_virtio.c | 634 -------------------------------------
 4 files changed, 1 insertion(+), 719 deletions(-)
 delete mode 100644 drivers/misc/mic/card/mic_virtio.h
 delete mode 100644 drivers/misc/mic/card/mic_virtio.c

diff --git a/drivers/misc/mic/card/Makefile b/drivers/misc/mic/card/Makefile
index 69d58be..6e9675e 100644
--- a/drivers/misc/mic/card/Makefile
+++ b/drivers/misc/mic/card/Makefile
@@ -8,4 +8,3 @@ obj-$(CONFIG_INTEL_MIC_CARD) += mic_card.o
 mic_card-y += mic_x100.o
 mic_card-y += mic_device.o
 mic_card-y += mic_debugfs.o
-mic_card-y += mic_virtio.o
diff --git a/drivers/misc/mic/card/mic_virtio.h b/drivers/misc/mic/card/mic_virtio.h
deleted file mode 100644
index d0407ba..0000000
--- a/drivers/misc/mic/card/mic_virtio.h
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Disclaimer: The codes contained in these modules may be specific to
- * the Intel Software Development Platform codenamed: Knights Ferry, and
- * the Intel product codenamed: Knights Corner, and are not backward
- * compatible with other Intel products. Additionally, Intel will NOT
- * support the codes or instruction set in future products.
- *
- * Intel MIC Card driver.
- *
- */
-#ifndef __MIC_CARD_VIRTIO_H
-#define __MIC_CARD_VIRTIO_H
-
-#include <linux/mic_common.h>
-#include "mic_device.h"
-
-/*
- * 64 bit I/O access
- */
-#ifndef ioread64
-#define ioread64 readq
-#endif
-#ifndef iowrite64
-#define iowrite64 writeq
-#endif
-
-static inline unsigned mic_desc_size(struct mic_device_desc __iomem *desc)
-{
-	return sizeof(*desc)
-		+ ioread8(&desc->num_vq) * sizeof(struct mic_vqconfig)
-		+ ioread8(&desc->feature_len) * 2
-		+ ioread8(&desc->config_len);
-}
-
-static inline struct mic_vqconfig __iomem *
-mic_vq_config(struct mic_device_desc __iomem *desc)
-{
-	return (struct mic_vqconfig __iomem *)(desc + 1);
-}
-
-static inline __u8 __iomem *
-mic_vq_features(struct mic_device_desc __iomem *desc)
-{
-	return (__u8 __iomem *)(mic_vq_config(desc) + ioread8(&desc->num_vq));
-}
-
-static inline __u8 __iomem *
-mic_vq_configspace(struct mic_device_desc __iomem *desc)
-{
-	return mic_vq_features(desc) + ioread8(&desc->feature_len) * 2;
-}
-static inline unsigned mic_total_desc_size(struct mic_device_desc __iomem *desc)
-{
-	return mic_aligned_desc_size(desc) + sizeof(struct mic_device_ctrl);
-}
-
-int mic_devices_init(struct mic_driver *mdrv);
-void mic_devices_uninit(struct mic_driver *mdrv);
-
-#endif
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index d0edaf7..ff03c63 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -34,7 +34,6 @@
 #include <linux/mic_common.h>
 #include "../common/mic_dev.h"
 #include "mic_device.h"
-#include "mic_virtio.h"
 
 static struct mic_driver *g_drv;
 
@@ -309,9 +308,6 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 		rc = -ENODEV;
 		goto irq_uninit;
 	}
-	rc = mic_devices_init(mdrv);
-	if (rc)
-		goto dma_free;
 	bootparam = mdrv->dp;
 	node_id = ioread8(&bootparam->node_id);
 	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
@@ -321,13 +317,11 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 					   mdrv->num_dma_ch, true);
 	if (IS_ERR(mdrv->scdev)) {
 		rc = PTR_ERR(mdrv->scdev);
-		goto device_uninit;
+		goto dma_free;
 	}
 	mic_create_card_debug_dir(mdrv);
 done:
 	return rc;
-device_uninit:
-	mic_devices_uninit(mdrv);
 dma_free:
 	mic_free_dma_chans(mdrv);
 irq_uninit:
@@ -348,7 +342,6 @@ void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
 	scif_unregister_device(mdrv->scdev);
-	mic_devices_uninit(mdrv);
 	mic_free_dma_chans(mdrv);
 	mic_uninit_irq();
 	mic_dp_uninit();
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
deleted file mode 100644
index 17764b2..0000000
--- a/drivers/misc/mic/card/mic_virtio.c
+++ /dev/null
@@ -1,634 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Disclaimer: The codes contained in these modules may be specific to
- * the Intel Software Development Platform codenamed: Knights Ferry, and
- * the Intel product codenamed: Knights Corner, and are not backward
- * compatible with other Intel products. Additionally, Intel will NOT
- * support the codes or instruction set in future products.
- *
- * Adapted from:
- *
- * virtio for kvm on s390
- *
- * Copyright IBM Corp. 2008
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License (version 2 only)
- * as published by the Free Software Foundation.
- *
- *    Author(s): Christian Borntraeger <borntraeger@de.ibm.com>
- *
- * Intel MIC Card driver.
- *
- */
-#include <linux/delay.h>
-#include <linux/slab.h>
-#include <linux/virtio_config.h>
-
-#include "../common/mic_dev.h"
-#include "mic_virtio.h"
-
-#define VIRTIO_SUBCODE_64 0x0D00
-
-#define MIC_MAX_VRINGS                4
-struct mic_vdev {
-	struct virtio_device vdev;
-	struct mic_device_desc __iomem *desc;
-	struct mic_device_ctrl __iomem *dc;
-	struct mic_device *mdev;
-	void __iomem *vr[MIC_MAX_VRINGS];
-	int used_size[MIC_MAX_VRINGS];
-	struct completion reset_done;
-	struct mic_irq *virtio_cookie;
-	int c2h_vdev_db;
-};
-
-static struct mic_irq *virtio_config_cookie;
-#define to_micvdev(vd) container_of(vd, struct mic_vdev, vdev)
-
-/* Helper API to obtain the parent of the virtio device */
-static inline struct device *mic_dev(struct mic_vdev *mvdev)
-{
-	return mvdev->vdev.dev.parent;
-}
-
-/* This gets the device's feature bits. */
-static u64 mic_get_features(struct virtio_device *vdev)
-{
-	unsigned int i, bits;
-	u32 features = 0;
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-	u8 __iomem *in_features = mic_vq_features(desc);
-	int feature_len = ioread8(&desc->feature_len);
-
-	bits = min_t(unsigned, feature_len, sizeof(features)) * 8;
-	for (i = 0; i < bits; i++)
-		if (ioread8(&in_features[i / 8]) & (BIT(i % 8)))
-			features |= BIT(i);
-
-	return features;
-}
-
-static int mic_finalize_features(struct virtio_device *vdev)
-{
-	unsigned int i, bits;
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-	u8 feature_len = ioread8(&desc->feature_len);
-	/* Second half of bitmap is features we accept. */
-	u8 __iomem *out_features =
-		mic_vq_features(desc) + feature_len;
-
-	/* Give virtio_ring a chance to accept features. */
-	vring_transport_features(vdev);
-
-	/* Make sure we don't have any features > 32 bits! */
-	BUG_ON((u32)vdev->features != vdev->features);
-
-	memset_io(out_features, 0, feature_len);
-	bits = min_t(unsigned, feature_len,
-		sizeof(vdev->features)) * 8;
-	for (i = 0; i < bits; i++) {
-		if (__virtio_test_bit(vdev, i))
-			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
-				 &out_features[i / 8]);
-	}
-
-	return 0;
-}
-
-/*
- * Reading and writing elements in config space
- */
-static void mic_get(struct virtio_device *vdev, unsigned int offset,
-		   void *buf, unsigned len)
-{
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-
-	if (offset + len > ioread8(&desc->config_len))
-		return;
-	memcpy_fromio(buf, mic_vq_configspace(desc) + offset, len);
-}
-
-static void mic_set(struct virtio_device *vdev, unsigned int offset,
-		   const void *buf, unsigned len)
-{
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-
-	if (offset + len > ioread8(&desc->config_len))
-		return;
-	memcpy_toio(mic_vq_configspace(desc) + offset, buf, len);
-}
-
-/*
- * The operations to get and set the status word just access the status
- * field of the device descriptor. set_status also interrupts the host
- * to tell about status changes.
- */
-static u8 mic_get_status(struct virtio_device *vdev)
-{
-	return ioread8(&to_micvdev(vdev)->desc->status);
-}
-
-static void mic_set_status(struct virtio_device *vdev, u8 status)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	if (!status)
-		return;
-	iowrite8(status, &mvdev->desc->status);
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-}
-
-/* Inform host on a virtio device reset and wait for ack from host */
-static void mic_reset_inform_host(struct virtio_device *vdev)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct mic_device_ctrl __iomem *dc = mvdev->dc;
-	int retry;
-
-	iowrite8(0, &dc->host_ack);
-	iowrite8(1, &dc->vdev_reset);
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-
-	/* Wait till host completes all card accesses and acks the reset */
-	for (retry = 100; retry--;) {
-		if (ioread8(&dc->host_ack))
-			break;
-		msleep(100);
-	};
-
-	dev_dbg(mic_dev(mvdev), "%s: retry: %d\n", __func__, retry);
-
-	/* Reset status to 0 in case we timed out */
-	iowrite8(0, &mvdev->desc->status);
-}
-
-static void mic_reset(struct virtio_device *vdev)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-
-	dev_dbg(mic_dev(mvdev), "%s: virtio id %d\n",
-		__func__, vdev->id.device);
-
-	mic_reset_inform_host(vdev);
-	complete_all(&mvdev->reset_done);
-}
-
-/*
- * The virtio_ring code calls this API when it wants to notify the Host.
- */
-static bool mic_notify(struct virtqueue *vq)
-{
-	struct mic_vdev *mvdev = vq->priv;
-
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-	return true;
-}
-
-static void mic_del_vq(struct virtqueue *vq, int n)
-{
-	struct mic_vdev *mvdev = to_micvdev(vq->vdev);
-	struct vring *vr = (struct vring *)(vq + 1);
-
-	free_pages((unsigned long) vr->used, get_order(mvdev->used_size[n]));
-	vring_del_virtqueue(vq);
-	mic_card_unmap(mvdev->mdev, mvdev->vr[n]);
-	mvdev->vr[n] = NULL;
-}
-
-static void mic_del_vqs(struct virtio_device *vdev)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct virtqueue *vq, *n;
-	int idx = 0;
-
-	dev_dbg(mic_dev(mvdev), "%s\n", __func__);
-
-	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
-		mic_del_vq(vq, idx++);
-}
-
-/*
- * This routine will assign vring's allocated in host/io memory. Code in
- * virtio_ring.c however continues to access this io memory as if it were local
- * memory without io accessors.
- */
-static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
-				     unsigned index,
-				     void (*callback)(struct virtqueue *vq),
-				     const char *name)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct mic_vqconfig __iomem *vqconfig;
-	struct mic_vqconfig config;
-	struct virtqueue *vq;
-	void __iomem *va;
-	struct _mic_vring_info __iomem *info;
-	void *used;
-	int vr_size, _vr_size, err, magic;
-	struct vring *vr;
-	u8 type = ioread8(&mvdev->desc->type);
-
-	if (index >= ioread8(&mvdev->desc->num_vq))
-		return ERR_PTR(-ENOENT);
-
-	if (!name)
-		return ERR_PTR(-ENOENT);
-
-	/* First assign the vring's allocated in host memory */
-	vqconfig = mic_vq_config(mvdev->desc) + index;
-	memcpy_fromio(&config, vqconfig, sizeof(config));
-	_vr_size = vring_size(le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN);
-	vr_size = PAGE_ALIGN(_vr_size + sizeof(struct _mic_vring_info));
-	va = mic_card_map(mvdev->mdev, le64_to_cpu(config.address), vr_size);
-	if (!va)
-		return ERR_PTR(-ENOMEM);
-	mvdev->vr[index] = va;
-	memset_io(va, 0x0, _vr_size);
-	vq = vring_new_virtqueue(index, le16_to_cpu(config.num),
-				 MIC_VIRTIO_RING_ALIGN, vdev, false,
-				 (void __force *)va, mic_notify, callback,
-				 name);
-	if (!vq) {
-		err = -ENOMEM;
-		goto unmap;
-	}
-	info = va + _vr_size;
-	magic = ioread32(&info->magic);
-
-	if (WARN(magic != MIC_MAGIC + type + index, "magic mismatch")) {
-		err = -EIO;
-		goto unmap;
-	}
-
-	/* Allocate and reassign used ring now */
-	mvdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
-					     sizeof(struct vring_used_elem) *
-					     le16_to_cpu(config.num));
-	used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-					get_order(mvdev->used_size[index]));
-	if (!used) {
-		err = -ENOMEM;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
-		goto del_vq;
-	}
-	iowrite64(virt_to_phys(used), &vqconfig->used_address);
-
-	/*
-	 * To reassign the used ring here we are directly accessing
-	 * struct vring_virtqueue which is a private data structure
-	 * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
-	 * vring_new_virtqueue() would ensure that
-	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
-	 */
-	vr = (struct vring *)(vq + 1);
-	vr->used = used;
-
-	vq->priv = mvdev;
-	return vq;
-del_vq:
-	vring_del_virtqueue(vq);
-unmap:
-	mic_card_unmap(mvdev->mdev, mvdev->vr[index]);
-	return ERR_PTR(err);
-}
-
-static int mic_find_vqs(struct virtio_device *vdev, unsigned nvqs,
-			struct virtqueue *vqs[],
-			vq_callback_t *callbacks[],
-			const char * const names[])
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct mic_device_ctrl __iomem *dc = mvdev->dc;
-	int i, err, retry;
-
-	/* We must have this many virtqueues. */
-	if (nvqs > ioread8(&mvdev->desc->num_vq))
-		return -ENOENT;
-
-	for (i = 0; i < nvqs; ++i) {
-		dev_dbg(mic_dev(mvdev), "%s: %d: %s\n",
-			__func__, i, names[i]);
-		vqs[i] = mic_find_vq(vdev, i, callbacks[i], names[i]);
-		if (IS_ERR(vqs[i])) {
-			err = PTR_ERR(vqs[i]);
-			goto error;
-		}
-	}
-
-	iowrite8(1, &dc->used_address_updated);
-	/*
-	 * Send an interrupt to the host to inform it that used
-	 * rings have been re-assigned.
-	 */
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-	for (retry = 100; --retry;) {
-		if (!ioread8(&dc->used_address_updated))
-			break;
-		msleep(100);
-	};
-
-	dev_dbg(mic_dev(mvdev), "%s: retry: %d\n", __func__, retry);
-	if (!retry) {
-		err = -ENODEV;
-		goto error;
-	}
-
-	return 0;
-error:
-	mic_del_vqs(vdev);
-	return err;
-}
-
-/*
- * The config ops structure as defined by virtio config
- */
-static struct virtio_config_ops mic_vq_config_ops = {
-	.get_features = mic_get_features,
-	.finalize_features = mic_finalize_features,
-	.get = mic_get,
-	.set = mic_set,
-	.get_status = mic_get_status,
-	.set_status = mic_set_status,
-	.reset = mic_reset,
-	.find_vqs = mic_find_vqs,
-	.del_vqs = mic_del_vqs,
-};
-
-static irqreturn_t
-mic_virtio_intr_handler(int irq, void *data)
-{
-	struct mic_vdev *mvdev = data;
-	struct virtqueue *vq;
-
-	mic_ack_interrupt(mvdev->mdev);
-	list_for_each_entry(vq, &mvdev->vdev.vqs, list)
-		vring_interrupt(0, vq);
-
-	return IRQ_HANDLED;
-}
-
-static void mic_virtio_release_dev(struct device *_d)
-{
-	/*
-	 * No need for a release method similar to virtio PCI.
-	 * Provide an empty one to avoid getting a warning from core.
-	 */
-}
-
-/*
- * adds a new device and register it with virtio
- * appropriate drivers are loaded by the device model
- */
-static int mic_add_device(struct mic_device_desc __iomem *d,
-	unsigned int offset, struct mic_driver *mdrv)
-{
-	struct mic_vdev *mvdev;
-	int ret;
-	int virtio_db;
-	u8 type = ioread8(&d->type);
-
-	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
-	if (!mvdev) {
-		dev_err(mdrv->dev, "Cannot allocate mic dev %u type %u\n",
-			offset, type);
-		return -ENOMEM;
-	}
-
-	mvdev->mdev = &mdrv->mdev;
-	mvdev->vdev.dev.parent = mdrv->dev;
-	mvdev->vdev.dev.release = mic_virtio_release_dev;
-	mvdev->vdev.id.device = type;
-	mvdev->vdev.config = &mic_vq_config_ops;
-	mvdev->desc = d;
-	mvdev->dc = (void __iomem *)d + mic_aligned_desc_size(d);
-	init_completion(&mvdev->reset_done);
-
-	virtio_db = mic_next_card_db();
-	mvdev->virtio_cookie = mic_request_card_irq(mic_virtio_intr_handler,
-			NULL, "virtio intr", mvdev, virtio_db);
-	if (IS_ERR(mvdev->virtio_cookie)) {
-		ret = PTR_ERR(mvdev->virtio_cookie);
-		goto kfree;
-	}
-	iowrite8((u8)virtio_db, &mvdev->dc->h2c_vdev_db);
-	mvdev->c2h_vdev_db = ioread8(&mvdev->dc->c2h_vdev_db);
-
-	ret = register_virtio_device(&mvdev->vdev);
-	if (ret) {
-		dev_err(mic_dev(mvdev),
-			"Failed to register mic device %u type %u\n",
-			offset, type);
-		goto free_irq;
-	}
-	iowrite64((u64)mvdev, &mvdev->dc->vdev);
-	dev_dbg(mic_dev(mvdev), "%s: registered mic device %u type %u mvdev %p\n",
-		__func__, offset, type, mvdev);
-
-	return 0;
-
-free_irq:
-	mic_free_card_irq(mvdev->virtio_cookie, mvdev);
-kfree:
-	kfree(mvdev);
-	return ret;
-}
-
-/*
- * match for a mic device with a specific desc pointer
- */
-static int mic_match_desc(struct device *dev, void *data)
-{
-	struct virtio_device *vdev = dev_to_virtio(dev);
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-
-	return mvdev->desc == (void __iomem *)data;
-}
-
-static void mic_handle_config_change(struct mic_device_desc __iomem *d,
-	unsigned int offset, struct mic_driver *mdrv)
-{
-	struct mic_device_ctrl __iomem *dc
-		= (void __iomem *)d + mic_aligned_desc_size(d);
-	struct mic_vdev *mvdev = (struct mic_vdev *)ioread64(&dc->vdev);
-
-	if (ioread8(&dc->config_change) != MIC_VIRTIO_PARAM_CONFIG_CHANGED)
-		return;
-
-	dev_dbg(mdrv->dev, "%s %d\n", __func__, __LINE__);
-	virtio_config_changed(&mvdev->vdev);
-	iowrite8(1, &dc->guest_ack);
-}
-
-/*
- * removes a virtio device if a hot remove event has been
- * requested by the host.
- */
-static int mic_remove_device(struct mic_device_desc __iomem *d,
-	unsigned int offset, struct mic_driver *mdrv)
-{
-	struct mic_device_ctrl __iomem *dc
-		= (void __iomem *)d + mic_aligned_desc_size(d);
-	struct mic_vdev *mvdev = (struct mic_vdev *)ioread64(&dc->vdev);
-	u8 status;
-	int ret = -1;
-
-	if (ioread8(&dc->config_change) == MIC_VIRTIO_PARAM_DEV_REMOVE) {
-		dev_dbg(mdrv->dev,
-			"%s %d config_change %d type %d mvdev %p\n",
-			__func__, __LINE__,
-			ioread8(&dc->config_change), ioread8(&d->type), mvdev);
-
-		status = ioread8(&d->status);
-		reinit_completion(&mvdev->reset_done);
-		unregister_virtio_device(&mvdev->vdev);
-		mic_free_card_irq(mvdev->virtio_cookie, mvdev);
-		if (status & VIRTIO_CONFIG_S_DRIVER_OK)
-			wait_for_completion(&mvdev->reset_done);
-		kfree(mvdev);
-		iowrite8(1, &dc->guest_ack);
-		dev_dbg(mdrv->dev, "%s %d guest_ack %d\n",
-			__func__, __LINE__, ioread8(&dc->guest_ack));
-		ret = 0;
-	}
-
-	return ret;
-}
-
-#define REMOVE_DEVICES true
-
-static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
-{
-	s8 type;
-	unsigned int i;
-	struct mic_device_desc __iomem *d;
-	struct mic_device_ctrl __iomem *dc;
-	struct device *dev;
-	int ret;
-
-	for (i = sizeof(struct mic_bootparam); i < MIC_DP_SIZE;
-		i += mic_total_desc_size(d)) {
-		d = mdrv->dp + i;
-		dc = (void __iomem *)d + mic_aligned_desc_size(d);
-		/*
-		 * This read barrier is paired with the corresponding write
-		 * barrier on the host which is inserted before adding or
-		 * removing a virtio device descriptor, by updating the type.
-		 */
-		rmb();
-		type = ioread8(&d->type);
-
-		/* end of list */
-		if (type == 0)
-			break;
-
-		if (type == -1)
-			continue;
-
-		/* device already exists */
-		dev = device_find_child(mdrv->dev, (void __force *)d,
-					mic_match_desc);
-		if (dev) {
-			if (remove)
-				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
-					 &dc->config_change);
-			put_device(dev);
-			mic_handle_config_change(d, i, mdrv);
-			ret = mic_remove_device(d, i, mdrv);
-			if (!ret && !remove)
-				iowrite8(-1, &d->type);
-			if (remove) {
-				iowrite8(0, &dc->config_change);
-				iowrite8(0, &dc->guest_ack);
-			}
-			continue;
-		}
-
-		/* new device */
-		dev_dbg(mdrv->dev, "%s %d Adding new virtio device %p\n",
-			__func__, __LINE__, d);
-		if (!remove)
-			mic_add_device(d, i, mdrv);
-	}
-}
-
-/*
- * mic_hotplug_device tries to find changes in the device page.
- */
-static void mic_hotplug_devices(struct work_struct *work)
-{
-	struct mic_driver *mdrv = container_of(work,
-		struct mic_driver, hotplug_work);
-
-	mic_scan_devices(mdrv, !REMOVE_DEVICES);
-}
-
-/*
- * Interrupt handler for hot plug/config changes etc.
- */
-static irqreturn_t
-mic_extint_handler(int irq, void *data)
-{
-	struct mic_driver *mdrv = (struct mic_driver *)data;
-
-	dev_dbg(mdrv->dev, "%s %d hotplug work\n",
-		__func__, __LINE__);
-	mic_ack_interrupt(&mdrv->mdev);
-	schedule_work(&mdrv->hotplug_work);
-	return IRQ_HANDLED;
-}
-
-/*
- * Init function for virtio
- */
-int mic_devices_init(struct mic_driver *mdrv)
-{
-	int rc;
-	struct mic_bootparam __iomem *bootparam;
-	int config_db;
-
-	INIT_WORK(&mdrv->hotplug_work, mic_hotplug_devices);
-	mic_scan_devices(mdrv, !REMOVE_DEVICES);
-
-	config_db = mic_next_card_db();
-	virtio_config_cookie = mic_request_card_irq(mic_extint_handler, NULL,
-						    "virtio_config_intr", mdrv,
-						    config_db);
-	if (IS_ERR(virtio_config_cookie)) {
-		rc = PTR_ERR(virtio_config_cookie);
-		goto exit;
-	}
-
-	bootparam = mdrv->dp;
-	iowrite8(config_db, &bootparam->h2c_config_db);
-	return 0;
-exit:
-	return rc;
-}
-
-/*
- * Uninit function for virtio
- */
-void mic_devices_uninit(struct mic_driver *mdrv)
-{
-	struct mic_bootparam __iomem *bootparam = mdrv->dp;
-	iowrite8(-1, &bootparam->h2c_config_db);
-	mic_free_card_irq(virtio_config_cookie, mdrv);
-	flush_work(&mdrv->hotplug_work);
-	mic_scan_devices(mdrv, REMOVE_DEVICES);
-}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 49%]

* [PATCH char-misc-testing v2 4/8] misc: mic: Add data structures for the VOP driver
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (2 preceding siblings ...)
  2016-02-08 23:48 61% ` [PATCH char-misc-testing v2 3/8] misc: mic: MIC VOP Bus Sudeep Dutt
@ 2016-02-08 23:48 75% ` Sudeep Dutt
  2016-02-08 23:48 39% ` [PATCH char-misc-testing v2 5/8] misc: mic: Enable VOP host side functionality Sudeep Dutt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch adds VOP driver data structures used in subsequent
patches. These data structures are refactored from similar data
structures used in the virtio parts of previous MIC host and card
drivers.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/vop/vop_main.h | 170 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)
 create mode 100644 drivers/misc/mic/vop/vop_main.h

diff --git a/drivers/misc/mic/vop/vop_main.h b/drivers/misc/mic/vop/vop_main.h
new file mode 100644
index 0000000..ba47ec7
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_main.h
@@ -0,0 +1,170 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#ifndef _VOP_MAIN_H_
+#define _VOP_MAIN_H_
+
+#include <linux/vringh.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio.h>
+#include <linux/miscdevice.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+#include "../bus/vop_bus.h"
+
+/*
+ * Note on endianness.
+ * 1. Host can be both BE or LE
+ * 2. Guest/card is LE. Host uses le_to_cpu to access desc/avail
+ *    rings and ioreadXX/iowriteXX to access used ring.
+ * 3. Device page exposed by host to guest contains LE values. Guest
+ *    accesses these using ioreadXX/iowriteXX etc. This way in general we
+ *    obey the virtio spec according to which guest works with native
+ *    endianness and host is aware of guest endianness and does all
+ *    required endianness conversion.
+ * 4. Data provided from user space to guest (in ADD_DEVICE and
+ *    CONFIG_CHANGE ioctl's) is not interpreted by the driver and should be
+ *    in guest endianness.
+ */
+
+/*
+ * vop_info - Allocated per invocation of VOP probe
+ *
+ * @vpdev: VOP device
+ * @hotplug_work: Handle virtio device creation, deletion and configuration
+ * @cookie: Cookie received upon requesting a virtio configuration interrupt
+ * @h2c_config_db: The doorbell used by the peer to indicate a config change
+ * @vdev_list: List of "active" virtio devices injected in the peer node
+ * @vop_mutex: Synchronize access to the device page as well as serialize
+ *             creation/deletion of virtio devices on the peer node
+ * @dp: Peer device page information
+ * @dbg: Debugfs entry
+ * @dma_ch: The DMA channel used by this transport for data transfers.
+ * @name: Name for this transport used in misc device creation.
+ * @miscdev: The misc device registered.
+ */
+struct vop_info {
+	struct vop_device *vpdev;
+	struct work_struct hotplug_work;
+	struct mic_irq *cookie;
+	int h2c_config_db;
+	struct list_head vdev_list;
+	struct mutex vop_mutex;
+	void __iomem *dp;
+	struct dentry *dbg;
+	struct dma_chan *dma_ch;
+	char name[16];
+	struct miscdevice miscdev;
+};
+
+/**
+ * struct vop_vringh - Virtio ring host information.
+ *
+ * @vring: The VOP vring used for setting up user space mappings.
+ * @vrh: The host VRINGH used for accessing the card vrings.
+ * @riov: The VRINGH read kernel IOV.
+ * @wiov: The VRINGH write kernel IOV.
+ * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
+ * @vr_mutex: Mutex for synchronizing access to the VRING.
+ * @buf: Temporary kernel buffer used to copy in/out data
+ * from/to the card via DMA.
+ * @buf_da: dma address of buf.
+ * @vdev: Back pointer to VOP virtio device for vringh_notify(..).
+ */
+struct vop_vringh {
+	struct mic_vring vring;
+	struct vringh vrh;
+	struct vringh_kiov riov;
+	struct vringh_kiov wiov;
+	u16 head;
+	struct mutex vr_mutex;
+	void *buf;
+	dma_addr_t buf_da;
+	struct vop_vdev *vdev;
+};
+
+/**
+ * struct vop_vdev - Host information for a card Virtio device.
+ *
+ * @virtio_id - Virtio device id.
+ * @waitq - Waitqueue to allow ring3 apps to poll.
+ * @vpdev - pointer to VOP bus device.
+ * @poll_wake - Used for waking up threads blocked in poll.
+ * @out_bytes - Debug stats for number of bytes copied from host to card.
+ * @in_bytes - Debug stats for number of bytes copied from card to host.
+ * @out_bytes_dma - Debug stats for number of bytes copied from host to card
+ * using DMA.
+ * @in_bytes_dma - Debug stats for number of bytes copied from card to host
+ * using DMA.
+ * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
+ * the transfer length did not have the required DMA alignment.
+ * @tx_dst_unaligned - Debug stats for number of bytes copied where the
+ * destination address on the card did not have the required DMA alignment.
+ * @vvr - Store per VRING data structures.
+ * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
+ * @dd - Virtio device descriptor.
+ * @dc - Virtio device control fields.
+ * @list - List of Virtio devices.
+ * @virtio_db - The doorbell used by the card to interrupt the host.
+ * @virtio_cookie - The cookie returned while requesting interrupts.
+ * @vi: Transport information.
+ * @vdev_mutex: Mutex synchronizing virtio device injection,
+ *              removal and data transfers.
+ * @destroy: Track if a virtio device is being destroyed.
+ * @deleted: The virtio device has been deleted.
+ */
+struct vop_vdev {
+	int virtio_id;
+	wait_queue_head_t waitq;
+	struct vop_device *vpdev;
+	int poll_wake;
+	unsigned long out_bytes;
+	unsigned long in_bytes;
+	unsigned long out_bytes_dma;
+	unsigned long in_bytes_dma;
+	unsigned long tx_len_unaligned;
+	unsigned long tx_dst_unaligned;
+	unsigned long rx_dst_unaligned;
+	struct vop_vringh vvr[MIC_MAX_VRINGS];
+	struct work_struct virtio_bh_work;
+	struct mic_device_desc *dd;
+	struct mic_device_ctrl *dc;
+	struct list_head list;
+	int virtio_db;
+	struct mic_irq *virtio_cookie;
+	struct vop_info *vi;
+	struct mutex vdev_mutex;
+	struct completion destroy;
+	bool deleted;
+};
+
+/* Helper API to check if a virtio device is running */
+static inline bool vop_vdevup(struct vop_vdev *vdev)
+{
+	return !!vdev->dd->status;
+}
+
+void vop_init_debugfs(struct vop_info *vi);
+void vop_exit_debugfs(struct vop_info *vi);
+int vop_host_init(struct vop_info *vi);
+void vop_host_uninit(struct vop_info *vi);
+#endif
-- 
1.8.2.1

^ permalink raw reply related	[relevance 75%]

* [PATCH char-misc-testing v2 5/8] misc: mic: Enable VOP host side functionality
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (3 preceding siblings ...)
  2016-02-08 23:48 75% ` [PATCH char-misc-testing v2 4/8] misc: mic: Add data structures for the VOP driver Sudeep Dutt
@ 2016-02-08 23:48 39% ` Sudeep Dutt
  2016-02-08 23:48 49% ` [PATCH char-misc-testing v2 6/8] misc: mic: Enable VOP card " Sudeep Dutt
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch moves virtio functionality from the MIC host driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit f69bcbf3b4c4 ("Intel MIC Host
Driver Changes for Virtio Devices.") in
drivers/misc/mic/host/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/vop/vop_vringh.c | 1164 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1164 insertions(+)
 create mode 100644 drivers/misc/mic/vop/vop_vringh.c

diff --git a/drivers/misc/mic/vop/vop_vringh.c b/drivers/misc/mic/vop/vop_vringh.c
new file mode 100644
index 0000000..c1dd000
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_vringh.c
@@ -0,0 +1,1164 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#include <linux/sched.h>
+#include <linux/poll.h>
+#include <linux/dma-mapping.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+#include <linux/mic_ioctl.h>
+#include "vop_main.h"
+
+/* Helper API to obtain the VOP PCIe device */
+static inline struct device *vop_dev(struct vop_vdev *vdev)
+{
+	return vdev->vpdev->dev.parent;
+}
+
+/* Helper API to check if a virtio device is initialized */
+static inline int vop_vdev_inited(struct vop_vdev *vdev)
+{
+	if (!vdev)
+		return -EINVAL;
+	/* Device has not been created yet */
+	if (!vdev->dd || !vdev->dd->type) {
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, -EINVAL);
+		return -EINVAL;
+	}
+	/* Device has been removed/deleted */
+	if (vdev->dd->type == -1) {
+		dev_dbg(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, -ENODEV);
+		return -ENODEV;
+	}
+	return 0;
+}
+
+static void _vop_notify(struct vringh *vrh)
+{
+	struct vop_vringh *vvrh = container_of(vrh, struct vop_vringh, vrh);
+	struct vop_vdev *vdev = vvrh->vdev;
+	struct vop_device *vpdev = vdev->vpdev;
+	s8 db = vdev->dc->h2c_vdev_db;
+
+	if (db != -1)
+		vpdev->hw_ops->send_intr(vpdev, db);
+}
+
+static void vop_virtio_init_post(struct vop_vdev *vdev)
+{
+	struct mic_vqconfig *vqconfig = mic_vq_config(vdev->dd);
+	struct vop_device *vpdev = vdev->vpdev;
+	int i, used_size;
+
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		used_size = PAGE_ALIGN(sizeof(u16) * 3 +
+				sizeof(struct vring_used_elem) *
+				le16_to_cpu(vqconfig->num));
+		if (!le64_to_cpu(vqconfig[i].used_address)) {
+			dev_warn(vop_dev(vdev), "used_address zero??\n");
+			continue;
+		}
+		vdev->vvr[i].vrh.vring.used =
+			(void __force *)vpdev->hw_ops->ioremap(
+			vpdev,
+			le64_to_cpu(vqconfig[i].used_address),
+			used_size);
+	}
+
+	vdev->dc->used_address_updated = 0;
+
+	dev_info(vop_dev(vdev), "%s: device type %d LINKUP\n",
+		 __func__, vdev->virtio_id);
+}
+
+static inline void vop_virtio_device_reset(struct vop_vdev *vdev)
+{
+	int i;
+
+	dev_dbg(vop_dev(vdev), "%s: status %d device type %d RESET\n",
+		__func__, vdev->dd->status, vdev->virtio_id);
+
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		/*
+		 * Avoid lockdep false positive. The + 1 is for the vop
+		 * mutex which is held in the reset devices code path.
+		 */
+		mutex_lock_nested(&vdev->vvr[i].vr_mutex, i + 1);
+
+	/* 0 status means "reset" */
+	vdev->dd->status = 0;
+	vdev->dc->vdev_reset = 0;
+	vdev->dc->host_ack = 1;
+
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		struct vringh *vrh = &vdev->vvr[i].vrh;
+
+		vdev->vvr[i].vring.info->avail_idx = 0;
+		vrh->completed = 0;
+		vrh->last_avail_idx = 0;
+		vrh->last_used_idx = 0;
+	}
+
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		mutex_unlock(&vdev->vvr[i].vr_mutex);
+}
+
+static void vop_virtio_reset_devices(struct vop_info *vi)
+{
+	struct list_head *pos, *tmp;
+	struct vop_vdev *vdev;
+
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev = list_entry(pos, struct vop_vdev, list);
+		vop_virtio_device_reset(vdev);
+		vdev->poll_wake = 1;
+		wake_up(&vdev->waitq);
+	}
+}
+
+static void vop_bh_handler(struct work_struct *work)
+{
+	struct vop_vdev *vdev = container_of(work, struct vop_vdev,
+			virtio_bh_work);
+
+	if (vdev->dc->used_address_updated)
+		vop_virtio_init_post(vdev);
+
+	if (vdev->dc->vdev_reset)
+		vop_virtio_device_reset(vdev);
+
+	vdev->poll_wake = 1;
+	wake_up(&vdev->waitq);
+}
+
+static irqreturn_t _vop_virtio_intr_handler(int irq, void *data)
+{
+	struct vop_vdev *vdev = data;
+	struct vop_device *vpdev = vdev->vpdev;
+
+	vpdev->hw_ops->ack_interrupt(vpdev, vdev->virtio_db);
+	schedule_work(&vdev->virtio_bh_work);
+	return IRQ_HANDLED;
+}
+
+static int vop_virtio_config_change(struct vop_vdev *vdev, void *argp)
+{
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
+	int ret = 0, retry, i;
+	struct vop_device *vpdev = vdev->vpdev;
+	struct vop_info *vi = dev_get_drvdata(&vpdev->dev);
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+	s8 db = bootparam->h2c_config_db;
+
+	mutex_lock(&vi->vop_mutex);
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		mutex_lock_nested(&vdev->vvr[i].vr_mutex, i + 1);
+
+	if (db == -1 || vdev->dd->type == -1) {
+		ret = -EIO;
+		goto exit;
+	}
+
+	memcpy(mic_vq_configspace(vdev->dd), argp, vdev->dd->config_len);
+	vdev->dc->config_change = MIC_VIRTIO_PARAM_CONFIG_CHANGED;
+	vpdev->hw_ops->send_intr(vpdev, db);
+
+	for (retry = 100; retry--;) {
+		ret = wait_event_timeout(wake, vdev->dc->guest_ack,
+					 msecs_to_jiffies(100));
+		if (ret)
+			break;
+	}
+
+	dev_dbg(vop_dev(vdev),
+		"%s %d retry: %d\n", __func__, __LINE__, retry);
+	vdev->dc->config_change = 0;
+	vdev->dc->guest_ack = 0;
+exit:
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		mutex_unlock(&vdev->vvr[i].vr_mutex);
+	mutex_unlock(&vi->vop_mutex);
+	return ret;
+}
+
+static int vop_copy_dp_entry(struct vop_vdev *vdev,
+			     struct mic_device_desc *argp, __u8 *type,
+			     struct mic_device_desc **devpage)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	struct mic_device_desc *devp;
+	struct mic_vqconfig *vqconfig;
+	int ret = 0, i;
+	bool slot_found = false;
+
+	vqconfig = mic_vq_config(argp);
+	for (i = 0; i < argp->num_vq; i++) {
+		if (le16_to_cpu(vqconfig[i].num) > MIC_MAX_VRING_ENTRIES) {
+			ret =  -EINVAL;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto exit;
+		}
+	}
+
+	/* Find the first free device page entry */
+	for (i = sizeof(struct mic_bootparam);
+		i < MIC_DP_SIZE - mic_total_desc_size(argp);
+		i += mic_total_desc_size(devp)) {
+		devp = vpdev->hw_ops->get_dp(vpdev) + i;
+		if (devp->type == 0 || devp->type == -1) {
+			slot_found = true;
+			break;
+		}
+	}
+	if (!slot_found) {
+		ret =  -EINVAL;
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, ret);
+		goto exit;
+	}
+	/*
+	 * Save off the type before doing the memcpy. Type will be set in the
+	 * end after completing all initialization for the new device.
+	 */
+	*type = argp->type;
+	argp->type = 0;
+	memcpy(devp, argp, mic_desc_size(argp));
+
+	*devpage = devp;
+exit:
+	return ret;
+}
+
+static void vop_init_device_ctrl(struct vop_vdev *vdev,
+				 struct mic_device_desc *devpage)
+{
+	struct mic_device_ctrl *dc;
+
+	dc = (void *)devpage + mic_aligned_desc_size(devpage);
+
+	dc->config_change = 0;
+	dc->guest_ack = 0;
+	dc->vdev_reset = 0;
+	dc->host_ack = 0;
+	dc->used_address_updated = 0;
+	dc->c2h_vdev_db = -1;
+	dc->h2c_vdev_db = -1;
+	vdev->dc = dc;
+}
+
+static int vop_virtio_add_device(struct vop_vdev *vdev,
+				 struct mic_device_desc *argp)
+{
+	struct vop_info *vi = vdev->vi;
+	struct vop_device *vpdev = vi->vpdev;
+	struct mic_device_desc *dd = NULL;
+	struct mic_vqconfig *vqconfig;
+	int vr_size, i, j, ret;
+	u8 type = 0;
+	s8 db = -1;
+	char irqname[16];
+	struct mic_bootparam *bootparam;
+	u16 num;
+	dma_addr_t vr_addr;
+
+	bootparam = vpdev->hw_ops->get_dp(vpdev);
+	init_waitqueue_head(&vdev->waitq);
+	INIT_LIST_HEAD(&vdev->list);
+	vdev->vpdev = vpdev;
+
+	ret = vop_copy_dp_entry(vdev, argp, &type, &dd);
+	if (ret) {
+		kfree(vdev);
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, ret);
+		return ret;
+	}
+
+	vop_init_device_ctrl(vdev, dd);
+
+	vdev->dd = dd;
+	vdev->virtio_id = type;
+	vqconfig = mic_vq_config(dd);
+	INIT_WORK(&vdev->virtio_bh_work, vop_bh_handler);
+
+	for (i = 0; i < dd->num_vq; i++) {
+		struct vop_vringh *vvr = &vdev->vvr[i];
+		struct mic_vring *vr = &vdev->vvr[i].vring;
+
+		num = le16_to_cpu(vqconfig[i].num);
+		mutex_init(&vvr->vr_mutex);
+		vr_size = PAGE_ALIGN(vring_size(num, MIC_VIRTIO_RING_ALIGN) +
+			sizeof(struct _mic_vring_info));
+		vr->va = (void *)
+			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					 get_order(vr_size));
+		if (!vr->va) {
+			ret = -ENOMEM;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vr->len = vr_size;
+		vr->info = vr->va + vring_size(num, MIC_VIRTIO_RING_ALIGN);
+		vr->info->magic = cpu_to_le32(MIC_MAGIC + vdev->virtio_id + i);
+		vr_addr = dma_map_single(&vpdev->dev, vr->va, vr_size,
+					 DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(&vpdev->dev, vr_addr)) {
+			free_pages((unsigned long)vr->va, get_order(vr_size));
+			ret = -ENOMEM;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vqconfig[i].address = cpu_to_le64(vr_addr);
+
+		vring_init(&vr->vr, num, vr->va, MIC_VIRTIO_RING_ALIGN);
+		ret = vringh_init_kern(&vvr->vrh,
+				       *(u32 *)mic_vq_features(vdev->dd),
+				       num, false, vr->vr.desc, vr->vr.avail,
+				       vr->vr.used);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vringh_kiov_init(&vvr->riov, NULL, 0);
+		vringh_kiov_init(&vvr->wiov, NULL, 0);
+		vvr->head = USHRT_MAX;
+		vvr->vdev = vdev;
+		vvr->vrh.notify = _vop_notify;
+		dev_dbg(&vpdev->dev,
+			"%s %d index %d va %p info %p vr_size 0x%x\n",
+			__func__, __LINE__, i, vr->va, vr->info, vr_size);
+		vvr->buf = (void *)__get_free_pages(GFP_KERNEL,
+					get_order(VOP_INT_DMA_BUF_SIZE));
+		vvr->buf_da = dma_map_single(&vpdev->dev,
+					  vvr->buf, VOP_INT_DMA_BUF_SIZE,
+					  DMA_BIDIRECTIONAL);
+	}
+
+	snprintf(irqname, sizeof(irqname), "vop%dvirtio%d", vpdev->index,
+		 vdev->virtio_id);
+	vdev->virtio_db = vpdev->hw_ops->next_db(vpdev);
+	vdev->virtio_cookie = vpdev->hw_ops->request_irq(vpdev,
+			_vop_virtio_intr_handler, irqname, vdev,
+			vdev->virtio_db);
+	if (IS_ERR(vdev->virtio_cookie)) {
+		ret = PTR_ERR(vdev->virtio_cookie);
+		dev_dbg(&vpdev->dev, "request irq failed\n");
+		goto err;
+	}
+
+	vdev->dc->c2h_vdev_db = vdev->virtio_db;
+
+	/*
+	 * Order the type update with previous stores. This write barrier
+	 * is paired with the corresponding read barrier before the uncached
+	 * system memory read of the type, on the card while scanning the
+	 * device page.
+	 */
+	smp_wmb();
+	dd->type = type;
+	argp->type = type;
+
+	if (bootparam) {
+		db = bootparam->h2c_config_db;
+		if (db != -1)
+			vpdev->hw_ops->send_intr(vpdev, db);
+	}
+	dev_dbg(&vpdev->dev, "Added virtio id %d db %d\n", dd->type, db);
+	return 0;
+err:
+	vqconfig = mic_vq_config(dd);
+	for (j = 0; j < i; j++) {
+		struct vop_vringh *vvr = &vdev->vvr[j];
+
+		dma_unmap_single(&vpdev->dev, le64_to_cpu(vqconfig[j].address),
+				 vvr->vring.len, DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vvr->vring.va,
+			   get_order(vvr->vring.len));
+	}
+	return ret;
+}
+
+static void vop_dev_remove(struct vop_info *pvi, struct mic_device_ctrl *devp,
+			   struct vop_device *vpdev)
+{
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+	s8 db;
+	int ret, retry;
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
+
+	devp->config_change = MIC_VIRTIO_PARAM_DEV_REMOVE;
+	db = bootparam->h2c_config_db;
+	if (db != -1)
+		vpdev->hw_ops->send_intr(vpdev, db);
+	else
+		goto done;
+	for (retry = 15; retry--;) {
+		ret = wait_event_timeout(wake, devp->guest_ack,
+					 msecs_to_jiffies(1000));
+		if (ret)
+			break;
+	}
+done:
+	devp->config_change = 0;
+	devp->guest_ack = 0;
+}
+
+static void vop_virtio_del_device(struct vop_vdev *vdev)
+{
+	struct vop_info *vi = vdev->vi;
+	struct vop_device *vpdev = vdev->vpdev;
+	int i;
+	struct mic_vqconfig *vqconfig;
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+
+	if (!bootparam)
+		goto skip_hot_remove;
+	vop_dev_remove(vi, vdev->dc, vpdev);
+skip_hot_remove:
+	vpdev->hw_ops->free_irq(vpdev, vdev->virtio_cookie, vdev);
+	flush_work(&vdev->virtio_bh_work);
+	vqconfig = mic_vq_config(vdev->dd);
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		struct vop_vringh *vvr = &vdev->vvr[i];
+
+		dma_unmap_single(&vpdev->dev,
+				 vvr->buf_da, VOP_INT_DMA_BUF_SIZE,
+				 DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vvr->buf,
+			   get_order(VOP_INT_DMA_BUF_SIZE));
+		vringh_kiov_cleanup(&vvr->riov);
+		vringh_kiov_cleanup(&vvr->wiov);
+		dma_unmap_single(&vpdev->dev, le64_to_cpu(vqconfig[i].address),
+				 vvr->vring.len, DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vvr->vring.va,
+			   get_order(vvr->vring.len));
+	}
+	/*
+	 * Order the type update with previous stores. This write barrier
+	 * is paired with the corresponding read barrier before the uncached
+	 * system memory read of the type, on the card while scanning the
+	 * device page.
+	 */
+	smp_wmb();
+	vdev->dd->type = -1;
+}
+
+/*
+ * vop_sync_dma - Wrapper for synchronous DMAs.
+ *
+ * @dev - The address of the pointer to the device instance used
+ * for DMA registration.
+ * @dst - destination DMA address.
+ * @src - source DMA address.
+ * @len - size of the transfer.
+ *
+ * Return DMA_SUCCESS on success
+ */
+static int vop_sync_dma(struct vop_vdev *vdev, dma_addr_t dst, dma_addr_t src,
+			size_t len)
+{
+	int err = 0;
+	struct dma_device *ddev;
+	struct dma_async_tx_descriptor *tx;
+	struct vop_info *vi = dev_get_drvdata(&vdev->vpdev->dev);
+	struct dma_chan *vop_ch = vi->dma_ch;
+
+	if (!vop_ch) {
+		err = -EBUSY;
+		goto error;
+	}
+	ddev = vop_ch->device;
+	tx = ddev->device_prep_dma_memcpy(vop_ch, dst, src, len,
+		DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	} else {
+		dma_cookie_t cookie;
+
+		cookie = tx->tx_submit(tx);
+		if (dma_submit_error(cookie)) {
+			err = -ENOMEM;
+			goto error;
+		}
+		dma_async_issue_pending(vop_ch);
+		err = dma_sync_wait(vop_ch, cookie);
+	}
+error:
+	if (err)
+		dev_err(&vi->vpdev->dev, "%s %d err %d\n",
+			__func__, __LINE__, err);
+	return err;
+}
+
+#define VOP_USE_DMA true
+
+/*
+ * Initiates the copies across the PCIe bus from card memory to a user
+ * space buffer. When transfers are done using DMA, source/destination
+ * addresses and transfer length must follow the alignment requirements of
+ * the MIC DMA engine.
+ */
+static int vop_virtio_copy_to_user(struct vop_vdev *vdev, void __user *ubuf,
+				   size_t len, u64 daddr, size_t dlen,
+				   int vr_idx)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	void __iomem *dbuf = vpdev->hw_ops->ioremap(vpdev, daddr, len);
+	struct vop_vringh *vvr = &vdev->vvr[vr_idx];
+	struct vop_info *vi = dev_get_drvdata(&vpdev->dev);
+	size_t dma_alignment = 1 << vi->dma_ch->device->copy_align;
+	bool x200 = is_dma_copy_aligned(vi->dma_ch->device, 1, 1, 1);
+	size_t dma_offset, partlen;
+	int err;
+
+	if (!VOP_USE_DMA) {
+		if (copy_to_user(ubuf, (void __force *)dbuf, len)) {
+			err = -EFAULT;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		vdev->in_bytes += len;
+		err = 0;
+		goto err;
+	}
+
+	dma_offset = daddr - round_down(daddr, dma_alignment);
+	daddr -= dma_offset;
+	len += dma_offset;
+	/*
+	 * X100 uses DMA addresses as seen by the card so adding
+	 * the aperture base is not required for DMA. However x200
+	 * requires DMA addresses to be an offset into the bar so
+	 * add the aperture base for x200.
+	 */
+	if (x200)
+		daddr += vpdev->aper->pa;
+	while (len) {
+		partlen = min_t(size_t, len, VOP_INT_DMA_BUF_SIZE);
+		err = vop_sync_dma(vdev, vvr->buf_da, daddr,
+				   ALIGN(partlen, dma_alignment));
+		if (err) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		if (copy_to_user(ubuf, vvr->buf + dma_offset,
+				 partlen - dma_offset)) {
+			err = -EFAULT;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		vdev->in_bytes_dma += partlen;
+		vdev->in_bytes += partlen;
+		len -= partlen;
+		dma_offset = 0;
+	}
+	err = 0;
+err:
+	vpdev->hw_ops->iounmap(vpdev, dbuf);
+	dev_dbg(vop_dev(vdev),
+		"%s: ubuf %p dbuf %p len 0x%lx vr_idx 0x%x\n",
+		__func__, ubuf, dbuf, len, vr_idx);
+	return err;
+}
+
+/*
+ * Initiates copies across the PCIe bus from a user space buffer to card
+ * memory. When transfers are done using DMA, source/destination addresses
+ * and transfer length must follow the alignment requirements of the MIC
+ * DMA engine.
+ */
+static int vop_virtio_copy_from_user(struct vop_vdev *vdev, void __user *ubuf,
+				     size_t len, u64 daddr, size_t dlen,
+				     int vr_idx)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	void __iomem *dbuf = vpdev->hw_ops->ioremap(vpdev, daddr, len);
+	struct vop_vringh *vvr = &vdev->vvr[vr_idx];
+	struct vop_info *vi = dev_get_drvdata(&vdev->vpdev->dev);
+	size_t dma_alignment = 1 << vi->dma_ch->device->copy_align;
+	bool x200 = is_dma_copy_aligned(vi->dma_ch->device, 1, 1, 1);
+	size_t partlen;
+	bool dma = VOP_USE_DMA;
+	int err = 0;
+
+	if (daddr & (dma_alignment - 1)) {
+		vdev->tx_dst_unaligned += len;
+		dma = false;
+	} else if (ALIGN(len, dma_alignment) > dlen) {
+		vdev->tx_len_unaligned += len;
+		dma = false;
+	}
+
+	if (!dma)
+		goto memcpy;
+
+	/*
+	 * X100 uses DMA addresses as seen by the card so adding
+	 * the aperture base is not required for DMA. However x200
+	 * requires DMA addresses to be an offset into the bar so
+	 * add the aperture base for x200.
+	 */
+	if (x200)
+		daddr += vpdev->aper->pa;
+	while (len) {
+		partlen = min_t(size_t, len, VOP_INT_DMA_BUF_SIZE);
+
+		if (copy_from_user(vvr->buf, ubuf, partlen)) {
+			err = -EFAULT;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		err = vop_sync_dma(vdev, daddr, vvr->buf_da,
+				   ALIGN(partlen, dma_alignment));
+		if (err) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		vdev->out_bytes_dma += partlen;
+		vdev->out_bytes += partlen;
+		len -= partlen;
+	}
+memcpy:
+	/*
+	 * We are copying to IO below and should ideally use something
+	 * like copy_from_user_toio(..) if it existed.
+	 */
+	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
+		err = -EFAULT;
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	vdev->out_bytes += len;
+	err = 0;
+err:
+	vpdev->hw_ops->iounmap(vpdev, dbuf);
+	dev_dbg(vop_dev(vdev),
+		"%s: ubuf %p dbuf %p len 0x%lx vr_idx 0x%x\n",
+		__func__, ubuf, dbuf, len, vr_idx);
+	return err;
+}
+
+#define MIC_VRINGH_READ true
+
+/* Determine the total number of bytes consumed in a VRINGH KIOV */
+static inline u32 vop_vringh_iov_consumed(struct vringh_kiov *iov)
+{
+	int i;
+	u32 total = iov->consumed;
+
+	for (i = 0; i < iov->i; i++)
+		total += iov->iov[i].iov_len;
+	return total;
+}
+
+/*
+ * Traverse the VRINGH KIOV and issue the APIs to trigger the copies.
+ * This API is heavily based on the vringh_iov_xfer(..) implementation
+ * in vringh.c. The reason we cannot reuse vringh_iov_pull_kern(..)
+ * and vringh_iov_push_kern(..) directly is because there is no
+ * way to override the VRINGH xfer(..) routines as of v3.10.
+ */
+static int vop_vringh_copy(struct vop_vdev *vdev, struct vringh_kiov *iov,
+			   void __user *ubuf, size_t len, bool read, int vr_idx,
+			   size_t *out_len)
+{
+	int ret = 0;
+	size_t partlen, tot_len = 0;
+
+	while (len && iov->i < iov->used) {
+		struct kvec *kiov = &iov->iov[iov->i];
+
+		partlen = min(kiov->iov_len, len);
+		if (read)
+			ret = vop_virtio_copy_to_user(vdev, ubuf, partlen,
+						      (u64)kiov->iov_base,
+						      kiov->iov_len,
+						      vr_idx);
+		else
+			ret = vop_virtio_copy_from_user(vdev, ubuf, partlen,
+							(u64)kiov->iov_base,
+							kiov->iov_len,
+							vr_idx);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= partlen;
+		ubuf += partlen;
+		tot_len += partlen;
+		iov->consumed += partlen;
+		kiov->iov_len -= partlen;
+		kiov->iov_base += partlen;
+		if (!kiov->iov_len) {
+			/* Fix up old iov element then increment. */
+			kiov->iov_len = iov->consumed;
+			kiov->iov_base -= iov->consumed;
+
+			iov->consumed = 0;
+			iov->i++;
+		}
+	}
+	*out_len = tot_len;
+	return ret;
+}
+
+/*
+ * Use the standard VRINGH infrastructure in the kernel to fetch new
+ * descriptors, initiate the copies and update the used ring.
+ */
+static int _vop_virtio_copy(struct vop_vdev *vdev, struct mic_copy_desc *copy)
+{
+	int ret = 0;
+	u32 iovcnt = copy->iovcnt;
+	struct iovec iov;
+	struct iovec __user *u_iov = copy->iov;
+	void __user *ubuf = NULL;
+	struct vop_vringh *vvr = &vdev->vvr[copy->vr_idx];
+	struct vringh_kiov *riov = &vvr->riov;
+	struct vringh_kiov *wiov = &vvr->wiov;
+	struct vringh *vrh = &vvr->vrh;
+	u16 *head = &vvr->head;
+	struct mic_vring *vr = &vvr->vring;
+	size_t len = 0, out_len;
+
+	copy->out_len = 0;
+	/* Fetch a new IOVEC if all previous elements have been processed */
+	if (riov->i == riov->used && wiov->i == wiov->used) {
+		ret = vringh_getdesc_kern(vrh, riov, wiov,
+					  head, GFP_KERNEL);
+		/* Check if there are available descriptors */
+		if (ret <= 0)
+			return ret;
+	}
+	while (iovcnt) {
+		if (!len) {
+			/* Copy over a new iovec from user space. */
+			ret = copy_from_user(&iov, u_iov, sizeof(*u_iov));
+			if (ret) {
+				ret = -EINVAL;
+				dev_err(vop_dev(vdev), "%s %d err %d\n",
+					__func__, __LINE__, ret);
+				break;
+			}
+			len = iov.iov_len;
+			ubuf = iov.iov_base;
+		}
+		/* Issue all the read descriptors first */
+		ret = vop_vringh_copy(vdev, riov, ubuf, len,
+				      MIC_VRINGH_READ, copy->vr_idx, &out_len);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= out_len;
+		ubuf += out_len;
+		copy->out_len += out_len;
+		/* Issue the write descriptors next */
+		ret = vop_vringh_copy(vdev, wiov, ubuf, len,
+				      !MIC_VRINGH_READ, copy->vr_idx, &out_len);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= out_len;
+		ubuf += out_len;
+		copy->out_len += out_len;
+		if (!len) {
+			/* One user space iovec is now completed */
+			iovcnt--;
+			u_iov++;
+		}
+		/* Exit loop if all elements in KIOVs have been processed. */
+		if (riov->i == riov->used && wiov->i == wiov->used)
+			break;
+	}
+	/*
+	 * Update the used ring if a descriptor was available and some data was
+	 * copied in/out and the user asked for a used ring update.
+	 */
+	if (*head != USHRT_MAX && copy->out_len && copy->update_used) {
+		u32 total = 0;
+
+		/* Determine the total data consumed */
+		total += vop_vringh_iov_consumed(riov);
+		total += vop_vringh_iov_consumed(wiov);
+		vringh_complete_kern(vrh, *head, total);
+		*head = USHRT_MAX;
+		if (vringh_need_notify_kern(vrh) > 0)
+			vringh_notify(vrh);
+		vringh_kiov_cleanup(riov);
+		vringh_kiov_cleanup(wiov);
+		/* Update avail idx for user space */
+		vr->info->avail_idx = vrh->last_avail_idx;
+	}
+	return ret;
+}
+
+static inline int vop_verify_copy_args(struct vop_vdev *vdev,
+				       struct mic_copy_desc *copy)
+{
+	if (!vdev || copy->vr_idx >= vdev->dd->num_vq)
+		return -EINVAL;
+	return 0;
+}
+
+/* Copy a specified number of virtio descriptors in a chain */
+static int vop_virtio_copy_desc(struct vop_vdev *vdev,
+				struct mic_copy_desc *copy)
+{
+	int err;
+	struct vop_vringh *vvr = &vdev->vvr[copy->vr_idx];
+
+	err = vop_verify_copy_args(vdev, copy);
+	if (err)
+		return err;
+
+	mutex_lock(&vvr->vr_mutex);
+	if (!vop_vdevup(vdev)) {
+		err = -ENODEV;
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	err = _vop_virtio_copy(vdev, copy);
+	if (err) {
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+	}
+err:
+	mutex_unlock(&vvr->vr_mutex);
+	return err;
+}
+
+static int vop_open(struct inode *inode, struct file *f)
+{
+	struct vop_vdev *vdev;
+	struct vop_info *vi = container_of(f->private_data,
+		struct vop_info, miscdev);
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return -ENOMEM;
+	vdev->vi = vi;
+	mutex_init(&vdev->vdev_mutex);
+	f->private_data = vdev;
+	init_completion(&vdev->destroy);
+	complete(&vdev->destroy);
+	return 0;
+}
+
+static int vop_release(struct inode *inode, struct file *f)
+{
+	struct vop_vdev *vdev = f->private_data, *vdev_tmp;
+	struct vop_info *vi = vdev->vi;
+	struct list_head *pos, *tmp;
+	bool found = false;
+
+	mutex_lock(&vdev->vdev_mutex);
+	if (vdev->deleted)
+		goto unlock;
+	mutex_lock(&vi->vop_mutex);
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev_tmp = list_entry(pos, struct vop_vdev, list);
+		if (vdev == vdev_tmp) {
+			vop_virtio_del_device(vdev);
+			list_del(pos);
+			found = true;
+			break;
+		}
+	}
+	mutex_unlock(&vi->vop_mutex);
+unlock:
+	mutex_unlock(&vdev->vdev_mutex);
+	if (!found)
+		wait_for_completion(&vdev->destroy);
+	f->private_data = NULL;
+	kfree(vdev);
+	return 0;
+}
+
+static long vop_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct vop_vdev *vdev = f->private_data;
+	struct vop_info *vi = vdev->vi;
+	void __user *argp = (void __user *)arg;
+	int ret;
+
+	switch (cmd) {
+	case MIC_VIRTIO_ADD_DEVICE:
+	{
+		struct mic_device_desc dd, *dd_config;
+
+		if (copy_from_user(&dd, argp, sizeof(dd)))
+			return -EFAULT;
+
+		if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE ||
+		    dd.num_vq > MIC_MAX_VRINGS)
+			return -EINVAL;
+
+		dd_config = kzalloc(mic_desc_size(&dd), GFP_KERNEL);
+		if (!dd_config)
+			return -ENOMEM;
+		if (copy_from_user(dd_config, argp, mic_desc_size(&dd))) {
+			ret = -EFAULT;
+			goto free_ret;
+		}
+		mutex_lock(&vdev->vdev_mutex);
+		mutex_lock(&vi->vop_mutex);
+		ret = vop_virtio_add_device(vdev, dd_config);
+		if (ret)
+			goto unlock_ret;
+		list_add_tail(&vdev->list, &vi->vdev_list);
+unlock_ret:
+		mutex_unlock(&vi->vop_mutex);
+		mutex_unlock(&vdev->vdev_mutex);
+free_ret:
+		kfree(dd_config);
+		return ret;
+	}
+	case MIC_VIRTIO_COPY_DESC:
+	{
+		struct mic_copy_desc copy;
+
+		mutex_lock(&vdev->vdev_mutex);
+		ret = vop_vdev_inited(vdev);
+		if (ret)
+			goto _unlock_ret;
+
+		if (copy_from_user(&copy, argp, sizeof(copy))) {
+			ret = -EFAULT;
+			goto _unlock_ret;
+		}
+
+		ret = vop_virtio_copy_desc(vdev, &copy);
+		if (ret < 0)
+			goto _unlock_ret;
+		if (copy_to_user(
+			&((struct mic_copy_desc __user *)argp)->out_len,
+			&copy.out_len, sizeof(copy.out_len)))
+			ret = -EFAULT;
+_unlock_ret:
+		mutex_unlock(&vdev->vdev_mutex);
+		return ret;
+	}
+	case MIC_VIRTIO_CONFIG_CHANGE:
+	{
+		void *buf;
+
+		mutex_lock(&vdev->vdev_mutex);
+		ret = vop_vdev_inited(vdev);
+		if (ret)
+			goto __unlock_ret;
+		buf = kzalloc(vdev->dd->config_len, GFP_KERNEL);
+		if (!buf) {
+			ret = -ENOMEM;
+			goto __unlock_ret;
+		}
+		if (copy_from_user(buf, argp, vdev->dd->config_len)) {
+			ret = -EFAULT;
+			goto done;
+		}
+		ret = vop_virtio_config_change(vdev, buf);
+done:
+		kfree(buf);
+__unlock_ret:
+		mutex_unlock(&vdev->vdev_mutex);
+		return ret;
+	}
+	default:
+		return -ENOIOCTLCMD;
+	};
+	return 0;
+}
+
+/*
+ * We return POLLIN | POLLOUT from poll when new buffers are enqueued, and
+ * not when previously enqueued buffers may be available. This means that
+ * in the card->host (TX) path, when userspace is unblocked by poll it
+ * must drain all available descriptors or it can stall.
+ */
+static unsigned int vop_poll(struct file *f, poll_table *wait)
+{
+	struct vop_vdev *vdev = f->private_data;
+	int mask = 0;
+
+	mutex_lock(&vdev->vdev_mutex);
+	if (vop_vdev_inited(vdev)) {
+		mask = POLLERR;
+		goto done;
+	}
+	poll_wait(f, &vdev->waitq, wait);
+	if (vop_vdev_inited(vdev)) {
+		mask = POLLERR;
+	} else if (vdev->poll_wake) {
+		vdev->poll_wake = 0;
+		mask = POLLIN | POLLOUT;
+	}
+done:
+	mutex_unlock(&vdev->vdev_mutex);
+	return mask;
+}
+
+static inline int
+vop_query_offset(struct vop_vdev *vdev, unsigned long offset,
+		 unsigned long *size, unsigned long *pa)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	unsigned long start = MIC_DP_SIZE;
+	int i;
+
+	/*
+	 * MMAP interface is as follows:
+	 * offset				region
+	 * 0x0					virtio device_page
+	 * 0x1000				first vring
+	 * 0x1000 + size of 1st vring		second vring
+	 * ....
+	 */
+	if (!offset) {
+		*pa = virt_to_phys(vpdev->hw_ops->get_dp(vpdev));
+		*size = MIC_DP_SIZE;
+		return 0;
+	}
+
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		struct vop_vringh *vvr = &vdev->vvr[i];
+
+		if (offset == start) {
+			*pa = virt_to_phys(vvr->vring.va);
+			*size = vvr->vring.len;
+			return 0;
+		}
+		start += vvr->vring.len;
+	}
+	return -1;
+}
+
+/*
+ * Maps the device page and virtio rings to user space for readonly access.
+ */
+static int vop_mmap(struct file *f, struct vm_area_struct *vma)
+{
+	struct vop_vdev *vdev = f->private_data;
+	unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
+	unsigned long pa, size = vma->vm_end - vma->vm_start, size_rem = size;
+	int i, err;
+
+	err = vop_vdev_inited(vdev);
+	if (err)
+		goto ret;
+	if (vma->vm_flags & VM_WRITE) {
+		err = -EACCES;
+		goto ret;
+	}
+	while (size_rem) {
+		i = vop_query_offset(vdev, offset, &size, &pa);
+		if (i < 0) {
+			err = -EINVAL;
+			goto ret;
+		}
+		err = remap_pfn_range(vma, vma->vm_start + offset,
+				      pa >> PAGE_SHIFT, size,
+				      vma->vm_page_prot);
+		if (err)
+			goto ret;
+		size_rem -= size;
+		offset += size;
+	}
+ret:
+	return err;
+}
+
+static const struct file_operations vop_fops = {
+	.open = vop_open,
+	.release = vop_release,
+	.unlocked_ioctl = vop_ioctl,
+	.poll = vop_poll,
+	.mmap = vop_mmap,
+	.owner = THIS_MODULE,
+};
+
+int vop_host_init(struct vop_info *vi)
+{
+	int rc;
+	struct miscdevice *mdev;
+	struct vop_device *vpdev = vi->vpdev;
+
+	INIT_LIST_HEAD(&vi->vdev_list);
+	vi->dma_ch = vpdev->dma_ch;
+	mdev = &vi->miscdev;
+	mdev->minor = MISC_DYNAMIC_MINOR;
+	snprintf(vi->name, sizeof(vi->name), "vop_virtio%d", vpdev->index);
+	mdev->name = vi->name;
+	mdev->fops = &vop_fops;
+	mdev->parent = &vpdev->dev;
+
+	rc = misc_register(mdev);
+	if (rc)
+		dev_err(&vpdev->dev, "%s failed rc %d\n", __func__, rc);
+	return rc;
+}
+
+void vop_host_uninit(struct vop_info *vi)
+{
+	struct list_head *pos, *tmp;
+	struct vop_vdev *vdev;
+
+	mutex_lock(&vi->vop_mutex);
+	vop_virtio_reset_devices(vi);
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev = list_entry(pos, struct vop_vdev, list);
+		list_del(pos);
+		reinit_completion(&vdev->destroy);
+		mutex_unlock(&vi->vop_mutex);
+		mutex_lock(&vdev->vdev_mutex);
+		vop_virtio_del_device(vdev);
+		vdev->deleted = true;
+		mutex_unlock(&vdev->vdev_mutex);
+		complete(&vdev->destroy);
+		mutex_lock(&vi->vop_mutex);
+	}
+	mutex_unlock(&vi->vop_mutex);
+	misc_deregister(&vi->miscdev);
+}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 39%]

* [PATCH char-misc-testing v2 7/8] misc: mic: Enable VOP debugfs and driver build
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (5 preceding siblings ...)
  2016-02-08 23:48 49% ` [PATCH char-misc-testing v2 6/8] misc: mic: Enable VOP card " Sudeep Dutt
@ 2016-02-08 23:48 67% ` Sudeep Dutt
  2016-02-08 23:48 58% ` [PATCH char-misc-testing v2 8/8] misc: mic: MIC host and card driver changes to enable VOP Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch moves the virtio specific debugfs hooks previously in
mic_debugfs.c in the MIC host driver into the VOP driver. The
Kconfig/Makefile is also updated to allow building the VOP driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig           |  20 ++++
 drivers/misc/mic/Makefile          |   1 +
 drivers/misc/mic/vop/Makefile      |   9 ++
 drivers/misc/mic/vop/vop_debugfs.c | 232 +++++++++++++++++++++++++++++++++++++
 4 files changed, 262 insertions(+)
 create mode 100644 drivers/misc/mic/vop/Makefile
 create mode 100644 drivers/misc/mic/vop/vop_debugfs.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 840f7ef..b03bb17 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -124,3 +124,23 @@ config MIC_COSM
 	  More information about the Intel MIC family as well as the Linux
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "VOP Driver"
+
+config VOP
+	tristate "VOP Driver"
+	depends on 64BIT && PCI && X86 && VOP_BUS
+	select VHOST_RING
+	help
+	  This enables VOP (Virtio over PCIe) Driver support for the Intel
+	  Many Integrated Core (MIC) family of PCIe form factor coprocessor
+	  devices. The VOP driver allows virtio drivers, e.g. net, console
+	  and block drivers, on the card connect to user space virtio
+	  devices on the host.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index e288a11..f2b1323 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -8,3 +8,4 @@ obj-y += bus/
 obj-$(CONFIG_SCIF) += scif/
 obj-$(CONFIG_MIC_COSM) += cosm/
 obj-$(CONFIG_MIC_COSM) += cosm_client/
+obj-$(CONFIG_VOP) += vop/
diff --git a/drivers/misc/mic/vop/Makefile b/drivers/misc/mic/vop/Makefile
new file mode 100644
index 0000000..78819c8
--- /dev/null
+++ b/drivers/misc/mic/vop/Makefile
@@ -0,0 +1,9 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2016, Intel Corporation.
+#
+obj-m := vop.o
+
+vop-objs += vop_main.o
+vop-objs += vop_debugfs.o
+vop-objs += vop_vringh.o
diff --git a/drivers/misc/mic/vop/vop_debugfs.c b/drivers/misc/mic/vop/vop_debugfs.c
new file mode 100644
index 0000000..ab43884
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_debugfs.c
@@ -0,0 +1,232 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+#include "vop_main.h"
+
+static int vop_dp_show(struct seq_file *s, void *pos)
+{
+	struct mic_device_desc *d;
+	struct mic_device_ctrl *dc;
+	struct mic_vqconfig *vqconfig;
+	__u32 *features;
+	__u8 *config;
+	struct vop_info *vi = s->private;
+	struct vop_device *vpdev = vi->vpdev;
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+	int j, k;
+
+	seq_printf(s, "Bootparam: magic 0x%x\n",
+		   bootparam->magic);
+	seq_printf(s, "Bootparam: h2c_config_db %d\n",
+		   bootparam->h2c_config_db);
+	seq_printf(s, "Bootparam: node_id %d\n",
+		   bootparam->node_id);
+	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
+		   bootparam->c2h_scif_db);
+	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
+		   bootparam->h2c_scif_db);
+	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
+		   bootparam->scif_host_dma_addr);
+	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
+		   bootparam->scif_card_dma_addr);
+
+	for (j = sizeof(*bootparam);
+		j < MIC_DP_SIZE; j += mic_total_desc_size(d)) {
+		d = (void *)bootparam + j;
+		dc = (void *)d + mic_aligned_desc_size(d);
+
+		/* end of list */
+		if (d->type == 0)
+			break;
+
+		if (d->type == -1)
+			continue;
+
+		seq_printf(s, "Type %d ", d->type);
+		seq_printf(s, "Num VQ %d ", d->num_vq);
+		seq_printf(s, "Feature Len %d\n", d->feature_len);
+		seq_printf(s, "Config Len %d ", d->config_len);
+		seq_printf(s, "Shutdown Status %d\n", d->status);
+
+		for (k = 0; k < d->num_vq; k++) {
+			vqconfig = mic_vq_config(d) + k;
+			seq_printf(s, "vqconfig[%d]: ", k);
+			seq_printf(s, "address 0x%llx ",
+				   vqconfig->address);
+			seq_printf(s, "num %d ", vqconfig->num);
+			seq_printf(s, "used address 0x%llx\n",
+				   vqconfig->used_address);
+		}
+
+		features = (__u32 *)mic_vq_features(d);
+		seq_printf(s, "Features: Host 0x%x ", features[0]);
+		seq_printf(s, "Guest 0x%x\n", features[1]);
+
+		config = mic_vq_configspace(d);
+		for (k = 0; k < d->config_len; k++)
+			seq_printf(s, "config[%d]=%d\n", k, config[k]);
+
+		seq_puts(s, "Device control:\n");
+		seq_printf(s, "Config Change %d ", dc->config_change);
+		seq_printf(s, "Vdev reset %d\n", dc->vdev_reset);
+		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
+		seq_printf(s, "Host ack %d\n", dc->host_ack);
+		seq_printf(s, "Used address updated %d ",
+			   dc->used_address_updated);
+		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
+		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
+		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
+	}
+	schedule_work(&vi->hotplug_work);
+	return 0;
+}
+
+static int vop_dp_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, vop_dp_show, inode->i_private);
+}
+
+static int vop_dp_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations dp_ops = {
+	.owner   = THIS_MODULE,
+	.open    = vop_dp_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = vop_dp_debug_release
+};
+
+static int vop_vdev_info_show(struct seq_file *s, void *unused)
+{
+	struct vop_info *vi = s->private;
+	struct list_head *pos, *tmp;
+	struct vop_vdev *vdev;
+	int i, j;
+
+	mutex_lock(&vi->vop_mutex);
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev = list_entry(pos, struct vop_vdev, list);
+		seq_printf(s, "VDEV type %d state %s in %ld out %ld in_dma %ld out_dma %ld\n",
+			   vdev->virtio_id,
+			   vop_vdevup(vdev) ? "UP" : "DOWN",
+			   vdev->in_bytes,
+			   vdev->out_bytes,
+			   vdev->in_bytes_dma,
+			   vdev->out_bytes_dma);
+		for (i = 0; i < MIC_MAX_VRINGS; i++) {
+			struct vring_desc *desc;
+			struct vring_avail *avail;
+			struct vring_used *used;
+			struct vop_vringh *vvr = &vdev->vvr[i];
+			struct vringh *vrh = &vvr->vrh;
+			int num = vrh->vring.num;
+
+			if (!num)
+				continue;
+			desc = vrh->vring.desc;
+			seq_printf(s, "vring i %d avail_idx %d",
+				   i, vvr->vring.info->avail_idx & (num - 1));
+			seq_printf(s, " vring i %d avail_idx %d\n",
+				   i, vvr->vring.info->avail_idx);
+			seq_printf(s, "vrh i %d weak_barriers %d",
+				   i, vrh->weak_barriers);
+			seq_printf(s, " last_avail_idx %d last_used_idx %d",
+				   vrh->last_avail_idx, vrh->last_used_idx);
+			seq_printf(s, " completed %d\n", vrh->completed);
+			for (j = 0; j < num; j++) {
+				seq_printf(s, "desc[%d] addr 0x%llx len %d",
+					   j, desc->addr, desc->len);
+				seq_printf(s, " flags 0x%x next %d\n",
+					   desc->flags, desc->next);
+				desc++;
+			}
+			avail = vrh->vring.avail;
+			seq_printf(s, "avail flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, avail->flags),
+				   vringh16_to_cpu(vrh,
+						   avail->idx) & (num - 1));
+			seq_printf(s, "avail flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, avail->flags),
+				   vringh16_to_cpu(vrh, avail->idx));
+			for (j = 0; j < num; j++)
+				seq_printf(s, "avail ring[%d] %d\n",
+					   j, avail->ring[j]);
+			used = vrh->vring.used;
+			seq_printf(s, "used flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, used->flags),
+				   vringh16_to_cpu(vrh, used->idx) & (num - 1));
+			seq_printf(s, "used flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, used->flags),
+				   vringh16_to_cpu(vrh, used->idx));
+			for (j = 0; j < num; j++)
+				seq_printf(s, "used ring[%d] id %d len %d\n",
+					   j, vringh32_to_cpu(vrh,
+							      used->ring[j].id),
+					   vringh32_to_cpu(vrh,
+							   used->ring[j].len));
+		}
+	}
+	mutex_unlock(&vi->vop_mutex);
+
+	return 0;
+}
+
+static int vop_vdev_info_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, vop_vdev_info_show, inode->i_private);
+}
+
+static int vop_vdev_info_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations vdev_info_ops = {
+	.owner   = THIS_MODULE,
+	.open    = vop_vdev_info_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = vop_vdev_info_debug_release
+};
+
+void vop_init_debugfs(struct vop_info *vi)
+{
+	char name[16];
+
+	snprintf(name, sizeof(name), "%s%d", KBUILD_MODNAME, vi->vpdev->dnode);
+	vi->dbg = debugfs_create_dir(name, NULL);
+	if (!vi->dbg) {
+		pr_err("can't create debugfs dir vop\n");
+		return;
+	}
+	debugfs_create_file("dp", 0444, vi->dbg, vi, &dp_ops);
+	debugfs_create_file("vdev_info", 0444, vi->dbg, vi, &vdev_info_ops);
+}
+
+void vop_exit_debugfs(struct vop_info *vi)
+{
+	debugfs_remove_recursive(vi->dbg);
+}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 67%]

* [PATCH char-misc-testing v2 6/8] misc: mic: Enable VOP card side functionality
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (4 preceding siblings ...)
  2016-02-08 23:48 39% ` [PATCH char-misc-testing v2 5/8] misc: mic: Enable VOP host side functionality Sudeep Dutt
@ 2016-02-08 23:48 49% ` Sudeep Dutt
  2016-02-08 23:48 67% ` [PATCH char-misc-testing v2 7/8] misc: mic: Enable VOP debugfs and driver build Sudeep Dutt
  2016-02-08 23:48 58% ` [PATCH char-misc-testing v2 8/8] misc: mic: MIC host and card driver changes to enable VOP Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

From: Ashutosh Dixit <ashutosh.dixit@intel.com>

This patch moves virtio functionality from the MIC card driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit 2141c7c5ee67 ("Intel MIC Card
Driver Changes for Virtio Devices.") in
drivers/misc/mic/card/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.

Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/misc/mic/vop/vop_main.c | 755 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 755 insertions(+)
 create mode 100644 drivers/misc/mic/vop/vop_main.c

diff --git a/drivers/misc/mic/vop/vop_main.c b/drivers/misc/mic/vop/vop_main.c
new file mode 100644
index 0000000..1a2b67f
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_main.c
@@ -0,0 +1,755 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Adapted from:
+ *
+ * virtio for kvm on s390
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License (version 2 only)
+ * as published by the Free Software Foundation.
+ *
+ *    Author(s): Christian Borntraeger <borntraeger@de.ibm.com>
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/dma-mapping.h>
+
+#include "vop_main.h"
+
+#define VOP_MAX_VRINGS 4
+
+/*
+ * _vop_vdev - Allocated per virtio device instance injected by the peer.
+ *
+ * @vdev: Virtio device
+ * @desc: Virtio device page descriptor
+ * @dc: Virtio device control
+ * @vpdev: VOP device which is the parent for this virtio device
+ * @vr: Buffer for accessing the VRING
+ * @used: Buffer for used
+ * @used_size: Size of the used buffer
+ * @reset_done: Track whether VOP reset is complete
+ * @virtio_cookie: Cookie returned upon requesting a interrupt
+ * @c2h_vdev_db: The doorbell used by the guest to interrupt the host
+ * @h2c_vdev_db: The doorbell used by the host to interrupt the guest
+ * @dnode: The destination node
+ */
+struct _vop_vdev {
+	struct virtio_device vdev;
+	struct mic_device_desc __iomem *desc;
+	struct mic_device_ctrl __iomem *dc;
+	struct vop_device *vpdev;
+	void __iomem *vr[VOP_MAX_VRINGS];
+	dma_addr_t used[VOP_MAX_VRINGS];
+	int used_size[VOP_MAX_VRINGS];
+	struct completion reset_done;
+	struct mic_irq *virtio_cookie;
+	int c2h_vdev_db;
+	int h2c_vdev_db;
+	int dnode;
+};
+
+#define to_vopvdev(vd) container_of(vd, struct _vop_vdev, vdev)
+
+#define _vop_aligned_desc_size(d) __mic_align(_vop_desc_size(d), 8)
+
+/* Helper API to obtain the parent of the virtio device */
+static inline struct device *_vop_dev(struct _vop_vdev *vdev)
+{
+	return vdev->vdev.dev.parent;
+}
+
+static inline unsigned _vop_desc_size(struct mic_device_desc __iomem *desc)
+{
+	return sizeof(*desc)
+		+ ioread8(&desc->num_vq) * sizeof(struct mic_vqconfig)
+		+ ioread8(&desc->feature_len) * 2
+		+ ioread8(&desc->config_len);
+}
+
+static inline struct mic_vqconfig __iomem *
+_vop_vq_config(struct mic_device_desc __iomem *desc)
+{
+	return (struct mic_vqconfig __iomem *)(desc + 1);
+}
+
+static inline u8 __iomem *
+_vop_vq_features(struct mic_device_desc __iomem *desc)
+{
+	return (u8 __iomem *)(_vop_vq_config(desc) + ioread8(&desc->num_vq));
+}
+
+static inline u8 __iomem *
+_vop_vq_configspace(struct mic_device_desc __iomem *desc)
+{
+	return _vop_vq_features(desc) + ioread8(&desc->feature_len) * 2;
+}
+
+static inline unsigned
+_vop_total_desc_size(struct mic_device_desc __iomem *desc)
+{
+	return _vop_aligned_desc_size(desc) + sizeof(struct mic_device_ctrl);
+}
+
+/* This gets the device's feature bits. */
+static u64 vop_get_features(struct virtio_device *vdev)
+{
+	unsigned int i, bits;
+	u32 features = 0;
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+	u8 __iomem *in_features = _vop_vq_features(desc);
+	int feature_len = ioread8(&desc->feature_len);
+
+	bits = min_t(unsigned, feature_len, sizeof(vdev->features)) * 8;
+	for (i = 0; i < bits; i++)
+		if (ioread8(&in_features[i / 8]) & (BIT(i % 8)))
+			features |= BIT(i);
+
+	return features;
+}
+
+static int vop_finalize_features(struct virtio_device *vdev)
+{
+	unsigned int i, bits;
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+	u8 feature_len = ioread8(&desc->feature_len);
+	/* Second half of bitmap is features we accept. */
+	u8 __iomem *out_features =
+		_vop_vq_features(desc) + feature_len;
+
+	/* Give virtio_ring a chance to accept features. */
+	vring_transport_features(vdev);
+
+	memset_io(out_features, 0, feature_len);
+	bits = min_t(unsigned, feature_len,
+		     sizeof(vdev->features)) * 8;
+	for (i = 0; i < bits; i++) {
+		if (__virtio_test_bit(vdev, i))
+			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
+				 &out_features[i / 8]);
+	}
+	return 0;
+}
+
+/*
+ * Reading and writing elements in config space
+ */
+static void vop_get(struct virtio_device *vdev, unsigned int offset,
+		    void *buf, unsigned len)
+{
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+
+	if (offset + len > ioread8(&desc->config_len))
+		return;
+	memcpy_fromio(buf, _vop_vq_configspace(desc) + offset, len);
+}
+
+static void vop_set(struct virtio_device *vdev, unsigned int offset,
+		    const void *buf, unsigned len)
+{
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+
+	if (offset + len > ioread8(&desc->config_len))
+		return;
+	memcpy_toio(_vop_vq_configspace(desc) + offset, buf, len);
+}
+
+/*
+ * The operations to get and set the status word just access the status
+ * field of the device descriptor. set_status also interrupts the host
+ * to tell about status changes.
+ */
+static u8 vop_get_status(struct virtio_device *vdev)
+{
+	return ioread8(&to_vopvdev(vdev)->desc->status);
+}
+
+static void vop_set_status(struct virtio_device *dev, u8 status)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct vop_device *vpdev = vdev->vpdev;
+
+	if (!status)
+		return;
+	iowrite8(status, &vdev->desc->status);
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+}
+
+/* Inform host on a virtio device reset and wait for ack from host */
+static void vop_reset_inform_host(struct virtio_device *dev)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct mic_device_ctrl __iomem *dc = vdev->dc;
+	struct vop_device *vpdev = vdev->vpdev;
+	int retry;
+
+	iowrite8(0, &dc->host_ack);
+	iowrite8(1, &dc->vdev_reset);
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+
+	/* Wait till host completes all card accesses and acks the reset */
+	for (retry = 100; retry--;) {
+		if (ioread8(&dc->host_ack))
+			break;
+		msleep(100);
+	};
+
+	dev_dbg(_vop_dev(vdev), "%s: retry: %d\n", __func__, retry);
+
+	/* Reset status to 0 in case we timed out */
+	iowrite8(0, &vdev->desc->status);
+}
+
+static void vop_reset(struct virtio_device *dev)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+
+	dev_dbg(_vop_dev(vdev), "%s: virtio id %d\n",
+		__func__, dev->id.device);
+
+	vop_reset_inform_host(dev);
+	complete_all(&vdev->reset_done);
+}
+
+/*
+ * The virtio_ring code calls this API when it wants to notify the Host.
+ */
+static bool vop_notify(struct virtqueue *vq)
+{
+	struct _vop_vdev *vdev = vq->priv;
+	struct vop_device *vpdev = vdev->vpdev;
+
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+	return true;
+}
+
+static void vop_del_vq(struct virtqueue *vq, int n)
+{
+	struct _vop_vdev *vdev = to_vopvdev(vq->vdev);
+	struct vring *vr = (struct vring *)(vq + 1);
+	struct vop_device *vpdev = vdev->vpdev;
+
+	dma_unmap_single(&vpdev->dev, vdev->used[n],
+			 vdev->used_size[n], DMA_BIDIRECTIONAL);
+	free_pages((unsigned long)vr->used, get_order(vdev->used_size[n]));
+	vring_del_virtqueue(vq);
+	vpdev->hw_ops->iounmap(vpdev, vdev->vr[n]);
+	vdev->vr[n] = NULL;
+}
+
+static void vop_del_vqs(struct virtio_device *dev)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct virtqueue *vq, *n;
+	int idx = 0;
+
+	dev_dbg(_vop_dev(vdev), "%s\n", __func__);
+
+	list_for_each_entry_safe(vq, n, &dev->vqs, list)
+		vop_del_vq(vq, idx++);
+}
+
+/*
+ * This routine will assign vring's allocated in host/io memory. Code in
+ * virtio_ring.c however continues to access this io memory as if it were local
+ * memory without io accessors.
+ */
+static struct virtqueue *vop_find_vq(struct virtio_device *dev,
+				     unsigned index,
+				     void (*callback)(struct virtqueue *vq),
+				     const char *name)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct vop_device *vpdev = vdev->vpdev;
+	struct mic_vqconfig __iomem *vqconfig;
+	struct mic_vqconfig config;
+	struct virtqueue *vq;
+	void __iomem *va;
+	struct _mic_vring_info __iomem *info;
+	void *used;
+	int vr_size, _vr_size, err, magic;
+	struct vring *vr;
+	u8 type = ioread8(&vdev->desc->type);
+
+	if (index >= ioread8(&vdev->desc->num_vq))
+		return ERR_PTR(-ENOENT);
+
+	if (!name)
+		return ERR_PTR(-ENOENT);
+
+	/* First assign the vring's allocated in host memory */
+	vqconfig = _vop_vq_config(vdev->desc) + index;
+	memcpy_fromio(&config, vqconfig, sizeof(config));
+	_vr_size = vring_size(le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN);
+	vr_size = PAGE_ALIGN(_vr_size + sizeof(struct _mic_vring_info));
+	va = vpdev->hw_ops->ioremap(vpdev, le64_to_cpu(config.address),
+			vr_size);
+	if (!va)
+		return ERR_PTR(-ENOMEM);
+	vdev->vr[index] = va;
+	memset_io(va, 0x0, _vr_size);
+	vq = vring_new_virtqueue(
+				index,
+				le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN,
+				dev,
+				false,
+				(void __force *)va, vop_notify, callback, name);
+	if (!vq) {
+		err = -ENOMEM;
+		goto unmap;
+	}
+	info = va + _vr_size;
+	magic = ioread32(&info->magic);
+
+	if (WARN(magic != MIC_MAGIC + type + index, "magic mismatch")) {
+		err = -EIO;
+		goto unmap;
+	}
+
+	/* Allocate and reassign used ring now */
+	vdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
+					     sizeof(struct vring_used_elem) *
+					     le16_to_cpu(config.num));
+	used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					get_order(vdev->used_size[index]));
+	if (!used) {
+		err = -ENOMEM;
+		dev_err(_vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto del_vq;
+	}
+	vdev->used[index] = dma_map_single(&vpdev->dev, used,
+					    vdev->used_size[index],
+					    DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(&vpdev->dev, vdev->used[index])) {
+		err = -ENOMEM;
+		dev_err(_vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto free_used;
+	}
+	writeq(vdev->used[index], &vqconfig->used_address);
+	/*
+	 * To reassign the used ring here we are directly accessing
+	 * struct vring_virtqueue which is a private data structure
+	 * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
+	 * vring_new_virtqueue() would ensure that
+	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
+	 */
+	vr = (struct vring *)(vq + 1);
+	vr->used = used;
+
+	vq->priv = vdev;
+	return vq;
+free_used:
+	free_pages((unsigned long)used,
+		   get_order(vdev->used_size[index]));
+del_vq:
+	vring_del_virtqueue(vq);
+unmap:
+	vpdev->hw_ops->iounmap(vpdev, vdev->vr[index]);
+	return ERR_PTR(err);
+}
+
+static int vop_find_vqs(struct virtio_device *dev, unsigned nvqs,
+			struct virtqueue *vqs[],
+			vq_callback_t *callbacks[],
+			const char * const names[])
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct vop_device *vpdev = vdev->vpdev;
+	struct mic_device_ctrl __iomem *dc = vdev->dc;
+	int i, err, retry;
+
+	/* We must have this many virtqueues. */
+	if (nvqs > ioread8(&vdev->desc->num_vq))
+		return -ENOENT;
+
+	for (i = 0; i < nvqs; ++i) {
+		dev_dbg(_vop_dev(vdev), "%s: %d: %s\n",
+			__func__, i, names[i]);
+		vqs[i] = vop_find_vq(dev, i, callbacks[i], names[i]);
+		if (IS_ERR(vqs[i])) {
+			err = PTR_ERR(vqs[i]);
+			goto error;
+		}
+	}
+
+	iowrite8(1, &dc->used_address_updated);
+	/*
+	 * Send an interrupt to the host to inform it that used
+	 * rings have been re-assigned.
+	 */
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+	for (retry = 100; --retry;) {
+		if (!ioread8(&dc->used_address_updated))
+			break;
+		msleep(100);
+	};
+
+	dev_dbg(_vop_dev(vdev), "%s: retry: %d\n", __func__, retry);
+	if (!retry) {
+		err = -ENODEV;
+		goto error;
+	}
+
+	return 0;
+error:
+	vop_del_vqs(dev);
+	return err;
+}
+
+/*
+ * The config ops structure as defined by virtio config
+ */
+static struct virtio_config_ops vop_vq_config_ops = {
+	.get_features = vop_get_features,
+	.finalize_features = vop_finalize_features,
+	.get = vop_get,
+	.set = vop_set,
+	.get_status = vop_get_status,
+	.set_status = vop_set_status,
+	.reset = vop_reset,
+	.find_vqs = vop_find_vqs,
+	.del_vqs = vop_del_vqs,
+};
+
+static irqreturn_t vop_virtio_intr_handler(int irq, void *data)
+{
+	struct _vop_vdev *vdev = data;
+	struct vop_device *vpdev = vdev->vpdev;
+	struct virtqueue *vq;
+
+	vpdev->hw_ops->ack_interrupt(vpdev, vdev->h2c_vdev_db);
+	list_for_each_entry(vq, &vdev->vdev.vqs, list)
+		vring_interrupt(0, vq);
+
+	return IRQ_HANDLED;
+}
+
+static void vop_virtio_release_dev(struct device *_d)
+{
+	/*
+	 * No need for a release method similar to virtio PCI.
+	 * Provide an empty one to avoid getting a warning from core.
+	 */
+}
+
+/*
+ * adds a new device and register it with virtio
+ * appropriate drivers are loaded by the device model
+ */
+static int _vop_add_device(struct mic_device_desc __iomem *d,
+			   unsigned int offset, struct vop_device *vpdev,
+			   int dnode)
+{
+	struct _vop_vdev *vdev;
+	int ret;
+	u8 type = ioread8(&d->type);
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return -ENOMEM;
+
+	vdev->vpdev = vpdev;
+	vdev->vdev.dev.parent = &vpdev->dev;
+	vdev->vdev.dev.release = vop_virtio_release_dev;
+	vdev->vdev.id.device = type;
+	vdev->vdev.config = &vop_vq_config_ops;
+	vdev->desc = d;
+	vdev->dc = (void __iomem *)d + _vop_aligned_desc_size(d);
+	vdev->dnode = dnode;
+	vdev->vdev.priv = (void *)(u64)dnode;
+	init_completion(&vdev->reset_done);
+
+	vdev->h2c_vdev_db = vpdev->hw_ops->next_db(vpdev);
+	vdev->virtio_cookie = vpdev->hw_ops->request_irq(vpdev,
+			vop_virtio_intr_handler, "virtio intr",
+			vdev, vdev->h2c_vdev_db);
+	if (IS_ERR(vdev->virtio_cookie)) {
+		ret = PTR_ERR(vdev->virtio_cookie);
+		goto kfree;
+	}
+	iowrite8((u8)vdev->h2c_vdev_db, &vdev->dc->h2c_vdev_db);
+	vdev->c2h_vdev_db = ioread8(&vdev->dc->c2h_vdev_db);
+
+	ret = register_virtio_device(&vdev->vdev);
+	if (ret) {
+		dev_err(_vop_dev(vdev),
+			"Failed to register vop device %u type %u\n",
+			offset, type);
+		goto free_irq;
+	}
+	writeq((u64)vdev, &vdev->dc->vdev);
+	dev_dbg(_vop_dev(vdev), "%s: registered vop device %u type %u vdev %p\n",
+		__func__, offset, type, vdev);
+
+	return 0;
+
+free_irq:
+	vpdev->hw_ops->free_irq(vpdev, vdev->virtio_cookie, vdev);
+kfree:
+	kfree(vdev);
+	return ret;
+}
+
+/*
+ * match for a vop device with a specific desc pointer
+ */
+static int vop_match_desc(struct device *dev, void *data)
+{
+	struct virtio_device *_dev = dev_to_virtio(dev);
+	struct _vop_vdev *vdev = to_vopvdev(_dev);
+
+	return vdev->desc == (void __iomem *)data;
+}
+
+static void _vop_handle_config_change(struct mic_device_desc __iomem *d,
+				      unsigned int offset,
+				      struct vop_device *vpdev)
+{
+	struct mic_device_ctrl __iomem *dc
+		= (void __iomem *)d + _vop_aligned_desc_size(d);
+	struct _vop_vdev *vdev = (struct _vop_vdev *)readq(&dc->vdev);
+
+	if (ioread8(&dc->config_change) != MIC_VIRTIO_PARAM_CONFIG_CHANGED)
+		return;
+
+	dev_dbg(&vpdev->dev, "%s %d\n", __func__, __LINE__);
+	virtio_config_changed(&vdev->vdev);
+	iowrite8(1, &dc->guest_ack);
+}
+
+/*
+ * removes a virtio device if a hot remove event has been
+ * requested by the host.
+ */
+static int _vop_remove_device(struct mic_device_desc __iomem *d,
+			      unsigned int offset, struct vop_device *vpdev)
+{
+	struct mic_device_ctrl __iomem *dc
+		= (void __iomem *)d + _vop_aligned_desc_size(d);
+	struct _vop_vdev *vdev = (struct _vop_vdev *)readq(&dc->vdev);
+	u8 status;
+	int ret = -1;
+
+	if (ioread8(&dc->config_change) == MIC_VIRTIO_PARAM_DEV_REMOVE) {
+		dev_dbg(&vpdev->dev,
+			"%s %d config_change %d type %d vdev %p\n",
+			__func__, __LINE__,
+			ioread8(&dc->config_change), ioread8(&d->type), vdev);
+		status = ioread8(&d->status);
+		reinit_completion(&vdev->reset_done);
+		unregister_virtio_device(&vdev->vdev);
+		vpdev->hw_ops->free_irq(vpdev, vdev->virtio_cookie, vdev);
+		iowrite8(-1, &dc->h2c_vdev_db);
+		if (status & VIRTIO_CONFIG_S_DRIVER_OK)
+			wait_for_completion(&vdev->reset_done);
+		kfree(vdev);
+		iowrite8(1, &dc->guest_ack);
+		dev_dbg(&vpdev->dev, "%s %d guest_ack %d\n",
+			__func__, __LINE__, ioread8(&dc->guest_ack));
+		iowrite8(-1, &d->type);
+		ret = 0;
+	}
+	return ret;
+}
+
+#define REMOVE_DEVICES true
+
+static void _vop_scan_devices(void __iomem *dp, struct vop_device *vpdev,
+			      bool remove, int dnode)
+{
+	s8 type;
+	unsigned int i;
+	struct mic_device_desc __iomem *d;
+	struct mic_device_ctrl __iomem *dc;
+	struct device *dev;
+	int ret;
+
+	for (i = sizeof(struct mic_bootparam);
+			i < MIC_DP_SIZE; i += _vop_total_desc_size(d)) {
+		d = dp + i;
+		dc = (void __iomem *)d + _vop_aligned_desc_size(d);
+		/*
+		 * This read barrier is paired with the corresponding write
+		 * barrier on the host which is inserted before adding or
+		 * removing a virtio device descriptor, by updating the type.
+		 */
+		rmb();
+		type = ioread8(&d->type);
+
+		/* end of list */
+		if (type == 0)
+			break;
+
+		if (type == -1)
+			continue;
+
+		/* device already exists */
+		dev = device_find_child(&vpdev->dev, (void __force *)d,
+					vop_match_desc);
+		if (dev) {
+			if (remove)
+				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
+					 &dc->config_change);
+			put_device(dev);
+			_vop_handle_config_change(d, i, vpdev);
+			ret = _vop_remove_device(d, i, vpdev);
+			if (remove) {
+				iowrite8(0, &dc->config_change);
+				iowrite8(0, &dc->guest_ack);
+			}
+			continue;
+		}
+
+		/* new device */
+		dev_dbg(&vpdev->dev, "%s %d Adding new virtio device %p\n",
+			__func__, __LINE__, d);
+		if (!remove)
+			_vop_add_device(d, i, vpdev, dnode);
+	}
+}
+
+static void vop_scan_devices(struct vop_info *vi,
+			     struct vop_device *vpdev, bool remove)
+{
+	void __iomem *dp = vpdev->hw_ops->get_remote_dp(vpdev);
+
+	if (!dp)
+		return;
+	mutex_lock(&vi->vop_mutex);
+	_vop_scan_devices(dp, vpdev, remove, vpdev->dnode);
+	mutex_unlock(&vi->vop_mutex);
+}
+
+/*
+ * vop_hotplug_device tries to find changes in the device page.
+ */
+static void vop_hotplug_devices(struct work_struct *work)
+{
+	struct vop_info *vi = container_of(work, struct vop_info,
+					     hotplug_work);
+
+	vop_scan_devices(vi, vi->vpdev, !REMOVE_DEVICES);
+}
+
+/*
+ * Interrupt handler for hot plug/config changes etc.
+ */
+static irqreturn_t vop_extint_handler(int irq, void *data)
+{
+	struct vop_info *vi = data;
+	struct mic_bootparam __iomem *bp;
+	struct vop_device *vpdev = vi->vpdev;
+
+	bp = vpdev->hw_ops->get_remote_dp(vpdev);
+	dev_dbg(&vpdev->dev, "%s %d hotplug work\n",
+		__func__, __LINE__);
+	vpdev->hw_ops->ack_interrupt(vpdev, ioread8(&bp->h2c_config_db));
+	schedule_work(&vi->hotplug_work);
+	return IRQ_HANDLED;
+}
+
+static int vop_driver_probe(struct vop_device *vpdev)
+{
+	struct vop_info *vi;
+	int rc;
+
+	vi = kzalloc(sizeof(*vi), GFP_KERNEL);
+	if (!vi) {
+		rc = -ENOMEM;
+		goto exit;
+	}
+	dev_set_drvdata(&vpdev->dev, vi);
+	vi->vpdev = vpdev;
+
+	mutex_init(&vi->vop_mutex);
+	INIT_WORK(&vi->hotplug_work, vop_hotplug_devices);
+	if (vpdev->dnode) {
+		rc = vop_host_init(vi);
+		if (rc < 0)
+			goto free;
+	} else {
+		struct mic_bootparam __iomem *bootparam;
+
+		vop_scan_devices(vi, vpdev, !REMOVE_DEVICES);
+
+		vi->h2c_config_db = vpdev->hw_ops->next_db(vpdev);
+		vi->cookie = vpdev->hw_ops->request_irq(vpdev,
+							vop_extint_handler,
+							"virtio_config_intr",
+							vi, vi->h2c_config_db);
+		if (IS_ERR(vi->cookie)) {
+			rc = PTR_ERR(vi->cookie);
+			goto free;
+		}
+		bootparam = vpdev->hw_ops->get_remote_dp(vpdev);
+		iowrite8(vi->h2c_config_db, &bootparam->h2c_config_db);
+	}
+	vop_init_debugfs(vi);
+	return 0;
+free:
+	kfree(vi);
+exit:
+	return rc;
+}
+
+static void vop_driver_remove(struct vop_device *vpdev)
+{
+	struct vop_info *vi = dev_get_drvdata(&vpdev->dev);
+
+	if (vpdev->dnode) {
+		vop_host_uninit(vi);
+	} else {
+		struct mic_bootparam __iomem *bootparam =
+			vpdev->hw_ops->get_remote_dp(vpdev);
+		if (bootparam)
+			iowrite8(-1, &bootparam->h2c_config_db);
+		vpdev->hw_ops->free_irq(vpdev, vi->cookie, vi);
+		flush_work(&vi->hotplug_work);
+		vop_scan_devices(vi, vpdev, REMOVE_DEVICES);
+	}
+	vop_exit_debugfs(vi);
+	kfree(vi);
+}
+
+static struct vop_device_id id_table[] = {
+	{ VOP_DEV_TRNSP, VOP_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct vop_driver vop_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = vop_driver_probe,
+	.remove = vop_driver_remove,
+};
+
+module_vop_driver(vop_driver);
+
+MODULE_DEVICE_TABLE(mbus, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) Virtio Over PCIe (VOP) driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1

^ permalink raw reply related	[relevance 49%]

* [PATCH char-misc-testing v2 8/8] misc: mic: MIC host and card driver changes to enable VOP
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (6 preceding siblings ...)
  2016-02-08 23:48 67% ` [PATCH char-misc-testing v2 7/8] misc: mic: Enable VOP debugfs and driver build Sudeep Dutt
@ 2016-02-08 23:48 58% ` Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch modifies the MIC host and card drivers to start using the
VOP driver. The MIC host and card drivers now implement the VOP bus
operations and register a VOP device on the VOP bus. MIC driver stack
documentation is also updated to include the new VOP driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mic_overview.txt |  54 ++++++++--------
 drivers/misc/mic/Kconfig           |   7 ++-
 drivers/misc/mic/card/mic_device.h |   3 +
 drivers/misc/mic/host/mic_device.h |   3 +
 Documentation/mic/mpssd/mpssd.c    |   2 +-
 drivers/misc/mic/card/mic_device.c |  84 ++++++++++++++++++++++++-
 drivers/misc/mic/card/mic_x100.c   |   1 +
 drivers/misc/mic/host/mic_boot.c   | 123 ++++++++++++++++++++++++++++++++++++-
 drivers/misc/mic/host/mic_main.c   |   1 +
 Documentation/mic/mpssd/mpss       |   2 +-
 10 files changed, 247 insertions(+), 33 deletions(-)

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index 73f44fc..074adbd 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -12,10 +12,19 @@ for the X100 devices.
 
 Since it is a PCIe card, it does not have the ability to host hardware
 devices for networking, storage and console. We provide these devices
-on X100 coprocessors thus enabling a self-bootable equivalent environment
-for applications. A key benefit of our solution is that it leverages
-the standard virtio framework for network, disk and console devices,
-though in our case the virtio framework is used across a PCIe bus.
+on X100 coprocessors thus enabling a self-bootable equivalent
+environment for applications. A key benefit of our solution is that it
+leverages the standard virtio framework for network, disk and console
+devices, though in our case the virtio framework is used across a PCIe
+bus. A Virtio Over PCIe (VOP) driver allows creating user space
+backends or devices on the host which are used to probe virtio drivers
+for these devices on the MIC card. The existing VRINGH infrastructure
+in the kernel is used to access virtio rings from the host. The card
+VOP driver allows card virtio drivers to communicate with their user
+space backends on the host via a device page. Ring 3 apps on the host
+can add, remove and configure virtio devices. A thin MIC specific
+virtio_config_ops is implemented which is borrowed heavily from
+previous similar implementations in lguest and s390.
 
 MIC PCIe card has a dma controller with 8 channels. These channels are
 shared between the host s/w and the card s/w. 0 to 3 are used by host
@@ -38,7 +47,6 @@ single threaded performance for the host compared to MIC, the ability of
 the host to initiate DMA's to/from the card using the MIC DMA engine and
 the fact that the virtio block storage backend can only be on the host.
 
-                                      |
                +----------+           |             +----------+
                | Card OS  |           |             | Host OS  |
                +----------+           |             +----------+
@@ -47,27 +55,25 @@ the fact that the virtio block storage backend can only be on the host.
         | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
         | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
         | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
-        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+        +---+---+ +---+----+ +--+---+ | +---------+  +----+---+ +--------+
             |         |         |     |      |            |         |
             |         |         |     |User  |            |         |
-            |         |         |     |------|------------|---------|-------
-            +-------------------+     |Kernel +--------------------------+
-                      |               |       | Virtio over PCIe IOCTLs  |
-                      |               |       +--------------------------+
-+-----------+         |               |                   |  +-----------+
-| MIC DMA   |         |      +------+ | +------+ +------+ |  | MIC DMA   |
-| Driver    |         |      | SCIF | | | SCIF | | COSM | |  | Driver    |
-+-----------+         |      +------+ | +------+ +--+---+ |  +-----------+
-      |               |         |     |    |        |     |        |
-+---------------+     |      +------+ | +--+---+ +--+---+ | +----------------+
-|MIC virtual Bus|     |      |SCIF  | | |SCIF  | | COSM | | |MIC virtual Bus |
-+---------------+     |      |HW Bus| | |HW Bus| | Bus  | | +----------------+
-      |               |      +------+ | +--+---+ +------+ |              |
-      |               |         |     |       |     |     |              |
-      |   +-----------+---+     |     |       |    +---------------+     |
-      |   |Intel MIC      |     |     |       |    |Intel MIC      |     |
-      +---|Card Driver    |     |     |       |    |Host Driver    |     |
-          +------------+--------+     |       +----+---------------+-----+
+            |         |         |     |------|------------|--+------|-------
+            +---------+---------+     |Kernel                |
+                      |               |                      |
+  +---------+     +---+----+ +------+ | +------+ +------+ +--+---+  +-------+
+  |MIC DMA  |     |  VOP   | | SCIF | | | SCIF | | COSM | | VOP  |  |MIC DMA|
+  +---+-----+     +---+----+ +--+---+ | +--+---+ +--+---+ +------+  +----+--+
+      |               |         |     |    |        |                    |
+  +---+-----+     +---+----+ +--+---+ | +--+---+ +--+---+ +------+  +----+--+
+  |MIC      |     |  VOP   | |SCIF  | | |SCIF  | | COSM | | VOP  |  | MIC   |
+  |HW Bus   |     |  HW Bus| |HW Bus| | |HW Bus| | Bus  | |HW Bus|  |HW Bus |
+  +---------+     +--------+ +--+---+ | +--+---+ +------+ +------+  +-------+
+      |               |         |     |       |     |                    |
+      |   +-----------+--+      |     |       |    +---------------+     |
+      |   |Intel MIC     |      |     |       |    |Intel MIC      |     |
+      |   |Card Driver   |      |     |       |    |Host Driver    |     |
+      +---+--------------+------+     |       +----+---------------+-----+
                  |                    |                   |
              +-------------------------------------------------------------+
              |                                                             |
diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index b03bb17..2e4f3ba 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -53,8 +53,8 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS && SCIF_BUS && MIC_COSM
-	select VHOST_RING
+	depends on 64BIT && PCI && X86
+	depends on INTEL_MIC_BUS && SCIF_BUS && MIC_COSM && VOP_BUS
 	help
 	  This enables Host Driver support for the Intel Many Integrated
 	  Core (MIC) family of PCIe form factor coprocessor devices that
@@ -73,7 +73,8 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86 && INTEL_MIC_BUS && SCIF_BUS && MIC_COSM
+	depends on 64BIT && X86
+	depends on INTEL_MIC_BUS && SCIF_BUS && MIC_COSM && VOP_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 1dbf83c..333dbed 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -32,6 +32,7 @@
 #include <linux/interrupt.h>
 #include <linux/mic_bus.h>
 #include "../bus/scif_bus.h"
+#include "../bus/vop_bus.h"
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -76,6 +77,7 @@ struct mic_device {
  * @dma_ch - Array of DMA channels
  * @num_dma_ch - Number of DMA channels available
  * @scdev: SCIF device on the SCIF virtual bus.
+ * @vpdev: Virtio over PCIe device on the VOP virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -90,6 +92,7 @@ struct mic_driver {
 	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
 	int num_dma_ch;
 	struct scif_hw_dev *scdev;
+	struct vop_device *vpdev;
 };
 
 /**
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 8460de1..52b12b2 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -29,6 +29,7 @@
 #include <linux/miscdevice.h>
 #include <linux/mic_bus.h>
 #include "../bus/scif_bus.h"
+#include "../bus/vop_bus.h"
 #include "../bus/cosm_bus.h"
 #include "mic_intr.h"
 
@@ -68,6 +69,7 @@ extern struct cosm_hw_ops cosm_hw_ops;
  * @dma_ch - Array of DMA channels
  * @num_dma_ch - Number of DMA channels available
  * @scdev: SCIF device on the SCIF virtual bus.
+ * @vpdev: Virtio over PCIe device on the VOP virtual bus.
  * @cosm_dev: COSM device
  */
 struct mic_device {
@@ -92,6 +94,7 @@ struct mic_device {
 	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
 	int num_dma_ch;
 	struct scif_hw_dev *scdev;
+	struct vop_device *vpdev;
 	struct cosm_device *cosm_dev;
 };
 
diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
index aaeafa1..518dece 100644
--- a/Documentation/mic/mpssd/mpssd.c
+++ b/Documentation/mic/mpssd/mpssd.c
@@ -926,7 +926,7 @@ add_virtio_device(struct mic_info *mic, struct mic_device_desc *dd)
 	char path[PATH_MAX];
 	int fd, err;
 
-	snprintf(path, PATH_MAX, "/dev/mic%d", mic->id);
+	snprintf(path, PATH_MAX, "/dev/vop_virtio%d", mic->id);
 	fd = open(path, O_RDWR);
 	if (fd < 0) {
 		mpsslog("Could not open %s %s\n", path, strerror(errno));
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index ff03c63..e749af4 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -249,12 +249,82 @@ static struct scif_hw_ops scif_hw_ops = {
 	.iounmap = ___mic_iounmap,
 };
 
+static inline struct mic_driver *vpdev_to_mdrv(struct vop_device *vpdev)
+{
+	return dev_get_drvdata(vpdev->dev.parent);
+}
+
+static struct mic_irq *
+__mic_request_irq(struct vop_device *vpdev,
+		  irqreturn_t (*func)(int irq, void *data),
+		   const char *name, void *data, int intr_src)
+{
+	return mic_request_card_irq(func, NULL, name, data, intr_src);
+}
+
+static void __mic_free_irq(struct vop_device *vpdev,
+			   struct mic_irq *cookie, void *data)
+{
+	return mic_free_card_irq(cookie, data);
+}
+
+static void __mic_ack_interrupt(struct vop_device *vpdev, int num)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	mic_ack_interrupt(&mdrv->mdev);
+}
+
+static int __mic_next_db(struct vop_device *vpdev)
+{
+	return mic_next_card_db();
+}
+
+static void __iomem *__mic_get_remote_dp(struct vop_device *vpdev)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	return mdrv->dp;
+}
+
+static void __mic_send_intr(struct vop_device *vpdev, int db)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	mic_send_intr(&mdrv->mdev, db);
+}
+
+static void __iomem *__mic_ioremap(struct vop_device *vpdev,
+				   dma_addr_t pa, size_t len)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	return mic_card_map(&mdrv->mdev, pa, len);
+}
+
+static void __mic_iounmap(struct vop_device *vpdev, void __iomem *va)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	mic_card_unmap(&mdrv->mdev, va);
+}
+
+static struct vop_hw_ops vop_hw_ops = {
+	.request_irq = __mic_request_irq,
+	.free_irq = __mic_free_irq,
+	.ack_interrupt = __mic_ack_interrupt,
+	.next_db = __mic_next_db,
+	.get_remote_dp = __mic_get_remote_dp,
+	.send_intr = __mic_send_intr,
+	.ioremap = __mic_ioremap,
+	.iounmap = __mic_iounmap,
+};
+
 static int mic_request_dma_chans(struct mic_driver *mdrv)
 {
 	dma_cap_mask_t mask;
 	struct dma_chan *chan;
 
-	request_module("mic_x100_dma");
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_MEMCPY, mask);
 
@@ -308,6 +378,13 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 		rc = -ENODEV;
 		goto irq_uninit;
 	}
+	mdrv->vpdev = vop_register_device(mdrv->dev, VOP_DEV_TRNSP,
+					  NULL, &vop_hw_ops, 0,
+					  NULL, mdrv->dma_ch[0]);
+	if (IS_ERR(mdrv->vpdev)) {
+		rc = PTR_ERR(mdrv->vpdev);
+		goto dma_free;
+	}
 	bootparam = mdrv->dp;
 	node_id = ioread8(&bootparam->node_id);
 	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
@@ -317,11 +394,13 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 					   mdrv->num_dma_ch, true);
 	if (IS_ERR(mdrv->scdev)) {
 		rc = PTR_ERR(mdrv->scdev);
-		goto dma_free;
+		goto vop_remove;
 	}
 	mic_create_card_debug_dir(mdrv);
 done:
 	return rc;
+vop_remove:
+	vop_unregister_device(mdrv->vpdev);
 dma_free:
 	mic_free_dma_chans(mdrv);
 irq_uninit:
@@ -342,6 +421,7 @@ void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
 	scif_unregister_device(mdrv->scdev);
+	vop_unregister_device(mdrv->vpdev);
 	mic_free_dma_chans(mdrv);
 	mic_uninit_irq();
 	mic_dp_uninit();
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index b2958ce..b9f0710 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -326,6 +326,7 @@ static int __init mic_init(void)
 		goto done;
 	}
 
+	request_module("mic_x100_dma");
 	mic_init_card_debugfs();
 	ret = platform_device_register(&mic_platform_dev);
 	if (ret) {
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index 3df305f..8c91c99 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -25,10 +25,118 @@
 #include <linux/mic_common.h>
 #include <linux/mic_bus.h>
 #include "../bus/scif_bus.h"
+#include "../bus/vop_bus.h"
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 
+static inline struct mic_device *vpdev_to_mdev(struct device *dev)
+{
+	return dev_get_drvdata(dev->parent);
+}
+
+static dma_addr_t
+_mic_dma_map_page(struct device *dev, struct page *page,
+		  unsigned long offset, size_t size,
+		  enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct mic_device *mdev = vpdev_to_mdev(dev);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void _mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+				size_t size, enum dma_data_direction dir,
+				struct dma_attrs *attrs)
+{
+	struct mic_device *mdev = vpdev_to_mdev(dev);
+
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static const struct dma_map_ops _mic_dma_ops = {
+	.map_page = _mic_dma_map_page,
+	.unmap_page = _mic_dma_unmap_page,
+};
+
+static struct mic_irq *
+__mic_request_irq(struct vop_device *vpdev,
+		  irqreturn_t (*func)(int irq, void *data),
+		  const char *name, void *data, int intr_src)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	return mic_request_threaded_irq(mdev, func, NULL, name, data,
+					intr_src, MIC_INTR_DB);
+}
+
+static void __mic_free_irq(struct vop_device *vpdev,
+			   struct mic_irq *cookie, void *data)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	return mic_free_irq(mdev, cookie, data);
+}
+
+static void __mic_ack_interrupt(struct vop_device *vpdev, int num)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static int __mic_next_db(struct vop_device *vpdev)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	return mic_next_db(mdev);
+}
+
+static void *__mic_get_dp(struct vop_device *vpdev)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	return mdev->dp;
+}
+
+static void __iomem *__mic_get_remote_dp(struct vop_device *vpdev)
+{
+	return NULL;
+}
+
+static void __mic_send_intr(struct vop_device *vpdev, int db)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	mdev->ops->send_intr(mdev, db);
+}
+
+static void __iomem *__mic_ioremap(struct vop_device *vpdev,
+				   dma_addr_t pa, size_t len)
+{
+	struct mic_device *mdev = vpdev_to_mdev(&vpdev->dev);
+
+	return mdev->aper.va + pa;
+}
+
+static void __mic_iounmap(struct vop_device *vpdev, void __iomem *va)
+{
+	/* nothing to do */
+}
+
+static struct vop_hw_ops vop_hw_ops = {
+	.request_irq = __mic_request_irq,
+	.free_irq = __mic_free_irq,
+	.ack_interrupt = __mic_ack_interrupt,
+	.next_db = __mic_next_db,
+	.get_dp = __mic_get_dp,
+	.get_remote_dp = __mic_get_remote_dp,
+	.send_intr = __mic_send_intr,
+	.ioremap = __mic_ioremap,
+	.iounmap = __mic_iounmap,
+};
+
 static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
 {
 	return dev_get_drvdata(scdev->dev.parent);
@@ -314,7 +422,6 @@ static int mic_request_dma_chans(struct mic_device *mdev)
 	dma_cap_mask_t mask;
 	struct dma_chan *chan;
 
-	request_module("mic_x100_dma");
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_MEMCPY, mask);
 
@@ -386,9 +493,18 @@ static int _mic_start(struct cosm_device *cdev, int id)
 		goto dma_free;
 	}
 
+	mdev->vpdev = vop_register_device(&mdev->pdev->dev,
+					  VOP_DEV_TRNSP, &_mic_dma_ops,
+					  &vop_hw_ops, id + 1, &mdev->aper,
+					  mdev->dma_ch[0]);
+	if (IS_ERR(mdev->vpdev)) {
+		rc = PTR_ERR(mdev->vpdev);
+		goto scif_remove;
+	}
+
 	rc = mdev->ops->load_mic_fw(mdev, NULL);
 	if (rc)
-		goto scif_remove;
+		goto vop_remove;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -396,6 +512,8 @@ static int _mic_start(struct cosm_device *cdev, int id)
 	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
 	mdev->ops->send_firmware_intr(mdev);
 	goto unlock_ret;
+vop_remove:
+	vop_unregister_device(mdev->vpdev);
 scif_remove:
 	scif_unregister_device(mdev->scdev);
 dma_free:
@@ -422,6 +540,7 @@ static void _mic_stop(struct cosm_device *cdev, bool force)
 	 * will be the first to be registered and the last to be
 	 * unregistered.
 	 */
+	vop_unregister_device(mdev->vpdev);
 	scif_unregister_device(mdev->scdev);
 	mic_free_dma_chans(mdev);
 	mbus_unregister_device(mdev->dma_mbdev);
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 400def2..035be3e 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -317,6 +317,7 @@ static int __init mic_init(void)
 {
 	int ret;
 
+	request_module("mic_x100_dma");
 	mic_init_debugfs();
 	ida_init(&g_mic_ida);
 	ret = pci_register_driver(&mic_driver);
diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index 09ea9093..5fcf9fa 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -35,7 +35,7 @@
 
 exec=/usr/sbin/mpssd
 sysfs="/sys/class/mic"
-mic_modules="mic_host mic_x100_dma scif"
+mic_modules="mic_host mic_x100_dma scif vop"
 
 start()
 {
-- 
1.8.2.1

^ permalink raw reply related	[relevance 58%]

* [PATCH char-misc-testing v2 3/8] misc: mic: MIC VOP Bus
  2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
  2016-02-08 23:48 29% ` [PATCH char-misc-testing v2 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
  2016-02-08 23:48 49% ` [PATCH char-misc-testing v2 2/8] misc: mic: Remove MIC X100 card " Sudeep Dutt
@ 2016-02-08 23:48 61% ` Sudeep Dutt
  2016-02-08 23:48 75% ` [PATCH char-misc-testing v2 4/8] misc: mic: Add data structures for the VOP driver Sudeep Dutt
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

The Virtio Over PCIe (VOP) bus abstracts the low level hardware
details like interrupts and mapping remote memory so that the same VOP
driver can work without changes with different MIC host or card
drivers as long as the hardware bus operations are implemented. The
VOP driver registers itself on the VOP bus. The base PCIe drivers
implement the bus ops and register VOP devices on the bus, resulting
in the VOP driver being probed with the VOP devices. This allows the
VOP functionality to be shared between multiple generations of Intel
MIC products.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig       |  17 ++++
 drivers/misc/mic/bus/Makefile  |   1 +
 drivers/misc/mic/bus/vop_bus.h | 140 ++++++++++++++++++++++++++++
 drivers/misc/mic/bus/vop_bus.c | 203 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 361 insertions(+)
 create mode 100644 drivers/misc/mic/bus/vop_bus.h
 create mode 100644 drivers/misc/mic/bus/vop_bus.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 40677df..840f7ef 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -32,6 +32,23 @@ config SCIF_BUS
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
 
+comment "VOP Bus Driver"
+
+config VOP_BUS
+	tristate "VOP Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the VOP Bus, such as CONFIG_INTEL_MIC_HOST
+	  and CONFIG_INTEL_MIC_CARD.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
index 761842b..8758a7d 100644
--- a/drivers/misc/mic/bus/Makefile
+++ b/drivers/misc/mic/bus/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
 obj-$(CONFIG_SCIF_BUS) += scif_bus.o
 obj-$(CONFIG_MIC_COSM) += cosm_bus.o
+obj-$(CONFIG_VOP_BUS) += vop_bus.o
diff --git a/drivers/misc/mic/bus/vop_bus.h b/drivers/misc/mic/bus/vop_bus.h
new file mode 100644
index 0000000..fff7a86
--- /dev/null
+++ b/drivers/misc/mic/bus/vop_bus.h
@@ -0,0 +1,140 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio over PCIe Bus driver.
+ */
+#ifndef _VOP_BUS_H_
+#define _VOP_BUS_H_
+/*
+ * Everything a vop driver needs to work with any particular vop
+ * implementation.
+ */
+#include <linux/dmaengine.h>
+#include <linux/interrupt.h>
+
+#include "../common/mic_dev.h"
+
+struct vop_device_id {
+	u32 device;
+	u32 vendor;
+};
+
+#define VOP_DEV_TRNSP 1
+#define VOP_DEV_ANY_ID 0xffffffff
+/*
+ * Size of the internal buffer used during DMA's as an intermediate buffer
+ * for copy to/from user. Must be an integral number of pages.
+ */
+#define VOP_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
+
+/**
+ * vop_device - representation of a device using vop
+ * @hw_ops: the hardware ops supported by this device.
+ * @id: the device type identification (used to match it with a driver).
+ * @dev: underlying device.
+ * @dnode - The destination node which this device will communicate with.
+ * @aper: Aperture memory window
+ * @dma_ch - DMA channel
+ * @index: unique position on the vop bus
+ */
+struct vop_device {
+	struct vop_hw_ops *hw_ops;
+	struct vop_device_id id;
+	struct device dev;
+	u8 dnode;
+	struct mic_mw *aper;
+	struct dma_chan *dma_ch;
+	int index;
+};
+
+/**
+ * vop_driver - operations for a vop I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct vop_driver {
+	struct device_driver driver;
+	const struct vop_device_id *id_table;
+	int (*probe)(struct vop_device *dev);
+	void (*remove)(struct vop_device *dev);
+};
+
+/**
+ * vop_hw_ops - Hardware operations for accessing a VOP device on the VOP bus.
+ *
+ * @next_db: Obtain the next available doorbell.
+ * @request_irq: Request an interrupt on a particular doorbell.
+ * @free_irq: Free an interrupt requested previously.
+ * @ack_interrupt: acknowledge an interrupt in the ISR.
+ * @get_remote_dp: Get access to the virtio device page used by the remote
+ *                 node to add/remove/configure virtio devices.
+ * @get_dp: Get access to the virtio device page used by the self
+ *          node to add/remove/configure virtio devices.
+ * @send_intr: Send an interrupt to the peer node on a specified doorbell.
+ * @ioremap: Map a buffer with the specified DMA address and length.
+ * @iounmap: Unmap a buffer previously mapped.
+ * @dma_filter: The DMA filter function to use for obtaining access to
+ *		a DMA channel on the peer node.
+ */
+struct vop_hw_ops {
+	int (*next_db)(struct vop_device *vpdev);
+	struct mic_irq *(*request_irq)(struct vop_device *vpdev,
+				       irqreturn_t (*func)(int irq, void *data),
+				       const char *name, void *data,
+				       int intr_src);
+	void (*free_irq)(struct vop_device *vpdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct vop_device *vpdev, int num);
+	void __iomem * (*get_remote_dp)(struct vop_device *vpdev);
+	void * (*get_dp)(struct vop_device *vpdev);
+	void (*send_intr)(struct vop_device *vpdev, int db);
+	void __iomem * (*ioremap)(struct vop_device *vpdev,
+				  dma_addr_t pa, size_t len);
+	void (*iounmap)(struct vop_device *vpdev, void __iomem *va);
+};
+
+struct vop_device *
+vop_register_device(struct device *pdev, int id,
+		    const struct dma_map_ops *dma_ops,
+		    struct vop_hw_ops *hw_ops, u8 dnode, struct mic_mw *aper,
+		    struct dma_chan *chan);
+void vop_unregister_device(struct vop_device *dev);
+int vop_register_driver(struct vop_driver *drv);
+void vop_unregister_driver(struct vop_driver *drv);
+
+/*
+ * module_vop_driver() - Helper macro for drivers that don't do
+ * anything special in module init/exit.  This eliminates a lot of
+ * boilerplate.  Each module may only use this macro once, and
+ * calling it replaces module_init() and module_exit()
+ */
+#define module_vop_driver(__vop_driver) \
+	module_driver(__vop_driver, vop_register_driver, \
+			vop_unregister_driver)
+
+static inline struct vop_device *dev_to_vop(struct device *dev)
+{
+	return container_of(dev, struct vop_device, dev);
+}
+
+static inline struct vop_driver *drv_to_vop(struct device_driver *drv)
+{
+	return container_of(drv, struct vop_driver, driver);
+}
+#endif /* _VOP_BUS_H */
diff --git a/drivers/misc/mic/bus/vop_bus.c b/drivers/misc/mic/bus/vop_bus.c
new file mode 100644
index 0000000..303da22
--- /dev/null
+++ b/drivers/misc/mic/bus/vop_bus.c
@@ -0,0 +1,203 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) Bus driver.
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/dma-mapping.h>
+
+#include "vop_bus.h"
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct vop_device *dev = dev_to_vop(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct vop_device *dev = dev_to_vop(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct vop_device *dev = dev_to_vop(d);
+
+	return sprintf(buf, "vop:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *vop_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(vop_dev);
+
+static inline int vop_id_match(const struct vop_device *dev,
+			       const struct vop_device_id *id)
+{
+	if (id->device != dev->id.device && id->device != VOP_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == VOP_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call vop_dev_probe().
+ */
+static int vop_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct vop_device *dev = dev_to_vop(dv);
+	const struct vop_device_id *ids;
+
+	ids = drv_to_vop(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (vop_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int vop_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct vop_device *dev = dev_to_vop(dv);
+
+	return add_uevent_var(env, "MODALIAS=vop:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int vop_dev_probe(struct device *d)
+{
+	struct vop_device *dev = dev_to_vop(d);
+	struct vop_driver *drv = drv_to_vop(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int vop_dev_remove(struct device *d)
+{
+	struct vop_device *dev = dev_to_vop(d);
+	struct vop_driver *drv = drv_to_vop(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type vop_bus = {
+	.name  = "vop_bus",
+	.match = vop_dev_match,
+	.dev_groups = vop_dev_groups,
+	.uevent = vop_uevent,
+	.probe = vop_dev_probe,
+	.remove = vop_dev_remove,
+};
+
+int vop_register_driver(struct vop_driver *driver)
+{
+	driver->driver.bus = &vop_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(vop_register_driver);
+
+void vop_unregister_driver(struct vop_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(vop_unregister_driver);
+
+static void vop_release_dev(struct device *d)
+{
+	put_device(d);
+}
+
+struct vop_device *
+vop_register_device(struct device *pdev, int id,
+		    const struct dma_map_ops *dma_ops,
+		    struct vop_hw_ops *hw_ops, u8 dnode, struct mic_mw *aper,
+		    struct dma_chan *chan)
+{
+	int ret;
+	struct vop_device *vdev;
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return ERR_PTR(-ENOMEM);
+
+	vdev->dev.parent = pdev;
+	vdev->id.device = id;
+	vdev->id.vendor = VOP_DEV_ANY_ID;
+	vdev->dev.archdata.dma_ops = (struct dma_map_ops *)dma_ops;
+	vdev->dev.dma_mask = &vdev->dev.coherent_dma_mask;
+	dma_set_mask(&vdev->dev, DMA_BIT_MASK(64));
+	vdev->dev.release = vop_release_dev;
+	vdev->hw_ops = hw_ops;
+	vdev->dev.bus = &vop_bus;
+	vdev->dnode = dnode;
+	vdev->aper = aper;
+	vdev->dma_ch = chan;
+	vdev->index = dnode - 1;
+	dev_set_name(&vdev->dev, "vop-dev%u", vdev->index);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&vdev->dev);
+	if (ret)
+		goto free_vdev;
+	return vdev;
+free_vdev:
+	kfree(vdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(vop_register_device);
+
+void vop_unregister_device(struct vop_device *dev)
+{
+	device_unregister(&dev->dev);
+}
+EXPORT_SYMBOL_GPL(vop_unregister_device);
+
+static int __init vop_init(void)
+{
+	return bus_register(&vop_bus);
+}
+
+static void __exit vop_exit(void)
+{
+	bus_unregister(&vop_bus);
+}
+
+core_initcall(vop_init);
+module_exit(vop_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) VOP Bus driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1

^ permalink raw reply related	[relevance 61%]

* [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver
@ 2016-02-08 23:48 84% Sudeep Dutt
  2016-02-08 23:48 29% ` [PATCH char-misc-testing v2 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 23:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

ChangeLog:
=========
v1 => v2:
a) Rebased to latest char-misc-testing
b) Use default private pointer in the device structure instead of adding
   a new priv field as per feedback from Greg Kroah-Hartman

v1: Initial post @ https://lkml.org/lkml/2016/2/1/1015

Description:
============
This patch series moves virtio functionality from the MIC host/card
driver into a separate hardware independent Virtio Over PCIe (VOP)
driver. Apart from being moved into a separate driver the functionality
is essentially unchanged. This refactoring allows this hardware
independent logic to be shared easily across multiple generations of
MIC devices. The original commits are listed below for reference:
- commit f69bcbf3b4c4 ("Intel MIC Host Driver Changes for Virtio Devices.")
in drivers/misc/mic/host/mic_virtio.c
- commit 2141c7c5ee67 ("Intel MIC Card Driver Changes for Virtio Devices.")
in drivers/misc/mic/card/mic_virtio.c

The patch series is partitioned as follows:
1) Removes MIC X100 host virtio functionality
2) Removes MIC X100 card virtio functionality
3) Enables the Virtio Over PCIe (VOP) bus which abstracts the
   low level hardware details like interrupts and mapping remote
   memory so that the same VOP driver can work without changes
   with different MIC host or card drivers as long as the hardware
   bus operations are implemented.
4) Adds data structures for the VOP driver
5) Enables VOP host side functionality
6) Enables VOP card side functionality
7) Enables VOP debugfs and driver build
8) Implements the MIC host and card driver changes to enable VOP

Ashutosh Dixit (1):
  misc: mic: Enable VOP card side functionality

Sudeep Dutt (7):
  misc: mic: Remove MIC X100 host virtio functionality
  misc: mic: Remove MIC X100 card virtio functionality
  misc: mic: MIC VOP Bus
  misc: mic: Add data structures for the VOP driver
  misc: mic: Enable VOP host side functionality
  misc: mic: Enable VOP debugfs and driver build
  misc: mic: MIC host and card driver changes to enable VOP

 Documentation/mic/mic_overview.txt                 |   54 +-
 drivers/misc/mic/Kconfig                           |   44 +-
 drivers/misc/mic/Makefile                          |    1 +
 drivers/misc/mic/bus/Makefile                      |    1 +
 drivers/misc/mic/card/Makefile                     |    1 -
 drivers/misc/mic/host/Makefile                     |    2 -
 drivers/misc/mic/vop/Makefile                      |    9 +
 drivers/misc/mic/bus/vop_bus.h                     |  140 +++
 drivers/misc/mic/card/mic_device.h                 |    3 +
 drivers/misc/mic/card/mic_virtio.h                 |   76 --
 drivers/misc/mic/host/mic_device.h                 |    9 +-
 drivers/misc/mic/host/mic_fops.h                   |   32 -
 .../misc/mic/{host/mic_virtio.h => vop/vop_main.h} |  129 ++-
 Documentation/mic/mpssd/mpssd.c                    |    2 +-
 drivers/misc/mic/bus/vop_bus.c                     |  203 ++++
 drivers/misc/mic/card/mic_device.c                 |   89 +-
 drivers/misc/mic/card/mic_virtio.c                 |  634 -----------
 drivers/misc/mic/card/mic_x100.c                   |    1 +
 drivers/misc/mic/host/mic_boot.c                   |  125 ++-
 drivers/misc/mic/host/mic_debugfs.c                |  190 ----
 drivers/misc/mic/host/mic_fops.c                   |  222 ----
 drivers/misc/mic/host/mic_main.c                   |   49 +-
 drivers/misc/mic/host/mic_virtio.c                 |  811 --------------
 drivers/misc/mic/vop/vop_debugfs.c                 |  232 ++++
 drivers/misc/mic/vop/vop_main.c                    |  755 +++++++++++++
 drivers/misc/mic/vop/vop_vringh.c                  | 1164 ++++++++++++++++++++
 Documentation/mic/mpssd/mpss                       |    2 +-
 27 files changed, 2864 insertions(+), 2116 deletions(-)
 create mode 100644 drivers/misc/mic/vop/Makefile
 create mode 100644 drivers/misc/mic/bus/vop_bus.h
 delete mode 100644 drivers/misc/mic/card/mic_virtio.h
 delete mode 100644 drivers/misc/mic/host/mic_fops.h
 rename drivers/misc/mic/{host/mic_virtio.h => vop/vop_main.h} (58%)
 create mode 100644 drivers/misc/mic/bus/vop_bus.c
 delete mode 100644 drivers/misc/mic/card/mic_virtio.c
 delete mode 100644 drivers/misc/mic/host/mic_fops.c
 delete mode 100644 drivers/misc/mic/host/mic_virtio.c
 create mode 100644 drivers/misc/mic/vop/vop_debugfs.c
 create mode 100644 drivers/misc/mic/vop/vop_main.c
 create mode 100644 drivers/misc/mic/vop/vop_vringh.c

-- 
1.8.2.1

^ permalink raw reply	[relevance 84%]

* Re: [PATCH char-misc-next 3/8] misc: mic: MIC VOP Bus
  @ 2016-02-08 17:26 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-08 17:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Sudeep Dutt, linux-kernel, Ashutosh Dixit, Nikhil Rao

On Sun, 2016-02-07 at 22:57 -0800, Greg Kroah-Hartman wrote:
> On Mon, Feb 01, 2016 at 08:23:40PM -0800, Sudeep Dutt wrote:
> > The Virtio Over PCIe (VOP) bus abstracts the low level hardware
> > details like interrupts and mapping remote memory so that the same VOP
> > driver can work without changes with different MIC host or card
> > drivers as long as the hardware bus operations are implemented. The
> > VOP driver registers itself on the VOP bus. The base PCIe drivers
> > implement the bus ops and register VOP devices on the bus, resulting
> > in the VOP driver being probed with the VOP devices. This allows the
> > VOP functionality to be shared between multiple generations of Intel
> > MIC products.
> > 
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > ---
> >  drivers/misc/mic/Kconfig       |  17 ++++
> >  drivers/misc/mic/bus/Makefile  |   1 +
> >  drivers/misc/mic/bus/vop_bus.h | 142 ++++++++++++++++++++++++++++
> >  drivers/misc/mic/bus/vop_bus.c | 204 +++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 364 insertions(+)
> >  create mode 100644 drivers/misc/mic/bus/vop_bus.h
> >  create mode 100644 drivers/misc/mic/bus/vop_bus.c
> > 
> > diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
> > index 40677df..840f7ef 100644
> > --- a/drivers/misc/mic/Kconfig
> > +++ b/drivers/misc/mic/Kconfig
> > @@ -32,6 +32,23 @@ config SCIF_BUS
> >  	  OS and tools for MIC to use with this driver are available from
> >  	  <http://software.intel.com/en-us/mic-developer>.
> >  
> > +comment "VOP Bus Driver"
> > +
> > +config VOP_BUS
> > +	tristate "VOP Bus Driver"
> > +	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
> > +	help
> > +	  This option is selected by any driver which registers a
> > +	  device or driver on the VOP Bus, such as CONFIG_INTEL_MIC_HOST
> > +	  and CONFIG_INTEL_MIC_CARD.
> > +
> > +	  If you are building a host/card kernel with an Intel MIC device
> > +	  then say M (recommended) or Y, else say N. If unsure say N.
> > +
> > +	  More information about the Intel MIC family as well as the Linux
> > +	  OS and tools for MIC to use with this driver are available from
> > +	  <http://software.intel.com/en-us/mic-developer>.
> > +
> >  comment "Intel MIC Host Driver"
> >  
> >  config INTEL_MIC_HOST
> > diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
> > index 761842b..8758a7d 100644
> > --- a/drivers/misc/mic/bus/Makefile
> > +++ b/drivers/misc/mic/bus/Makefile
> > @@ -5,3 +5,4 @@
> >  obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
> >  obj-$(CONFIG_SCIF_BUS) += scif_bus.o
> >  obj-$(CONFIG_MIC_COSM) += cosm_bus.o
> > +obj-$(CONFIG_VOP_BUS) += vop_bus.o
> > diff --git a/drivers/misc/mic/bus/vop_bus.h b/drivers/misc/mic/bus/vop_bus.h
> > new file mode 100644
> > index 0000000..97fa5d6
> > --- /dev/null
> > +++ b/drivers/misc/mic/bus/vop_bus.h
> > @@ -0,0 +1,142 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * Copyright(c) 2016 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License, version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * General Public License for more details.
> > + *
> > + * The full GNU General Public License is included in this distribution in
> > + * the file called "COPYING".
> > + *
> > + * Intel Virtio over PCIe Bus driver.
> > + */
> > +#ifndef _VOP_BUS_H_
> > +#define _VOP_BUS_H_
> > +/*
> > + * Everything a vop driver needs to work with any particular vop
> > + * implementation.
> > + */
> > +#include <linux/dmaengine.h>
> > +#include <linux/interrupt.h>
> > +
> > +#include "../common/mic_dev.h"
> > +
> > +struct vop_device_id {
> > +	u32 device;
> > +	u32 vendor;
> > +};
> > +
> > +#define VOP_DEV_TRNSP 1
> > +#define VOP_DEV_ANY_ID 0xffffffff
> > +/*
> > + * Size of the internal buffer used during DMA's as an intermediate buffer
> > + * for copy to/from user. Must be an integral number of pages.
> > + */
> > +#define VOP_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
> > +
> > +/**
> > + * vop_device - representation of a device using vop
> > + * @priv: private pointer for the driver's use.
> > + * @hw_ops: the hardware ops supported by this device.
> > + * @id: the device type identification (used to match it with a driver).
> > + * @dev: underlying device.
> > + * @dnode - The destination node which this device will communicate with.
> > + * @aper: Aperture memory window
> > + * @dma_ch - DMA channel
> > + * @index: unique position on the vop bus
> > + */
> > +struct vop_device {
> > +	void *priv;
> 
> You don't need this pointer, use the one in struct device instead.
> 
> Other than that, looks good, nice job with this bus.
> 

Hi Greg,

I will clean this up, refresh this patch series against the latest
char-misc-next tree and resend today. 

Thanks for reviewing!

Sudeep Dutt

^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next 6/8] misc: mic: Enable VOP card side functionality
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (4 preceding siblings ...)
  2016-02-02  4:23 39% ` [PATCH char-misc-next 5/8] misc: mic: Enable VOP host side functionality Sudeep Dutt
@ 2016-02-02  4:23 49% ` Sudeep Dutt
  2016-02-02  4:23 67% ` [PATCH char-misc-next 7/8] misc: mic: Enable VOP debugfs and driver build Sudeep Dutt
  2016-02-02  4:23 58% ` [PATCH char-misc-next 8/8] misc: mic: MIC host and card driver changes to enable VOP Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

From: Ashutosh Dixit <ashutosh.dixit@intel.com>

This patch moves virtio functionality from the MIC card driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit 2141c7c5ee67 ("Intel MIC Card
Driver Changes for Virtio Devices.") in
drivers/misc/mic/card/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.

Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/misc/mic/vop/vop_main.c | 755 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 755 insertions(+)
 create mode 100644 drivers/misc/mic/vop/vop_main.c

diff --git a/drivers/misc/mic/vop/vop_main.c b/drivers/misc/mic/vop/vop_main.c
new file mode 100644
index 0000000..c86aeeb
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_main.c
@@ -0,0 +1,755 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Adapted from:
+ *
+ * virtio for kvm on s390
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License (version 2 only)
+ * as published by the Free Software Foundation.
+ *
+ *    Author(s): Christian Borntraeger <borntraeger@de.ibm.com>
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/dma-mapping.h>
+
+#include "vop_main.h"
+
+#define VOP_MAX_VRINGS 4
+
+/*
+ * _vop_vdev - Allocated per virtio device instance injected by the peer.
+ *
+ * @vdev: Virtio device
+ * @desc: Virtio device page descriptor
+ * @dc: Virtio device control
+ * @vpdev: VOP device which is the parent for this virtio device
+ * @vr: Buffer for accessing the VRING
+ * @used: Buffer for used
+ * @used_size: Size of the used buffer
+ * @reset_done: Track whether VOP reset is complete
+ * @virtio_cookie: Cookie returned upon requesting a interrupt
+ * @c2h_vdev_db: The doorbell used by the guest to interrupt the host
+ * @h2c_vdev_db: The doorbell used by the host to interrupt the guest
+ * @dnode: The destination node
+ */
+struct _vop_vdev {
+	struct virtio_device vdev;
+	struct mic_device_desc __iomem *desc;
+	struct mic_device_ctrl __iomem *dc;
+	struct vop_device *vpdev;
+	void __iomem *vr[VOP_MAX_VRINGS];
+	dma_addr_t used[VOP_MAX_VRINGS];
+	int used_size[VOP_MAX_VRINGS];
+	struct completion reset_done;
+	struct mic_irq *virtio_cookie;
+	int c2h_vdev_db;
+	int h2c_vdev_db;
+	int dnode;
+};
+
+#define to_vopvdev(vd) container_of(vd, struct _vop_vdev, vdev)
+
+#define _vop_aligned_desc_size(d) __mic_align(_vop_desc_size(d), 8)
+
+/* Helper API to obtain the parent of the virtio device */
+static inline struct device *_vop_dev(struct _vop_vdev *vdev)
+{
+	return vdev->vdev.dev.parent;
+}
+
+static inline unsigned _vop_desc_size(struct mic_device_desc __iomem *desc)
+{
+	return sizeof(*desc)
+		+ ioread8(&desc->num_vq) * sizeof(struct mic_vqconfig)
+		+ ioread8(&desc->feature_len) * 2
+		+ ioread8(&desc->config_len);
+}
+
+static inline struct mic_vqconfig __iomem *
+_vop_vq_config(struct mic_device_desc __iomem *desc)
+{
+	return (struct mic_vqconfig __iomem *)(desc + 1);
+}
+
+static inline u8 __iomem *
+_vop_vq_features(struct mic_device_desc __iomem *desc)
+{
+	return (u8 __iomem *)(_vop_vq_config(desc) + ioread8(&desc->num_vq));
+}
+
+static inline u8 __iomem *
+_vop_vq_configspace(struct mic_device_desc __iomem *desc)
+{
+	return _vop_vq_features(desc) + ioread8(&desc->feature_len) * 2;
+}
+
+static inline unsigned
+_vop_total_desc_size(struct mic_device_desc __iomem *desc)
+{
+	return _vop_aligned_desc_size(desc) + sizeof(struct mic_device_ctrl);
+}
+
+/* This gets the device's feature bits. */
+static u64 vop_get_features(struct virtio_device *vdev)
+{
+	unsigned int i, bits;
+	u32 features = 0;
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+	u8 __iomem *in_features = _vop_vq_features(desc);
+	int feature_len = ioread8(&desc->feature_len);
+
+	bits = min_t(unsigned, feature_len, sizeof(vdev->features)) * 8;
+	for (i = 0; i < bits; i++)
+		if (ioread8(&in_features[i / 8]) & (BIT(i % 8)))
+			features |= BIT(i);
+
+	return features;
+}
+
+static int vop_finalize_features(struct virtio_device *vdev)
+{
+	unsigned int i, bits;
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+	u8 feature_len = ioread8(&desc->feature_len);
+	/* Second half of bitmap is features we accept. */
+	u8 __iomem *out_features =
+		_vop_vq_features(desc) + feature_len;
+
+	/* Give virtio_ring a chance to accept features. */
+	vring_transport_features(vdev);
+
+	memset_io(out_features, 0, feature_len);
+	bits = min_t(unsigned, feature_len,
+		     sizeof(vdev->features)) * 8;
+	for (i = 0; i < bits; i++) {
+		if (__virtio_test_bit(vdev, i))
+			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
+				 &out_features[i / 8]);
+	}
+	return 0;
+}
+
+/*
+ * Reading and writing elements in config space
+ */
+static void vop_get(struct virtio_device *vdev, unsigned int offset,
+		    void *buf, unsigned len)
+{
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+
+	if (offset + len > ioread8(&desc->config_len))
+		return;
+	memcpy_fromio(buf, _vop_vq_configspace(desc) + offset, len);
+}
+
+static void vop_set(struct virtio_device *vdev, unsigned int offset,
+		    const void *buf, unsigned len)
+{
+	struct mic_device_desc __iomem *desc = to_vopvdev(vdev)->desc;
+
+	if (offset + len > ioread8(&desc->config_len))
+		return;
+	memcpy_toio(_vop_vq_configspace(desc) + offset, buf, len);
+}
+
+/*
+ * The operations to get and set the status word just access the status
+ * field of the device descriptor. set_status also interrupts the host
+ * to tell about status changes.
+ */
+static u8 vop_get_status(struct virtio_device *vdev)
+{
+	return ioread8(&to_vopvdev(vdev)->desc->status);
+}
+
+static void vop_set_status(struct virtio_device *dev, u8 status)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct vop_device *vpdev = vdev->vpdev;
+
+	if (!status)
+		return;
+	iowrite8(status, &vdev->desc->status);
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+}
+
+/* Inform host on a virtio device reset and wait for ack from host */
+static void vop_reset_inform_host(struct virtio_device *dev)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct mic_device_ctrl __iomem *dc = vdev->dc;
+	struct vop_device *vpdev = vdev->vpdev;
+	int retry;
+
+	iowrite8(0, &dc->host_ack);
+	iowrite8(1, &dc->vdev_reset);
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+
+	/* Wait till host completes all card accesses and acks the reset */
+	for (retry = 100; retry--;) {
+		if (ioread8(&dc->host_ack))
+			break;
+		msleep(100);
+	};
+
+	dev_dbg(_vop_dev(vdev), "%s: retry: %d\n", __func__, retry);
+
+	/* Reset status to 0 in case we timed out */
+	iowrite8(0, &vdev->desc->status);
+}
+
+static void vop_reset(struct virtio_device *dev)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+
+	dev_dbg(_vop_dev(vdev), "%s: virtio id %d\n",
+		__func__, dev->id.device);
+
+	vop_reset_inform_host(dev);
+	complete_all(&vdev->reset_done);
+}
+
+/*
+ * The virtio_ring code calls this API when it wants to notify the Host.
+ */
+static bool vop_notify(struct virtqueue *vq)
+{
+	struct _vop_vdev *vdev = vq->priv;
+	struct vop_device *vpdev = vdev->vpdev;
+
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+	return true;
+}
+
+static void vop_del_vq(struct virtqueue *vq, int n)
+{
+	struct _vop_vdev *vdev = to_vopvdev(vq->vdev);
+	struct vring *vr = (struct vring *)(vq + 1);
+	struct vop_device *vpdev = vdev->vpdev;
+
+	dma_unmap_single(&vpdev->dev, vdev->used[n],
+			 vdev->used_size[n], DMA_BIDIRECTIONAL);
+	free_pages((unsigned long)vr->used, get_order(vdev->used_size[n]));
+	vring_del_virtqueue(vq);
+	vpdev->hw_ops->iounmap(vpdev, vdev->vr[n]);
+	vdev->vr[n] = NULL;
+}
+
+static void vop_del_vqs(struct virtio_device *dev)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct virtqueue *vq, *n;
+	int idx = 0;
+
+	dev_dbg(_vop_dev(vdev), "%s\n", __func__);
+
+	list_for_each_entry_safe(vq, n, &dev->vqs, list)
+		vop_del_vq(vq, idx++);
+}
+
+/*
+ * This routine will assign vring's allocated in host/io memory. Code in
+ * virtio_ring.c however continues to access this io memory as if it were local
+ * memory without io accessors.
+ */
+static struct virtqueue *vop_find_vq(struct virtio_device *dev,
+				     unsigned index,
+				     void (*callback)(struct virtqueue *vq),
+				     const char *name)
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct vop_device *vpdev = vdev->vpdev;
+	struct mic_vqconfig __iomem *vqconfig;
+	struct mic_vqconfig config;
+	struct virtqueue *vq;
+	void __iomem *va;
+	struct _mic_vring_info __iomem *info;
+	void *used;
+	int vr_size, _vr_size, err, magic;
+	struct vring *vr;
+	u8 type = ioread8(&vdev->desc->type);
+
+	if (index >= ioread8(&vdev->desc->num_vq))
+		return ERR_PTR(-ENOENT);
+
+	if (!name)
+		return ERR_PTR(-ENOENT);
+
+	/* First assign the vring's allocated in host memory */
+	vqconfig = _vop_vq_config(vdev->desc) + index;
+	memcpy_fromio(&config, vqconfig, sizeof(config));
+	_vr_size = vring_size(le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN);
+	vr_size = PAGE_ALIGN(_vr_size + sizeof(struct _mic_vring_info));
+	va = vpdev->hw_ops->ioremap(vpdev, le64_to_cpu(config.address),
+			vr_size);
+	if (!va)
+		return ERR_PTR(-ENOMEM);
+	vdev->vr[index] = va;
+	memset_io(va, 0x0, _vr_size);
+	vq = vring_new_virtqueue(
+				index,
+				le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN,
+				dev,
+				false,
+				(void __force *)va, vop_notify, callback, name);
+	if (!vq) {
+		err = -ENOMEM;
+		goto unmap;
+	}
+	info = va + _vr_size;
+	magic = ioread32(&info->magic);
+
+	if (WARN(magic != MIC_MAGIC + type + index, "magic mismatch")) {
+		err = -EIO;
+		goto unmap;
+	}
+
+	/* Allocate and reassign used ring now */
+	vdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
+					     sizeof(struct vring_used_elem) *
+					     le16_to_cpu(config.num));
+	used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					get_order(vdev->used_size[index]));
+	if (!used) {
+		err = -ENOMEM;
+		dev_err(_vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto del_vq;
+	}
+	vdev->used[index] = dma_map_single(&vpdev->dev, used,
+					    vdev->used_size[index],
+					    DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(&vpdev->dev, vdev->used[index])) {
+		err = -ENOMEM;
+		dev_err(_vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto free_used;
+	}
+	writeq(vdev->used[index], &vqconfig->used_address);
+	/*
+	 * To reassign the used ring here we are directly accessing
+	 * struct vring_virtqueue which is a private data structure
+	 * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
+	 * vring_new_virtqueue() would ensure that
+	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
+	 */
+	vr = (struct vring *)(vq + 1);
+	vr->used = used;
+
+	vq->priv = vdev;
+	return vq;
+free_used:
+	free_pages((unsigned long)used,
+		   get_order(vdev->used_size[index]));
+del_vq:
+	vring_del_virtqueue(vq);
+unmap:
+	vpdev->hw_ops->iounmap(vpdev, vdev->vr[index]);
+	return ERR_PTR(err);
+}
+
+static int vop_find_vqs(struct virtio_device *dev, unsigned nvqs,
+			struct virtqueue *vqs[],
+			vq_callback_t *callbacks[],
+			const char * const names[])
+{
+	struct _vop_vdev *vdev = to_vopvdev(dev);
+	struct vop_device *vpdev = vdev->vpdev;
+	struct mic_device_ctrl __iomem *dc = vdev->dc;
+	int i, err, retry;
+
+	/* We must have this many virtqueues. */
+	if (nvqs > ioread8(&vdev->desc->num_vq))
+		return -ENOENT;
+
+	for (i = 0; i < nvqs; ++i) {
+		dev_dbg(_vop_dev(vdev), "%s: %d: %s\n",
+			__func__, i, names[i]);
+		vqs[i] = vop_find_vq(dev, i, callbacks[i], names[i]);
+		if (IS_ERR(vqs[i])) {
+			err = PTR_ERR(vqs[i]);
+			goto error;
+		}
+	}
+
+	iowrite8(1, &dc->used_address_updated);
+	/*
+	 * Send an interrupt to the host to inform it that used
+	 * rings have been re-assigned.
+	 */
+	vpdev->hw_ops->send_intr(vpdev, vdev->c2h_vdev_db);
+	for (retry = 100; retry--;) {
+		if (!ioread8(&dc->used_address_updated))
+			break;
+		msleep(100);
+	};
+
+	dev_dbg(_vop_dev(vdev), "%s: retry: %d\n", __func__, retry);
+	if (!retry) {
+		err = -ENODEV;
+		goto error;
+	}
+
+	return 0;
+error:
+	vop_del_vqs(dev);
+	return err;
+}
+
+/*
+ * The config ops structure as defined by virtio config
+ */
+static struct virtio_config_ops vop_vq_config_ops = {
+	.get_features = vop_get_features,
+	.finalize_features = vop_finalize_features,
+	.get = vop_get,
+	.set = vop_set,
+	.get_status = vop_get_status,
+	.set_status = vop_set_status,
+	.reset = vop_reset,
+	.find_vqs = vop_find_vqs,
+	.del_vqs = vop_del_vqs,
+};
+
+static irqreturn_t vop_virtio_intr_handler(int irq, void *data)
+{
+	struct _vop_vdev *vdev = data;
+	struct vop_device *vpdev = vdev->vpdev;
+	struct virtqueue *vq;
+
+	vpdev->hw_ops->ack_interrupt(vpdev, vdev->h2c_vdev_db);
+	list_for_each_entry(vq, &vdev->vdev.vqs, list)
+		vring_interrupt(0, vq);
+
+	return IRQ_HANDLED;
+}
+
+static void vop_virtio_release_dev(struct device *_d)
+{
+	/*
+	 * No need for a release method similar to virtio PCI.
+	 * Provide an empty one to avoid getting a warning from core.
+	 */
+}
+
+/*
+ * adds a new device and register it with virtio
+ * appropriate drivers are loaded by the device model
+ */
+static int _vop_add_device(struct mic_device_desc __iomem *d,
+			   unsigned int offset, struct vop_device *vpdev,
+			   int dnode)
+{
+	struct _vop_vdev *vdev;
+	int ret;
+	u8 type = ioread8(&d->type);
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return -ENOMEM;
+
+	vdev->vpdev = vpdev;
+	vdev->vdev.dev.parent = &vpdev->dev;
+	vdev->vdev.dev.release = vop_virtio_release_dev;
+	vdev->vdev.id.device = type;
+	vdev->vdev.config = &vop_vq_config_ops;
+	vdev->desc = d;
+	vdev->dc = (void __iomem *)d + _vop_aligned_desc_size(d);
+	vdev->dnode = dnode;
+	vdev->vdev.priv = (void *)(u64)dnode;
+	init_completion(&vdev->reset_done);
+
+	vdev->h2c_vdev_db = vpdev->hw_ops->next_db(vpdev);
+	vdev->virtio_cookie = vpdev->hw_ops->request_irq(vpdev,
+			vop_virtio_intr_handler, "virtio intr",
+			vdev, vdev->h2c_vdev_db);
+	if (IS_ERR(vdev->virtio_cookie)) {
+		ret = PTR_ERR(vdev->virtio_cookie);
+		goto kfree;
+	}
+	iowrite8((u8)vdev->h2c_vdev_db, &vdev->dc->h2c_vdev_db);
+	vdev->c2h_vdev_db = ioread8(&vdev->dc->c2h_vdev_db);
+
+	ret = register_virtio_device(&vdev->vdev);
+	if (ret) {
+		dev_err(_vop_dev(vdev),
+			"Failed to register vop device %u type %u\n",
+			offset, type);
+		goto free_irq;
+	}
+	writeq((u64)vdev, &vdev->dc->vdev);
+	dev_dbg(_vop_dev(vdev), "%s: registered vop device %u type %u vdev %p\n",
+		__func__, offset, type, vdev);
+
+	return 0;
+
+free_irq:
+	vpdev->hw_ops->free_irq(vpdev, vdev->virtio_cookie, vdev);
+kfree:
+	kfree(vdev);
+	return ret;
+}
+
+/*
+ * match for a vop device with a specific desc pointer
+ */
+static int vop_match_desc(struct device *dev, void *data)
+{
+	struct virtio_device *_dev = dev_to_virtio(dev);
+	struct _vop_vdev *vdev = to_vopvdev(_dev);
+
+	return vdev->desc == (void __iomem *)data;
+}
+
+static void _vop_handle_config_change(struct mic_device_desc __iomem *d,
+				      unsigned int offset,
+				      struct vop_device *vpdev)
+{
+	struct mic_device_ctrl __iomem *dc
+		= (void __iomem *)d + _vop_aligned_desc_size(d);
+	struct _vop_vdev *vdev = (struct _vop_vdev *)readq(&dc->vdev);
+
+	if (ioread8(&dc->config_change) != MIC_VIRTIO_PARAM_CONFIG_CHANGED)
+		return;
+
+	dev_dbg(&vpdev->dev, "%s %d\n", __func__, __LINE__);
+	virtio_config_changed(&vdev->vdev);
+	iowrite8(1, &dc->guest_ack);
+}
+
+/*
+ * removes a virtio device if a hot remove event has been
+ * requested by the host.
+ */
+static int _vop_remove_device(struct mic_device_desc __iomem *d,
+			      unsigned int offset, struct vop_device *vpdev)
+{
+	struct mic_device_ctrl __iomem *dc
+		= (void __iomem *)d + _vop_aligned_desc_size(d);
+	struct _vop_vdev *vdev = (struct _vop_vdev *)readq(&dc->vdev);
+	u8 status;
+	int ret = -1;
+
+	if (ioread8(&dc->config_change) == MIC_VIRTIO_PARAM_DEV_REMOVE) {
+		dev_dbg(&vpdev->dev,
+			"%s %d config_change %d type %d vdev %p\n",
+			__func__, __LINE__,
+			ioread8(&dc->config_change), ioread8(&d->type), vdev);
+		status = ioread8(&d->status);
+		reinit_completion(&vdev->reset_done);
+		unregister_virtio_device(&vdev->vdev);
+		vpdev->hw_ops->free_irq(vpdev, vdev->virtio_cookie, vdev);
+		iowrite8(-1, &dc->h2c_vdev_db);
+		if (status & VIRTIO_CONFIG_S_DRIVER_OK)
+			wait_for_completion(&vdev->reset_done);
+		kfree(vdev);
+		iowrite8(1, &dc->guest_ack);
+		dev_dbg(&vpdev->dev, "%s %d guest_ack %d\n",
+			__func__, __LINE__, ioread8(&dc->guest_ack));
+		iowrite8(-1, &d->type);
+		ret = 0;
+	}
+	return ret;
+}
+
+#define REMOVE_DEVICES true
+
+static void _vop_scan_devices(void __iomem *dp, struct vop_device *vpdev,
+			      bool remove, int dnode)
+{
+	s8 type;
+	unsigned int i;
+	struct mic_device_desc __iomem *d;
+	struct mic_device_ctrl __iomem *dc;
+	struct device *dev;
+	int ret;
+
+	for (i = sizeof(struct mic_bootparam);
+			i < MIC_DP_SIZE; i += _vop_total_desc_size(d)) {
+		d = dp + i;
+		dc = (void __iomem *)d + _vop_aligned_desc_size(d);
+		/*
+		 * This read barrier is paired with the corresponding write
+		 * barrier on the host which is inserted before adding or
+		 * removing a virtio device descriptor, by updating the type.
+		 */
+		rmb();
+		type = ioread8(&d->type);
+
+		/* end of list */
+		if (type == 0)
+			break;
+
+		if (type == -1)
+			continue;
+
+		/* device already exists */
+		dev = device_find_child(&vpdev->dev, (void __force *)d,
+					vop_match_desc);
+		if (dev) {
+			if (remove)
+				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
+					 &dc->config_change);
+			put_device(dev);
+			_vop_handle_config_change(d, i, vpdev);
+			ret = _vop_remove_device(d, i, vpdev);
+			if (remove) {
+				iowrite8(0, &dc->config_change);
+				iowrite8(0, &dc->guest_ack);
+			}
+			continue;
+		}
+
+		/* new device */
+		dev_dbg(&vpdev->dev, "%s %d Adding new virtio device %p\n",
+			__func__, __LINE__, d);
+		if (!remove)
+			_vop_add_device(d, i, vpdev, dnode);
+	}
+}
+
+static void vop_scan_devices(struct vop_info *vi,
+			     struct vop_device *vpdev, bool remove)
+{
+	void __iomem *dp = vpdev->hw_ops->get_remote_dp(vpdev);
+
+	if (!dp)
+		return;
+	mutex_lock(&vi->vop_mutex);
+	_vop_scan_devices(dp, vpdev, remove, vpdev->dnode);
+	mutex_unlock(&vi->vop_mutex);
+}
+
+/*
+ * vop_hotplug_device tries to find changes in the device page.
+ */
+static void vop_hotplug_devices(struct work_struct *work)
+{
+	struct vop_info *vi = container_of(work, struct vop_info,
+					     hotplug_work);
+
+	vop_scan_devices(vi, vi->vpdev, !REMOVE_DEVICES);
+}
+
+/*
+ * Interrupt handler for hot plug/config changes etc.
+ */
+static irqreturn_t vop_extint_handler(int irq, void *data)
+{
+	struct vop_info *vi = data;
+	struct mic_bootparam __iomem *bp;
+	struct vop_device *vpdev = vi->vpdev;
+
+	bp = vpdev->hw_ops->get_remote_dp(vpdev);
+	dev_dbg(&vpdev->dev, "%s %d hotplug work\n",
+		__func__, __LINE__);
+	vpdev->hw_ops->ack_interrupt(vpdev, ioread8(&bp->h2c_config_db));
+	schedule_work(&vi->hotplug_work);
+	return IRQ_HANDLED;
+}
+
+static int vop_driver_probe(struct vop_device *vpdev)
+{
+	struct vop_info *vi;
+	int rc;
+
+	vi = kzalloc(sizeof(*vi), GFP_KERNEL);
+	if (!vi) {
+		rc = -ENOMEM;
+		goto exit;
+	}
+	vpdev->priv = vi;
+	vi->vpdev = vpdev;
+
+	mutex_init(&vi->vop_mutex);
+	INIT_WORK(&vi->hotplug_work, vop_hotplug_devices);
+	if (vpdev->dnode) {
+		rc = vop_host_init(vi);
+		if (rc < 0)
+			goto free;
+	} else {
+		struct mic_bootparam __iomem *bootparam;
+
+		vop_scan_devices(vi, vpdev, !REMOVE_DEVICES);
+
+		vi->h2c_config_db = vpdev->hw_ops->next_db(vpdev);
+		vi->cookie = vpdev->hw_ops->request_irq(vpdev,
+							vop_extint_handler,
+							"virtio_config_intr",
+							vi, vi->h2c_config_db);
+		if (IS_ERR(vi->cookie)) {
+			rc = PTR_ERR(vi->cookie);
+			goto free;
+		}
+		bootparam = vpdev->hw_ops->get_remote_dp(vpdev);
+		iowrite8(vi->h2c_config_db, &bootparam->h2c_config_db);
+	}
+	vop_init_debugfs(vi);
+	return 0;
+free:
+	kfree(vi);
+exit:
+	return rc;
+}
+
+static void vop_driver_remove(struct vop_device *vpdev)
+{
+	struct vop_info *vi = vpdev->priv;
+
+	if (vpdev->dnode) {
+		vop_host_uninit(vi);
+	} else {
+		struct mic_bootparam __iomem *bootparam =
+			vpdev->hw_ops->get_remote_dp(vpdev);
+		if (bootparam)
+			iowrite8(-1, &bootparam->h2c_config_db);
+		vpdev->hw_ops->free_irq(vpdev, vi->cookie, vi);
+		flush_work(&vi->hotplug_work);
+		vop_scan_devices(vi, vpdev, REMOVE_DEVICES);
+	}
+	vop_exit_debugfs(vi);
+	kfree(vi);
+}
+
+static struct vop_device_id id_table[] = {
+	{ VOP_DEV_TRNSP, VOP_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct vop_driver vop_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = vop_driver_probe,
+	.remove = vop_driver_remove,
+};
+
+module_vop_driver(vop_driver);
+
+MODULE_DEVICE_TABLE(mbus, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) Virtio Over PCIe (VOP) driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1

^ permalink raw reply related	[relevance 49%]

* [PATCH char-misc-next 5/8] misc: mic: Enable VOP host side functionality
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (3 preceding siblings ...)
  2016-02-02  4:23 75% ` [PATCH char-misc-next 4/8] misc: mic: Add data structures for the VOP driver Sudeep Dutt
@ 2016-02-02  4:23 39% ` Sudeep Dutt
  2016-02-02  4:23 49% ` [PATCH char-misc-next 6/8] misc: mic: Enable VOP card " Sudeep Dutt
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch moves virtio functionality from the MIC host driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit f69bcbf3b4c4 ("Intel MIC Host
Driver Changes for Virtio Devices.") in
drivers/misc/mic/host/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/vop/vop_vringh.c | 1164 +++++++++++++++++++++++++++++++++++++
 1 file changed, 1164 insertions(+)
 create mode 100644 drivers/misc/mic/vop/vop_vringh.c

diff --git a/drivers/misc/mic/vop/vop_vringh.c b/drivers/misc/mic/vop/vop_vringh.c
new file mode 100644
index 0000000..6dc41fe
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_vringh.c
@@ -0,0 +1,1164 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#include <linux/sched.h>
+#include <linux/poll.h>
+#include <linux/dma-mapping.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+#include <linux/mic_ioctl.h>
+#include "vop_main.h"
+
+/* Helper API to obtain the VOP PCIe device */
+static inline struct device *vop_dev(struct vop_vdev *vdev)
+{
+	return vdev->vpdev->dev.parent;
+}
+
+/* Helper API to check if a virtio device is initialized */
+static inline int vop_vdev_inited(struct vop_vdev *vdev)
+{
+	if (!vdev)
+		return -EINVAL;
+	/* Device has not been created yet */
+	if (!vdev->dd || !vdev->dd->type) {
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, -EINVAL);
+		return -EINVAL;
+	}
+	/* Device has been removed/deleted */
+	if (vdev->dd->type == -1) {
+		dev_dbg(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, -ENODEV);
+		return -ENODEV;
+	}
+	return 0;
+}
+
+static void _vop_notify(struct vringh *vrh)
+{
+	struct vop_vringh *vvrh = container_of(vrh, struct vop_vringh, vrh);
+	struct vop_vdev *vdev = vvrh->vdev;
+	struct vop_device *vpdev = vdev->vpdev;
+	s8 db = vdev->dc->h2c_vdev_db;
+
+	if (db != -1)
+		vpdev->hw_ops->send_intr(vpdev, db);
+}
+
+static void vop_virtio_init_post(struct vop_vdev *vdev)
+{
+	struct mic_vqconfig *vqconfig = mic_vq_config(vdev->dd);
+	struct vop_device *vpdev = vdev->vpdev;
+	int i, used_size;
+
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		used_size = PAGE_ALIGN(sizeof(u16) * 3 +
+				sizeof(struct vring_used_elem) *
+				le16_to_cpu(vqconfig->num));
+		if (!le64_to_cpu(vqconfig[i].used_address)) {
+			dev_warn(vop_dev(vdev), "used_address zero??\n");
+			continue;
+		}
+		vdev->vvr[i].vrh.vring.used =
+			(void __force *)vpdev->hw_ops->ioremap(
+			vpdev,
+			le64_to_cpu(vqconfig[i].used_address),
+			used_size);
+	}
+
+	vdev->dc->used_address_updated = 0;
+
+	dev_info(vop_dev(vdev), "%s: device type %d LINKUP\n",
+		 __func__, vdev->virtio_id);
+}
+
+static inline void vop_virtio_device_reset(struct vop_vdev *vdev)
+{
+	int i;
+
+	dev_dbg(vop_dev(vdev), "%s: status %d device type %d RESET\n",
+		__func__, vdev->dd->status, vdev->virtio_id);
+
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		/*
+		 * Avoid lockdep false positive. The + 1 is for the vop
+		 * mutex which is held in the reset devices code path.
+		 */
+		mutex_lock_nested(&vdev->vvr[i].vr_mutex, i + 1);
+
+	/* 0 status means "reset" */
+	vdev->dd->status = 0;
+	vdev->dc->vdev_reset = 0;
+	vdev->dc->host_ack = 1;
+
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		struct vringh *vrh = &vdev->vvr[i].vrh;
+
+		vdev->vvr[i].vring.info->avail_idx = 0;
+		vrh->completed = 0;
+		vrh->last_avail_idx = 0;
+		vrh->last_used_idx = 0;
+	}
+
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		mutex_unlock(&vdev->vvr[i].vr_mutex);
+}
+
+static void vop_virtio_reset_devices(struct vop_info *vi)
+{
+	struct list_head *pos, *tmp;
+	struct vop_vdev *vdev;
+
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev = list_entry(pos, struct vop_vdev, list);
+		vop_virtio_device_reset(vdev);
+		vdev->poll_wake = 1;
+		wake_up(&vdev->waitq);
+	}
+}
+
+static void vop_bh_handler(struct work_struct *work)
+{
+	struct vop_vdev *vdev = container_of(work, struct vop_vdev,
+			virtio_bh_work);
+
+	if (vdev->dc->used_address_updated)
+		vop_virtio_init_post(vdev);
+
+	if (vdev->dc->vdev_reset)
+		vop_virtio_device_reset(vdev);
+
+	vdev->poll_wake = 1;
+	wake_up(&vdev->waitq);
+}
+
+static irqreturn_t _vop_virtio_intr_handler(int irq, void *data)
+{
+	struct vop_vdev *vdev = data;
+	struct vop_device *vpdev = vdev->vpdev;
+
+	vpdev->hw_ops->ack_interrupt(vpdev, vdev->virtio_db);
+	schedule_work(&vdev->virtio_bh_work);
+	return IRQ_HANDLED;
+}
+
+static int vop_virtio_config_change(struct vop_vdev *vdev, void *argp)
+{
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
+	int ret = 0, retry, i;
+	struct vop_device *vpdev = vdev->vpdev;
+	struct vop_info *vi = vpdev->priv;
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+	s8 db = bootparam->h2c_config_db;
+
+	mutex_lock(&vi->vop_mutex);
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		mutex_lock_nested(&vdev->vvr[i].vr_mutex, i + 1);
+
+	if (db == -1 || vdev->dd->type == -1) {
+		ret = -EIO;
+		goto exit;
+	}
+
+	memcpy(mic_vq_configspace(vdev->dd), argp, vdev->dd->config_len);
+	vdev->dc->config_change = MIC_VIRTIO_PARAM_CONFIG_CHANGED;
+	vpdev->hw_ops->send_intr(vpdev, db);
+
+	for (retry = 100; retry--;) {
+		ret = wait_event_timeout(wake, vdev->dc->guest_ack,
+					 msecs_to_jiffies(100));
+		if (ret)
+			break;
+	}
+
+	dev_dbg(vop_dev(vdev),
+		"%s %d retry: %d\n", __func__, __LINE__, retry);
+	vdev->dc->config_change = 0;
+	vdev->dc->guest_ack = 0;
+exit:
+	for (i = 0; i < vdev->dd->num_vq; i++)
+		mutex_unlock(&vdev->vvr[i].vr_mutex);
+	mutex_unlock(&vi->vop_mutex);
+	return ret;
+}
+
+static int vop_copy_dp_entry(struct vop_vdev *vdev,
+			     struct mic_device_desc *argp, __u8 *type,
+			     struct mic_device_desc **devpage)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	struct mic_device_desc *devp;
+	struct mic_vqconfig *vqconfig;
+	int ret = 0, i;
+	bool slot_found = false;
+
+	vqconfig = mic_vq_config(argp);
+	for (i = 0; i < argp->num_vq; i++) {
+		if (le16_to_cpu(vqconfig[i].num) > MIC_MAX_VRING_ENTRIES) {
+			ret =  -EINVAL;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto exit;
+		}
+	}
+
+	/* Find the first free device page entry */
+	for (i = sizeof(struct mic_bootparam);
+		i < MIC_DP_SIZE - mic_total_desc_size(argp);
+		i += mic_total_desc_size(devp)) {
+		devp = vpdev->hw_ops->get_dp(vpdev) + i;
+		if (devp->type == 0 || devp->type == -1) {
+			slot_found = true;
+			break;
+		}
+	}
+	if (!slot_found) {
+		ret =  -EINVAL;
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, ret);
+		goto exit;
+	}
+	/*
+	 * Save off the type before doing the memcpy. Type will be set in the
+	 * end after completing all initialization for the new device.
+	 */
+	*type = argp->type;
+	argp->type = 0;
+	memcpy(devp, argp, mic_desc_size(argp));
+
+	*devpage = devp;
+exit:
+	return ret;
+}
+
+static void vop_init_device_ctrl(struct vop_vdev *vdev,
+				 struct mic_device_desc *devpage)
+{
+	struct mic_device_ctrl *dc;
+
+	dc = (void *)devpage + mic_aligned_desc_size(devpage);
+
+	dc->config_change = 0;
+	dc->guest_ack = 0;
+	dc->vdev_reset = 0;
+	dc->host_ack = 0;
+	dc->used_address_updated = 0;
+	dc->c2h_vdev_db = -1;
+	dc->h2c_vdev_db = -1;
+	vdev->dc = dc;
+}
+
+static int vop_virtio_add_device(struct vop_vdev *vdev,
+				 struct mic_device_desc *argp)
+{
+	struct vop_info *vi = vdev->vi;
+	struct vop_device *vpdev = vi->vpdev;
+	struct mic_device_desc *dd = NULL;
+	struct mic_vqconfig *vqconfig;
+	int vr_size, i, j, ret;
+	u8 type = 0;
+	s8 db = -1;
+	char irqname[16];
+	struct mic_bootparam *bootparam;
+	u16 num;
+	dma_addr_t vr_addr;
+
+	bootparam = vpdev->hw_ops->get_dp(vpdev);
+	init_waitqueue_head(&vdev->waitq);
+	INIT_LIST_HEAD(&vdev->list);
+	vdev->vpdev = vpdev;
+
+	ret = vop_copy_dp_entry(vdev, argp, &type, &dd);
+	if (ret) {
+		kfree(vdev);
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, ret);
+		return ret;
+	}
+
+	vop_init_device_ctrl(vdev, dd);
+
+	vdev->dd = dd;
+	vdev->virtio_id = type;
+	vqconfig = mic_vq_config(dd);
+	INIT_WORK(&vdev->virtio_bh_work, vop_bh_handler);
+
+	for (i = 0; i < dd->num_vq; i++) {
+		struct vop_vringh *vvr = &vdev->vvr[i];
+		struct mic_vring *vr = &vdev->vvr[i].vring;
+
+		num = le16_to_cpu(vqconfig[i].num);
+		mutex_init(&vvr->vr_mutex);
+		vr_size = PAGE_ALIGN(vring_size(num, MIC_VIRTIO_RING_ALIGN) +
+			sizeof(struct _mic_vring_info));
+		vr->va = (void *)
+			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					 get_order(vr_size));
+		if (!vr->va) {
+			ret = -ENOMEM;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vr->len = vr_size;
+		vr->info = vr->va + vring_size(num, MIC_VIRTIO_RING_ALIGN);
+		vr->info->magic = cpu_to_le32(MIC_MAGIC + vdev->virtio_id + i);
+		vr_addr = dma_map_single(&vpdev->dev, vr->va, vr_size,
+					 DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(&vpdev->dev, vr_addr)) {
+			free_pages((unsigned long)vr->va, get_order(vr_size));
+			ret = -ENOMEM;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vqconfig[i].address = cpu_to_le64(vr_addr);
+
+		vring_init(&vr->vr, num, vr->va, MIC_VIRTIO_RING_ALIGN);
+		ret = vringh_init_kern(&vvr->vrh,
+				       *(u32 *)mic_vq_features(vdev->dd),
+				       num, false, vr->vr.desc, vr->vr.avail,
+				       vr->vr.used);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vringh_kiov_init(&vvr->riov, NULL, 0);
+		vringh_kiov_init(&vvr->wiov, NULL, 0);
+		vvr->head = USHRT_MAX;
+		vvr->vdev = vdev;
+		vvr->vrh.notify = _vop_notify;
+		dev_dbg(&vpdev->dev,
+			"%s %d index %d va %p info %p vr_size 0x%x\n",
+			__func__, __LINE__, i, vr->va, vr->info, vr_size);
+		vvr->buf = (void *)__get_free_pages(GFP_KERNEL,
+					get_order(VOP_INT_DMA_BUF_SIZE));
+		vvr->buf_da = dma_map_single(&vpdev->dev,
+					  vvr->buf, VOP_INT_DMA_BUF_SIZE,
+					  DMA_BIDIRECTIONAL);
+	}
+
+	snprintf(irqname, sizeof(irqname), "vop%dvirtio%d", vpdev->index,
+		 vdev->virtio_id);
+	vdev->virtio_db = vpdev->hw_ops->next_db(vpdev);
+	vdev->virtio_cookie = vpdev->hw_ops->request_irq(vpdev,
+			_vop_virtio_intr_handler, irqname, vdev,
+			vdev->virtio_db);
+	if (IS_ERR(vdev->virtio_cookie)) {
+		ret = PTR_ERR(vdev->virtio_cookie);
+		dev_dbg(&vpdev->dev, "request irq failed\n");
+		goto err;
+	}
+
+	vdev->dc->c2h_vdev_db = vdev->virtio_db;
+
+	/*
+	 * Order the type update with previous stores. This write barrier
+	 * is paired with the corresponding read barrier before the uncached
+	 * system memory read of the type, on the card while scanning the
+	 * device page.
+	 */
+	smp_wmb();
+	dd->type = type;
+	argp->type = type;
+
+	if (bootparam) {
+		db = bootparam->h2c_config_db;
+		if (db != -1)
+			vpdev->hw_ops->send_intr(vpdev, db);
+	}
+	dev_dbg(&vpdev->dev, "Added virtio id %d db %d\n", dd->type, db);
+	return 0;
+err:
+	vqconfig = mic_vq_config(dd);
+	for (j = 0; j < i; j++) {
+		struct vop_vringh *vvr = &vdev->vvr[j];
+
+		dma_unmap_single(&vpdev->dev, le64_to_cpu(vqconfig[j].address),
+				 vvr->vring.len, DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vvr->vring.va,
+			   get_order(vvr->vring.len));
+	}
+	return ret;
+}
+
+static void vop_dev_remove(struct vop_info *pvi, struct mic_device_ctrl *devp,
+			   struct vop_device *vpdev)
+{
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+	s8 db;
+	int ret, retry;
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
+
+	devp->config_change = MIC_VIRTIO_PARAM_DEV_REMOVE;
+	db = bootparam->h2c_config_db;
+	if (db != -1)
+		vpdev->hw_ops->send_intr(vpdev, db);
+	else
+		goto done;
+	for (retry = 15; retry--;) {
+		ret = wait_event_timeout(wake, devp->guest_ack,
+					 msecs_to_jiffies(1000));
+		if (ret)
+			break;
+	}
+done:
+	devp->config_change = 0;
+	devp->guest_ack = 0;
+}
+
+static void vop_virtio_del_device(struct vop_vdev *vdev)
+{
+	struct vop_info *vi = vdev->vi;
+	struct vop_device *vpdev = vdev->vpdev;
+	int i;
+	struct mic_vqconfig *vqconfig;
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+
+	if (!bootparam)
+		goto skip_hot_remove;
+	vop_dev_remove(vi, vdev->dc, vpdev);
+skip_hot_remove:
+	vpdev->hw_ops->free_irq(vpdev, vdev->virtio_cookie, vdev);
+	flush_work(&vdev->virtio_bh_work);
+	vqconfig = mic_vq_config(vdev->dd);
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		struct vop_vringh *vvr = &vdev->vvr[i];
+
+		dma_unmap_single(&vpdev->dev,
+				 vvr->buf_da, VOP_INT_DMA_BUF_SIZE,
+				 DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vvr->buf,
+			   get_order(VOP_INT_DMA_BUF_SIZE));
+		vringh_kiov_cleanup(&vvr->riov);
+		vringh_kiov_cleanup(&vvr->wiov);
+		dma_unmap_single(&vpdev->dev, le64_to_cpu(vqconfig[i].address),
+				 vvr->vring.len, DMA_BIDIRECTIONAL);
+		free_pages((unsigned long)vvr->vring.va,
+			   get_order(vvr->vring.len));
+	}
+	/*
+	 * Order the type update with previous stores. This write barrier
+	 * is paired with the corresponding read barrier before the uncached
+	 * system memory read of the type, on the card while scanning the
+	 * device page.
+	 */
+	smp_wmb();
+	vdev->dd->type = -1;
+}
+
+/*
+ * vop_sync_dma - Wrapper for synchronous DMAs.
+ *
+ * @dev - The address of the pointer to the device instance used
+ * for DMA registration.
+ * @dst - destination DMA address.
+ * @src - source DMA address.
+ * @len - size of the transfer.
+ *
+ * Return DMA_SUCCESS on success
+ */
+static int vop_sync_dma(struct vop_vdev *vdev, dma_addr_t dst, dma_addr_t src,
+			size_t len)
+{
+	int err = 0;
+	struct dma_device *ddev;
+	struct dma_async_tx_descriptor *tx;
+	struct vop_info *vi = vdev->vpdev->priv;
+	struct dma_chan *vop_ch = vi->dma_ch;
+
+	if (!vop_ch) {
+		err = -EBUSY;
+		goto error;
+	}
+	ddev = vop_ch->device;
+	tx = ddev->device_prep_dma_memcpy(vop_ch, dst, src, len,
+		DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	} else {
+		dma_cookie_t cookie;
+
+		cookie = tx->tx_submit(tx);
+		if (dma_submit_error(cookie)) {
+			err = -ENOMEM;
+			goto error;
+		}
+		dma_async_issue_pending(vop_ch);
+		err = dma_sync_wait(vop_ch, cookie);
+	}
+error:
+	if (err)
+		dev_err(&vi->vpdev->dev, "%s %d err %d\n",
+			__func__, __LINE__, err);
+	return err;
+}
+
+#define VOP_USE_DMA true
+
+/*
+ * Initiates the copies across the PCIe bus from card memory to a user
+ * space buffer. When transfers are done using DMA, source/destination
+ * addresses and transfer length must follow the alignment requirements of
+ * the MIC DMA engine.
+ */
+static int vop_virtio_copy_to_user(struct vop_vdev *vdev, void __user *ubuf,
+				   size_t len, u64 daddr, size_t dlen,
+				   int vr_idx)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	void __iomem *dbuf = vpdev->hw_ops->ioremap(vpdev, daddr, len);
+	struct vop_vringh *vvr = &vdev->vvr[vr_idx];
+	struct vop_info *vi = vdev->vpdev->priv;
+	size_t dma_alignment = 1 << vi->dma_ch->device->copy_align;
+	bool x200 = is_dma_copy_aligned(vi->dma_ch->device, 1, 1, 1);
+	size_t dma_offset, partlen;
+	int err;
+
+	if (!VOP_USE_DMA) {
+		if (copy_to_user(ubuf, (void __force *)dbuf, len)) {
+			err = -EFAULT;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		vdev->in_bytes += len;
+		err = 0;
+		goto err;
+	}
+
+	dma_offset = daddr - round_down(daddr, dma_alignment);
+	daddr -= dma_offset;
+	len += dma_offset;
+	/*
+	 * X100 uses DMA addresses as seen by the card so adding
+	 * the aperture base is not required for DMA. However x200
+	 * requires DMA addresses to be an offset into the bar so
+	 * add the aperture base for x200.
+	 */
+	if (x200)
+		daddr += vpdev->aper->pa;
+	while (len) {
+		partlen = min_t(size_t, len, VOP_INT_DMA_BUF_SIZE);
+		err = vop_sync_dma(vdev, vvr->buf_da, daddr,
+				   ALIGN(partlen, dma_alignment));
+		if (err) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		if (copy_to_user(ubuf, vvr->buf + dma_offset,
+				 partlen - dma_offset)) {
+			err = -EFAULT;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		vdev->in_bytes_dma += partlen;
+		vdev->in_bytes += partlen;
+		len -= partlen;
+		dma_offset = 0;
+	}
+	err = 0;
+err:
+	vpdev->hw_ops->iounmap(vpdev, dbuf);
+	dev_dbg(vop_dev(vdev),
+		"%s: ubuf %p dbuf %p len 0x%lx vr_idx 0x%x\n",
+		__func__, ubuf, dbuf, len, vr_idx);
+	return err;
+}
+
+/*
+ * Initiates copies across the PCIe bus from a user space buffer to card
+ * memory. When transfers are done using DMA, source/destination addresses
+ * and transfer length must follow the alignment requirements of the MIC
+ * DMA engine.
+ */
+static int vop_virtio_copy_from_user(struct vop_vdev *vdev, void __user *ubuf,
+				     size_t len, u64 daddr, size_t dlen,
+				     int vr_idx)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	void __iomem *dbuf = vpdev->hw_ops->ioremap(vpdev, daddr, len);
+	struct vop_vringh *vvr = &vdev->vvr[vr_idx];
+	struct vop_info *vi = vdev->vpdev->priv;
+	size_t dma_alignment = 1 << vi->dma_ch->device->copy_align;
+	bool x200 = is_dma_copy_aligned(vi->dma_ch->device, 1, 1, 1);
+	size_t partlen;
+	bool dma = VOP_USE_DMA;
+	int err = 0;
+
+	if (daddr & (dma_alignment - 1)) {
+		vdev->tx_dst_unaligned += len;
+		dma = false;
+	} else if (ALIGN(len, dma_alignment) > dlen) {
+		vdev->tx_len_unaligned += len;
+		dma = false;
+	}
+
+	if (!dma)
+		goto memcpy;
+
+	/*
+	 * X100 uses DMA addresses as seen by the card so adding
+	 * the aperture base is not required for DMA. However x200
+	 * requires DMA addresses to be an offset into the bar so
+	 * add the aperture base for x200.
+	 */
+	if (x200)
+		daddr += vpdev->aper->pa;
+	while (len) {
+		partlen = min_t(size_t, len, VOP_INT_DMA_BUF_SIZE);
+
+		if (copy_from_user(vvr->buf, ubuf, partlen)) {
+			err = -EFAULT;
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		err = vop_sync_dma(vdev, daddr, vvr->buf_da,
+				   ALIGN(partlen, dma_alignment));
+		if (err) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, err);
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		vdev->out_bytes_dma += partlen;
+		vdev->out_bytes += partlen;
+		len -= partlen;
+	}
+memcpy:
+	/*
+	 * We are copying to IO below and should ideally use something
+	 * like copy_from_user_toio(..) if it existed.
+	 */
+	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
+		err = -EFAULT;
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	vdev->out_bytes += len;
+	err = 0;
+err:
+	vpdev->hw_ops->iounmap(vpdev, dbuf);
+	dev_dbg(vop_dev(vdev),
+		"%s: ubuf %p dbuf %p len 0x%lx vr_idx 0x%x\n",
+		__func__, ubuf, dbuf, len, vr_idx);
+	return err;
+}
+
+#define MIC_VRINGH_READ true
+
+/* Determine the total number of bytes consumed in a VRINGH KIOV */
+static inline u32 vop_vringh_iov_consumed(struct vringh_kiov *iov)
+{
+	int i;
+	u32 total = iov->consumed;
+
+	for (i = 0; i < iov->i; i++)
+		total += iov->iov[i].iov_len;
+	return total;
+}
+
+/*
+ * Traverse the VRINGH KIOV and issue the APIs to trigger the copies.
+ * This API is heavily based on the vringh_iov_xfer(..) implementation
+ * in vringh.c. The reason we cannot reuse vringh_iov_pull_kern(..)
+ * and vringh_iov_push_kern(..) directly is because there is no
+ * way to override the VRINGH xfer(..) routines as of v3.10.
+ */
+static int vop_vringh_copy(struct vop_vdev *vdev, struct vringh_kiov *iov,
+			   void __user *ubuf, size_t len, bool read, int vr_idx,
+			   size_t *out_len)
+{
+	int ret = 0;
+	size_t partlen, tot_len = 0;
+
+	while (len && iov->i < iov->used) {
+		struct kvec *kiov = &iov->iov[iov->i];
+
+		partlen = min(kiov->iov_len, len);
+		if (read)
+			ret = vop_virtio_copy_to_user(vdev, ubuf, partlen,
+						      (u64)kiov->iov_base,
+						      kiov->iov_len,
+						      vr_idx);
+		else
+			ret = vop_virtio_copy_from_user(vdev, ubuf, partlen,
+							(u64)kiov->iov_base,
+							kiov->iov_len,
+							vr_idx);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= partlen;
+		ubuf += partlen;
+		tot_len += partlen;
+		iov->consumed += partlen;
+		kiov->iov_len -= partlen;
+		kiov->iov_base += partlen;
+		if (!kiov->iov_len) {
+			/* Fix up old iov element then increment. */
+			kiov->iov_len = iov->consumed;
+			kiov->iov_base -= iov->consumed;
+
+			iov->consumed = 0;
+			iov->i++;
+		}
+	}
+	*out_len = tot_len;
+	return ret;
+}
+
+/*
+ * Use the standard VRINGH infrastructure in the kernel to fetch new
+ * descriptors, initiate the copies and update the used ring.
+ */
+static int _vop_virtio_copy(struct vop_vdev *vdev, struct mic_copy_desc *copy)
+{
+	int ret = 0;
+	u32 iovcnt = copy->iovcnt;
+	struct iovec iov;
+	struct iovec __user *u_iov = copy->iov;
+	void __user *ubuf = NULL;
+	struct vop_vringh *vvr = &vdev->vvr[copy->vr_idx];
+	struct vringh_kiov *riov = &vvr->riov;
+	struct vringh_kiov *wiov = &vvr->wiov;
+	struct vringh *vrh = &vvr->vrh;
+	u16 *head = &vvr->head;
+	struct mic_vring *vr = &vvr->vring;
+	size_t len = 0, out_len;
+
+	copy->out_len = 0;
+	/* Fetch a new IOVEC if all previous elements have been processed */
+	if (riov->i == riov->used && wiov->i == wiov->used) {
+		ret = vringh_getdesc_kern(vrh, riov, wiov,
+					  head, GFP_KERNEL);
+		/* Check if there are available descriptors */
+		if (ret <= 0)
+			return ret;
+	}
+	while (iovcnt) {
+		if (!len) {
+			/* Copy over a new iovec from user space. */
+			ret = copy_from_user(&iov, u_iov, sizeof(*u_iov));
+			if (ret) {
+				ret = -EINVAL;
+				dev_err(vop_dev(vdev), "%s %d err %d\n",
+					__func__, __LINE__, ret);
+				break;
+			}
+			len = iov.iov_len;
+			ubuf = iov.iov_base;
+		}
+		/* Issue all the read descriptors first */
+		ret = vop_vringh_copy(vdev, riov, ubuf, len,
+				      MIC_VRINGH_READ, copy->vr_idx, &out_len);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= out_len;
+		ubuf += out_len;
+		copy->out_len += out_len;
+		/* Issue the write descriptors next */
+		ret = vop_vringh_copy(vdev, wiov, ubuf, len,
+				      !MIC_VRINGH_READ, copy->vr_idx, &out_len);
+		if (ret) {
+			dev_err(vop_dev(vdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= out_len;
+		ubuf += out_len;
+		copy->out_len += out_len;
+		if (!len) {
+			/* One user space iovec is now completed */
+			iovcnt--;
+			u_iov++;
+		}
+		/* Exit loop if all elements in KIOVs have been processed. */
+		if (riov->i == riov->used && wiov->i == wiov->used)
+			break;
+	}
+	/*
+	 * Update the used ring if a descriptor was available and some data was
+	 * copied in/out and the user asked for a used ring update.
+	 */
+	if (*head != USHRT_MAX && copy->out_len && copy->update_used) {
+		u32 total = 0;
+
+		/* Determine the total data consumed */
+		total += vop_vringh_iov_consumed(riov);
+		total += vop_vringh_iov_consumed(wiov);
+		vringh_complete_kern(vrh, *head, total);
+		*head = USHRT_MAX;
+		if (vringh_need_notify_kern(vrh) > 0)
+			vringh_notify(vrh);
+		vringh_kiov_cleanup(riov);
+		vringh_kiov_cleanup(wiov);
+		/* Update avail idx for user space */
+		vr->info->avail_idx = vrh->last_avail_idx;
+	}
+	return ret;
+}
+
+static inline int vop_verify_copy_args(struct vop_vdev *vdev,
+				       struct mic_copy_desc *copy)
+{
+	if (!vdev || copy->vr_idx >= vdev->dd->num_vq)
+		return -EINVAL;
+	return 0;
+}
+
+/* Copy a specified number of virtio descriptors in a chain */
+static int vop_virtio_copy_desc(struct vop_vdev *vdev,
+				struct mic_copy_desc *copy)
+{
+	int err;
+	struct vop_vringh *vvr = &vdev->vvr[copy->vr_idx];
+
+	err = vop_verify_copy_args(vdev, copy);
+	if (err)
+		return err;
+
+	mutex_lock(&vvr->vr_mutex);
+	if (!vop_vdevup(vdev)) {
+		err = -ENODEV;
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	err = _vop_virtio_copy(vdev, copy);
+	if (err) {
+		dev_err(vop_dev(vdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+	}
+err:
+	mutex_unlock(&vvr->vr_mutex);
+	return err;
+}
+
+static int vop_open(struct inode *inode, struct file *f)
+{
+	struct vop_vdev *vdev;
+	struct vop_info *vi = container_of(f->private_data,
+		struct vop_info, miscdev);
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return -ENOMEM;
+	vdev->vi = vi;
+	mutex_init(&vdev->vdev_mutex);
+	f->private_data = vdev;
+	init_completion(&vdev->destroy);
+	complete(&vdev->destroy);
+	return 0;
+}
+
+static int vop_release(struct inode *inode, struct file *f)
+{
+	struct vop_vdev *vdev = f->private_data, *vdev_tmp;
+	struct vop_info *vi = vdev->vi;
+	struct list_head *pos, *tmp;
+	bool found = false;
+
+	mutex_lock(&vdev->vdev_mutex);
+	if (vdev->deleted)
+		goto unlock;
+	mutex_lock(&vi->vop_mutex);
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev_tmp = list_entry(pos, struct vop_vdev, list);
+		if (vdev == vdev_tmp) {
+			vop_virtio_del_device(vdev);
+			list_del(pos);
+			found = true;
+			break;
+		}
+	}
+	mutex_unlock(&vi->vop_mutex);
+unlock:
+	mutex_unlock(&vdev->vdev_mutex);
+	if (!found)
+		wait_for_completion(&vdev->destroy);
+	f->private_data = NULL;
+	kfree(vdev);
+	return 0;
+}
+
+static long vop_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct vop_vdev *vdev = f->private_data;
+	struct vop_info *vi = vdev->vi;
+	void __user *argp = (void __user *)arg;
+	int ret;
+
+	switch (cmd) {
+	case MIC_VIRTIO_ADD_DEVICE:
+	{
+		struct mic_device_desc dd, *dd_config;
+
+		if (copy_from_user(&dd, argp, sizeof(dd)))
+			return -EFAULT;
+
+		if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE ||
+		    dd.num_vq > MIC_MAX_VRINGS)
+			return -EINVAL;
+
+		dd_config = kzalloc(mic_desc_size(&dd), GFP_KERNEL);
+		if (!dd_config)
+			return -ENOMEM;
+		if (copy_from_user(dd_config, argp, mic_desc_size(&dd))) {
+			ret = -EFAULT;
+			goto free_ret;
+		}
+		mutex_lock(&vdev->vdev_mutex);
+		mutex_lock(&vi->vop_mutex);
+		ret = vop_virtio_add_device(vdev, dd_config);
+		if (ret)
+			goto unlock_ret;
+		list_add_tail(&vdev->list, &vi->vdev_list);
+unlock_ret:
+		mutex_unlock(&vi->vop_mutex);
+		mutex_unlock(&vdev->vdev_mutex);
+free_ret:
+		kfree(dd_config);
+		return ret;
+	}
+	case MIC_VIRTIO_COPY_DESC:
+	{
+		struct mic_copy_desc copy;
+
+		mutex_lock(&vdev->vdev_mutex);
+		ret = vop_vdev_inited(vdev);
+		if (ret)
+			goto _unlock_ret;
+
+		if (copy_from_user(&copy, argp, sizeof(copy))) {
+			ret = -EFAULT;
+			goto _unlock_ret;
+		}
+
+		ret = vop_virtio_copy_desc(vdev, &copy);
+		if (ret < 0)
+			goto _unlock_ret;
+		if (copy_to_user(
+			&((struct mic_copy_desc __user *)argp)->out_len,
+			&copy.out_len, sizeof(copy.out_len)))
+			ret = -EFAULT;
+_unlock_ret:
+		mutex_unlock(&vdev->vdev_mutex);
+		return ret;
+	}
+	case MIC_VIRTIO_CONFIG_CHANGE:
+	{
+		void *buf;
+
+		mutex_lock(&vdev->vdev_mutex);
+		ret = vop_vdev_inited(vdev);
+		if (ret)
+			goto __unlock_ret;
+		buf = kzalloc(vdev->dd->config_len, GFP_KERNEL);
+		if (!buf) {
+			ret = -ENOMEM;
+			goto __unlock_ret;
+		}
+		if (copy_from_user(buf, argp, vdev->dd->config_len)) {
+			ret = -EFAULT;
+			goto done;
+		}
+		ret = vop_virtio_config_change(vdev, buf);
+done:
+		kfree(buf);
+__unlock_ret:
+		mutex_unlock(&vdev->vdev_mutex);
+		return ret;
+	}
+	default:
+		return -ENOIOCTLCMD;
+	};
+	return 0;
+}
+
+/*
+ * We return POLLIN | POLLOUT from poll when new buffers are enqueued, and
+ * not when previously enqueued buffers may be available. This means that
+ * in the card->host (TX) path, when userspace is unblocked by poll it
+ * must drain all available descriptors or it can stall.
+ */
+static unsigned int vop_poll(struct file *f, poll_table *wait)
+{
+	struct vop_vdev *vdev = f->private_data;
+	int mask = 0;
+
+	mutex_lock(&vdev->vdev_mutex);
+	if (vop_vdev_inited(vdev)) {
+		mask = POLLERR;
+		goto done;
+	}
+	poll_wait(f, &vdev->waitq, wait);
+	if (vop_vdev_inited(vdev)) {
+		mask = POLLERR;
+	} else if (vdev->poll_wake) {
+		vdev->poll_wake = 0;
+		mask = POLLIN | POLLOUT;
+	}
+done:
+	mutex_unlock(&vdev->vdev_mutex);
+	return mask;
+}
+
+static inline int
+vop_query_offset(struct vop_vdev *vdev, unsigned long offset,
+		 unsigned long *size, unsigned long *pa)
+{
+	struct vop_device *vpdev = vdev->vpdev;
+	unsigned long start = MIC_DP_SIZE;
+	int i;
+
+	/*
+	 * MMAP interface is as follows:
+	 * offset				region
+	 * 0x0					virtio device_page
+	 * 0x1000				first vring
+	 * 0x1000 + size of 1st vring		second vring
+	 * ....
+	 */
+	if (!offset) {
+		*pa = virt_to_phys(vpdev->hw_ops->get_dp(vpdev));
+		*size = MIC_DP_SIZE;
+		return 0;
+	}
+
+	for (i = 0; i < vdev->dd->num_vq; i++) {
+		struct vop_vringh *vvr = &vdev->vvr[i];
+
+		if (offset == start) {
+			*pa = virt_to_phys(vvr->vring.va);
+			*size = vvr->vring.len;
+			return 0;
+		}
+		start += vvr->vring.len;
+	}
+	return -1;
+}
+
+/*
+ * Maps the device page and virtio rings to user space for readonly access.
+ */
+static int vop_mmap(struct file *f, struct vm_area_struct *vma)
+{
+	struct vop_vdev *vdev = f->private_data;
+	unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
+	unsigned long pa, size = vma->vm_end - vma->vm_start, size_rem = size;
+	int i, err;
+
+	err = vop_vdev_inited(vdev);
+	if (err)
+		goto ret;
+	if (vma->vm_flags & VM_WRITE) {
+		err = -EACCES;
+		goto ret;
+	}
+	while (size_rem) {
+		i = vop_query_offset(vdev, offset, &size, &pa);
+		if (i < 0) {
+			err = -EINVAL;
+			goto ret;
+		}
+		err = remap_pfn_range(vma, vma->vm_start + offset,
+				      pa >> PAGE_SHIFT, size,
+				      vma->vm_page_prot);
+		if (err)
+			goto ret;
+		size_rem -= size;
+		offset += size;
+	}
+ret:
+	return err;
+}
+
+static const struct file_operations vop_fops = {
+	.open = vop_open,
+	.release = vop_release,
+	.unlocked_ioctl = vop_ioctl,
+	.poll = vop_poll,
+	.mmap = vop_mmap,
+	.owner = THIS_MODULE,
+};
+
+int vop_host_init(struct vop_info *vi)
+{
+	int rc;
+	struct miscdevice *mdev;
+	struct vop_device *vpdev = vi->vpdev;
+
+	INIT_LIST_HEAD(&vi->vdev_list);
+	vi->dma_ch = vpdev->dma_ch;
+	mdev = &vi->miscdev;
+	mdev->minor = MISC_DYNAMIC_MINOR;
+	snprintf(vi->name, sizeof(vi->name), "vop_virtio%d", vpdev->index);
+	mdev->name = vi->name;
+	mdev->fops = &vop_fops;
+	mdev->parent = &vpdev->dev;
+
+	rc = misc_register(mdev);
+	if (rc)
+		dev_err(&vpdev->dev, "%s failed rc %d\n", __func__, rc);
+	return rc;
+}
+
+void vop_host_uninit(struct vop_info *vi)
+{
+	struct list_head *pos, *tmp;
+	struct vop_vdev *vdev;
+
+	mutex_lock(&vi->vop_mutex);
+	vop_virtio_reset_devices(vi);
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev = list_entry(pos, struct vop_vdev, list);
+		list_del(pos);
+		reinit_completion(&vdev->destroy);
+		mutex_unlock(&vi->vop_mutex);
+		mutex_lock(&vdev->vdev_mutex);
+		vop_virtio_del_device(vdev);
+		vdev->deleted = true;
+		mutex_unlock(&vdev->vdev_mutex);
+		complete(&vdev->destroy);
+		mutex_lock(&vi->vop_mutex);
+	}
+	mutex_unlock(&vi->vop_mutex);
+	misc_deregister(&vi->miscdev);
+}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 39%]

* [PATCH char-misc-next 7/8] misc: mic: Enable VOP debugfs and driver build
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (5 preceding siblings ...)
  2016-02-02  4:23 49% ` [PATCH char-misc-next 6/8] misc: mic: Enable VOP card " Sudeep Dutt
@ 2016-02-02  4:23 67% ` Sudeep Dutt
  2016-02-02  4:23 58% ` [PATCH char-misc-next 8/8] misc: mic: MIC host and card driver changes to enable VOP Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch moves the virtio specific debugfs hooks previously in
mic_debugfs.c in the MIC host driver into the VOP driver. The
Kconfig/Makefile is also updated to allow building the VOP driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig           |  20 ++++
 drivers/misc/mic/Makefile          |   1 +
 drivers/misc/mic/vop/Makefile      |   9 ++
 drivers/misc/mic/vop/vop_debugfs.c | 232 +++++++++++++++++++++++++++++++++++++
 4 files changed, 262 insertions(+)
 create mode 100644 drivers/misc/mic/vop/Makefile
 create mode 100644 drivers/misc/mic/vop/vop_debugfs.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 840f7ef..b03bb17 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -124,3 +124,23 @@ config MIC_COSM
 	  More information about the Intel MIC family as well as the Linux
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "VOP Driver"
+
+config VOP
+	tristate "VOP Driver"
+	depends on 64BIT && PCI && X86 && VOP_BUS
+	select VHOST_RING
+	help
+	  This enables VOP (Virtio over PCIe) Driver support for the Intel
+	  Many Integrated Core (MIC) family of PCIe form factor coprocessor
+	  devices. The VOP driver allows virtio drivers, e.g. net, console
+	  and block drivers, on the card connect to user space virtio
+	  devices on the host.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index e288a11..f2b1323 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -8,3 +8,4 @@ obj-y += bus/
 obj-$(CONFIG_SCIF) += scif/
 obj-$(CONFIG_MIC_COSM) += cosm/
 obj-$(CONFIG_MIC_COSM) += cosm_client/
+obj-$(CONFIG_VOP) += vop/
diff --git a/drivers/misc/mic/vop/Makefile b/drivers/misc/mic/vop/Makefile
new file mode 100644
index 0000000..78819c8
--- /dev/null
+++ b/drivers/misc/mic/vop/Makefile
@@ -0,0 +1,9 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2016, Intel Corporation.
+#
+obj-m := vop.o
+
+vop-objs += vop_main.o
+vop-objs += vop_debugfs.o
+vop-objs += vop_vringh.o
diff --git a/drivers/misc/mic/vop/vop_debugfs.c b/drivers/misc/mic/vop/vop_debugfs.c
new file mode 100644
index 0000000..ab43884
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_debugfs.c
@@ -0,0 +1,232 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+#include "vop_main.h"
+
+static int vop_dp_show(struct seq_file *s, void *pos)
+{
+	struct mic_device_desc *d;
+	struct mic_device_ctrl *dc;
+	struct mic_vqconfig *vqconfig;
+	__u32 *features;
+	__u8 *config;
+	struct vop_info *vi = s->private;
+	struct vop_device *vpdev = vi->vpdev;
+	struct mic_bootparam *bootparam = vpdev->hw_ops->get_dp(vpdev);
+	int j, k;
+
+	seq_printf(s, "Bootparam: magic 0x%x\n",
+		   bootparam->magic);
+	seq_printf(s, "Bootparam: h2c_config_db %d\n",
+		   bootparam->h2c_config_db);
+	seq_printf(s, "Bootparam: node_id %d\n",
+		   bootparam->node_id);
+	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
+		   bootparam->c2h_scif_db);
+	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
+		   bootparam->h2c_scif_db);
+	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
+		   bootparam->scif_host_dma_addr);
+	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
+		   bootparam->scif_card_dma_addr);
+
+	for (j = sizeof(*bootparam);
+		j < MIC_DP_SIZE; j += mic_total_desc_size(d)) {
+		d = (void *)bootparam + j;
+		dc = (void *)d + mic_aligned_desc_size(d);
+
+		/* end of list */
+		if (d->type == 0)
+			break;
+
+		if (d->type == -1)
+			continue;
+
+		seq_printf(s, "Type %d ", d->type);
+		seq_printf(s, "Num VQ %d ", d->num_vq);
+		seq_printf(s, "Feature Len %d\n", d->feature_len);
+		seq_printf(s, "Config Len %d ", d->config_len);
+		seq_printf(s, "Shutdown Status %d\n", d->status);
+
+		for (k = 0; k < d->num_vq; k++) {
+			vqconfig = mic_vq_config(d) + k;
+			seq_printf(s, "vqconfig[%d]: ", k);
+			seq_printf(s, "address 0x%llx ",
+				   vqconfig->address);
+			seq_printf(s, "num %d ", vqconfig->num);
+			seq_printf(s, "used address 0x%llx\n",
+				   vqconfig->used_address);
+		}
+
+		features = (__u32 *)mic_vq_features(d);
+		seq_printf(s, "Features: Host 0x%x ", features[0]);
+		seq_printf(s, "Guest 0x%x\n", features[1]);
+
+		config = mic_vq_configspace(d);
+		for (k = 0; k < d->config_len; k++)
+			seq_printf(s, "config[%d]=%d\n", k, config[k]);
+
+		seq_puts(s, "Device control:\n");
+		seq_printf(s, "Config Change %d ", dc->config_change);
+		seq_printf(s, "Vdev reset %d\n", dc->vdev_reset);
+		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
+		seq_printf(s, "Host ack %d\n", dc->host_ack);
+		seq_printf(s, "Used address updated %d ",
+			   dc->used_address_updated);
+		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
+		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
+		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
+	}
+	schedule_work(&vi->hotplug_work);
+	return 0;
+}
+
+static int vop_dp_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, vop_dp_show, inode->i_private);
+}
+
+static int vop_dp_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations dp_ops = {
+	.owner   = THIS_MODULE,
+	.open    = vop_dp_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = vop_dp_debug_release
+};
+
+static int vop_vdev_info_show(struct seq_file *s, void *unused)
+{
+	struct vop_info *vi = s->private;
+	struct list_head *pos, *tmp;
+	struct vop_vdev *vdev;
+	int i, j;
+
+	mutex_lock(&vi->vop_mutex);
+	list_for_each_safe(pos, tmp, &vi->vdev_list) {
+		vdev = list_entry(pos, struct vop_vdev, list);
+		seq_printf(s, "VDEV type %d state %s in %ld out %ld in_dma %ld out_dma %ld\n",
+			   vdev->virtio_id,
+			   vop_vdevup(vdev) ? "UP" : "DOWN",
+			   vdev->in_bytes,
+			   vdev->out_bytes,
+			   vdev->in_bytes_dma,
+			   vdev->out_bytes_dma);
+		for (i = 0; i < MIC_MAX_VRINGS; i++) {
+			struct vring_desc *desc;
+			struct vring_avail *avail;
+			struct vring_used *used;
+			struct vop_vringh *vvr = &vdev->vvr[i];
+			struct vringh *vrh = &vvr->vrh;
+			int num = vrh->vring.num;
+
+			if (!num)
+				continue;
+			desc = vrh->vring.desc;
+			seq_printf(s, "vring i %d avail_idx %d",
+				   i, vvr->vring.info->avail_idx & (num - 1));
+			seq_printf(s, " vring i %d avail_idx %d\n",
+				   i, vvr->vring.info->avail_idx);
+			seq_printf(s, "vrh i %d weak_barriers %d",
+				   i, vrh->weak_barriers);
+			seq_printf(s, " last_avail_idx %d last_used_idx %d",
+				   vrh->last_avail_idx, vrh->last_used_idx);
+			seq_printf(s, " completed %d\n", vrh->completed);
+			for (j = 0; j < num; j++) {
+				seq_printf(s, "desc[%d] addr 0x%llx len %d",
+					   j, desc->addr, desc->len);
+				seq_printf(s, " flags 0x%x next %d\n",
+					   desc->flags, desc->next);
+				desc++;
+			}
+			avail = vrh->vring.avail;
+			seq_printf(s, "avail flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, avail->flags),
+				   vringh16_to_cpu(vrh,
+						   avail->idx) & (num - 1));
+			seq_printf(s, "avail flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, avail->flags),
+				   vringh16_to_cpu(vrh, avail->idx));
+			for (j = 0; j < num; j++)
+				seq_printf(s, "avail ring[%d] %d\n",
+					   j, avail->ring[j]);
+			used = vrh->vring.used;
+			seq_printf(s, "used flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, used->flags),
+				   vringh16_to_cpu(vrh, used->idx) & (num - 1));
+			seq_printf(s, "used flags 0x%x idx %d\n",
+				   vringh16_to_cpu(vrh, used->flags),
+				   vringh16_to_cpu(vrh, used->idx));
+			for (j = 0; j < num; j++)
+				seq_printf(s, "used ring[%d] id %d len %d\n",
+					   j, vringh32_to_cpu(vrh,
+							      used->ring[j].id),
+					   vringh32_to_cpu(vrh,
+							   used->ring[j].len));
+		}
+	}
+	mutex_unlock(&vi->vop_mutex);
+
+	return 0;
+}
+
+static int vop_vdev_info_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, vop_vdev_info_show, inode->i_private);
+}
+
+static int vop_vdev_info_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations vdev_info_ops = {
+	.owner   = THIS_MODULE,
+	.open    = vop_vdev_info_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = vop_vdev_info_debug_release
+};
+
+void vop_init_debugfs(struct vop_info *vi)
+{
+	char name[16];
+
+	snprintf(name, sizeof(name), "%s%d", KBUILD_MODNAME, vi->vpdev->dnode);
+	vi->dbg = debugfs_create_dir(name, NULL);
+	if (!vi->dbg) {
+		pr_err("can't create debugfs dir vop\n");
+		return;
+	}
+	debugfs_create_file("dp", 0444, vi->dbg, vi, &dp_ops);
+	debugfs_create_file("vdev_info", 0444, vi->dbg, vi, &vdev_info_ops);
+}
+
+void vop_exit_debugfs(struct vop_info *vi)
+{
+	debugfs_remove_recursive(vi->dbg);
+}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 67%]

* [PATCH char-misc-next 8/8] misc: mic: MIC host and card driver changes to enable VOP
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (6 preceding siblings ...)
  2016-02-02  4:23 67% ` [PATCH char-misc-next 7/8] misc: mic: Enable VOP debugfs and driver build Sudeep Dutt
@ 2016-02-02  4:23 58% ` Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch modifies the MIC host and card drivers to start using the
VOP driver. The MIC host and card drivers now implement the VOP bus
operations and register a VOP device on the VOP bus. MIC driver stack
documentation is also updated to include the new VOP driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mic_overview.txt |  54 +++++++++-------
 drivers/misc/mic/Kconfig           |   7 ++-
 drivers/misc/mic/card/mic_device.h |   3 +
 drivers/misc/mic/host/mic_device.h |   3 +
 Documentation/mic/mpssd/mpssd.c    |   2 +-
 drivers/misc/mic/card/mic_device.c |  84 ++++++++++++++++++++++++-
 drivers/misc/mic/card/mic_x100.c   |   1 +
 drivers/misc/mic/host/mic_boot.c   | 125 ++++++++++++++++++++++++++++++++++++-
 drivers/misc/mic/host/mic_main.c   |   1 +
 Documentation/mic/mpssd/mpss       |   2 +-
 10 files changed, 249 insertions(+), 33 deletions(-)

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index 73f44fc..074adbd 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -12,10 +12,19 @@ for the X100 devices.
 
 Since it is a PCIe card, it does not have the ability to host hardware
 devices for networking, storage and console. We provide these devices
-on X100 coprocessors thus enabling a self-bootable equivalent environment
-for applications. A key benefit of our solution is that it leverages
-the standard virtio framework for network, disk and console devices,
-though in our case the virtio framework is used across a PCIe bus.
+on X100 coprocessors thus enabling a self-bootable equivalent
+environment for applications. A key benefit of our solution is that it
+leverages the standard virtio framework for network, disk and console
+devices, though in our case the virtio framework is used across a PCIe
+bus. A Virtio Over PCIe (VOP) driver allows creating user space
+backends or devices on the host which are used to probe virtio drivers
+for these devices on the MIC card. The existing VRINGH infrastructure
+in the kernel is used to access virtio rings from the host. The card
+VOP driver allows card virtio drivers to communicate with their user
+space backends on the host via a device page. Ring 3 apps on the host
+can add, remove and configure virtio devices. A thin MIC specific
+virtio_config_ops is implemented which is borrowed heavily from
+previous similar implementations in lguest and s390.
 
 MIC PCIe card has a dma controller with 8 channels. These channels are
 shared between the host s/w and the card s/w. 0 to 3 are used by host
@@ -38,7 +47,6 @@ single threaded performance for the host compared to MIC, the ability of
 the host to initiate DMA's to/from the card using the MIC DMA engine and
 the fact that the virtio block storage backend can only be on the host.
 
-                                      |
                +----------+           |             +----------+
                | Card OS  |           |             | Host OS  |
                +----------+           |             +----------+
@@ -47,27 +55,25 @@ the fact that the virtio block storage backend can only be on the host.
         | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
         | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
         | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
-        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+        +---+---+ +---+----+ +--+---+ | +---------+  +----+---+ +--------+
             |         |         |     |      |            |         |
             |         |         |     |User  |            |         |
-            |         |         |     |------|------------|---------|-------
-            +-------------------+     |Kernel +--------------------------+
-                      |               |       | Virtio over PCIe IOCTLs  |
-                      |               |       +--------------------------+
-+-----------+         |               |                   |  +-----------+
-| MIC DMA   |         |      +------+ | +------+ +------+ |  | MIC DMA   |
-| Driver    |         |      | SCIF | | | SCIF | | COSM | |  | Driver    |
-+-----------+         |      +------+ | +------+ +--+---+ |  +-----------+
-      |               |         |     |    |        |     |        |
-+---------------+     |      +------+ | +--+---+ +--+---+ | +----------------+
-|MIC virtual Bus|     |      |SCIF  | | |SCIF  | | COSM | | |MIC virtual Bus |
-+---------------+     |      |HW Bus| | |HW Bus| | Bus  | | +----------------+
-      |               |      +------+ | +--+---+ +------+ |              |
-      |               |         |     |       |     |     |              |
-      |   +-----------+---+     |     |       |    +---------------+     |
-      |   |Intel MIC      |     |     |       |    |Intel MIC      |     |
-      +---|Card Driver    |     |     |       |    |Host Driver    |     |
-          +------------+--------+     |       +----+---------------+-----+
+            |         |         |     |------|------------|--+------|-------
+            +---------+---------+     |Kernel                |
+                      |               |                      |
+  +---------+     +---+----+ +------+ | +------+ +------+ +--+---+  +-------+
+  |MIC DMA  |     |  VOP   | | SCIF | | | SCIF | | COSM | | VOP  |  |MIC DMA|
+  +---+-----+     +---+----+ +--+---+ | +--+---+ +--+---+ +------+  +----+--+
+      |               |         |     |    |        |                    |
+  +---+-----+     +---+----+ +--+---+ | +--+---+ +--+---+ +------+  +----+--+
+  |MIC      |     |  VOP   | |SCIF  | | |SCIF  | | COSM | | VOP  |  | MIC   |
+  |HW Bus   |     |  HW Bus| |HW Bus| | |HW Bus| | Bus  | |HW Bus|  |HW Bus |
+  +---------+     +--------+ +--+---+ | +--+---+ +------+ +------+  +-------+
+      |               |         |     |       |     |                    |
+      |   +-----------+--+      |     |       |    +---------------+     |
+      |   |Intel MIC     |      |     |       |    |Intel MIC      |     |
+      |   |Card Driver   |      |     |       |    |Host Driver    |     |
+      +---+--------------+------+     |       +----+---------------+-----+
                  |                    |                   |
              +-------------------------------------------------------------+
              |                                                             |
diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index b03bb17..2e4f3ba 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -53,8 +53,8 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS && SCIF_BUS && MIC_COSM
-	select VHOST_RING
+	depends on 64BIT && PCI && X86
+	depends on INTEL_MIC_BUS && SCIF_BUS && MIC_COSM && VOP_BUS
 	help
 	  This enables Host Driver support for the Intel Many Integrated
 	  Core (MIC) family of PCIe form factor coprocessor devices that
@@ -73,7 +73,8 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86 && INTEL_MIC_BUS && SCIF_BUS && MIC_COSM
+	depends on 64BIT && X86
+	depends on INTEL_MIC_BUS && SCIF_BUS && MIC_COSM && VOP_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 1dbf83c..333dbed 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -32,6 +32,7 @@
 #include <linux/interrupt.h>
 #include <linux/mic_bus.h>
 #include "../bus/scif_bus.h"
+#include "../bus/vop_bus.h"
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -76,6 +77,7 @@ struct mic_device {
  * @dma_ch - Array of DMA channels
  * @num_dma_ch - Number of DMA channels available
  * @scdev: SCIF device on the SCIF virtual bus.
+ * @vpdev: Virtio over PCIe device on the VOP virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -90,6 +92,7 @@ struct mic_driver {
 	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
 	int num_dma_ch;
 	struct scif_hw_dev *scdev;
+	struct vop_device *vpdev;
 };
 
 /**
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 8460de1..52b12b2 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -29,6 +29,7 @@
 #include <linux/miscdevice.h>
 #include <linux/mic_bus.h>
 #include "../bus/scif_bus.h"
+#include "../bus/vop_bus.h"
 #include "../bus/cosm_bus.h"
 #include "mic_intr.h"
 
@@ -68,6 +69,7 @@ extern struct cosm_hw_ops cosm_hw_ops;
  * @dma_ch - Array of DMA channels
  * @num_dma_ch - Number of DMA channels available
  * @scdev: SCIF device on the SCIF virtual bus.
+ * @vpdev: Virtio over PCIe device on the VOP virtual bus.
  * @cosm_dev: COSM device
  */
 struct mic_device {
@@ -92,6 +94,7 @@ struct mic_device {
 	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
 	int num_dma_ch;
 	struct scif_hw_dev *scdev;
+	struct vop_device *vpdev;
 	struct cosm_device *cosm_dev;
 };
 
diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
index aaeafa1..518dece 100644
--- a/Documentation/mic/mpssd/mpssd.c
+++ b/Documentation/mic/mpssd/mpssd.c
@@ -926,7 +926,7 @@ add_virtio_device(struct mic_info *mic, struct mic_device_desc *dd)
 	char path[PATH_MAX];
 	int fd, err;
 
-	snprintf(path, PATH_MAX, "/dev/mic%d", mic->id);
+	snprintf(path, PATH_MAX, "/dev/vop_virtio%d", mic->id);
 	fd = open(path, O_RDWR);
 	if (fd < 0) {
 		mpsslog("Could not open %s %s\n", path, strerror(errno));
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index ff03c63..e749af4 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -249,12 +249,82 @@ static struct scif_hw_ops scif_hw_ops = {
 	.iounmap = ___mic_iounmap,
 };
 
+static inline struct mic_driver *vpdev_to_mdrv(struct vop_device *vpdev)
+{
+	return dev_get_drvdata(vpdev->dev.parent);
+}
+
+static struct mic_irq *
+__mic_request_irq(struct vop_device *vpdev,
+		  irqreturn_t (*func)(int irq, void *data),
+		   const char *name, void *data, int intr_src)
+{
+	return mic_request_card_irq(func, NULL, name, data, intr_src);
+}
+
+static void __mic_free_irq(struct vop_device *vpdev,
+			   struct mic_irq *cookie, void *data)
+{
+	return mic_free_card_irq(cookie, data);
+}
+
+static void __mic_ack_interrupt(struct vop_device *vpdev, int num)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	mic_ack_interrupt(&mdrv->mdev);
+}
+
+static int __mic_next_db(struct vop_device *vpdev)
+{
+	return mic_next_card_db();
+}
+
+static void __iomem *__mic_get_remote_dp(struct vop_device *vpdev)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	return mdrv->dp;
+}
+
+static void __mic_send_intr(struct vop_device *vpdev, int db)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	mic_send_intr(&mdrv->mdev, db);
+}
+
+static void __iomem *__mic_ioremap(struct vop_device *vpdev,
+				   dma_addr_t pa, size_t len)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	return mic_card_map(&mdrv->mdev, pa, len);
+}
+
+static void __mic_iounmap(struct vop_device *vpdev, void __iomem *va)
+{
+	struct mic_driver *mdrv = vpdev_to_mdrv(vpdev);
+
+	mic_card_unmap(&mdrv->mdev, va);
+}
+
+static struct vop_hw_ops vop_hw_ops = {
+	.request_irq = __mic_request_irq,
+	.free_irq = __mic_free_irq,
+	.ack_interrupt = __mic_ack_interrupt,
+	.next_db = __mic_next_db,
+	.get_remote_dp = __mic_get_remote_dp,
+	.send_intr = __mic_send_intr,
+	.ioremap = __mic_ioremap,
+	.iounmap = __mic_iounmap,
+};
+
 static int mic_request_dma_chans(struct mic_driver *mdrv)
 {
 	dma_cap_mask_t mask;
 	struct dma_chan *chan;
 
-	request_module("mic_x100_dma");
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_MEMCPY, mask);
 
@@ -308,6 +378,13 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 		rc = -ENODEV;
 		goto irq_uninit;
 	}
+	mdrv->vpdev = vop_register_device(mdrv->dev, VOP_DEV_TRNSP,
+					  NULL, &vop_hw_ops, 0,
+					  NULL, mdrv->dma_ch[0]);
+	if (IS_ERR(mdrv->vpdev)) {
+		rc = PTR_ERR(mdrv->vpdev);
+		goto dma_free;
+	}
 	bootparam = mdrv->dp;
 	node_id = ioread8(&bootparam->node_id);
 	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
@@ -317,11 +394,13 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 					   mdrv->num_dma_ch, true);
 	if (IS_ERR(mdrv->scdev)) {
 		rc = PTR_ERR(mdrv->scdev);
-		goto dma_free;
+		goto vop_remove;
 	}
 	mic_create_card_debug_dir(mdrv);
 done:
 	return rc;
+vop_remove:
+	vop_unregister_device(mdrv->vpdev);
 dma_free:
 	mic_free_dma_chans(mdrv);
 irq_uninit:
@@ -342,6 +421,7 @@ void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
 	scif_unregister_device(mdrv->scdev);
+	vop_unregister_device(mdrv->vpdev);
 	mic_free_dma_chans(mdrv);
 	mic_uninit_irq();
 	mic_dp_uninit();
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index b2958ce..b9f0710 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -326,6 +326,7 @@ static int __init mic_init(void)
 		goto done;
 	}
 
+	request_module("mic_x100_dma");
 	mic_init_card_debugfs();
 	ret = platform_device_register(&mic_platform_dev);
 	if (ret) {
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index 3df305f..bceb88a 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -25,10 +25,120 @@
 #include <linux/mic_common.h>
 #include <linux/mic_bus.h>
 #include "../bus/scif_bus.h"
+#include "../bus/vop_bus.h"
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 
+static inline struct mic_device *vpdev_to_mdev(struct vop_device *vpdev)
+{
+	return dev_get_drvdata(vpdev->dev.parent);
+}
+
+static dma_addr_t
+_mic_dma_map_page(struct device *dev, struct page *page,
+		  unsigned long offset, size_t size,
+		  enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct vop_device *vpdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void _mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+				size_t size, enum dma_data_direction dir,
+				struct dma_attrs *attrs)
+{
+	struct vop_device *vpdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static const struct dma_map_ops _mic_dma_ops = {
+	.map_page = _mic_dma_map_page,
+	.unmap_page = _mic_dma_unmap_page,
+};
+
+static struct mic_irq *
+__mic_request_irq(struct vop_device *vpdev,
+		  irqreturn_t (*func)(int irq, void *data),
+		  const char *name, void *data, int intr_src)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	return mic_request_threaded_irq(mdev, func, NULL, name, data,
+					intr_src, MIC_INTR_DB);
+}
+
+static void __mic_free_irq(struct vop_device *vpdev,
+			   struct mic_irq *cookie, void *data)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	return mic_free_irq(mdev, cookie, data);
+}
+
+static void __mic_ack_interrupt(struct vop_device *vpdev, int num)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static int __mic_next_db(struct vop_device *vpdev)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	return mic_next_db(mdev);
+}
+
+static void *__mic_get_dp(struct vop_device *vpdev)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	return mdev->dp;
+}
+
+static void __iomem *__mic_get_remote_dp(struct vop_device *vpdev)
+{
+	return NULL;
+}
+
+static void __mic_send_intr(struct vop_device *vpdev, int db)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	mdev->ops->send_intr(mdev, db);
+}
+
+static void __iomem *__mic_ioremap(struct vop_device *vpdev,
+				   dma_addr_t pa, size_t len)
+{
+	struct mic_device *mdev = vpdev_to_mdev(vpdev);
+
+	return mdev->aper.va + pa;
+}
+
+static void __mic_iounmap(struct vop_device *vpdev, void __iomem *va)
+{
+	/* nothing to do */
+}
+
+static struct vop_hw_ops vop_hw_ops = {
+	.request_irq = __mic_request_irq,
+	.free_irq = __mic_free_irq,
+	.ack_interrupt = __mic_ack_interrupt,
+	.next_db = __mic_next_db,
+	.get_dp = __mic_get_dp,
+	.get_remote_dp = __mic_get_remote_dp,
+	.send_intr = __mic_send_intr,
+	.ioremap = __mic_ioremap,
+	.iounmap = __mic_iounmap,
+};
+
 static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
 {
 	return dev_get_drvdata(scdev->dev.parent);
@@ -314,7 +424,6 @@ static int mic_request_dma_chans(struct mic_device *mdev)
 	dma_cap_mask_t mask;
 	struct dma_chan *chan;
 
-	request_module("mic_x100_dma");
 	dma_cap_zero(mask);
 	dma_cap_set(DMA_MEMCPY, mask);
 
@@ -386,9 +495,18 @@ static int _mic_start(struct cosm_device *cdev, int id)
 		goto dma_free;
 	}
 
+	mdev->vpdev = vop_register_device(&mdev->pdev->dev,
+					  VOP_DEV_TRNSP, &_mic_dma_ops,
+					  &vop_hw_ops, id + 1, &mdev->aper,
+					  mdev->dma_ch[0]);
+	if (IS_ERR(mdev->vpdev)) {
+		rc = PTR_ERR(mdev->vpdev);
+		goto scif_remove;
+	}
+
 	rc = mdev->ops->load_mic_fw(mdev, NULL);
 	if (rc)
-		goto scif_remove;
+		goto vop_remove;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -396,6 +514,8 @@ static int _mic_start(struct cosm_device *cdev, int id)
 	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
 	mdev->ops->send_firmware_intr(mdev);
 	goto unlock_ret;
+vop_remove:
+	vop_unregister_device(mdev->vpdev);
 scif_remove:
 	scif_unregister_device(mdev->scdev);
 dma_free:
@@ -422,6 +542,7 @@ static void _mic_stop(struct cosm_device *cdev, bool force)
 	 * will be the first to be registered and the last to be
 	 * unregistered.
 	 */
+	vop_unregister_device(mdev->vpdev);
 	scif_unregister_device(mdev->scdev);
 	mic_free_dma_chans(mdev);
 	mbus_unregister_device(mdev->dma_mbdev);
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 400def2..035be3e 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -317,6 +317,7 @@ static int __init mic_init(void)
 {
 	int ret;
 
+	request_module("mic_x100_dma");
 	mic_init_debugfs();
 	ida_init(&g_mic_ida);
 	ret = pci_register_driver(&mic_driver);
diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index 09ea9093..5fcf9fa 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -35,7 +35,7 @@
 
 exec=/usr/sbin/mpssd
 sysfs="/sys/class/mic"
-mic_modules="mic_host mic_x100_dma scif"
+mic_modules="mic_host mic_x100_dma scif vop"
 
 start()
 {
-- 
1.8.2.1

^ permalink raw reply related	[relevance 58%]

* [PATCH char-misc-next 2/8] misc: mic: Remove MIC X100 card virtio functionality
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
  2016-02-02  4:23 29% ` [PATCH char-misc-next 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
@ 2016-02-02  4:23 49% ` Sudeep Dutt
  2016-02-02  4:23 61% ` [PATCH char-misc-next 3/8] misc: mic: MIC VOP Bus Sudeep Dutt
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch deletes the virtio functionality from the MIC X100 card
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/card/Makefile     |   1 -
 drivers/misc/mic/card/mic_virtio.h |  76 -----
 drivers/misc/mic/card/mic_device.c |   9 +-
 drivers/misc/mic/card/mic_virtio.c | 634 -------------------------------------
 4 files changed, 1 insertion(+), 719 deletions(-)
 delete mode 100644 drivers/misc/mic/card/mic_virtio.h
 delete mode 100644 drivers/misc/mic/card/mic_virtio.c

diff --git a/drivers/misc/mic/card/Makefile b/drivers/misc/mic/card/Makefile
index 69d58be..6e9675e 100644
--- a/drivers/misc/mic/card/Makefile
+++ b/drivers/misc/mic/card/Makefile
@@ -8,4 +8,3 @@ obj-$(CONFIG_INTEL_MIC_CARD) += mic_card.o
 mic_card-y += mic_x100.o
 mic_card-y += mic_device.o
 mic_card-y += mic_debugfs.o
-mic_card-y += mic_virtio.o
diff --git a/drivers/misc/mic/card/mic_virtio.h b/drivers/misc/mic/card/mic_virtio.h
deleted file mode 100644
index d0407ba..0000000
--- a/drivers/misc/mic/card/mic_virtio.h
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Disclaimer: The codes contained in these modules may be specific to
- * the Intel Software Development Platform codenamed: Knights Ferry, and
- * the Intel product codenamed: Knights Corner, and are not backward
- * compatible with other Intel products. Additionally, Intel will NOT
- * support the codes or instruction set in future products.
- *
- * Intel MIC Card driver.
- *
- */
-#ifndef __MIC_CARD_VIRTIO_H
-#define __MIC_CARD_VIRTIO_H
-
-#include <linux/mic_common.h>
-#include "mic_device.h"
-
-/*
- * 64 bit I/O access
- */
-#ifndef ioread64
-#define ioread64 readq
-#endif
-#ifndef iowrite64
-#define iowrite64 writeq
-#endif
-
-static inline unsigned mic_desc_size(struct mic_device_desc __iomem *desc)
-{
-	return sizeof(*desc)
-		+ ioread8(&desc->num_vq) * sizeof(struct mic_vqconfig)
-		+ ioread8(&desc->feature_len) * 2
-		+ ioread8(&desc->config_len);
-}
-
-static inline struct mic_vqconfig __iomem *
-mic_vq_config(struct mic_device_desc __iomem *desc)
-{
-	return (struct mic_vqconfig __iomem *)(desc + 1);
-}
-
-static inline __u8 __iomem *
-mic_vq_features(struct mic_device_desc __iomem *desc)
-{
-	return (__u8 __iomem *)(mic_vq_config(desc) + ioread8(&desc->num_vq));
-}
-
-static inline __u8 __iomem *
-mic_vq_configspace(struct mic_device_desc __iomem *desc)
-{
-	return mic_vq_features(desc) + ioread8(&desc->feature_len) * 2;
-}
-static inline unsigned mic_total_desc_size(struct mic_device_desc __iomem *desc)
-{
-	return mic_aligned_desc_size(desc) + sizeof(struct mic_device_ctrl);
-}
-
-int mic_devices_init(struct mic_driver *mdrv);
-void mic_devices_uninit(struct mic_driver *mdrv);
-
-#endif
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index d0edaf7..ff03c63 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -34,7 +34,6 @@
 #include <linux/mic_common.h>
 #include "../common/mic_dev.h"
 #include "mic_device.h"
-#include "mic_virtio.h"
 
 static struct mic_driver *g_drv;
 
@@ -309,9 +308,6 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 		rc = -ENODEV;
 		goto irq_uninit;
 	}
-	rc = mic_devices_init(mdrv);
-	if (rc)
-		goto dma_free;
 	bootparam = mdrv->dp;
 	node_id = ioread8(&bootparam->node_id);
 	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
@@ -321,13 +317,11 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 					   mdrv->num_dma_ch, true);
 	if (IS_ERR(mdrv->scdev)) {
 		rc = PTR_ERR(mdrv->scdev);
-		goto device_uninit;
+		goto dma_free;
 	}
 	mic_create_card_debug_dir(mdrv);
 done:
 	return rc;
-device_uninit:
-	mic_devices_uninit(mdrv);
 dma_free:
 	mic_free_dma_chans(mdrv);
 irq_uninit:
@@ -348,7 +342,6 @@ void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
 	scif_unregister_device(mdrv->scdev);
-	mic_devices_uninit(mdrv);
 	mic_free_dma_chans(mdrv);
 	mic_uninit_irq();
 	mic_dp_uninit();
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
deleted file mode 100644
index f6ed57d..0000000
--- a/drivers/misc/mic/card/mic_virtio.c
+++ /dev/null
@@ -1,634 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Disclaimer: The codes contained in these modules may be specific to
- * the Intel Software Development Platform codenamed: Knights Ferry, and
- * the Intel product codenamed: Knights Corner, and are not backward
- * compatible with other Intel products. Additionally, Intel will NOT
- * support the codes or instruction set in future products.
- *
- * Adapted from:
- *
- * virtio for kvm on s390
- *
- * Copyright IBM Corp. 2008
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License (version 2 only)
- * as published by the Free Software Foundation.
- *
- *    Author(s): Christian Borntraeger <borntraeger@de.ibm.com>
- *
- * Intel MIC Card driver.
- *
- */
-#include <linux/delay.h>
-#include <linux/slab.h>
-#include <linux/virtio_config.h>
-
-#include "../common/mic_dev.h"
-#include "mic_virtio.h"
-
-#define VIRTIO_SUBCODE_64 0x0D00
-
-#define MIC_MAX_VRINGS                4
-struct mic_vdev {
-	struct virtio_device vdev;
-	struct mic_device_desc __iomem *desc;
-	struct mic_device_ctrl __iomem *dc;
-	struct mic_device *mdev;
-	void __iomem *vr[MIC_MAX_VRINGS];
-	int used_size[MIC_MAX_VRINGS];
-	struct completion reset_done;
-	struct mic_irq *virtio_cookie;
-	int c2h_vdev_db;
-};
-
-static struct mic_irq *virtio_config_cookie;
-#define to_micvdev(vd) container_of(vd, struct mic_vdev, vdev)
-
-/* Helper API to obtain the parent of the virtio device */
-static inline struct device *mic_dev(struct mic_vdev *mvdev)
-{
-	return mvdev->vdev.dev.parent;
-}
-
-/* This gets the device's feature bits. */
-static u64 mic_get_features(struct virtio_device *vdev)
-{
-	unsigned int i, bits;
-	u32 features = 0;
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-	u8 __iomem *in_features = mic_vq_features(desc);
-	int feature_len = ioread8(&desc->feature_len);
-
-	bits = min_t(unsigned, feature_len, sizeof(features)) * 8;
-	for (i = 0; i < bits; i++)
-		if (ioread8(&in_features[i / 8]) & (BIT(i % 8)))
-			features |= BIT(i);
-
-	return features;
-}
-
-static int mic_finalize_features(struct virtio_device *vdev)
-{
-	unsigned int i, bits;
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-	u8 feature_len = ioread8(&desc->feature_len);
-	/* Second half of bitmap is features we accept. */
-	u8 __iomem *out_features =
-		mic_vq_features(desc) + feature_len;
-
-	/* Give virtio_ring a chance to accept features. */
-	vring_transport_features(vdev);
-
-	/* Make sure we don't have any features > 32 bits! */
-	BUG_ON((u32)vdev->features != vdev->features);
-
-	memset_io(out_features, 0, feature_len);
-	bits = min_t(unsigned, feature_len,
-		sizeof(vdev->features)) * 8;
-	for (i = 0; i < bits; i++) {
-		if (__virtio_test_bit(vdev, i))
-			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
-				 &out_features[i / 8]);
-	}
-
-	return 0;
-}
-
-/*
- * Reading and writing elements in config space
- */
-static void mic_get(struct virtio_device *vdev, unsigned int offset,
-		   void *buf, unsigned len)
-{
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-
-	if (offset + len > ioread8(&desc->config_len))
-		return;
-	memcpy_fromio(buf, mic_vq_configspace(desc) + offset, len);
-}
-
-static void mic_set(struct virtio_device *vdev, unsigned int offset,
-		   const void *buf, unsigned len)
-{
-	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
-
-	if (offset + len > ioread8(&desc->config_len))
-		return;
-	memcpy_toio(mic_vq_configspace(desc) + offset, buf, len);
-}
-
-/*
- * The operations to get and set the status word just access the status
- * field of the device descriptor. set_status also interrupts the host
- * to tell about status changes.
- */
-static u8 mic_get_status(struct virtio_device *vdev)
-{
-	return ioread8(&to_micvdev(vdev)->desc->status);
-}
-
-static void mic_set_status(struct virtio_device *vdev, u8 status)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	if (!status)
-		return;
-	iowrite8(status, &mvdev->desc->status);
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-}
-
-/* Inform host on a virtio device reset and wait for ack from host */
-static void mic_reset_inform_host(struct virtio_device *vdev)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct mic_device_ctrl __iomem *dc = mvdev->dc;
-	int retry;
-
-	iowrite8(0, &dc->host_ack);
-	iowrite8(1, &dc->vdev_reset);
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-
-	/* Wait till host completes all card accesses and acks the reset */
-	for (retry = 100; retry--;) {
-		if (ioread8(&dc->host_ack))
-			break;
-		msleep(100);
-	};
-
-	dev_dbg(mic_dev(mvdev), "%s: retry: %d\n", __func__, retry);
-
-	/* Reset status to 0 in case we timed out */
-	iowrite8(0, &mvdev->desc->status);
-}
-
-static void mic_reset(struct virtio_device *vdev)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-
-	dev_dbg(mic_dev(mvdev), "%s: virtio id %d\n",
-		__func__, vdev->id.device);
-
-	mic_reset_inform_host(vdev);
-	complete_all(&mvdev->reset_done);
-}
-
-/*
- * The virtio_ring code calls this API when it wants to notify the Host.
- */
-static bool mic_notify(struct virtqueue *vq)
-{
-	struct mic_vdev *mvdev = vq->priv;
-
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-	return true;
-}
-
-static void mic_del_vq(struct virtqueue *vq, int n)
-{
-	struct mic_vdev *mvdev = to_micvdev(vq->vdev);
-	struct vring *vr = (struct vring *)(vq + 1);
-
-	free_pages((unsigned long) vr->used, get_order(mvdev->used_size[n]));
-	vring_del_virtqueue(vq);
-	mic_card_unmap(mvdev->mdev, mvdev->vr[n]);
-	mvdev->vr[n] = NULL;
-}
-
-static void mic_del_vqs(struct virtio_device *vdev)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct virtqueue *vq, *n;
-	int idx = 0;
-
-	dev_dbg(mic_dev(mvdev), "%s\n", __func__);
-
-	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
-		mic_del_vq(vq, idx++);
-}
-
-/*
- * This routine will assign vring's allocated in host/io memory. Code in
- * virtio_ring.c however continues to access this io memory as if it were local
- * memory without io accessors.
- */
-static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
-				     unsigned index,
-				     void (*callback)(struct virtqueue *vq),
-				     const char *name)
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct mic_vqconfig __iomem *vqconfig;
-	struct mic_vqconfig config;
-	struct virtqueue *vq;
-	void __iomem *va;
-	struct _mic_vring_info __iomem *info;
-	void *used;
-	int vr_size, _vr_size, err, magic;
-	struct vring *vr;
-	u8 type = ioread8(&mvdev->desc->type);
-
-	if (index >= ioread8(&mvdev->desc->num_vq))
-		return ERR_PTR(-ENOENT);
-
-	if (!name)
-		return ERR_PTR(-ENOENT);
-
-	/* First assign the vring's allocated in host memory */
-	vqconfig = mic_vq_config(mvdev->desc) + index;
-	memcpy_fromio(&config, vqconfig, sizeof(config));
-	_vr_size = vring_size(le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN);
-	vr_size = PAGE_ALIGN(_vr_size + sizeof(struct _mic_vring_info));
-	va = mic_card_map(mvdev->mdev, le64_to_cpu(config.address), vr_size);
-	if (!va)
-		return ERR_PTR(-ENOMEM);
-	mvdev->vr[index] = va;
-	memset_io(va, 0x0, _vr_size);
-	vq = vring_new_virtqueue(index, le16_to_cpu(config.num),
-				 MIC_VIRTIO_RING_ALIGN, vdev, false,
-				 (void __force *)va, mic_notify, callback,
-				 name);
-	if (!vq) {
-		err = -ENOMEM;
-		goto unmap;
-	}
-	info = va + _vr_size;
-	magic = ioread32(&info->magic);
-
-	if (WARN(magic != MIC_MAGIC + type + index, "magic mismatch")) {
-		err = -EIO;
-		goto unmap;
-	}
-
-	/* Allocate and reassign used ring now */
-	mvdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
-					     sizeof(struct vring_used_elem) *
-					     le16_to_cpu(config.num));
-	used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-					get_order(mvdev->used_size[index]));
-	if (!used) {
-		err = -ENOMEM;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
-		goto del_vq;
-	}
-	iowrite64(virt_to_phys(used), &vqconfig->used_address);
-
-	/*
-	 * To reassign the used ring here we are directly accessing
-	 * struct vring_virtqueue which is a private data structure
-	 * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
-	 * vring_new_virtqueue() would ensure that
-	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
-	 */
-	vr = (struct vring *)(vq + 1);
-	vr->used = used;
-
-	vq->priv = mvdev;
-	return vq;
-del_vq:
-	vring_del_virtqueue(vq);
-unmap:
-	mic_card_unmap(mvdev->mdev, mvdev->vr[index]);
-	return ERR_PTR(err);
-}
-
-static int mic_find_vqs(struct virtio_device *vdev, unsigned nvqs,
-			struct virtqueue *vqs[],
-			vq_callback_t *callbacks[],
-			const char * const names[])
-{
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-	struct mic_device_ctrl __iomem *dc = mvdev->dc;
-	int i, err, retry;
-
-	/* We must have this many virtqueues. */
-	if (nvqs > ioread8(&mvdev->desc->num_vq))
-		return -ENOENT;
-
-	for (i = 0; i < nvqs; ++i) {
-		dev_dbg(mic_dev(mvdev), "%s: %d: %s\n",
-			__func__, i, names[i]);
-		vqs[i] = mic_find_vq(vdev, i, callbacks[i], names[i]);
-		if (IS_ERR(vqs[i])) {
-			err = PTR_ERR(vqs[i]);
-			goto error;
-		}
-	}
-
-	iowrite8(1, &dc->used_address_updated);
-	/*
-	 * Send an interrupt to the host to inform it that used
-	 * rings have been re-assigned.
-	 */
-	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
-	for (retry = 100; retry--;) {
-		if (!ioread8(&dc->used_address_updated))
-			break;
-		msleep(100);
-	};
-
-	dev_dbg(mic_dev(mvdev), "%s: retry: %d\n", __func__, retry);
-	if (!retry) {
-		err = -ENODEV;
-		goto error;
-	}
-
-	return 0;
-error:
-	mic_del_vqs(vdev);
-	return err;
-}
-
-/*
- * The config ops structure as defined by virtio config
- */
-static struct virtio_config_ops mic_vq_config_ops = {
-	.get_features = mic_get_features,
-	.finalize_features = mic_finalize_features,
-	.get = mic_get,
-	.set = mic_set,
-	.get_status = mic_get_status,
-	.set_status = mic_set_status,
-	.reset = mic_reset,
-	.find_vqs = mic_find_vqs,
-	.del_vqs = mic_del_vqs,
-};
-
-static irqreturn_t
-mic_virtio_intr_handler(int irq, void *data)
-{
-	struct mic_vdev *mvdev = data;
-	struct virtqueue *vq;
-
-	mic_ack_interrupt(mvdev->mdev);
-	list_for_each_entry(vq, &mvdev->vdev.vqs, list)
-		vring_interrupt(0, vq);
-
-	return IRQ_HANDLED;
-}
-
-static void mic_virtio_release_dev(struct device *_d)
-{
-	/*
-	 * No need for a release method similar to virtio PCI.
-	 * Provide an empty one to avoid getting a warning from core.
-	 */
-}
-
-/*
- * adds a new device and register it with virtio
- * appropriate drivers are loaded by the device model
- */
-static int mic_add_device(struct mic_device_desc __iomem *d,
-	unsigned int offset, struct mic_driver *mdrv)
-{
-	struct mic_vdev *mvdev;
-	int ret;
-	int virtio_db;
-	u8 type = ioread8(&d->type);
-
-	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
-	if (!mvdev) {
-		dev_err(mdrv->dev, "Cannot allocate mic dev %u type %u\n",
-			offset, type);
-		return -ENOMEM;
-	}
-
-	mvdev->mdev = &mdrv->mdev;
-	mvdev->vdev.dev.parent = mdrv->dev;
-	mvdev->vdev.dev.release = mic_virtio_release_dev;
-	mvdev->vdev.id.device = type;
-	mvdev->vdev.config = &mic_vq_config_ops;
-	mvdev->desc = d;
-	mvdev->dc = (void __iomem *)d + mic_aligned_desc_size(d);
-	init_completion(&mvdev->reset_done);
-
-	virtio_db = mic_next_card_db();
-	mvdev->virtio_cookie = mic_request_card_irq(mic_virtio_intr_handler,
-			NULL, "virtio intr", mvdev, virtio_db);
-	if (IS_ERR(mvdev->virtio_cookie)) {
-		ret = PTR_ERR(mvdev->virtio_cookie);
-		goto kfree;
-	}
-	iowrite8((u8)virtio_db, &mvdev->dc->h2c_vdev_db);
-	mvdev->c2h_vdev_db = ioread8(&mvdev->dc->c2h_vdev_db);
-
-	ret = register_virtio_device(&mvdev->vdev);
-	if (ret) {
-		dev_err(mic_dev(mvdev),
-			"Failed to register mic device %u type %u\n",
-			offset, type);
-		goto free_irq;
-	}
-	iowrite64((u64)mvdev, &mvdev->dc->vdev);
-	dev_dbg(mic_dev(mvdev), "%s: registered mic device %u type %u mvdev %p\n",
-		__func__, offset, type, mvdev);
-
-	return 0;
-
-free_irq:
-	mic_free_card_irq(mvdev->virtio_cookie, mvdev);
-kfree:
-	kfree(mvdev);
-	return ret;
-}
-
-/*
- * match for a mic device with a specific desc pointer
- */
-static int mic_match_desc(struct device *dev, void *data)
-{
-	struct virtio_device *vdev = dev_to_virtio(dev);
-	struct mic_vdev *mvdev = to_micvdev(vdev);
-
-	return mvdev->desc == (void __iomem *)data;
-}
-
-static void mic_handle_config_change(struct mic_device_desc __iomem *d,
-	unsigned int offset, struct mic_driver *mdrv)
-{
-	struct mic_device_ctrl __iomem *dc
-		= (void __iomem *)d + mic_aligned_desc_size(d);
-	struct mic_vdev *mvdev = (struct mic_vdev *)ioread64(&dc->vdev);
-
-	if (ioread8(&dc->config_change) != MIC_VIRTIO_PARAM_CONFIG_CHANGED)
-		return;
-
-	dev_dbg(mdrv->dev, "%s %d\n", __func__, __LINE__);
-	virtio_config_changed(&mvdev->vdev);
-	iowrite8(1, &dc->guest_ack);
-}
-
-/*
- * removes a virtio device if a hot remove event has been
- * requested by the host.
- */
-static int mic_remove_device(struct mic_device_desc __iomem *d,
-	unsigned int offset, struct mic_driver *mdrv)
-{
-	struct mic_device_ctrl __iomem *dc
-		= (void __iomem *)d + mic_aligned_desc_size(d);
-	struct mic_vdev *mvdev = (struct mic_vdev *)ioread64(&dc->vdev);
-	u8 status;
-	int ret = -1;
-
-	if (ioread8(&dc->config_change) == MIC_VIRTIO_PARAM_DEV_REMOVE) {
-		dev_dbg(mdrv->dev,
-			"%s %d config_change %d type %d mvdev %p\n",
-			__func__, __LINE__,
-			ioread8(&dc->config_change), ioread8(&d->type), mvdev);
-
-		status = ioread8(&d->status);
-		reinit_completion(&mvdev->reset_done);
-		unregister_virtio_device(&mvdev->vdev);
-		mic_free_card_irq(mvdev->virtio_cookie, mvdev);
-		if (status & VIRTIO_CONFIG_S_DRIVER_OK)
-			wait_for_completion(&mvdev->reset_done);
-		kfree(mvdev);
-		iowrite8(1, &dc->guest_ack);
-		dev_dbg(mdrv->dev, "%s %d guest_ack %d\n",
-			__func__, __LINE__, ioread8(&dc->guest_ack));
-		ret = 0;
-	}
-
-	return ret;
-}
-
-#define REMOVE_DEVICES true
-
-static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
-{
-	s8 type;
-	unsigned int i;
-	struct mic_device_desc __iomem *d;
-	struct mic_device_ctrl __iomem *dc;
-	struct device *dev;
-	int ret;
-
-	for (i = sizeof(struct mic_bootparam); i < MIC_DP_SIZE;
-		i += mic_total_desc_size(d)) {
-		d = mdrv->dp + i;
-		dc = (void __iomem *)d + mic_aligned_desc_size(d);
-		/*
-		 * This read barrier is paired with the corresponding write
-		 * barrier on the host which is inserted before adding or
-		 * removing a virtio device descriptor, by updating the type.
-		 */
-		rmb();
-		type = ioread8(&d->type);
-
-		/* end of list */
-		if (type == 0)
-			break;
-
-		if (type == -1)
-			continue;
-
-		/* device already exists */
-		dev = device_find_child(mdrv->dev, (void __force *)d,
-					mic_match_desc);
-		if (dev) {
-			if (remove)
-				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
-					 &dc->config_change);
-			put_device(dev);
-			mic_handle_config_change(d, i, mdrv);
-			ret = mic_remove_device(d, i, mdrv);
-			if (!ret && !remove)
-				iowrite8(-1, &d->type);
-			if (remove) {
-				iowrite8(0, &dc->config_change);
-				iowrite8(0, &dc->guest_ack);
-			}
-			continue;
-		}
-
-		/* new device */
-		dev_dbg(mdrv->dev, "%s %d Adding new virtio device %p\n",
-			__func__, __LINE__, d);
-		if (!remove)
-			mic_add_device(d, i, mdrv);
-	}
-}
-
-/*
- * mic_hotplug_device tries to find changes in the device page.
- */
-static void mic_hotplug_devices(struct work_struct *work)
-{
-	struct mic_driver *mdrv = container_of(work,
-		struct mic_driver, hotplug_work);
-
-	mic_scan_devices(mdrv, !REMOVE_DEVICES);
-}
-
-/*
- * Interrupt handler for hot plug/config changes etc.
- */
-static irqreturn_t
-mic_extint_handler(int irq, void *data)
-{
-	struct mic_driver *mdrv = (struct mic_driver *)data;
-
-	dev_dbg(mdrv->dev, "%s %d hotplug work\n",
-		__func__, __LINE__);
-	mic_ack_interrupt(&mdrv->mdev);
-	schedule_work(&mdrv->hotplug_work);
-	return IRQ_HANDLED;
-}
-
-/*
- * Init function for virtio
- */
-int mic_devices_init(struct mic_driver *mdrv)
-{
-	int rc;
-	struct mic_bootparam __iomem *bootparam;
-	int config_db;
-
-	INIT_WORK(&mdrv->hotplug_work, mic_hotplug_devices);
-	mic_scan_devices(mdrv, !REMOVE_DEVICES);
-
-	config_db = mic_next_card_db();
-	virtio_config_cookie = mic_request_card_irq(mic_extint_handler, NULL,
-						    "virtio_config_intr", mdrv,
-						    config_db);
-	if (IS_ERR(virtio_config_cookie)) {
-		rc = PTR_ERR(virtio_config_cookie);
-		goto exit;
-	}
-
-	bootparam = mdrv->dp;
-	iowrite8(config_db, &bootparam->h2c_config_db);
-	return 0;
-exit:
-	return rc;
-}
-
-/*
- * Uninit function for virtio
- */
-void mic_devices_uninit(struct mic_driver *mdrv)
-{
-	struct mic_bootparam __iomem *bootparam = mdrv->dp;
-	iowrite8(-1, &bootparam->h2c_config_db);
-	mic_free_card_irq(virtio_config_cookie, mdrv);
-	flush_work(&mdrv->hotplug_work);
-	mic_scan_devices(mdrv, REMOVE_DEVICES);
-}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 49%]

* [PATCH char-misc-next 3/8] misc: mic: MIC VOP Bus
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
  2016-02-02  4:23 29% ` [PATCH char-misc-next 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
  2016-02-02  4:23 49% ` [PATCH char-misc-next 2/8] misc: mic: Remove MIC X100 card " Sudeep Dutt
@ 2016-02-02  4:23 61% ` Sudeep Dutt
    2016-02-02  4:23 75% ` [PATCH char-misc-next 4/8] misc: mic: Add data structures for the VOP driver Sudeep Dutt
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

The Virtio Over PCIe (VOP) bus abstracts the low level hardware
details like interrupts and mapping remote memory so that the same VOP
driver can work without changes with different MIC host or card
drivers as long as the hardware bus operations are implemented. The
VOP driver registers itself on the VOP bus. The base PCIe drivers
implement the bus ops and register VOP devices on the bus, resulting
in the VOP driver being probed with the VOP devices. This allows the
VOP functionality to be shared between multiple generations of Intel
MIC products.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig       |  17 ++++
 drivers/misc/mic/bus/Makefile  |   1 +
 drivers/misc/mic/bus/vop_bus.h | 142 ++++++++++++++++++++++++++++
 drivers/misc/mic/bus/vop_bus.c | 204 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 364 insertions(+)
 create mode 100644 drivers/misc/mic/bus/vop_bus.h
 create mode 100644 drivers/misc/mic/bus/vop_bus.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 40677df..840f7ef 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -32,6 +32,23 @@ config SCIF_BUS
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
 
+comment "VOP Bus Driver"
+
+config VOP_BUS
+	tristate "VOP Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the VOP Bus, such as CONFIG_INTEL_MIC_HOST
+	  and CONFIG_INTEL_MIC_CARD.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
index 761842b..8758a7d 100644
--- a/drivers/misc/mic/bus/Makefile
+++ b/drivers/misc/mic/bus/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
 obj-$(CONFIG_SCIF_BUS) += scif_bus.o
 obj-$(CONFIG_MIC_COSM) += cosm_bus.o
+obj-$(CONFIG_VOP_BUS) += vop_bus.o
diff --git a/drivers/misc/mic/bus/vop_bus.h b/drivers/misc/mic/bus/vop_bus.h
new file mode 100644
index 0000000..97fa5d6
--- /dev/null
+++ b/drivers/misc/mic/bus/vop_bus.h
@@ -0,0 +1,142 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio over PCIe Bus driver.
+ */
+#ifndef _VOP_BUS_H_
+#define _VOP_BUS_H_
+/*
+ * Everything a vop driver needs to work with any particular vop
+ * implementation.
+ */
+#include <linux/dmaengine.h>
+#include <linux/interrupt.h>
+
+#include "../common/mic_dev.h"
+
+struct vop_device_id {
+	u32 device;
+	u32 vendor;
+};
+
+#define VOP_DEV_TRNSP 1
+#define VOP_DEV_ANY_ID 0xffffffff
+/*
+ * Size of the internal buffer used during DMA's as an intermediate buffer
+ * for copy to/from user. Must be an integral number of pages.
+ */
+#define VOP_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
+
+/**
+ * vop_device - representation of a device using vop
+ * @priv: private pointer for the driver's use.
+ * @hw_ops: the hardware ops supported by this device.
+ * @id: the device type identification (used to match it with a driver).
+ * @dev: underlying device.
+ * @dnode - The destination node which this device will communicate with.
+ * @aper: Aperture memory window
+ * @dma_ch - DMA channel
+ * @index: unique position on the vop bus
+ */
+struct vop_device {
+	void *priv;
+	struct vop_hw_ops *hw_ops;
+	struct vop_device_id id;
+	struct device dev;
+	u8 dnode;
+	struct mic_mw *aper;
+	struct dma_chan *dma_ch;
+	int index;
+};
+
+/**
+ * vop_driver - operations for a vop I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct vop_driver {
+	struct device_driver driver;
+	const struct vop_device_id *id_table;
+	int (*probe)(struct vop_device *dev);
+	void (*remove)(struct vop_device *dev);
+};
+
+/**
+ * vop_hw_ops - Hardware operations for accessing a VOP device on the VOP bus.
+ *
+ * @next_db: Obtain the next available doorbell.
+ * @request_irq: Request an interrupt on a particular doorbell.
+ * @free_irq: Free an interrupt requested previously.
+ * @ack_interrupt: acknowledge an interrupt in the ISR.
+ * @get_remote_dp: Get access to the virtio device page used by the remote
+ *                 node to add/remove/configure virtio devices.
+ * @get_dp: Get access to the virtio device page used by the self
+ *          node to add/remove/configure virtio devices.
+ * @send_intr: Send an interrupt to the peer node on a specified doorbell.
+ * @ioremap: Map a buffer with the specified DMA address and length.
+ * @iounmap: Unmap a buffer previously mapped.
+ * @dma_filter: The DMA filter function to use for obtaining access to
+ *		a DMA channel on the peer node.
+ */
+struct vop_hw_ops {
+	int (*next_db)(struct vop_device *vpdev);
+	struct mic_irq *(*request_irq)(struct vop_device *vpdev,
+				       irqreturn_t (*func)(int irq, void *data),
+				       const char *name, void *data,
+				       int intr_src);
+	void (*free_irq)(struct vop_device *vpdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct vop_device *vpdev, int num);
+	void __iomem * (*get_remote_dp)(struct vop_device *vpdev);
+	void * (*get_dp)(struct vop_device *vpdev);
+	void (*send_intr)(struct vop_device *vpdev, int db);
+	void __iomem * (*ioremap)(struct vop_device *vpdev,
+				  dma_addr_t pa, size_t len);
+	void (*iounmap)(struct vop_device *vpdev, void __iomem *va);
+};
+
+struct vop_device *
+vop_register_device(struct device *pdev, int id,
+		    const struct dma_map_ops *dma_ops,
+		    struct vop_hw_ops *hw_ops, u8 dnode, struct mic_mw *aper,
+		    struct dma_chan *chan);
+void vop_unregister_device(struct vop_device *dev);
+int vop_register_driver(struct vop_driver *drv);
+void vop_unregister_driver(struct vop_driver *drv);
+
+/*
+ * module_vop_driver() - Helper macro for drivers that don't do
+ * anything special in module init/exit.  This eliminates a lot of
+ * boilerplate.  Each module may only use this macro once, and
+ * calling it replaces module_init() and module_exit()
+ */
+#define module_vop_driver(__vop_driver) \
+	module_driver(__vop_driver, vop_register_driver, \
+			vop_unregister_driver)
+
+static inline struct vop_device *dev_to_vop(struct device *dev)
+{
+	return container_of(dev, struct vop_device, dev);
+}
+
+static inline struct vop_driver *drv_to_vop(struct device_driver *drv)
+{
+	return container_of(drv, struct vop_driver, driver);
+}
+#endif /* _VOP_BUS_H */
diff --git a/drivers/misc/mic/bus/vop_bus.c b/drivers/misc/mic/bus/vop_bus.c
new file mode 100644
index 0000000..02449f5
--- /dev/null
+++ b/drivers/misc/mic/bus/vop_bus.c
@@ -0,0 +1,204 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) Bus driver.
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/dma-mapping.h>
+
+#include "vop_bus.h"
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct vop_device *dev = dev_to_vop(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct vop_device *dev = dev_to_vop(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct vop_device *dev = dev_to_vop(d);
+
+	return sprintf(buf, "vop:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *vop_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(vop_dev);
+
+static inline int vop_id_match(const struct vop_device *dev,
+			       const struct vop_device_id *id)
+{
+	if (id->device != dev->id.device && id->device != VOP_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == VOP_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call vop_dev_probe().
+ */
+static int vop_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct vop_device *dev = dev_to_vop(dv);
+	const struct vop_device_id *ids;
+
+	ids = drv_to_vop(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (vop_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int vop_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct vop_device *dev = dev_to_vop(dv);
+
+	return add_uevent_var(env, "MODALIAS=vop:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int vop_dev_probe(struct device *d)
+{
+	struct vop_device *dev = dev_to_vop(d);
+	struct vop_driver *drv = drv_to_vop(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int vop_dev_remove(struct device *d)
+{
+	struct vop_device *dev = dev_to_vop(d);
+	struct vop_driver *drv = drv_to_vop(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type vop_bus = {
+	.name  = "vop_bus",
+	.match = vop_dev_match,
+	.dev_groups = vop_dev_groups,
+	.uevent = vop_uevent,
+	.probe = vop_dev_probe,
+	.remove = vop_dev_remove,
+};
+
+int vop_register_driver(struct vop_driver *driver)
+{
+	driver->driver.bus = &vop_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(vop_register_driver);
+
+void vop_unregister_driver(struct vop_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(vop_unregister_driver);
+
+static void vop_release_dev(struct device *d)
+{
+	put_device(d);
+}
+
+struct vop_device *
+vop_register_device(struct device *pdev, int id,
+		    const struct dma_map_ops *dma_ops,
+		    struct vop_hw_ops *hw_ops, u8 dnode, struct mic_mw *aper,
+		    struct dma_chan *chan)
+{
+	int ret;
+	struct vop_device *vdev;
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev)
+		return ERR_PTR(-ENOMEM);
+
+	vdev->dev.parent = pdev;
+	vdev->id.device = id;
+	vdev->id.vendor = VOP_DEV_ANY_ID;
+	vdev->dev.archdata.dma_ops = (struct dma_map_ops *)dma_ops;
+	vdev->dev.dma_mask = &vdev->dev.coherent_dma_mask;
+	dma_set_mask(&vdev->dev, DMA_BIT_MASK(64));
+	vdev->dev.release = vop_release_dev;
+	vdev->hw_ops = hw_ops;
+	dev_set_drvdata(&vdev->dev, vdev);
+	vdev->dev.bus = &vop_bus;
+	vdev->dnode = dnode;
+	vdev->aper = aper;
+	vdev->dma_ch = chan;
+	vdev->index = dnode - 1;
+	dev_set_name(&vdev->dev, "vop-dev%u", vdev->index);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&vdev->dev);
+	if (ret)
+		goto free_vdev;
+	return vdev;
+free_vdev:
+	kfree(vdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(vop_register_device);
+
+void vop_unregister_device(struct vop_device *dev)
+{
+	device_unregister(&dev->dev);
+}
+EXPORT_SYMBOL_GPL(vop_unregister_device);
+
+static int __init vop_init(void)
+{
+	return bus_register(&vop_bus);
+}
+
+static void __exit vop_exit(void)
+{
+	bus_unregister(&vop_bus);
+}
+
+core_initcall(vop_init);
+module_exit(vop_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) VOP Bus driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1

^ permalink raw reply related	[relevance 61%]

* [PATCH char-misc-next 4/8] misc: mic: Add data structures for the VOP driver
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
                   ` (2 preceding siblings ...)
  2016-02-02  4:23 61% ` [PATCH char-misc-next 3/8] misc: mic: MIC VOP Bus Sudeep Dutt
@ 2016-02-02  4:23 75% ` Sudeep Dutt
  2016-02-02  4:23 39% ` [PATCH char-misc-next 5/8] misc: mic: Enable VOP host side functionality Sudeep Dutt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch adds VOP driver data structures used in subsequent
patches. These data structures are refactored from similar data
structures used in the virtio parts of previous MIC host and card
drivers.

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/vop/vop_main.h | 170 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)
 create mode 100644 drivers/misc/mic/vop/vop_main.h

diff --git a/drivers/misc/mic/vop/vop_main.h b/drivers/misc/mic/vop/vop_main.h
new file mode 100644
index 0000000..ba47ec7
--- /dev/null
+++ b/drivers/misc/mic/vop/vop_main.h
@@ -0,0 +1,170 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel Virtio Over PCIe (VOP) driver.
+ *
+ */
+#ifndef _VOP_MAIN_H_
+#define _VOP_MAIN_H_
+
+#include <linux/vringh.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio.h>
+#include <linux/miscdevice.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+#include "../bus/vop_bus.h"
+
+/*
+ * Note on endianness.
+ * 1. Host can be both BE or LE
+ * 2. Guest/card is LE. Host uses le_to_cpu to access desc/avail
+ *    rings and ioreadXX/iowriteXX to access used ring.
+ * 3. Device page exposed by host to guest contains LE values. Guest
+ *    accesses these using ioreadXX/iowriteXX etc. This way in general we
+ *    obey the virtio spec according to which guest works with native
+ *    endianness and host is aware of guest endianness and does all
+ *    required endianness conversion.
+ * 4. Data provided from user space to guest (in ADD_DEVICE and
+ *    CONFIG_CHANGE ioctl's) is not interpreted by the driver and should be
+ *    in guest endianness.
+ */
+
+/*
+ * vop_info - Allocated per invocation of VOP probe
+ *
+ * @vpdev: VOP device
+ * @hotplug_work: Handle virtio device creation, deletion and configuration
+ * @cookie: Cookie received upon requesting a virtio configuration interrupt
+ * @h2c_config_db: The doorbell used by the peer to indicate a config change
+ * @vdev_list: List of "active" virtio devices injected in the peer node
+ * @vop_mutex: Synchronize access to the device page as well as serialize
+ *             creation/deletion of virtio devices on the peer node
+ * @dp: Peer device page information
+ * @dbg: Debugfs entry
+ * @dma_ch: The DMA channel used by this transport for data transfers.
+ * @name: Name for this transport used in misc device creation.
+ * @miscdev: The misc device registered.
+ */
+struct vop_info {
+	struct vop_device *vpdev;
+	struct work_struct hotplug_work;
+	struct mic_irq *cookie;
+	int h2c_config_db;
+	struct list_head vdev_list;
+	struct mutex vop_mutex;
+	void __iomem *dp;
+	struct dentry *dbg;
+	struct dma_chan *dma_ch;
+	char name[16];
+	struct miscdevice miscdev;
+};
+
+/**
+ * struct vop_vringh - Virtio ring host information.
+ *
+ * @vring: The VOP vring used for setting up user space mappings.
+ * @vrh: The host VRINGH used for accessing the card vrings.
+ * @riov: The VRINGH read kernel IOV.
+ * @wiov: The VRINGH write kernel IOV.
+ * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
+ * @vr_mutex: Mutex for synchronizing access to the VRING.
+ * @buf: Temporary kernel buffer used to copy in/out data
+ * from/to the card via DMA.
+ * @buf_da: dma address of buf.
+ * @vdev: Back pointer to VOP virtio device for vringh_notify(..).
+ */
+struct vop_vringh {
+	struct mic_vring vring;
+	struct vringh vrh;
+	struct vringh_kiov riov;
+	struct vringh_kiov wiov;
+	u16 head;
+	struct mutex vr_mutex;
+	void *buf;
+	dma_addr_t buf_da;
+	struct vop_vdev *vdev;
+};
+
+/**
+ * struct vop_vdev - Host information for a card Virtio device.
+ *
+ * @virtio_id - Virtio device id.
+ * @waitq - Waitqueue to allow ring3 apps to poll.
+ * @vpdev - pointer to VOP bus device.
+ * @poll_wake - Used for waking up threads blocked in poll.
+ * @out_bytes - Debug stats for number of bytes copied from host to card.
+ * @in_bytes - Debug stats for number of bytes copied from card to host.
+ * @out_bytes_dma - Debug stats for number of bytes copied from host to card
+ * using DMA.
+ * @in_bytes_dma - Debug stats for number of bytes copied from card to host
+ * using DMA.
+ * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
+ * the transfer length did not have the required DMA alignment.
+ * @tx_dst_unaligned - Debug stats for number of bytes copied where the
+ * destination address on the card did not have the required DMA alignment.
+ * @vvr - Store per VRING data structures.
+ * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
+ * @dd - Virtio device descriptor.
+ * @dc - Virtio device control fields.
+ * @list - List of Virtio devices.
+ * @virtio_db - The doorbell used by the card to interrupt the host.
+ * @virtio_cookie - The cookie returned while requesting interrupts.
+ * @vi: Transport information.
+ * @vdev_mutex: Mutex synchronizing virtio device injection,
+ *              removal and data transfers.
+ * @destroy: Track if a virtio device is being destroyed.
+ * @deleted: The virtio device has been deleted.
+ */
+struct vop_vdev {
+	int virtio_id;
+	wait_queue_head_t waitq;
+	struct vop_device *vpdev;
+	int poll_wake;
+	unsigned long out_bytes;
+	unsigned long in_bytes;
+	unsigned long out_bytes_dma;
+	unsigned long in_bytes_dma;
+	unsigned long tx_len_unaligned;
+	unsigned long tx_dst_unaligned;
+	unsigned long rx_dst_unaligned;
+	struct vop_vringh vvr[MIC_MAX_VRINGS];
+	struct work_struct virtio_bh_work;
+	struct mic_device_desc *dd;
+	struct mic_device_ctrl *dc;
+	struct list_head list;
+	int virtio_db;
+	struct mic_irq *virtio_cookie;
+	struct vop_info *vi;
+	struct mutex vdev_mutex;
+	struct completion destroy;
+	bool deleted;
+};
+
+/* Helper API to check if a virtio device is running */
+static inline bool vop_vdevup(struct vop_vdev *vdev)
+{
+	return !!vdev->dd->status;
+}
+
+void vop_init_debugfs(struct vop_info *vi);
+void vop_exit_debugfs(struct vop_info *vi);
+int vop_host_init(struct vop_info *vi);
+void vop_host_uninit(struct vop_info *vi);
+#endif
-- 
1.8.2.1

^ permalink raw reply related	[relevance 75%]

* [PATCH char-misc-next 1/8] misc: mic: Remove MIC X100 host virtio functionality
  2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
@ 2016-02-02  4:23 29% ` Sudeep Dutt
  2016-02-02  4:23 49% ` [PATCH char-misc-next 2/8] misc: mic: Remove MIC X100 card " Sudeep Dutt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch deletes the virtio functionality from the MIC X100 host
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/host/Makefile      |   2 -
 drivers/misc/mic/host/mic_device.h  |   6 -
 drivers/misc/mic/host/mic_fops.h    |  32 --
 drivers/misc/mic/host/mic_virtio.h  | 155 -------
 drivers/misc/mic/host/mic_boot.c    |   2 -
 drivers/misc/mic/host/mic_debugfs.c | 190 ---------
 drivers/misc/mic/host/mic_fops.c    | 222 ----------
 drivers/misc/mic/host/mic_main.c    |  48 +--
 drivers/misc/mic/host/mic_virtio.c  | 811 ------------------------------------
 9 files changed, 4 insertions(+), 1464 deletions(-)
 delete mode 100644 drivers/misc/mic/host/mic_fops.h
 delete mode 100644 drivers/misc/mic/host/mic_virtio.h
 delete mode 100644 drivers/misc/mic/host/mic_fops.c
 delete mode 100644 drivers/misc/mic/host/mic_virtio.c

diff --git a/drivers/misc/mic/host/Makefile b/drivers/misc/mic/host/Makefile
index 004d3db..f3b5023 100644
--- a/drivers/misc/mic/host/Makefile
+++ b/drivers/misc/mic/host/Makefile
@@ -9,5 +9,3 @@ mic_host-objs += mic_smpt.o
 mic_host-objs += mic_intr.o
 mic_host-objs += mic_boot.o
 mic_host-objs += mic_debugfs.o
-mic_host-objs += mic_fops.o
-mic_host-objs += mic_virtio.o
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 461184a..8460de1 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -64,9 +64,6 @@ extern struct cosm_hw_ops cosm_hw_ops;
  * @bootaddr: MIC boot address.
  * @dp: virtio device page
  * @dp_dma_addr: virtio device page DMA address.
- * @name: name for the misc char device
- * @miscdev: registered misc char device
- * @vdev_list: list of virtio devices.
  * @dma_mbdev: MIC BUS DMA device.
  * @dma_ch - Array of DMA channels
  * @num_dma_ch - Number of DMA channels available
@@ -91,9 +88,6 @@ struct mic_device {
 	u32 bootaddr;
 	void *dp;
 	dma_addr_t dp_dma_addr;
-	char name[16];
-	struct miscdevice miscdev;
-	struct list_head vdev_list;
 	struct mbus_device *dma_mbdev;
 	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
 	int num_dma_ch;
diff --git a/drivers/misc/mic/host/mic_fops.h b/drivers/misc/mic/host/mic_fops.h
deleted file mode 100644
index dc3893d..0000000
--- a/drivers/misc/mic/host/mic_fops.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#ifndef _MIC_FOPS_H_
-#define _MIC_FOPS_H_
-
-int mic_open(struct inode *inode, struct file *filp);
-int mic_release(struct inode *inode, struct file *filp);
-ssize_t mic_read(struct file *filp, char __user *buf,
-			size_t count, loff_t *pos);
-long mic_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
-int mic_mmap(struct file *f, struct vm_area_struct *vma);
-unsigned int mic_poll(struct file *f, poll_table *wait);
-
-#endif
diff --git a/drivers/misc/mic/host/mic_virtio.h b/drivers/misc/mic/host/mic_virtio.h
deleted file mode 100644
index a80631f..0000000
--- a/drivers/misc/mic/host/mic_virtio.h
+++ /dev/null
@@ -1,155 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#ifndef MIC_VIRTIO_H
-#define MIC_VIRTIO_H
-
-#include <linux/virtio_config.h>
-#include <linux/mic_ioctl.h>
-
-/*
- * Note on endianness.
- * 1. Host can be both BE or LE
- * 2. Guest/card is LE. Host uses le_to_cpu to access desc/avail
- *    rings and ioreadXX/iowriteXX to access used ring.
- * 3. Device page exposed by host to guest contains LE values. Guest
- *    accesses these using ioreadXX/iowriteXX etc. This way in general we
- *    obey the virtio spec according to which guest works with native
- *    endianness and host is aware of guest endianness and does all
- *    required endianness conversion.
- * 4. Data provided from user space to guest (in ADD_DEVICE and
- *    CONFIG_CHANGE ioctl's) is not interpreted by the driver and should be
- *    in guest endianness.
- */
-
-/**
- * struct mic_vringh - Virtio ring host information.
- *
- * @vring: The MIC vring used for setting up user space mappings.
- * @vrh: The host VRINGH used for accessing the card vrings.
- * @riov: The VRINGH read kernel IOV.
- * @wiov: The VRINGH write kernel IOV.
- * @vr_mutex: Mutex for synchronizing access to the VRING.
- * @buf: Temporary kernel buffer used to copy in/out data
- * from/to the card via DMA.
- * @buf_da: dma address of buf.
- * @mvdev: Back pointer to MIC virtio device for vringh_notify(..).
- * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
- */
-struct mic_vringh {
-	struct mic_vring vring;
-	struct vringh vrh;
-	struct vringh_kiov riov;
-	struct vringh_kiov wiov;
-	struct mutex vr_mutex;
-	void *buf;
-	dma_addr_t buf_da;
-	struct mic_vdev *mvdev;
-	u16 head;
-};
-
-/**
- * struct mic_vdev - Host information for a card Virtio device.
- *
- * @virtio_id - Virtio device id.
- * @waitq - Waitqueue to allow ring3 apps to poll.
- * @mdev - Back pointer to host MIC device.
- * @poll_wake - Used for waking up threads blocked in poll.
- * @out_bytes - Debug stats for number of bytes copied from host to card.
- * @in_bytes - Debug stats for number of bytes copied from card to host.
- * @out_bytes_dma - Debug stats for number of bytes copied from host to card
- * using DMA.
- * @in_bytes_dma - Debug stats for number of bytes copied from card to host
- * using DMA.
- * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
- * the transfer length did not have the required DMA alignment.
- * @tx_dst_unaligned - Debug stats for number of bytes copied where the
- * destination address on the card did not have the required DMA alignment.
- * @mvr - Store per VRING data structures.
- * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
- * @dd - Virtio device descriptor.
- * @dc - Virtio device control fields.
- * @list - List of Virtio devices.
- * @virtio_db - The doorbell used by the card to interrupt the host.
- * @virtio_cookie - The cookie returned while requesting interrupts.
- */
-struct mic_vdev {
-	int virtio_id;
-	wait_queue_head_t waitq;
-	struct mic_device *mdev;
-	int poll_wake;
-	unsigned long out_bytes;
-	unsigned long in_bytes;
-	unsigned long out_bytes_dma;
-	unsigned long in_bytes_dma;
-	unsigned long tx_len_unaligned;
-	unsigned long tx_dst_unaligned;
-	struct mic_vringh mvr[MIC_MAX_VRINGS];
-	struct work_struct virtio_bh_work;
-	struct mic_device_desc *dd;
-	struct mic_device_ctrl *dc;
-	struct list_head list;
-	int virtio_db;
-	struct mic_irq *virtio_cookie;
-};
-
-void mic_virtio_uninit(struct mic_device *mdev);
-int mic_virtio_add_device(struct mic_vdev *mvdev,
-			void __user *argp);
-void mic_virtio_del_device(struct mic_vdev *mvdev);
-int mic_virtio_config_change(struct mic_vdev *mvdev,
-			void __user *argp);
-int mic_virtio_copy_desc(struct mic_vdev *mvdev,
-	struct mic_copy_desc *request);
-void mic_virtio_reset_devices(struct mic_device *mdev);
-void mic_bh_handler(struct work_struct *work);
-
-/* Helper API to obtain the MIC PCIe device */
-static inline struct device *mic_dev(struct mic_vdev *mvdev)
-{
-	return &mvdev->mdev->pdev->dev;
-}
-
-/* Helper API to check if a virtio device is initialized */
-static inline int mic_vdev_inited(struct mic_vdev *mvdev)
-{
-	/* Device has not been created yet */
-	if (!mvdev->dd || !mvdev->dd->type) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EINVAL);
-		return -EINVAL;
-	}
-
-	/* Device has been removed/deleted */
-	if (mvdev->dd->type == -1) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -ENODEV);
-		return -ENODEV;
-	}
-
-	return 0;
-}
-
-/* Helper API to check if a virtio device is running */
-static inline bool mic_vdevup(struct mic_vdev *mvdev)
-{
-	return !!mvdev->dd->status;
-}
-#endif
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index 7845564..3df305f 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -28,7 +28,6 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
-#include "mic_virtio.h"
 
 static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
 {
@@ -423,7 +422,6 @@ static void _mic_stop(struct cosm_device *cdev, bool force)
 	 * will be the first to be registered and the last to be
 	 * unregistered.
 	 */
-	mic_virtio_reset_devices(mdev);
 	scif_unregister_device(mdev->scdev);
 	mic_free_dma_chans(mdev);
 	mbus_unregister_device(mdev->dma_mbdev);
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 1058160..0a9daba 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -26,7 +26,6 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
-#include "mic_virtio.h"
 
 /* Debugfs parent dir */
 static struct dentry *mic_dbg;
@@ -100,190 +99,6 @@ static const struct file_operations post_code_ops = {
 	.release = mic_post_code_debug_release
 };
 
-static int mic_dp_show(struct seq_file *s, void *pos)
-{
-	struct mic_device *mdev = s->private;
-	struct mic_device_desc *d;
-	struct mic_device_ctrl *dc;
-	struct mic_vqconfig *vqconfig;
-	__u32 *features;
-	__u8 *config;
-	struct mic_bootparam *bootparam = mdev->dp;
-	int i, j;
-
-	seq_printf(s, "Bootparam: magic 0x%x\n",
-		   bootparam->magic);
-	seq_printf(s, "Bootparam: h2c_config_db %d\n",
-		   bootparam->h2c_config_db);
-	seq_printf(s, "Bootparam: node_id %d\n",
-		   bootparam->node_id);
-	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
-		   bootparam->c2h_scif_db);
-	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
-		   bootparam->h2c_scif_db);
-	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
-		   bootparam->scif_host_dma_addr);
-	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
-		   bootparam->scif_card_dma_addr);
-
-
-	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
-	     i += mic_total_desc_size(d)) {
-		d = mdev->dp + i;
-		dc = (void *)d + mic_aligned_desc_size(d);
-
-		/* end of list */
-		if (d->type == 0)
-			break;
-
-		if (d->type == -1)
-			continue;
-
-		seq_printf(s, "Type %d ", d->type);
-		seq_printf(s, "Num VQ %d ", d->num_vq);
-		seq_printf(s, "Feature Len %d\n", d->feature_len);
-		seq_printf(s, "Config Len %d ", d->config_len);
-		seq_printf(s, "Shutdown Status %d\n", d->status);
-
-		for (j = 0; j < d->num_vq; j++) {
-			vqconfig = mic_vq_config(d) + j;
-			seq_printf(s, "vqconfig[%d]: ", j);
-			seq_printf(s, "address 0x%llx ", vqconfig->address);
-			seq_printf(s, "num %d ", vqconfig->num);
-			seq_printf(s, "used address 0x%llx\n",
-				   vqconfig->used_address);
-		}
-
-		features = (__u32 *)mic_vq_features(d);
-		seq_printf(s, "Features: Host 0x%x ", features[0]);
-		seq_printf(s, "Guest 0x%x\n", features[1]);
-
-		config = mic_vq_configspace(d);
-		for (j = 0; j < d->config_len; j++)
-			seq_printf(s, "config[%d]=%d\n", j, config[j]);
-
-		seq_puts(s, "Device control:\n");
-		seq_printf(s, "Config Change %d ", dc->config_change);
-		seq_printf(s, "Vdev reset %d\n", dc->vdev_reset);
-		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
-		seq_printf(s, "Host ack %d\n", dc->host_ack);
-		seq_printf(s, "Used address updated %d ",
-			   dc->used_address_updated);
-		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
-		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
-		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
-	}
-
-	return 0;
-}
-
-static int mic_dp_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, mic_dp_show, inode->i_private);
-}
-
-static int mic_dp_debug_release(struct inode *inode, struct file *file)
-{
-	return single_release(inode, file);
-}
-
-static const struct file_operations dp_ops = {
-	.owner   = THIS_MODULE,
-	.open    = mic_dp_debug_open,
-	.read    = seq_read,
-	.llseek  = seq_lseek,
-	.release = mic_dp_debug_release
-};
-
-static int mic_vdev_info_show(struct seq_file *s, void *unused)
-{
-	struct mic_device *mdev = s->private;
-	struct list_head *pos, *tmp;
-	struct mic_vdev *mvdev;
-	int i, j;
-
-	mutex_lock(&mdev->mic_mutex);
-	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
-		mvdev = list_entry(pos, struct mic_vdev, list);
-		seq_printf(s, "VDEV type %d state %s in %ld out %ld\n",
-			   mvdev->virtio_id,
-			   mic_vdevup(mvdev) ? "UP" : "DOWN",
-			   mvdev->in_bytes,
-			   mvdev->out_bytes);
-		for (i = 0; i < MIC_MAX_VRINGS; i++) {
-			struct vring_desc *desc;
-			struct vring_avail *avail;
-			struct vring_used *used;
-			struct mic_vringh *mvr = &mvdev->mvr[i];
-			struct vringh *vrh = &mvr->vrh;
-			int num = vrh->vring.num;
-			if (!num)
-				continue;
-			desc = vrh->vring.desc;
-			seq_printf(s, "vring i %d avail_idx %d",
-				   i, mvr->vring.info->avail_idx & (num - 1));
-			seq_printf(s, " vring i %d avail_idx %d\n",
-				   i, mvr->vring.info->avail_idx);
-			seq_printf(s, "vrh i %d weak_barriers %d",
-				   i, vrh->weak_barriers);
-			seq_printf(s, " last_avail_idx %d last_used_idx %d",
-				   vrh->last_avail_idx, vrh->last_used_idx);
-			seq_printf(s, " completed %d\n", vrh->completed);
-			for (j = 0; j < num; j++) {
-				seq_printf(s, "desc[%d] addr 0x%llx len %d",
-					   j, desc->addr, desc->len);
-				seq_printf(s, " flags 0x%x next %d\n",
-					   desc->flags, desc->next);
-				desc++;
-			}
-			avail = vrh->vring.avail;
-			seq_printf(s, "avail flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, avail->flags),
-				   vringh16_to_cpu(vrh, avail->idx) & (num - 1));
-			seq_printf(s, "avail flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, avail->flags),
-				   vringh16_to_cpu(vrh, avail->idx));
-			for (j = 0; j < num; j++)
-				seq_printf(s, "avail ring[%d] %d\n",
-					   j, avail->ring[j]);
-			used = vrh->vring.used;
-			seq_printf(s, "used flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, used->flags),
-				   vringh16_to_cpu(vrh, used->idx) & (num - 1));
-			seq_printf(s, "used flags 0x%x idx %d\n",
-				   vringh16_to_cpu(vrh, used->flags),
-				   vringh16_to_cpu(vrh, used->idx));
-			for (j = 0; j < num; j++)
-				seq_printf(s, "used ring[%d] id %d len %d\n",
-					   j, vringh32_to_cpu(vrh,
-							      used->ring[j].id),
-					   vringh32_to_cpu(vrh,
-							   used->ring[j].len));
-		}
-	}
-	mutex_unlock(&mdev->mic_mutex);
-
-	return 0;
-}
-
-static int mic_vdev_info_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, mic_vdev_info_show, inode->i_private);
-}
-
-static int mic_vdev_info_debug_release(struct inode *inode, struct file *file)
-{
-	return single_release(inode, file);
-}
-
-static const struct file_operations vdev_info_ops = {
-	.owner   = THIS_MODULE,
-	.open    = mic_vdev_info_debug_open,
-	.read    = seq_read,
-	.llseek  = seq_lseek,
-	.release = mic_vdev_info_debug_release
-};
-
 static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 {
 	struct mic_device *mdev  = s->private;
@@ -367,11 +182,6 @@ void mic_create_debug_dir(struct mic_device *mdev)
 	debugfs_create_file("post_code", 0444, mdev->dbg_dir, mdev,
 			    &post_code_ops);
 
-	debugfs_create_file("dp", 0444, mdev->dbg_dir, mdev, &dp_ops);
-
-	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir, mdev,
-			    &vdev_info_ops);
-
 	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir, mdev,
 			    &msi_irq_info_ops);
 }
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
deleted file mode 100644
index 8cc1d90..0000000
--- a/drivers/misc/mic/host/mic_fops.c
+++ /dev/null
@@ -1,222 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#include <linux/poll.h>
-#include <linux/pci.h>
-
-#include <linux/mic_common.h>
-#include "../common/mic_dev.h"
-#include "mic_device.h"
-#include "mic_fops.h"
-#include "mic_virtio.h"
-
-int mic_open(struct inode *inode, struct file *f)
-{
-	struct mic_vdev *mvdev;
-	struct mic_device *mdev = container_of(f->private_data,
-		struct mic_device, miscdev);
-
-	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
-	if (!mvdev)
-		return -ENOMEM;
-
-	init_waitqueue_head(&mvdev->waitq);
-	INIT_LIST_HEAD(&mvdev->list);
-	mvdev->mdev = mdev;
-	mvdev->virtio_id = -1;
-
-	f->private_data = mvdev;
-	return 0;
-}
-
-int mic_release(struct inode *inode, struct file *f)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-
-	if (-1 != mvdev->virtio_id)
-		mic_virtio_del_device(mvdev);
-	f->private_data = NULL;
-	kfree(mvdev);
-	return 0;
-}
-
-long mic_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-	void __user *argp = (void __user *)arg;
-	int ret;
-
-	switch (cmd) {
-	case MIC_VIRTIO_ADD_DEVICE:
-	{
-		ret = mic_virtio_add_device(mvdev, argp);
-		if (ret < 0) {
-			dev_err(mic_dev(mvdev),
-				"%s %d errno ret %d\n",
-				__func__, __LINE__, ret);
-			return ret;
-		}
-		break;
-	}
-	case MIC_VIRTIO_COPY_DESC:
-	{
-		struct mic_copy_desc copy;
-
-		ret = mic_vdev_inited(mvdev);
-		if (ret)
-			return ret;
-
-		if (copy_from_user(&copy, argp, sizeof(copy)))
-			return -EFAULT;
-
-		dev_dbg(mic_dev(mvdev),
-			"%s %d === iovcnt 0x%x vr_idx 0x%x update_used %d\n",
-			__func__, __LINE__, copy.iovcnt, copy.vr_idx,
-			copy.update_used);
-
-		ret = mic_virtio_copy_desc(mvdev, &copy);
-		if (ret < 0) {
-			dev_err(mic_dev(mvdev),
-				"%s %d errno ret %d\n",
-				__func__, __LINE__, ret);
-			return ret;
-		}
-		if (copy_to_user(
-			&((struct mic_copy_desc __user *)argp)->out_len,
-			&copy.out_len, sizeof(copy.out_len))) {
-			dev_err(mic_dev(mvdev), "%s %d errno ret %d\n",
-				__func__, __LINE__, -EFAULT);
-			return -EFAULT;
-		}
-		break;
-	}
-	case MIC_VIRTIO_CONFIG_CHANGE:
-	{
-		ret = mic_vdev_inited(mvdev);
-		if (ret)
-			return ret;
-
-		ret = mic_virtio_config_change(mvdev, argp);
-		if (ret < 0) {
-			dev_err(mic_dev(mvdev),
-				"%s %d errno ret %d\n",
-				__func__, __LINE__, ret);
-			return ret;
-		}
-		break;
-	}
-	default:
-		return -ENOIOCTLCMD;
-	};
-	return 0;
-}
-
-/*
- * We return POLLIN | POLLOUT from poll when new buffers are enqueued, and
- * not when previously enqueued buffers may be available. This means that
- * in the card->host (TX) path, when userspace is unblocked by poll it
- * must drain all available descriptors or it can stall.
- */
-unsigned int mic_poll(struct file *f, poll_table *wait)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-	int mask = 0;
-
-	poll_wait(f, &mvdev->waitq, wait);
-
-	if (mic_vdev_inited(mvdev)) {
-		mask = POLLERR;
-	} else if (mvdev->poll_wake) {
-		mvdev->poll_wake = 0;
-		mask = POLLIN | POLLOUT;
-	}
-
-	return mask;
-}
-
-static inline int
-mic_query_offset(struct mic_vdev *mvdev, unsigned long offset,
-		 unsigned long *size, unsigned long *pa)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	unsigned long start = MIC_DP_SIZE;
-	int i;
-
-	/*
-	 * MMAP interface is as follows:
-	 * offset				region
-	 * 0x0					virtio device_page
-	 * 0x1000				first vring
-	 * 0x1000 + size of 1st vring		second vring
-	 * ....
-	 */
-	if (!offset) {
-		*pa = virt_to_phys(mdev->dp);
-		*size = MIC_DP_SIZE;
-		return 0;
-	}
-
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		struct mic_vringh *mvr = &mvdev->mvr[i];
-		if (offset == start) {
-			*pa = virt_to_phys(mvr->vring.va);
-			*size = mvr->vring.len;
-			return 0;
-		}
-		start += mvr->vring.len;
-	}
-	return -1;
-}
-
-/*
- * Maps the device page and virtio rings to user space for readonly access.
- */
-int
-mic_mmap(struct file *f, struct vm_area_struct *vma)
-{
-	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
-	unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
-	unsigned long pa, size = vma->vm_end - vma->vm_start, size_rem = size;
-	int i, err;
-
-	err = mic_vdev_inited(mvdev);
-	if (err)
-		return err;
-
-	if (vma->vm_flags & VM_WRITE)
-		return -EACCES;
-
-	while (size_rem) {
-		i = mic_query_offset(mvdev, offset, &size, &pa);
-		if (i < 0)
-			return -EINVAL;
-		err = remap_pfn_range(vma, vma->vm_start + offset,
-			pa >> PAGE_SHIFT, size, vma->vm_page_prot);
-		if (err)
-			return err;
-		dev_dbg(mic_dev(mvdev),
-			"%s %d type %d size 0x%lx off 0x%lx pa 0x%lx vma 0x%lx\n",
-			__func__, __LINE__, mvdev->virtio_id, size, offset,
-			pa, vma->vm_start + offset);
-		size_rem -= size;
-		offset += size;
-	}
-	return 0;
-}
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 153894e..400def2 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -27,8 +27,6 @@
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
-#include "mic_fops.h"
-#include "mic_virtio.h"
 
 static const char mic_driver_name[] = "mic";
 
@@ -57,17 +55,6 @@ MODULE_DEVICE_TABLE(pci, mic_pci_tbl);
 
 /* ID allocator for MIC devices */
 static struct ida g_mic_ida;
-/* Base device node number for MIC devices */
-static dev_t g_mic_devno;
-
-static const struct file_operations mic_fops = {
-	.open = mic_open,
-	.release = mic_release,
-	.unlocked_ioctl = mic_ioctl,
-	.poll = mic_poll,
-	.mmap = mic_mmap,
-	.owner = THIS_MODULE,
-};
 
 /* Initialize the device page */
 static int mic_dp_init(struct mic_device *mdev)
@@ -169,7 +156,6 @@ mic_device_init(struct mic_device *mdev, struct pci_dev *pdev)
 	mic_ops_init(mdev);
 	mutex_init(&mdev->mic_mutex);
 	mdev->irq_info.next_avail_src = 0;
-	INIT_LIST_HEAD(&mdev->vdev_list);
 }
 
 /**
@@ -259,30 +245,15 @@ static int mic_probe(struct pci_dev *pdev,
 		goto smpt_uninit;
 	}
 	mic_bootparam_init(mdev);
-
 	mic_create_debug_dir(mdev);
 
-	mdev->miscdev.minor = MISC_DYNAMIC_MINOR;
-	snprintf(mdev->name, sizeof(mdev->name), "mic%d", mdev->id);
-	mdev->miscdev.name = mdev->name;
-	mdev->miscdev.fops = &mic_fops;
-	mdev->miscdev.parent = &mdev->pdev->dev;
-	rc = misc_register(&mdev->miscdev);
-	if (rc) {
-		dev_err(&pdev->dev, "misc_register err id %d rc %d\n",
-			mdev->id, rc);
-		goto cleanup_debug_dir;
-	}
-
 	mdev->cosm_dev = cosm_register_device(&mdev->pdev->dev, &cosm_hw_ops);
 	if (IS_ERR(mdev->cosm_dev)) {
 		rc = PTR_ERR(mdev->cosm_dev);
 		dev_err(&pdev->dev, "cosm_add_device failed rc %d\n", rc);
-		goto misc_dereg;
+		goto cleanup_debug_dir;
 	}
 	return 0;
-misc_dereg:
-	misc_deregister(&mdev->miscdev);
 cleanup_debug_dir:
 	mic_delete_debug_dir(mdev);
 	mic_dp_uninit(mdev);
@@ -323,7 +294,6 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	cosm_unregister_device(mdev->cosm_dev);
-	misc_deregister(&mdev->miscdev);
 	mic_delete_debug_dir(mdev);
 	mic_dp_uninit(mdev);
 	mic_smpt_uninit(mdev);
@@ -347,26 +317,17 @@ static int __init mic_init(void)
 {
 	int ret;
 
-	ret = alloc_chrdev_region(&g_mic_devno, 0,
-				  MIC_MAX_NUM_DEVS, mic_driver_name);
-	if (ret) {
-		pr_err("alloc_chrdev_region failed ret %d\n", ret);
-		goto error;
-	}
-
 	mic_init_debugfs();
 	ida_init(&g_mic_ida);
 	ret = pci_register_driver(&mic_driver);
 	if (ret) {
 		pr_err("pci_register_driver failed ret %d\n", ret);
-		goto cleanup_chrdev;
+		goto cleanup_debugfs;
 	}
-	return ret;
-cleanup_chrdev:
+	return 0;
+cleanup_debugfs:
 	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
-	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
-error:
 	return ret;
 }
 
@@ -375,7 +336,6 @@ static void __exit mic_exit(void)
 	pci_unregister_driver(&mic_driver);
 	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
-	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
 }
 
 module_init(mic_init);
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
deleted file mode 100644
index 58b107a..0000000
--- a/drivers/misc/mic/host/mic_virtio.c
+++ /dev/null
@@ -1,811 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC Host driver.
- *
- */
-#include <linux/pci.h>
-#include <linux/sched.h>
-#include <linux/uaccess.h>
-#include <linux/dmaengine.h>
-#include <linux/mic_common.h>
-#include "../common/mic_dev.h"
-#include "mic_device.h"
-#include "mic_smpt.h"
-#include "mic_virtio.h"
-
-/*
- * Size of the internal buffer used during DMA's as an intermediate buffer
- * for copy to/from user.
- */
-#define MIC_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
-
-static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
-			dma_addr_t src, size_t len)
-{
-	int err = 0;
-	struct dma_async_tx_descriptor *tx;
-	struct dma_chan *mic_ch = mdev->dma_ch[0];
-
-	if (!mic_ch) {
-		err = -EBUSY;
-		goto error;
-	}
-
-	tx = mic_ch->device->device_prep_dma_memcpy(mic_ch, dst, src, len,
-						    DMA_PREP_FENCE);
-	if (!tx) {
-		err = -ENOMEM;
-		goto error;
-	} else {
-		dma_cookie_t cookie = tx->tx_submit(tx);
-
-		err = dma_submit_error(cookie);
-		if (err)
-			goto error;
-		err = dma_sync_wait(mic_ch, cookie);
-	}
-error:
-	if (err)
-		dev_err(&mdev->pdev->dev, "%s %d err %d\n",
-			__func__, __LINE__, err);
-	return err;
-}
-
-/*
- * Initiates the copies across the PCIe bus from card memory to a user
- * space buffer. When transfers are done using DMA, source/destination
- * addresses and transfer length must follow the alignment requirements of
- * the MIC DMA engine.
- */
-static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
-				   size_t len, u64 daddr, size_t dlen,
-				   int vr_idx)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	void __iomem *dbuf = mdev->aper.va + daddr;
-	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
-	size_t dma_offset;
-	size_t partlen;
-	int err;
-
-	dma_offset = daddr - round_down(daddr, dma_alignment);
-	daddr -= dma_offset;
-	len += dma_offset;
-
-	while (len) {
-		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
-
-		err = mic_sync_dma(mdev, mvr->buf_da, daddr,
-				   ALIGN(partlen, dma_alignment));
-		if (err)
-			goto err;
-
-		if (copy_to_user(ubuf, mvr->buf + dma_offset,
-				 partlen - dma_offset)) {
-			err = -EFAULT;
-			goto err;
-		}
-		daddr += partlen;
-		ubuf += partlen;
-		dbuf += partlen;
-		mvdev->in_bytes_dma += partlen;
-		mvdev->in_bytes += partlen;
-		len -= partlen;
-		dma_offset = 0;
-	}
-	return 0;
-err:
-	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
-	return err;
-}
-
-/*
- * Initiates copies across the PCIe bus from a user space buffer to card
- * memory. When transfers are done using DMA, source/destination addresses
- * and transfer length must follow the alignment requirements of the MIC
- * DMA engine.
- */
-static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
-				     size_t len, u64 daddr, size_t dlen,
-				     int vr_idx)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	void __iomem *dbuf = mdev->aper.va + daddr;
-	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
-	size_t partlen;
-	int err;
-
-	if (daddr & (dma_alignment - 1)) {
-		mvdev->tx_dst_unaligned += len;
-		goto memcpy;
-	} else if (ALIGN(len, dma_alignment) > dlen) {
-		mvdev->tx_len_unaligned += len;
-		goto memcpy;
-	}
-
-	while (len) {
-		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
-
-		if (copy_from_user(mvr->buf, ubuf, partlen)) {
-			err = -EFAULT;
-			goto err;
-		}
-		err = mic_sync_dma(mdev, daddr, mvr->buf_da,
-				   ALIGN(partlen, dma_alignment));
-		if (err)
-			goto err;
-		daddr += partlen;
-		ubuf += partlen;
-		dbuf += partlen;
-		mvdev->out_bytes_dma += partlen;
-		mvdev->out_bytes += partlen;
-		len -= partlen;
-	}
-memcpy:
-	/*
-	 * We are copying to IO below and should ideally use something
-	 * like copy_from_user_toio(..) if it existed.
-	 */
-	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
-		err = -EFAULT;
-		goto err;
-	}
-	mvdev->out_bytes += len;
-	return 0;
-err:
-	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
-	return err;
-}
-
-#define MIC_VRINGH_READ true
-
-/* The function to call to notify the card about added buffers */
-static void mic_notify(struct vringh *vrh)
-{
-	struct mic_vringh *mvrh = container_of(vrh, struct mic_vringh, vrh);
-	struct mic_vdev *mvdev = mvrh->mvdev;
-	s8 db = mvdev->dc->h2c_vdev_db;
-
-	if (db != -1)
-		mvdev->mdev->ops->send_intr(mvdev->mdev, db);
-}
-
-/* Determine the total number of bytes consumed in a VRINGH KIOV */
-static inline u32 mic_vringh_iov_consumed(struct vringh_kiov *iov)
-{
-	int i;
-	u32 total = iov->consumed;
-
-	for (i = 0; i < iov->i; i++)
-		total += iov->iov[i].iov_len;
-	return total;
-}
-
-/*
- * Traverse the VRINGH KIOV and issue the APIs to trigger the copies.
- * This API is heavily based on the vringh_iov_xfer(..) implementation
- * in vringh.c. The reason we cannot reuse vringh_iov_pull_kern(..)
- * and vringh_iov_push_kern(..) directly is because there is no
- * way to override the VRINGH xfer(..) routines as of v3.10.
- */
-static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
-			void __user *ubuf, size_t len, bool read, int vr_idx,
-			size_t *out_len)
-{
-	int ret = 0;
-	size_t partlen, tot_len = 0;
-
-	while (len && iov->i < iov->used) {
-		partlen = min(iov->iov[iov->i].iov_len, len);
-		if (read)
-			ret = mic_virtio_copy_to_user(mvdev, ubuf, partlen,
-						(u64)iov->iov[iov->i].iov_base,
-						iov->iov[iov->i].iov_len,
-						vr_idx);
-		else
-			ret = mic_virtio_copy_from_user(mvdev, ubuf, partlen,
-						(u64)iov->iov[iov->i].iov_base,
-						iov->iov[iov->i].iov_len,
-						vr_idx);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			break;
-		}
-		len -= partlen;
-		ubuf += partlen;
-		tot_len += partlen;
-		iov->consumed += partlen;
-		iov->iov[iov->i].iov_len -= partlen;
-		iov->iov[iov->i].iov_base += partlen;
-		if (!iov->iov[iov->i].iov_len) {
-			/* Fix up old iov element then increment. */
-			iov->iov[iov->i].iov_len = iov->consumed;
-			iov->iov[iov->i].iov_base -= iov->consumed;
-
-			iov->consumed = 0;
-			iov->i++;
-		}
-	}
-	*out_len = tot_len;
-	return ret;
-}
-
-/*
- * Use the standard VRINGH infrastructure in the kernel to fetch new
- * descriptors, initiate the copies and update the used ring.
- */
-static int _mic_virtio_copy(struct mic_vdev *mvdev,
-	struct mic_copy_desc *copy)
-{
-	int ret = 0;
-	u32 iovcnt = copy->iovcnt;
-	struct iovec iov;
-	struct iovec __user *u_iov = copy->iov;
-	void __user *ubuf = NULL;
-	struct mic_vringh *mvr = &mvdev->mvr[copy->vr_idx];
-	struct vringh_kiov *riov = &mvr->riov;
-	struct vringh_kiov *wiov = &mvr->wiov;
-	struct vringh *vrh = &mvr->vrh;
-	u16 *head = &mvr->head;
-	struct mic_vring *vr = &mvr->vring;
-	size_t len = 0, out_len;
-
-	copy->out_len = 0;
-	/* Fetch a new IOVEC if all previous elements have been processed */
-	if (riov->i == riov->used && wiov->i == wiov->used) {
-		ret = vringh_getdesc_kern(vrh, riov, wiov,
-				head, GFP_KERNEL);
-		/* Check if there are available descriptors */
-		if (ret <= 0)
-			return ret;
-	}
-	while (iovcnt) {
-		if (!len) {
-			/* Copy over a new iovec from user space. */
-			ret = copy_from_user(&iov, u_iov, sizeof(*u_iov));
-			if (ret) {
-				ret = -EINVAL;
-				dev_err(mic_dev(mvdev), "%s %d err %d\n",
-					__func__, __LINE__, ret);
-				break;
-			}
-			len = iov.iov_len;
-			ubuf = iov.iov_base;
-		}
-		/* Issue all the read descriptors first */
-		ret = mic_vringh_copy(mvdev, riov, ubuf, len, MIC_VRINGH_READ,
-				      copy->vr_idx, &out_len);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			break;
-		}
-		len -= out_len;
-		ubuf += out_len;
-		copy->out_len += out_len;
-		/* Issue the write descriptors next */
-		ret = mic_vringh_copy(mvdev, wiov, ubuf, len, !MIC_VRINGH_READ,
-				      copy->vr_idx, &out_len);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			break;
-		}
-		len -= out_len;
-		ubuf += out_len;
-		copy->out_len += out_len;
-		if (!len) {
-			/* One user space iovec is now completed */
-			iovcnt--;
-			u_iov++;
-		}
-		/* Exit loop if all elements in KIOVs have been processed. */
-		if (riov->i == riov->used && wiov->i == wiov->used)
-			break;
-	}
-	/*
-	 * Update the used ring if a descriptor was available and some data was
-	 * copied in/out and the user asked for a used ring update.
-	 */
-	if (*head != USHRT_MAX && copy->out_len && copy->update_used) {
-		u32 total = 0;
-
-		/* Determine the total data consumed */
-		total += mic_vringh_iov_consumed(riov);
-		total += mic_vringh_iov_consumed(wiov);
-		vringh_complete_kern(vrh, *head, total);
-		*head = USHRT_MAX;
-		if (vringh_need_notify_kern(vrh) > 0)
-			vringh_notify(vrh);
-		vringh_kiov_cleanup(riov);
-		vringh_kiov_cleanup(wiov);
-		/* Update avail idx for user space */
-		vr->info->avail_idx = vrh->last_avail_idx;
-	}
-	return ret;
-}
-
-static inline int mic_verify_copy_args(struct mic_vdev *mvdev,
-		struct mic_copy_desc *copy)
-{
-	if (copy->vr_idx >= mvdev->dd->num_vq) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EINVAL);
-		return -EINVAL;
-	}
-	return 0;
-}
-
-/* Copy a specified number of virtio descriptors in a chain */
-int mic_virtio_copy_desc(struct mic_vdev *mvdev,
-		struct mic_copy_desc *copy)
-{
-	int err;
-	struct mic_vringh *mvr = &mvdev->mvr[copy->vr_idx];
-
-	err = mic_verify_copy_args(mvdev, copy);
-	if (err)
-		return err;
-
-	mutex_lock(&mvr->vr_mutex);
-	if (!mic_vdevup(mvdev)) {
-		err = -ENODEV;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
-		goto err;
-	}
-	err = _mic_virtio_copy(mvdev, copy);
-	if (err) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
-	}
-err:
-	mutex_unlock(&mvr->vr_mutex);
-	return err;
-}
-
-static void mic_virtio_init_post(struct mic_vdev *mvdev)
-{
-	struct mic_vqconfig *vqconfig = mic_vq_config(mvdev->dd);
-	int i;
-
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		if (!le64_to_cpu(vqconfig[i].used_address)) {
-			dev_warn(mic_dev(mvdev), "used_address zero??\n");
-			continue;
-		}
-		mvdev->mvr[i].vrh.vring.used =
-			(void __force *)mvdev->mdev->aper.va +
-			le64_to_cpu(vqconfig[i].used_address);
-	}
-
-	mvdev->dc->used_address_updated = 0;
-
-	dev_dbg(mic_dev(mvdev), "%s: device type %d LINKUP\n",
-		__func__, mvdev->virtio_id);
-}
-
-static inline void mic_virtio_device_reset(struct mic_vdev *mvdev)
-{
-	int i;
-
-	dev_dbg(mic_dev(mvdev), "%s: status %d device type %d RESET\n",
-		__func__, mvdev->dd->status, mvdev->virtio_id);
-
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		/*
-		 * Avoid lockdep false positive. The + 1 is for the mic
-		 * mutex which is held in the reset devices code path.
-		 */
-		mutex_lock_nested(&mvdev->mvr[i].vr_mutex, i + 1);
-
-	/* 0 status means "reset" */
-	mvdev->dd->status = 0;
-	mvdev->dc->vdev_reset = 0;
-	mvdev->dc->host_ack = 1;
-
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		struct vringh *vrh = &mvdev->mvr[i].vrh;
-		mvdev->mvr[i].vring.info->avail_idx = 0;
-		vrh->completed = 0;
-		vrh->last_avail_idx = 0;
-		vrh->last_used_idx = 0;
-	}
-
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		mutex_unlock(&mvdev->mvr[i].vr_mutex);
-}
-
-void mic_virtio_reset_devices(struct mic_device *mdev)
-{
-	struct list_head *pos, *tmp;
-	struct mic_vdev *mvdev;
-
-	dev_dbg(&mdev->pdev->dev, "%s\n",  __func__);
-
-	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
-		mvdev = list_entry(pos, struct mic_vdev, list);
-		mic_virtio_device_reset(mvdev);
-		mvdev->poll_wake = 1;
-		wake_up(&mvdev->waitq);
-	}
-}
-
-void mic_bh_handler(struct work_struct *work)
-{
-	struct mic_vdev *mvdev = container_of(work, struct mic_vdev,
-			virtio_bh_work);
-
-	if (mvdev->dc->used_address_updated)
-		mic_virtio_init_post(mvdev);
-
-	if (mvdev->dc->vdev_reset)
-		mic_virtio_device_reset(mvdev);
-
-	mvdev->poll_wake = 1;
-	wake_up(&mvdev->waitq);
-}
-
-static irqreturn_t mic_virtio_intr_handler(int irq, void *data)
-{
-	struct mic_vdev *mvdev = data;
-	struct mic_device *mdev = mvdev->mdev;
-
-	mdev->ops->intr_workarounds(mdev);
-	schedule_work(&mvdev->virtio_bh_work);
-	return IRQ_HANDLED;
-}
-
-int mic_virtio_config_change(struct mic_vdev *mvdev,
-			void __user *argp)
-{
-	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
-	int ret = 0, retry, i;
-	struct mic_bootparam *bootparam = mvdev->mdev->dp;
-	s8 db = bootparam->h2c_config_db;
-
-	mutex_lock(&mvdev->mdev->mic_mutex);
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		mutex_lock_nested(&mvdev->mvr[i].vr_mutex, i + 1);
-
-	if (db == -1 || mvdev->dd->type == -1) {
-		ret = -EIO;
-		goto exit;
-	}
-
-	if (copy_from_user(mic_vq_configspace(mvdev->dd),
-			   argp, mvdev->dd->config_len)) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EFAULT);
-		ret = -EFAULT;
-		goto exit;
-	}
-	mvdev->dc->config_change = MIC_VIRTIO_PARAM_CONFIG_CHANGED;
-	mvdev->mdev->ops->send_intr(mvdev->mdev, db);
-
-	for (retry = 100; retry--;) {
-		ret = wait_event_timeout(wake,
-			mvdev->dc->guest_ack, msecs_to_jiffies(100));
-		if (ret)
-			break;
-	}
-
-	dev_dbg(mic_dev(mvdev),
-		"%s %d retry: %d\n", __func__, __LINE__, retry);
-	mvdev->dc->config_change = 0;
-	mvdev->dc->guest_ack = 0;
-exit:
-	for (i = 0; i < mvdev->dd->num_vq; i++)
-		mutex_unlock(&mvdev->mvr[i].vr_mutex);
-	mutex_unlock(&mvdev->mdev->mic_mutex);
-	return ret;
-}
-
-static int mic_copy_dp_entry(struct mic_vdev *mvdev,
-					void __user *argp,
-					__u8 *type,
-					struct mic_device_desc **devpage)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	struct mic_device_desc dd, *dd_config, *devp;
-	struct mic_vqconfig *vqconfig;
-	int ret = 0, i;
-	bool slot_found = false;
-
-	if (copy_from_user(&dd, argp, sizeof(dd))) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EFAULT);
-		return -EFAULT;
-	}
-
-	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE ||
-	    dd.num_vq > MIC_MAX_VRINGS) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -EINVAL);
-		return -EINVAL;
-	}
-
-	dd_config = kmalloc(mic_desc_size(&dd), GFP_KERNEL);
-	if (dd_config == NULL) {
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, -ENOMEM);
-		return -ENOMEM;
-	}
-	if (copy_from_user(dd_config, argp, mic_desc_size(&dd))) {
-		ret = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, ret);
-		goto exit;
-	}
-
-	vqconfig = mic_vq_config(dd_config);
-	for (i = 0; i < dd.num_vq; i++) {
-		if (le16_to_cpu(vqconfig[i].num) > MIC_MAX_VRING_ENTRIES) {
-			ret =  -EINVAL;
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto exit;
-		}
-	}
-
-	/* Find the first free device page entry */
-	for (i = sizeof(struct mic_bootparam);
-		i < MIC_DP_SIZE - mic_total_desc_size(dd_config);
-		i += mic_total_desc_size(devp)) {
-		devp = mdev->dp + i;
-		if (devp->type == 0 || devp->type == -1) {
-			slot_found = true;
-			break;
-		}
-	}
-	if (!slot_found) {
-		ret =  -EINVAL;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, ret);
-		goto exit;
-	}
-	/*
-	 * Save off the type before doing the memcpy. Type will be set in the
-	 * end after completing all initialization for the new device.
-	 */
-	*type = dd_config->type;
-	dd_config->type = 0;
-	memcpy(devp, dd_config, mic_desc_size(dd_config));
-
-	*devpage = devp;
-exit:
-	kfree(dd_config);
-	return ret;
-}
-
-static void mic_init_device_ctrl(struct mic_vdev *mvdev,
-				struct mic_device_desc *devpage)
-{
-	struct mic_device_ctrl *dc;
-
-	dc = (void *)devpage + mic_aligned_desc_size(devpage);
-
-	dc->config_change = 0;
-	dc->guest_ack = 0;
-	dc->vdev_reset = 0;
-	dc->host_ack = 0;
-	dc->used_address_updated = 0;
-	dc->c2h_vdev_db = -1;
-	dc->h2c_vdev_db = -1;
-	mvdev->dc = dc;
-}
-
-int mic_virtio_add_device(struct mic_vdev *mvdev,
-			void __user *argp)
-{
-	struct mic_device *mdev = mvdev->mdev;
-	struct mic_device_desc *dd = NULL;
-	struct mic_vqconfig *vqconfig;
-	int vr_size, i, j, ret;
-	u8 type = 0;
-	s8 db;
-	char irqname[10];
-	struct mic_bootparam *bootparam = mdev->dp;
-	u16 num;
-	dma_addr_t vr_addr;
-
-	mutex_lock(&mdev->mic_mutex);
-
-	ret = mic_copy_dp_entry(mvdev, argp, &type, &dd);
-	if (ret) {
-		mutex_unlock(&mdev->mic_mutex);
-		return ret;
-	}
-
-	mic_init_device_ctrl(mvdev, dd);
-
-	mvdev->dd = dd;
-	mvdev->virtio_id = type;
-	vqconfig = mic_vq_config(dd);
-	INIT_WORK(&mvdev->virtio_bh_work, mic_bh_handler);
-
-	for (i = 0; i < dd->num_vq; i++) {
-		struct mic_vringh *mvr = &mvdev->mvr[i];
-		struct mic_vring *vr = &mvdev->mvr[i].vring;
-		num = le16_to_cpu(vqconfig[i].num);
-		mutex_init(&mvr->vr_mutex);
-		vr_size = PAGE_ALIGN(vring_size(num, MIC_VIRTIO_RING_ALIGN) +
-			sizeof(struct _mic_vring_info));
-		vr->va = (void *)
-			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-					 get_order(vr_size));
-		if (!vr->va) {
-			ret = -ENOMEM;
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto err;
-		}
-		vr->len = vr_size;
-		vr->info = vr->va + vring_size(num, MIC_VIRTIO_RING_ALIGN);
-		vr->info->magic = cpu_to_le32(MIC_MAGIC + mvdev->virtio_id + i);
-		vr_addr = mic_map_single(mdev, vr->va, vr_size);
-		if (mic_map_error(vr_addr)) {
-			free_pages((unsigned long)vr->va, get_order(vr_size));
-			ret = -ENOMEM;
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto err;
-		}
-		vqconfig[i].address = cpu_to_le64(vr_addr);
-
-		vring_init(&vr->vr, num, vr->va, MIC_VIRTIO_RING_ALIGN);
-		ret = vringh_init_kern(&mvr->vrh,
-			*(u32 *)mic_vq_features(mvdev->dd), num, false,
-			vr->vr.desc, vr->vr.avail, vr->vr.used);
-		if (ret) {
-			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-				__func__, __LINE__, ret);
-			goto err;
-		}
-		vringh_kiov_init(&mvr->riov, NULL, 0);
-		vringh_kiov_init(&mvr->wiov, NULL, 0);
-		mvr->head = USHRT_MAX;
-		mvr->mvdev = mvdev;
-		mvr->vrh.notify = mic_notify;
-		dev_dbg(&mdev->pdev->dev,
-			"%s %d index %d va %p info %p vr_size 0x%x\n",
-			__func__, __LINE__, i, vr->va, vr->info, vr_size);
-		mvr->buf = (void *)__get_free_pages(GFP_KERNEL,
-					get_order(MIC_INT_DMA_BUF_SIZE));
-		mvr->buf_da = mic_map_single(mvdev->mdev, mvr->buf,
-					  MIC_INT_DMA_BUF_SIZE);
-	}
-
-	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
-		 mvdev->virtio_id);
-	mvdev->virtio_db = mic_next_db(mdev);
-	mvdev->virtio_cookie = mic_request_threaded_irq(mdev,
-					       mic_virtio_intr_handler,
-					       NULL, irqname, mvdev,
-					       mvdev->virtio_db, MIC_INTR_DB);
-	if (IS_ERR(mvdev->virtio_cookie)) {
-		ret = PTR_ERR(mvdev->virtio_cookie);
-		dev_dbg(&mdev->pdev->dev, "request irq failed\n");
-		goto err;
-	}
-
-	mvdev->dc->c2h_vdev_db = mvdev->virtio_db;
-
-	list_add_tail(&mvdev->list, &mdev->vdev_list);
-	/*
-	 * Order the type update with previous stores. This write barrier
-	 * is paired with the corresponding read barrier before the uncached
-	 * system memory read of the type, on the card while scanning the
-	 * device page.
-	 */
-	smp_wmb();
-	dd->type = type;
-
-	dev_dbg(&mdev->pdev->dev, "Added virtio device id %d\n", dd->type);
-
-	db = bootparam->h2c_config_db;
-	if (db != -1)
-		mdev->ops->send_intr(mdev, db);
-	mutex_unlock(&mdev->mic_mutex);
-	return 0;
-err:
-	vqconfig = mic_vq_config(dd);
-	for (j = 0; j < i; j++) {
-		struct mic_vringh *mvr = &mvdev->mvr[j];
-		mic_unmap_single(mdev, le64_to_cpu(vqconfig[j].address),
-				 mvr->vring.len);
-		free_pages((unsigned long)mvr->vring.va,
-			   get_order(mvr->vring.len));
-	}
-	mutex_unlock(&mdev->mic_mutex);
-	return ret;
-}
-
-void mic_virtio_del_device(struct mic_vdev *mvdev)
-{
-	struct list_head *pos, *tmp;
-	struct mic_vdev *tmp_mvdev;
-	struct mic_device *mdev = mvdev->mdev;
-	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
-	int i, ret, retry;
-	struct mic_vqconfig *vqconfig;
-	struct mic_bootparam *bootparam = mdev->dp;
-	s8 db;
-
-	mutex_lock(&mdev->mic_mutex);
-	db = bootparam->h2c_config_db;
-	if (db == -1)
-		goto skip_hot_remove;
-	dev_dbg(&mdev->pdev->dev,
-		"Requesting hot remove id %d\n", mvdev->virtio_id);
-	mvdev->dc->config_change = MIC_VIRTIO_PARAM_DEV_REMOVE;
-	mdev->ops->send_intr(mdev, db);
-	for (retry = 100; retry--;) {
-		ret = wait_event_timeout(wake,
-			mvdev->dc->guest_ack, msecs_to_jiffies(100));
-		if (ret)
-			break;
-	}
-	dev_dbg(&mdev->pdev->dev,
-		"Device id %d config_change %d guest_ack %d retry %d\n",
-		mvdev->virtio_id, mvdev->dc->config_change,
-		mvdev->dc->guest_ack, retry);
-	mvdev->dc->config_change = 0;
-	mvdev->dc->guest_ack = 0;
-skip_hot_remove:
-	mic_free_irq(mdev, mvdev->virtio_cookie, mvdev);
-	flush_work(&mvdev->virtio_bh_work);
-	vqconfig = mic_vq_config(mvdev->dd);
-	for (i = 0; i < mvdev->dd->num_vq; i++) {
-		struct mic_vringh *mvr = &mvdev->mvr[i];
-
-		mic_unmap_single(mvdev->mdev, mvr->buf_da,
-				 MIC_INT_DMA_BUF_SIZE);
-		free_pages((unsigned long)mvr->buf,
-			   get_order(MIC_INT_DMA_BUF_SIZE));
-		vringh_kiov_cleanup(&mvr->riov);
-		vringh_kiov_cleanup(&mvr->wiov);
-		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
-				 mvr->vring.len);
-		free_pages((unsigned long)mvr->vring.va,
-			   get_order(mvr->vring.len));
-	}
-
-	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
-		tmp_mvdev = list_entry(pos, struct mic_vdev, list);
-		if (tmp_mvdev == mvdev) {
-			list_del(pos);
-			dev_dbg(&mdev->pdev->dev,
-				"Removing virtio device id %d\n",
-				mvdev->virtio_id);
-			break;
-		}
-	}
-	/*
-	 * Order the type update with previous stores. This write barrier
-	 * is paired with the corresponding read barrier before the uncached
-	 * system memory read of the type, on the card while scanning the
-	 * device page.
-	 */
-	smp_wmb();
-	mvdev->dd->type = -1;
-	mutex_unlock(&mdev->mic_mutex);
-}
-- 
1.8.2.1

^ permalink raw reply related	[relevance 29%]

* [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver
@ 2016-02-02  4:23 86% Sudeep Dutt
  2016-02-02  4:23 29% ` [PATCH char-misc-next 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Sudeep Dutt @ 2016-02-02  4:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

This patch series moves virtio functionality from the MIC host/card
driver into a separate hardware independent Virtio Over PCIe (VOP)
driver. Apart from being moved into a separate driver the functionality
is essentially unchanged. This refactoring allows this hardware
independent logic to be shared easily across multiple generations of
MIC devices. The original commits are listed below for reference:
- commit f69bcbf3b4c4 ("Intel MIC Host Driver Changes for Virtio Devices.")
in drivers/misc/mic/host/mic_virtio.c
- commit 2141c7c5ee67 ("Intel MIC Card Driver Changes for Virtio Devices.")
in drivers/misc/mic/card/mic_virtio.c

The patch series is partitioned as follows:
1) Removes MIC X100 host virtio functionality
2) Removes MIC X100 card virtio functionality
3) Enables the Virtio Over PCIe (VOP) bus which abstracts the
   low level hardware details like interrupts and mapping remote
   memory so that the same VOP driver can work without changes
   with different MIC host or card drivers as long as the hardware
   bus operations are implemented.
4) Adds data structures for the VOP driver
5) Enables VOP host side functionality
6) Enables VOP card side functionality
7) Enables VOP debugfs and driver build
8) Implements the MIC host and card driver changes to enable VOP

Ashutosh Dixit (1):
  misc: mic: Enable VOP card side functionality

Sudeep Dutt (7):
  misc: mic: Remove MIC X100 host virtio functionality
  misc: mic: Remove MIC X100 card virtio functionality
  misc: mic: MIC VOP Bus
  misc: mic: Add data structures for the VOP driver
  misc: mic: Enable VOP host side functionality
  misc: mic: Enable VOP debugfs and driver build
  misc: mic: MIC host and card driver changes to enable VOP

 Documentation/mic/mic_overview.txt                 |   54 +-
 drivers/misc/mic/Kconfig                           |   44 +-
 drivers/misc/mic/Makefile                          |    1 +
 drivers/misc/mic/bus/Makefile                      |    1 +
 drivers/misc/mic/card/Makefile                     |    1 -
 drivers/misc/mic/host/Makefile                     |    2 -
 drivers/misc/mic/vop/Makefile                      |    9 +
 drivers/misc/mic/bus/vop_bus.h                     |  142 +++
 drivers/misc/mic/card/mic_device.h                 |    3 +
 drivers/misc/mic/card/mic_virtio.h                 |   76 --
 drivers/misc/mic/host/mic_device.h                 |    9 +-
 drivers/misc/mic/host/mic_fops.h                   |   32 -
 .../misc/mic/{host/mic_virtio.h => vop/vop_main.h} |  129 ++-
 Documentation/mic/mpssd/mpssd.c                    |    2 +-
 drivers/misc/mic/bus/vop_bus.c                     |  204 ++++
 drivers/misc/mic/card/mic_device.c                 |   89 +-
 drivers/misc/mic/card/mic_virtio.c                 |  634 -----------
 drivers/misc/mic/card/mic_x100.c                   |    1 +
 drivers/misc/mic/host/mic_boot.c                   |  127 ++-
 drivers/misc/mic/host/mic_debugfs.c                |  190 ----
 drivers/misc/mic/host/mic_fops.c                   |  222 ----
 drivers/misc/mic/host/mic_main.c                   |   49 +-
 drivers/misc/mic/host/mic_virtio.c                 |  811 --------------
 drivers/misc/mic/vop/vop_debugfs.c                 |  232 ++++
 drivers/misc/mic/vop/vop_main.c                    |  755 +++++++++++++
 drivers/misc/mic/vop/vop_vringh.c                  | 1164 ++++++++++++++++++++
 Documentation/mic/mpssd/mpss                       |    2 +-
 27 files changed, 2869 insertions(+), 2116 deletions(-)
 create mode 100644 drivers/misc/mic/vop/Makefile
 create mode 100644 drivers/misc/mic/bus/vop_bus.h
 delete mode 100644 drivers/misc/mic/card/mic_virtio.h
 delete mode 100644 drivers/misc/mic/host/mic_fops.h
 rename drivers/misc/mic/{host/mic_virtio.h => vop/vop_main.h} (58%)
 create mode 100644 drivers/misc/mic/bus/vop_bus.c
 delete mode 100644 drivers/misc/mic/card/mic_virtio.c
 delete mode 100644 drivers/misc/mic/host/mic_fops.c
 delete mode 100644 drivers/misc/mic/host/mic_virtio.c
 create mode 100644 drivers/misc/mic/vop/vop_debugfs.c
 create mode 100644 drivers/misc/mic/vop/vop_main.c
 create mode 100644 drivers/misc/mic/vop/vop_vringh.c

-- 
1.8.2.1

^ permalink raw reply	[relevance 86%]

* Re: [PATCH] misc: mic: fix incorrect use of error codes in SCIF DMA driver
  @ 2015-12-12  2:50 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-12-12  2:50 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Sudeep Dutt, ashutosh.dixit, nikhil.rao, gregkh, linux-kernel

On Fri, 2015-12-11 at 20:09 -0600, Eric Biggers wrote:
> The error code passed to ERR_PTR() always should be negated.  Also, the
> return value of scif_add_mmu_notifier() was never checked.
> 

Thanks for the patch Eric.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
> ---
>  drivers/misc/mic/scif/scif_dma.c | 14 +++++---------
>  1 file changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_dma.c b/drivers/misc/mic/scif/scif_dma.c
> index 95a13c6..f6aeebd 100644
> --- a/drivers/misc/mic/scif/scif_dma.c
> +++ b/drivers/misc/mic/scif/scif_dma.c
> @@ -276,13 +276,10 @@ static struct scif_mmu_notif *
>  scif_find_mmu_notifier(struct mm_struct *mm, struct scif_endpt_rma_info *rma)
>  {
>  	struct scif_mmu_notif *mmn;
> -	struct list_head *item;
>  
> -	list_for_each(item, &rma->mmn_list) {
> -		mmn = list_entry(item, struct scif_mmu_notif, list);
> +	list_for_each_entry(mmn, &rma->mmn_list, list)
>  		if (mmn->mm == mm)
>  			return mmn;
> -	}
>  	return NULL;
>  }
>  
> @@ -293,13 +290,12 @@ scif_add_mmu_notifier(struct mm_struct *mm, struct scif_endpt *ep)
>  		 = kzalloc(sizeof(*mmn), GFP_KERNEL);
>  
>  	if (!mmn)
> -		return ERR_PTR(ENOMEM);
> +		return ERR_PTR(-ENOMEM);
>  
>  	scif_init_mmu_notifier(mmn, current->mm, ep);
> -	if (mmu_notifier_register(&mmn->ep_mmu_notifier,
> -				  current->mm)) {
> +	if (mmu_notifier_register(&mmn->ep_mmu_notifier, current->mm)) {
>  		kfree(mmn);
> -		return ERR_PTR(EBUSY);
> +		return ERR_PTR(-EBUSY);
>  	}
>  	list_add(&mmn->list, &ep->rma_info.mmn_list);
>  	return mmn;
> @@ -1730,7 +1726,7 @@ static int scif_rma_copy(scif_epd_t epd, off_t loffset, unsigned long addr,
>  		mutex_lock(&ep->rma_info.mmn_lock);
>  		mmn = scif_find_mmu_notifier(current->mm, &ep->rma_info);
>  		if (!mmn)
> -			scif_add_mmu_notifier(current->mm, ep);
> +			mmn = scif_add_mmu_notifier(current->mm, ep);
>  		mutex_unlock(&ep->rma_info.mmn_lock);
>  		if (IS_ERR(mmn)) {
>  			scif_put_peer_dev(spdev);



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next 4/4] misc: mic: use common error path
  @ 2015-12-12  2:46 99%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-12-12  2:46 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Sudeep Dutt, Greg Kroah-Hartman, linux-kernel, Dixit, Ashutosh

On Mon, 2015-11-23 at 17:24 +0530, Sudip Mukherjee wrote:
> Instead of calling release_firmware() on every error and then jumping
> lets have a common release_firmware() in the error path.
> This patch also fixes a memory leak where we missed release_firmware()
> if mic_x100_load_command_line() fails.
> 

Thanks for this patch series Sudip. All 4 patches look good.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>


> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
> ---
>  drivers/misc/mic/host/mic_x100.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
> index 37fa898..82a973c 100644
> --- a/drivers/misc/mic/host/mic_x100.c
> +++ b/drivers/misc/mic/host/mic_x100.c
> @@ -463,8 +463,7 @@ mic_x100_load_firmware(struct mic_device *mdev, const char *buf)
>  		rc = -EINVAL;
>  		dev_err(&mdev->pdev->dev, "%s %d rc %d bootaddr 0x%x\n",
>  			__func__, __LINE__, rc, mdev->bootaddr);
> -		release_firmware(fw);
> -		goto done;
> +		goto error;
>  	}
>  	memcpy_toio(mdev->aper.va + mdev->bootaddr, fw->data, fw->size);
>  	mdev->ops->write_spad(mdev, MIC_X100_FW_SIZE, fw->size);
> @@ -472,22 +471,24 @@ mic_x100_load_firmware(struct mic_device *mdev, const char *buf)
>  		rc = -EINVAL;
>  		dev_err(&mdev->pdev->dev, "%s %d rc %d\n",
>  			__func__, __LINE__, rc);
> -		release_firmware(fw);
> -		goto done;
> +		goto error;
>  	}
>  	/* load command line */
>  	rc = mic_x100_load_command_line(mdev, fw);
>  	if (rc) {
>  		dev_err(&mdev->pdev->dev, "%s %d rc %d\n",
>  			__func__, __LINE__, rc);
> -		goto done;
> +		goto error;
>  	}
>  	release_firmware(fw);
>  	/* load ramdisk */
>  	if (mdev->cosm_dev->ramdisk)
>  		rc = mic_x100_load_ramdisk(mdev);
>  
> -done:
> +	return rc;
> +
> +error:
> +	release_firmware(fw);
>  	return rc;
>  }
>  



^ permalink raw reply	[relevance 99%]

* Re: [PATCH 1/3] misc:scif: do not define list_entry_next
  @ 2015-11-18 16:17 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-11-18 16:17 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sudeep Dutt, Greg Kroah-Hartman, Dan Carpenter, linux-kernel,
	Sergey Senozhatsky

On Wed, 2015-11-18 at 21:10 +0900, Sergey Senozhatsky wrote:
> Cosmetic.
> 
> Do not define list_entry_next() and use list_next_entry()
> from list.h.
> 

Hi Sergey,

A similar patch was posted @ https://lkml.org/lkml/2015/11/16/328

Regards,
Sudeep Dutt

> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  drivers/misc/mic/scif/scif_dma.c  | 27 +++++++++++----------------
>  drivers/misc/mic/scif/scif_main.h |  1 +
>  2 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_dma.c b/drivers/misc/mic/scif/scif_dma.c
> index 95a13c6..8804bcc 100644
> --- a/drivers/misc/mic/scif/scif_dma.c
> +++ b/drivers/misc/mic/scif/scif_dma.c
> @@ -74,11 +74,6 @@ struct scif_copy_work {
>  	bool ordered;
>  };
>  
> -#ifndef list_entry_next
> -#define list_entry_next(pos, member) \
> -	list_entry(pos->member.next, typeof(*pos), member)
> -#endif
> -
>  /**
>   * scif_reserve_dma_chan:
>   * @ep: Endpoint Descriptor.
> @@ -851,7 +846,7 @@ static void scif_rma_local_cpu_copy(s64 offset, struct scif_window *window,
>  		(window->nr_pages << PAGE_SHIFT);
>  	while (rem_len) {
>  		if (offset == end_offset) {
> -			window = list_entry_next(window, list);
> +			window = list_next_entry(window, list);
>  			end_offset = window->offset +
>  				(window->nr_pages << PAGE_SHIFT);
>  		}
> @@ -957,7 +952,7 @@ scif_rma_list_dma_copy_unaligned(struct scif_copy_work *work,
>  	remaining_len -= tail_len;
>  	while (remaining_len) {
>  		if (offset == end_offset) {
> -			window = list_entry_next(window, list);
> +			window = list_next_entry(window, list);
>  			end_offset = window->offset +
>  				(window->nr_pages << PAGE_SHIFT);
>  		}
> @@ -1064,7 +1059,7 @@ scif_rma_list_dma_copy_unaligned(struct scif_copy_work *work,
>  	}
>  	if (tail_len) {
>  		if (offset == end_offset) {
> -			window = list_entry_next(window, list);
> +			window = list_next_entry(window, list);
>  			end_offset = window->offset +
>  				(window->nr_pages << PAGE_SHIFT);
>  		}
> @@ -1147,13 +1142,13 @@ static int _scif_rma_list_dma_copy_aligned(struct scif_copy_work *work,
>  		(dst_window->nr_pages << PAGE_SHIFT);
>  	while (remaining_len) {
>  		if (src_offset == end_src_offset) {
> -			src_window = list_entry_next(src_window, list);
> +			src_window = list_next_entry(src_window, list);
>  			end_src_offset = src_window->offset +
>  				(src_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(src_window, &src_win_iter);
>  		}
>  		if (dst_offset == end_dst_offset) {
> -			dst_window = list_entry_next(dst_window, list);
> +			dst_window = list_next_entry(dst_window, list);
>  			end_dst_offset = dst_window->offset +
>  				(dst_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(dst_window, &dst_win_iter);
> @@ -1314,13 +1309,13 @@ static int scif_rma_list_dma_copy_aligned(struct scif_copy_work *work,
>  	remaining_len -= tail_len;
>  	while (remaining_len) {
>  		if (src_offset == end_src_offset) {
> -			src_window = list_entry_next(src_window, list);
> +			src_window = list_next_entry(src_window, list);
>  			end_src_offset = src_window->offset +
>  				(src_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(src_window, &src_win_iter);
>  		}
>  		if (dst_offset == end_dst_offset) {
> -			dst_window = list_entry_next(dst_window, list);
> +			dst_window = list_next_entry(dst_window, list);
>  			end_dst_offset = dst_window->offset +
>  				(dst_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(dst_window, &dst_win_iter);
> @@ -1405,9 +1400,9 @@ static int scif_rma_list_dma_copy_aligned(struct scif_copy_work *work,
>  	if (remaining_len) {
>  		loop_len = remaining_len;
>  		if (src_offset == end_src_offset)
> -			src_window = list_entry_next(src_window, list);
> +			src_window = list_next_entry(src_window, list);
>  		if (dst_offset == end_dst_offset)
> -			dst_window = list_entry_next(dst_window, list);
> +			dst_window = list_next_entry(dst_window, list);
>  
>  		src_dma_addr = __scif_off_to_dma_addr(src_window, src_offset);
>  		dst_dma_addr = __scif_off_to_dma_addr(dst_window, dst_offset);
> @@ -1550,12 +1545,12 @@ static int scif_rma_list_cpu_copy(struct scif_copy_work *work)
>  			end_dst_offset = dst_window->offset +
>  				(dst_window->nr_pages << PAGE_SHIFT);
>  			if (src_offset == end_src_offset) {
> -				src_window = list_entry_next(src_window, list);
> +				src_window = list_next_entry(src_window, list);
>  				scif_init_window_iter(src_window,
>  						      &src_win_iter);
>  			}
>  			if (dst_offset == end_dst_offset) {
> -				dst_window = list_entry_next(dst_window, list);
> +				dst_window = list_next_entry(dst_window, list);
>  				scif_init_window_iter(dst_window,
>  						      &dst_win_iter);
>  			}
> diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
> index a08f0b6..9082f83 100644
> --- a/drivers/misc/mic/scif/scif_main.h
> +++ b/drivers/misc/mic/scif/scif_main.h
> @@ -27,6 +27,7 @@
>  #include <linux/file.h>
>  #include <linux/vmalloc.h>
>  #include <linux/scif.h>
> +#include <linux/list.h>
>  #include "../common/mic_dev.h"
>  
>  #define SCIF_MGMT_NODE 0



^ permalink raw reply	[relevance 99%]

* Re: [PATCH 1/3] misc: mic/scif: use list_next_entry instead of list_entry_next
  @ 2015-11-16 17:08 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-11-16 17:08 UTC (permalink / raw)
  To: Geliang Tang; +Cc: Sudeep Dutt, Greg Kroah-Hartman, linux-kernel

On Mon, 2015-11-16 at 21:46 +0800, Geliang Tang wrote:
> list_next_entry has been defined in list.h, so I replace list_entry_next
> with it.
> 

Thanks for the cleanup.

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

> Signed-off-by: Geliang Tang <geliangtang@163.com>
> ---
>  drivers/misc/mic/scif/scif_dma.c | 27 +++++++++++----------------
>  1 file changed, 11 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_dma.c b/drivers/misc/mic/scif/scif_dma.c
> index 95a13c6..8804bcc 100644
> --- a/drivers/misc/mic/scif/scif_dma.c
> +++ b/drivers/misc/mic/scif/scif_dma.c
> @@ -74,11 +74,6 @@ struct scif_copy_work {
>  	bool ordered;
>  };
>  
> -#ifndef list_entry_next
> -#define list_entry_next(pos, member) \
> -	list_entry(pos->member.next, typeof(*pos), member)
> -#endif
> -
>  /**
>   * scif_reserve_dma_chan:
>   * @ep: Endpoint Descriptor.
> @@ -851,7 +846,7 @@ static void scif_rma_local_cpu_copy(s64 offset, struct scif_window *window,
>  		(window->nr_pages << PAGE_SHIFT);
>  	while (rem_len) {
>  		if (offset == end_offset) {
> -			window = list_entry_next(window, list);
> +			window = list_next_entry(window, list);
>  			end_offset = window->offset +
>  				(window->nr_pages << PAGE_SHIFT);
>  		}
> @@ -957,7 +952,7 @@ scif_rma_list_dma_copy_unaligned(struct scif_copy_work *work,
>  	remaining_len -= tail_len;
>  	while (remaining_len) {
>  		if (offset == end_offset) {
> -			window = list_entry_next(window, list);
> +			window = list_next_entry(window, list);
>  			end_offset = window->offset +
>  				(window->nr_pages << PAGE_SHIFT);
>  		}
> @@ -1064,7 +1059,7 @@ scif_rma_list_dma_copy_unaligned(struct scif_copy_work *work,
>  	}
>  	if (tail_len) {
>  		if (offset == end_offset) {
> -			window = list_entry_next(window, list);
> +			window = list_next_entry(window, list);
>  			end_offset = window->offset +
>  				(window->nr_pages << PAGE_SHIFT);
>  		}
> @@ -1147,13 +1142,13 @@ static int _scif_rma_list_dma_copy_aligned(struct scif_copy_work *work,
>  		(dst_window->nr_pages << PAGE_SHIFT);
>  	while (remaining_len) {
>  		if (src_offset == end_src_offset) {
> -			src_window = list_entry_next(src_window, list);
> +			src_window = list_next_entry(src_window, list);
>  			end_src_offset = src_window->offset +
>  				(src_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(src_window, &src_win_iter);
>  		}
>  		if (dst_offset == end_dst_offset) {
> -			dst_window = list_entry_next(dst_window, list);
> +			dst_window = list_next_entry(dst_window, list);
>  			end_dst_offset = dst_window->offset +
>  				(dst_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(dst_window, &dst_win_iter);
> @@ -1314,13 +1309,13 @@ static int scif_rma_list_dma_copy_aligned(struct scif_copy_work *work,
>  	remaining_len -= tail_len;
>  	while (remaining_len) {
>  		if (src_offset == end_src_offset) {
> -			src_window = list_entry_next(src_window, list);
> +			src_window = list_next_entry(src_window, list);
>  			end_src_offset = src_window->offset +
>  				(src_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(src_window, &src_win_iter);
>  		}
>  		if (dst_offset == end_dst_offset) {
> -			dst_window = list_entry_next(dst_window, list);
> +			dst_window = list_next_entry(dst_window, list);
>  			end_dst_offset = dst_window->offset +
>  				(dst_window->nr_pages << PAGE_SHIFT);
>  			scif_init_window_iter(dst_window, &dst_win_iter);
> @@ -1405,9 +1400,9 @@ static int scif_rma_list_dma_copy_aligned(struct scif_copy_work *work,
>  	if (remaining_len) {
>  		loop_len = remaining_len;
>  		if (src_offset == end_src_offset)
> -			src_window = list_entry_next(src_window, list);
> +			src_window = list_next_entry(src_window, list);
>  		if (dst_offset == end_dst_offset)
> -			dst_window = list_entry_next(dst_window, list);
> +			dst_window = list_next_entry(dst_window, list);
>  
>  		src_dma_addr = __scif_off_to_dma_addr(src_window, src_offset);
>  		dst_dma_addr = __scif_off_to_dma_addr(dst_window, dst_offset);
> @@ -1550,12 +1545,12 @@ static int scif_rma_list_cpu_copy(struct scif_copy_work *work)
>  			end_dst_offset = dst_window->offset +
>  				(dst_window->nr_pages << PAGE_SHIFT);
>  			if (src_offset == end_src_offset) {
> -				src_window = list_entry_next(src_window, list);
> +				src_window = list_next_entry(src_window, list);
>  				scif_init_window_iter(src_window,
>  						      &src_win_iter);
>  			}
>  			if (dst_offset == end_dst_offset) {
> -				dst_window = list_entry_next(dst_window, list);
> +				dst_window = list_next_entry(dst_window, list);
>  				scif_init_window_iter(dst_window,
>  						      &dst_win_iter);
>  			}



^ permalink raw reply	[relevance 99%]

* Re: [patch 2/2] misc: mic/scif: fix wrap around tests
    2015-10-11  9:14 99% ` Sudeep Dutt
@ 2015-10-14  3:21 99% ` Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-14  3:21 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Sudeep Dutt, Nikhil Rao, Greg Kroah-Hartman, linux-kernel,
	kernel-janitors

On Fri, 2015-10-09 at 09:40 +0300, Dan Carpenter wrote:
> Signed integer overflow is undefined.  Also I added a check for
> "(offset < 0)" in scif_unregister() because that makes it match the
> other conditions and because I didn't want to subtract a negative.
> 
> Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> 
> Imagine you are on 64 bit and len is larger than INT_MAX << 12, it means
> that we truncate it because scif_get_window_offset() takes an integer
> argument.  I don't know if this is an issue.  Maybe I should use
> INT_MAX instead of LONG_MAX?  I am working on a static checker warning
> for these types of issues:
> drivers/misc/mic/scif/scif_rma.c:1631 scif_register() warn: truncating user data 'len >> 12' '0-4503599627370495'
> drivers/misc/mic/scif/scif_rma.c:1643 scif_register() warn: truncating user data 'len >> 12' '0-4503599627370495'
> 
> The other static warnings here are:
> 
> drivers/misc/mic/scif/scif_rma.c:745 scif_unregister_window() warn: inconsistent returns 'mutex:&ep->rma_info.rma_lock'.
>   Locked on:   line 745
>   Unlocked on: line 687
> drivers/misc/mic/scif/scif_rma.c:1463 scif_unpin_pages() warn: passing __func__ while the format string already contains the name of the function 'scif_unpin_pages'
> 
> diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
> index bc2dccb..fea7d2c 100644
> --- a/drivers/misc/mic/scif/scif_rma.c
> +++ b/drivers/misc/mic/scif/scif_rma.c
> @@ -1510,7 +1510,7 @@ off_t scif_register_pinned_pages(scif_epd_t epd,
>  	if ((map_flags & SCIF_MAP_FIXED) &&
>  	    ((ALIGN(offset, PAGE_SIZE) != offset) ||
>  	    (offset < 0) ||
> -	    (offset + (off_t)len < offset)))
> +	    (len > LONG_MAX - offset)))
>  		return -EINVAL;
>  
>  	might_sleep();
> @@ -1613,7 +1613,7 @@ off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
>  	if ((map_flags & SCIF_MAP_FIXED) &&
>  	    ((ALIGN(offset, PAGE_SIZE) != offset) ||
>  	    (offset < 0) ||
> -	    (offset + (off_t)len < offset)))
> +	    (len < LONG_MAX - offset)))

Hi Dan,
Should this be > instead of < like the others?
Thanks,
Sudeep Dutt

>  		return -EINVAL;
>  
>  	/* Unsupported protection requested */
> @@ -1731,7 +1731,8 @@ scif_unregister(scif_epd_t epd, off_t offset, size_t len)
>  
>  	/* Offset is not page aligned or offset+len wraps around */
>  	if ((ALIGN(offset, PAGE_SIZE) != offset) ||
> -	    (offset + (off_t)len < offset))
> +	    (offset < 0) ||
> +	    (len > LONG_MAX - offset))
>  		return -EINVAL;
>  
>  	err = scif_verify_epd(ep);



^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next v2 1/1] MAINTAINERS: Add maintainership for MIC drivers
@ 2015-10-13 18:21 99% Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-13 18:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Joe Perches, Ashutosh Dixit, Sudeep Dutt

Add entry for MIC drivers to the MAINTAINERS file

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 MAINTAINERS | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fe5bbfe..994a7ae 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5575,6 +5575,22 @@ F:	include/linux/mei_cl_bus.h
 F:	drivers/misc/mei/*
 F:	Documentation/misc-devices/mei/*
 
+INTEL MIC DRIVERS (mic)
+M:	Sudeep Dutt <sudeep.dutt@intel.com>
+M:	Ashutosh Dixit <ashutosh.dixit@intel.com>
+S:	Supported
+W:	https://github.com/sudeepdutt/mic
+W:	http://software.intel.com/en-us/mic-developer
+F:	include/linux/mic_bus.h
+F:	include/linux/scif.h
+F:	include/uapi/linux/mic_common.h
+F: 	include/uapi/linux/mic_ioctl.h
+F	include/uapi/linux/scif_ioctl.h
+F:	drivers/misc/mic/
+F:	drivers/dma/mic_x100_dma.c
+F:	drivers/dma/mic_x100_dma.h
+F	Documentation/mic/
+
 INTEL PMC IPC DRIVER
 M:	Zha Qipeng<qipeng.zha@intel.com>
 L:	platform-driver-x86@vger.kernel.org
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 1/1] MAINTAINERS: Add maintainership for MIC drivers
@ 2015-10-13 17:58 99% Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-13 17:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Sudeep Dutt

Add entry for MIC drivers to the MAINTAINERS file

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 MAINTAINERS | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fe5bbfe..840993b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5575,6 +5575,21 @@ F:	include/linux/mei_cl_bus.h
 F:	drivers/misc/mei/*
 F:	Documentation/misc-devices/mei/*
 
+INTEL MIC DRIVERS (mic)
+M:	Sudeep Dutt <sudeep.dutt@intel.com>
+M:	Ashutosh Dixit <ashutosh.dixit@intel.com>
+S:	Supported
+W:	https://github.com/sudeepdutt/mic
+W:	http://software.intel.com/en-us/mic-developer
+F:	include/linux/mic_bus.h
+F:	include/linux/scif.h
+F:	include/uapi/linux/mic_common.h
+F: 	include/uapi/linux/mic_ioctl.h
+F	include/uapi/linux/scif_ioctl.h
+F:	drivers/misc/mic/*
+F:	drivers/dma/mic_x100_dma*
+F	Documentation/mic/*
+
 INTEL PMC IPC DRIVER
 M:	Zha Qipeng<qipeng.zha@intel.com>
 L:	platform-driver-x86@vger.kernel.org
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* Re: [PATCH] misc: mic: fix memory leak
  @ 2015-10-13 17:25 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-13 17:25 UTC (permalink / raw)
  To: Sudip Mukherjee; +Cc: Sudeep Dutt, Greg Kroah-Hartman, linux-kernel

On Tue, 2015-10-13 at 22:46 +0530, Sudip Mukherjee wrote:
> On Tue, Oct 13, 2015 at 09:34:21AM -0700, Sudeep Dutt wrote:
> > On Tue, 2015-10-13 at 14:38 +0530, Sudip Mukherjee wrote:
> <snip>
> > > +	struct scif_hw_dev *sdev = scifdev->sdev;
> > > +
> > > +	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
> > > +		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);
> > 
> > Hi Sudip,
> > 
> > You missed unmapping the aperture here:
> > 
> > dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
> > 	     p2p->sg_nentries[SCIF_PPI_APER], DMA_BIDIRECTIONAL);
> > 
> > Can you please resubmit this patch with the change above?
> 
> Hi Sudeep,
> v3 sent for your review.   
> > 
> > Also you should mention [PATCH char-misc-next] in the next revision.
> 
> Ohhh... yes, I keep forgetting that Fengguang will need this tag for
> his build bot.
> 
> But I have one more doubt. If scif_nodeqp_send() fails then shouldn't we
> remove these two nodes of the p2p connection from the lists and release
> the resources?

The resources will get cleaned up via scif_destroy_p2p(..) once they are
added to the lists.

Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v3] misc: mic: fix memory leak
  @ 2015-10-13 17:08 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-13 17:08 UTC (permalink / raw)
  To: Sudip Mukherjee; +Cc: Sudeep Dutt, Greg Kroah-Hartman, linux-kernel

On Tue, 2015-10-13 at 22:36 +0530, Sudip Mukherjee wrote:
> In scif_node_connect() we were returning if the initialization of p2p_ji
> fails. But at that time p2p_ij has already been initialized and
> resources allocated for it. And since p2p_ij is not added to the list
> till now so we will have a leak.
> Lets deinitialize and release the resources connected to p2p_ij.
> 

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

Thanks again for the fix.

> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
> ---
> 
> v3: added unmapping the aperture
> v2: missed making the function as static
> 
>  drivers/misc/mic/scif/scif_nodeqp.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
> index 7180d56..c66ca1a 100644
> --- a/drivers/misc/mic/scif/scif_nodeqp.c
> +++ b/drivers/misc/mic/scif/scif_nodeqp.c
> @@ -435,6 +435,21 @@ free_p2p:
>  	return NULL;
>  }
>  
> +/* Uninitialize and release resources from a p2p mapping */
> +static void scif_deinit_p2p_info(struct scif_dev *scifdev,
> +				 struct scif_p2p_info *p2p)
> +{
> +	struct scif_hw_dev *sdev = scifdev->sdev;
> +
> +	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
> +		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);
> +	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
> +		     p2p->sg_nentries[SCIF_PPI_APER], DMA_BIDIRECTIONAL);
> +	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
> +	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
> +	kfree(p2p);
> +}
> +
>  /**
>   * scif_node_connect: Respond to SCIF_NODE_CONNECT interrupt message
>   * @dst: Destination node
> @@ -477,8 +492,10 @@ static void scif_node_connect(struct scif_dev *scifdev, int dst)
>  	if (!p2p_ij)
>  		return;
>  	p2p_ji = scif_init_p2p_info(dev_j, dev_i);
> -	if (!p2p_ji)
> +	if (!p2p_ji) {
> +		scif_deinit_p2p_info(dev_i, p2p_ij);
>  		return;
> +	}
>  	list_add_tail(&p2p_ij->ppi_list, &dev_i->p2p);
>  	list_add_tail(&p2p_ji->ppi_list, &dev_j->p2p);
>  



^ permalink raw reply	[relevance 99%]

* Re: [PATCH] misc: mic: fix memory leak
  @ 2015-10-13 16:34 99% ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-10-13 16:34 UTC (permalink / raw)
  To: Sudip Mukherjee; +Cc: Sudeep Dutt, Greg Kroah-Hartman, linux-kernel

On Tue, 2015-10-13 at 14:38 +0530, Sudip Mukherjee wrote:
> In scif_node_connect() we were returning if the initialization of p2p_ji
> fails. But at that time p2p_ij has already been initialized and
> resources allocated for it. And since p2p_ij is not added to the list
> till now so we will have a leak.
> Lets deinitialize and release the resources connected to p2p_ij.
> 
> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
> ---
> 
> One sudip sending patch to another sudeep... :)
> 
>  drivers/misc/mic/scif/scif_nodeqp.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
> index 7180d56..5e94fac 100644
> --- a/drivers/misc/mic/scif/scif_nodeqp.c
> +++ b/drivers/misc/mic/scif/scif_nodeqp.c
> @@ -435,6 +435,18 @@ free_p2p:
>  	return NULL;
>  }
>  
> +/* Uninitialize and release resources from a p2p mapping */
> +void scif_deinit_p2p_info(struct scif_dev *scifdev, struct scif_p2p_info *p2p)
> +{
> +	struct scif_hw_dev *sdev = scifdev->sdev;
> +
> +	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
> +		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);

Hi Sudip,

You missed unmapping the aperture here:

dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
	     p2p->sg_nentries[SCIF_PPI_APER], DMA_BIDIRECTIONAL);

Can you please resubmit this patch with the change above?

Also you should mention [PATCH char-misc-next] in the next revision.

Thanks for the fix!

Sudeep Dutt

> +	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
> +	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
> +	kfree(p2p);
> +}
> +
>  /**
>   * scif_node_connect: Respond to SCIF_NODE_CONNECT interrupt message
>   * @dst: Destination node
> @@ -477,8 +489,10 @@ static void scif_node_connect(struct scif_dev *scifdev, int dst)
>  	if (!p2p_ij)
>  		return;
>  	p2p_ji = scif_init_p2p_info(dev_j, dev_i);
> -	if (!p2p_ji)
> +	if (!p2p_ji) {
> +		scif_deinit_p2p_info(dev_i, p2p_ij);
>  		return;
> +	}
>  	list_add_tail(&p2p_ij->ppi_list, &dev_i->p2p);
>  	list_add_tail(&p2p_ji->ppi_list, &dev_j->p2p);
>  



^ permalink raw reply	[relevance 99%]

* Re: [patch] misc: mic/scif: re-take a lock on error path
  @ 2015-10-13 16:20 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-13 16:20 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Sudeep Dutt, Nikhil Rao, Greg Kroah-Hartman, linux-kernel,
	kernel-janitors

On Tue, 2015-10-13 at 15:52 +0300, Dan Carpenter wrote:
> The caller expects that we take this lock again before returning
> otherwise it you get double unlocks and races.
> 

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

Thanks for the fix!

> Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
> index e288996..980ef13 100644
> --- a/drivers/misc/mic/scif/scif_rma.c
> +++ b/drivers/misc/mic/scif/scif_rma.c
> @@ -680,6 +680,7 @@ int scif_unregister_window(struct scif_window *window)
>  			}
>  		} else {
>  			/* Return ENXIO since unregistration is in progress */
> +			mutex_lock(&ep->rma_info.rma_lock);
>  			return -ENXIO;
>  		}
>  retry:



^ permalink raw reply	[relevance 99%]

* Re: [patch 2/2] misc: mic/scif: fix wrap around tests
  @ 2015-10-11  9:14 99% ` Sudeep Dutt
  2015-10-14  3:21 99% ` Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-11  9:14 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Sudeep Dutt, Nikhil Rao, Greg Kroah-Hartman, linux-kernel,
	kernel-janitors

On Fri, 2015-10-09 at 09:40 +0300, Dan Carpenter wrote:
> Signed integer overflow is undefined.  Also I added a check for
> "(offset < 0)" in scif_unregister() because that makes it match the
> other conditions and because I didn't want to subtract a negative.
> 
> Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> 
> Imagine you are on 64 bit and len is larger than INT_MAX << 12, it means
> that we truncate it because scif_get_window_offset() takes an integer
> argument.  I don't know if this is an issue. 

scif_get_window_offset(..) takes an integer argument for the number of
pages. We believe that an int for number of 4K pages is sufficient for
current systems. I don't think there is an issue here.

>  Maybe I should use
> INT_MAX instead of LONG_MAX?  I am working on a static checker warning
> for these types of issues:
> drivers/misc/mic/scif/scif_rma.c:1631 scif_register() warn: truncating user data 'len >> 12' '0-4503599627370495'
> drivers/misc/mic/scif/scif_rma.c:1643 scif_register() warn: truncating user data 'len >> 12' '0-4503599627370495'
> 
> The other static warnings here are:
> 
> drivers/misc/mic/scif/scif_rma.c:745 scif_unregister_window() warn: inconsistent returns 'mutex:&ep->rma_info.rma_lock'.
>   Locked on:   line 745
>   Unlocked on: line 687

The function expects the lock to be held by the caller so there is no
issue here.

> drivers/misc/mic/scif/scif_rma.c:1463 scif_unpin_pages() warn: passing __func__ while the format string already contains the name of the function 'scif_unpin_pages'
> 

It might be useful to enhance checkpatch to catch such issues.

> diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
> index bc2dccb..fea7d2c 100644
> --- a/drivers/misc/mic/scif/scif_rma.c
> +++ b/drivers/misc/mic/scif/scif_rma.c
> @@ -1510,7 +1510,7 @@ off_t scif_register_pinned_pages(scif_epd_t epd,
>  	if ((map_flags & SCIF_MAP_FIXED) &&
>  	    ((ALIGN(offset, PAGE_SIZE) != offset) ||
>  	    (offset < 0) ||
> -	    (offset + (off_t)len < offset)))
> +	    (len > LONG_MAX - offset)))
>  		return -EINVAL;
>  
>  	might_sleep();
> @@ -1613,7 +1613,7 @@ off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
>  	if ((map_flags & SCIF_MAP_FIXED) &&
>  	    ((ALIGN(offset, PAGE_SIZE) != offset) ||
>  	    (offset < 0) ||
> -	    (offset + (off_t)len < offset)))
> +	    (len < LONG_MAX - offset)))

Why is this change required? The earlier code was being used to detect
wraparound and I think it works fine.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [patch 1/2] misc: mic/scif: fix error code in scif_create_remote_lookup()
  @ 2015-10-11  9:14 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-11  9:14 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Sudeep Dutt, Nikhil Rao, Ashutosh Dixit, Greg Kroah-Hartman,
	linux-kernel, kernel-janitors

On Fri, 2015-10-09 at 09:41 +0300, Dan Carpenter wrote:
> We should be returning -ENOMEM here instead of success.
> 

Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>

Thanks for the fix!

> Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
> index e288996..bc2dccb 100644
> --- a/drivers/misc/mic/scif/scif_rma.c
> +++ b/drivers/misc/mic/scif/scif_rma.c
> @@ -386,16 +386,20 @@ static int scif_create_remote_lookup(struct scif_dev *remote_dev,
>  				    remote_dev, window->nr_lookup *
>  				    sizeof(*window->dma_addr_lookup.lookup),
>  				    GFP_KERNEL | __GFP_ZERO);
> -	if (!window->dma_addr_lookup.lookup)
> +	if (!window->dma_addr_lookup.lookup) {
> +		err = -ENOMEM;
>  		goto error_window;
> +	}
>  
>  	window->num_pages_lookup.lookup =
>  		scif_alloc_coherent(&window->num_pages_lookup.offset,
>  				    remote_dev, window->nr_lookup *
>  				    sizeof(*window->num_pages_lookup.lookup),
>  				    GFP_KERNEL | __GFP_ZERO);
> -	if (!window->num_pages_lookup.lookup)
> +	if (!window->num_pages_lookup.lookup) {
> +		err = -ENOMEM;
>  		goto error_window;
> +	}
>  
>  	vmalloc_dma_phys = is_vmalloc_addr(&window->dma_addr[0]);
>  	vmalloc_num_pages = is_vmalloc_addr(&window->num_pages[0]);



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 04/22] iommu: Allow iova to be used without requiring IOMMU_SUPPORT
  @ 2015-10-06 12:04 99%                   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-06 12:04 UTC (permalink / raw)
  To: gregkh
  Cc: Sudeep Dutt, Woodhouse, David, Dixit, Ashutosh, linux-kernel,
	Rao, Nikhil, joro, sakari.ailus, iommu

On Tue, 2015-10-06 at 09:33 +0100, gregkh@linuxfoundation.org wrote:
> On Tue, Oct 06, 2015 at 01:05:27AM -0700, Sudeep Dutt wrote:
> > On Tue, 2015-10-06 at 08:56 +0100, gregkh@linuxfoundation.org wrote:
> > > On Mon, Oct 05, 2015 at 10:23:38PM -0700, Sudeep Dutt wrote:
> > > > On Tue, 2015-10-06 at 06:20 +0100, gregkh@linuxfoundation.org wrote:
> > > > > On Tue, Oct 06, 2015 at 06:12:40AM +0100, gregkh@linuxfoundation.org wrote:
> > > > > > On Mon, Oct 05, 2015 at 10:38:43AM -0700, Sudeep Dutt wrote:
> > > > > > > On Mon, 2015-10-05 at 03:50 -0700, Woodhouse, David wrote:
> > > > > > > > On Tue, 2015-09-29 at 18:09 -0700, Ashutosh Dixit wrote:
> > > > > > > > > From: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > > > > > > 
> > > > > > > > > iova is a library which can be built without IOMMU_SUPPORT
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > > > > > 
> > > > > > > > The first three of these patches are in 4.3-rc4 already. Apologies for
> > > > > > > > the delay in pushing them out.
> > > > > > > > 
> > > > > > > > This one looks sane enough too, but perhaps in that case we should move
> > > > > > > > the code *out* of drivers/iommu/ and into lib/iova/ ?
> > > > > > > > 
> > > > > > > 
> > > > > > > Yes, moving the code into lib/iova is the correct long term solution. I
> > > > > > > have sent Greg a patch which reverts this commit since it is no longer
> > > > > > > required and will create a merge conflict for him unnecessarily as well
> > > > > > > 0000d3bd-0010
> > > > > > 
> > > > > > I can handle merge issues, that's trivial.  Reverting the patch
> > > > > > shoulnd't really be needed, right?  Let me see what happens when I merge
> > > > > > to see if your patch is necessary...
> > > > > 
> > > > > Ok, I don't think it is needed, the merge was pretty trivial.
> > > > > 
> > > > > Can you test out my char-misc-testing branch right now to see if it's
> > > > > all ok with the merge?  If so, I'll move it all over to the "real" place
> > > > > for it to start showing up in linux-next, i.e. my char-misc-next branch.
> > > > > 
> > > > 
> > > > Hi Greg,
> > > > 
> > > > I think it is best to revert this patch as it is incorrect. The iommu
> > > > folder gets compiled only if IOMMU_SUPPORT is enabled so IOMMU_IOVA
> > > > should indeed be included only when IOMMU_SUPPORT is enabled.
> > > > 
> > > > Sincere apologies for the mess here but I believe it will all get fixed
> > > > up if you accept the revert of 353649e5da I sent across earlier today.
> > > 
> > > Again, look at the merge, I think I already handled this in that manner.
> > > If not, let me know.
> > > 
> > 
> > Hi Greg,
> > 
> > I took a look at your latest char-misc-testing tree and it needs to be
> > fixed up. IOMMU_IOVA should be inside the "if IOMMU_SUPPORT" block
> > instead of above it.
> > 
> > git revert 353649e5da in the char-misc-testing tree will fix everything
> > up or you could also apply the patch I sent earlier today which has the
> > same revert.
> 
> Ok, I've applied your patch, rolled back the merge, and that should be
> good, right?  If so, I'll push this branch out to char-misc-next and
> then merge in 4.3-rc4 just to keep everything up to date.

Yes, it looks good now.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 04/22] iommu: Allow iova to be used without requiring IOMMU_SUPPORT
  @ 2015-10-06  8:05 99%               ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-10-06  8:05 UTC (permalink / raw)
  To: gregkh
  Cc: Sudeep Dutt, Woodhouse, David, Dixit, Ashutosh, linux-kernel,
	Rao, Nikhil, joro, sakari.ailus, iommu

On Tue, 2015-10-06 at 08:56 +0100, gregkh@linuxfoundation.org wrote:
> On Mon, Oct 05, 2015 at 10:23:38PM -0700, Sudeep Dutt wrote:
> > On Tue, 2015-10-06 at 06:20 +0100, gregkh@linuxfoundation.org wrote:
> > > On Tue, Oct 06, 2015 at 06:12:40AM +0100, gregkh@linuxfoundation.org wrote:
> > > > On Mon, Oct 05, 2015 at 10:38:43AM -0700, Sudeep Dutt wrote:
> > > > > On Mon, 2015-10-05 at 03:50 -0700, Woodhouse, David wrote:
> > > > > > On Tue, 2015-09-29 at 18:09 -0700, Ashutosh Dixit wrote:
> > > > > > > From: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > > > > 
> > > > > > > iova is a library which can be built without IOMMU_SUPPORT
> > > > > > > 
> > > > > > > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > > > 
> > > > > > The first three of these patches are in 4.3-rc4 already. Apologies for
> > > > > > the delay in pushing them out.
> > > > > > 
> > > > > > This one looks sane enough too, but perhaps in that case we should move
> > > > > > the code *out* of drivers/iommu/ and into lib/iova/ ?
> > > > > > 
> > > > > 
> > > > > Yes, moving the code into lib/iova is the correct long term solution. I
> > > > > have sent Greg a patch which reverts this commit since it is no longer
> > > > > required and will create a merge conflict for him unnecessarily as well
> > > > > with 4.3-rc4.
> > > > 
> > > > I can handle merge issues, that's trivial.  Reverting the patch
> > > > shoulnd't really be needed, right?  Let me see what happens when I merge
> > > > to see if your patch is necessary...
> > > 
> > > Ok, I don't think it is needed, the merge was pretty trivial.
> > > 
> > > Can you test out my char-misc-testing branch right now to see if it's
> > > all ok with the merge?  If so, I'll move it all over to the "real" place
> > > for it to start showing up in linux-next, i.e. my char-misc-next branch.
> > > 
> > 
> > Hi Greg,
> > 
> > I think it is best to revert this patch as it is incorrect. The iommu
> > folder gets compiled only if IOMMU_SUPPORT is enabled so IOMMU_IOVA
> > should indeed be included only when IOMMU_SUPPORT is enabled.
> > 
> > Sincere apologies for the mess here but I believe it will all get fixed
> > up if you accept the revert of 353649e5da I sent across earlier today.
> 
> Again, look at the merge, I think I already handled this in that manner.
> If not, let me know.
> 

Hi Greg,

I took a look at your latest char-misc-testing tree and it needs to be
fixed up. IOMMU_IOVA should be inside the "if IOMMU_SUPPORT" block
instead of above it.

git revert 353649e5da in the char-misc-testing tree will fix everything
up or you could also apply the patch I sent earlier today which has the
same revert.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 04/22] iommu: Allow iova to be used without requiring IOMMU_SUPPORT
  @ 2015-10-06  5:23 99%           ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-10-06  5:23 UTC (permalink / raw)
  To: gregkh
  Cc: Sudeep Dutt, Woodhouse, David, Dixit, Ashutosh, linux-kernel,
	Rao, Nikhil, joro, sakari.ailus, iommu

On Tue, 2015-10-06 at 06:20 +0100, gregkh@linuxfoundation.org wrote:
> On Tue, Oct 06, 2015 at 06:12:40AM +0100, gregkh@linuxfoundation.org wrote:
> > On Mon, Oct 05, 2015 at 10:38:43AM -0700, Sudeep Dutt wrote:
> > > On Mon, 2015-10-05 at 03:50 -0700, Woodhouse, David wrote:
> > > > On Tue, 2015-09-29 at 18:09 -0700, Ashutosh Dixit wrote:
> > > > > From: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > > 
> > > > > iova is a library which can be built without IOMMU_SUPPORT
> > > > > 
> > > > > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > 
> > > > The first three of these patches are in 4.3-rc4 already. Apologies for
> > > > the delay in pushing them out.
> > > > 
> > > > This one looks sane enough too, but perhaps in that case we should move
> > > > the code *out* of drivers/iommu/ and into lib/iova/ ?
> > > > 
> > > 
> > > Yes, moving the code into lib/iova is the correct long term solution. I
> > > have sent Greg a patch which reverts this commit since it is no longer
> > > required and will create a merge conflict for him unnecessarily as well
> > > with 4.3-rc4.
> > 
> > I can handle merge issues, that's trivial.  Reverting the patch
> > shoulnd't really be needed, right?  Let me see what happens when I merge
> > to see if your patch is necessary...
> 
> Ok, I don't think it is needed, the merge was pretty trivial.
> 
> Can you test out my char-misc-testing branch right now to see if it's
> all ok with the merge?  If so, I'll move it all over to the "real" place
> for it to start showing up in linux-next, i.e. my char-misc-next branch.
> 

Hi Greg,

I think it is best to revert this patch as it is incorrect. The iommu
folder gets compiled only if IOMMU_SUPPORT is enabled so IOMMU_IOVA
should indeed be included only when IOMMU_SUPPORT is enabled.

Sincere apologies for the mess here but I believe it will all get fixed
up if you accept the revert of 353649e5da I sent across earlier today.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 04/22] iommu: Allow iova to be used without requiring IOMMU_SUPPORT
  @ 2015-10-05 17:38 99%     ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-10-05 17:38 UTC (permalink / raw)
  To: Woodhouse, David
  Cc: Sudeep Dutt, Dixit, Ashutosh, linux-kernel, gregkh, Rao, Nikhil,
	joro, sakari.ailus, iommu

On Mon, 2015-10-05 at 03:50 -0700, Woodhouse, David wrote:
> On Tue, 2015-09-29 at 18:09 -0700, Ashutosh Dixit wrote:
> > From: Sudeep Dutt <sudeep.dutt@intel.com>
> > 
> > iova is a library which can be built without IOMMU_SUPPORT
> > 
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> 
> The first three of these patches are in 4.3-rc4 already. Apologies for
> the delay in pushing them out.
> 
> This one looks sane enough too, but perhaps in that case we should move
> the code *out* of drivers/iommu/ and into lib/iova/ ?
> 

Yes, moving the code into lib/iova is the correct long term solution. I
have sent Greg a patch which reverts this commit since it is no longer
required and will create a merge conflict for him unnecessarily as well
with 4.3-rc4.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-testing 1/1] Revert "iommu: Allow iova to be used without requiring IOMMU_SUPPORT"
@ 2015-10-05 17:37 99% Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-05 17:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, iommu, David Woodhouse, Sakari Ailus, joro,
	Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

Revert 'commit 353649e5da90 ("iommu: Allow iova to be used without
requiring IOMMU_SUPPORT"). This commit is made unnecessary by
'commit ac6d83ccd9c5 ("misc: mic: Fix SCIF build failure with
IOMMU_SUPPORT disabled") and will create a conflict upon merging
with 4.3-rc4. The correct long term solution is to move the iova
library from drivers/iommu into lib/iova which will be done in
a future patch.

Cc: David Woodhouse <David.Woodhouse@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/iommu/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 71d1c46..d9da766 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -12,9 +12,6 @@ menuconfig IOMMU_SUPPORT
 	  remap DMA requests and/or remap interrupts from other devices on the
 	  system.
 
-config IOMMU_IOVA
-	tristate
-
 if IOMMU_SUPPORT
 
 menu "Generic IOMMU Pagetable Support"
@@ -45,6 +42,9 @@ config IOMMU_IO_PGTABLE_LPAE_SELFTEST
 
 endmenu
 
+config IOMMU_IOVA
+	tristate
+
 config OF_IOMMU
        def_bool y
        depends on OF && IOMMU_API
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 1/1] misc: mic: Fix SCIF build failure with IOMMU_SUPPORT disabled
@ 2015-10-04 20:38 99% Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-10-04 20:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt

SCIF depends on IOVA which requires IOMMU_SUPPORT to be enabled.
The long term fix is to move IOVA from drivers/iommu to lib/
but this current patch should fix the reported issue.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 60376fb..40677df 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -74,7 +74,7 @@ comment "SCIF Driver"
 
 config SCIF
 	tristate "SCIF Driver"
-	depends on 64BIT && PCI && X86 && SCIF_BUS
+	depends on 64BIT && PCI && X86 && SCIF_BUS && IOMMU_SUPPORT
 	select IOMMU_IOVA
 	help
 	  This enables SCIF Driver support for the Intel Many Integrated
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* Re: [PATCH char-misc-next 11/19] dma: Add support to program MIC x100 status descriptiors
  @ 2015-08-21 16:14 99%         ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-08-21 16:14 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Ashutosh Dixit, Greg Kroah-Hartman, Dan Williams, linux-kernel,
	Nikhil Rao, Siva Yerramreddy, Sudeep Dutt

On Fri, 2015-08-21 at 13:59 +0530, Vinod Koul wrote:
> On Thu, Aug 20, 2015 at 10:30:50AM -0700, Sudeep Dutt wrote:
> > On Thu, 2015-08-20 at 10:50 +0530, Vinod Koul wrote:
> > > On Mon, Jul 27, 2015 at 04:58:17PM -0700, Ashutosh Dixit wrote:
> 
> > > > +		/*
> > > > +		 * This is a hack to program status descriptor since
> > > > +		 * DMA engine API doesn't have support for this.
> > > > +		 */
> > > what do you mean by programming status descriptor, what do you need to
> > > program?
> > > 
> > 
> > The MIC X100 DMA engine has a special status descriptor which writes an
> > 8 byte immediate data value to a destination address. It is used to
> > signal completion of all DMA descriptors prior to the status descriptor.
> > The DMA engine API does not allow drivers to pass a 8 byte value. We are
> > allowed to pass a source physical address but we cannot determine the
> > source value within the DMA driver using that information specifically
> > with the IOMMU enabled. We have added this workaround so that we can
> > program this special status update descriptor without making any changes
> > to the DMA engine API.
> I think this is a special case so you should export an API and let client
> program using that API.
> 

Sounds good. We will export a new API from the MIC X100 DMA driver in
the next revision of this patch series.

Thanks for the review!

Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next 11/19] dma: Add support to program MIC x100 status descriptiors
  @ 2015-08-20 17:30 98%     ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-08-20 17:30 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Ashutosh Dixit, Greg Kroah-Hartman, Dan Williams, linux-kernel,
	Nikhil Rao, Siva Yerramreddy, Sudeep Dutt

On Thu, 2015-08-20 at 10:50 +0530, Vinod Koul wrote:
> On Mon, Jul 27, 2015 at 04:58:17PM -0700, Ashutosh Dixit wrote:
> > +/* Program a status descriptor with phys as address and value to be written */
> > +static int mic_dma_do_status_update(struct mic_dma_chan *ch, dma_addr_t phys,
> > +				    u64 value)
> > +{
> > +	int ret = mic_dma_avail_desc_ring_space(ch, 4);
> > +
> > +	if (ret < 0)
> > +		return ret;
> > +	ret = 0;
> ?
> > +	mic_dma_prep_status_desc(&ch->desc_ring[ch->head],
> > +				 value, phys, false);
> > +	mic_dma_hw_ring_inc_head(ch);
> > +	return ret;
> 
> return 0 then?
> 

Sounds good. We will change this in the next revision of the patch
series which will be posted after the 4.3 merge window closes.

> > +}
> > +
> >  static inline void mic_dma_issue_pending(struct dma_chan *ch)
> >  {
> >  	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
> > @@ -287,9 +310,28 @@ mic_dma_prep_memcpy_lock(struct dma_chan *ch, dma_addr_t dma_dest,
> >  		return NULL;
> >  
> >  	spin_lock(&mic_ch->prep_lock);
> > +	if (len == 8) {
> magic number?
> 

The MIC X100 DMA engine requires cache line (64 bytes) aligned
src/dst/len. We use the length of 8 bytes as a special case for
programming the status descriptor since it writes an 8 byte value. We
will use a macro in the next revision of the patch series.

> > +		/*
> > +		 * This is a hack to program status descriptor since
> > +		 * DMA engine API doesn't have support for this.
> > +		 */
> what do you mean by programming status descriptor, what do you need to
> program?
> 

The MIC X100 DMA engine has a special status descriptor which writes an
8 byte immediate data value to a destination address. It is used to
signal completion of all DMA descriptors prior to the status descriptor.
The DMA engine API does not allow drivers to pass a 8 byte value. We are
allowed to pass a source physical address but we cannot determine the
source value within the DMA driver using that information specifically
with the IOMMU enabled. We have added this workaround so that we can
program this special status update descriptor without making any changes
to the DMA engine API.

Thanks for the review!

Sudeep Dutt


^ permalink raw reply	[relevance 98%]

* Re: [PATCH char-misc-next 10/19] lib: convert iova.c into a library
  @ 2015-07-28 17:01 97%         ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-07-28 17:01 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Robin Murphy, Ashutosh Dixit, Joerg Roedel, Greg Kroah-Hartman,
	iommu, linux-kernel, Andrew Morton, David S. Miller, Nikhil Rao,
	Anil S Keshavamurthy, Harish Chegondi, sakari.ailus, Dutt,
	Sudeep

On Tue, 2015-07-28 at 15:38 +0100, David Woodhouse wrote:
> On Tue, 2015-07-28 at 11:41 +0100, Robin Murphy wrote:
> > On 28/07/15 11:03, Joerg Roedel wrote:
> > > On Mon, Jul 27, 2015 at 04:57:32PM -0700, Ashutosh Dixit wrote:
> > > > From: Harish Chegondi <harish.chegondi@intel.com>
> > > > 
> > > > This patch converts iova.c into a library, moving it from
> > > > drivers/iommu/ to lib/, and exports its virtual address 
> > > > allocation and
> > > > management functions so that other modules can reuse them.
> > > > 
> > > > Cc: Joerg Roedel <joro@8bytes.org>
> > > > Reviewed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com
> > > > >
> > > > Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > > > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > > 
> > > Where is this going to be used outside of the IOMMU world?
> > > 

We are using the IOVA generator in the SCIF driver posted @
http://thread.gmane.org/gmane.linux.kernel/2005895 under
drivers/misc/mic/scif

> > 
> > ...and how does it relate to the patches from Sakari (+CC) doing much 
> > the same thing[1]?
> 

The patch series from Sakari does the right thing by moving the IOVA
cache management to the IOVA library. We will simply drop this current
patch as it is incorrect.

> I merged Sakari's patches into the intel-iommu git tree today, FWIW.
> 
> If there's really a need to move it from drivers/iommu/ to lib/ then we
> could feasibly do that too.
> 

The patch series from Sakari should work perfectly for us. We will post
a v2 of the current SCIF patch series without this IOVA patch and modify
the SCIF driver to use the newly added iova_cache_get(..) and
iova_cache_put(..) APIs once it is available in Linus's tree. It would
make it easier for us to integrate if Sakari's patches reach mainline
soon.

It might be cleaner to move IOVA to lib/ in the longer term since we
will have multiple driver subsystems using it, but it should work just
fine for now.

Thanks for the review!

Sudeep Dutt


^ permalink raw reply	[relevance 97%]

* [PATCH char-misc-next 2/2] misc: mic: Fix reported static checker warning
  2015-06-10  3:58 99% [PATCH char-misc-next 0/2] misc: mic: bug fixes Sudeep Dutt
  2015-06-10  3:58 99% ` [PATCH char-misc-next 1/2] misc: mic: Fix randconfig build error by including errno.h Sudeep Dutt
@ 2015-06-10  3:58 99% ` Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-06-10  3:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Jim Davis, Dan Carpenter, Nikhil Rao,
	Ashutosh Dixit, Sudeep Dutt

Delete unnecessary prints resulting in an "spdev could be null"
warning from a static checker in scif_peer_remove(..).

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_main.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
index f7187dc..6ce851f 100644
--- a/drivers/misc/mic/scif/scif_main.c
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -91,8 +91,6 @@ static int scif_peer_probe(struct scif_peer_dev *spdev)
 	rcu_assign_pointer(scifdev->spdev, spdev);
 
 	/* In the future SCIF kernel client devices will be added here */
-	dev_info(&spdev->dev, "Peer added dnode %d\n",
-		 spdev->dnode);
 	return 0;
 }
 
@@ -109,8 +107,6 @@ static void scif_peer_remove(struct scif_peer_dev *spdev)
 	mutex_lock(&scif_info.conflock);
 	scif_info.total--;
 	mutex_unlock(&scif_info.conflock);
-	dev_info(&spdev->dev, "Peer removed dnode %d\n",
-		 spdev->dnode);
 }
 
 static void scif_qp_setup_handler(struct work_struct *work)
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 1/2] misc: mic: Fix randconfig build error by including errno.h
  2015-06-10  3:58 99% [PATCH char-misc-next 0/2] misc: mic: bug fixes Sudeep Dutt
@ 2015-06-10  3:58 99% ` Sudeep Dutt
  2015-06-10  3:58 99% ` [PATCH char-misc-next 2/2] misc: mic: Fix reported static checker warning Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-06-10  3:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Jim Davis, Dan Carpenter, Nikhil Rao,
	Ashutosh Dixit, Sudeep Dutt

This issue was reported @ https://lkml.org/lkml/2015/6/9/731

Reported-by: Jim Davis <jim.epost@gmail.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_rb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/misc/mic/scif/scif_rb.c b/drivers/misc/mic/scif/scif_rb.c
index 47de222..637cc46 100644
--- a/drivers/misc/mic/scif/scif_rb.c
+++ b/drivers/misc/mic/scif/scif_rb.c
@@ -18,6 +18,7 @@
 #include <linux/circ_buf.h>
 #include <linux/types.h>
 #include <linux/io.h>
+#include <linux/errno.h>
 
 #include "scif_rb.h"
 
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 0/2] misc: mic: bug fixes
@ 2015-06-10  3:58 99% Sudeep Dutt
  2015-06-10  3:58 99% ` [PATCH char-misc-next 1/2] misc: mic: Fix randconfig build error by including errno.h Sudeep Dutt
  2015-06-10  3:58 99% ` [PATCH char-misc-next 2/2] misc: mic: Fix reported static checker warning Sudeep Dutt
  0 siblings, 2 replies; 200+ results
From: Sudeep Dutt @ 2015-06-10  3:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Jim Davis, Dan Carpenter, Nikhil Rao,
	Ashutosh Dixit, Sudeep Dutt

This patch series fixes reported issues including a
randconfig build error and a warning from a static checker.

Sudeep Dutt (2):
  misc: mic: Fix randconfig build error by including errno.h
  misc: mic: Fix reported static checker warning

 drivers/misc/mic/scif/scif_main.c | 4 ----
 drivers/misc/mic/scif/scif_rb.c   | 1 +
 2 files changed, 1 insertion(+), 4 deletions(-)

-- 
1.8.2.1


^ permalink raw reply	[relevance 99%]

* Re: randconfig build error with next-20150609, in drivers/misc/mic/scif/scif_rb.c
  @ 2015-06-09 18:47 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-06-09 18:47 UTC (permalink / raw)
  To: Jim Davis
  Cc: Stephen Rothwell, linux-kernel, linux-next, Greg Kroah-Hartman,
	ashutosh.dixit, nikhil.rao, Sudeep Dutt

On Tue, 2015-06-09 at 10:48 -0700, Jim Davis wrote:
> Building with the attached random configuration file,
> 
> drivers/misc/mic/scif/scif_rb.c: In function ‘scif_rb_write’:
> drivers/misc/mic/scif/scif_rb.c:115:11: error: ‘ENOMEM’ undeclared (first use in
>  this function)
>    return -ENOMEM;
>            ^
> drivers/misc/mic/scif/scif_rb.c:115:11: note: each undeclared
> identifier is reported only once for each function it appears in

Thanks for the report Jim. I will post a patch with the fix.

Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver
  @ 2015-04-29 12:42 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-04-29 12:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Wed, 2015-04-29 at 13:27 +0200, Greg Kroah-Hartman wrote:
> On Tue, Apr 28, 2015 at 01:45:12PM -0700, Sudeep Dutt wrote:
> > On Mon, 2015-03-30 at 18:36 -0700, Sudeep Dutt wrote:
> > > ChangeLog:
> > > =========
> > > 
> > > v3 => v4:
> > > a) Delete unused IOCTL definitions and IOCTL interface fixes as per
> > > feedback from Greg Kroah-Hartman
> > > b) Couple of tiny bug fixes since v3
> > > 
> > > v2 => v3 @ https://lkml.org/lkml/2015/1/26/849
> > > a) Formatting fixes for SCIF header file documentation, data type fixes
> > >    for SCIF IOCTL interface and added SCIF user space documentation in
> > >    scif_overview.txt as per feedback from Greg Kroah-Hartman
> > > 
> > > v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
> > > a) Use kernel-doc formatting for SCIF header file documentation, enhance
> > >    SCIF ring buffer documentation and formatting cleanup in patch 4 as
> > >    per feedback from Greg Kroah-Hartman
> > > b) SCIF bug fixes and cleanups since initial post
> > > 
> > > v1: Initial post @ https://lkml.org/lkml/2014/12/10/546
> > > 
> > > Description:
> > > ============
> > > 
> > > The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> > > level communications API across PCIe currently implemented for MIC. Currently
> > > SCIF provides inter-node communication within a single host platform, where a
> > > node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> > > communicating over the PCIe bus while providing an API that is symmetric
> > > across all the nodes in the PCIe network. An important design objective for SCIF
> > > is to deliver the maximum possible performance given the communication
> > > abilities of the hardware. SCIF has been used to implement an offload compiler
> > > runtime and OFED support for MPI implementations for MIC coprocessors.
> > > 
> > 
> > Hi Greg,
> > 
> > Please take a look at this patch series.
> 
> What patch series?  I don't have this in my todo queue at all, what
> happened to it?

That is strange indeed since it does show up on the mailing lists :( I
have resent the patch series a few minutes back. Please let me know if
you did not receive it.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (12 preceding siblings ...)
  2015-03-31  1:36 98% ` [PATCH char-misc-next v4 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
@ 2015-04-28 20:45 99% ` Sudeep Dutt
    13 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-04-28 20:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Mon, 2015-03-30 at 18:36 -0700, Sudeep Dutt wrote:
> ChangeLog:
> =========
> 
> v3 => v4:
> a) Delete unused IOCTL definitions and IOCTL interface fixes as per
> feedback from Greg Kroah-Hartman
> b) Couple of tiny bug fixes since v3
> 
> v2 => v3 @ https://lkml.org/lkml/2015/1/26/849
> a) Formatting fixes for SCIF header file documentation, data type fixes
>    for SCIF IOCTL interface and added SCIF user space documentation in
>    scif_overview.txt as per feedback from Greg Kroah-Hartman
> 
> v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
> a) Use kernel-doc formatting for SCIF header file documentation, enhance
>    SCIF ring buffer documentation and formatting cleanup in patch 4 as
>    per feedback from Greg Kroah-Hartman
> b) SCIF bug fixes and cleanups since initial post
> 
> v1: Initial post @ https://lkml.org/lkml/2014/12/10/546
> 
> Description:
> ============
> 
> The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> level communications API across PCIe currently implemented for MIC. Currently
> SCIF provides inter-node communication within a single host platform, where a
> node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> communicating over the PCIe bus while providing an API that is symmetric
> across all the nodes in the PCIe network. An important design objective for SCIF
> is to deliver the maximum possible performance given the communication
> abilities of the hardware. SCIF has been used to implement an offload compiler
> runtime and OFED support for MPI implementations for MIC coprocessors.
> 

Hi Greg,

Please take a look at this patch series.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v4 08/13] misc: mic: SCIF open close bind and listen APIs
  @ 2015-03-31 21:13 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31 21:13 UTC (permalink / raw)
  To: Paul Bolle
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Tue, 2015-03-31 at 10:39 +0200, Paul Bolle wrote:
> On Mon, 2015-03-30 at 18:36 -0700, Sudeep Dutt wrote:
> > --- a/drivers/misc/mic/Kconfig
> > +++ b/drivers/misc/mic/Kconfig
> > +comment "SCIF Driver"
> > +
> > +config SCIF
> > +	tristate "SCIF Driver"
> > +	depends on 64BIT && PCI && X86 && SCIF_BUS
> 
> Won't this work with just
> 	depends on SCIF_BUS
> 

Hi Paul,

Yes, that should work as well. We will incorporate the Kconfig cleanup
suggestions from your three emails if there is another revision of this
patch series or post a follow on cleanup patch along with our next patch
series.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next v4 01/13] misc: mic: SCIF header file and IOCTL interface
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
@ 2015-03-31  1:36 22% ` Sudeep Dutt
  2015-03-31  1:36 58% ` [PATCH char-misc-next v4 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

This patch introduces the SCIF documentation in the header file
and describes the IOCTL interface for user mode. mic_overview.txt
is updated with documentation on SCIF and a new document
describing SCIF in more details is available in scif_overview.txt.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mic_overview.txt  |  28 +-
 Documentation/mic/scif_overview.txt |  98 ++++
 include/uapi/linux/Kbuild           |   1 +
 include/linux/scif.h                | 993 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/scif_ioctl.h     | 130 +++++
 5 files changed, 1238 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index 77c5418..1a2f2c8 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -24,6 +24,10 @@ a virtual bus called mic bus is created and virtual dma devices are
 created on it by the host/card drivers. On host the channels are private
 and used only by the host driver to transfer data for the virtio devices.
 
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
+low level communications API across PCIe currently implemented for MIC.
+More details are available at scif_overview.txt.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
@@ -47,18 +51,18 @@ the fact that the virtio block storage backend can only be on the host.
                       |               |       | Virtio over PCIe IOCTLs  |
                       |               |       +--------------------------+
 +-----------+         |               |                   |  +-----------+
-| MIC DMA   |         |               |                   |  | MIC DMA   |
-| Driver    |         |               |                   |  | Driver    |
-+-----------+         |               |                   |  +-----------+
-      |               |               |                   |        |
-+---------------+     |               |                   |  +----------------+
-|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
-+---------------+     |               |                   |  +----------------+
-      |               |               |                   |              |
-      |   +--------------+            |            +---------------+     |
-      |   |Intel MIC     |            |            |Intel MIC      |     |
-      +---|Card Driver   |            |            |Host Driver    |     |
-          +--------------+            |            +---------------+-----+
+| MIC DMA   |         |  +----------+ | +-----------+     |  | MIC DMA   |
+| Driver    |         |  |  SCIF    | | |   SCIF    |     |  | Driver    |
++-----------+         |  +----------+ | +-----------+     |  +-----------+
+      |               |       |       |       |           |        |
++---------------+     | +-----+-----+ | +-----+-----+     | +---------------+
+|MIC virtual Bus|     | |SCIF HW Bus| | |SCIF HW BUS|     | |MIC virtual Bus|
++---------------+     | +-----------+ | +-----+-----+     | +---------------+
+      |               |       |       |       |           |              |
+      |   +--------------+    |       |       |    +---------------+     |
+      |   |Intel MIC     |    |       |       |    |Intel MIC      |     |
+      +---|Card Driver   +----+       |       |    |Host Driver    |     |
+          +--------------+            |       +----+---------------+-----+
                       |               |                   |
              +-------------------------------------------------------------+
              |                                                             |
diff --git a/Documentation/mic/scif_overview.txt b/Documentation/mic/scif_overview.txt
new file mode 100644
index 0000000..0a280d9
--- /dev/null
+++ b/Documentation/mic/scif_overview.txt
@@ -0,0 +1,98 @@
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
+level communications API across PCIe currently implemented for MIC. Currently
+SCIF provides inter-node communication within a single host platform, where a
+node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
+communicating over the PCIe bus while providing an API that is symmetric
+across all the nodes in the PCIe network. An important design objective for SCIF
+is to deliver the maximum possible performance given the communication
+abilities of the hardware. SCIF has been used to implement an offload compiler
+runtime and OFED support for MPI implementations for MIC coprocessors.
+
+==== SCIF API Components ====
+The SCIF API has the following parts:
+1. Connection establishment using a client server model
+2. Byte stream messaging intended for short messages
+3. Node enumeration to determine online nodes
+4. Poll semantics for detection of incoming connections and messages
+5. Memory registration to pin down pages
+6. Remote memory mapping for low latency CPU accesses via mmap
+7. Remote DMA (RDMA) for high bandwidth DMA transfers
+8. Fence APIs for RDMA synchronization
+
+SCIF exposes the notion of a connection which can be used by peer processes on
+nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
+process in a SCIF node initiates a SCIF connection to a peer process on a
+different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
+which are similar to connection oriented socket APIs. Connected SCIF endpoints
+can also register local memory which is followed by data transfer using either
+DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
+kernel mode clients which are functionally equivalent.
+
+==== SCIF Performance for MIC ====
+DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
+SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
+
+             Comparison of TCP and SCIF based BW
+
+  Throughput (GB/sec)
+    8 +                                             PCIe Bandwidth ******
+      +                                                        TCP ######
+    7 +    **************************************             SCIF %%%%%%
+      |                       %%%%%%%%%%%%%%%%%%%
+    6 +                   %%%%
+      |                 %%
+      |               %%%
+    5 +              %%
+      |            %%
+    4 +           %%
+      |          %%
+    3 +         %%
+      |        %
+    2 +      %%
+      |     %%
+      |    %
+    1 +
+      +    ######################################
+    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
+      1       10     100      1000   10000   100000
+                   Transfer Size (KBytes)
+
+SCIF allows memory sharing via mmap(..) between processes on different PCIe
+nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
+latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
+
+SCIF has a user space library which is a thin IOCTL wrapper providing a user
+space API similar to the kernel API in scif.h. The SCIF user space library
+is distributed @ https://software.intel.com/en-us/mic-developer
+
+Here is some pseudo code for an example of how two applications on two PCIe
+nodes would typically use the SCIF API:
+
+Process A (on node A)			Process B (on node B)
+
+/* get online node information */
+scif_get_node_ids(..)			scif_get_node_ids(..)
+scif_open(..)				scif_open(..)
+scif_bind(..)				scif_bind(..)
+scif_listen(..)
+scif_accept(..)				scif_connect(..)
+/* SCIF connection established */
+
+/* Send and receive short messages */
+scif_send(..)/scif_recv(..)		scif_send(..)/scif_recv(..)
+
+/* Register memory */
+scif_register(..)			scif_register(..)
+
+/* RDMA */
+scif_readfrom(..)/scif_writeto(..)	scif_readfrom(..)/scif_writeto(..)
+
+/* Fence DMAs */
+scif_fence_signal(..)			scif_fence_signal(..)
+
+mmap(..)				mmap(..)
+
+/* Access remote registered memory */
+
+/* Close the endpoints */
+scif_close(..)				scif_close(..)
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 68ceb97..cdb3095 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -351,6 +351,7 @@ header-y += rtc.h
 header-y += rtnetlink.h
 header-y += scc.h
 header-y += sched.h
+header-y += scif_ioctl.h
 header-y += screen_info.h
 header-y += sctp.h
 header-y += sdla.h
diff --git a/include/linux/scif.h b/include/linux/scif.h
new file mode 100644
index 0000000..44f4f38
--- /dev/null
+++ b/include/linux/scif.h
@@ -0,0 +1,993 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef __SCIF_H__
+#define __SCIF_H__
+
+#include <linux/types.h>
+#include <linux/poll.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_ACCEPT_SYNC	1
+#define SCIF_SEND_BLOCK		1
+#define SCIF_RECV_BLOCK		1
+
+enum {
+	SCIF_PROT_READ = (1 << 0),
+	SCIF_PROT_WRITE = (1 << 1)
+};
+
+enum {
+	SCIF_MAP_FIXED = 0x10,
+	SCIF_MAP_KERNEL	= 0x20,
+};
+
+enum {
+	SCIF_FENCE_INIT_SELF = (1 << 0),
+	SCIF_FENCE_INIT_PEER = (1 << 1),
+	SCIF_SIGNAL_LOCAL = (1 << 4),
+	SCIF_SIGNAL_REMOTE = (1 << 5)
+};
+
+enum {
+	SCIF_RMA_USECPU = (1 << 0),
+	SCIF_RMA_USECACHE = (1 << 1),
+	SCIF_RMA_SYNC = (1 << 2),
+	SCIF_RMA_ORDERED = (1 << 3)
+};
+
+/* End of SCIF Admin Reserved Ports */
+#define SCIF_ADMIN_PORT_END	1024
+
+/* End of SCIF Reserved Ports */
+#define SCIF_PORT_RSVD		1088
+
+typedef struct scif_endpt *scif_epd_t;
+
+#define SCIF_OPEN_FAILED ((scif_epd_t)-1)
+#define SCIF_REGISTER_FAILED ((off_t)-1)
+#define SCIF_MMAP_FAILED ((void *)-1)
+
+/**
+ * scif_open() - Create an endpoint
+ *
+ * Return:
+ * Upon successful completion, scif_open() returns an endpoint descriptor to
+ * be used in subsequent SCIF functions calls to refer to that endpoint;
+ * otherwise in user mode SCIF_OPEN_FAILED (that is ((scif_epd_t)-1)) is
+ * returned and errno is set to indicate the error; in kernel mode a NULL
+ * scif_epd_t is returned.
+ *
+ * Errors:
+ * ENOMEM - Insufficient kernel memory was available
+ */
+scif_epd_t scif_open(void);
+
+/**
+ * scif_bind() - Bind an endpoint to a port
+ * @epd:	endpoint descriptor
+ * @pn:		port number
+ *
+ * scif_bind() binds endpoint epd to port pn, where pn is a port number on the
+ * local node. If pn is zero, a port number greater than or equal to
+ * SCIF_PORT_RSVD is assigned and returned. Each endpoint may be bound to
+ * exactly one local port. Ports less than 1024 when requested can only be bound
+ * by system (or root) processes or by processes executed by privileged users.
+ *
+ * Return:
+ * Upon successful completion, scif_bind() returns the port number to which epd
+ * is bound; otherwise in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * EINVAL - the endpoint or the port is already bound
+ * EISCONN - The endpoint is already connected
+ * ENOSPC - No port number available for assignment
+ * EACCES - The port requested is protected and the user is not the superuser
+ */
+int scif_bind(scif_epd_t epd, u16 pn);
+
+/**
+ * scif_listen() - Listen for connections on an endpoint
+ * @epd:	endpoint descriptor
+ * @backlog:	maximum pending connection requests
+ *
+ * scif_listen() marks the endpoint epd as a listening endpoint - that is, as
+ * an endpoint that will be used to accept incoming connection requests. Once
+ * so marked, the endpoint is said to be in the listening state and may not be
+ * used as the endpoint of a connection.
+ *
+ * The endpoint, epd, must have been bound to a port.
+ *
+ * The backlog argument defines the maximum length to which the queue of
+ * pending connections for epd may grow. If a connection request arrives when
+ * the queue is full, the client may receive an error with an indication that
+ * the connection was refused.
+ *
+ * Return:
+ * Upon successful completion, scif_listen() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * EINVAL - the endpoint is not bound to a port
+ * EISCONN - The endpoint is already connected or listening
+ */
+int scif_listen(scif_epd_t epd, int backlog);
+
+/**
+ * scif_connect() - Initiate a connection on a port
+ * @epd:	endpoint descriptor
+ * @dst:	global id of port to which to connect
+ *
+ * The scif_connect() function requests the connection of endpoint epd to remote
+ * port dst. If the connection is successful, a peer endpoint, bound to dst, is
+ * created on node dst.node. On successful return, the connection is complete.
+ *
+ * If the endpoint epd has not already been bound to a port, scif_connect()
+ * will bind it to an unused local port.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * In user space, scif_connect() supports an asynchronous connection mode
+ * if the application has set the O_NONBLOCK flag on the endpoint via the
+ * fcntl() system call. Setting this flag will result in the calling process
+ * not to wait during scif_connect().
+ *
+ * Return:
+ * Upon successful completion, scif_connect() returns the port ID to which the
+ * endpoint, epd, is bound; otherwise in user mode -1 is returned and errno is
+ * set to indicate the error; in kernel mode the negative of one of the
+ * following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNREFUSED - The destination was not listening for connections or refused
+ * the connection request
+ * EINVAL - dst.port is not a valid port ID
+ * EISCONN - The endpoint is already connected
+ * ENOMEM - No buffer space is available
+ * ENODEV - The destination node does not exist, or the node is lost or existed,
+ * but is not currently in the network since it may have crashed
+ * ENOSPC - No port number available for assignment
+ * EOPNOTSUPP - The endpoint is listening and cannot be connected
+ */
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst);
+
+/**
+ * scif_accept() - Accept a connection on an endpoint
+ * @epd:	endpoint descriptor
+ * @peer:	global id of port to which connected
+ * @newepd:	new connected endpoint descriptor
+ * @flags:	flags
+ *
+ * The scif_accept() call extracts the first connection request from the queue
+ * of pending connections for the port on which epd is listening. scif_accept()
+ * creates a new endpoint, bound to the same port as epd, and allocates a new
+ * SCIF endpoint descriptor, returned in newepd, for the endpoint. The new
+ * endpoint is connected to the endpoint through which the connection was
+ * requested. epd is unaffected by this call, and remains in the listening
+ * state.
+ *
+ * On successful return, peer holds the global port identifier (node id and
+ * local port number) of the port which requested the connection.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * The number of connections that can (subsequently) be accepted on epd is only
+ * limited by system resources (memory).
+ *
+ * The flags argument is formed by OR'ing together zero or more of the
+ * following values.
+ * SCIF_ACCEPT_SYNC - block until a connection request is presented. If
+ *			SCIF_ACCEPT_SYNC is not in flags, and no pending
+ *			connections are present on the queue, scif_accept()
+ *			fails with an EAGAIN error
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when there is a connection request. In kernel mode, the scif_poll()
+ * function may be used for this purpose. A readable event will be delivered
+ * when a connection is requested.
+ *
+ * Return:
+ * Upon successful completion, scif_accept() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ *	negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EAGAIN - SCIF_ACCEPT_SYNC is not set and no connections are present to be
+ * accepted or SCIF_ACCEPT_SYNC is not set and remote node failed to complete
+ * its connection request
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * EINTR - Interrupted function
+ * EINVAL - epd is not a listening endpoint, or flags is invalid, or peer is
+ * NULL, or newepd is NULL
+ * ENODEV - The requesting node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOENT - Secondary part of epd registration failed
+ */
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer, scif_epd_t
+		*newepd, int flags);
+
+/**
+ * scif_close() - Close an endpoint
+ * @epd:	endpoint descriptor
+ *
+ * scif_close() closes an endpoint and performs necessary teardown of
+ * facilities associated with that endpoint.
+ *
+ * If epd is a listening endpoint then it will no longer accept connection
+ * requests on the port to which it is bound. Any pending connection requests
+ * are rejected.
+ *
+ * If epd is a connected endpoint, then its peer endpoint is also closed. RMAs
+ * which are in-process through epd or its peer endpoint will complete before
+ * scif_close() returns. Registered windows of the local and peer endpoints are
+ * released as if scif_unregister() was called against each window.
+ *
+ * Closing a SCIF endpoint does not affect local registered memory mapped by
+ * a SCIF endpoint on a remote node. The local memory remains mapped by the peer
+ * SCIF endpoint explicitly removed by calling munmap(..) by the peer.
+ *
+ * If the peer endpoint's receive queue is not empty at the time that epd is
+ * closed, then the peer endpoint can be passed as the endpoint parameter to
+ * scif_recv() until the receive queue is empty.
+ *
+ * epd is freed and may no longer be accessed.
+ *
+ * Return:
+ * Upon successful completion, scif_close() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ */
+int scif_close(scif_epd_t epd);
+
+/**
+ * scif_send() - Send a message
+ * @epd:	endpoint descriptor
+ * @msg:	message buffer address
+ * @len:	message length
+ * @flags:	blocking mode flags
+ *
+ * scif_send() sends data to the peer of endpoint epd. Up to len bytes of data
+ * are copied from memory starting at address msg. On successful execution the
+ * return value of scif_send() is the number of bytes that were sent, and is
+ * zero if no bytes were sent because len was zero. scif_send() may be called
+ * only when the endpoint is in a connected state.
+ *
+ * If a scif_send() call is non-blocking, then it sends only those bytes which
+ * can be sent without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_send() call is blocking, then it normally returns after sending
+ * all len bytes. If a blocking call is interrupted or the connection is
+ * reset, the call is considered successful if some bytes were sent or len is
+ * zero, otherwise the call is considered unsuccessful.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when the send queue is not full. In kernel mode, the scif_poll() function
+ * may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * scif_send() will block until the entire message is sent if SCIF_SEND_BLOCK
+ * is passed as the flags argument.
+ *
+ * Return:
+ * Upon successful completion, scif_send() returns the number of bytes sent;
+ * otherwise in user mode -1 is returned and errno is set to indicate the
+ * error; in kernel mode the negative of one of the following errors is
+ * returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - An invalid address was specified for a parameter
+ * EINVAL - flags is invalid, or len is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOTCONN - The endpoint is not connected
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_recv() - Receive a message
+ * @epd:	endpoint descriptor
+ * @msg:	message buffer address
+ * @len:	message buffer length
+ * @flags:	blocking mode flags
+ *
+ * scif_recv() receives data from the peer of endpoint epd. Up to len bytes of
+ * data are copied to memory starting at address msg. On successful execution
+ * the return value of scif_recv() is the number of bytes that were received,
+ * and is zero if no bytes were received because len was zero. scif_recv() may
+ * be called only when the endpoint is in a connected state.
+ *
+ * If a scif_recv() call is non-blocking, then it receives only those bytes
+ * which can be received without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_recv() call is blocking, then it normally returns after receiving
+ * all len bytes. If the blocking call was interrupted due to a disconnection,
+ * subsequent calls to scif_recv() will copy all bytes received upto the point
+ * of disconnection.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when data is available to be received. In kernel mode, the scif_poll()
+ * function may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * scif_recv() will block until the entire message is received if
+ * SCIF_RECV_BLOCK is passed as the flags argument.
+ *
+ * Return:
+ * Upon successful completion, scif_recv() returns the number of bytes
+ * received; otherwise in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ * Errors:
+ * EAGAIN - The destination node is returning from a low power state
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - An invalid address was specified for a parameter
+ * EINVAL - flags is invalid, or len is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOTCONN - The endpoint is not connected
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_register() - Mark a memory region for remote access.
+ * @epd:		endpoint descriptor
+ * @addr:		starting virtual address
+ * @len:		length of range
+ * @offset:		offset of window
+ * @prot_flags:		read/write protection flags
+ * @map_flags:		mapping flags
+ *
+ * The scif_register() function opens a window, a range of whole pages of the
+ * registered address space of the endpoint epd, starting at offset po and
+ * continuing for len bytes. The value of po, further described below, is a
+ * function of the parameters offset and len, and the value of map_flags. Each
+ * page of the window represents the physical memory page which backs the
+ * corresponding page of the range of virtual address pages starting at addr
+ * and continuing for len bytes. addr and len are constrained to be multiples
+ * of the page size. A successful scif_register() call returns po.
+ *
+ * When SCIF_MAP_FIXED is set in the map_flags argument, po will be offset
+ * exactly, and offset is constrained to be a multiple of the page size. The
+ * mapping established by scif_register() will not replace any existing
+ * registration; an error is returned if any page within the range [offset,
+ * offset + len - 1] intersects an existing window.
+ *
+ * When SCIF_MAP_FIXED is not set, the implementation uses offset in an
+ * implementation-defined manner to arrive at po. The po value so chosen will
+ * be an area of the registered address space that the implementation deems
+ * suitable for a mapping of len bytes. An offset value of 0 is interpreted as
+ * granting the implementation complete freedom in selecting po, subject to
+ * constraints described below. A non-zero value of offset is taken to be a
+ * suggestion of an offset near which the mapping should be placed. When the
+ * implementation selects a value for po, it does not replace any extant
+ * window. In all cases, po will be a multiple of the page size.
+ *
+ * The physical pages which are so represented by a window are available for
+ * access in calls to mmap(), scif_readfrom(), scif_writeto(),
+ * scif_vreadfrom(), and scif_vwriteto(). While a window is registered, the
+ * physical pages represented by the window will not be reused by the memory
+ * subsystem for any other purpose. Note that the same physical page may be
+ * represented by multiple windows.
+ *
+ * Subsequent operations which change the memory pages to which virtual
+ * addresses are mapped (such as mmap(), munmap()) have no effect on
+ * existing window.
+ *
+ * If the process will fork(), it is recommended that the registered
+ * virtual address range be marked with MADV_DONTFORK. Doing so will prevent
+ * problems due to copy-on-write semantics.
+ *
+ * The prot_flags argument is formed by OR'ing together one or more of the
+ * following values.
+ * SCIF_PROT_READ - allow read operations from the window
+ * SCIF_PROT_WRITE - allow write operations to the window
+ *
+ * The map_flags argument can be set to SCIF_MAP_FIXED which interprets a
+ * fixed offset.
+ *
+ * Return:
+ * Upon successful completion, scif_register() returns the offset at which the
+ * mapping was placed (po); otherwise in user mode SCIF_REGISTER_FAILED (that
+ * is (off_t *)-1) is returned and errno is set to indicate the error; in
+ * kernel mode the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EADDRINUSE - SCIF_MAP_FIXED is set in map_flags, and pages in the range
+ * [offset, offset + len -1] are already registered
+ * EAGAIN - The mapping could not be performed due to lack of resources
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL - map_flags is invalid, or prot_flags is invalid, or SCIF_MAP_FIXED is
+ * set in flags, and offset is not a multiple of the page size, or addr is not a
+ * multiple of the page size, or len is not a multiple of the page size, or is
+ * 0, or offset is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOTCONN -The endpoint is not connected
+ */
+off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
+		    int prot_flags, int map_flags);
+
+/**
+ * scif_unregister() - Mark a memory region for remote access.
+ * @epd:	endpoint descriptor
+ * @offset:	start of range to unregister
+ * @len:	length of range to unregister
+ *
+ * The scif_unregister() function closes those previously registered windows
+ * which are entirely within the range [offset, offset + len - 1]. It is an
+ * error to specify a range which intersects only a subrange of a window.
+ *
+ * On a successful return, pages within the window may no longer be specified
+ * in calls to mmap(), scif_readfrom(), scif_writeto(), scif_vreadfrom(),
+ * scif_vwriteto(), scif_get_pages, and scif_fence_signal(). The window,
+ * however, continues to exist until all previous references against it are
+ * removed. A window is referenced if there is a mapping to it created by
+ * mmap(), or if scif_get_pages() was called against the window
+ * (and the pages have not been returned via scif_put_pages()). A window is
+ * also referenced while an RMA, in which some range of the window is a source
+ * or destination, is in progress. Finally a window is referenced while some
+ * offset in that window was specified to scif_fence_signal(), and the RMAs
+ * marked by that call to scif_fence_signal() have not completed. While a
+ * window is in this state, its registered address space pages are not
+ * available for use in a new registered window.
+ *
+ * When all such references to the window have been removed, its references to
+ * all the physical pages which it represents are removed. Similarly, the
+ * registered address space pages of the window become available for
+ * registration in a new window.
+ *
+ * Return:
+ * Upon successful completion, scif_unregister() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned. In the event of an
+ * error, no windows are unregistered.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - the range [offset, offset + len - 1] intersects a subrange of a
+ * window, or offset is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - Offsets in the range [offset, offset + len - 1] are invalid for the
+ * registered address space of epd
+ */
+int scif_unregister(scif_epd_t epd, off_t offset, size_t len);
+
+/**
+ * scif_readfrom() - Copy from a remote address space
+ * @epd:	endpoint descriptor
+ * @loffset:	offset in local registered address space to
+ *		which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space
+ *		from which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_readfrom() copies len bytes from the remote registered address space of
+ * the peer of endpoint epd, starting at the offset roffset to the local
+ * registered address space of epd, starting at the offset loffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_readfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if loffset and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_readfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - The range [loffset, loffset + len - 1] is invalid for the registered
+ * address space of epd, or, The range [roffset, roffset + len - 1] is invalid
+ * for the registered address space of the peer of epd, or loffset or roffset
+ * is negative
+ */
+int scif_readfrom(scif_epd_t epd, off_t loffset, size_t len, off_t
+		  roffset, int rma_flags);
+
+/**
+ * scif_writeto() - Copy to a remote address space
+ * @epd:	endpoint descriptor
+ * @loffset:	offset in local registered address space
+ *		from which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to
+ *		which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_writeto() copies len bytes from the local registered address space of
+ * epd, starting at the offset loffset to the remote registered address space
+ * of the peer of endpoint epd, starting at the offset roffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_writeto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if loffset and roffset are not separated by a multiple
+ * of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *			engine.
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_readfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - The range [loffset, loffset + len - 1] is invalid for the registered
+ * address space of epd, or, The range [roffset , roffset + len -1] is invalid
+ * for the registered address space of the peer of epd, or loffset or roffset
+ * is negative
+ */
+int scif_writeto(scif_epd_t epd, off_t loffset, size_t len, off_t
+		 roffset, int rma_flags);
+
+/**
+ * scif_vreadfrom() - Copy from a remote address space
+ * @epd:	endpoint descriptor
+ * @addr:	address to which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space
+ *		from which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_vreadfrom() copies len bytes from the remote registered address
+ * space of the peer of endpoint epd, starting at the offset roffset, to local
+ * memory, starting at addr.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may
+ * intersect multiple registered windows, but only if those windows are
+ * contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vreadfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if addr and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_USECACHE - enable registration caching
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *	the source range becomes visible on the destination node
+ *	after all other transferred data in the source range has
+ *	become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_vreadfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ * registered address space of epd
+ */
+int scif_vreadfrom(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		   int rma_flags);
+
+/**
+ * scif_vwriteto() - Copy to a remote address space
+ * @epd:	endpoint descriptor
+ * @addr:	address from which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to
+ *		which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_vwriteto() copies len bytes from the local memory, starting at addr, to
+ * the remote registered address space of the peer of endpoint epd, starting at
+ * the offset roffset.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may intersect
+ * multiple registered windows, but only if those windows are contiguous in the
+ * registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vwriteto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and offset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and offset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if addr and offset are not separated by a multiple of
+ * 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_USECACHE - allow registration caching
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_vwriteto() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ * registered address space of epd
+ */
+int scif_vwriteto(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		  int rma_flags);
+
+/**
+ * scif_fence_mark() - Mark previously issued RMAs
+ * @epd:	endpoint descriptor
+ * @flags:	control flags
+ * @mark:	marked value returned as output.
+ *
+ * scif_fence_mark() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd. The RMAs are
+ * marked with a value returned at mark. The application may subsequently call
+ * scif_fence_wait(), passing the value returned at mark, to await completion
+ * of all RMAs so marked.
+ *
+ * The flags argument has exactly one of the following values.
+ * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
+ *	epd are marked
+ * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ *
+ * Return:
+ * Upon successful completion, scif_fence_mark() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENOMEM - Insufficient kernel memory was available
+ */
+int scif_fence_mark(scif_epd_t epd, int flags, int *mark);
+
+/**
+ * scif_fence_wait() - Wait for completion of marked RMAs
+ * @epd:	endpoint descriptor
+ * @mark:	mark request
+ *
+ * scif_fence_wait() returns after all RMAs marked with mark have completed.
+ * The value passed in mark must have been obtained in a previous call to
+ * scif_fence_mark().
+ *
+ * Return:
+ * Upon successful completion, scif_fence_wait() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENOMEM - Insufficient kernel memory was available
+ */
+int scif_fence_wait(scif_epd_t epd, int mark);
+
+/**
+ * scif_fence_signal() - Request a memory update on completion of RMAs
+ * @epd:	endpoint descriptor
+ * @loff:	local offset
+ * @lval:	local value to write to loffset
+ * @roff:	remote offset
+ * @rval:	remote value to write to roffset
+ * @flags:	flags
+ *
+ * scif_fence_signal() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or marking the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd.
+ *
+ * If flags includes SCIF_SIGNAL_LOCAL, then on completion of the RMAs in the
+ * marked set, lval is written to memory at the address corresponding to offset
+ * loff in the local registered address space of epd. loff must be within a
+ * registered window. If flags includes SCIF_SIGNAL_REMOTE, then on completion
+ * of the RMAs in the marked set, rval is written to memory at the address
+ * corresponding to offset roff in the remote registered address space of epd.
+ * roff must be within a remote registered window of the peer of epd. Note
+ * that any specified offset must be DWORD (4 byte / 32 bit) aligned.
+ *
+ * The flags argument is formed by OR'ing together the following.
+ * Exactly one of the following values.
+ * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
+ *	epd are marked
+ * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ * One or more of the following values.
+ * SCIF_SIGNAL_LOCAL - On completion of the marked set of RMAs, write lval to
+ *	memory at the address corresponding to offset loff in the local
+ *	registered address space of epd.
+ * SCIF_SIGNAL_REMOTE - On completion of the marked set of RMAs, write rval to
+ *	memory at the address corresponding to offset roff in the remote
+ *	registered address space of epd.
+ *
+ * Return:
+ * Upon successful completion, scif_fence_signal() returns 0; otherwise in
+ * user mode -1 is returned and errno is set to indicate the error; in kernel
+ * mode the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - flags is invalid, or loff or roff are not DWORD aligned
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - loff is invalid for the registered address of epd, or roff is invalid
+ * for the registered address space, of the peer of epd
+ */
+int scif_fence_signal(scif_epd_t epd, off_t loff, u64 lval, off_t roff,
+		      u64 rval, int flags);
+
+/**
+ * scif_get_node_ids() - Return information about online nodes
+ * @nodes:	array in which to return online node IDs
+ * @len:	number of entries in the nodes array
+ * @self:	address to place the node ID of the local node
+ *
+ * scif_get_node_ids() fills in the nodes array with up to len node IDs of the
+ * nodes in the SCIF network. If there is not enough space in nodes, as
+ * indicated by the len parameter, only len node IDs are returned in nodes. The
+ * return value of scif_get_node_ids() is the total number of nodes currently in
+ * the SCIF network. By checking the return value against the len parameter,
+ * the user may determine if enough space for nodes was allocated.
+ *
+ * The node ID of the local node is returned at self.
+ *
+ * Return:
+ * Upon successful completion, scif_get_node_ids() returns the actual number of
+ * online nodes in the SCIF network including 'self'; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode no
+ * errors are returned.
+ *
+ * Errors:
+ * EFAULT - Bad address
+ */
+int scif_get_node_ids(u16 *nodes, int len, u16 *self);
+
+#endif /* __SCIF_H__ */
diff --git a/include/uapi/linux/scif_ioctl.h b/include/uapi/linux/scif_ioctl.h
new file mode 100644
index 0000000..4a94d91
--- /dev/null
+++ b/include/uapi/linux/scif_ioctl.h
@@ -0,0 +1,130 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+/*
+ * -----------------------------------------
+ * SCIF IOCTL interface information
+ * -----------------------------------------
+ */
+#ifndef SCIF_IOCTL_H
+#define SCIF_IOCTL_H
+
+#include <linux/types.h>
+
+/**
+ * struct scif_port_id - SCIF port information
+ * @node:	node on which port resides
+ * @port:	local port number
+ */
+struct scif_port_id {
+	__u16 node;
+	__u16 port;
+};
+
+/**
+ * struct scifioctl_connect - used for SCIF_CONNECT IOCTL
+ * @self:	used to read back the assigned port_id
+ * @peer:	destination node and port to connect to
+ */
+struct scifioctl_connect {
+	struct scif_port_id	self;
+	struct scif_port_id	peer;
+};
+
+/**
+ * struct scifioctl_accept - used for SCIF_ACCEPTREQ IOCTL
+ * @flags:	flags
+ * @peer:	global id of peer endpoint
+ * @endpt:	new connected endpoint descriptor
+ */
+struct scifioctl_accept {
+	__s32			flags;
+	struct scif_port_id	peer;
+	__u64			endpt;
+};
+
+/**
+ * struct scifioctl_msg - used for SCIF_SEND/SCIF_RECV IOCTL
+ * @msg:	message buffer address
+ * @len:	message length
+ * @flags:	flags
+ * @out_len:	number of bytes sent/received
+ */
+struct scifioctl_msg {
+	__u64	msg;
+	__s32	len;
+	__s32	flags;
+	__s32	out_len;
+};
+
+/**
+ * struct scifioctl_node_ids - used for SCIF_GET_NODEIDS IOCTL
+ * @nodes:	pointer to an array of node_ids
+ * @self:	ID of the current node
+ * @len:	length of array
+ */
+struct scifioctl_node_ids {
+	__u64	nodes;
+	__u64	self;
+	__s32	len;
+};
+
+#define SCIF_BIND		_IOWR('s', 1, __u64)
+#define SCIF_LISTEN		_IOW('s', 2, __s32)
+#define SCIF_CONNECT		_IOWR('s', 3, struct scifioctl_connect)
+#define SCIF_ACCEPTREQ		_IOWR('s', 4, struct scifioctl_accept)
+#define SCIF_ACCEPTREG		_IOWR('s', 5, __u64)
+#define SCIF_SEND		_IOWR('s', 6, struct scifioctl_msg)
+#define SCIF_RECV		_IOWR('s', 7, struct scifioctl_msg)
+#define SCIF_GET_NODEIDS	_IOWR('s', 14, struct scifioctl_node_ids)
+
+#endif /* SCIF_IOCTL_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 22%]

* [PATCH char-misc-next v4 02/13] misc: mic: SCIF ring buffer infrastructure
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
  2015-03-31  1:36 22% ` [PATCH char-misc-next v4 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
@ 2015-03-31  1:36 58% ` Sudeep Dutt
  2015-03-31  1:36 61% ` [PATCH char-misc-next v4 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF ring buffer is a single producer, single consumer byte stream
ring buffer optimized for avoiding reads across the PCIe bus while
adding the required barriers and hardware workarounds for the MIC
Coprocessor. The ring buffer is used to implement a receive queue for
SCIF driver messaging between two nodes and for byte stream messaging
between SCIF endpoints. The existing in-kernel ring buffer was not
reused since it has not been designed for our use across the PCIe bus
where each node runs an independent OS. Each SCIF node has a receive
queue for every other SCIF node, and each connected endpoint has a
receive queue for messages from its peer. This pair of receive
queues is referred to as a SCIF queue pair.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_rb.h | 100 ++++++++++++++++
 drivers/misc/mic/scif/scif_rb.c | 248 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 348 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

diff --git a/drivers/misc/mic/scif/scif_rb.h b/drivers/misc/mic/scif/scif_rb.h
new file mode 100644
index 0000000..166dffe
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.h
@@ -0,0 +1,100 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef SCIF_RB_H
+#define SCIF_RB_H
+/*
+ * This file describes a general purpose, byte based ring buffer. Writers to the
+ * ring buffer need to synchronize using a lock. The same is true for readers,
+ * although in practice, the ring buffer has a single reader. It is lockless
+ * between producer and consumer so it can handle being used across the PCIe
+ * bus. The ring buffer ensures that there are no reads across the PCIe bus for
+ * performance reasons. Two of these are used to form a single bidirectional
+ * queue-pair across PCIe.
+ */
+/*
+ * struct scif_rb - SCIF Ring Buffer
+ *
+ * @rb_base: The base of the memory used for storing RB messages
+ * @read_ptr: Pointer to the read offset
+ * @write_ptr: Pointer to the write offset
+ * @size: Size of the memory in rb_base
+ * @current_read_offset: Cached read offset for performance
+ * @current_write_offset: Cached write offset for performance
+ */
+struct scif_rb {
+	void *rb_base;
+	u32 *read_ptr;
+	u32 *write_ptr;
+	u32 size;
+	u32 current_read_offset;
+	u32 current_write_offset;
+};
+
+/* methods used by both */
+void scif_rb_init(struct scif_rb *rb, u32 *read_ptr, u32 *write_ptr,
+		  void *rb_base, u8 size);
+/* writer only methods */
+/* write a new command, then scif_rb_commit() */
+int scif_rb_write(struct scif_rb *rb, void *msg, u32 size);
+/* after write(), then scif_rb_commit() */
+void scif_rb_commit(struct scif_rb *rb);
+/* query space available for writing to a RB. */
+u32 scif_rb_space(struct scif_rb *rb);
+
+/* reader only methods */
+/* read a new message from the ring buffer of size bytes */
+u32 scif_rb_get_next(struct scif_rb *rb, void *msg, u32 size);
+/* update the read pointer so that the space can be reused */
+void scif_rb_update_read_ptr(struct scif_rb *rb);
+/* count the number of bytes that can be read */
+u32 scif_rb_count(struct scif_rb *rb, u32 size);
+#endif
diff --git a/drivers/misc/mic/scif/scif_rb.c b/drivers/misc/mic/scif/scif_rb.c
new file mode 100644
index 0000000..47de222
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.c
@@ -0,0 +1,248 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/circ_buf.h>
+#include <linux/types.h>
+#include <linux/io.h>
+
+#include "scif_rb.h"
+
+#define scif_rb_ring_cnt(head, tail, size) CIRC_CNT(head, tail, size)
+#define scif_rb_ring_space(head, tail, size) CIRC_SPACE(head, tail, size)
+
+/**
+ * scif_rb_init - Initializes the ring buffer
+ * @rb: ring buffer
+ * @read_ptr: A pointer to the read offset
+ * @write_ptr: A pointer to the write offset
+ * @rb_base: A pointer to the base of the ring buffer
+ * @size: The size of the ring buffer in powers of two
+ */
+void scif_rb_init(struct scif_rb *rb, u32 *read_ptr, u32 *write_ptr,
+		  void *rb_base, u8 size)
+{
+	rb->rb_base = rb_base;
+	rb->size = (1 << size);
+	rb->read_ptr = read_ptr;
+	rb->write_ptr = write_ptr;
+	rb->current_read_offset = *read_ptr;
+	rb->current_write_offset = *write_ptr;
+}
+
+/* Copies a message to the ring buffer -- handles the wrap around case */
+static void memcpy_torb(struct scif_rb *rb, void *header,
+			void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_toio((void __iomem __force *)header, msg, size1);
+		memcpy_toio((void __iomem __force *)rb->rb_base,
+			    msg + size1, size2);
+	} else {
+		memcpy_toio((void __iomem __force *)header, msg, size);
+	}
+}
+
+/* Copies a message from the ring buffer -- handles the wrap around case */
+static void memcpy_fromrb(struct scif_rb *rb, void *header,
+			  void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_fromio(msg, (void __iomem __force *)header, size1);
+		memcpy_fromio(msg + size1,
+			      (void __iomem __force *)rb->rb_base, size2);
+	} else {
+		memcpy_fromio(msg, (void __iomem __force *)header, size);
+	}
+}
+
+/**
+ * scif_rb_space - Query space available for writing to the RB
+ * @rb: ring buffer
+ *
+ * Return: size available for writing to RB in bytes.
+ */
+u32 scif_rb_space(struct scif_rb *rb)
+{
+	rb->current_read_offset = *rb->read_ptr;
+	/*
+	 * Update from the HW read pointer only once the peer has exposed the
+	 * new empty slot. This barrier is paired with the memory barrier
+	 * scif_rb_update_read_ptr()
+	 */
+	mb();
+	return scif_rb_ring_space(rb->current_write_offset,
+				  rb->current_read_offset, rb->size);
+}
+
+/**
+ * scif_rb_write - Write a message to the RB
+ * @rb: ring buffer
+ * @msg: buffer to send the message.  Must be at least size bytes long
+ * @size: the size (in bytes) to be copied to the RB
+ *
+ * This API does not block if there isn't enough space in the RB.
+ * Returns: 0 on success or -ENOMEM on failure
+ */
+int scif_rb_write(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header;
+
+	if (scif_rb_space(rb) < size)
+		return -ENOMEM;
+	header = rb->rb_base + rb->current_write_offset;
+	memcpy_torb(rb, header, msg, size);
+	/*
+	 * Wait until scif_rb_commit(). Update the local ring
+	 * buffer data, not the shared data until commit.
+	 */
+	rb->current_write_offset =
+		(rb->current_write_offset + size) & (rb->size - 1);
+	return 0;
+}
+
+/**
+ * scif_rb_commit - To submit the message to let the peer fetch it
+ * @rb: ring buffer
+ */
+void scif_rb_commit(struct scif_rb *rb)
+{
+	/*
+	 * We must ensure ordering between the all the data committed
+	 * previously before we expose the new message to the peer by
+	 * updating the write_ptr. This write barrier is paired with
+	 * the read barrier in scif_rb_count(..)
+	 */
+	wmb();
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#endif
+}
+
+/**
+ * scif_rb_get - To get next message from the ring buffer
+ * @rb: ring buffer
+ * @size: Number of bytes to be read
+ *
+ * Return: NULL if no bytes to be read from the ring buffer, otherwise the
+ *	pointer to the next byte
+ */
+static void *scif_rb_get(struct scif_rb *rb, u32 size)
+{
+	void *header = NULL;
+
+	if (scif_rb_count(rb, size) >= size)
+		header = rb->rb_base + rb->current_read_offset;
+	return header;
+}
+
+/*
+ * scif_rb_get_next - Read from ring buffer.
+ * @rb: ring buffer
+ * @msg: buffer to hold the message.  Must be at least size bytes long
+ * @size: Number of bytes to be read
+ *
+ * Return: number of bytes read if available bytes are >= size, otherwise
+ * returns zero.
+ */
+u32 scif_rb_get_next(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header = NULL;
+	int read_size = 0;
+
+	header = scif_rb_get(rb, size);
+	if (header) {
+		u32 next_cmd_offset =
+			(rb->current_read_offset + size) & (rb->size - 1);
+
+		read_size = size;
+		rb->current_read_offset = next_cmd_offset;
+		memcpy_fromrb(rb, header, msg, size);
+	}
+	return read_size;
+}
+
+/**
+ * scif_rb_update_read_ptr
+ * @rb: ring buffer
+ */
+void scif_rb_update_read_ptr(struct scif_rb *rb)
+{
+	u32 new_offset;
+
+	new_offset = rb->current_read_offset;
+	/*
+	 * We must ensure ordering between the all the data committed or read
+	 * previously before we expose the empty slot to the peer by updating
+	 * the read_ptr. This barrier is paired with the memory barrier in
+	 * scif_rb_space(..)
+	 */
+	mb();
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si Bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#endif
+}
+
+/**
+ * scif_rb_count
+ * @rb: ring buffer
+ * @size: Number of bytes expected to be read
+ *
+ * Return: number of bytes that can be read from the RB
+ */
+u32 scif_rb_count(struct scif_rb *rb, u32 size)
+{
+	if (scif_rb_ring_cnt(rb->current_write_offset,
+			     rb->current_read_offset,
+			     rb->size) < size) {
+		rb->current_write_offset = *rb->write_ptr;
+		/*
+		 * Update from the HW write pointer if empty only once the peer
+		 * has exposed the new message. This read barrier is paired
+		 * with the write barrier in scif_rb_commit(..)
+		 */
+		smp_rmb();
+	}
+	return scif_rb_ring_cnt(rb->current_write_offset,
+				rb->current_read_offset,
+				rb->size);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 58%]

* [PATCH char-misc-next v4 03/13] misc: mic: SCIF Hardware Bus
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
  2015-03-31  1:36 22% ` [PATCH char-misc-next v4 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
  2015-03-31  1:36 58% ` [PATCH char-misc-next v4 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
@ 2015-03-31  1:36 61% ` Sudeep Dutt
  2015-03-31  1:36 78% ` [PATCH char-misc-next v4 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF hardware bus abstracts the low level hardware driver details
like interrupts and mapping remote memory so that the same SCIF driver
can work without any changes with the MIC host or card driver as long
as the hardware bus operations are implemented. The SCIF hardware
device is registered by the host and card drivers on the SCIF hardware
bus resulting in probing the SCIF driver.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig        |  17 ++++
 drivers/misc/mic/Makefile       |   2 +-
 drivers/misc/mic/bus/Makefile   |   1 +
 drivers/misc/mic/bus/scif_bus.h | 129 ++++++++++++++++++++++++
 drivers/misc/mic/bus/scif_bus.c | 210 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 358 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index cc4eef0..afc9f53 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -15,6 +15,23 @@ config INTEL_MIC_BUS
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
 
+comment "SCIF Bus Driver"
+
+config SCIF_BUS
+	tristate "SCIF Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the SCIF Bus, such as CONFIG_INTEL_MIC_HOST
+	  and CONFIG_INTEL_MIC_CARD.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index e9bf148..6bc6083 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,4 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
-obj-$(CONFIG_INTEL_MIC_BUS) += bus/
+obj-y += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
index d85c7f2..1ed37e2 100644
--- a/drivers/misc/mic/bus/Makefile
+++ b/drivers/misc/mic/bus/Makefile
@@ -3,3 +3,4 @@
 # Copyright(c) 2014, Intel Corporation.
 #
 obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
+obj-$(CONFIG_SCIF_BUS) += scif_bus.o
diff --git a/drivers/misc/mic/bus/scif_bus.h b/drivers/misc/mic/bus/scif_bus.h
new file mode 100644
index 0000000..335a228
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.h
@@ -0,0 +1,129 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#ifndef _SCIF_BUS_H_
+#define _SCIF_BUS_H_
+/*
+ * Everything a scif driver needs to work with any particular scif
+ * hardware abstraction layer.
+ */
+#include <linux/dma-mapping.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+struct scif_hw_dev_id {
+	u32 device;
+	u32 vendor;
+};
+
+#define MIC_SCIF_DEV 1
+#define SCIF_DEV_ANY_ID 0xffffffff
+
+/**
+ * scif_hw_dev - representation of a hardware device abstracted for scif
+ * @hw_ops: the hardware ops supported by this device
+ * @id: the device type identification (used to match it with a driver)
+ * @mmio: MMIO memory window
+ * @aper: Aperture memory window
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ * @snode - The source node for this device.
+ * @dp - Self device page
+ * @rdp - Remote device page
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ */
+struct scif_hw_dev {
+	struct scif_hw_ops *hw_ops;
+	struct scif_hw_dev_id id;
+	struct mic_mw *mmio;
+	struct mic_mw *aper;
+	struct device dev;
+	u8 dnode;
+	u8 snode;
+	void *dp;
+	void __iomem *rdp;
+	struct dma_chan **dma_ch;
+	int num_dma_ch;
+};
+
+/**
+ * scif_driver - operations for a scif I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_driver {
+	struct device_driver driver;
+	const struct scif_hw_dev_id *id_table;
+	int (*probe)(struct scif_hw_dev *dev);
+	void (*remove)(struct scif_hw_dev *dev);
+};
+
+/**
+ * scif_hw_ops - Hardware operations for accessing a SCIF device on the SCIF bus.
+ *
+ * @next_db: Obtain the next available doorbell.
+ * @request_irq: Request an interrupt on a particular doorbell.
+ * @free_irq: Free an interrupt requested previously.
+ * @ack_interrupt: acknowledge an interrupt in the ISR.
+ * @send_intr: Send an interrupt to the remote node on a specified doorbell.
+ * @send_p2p_intr: Send an interrupt to the peer node on a specified doorbell
+ * which is specifically targeted for a peer to peer node.
+ * @ioremap: Map a buffer with the specified physical address and length.
+ * @iounmap: Unmap a buffer previously mapped.
+ */
+struct scif_hw_ops {
+	int (*next_db)(struct scif_hw_dev *sdev);
+	struct mic_irq * (*request_irq)(struct scif_hw_dev *sdev,
+					irqreturn_t (*func)(int irq,
+							    void *data),
+					const char *name, void *data,
+					int db);
+	void (*free_irq)(struct scif_hw_dev *sdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct scif_hw_dev *sdev, int num);
+	void (*send_intr)(struct scif_hw_dev *sdev, int db);
+	void (*send_p2p_intr)(struct scif_hw_dev *sdev, int db,
+			      struct mic_mw *mw);
+	void __iomem * (*ioremap)(struct scif_hw_dev *sdev,
+				  phys_addr_t pa, size_t len);
+	void (*iounmap)(struct scif_hw_dev *sdev, void __iomem *va);
+};
+
+int scif_register_driver(struct scif_driver *driver);
+void scif_unregister_driver(struct scif_driver *driver);
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id,
+		     struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper,
+		     void *dp, void __iomem *rdp,
+		     struct dma_chan **chan, int num_chan);
+void scif_unregister_device(struct scif_hw_dev *sdev);
+
+static inline struct scif_hw_dev *dev_to_scif(struct device *dev)
+{
+	return container_of(dev, struct scif_hw_dev, dev);
+}
+
+static inline struct scif_driver *drv_to_scif(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_driver, driver);
+}
+#endif /* _SCIF_BUS_H */
diff --git a/drivers/misc/mic/bus/scif_bus.c b/drivers/misc/mic/bus/scif_bus.c
new file mode 100644
index 0000000..2da7cee
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.c
@@ -0,0 +1,210 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/dma-mapping.h>
+
+#include "scif_bus.h"
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "scif:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *scif_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(scif_dev);
+
+static inline int scif_id_match(const struct scif_hw_dev *dev,
+				const struct scif_hw_dev_id *id)
+{
+	if (id->device != dev->id.device && id->device != SCIF_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == SCIF_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call scif_dev_probe().
+ */
+static int scif_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+	const struct scif_hw_dev_id *ids;
+
+	ids = drv_to_scif(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (scif_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int scif_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+
+	return add_uevent_var(env, "MODALIAS=scif:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int scif_dev_probe(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_dev_remove(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_bus = {
+	.name  = "scif_bus",
+	.match = scif_dev_match,
+	.dev_groups = scif_dev_groups,
+	.uevent = scif_uevent,
+	.probe = scif_dev_probe,
+	.remove = scif_dev_remove,
+};
+
+int scif_register_driver(struct scif_driver *driver)
+{
+	driver->driver.bus = &scif_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_register_driver);
+
+void scif_unregister_driver(struct scif_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_driver);
+
+static void scif_release_dev(struct device *d)
+{
+	struct scif_hw_dev *sdev = dev_to_scif(d);
+
+	kfree(sdev);
+}
+
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper, void *dp,
+		     void __iomem *rdp, struct dma_chan **chan, int num_chan)
+{
+	int ret;
+	struct scif_hw_dev *sdev;
+
+	sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
+	if (!sdev)
+		return ERR_PTR(-ENOMEM);
+
+	sdev->dev.parent = pdev;
+	sdev->id.device = id;
+	sdev->id.vendor = SCIF_DEV_ANY_ID;
+	sdev->dev.archdata.dma_ops = dma_ops;
+	sdev->dev.release = scif_release_dev;
+	sdev->hw_ops = hw_ops;
+	sdev->dnode = dnode;
+	sdev->snode = snode;
+	dev_set_drvdata(&sdev->dev, sdev);
+	sdev->dev.bus = &scif_bus;
+	sdev->mmio = mmio;
+	sdev->aper = aper;
+	sdev->dp = dp;
+	sdev->rdp = rdp;
+	sdev->dev.dma_mask = &sdev->dev.coherent_dma_mask;
+	dma_set_mask(&sdev->dev, DMA_BIT_MASK(64));
+	sdev->dma_ch = chan;
+	sdev->num_dma_ch = num_chan;
+	dev_set_name(&sdev->dev, "scif-dev%u", sdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&sdev->dev);
+	if (ret)
+		goto free_sdev;
+	return sdev;
+free_sdev:
+	kfree(sdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(scif_register_device);
+
+void scif_unregister_device(struct scif_hw_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_device);
+
+static int __init scif_init(void)
+{
+	return bus_register(&scif_bus);
+}
+
+static void __exit scif_exit(void)
+{
+	bus_unregister(&scif_bus);
+}
+
+core_initcall(scif_init);
+module_exit(scif_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF Bus driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 61%]

* [PATCH char-misc-next v4 04/13] misc: mic: SCIF Peer Bus
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (2 preceding siblings ...)
  2015-03-31  1:36 61% ` [PATCH char-misc-next v4 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
@ 2015-03-31  1:36 78% ` Sudeep Dutt
  2015-03-31  1:36 96% ` [PATCH char-misc-next v4 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF peer bus is used to register and unregister SCIF peer devices
internally by the SCIF driver to signify the addition and removal of
peer nodes respectively from the SCIF network. This simplifies remote node
handling within SCIF and will also be used to support device probe/remove
for SCIF client drivers (e.g. netdev over SCIF)

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_peer_bus.h |  65 ++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c | 124 ++++++++++++++++++++++++++++++++++
 2 files changed, 189 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c

diff --git a/drivers/misc/mic/scif/scif_peer_bus.h b/drivers/misc/mic/scif/scif_peer_bus.h
new file mode 100644
index 0000000..33f0dbb
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.h
@@ -0,0 +1,65 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef _SCIF_PEER_BUS_H_
+#define _SCIF_PEER_BUS_H_
+
+#include <linux/device.h>
+#include <linux/mic_common.h>
+
+/*
+ * Peer devices show up as PCIe devices for the mgmt node but not the cards.
+ * The mgmt node discovers all the cards on the PCIe bus and informs the other
+ * cards about their peers. Upon notification of a peer a node adds a peer
+ * device to the peer bus to maintain symmetry in the way devices are
+ * discovered across all nodes in the SCIF network.
+ */
+/**
+ * scif_peer_dev - representation of a peer SCIF device
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ */
+struct scif_peer_dev {
+	struct device dev;
+	u8 dnode;
+};
+
+/**
+ * scif_peer_driver - operations for a scif_peer I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_peer_driver {
+	struct device_driver driver;
+	const struct scif_peer_dev_id *id_table;
+
+	int (*probe)(struct scif_peer_dev *dev);
+	void (*remove)(struct scif_peer_dev *dev);
+};
+
+struct scif_dev;
+
+int scif_peer_register_driver(struct scif_peer_driver *driver);
+void scif_peer_unregister_driver(struct scif_peer_driver *driver);
+
+struct scif_peer_dev *scif_peer_register_device(struct scif_dev *sdev);
+void scif_peer_unregister_device(struct scif_peer_dev *sdev);
+
+int scif_peer_bus_init(void);
+void scif_peer_bus_exit(void);
+#endif /* _SCIF_PEER_BUS_H */
diff --git a/drivers/misc/mic/scif/scif_peer_bus.c b/drivers/misc/mic/scif/scif_peer_bus.c
new file mode 100644
index 0000000..589ae9a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#include "scif_main.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+
+static inline struct scif_peer_dev *
+dev_to_scif_peer(struct device *dev)
+{
+	return container_of(dev, struct scif_peer_dev, dev);
+}
+
+static inline struct scif_peer_driver *
+drv_to_scif_peer(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_peer_driver, driver);
+}
+
+static int scif_peer_dev_match(struct device *dv, struct device_driver *dr)
+{
+	return !strncmp(dev_name(dv), dr->name, 4);
+}
+
+static int scif_peer_dev_probe(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_peer_dev_remove(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_peer_bus = {
+	.name  = "scif_peer_bus",
+	.match = scif_peer_dev_match,
+	.probe = scif_peer_dev_probe,
+	.remove = scif_peer_dev_remove,
+};
+
+int scif_peer_register_driver(struct scif_peer_driver *driver)
+{
+	driver->driver.bus = &scif_peer_bus;
+	return driver_register(&driver->driver);
+}
+
+void scif_peer_unregister_driver(struct scif_peer_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+
+static void scif_peer_release_dev(struct device *d)
+{
+	struct scif_peer_dev *sdev = dev_to_scif_peer(d);
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	scif_cleanup_scifdev(scifdev);
+	kfree(sdev);
+}
+
+struct scif_peer_dev *
+scif_peer_register_device(struct scif_dev *scifdev)
+{
+	int ret;
+	struct scif_peer_dev *spdev;
+
+	spdev = kzalloc(sizeof(*spdev), GFP_KERNEL);
+	if (!spdev)
+		return ERR_PTR(-ENOMEM);
+
+	spdev->dev.parent = scifdev->sdev->dev.parent;
+	spdev->dev.release = scif_peer_release_dev;
+	spdev->dnode = scifdev->node;
+	spdev->dev.bus = &scif_peer_bus;
+
+	dev_set_name(&spdev->dev, "scif_peer-dev%u", spdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&spdev->dev);
+	if (ret)
+		goto free_spdev;
+	return spdev;
+free_spdev:
+	kfree(spdev);
+	return ERR_PTR(ret);
+}
+
+void scif_peer_unregister_device(struct scif_peer_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+
+int scif_peer_bus_init(void)
+{
+	return bus_register(&scif_peer_bus);
+}
+
+void scif_peer_bus_exit(void)
+{
+	bus_unregister(&scif_peer_bus);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 78%]

* [PATCH char-misc-next v4 08/13] misc: mic: SCIF open close bind and listen APIs
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (6 preceding siblings ...)
  2015-03-31  1:36 28% ` [PATCH char-misc-next v4 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
@ 2015-03-31  1:36 41% ` Sudeep Dutt
    2015-03-31  1:36 41% ` [PATCH char-misc-next v4 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF character device file operations and kernel APIs for opening and
closing a user and kernel mode SCIF endpoint. This patch also enables
binding to a SCIF port and listening for incoming SCIF connections.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |  19 ++
 drivers/misc/mic/Makefile           |   1 +
 drivers/misc/mic/scif/Makefile      |  15 ++
 drivers/misc/mic/scif/scif_epd.h    | 148 +++++++++++++
 drivers/misc/mic/scif/scif_main.h   |   2 +
 drivers/misc/mic/scif/scif_nodeqp.h |  12 +-
 drivers/misc/mic/scif/scif_api.c    | 417 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  92 ++++++++
 drivers/misc/mic/scif/scif_fd.c     | 104 +++++++++
 drivers/misc/mic/scif/scif_ports.c  | 124 +++++++++++
 10 files changed, 933 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index afc9f53..bcf6e36 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -69,3 +69,22 @@ config INTEL_MIC_CARD
 
 	  For more information see
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "SCIF Driver"
+
+config SCIF
+	tristate "SCIF Driver"
+	depends on 64BIT && PCI && X86 && SCIF_BUS
+	help
+	  This enables SCIF Driver support for the Intel Many Integrated
+	  Core (MIC) family of PCIe form factor coprocessor devices that
+	  run a 64 bit Linux OS. The Symmetric Communication Interface
+	  (SCIF (pronounced as skiff)) is a low level communications API
+	  across PCIe currently implemented for MIC.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 6bc6083..a74042c 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
 obj-y += bus/
+obj-$(CONFIG_SCIF) += scif/
diff --git a/drivers/misc/mic/scif/Makefile b/drivers/misc/mic/scif/Makefile
new file mode 100644
index 0000000..bf10bb7
--- /dev/null
+++ b/drivers/misc/mic/scif/Makefile
@@ -0,0 +1,15 @@
+#
+# Makefile - SCIF driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_SCIF) += scif.o
+scif-objs := scif_main.o
+scif-objs += scif_peer_bus.o
+scif-objs += scif_ports.o
+scif-objs += scif_debugfs.o
+scif-objs += scif_fd.o
+scif-objs += scif_api.o
+scif-objs += scif_epd.o
+scif-objs += scif_rb.o
+scif-objs += scif_nodeqp.o
+scif-objs += scif_nm.o
diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
new file mode 100644
index 0000000..b973b99
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -0,0 +1,148 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_EPD_H
+#define SCIF_EPD_H
+
+#include <linux/delay.h>
+#include <linux/scif.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_EPLOCK_HELD true
+
+enum scif_epd_state {
+	SCIFEP_UNBOUND,
+	SCIFEP_BOUND,
+	SCIFEP_LISTENING,
+	SCIFEP_CONNECTED,
+	SCIFEP_CONNECTING,
+	SCIFEP_MAPPING,
+	SCIFEP_CLOSING,
+	SCIFEP_CLLISTEN,
+	SCIFEP_DISCONNECTED,
+	SCIFEP_ZOMBIE
+};
+
+/*
+ * struct scif_conreq - Data structure added to the connection list.
+ *
+ * @msg: connection request message received
+ * @list: link to list of connection requests
+ */
+struct scif_conreq {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+/* Size of the RB for the Endpoint QP */
+#define SCIF_ENDPT_QP_SIZE 0x1000
+
+/*
+ * scif_endpt_qp_info - SCIF endpoint queue pair
+ *
+ * @qp - Qpair for this endpoint
+ * @qp_offset - DMA address of the QP
+ * @gnt_pld - Payload in a SCIF_CNCT_GNT message containing the
+ * physical address of the remote_qp.
+ */
+struct scif_endpt_qp_info {
+	struct scif_qp *qp;
+	dma_addr_t qp_offset;
+	dma_addr_t gnt_pld;
+};
+
+/*
+ * struct scif_endpt - The SCIF endpoint data structure
+ *
+ * @state: end point state
+ * @lock: lock synchronizing access to endpoint fields like state etc
+ * @port: self port information
+ * @peer: peer port information
+ * @backlog: maximum pending connection requests
+ * @qp_info: Endpoint QP information for SCIF messaging
+ * @remote_dev: scifdev used by this endpt to communicate with remote node.
+ * @remote_ep: remote endpoint
+ * @conreqcnt: Keep track of number of connection requests.
+ * @files: Open file information used to match the id passed in with
+ *         the flush routine.
+ * @conlist: list of connection requests
+ * @conwq: waitqueue for connection processing
+ * @discon: completion used during disconnection
+ * @sendwq: waitqueue used during sending messages
+ * @recvwq: waitqueue used during message receipt
+ * @sendlock: Synchronize ordering of messages sent
+ * @recvlock: Synchronize ordering of messages received
+ * @list: link to list of various endpoints like connected, listening etc
+ * @li_accept: pending ACCEPTREG
+ * @acceptcnt: pending ACCEPTREG cnt
+ * @liacceptlist: link to listen accept
+ * @miacceptlist: link to uaccept
+ * @listenep: associated listen ep
+ * @conn_work: Non blocking connect work
+ * @conn_port: Connection port
+ * @conn_err: Errors during connection
+ * @conn_async_state: Async connection
+ * @conn_list: List of async connection requests
+ */
+struct scif_endpt {
+	enum scif_epd_state state;
+	spinlock_t lock;
+	struct scif_port_id port;
+	struct scif_port_id peer;
+	int backlog;
+	struct scif_endpt_qp_info qp_info;
+	struct scif_dev *remote_dev;
+	u64 remote_ep;
+	int conreqcnt;
+	struct files_struct *files;
+	struct list_head conlist;
+	wait_queue_head_t conwq;
+	struct completion discon;
+	wait_queue_head_t sendwq;
+	wait_queue_head_t recvwq;
+	struct mutex sendlock;
+	struct mutex recvlock;
+	struct list_head list;
+	struct list_head li_accept;
+	int acceptcnt;
+	struct list_head liacceptlist;
+	struct list_head miacceptlist;
+	struct scif_endpt *listenep;
+	struct scif_port_id conn_port;
+	int conn_err;
+	int conn_async_state;
+	struct list_head conn_list;
+};
+
+static inline int scifdev_alive(struct scif_endpt *ep)
+{
+	return _scifdev_alive(ep->remote_dev);
+}
+
+void scif_cleanup_zombie_epd(void);
+void scif_teardown_ep(void *endpt);
+void scif_cleanup_ep_qp(struct scif_endpt *ep);
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held);
+void scif_get_node_info(void);
+void scif_send_acks(struct scif_dev *dev);
+void scif_conn_handler(struct work_struct *work);
+int scif_rsrv_port(u16 port);
+void scif_get_port(u16 port);
+int scif_get_new_port(void);
+void scif_put_port(u16 port);
+int __scif_flush(scif_epd_t epd);
+#endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
index acd1233..580bc63 100644
--- a/drivers/misc/mic/scif/scif_main.h
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -238,6 +238,8 @@ static inline int _scifdev_alive(struct scif_dev *scifdev)
 	return !!spdev;
 }
 
+#include "scif_epd.h"
+
 void __init scif_init_debugfs(void);
 void scif_exit_debugfs(void);
 int scif_setup_intr_wq(struct scif_dev *scifdev);
diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
index 6200292..6c0ed67 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.h
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -64,7 +64,17 @@
 #define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
 #define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
 #define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
-#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+#define SCIF_CNCT_REQ 9  /* Phys addr of Request connection to a port */
+#define SCIF_CNCT_GNT 10  /* Phys addr of new Grant connection request */
+#define SCIF_CNCT_GNTACK 11  /* Error type Reject a connection request */
+#define SCIF_CNCT_GNTNACK 12  /* Error type Reject a connection request */
+#define SCIF_CNCT_REJ 13  /* Error type Reject a connection request */
+#define SCIF_DISCNCT 14 /* Notify peer that connection is being terminated */
+#define SCIF_DISCNT_ACK 15 /* Notify peer that connection is being terminated */
+#define SCIF_CLIENT_SENT 16 /* Notify the peer that data has been written */
+#define SCIF_CLIENT_RCVD 17 /* Notify the peer that data has been read */
+#define SCIF_GET_NODE_INFO 18 /* Get current node mask from the mgmt node*/
+#define SCIF_MAX_MSG SCIF_GET_NODE_INFO
 
 /*
  * struct scifmsg - Node QP message format
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
new file mode 100644
index 0000000..cbdc27b
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -0,0 +1,417 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/scif.h>
+#include "scif_main.h"
+#include "scif_map.h"
+
+static const char * const scif_ep_states[] = {
+	"Unbound",
+	"Bound",
+	"Listening",
+	"Connected",
+	"Connecting",
+	"Mapping",
+	"Closing",
+	"Close Listening",
+	"Disconnected",
+	"Zombie"};
+
+enum conn_async_state {
+	ASYNC_CONN_IDLE = 1,	/* ep setup for async connect */
+	ASYNC_CONN_INPROGRESS,	/* async connect in progress */
+	ASYNC_CONN_FLUSH_WORK	/* async work flush in progress  */
+};
+
+scif_epd_t scif_open(void)
+{
+	struct scif_endpt *ep;
+
+	might_sleep();
+	ep = kzalloc(sizeof(*ep), GFP_KERNEL);
+	if (!ep)
+		goto err_ep_alloc;
+
+	ep->qp_info.qp = kzalloc(sizeof(*ep->qp_info.qp), GFP_KERNEL);
+	if (!ep->qp_info.qp)
+		goto err_qp_alloc;
+
+	spin_lock_init(&ep->lock);
+	mutex_init(&ep->sendlock);
+	mutex_init(&ep->recvlock);
+
+	ep->state = SCIFEP_UNBOUND;
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI open: ep %p success\n", ep);
+	return ep;
+
+err_qp_alloc:
+	kfree(ep);
+err_ep_alloc:
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(scif_open);
+
+/*
+ * scif_disconnect_ep - Disconnects the endpoint if found
+ * @epd: The end point returned from scif_open()
+ */
+static struct scif_endpt *scif_disconnect_ep(struct scif_endpt *ep)
+{
+	struct scifmsg msg;
+	struct scif_endpt *fep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	int err;
+
+	/*
+	 * Wake up any threads blocked in send()/recv() before closing
+	 * out the connection. Grabbing and releasing the send/recv lock
+	 * will ensure that any blocked senders/receivers have exited for
+	 * Ring 0 endpoints. It is a Ring 0 bug to call send/recv after
+	 * close. Ring 3 endpoints are not affected since close will not
+	 * be called while there are IOCTLs executing.
+	 */
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	mutex_lock(&ep->sendlock);
+	mutex_unlock(&ep->sendlock);
+	mutex_lock(&ep->recvlock);
+	mutex_unlock(&ep->recvlock);
+
+	/* Remove from the connected list */
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		if (tmpep == ep) {
+			list_del(pos);
+			fep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	if (!fep) {
+		/*
+		 * The other side has completed the disconnect before
+		 * the end point can be removed from the list. Therefore
+		 * the ep lock is not locked, traverse the disconnected
+		 * list to find the endpoint and release the conn lock.
+		 */
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		return NULL;
+	}
+
+	init_completion(&ep->discon);
+	msg.uop = SCIF_DISCNCT;
+	msg.src = ep->port;
+	msg.dst = ep->peer;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->remote_ep;
+
+	err = scif_nodeqp_send(ep->remote_dev, &msg);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+	if (!err)
+		/* Wait for the remote node to respond with SCIF_DISCNT_ACK */
+		wait_for_completion_timeout(&ep->discon,
+					    SCIF_NODE_ALIVE_TIMEOUT);
+	return ep;
+}
+
+int scif_close(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	enum scif_epd_state oldstate;
+	bool flush_conn;
+
+	dev_dbg(scif_info.mdev.this_device, "SCIFAPI close: ep %p %s\n",
+		ep, scif_ep_states[ep->state]);
+	might_sleep();
+	spin_lock(&ep->lock);
+	flush_conn = (ep->conn_async_state == ASYNC_CONN_INPROGRESS);
+	spin_unlock(&ep->lock);
+
+	if (flush_conn)
+		flush_work(&scif_info.conn_work);
+
+	spin_lock(&ep->lock);
+	oldstate = ep->state;
+
+	ep->state = SCIFEP_CLOSING;
+
+	switch (oldstate) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		/* Remove from the disconnected list */
+		mutex_lock(&scif_info.connlock);
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		break;
+	case SCIFEP_UNBOUND:
+	case SCIFEP_BOUND:
+	case SCIFEP_CONNECTING:
+		spin_unlock(&ep->lock);
+		break;
+	case SCIFEP_MAPPING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CLOSING:
+	{
+		spin_unlock(&ep->lock);
+		scif_disconnect_ep(ep);
+		break;
+	}
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+	{
+		struct scif_conreq *conreq;
+		struct scifmsg msg;
+		struct scif_endpt *aep;
+
+		spin_unlock(&ep->lock);
+		spin_lock(&scif_info.eplock);
+
+		/* remove from listen list */
+		list_for_each_safe(pos, tmpq, &scif_info.listen) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep)
+				list_del(pos);
+		}
+		/* Remove any dangling accepts */
+		while (ep->acceptcnt) {
+			aep = list_first_entry(&ep->li_accept,
+					       struct scif_endpt, liacceptlist);
+			list_del(&aep->liacceptlist);
+			scif_put_port(aep->port.port);
+			list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+				tmpep = list_entry(pos, struct scif_endpt,
+						   miacceptlist);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			spin_unlock(&scif_info.eplock);
+			mutex_lock(&scif_info.connlock);
+			list_for_each_safe(pos, tmpq, &scif_info.connected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			mutex_unlock(&scif_info.connlock);
+			scif_teardown_ep(aep);
+			spin_lock(&scif_info.eplock);
+			scif_add_epd_to_zombie_list(aep, SCIF_EPLOCK_HELD);
+			ep->acceptcnt--;
+		}
+
+		spin_lock(&ep->lock);
+		spin_unlock(&scif_info.eplock);
+
+		/* Remove and reject any pending connection requests. */
+		while (ep->conreqcnt) {
+			conreq = list_first_entry(&ep->conlist,
+						  struct scif_conreq, list);
+			list_del(&conreq->list);
+
+			msg.uop = SCIF_CNCT_REJ;
+			msg.dst.node = conreq->msg.src.node;
+			msg.dst.port = conreq->msg.src.port;
+			msg.payload[0] = conreq->msg.payload[0];
+			msg.payload[1] = conreq->msg.payload[1];
+			/*
+			 * No Error Handling on purpose for scif_nodeqp_send().
+			 * If the remote node is lost we still want free the
+			 * connection requests on the self node.
+			 */
+			scif_nodeqp_send(&scif_dev[conreq->msg.src.node],
+					 &msg);
+			ep->conreqcnt--;
+			kfree(conreq);
+		}
+
+		spin_unlock(&ep->lock);
+		/* If a kSCIF accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	}
+	scif_put_port(ep->port.port);
+	scif_teardown_ep(ep);
+	scif_add_epd_to_zombie_list(ep, !SCIF_EPLOCK_HELD);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_close);
+
+/**
+ * scif_flush() - Wakes up any blocking accepts. The endpoint will no longer
+ *			accept new connections.
+ * @epd: The end point returned from scif_open()
+ */
+int __scif_flush(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	switch (ep->state) {
+	case SCIFEP_LISTENING:
+	{
+		ep->state = SCIFEP_CLLISTEN;
+
+		/* If an accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	default:
+		break;
+	}
+	return 0;
+}
+
+int scif_bind(scif_epd_t epd, u16 pn)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret = 0;
+	int tmp;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI bind: ep %p %s requested port number %d\n",
+		ep, scif_ep_states[ep->state], pn);
+	if (pn) {
+		/*
+		 * Similar to IETF RFC 1700, SCIF ports below
+		 * SCIF_ADMIN_PORT_END can only be bound by system (or root)
+		 * processes or by processes executed by privileged users.
+		 */
+		if (pn < SCIF_ADMIN_PORT_END && !capable(CAP_SYS_ADMIN)) {
+			ret = -EACCES;
+			goto scif_bind_admin_exit;
+		}
+	}
+
+	spin_lock(&ep->lock);
+	if (ep->state == SCIFEP_BOUND) {
+		ret = -EINVAL;
+		goto scif_bind_exit;
+	} else if (ep->state != SCIFEP_UNBOUND) {
+		ret = -EISCONN;
+		goto scif_bind_exit;
+	}
+
+	if (pn) {
+		tmp = scif_rsrv_port(pn);
+		if (tmp != pn) {
+			ret = -EINVAL;
+			goto scif_bind_exit;
+		}
+	} else {
+		pn = scif_get_new_port();
+		if (!pn) {
+			ret = -ENOSPC;
+			goto scif_bind_exit;
+		}
+	}
+
+	ep->state = SCIFEP_BOUND;
+	ep->port.node = scif_info.nodeid;
+	ep->port.port = pn;
+	ep->conn_async_state = ASYNC_CONN_IDLE;
+	ret = pn;
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI bind: bound to port number %d\n", pn);
+scif_bind_exit:
+	spin_unlock(&ep->lock);
+scif_bind_admin_exit:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_bind);
+
+int scif_listen(scif_epd_t epd, int backlog)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI listen: ep %p %s\n", ep, scif_ep_states[ep->state]);
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+	case SCIFEP_CLLISTEN:
+	case SCIFEP_UNBOUND:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		return -EINVAL;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		spin_unlock(&ep->lock);
+		return -EISCONN;
+	case SCIFEP_BOUND:
+		break;
+	}
+
+	ep->state = SCIFEP_LISTENING;
+	ep->backlog = backlog;
+
+	ep->conreqcnt = 0;
+	ep->acceptcnt = 0;
+	INIT_LIST_HEAD(&ep->conlist);
+	init_waitqueue_head(&ep->conwq);
+	INIT_LIST_HEAD(&ep->li_accept);
+	spin_unlock(&ep->lock);
+
+	/*
+	 * Listen status is complete so delete the qp information not needed
+	 * on a listen before placing on the list of listening ep's
+	 */
+	scif_teardown_ep(ep);
+	ep->qp_info.qp = NULL;
+
+	spin_lock(&scif_info.eplock);
+	list_add_tail(&ep->list, &scif_info.listen);
+	spin_unlock(&scif_info.eplock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_listen);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
new file mode 100644
index 0000000..a2b728a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -0,0 +1,92 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_cleanup_ep_qp(struct scif_endpt *ep)
+{
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp->outbound_q.rb_base) {
+		scif_iounmap((void *)qp->outbound_q.rb_base,
+			     qp->outbound_q.size, ep->remote_dev);
+		qp->outbound_q.rb_base = NULL;
+	}
+	if (qp->remote_qp) {
+		scif_iounmap((void *)qp->remote_qp,
+			     sizeof(struct scif_qp), ep->remote_dev);
+		qp->remote_qp = NULL;
+	}
+	if (qp->local_qp) {
+		scif_unmap_single(qp->local_qp, ep->remote_dev,
+				  sizeof(struct scif_qp));
+		qp->local_qp = 0x0;
+	}
+	if (qp->local_buf) {
+		scif_unmap_single(qp->local_buf, ep->remote_dev,
+				  SCIF_ENDPT_QP_SIZE);
+		qp->local_buf = 0;
+	}
+}
+
+void scif_teardown_ep(void *endpt)
+{
+	struct scif_endpt *ep = endpt;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp) {
+		spin_lock(&ep->lock);
+		scif_cleanup_ep_qp(ep);
+		spin_unlock(&ep->lock);
+		kfree(qp->inbound_q.rb_base);
+		kfree(qp);
+	}
+}
+
+/*
+ * Enqueue the endpoint to the zombie list for cleanup.
+ * The endpoint should not be accessed once this API returns.
+ */
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
+{
+	if (!eplock_held)
+		spin_lock(&scif_info.eplock);
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_ZOMBIE;
+	spin_unlock(&ep->lock);
+	list_add_tail(&ep->list, &scif_info.zombie);
+	scif_info.nr_zombies++;
+	if (!eplock_held)
+		spin_unlock(&scif_info.eplock);
+	schedule_work(&scif_info.misc_work);
+}
+
+void scif_cleanup_zombie_epd(void)
+{
+	struct list_head *pos, *tmpq;
+	struct scif_endpt *ep;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.zombie) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		list_del(pos);
+		scif_info.nr_zombies--;
+		kfree(ep);
+	}
+	spin_unlock(&scif_info.eplock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
new file mode 100644
index 0000000..812c23f
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -0,0 +1,104 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+
+static int scif_fdopen(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = scif_open();
+
+	if (!priv)
+		return -ENOMEM;
+	f->private_data = priv;
+	return 0;
+}
+
+static int scif_fdclose(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = f->private_data;
+
+	return scif_close(priv);
+}
+
+static int scif_fdflush(struct file *f, fl_owner_t id)
+{
+	struct scif_endpt *ep = f->private_data;
+
+	spin_lock(&ep->lock);
+	/*
+	 * The listening endpoint stashes the open file information before
+	 * waiting for incoming connections. The release callback would never be
+	 * called if the application closed the endpoint, while waiting for
+	 * incoming connections from a separate thread since the file descriptor
+	 * reference count is bumped up in the accept IOCTL. Call the flush
+	 * routine if the id matches the endpoint open file information so that
+	 * the listening endpoint can be woken up and the fd released.
+	 */
+	if (ep->files == id)
+		__scif_flush(ep);
+	spin_unlock(&ep->lock);
+	return 0;
+}
+
+static __always_inline void scif_err_debug(int err, const char *str)
+{
+	/*
+	 * ENOTCONN is a common uninteresting error which is
+	 * flooding debug messages to the console unnecessarily.
+	 */
+	if (err < 0 && err != -ENOTCONN)
+		dev_dbg(scif_info.mdev.this_device, "%s err %d\n", str, err);
+}
+
+static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct scif_endpt *priv = f->private_data;
+	void __user *argp = (void __user *)arg;
+	bool non_block = false;
+
+	non_block = !!(f->f_flags & O_NONBLOCK);
+
+	switch (cmd) {
+	case SCIF_BIND:
+	{
+		int pn;
+
+		if (copy_from_user(&pn, argp, sizeof(pn)))
+			return -EFAULT;
+
+		pn = scif_bind(priv, pn);
+		if (pn < 0)
+			return pn;
+
+		if (copy_to_user(argp, &pn, sizeof(pn)))
+			return -EFAULT;
+
+		return 0;
+	}
+	case SCIF_LISTEN:
+		return scif_listen(priv, arg);
+	}
+	return -EINVAL;
+}
+
+const struct file_operations scif_fops = {
+	.open = scif_fdopen,
+	.release = scif_fdclose,
+	.unlocked_ioctl = scif_fdioctl,
+	.flush = scif_fdflush,
+	.owner = THIS_MODULE,
+};
diff --git a/drivers/misc/mic/scif/scif_ports.c b/drivers/misc/mic/scif/scif_ports.c
new file mode 100644
index 0000000..594e18d
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_ports.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/idr.h>
+
+#include "scif_main.h"
+
+#define SCIF_PORT_COUNT	0x10000	/* Ports available */
+
+struct idr scif_ports;
+
+/*
+ * struct scif_port - SCIF port information
+ *
+ * @ref_cnt - Reference count since there can be multiple endpoints
+ *		created via scif_accept(..) simultaneously using a port.
+ */
+struct scif_port {
+	int ref_cnt;
+};
+
+/**
+ * __scif_get_port - Reserve a specified port # for SCIF and add it
+ * to the global list.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+static int __scif_get_port(int start, int end)
+{
+	int id;
+	struct scif_port *port = kzalloc(sizeof(*port), GFP_ATOMIC);
+
+	if (!port)
+		return -ENOMEM;
+	spin_lock(&scif_info.port_lock);
+	id = idr_alloc(&scif_ports, port, start, end, GFP_ATOMIC);
+	if (id >= 0)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+	return id;
+}
+
+/**
+ * scif_rsrv_port - Reserve a specified port # for SCIF.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_rsrv_port(u16 port)
+{
+	return __scif_get_port(port, port + 1);
+}
+
+/**
+ * scif_get_new_port - Get and reserve any port # for SCIF in the range
+ *			SCIF_PORT_RSVD + 1 to SCIF_PORT_COUNT - 1.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if no ports available.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_get_new_port(void)
+{
+	return __scif_get_port(SCIF_PORT_RSVD + 1, SCIF_PORT_COUNT);
+}
+
+/**
+ * scif_get_port - Increment the reference count for a SCIF port
+ * @id : SCIF port
+ *
+ * @return : None
+ */
+void scif_get_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+}
+
+/**
+ * scif_put_port - Release a reserved SCIF port
+ * @id : SCIF port to be released.
+ *
+ * @return : None
+ */
+void scif_put_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port) {
+		port->ref_cnt--;
+		if (!port->ref_cnt) {
+			idr_remove(&scif_ports, id);
+			kfree(port);
+		}
+	}
+	spin_unlock(&scif_info.port_lock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v4 07/13] misc: mic: SCIF node queue pair setup management
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (5 preceding siblings ...)
  2015-03-31  1:36 47% ` [PATCH char-misc-next v4 06/13] misc: mic: SCIF module initialization Sudeep Dutt
@ 2015-03-31  1:36 28% ` Sudeep Dutt
  2015-03-31  1:36 41% ` [PATCH char-misc-next v4 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF node queue pair setup creates the SCIF driver kernel
mode private node queue pairs between all the nodes to enable
internal control message communication once SCIF gets probed
by the SCIF hardware bus. Peer to peer communication between
MIC Coprocessor nodes is supported.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_nodeqp.h |  173 +++++
 drivers/misc/mic/scif/scif_nm.c     |  184 +++++
 drivers/misc/mic/scif/scif_nodeqp.c | 1264 +++++++++++++++++++++++++++++++++++
 3 files changed, 1621 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c

diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
new file mode 100644
index 0000000..6200292
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -0,0 +1,173 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_NODEQP
+#define SCIF_NODEQP
+
+#include "scif_rb.h"
+#include "scif_peer_bus.h"
+
+#define SCIF_INIT 1  /* First message sent to the peer node for discovery */
+#define SCIF_EXIT 2  /* Last message from the peer informing intent to exit */
+#define SCIF_EXIT_ACK 3 /* Response to SCIF_EXIT message */
+#define SCIF_NODE_ADD 4  /* Tell Online nodes a new node exits */
+#define SCIF_NODE_ADD_ACK 5  /* Confirm to mgmt node sequence is finished */
+#define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
+#define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
+#define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
+#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+
+/*
+ * struct scifmsg - Node QP message format
+ *
+ * @src: Source information
+ * @dst: Destination information
+ * @uop: The message opcode
+ * @payload: Unique payload format for each message
+ */
+struct scifmsg {
+	struct scif_port_id src;
+	struct scif_port_id dst;
+	u32 uop;
+	u64 payload[4];
+} __packed;
+
+/*
+ * struct scif_qp - Node Queue Pair
+ *
+ * Interesting structure -- a little difficult because we can only
+ * write across the PCIe, so any r/w pointer we need to read is
+ * local. We only need to read the read pointer on the inbound_q
+ * and read the write pointer in the outbound_q
+ *
+ * @magic: Magic value to ensure the peer sees the QP correctly
+ * @outbound_q: The outbound ring buffer for sending messages
+ * @inbound_q: The inbound ring buffer for receiving messages
+ * @local_write: Local write index
+ * @local_read: Local read index
+ * @remote_qp: The remote queue pair
+ * @local_buf: DMA address of local ring buffer
+ * @local_qp: DMA address of the local queue pair data structure
+ * @remote_buf: DMA address of remote ring buffer
+ * @qp_state: QP state i.e. online or offline used for P2P
+ * @send_lock: synchronize access to outbound queue
+ * @recv_lock: Synchronize access to inbound queue
+ */
+struct scif_qp {
+	u64 magic;
+#define SCIFEP_MAGIC 0x5c1f000000005c1fULL
+	struct scif_rb outbound_q;
+	struct scif_rb inbound_q;
+
+	u32 local_write __aligned(64);
+	u32 local_read __aligned(64);
+	struct scif_qp *remote_qp;
+	dma_addr_t local_buf;
+	dma_addr_t local_qp;
+	dma_addr_t remote_buf;
+	u32 qp_state;
+#define SCIF_QP_OFFLINE 0xdead
+#define SCIF_QP_ONLINE 0xc0de
+	spinlock_t send_lock;
+	spinlock_t recv_lock;
+};
+
+/*
+ * struct scif_loopb_msg - An element in the loopback Node QP message list.
+ *
+ * @msg - The SCIF node QP message
+ * @list - link in the list of messages
+ */
+struct scif_loopb_msg {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_setup_qp(struct scif_dev *scifdev);
+int scif_qp_response(phys_addr_t phys, struct scif_dev *dev);
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev);
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev);
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload);
+int scif_setup_loopback_qp(struct scif_dev *scifdev);
+int scif_destroy_loopback_qp(struct scif_dev *scifdev);
+void scif_poll_qp_state(struct work_struct *work);
+void scif_qp_response_ack(struct work_struct *work);
+void scif_destroy_p2p(struct scif_dev *scifdev);
+void scif_send_exit(struct scif_dev *scifdev);
+static inline struct device *scif_get_peer_dev(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+	struct device *spdev_ret;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		spdev_ret = get_device(&spdev->dev);
+	else
+		spdev_ret = ERR_PTR(-ENODEV);
+	rcu_read_unlock();
+	return spdev_ret;
+}
+
+static inline void scif_put_peer_dev(struct device *dev)
+{
+	put_device(dev);
+}
+#endif  /* SCIF_NODEQP */
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
new file mode 100644
index 0000000..5fefacc
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -0,0 +1,184 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_peer_bus.h"
+
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_free_qp(struct scif_dev *scifdev)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+
+	if (!qp)
+		return;
+	scif_free_coherent((void *)qp->inbound_q.rb_base,
+			   qp->local_buf, scifdev, qp->inbound_q.size);
+	scif_unmap_single(qp->local_qp, scifdev, sizeof(struct scif_qp));
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+}
+
+static void scif_cleanup_qp(struct scif_dev *dev)
+{
+	struct scif_qp *qp = &dev->qpairs[0];
+
+	if (!qp)
+		return;
+	scif_iounmap((void *)qp->remote_qp, sizeof(struct scif_qp), dev);
+	scif_iounmap((void *)qp->outbound_q.rb_base,
+		     sizeof(struct scif_qp), dev);
+	qp->remote_qp = NULL;
+	qp->local_write = 0;
+	qp->inbound_q.current_write_offset = 0;
+	qp->inbound_q.current_read_offset = 0;
+	if (scifdev_is_p2p(dev))
+		scif_free_qp(dev);
+}
+
+void scif_send_acks(struct scif_dev *dev)
+{
+	struct scifmsg msg;
+
+	if (dev->node_remove_ack_pending) {
+		msg.uop = SCIF_NODE_REMOVE_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = SCIF_MGMT_NODE;
+		msg.payload[0] = dev->node;
+		scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg);
+		dev->node_remove_ack_pending = false;
+	}
+	if (dev->exit_ack_pending) {
+		msg.uop = SCIF_EXIT_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = dev->node;
+		scif_nodeqp_send(dev, &msg);
+		dev->exit_ack_pending = false;
+	}
+}
+
+/*
+ * scif_cleanup_scifdev
+ *
+ * @dev: Remote SCIF device.
+ * Uninitialize SCIF data structures for remote SCIF device.
+ */
+void scif_cleanup_scifdev(struct scif_dev *dev)
+{
+	struct scif_hw_dev *sdev = dev->sdev;
+
+	if (!dev->sdev)
+		return;
+	if (scifdev_is_p2p(dev)) {
+		if (dev->cookie) {
+			sdev->hw_ops->free_irq(sdev, dev->cookie, dev);
+			dev->cookie = NULL;
+		}
+		scif_destroy_intr_wq(dev);
+	}
+	scif_destroy_p2p(dev);
+	scif_send_acks(dev);
+	if (!dev->node && scif_info.card_initiated_exit) {
+		/*
+		 * Send an SCIF_EXIT message which is the last message from MIC
+		 * to the Host and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(dev);
+		scif_info.card_initiated_exit = false;
+	}
+	scif_cleanup_qp(dev);
+}
+
+/*
+ * scif_remove_node:
+ *
+ * @node: Node to remove
+ */
+void scif_handle_remove_node(int node)
+{
+	struct scif_dev *scifdev = &scif_dev[node];
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	else
+		scif_send_acks(scifdev);
+}
+
+static int scif_send_rmnode_msg(int node, int remove_node)
+{
+	struct scifmsg notif_msg;
+	struct scif_dev *dev = &scif_dev[node];
+
+	notif_msg.uop = SCIF_NODE_REMOVE;
+	notif_msg.src.node = scif_info.nodeid;
+	notif_msg.dst.node = node;
+	notif_msg.payload[0] = remove_node;
+	return scif_nodeqp_send(dev, &notif_msg);
+}
+
+/**
+ * scif_node_disconnect:
+ *
+ * @node_id[in]: source node id.
+ * @mgmt_initiated: Disconnection initiated from the mgmt node
+ *
+ * Disconnect a node from the scif network.
+ */
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
+{
+	int ret;
+	int msg_cnt = 0;
+	u32 i = 0;
+	struct scif_dev *scifdev = &scif_dev[node_id];
+
+	if (!node_id)
+		return;
+
+	atomic_set(&scifdev->disconn_rescnt, 0);
+
+	/* Destroy p2p network */
+	for (i = 1; i <= scif_info.maxid; i++) {
+		if (i == node_id)
+			continue;
+		ret = scif_send_rmnode_msg(i, node_id);
+		if (!ret)
+			msg_cnt++;
+	}
+	/* Wait for the remote nodes to respond with SCIF_NODE_REMOVE_ACK */
+	ret = wait_event_timeout(scifdev->disconn_wq,
+				 (atomic_read(&scifdev->disconn_rescnt)
+				 == msg_cnt), SCIF_NODE_ALIVE_TIMEOUT);
+	/* Tell the card to clean up */
+	if (mgmt_initiated && _scifdev_alive(scifdev))
+		/*
+		 * Send an SCIF_EXIT message which is the last message from Host
+		 * to the MIC and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(scifdev);
+	atomic_set(&scifdev->disconn_rescnt, 0);
+	/* Tell the mgmt node to clean up */
+	ret = scif_send_rmnode_msg(SCIF_MGMT_NODE, node_id);
+	if (!ret)
+		/* Wait for mgmt node to respond with SCIF_NODE_REMOVE_ACK */
+		wait_event_timeout(scifdev->disconn_wq,
+				   (atomic_read(&scifdev->disconn_rescnt) == 1),
+				   SCIF_NODE_ALIVE_TIMEOUT);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
new file mode 100644
index 0000000..0e0cd6b
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -0,0 +1,1264 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_nodeqp.h"
+#include "scif_map.h"
+
+/*
+ ************************************************************************
+ * SCIF node Queue Pair (QP) setup flow:
+ *
+ * 1) SCIF driver gets probed with a scif_hw_dev via the scif_hw_bus
+ * 2) scif_setup_qp(..) allocates the local qp and calls
+ *	scif_setup_qp_connect(..) which allocates and maps the local
+ *	buffer for the inbound QP
+ * 3) The local node updates the device page with the DMA address of the QP
+ * 4) A delayed work is scheduled (qp_dwork) which periodically reads if
+ *	the peer node has updated its QP DMA address
+ * 5) Once a valid non zero address is found in the QP DMA address field
+ *	in the device page, the local node maps the remote node's QP,
+ *	updates its outbound QP and sends a SCIF_INIT message to the peer
+ * 6) The SCIF_INIT message is received by the peer node QP interrupt bottom
+ *	half handler by calling scif_init(..)
+ * 7) scif_init(..) registers a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node
+ * 8) On the mgmt node, P2P network setup/teardown is initiated if all the
+ *	remote nodes are online via scif_p2p_setup(..)
+ * 9) For P2P setup, the host maps the remote nodes' aperture and memory
+ *	bars and sends a SCIF_NODE_ADD message to both nodes
+ * 10) As part of scif_nodeadd, both nodes set up their local inbound
+ *	QPs and send a SCIF_NODE_ADD_ACK to the mgmt node
+ * 11) As part of scif_node_add_ack(..) the mgmt node forwards the
+ *	SCIF_NODE_ADD_ACK to the remote nodes
+ * 12) As part of scif_node_add_ack(..) the remote nodes update their
+ *	outbound QPs, make sure they can access memory on the remote node
+ *	and then add a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node.
+ * 13) The SCIF network is now established across all nodes.
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by non mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) A non mgmt node now cleans up all local data structures and sends a
+ *	SCIF_EXIT message to the peer and waits for a SCIF_EXIT_ACK
+ * 4) As part of scif_exit(..) handling scif_disconnect_node(..) is called
+ * 5) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the
+ *	peers and waits for a SCIF_NODE_REMOVE_ACK
+ * 6) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 7) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it sends itself a node remove message whose handling cleans up local
+ *	data structures and unregisters the peer node from the SCIF network
+ * 8) The mgmt node sends a SCIF_EXIT_ACK
+ * 9) Upon receipt of the SCIF_EXIT_ACK the node initiating the teardown
+ *	completes the SCIF remove routine
+ * 10) The SCIF network is now torn down for the node initiating the
+ *	teardown sequence
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) The mgmt node calls scif_disconnect_node(..)
+ * 4) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the peers
+ *	and waits for a SCIF_NODE_REMOVE_ACK
+ * 5) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 6) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it unregisters the peer node from the SCIF network
+ * 7) The mgmt node sends a SCIF_EXIT message and waits for a SCIF_EXIT_ACK.
+ * 8) A non mgmt node upon receipt of a SCIF_EXIT message calls scif_stop(..)
+ *	which would clean up local data structures for all SCIF nodes and
+ *	then send a SCIF_EXIT_ACK back to the mgmt node
+ * 9) Upon receipt of the SCIF_EXIT_ACK the the mgmt node sends itself a node
+ *	remove message whose handling cleans up local data structures and
+ *	destroys any P2P mappings.
+ * 10) The SCIF hardware device for which a remove callback was received is now
+ *	disconnected from the SCIF network.
+ */
+/*
+ * Initializes "local" data structures for the QP. Allocates the QP
+ * ring buffer (rb) and initializes the "in bound" queue.
+ */
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev)
+{
+	void *local_q = NULL;
+	int err = 0;
+	u32 tmp_rd = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		return err;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &tmp_rd,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	/*
+	 * The read pointer is NULL initially and it is unsafe to use the ring
+	 * buffer til this changes!
+	 */
+	qp->inbound_q.read_ptr = NULL;
+	err = scif_map_single(qp_offset, qp,
+			      scifdev, sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+	return err;
+}
+
+/* When the other side has already done it's allocation, this is called */
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev)
+{
+	void *local_q;
+	void *remote_q;
+	struct scif_qp *remote_qp;
+	int remote_size;
+	int err = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+	/* Start by figuring out where we need to point */
+	remote_qp = scif_ioremap(phys, sizeof(struct scif_qp), scifdev);
+	if (!remote_qp)
+		return -EIO;
+	qp->remote_qp = remote_qp;
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_buf = remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	remote_q = scif_ioremap(qp->remote_buf, remote_size, scifdev);
+	if (!remote_q) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_qp->local_write = 0;
+	/*
+	 * To setup the outbound_q, the buffer lives in remote memory,
+	 * the read pointer is local, the write pointer is remote
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     remote_q,
+		     get_count_order(remote_size));
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto iounmap_1;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	qp->remote_qp->local_read = 0;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	err = scif_map_single(qp_offset, qp, scifdev,
+			      sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+iounmap_1:
+	scif_iounmap(remote_q, remote_size, scifdev);
+	qp->outbound_q.rb_base = NULL;
+iounmap:
+	scif_iounmap(qp->remote_qp, sizeof(struct scif_qp), scifdev);
+	qp->remote_qp = NULL;
+	return err;
+}
+
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload)
+{
+	int err = 0;
+	void *r_buf;
+	int remote_size;
+	phys_addr_t tmp_phys;
+
+	qp->remote_qp = scif_ioremap(payload, sizeof(struct scif_qp), scifdev);
+
+	if (!qp->remote_qp) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		dev_err(&scifdev->sdev->dev,
+			"SCIFEP_MAGIC mismatch between self %d remote %d\n",
+			scif_dev[scif_info.nodeid].node, scifdev->node);
+		err = -ENODEV;
+		goto error;
+	}
+
+	tmp_phys = qp->remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	r_buf = scif_ioremap(tmp_phys, remote_size, scifdev);
+
+	if (!r_buf)
+		return -EIO;
+
+	qp->local_read = 0;
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     r_buf,
+		     get_count_order(remote_size));
+	/*
+	 * resetup the inbound_q now that we know where the
+	 * inbound_read really is.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     qp->inbound_q.rb_base,
+		     get_count_order(qp->inbound_q.size));
+error:
+	return err;
+}
+
+static __always_inline void
+scif_send_msg_intr(struct scif_dev *scifdev)
+{
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_is_p2p(scifdev))
+		sdev->hw_ops->send_p2p_intr(sdev, scifdev->rdb, &scifdev->mmio);
+	else
+		sdev->hw_ops->send_intr(sdev, scifdev->rdb);
+}
+
+int scif_qp_response(phys_addr_t phys, struct scif_dev *scifdev)
+{
+	int err = 0;
+	struct scifmsg msg;
+
+	err = scif_setup_qp_connect_response(scifdev, scifdev->qpairs, phys);
+	if (!err) {
+		/*
+		 * Now that everything is setup and mapped, we're ready
+		 * to tell the peer about our queue's location
+		 */
+		msg.uop = SCIF_INIT;
+		msg.dst.node = scifdev->node;
+		err = scif_nodeqp_send(scifdev, &msg);
+	}
+	return err;
+}
+
+void scif_send_exit(struct scif_dev *scifdev)
+{
+	struct scifmsg msg;
+	int ret;
+
+	scifdev->exit = OP_IN_PROGRESS;
+	msg.uop = SCIF_EXIT;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = scifdev->node;
+	ret = scif_nodeqp_send(scifdev, &msg);
+	if (ret)
+		goto done;
+	/* Wait for a SCIF_EXIT_ACK message */
+	wait_event_timeout(scif_info.exitwq, scifdev->exit == OP_COMPLETED,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+done:
+	scifdev->exit = OP_IDLE;
+}
+
+int scif_setup_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	int local_size;
+	struct scif_qp *qp;
+
+	local_size = SCIF_NODE_QP_SIZE;
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp) {
+		err = -ENOMEM;
+		return err;
+	}
+	qp->magic = SCIFEP_MAGIC;
+	scifdev->qpairs = qp;
+	err = scif_setup_qp_connect(qp, &scifdev->qp_dma_addr,
+				    local_size, scifdev);
+	if (err)
+		goto free_qp;
+	/*
+	 * We're as setup as we can be. The inbound_q is setup, w/o a usable
+	 * outbound q.  When we get a message, the read_ptr will be updated,
+	 * and we will pull the message.
+	 */
+	return err;
+free_qp:
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+	return err;
+}
+
+static void scif_p2p_freesg(struct scatterlist *sg)
+{
+	kfree(sg);
+}
+
+static struct scatterlist *
+scif_p2p_setsg(void __iomem *va, int page_size, int page_cnt)
+{
+	struct scatterlist *sg;
+	struct page *page;
+	int i;
+
+	sg = kcalloc(page_cnt, sizeof(struct scatterlist), GFP_KERNEL);
+	if (!sg)
+		return NULL;
+	sg_init_table(sg, page_cnt);
+	for (i = 0; i < page_cnt; i++) {
+		page = vmalloc_to_page((void __force *)va);
+		if (!page)
+			goto p2p_sg_err;
+		sg_set_page(&sg[i], page, page_size, 0);
+		va += page_size;
+	}
+	return sg;
+p2p_sg_err:
+	kfree(sg);
+	return NULL;
+}
+
+/* Init p2p mappings required to access peerdev from scifdev */
+static struct scif_p2p_info *
+scif_init_p2p_info(struct scif_dev *scifdev, struct scif_dev *peerdev)
+{
+	struct scif_p2p_info *p2p;
+	int num_mmio_pages, num_aper_pages, sg_page_shift, err, num_aper_chunks;
+	struct scif_hw_dev *psdev = peerdev->sdev;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	num_mmio_pages = psdev->mmio->len >> PAGE_SHIFT;
+	num_aper_pages = psdev->aper->len >> PAGE_SHIFT;
+
+	p2p = kzalloc(sizeof(*p2p), GFP_KERNEL);
+	if (!p2p)
+		return NULL;
+	p2p->ppi_sg[SCIF_PPI_MMIO] = scif_p2p_setsg(psdev->mmio->va,
+						    PAGE_SIZE, num_mmio_pages);
+	if (!p2p->ppi_sg[SCIF_PPI_MMIO])
+		goto free_p2p;
+	p2p->sg_nentries[SCIF_PPI_MMIO] = num_mmio_pages;
+	sg_page_shift = get_order(min(psdev->aper->len, (u64)(1 << 30)));
+	num_aper_chunks = num_aper_pages >> (sg_page_shift - PAGE_SHIFT);
+	p2p->ppi_sg[SCIF_PPI_APER] = scif_p2p_setsg(psdev->aper->va,
+						    1 << sg_page_shift,
+						    num_aper_chunks);
+	p2p->sg_nentries[SCIF_PPI_APER] = num_aper_chunks;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			 num_mmio_pages, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_mmio_pages)
+		goto scif_p2p_free;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			 num_aper_chunks, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_aper_chunks)
+		goto dma_unmap;
+	p2p->ppi_da[SCIF_PPI_MMIO] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	p2p->ppi_da[SCIF_PPI_APER] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_APER]);
+	p2p->ppi_len[SCIF_PPI_MMIO] = num_mmio_pages;
+	p2p->ppi_len[SCIF_PPI_APER] = num_aper_pages;
+	p2p->ppi_peer_id = peerdev->node;
+	return p2p;
+dma_unmap:
+	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);
+scif_p2p_free:
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+free_p2p:
+	kfree(p2p);
+	return NULL;
+}
+
+/**
+ * scif_node_connect: Respond to SCIF_NODE_CONNECT interrupt message
+ * @dst: Destination node
+ *
+ * Connect the src and dst node by setting up the p2p connection
+ * between them. Management node here acts like a proxy.
+ */
+static void scif_node_connect(struct scif_dev *scifdev, int dst)
+{
+	struct scif_dev *dev_j = scifdev;
+	struct scif_dev *dev_i = NULL;
+	struct scif_p2p_info *p2p_ij = NULL;    /* bus addr for j from i */
+	struct scif_p2p_info *p2p_ji = NULL;    /* bus addr for i from j */
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	struct scifmsg msg;
+	int err;
+	u64 tmppayload;
+
+	if (dst < 1 || dst > scif_info.maxid)
+		return;
+
+	dev_i = &scif_dev[dst];
+
+	if (!_scifdev_alive(dev_i))
+		return;
+	/*
+	 * If the p2p connection is already setup or in the process of setting
+	 * up then just ignore this request. The requested node will get
+	 * informed by SCIF_NODE_ADD_ACK or SCIF_NODE_ADD_NACK
+	 */
+	if (!list_empty(&dev_i->p2p)) {
+		list_for_each_safe(pos, tmp, &dev_i->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == dev_j->node)
+				return;
+		}
+	}
+	p2p_ij = scif_init_p2p_info(dev_i, dev_j);
+	if (!p2p_ij)
+		return;
+	p2p_ji = scif_init_p2p_info(dev_j, dev_i);
+	if (!p2p_ji)
+		return;
+	list_add_tail(&p2p_ij->ppi_list, &dev_i->p2p);
+	list_add_tail(&p2p_ji->ppi_list, &dev_j->p2p);
+
+	/*
+	 * Send a SCIF_NODE_ADD to dev_i, pass it its bus address
+	 * as seen from dev_j
+	 */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_j->node;
+	msg.dst.node = dev_i->node;
+
+	msg.payload[0] = p2p_ji->ppi_da[SCIF_PPI_APER];
+	msg.payload[1] = p2p_ij->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[2] = p2p_ij->ppi_da[SCIF_PPI_APER];
+	msg.payload[3] = p2p_ij->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	err = scif_nodeqp_send(dev_i,  &msg);
+	if (err) {
+		dev_err(&scifdev->sdev->dev,
+			"%s %d error %d\n", __func__, __LINE__, err);
+		return;
+	}
+
+	/* Same as above but to dev_j */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_i->node;
+	msg.dst.node = dev_j->node;
+
+	tmppayload = msg.payload[0];
+	msg.payload[0] = msg.payload[2];
+	msg.payload[2] = tmppayload;
+	msg.payload[1] = p2p_ji->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[3] = p2p_ji->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	scif_nodeqp_send(dev_j, &msg);
+}
+
+static void scif_p2p_setup(void)
+{
+	int i, j;
+
+	if (!scif_info.p2p_enable)
+		return;
+
+	for (i = 1; i <= scif_info.maxid; i++)
+		if (!_scifdev_alive(&scif_dev[i]))
+			return;
+
+	for (i = 1; i <= scif_info.maxid; i++) {
+		for (j = 1; j <= scif_info.maxid; j++) {
+			struct scif_dev *scifdev = &scif_dev[i];
+
+			if (i == j)
+				continue;
+			scif_node_connect(scifdev, j);
+		}
+	}
+}
+
+void scif_qp_response_ack(struct work_struct *work)
+{
+	struct scif_dev *scifdev = container_of(work, struct scif_dev,
+						init_msg_work);
+	struct scif_peer_dev *spdev;
+
+	/* Drop the INIT message if it has already been received */
+	if (_scifdev_alive(scifdev))
+		return;
+
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev))
+		return;
+
+	if (scif_is_mgmt_node()) {
+		mutex_lock(&scif_info.conflock);
+		scif_p2p_setup();
+		mutex_unlock(&scif_info.conflock);
+	}
+}
+
+static char *message_types[] = {"BAD",
+				"INIT",
+				"EXIT",
+				"SCIF_EXIT_ACK",
+				"SCIF_NODE_ADD",
+				"SCIF_NODE_ADD_ACK",
+				"SCIF_NODE_ADD_NACK",
+				"REMOVE_NODE",
+				"REMOVE_NODE_ACK"};
+
+static void
+scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
+		     const char *label)
+{
+	if (!scif_info.en_msg_log)
+		return;
+	if (msg->uop > SCIF_MAX_MSG) {
+		dev_err(&scifdev->sdev->dev,
+			"%s: unknown msg type %d\n", label, msg->uop);
+		return;
+	}
+	dev_info(&scifdev->sdev->dev,
+		 "%s: msg type %s, src %d:%d, dest %d:%d payload 0x%llx:0x%llx:0x%llx:0x%llx\n",
+		 label, message_types[msg->uop], msg->src.node, msg->src.port,
+		 msg->dst.node, msg->dst.port, msg->payload[0], msg->payload[1],
+		 msg->payload[2], msg->payload[3]);
+}
+
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+	int err = -ENOMEM, loop_cnt = 0;
+
+	scif_display_message(scifdev, msg, "Sent");
+	if (!qp) {
+		err = -EINVAL;
+		goto error;
+	}
+	spin_lock(&qp->send_lock);
+
+	while ((err = scif_rb_write(&qp->outbound_q,
+				    msg, sizeof(struct scifmsg)))) {
+		mdelay(1);
+#define SCIF_NODEQP_SEND_TO_MSEC (3 * 1000)
+		if (loop_cnt++ > (SCIF_NODEQP_SEND_TO_MSEC)) {
+			err = -ENODEV;
+			break;
+		}
+	}
+	if (!err)
+		scif_rb_commit(&qp->outbound_q);
+	spin_unlock(&qp->send_lock);
+	if (!err) {
+		if (scifdev_self(scifdev))
+			/*
+			 * For loopback we need to emulate an interrupt by
+			 * queuing work for the queue handling real node
+			 * Qp interrupts.
+			 */
+			queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+		else
+			scif_send_msg_intr(scifdev);
+	}
+error:
+	if (err)
+		dev_dbg(&scifdev->sdev->dev,
+			"%s %d error %d uop %d\n",
+			 __func__, __LINE__, err, msg->uop);
+	return err;
+}
+
+/**
+ * scif_nodeqp_send - Send a message on the node queue pair
+ * @scifdev: Scif Device.
+ * @msg: The message to be sent.
+ */
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int err;
+	struct device *spdev = NULL;
+
+	if (msg->uop > SCIF_EXIT_ACK) {
+		/* Dont send messages once the exit flow has begun */
+		if (OP_IDLE != scifdev->exit)
+			return -ENODEV;
+		spdev = scif_get_peer_dev(scifdev);
+		if (IS_ERR(spdev)) {
+			err = PTR_ERR(spdev);
+			return err;
+		}
+	}
+	err = _scif_nodeqp_send(scifdev, msg);
+	if (msg->uop > SCIF_EXIT_ACK)
+		scif_put_peer_dev(spdev);
+	return err;
+}
+
+/*
+ * scif_misc_handler:
+ *
+ * Work queue handler for servicing miscellaneous SCIF tasks.
+ * Examples include:
+ * 1) Cleanup of zombie endpoints.
+ */
+void scif_misc_handler(struct work_struct *work)
+{
+	scif_cleanup_zombie_epd();
+}
+
+/**
+ * scif_init() - Respond to SCIF_INIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ */
+static __always_inline void
+scif_init(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	/*
+	 * Allow the thread waiting for device page updates for the peer QP DMA
+	 * address to complete initializing the inbound_q.
+	 */
+	flush_delayed_work(&scifdev->qp_dwork);
+	/*
+	 * Delegate the peer device registration to a workqueue, otherwise if
+	 * SCIF client probe (called during peer device registration) calls
+	 * scif_connect(..), it will block the message processing thread causing
+	 * a deadlock.
+	 */
+	schedule_work(&scifdev->init_msg_work);
+}
+
+/**
+ * scif_exit() - Respond to SCIF_EXIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * This function stops the SCIF interface for the node which sent
+ * the SCIF_EXIT message and starts waiting for that node to
+ * resetup the queue pair again.
+ */
+static __always_inline void
+scif_exit(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit_ack_pending = true;
+	if (scif_is_mgmt_node())
+		scif_disconnect_node(scifdev->node, false);
+	else
+		scif_stop(scifdev);
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+}
+
+/**
+ * scif_exitack() - Respond to SCIF_EXIT_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ */
+static __always_inline void
+scif_exit_ack(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit = OP_COMPLETED;
+	wake_up(&scif_info.exitwq);
+}
+
+/**
+ * scif_node_add() - Respond to SCIF_NODE_ADD interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * When the mgmt node driver has finished initializing a MIC node queue pair it
+ * marks the node as online. It then looks for all currently online MIC cards
+ * and send a SCIF_NODE_ADD message to identify the ID of the new card for
+ * peer to peer initialization
+ *
+ * The local node allocates its incoming queue and sends its address in the
+ * SCIF_NODE_ADD_ACK message back to the mgmt node, the mgmt node "reflects"
+ * this message to the new node
+ */
+static __always_inline void
+scif_node_add(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *newdev;
+	dma_addr_t qp_offset;
+	int qp_connect;
+	struct scif_hw_dev *sdev;
+
+	dev_dbg(&scifdev->sdev->dev,
+		"Scifdev %d:%d received NODE_ADD msg for node %d\n",
+		scifdev->node, msg->dst.node, msg->src.node);
+	dev_dbg(&scifdev->sdev->dev,
+		"Remote address for this node's aperture %llx\n",
+		msg->payload[0]);
+	newdev = &scif_dev[msg->src.node];
+	newdev->node = msg->src.node;
+	newdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	sdev = newdev->sdev;
+
+	if (scif_setup_intr_wq(newdev)) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to setup interrupts for %d\n", msg->src.node);
+		goto interrupt_setup_error;
+	}
+	newdev->mmio.va = ioremap_nocache(msg->payload[1], sdev->mmio->len);
+	if (!newdev->mmio.va) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to map mmio for %d\n", msg->src.node);
+		goto mmio_map_error;
+	}
+	newdev->qpairs = kzalloc(sizeof(*newdev->qpairs), GFP_KERNEL);
+	if (!newdev->qpairs)
+		goto qp_alloc_error;
+	/*
+	 * Set the base address of the remote node's memory since it gets
+	 * added to qp_offset
+	 */
+	newdev->base_addr = msg->payload[0];
+
+	qp_connect = scif_setup_qp_connect(newdev->qpairs, &qp_offset,
+					   SCIF_NODE_QP_SIZE, newdev);
+	if (qp_connect) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to setup qp_connect %d\n", qp_connect);
+		goto qp_connect_error;
+	}
+
+	newdev->db = sdev->hw_ops->next_db(sdev);
+	newdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						   "SCIF_INTR", newdev,
+						   newdev->db);
+	if (IS_ERR(newdev->cookie))
+		goto qp_connect_error;
+	newdev->qpairs->magic = SCIFEP_MAGIC;
+	newdev->qpairs->qp_state = SCIF_QP_OFFLINE;
+
+	msg->uop = SCIF_NODE_ADD_ACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	msg->payload[0] = qp_offset;
+	msg->payload[2] = newdev->db;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+	return;
+qp_connect_error:
+	kfree(newdev->qpairs);
+	newdev->qpairs = NULL;
+qp_alloc_error:
+	iounmap(newdev->mmio.va);
+	newdev->mmio.va = NULL;
+mmio_map_error:
+interrupt_setup_error:
+	dev_err(&scifdev->sdev->dev,
+		"node add failed for node %d\n", msg->src.node);
+	msg->uop = SCIF_NODE_ADD_NACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+}
+
+void scif_poll_qp_state(struct work_struct *work)
+{
+#define SCIF_NODE_QP_RETRY 100
+#define SCIF_NODE_QP_TIMEOUT 100
+	struct scif_dev *peerdev = container_of(work, struct scif_dev,
+							p2p_dwork.work);
+	struct scif_qp *qp = &peerdev->qpairs[0];
+
+	if (qp->qp_state != SCIF_QP_ONLINE ||
+	    qp->remote_qp->qp_state != SCIF_QP_ONLINE) {
+		if (peerdev->p2p_retry++ == SCIF_NODE_QP_RETRY) {
+			dev_err(&peerdev->sdev->dev,
+				"Warning: QP check timeout with state %d\n",
+				qp->qp_state);
+			goto timeout;
+		}
+		schedule_delayed_work(&peerdev->p2p_dwork,
+				      msecs_to_jiffies(SCIF_NODE_QP_TIMEOUT));
+		return;
+	}
+	scif_peer_register_device(peerdev);
+	return;
+timeout:
+	dev_err(&peerdev->sdev->dev,
+		"%s %d remote node %d offline,  state = 0x%x\n",
+		__func__, __LINE__, peerdev->node, qp->qp_state);
+	qp->remote_qp->qp_state = SCIF_QP_OFFLINE;
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_ack() - Respond to SCIF_NODE_ADD_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * After a MIC node receives the SCIF_NODE_ADD_ACK message it send this
+ * message to the mgmt node to confirm the sequence is finished.
+ *
+ */
+static __always_inline void
+scif_node_add_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *peerdev;
+	struct scif_qp *qp;
+	struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+	dev_dbg(&scifdev->sdev->dev,
+		"Scifdev %d received SCIF_NODE_ADD_ACK msg src %d dst %d\n",
+		scifdev->node, msg->src.node, msg->dst.node);
+	dev_dbg(&scifdev->sdev->dev,
+		"payload %llx %llx %llx %llx\n", msg->payload[0],
+		msg->payload[1], msg->payload[2], msg->payload[3]);
+	if (scif_is_mgmt_node()) {
+		/*
+		 * the lock serializes with scif_qp_response_ack. The mgmt node
+		 * is forwarding the NODE_ADD_ACK message from src to dst we
+		 * need to make sure that the dst has already received a
+		 * NODE_ADD for src and setup its end of the qp to dst
+		 */
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		scif_nodeqp_send(dst_dev, msg);
+		mutex_unlock(&scif_info.conflock);
+		return;
+	}
+	peerdev = &scif_dev[msg->src.node];
+	peerdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	peerdev->node = msg->src.node;
+
+	qp = &peerdev->qpairs[0];
+
+	if ((scif_setup_qp_connect_response(peerdev, &peerdev->qpairs[0],
+					    msg->payload[0])))
+		goto local_error;
+	peerdev->rdb = msg->payload[2];
+	qp->remote_qp->qp_state = SCIF_QP_ONLINE;
+	schedule_delayed_work(&peerdev->p2p_dwork, 0);
+	return;
+local_error:
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_nack: Respond to SCIF_NODE_ADD_NACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * SCIF_NODE_ADD failed, so inform the waiting wq.
+ */
+static __always_inline void
+scif_node_add_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+		dev_dbg(&scifdev->sdev->dev,
+			"SCIF_NODE_ADD_NACK received from %d\n", scifdev->node);
+		scif_nodeqp_send(dst_dev, msg);
+	}
+}
+
+/*
+ * scif_node_remove: Handle SCIF_NODE_REMOVE message
+ * @msg: Interrupt message
+ *
+ * Handle node removal.
+ */
+static __always_inline void
+scif_node_remove(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int node = msg->payload[0];
+	struct scif_dev *scdev = &scif_dev[node];
+
+	scdev->node_remove_ack_pending = true;
+	scif_handle_remove_node(node);
+}
+
+/*
+ * scif_node_remove_ack: Handle SCIF_NODE_REMOVE_ACK message
+ * @msg: Interrupt message
+ *
+ * The peer has acked a SCIF_NODE_REMOVE message.
+ */
+static __always_inline void
+scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *sdev = &scif_dev[msg->payload[0]];
+
+	atomic_inc(&sdev->disconn_rescnt);
+	wake_up(&sdev->disconn_wq);
+}
+
+static void
+scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	/* Bogus Node Qp Message? */
+	dev_err(&scifdev->sdev->dev,
+		"Unknown message 0x%xn scifdev->node 0x%x\n",
+		msg->uop, scifdev->node);
+}
+
+static void (*scif_intr_func[SCIF_MAX_MSG + 1])
+	    (struct scif_dev *, struct scifmsg *msg) = {
+	scif_msg_unknown,	/* Error */
+	scif_init,		/* SCIF_INIT */
+	scif_exit,		/* SCIF_EXIT */
+	scif_exit_ack,		/* SCIF_EXIT_ACK */
+	scif_node_add,		/* SCIF_NODE_ADD */
+	scif_node_add_ack,	/* SCIF_NODE_ADD_ACK */
+	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
+	scif_node_remove,	/* SCIF_NODE_REMOVE */
+	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+};
+
+/**
+ * scif_nodeqp_msg_handler() - Common handler for node messages
+ * @scifdev: Remote device to respond to
+ * @qp: Remote memory pointer
+ * @msg: The message to be handled.
+ *
+ * This routine calls the appropriate routine to handle a Node Qp
+ * message receipt
+ */
+static int scif_max_msg_id = SCIF_MAX_MSG;
+
+static void
+scif_nodeqp_msg_handler(struct scif_dev *scifdev,
+			struct scif_qp *qp, struct scifmsg *msg)
+{
+	scif_display_message(scifdev, msg, "Rcvd");
+
+	if (msg->uop > (u32)scif_max_msg_id) {
+		/* Bogus Node Qp Message? */
+		dev_err(&scifdev->sdev->dev,
+			"Unknown message 0x%xn scifdev->node 0x%x\n",
+			msg->uop, scifdev->node);
+		return;
+	}
+
+	scif_intr_func[msg->uop](scifdev, msg);
+}
+
+/**
+ * scif_nodeqp_intrhandler() - Interrupt handler for node messages
+ * @scifdev:    Remote device to respond to
+ * @qp:         Remote memory pointer
+ *
+ * This routine is triggered by the interrupt mechanism.  It reads
+ * messages from the node queue RB and calls the Node QP Message handling
+ * routine.
+ */
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	struct scifmsg msg;
+	int read_size;
+
+	do {
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg, sizeof(msg));
+		if (!read_size)
+			break;
+		scif_nodeqp_msg_handler(scifdev, qp, &msg);
+		/*
+		 * The node queue pair is unmapped so skip the read pointer
+		 * update after receipt of a SCIF_EXIT_ACK
+		 */
+		if (SCIF_EXIT_ACK == msg.uop)
+			break;
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (1);
+}
+
+/**
+ * scif_loopb_wq_handler - Loopback Workqueue Handler.
+ * @work: loop back work
+ *
+ * This work queue routine is invoked by the loopback work queue handler.
+ * It grabs the recv lock, dequeues any available messages from the head
+ * of the loopback message list, calls the node QP message handler,
+ * waits for it to return, then frees up this message and dequeues more
+ * elements of the list if available.
+ */
+static void scif_loopb_wq_handler(struct work_struct *unused)
+{
+	struct scif_dev *scifdev = scif_info.loopb_dev;
+	struct scif_qp *qp = scifdev->qpairs;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = NULL;
+		spin_lock(&qp->recv_lock);
+		if (!list_empty(&scif_info.loopb_recv_q)) {
+			msg = list_first_entry(&scif_info.loopb_recv_q,
+					       struct scif_loopb_msg,
+					       list);
+			list_del(&msg->list);
+		}
+		spin_unlock(&qp->recv_lock);
+
+		if (msg) {
+			scif_nodeqp_msg_handler(scifdev, qp, &msg->msg);
+			kfree(msg);
+		}
+	} while (msg);
+}
+
+/**
+ * scif_loopb_msg_handler() - Workqueue handler for loopback messages.
+ * @scifdev: SCIF device
+ * @qp: Queue pair.
+ *
+ * This work queue routine is triggered when a loopback message is received.
+ *
+ * We need special handling for receiving Node Qp messages on a loopback SCIF
+ * device via two workqueues for receiving messages.
+ *
+ * The reason we need the extra workqueue which is not required with *normal*
+ * non-loopback SCIF devices is the potential classic deadlock described below:
+ *
+ * Thread A tries to send a message on a loopback SCIF device and blocks since
+ * there is no space in the RB while it has the send_lock held or another
+ * lock called lock X for example.
+ *
+ * Thread B: The Loopback Node QP message receive workqueue receives the message
+ * and tries to send a message (eg an ACK) to the loopback SCIF device. It tries
+ * to grab the send lock again or lock X and deadlocks with Thread A. The RB
+ * cannot be drained any further due to this classic deadlock.
+ *
+ * In order to avoid deadlocks as mentioned above we have an extra level of
+ * indirection achieved by having two workqueues.
+ * 1) The first workqueue whose handler is scif_loopb_msg_handler reads
+ * messages from the Node QP RB, adds them to a list and queues work for the
+ * second workqueue.
+ *
+ * 2) The second workqueue whose handler is scif_loopb_wq_handler dequeues
+ * messages from the list, handles them, frees up the memory and dequeues
+ * more elements from the list if possible.
+ */
+int
+scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	int read_size;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = kmalloc(sizeof(*msg), GFP_KERNEL);
+		if (!msg)
+			return -ENOMEM;
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg->msg,
+					     sizeof(struct scifmsg));
+		if (read_size != sizeof(struct scifmsg)) {
+			kfree(msg);
+			scif_rb_update_read_ptr(&qp->inbound_q);
+			break;
+		}
+		spin_lock(&qp->recv_lock);
+		list_add_tail(&msg->list, &scif_info.loopb_recv_q);
+		spin_unlock(&qp->recv_lock);
+		queue_work(scif_info.loopb_wq, &scif_info.loopb_work);
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (read_size == sizeof(struct scifmsg));
+	return read_size;
+}
+
+/**
+ * scif_setup_loopback_qp - One time setup work for Loopback Node Qp.
+ * @scifdev: SCIF device
+ *
+ * Sets up the required loopback workqueues, queue pairs and ring buffers
+ */
+int scif_setup_loopback_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	void *local_q;
+	struct scif_qp *qp;
+	struct scif_peer_dev *spdev;
+
+	err = scif_setup_intr_wq(scifdev);
+	if (err)
+		goto exit;
+	INIT_LIST_HEAD(&scif_info.loopb_recv_q);
+	snprintf(scif_info.loopb_wqname, sizeof(scif_info.loopb_wqname),
+		 "SCIF LOOPB %d", scifdev->node);
+	scif_info.loopb_wq =
+		alloc_ordered_workqueue(scif_info.loopb_wqname, 0);
+	if (!scif_info.loopb_wq) {
+		err = -ENOMEM;
+		goto destroy_intr;
+	}
+	INIT_WORK(&scif_info.loopb_work, scif_loopb_wq_handler);
+	/* Allocate Self Qpair */
+	scifdev->qpairs = kzalloc(sizeof(*scifdev->qpairs), GFP_KERNEL);
+	if (!scifdev->qpairs) {
+		err = -ENOMEM;
+		goto destroy_loopb_wq;
+	}
+
+	qp = scifdev->qpairs;
+	qp->magic = SCIFEP_MAGIC;
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(SCIF_NODE_QP_SIZE, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto free_qpairs;
+	}
+	/*
+	 * For loopback the inbound_q and outbound_q are essentially the same
+	 * since the Node sends a message on the loopback interface to the
+	 * outbound_q which is then received on the inbound_q.
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+
+	scif_rb_init(&qp->inbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+	scif_info.nodeid = scifdev->node;
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto free_local_q;
+	}
+	scif_info.loopb_dev = scifdev;
+	return err;
+free_local_q:
+	kfree(local_q);
+free_qpairs:
+	kfree(scifdev->qpairs);
+destroy_loopb_wq:
+	destroy_workqueue(scif_info.loopb_wq);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+exit:
+	return err;
+}
+
+/**
+ * scif_destroy_loopback_qp - One time uninit work for Loopback Node Qp
+ * @scifdev: SCIF device
+ *
+ * Destroys the workqueues and frees up the Ring Buffer and Queue Pair memory.
+ */
+int scif_destroy_loopback_qp(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	destroy_workqueue(scif_info.loopb_wq);
+	scif_destroy_intr_wq(scifdev);
+	kfree(scifdev->qpairs->outbound_q.rb_base);
+	kfree(scifdev->qpairs);
+	scifdev->sdev = NULL;
+	scif_info.loopb_dev = NULL;
+	return 0;
+}
+
+void scif_destroy_p2p(struct scif_dev *scifdev)
+{
+	struct scif_dev *peer_dev;
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	int bd;
+
+	mutex_lock(&scif_info.conflock);
+	/* Free P2P mappings in the given node for all its peer nodes */
+	list_for_each_safe(pos, tmp, &scifdev->p2p) {
+		p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			     p2p->sg_nentries[SCIF_PPI_MMIO],
+			     DMA_BIDIRECTIONAL);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			     p2p->sg_nentries[SCIF_PPI_APER],
+			     DMA_BIDIRECTIONAL);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+		list_del(pos);
+		kfree(p2p);
+	}
+
+	/* Free P2P mapping created in the peer nodes for the given node */
+	for (bd = SCIF_MGMT_NODE + 1; bd <= scif_info.maxid; bd++) {
+		peer_dev = &scif_dev[bd];
+		list_for_each_safe(pos, tmp, &peer_dev->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == scifdev->node) {
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_MMIO],
+					     p2p->sg_nentries[SCIF_PPI_MMIO],
+					     DMA_BIDIRECTIONAL);
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_APER],
+					     p2p->sg_nentries[SCIF_PPI_APER],
+					     DMA_BIDIRECTIONAL);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+				list_del(pos);
+				kfree(p2p);
+			}
+		}
+	}
+	mutex_unlock(&scif_info.conflock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 28%]

* [PATCH char-misc-next v4 09/13] misc: mic: SCIF connections APIs i.e. accept and connect
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (7 preceding siblings ...)
  2015-03-31  1:36 41% ` [PATCH char-misc-next v4 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
@ 2015-03-31  1:36 41% ` Sudeep Dutt
  2015-03-31  1:36 52% ` [PATCH char-misc-next v4 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

From: Nikhil Rao <nikhil.rao@intel.com>

SCIF connection APIs which establish a SCIF connection between
a pair of SCIF endpoints. A SCIF connection consists of a
dedicated queue-pair between the endpoints. Client messages are
sent over the queue-pair whereas the signaling associated with the
message is multiplexed over the node queue-pair. Similarly other
control messages such as exposing registered memory are also sent
over the node queue-pair. The SCIF endpoints must be in connected
state to exchange messages, register memory, map remote memory and
trigger DMA transfers. SCIF connections can be set up
asynchronously or synchronously.

Thanks to Johnnie S Peters for authoring parts of this patch during
early bring up of the SCIF driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   8 +
 drivers/misc/mic/scif/scif_api.c    | 457 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    | 229 ++++++++++++++++++
 drivers/misc/mic/scif/scif_fd.c     | 106 +++++++++
 drivers/misc/mic/scif/scif_main.c   |   1 +
 drivers/misc/mic/scif/scif_nm.c     |  36 +++
 drivers/misc/mic/scif/scif_nodeqp.c |  16 +-
 7 files changed, 852 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index b973b99..65c628e 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,5 +144,13 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index cbdc27b..da64cee 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -415,3 +415,460 @@ int scif_listen(scif_epd_t epd, int backlog)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(scif_listen);
+
+/*
+ ************************************************************************
+ * SCIF connection flow:
+ *
+ * 1) A SCIF listening endpoint can call scif_accept(..) to wait for SCIF
+ *	connections via a SCIF_CNCT_REQ message
+ * 2) A SCIF endpoint can initiate a SCIF connection by calling
+ *	scif_connect(..) which calls scif_setup_qp_connect(..) which
+ *	allocates the local qp for the endpoint ring buffer and then sends
+ *	a SCIF_CNCT_REQ to the remote node and waits for a SCIF_CNCT_GNT or
+ *	a SCIF_CNCT_REJ message
+ * 3) The peer node handles a SCIF_CNCT_REQ via scif_cnctreq_resp(..) which
+ *	wakes up any threads blocked in step 1 or sends a SCIF_CNCT_REJ
+ *	message otherwise
+ * 4) A thread blocked waiting for incoming connections allocates its local
+ *	endpoint QP and ring buffer following which it sends a SCIF_CNCT_GNT
+ *	and waits for a SCIF_CNCT_GNT(N)ACK. If the allocation fails then
+ *	the node sends a SCIF_CNCT_REJ message
+ * 5) Upon receipt of a SCIF_CNCT_GNT or a SCIF_CNCT_REJ message the
+ *	connecting endpoint is woken up as part of handling
+ *	scif_cnctgnt_resp(..) following which it maps the remote endpoints'
+ *	QP, updates its outbound QP and sends a SCIF_CNCT_GNTACK message on
+ *	success or a SCIF_CNCT_GNTNACK message on failure and completes
+ *	the scif_connect(..) API
+ * 6) Upon receipt of a SCIF_CNCT_GNT(N)ACK the accepting endpoint blocked
+ *	in step 4 is woken up and completes the scif_accept(..) API
+ * 7) The SCIF connection is now established between the two SCIF endpoints.
+ */
+static int scif_conn_func(struct scif_endpt *ep)
+{
+	int err = 0;
+	struct scifmsg msg;
+	struct device *spdev;
+
+	/* Initiate the first part of the endpoint QP setup */
+	err = scif_setup_qp_connect(ep->qp_info.qp, &ep->qp_info.qp_offset,
+				    SCIF_ENDPT_QP_SIZE, ep->remote_dev);
+	if (err) {
+		dev_err(&ep->remote_dev->sdev->dev,
+			"%s err %d qp_offset 0x%llx\n",
+			__func__, err, ep->qp_info.qp_offset);
+		ep->state = SCIFEP_BOUND;
+		goto connect_error_simple;
+	}
+
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	/* Format connect message and send it */
+	msg.src = ep->port;
+	msg.dst = ep->conn_port;
+	msg.uop = SCIF_CNCT_REQ;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->qp_info.qp_offset;
+	err = _scif_nodeqp_send(ep->remote_dev, &msg);
+	if (err)
+		goto connect_error_dec;
+	scif_put_peer_dev(spdev);
+	/*
+	 * Wait for the remote node to respond with SCIF_CNCT_GNT or
+	 * SCIF_CNCT_REJ message.
+	 */
+	err = wait_event_timeout(ep->conwq, ep->state != SCIFEP_CONNECTING,
+				 SCIF_NODE_ALIVE_TIMEOUT);
+	if (!err) {
+		dev_err(&ep->remote_dev->sdev->dev,
+			"%s %d timeout\n", __func__, __LINE__);
+		ep->state = SCIFEP_BOUND;
+	}
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	if (ep->state == SCIFEP_MAPPING) {
+		err = scif_setup_qp_connect_response(ep->remote_dev,
+						     ep->qp_info.qp,
+						     ep->qp_info.gnt_pld);
+		/*
+		 * If the resource to map the queue are not available then
+		 * we need to tell the other side to terminate the accept
+		 */
+		if (err) {
+			dev_err(&ep->remote_dev->sdev->dev,
+				"%s %d err %d\n", __func__, __LINE__, err);
+			msg.uop = SCIF_CNCT_GNTNACK;
+			msg.payload[0] = ep->remote_ep;
+			_scif_nodeqp_send(ep->remote_dev, &msg);
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+
+		msg.uop = SCIF_CNCT_GNTACK;
+		msg.payload[0] = ep->remote_ep;
+		err = _scif_nodeqp_send(ep->remote_dev, &msg);
+		if (err) {
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+		ep->state = SCIFEP_CONNECTED;
+		mutex_lock(&scif_info.connlock);
+		list_add_tail(&ep->list, &scif_info.connected);
+		mutex_unlock(&scif_info.connlock);
+		dev_dbg(&ep->remote_dev->sdev->dev,
+			"SCIFAPI connect: ep %p connected\n", ep);
+	} else if (ep->state == SCIFEP_BOUND) {
+		dev_dbg(&ep->remote_dev->sdev->dev,
+			"SCIFAPI connect: ep %p connection refused\n", ep);
+		err = -ECONNREFUSED;
+		goto connect_error_dec;
+	}
+	scif_put_peer_dev(spdev);
+	return err;
+connect_error_dec:
+	scif_put_peer_dev(spdev);
+cleanup_qp:
+	scif_cleanup_ep_qp(ep);
+connect_error_simple:
+	return err;
+}
+
+/*
+ * scif_conn_handler:
+ *
+ * Workqueue handler for servicing non-blocking SCIF connect
+ *
+ */
+void scif_conn_handler(struct work_struct *work)
+{
+	struct scif_endpt *ep;
+
+	do {
+		ep = NULL;
+		spin_lock(&scif_info.nb_connect_lock);
+		if (!list_empty(&scif_info.nb_connect_list)) {
+			ep = list_first_entry(&scif_info.nb_connect_list,
+					      struct scif_endpt, conn_list);
+			list_del(&ep->conn_list);
+		}
+		spin_unlock(&scif_info.nb_connect_lock);
+		if (ep)
+			ep->conn_err = scif_conn_func(ep);
+	} while (ep);
+}
+
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	struct scif_dev *remote_dev;
+	struct device *spdev;
+
+	dev_dbg(scif_info.mdev.this_device, "SCIFAPI connect: ep %p %s\n", ep,
+		scif_ep_states[ep->state]);
+
+	if (!scif_dev || dst->node > scif_info.maxid)
+		return -ENODEV;
+
+	might_sleep();
+
+	remote_dev = &scif_dev[dst->node];
+	spdev = scif_get_peer_dev(remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		return err;
+	}
+
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+		err = -EINVAL;
+		break;
+	case SCIFEP_DISCONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EINVAL;
+		break;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+		err = -EOPNOTSUPP;
+		break;
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			err = -EINPROGRESS;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_CONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_UNBOUND:
+		ep->port.port = scif_get_new_port();
+		if (!ep->port.port) {
+			err = -ENOSPC;
+		} else {
+			ep->port.node = scif_info.nodeid;
+			ep->conn_async_state = ASYNC_CONN_IDLE;
+		}
+		/* Fall through */
+	case SCIFEP_BOUND:
+		/*
+		 * If a non-blocking connect has been already initiated
+		 * (conn_async_state is either ASYNC_CONN_INPROGRESS or
+		 * ASYNC_CONN_FLUSH_WORK), the end point could end up in
+		 * SCIF_BOUND due an error in the connection process
+		 * (e.g., connection refused) If conn_async_state is
+		 * ASYNC_CONN_INPROGRESS - transition to ASYNC_CONN_FLUSH_WORK
+		 * so that the error status can be collected. If the state is
+		 * already ASYNC_CONN_FLUSH_WORK - then set the error to
+		 * EINPROGRESS since some other thread is waiting to collect
+		 * error status.
+		 */
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+			err = -EINPROGRESS;
+		} else {
+			ep->conn_port = *dst;
+			init_waitqueue_head(&ep->sendwq);
+			init_waitqueue_head(&ep->recvwq);
+			init_waitqueue_head(&ep->conwq);
+			ep->conn_async_state = 0;
+
+			if (unlikely(non_block))
+				ep->conn_async_state = ASYNC_CONN_INPROGRESS;
+		}
+		break;
+	}
+
+	if (err || ep->conn_async_state == ASYNC_CONN_FLUSH_WORK)
+			goto connect_simple_unlock1;
+
+	ep->state = SCIFEP_CONNECTING;
+	ep->remote_dev = &scif_dev[dst->node];
+	ep->qp_info.qp->magic = SCIFEP_MAGIC;
+	if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+		spin_lock(&scif_info.nb_connect_lock);
+		list_add_tail(&ep->conn_list, &scif_info.nb_connect_list);
+		spin_unlock(&scif_info.nb_connect_lock);
+		err = -EINPROGRESS;
+		schedule_work(&scif_info.conn_work);
+	}
+connect_simple_unlock1:
+	spin_unlock(&ep->lock);
+	scif_put_peer_dev(spdev);
+	if (err) {
+		return err;
+	} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+		flush_work(&scif_info.conn_work);
+		err = ep->conn_err;
+		spin_lock(&ep->lock);
+		ep->conn_async_state = ASYNC_CONN_IDLE;
+		spin_unlock(&ep->lock);
+	} else {
+		err = scif_conn_func(ep);
+	}
+	return err;
+}
+
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst)
+{
+	return __scif_connect(epd, dst, false);
+}
+EXPORT_SYMBOL_GPL(scif_connect);
+
+/**
+ * scif_accept() - Accept a connection request from the remote node
+ *
+ * The function accepts a connection request from the remote node.  Successful
+ * complete is indicate by a new end point being created and passed back
+ * to the caller for future reference.
+ *
+ * Upon successful complete a zero will be returned and the peer information
+ * will be filled in.
+ *
+ * If the end point is not in the listening state -EINVAL will be returned.
+ *
+ * If during the connection sequence resource allocation fails the -ENOMEM
+ * will be returned.
+ *
+ * If the function is called with the ASYNC flag set and no connection requests
+ * are pending it will return -EAGAIN.
+ *
+ * If the remote side is not sending any connection requests the caller may
+ * terminate this function with a signal.  If so a -EINTR will be returned.
+ */
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer,
+		scif_epd_t *newepd, int flags)
+{
+	struct scif_endpt *lep = (struct scif_endpt *)epd;
+	struct scif_endpt *cep;
+	struct scif_conreq *conreq;
+	struct scifmsg msg;
+	int err;
+	struct device *spdev;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI accept: ep %p %s\n", lep, scif_ep_states[lep->state]);
+
+	if (flags & ~SCIF_ACCEPT_SYNC)
+		return -EINVAL;
+
+	if (!peer || !newepd)
+		return -EINVAL;
+
+	might_sleep();
+	spin_lock(&lep->lock);
+	if (lep->state != SCIFEP_LISTENING) {
+		spin_unlock(&lep->lock);
+		return -EINVAL;
+	}
+
+	if (!lep->conreqcnt && !(flags & SCIF_ACCEPT_SYNC)) {
+		/* No connection request present and we do not want to wait */
+		spin_unlock(&lep->lock);
+		return -EAGAIN;
+	}
+
+	lep->files = current->files;
+retry_connection:
+	spin_unlock(&lep->lock);
+	/* Wait for the remote node to send us a SCIF_CNCT_REQ */
+	err = wait_event_interruptible(lep->conwq,
+				       (lep->conreqcnt ||
+				       (lep->state != SCIFEP_LISTENING)));
+	if (err)
+		return err;
+
+	if (lep->state != SCIFEP_LISTENING)
+		return -EINTR;
+
+	spin_lock(&lep->lock);
+
+	if (!lep->conreqcnt)
+		goto retry_connection;
+
+	/* Get the first connect request off the list */
+	conreq = list_first_entry(&lep->conlist, struct scif_conreq, list);
+	list_del(&conreq->list);
+	lep->conreqcnt--;
+	spin_unlock(&lep->lock);
+
+	/* Fill in the peer information */
+	peer->node = conreq->msg.src.node;
+	peer->port = conreq->msg.src.port;
+
+	cep = kzalloc(sizeof(*cep), GFP_KERNEL);
+	if (!cep) {
+		err = -ENOMEM;
+		goto scif_accept_error_epalloc;
+	}
+	spin_lock_init(&cep->lock);
+	mutex_init(&cep->sendlock);
+	mutex_init(&cep->recvlock);
+	cep->state = SCIFEP_CONNECTING;
+	cep->remote_dev = &scif_dev[peer->node];
+	cep->remote_ep = conreq->msg.payload[0];
+
+	cep->qp_info.qp = kzalloc(sizeof(*cep->qp_info.qp), GFP_KERNEL);
+	if (!cep->qp_info.qp) {
+		err = -ENOMEM;
+		goto scif_accept_error_qpalloc;
+	}
+
+	cep->qp_info.qp->magic = SCIFEP_MAGIC;
+	spdev = scif_get_peer_dev(cep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto scif_accept_error_map;
+	}
+	err = scif_setup_qp_accept(cep->qp_info.qp, &cep->qp_info.qp_offset,
+				   conreq->msg.payload[1], SCIF_ENDPT_QP_SIZE,
+				   cep->remote_dev);
+	if (err) {
+		dev_dbg(&cep->remote_dev->sdev->dev,
+			"SCIFAPI accept: ep %p new %p scif_setup_qp_accept %d qp_offset 0x%llx\n",
+			lep, cep, err, cep->qp_info.qp_offset);
+		scif_put_peer_dev(spdev);
+		goto scif_accept_error_map;
+	}
+
+	cep->port.node = lep->port.node;
+	cep->port.port = lep->port.port;
+	cep->peer.node = peer->node;
+	cep->peer.port = peer->port;
+	init_waitqueue_head(&cep->sendwq);
+	init_waitqueue_head(&cep->recvwq);
+	init_waitqueue_head(&cep->conwq);
+
+	msg.uop = SCIF_CNCT_GNT;
+	msg.src = cep->port;
+	msg.payload[0] = cep->remote_ep;
+	msg.payload[1] = cep->qp_info.qp_offset;
+	msg.payload[2] = (u64)cep;
+
+	err = _scif_nodeqp_send(cep->remote_dev, &msg);
+	scif_put_peer_dev(spdev);
+	if (err)
+		goto scif_accept_error_map;
+retry:
+	/* Wait for the remote node to respond with SCIF_CNCT_GNT(N)ACK */
+	err = wait_event_timeout(cep->conwq, cep->state != SCIFEP_CONNECTING,
+				 SCIF_NODE_ACCEPT_TIMEOUT);
+	if (!err && scifdev_alive(cep))
+		goto retry;
+	err = !err ? -ENODEV : 0;
+	if (err)
+		goto scif_accept_error_map;
+	kfree(conreq);
+
+	spin_lock(&cep->lock);
+
+	if (cep->state == SCIFEP_CLOSING) {
+		/*
+		 * Remote failed to allocate resources and NAKed the grant.
+		 * There is at this point nothing referencing the new end point.
+		 */
+		spin_unlock(&cep->lock);
+		scif_teardown_ep(cep);
+		kfree(cep);
+
+		/* If call with sync flag then go back and wait. */
+		if (flags & SCIF_ACCEPT_SYNC) {
+			spin_lock(&lep->lock);
+			goto retry_connection;
+		}
+		return -EAGAIN;
+	}
+
+	scif_get_port(cep->port.port);
+	*newepd = (scif_epd_t)cep;
+	spin_unlock(&cep->lock);
+	return 0;
+scif_accept_error_map:
+	scif_teardown_ep(cep);
+scif_accept_error_qpalloc:
+	kfree(cep);
+scif_accept_error_epalloc:
+	msg.uop = SCIF_CNCT_REJ;
+	msg.dst.node = conreq->msg.src.node;
+	msg.dst.port = conreq->msg.src.port;
+	msg.payload[0] = conreq->msg.payload[0];
+	msg.payload[1] = conreq->msg.payload[1];
+	scif_nodeqp_send(&scif_dev[conreq->msg.src.node], &msg);
+	kfree(conreq);
+	return err;
+}
+EXPORT_SYMBOL_GPL(scif_accept);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index a2b728a..685ee5c 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -76,6 +76,24 @@ void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
 	schedule_work(&scif_info.misc_work);
 }
 
+static struct scif_endpt *scif_find_listen_ep(u16 port)
+{
+	struct scif_endpt *ep = NULL;
+	struct list_head *pos, *tmpq;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.listen) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->port.port == port) {
+			spin_lock(&ep->lock);
+			spin_unlock(&scif_info.eplock);
+			return ep;
+		}
+	}
+	spin_unlock(&scif_info.eplock);
+	return NULL;
+}
+
 void scif_cleanup_zombie_epd(void)
 {
 	struct list_head *pos, *tmpq;
@@ -90,3 +108,214 @@ void scif_cleanup_zombie_epd(void)
 	}
 	spin_unlock(&scif_info.eplock);
 }
+
+/**
+ * scif_cnctreq() - Respond to SCIF_CNCT_REQ interrupt message
+ * @msg:        Interrupt message
+ *
+ * This message is initiated by the remote node to request a connection
+ * to the local node.  This function looks for an end point in the
+ * listen state on the requested port id.
+ *
+ * If it finds a listening port it places the connect request on the
+ * listening end points queue and wakes up any pending accept calls.
+ *
+ * If it does not find a listening end point it sends a connection
+ * reject message to the remote node.
+ */
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_conreq *conreq;
+
+	conreq = kmalloc(sizeof(*conreq), GFP_KERNEL);
+	if (!conreq)
+		/* Lack of resources so reject the request. */
+		goto conreq_sendrej;
+
+	ep = scif_find_listen_ep(msg->dst.port);
+	if (!ep)
+		/*  Send reject due to no listening ports */
+		goto conreq_sendrej_free;
+
+	if (ep->backlog <= ep->conreqcnt) {
+		/*  Send reject due to too many pending requests */
+		spin_unlock(&ep->lock);
+		goto conreq_sendrej_free;
+	}
+
+	conreq->msg = *msg;
+	list_add_tail(&conreq->list, &ep->conlist);
+	ep->conreqcnt++;
+	wake_up_interruptible(&ep->conwq);
+	spin_unlock(&ep->lock);
+	return;
+
+conreq_sendrej_free:
+	kfree(conreq);
+conreq_sendrej:
+	msg->uop = SCIF_CNCT_REJ;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_cnctgnt() - Respond to SCIF_CNCT_GNT interrupt message
+ * @msg:        Interrupt message
+ *
+ * An accept() on the remote node has occurred and sent this message
+ * to indicate success.  Place the end point in the MAPPING state and
+ * save the remote nodes memory information.  Then wake up the connect
+ * request so it can finish.
+ */
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->peer.node = msg->src.node;
+		ep->peer.port = msg->src.port;
+		ep->qp_info.gnt_pld = msg->payload[1];
+		ep->remote_ep = msg->payload[2];
+		ep->state = SCIFEP_MAPPING;
+
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctgnt_ack() - Respond to SCIF_CNCT_GNTACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request has finished mapping the local memory.
+ * Place the connection in the connected state and wake up the pending
+ * accept() call.
+ */
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	mutex_lock(&scif_info.connlock);
+	spin_lock(&ep->lock);
+	/* New ep is now connected with all resources set. */
+	ep->state = SCIFEP_CONNECTED;
+	list_add_tail(&ep->list, &scif_info.connected);
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+}
+
+/**
+ * scif_cnctgnt_nack() - Respond to SCIF_CNCT_GNTNACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request failed to map the local memory it was sent.
+ * Place the end point in the CLOSING state to indicate it and wake up
+ * the pending accept();
+ */
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_CLOSING;
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctrej() - Respond to SCIF_CNCT_REJ interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote end has rejected the connection request.  Set the end
+ * point back to the bound state and wake up the pending connect().
+ */
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->state = SCIFEP_BOUND;
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_discnct() - Respond to SCIF_DISCNCT interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote node has indicated close() has been called on its end
+ * point.  Remove the local end point from the connected list, set its
+ * state to disconnected and ensure accesses to the remote node are
+ * shutdown.
+ *
+ * When all accesses to the remote end have completed then send a
+ * DISCNT_ACK to indicate it can remove its resources and complete
+ * the close routine.
+ */
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		/*
+		 * The local ep may have sent a disconnect and and been closed
+		 * due to a message response time out. It may have been
+		 * allocated again and formed a new connection so we want to
+		 * check if the remote ep matches
+		 */
+		if (((u64)tmpep == msg->payload[1]) &&
+		    ((u64)tmpep->remote_ep == msg->payload[0])) {
+			list_del(pos);
+			ep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	/*
+	 * If the terminated end is not found then this side started closing
+	 * before the other side sent the disconnect.  If so the ep will no
+	 * longer be on the connected list.  Regardless the other side
+	 * needs to be acked to let it know close is complete.
+	 */
+	if (!ep) {
+		mutex_unlock(&scif_info.connlock);
+		goto discnct_ack;
+	}
+
+	ep->state = SCIFEP_DISCONNECTED;
+	list_add_tail(&ep->list, &scif_info.disconnected);
+
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+discnct_ack:
+	msg->uop = SCIF_DISCNT_ACK;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_discnct_ack() - Respond to SCIF_DISCNT_ACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side has indicated it has not more references to local resources
+ */
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_DISCONNECTED;
+	spin_unlock(&ep->lock);
+	complete(&ep->discon);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index 812c23f..c1d6bc3 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -68,6 +68,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 {
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
+	int err = 0;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -91,6 +92,111 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	}
 	case SCIF_LISTEN:
 		return scif_listen(priv, arg);
+	case SCIF_CONNECT:
+	{
+		struct scifioctl_connect req;
+		struct scif_endpt *ep = (struct scif_endpt *)priv;
+
+		if (copy_from_user(&req, argp, sizeof(req)))
+			return -EFAULT;
+
+		err = __scif_connect(priv, &req.peer, non_block);
+		if (err < 0)
+			return err;
+
+		req.self.node = ep->port.node;
+		req.self.port = ep->port.port;
+
+		if (copy_to_user(argp, &req, sizeof(req)))
+			return -EFAULT;
+
+		return 0;
+	}
+	/*
+	 * Accept is done in two halves.  The request ioctl does the basic
+	 * functionality of accepting the request and returning the information
+	 * about it including the internal ID of the end point.  The register
+	 * is done with the internal ID on a new file descriptor opened by the
+	 * requesting process.
+	 */
+	case SCIF_ACCEPTREQ:
+	{
+		struct scifioctl_accept request;
+		scif_epd_t *ep = (scif_epd_t *)&request.endpt;
+
+		if (copy_from_user(&request, argp, sizeof(request)))
+			return -EFAULT;
+
+		err = scif_accept(priv, &request.peer, ep, request.flags);
+		if (err < 0)
+			return err;
+
+		if (copy_to_user(argp, &request, sizeof(request))) {
+			scif_close(*ep);
+			return -EFAULT;
+		}
+		/*
+		 * Add to the list of user mode eps where the second half
+		 * of the accept is not yet completed.
+		 */
+		spin_lock(&scif_info.eplock);
+		list_add_tail(&((*ep)->miacceptlist), &scif_info.uaccept);
+		list_add_tail(&((*ep)->liacceptlist), &priv->li_accept);
+		(*ep)->listenep = priv;
+		priv->acceptcnt++;
+		spin_unlock(&scif_info.eplock);
+
+		return 0;
+	}
+	case SCIF_ACCEPTREG:
+	{
+		struct scif_endpt *priv = f->private_data;
+		struct scif_endpt *newep;
+		struct scif_endpt *lisep;
+		struct scif_endpt *fep = NULL;
+		struct scif_endpt *tmpep;
+		struct list_head *pos, *tmpq;
+
+		/* Finally replace the pointer to the accepted endpoint */
+		if (copy_from_user(&newep, argp, sizeof(void *)))
+			return -EFAULT;
+
+		/* Remove form the user accept queue */
+		spin_lock(&scif_info.eplock);
+		list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, miacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				fep = tmpep;
+				break;
+			}
+		}
+
+		if (!fep) {
+			spin_unlock(&scif_info.eplock);
+			return -ENOENT;
+		}
+
+		lisep = newep->listenep;
+		list_for_each_safe(pos, tmpq, &lisep->li_accept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, liacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				lisep->acceptcnt--;
+				break;
+			}
+		}
+
+		spin_unlock(&scif_info.eplock);
+
+		/* Free the resources automatically created from the open. */
+		scif_teardown_ep(priv);
+		scif_add_epd_to_zombie_list(priv, !SCIF_EPLOCK_HELD);
+		f->private_data = newep;
+		return 0;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
index 82792b1..f7187dc 100644
--- a/drivers/misc/mic/scif/scif_main.c
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -331,6 +331,7 @@ static int _scif_init(void)
 	scif_info.en_msg_log = 0;
 	scif_info.p2p_enable = 1;
 	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	INIT_WORK(&scif_info.conn_work, scif_conn_handler);
 	idr_init(&scif_ports);
 	return 0;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 5fefacc..0469aad 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -20,6 +20,41 @@
 #include "scif_main.h"
 #include "scif_map.h"
 
+/**
+ * scif_invalidate_ep() - Set state for all connected endpoints
+ * to disconnected and wake up all send/recv waitqueues
+ */
+static void scif_invalidate_ep(int node)
+{
+	struct scif_endpt *ep;
+	struct list_head *pos, *tmpq;
+
+	flush_work(&scif_info.conn_work);
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			spin_lock(&ep->lock);
+			scif_cleanup_ep_qp(ep);
+			spin_unlock(&ep->lock);
+		}
+	}
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			list_del(pos);
+			spin_lock(&ep->lock);
+			ep->state = SCIFEP_DISCONNECTED;
+			list_add_tail(&ep->list, &scif_info.disconnected);
+			scif_cleanup_ep_qp(ep);
+			wake_up_interruptible(&ep->sendwq);
+			wake_up_interruptible(&ep->recvwq);
+			spin_unlock(&ep->lock);
+		}
+	}
+	mutex_unlock(&scif_info.connlock);
+}
+
 void scif_free_qp(struct scif_dev *scifdev)
 {
 	struct scif_qp *qp = scifdev->qpairs;
@@ -91,6 +126,7 @@ void scif_cleanup_scifdev(struct scif_dev *dev)
 		scif_destroy_intr_wq(dev);
 	}
 	scif_destroy_p2p(dev);
+	scif_invalidate_ep(dev->node);
 	scif_send_acks(dev);
 	if (!dev->node && scif_info.card_initiated_exit) {
 		/*
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index 0e0cd6b..c35451e 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -563,7 +563,14 @@ static char *message_types[] = {"BAD",
 				"SCIF_NODE_ADD_ACK",
 				"SCIF_NODE_ADD_NACK",
 				"REMOVE_NODE",
-				"REMOVE_NODE_ACK"};
+				"REMOVE_NODE_ACK",
+				"CNCT_REQ",
+				"CNCT_GNT",
+				"CNCT_GNTACK",
+				"CNCT_GNTNACK",
+				"CNCT_REJ",
+				"DISCNCT",
+				"DISCNT_ACK"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -964,6 +971,13 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
 	scif_node_remove,	/* SCIF_NODE_REMOVE */
 	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+	scif_cnctreq,		/* SCIF_CNCT_REQ */
+	scif_cnctgnt,		/* SCIF_CNCT_GNT */
+	scif_cnctgnt_ack,	/* SCIF_CNCT_GNTACK */
+	scif_cnctgnt_nack,	/* SCIF_CNCT_GNTNACK */
+	scif_cnctrej,		/* SCIF_CNCT_REJ */
+	scif_discnct,		/* SCIF_DISCNCT */
+	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v4 10/13] misc: mic: SCIF messaging and node enumeration APIs
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (8 preceding siblings ...)
  2015-03-31  1:36 41% ` [PATCH char-misc-next v4 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
@ 2015-03-31  1:36 52% ` Sudeep Dutt
  2015-03-31  1:36 57% ` [PATCH char-misc-next v4 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF messaging APIs which allow sending messages between the SCIF
endpoints via a byte stream based ring buffer which has been
optimized to avoid reads across PCIe. The SCIF messaging APIs
are typically used for short < 1024 byte messages for best
performance while the RDMA APIs which will be submitted in a future
patch series is recommended for larger transfers. The node
enumeration API enables a user to query for the number of nodes
online in the SCIF network and their node ids.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   4 +
 drivers/misc/mic/scif/scif_api.c    | 402 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  32 +++
 drivers/misc/mic/scif/scif_fd.c     |  93 +++++++++
 drivers/misc/mic/scif/scif_nm.c     |  17 ++
 drivers/misc/mic/scif/scif_nodeqp.c |  36 +++-
 6 files changed, 583 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index 65c628e..331322a 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,6 +144,8 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags);
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags);
 void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
@@ -151,6 +153,8 @@ void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg);
 int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index da64cee..f39d313 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -872,3 +872,405 @@ scif_accept_error_epalloc:
 	return err;
 }
 EXPORT_SYMBOL_GPL(scif_accept);
+
+/*
+ * scif_msg_param_check:
+ * @epd: The end point returned from scif_open()
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * Validate parameters for messaging APIs scif_send(..)/scif_recv(..).
+ */
+static inline int scif_msg_param_check(scif_epd_t epd, int len, int flags)
+{
+	int ret = -EINVAL;
+
+	if (len < 0)
+		goto err_ret;
+	if (flags && (!(flags & SCIF_RECV_BLOCK)))
+		goto err_ret;
+	ret = 0;
+err_ret:
+	return ret;
+}
+
+static int _scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_xfer_len = 0, sent_len = 0, write_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_SEND_BLOCK)
+		might_sleep();
+
+	spin_lock(&ep->lock);
+	while (sent_len != len && SCIFEP_CONNECTED == ep->state) {
+		write_count = scif_rb_space(&qp->outbound_q);
+		if (write_count) {
+			/* Best effort to send as much data as possible */
+			curr_xfer_len = min(len - sent_len, write_count);
+			ret = scif_rb_write(&qp->outbound_q, msg,
+					    curr_xfer_len);
+			if (ret < 0)
+				break;
+			/* Success. Update write pointer */
+			scif_rb_commit(&qp->outbound_q);
+			/*
+			 * Send a notification to the peer about the
+			 * produced data message.
+			 */
+			notif_msg.src = ep->port;
+			notif_msg.uop = SCIF_CLIENT_SENT;
+			notif_msg.payload[0] = ep->remote_ep;
+			ret = _scif_nodeqp_send(ep->remote_dev, &notif_msg);
+			if (ret)
+				break;
+			sent_len += curr_xfer_len;
+			msg = msg + curr_xfer_len;
+			continue;
+		}
+		curr_xfer_len = min(len - sent_len, SCIF_ENDPT_QP_SIZE - 1);
+		/* Not enough RB space. return for the Non Blocking case */
+		if (!(flags & SCIF_SEND_BLOCK))
+			break;
+
+		spin_unlock(&ep->lock);
+		/* Wait for a SCIF_CLIENT_RCVD message in the Blocking case */
+		ret =
+		wait_event_interruptible(ep->sendwq,
+					 (SCIFEP_CONNECTED != ep->state) ||
+					 (scif_rb_space(&qp->outbound_q) >=
+					 curr_xfer_len));
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (sent_len)
+		ret = sent_len;
+	else if (!ret && SCIFEP_CONNECTED != ep->state)
+		ret = SCIFEP_DISCONNECTED == ep->state ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+static int _scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	int read_size;
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_recv_len = 0, remaining_len = len, read_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_RECV_BLOCK)
+		might_sleep();
+	spin_lock(&ep->lock);
+	while (remaining_len && (SCIFEP_CONNECTED == ep->state ||
+				 SCIFEP_DISCONNECTED == ep->state)) {
+		read_count = scif_rb_count(&qp->inbound_q, remaining_len);
+		if (read_count) {
+			/*
+			 * Best effort to recv as much data as there
+			 * are bytes to read in the RB particularly
+			 * important for the Non Blocking case.
+			 */
+			curr_recv_len = min(remaining_len, read_count);
+			read_size = scif_rb_get_next(&qp->inbound_q,
+						     msg, curr_recv_len);
+			if (ep->state == SCIFEP_CONNECTED) {
+				/*
+				 * Update the read pointer only if the endpoint
+				 * is still connected else the read pointer
+				 * might no longer exist since the peer has
+				 * freed resources!
+				 */
+				scif_rb_update_read_ptr(&qp->inbound_q);
+				/*
+				 * Send a notification to the peer about the
+				 * consumed data message only if the EP is in
+				 * SCIFEP_CONNECTED state.
+				 */
+				notif_msg.src = ep->port;
+				notif_msg.uop = SCIF_CLIENT_RCVD;
+				notif_msg.payload[0] = ep->remote_ep;
+				ret = _scif_nodeqp_send(ep->remote_dev,
+							&notif_msg);
+				if (ret)
+					break;
+			}
+			remaining_len -= curr_recv_len;
+			msg = msg + curr_recv_len;
+			continue;
+		}
+		/*
+		 * Bail out now if the EP is in SCIFEP_DISCONNECTED state else
+		 * we will keep looping forever.
+		 */
+		if (ep->state == SCIFEP_DISCONNECTED)
+			break;
+		/*
+		 * Return in the Non Blocking case if there is no data
+		 * to read in this iteration.
+		 */
+		if (!(flags & SCIF_RECV_BLOCK))
+			break;
+		curr_recv_len = min(remaining_len, SCIF_ENDPT_QP_SIZE - 1);
+		spin_unlock(&ep->lock);
+		/*
+		 * Wait for a SCIF_CLIENT_SEND message in the blocking case
+		 * or until other side disconnects.
+		 */
+		ret =
+		wait_event_interruptible(ep->recvwq,
+					 SCIFEP_CONNECTED != ep->state ||
+					 scif_rb_count(&qp->inbound_q,
+						       curr_recv_len)
+					 >= curr_recv_len);
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (len - remaining_len)
+		ret = len - remaining_len;
+	else if (!ret && ep->state != SCIFEP_CONNECTED)
+		ret = ep->state == SCIFEP_DISCONNECTED ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+/**
+ * scif_user_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_send().
+ */
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int sent_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI send (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto send_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto send_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->sendlock);
+	while (sent_len != len) {
+		loop_len = len - sent_len;
+		loop_len = min(chunk_len, loop_len);
+		if (copy_from_user(tmp, msg, loop_len)) {
+			err = -EFAULT;
+			goto send_free_err;
+		}
+		err = _scif_send(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto send_free_err;
+		sent_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto send_free_err;
+	}
+send_free_err:
+	mutex_unlock(&ep->sendlock);
+	kfree(tmp);
+send_err:
+	return err < 0 ? err : sent_len;
+}
+
+/**
+ * scif_user_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_recv().
+ */
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int recv_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI recv (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto recv_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto recv_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->recvlock);
+	while (recv_len != len) {
+		loop_len = len - recv_len;
+		loop_len = min(chunk_len, loop_len);
+		err = _scif_recv(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto recv_free_err;
+		if (copy_to_user(msg, tmp, err)) {
+			err = -EFAULT;
+			goto recv_free_err;
+		}
+		recv_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto recv_free_err;
+	}
+recv_free_err:
+	mutex_unlock(&ep->recvlock);
+	kfree(tmp);
+recv_err:
+	return err < 0 ? err : recv_len;
+}
+
+/**
+ * scif_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_send().
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI send (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	if (!ep->remote_dev)
+		return -ENOTCONN;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_lock(&ep->sendlock);
+
+	ret = _scif_send(epd, msg, len, flags);
+
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_unlock(&ep->sendlock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_send);
+
+/**
+ * scif_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_recv().
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI recv (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_lock(&ep->recvlock);
+
+	ret = _scif_recv(epd, msg, len, flags);
+
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_unlock(&ep->recvlock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_recv);
+
+int scif_get_node_ids(u16 *nodes, int len, u16 *self)
+{
+	int online = 0;
+	int offset = 0;
+	int node;
+
+	if (!scif_is_mgmt_node())
+		scif_get_node_info();
+
+	*self = scif_info.nodeid;
+	mutex_lock(&scif_info.conflock);
+	len = min_t(int, len, scif_info.total);
+	for (node = 0; node <= scif_info.maxid; node++) {
+		if (_scifdev_alive(&scif_dev[node])) {
+			online++;
+			if (offset < len)
+				nodes[offset++] = node;
+		}
+	}
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI get_node_ids total %d online %d filled in %d nodes\n",
+		scif_info.total, online, offset);
+	mutex_unlock(&scif_info.conflock);
+
+	return online;
+}
+EXPORT_SYMBOL_GPL(scif_get_node_ids);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index 685ee5c..b4bfbb0 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -319,3 +319,35 @@ void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	spin_unlock(&ep->lock);
 	complete(&ep->discon);
 }
+
+/**
+ * scif_clientsend() - Respond to SCIF_CLIENT_SEND interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_clientrcvd() - Respond to SCIF_CLIENT_RCVD interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->sendwq);
+	spin_unlock(&ep->lock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index c1d6bc3..eccf7e7 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -69,6 +69,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
 	int err = 0;
+	struct scifioctl_msg request;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -197,6 +198,98 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 		f->private_data = newep;
 		return 0;
 	}
+	case SCIF_SEND:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = scif_user_send(priv, (void __user *)request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto send_err;
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+				 &err, sizeof(err))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = 0;
+send_err:
+		scif_err_debug(err, "scif_send");
+		return err;
+	}
+	case SCIF_RECV:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+
+		err = scif_user_recv(priv, (void __user *)request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto recv_err;
+
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+			&err, sizeof(err))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+		err = 0;
+recv_err:
+		scif_err_debug(err, "scif_recv");
+		return err;
+	}
+	case SCIF_GET_NODEIDS:
+	{
+		struct scifioctl_node_ids node_ids;
+		int entries;
+		u16 *nodes;
+		void __user *unodes, *uself;
+		u16 self;
+
+		if (copy_from_user(&node_ids, argp, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err2;
+		}
+
+		entries = min_t(int, scif_info.maxid, node_ids.len);
+		nodes = kmalloc_array(entries, sizeof(u16), GFP_KERNEL);
+		if (entries && !nodes) {
+			err = -ENOMEM;
+			goto getnodes_err2;
+		}
+		node_ids.len = scif_get_node_ids(nodes, entries, &self);
+
+		unodes = (void __user *)node_ids.nodes;
+		if (copy_to_user(unodes, nodes, sizeof(u16) * entries)) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		uself = (void __user *)node_ids.self;
+		if (copy_to_user(uself, &self, sizeof(u16))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		if (copy_to_user(argp, &node_ids, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+getnodes_err1:
+		kfree(nodes);
+getnodes_err2:
+		return err;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 0469aad..9b4c538 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -218,3 +218,20 @@ void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
 				   (atomic_read(&scifdev->disconn_rescnt) == 1),
 				   SCIF_NODE_ALIVE_TIMEOUT);
 }
+
+void scif_get_node_info(void)
+{
+	struct scifmsg msg;
+	DECLARE_COMPLETION_ONSTACK(node_info);
+
+	msg.uop = SCIF_GET_NODE_INFO;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = SCIF_MGMT_NODE;
+	msg.payload[3] = (u64)&node_info;
+
+	if ((scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg)))
+		return;
+
+	/* Wait for a response with SCIF_GET_NODE_INFO */
+	wait_for_completion(&node_info);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index c35451e..41e3bdb 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -570,7 +570,10 @@ static char *message_types[] = {"BAD",
 				"CNCT_GNTNACK",
 				"CNCT_REJ",
 				"DISCNCT",
-				"DISCNT_ACK"};
+				"DISCNT_ACK",
+				"CLIENT_SENT",
+				"CLIENT_RCVD",
+				"SCIF_GET_NODE_INFO"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -951,6 +954,34 @@ scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	wake_up(&sdev->disconn_wq);
 }
 
+/**
+ * scif_get_node_info: Respond to SCIF_GET_NODE_INFO interrupt message
+ * @msg:        Interrupt message
+ *
+ * Retrieve node info i.e maxid and total from the mgmt node.
+ */
+static __always_inline void
+scif_get_node_info_resp(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		swap(msg->dst.node, msg->src.node);
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		msg->payload[2] = scif_info.total;
+		mutex_unlock(&scif_info.conflock);
+		scif_nodeqp_send(scifdev, msg);
+	} else {
+		struct completion *node_info =
+			(struct completion *)msg->payload[3];
+
+		mutex_lock(&scif_info.conflock);
+		scif_info.maxid = msg->payload[1];
+		scif_info.total = msg->payload[2];
+		complete_all(node_info);
+		mutex_unlock(&scif_info.conflock);
+	}
+}
+
 static void
 scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
 {
@@ -978,6 +1009,9 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_cnctrej,		/* SCIF_CNCT_REJ */
 	scif_discnct,		/* SCIF_DISCNCT */
 	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
+	scif_clientsend,	/* SCIF_CLIENT_SENT */
+	scif_clientrcvd,	/* SCIF_CLIENT_RCVD */
+	scif_get_node_info_resp,/* SCIF_GET_NODE_INFO */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 52%]

* [PATCH char-misc-next v4 11/13] misc: mic: MIC host driver specific changes to enable SCIF
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (9 preceding siblings ...)
  2015-03-31  1:36 52% ` [PATCH char-misc-next v4 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
@ 2015-03-31  1:36 57% ` Sudeep Dutt
  2015-03-31  1:36 70% ` [PATCH char-misc-next v4 12/13] misc: mic: MIC card " Sudeep Dutt
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC host driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |   2 +-
 drivers/misc/mic/host/mic_device.h  |  11 +-
 drivers/misc/mic/host/mic_intr.h    |   3 +-
 drivers/misc/mic/host/mic_smpt.h    |   1 +
 drivers/misc/mic/host/mic_boot.c    | 264 ++++++++++++++++++++++++++++++++++--
 drivers/misc/mic/host/mic_debugfs.c |  13 ++
 drivers/misc/mic/host/mic_main.c    |   6 +
 drivers/misc/mic/host/mic_smpt.c    |   7 +-
 drivers/misc/mic/host/mic_virtio.c  |   6 +-
 drivers/misc/mic/host/mic_x100.c    |   3 +-
 10 files changed, 292 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bcf6e36..a2ea3ab 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -36,7 +36,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 016bd15..01a7555 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -27,7 +27,7 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 #include <linux/mic_bus.h>
-
+#include "../bus/scif_bus.h"
 #include "mic_intr.h"
 
 /* The maximum number of MIC devices supported in a single host system. */
@@ -90,7 +90,9 @@ enum mic_stepping {
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
  * @dma_mbdev: MIC BUS DMA device.
- * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -129,7 +131,9 @@ struct mic_device {
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
 	struct mbus_device *dma_mbdev;
-	struct dma_chan *dma_ch;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -228,4 +232,5 @@ void mic_exit_debugfs(void);
 void mic_prepare_suspend(struct mic_device *mdev);
 void mic_complete_resume(struct mic_device *mdev);
 void mic_suspend(struct mic_device *mdev);
+extern atomic_t g_num_mics;
 #endif
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 9f783d4a..cce2882 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -28,8 +28,9 @@
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
  * 4 for host owned DMA channels.
+ * 1 for SCIF
  */
-#define MIC_MIN_MSIX 8
+#define MIC_MIN_MSIX 9
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_smpt.h b/drivers/misc/mic/host/mic_smpt.h
index 51970ab..68721c6 100644
--- a/drivers/misc/mic/host/mic_smpt.h
+++ b/drivers/misc/mic/host/mic_smpt.h
@@ -78,6 +78,7 @@ void mic_unmap_single(struct mic_device *mdev,
 dma_addr_t mic_map(struct mic_device *mdev,
 	dma_addr_t dma_addr, size_t size);
 void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size);
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr);
 
 /**
  * mic_map_error - Check a MIC address for errors.
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index ff2b0fb..bb15506 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -21,6 +21,7 @@
 #include <linux/delay.h>
 #include <linux/firmware.h>
 #include <linux/pci.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include <linux/mic_bus.h>
@@ -29,6 +30,188 @@
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static void *__mic_dma_alloc(struct device *dev, size_t size,
+			     dma_addr_t *dma_handle, gfp_t gfp,
+			     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	dma_addr_t tmp;
+	void *va = kmalloc(size, gfp);
+
+	if (va) {
+		tmp = mic_map_single(mdev, va, size);
+		if (dma_mapping_error(dev, tmp)) {
+			kfree(va);
+			va = NULL;
+		} else {
+			*dma_handle = tmp;
+		}
+	}
+	return va;
+}
+
+static void __mic_dma_free(struct device *dev, size_t size, void *vaddr,
+			   dma_addr_t dma_handle, struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_handle, size);
+	kfree(vaddr);
+}
+
+static dma_addr_t
+__mic_dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+		   size_t size, enum dma_data_direction dir,
+		   struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void
+__mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+		     size_t size, enum dma_data_direction dir,
+		     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static int __mic_dma_map_sg(struct device *dev, struct scatterlist *sg,
+			    int nents, enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	int i, j, ret;
+	dma_addr_t da;
+
+	ret = dma_map_sg(mdev->sdev->parent, sg, nents, dir);
+	if (ret <= 0)
+		return 0;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_map(mdev, sg_dma_address(s) + s->offset, s->length);
+		if (!da)
+			goto err;
+		sg_dma_address(s) = da;
+	}
+	return nents;
+err:
+	for_each_sg(sg, s, i, j) {
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = mic_to_dma_addr(mdev, sg_dma_address(s));
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+	return 0;
+}
+
+static void __mic_dma_unmap_sg(struct device *dev,
+			       struct scatterlist *sg, int nents,
+			       enum dma_data_direction dir,
+			       struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	dma_addr_t da;
+	int i;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_to_dma_addr(mdev, sg_dma_address(s));
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = da;
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+}
+
+static struct dma_map_ops __mic_dma_ops = {
+	.alloc = __mic_dma_alloc,
+	.free = __mic_dma_free,
+	.map_page = __mic_dma_map_page,
+	.unmap_page = __mic_dma_unmap_page,
+	.map_sg = __mic_dma_map_sg,
+	.unmap_sg = __mic_dma_unmap_sg,
+};
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name,
+				       void *data, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_request_threaded_irq(mdev, func, NULL, name, data,
+					db, MIC_INTR_DB);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_free_irq(mdev, cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_next_db(mdev);
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->send_intr(mdev, db);
+}
+
+static void __iomem *___mic_ioremap(struct scif_hw_dev *scdev,
+				    phys_addr_t pa, size_t len)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mdev->aper.va + pa;
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	/* nothing to do */
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
 static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
 {
 	return dev_get_drvdata(mbdev->dev.parent);
@@ -127,6 +310,58 @@ void mic_bootparam_init(struct mic_device *mdev)
 	bootparam->h2c_config_db = -1;
 	bootparam->shutdown_status = 0;
 	bootparam->shutdown_card = 0;
+	/* Total nodes = number of MICs + 1 for self node */
+	bootparam->tot_nodes = atomic_read(&g_num_mics) + 1;
+	bootparam->node_id = mdev->id + 1;
+	bootparam->scif_host_dma_addr = 0x0;
+	bootparam->scif_card_dma_addr = 0x0;
+	bootparam->c2h_scif_db = -1;
+	bootparam->h2c_scif_db = -1;
+}
+
+/**
+ * mic_request_dma_chans - Request DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns number of DMA channels acquired
+ */
+static int mic_request_dma_chans(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, mdev->ops->dma_filter,
+					   mdev->sdev->parent);
+		if (chan) {
+			mdev->dma_ch[mdev->num_dma_ch++] = chan;
+			if (mdev->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdev->sdev->parent, "DMA channels # %d\n", mdev->num_dma_ch);
+	return mdev->num_dma_ch;
+}
+
+/**
+ * mic_free_dma_chans - release DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns none
+ */
+static void mic_free_dma_chans(struct mic_device *mdev)
+{
+	int i = 0;
+
+	for (i = 0; i < mdev->num_dma_ch; i++) {
+		dma_release_channel(mdev->dma_ch[i]);
+		mdev->dma_ch[i] = NULL;
+	}
+	mdev->num_dma_ch = 0;
 }
 
 /**
@@ -141,6 +376,7 @@ int mic_start(struct mic_device *mdev, const char *buf)
 {
 	int rc;
 	mutex_lock(&mdev->mic_mutex);
+	mic_bootparam_init(mdev);
 retry:
 	if (MIC_OFFLINE != mdev->state) {
 		rc = -EINVAL;
@@ -161,14 +397,22 @@ retry:
 		rc = PTR_ERR(mdev->dma_mbdev);
 		goto unlock_ret;
 	}
-	mdev->dma_ch = mic_request_dma_chan(mdev);
-	if (!mdev->dma_ch) {
-		rc = -ENXIO;
+	if (!mic_request_dma_chans(mdev)) {
+		rc = -ENODEV;
 		goto dma_remove;
 	}
+	mdev->scdev = scif_register_device(mdev->sdev->parent, MIC_SCIF_DEV,
+					   &__mic_dma_ops, &scif_hw_ops,
+					   mdev->id + 1, 0, &mdev->mmio,
+					   &mdev->aper, mdev->dp, NULL,
+					   mdev->dma_ch, mdev->num_dma_ch);
+	if (IS_ERR(mdev->scdev)) {
+		rc = PTR_ERR(mdev->scdev);
+		goto dma_free;
+	}
 	rc = mdev->ops->load_mic_fw(mdev, buf);
 	if (rc)
-		goto dma_release;
+		goto scif_remove;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -177,8 +421,10 @@ retry:
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
 	goto unlock_ret;
-dma_release:
-	dma_release_channel(mdev->dma_ch);
+scif_remove:
+	scif_unregister_device(mdev->scdev);
+dma_free:
+	mic_free_dma_chans(mdev);
 dma_remove:
 	mbus_unregister_device(mdev->dma_mbdev);
 unlock_ret:
@@ -197,11 +443,9 @@ void mic_stop(struct mic_device *mdev, bool force)
 {
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
+		scif_unregister_device(mdev->scdev);
 		mic_virtio_reset_devices(mdev);
-		if (mdev->dma_ch) {
-			dma_release_channel(mdev->dma_ch);
-			mdev->dma_ch = NULL;
-		}
+		mic_free_dma_chans(mdev);
 		mbus_unregister_device(mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 687e9aa..3c9ea48 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -214,6 +214,19 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 		   bootparam->shutdown_status);
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
 		   bootparam->shutdown_card);
+	seq_printf(s, "Bootparam: tot_nodes %d\n",
+		   bootparam->tot_nodes);
+	seq_printf(s, "Bootparam: node_id %d\n",
+		   bootparam->node_id);
+	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
+		   bootparam->c2h_scif_db);
+	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
+		   bootparam->h2c_scif_db);
+	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
+		   bootparam->scif_host_dma_addr);
+	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
+		   bootparam->scif_card_dma_addr);
+
 
 	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
 	     i += mic_total_desc_size(d)) {
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index ab37a31..4564629 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -67,6 +67,8 @@ static struct ida g_mic_ida;
 static struct class *g_mic_class;
 /* Base device node number for MIC devices */
 static dev_t g_mic_devno;
+/* Track the total number of MIC devices */
+atomic_t g_num_mics;
 
 static const struct file_operations mic_fops = {
 	.open = mic_open,
@@ -408,6 +410,7 @@ static int mic_probe(struct pci_dev *pdev,
 		dev_err(&pdev->dev, "cdev_add err id %d rc %d\n", mdev->id, rc);
 		goto cleanup_debug_dir;
 	}
+	atomic_inc(&g_num_mics);
 	return 0;
 cleanup_debug_dir:
 	mic_delete_debug_dir(mdev);
@@ -459,6 +462,7 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	mic_stop(mdev, false);
+	atomic_dec(&g_num_mics);
 	cdev_del(&mdev->cdev);
 	mic_delete_debug_dir(mdev);
 	mutex_lock(&mdev->mic_mutex);
@@ -478,6 +482,7 @@ static void mic_remove(struct pci_dev *pdev)
 	ida_simple_remove(&g_mic_ida, mdev->id);
 	kfree(mdev);
 }
+
 static struct pci_driver mic_driver = {
 	.name = mic_driver_name,
 	.id_table = mic_pci_tbl,
@@ -512,6 +517,7 @@ static int __init mic_init(void)
 	}
 	return ret;
 cleanup_debugfs:
+	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
 	class_destroy(g_mic_class);
 cleanup_chrdev:
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index fae474c..cec8203 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -174,8 +174,7 @@ static int mic_get_smpt_ref_count(struct mic_device *mdev, dma_addr_t dma_addr,
  *
  * returns a DMA address.
  */
-static dma_addr_t
-mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
 {
 	struct mic_smpt_info *smpt_info = mdev->smpt;
 	int spt;
@@ -214,7 +213,7 @@ dma_addr_t mic_map(struct mic_device *mdev, dma_addr_t dma_addr, size_t size)
 	if (!size || size > mic_max_system_memory(mdev))
 		return mic_addr;
 
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return mic_addr;
 
@@ -271,7 +270,7 @@ void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
 	}
 
 	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return;
 
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index a020e4e..cc08e9f 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -40,7 +40,7 @@ static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
 {
 	int err = 0;
 	struct dma_async_tx_descriptor *tx;
-	struct dma_chan *mic_ch = mdev->dma_ch;
+	struct dma_chan *mic_ch = mdev->dma_ch[0];
 
 	if (!mic_ch) {
 		err = -EBUSY;
@@ -80,7 +80,7 @@ static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t dma_offset;
 	size_t partlen;
 	int err;
@@ -129,7 +129,7 @@ static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t partlen;
 	int err;
 
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index b7a21e1..3341e90 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -167,8 +167,7 @@ static void mic_x100_send_intr(struct mic_device *mdev, int doorbell)
 	if (doorbell < MIC_X100_NUM_SBOX_IRQ) {
 		mic_x100_send_sbox_intr(mdev, doorbell);
 	} else {
-		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ;
 		mic_x100_send_rdmasr_intr(mdev, rdmasr_db);
 	}
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 57%]

* [PATCH char-misc-next v4 12/13] misc: mic: MIC card driver specific changes to enable SCIF
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (10 preceding siblings ...)
  2015-03-31  1:36 57% ` [PATCH char-misc-next v4 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
@ 2015-03-31  1:36 70% ` Sudeep Dutt
  2015-03-31  1:36 98% ` [PATCH char-misc-next v4 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
  2015-04-28 20:45 99% ` [PATCH char-misc-next v4 00/13] misc: mic: " Sudeep Dutt
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC card driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/card/mic_device.h |  11 +++-
 drivers/misc/mic/card/mic_x100.h   |   1 +
 drivers/misc/mic/card/mic_device.c | 132 ++++++++++++++++++++++++++++++++++++-
 drivers/misc/mic/card/mic_x100.c   |  61 ++++++++++++++++-
 5 files changed, 201 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index a2ea3ab..e9f2f56 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -56,7 +56,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86 && INTEL_MIC_BUS
+	depends on 64BIT && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 844be8f..1dbf83c 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -29,9 +29,9 @@
 
 #include <linux/workqueue.h>
 #include <linux/io.h>
-#include <linux/irqreturn.h>
 #include <linux/interrupt.h>
 #include <linux/mic_bus.h>
+#include "../bus/scif_bus.h"
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -73,6 +73,9 @@ struct mic_device {
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
  * @dma_mbdev: dma device on the MIC virtual bus.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -84,6 +87,9 @@ struct mic_driver {
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
 	struct mbus_device *dma_mbdev;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -122,10 +128,11 @@ void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
 struct mic_irq *
 mic_request_card_irq(irq_handler_t handler, irq_handler_t thread_fn,
-		     const char *name, void *data, int intr_src);
+		     const char *name, void *data, int db);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
+void mic_send_p2p_intr(int doorbell, struct mic_mw *mw);
 int mic_db_to_irq(struct mic_driver *mdrv, int db);
 u32 mic_ack_interrupt(struct mic_device *mdev);
 void mic_hw_intr_init(struct mic_driver *mdrv);
diff --git a/drivers/misc/mic/card/mic_x100.h b/drivers/misc/mic/card/mic_x100.h
index d66ea55..7e22249 100644
--- a/drivers/misc/mic/card/mic_x100.h
+++ b/drivers/misc/mic/card/mic_x100.h
@@ -35,6 +35,7 @@
 #define MIC_X100_SBOX_SDBIC0 0x0000CC90
 #define MIC_X100_SBOX_SDBIC0_DBREQ_BIT 0x80000000
 #define MIC_X100_SBOX_RDMASR0	0x0000B180
+#define MIC_X100_SBOX_APICICR0 0x0000A9D0
 
 #define MIC_X100_MAX_DOORBELL_IDX 8
 
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 83819ee..6338908 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -28,6 +28,8 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/reboot.h>
+#include <linux/dmaengine.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include "../common/mic_dev.h"
@@ -240,6 +242,111 @@ static void mic_uninit_irq(void)
 	kfree(mdrv->irq_info.irq_usage_count);
 }
 
+static inline struct mic_driver *scdev_to_mdrv(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name, void *data,
+				       int db)
+{
+	return mic_request_card_irq(func, NULL, name, data, db);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	return mic_free_card_irq(cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_ack_interrupt(&mdrv->mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	return mic_next_card_db();
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_send_intr(&mdrv->mdev, db);
+}
+
+static void ___mic_send_p2p_intr(struct scif_hw_dev *scdev, int db,
+				 struct mic_mw *mw)
+{
+	mic_send_p2p_intr(db, mw);
+}
+
+static void __iomem *
+___mic_ioremap(struct scif_hw_dev *scdev,
+	       phys_addr_t pa, size_t len)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	return mic_card_map(&mdrv->mdev, pa, len);
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_card_unmap(&mdrv->mdev, va);
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.send_p2p_intr = ___mic_send_p2p_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
+static int mic_request_dma_chans(struct mic_driver *mdrv)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, NULL, NULL);
+		if (chan) {
+			mdrv->dma_ch[mdrv->num_dma_ch++] = chan;
+			if (mdrv->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdrv->dev, "DMA channels # %d\n", mdrv->num_dma_ch);
+	return mdrv->num_dma_ch;
+}
+
+static void mic_free_dma_chans(struct mic_driver *mdrv)
+{
+	int i = 0;
+
+	for (i = 0; i < mdrv->num_dma_ch; i++) {
+		dma_release_channel(mdrv->dma_ch[i]);
+		mdrv->dma_ch[i] = NULL;
+	}
+	mdrv->num_dma_ch = 0;
+}
+
 /*
  * mic_driver_init - MIC driver initialization tasks.
  *
@@ -248,6 +355,8 @@ static void mic_uninit_irq(void)
 int __init mic_driver_init(struct mic_driver *mdrv)
 {
 	int rc;
+	struct mic_bootparam __iomem *bootparam;
+	u8 node_id;
 
 	g_drv = mdrv;
 	/*
@@ -268,13 +377,32 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 	rc = mic_shutdown_init();
 	if (rc)
 		goto irq_uninit;
+	if (!mic_request_dma_chans(mdrv)) {
+		rc = -ENODEV;
+		goto shutdown_uninit;
+	}
 	rc = mic_devices_init(mdrv);
 	if (rc)
-		goto shutdown_uninit;
+		goto dma_free;
+	bootparam = mdrv->dp;
+	node_id = ioread8(&bootparam->node_id);
+	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
+					   NULL, &scif_hw_ops,
+					   0, node_id, &mdrv->mdev.mmio, NULL,
+					   NULL, mdrv->dp, mdrv->dma_ch,
+					   mdrv->num_dma_ch);
+	if (IS_ERR(mdrv->scdev)) {
+		rc = PTR_ERR(mdrv->scdev);
+		goto device_uninit;
+	}
 	mic_create_card_debug_dir(mdrv);
 	atomic_notifier_chain_register(&panic_notifier_list, &mic_panic);
 done:
 	return rc;
+device_uninit:
+	mic_devices_uninit(mdrv);
+dma_free:
+	mic_free_dma_chans(mdrv);
 shutdown_uninit:
 	mic_shutdown_uninit();
 irq_uninit:
@@ -294,7 +422,9 @@ put:
 void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
+	scif_unregister_device(mdrv->scdev);
 	mic_devices_uninit(mdrv);
+	mic_free_dma_chans(mdrv);
 	/*
 	 * Inform the host about the shutdown status i.e. poweroff/restart etc.
 	 * The module cannot be unloaded so the only code path to call
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index e98e537..77fd41781 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -70,6 +70,41 @@ void mic_send_intr(struct mic_device *mdev, int doorbell)
 		       (MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
 }
 
+/*
+ * mic_x100_send_sbox_intr - Send an MIC_X100_SBOX interrupt to MIC.
+ */
+static void mic_x100_send_sbox_intr(struct mic_mw *mw, int doorbell)
+{
+	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
+	u32 apicicr_low = mic_mmio_read(mw, MIC_X100_SBOX_BASE_ADDRESS +
+					apic_icr_offset);
+
+	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
+	apicicr_low = (apicicr_low | (1 << 13));
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, apicicr_low,
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+}
+
+static void mic_x100_send_rdmasr_intr(struct mic_mw *mw, int doorbell)
+{
+	int rdmasr_offset = MIC_X100_SBOX_RDMASR0 + (doorbell << 2);
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, 0, MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+}
+
 /**
  * mic_ack_interrupt - Device specific interrupt handling.
  * @mdev: pointer to mic_device instance
@@ -91,6 +126,18 @@ static inline int mic_get_rdmasr_irq(int index)
 	return  MIC_X100_RDMASR_IRQ_BASE + index;
 }
 
+void mic_send_p2p_intr(int db, struct mic_mw *mw)
+{
+	int rdmasr_index;
+
+	if (db < MIC_X100_NUM_SBOX_IRQ) {
+		mic_x100_send_sbox_intr(mw, db);
+	} else {
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
+		mic_x100_send_rdmasr_intr(mw, rdmasr_index);
+	}
+}
+
 /**
  * mic_hw_intr_init - Initialize h/w specific interrupt
  * information.
@@ -113,11 +160,15 @@ void mic_hw_intr_init(struct mic_driver *mdrv)
 int mic_db_to_irq(struct mic_driver *mdrv, int db)
 {
 	int rdmasr_index;
+
+	/*
+	 * The total number of doorbell interrupts on the card are 16. Indices
+	 * 0-8 falls in the SBOX category and 8-15 fall in the RDMASR category.
+	 */
 	if (db < MIC_X100_NUM_SBOX_IRQ) {
 		return mic_get_sbox_irq(db);
 	} else {
-		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
 		return mic_get_rdmasr_irq(rdmasr_index);
 	}
 }
@@ -243,10 +294,16 @@ static void mic_platform_shutdown(struct platform_device *pdev)
 	mic_remove(pdev);
 }
 
+static u64 mic_dma_mask = DMA_BIT_MASK(64);
+
 static struct platform_device mic_platform_dev = {
 	.name = mic_driver_name,
 	.id   = 0,
 	.num_resources = 0,
+	.dev = {
+		.dma_mask = &mic_dma_mask,
+		.coherent_dma_mask = DMA_BIT_MASK(64),
+	},
 };
 
 static struct platform_driver __refdata mic_platform_driver = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 70%]

* [PATCH char-misc-next v4 13/13] misc: mic: add support for loading/unloading SCIF driver
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (11 preceding siblings ...)
  2015-03-31  1:36 70% ` [PATCH char-misc-next v4 12/13] misc: mic: MIC card " Sudeep Dutt
@ 2015-03-31  1:36 98% ` Sudeep Dutt
  2015-04-28 20:45 99% ` [PATCH char-misc-next v4 00/13] misc: mic: " Sudeep Dutt
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

modprobe SCIF driver upon start and remove it upon unload

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpss | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index cacbdb0..582aad4 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -35,6 +35,7 @@
 
 exec=/usr/sbin/mpssd
 sysfs="/sys/class/mic"
+mic_modules="mic_host mic_x100_dma scif"
 
 start()
 {
@@ -48,18 +49,15 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
+	echo -e $"Loading MIC drivers:" $mic_modules
 
-	for f in "mic_host" "mic_x100_dma"
-	do
-		modprobe $f
-		RETVAL=$?
-		if [ $RETVAL -ne 0 ]; then
-			failure
-			echo
-			return $RETVAL
-		fi
-	done
+	modprobe -a $mic_modules
+	RETVAL=$?
+	if [ $RETVAL -ne 0 ]; then
+		failure
+		echo
+		return $RETVAL
+	fi
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +168,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
-	modprobe -r mic_host mic_x100_dma
+	echo -n $"Removing MIC drivers:" $mic_modules
+	modprobe -r $mic_modules
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 98%]

* [PATCH char-misc-next v4 06/13] misc: mic: SCIF module initialization
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (4 preceding siblings ...)
  2015-03-31  1:36 96% ` [PATCH char-misc-next v4 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
@ 2015-03-31  1:36 47% ` Sudeep Dutt
  2015-03-31  1:36 28% ` [PATCH char-misc-next v4 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF module initialization, DMA mapping, ioremap wrapper APIs
and debugfs hooks. SCIF gets probed by the SCIF hardware bus
if SCIF devices were registered by base drivers. A MISC device
is registered to provide the SCIF character device interface.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_main.h    | 252 ++++++++++++++++++++++
 drivers/misc/mic/scif/scif_map.h     | 113 ++++++++++
 drivers/misc/mic/scif/scif_debugfs.c |  85 ++++++++
 drivers/misc/mic/scif/scif_main.c    | 391 +++++++++++++++++++++++++++++++++++
 4 files changed, 841 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c

diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
new file mode 100644
index 0000000..acd1233
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -0,0 +1,252 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAIN_H
+#define SCIF_MAIN_H
+
+#include <linux/sched.h>
+#include <linux/pci.h>
+#include <linux/miscdevice.h>
+#include <linux/dmaengine.h>
+#include <linux/file.h>
+#include <linux/scif.h>
+
+#include "../common/mic_dev.h"
+
+#define SCIF_MGMT_NODE 0
+#define SCIF_DEFAULT_WATCHDOG_TO 30
+#define SCIF_NODE_ACCEPT_TIMEOUT (3 * HZ)
+#define SCIF_NODE_ALIVE_TIMEOUT (SCIF_DEFAULT_WATCHDOG_TO * HZ)
+
+/*
+ * Generic state used for certain node QP message exchanges
+ * like Unregister, Alloc etc.
+ */
+enum scif_msg_state {
+	OP_IDLE = 1,
+	OP_IN_PROGRESS,
+	OP_COMPLETED,
+	OP_FAILED
+};
+
+/*
+ * struct scif_info - Global SCIF information
+ *
+ * @nodeid: Node ID this node is to others
+ * @maxid: Max known node ID
+ * @total: Total number of SCIF nodes
+ * @nr_zombies: number of zombie endpoints
+ * @eplock: Lock to synchronize listening, zombie endpoint lists
+ * @connlock: Lock to synchronize connected and disconnected lists
+ * @nb_connect_lock: Synchronize non blocking connect operations
+ * @port_lock: Synchronize access to SCIF ports
+ * @uaccept: List of user acceptreq waiting for acceptreg
+ * @listen: List of listening end points
+ * @zombie: List of zombie end points with pending RMA's
+ * @connected: List of end points in connected state
+ * @disconnected: List of end points in disconnected state
+ * @nb_connect_list: List for non blocking connections
+ * @misc_work: miscellaneous SCIF tasks
+ * @conflock: Lock to synchronize SCIF node configuration changes
+ * @en_msg_log: Enable debug message logging
+ * @p2p_enable: Enable P2P SCIF network
+ * @mdev: The MISC device
+ * @conn_work: Work for workqueue handling all connections
+ * @exitwq: Wait queue for waiting for an EXIT node QP message response
+ * @loopb_dev: Dummy SCIF device used for loopback
+ * @loopb_wq: Workqueue used for handling loopback messages
+ * @loopb_wqname[16]: Name of loopback workqueue
+ * @loopb_work: Used for submitting work to loopb_wq
+ * @loopb_recv_q: List of messages received on the loopb_wq
+ * @card_initiated_exit: set when the card has initiated the exit
+ */
+struct scif_info {
+	u8 nodeid;
+	u8 maxid;
+	u8 total;
+	u32 nr_zombies;
+	spinlock_t eplock;
+	struct mutex connlock;
+	spinlock_t nb_connect_lock;
+	spinlock_t port_lock;
+	struct list_head uaccept;
+	struct list_head listen;
+	struct list_head zombie;
+	struct list_head connected;
+	struct list_head disconnected;
+	struct list_head nb_connect_list;
+	struct work_struct misc_work;
+	struct mutex conflock;
+	u8 en_msg_log;
+	u8 p2p_enable;
+	struct miscdevice mdev;
+	struct work_struct conn_work;
+	wait_queue_head_t exitwq;
+	struct scif_dev *loopb_dev;
+	struct workqueue_struct *loopb_wq;
+	char loopb_wqname[16];
+	struct work_struct loopb_work;
+	struct list_head loopb_recv_q;
+	bool card_initiated_exit;
+};
+
+/*
+ * struct scif_p2p_info - SCIF mapping information used for P2P
+ *
+ * @ppi_peer_id - SCIF peer node id
+ * @ppi_sg - Scatter list for bar information (One for mmio and one for aper)
+ * @sg_nentries - Number of entries in the scatterlist
+ * @ppi_da: DMA address for MMIO and APER bars
+ * @ppi_len: Length of MMIO and APER bars
+ * @ppi_list: Link in list of mapping information
+ */
+struct scif_p2p_info {
+	u8 ppi_peer_id;
+	struct scatterlist *ppi_sg[2];
+	u64 sg_nentries[2];
+	dma_addr_t ppi_da[2];
+	u64 ppi_len[2];
+#define SCIF_PPI_MMIO 0
+#define SCIF_PPI_APER 1
+	struct list_head ppi_list;
+};
+
+/*
+ * struct scif_dev - SCIF remote device specific fields
+ *
+ * @node: Node id
+ * @p2p: List of P2P mapping information
+ * @qpairs: The node queue pair for exchanging control messages
+ * @intr_wq: Workqueue for handling Node QP messages
+ * @intr_wqname: Name of node QP workqueue for handling interrupts
+ * @intr_bh: Used for submitting work to intr_wq
+ * @lock: Lock used for synchronizing access to the scif device
+ * @sdev: SCIF hardware device on the SCIF hardware bus
+ * @db: doorbell the peer will trigger to generate an interrupt on self
+ * @rdb: Doorbell to trigger on the peer to generate an interrupt on the peer
+ * @cookie: Cookie received while registering the interrupt handler
+ * init_msg_work: work scheduled for SCIF_INIT message processing
+ * @p2p_dwork: Delayed work to enable polling for P2P state
+ * @qp_dwork: Delayed work for enabling polling for remote QP information
+ * @p2p_retry: Number of times to retry polling of P2P state
+ * @base_addr: P2P aperture bar base address
+ * @mic_mw mmio: The peer MMIO information used for P2P
+ * @spdev: SCIF peer device on the SCIF peer bus
+ * @node_remove_ack_pending: True if a node_remove_ack is pending
+ * @exit_ack_pending: true if an exit_ack is pending
+ * @disconn_wq: Used while waiting for a node remove response
+ * @disconn_rescnt: Keeps track of number of node remove requests sent
+ * @exit: Status of exit message
+ * @qp_dma_addr: Queue pair DMA address passed to the peer
+*/
+struct scif_dev {
+	u8 node;
+	struct list_head p2p;
+	struct scif_qp *qpairs;
+	struct workqueue_struct *intr_wq;
+	char intr_wqname[16];
+	struct work_struct intr_bh;
+	struct mutex lock;
+	struct scif_hw_dev *sdev;
+	int db;
+	int rdb;
+	struct mic_irq *cookie;
+	struct work_struct init_msg_work;
+	struct delayed_work p2p_dwork;
+	struct delayed_work qp_dwork;
+	int p2p_retry;
+	dma_addr_t base_addr;
+	struct mic_mw mmio;
+	struct scif_peer_dev __rcu *spdev;
+	bool node_remove_ack_pending;
+	bool exit_ack_pending;
+	wait_queue_head_t disconn_wq;
+	atomic_t disconn_rescnt;
+	enum scif_msg_state exit;
+	dma_addr_t qp_dma_addr;
+};
+
+extern struct scif_info scif_info;
+extern struct idr scif_ports;
+extern struct scif_dev *scif_dev;
+extern const struct file_operations scif_fops;
+
+/* Size of the RB for the Node QP */
+#define SCIF_NODE_QP_SIZE 0x10000
+
+#include "scif_nodeqp.h"
+
+/*
+ * scifdev_self:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device passed is the self aka Loopback SCIF device.
+ */
+static inline int scifdev_self(struct scif_dev *dev)
+{
+	return dev->node == scif_info.nodeid;
+}
+
+static inline bool scif_is_mgmt_node(void)
+{
+	return !scif_info.nodeid;
+}
+
+/*
+ * scifdev_is_p2p:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device is a MIC Peer to Peer SCIF device.
+ */
+static inline bool scifdev_is_p2p(struct scif_dev *dev)
+{
+	if (scif_is_mgmt_node())
+		return false;
+	else
+		return dev != &scif_dev[SCIF_MGMT_NODE] &&
+			!scifdev_self(dev);
+}
+
+/*
+ * scifdev_alive:
+ * @scifdev: The remote SCIF Device
+ *
+ * Returns true if the remote SCIF Device is running or sleeping for
+ * this endpoint.
+ */
+static inline int _scifdev_alive(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	return !!spdev;
+}
+
+void __init scif_init_debugfs(void);
+void scif_exit_debugfs(void);
+int scif_setup_intr_wq(struct scif_dev *scifdev);
+void scif_destroy_intr_wq(struct scif_dev *scifdev);
+void scif_cleanup_scifdev(struct scif_dev *dev);
+void scif_handle_remove_node(int node);
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated);
+void scif_free_qp(struct scif_dev *dev);
+void scif_misc_handler(struct work_struct *work);
+void scif_stop(struct scif_dev *scifdev);
+irqreturn_t scif_intr_handler(int irq, void *data);
+#endif /* SCIF_MAIN_H */
diff --git a/drivers/misc/mic/scif/scif_map.h b/drivers/misc/mic/scif/scif_map.h
new file mode 100644
index 0000000..20e50b4
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_map.h
@@ -0,0 +1,113 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAP_H
+#define SCIF_MAP_H
+
+#include "../bus/scif_bus.h"
+
+static __always_inline void *
+scif_alloc_coherent(dma_addr_t *dma_handle,
+		    struct scif_dev *scifdev, size_t size,
+		    gfp_t gfp)
+{
+	void *va;
+
+	if (scifdev_self(scifdev)) {
+		va = kmalloc(size, gfp);
+		if (va)
+			*dma_handle = virt_to_phys(va);
+	} else {
+		va = dma_alloc_coherent(&scifdev->sdev->dev,
+					size, dma_handle, gfp);
+		if (va && scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	return va;
+}
+
+static __always_inline void
+scif_free_coherent(void *va, dma_addr_t local,
+		   struct scif_dev *scifdev, size_t size)
+{
+	if (scifdev_self(scifdev)) {
+		kfree(va);
+	} else {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_free_coherent(&scifdev->sdev->dev,
+				  size, va, local);
+	}
+}
+
+static __always_inline int
+scif_map_single(dma_addr_t *dma_handle,
+		void *local, struct scif_dev *scifdev, size_t size)
+{
+	int err = 0;
+
+	if (scifdev_self(scifdev)) {
+		*dma_handle = virt_to_phys((local));
+	} else {
+		*dma_handle = dma_map_single(&scifdev->sdev->dev,
+					     local, size, DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(&scifdev->sdev->dev, *dma_handle))
+			err = -ENOMEM;
+		else if (scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	if (err)
+		*dma_handle = 0;
+	return err;
+}
+
+static __always_inline void
+scif_unmap_single(dma_addr_t local, struct scif_dev *scifdev,
+		  size_t size)
+{
+	if (!scifdev_self(scifdev)) {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_unmap_single(&scifdev->sdev->dev, local,
+				 size, DMA_BIDIRECTIONAL);
+	}
+}
+
+static __always_inline void *
+scif_ioremap(dma_addr_t phys, size_t size, struct scif_dev *scifdev)
+{
+	void *out_virt;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_self(scifdev))
+		out_virt = phys_to_virt(phys);
+	else
+		out_virt = (void __force *)
+			   sdev->hw_ops->ioremap(sdev, phys, size);
+	return out_virt;
+}
+
+static __always_inline void
+scif_iounmap(void *virt, size_t len, struct scif_dev *scifdev)
+{
+	if (!scifdev_self(scifdev)) {
+		struct scif_hw_dev *sdev = scifdev->sdev;
+
+		sdev->hw_ops->iounmap(sdev, (void __force __iomem *)virt);
+	}
+}
+#endif  /* SCIF_MAP_H */
diff --git a/drivers/misc/mic/scif/scif_debugfs.c b/drivers/misc/mic/scif/scif_debugfs.c
new file mode 100644
index 0000000..51f14e2
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_debugfs.c
@@ -0,0 +1,85 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+#include "../common/mic_dev.h"
+#include "scif_main.h"
+
+/* Debugfs parent dir */
+static struct dentry *scif_dbg;
+
+static int scif_dev_test(struct seq_file *s, void *unused)
+{
+	int node;
+
+	seq_printf(s, "Total Nodes %d Self Node Id %d Maxid %d\n",
+		   scif_info.total, scif_info.nodeid,
+		   scif_info.maxid);
+
+	if (!scif_dev)
+		return 0;
+
+	seq_printf(s, "%-16s\t%-16s\n", "node_id", "state");
+
+	for (node = 0; node <= scif_info.maxid; node++)
+		seq_printf(s, "%-16d\t%-16s\n", scif_dev[node].node,
+			   _scifdev_alive(&scif_dev[node]) ?
+			   "Running" : "Offline");
+	return 0;
+}
+
+static int scif_dev_test_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, scif_dev_test, inode->i_private);
+}
+
+static int scif_dev_test_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations scif_dev_ops = {
+	.owner   = THIS_MODULE,
+	.open    = scif_dev_test_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = scif_dev_test_release
+};
+
+void __init scif_init_debugfs(void)
+{
+	struct dentry *d;
+
+	scif_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	if (!scif_dbg) {
+		dev_err(scif_info.mdev.this_device,
+			"can't create debugfs dir scif\n");
+		return;
+	}
+
+	d = debugfs_create_file("scif_dev", 0444, scif_dbg,
+				NULL, &scif_dev_ops);
+	debugfs_create_u8("en_msg_log", 0666, scif_dbg, &scif_info.en_msg_log);
+	debugfs_create_u8("p2p_enable", 0666, scif_dbg, &scif_info.p2p_enable);
+}
+
+void scif_exit_debugfs(void)
+{
+	debugfs_remove_recursive(scif_dbg);
+}
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
new file mode 100644
index 0000000..82792b1
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -0,0 +1,391 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/module.h>
+#include <linux/idr.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_map.h"
+
+struct scif_info scif_info = {
+	.mdev = {
+		.minor = MISC_DYNAMIC_MINOR,
+		.name = "scif",
+		.fops = &scif_fops,
+	}
+};
+
+struct scif_dev *scif_dev;
+static atomic_t g_loopb_cnt;
+
+/* Runs in the context of intr_wq */
+static void scif_intr_bh_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev =
+			container_of(work, struct scif_dev, intr_bh);
+
+	if (scifdev_self(scifdev))
+		scif_loopb_msg_handler(scifdev, scifdev->qpairs);
+	else
+		scif_nodeqp_intrhandler(scifdev, scifdev->qpairs);
+}
+
+int scif_setup_intr_wq(struct scif_dev *scifdev)
+{
+	if (!scifdev->intr_wq) {
+		snprintf(scifdev->intr_wqname, sizeof(scifdev->intr_wqname),
+			 "SCIF INTR %d", scifdev->node);
+		scifdev->intr_wq =
+			alloc_ordered_workqueue(scifdev->intr_wqname, 0);
+		if (!scifdev->intr_wq)
+			return -ENOMEM;
+		INIT_WORK(&scifdev->intr_bh, scif_intr_bh_handler);
+	}
+	return 0;
+}
+
+void scif_destroy_intr_wq(struct scif_dev *scifdev)
+{
+	if (scifdev->intr_wq) {
+		destroy_workqueue(scifdev->intr_wq);
+		scifdev->intr_wq = NULL;
+	}
+}
+
+irqreturn_t scif_intr_handler(int irq, void *data)
+{
+	struct scif_dev *scifdev = data;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	sdev->hw_ops->ack_interrupt(sdev, scifdev->db);
+	queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+	return IRQ_HANDLED;
+}
+
+static int scif_peer_probe(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total++;
+	scif_info.maxid = max_t(u32, spdev->dnode, scif_info.maxid);
+	mutex_unlock(&scif_info.conflock);
+	rcu_assign_pointer(scifdev->spdev, spdev);
+
+	/* In the future SCIF kernel client devices will be added here */
+	dev_info(&spdev->dev, "Peer added dnode %d\n",
+		 spdev->dnode);
+	return 0;
+}
+
+static void scif_peer_remove(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	/* In the future SCIF kernel client devices will be removed here */
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	synchronize_rcu();
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total--;
+	mutex_unlock(&scif_info.conflock);
+	dev_info(&spdev->dev, "Peer removed dnode %d\n",
+		 spdev->dnode);
+}
+
+static void scif_qp_setup_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev = container_of(work, struct scif_dev,
+						qp_dwork.work);
+	struct scif_hw_dev *sdev = scifdev->sdev;
+	dma_addr_t da = 0;
+	int err;
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		da = bp->scif_card_dma_addr;
+		scifdev->rdb = bp->h2c_scif_db;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		da = readq(&bp->scif_host_dma_addr);
+		scifdev->rdb = ioread8(&bp->c2h_scif_db);
+	}
+	if (da) {
+		err = scif_qp_response(da, scifdev);
+		if (err)
+			dev_err(&scifdev->sdev->dev,
+				"scif_qp_response err %d\n", err);
+	} else {
+		schedule_delayed_work(&scifdev->qp_dwork,
+				      msecs_to_jiffies(1000));
+	}
+}
+
+static int scif_setup_scifdev(struct scif_hw_dev *sdev)
+{
+	int i;
+	u8 num_nodes;
+
+	if (sdev->snode) {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		num_nodes = ioread8(&bp->tot_nodes);
+	} else {
+		struct mic_bootparam *bp = sdev->dp;
+
+		num_nodes = bp->tot_nodes;
+	}
+	scif_dev = kcalloc(num_nodes, sizeof(*scif_dev), GFP_KERNEL);
+	if (!scif_dev)
+		return -ENOMEM;
+	for (i = 0; i < num_nodes; i++) {
+		struct scif_dev *scifdev = &scif_dev[i];
+
+		scifdev->node = i;
+		scifdev->exit = OP_IDLE;
+		init_waitqueue_head(&scifdev->disconn_wq);
+		mutex_init(&scifdev->lock);
+		INIT_WORK(&scifdev->init_msg_work, scif_qp_response_ack);
+		INIT_DELAYED_WORK(&scifdev->p2p_dwork,
+				  scif_poll_qp_state);
+		INIT_DELAYED_WORK(&scifdev->qp_dwork,
+				  scif_qp_setup_handler);
+		INIT_LIST_HEAD(&scifdev->p2p);
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	}
+	return 0;
+}
+
+static void scif_destroy_scifdev(void)
+{
+	kfree(scif_dev);
+}
+
+static int scif_probe(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev;
+	int rc;
+
+	dev_set_drvdata(&sdev->dev, sdev);
+	if (1 == atomic_add_return(1, &g_loopb_cnt)) {
+		struct scif_dev *loopb_dev;
+
+		rc = scif_setup_scifdev(sdev);
+		if (rc)
+			goto exit;
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+		loopb_dev = &scif_dev[sdev->snode];
+		loopb_dev->sdev = sdev;
+		rc = scif_setup_loopback_qp(loopb_dev);
+		if (rc)
+			goto free_sdev;
+	} else {
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+	}
+	rc = scif_setup_intr_wq(scifdev);
+	if (rc)
+		goto destroy_loopb;
+	rc = scif_setup_qp(scifdev);
+	if (rc)
+		goto destroy_intr;
+	scifdev->db = sdev->hw_ops->next_db(sdev);
+	scifdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						    "SCIF_INTR", scifdev,
+						    scifdev->db);
+	if (IS_ERR(scifdev->cookie)) {
+		rc = PTR_ERR(scifdev->cookie);
+		goto free_qp;
+	}
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = scifdev->db;
+		bp->scif_host_dma_addr = scifdev->qp_dma_addr;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(scifdev->db, &bp->h2c_scif_db);
+		writeq(scifdev->qp_dma_addr, &bp->scif_card_dma_addr);
+	}
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+	return rc;
+free_qp:
+	scif_free_qp(scifdev);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+destroy_loopb:
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+free_sdev:
+	scif_destroy_scifdev();
+exit:
+	return rc;
+}
+
+void scif_stop(struct scif_dev *scifdev)
+{
+	struct scif_dev *dev;
+	int i;
+
+	for (i = scif_info.maxid; i >= 0; i--) {
+		dev = &scif_dev[i];
+		if (scifdev_self(dev))
+			continue;
+		scif_handle_remove_node(i);
+	}
+}
+
+static void scif_remove(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = -1;
+		bp->scif_host_dma_addr = 0x0;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(-1, &bp->h2c_scif_db);
+		writeq(0x0, &bp->scif_card_dma_addr);
+	}
+	if (scif_is_mgmt_node()) {
+		scif_disconnect_node(scifdev->node, true);
+	} else {
+		scif_info.card_initiated_exit = true;
+		scif_stop(scifdev);
+	}
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+	if (scifdev->cookie) {
+		sdev->hw_ops->free_irq(sdev, scifdev->cookie, scifdev);
+		scifdev->cookie = NULL;
+	}
+	scif_destroy_intr_wq(scifdev);
+	cancel_delayed_work(&scifdev->qp_dwork);
+	scif_free_qp(scifdev);
+	scifdev->rdb = -1;
+	scifdev->sdev = NULL;
+}
+
+static struct scif_peer_driver scif_peer_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.probe = scif_peer_probe,
+	.remove = scif_peer_remove,
+};
+
+static struct scif_hw_dev_id id_table[] = {
+	{ MIC_SCIF_DEV, SCIF_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct scif_driver scif_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = scif_probe,
+	.remove = scif_remove,
+};
+
+static int _scif_init(void)
+{
+	spin_lock_init(&scif_info.eplock);
+	spin_lock_init(&scif_info.nb_connect_lock);
+	spin_lock_init(&scif_info.port_lock);
+	mutex_init(&scif_info.conflock);
+	mutex_init(&scif_info.connlock);
+	INIT_LIST_HEAD(&scif_info.uaccept);
+	INIT_LIST_HEAD(&scif_info.listen);
+	INIT_LIST_HEAD(&scif_info.zombie);
+	INIT_LIST_HEAD(&scif_info.connected);
+	INIT_LIST_HEAD(&scif_info.disconnected);
+	INIT_LIST_HEAD(&scif_info.nb_connect_list);
+	init_waitqueue_head(&scif_info.exitwq);
+	scif_info.en_msg_log = 0;
+	scif_info.p2p_enable = 1;
+	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	idr_init(&scif_ports);
+	return 0;
+}
+
+static void _scif_exit(void)
+{
+	idr_destroy(&scif_ports);
+	scif_destroy_scifdev();
+}
+
+static int __init scif_init(void)
+{
+	struct miscdevice *mdev = &scif_info.mdev;
+	int rc;
+
+	_scif_init();
+	rc = scif_peer_bus_init();
+	if (rc)
+		goto exit;
+	rc = scif_peer_register_driver(&scif_peer_driver);
+	if (rc)
+		goto peer_bus_exit;
+	rc = scif_register_driver(&scif_driver);
+	if (rc)
+		goto unreg_scif_peer;
+	rc = misc_register(mdev);
+	if (rc)
+		goto unreg_scif;
+	scif_init_debugfs();
+	return 0;
+unreg_scif:
+	scif_unregister_driver(&scif_driver);
+unreg_scif_peer:
+	scif_peer_unregister_driver(&scif_peer_driver);
+peer_bus_exit:
+	scif_peer_bus_exit();
+exit:
+	_scif_exit();
+	return rc;
+}
+
+static void __exit scif_exit(void)
+{
+	scif_exit_debugfs();
+	misc_deregister(&scif_info.mdev);
+	scif_unregister_driver(&scif_driver);
+	scif_peer_unregister_driver(&scif_peer_driver);
+	scif_peer_bus_exit();
+	_scif_exit();
+}
+
+module_init(scif_init);
+module_exit(scif_exit);
+
+MODULE_DEVICE_TABLE(scif, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 47%]

* [PATCH char-misc-next v4 05/13] misc: mic: Common MIC header file changes in preparation for SCIF
  2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (3 preceding siblings ...)
  2015-03-31  1:36 78% ` [PATCH char-misc-next v4 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
@ 2015-03-31  1:36 96% ` Sudeep Dutt
  2015-03-31  1:36 47% ` [PATCH char-misc-next v4 06/13] misc: mic: SCIF module initialization Sudeep Dutt
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

Update mic_bootparam and define the maximum number of DMA channels

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/common/mic_dev.h |  3 +++
 include/uapi/linux/mic_common.h   | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/misc/mic/common/mic_dev.h b/drivers/misc/mic/common/mic_dev.h
index 92999c2..0b58c46 100644
--- a/drivers/misc/mic/common/mic_dev.h
+++ b/drivers/misc/mic/common/mic_dev.h
@@ -48,4 +48,7 @@ struct mic_mw {
 #define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
 #define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
 
+/* Maximum number of DMA channels */
+#define MIC_MAX_DMA_CHAN 4
+
 #endif
diff --git a/include/uapi/linux/mic_common.h b/include/uapi/linux/mic_common.h
index 6eb4024..302a2ce 100644
--- a/include/uapi/linux/mic_common.h
+++ b/include/uapi/linux/mic_common.h
@@ -80,6 +80,12 @@ struct mic_device_ctrl {
  * @h2c_config_db: Host to Card Virtio config doorbell set by card
  * @shutdown_status: Card shutdown status set by card
  * @shutdown_card: Set to 1 by the host when a card shutdown is initiated
+ * @tot_nodes: Total number of nodes in the SCIF network
+ * @node_id: Unique id of the node
+ * @h2c_scif_db - Host to card SCIF doorbell set by card
+ * @c2h_scif_db - Card to host SCIF doorbell set by host
+ * @scif_host_dma_addr - SCIF host queue pair DMA address
+ * @scif_card_dma_addr - SCIF card queue pair DMA address
  */
 struct mic_bootparam {
 	__le32 magic;
@@ -88,6 +94,12 @@ struct mic_bootparam {
 	__s8 h2c_config_db;
 	__u8 shutdown_status;
 	__u8 shutdown_card;
+	__u8 tot_nodes;
+	__u8 node_id;
+	__u8 h2c_scif_db;
+	__u8 c2h_scif_db;
+	__u64 scif_host_dma_addr;
+	__u64 scif_card_dma_addr;
 } __attribute__ ((aligned(8)));
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 96%]

* [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver
@ 2015-03-31  1:36 63% Sudeep Dutt
  2015-03-31  1:36 22% ` [PATCH char-misc-next v4 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
                   ` (13 more replies)
  0 siblings, 14 replies; 200+ results
From: Sudeep Dutt @ 2015-03-31  1:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

ChangeLog:
=========

v3 => v4:
a) Delete unused IOCTL definitions and IOCTL interface fixes as per
feedback from Greg Kroah-Hartman
b) Couple of tiny bug fixes since v3

v2 => v3 @ https://lkml.org/lkml/2015/1/26/849
a) Formatting fixes for SCIF header file documentation, data type fixes
   for SCIF IOCTL interface and added SCIF user space documentation in
   scif_overview.txt as per feedback from Greg Kroah-Hartman

v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
a) Use kernel-doc formatting for SCIF header file documentation, enhance
   SCIF ring buffer documentation and formatting cleanup in patch 4 as
   per feedback from Greg Kroah-Hartman
b) SCIF bug fixes and cleanups since initial post

v1: Initial post @ https://lkml.org/lkml/2014/12/10/546

Description:
============

The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
level communications API across PCIe currently implemented for MIC. Currently
SCIF provides inter-node communication within a single host platform, where a
node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
communicating over the PCIe bus while providing an API that is symmetric
across all the nodes in the PCIe network. An important design objective for SCIF
is to deliver the maximum possible performance given the communication
abilities of the hardware. SCIF has been used to implement an offload compiler
runtime and OFED support for MPI implementations for MIC coprocessors.

==== SCIF API Components ====
The SCIF API has the following parts:
1. Connection establishment using a client server model
2. Byte stream messaging intended for short messages
3. Node enumeration to determine online nodes
4. Poll semantics for detection of incoming connections and messages
5. Memory registration to pin down pages
6. Remote memory mapping for low latency CPU accesses via mmap
7. Remote DMA (RDMA) for high bandwidth DMA transfers
8. Fence APIs for RDMA synchronization

SCIF exposes the notion of a connection which can be used by peer processes on
nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
process in a SCIF node initiates a SCIF connection to a peer process on a
different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
which are similar to connection oriented socket APIs. Connected SCIF endpoints
can also register local memory which is followed by data transfer using either
DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
kernel mode clients which are functionally equivalent.

==== SCIF Performance for MIC ====
DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
SCIF shows the performance advantages of SCIF for HPC applications and runtimes.

             Comparison of TCP and SCIF based BW

  Throughput (GB/sec)
    8 +                                             PCIe Bandwidth ******
      +                                                        TCP ######
    7 +    **************************************             SCIF %%%%%%
      |                       %%%%%%%%%%%%%%%%%%%
    6 +                   %%%%
      |                 %%
      |               %%%
    5 +              %%
      |            %%
    4 +           %%
      |          %%
    3 +         %%
      |        %
    2 +      %%
      |     %%
      |    %
    1 +
      +    ######################################
    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
      1       10     100      1000   10000   100000
                   Transfer Size (KBytes)

SCIF allows memory sharing via mmap(..) between processes on different PCIe
nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.

For more information on SCIF please refer to the Linux Con 2013 slides titled
"Intel® MIC x100 Coprocessor Driver - on the Frontiers of Linux & HPC" at
http://events.linuxfoundation.org/sites/events/files/slides/LinuxConMicDriver.pdf
The talk can also be viewed at https://www.youtube.com/watch?v=dylbmmQs4W0

This initial patch series introduces the SCIF API and implements the SCIF
connection, messaging and node enumeration APIs. SCIF poll, mmap, RDMA and
fence APIs will be submitted in a future patch series once the initial base
patches are accepted upstream.

This patch series is divided into 13 patches as follows:
1) SCIF header file documenting the API along with the IOCTL interface

2) SCIF ring buffer is a single producer, single consumer byte stream ring
   buffer optimized for avoiding reads across the PCIe bus. The ring buffer is
   used to implement a receive queue for SCIF driver messaging between nodes and
   for byte stream messaging between SCIF endpoints. Each SCIF node has a
   receive queue for every other SCIF node, and each connected endpoint has a
   receive queue for messages from its peer. This pair of receive queues is
   referred to as a SCIF queue pair.

3) SCIF hardware bus which abstracts the low level hardware details so that the
   same SCIF driver can work without any changes on the host or the card as long
   as the hardware bus operations are implemented.

4) SCIF peer bus is used to register and unregister SCIF peer devices internally
   by the SCIF driver to signify the addition and removal of peer nodes
   respectively from the SCIF network. This simplifies remote node handling
   within SCIF and will also be used to support device probe/remove for SCIF
   client drivers (e.g. netdev over SCIF)

5) Common MIC header file changes to enable SCIF

6) SCIF module initialization, DMA/remote memory mapping APIs and debugfs hooks.

7) SCIF node queue pair setup sets up the kernel mode private node
   queue pairs between all the nodes to enable internal control
   message communication.

8) SCIF character device file operations and kernel APIs for opening and
   closing a user and kernel mode SCIF endpoint. This patch also enables
   binding to a SCIF port and listening for incoming SCIF connections.

9) SCIF connection APIs which establish a SCIF connection between
   a pair of user or kernel mode endpoints.

10) SCIF messaging APIs which allow sending messages between the SCIF endpoints
    via a byte stream based ring buffer which has been optimized to avoid reads
    across PCIe. The node enumeration API enables a user to query for the number
    of online nodes.

11) MIC host driver specific changes to enable SCIF

12) MIC card driver specific changes to enable SCIF

13) Add support for loading/unloading SCIF driver

Nikhil Rao (1):
  misc: mic: SCIF connections APIs i.e. accept and connect

Sudeep Dutt (12):
  misc: mic: SCIF header file and IOCTL interface
  misc: mic: SCIF ring buffer infrastructure
  misc: mic: SCIF Hardware Bus
  misc: mic: SCIF Peer Bus
  misc: mic: Common MIC header file changes in preparation for SCIF
  misc: mic: SCIF module initialization
  misc: mic: SCIF node queue pair setup management
  misc: mic: SCIF open close bind and listen APIs
  misc: mic: SCIF messaging and node enumeration APIs
  misc: mic: MIC host driver specific changes to enable SCIF
  misc: mic: MIC card driver specific changes to enable SCIF
  misc: mic: add support for loading/unloading SCIF driver

 Documentation/mic/mic_overview.txt    |   28 +-
 Documentation/mic/scif_overview.txt   |   98 +++
 drivers/misc/mic/Kconfig              |   40 +-
 drivers/misc/mic/Makefile             |    3 +-
 drivers/misc/mic/bus/Makefile         |    1 +
 drivers/misc/mic/scif/Makefile        |   15 +
 include/uapi/linux/Kbuild             |    1 +
 drivers/misc/mic/bus/scif_bus.h       |  129 ++++
 drivers/misc/mic/card/mic_device.h    |   11 +-
 drivers/misc/mic/card/mic_x100.h      |    1 +
 drivers/misc/mic/common/mic_dev.h     |    3 +
 drivers/misc/mic/host/mic_device.h    |   11 +-
 drivers/misc/mic/host/mic_intr.h      |    3 +-
 drivers/misc/mic/host/mic_smpt.h      |    1 +
 drivers/misc/mic/scif/scif_epd.h      |  160 ++++
 drivers/misc/mic/scif/scif_main.h     |  254 +++++++
 drivers/misc/mic/scif/scif_map.h      |  113 +++
 drivers/misc/mic/scif/scif_nodeqp.h   |  183 +++++
 drivers/misc/mic/scif/scif_peer_bus.h |   65 ++
 drivers/misc/mic/scif/scif_rb.h       |  100 +++
 include/linux/scif.h                  |  993 +++++++++++++++++++++++++
 include/uapi/linux/mic_common.h       |   12 +
 include/uapi/linux/scif_ioctl.h       |  130 ++++
 drivers/misc/mic/bus/scif_bus.c       |  210 ++++++
 drivers/misc/mic/card/mic_device.c    |  132 +++-
 drivers/misc/mic/card/mic_x100.c      |   61 +-
 drivers/misc/mic/host/mic_boot.c      |  264 ++++++-
 drivers/misc/mic/host/mic_debugfs.c   |   13 +
 drivers/misc/mic/host/mic_main.c      |    6 +
 drivers/misc/mic/host/mic_smpt.c      |    7 +-
 drivers/misc/mic/host/mic_virtio.c    |    6 +-
 drivers/misc/mic/host/mic_x100.c      |    3 +-
 drivers/misc/mic/scif/scif_api.c      | 1276 ++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_debugfs.c  |   85 +++
 drivers/misc/mic/scif/scif_epd.c      |  353 +++++++++
 drivers/misc/mic/scif/scif_fd.c       |  303 ++++++++
 drivers/misc/mic/scif/scif_main.c     |  392 ++++++++++
 drivers/misc/mic/scif/scif_nm.c       |  237 ++++++
 drivers/misc/mic/scif/scif_nodeqp.c   | 1312 +++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c |  124 ++++
 drivers/misc/mic/scif/scif_ports.c    |  124 ++++
 drivers/misc/mic/scif/scif_rb.c       |  248 +++++++
 Documentation/mic/mpssd/mpss          |   24 +-
 43 files changed, 7479 insertions(+), 56 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

-- 
1.8.2.1


^ permalink raw reply	[relevance 63%]

* Re: [PATCH char-misc-next v3 01/13] misc: mic: SCIF header file and IOCTL interface
  @ 2015-03-26 17:53 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-26 17:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Wed, 2015-03-25 at 12:04 +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 26, 2015 at 04:32:42PM -0800, Sudeep Dutt wrote:
> > This patch introduces the SCIF documentation in the header file
> > and describes the IOCTL interface for user mode. mic_overview.txt
> > is updated with documentation on SCIF and a new document
> > describing SCIF in more details is available in scif_overview.txt.
> > 
> > Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > ---
> >  Documentation/mic/mic_overview.txt  |  28 +-
> >  Documentation/mic/scif_overview.txt |  98 ++++
> >  include/uapi/linux/Kbuild           |   1 +
> >  include/linux/scif.h                | 993 ++++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/scif_ioctl.h     | 215 ++++++++
> >  5 files changed, 1323 insertions(+), 12 deletions(-)
> >  create mode 100644 Documentation/mic/scif_overview.txt
> >  create mode 100644 include/linux/scif.h
> >  create mode 100644 include/uapi/linux/scif_ioctl.h
> > 
> > diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
> > index 77c5418..1a2f2c8 100644
> > --- a/Documentation/mic/mic_overview.txt
> > +++ b/Documentation/mic/mic_overview.txt
> > @@ -24,6 +24,10 @@ a virtual bus called mic bus is created and virtual dma devices are
> >  created on it by the host/card drivers. On host the channels are private
> >  and used only by the host driver to transfer data for the virtio devices.
> >  
> > +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
> > +low level communications API across PCIe currently implemented for MIC.
> > +More details are available at scif_overview.txt.
> > +
> >  Here is a block diagram of the various components described above. The
> >  virtio backends are situated on the host rather than the card given better
> >  single threaded performance for the host compared to MIC, the ability of
> > @@ -47,18 +51,18 @@ the fact that the virtio block storage backend can only be on the host.
> >                        |               |       | Virtio over PCIe IOCTLs  |
> >                        |               |       +--------------------------+
> >  +-----------+         |               |                   |  +-----------+
> > -| MIC DMA   |         |               |                   |  | MIC DMA   |
> > -| Driver    |         |               |                   |  | Driver    |
> > -+-----------+         |               |                   |  +-----------+
> > -      |               |               |                   |        |
> > -+---------------+     |               |                   |  +----------------+
> > -|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
> > -+---------------+     |               |                   |  +----------------+
> > -      |               |               |                   |              |
> > -      |   +--------------+            |            +---------------+     |
> > -      |   |Intel MIC     |            |            |Intel MIC      |     |
> > -      +---|Card Driver   |            |            |Host Driver    |     |
> > -          +--------------+            |            +---------------+-----+
> > +| MIC DMA   |         |  +----------+ | +-----------+     |  | MIC DMA   |
> > +| Driver    |         |  |  SCIF    | | |   SCIF    |     |  | Driver    |
> > ++-----------+         |  +----------+ | +-----------+     |  +-----------+
> > +      |               |       |       |       |           |        |
> > ++---------------+     | +-----+-----+ | +-----+-----+     | +---------------+
> > +|MIC virtual Bus|     | |SCIF HW Bus| | |SCIF HW BUS|     | |MIC virtual Bus|
> > ++---------------+     | +-----------+ | +-----+-----+     | +---------------+
> > +      |               |       |       |       |           |              |
> > +      |   +--------------+    |       |       |    +---------------+     |
> > +      |   |Intel MIC     |    |       |       |    |Intel MIC      |     |
> > +      +---|Card Driver   +----+       |       |    |Host Driver    |     |
> > +          +--------------+            |       +----+---------------+-----+
> >                        |               |                   |
> >               +-------------------------------------------------------------+
> >               |                                                             |
> > diff --git a/Documentation/mic/scif_overview.txt b/Documentation/mic/scif_overview.txt
> > new file mode 100644
> > index 0000000..0a280d9
> > --- /dev/null
> > +++ b/Documentation/mic/scif_overview.txt
> > @@ -0,0 +1,98 @@
> > +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> > +level communications API across PCIe currently implemented for MIC. Currently
> > +SCIF provides inter-node communication within a single host platform, where a
> > +node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> > +communicating over the PCIe bus while providing an API that is symmetric
> > +across all the nodes in the PCIe network. An important design objective for SCIF
> > +is to deliver the maximum possible performance given the communication
> > +abilities of the hardware. SCIF has been used to implement an offload compiler
> > +runtime and OFED support for MPI implementations for MIC coprocessors.
> > +
> > +==== SCIF API Components ====
> > +The SCIF API has the following parts:
> > +1. Connection establishment using a client server model
> > +2. Byte stream messaging intended for short messages
> > +3. Node enumeration to determine online nodes
> > +4. Poll semantics for detection of incoming connections and messages
> > +5. Memory registration to pin down pages
> > +6. Remote memory mapping for low latency CPU accesses via mmap
> > +7. Remote DMA (RDMA) for high bandwidth DMA transfers
> > +8. Fence APIs for RDMA synchronization
> > +
> > +SCIF exposes the notion of a connection which can be used by peer processes on
> > +nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
> > +process in a SCIF node initiates a SCIF connection to a peer process on a
> > +different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
> > +which are similar to connection oriented socket APIs. Connected SCIF endpoints
> > +can also register local memory which is followed by data transfer using either
> > +DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
> > +kernel mode clients which are functionally equivalent.
> > +
> > +==== SCIF Performance for MIC ====
> > +DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
> > +SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
> > +
> > +             Comparison of TCP and SCIF based BW
> > +
> > +  Throughput (GB/sec)
> > +    8 +                                             PCIe Bandwidth ******
> > +      +                                                        TCP ######
> > +    7 +    **************************************             SCIF %%%%%%
> > +      |                       %%%%%%%%%%%%%%%%%%%
> > +    6 +                   %%%%
> > +      |                 %%
> > +      |               %%%
> > +    5 +              %%
> > +      |            %%
> > +    4 +           %%
> > +      |          %%
> > +    3 +         %%
> > +      |        %
> > +    2 +      %%
> > +      |     %%
> > +      |    %
> > +    1 +
> > +      +    ######################################
> > +    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
> > +      1       10     100      1000   10000   100000
> > +                   Transfer Size (KBytes)
> > +
> > +SCIF allows memory sharing via mmap(..) between processes on different PCIe
> > +nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
> > +latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
> > +
> > +SCIF has a user space library which is a thin IOCTL wrapper providing a user
> > +space API similar to the kernel API in scif.h. The SCIF user space library
> > +is distributed @ https://software.intel.com/en-us/mic-developer
> > +
> > +Here is some pseudo code for an example of how two applications on two PCIe
> > +nodes would typically use the SCIF API:
> > +
> > +Process A (on node A)			Process B (on node B)
> > +
> > +/* get online node information */
> > +scif_get_node_ids(..)			scif_get_node_ids(..)
> > +scif_open(..)				scif_open(..)
> > +scif_bind(..)				scif_bind(..)
> > +scif_listen(..)
> > +scif_accept(..)				scif_connect(..)
> > +/* SCIF connection established */
> > +
> > +/* Send and receive short messages */
> > +scif_send(..)/scif_recv(..)		scif_send(..)/scif_recv(..)
> > +
> > +/* Register memory */
> > +scif_register(..)			scif_register(..)
> > +
> > +/* RDMA */
> > +scif_readfrom(..)/scif_writeto(..)	scif_readfrom(..)/scif_writeto(..)
> > +
> > +/* Fence DMAs */
> > +scif_fence_signal(..)			scif_fence_signal(..)
> > +
> > +mmap(..)				mmap(..)
> > +
> > +/* Access remote registered memory */
> > +
> > +/* Close the endpoints */
> > +scif_close(..)				scif_close(..)
> > diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> > index 00b10002..5083817 100644
> > --- a/include/uapi/linux/Kbuild
> > +++ b/include/uapi/linux/Kbuild
> > @@ -349,6 +349,7 @@ header-y += rtc.h
> >  header-y += rtnetlink.h
> >  header-y += scc.h
> >  header-y += sched.h
> > +header-y += scif_ioctl.h
> >  header-y += screen_info.h
> >  header-y += sctp.h
> >  header-y += sdla.h
> > diff --git a/include/linux/scif.h b/include/linux/scif.h
> > new file mode 100644
> > index 0000000..44f4f38
> > --- /dev/null
> > +++ b/include/linux/scif.h
> > @@ -0,0 +1,993 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * This file is provided under a dual BSD/GPLv2 license.  When using or
> > + * redistributing this file, you may do so under either license.
> > + *
> > + * GPL LICENSE SUMMARY
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of version 2 of the GNU General Public License as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * General Public License for more details.
> > + *
> > + * BSD LICENSE
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + *
> > + * * Redistributions of source code must retain the above copyright
> > + *   notice, this list of conditions and the following disclaimer.
> > + * * Redistributions in binary form must reproduce the above copyright
> > + *   notice, this list of conditions and the following disclaimer in
> > + *   the documentation and/or other materials provided with the
> > + *   distribution.
> > + * * Neither the name of Intel Corporation nor the names of its
> > + *   contributors may be used to endorse or promote products derived
> > + *   from this software without specific prior written permission.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + *
> > + * Intel SCIF driver.
> > + *
> > + */
> > +#ifndef __SCIF_H__
> > +#define __SCIF_H__
> > +
> > +#include <linux/types.h>
> > +#include <linux/poll.h>
> > +#include <linux/scif_ioctl.h>
> > +
> > +#define SCIF_ACCEPT_SYNC	1
> > +#define SCIF_SEND_BLOCK		1
> > +#define SCIF_RECV_BLOCK		1
> > +
> > +enum {
> > +	SCIF_PROT_READ = (1 << 0),
> > +	SCIF_PROT_WRITE = (1 << 1)
> > +};
> > +
> > +enum {
> > +	SCIF_MAP_FIXED = 0x10,
> > +	SCIF_MAP_KERNEL	= 0x20,
> > +};
> > +
> > +enum {
> > +	SCIF_FENCE_INIT_SELF = (1 << 0),
> > +	SCIF_FENCE_INIT_PEER = (1 << 1),
> > +	SCIF_SIGNAL_LOCAL = (1 << 4),
> > +	SCIF_SIGNAL_REMOTE = (1 << 5)
> > +};
> > +
> > +enum {
> > +	SCIF_RMA_USECPU = (1 << 0),
> > +	SCIF_RMA_USECACHE = (1 << 1),
> > +	SCIF_RMA_SYNC = (1 << 2),
> > +	SCIF_RMA_ORDERED = (1 << 3)
> > +};
> > +
> > +/* End of SCIF Admin Reserved Ports */
> > +#define SCIF_ADMIN_PORT_END	1024
> > +
> > +/* End of SCIF Reserved Ports */
> > +#define SCIF_PORT_RSVD		1088
> > +
> > +typedef struct scif_endpt *scif_epd_t;
> > +
> > +#define SCIF_OPEN_FAILED ((scif_epd_t)-1)
> > +#define SCIF_REGISTER_FAILED ((off_t)-1)
> > +#define SCIF_MMAP_FAILED ((void *)-1)
> > +
> > +/**
> > + * scif_open() - Create an endpoint
> > + *
> > + * Return:
> > + * Upon successful completion, scif_open() returns an endpoint descriptor to
> > + * be used in subsequent SCIF functions calls to refer to that endpoint;
> > + * otherwise in user mode SCIF_OPEN_FAILED (that is ((scif_epd_t)-1)) is
> > + * returned and errno is set to indicate the error; in kernel mode a NULL
> > + * scif_epd_t is returned.
> > + *
> > + * Errors:
> > + * ENOMEM - Insufficient kernel memory was available
> > + */
> > +scif_epd_t scif_open(void);
> > +
> > +/**
> > + * scif_bind() - Bind an endpoint to a port
> > + * @epd:	endpoint descriptor
> > + * @pn:		port number
> > + *
> > + * scif_bind() binds endpoint epd to port pn, where pn is a port number on the
> > + * local node. If pn is zero, a port number greater than or equal to
> > + * SCIF_PORT_RSVD is assigned and returned. Each endpoint may be bound to
> > + * exactly one local port. Ports less than 1024 when requested can only be bound
> > + * by system (or root) processes or by processes executed by privileged users.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_bind() returns the port number to which epd
> > + * is bound; otherwise in user mode -1 is returned and errno is set to
> > + * indicate the error; in kernel mode the negative of one of the following
> > + * errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * EINVAL - the endpoint or the port is already bound
> > + * EISCONN - The endpoint is already connected
> > + * ENOSPC - No port number available for assignment
> > + * EACCES - The port requested is protected and the user is not the superuser
> > + */
> > +int scif_bind(scif_epd_t epd, u16 pn);
> > +
> > +/**
> > + * scif_listen() - Listen for connections on an endpoint
> > + * @epd:	endpoint descriptor
> > + * @backlog:	maximum pending connection requests
> > + *
> > + * scif_listen() marks the endpoint epd as a listening endpoint - that is, as
> > + * an endpoint that will be used to accept incoming connection requests. Once
> > + * so marked, the endpoint is said to be in the listening state and may not be
> > + * used as the endpoint of a connection.
> > + *
> > + * The endpoint, epd, must have been bound to a port.
> > + *
> > + * The backlog argument defines the maximum length to which the queue of
> > + * pending connections for epd may grow. If a connection request arrives when
> > + * the queue is full, the client may receive an error with an indication that
> > + * the connection was refused.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_listen() returns 0; otherwise in user mode
> > + * -1 is returned and errno is set to indicate the error; in kernel mode the
> > + * negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * EINVAL - the endpoint is not bound to a port
> > + * EISCONN - The endpoint is already connected or listening
> > + */
> > +int scif_listen(scif_epd_t epd, int backlog);
> > +
> > +/**
> > + * scif_connect() - Initiate a connection on a port
> > + * @epd:	endpoint descriptor
> > + * @dst:	global id of port to which to connect
> > + *
> > + * The scif_connect() function requests the connection of endpoint epd to remote
> > + * port dst. If the connection is successful, a peer endpoint, bound to dst, is
> > + * created on node dst.node. On successful return, the connection is complete.
> > + *
> > + * If the endpoint epd has not already been bound to a port, scif_connect()
> > + * will bind it to an unused local port.
> > + *
> > + * A connection is terminated when an endpoint of the connection is closed,
> > + * either explicitly by scif_close(), or when a process that owns one of the
> > + * endpoints of the connection is terminated.
> > + *
> > + * In user space, scif_connect() supports an asynchronous connection mode
> > + * if the application has set the O_NONBLOCK flag on the endpoint via the
> > + * fcntl() system call. Setting this flag will result in the calling process
> > + * not to wait during scif_connect().
> > + *
> > + * Return:
> > + * Upon successful completion, scif_connect() returns the port ID to which the
> > + * endpoint, epd, is bound; otherwise in user mode -1 is returned and errno is
> > + * set to indicate the error; in kernel mode the negative of one of the
> > + * following errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNREFUSED - The destination was not listening for connections or refused
> > + * the connection request
> > + * EINVAL - dst.port is not a valid port ID
> > + * EISCONN - The endpoint is already connected
> > + * ENOMEM - No buffer space is available
> > + * ENODEV - The destination node does not exist, or the node is lost or existed,
> > + * but is not currently in the network since it may have crashed
> > + * ENOSPC - No port number available for assignment
> > + * EOPNOTSUPP - The endpoint is listening and cannot be connected
> > + */
> > +int scif_connect(scif_epd_t epd, struct scif_port_id *dst);
> > +
> > +/**
> > + * scif_accept() - Accept a connection on an endpoint
> > + * @epd:	endpoint descriptor
> > + * @peer:	global id of port to which connected
> > + * @newepd:	new connected endpoint descriptor
> > + * @flags:	flags
> > + *
> > + * The scif_accept() call extracts the first connection request from the queue
> > + * of pending connections for the port on which epd is listening. scif_accept()
> > + * creates a new endpoint, bound to the same port as epd, and allocates a new
> > + * SCIF endpoint descriptor, returned in newepd, for the endpoint. The new
> > + * endpoint is connected to the endpoint through which the connection was
> > + * requested. epd is unaffected by this call, and remains in the listening
> > + * state.
> > + *
> > + * On successful return, peer holds the global port identifier (node id and
> > + * local port number) of the port which requested the connection.
> > + *
> > + * A connection is terminated when an endpoint of the connection is closed,
> > + * either explicitly by scif_close(), or when a process that owns one of the
> > + * endpoints of the connection is terminated.
> > + *
> > + * The number of connections that can (subsequently) be accepted on epd is only
> > + * limited by system resources (memory).
> > + *
> > + * The flags argument is formed by OR'ing together zero or more of the
> > + * following values.
> > + * SCIF_ACCEPT_SYNC - block until a connection request is presented. If
> > + *			SCIF_ACCEPT_SYNC is not in flags, and no pending
> > + *			connections are present on the queue, scif_accept()
> > + *			fails with an EAGAIN error
> > + *
> > + * In user mode, the select() and poll() functions can be used to determine
> > + * when there is a connection request. In kernel mode, the scif_poll()
> > + * function may be used for this purpose. A readable event will be delivered
> > + * when a connection is requested.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_accept() returns 0; otherwise in user mode
> > + * -1 is returned and errno is set to indicate the error; in kernel mode the
> > + *	negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EAGAIN - SCIF_ACCEPT_SYNC is not set and no connections are present to be
> > + * accepted or SCIF_ACCEPT_SYNC is not set and remote node failed to complete
> > + * its connection request
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * EINTR - Interrupted function
> > + * EINVAL - epd is not a listening endpoint, or flags is invalid, or peer is
> > + * NULL, or newepd is NULL
> > + * ENODEV - The requesting node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOMEM - Not enough space
> > + * ENOENT - Secondary part of epd registration failed
> > + */
> > +int scif_accept(scif_epd_t epd, struct scif_port_id *peer, scif_epd_t
> > +		*newepd, int flags);
> > +
> > +/**
> > + * scif_close() - Close an endpoint
> > + * @epd:	endpoint descriptor
> > + *
> > + * scif_close() closes an endpoint and performs necessary teardown of
> > + * facilities associated with that endpoint.
> > + *
> > + * If epd is a listening endpoint then it will no longer accept connection
> > + * requests on the port to which it is bound. Any pending connection requests
> > + * are rejected.
> > + *
> > + * If epd is a connected endpoint, then its peer endpoint is also closed. RMAs
> > + * which are in-process through epd or its peer endpoint will complete before
> > + * scif_close() returns. Registered windows of the local and peer endpoints are
> > + * released as if scif_unregister() was called against each window.
> > + *
> > + * Closing a SCIF endpoint does not affect local registered memory mapped by
> > + * a SCIF endpoint on a remote node. The local memory remains mapped by the peer
> > + * SCIF endpoint explicitly removed by calling munmap(..) by the peer.
> > + *
> > + * If the peer endpoint's receive queue is not empty at the time that epd is
> > + * closed, then the peer endpoint can be passed as the endpoint parameter to
> > + * scif_recv() until the receive queue is empty.
> > + *
> > + * epd is freed and may no longer be accessed.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_close() returns 0; otherwise in user mode
> > + * -1 is returned and errno is set to indicate the error; in kernel mode the
> > + * negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + */
> > +int scif_close(scif_epd_t epd);
> > +
> > +/**
> > + * scif_send() - Send a message
> > + * @epd:	endpoint descriptor
> > + * @msg:	message buffer address
> > + * @len:	message length
> > + * @flags:	blocking mode flags
> > + *
> > + * scif_send() sends data to the peer of endpoint epd. Up to len bytes of data
> > + * are copied from memory starting at address msg. On successful execution the
> > + * return value of scif_send() is the number of bytes that were sent, and is
> > + * zero if no bytes were sent because len was zero. scif_send() may be called
> > + * only when the endpoint is in a connected state.
> > + *
> > + * If a scif_send() call is non-blocking, then it sends only those bytes which
> > + * can be sent without waiting, up to a maximum of len bytes.
> > + *
> > + * If a scif_send() call is blocking, then it normally returns after sending
> > + * all len bytes. If a blocking call is interrupted or the connection is
> > + * reset, the call is considered successful if some bytes were sent or len is
> > + * zero, otherwise the call is considered unsuccessful.
> > + *
> > + * In user mode, the select() and poll() functions can be used to determine
> > + * when the send queue is not full. In kernel mode, the scif_poll() function
> > + * may be used for this purpose.
> > + *
> > + * It is recommended that scif_send()/scif_recv() only be used for short
> > + * control-type message communication between SCIF endpoints. The SCIF RMA
> > + * APIs are expected to provide better performance for transfer sizes of
> > + * 1024 bytes or longer for the current MIC hardware and software
> > + * implementation.
> > + *
> > + * scif_send() will block until the entire message is sent if SCIF_SEND_BLOCK
> > + * is passed as the flags argument.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_send() returns the number of bytes sent;
> > + * otherwise in user mode -1 is returned and errno is set to indicate the
> > + * error; in kernel mode the negative of one of the following errors is
> > + * returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EFAULT - An invalid address was specified for a parameter
> > + * EINVAL - flags is invalid, or len is negative
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOMEM - Not enough space
> > + * ENOTCONN - The endpoint is not connected
> > + */
> > +int scif_send(scif_epd_t epd, void *msg, int len, int flags);
> > +
> > +/**
> > + * scif_recv() - Receive a message
> > + * @epd:	endpoint descriptor
> > + * @msg:	message buffer address
> > + * @len:	message buffer length
> > + * @flags:	blocking mode flags
> > + *
> > + * scif_recv() receives data from the peer of endpoint epd. Up to len bytes of
> > + * data are copied to memory starting at address msg. On successful execution
> > + * the return value of scif_recv() is the number of bytes that were received,
> > + * and is zero if no bytes were received because len was zero. scif_recv() may
> > + * be called only when the endpoint is in a connected state.
> > + *
> > + * If a scif_recv() call is non-blocking, then it receives only those bytes
> > + * which can be received without waiting, up to a maximum of len bytes.
> > + *
> > + * If a scif_recv() call is blocking, then it normally returns after receiving
> > + * all len bytes. If the blocking call was interrupted due to a disconnection,
> > + * subsequent calls to scif_recv() will copy all bytes received upto the point
> > + * of disconnection.
> > + *
> > + * In user mode, the select() and poll() functions can be used to determine
> > + * when data is available to be received. In kernel mode, the scif_poll()
> > + * function may be used for this purpose.
> > + *
> > + * It is recommended that scif_send()/scif_recv() only be used for short
> > + * control-type message communication between SCIF endpoints. The SCIF RMA
> > + * APIs are expected to provide better performance for transfer sizes of
> > + * 1024 bytes or longer for the current MIC hardware and software
> > + * implementation.
> > + *
> > + * scif_recv() will block until the entire message is received if
> > + * SCIF_RECV_BLOCK is passed as the flags argument.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_recv() returns the number of bytes
> > + * received; otherwise in user mode -1 is returned and errno is set to
> > + * indicate the error; in kernel mode the negative of one of the following
> > + * errors is returned.
> > + *
> > + * Errors:
> > + * EAGAIN - The destination node is returning from a low power state
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EFAULT - An invalid address was specified for a parameter
> > + * EINVAL - flags is invalid, or len is negative
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOMEM - Not enough space
> > + * ENOTCONN - The endpoint is not connected
> > + */
> > +int scif_recv(scif_epd_t epd, void *msg, int len, int flags);
> > +
> > +/**
> > + * scif_register() - Mark a memory region for remote access.
> > + * @epd:		endpoint descriptor
> > + * @addr:		starting virtual address
> > + * @len:		length of range
> > + * @offset:		offset of window
> > + * @prot_flags:		read/write protection flags
> > + * @map_flags:		mapping flags
> > + *
> > + * The scif_register() function opens a window, a range of whole pages of the
> > + * registered address space of the endpoint epd, starting at offset po and
> > + * continuing for len bytes. The value of po, further described below, is a
> > + * function of the parameters offset and len, and the value of map_flags. Each
> > + * page of the window represents the physical memory page which backs the
> > + * corresponding page of the range of virtual address pages starting at addr
> > + * and continuing for len bytes. addr and len are constrained to be multiples
> > + * of the page size. A successful scif_register() call returns po.
> > + *
> > + * When SCIF_MAP_FIXED is set in the map_flags argument, po will be offset
> > + * exactly, and offset is constrained to be a multiple of the page size. The
> > + * mapping established by scif_register() will not replace any existing
> > + * registration; an error is returned if any page within the range [offset,
> > + * offset + len - 1] intersects an existing window.
> > + *
> > + * When SCIF_MAP_FIXED is not set, the implementation uses offset in an
> > + * implementation-defined manner to arrive at po. The po value so chosen will
> > + * be an area of the registered address space that the implementation deems
> > + * suitable for a mapping of len bytes. An offset value of 0 is interpreted as
> > + * granting the implementation complete freedom in selecting po, subject to
> > + * constraints described below. A non-zero value of offset is taken to be a
> > + * suggestion of an offset near which the mapping should be placed. When the
> > + * implementation selects a value for po, it does not replace any extant
> > + * window. In all cases, po will be a multiple of the page size.
> > + *
> > + * The physical pages which are so represented by a window are available for
> > + * access in calls to mmap(), scif_readfrom(), scif_writeto(),
> > + * scif_vreadfrom(), and scif_vwriteto(). While a window is registered, the
> > + * physical pages represented by the window will not be reused by the memory
> > + * subsystem for any other purpose. Note that the same physical page may be
> > + * represented by multiple windows.
> > + *
> > + * Subsequent operations which change the memory pages to which virtual
> > + * addresses are mapped (such as mmap(), munmap()) have no effect on
> > + * existing window.
> > + *
> > + * If the process will fork(), it is recommended that the registered
> > + * virtual address range be marked with MADV_DONTFORK. Doing so will prevent
> > + * problems due to copy-on-write semantics.
> > + *
> > + * The prot_flags argument is formed by OR'ing together one or more of the
> > + * following values.
> > + * SCIF_PROT_READ - allow read operations from the window
> > + * SCIF_PROT_WRITE - allow write operations to the window
> > + *
> > + * The map_flags argument can be set to SCIF_MAP_FIXED which interprets a
> > + * fixed offset.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_register() returns the offset at which the
> > + * mapping was placed (po); otherwise in user mode SCIF_REGISTER_FAILED (that
> > + * is (off_t *)-1) is returned and errno is set to indicate the error; in
> > + * kernel mode the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EADDRINUSE - SCIF_MAP_FIXED is set in map_flags, and pages in the range
> > + * [offset, offset + len -1] are already registered
> > + * EAGAIN - The mapping could not be performed due to lack of resources
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
> > + * EINVAL - map_flags is invalid, or prot_flags is invalid, or SCIF_MAP_FIXED is
> > + * set in flags, and offset is not a multiple of the page size, or addr is not a
> > + * multiple of the page size, or len is not a multiple of the page size, or is
> > + * 0, or offset is negative
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOMEM - Not enough space
> > + * ENOTCONN -The endpoint is not connected
> > + */
> > +off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
> > +		    int prot_flags, int map_flags);
> > +
> > +/**
> > + * scif_unregister() - Mark a memory region for remote access.
> > + * @epd:	endpoint descriptor
> > + * @offset:	start of range to unregister
> > + * @len:	length of range to unregister
> > + *
> > + * The scif_unregister() function closes those previously registered windows
> > + * which are entirely within the range [offset, offset + len - 1]. It is an
> > + * error to specify a range which intersects only a subrange of a window.
> > + *
> > + * On a successful return, pages within the window may no longer be specified
> > + * in calls to mmap(), scif_readfrom(), scif_writeto(), scif_vreadfrom(),
> > + * scif_vwriteto(), scif_get_pages, and scif_fence_signal(). The window,
> > + * however, continues to exist until all previous references against it are
> > + * removed. A window is referenced if there is a mapping to it created by
> > + * mmap(), or if scif_get_pages() was called against the window
> > + * (and the pages have not been returned via scif_put_pages()). A window is
> > + * also referenced while an RMA, in which some range of the window is a source
> > + * or destination, is in progress. Finally a window is referenced while some
> > + * offset in that window was specified to scif_fence_signal(), and the RMAs
> > + * marked by that call to scif_fence_signal() have not completed. While a
> > + * window is in this state, its registered address space pages are not
> > + * available for use in a new registered window.
> > + *
> > + * When all such references to the window have been removed, its references to
> > + * all the physical pages which it represents are removed. Similarly, the
> > + * registered address space pages of the window become available for
> > + * registration in a new window.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_unregister() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned. In the event of an
> > + * error, no windows are unregistered.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EINVAL - the range [offset, offset + len - 1] intersects a subrange of a
> > + * window, or offset is negative
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENXIO - Offsets in the range [offset, offset + len - 1] are invalid for the
> > + * registered address space of epd
> > + */
> > +int scif_unregister(scif_epd_t epd, off_t offset, size_t len);
> > +
> > +/**
> > + * scif_readfrom() - Copy from a remote address space
> > + * @epd:	endpoint descriptor
> > + * @loffset:	offset in local registered address space to
> > + *		which to copy
> > + * @len:	length of range to copy
> > + * @roffset:	offset in remote registered address space
> > + *		from which to copy
> > + * @rma_flags:	transfer mode flags
> > + *
> > + * scif_readfrom() copies len bytes from the remote registered address space of
> > + * the peer of endpoint epd, starting at the offset roffset to the local
> > + * registered address space of epd, starting at the offset loffset.
> > + *
> > + * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
> > + * roffset + len - 1] must be within some registered window or windows of the
> > + * local and remote nodes. A range may intersect multiple registered windows,
> > + * but only if those windows are contiguous in the registered address space.
> > + *
> > + * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
> > + * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
> > + * flags includes SCIF_RMA_SYNC, then scif_readfrom() will return after the
> > + * transfer is complete. Otherwise, the transfer may be performed asynchron-
> > + * ously. The order in which any two asynchronous RMA operations complete
> > + * is non-deterministic. The synchronization functions, scif_fence_mark()/
> > + * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
> > + * the completion of asynchronous RMA operations on the same endpoint.
> > + *
> > + * The DMA transfer of individual bytes is not guaranteed to complete in
> > + * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
> > + * cacheline or partial cacheline of the source range will become visible on
> > + * the destination node after all other transferred data in the source
> > + * range has become visible on the destination node.
> > + *
> > + * The optimal DMA performance will likely be realized if both
> > + * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
> > + * performance will likely be realized if loffset and roffset are not
> > + * cacheline aligned but are separated by some multiple of 64. The lowest level
> > + * of performance is likely if loffset and roffset are not separated by a
> > + * multiple of 64.
> > + *
> > + * The rma_flags argument is formed by ORing together zero or more of the
> > + * following values.
> > + * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
> > + *	engine.
> > + * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
> > + *		transfer has completed. Passing this flag results in the
> > + *		current implementation busy waiting and consuming CPU cycles
> > + *		while the DMA transfer is in progress for best performance by
> > + *		avoiding the interrupt latency.
> > + * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
> > + *		the source range becomes visible on the destination node
> > + *		after all other transferred data in the source range has
> > + *		become visible on the destination
> > + *
> > + * Return:
> > + * Upon successful completion, scif_readfrom() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EACCESS - Attempt to write to a read-only range
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EINVAL - rma_flags is invalid
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENXIO - The range [loffset, loffset + len - 1] is invalid for the registered
> > + * address space of epd, or, The range [roffset, roffset + len - 1] is invalid
> > + * for the registered address space of the peer of epd, or loffset or roffset
> > + * is negative
> > + */
> > +int scif_readfrom(scif_epd_t epd, off_t loffset, size_t len, off_t
> > +		  roffset, int rma_flags);
> > +
> > +/**
> > + * scif_writeto() - Copy to a remote address space
> > + * @epd:	endpoint descriptor
> > + * @loffset:	offset in local registered address space
> > + *		from which to copy
> > + * @len:	length of range to copy
> > + * @roffset:	offset in remote registered address space to
> > + *		which to copy
> > + * @rma_flags:	transfer mode flags
> > + *
> > + * scif_writeto() copies len bytes from the local registered address space of
> > + * epd, starting at the offset loffset to the remote registered address space
> > + * of the peer of endpoint epd, starting at the offset roffset.
> > + *
> > + * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
> > + * roffset + len - 1] must be within some registered window or windows of the
> > + * local and remote nodes. A range may intersect multiple registered windows,
> > + * but only if those windows are contiguous in the registered address space.
> > + *
> > + * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
> > + * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
> > + * flags includes SCIF_RMA_SYNC, then scif_writeto() will return after the
> > + * transfer is complete. Otherwise, the transfer may be performed asynchron-
> > + * ously. The order in which any two asynchronous RMA operations complete
> > + * is non-deterministic. The synchronization functions, scif_fence_mark()/
> > + * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
> > + * the completion of asynchronous RMA operations on the same endpoint.
> > + *
> > + * The DMA transfer of individual bytes is not guaranteed to complete in
> > + * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
> > + * cacheline or partial cacheline of the source range will become visible on
> > + * the destination node after all other transferred data in the source
> > + * range has become visible on the destination node.
> > + *
> > + * The optimal DMA performance will likely be realized if both
> > + * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
> > + * performance will likely be realized if loffset and roffset are not cacheline
> > + * aligned but are separated by some multiple of 64. The lowest level of
> > + * performance is likely if loffset and roffset are not separated by a multiple
> > + * of 64.
> > + *
> > + * The rma_flags argument is formed by ORing together zero or more of the
> > + * following values.
> > + * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
> > + *			engine.
> > + * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
> > + *		transfer has completed. Passing this flag results in the
> > + *		current implementation busy waiting and consuming CPU cycles
> > + *		while the DMA transfer is in progress for best performance by
> > + *		avoiding the interrupt latency.
> > + * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
> > + *		the source range becomes visible on the destination node
> > + *		after all other transferred data in the source range has
> > + *		become visible on the destination
> > + *
> > + * Return:
> > + * Upon successful completion, scif_readfrom() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EACCESS - Attempt to write to a read-only range
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EINVAL - rma_flags is invalid
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENXIO - The range [loffset, loffset + len - 1] is invalid for the registered
> > + * address space of epd, or, The range [roffset , roffset + len -1] is invalid
> > + * for the registered address space of the peer of epd, or loffset or roffset
> > + * is negative
> > + */
> > +int scif_writeto(scif_epd_t epd, off_t loffset, size_t len, off_t
> > +		 roffset, int rma_flags);
> > +
> > +/**
> > + * scif_vreadfrom() - Copy from a remote address space
> > + * @epd:	endpoint descriptor
> > + * @addr:	address to which to copy
> > + * @len:	length of range to copy
> > + * @roffset:	offset in remote registered address space
> > + *		from which to copy
> > + * @rma_flags:	transfer mode flags
> > + *
> > + * scif_vreadfrom() copies len bytes from the remote registered address
> > + * space of the peer of endpoint epd, starting at the offset roffset, to local
> > + * memory, starting at addr.
> > + *
> > + * The specified range [roffset, roffset + len - 1] must be within some
> > + * registered window or windows of the remote nodes. The range may
> > + * intersect multiple registered windows, but only if those windows are
> > + * contiguous in the registered address space.
> > + *
> > + * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
> > + * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
> > + * flags includes SCIF_RMA_SYNC, then scif_vreadfrom() will return after the
> > + * transfer is complete. Otherwise, the transfer may be performed asynchron-
> > + * ously. The order in which any two asynchronous RMA operations complete
> > + * is non-deterministic. The synchronization functions, scif_fence_mark()/
> > + * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
> > + * the completion of asynchronous RMA operations on the same endpoint.
> > + *
> > + * The DMA transfer of individual bytes is not guaranteed to complete in
> > + * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
> > + * cacheline or partial cacheline of the source range will become visible on
> > + * the destination node after all other transferred data in the source
> > + * range has become visible on the destination node.
> > + *
> > + * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
> > + * the specified local memory range may be remain in a pinned state even after
> > + * the specified transfer completes. This may reduce overhead if some or all of
> > + * the same virtual address range is referenced in a subsequent call of
> > + * scif_vreadfrom() or scif_vwriteto().
> > + *
> > + * The optimal DMA performance will likely be realized if both
> > + * addr and roffset are cacheline aligned (are a multiple of 64). Lower
> > + * performance will likely be realized if addr and roffset are not
> > + * cacheline aligned but are separated by some multiple of 64. The lowest level
> > + * of performance is likely if addr and roffset are not separated by a
> > + * multiple of 64.
> > + *
> > + * The rma_flags argument is formed by ORing together zero or more of the
> > + * following values.
> > + * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
> > + *	engine.
> > + * SCIF_RMA_USECACHE - enable registration caching
> > + * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
> > + *		transfer has completed. Passing this flag results in the
> > + *		current implementation busy waiting and consuming CPU cycles
> > + *		while the DMA transfer is in progress for best performance by
> > + *		avoiding the interrupt latency.
> > + * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
> > + *	the source range becomes visible on the destination node
> > + *	after all other transferred data in the source range has
> > + *	become visible on the destination
> > + *
> > + * Return:
> > + * Upon successful completion, scif_vreadfrom() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EACCESS - Attempt to write to a read-only range
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
> > + * EINVAL - rma_flags is invalid
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENXIO - Offsets in the range [roffset, roffset + len - 1] are invalid for the
> > + * registered address space of epd
> > + */
> > +int scif_vreadfrom(scif_epd_t epd, void *addr, size_t len, off_t roffset,
> > +		   int rma_flags);
> > +
> > +/**
> > + * scif_vwriteto() - Copy to a remote address space
> > + * @epd:	endpoint descriptor
> > + * @addr:	address from which to copy
> > + * @len:	length of range to copy
> > + * @roffset:	offset in remote registered address space to
> > + *		which to copy
> > + * @rma_flags:	transfer mode flags
> > + *
> > + * scif_vwriteto() copies len bytes from the local memory, starting at addr, to
> > + * the remote registered address space of the peer of endpoint epd, starting at
> > + * the offset roffset.
> > + *
> > + * The specified range [roffset, roffset + len - 1] must be within some
> > + * registered window or windows of the remote nodes. The range may intersect
> > + * multiple registered windows, but only if those windows are contiguous in the
> > + * registered address space.
> > + *
> > + * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
> > + * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
> > + * flags includes SCIF_RMA_SYNC, then scif_vwriteto() will return after the
> > + * transfer is complete. Otherwise, the transfer may be performed asynchron-
> > + * ously. The order in which any two asynchronous RMA operations complete
> > + * is non-deterministic. The synchronization functions, scif_fence_mark()/
> > + * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
> > + * the completion of asynchronous RMA operations on the same endpoint.
> > + *
> > + * The DMA transfer of individual bytes is not guaranteed to complete in
> > + * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
> > + * cacheline or partial cacheline of the source range will become visible on
> > + * the destination node after all other transferred data in the source
> > + * range has become visible on the destination node.
> > + *
> > + * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
> > + * the specified local memory range may be remain in a pinned state even after
> > + * the specified transfer completes. This may reduce overhead if some or all of
> > + * the same virtual address range is referenced in a subsequent call of
> > + * scif_vreadfrom() or scif_vwriteto().
> > + *
> > + * The optimal DMA performance will likely be realized if both
> > + * addr and offset are cacheline aligned (are a multiple of 64). Lower
> > + * performance will likely be realized if addr and offset are not cacheline
> > + * aligned but are separated by some multiple of 64. The lowest level of
> > + * performance is likely if addr and offset are not separated by a multiple of
> > + * 64.
> > + *
> > + * The rma_flags argument is formed by ORing together zero or more of the
> > + * following values.
> > + * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
> > + *	engine.
> > + * SCIF_RMA_USECACHE - allow registration caching
> > + * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
> > + *		transfer has completed. Passing this flag results in the
> > + *		current implementation busy waiting and consuming CPU cycles
> > + *		while the DMA transfer is in progress for best performance by
> > + *		avoiding the interrupt latency.
> > + * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
> > + *		the source range becomes visible on the destination node
> > + *		after all other transferred data in the source range has
> > + *		become visible on the destination
> > + *
> > + * Return:
> > + * Upon successful completion, scif_vwriteto() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EACCESS - Attempt to write to a read-only range
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
> > + * EINVAL - rma_flags is invalid
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENXIO - Offsets in the range [roffset, roffset + len - 1] are invalid for the
> > + * registered address space of epd
> > + */
> > +int scif_vwriteto(scif_epd_t epd, void *addr, size_t len, off_t roffset,
> > +		  int rma_flags);
> > +
> > +/**
> > + * scif_fence_mark() - Mark previously issued RMAs
> > + * @epd:	endpoint descriptor
> > + * @flags:	control flags
> > + * @mark:	marked value returned as output.
> > + *
> > + * scif_fence_mark() returns after marking the current set of all uncompleted
> > + * RMAs initiated through the endpoint epd or the current set of all
> > + * uncompleted RMAs initiated through the peer of endpoint epd. The RMAs are
> > + * marked with a value returned at mark. The application may subsequently call
> > + * scif_fence_wait(), passing the value returned at mark, to await completion
> > + * of all RMAs so marked.
> > + *
> > + * The flags argument has exactly one of the following values.
> > + * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
> > + *	epd are marked
> > + * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
> > + *	of endpoint epd are marked
> > + *
> > + * Return:
> > + * Upon successful completion, scif_fence_mark() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EINVAL - flags is invalid
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENOMEM - Insufficient kernel memory was available
> > + */
> > +int scif_fence_mark(scif_epd_t epd, int flags, int *mark);
> > +
> > +/**
> > + * scif_fence_wait() - Wait for completion of marked RMAs
> > + * @epd:	endpoint descriptor
> > + * @mark:	mark request
> > + *
> > + * scif_fence_wait() returns after all RMAs marked with mark have completed.
> > + * The value passed in mark must have been obtained in a previous call to
> > + * scif_fence_mark().
> > + *
> > + * Return:
> > + * Upon successful completion, scif_fence_wait() returns 0; otherwise in user
> > + * mode -1 is returned and errno is set to indicate the error; in kernel mode
> > + * the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENOMEM - Insufficient kernel memory was available
> > + */
> > +int scif_fence_wait(scif_epd_t epd, int mark);
> > +
> > +/**
> > + * scif_fence_signal() - Request a memory update on completion of RMAs
> > + * @epd:	endpoint descriptor
> > + * @loff:	local offset
> > + * @lval:	local value to write to loffset
> > + * @roff:	remote offset
> > + * @rval:	remote value to write to roffset
> > + * @flags:	flags
> > + *
> > + * scif_fence_signal() returns after marking the current set of all uncompleted
> > + * RMAs initiated through the endpoint epd or marking the current set of all
> > + * uncompleted RMAs initiated through the peer of endpoint epd.
> > + *
> > + * If flags includes SCIF_SIGNAL_LOCAL, then on completion of the RMAs in the
> > + * marked set, lval is written to memory at the address corresponding to offset
> > + * loff in the local registered address space of epd. loff must be within a
> > + * registered window. If flags includes SCIF_SIGNAL_REMOTE, then on completion
> > + * of the RMAs in the marked set, rval is written to memory at the address
> > + * corresponding to offset roff in the remote registered address space of epd.
> > + * roff must be within a remote registered window of the peer of epd. Note
> > + * that any specified offset must be DWORD (4 byte / 32 bit) aligned.
> > + *
> > + * The flags argument is formed by OR'ing together the following.
> > + * Exactly one of the following values.
> > + * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
> > + *	epd are marked
> > + * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
> > + *	of endpoint epd are marked
> > + * One or more of the following values.
> > + * SCIF_SIGNAL_LOCAL - On completion of the marked set of RMAs, write lval to
> > + *	memory at the address corresponding to offset loff in the local
> > + *	registered address space of epd.
> > + * SCIF_SIGNAL_REMOTE - On completion of the marked set of RMAs, write rval to
> > + *	memory at the address corresponding to offset roff in the remote
> > + *	registered address space of epd.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_fence_signal() returns 0; otherwise in
> > + * user mode -1 is returned and errno is set to indicate the error; in kernel
> > + * mode the negative of one of the following errors is returned.
> > + *
> > + * Errors:
> > + * EBADF, ENOTTY - epd is not a valid endpoint descriptor
> > + * ECONNRESET - Connection reset by peer
> > + * EINVAL - flags is invalid, or loff or roff are not DWORD aligned
> > + * ENODEV - The remote node is lost or existed, but is not currently in the
> > + * network since it may have crashed
> > + * ENOTCONN - The endpoint is not connected
> > + * ENXIO - loff is invalid for the registered address of epd, or roff is invalid
> > + * for the registered address space, of the peer of epd
> > + */
> > +int scif_fence_signal(scif_epd_t epd, off_t loff, u64 lval, off_t roff,
> > +		      u64 rval, int flags);
> > +
> > +/**
> > + * scif_get_node_ids() - Return information about online nodes
> > + * @nodes:	array in which to return online node IDs
> > + * @len:	number of entries in the nodes array
> > + * @self:	address to place the node ID of the local node
> > + *
> > + * scif_get_node_ids() fills in the nodes array with up to len node IDs of the
> > + * nodes in the SCIF network. If there is not enough space in nodes, as
> > + * indicated by the len parameter, only len node IDs are returned in nodes. The
> > + * return value of scif_get_node_ids() is the total number of nodes currently in
> > + * the SCIF network. By checking the return value against the len parameter,
> > + * the user may determine if enough space for nodes was allocated.
> > + *
> > + * The node ID of the local node is returned at self.
> > + *
> > + * Return:
> > + * Upon successful completion, scif_get_node_ids() returns the actual number of
> > + * online nodes in the SCIF network including 'self'; otherwise in user mode
> > + * -1 is returned and errno is set to indicate the error; in kernel mode no
> > + * errors are returned.
> > + *
> > + * Errors:
> > + * EFAULT - Bad address
> > + */
> > +int scif_get_node_ids(u16 *nodes, int len, u16 *self);
> > +
> > +#endif /* __SCIF_H__ */
> > diff --git a/include/uapi/linux/scif_ioctl.h b/include/uapi/linux/scif_ioctl.h
> > new file mode 100644
> > index 0000000..32449de
> > --- /dev/null
> > +++ b/include/uapi/linux/scif_ioctl.h
> > @@ -0,0 +1,215 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * This file is provided under a dual BSD/GPLv2 license.  When using or
> > + * redistributing this file, you may do so under either license.
> > + *
> > + * GPL LICENSE SUMMARY
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of version 2 of the GNU General Public License as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * General Public License for more details.
> > + *
> > + * BSD LICENSE
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + *
> > + * * Redistributions of source code must retain the above copyright
> > + *   notice, this list of conditions and the following disclaimer.
> > + * * Redistributions in binary form must reproduce the above copyright
> > + *   notice, this list of conditions and the following disclaimer in
> > + *   the documentation and/or other materials provided with the
> > + *   distribution.
> > + * * Neither the name of Intel Corporation nor the names of its
> > + *   contributors may be used to endorse or promote products derived
> > + *   from this software without specific prior written permission.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + *
> > + * Intel SCIF driver.
> > + *
> > + */
> > +/*
> > + * -----------------------------------------
> > + * SCIF IOCTL interface information
> > + * -----------------------------------------
> > + */
> > +#ifndef SCIF_IOCTL_H
> > +#define SCIF_IOCTL_H
> > +
> > +#include <linux/types.h>
> > +
> > +/**
> > + * struct scif_port_id - SCIF port information
> > + * @node:	node on which port resides
> > + * @port:	local port number
> > + */
> > +struct scif_port_id {
> > +	__u16 node;
> > +	__u16 port;
> > +};
> > +
> > +/**
> > + * struct scifioctl_connect - used for SCIF_CONNECT IOCTL
> > + * @self:	used to read back the assigned port_id
> > + * @peer:	destination node and port to connect to
> > + */
> > +struct scifioctl_connect {
> > +	struct scif_port_id	self;
> > +	struct scif_port_id	peer;
> > +};
> > +
> > +/**
> > + * struct scifioctl_accept - used for SCIF_ACCEPTREQ IOCTL
> > + * @flags:	flags
> > + * @peer:	global id of peer endpoint
> > + * @endpt:	new connected endpoint descriptor
> > + */
> > +struct scifioctl_accept {
> > +	__s32			flags;
> > +	struct scif_port_id	peer;
> > +	void			*endpt;
> > +};
> > +
> > +/**
> > + * struct scifioctl_msg - used for SCIF_SEND/SCIF_RECV IOCTL
> > + * @msg:	message buffer address
> > + * @len:	message length
> > + * @flags:	flags
> > + * @out_len:	number of bytes sent/received
> > + */
> > +struct scifioctl_msg {
> > +	void	__user *msg;
> > +	__s32	len;
> > +	__s32	flags;
> > +	__s32	out_len;
> > +};
> > +
> > +/**
> > + * struct scifioctl_reg - used for SCIF_REG IOCTL
> > + * @addr:	starting virtual address
> > + * @len:	length of range
> > + * @offset:	offset of window
> > + * @prot:	read/write protection
> > + * @flags:	flags
> > + * @out_offset:	offset returned
> > + */
> > +struct scifioctl_reg {
> > +	void		*addr;
> > +	__u64		len;
> > +	__s64		offset;
> > +	__s32		prot;
> > +	__s32		flags;
> > +	__s64		out_offset;
> > +};
> > +
> > +/**
> > + * struct scifioctl_unreg - used for SCIF_UNREG IOCTL
> > + * @offset:	start of range to unregister
> > + * @len:	length of range to unregister
> > + */
> > +struct scifioctl_unreg {
> > +	__s64		offset;
> > +	__u64		len;
> > +};
> > +
> > +/**
> > + * struct scifioctl_copy - used for SCIF DMA copy IOCTLs
> > + *
> > + * @loffset:	offset in local registered address space to/from
> > + *		which to copy
> > + * @len:	length of range to copy
> > + * @roffset:	offset in remote registered address space to/from
> > + *		which to copy
> > + * @addr:	user virtual address to/from which to copy
> > + * @flags:	flags
> > + *
> > + * This structure is used for SCIF_READFROM, SCIF_WRITETO, SCIF_VREADFROM
> > + * and SCIF_VREADFROM IOCTL's.
> > + */
> > +struct scifioctl_copy {
> > +	__s64		loffset;
> > +	__u64		len;
> > +	__s64		roffset;
> > +	__u8		__user *addr;
> > +	__s32		flags;
> > +};
> > +
> > +/**
> > + * struct scifioctl_fence_mark  - used for SCIF_FENCE_MARK IOCTL
> > + * @flags:	flags
> > + * @mark:	fence handle returned by reference
> > + */
> > +struct scifioctl_fence_mark {
> > +	__s32	flags;
> > +	__s32	__user *mark;
> > +};
> > +
> > +/**
> > + * struct scifioctl_fence_signal - used for SCIF_FENCE_SIGNAL IOCTL
> > + * @loff:	local offset
> > + * @lval:	local value to write to loffset
> > + * @roff:	remote offset
> > + * @rval:	remote value to write to roffset
> > + * @flags:	flags
> > + */
> > +struct scifioctl_fence_signal {
> > +	__s64		loff;
> > +	__u64		lval;
> > +	__s64		roff;
> > +	__u64		rval;
> > +	__s32		flags;
> > +};
> > +
> > +/**
> > + * struct scifioctl_node_ids - used for SCIF_GET_NODEIDS IOCTL
> > + * @nodes:	pointer to an array of node_ids
> > + * @len:	length of array
> > + * @self:	ID of the current node
> > + */
> > +struct scifioctl_node_ids {
> > +	__u16	__user *nodes;
> > +	__s32	len;
> > +	__u16	__user *self;
> > +};
> > +
> > +#define SCIF_BIND		_IOWR('s', 1, __s32 *)
> > +#define SCIF_LISTEN		_IOW('s', 2, __s32)
> > +#define SCIF_CONNECT		_IOWR('s', 3, struct scifioctl_connect *)
> > +#define SCIF_ACCEPTREQ		_IOWR('s', 4, struct scifioctl_accept *)
> > +#define SCIF_ACCEPTREG		_IOWR('s', 5, void *)
> > +#define SCIF_SEND		_IOWR('s', 6, struct scifioctl_msg *)
> > +#define SCIF_RECV		_IOWR('s', 7, struct scifioctl_msg *)
> > +#define SCIF_REG		_IOWR('s', 8, struct scifioctl_reg *)
> 
> I just picked one random ioctl, this one, and I don't see it used
> anywhere in this series.  Why are you defining it if you don't use it?
> 

This initial patch series implements the SCIF connection, messaging and
node enumeration APIs. Our intention was to add the implementation of
the remaining SCIF functionality in the next patch series while these
unused IOCTls from the first patch series return an error so that the
interfaces can be reviewed up front. However, we will delete the unused
IOCTLs in the next revision of this patch series and add them back
later.

> Also, these are not the correct way to define ioctls, someone at Intel
> should have caught this, which makes me really worry about the
> user/kernel api here :(
> 
> Please fix.
> 

Sorry for missing the incorrect size field and the issues with running
32 bit applications on a 64 bit system during our internal reviews. We
will fix these up and post another revision of this patch series next
week.

Thanks for the review!

Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v3 00/13] misc: mic: SCIF driver
  2015-03-09 16:13 99%   ` Sudeep Dutt
@ 2015-03-23 16:44 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-03-23 16:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Mon, 2015-03-09 at 09:13 -0700, Sudeep Dutt wrote:
> On Mon, 2015-02-23 at 16:45 -0800, Sudeep Dutt wrote:
> > On Mon, 2015-01-26 at 16:32 -0800, Sudeep Dutt wrote:
> > > ChangeLog:
> > > =========
> > > 
> > > v2 => v3:
> > > a) Formatting fixes for SCIF header file documentation, data type fixes
> > >    for SCIF IOCTL interface and added SCIF user space documentation in
> > >    scif_overview.txt as per feedback from Greg Kroah-Hartman
> > > 
> > > v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
> > > a) Use kernel-doc formatting for SCIF header file documentation, enhance
> > >    SCIF ring buffer documentation and formatting cleanup in patch 4 as
> > >    per feedback from Greg Kroah-Hartman
> > > b) SCIF bug fixes and cleanups since initial post
> > > 
> > > v1: Initial post @ https://lkml.org/lkml/2014/12/10/546
> > > 
> > > Description:
> > > ============
> > > 
> > > The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> > > level communications API across PCIe currently implemented for MIC. Currently
> > > SCIF provides inter-node communication within a single host platform, where a
> > > node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> > > communicating over the PCIe bus while providing an API that is symmetric
> > > across all the nodes in the PCIe network. An important design objective for SCIF
> > > is to deliver the maximum possible performance given the communication
> > > abilities of the hardware. SCIF has been used to implement an offload compiler
> > > runtime and OFED support for MPI implementations for MIC coprocessors.
> > > 
> > 
> > Hi Greg,
> > 
> > Please take a look at this patch series.
> > 
> 
> Hi Greg,
> 
> Another gentle reminder for reviewing this patch series. Please let us
> know if there is anything we can do, to speed up the review process.
> 

Hi Greg,

Please review this patch series.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v3 00/13] misc: mic: SCIF driver
  2015-02-24  0:45 99% ` [PATCH char-misc-next v3 00/13] misc: mic: " Sudeep Dutt
@ 2015-03-09 16:13 99%   ` Sudeep Dutt
  2015-03-23 16:44 99%     ` Sudeep Dutt
  0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-03-09 16:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit

On Mon, 2015-02-23 at 16:45 -0800, Sudeep Dutt wrote:
> On Mon, 2015-01-26 at 16:32 -0800, Sudeep Dutt wrote:
> > ChangeLog:
> > =========
> > 
> > v2 => v3:
> > a) Formatting fixes for SCIF header file documentation, data type fixes
> >    for SCIF IOCTL interface and added SCIF user space documentation in
> >    scif_overview.txt as per feedback from Greg Kroah-Hartman
> > 
> > v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
> > a) Use kernel-doc formatting for SCIF header file documentation, enhance
> >    SCIF ring buffer documentation and formatting cleanup in patch 4 as
> >    per feedback from Greg Kroah-Hartman
> > b) SCIF bug fixes and cleanups since initial post
> > 
> > v1: Initial post @ https://lkml.org/lkml/2014/12/10/546
> > 
> > Description:
> > ============
> > 
> > The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> > level communications API across PCIe currently implemented for MIC. Currently
> > SCIF provides inter-node communication within a single host platform, where a
> > node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> > communicating over the PCIe bus while providing an API that is symmetric
> > across all the nodes in the PCIe network. An important design objective for SCIF
> > is to deliver the maximum possible performance given the communication
> > abilities of the hardware. SCIF has been used to implement an offload compiler
> > runtime and OFED support for MPI implementations for MIC coprocessors.
> > 
> 
> Hi Greg,
> 
> Please take a look at this patch series.
> 

Hi Greg,

Another gentle reminder for reviewing this patch series. Please let us
know if there is anything we can do, to speed up the review process.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v3 00/13] misc: mic: SCIF driver
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (12 preceding siblings ...)
  2015-01-27  0:32 98% ` [PATCH char-misc-next v3 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
@ 2015-02-24  0:45 99% ` Sudeep Dutt
  2015-03-09 16:13 99%   ` Sudeep Dutt
  13 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-02-24  0:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit

On Mon, 2015-01-26 at 16:32 -0800, Sudeep Dutt wrote:
> ChangeLog:
> =========
> 
> v2 => v3:
> a) Formatting fixes for SCIF header file documentation, data type fixes
>    for SCIF IOCTL interface and added SCIF user space documentation in
>    scif_overview.txt as per feedback from Greg Kroah-Hartman
> 
> v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
> a) Use kernel-doc formatting for SCIF header file documentation, enhance
>    SCIF ring buffer documentation and formatting cleanup in patch 4 as
>    per feedback from Greg Kroah-Hartman
> b) SCIF bug fixes and cleanups since initial post
> 
> v1: Initial post @ https://lkml.org/lkml/2014/12/10/546
> 
> Description:
> ============
> 
> The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> level communications API across PCIe currently implemented for MIC. Currently
> SCIF provides inter-node communication within a single host platform, where a
> node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> communicating over the PCIe bus while providing an API that is symmetric
> across all the nodes in the PCIe network. An important design objective for SCIF
> is to deliver the maximum possible performance given the communication
> abilities of the hardware. SCIF has been used to implement an offload compiler
> runtime and OFED support for MPI implementations for MIC coprocessors.
> 

Hi Greg,

Please take a look at this patch series.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH] misc: mic: fixup return type of wait_for_completion_timeout
  @ 2015-02-09 22:00 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-02-09 22:00 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: sudeep.dutt, Ashutosh Dixit, Siva Yerramreddy,
	Greg Kroah-Hartman, Nikhil Rao, linux-kernel

On Mon, 2015-02-09 at 14:09 -0500, Nicholas Mc Guire wrote:
> return type of wait_for_completion_timeout is unsigned long not int. The
> rc variable is renamed timeout to reflect its use and the type adjusted to
> unsigned long.
> 

Acked-by: Sudeep Dutt <sudeep.dutt@intel.com>

Thanks for the patch!

> Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
> ---
> 
> Patch was only compile tested with x86_64_defconfig + CONFIG_INTEL_MIC_BUS=m
> CONFIG_INTEL_MIC_HOST=m
> 
> Patch is against 3.19.0-rc7 (localversion-next is -next-20150209)
> 
>  drivers/misc/mic/host/mic_boot.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
> index ff2b0fb..d9fa609 100644
> --- a/drivers/misc/mic/host/mic_boot.c
> +++ b/drivers/misc/mic/host/mic_boot.c
> @@ -309,7 +309,7 @@ void mic_complete_resume(struct mic_device *mdev)
>   */
>  void mic_prepare_suspend(struct mic_device *mdev)
>  {
> -	int rc;
> +	unsigned long timeout;
>  
>  #define MIC_SUSPEND_TIMEOUT (60 * HZ)
>  
> @@ -331,10 +331,10 @@ void mic_prepare_suspend(struct mic_device *mdev)
>  		 */
>  		mic_set_state(mdev, MIC_SUSPENDING);
>  		mutex_unlock(&mdev->mic_mutex);
> -		rc = wait_for_completion_timeout(&mdev->reset_wait,
> -						MIC_SUSPEND_TIMEOUT);
> +		timeout = wait_for_completion_timeout(&mdev->reset_wait,
> +						      MIC_SUSPEND_TIMEOUT);
>  		/* Force reset the card if the shutdown completion timed out */
> -		if (!rc) {
> +		if (!timeout) {
>  			mutex_lock(&mdev->mic_mutex);
>  			mic_set_state(mdev, MIC_SUSPENDED);
>  			mutex_unlock(&mdev->mic_mutex);
> @@ -348,10 +348,10 @@ void mic_prepare_suspend(struct mic_device *mdev)
>  		 */
>  		mic_set_state(mdev, MIC_SUSPENDED);
>  		mutex_unlock(&mdev->mic_mutex);
> -		rc = wait_for_completion_timeout(&mdev->reset_wait,
> -						MIC_SUSPEND_TIMEOUT);
> +		timeout = wait_for_completion_timeout(&mdev->reset_wait,
> +						      MIC_SUSPEND_TIMEOUT);
>  		/* Force reset the card if the shutdown completion timed out */
> -		if (!rc)
> +		if (!timeout)
>  			mic_stop(mdev, true);
>  		break;
>  	default:



^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next v3 03/13] misc: mic: SCIF Hardware Bus
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
  2015-01-27  0:32 22% ` [PATCH char-misc-next v3 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
  2015-01-27  0:32 58% ` [PATCH char-misc-next v3 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
@ 2015-01-27  0:32 61% ` Sudeep Dutt
  2015-01-27  0:32 78% ` [PATCH char-misc-next v3 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF hardware bus abstracts the low level hardware driver details
like interrupts and mapping remote memory so that the same SCIF driver
can work without any changes with the MIC host or card driver as long
as the hardware bus operations are implemented. The SCIF hardware
device is registered by the host and card drivers on the SCIF hardware
bus resulting in probing the SCIF driver.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig        |  17 ++++
 drivers/misc/mic/Makefile       |   2 +-
 drivers/misc/mic/bus/Makefile   |   1 +
 drivers/misc/mic/bus/scif_bus.h | 129 ++++++++++++++++++++++++
 drivers/misc/mic/bus/scif_bus.c | 210 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 358 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index cc4eef0..afc9f53 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -15,6 +15,23 @@ config INTEL_MIC_BUS
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
 
+comment "SCIF Bus Driver"
+
+config SCIF_BUS
+	tristate "SCIF Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the SCIF Bus, such as CONFIG_INTEL_MIC_HOST
+	  and CONFIG_INTEL_MIC_CARD.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index e9bf148..6bc6083 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,4 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
-obj-$(CONFIG_INTEL_MIC_BUS) += bus/
+obj-y += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
index d85c7f2..1ed37e2 100644
--- a/drivers/misc/mic/bus/Makefile
+++ b/drivers/misc/mic/bus/Makefile
@@ -3,3 +3,4 @@
 # Copyright(c) 2014, Intel Corporation.
 #
 obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
+obj-$(CONFIG_SCIF_BUS) += scif_bus.o
diff --git a/drivers/misc/mic/bus/scif_bus.h b/drivers/misc/mic/bus/scif_bus.h
new file mode 100644
index 0000000..335a228
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.h
@@ -0,0 +1,129 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#ifndef _SCIF_BUS_H_
+#define _SCIF_BUS_H_
+/*
+ * Everything a scif driver needs to work with any particular scif
+ * hardware abstraction layer.
+ */
+#include <linux/dma-mapping.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+struct scif_hw_dev_id {
+	u32 device;
+	u32 vendor;
+};
+
+#define MIC_SCIF_DEV 1
+#define SCIF_DEV_ANY_ID 0xffffffff
+
+/**
+ * scif_hw_dev - representation of a hardware device abstracted for scif
+ * @hw_ops: the hardware ops supported by this device
+ * @id: the device type identification (used to match it with a driver)
+ * @mmio: MMIO memory window
+ * @aper: Aperture memory window
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ * @snode - The source node for this device.
+ * @dp - Self device page
+ * @rdp - Remote device page
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ */
+struct scif_hw_dev {
+	struct scif_hw_ops *hw_ops;
+	struct scif_hw_dev_id id;
+	struct mic_mw *mmio;
+	struct mic_mw *aper;
+	struct device dev;
+	u8 dnode;
+	u8 snode;
+	void *dp;
+	void __iomem *rdp;
+	struct dma_chan **dma_ch;
+	int num_dma_ch;
+};
+
+/**
+ * scif_driver - operations for a scif I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_driver {
+	struct device_driver driver;
+	const struct scif_hw_dev_id *id_table;
+	int (*probe)(struct scif_hw_dev *dev);
+	void (*remove)(struct scif_hw_dev *dev);
+};
+
+/**
+ * scif_hw_ops - Hardware operations for accessing a SCIF device on the SCIF bus.
+ *
+ * @next_db: Obtain the next available doorbell.
+ * @request_irq: Request an interrupt on a particular doorbell.
+ * @free_irq: Free an interrupt requested previously.
+ * @ack_interrupt: acknowledge an interrupt in the ISR.
+ * @send_intr: Send an interrupt to the remote node on a specified doorbell.
+ * @send_p2p_intr: Send an interrupt to the peer node on a specified doorbell
+ * which is specifically targeted for a peer to peer node.
+ * @ioremap: Map a buffer with the specified physical address and length.
+ * @iounmap: Unmap a buffer previously mapped.
+ */
+struct scif_hw_ops {
+	int (*next_db)(struct scif_hw_dev *sdev);
+	struct mic_irq * (*request_irq)(struct scif_hw_dev *sdev,
+					irqreturn_t (*func)(int irq,
+							    void *data),
+					const char *name, void *data,
+					int db);
+	void (*free_irq)(struct scif_hw_dev *sdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct scif_hw_dev *sdev, int num);
+	void (*send_intr)(struct scif_hw_dev *sdev, int db);
+	void (*send_p2p_intr)(struct scif_hw_dev *sdev, int db,
+			      struct mic_mw *mw);
+	void __iomem * (*ioremap)(struct scif_hw_dev *sdev,
+				  phys_addr_t pa, size_t len);
+	void (*iounmap)(struct scif_hw_dev *sdev, void __iomem *va);
+};
+
+int scif_register_driver(struct scif_driver *driver);
+void scif_unregister_driver(struct scif_driver *driver);
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id,
+		     struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper,
+		     void *dp, void __iomem *rdp,
+		     struct dma_chan **chan, int num_chan);
+void scif_unregister_device(struct scif_hw_dev *sdev);
+
+static inline struct scif_hw_dev *dev_to_scif(struct device *dev)
+{
+	return container_of(dev, struct scif_hw_dev, dev);
+}
+
+static inline struct scif_driver *drv_to_scif(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_driver, driver);
+}
+#endif /* _SCIF_BUS_H */
diff --git a/drivers/misc/mic/bus/scif_bus.c b/drivers/misc/mic/bus/scif_bus.c
new file mode 100644
index 0000000..2da7cee
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.c
@@ -0,0 +1,210 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/dma-mapping.h>
+
+#include "scif_bus.h"
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "scif:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *scif_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(scif_dev);
+
+static inline int scif_id_match(const struct scif_hw_dev *dev,
+				const struct scif_hw_dev_id *id)
+{
+	if (id->device != dev->id.device && id->device != SCIF_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == SCIF_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call scif_dev_probe().
+ */
+static int scif_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+	const struct scif_hw_dev_id *ids;
+
+	ids = drv_to_scif(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (scif_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int scif_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+
+	return add_uevent_var(env, "MODALIAS=scif:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int scif_dev_probe(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_dev_remove(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_bus = {
+	.name  = "scif_bus",
+	.match = scif_dev_match,
+	.dev_groups = scif_dev_groups,
+	.uevent = scif_uevent,
+	.probe = scif_dev_probe,
+	.remove = scif_dev_remove,
+};
+
+int scif_register_driver(struct scif_driver *driver)
+{
+	driver->driver.bus = &scif_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_register_driver);
+
+void scif_unregister_driver(struct scif_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_driver);
+
+static void scif_release_dev(struct device *d)
+{
+	struct scif_hw_dev *sdev = dev_to_scif(d);
+
+	kfree(sdev);
+}
+
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper, void *dp,
+		     void __iomem *rdp, struct dma_chan **chan, int num_chan)
+{
+	int ret;
+	struct scif_hw_dev *sdev;
+
+	sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
+	if (!sdev)
+		return ERR_PTR(-ENOMEM);
+
+	sdev->dev.parent = pdev;
+	sdev->id.device = id;
+	sdev->id.vendor = SCIF_DEV_ANY_ID;
+	sdev->dev.archdata.dma_ops = dma_ops;
+	sdev->dev.release = scif_release_dev;
+	sdev->hw_ops = hw_ops;
+	sdev->dnode = dnode;
+	sdev->snode = snode;
+	dev_set_drvdata(&sdev->dev, sdev);
+	sdev->dev.bus = &scif_bus;
+	sdev->mmio = mmio;
+	sdev->aper = aper;
+	sdev->dp = dp;
+	sdev->rdp = rdp;
+	sdev->dev.dma_mask = &sdev->dev.coherent_dma_mask;
+	dma_set_mask(&sdev->dev, DMA_BIT_MASK(64));
+	sdev->dma_ch = chan;
+	sdev->num_dma_ch = num_chan;
+	dev_set_name(&sdev->dev, "scif-dev%u", sdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&sdev->dev);
+	if (ret)
+		goto free_sdev;
+	return sdev;
+free_sdev:
+	kfree(sdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(scif_register_device);
+
+void scif_unregister_device(struct scif_hw_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_device);
+
+static int __init scif_init(void)
+{
+	return bus_register(&scif_bus);
+}
+
+static void __exit scif_exit(void)
+{
+	bus_unregister(&scif_bus);
+}
+
+core_initcall(scif_init);
+module_exit(scif_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF Bus driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 61%]

* [PATCH char-misc-next v3 01/13] misc: mic: SCIF header file and IOCTL interface
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
@ 2015-01-27  0:32 22% ` Sudeep Dutt
    2015-01-27  0:32 58% ` [PATCH char-misc-next v3 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

This patch introduces the SCIF documentation in the header file
and describes the IOCTL interface for user mode. mic_overview.txt
is updated with documentation on SCIF and a new document
describing SCIF in more details is available in scif_overview.txt.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mic_overview.txt  |  28 +-
 Documentation/mic/scif_overview.txt |  98 ++++
 include/uapi/linux/Kbuild           |   1 +
 include/linux/scif.h                | 993 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/scif_ioctl.h     | 215 ++++++++
 5 files changed, 1323 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index 77c5418..1a2f2c8 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -24,6 +24,10 @@ a virtual bus called mic bus is created and virtual dma devices are
 created on it by the host/card drivers. On host the channels are private
 and used only by the host driver to transfer data for the virtio devices.
 
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
+low level communications API across PCIe currently implemented for MIC.
+More details are available at scif_overview.txt.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
@@ -47,18 +51,18 @@ the fact that the virtio block storage backend can only be on the host.
                       |               |       | Virtio over PCIe IOCTLs  |
                       |               |       +--------------------------+
 +-----------+         |               |                   |  +-----------+
-| MIC DMA   |         |               |                   |  | MIC DMA   |
-| Driver    |         |               |                   |  | Driver    |
-+-----------+         |               |                   |  +-----------+
-      |               |               |                   |        |
-+---------------+     |               |                   |  +----------------+
-|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
-+---------------+     |               |                   |  +----------------+
-      |               |               |                   |              |
-      |   +--------------+            |            +---------------+     |
-      |   |Intel MIC     |            |            |Intel MIC      |     |
-      +---|Card Driver   |            |            |Host Driver    |     |
-          +--------------+            |            +---------------+-----+
+| MIC DMA   |         |  +----------+ | +-----------+     |  | MIC DMA   |
+| Driver    |         |  |  SCIF    | | |   SCIF    |     |  | Driver    |
++-----------+         |  +----------+ | +-----------+     |  +-----------+
+      |               |       |       |       |           |        |
++---------------+     | +-----+-----+ | +-----+-----+     | +---------------+
+|MIC virtual Bus|     | |SCIF HW Bus| | |SCIF HW BUS|     | |MIC virtual Bus|
++---------------+     | +-----------+ | +-----+-----+     | +---------------+
+      |               |       |       |       |           |              |
+      |   +--------------+    |       |       |    +---------------+     |
+      |   |Intel MIC     |    |       |       |    |Intel MIC      |     |
+      +---|Card Driver   +----+       |       |    |Host Driver    |     |
+          +--------------+            |       +----+---------------+-----+
                       |               |                   |
              +-------------------------------------------------------------+
              |                                                             |
diff --git a/Documentation/mic/scif_overview.txt b/Documentation/mic/scif_overview.txt
new file mode 100644
index 0000000..0a280d9
--- /dev/null
+++ b/Documentation/mic/scif_overview.txt
@@ -0,0 +1,98 @@
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
+level communications API across PCIe currently implemented for MIC. Currently
+SCIF provides inter-node communication within a single host platform, where a
+node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
+communicating over the PCIe bus while providing an API that is symmetric
+across all the nodes in the PCIe network. An important design objective for SCIF
+is to deliver the maximum possible performance given the communication
+abilities of the hardware. SCIF has been used to implement an offload compiler
+runtime and OFED support for MPI implementations for MIC coprocessors.
+
+==== SCIF API Components ====
+The SCIF API has the following parts:
+1. Connection establishment using a client server model
+2. Byte stream messaging intended for short messages
+3. Node enumeration to determine online nodes
+4. Poll semantics for detection of incoming connections and messages
+5. Memory registration to pin down pages
+6. Remote memory mapping for low latency CPU accesses via mmap
+7. Remote DMA (RDMA) for high bandwidth DMA transfers
+8. Fence APIs for RDMA synchronization
+
+SCIF exposes the notion of a connection which can be used by peer processes on
+nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
+process in a SCIF node initiates a SCIF connection to a peer process on a
+different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
+which are similar to connection oriented socket APIs. Connected SCIF endpoints
+can also register local memory which is followed by data transfer using either
+DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
+kernel mode clients which are functionally equivalent.
+
+==== SCIF Performance for MIC ====
+DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
+SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
+
+             Comparison of TCP and SCIF based BW
+
+  Throughput (GB/sec)
+    8 +                                             PCIe Bandwidth ******
+      +                                                        TCP ######
+    7 +    **************************************             SCIF %%%%%%
+      |                       %%%%%%%%%%%%%%%%%%%
+    6 +                   %%%%
+      |                 %%
+      |               %%%
+    5 +              %%
+      |            %%
+    4 +           %%
+      |          %%
+    3 +         %%
+      |        %
+    2 +      %%
+      |     %%
+      |    %
+    1 +
+      +    ######################################
+    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
+      1       10     100      1000   10000   100000
+                   Transfer Size (KBytes)
+
+SCIF allows memory sharing via mmap(..) between processes on different PCIe
+nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
+latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
+
+SCIF has a user space library which is a thin IOCTL wrapper providing a user
+space API similar to the kernel API in scif.h. The SCIF user space library
+is distributed @ https://software.intel.com/en-us/mic-developer
+
+Here is some pseudo code for an example of how two applications on two PCIe
+nodes would typically use the SCIF API:
+
+Process A (on node A)			Process B (on node B)
+
+/* get online node information */
+scif_get_node_ids(..)			scif_get_node_ids(..)
+scif_open(..)				scif_open(..)
+scif_bind(..)				scif_bind(..)
+scif_listen(..)
+scif_accept(..)				scif_connect(..)
+/* SCIF connection established */
+
+/* Send and receive short messages */
+scif_send(..)/scif_recv(..)		scif_send(..)/scif_recv(..)
+
+/* Register memory */
+scif_register(..)			scif_register(..)
+
+/* RDMA */
+scif_readfrom(..)/scif_writeto(..)	scif_readfrom(..)/scif_writeto(..)
+
+/* Fence DMAs */
+scif_fence_signal(..)			scif_fence_signal(..)
+
+mmap(..)				mmap(..)
+
+/* Access remote registered memory */
+
+/* Close the endpoints */
+scif_close(..)				scif_close(..)
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 00b10002..5083817 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -349,6 +349,7 @@ header-y += rtc.h
 header-y += rtnetlink.h
 header-y += scc.h
 header-y += sched.h
+header-y += scif_ioctl.h
 header-y += screen_info.h
 header-y += sctp.h
 header-y += sdla.h
diff --git a/include/linux/scif.h b/include/linux/scif.h
new file mode 100644
index 0000000..44f4f38
--- /dev/null
+++ b/include/linux/scif.h
@@ -0,0 +1,993 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef __SCIF_H__
+#define __SCIF_H__
+
+#include <linux/types.h>
+#include <linux/poll.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_ACCEPT_SYNC	1
+#define SCIF_SEND_BLOCK		1
+#define SCIF_RECV_BLOCK		1
+
+enum {
+	SCIF_PROT_READ = (1 << 0),
+	SCIF_PROT_WRITE = (1 << 1)
+};
+
+enum {
+	SCIF_MAP_FIXED = 0x10,
+	SCIF_MAP_KERNEL	= 0x20,
+};
+
+enum {
+	SCIF_FENCE_INIT_SELF = (1 << 0),
+	SCIF_FENCE_INIT_PEER = (1 << 1),
+	SCIF_SIGNAL_LOCAL = (1 << 4),
+	SCIF_SIGNAL_REMOTE = (1 << 5)
+};
+
+enum {
+	SCIF_RMA_USECPU = (1 << 0),
+	SCIF_RMA_USECACHE = (1 << 1),
+	SCIF_RMA_SYNC = (1 << 2),
+	SCIF_RMA_ORDERED = (1 << 3)
+};
+
+/* End of SCIF Admin Reserved Ports */
+#define SCIF_ADMIN_PORT_END	1024
+
+/* End of SCIF Reserved Ports */
+#define SCIF_PORT_RSVD		1088
+
+typedef struct scif_endpt *scif_epd_t;
+
+#define SCIF_OPEN_FAILED ((scif_epd_t)-1)
+#define SCIF_REGISTER_FAILED ((off_t)-1)
+#define SCIF_MMAP_FAILED ((void *)-1)
+
+/**
+ * scif_open() - Create an endpoint
+ *
+ * Return:
+ * Upon successful completion, scif_open() returns an endpoint descriptor to
+ * be used in subsequent SCIF functions calls to refer to that endpoint;
+ * otherwise in user mode SCIF_OPEN_FAILED (that is ((scif_epd_t)-1)) is
+ * returned and errno is set to indicate the error; in kernel mode a NULL
+ * scif_epd_t is returned.
+ *
+ * Errors:
+ * ENOMEM - Insufficient kernel memory was available
+ */
+scif_epd_t scif_open(void);
+
+/**
+ * scif_bind() - Bind an endpoint to a port
+ * @epd:	endpoint descriptor
+ * @pn:		port number
+ *
+ * scif_bind() binds endpoint epd to port pn, where pn is a port number on the
+ * local node. If pn is zero, a port number greater than or equal to
+ * SCIF_PORT_RSVD is assigned and returned. Each endpoint may be bound to
+ * exactly one local port. Ports less than 1024 when requested can only be bound
+ * by system (or root) processes or by processes executed by privileged users.
+ *
+ * Return:
+ * Upon successful completion, scif_bind() returns the port number to which epd
+ * is bound; otherwise in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * EINVAL - the endpoint or the port is already bound
+ * EISCONN - The endpoint is already connected
+ * ENOSPC - No port number available for assignment
+ * EACCES - The port requested is protected and the user is not the superuser
+ */
+int scif_bind(scif_epd_t epd, u16 pn);
+
+/**
+ * scif_listen() - Listen for connections on an endpoint
+ * @epd:	endpoint descriptor
+ * @backlog:	maximum pending connection requests
+ *
+ * scif_listen() marks the endpoint epd as a listening endpoint - that is, as
+ * an endpoint that will be used to accept incoming connection requests. Once
+ * so marked, the endpoint is said to be in the listening state and may not be
+ * used as the endpoint of a connection.
+ *
+ * The endpoint, epd, must have been bound to a port.
+ *
+ * The backlog argument defines the maximum length to which the queue of
+ * pending connections for epd may grow. If a connection request arrives when
+ * the queue is full, the client may receive an error with an indication that
+ * the connection was refused.
+ *
+ * Return:
+ * Upon successful completion, scif_listen() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * EINVAL - the endpoint is not bound to a port
+ * EISCONN - The endpoint is already connected or listening
+ */
+int scif_listen(scif_epd_t epd, int backlog);
+
+/**
+ * scif_connect() - Initiate a connection on a port
+ * @epd:	endpoint descriptor
+ * @dst:	global id of port to which to connect
+ *
+ * The scif_connect() function requests the connection of endpoint epd to remote
+ * port dst. If the connection is successful, a peer endpoint, bound to dst, is
+ * created on node dst.node. On successful return, the connection is complete.
+ *
+ * If the endpoint epd has not already been bound to a port, scif_connect()
+ * will bind it to an unused local port.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * In user space, scif_connect() supports an asynchronous connection mode
+ * if the application has set the O_NONBLOCK flag on the endpoint via the
+ * fcntl() system call. Setting this flag will result in the calling process
+ * not to wait during scif_connect().
+ *
+ * Return:
+ * Upon successful completion, scif_connect() returns the port ID to which the
+ * endpoint, epd, is bound; otherwise in user mode -1 is returned and errno is
+ * set to indicate the error; in kernel mode the negative of one of the
+ * following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNREFUSED - The destination was not listening for connections or refused
+ * the connection request
+ * EINVAL - dst.port is not a valid port ID
+ * EISCONN - The endpoint is already connected
+ * ENOMEM - No buffer space is available
+ * ENODEV - The destination node does not exist, or the node is lost or existed,
+ * but is not currently in the network since it may have crashed
+ * ENOSPC - No port number available for assignment
+ * EOPNOTSUPP - The endpoint is listening and cannot be connected
+ */
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst);
+
+/**
+ * scif_accept() - Accept a connection on an endpoint
+ * @epd:	endpoint descriptor
+ * @peer:	global id of port to which connected
+ * @newepd:	new connected endpoint descriptor
+ * @flags:	flags
+ *
+ * The scif_accept() call extracts the first connection request from the queue
+ * of pending connections for the port on which epd is listening. scif_accept()
+ * creates a new endpoint, bound to the same port as epd, and allocates a new
+ * SCIF endpoint descriptor, returned in newepd, for the endpoint. The new
+ * endpoint is connected to the endpoint through which the connection was
+ * requested. epd is unaffected by this call, and remains in the listening
+ * state.
+ *
+ * On successful return, peer holds the global port identifier (node id and
+ * local port number) of the port which requested the connection.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * The number of connections that can (subsequently) be accepted on epd is only
+ * limited by system resources (memory).
+ *
+ * The flags argument is formed by OR'ing together zero or more of the
+ * following values.
+ * SCIF_ACCEPT_SYNC - block until a connection request is presented. If
+ *			SCIF_ACCEPT_SYNC is not in flags, and no pending
+ *			connections are present on the queue, scif_accept()
+ *			fails with an EAGAIN error
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when there is a connection request. In kernel mode, the scif_poll()
+ * function may be used for this purpose. A readable event will be delivered
+ * when a connection is requested.
+ *
+ * Return:
+ * Upon successful completion, scif_accept() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ *	negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EAGAIN - SCIF_ACCEPT_SYNC is not set and no connections are present to be
+ * accepted or SCIF_ACCEPT_SYNC is not set and remote node failed to complete
+ * its connection request
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * EINTR - Interrupted function
+ * EINVAL - epd is not a listening endpoint, or flags is invalid, or peer is
+ * NULL, or newepd is NULL
+ * ENODEV - The requesting node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOENT - Secondary part of epd registration failed
+ */
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer, scif_epd_t
+		*newepd, int flags);
+
+/**
+ * scif_close() - Close an endpoint
+ * @epd:	endpoint descriptor
+ *
+ * scif_close() closes an endpoint and performs necessary teardown of
+ * facilities associated with that endpoint.
+ *
+ * If epd is a listening endpoint then it will no longer accept connection
+ * requests on the port to which it is bound. Any pending connection requests
+ * are rejected.
+ *
+ * If epd is a connected endpoint, then its peer endpoint is also closed. RMAs
+ * which are in-process through epd or its peer endpoint will complete before
+ * scif_close() returns. Registered windows of the local and peer endpoints are
+ * released as if scif_unregister() was called against each window.
+ *
+ * Closing a SCIF endpoint does not affect local registered memory mapped by
+ * a SCIF endpoint on a remote node. The local memory remains mapped by the peer
+ * SCIF endpoint explicitly removed by calling munmap(..) by the peer.
+ *
+ * If the peer endpoint's receive queue is not empty at the time that epd is
+ * closed, then the peer endpoint can be passed as the endpoint parameter to
+ * scif_recv() until the receive queue is empty.
+ *
+ * epd is freed and may no longer be accessed.
+ *
+ * Return:
+ * Upon successful completion, scif_close() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ */
+int scif_close(scif_epd_t epd);
+
+/**
+ * scif_send() - Send a message
+ * @epd:	endpoint descriptor
+ * @msg:	message buffer address
+ * @len:	message length
+ * @flags:	blocking mode flags
+ *
+ * scif_send() sends data to the peer of endpoint epd. Up to len bytes of data
+ * are copied from memory starting at address msg. On successful execution the
+ * return value of scif_send() is the number of bytes that were sent, and is
+ * zero if no bytes were sent because len was zero. scif_send() may be called
+ * only when the endpoint is in a connected state.
+ *
+ * If a scif_send() call is non-blocking, then it sends only those bytes which
+ * can be sent without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_send() call is blocking, then it normally returns after sending
+ * all len bytes. If a blocking call is interrupted or the connection is
+ * reset, the call is considered successful if some bytes were sent or len is
+ * zero, otherwise the call is considered unsuccessful.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when the send queue is not full. In kernel mode, the scif_poll() function
+ * may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * scif_send() will block until the entire message is sent if SCIF_SEND_BLOCK
+ * is passed as the flags argument.
+ *
+ * Return:
+ * Upon successful completion, scif_send() returns the number of bytes sent;
+ * otherwise in user mode -1 is returned and errno is set to indicate the
+ * error; in kernel mode the negative of one of the following errors is
+ * returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - An invalid address was specified for a parameter
+ * EINVAL - flags is invalid, or len is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOTCONN - The endpoint is not connected
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_recv() - Receive a message
+ * @epd:	endpoint descriptor
+ * @msg:	message buffer address
+ * @len:	message buffer length
+ * @flags:	blocking mode flags
+ *
+ * scif_recv() receives data from the peer of endpoint epd. Up to len bytes of
+ * data are copied to memory starting at address msg. On successful execution
+ * the return value of scif_recv() is the number of bytes that were received,
+ * and is zero if no bytes were received because len was zero. scif_recv() may
+ * be called only when the endpoint is in a connected state.
+ *
+ * If a scif_recv() call is non-blocking, then it receives only those bytes
+ * which can be received without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_recv() call is blocking, then it normally returns after receiving
+ * all len bytes. If the blocking call was interrupted due to a disconnection,
+ * subsequent calls to scif_recv() will copy all bytes received upto the point
+ * of disconnection.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when data is available to be received. In kernel mode, the scif_poll()
+ * function may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * scif_recv() will block until the entire message is received if
+ * SCIF_RECV_BLOCK is passed as the flags argument.
+ *
+ * Return:
+ * Upon successful completion, scif_recv() returns the number of bytes
+ * received; otherwise in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ * Errors:
+ * EAGAIN - The destination node is returning from a low power state
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - An invalid address was specified for a parameter
+ * EINVAL - flags is invalid, or len is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOTCONN - The endpoint is not connected
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_register() - Mark a memory region for remote access.
+ * @epd:		endpoint descriptor
+ * @addr:		starting virtual address
+ * @len:		length of range
+ * @offset:		offset of window
+ * @prot_flags:		read/write protection flags
+ * @map_flags:		mapping flags
+ *
+ * The scif_register() function opens a window, a range of whole pages of the
+ * registered address space of the endpoint epd, starting at offset po and
+ * continuing for len bytes. The value of po, further described below, is a
+ * function of the parameters offset and len, and the value of map_flags. Each
+ * page of the window represents the physical memory page which backs the
+ * corresponding page of the range of virtual address pages starting at addr
+ * and continuing for len bytes. addr and len are constrained to be multiples
+ * of the page size. A successful scif_register() call returns po.
+ *
+ * When SCIF_MAP_FIXED is set in the map_flags argument, po will be offset
+ * exactly, and offset is constrained to be a multiple of the page size. The
+ * mapping established by scif_register() will not replace any existing
+ * registration; an error is returned if any page within the range [offset,
+ * offset + len - 1] intersects an existing window.
+ *
+ * When SCIF_MAP_FIXED is not set, the implementation uses offset in an
+ * implementation-defined manner to arrive at po. The po value so chosen will
+ * be an area of the registered address space that the implementation deems
+ * suitable for a mapping of len bytes. An offset value of 0 is interpreted as
+ * granting the implementation complete freedom in selecting po, subject to
+ * constraints described below. A non-zero value of offset is taken to be a
+ * suggestion of an offset near which the mapping should be placed. When the
+ * implementation selects a value for po, it does not replace any extant
+ * window. In all cases, po will be a multiple of the page size.
+ *
+ * The physical pages which are so represented by a window are available for
+ * access in calls to mmap(), scif_readfrom(), scif_writeto(),
+ * scif_vreadfrom(), and scif_vwriteto(). While a window is registered, the
+ * physical pages represented by the window will not be reused by the memory
+ * subsystem for any other purpose. Note that the same physical page may be
+ * represented by multiple windows.
+ *
+ * Subsequent operations which change the memory pages to which virtual
+ * addresses are mapped (such as mmap(), munmap()) have no effect on
+ * existing window.
+ *
+ * If the process will fork(), it is recommended that the registered
+ * virtual address range be marked with MADV_DONTFORK. Doing so will prevent
+ * problems due to copy-on-write semantics.
+ *
+ * The prot_flags argument is formed by OR'ing together one or more of the
+ * following values.
+ * SCIF_PROT_READ - allow read operations from the window
+ * SCIF_PROT_WRITE - allow write operations to the window
+ *
+ * The map_flags argument can be set to SCIF_MAP_FIXED which interprets a
+ * fixed offset.
+ *
+ * Return:
+ * Upon successful completion, scif_register() returns the offset at which the
+ * mapping was placed (po); otherwise in user mode SCIF_REGISTER_FAILED (that
+ * is (off_t *)-1) is returned and errno is set to indicate the error; in
+ * kernel mode the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EADDRINUSE - SCIF_MAP_FIXED is set in map_flags, and pages in the range
+ * [offset, offset + len -1] are already registered
+ * EAGAIN - The mapping could not be performed due to lack of resources
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL - map_flags is invalid, or prot_flags is invalid, or SCIF_MAP_FIXED is
+ * set in flags, and offset is not a multiple of the page size, or addr is not a
+ * multiple of the page size, or len is not a multiple of the page size, or is
+ * 0, or offset is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOMEM - Not enough space
+ * ENOTCONN -The endpoint is not connected
+ */
+off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
+		    int prot_flags, int map_flags);
+
+/**
+ * scif_unregister() - Mark a memory region for remote access.
+ * @epd:	endpoint descriptor
+ * @offset:	start of range to unregister
+ * @len:	length of range to unregister
+ *
+ * The scif_unregister() function closes those previously registered windows
+ * which are entirely within the range [offset, offset + len - 1]. It is an
+ * error to specify a range which intersects only a subrange of a window.
+ *
+ * On a successful return, pages within the window may no longer be specified
+ * in calls to mmap(), scif_readfrom(), scif_writeto(), scif_vreadfrom(),
+ * scif_vwriteto(), scif_get_pages, and scif_fence_signal(). The window,
+ * however, continues to exist until all previous references against it are
+ * removed. A window is referenced if there is a mapping to it created by
+ * mmap(), or if scif_get_pages() was called against the window
+ * (and the pages have not been returned via scif_put_pages()). A window is
+ * also referenced while an RMA, in which some range of the window is a source
+ * or destination, is in progress. Finally a window is referenced while some
+ * offset in that window was specified to scif_fence_signal(), and the RMAs
+ * marked by that call to scif_fence_signal() have not completed. While a
+ * window is in this state, its registered address space pages are not
+ * available for use in a new registered window.
+ *
+ * When all such references to the window have been removed, its references to
+ * all the physical pages which it represents are removed. Similarly, the
+ * registered address space pages of the window become available for
+ * registration in a new window.
+ *
+ * Return:
+ * Upon successful completion, scif_unregister() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned. In the event of an
+ * error, no windows are unregistered.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - the range [offset, offset + len - 1] intersects a subrange of a
+ * window, or offset is negative
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - Offsets in the range [offset, offset + len - 1] are invalid for the
+ * registered address space of epd
+ */
+int scif_unregister(scif_epd_t epd, off_t offset, size_t len);
+
+/**
+ * scif_readfrom() - Copy from a remote address space
+ * @epd:	endpoint descriptor
+ * @loffset:	offset in local registered address space to
+ *		which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space
+ *		from which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_readfrom() copies len bytes from the remote registered address space of
+ * the peer of endpoint epd, starting at the offset roffset to the local
+ * registered address space of epd, starting at the offset loffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_readfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if loffset and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_readfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - The range [loffset, loffset + len - 1] is invalid for the registered
+ * address space of epd, or, The range [roffset, roffset + len - 1] is invalid
+ * for the registered address space of the peer of epd, or loffset or roffset
+ * is negative
+ */
+int scif_readfrom(scif_epd_t epd, off_t loffset, size_t len, off_t
+		  roffset, int rma_flags);
+
+/**
+ * scif_writeto() - Copy to a remote address space
+ * @epd:	endpoint descriptor
+ * @loffset:	offset in local registered address space
+ *		from which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to
+ *		which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_writeto() copies len bytes from the local registered address space of
+ * epd, starting at the offset loffset to the remote registered address space
+ * of the peer of endpoint epd, starting at the offset roffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_writeto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if loffset and roffset are not separated by a multiple
+ * of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *			engine.
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_readfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - The range [loffset, loffset + len - 1] is invalid for the registered
+ * address space of epd, or, The range [roffset , roffset + len -1] is invalid
+ * for the registered address space of the peer of epd, or loffset or roffset
+ * is negative
+ */
+int scif_writeto(scif_epd_t epd, off_t loffset, size_t len, off_t
+		 roffset, int rma_flags);
+
+/**
+ * scif_vreadfrom() - Copy from a remote address space
+ * @epd:	endpoint descriptor
+ * @addr:	address to which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space
+ *		from which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_vreadfrom() copies len bytes from the remote registered address
+ * space of the peer of endpoint epd, starting at the offset roffset, to local
+ * memory, starting at addr.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may
+ * intersect multiple registered windows, but only if those windows are
+ * contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vreadfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if addr and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_USECACHE - enable registration caching
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *	the source range becomes visible on the destination node
+ *	after all other transferred data in the source range has
+ *	become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_vreadfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ * registered address space of epd
+ */
+int scif_vreadfrom(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		   int rma_flags);
+
+/**
+ * scif_vwriteto() - Copy to a remote address space
+ * @epd:	endpoint descriptor
+ * @addr:	address from which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to
+ *		which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_vwriteto() copies len bytes from the local memory, starting at addr, to
+ * the remote registered address space of the peer of endpoint epd, starting at
+ * the offset roffset.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may intersect
+ * multiple registered windows, but only if those windows are contiguous in the
+ * registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vwriteto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and offset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and offset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if addr and offset are not separated by a multiple of
+ * 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_USECACHE - allow registration caching
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_vwriteto() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS - Attempt to write to a read-only range
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EFAULT - Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL - rma_flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ * registered address space of epd
+ */
+int scif_vwriteto(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		  int rma_flags);
+
+/**
+ * scif_fence_mark() - Mark previously issued RMAs
+ * @epd:	endpoint descriptor
+ * @flags:	control flags
+ * @mark:	marked value returned as output.
+ *
+ * scif_fence_mark() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd. The RMAs are
+ * marked with a value returned at mark. The application may subsequently call
+ * scif_fence_wait(), passing the value returned at mark, to await completion
+ * of all RMAs so marked.
+ *
+ * The flags argument has exactly one of the following values.
+ * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
+ *	epd are marked
+ * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ *
+ * Return:
+ * Upon successful completion, scif_fence_mark() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - flags is invalid
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENOMEM - Insufficient kernel memory was available
+ */
+int scif_fence_mark(scif_epd_t epd, int flags, int *mark);
+
+/**
+ * scif_fence_wait() - Wait for completion of marked RMAs
+ * @epd:	endpoint descriptor
+ * @mark:	mark request
+ *
+ * scif_fence_wait() returns after all RMAs marked with mark have completed.
+ * The value passed in mark must have been obtained in a previous call to
+ * scif_fence_mark().
+ *
+ * Return:
+ * Upon successful completion, scif_fence_wait() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENOMEM - Insufficient kernel memory was available
+ */
+int scif_fence_wait(scif_epd_t epd, int mark);
+
+/**
+ * scif_fence_signal() - Request a memory update on completion of RMAs
+ * @epd:	endpoint descriptor
+ * @loff:	local offset
+ * @lval:	local value to write to loffset
+ * @roff:	remote offset
+ * @rval:	remote value to write to roffset
+ * @flags:	flags
+ *
+ * scif_fence_signal() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or marking the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd.
+ *
+ * If flags includes SCIF_SIGNAL_LOCAL, then on completion of the RMAs in the
+ * marked set, lval is written to memory at the address corresponding to offset
+ * loff in the local registered address space of epd. loff must be within a
+ * registered window. If flags includes SCIF_SIGNAL_REMOTE, then on completion
+ * of the RMAs in the marked set, rval is written to memory at the address
+ * corresponding to offset roff in the remote registered address space of epd.
+ * roff must be within a remote registered window of the peer of epd. Note
+ * that any specified offset must be DWORD (4 byte / 32 bit) aligned.
+ *
+ * The flags argument is formed by OR'ing together the following.
+ * Exactly one of the following values.
+ * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
+ *	epd are marked
+ * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ * One or more of the following values.
+ * SCIF_SIGNAL_LOCAL - On completion of the marked set of RMAs, write lval to
+ *	memory at the address corresponding to offset loff in the local
+ *	registered address space of epd.
+ * SCIF_SIGNAL_REMOTE - On completion of the marked set of RMAs, write rval to
+ *	memory at the address corresponding to offset roff in the remote
+ *	registered address space of epd.
+ *
+ * Return:
+ * Upon successful completion, scif_fence_signal() returns 0; otherwise in
+ * user mode -1 is returned and errno is set to indicate the error; in kernel
+ * mode the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY - epd is not a valid endpoint descriptor
+ * ECONNRESET - Connection reset by peer
+ * EINVAL - flags is invalid, or loff or roff are not DWORD aligned
+ * ENODEV - The remote node is lost or existed, but is not currently in the
+ * network since it may have crashed
+ * ENOTCONN - The endpoint is not connected
+ * ENXIO - loff is invalid for the registered address of epd, or roff is invalid
+ * for the registered address space, of the peer of epd
+ */
+int scif_fence_signal(scif_epd_t epd, off_t loff, u64 lval, off_t roff,
+		      u64 rval, int flags);
+
+/**
+ * scif_get_node_ids() - Return information about online nodes
+ * @nodes:	array in which to return online node IDs
+ * @len:	number of entries in the nodes array
+ * @self:	address to place the node ID of the local node
+ *
+ * scif_get_node_ids() fills in the nodes array with up to len node IDs of the
+ * nodes in the SCIF network. If there is not enough space in nodes, as
+ * indicated by the len parameter, only len node IDs are returned in nodes. The
+ * return value of scif_get_node_ids() is the total number of nodes currently in
+ * the SCIF network. By checking the return value against the len parameter,
+ * the user may determine if enough space for nodes was allocated.
+ *
+ * The node ID of the local node is returned at self.
+ *
+ * Return:
+ * Upon successful completion, scif_get_node_ids() returns the actual number of
+ * online nodes in the SCIF network including 'self'; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode no
+ * errors are returned.
+ *
+ * Errors:
+ * EFAULT - Bad address
+ */
+int scif_get_node_ids(u16 *nodes, int len, u16 *self);
+
+#endif /* __SCIF_H__ */
diff --git a/include/uapi/linux/scif_ioctl.h b/include/uapi/linux/scif_ioctl.h
new file mode 100644
index 0000000..32449de
--- /dev/null
+++ b/include/uapi/linux/scif_ioctl.h
@@ -0,0 +1,215 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+/*
+ * -----------------------------------------
+ * SCIF IOCTL interface information
+ * -----------------------------------------
+ */
+#ifndef SCIF_IOCTL_H
+#define SCIF_IOCTL_H
+
+#include <linux/types.h>
+
+/**
+ * struct scif_port_id - SCIF port information
+ * @node:	node on which port resides
+ * @port:	local port number
+ */
+struct scif_port_id {
+	__u16 node;
+	__u16 port;
+};
+
+/**
+ * struct scifioctl_connect - used for SCIF_CONNECT IOCTL
+ * @self:	used to read back the assigned port_id
+ * @peer:	destination node and port to connect to
+ */
+struct scifioctl_connect {
+	struct scif_port_id	self;
+	struct scif_port_id	peer;
+};
+
+/**
+ * struct scifioctl_accept - used for SCIF_ACCEPTREQ IOCTL
+ * @flags:	flags
+ * @peer:	global id of peer endpoint
+ * @endpt:	new connected endpoint descriptor
+ */
+struct scifioctl_accept {
+	__s32			flags;
+	struct scif_port_id	peer;
+	void			*endpt;
+};
+
+/**
+ * struct scifioctl_msg - used for SCIF_SEND/SCIF_RECV IOCTL
+ * @msg:	message buffer address
+ * @len:	message length
+ * @flags:	flags
+ * @out_len:	number of bytes sent/received
+ */
+struct scifioctl_msg {
+	void	__user *msg;
+	__s32	len;
+	__s32	flags;
+	__s32	out_len;
+};
+
+/**
+ * struct scifioctl_reg - used for SCIF_REG IOCTL
+ * @addr:	starting virtual address
+ * @len:	length of range
+ * @offset:	offset of window
+ * @prot:	read/write protection
+ * @flags:	flags
+ * @out_offset:	offset returned
+ */
+struct scifioctl_reg {
+	void		*addr;
+	__u64		len;
+	__s64		offset;
+	__s32		prot;
+	__s32		flags;
+	__s64		out_offset;
+};
+
+/**
+ * struct scifioctl_unreg - used for SCIF_UNREG IOCTL
+ * @offset:	start of range to unregister
+ * @len:	length of range to unregister
+ */
+struct scifioctl_unreg {
+	__s64		offset;
+	__u64		len;
+};
+
+/**
+ * struct scifioctl_copy - used for SCIF DMA copy IOCTLs
+ *
+ * @loffset:	offset in local registered address space to/from
+ *		which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to/from
+ *		which to copy
+ * @addr:	user virtual address to/from which to copy
+ * @flags:	flags
+ *
+ * This structure is used for SCIF_READFROM, SCIF_WRITETO, SCIF_VREADFROM
+ * and SCIF_VREADFROM IOCTL's.
+ */
+struct scifioctl_copy {
+	__s64		loffset;
+	__u64		len;
+	__s64		roffset;
+	__u8		__user *addr;
+	__s32		flags;
+};
+
+/**
+ * struct scifioctl_fence_mark  - used for SCIF_FENCE_MARK IOCTL
+ * @flags:	flags
+ * @mark:	fence handle returned by reference
+ */
+struct scifioctl_fence_mark {
+	__s32	flags;
+	__s32	__user *mark;
+};
+
+/**
+ * struct scifioctl_fence_signal - used for SCIF_FENCE_SIGNAL IOCTL
+ * @loff:	local offset
+ * @lval:	local value to write to loffset
+ * @roff:	remote offset
+ * @rval:	remote value to write to roffset
+ * @flags:	flags
+ */
+struct scifioctl_fence_signal {
+	__s64		loff;
+	__u64		lval;
+	__s64		roff;
+	__u64		rval;
+	__s32		flags;
+};
+
+/**
+ * struct scifioctl_node_ids - used for SCIF_GET_NODEIDS IOCTL
+ * @nodes:	pointer to an array of node_ids
+ * @len:	length of array
+ * @self:	ID of the current node
+ */
+struct scifioctl_node_ids {
+	__u16	__user *nodes;
+	__s32	len;
+	__u16	__user *self;
+};
+
+#define SCIF_BIND		_IOWR('s', 1, __s32 *)
+#define SCIF_LISTEN		_IOW('s', 2, __s32)
+#define SCIF_CONNECT		_IOWR('s', 3, struct scifioctl_connect *)
+#define SCIF_ACCEPTREQ		_IOWR('s', 4, struct scifioctl_accept *)
+#define SCIF_ACCEPTREG		_IOWR('s', 5, void *)
+#define SCIF_SEND		_IOWR('s', 6, struct scifioctl_msg *)
+#define SCIF_RECV		_IOWR('s', 7, struct scifioctl_msg *)
+#define SCIF_REG		_IOWR('s', 8, struct scifioctl_reg *)
+#define SCIF_UNREG		_IOWR('s', 9, struct scifioctl_unreg *)
+#define SCIF_READFROM		_IOWR('s', 10, struct scifioctl_copy *)
+#define SCIF_WRITETO		_IOWR('s', 11, struct scifioctl_copy *)
+#define SCIF_VREADFROM		_IOWR('s', 12, struct scifioctl_copy *)
+#define SCIF_VWRITETO		_IOWR('s', 13, struct scifioctl_copy *)
+#define SCIF_GET_NODEIDS	_IOWR('s', 14, struct scifioctl_node_ids *)
+#define SCIF_FENCE_MARK		_IOWR('s', 15, struct scifioctl_fence_mark *)
+#define SCIF_FENCE_WAIT		_IOWR('s', 16, __s32)
+#define SCIF_FENCE_SIGNAL	_IOWR('s', 17, struct scifioctl_fence_signal *)
+
+#endif /* SCIF_IOCTL_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 22%]

* [PATCH char-misc-next v3 07/13] misc: mic: SCIF node queue pair setup management
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (5 preceding siblings ...)
  2015-01-27  0:32 47% ` [PATCH char-misc-next v3 06/13] misc: mic: SCIF module initialization Sudeep Dutt
@ 2015-01-27  0:32 28% ` Sudeep Dutt
  2015-01-27  0:32 41% ` [PATCH char-misc-next v3 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF node queue pair setup creates the SCIF driver kernel
mode private node queue pairs between all the nodes to enable
internal control message communication once SCIF gets probed
by the SCIF hardware bus. Peer to peer communication between
MIC Coprocessor nodes is supported.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_nodeqp.h |  172 +++++
 drivers/misc/mic/scif/scif_nm.c     |  184 +++++
 drivers/misc/mic/scif/scif_nodeqp.c | 1259 +++++++++++++++++++++++++++++++++++
 3 files changed, 1615 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c

diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
new file mode 100644
index 0000000..511ad12
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -0,0 +1,172 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_NODEQP
+#define SCIF_NODEQP
+
+#include "scif_rb.h"
+#include "scif_peer_bus.h"
+
+#define SCIF_INIT 1  /* First message sent to the peer node for discovery */
+#define SCIF_EXIT 2  /* Last message from the peer informing intent to exit */
+#define SCIF_EXIT_ACK 3 /* Response to SCIF_EXIT message */
+#define SCIF_NODE_ADD 4  /* Tell Online nodes a new node exits */
+#define SCIF_NODE_ADD_ACK 5  /* Confirm to mgmt node sequence is finished */
+#define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
+#define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
+#define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
+#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+
+/*
+ * struct scifmsg - Node QP message format
+ *
+ * @src: Source information
+ * @dst: Destination information
+ * @uop: The message opcode
+ * @payload: Unique payload format for each message
+ */
+struct scifmsg {
+	struct scif_port_id src;
+	struct scif_port_id dst;
+	u32 uop;
+	u64 payload[4];
+} __packed;
+
+/*
+ * struct scif_qp - Node Queue Pair
+ *
+ * Interesting structure -- a little difficult because we can only
+ * write across the PCIe, so any r/w pointer we need to read is
+ * local. We only need to read the read pointer on the inbound_q
+ * and read the write pointer in the outbound_q
+ *
+ * @magic: Magic value to ensure the peer sees the QP correctly
+ * @outbound_q: The outbound ring buffer for sending messages
+ * @inbound_q: The inbound ring buffer for receiving messages
+ * @local_write: Local write index
+ * @local_read: Local read index
+ * @remote_qp: The remote queue pair
+ * @local_buf: DMA address of local ring buffer
+ * @local_qp: DMA address of the local queue pair data structure
+ * @remote_buf: DMA address of remote ring buffer
+ * @qp_state: QP state i.e. online or offline used for P2P
+ * @send_lock: synchronize access to outbound queue
+ * @recv_lock: Synchronize access to inbound queue
+ */
+struct scif_qp {
+	u64 magic;
+#define SCIFEP_MAGIC 0x5c1f000000005c1fULL
+	struct scif_rb outbound_q;
+	struct scif_rb inbound_q;
+
+	u32 local_write __aligned(64);
+	u32 local_read __aligned(64);
+	struct scif_qp *remote_qp;
+	dma_addr_t local_buf;
+	dma_addr_t local_qp;
+	dma_addr_t remote_buf;
+	u32 qp_state;
+#define SCIF_QP_OFFLINE 0xdead
+#define SCIF_QP_ONLINE 0xc0de
+	spinlock_t send_lock;
+	spinlock_t recv_lock;
+};
+
+/*
+ * struct scif_loopb_msg - An element in the loopback Node QP message list.
+ *
+ * @msg - The SCIF node QP message
+ * @list - link in the list of messages
+ */
+struct scif_loopb_msg {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_setup_qp(struct scif_dev *scifdev);
+int scif_qp_response(phys_addr_t phys, struct scif_dev *dev);
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev);
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev);
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload);
+int scif_setup_loopback_qp(struct scif_dev *scifdev);
+int scif_destroy_loopback_qp(struct scif_dev *scifdev);
+void scif_poll_qp_state(struct work_struct *work);
+void scif_destroy_p2p(struct scif_dev *scifdev);
+void scif_send_exit(struct scif_dev *scifdev);
+static inline struct device *scif_get_peer_dev(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+	struct device *spdev_ret;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		spdev_ret = get_device(&spdev->dev);
+	else
+		spdev_ret = ERR_PTR(-ENODEV);
+	rcu_read_unlock();
+	return spdev_ret;
+}
+
+static inline void scif_put_peer_dev(struct device *dev)
+{
+	put_device(dev);
+}
+#endif  /* SCIF_NODEQP */
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
new file mode 100644
index 0000000..5fefacc
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -0,0 +1,184 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_peer_bus.h"
+
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_free_qp(struct scif_dev *scifdev)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+
+	if (!qp)
+		return;
+	scif_free_coherent((void *)qp->inbound_q.rb_base,
+			   qp->local_buf, scifdev, qp->inbound_q.size);
+	scif_unmap_single(qp->local_qp, scifdev, sizeof(struct scif_qp));
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+}
+
+static void scif_cleanup_qp(struct scif_dev *dev)
+{
+	struct scif_qp *qp = &dev->qpairs[0];
+
+	if (!qp)
+		return;
+	scif_iounmap((void *)qp->remote_qp, sizeof(struct scif_qp), dev);
+	scif_iounmap((void *)qp->outbound_q.rb_base,
+		     sizeof(struct scif_qp), dev);
+	qp->remote_qp = NULL;
+	qp->local_write = 0;
+	qp->inbound_q.current_write_offset = 0;
+	qp->inbound_q.current_read_offset = 0;
+	if (scifdev_is_p2p(dev))
+		scif_free_qp(dev);
+}
+
+void scif_send_acks(struct scif_dev *dev)
+{
+	struct scifmsg msg;
+
+	if (dev->node_remove_ack_pending) {
+		msg.uop = SCIF_NODE_REMOVE_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = SCIF_MGMT_NODE;
+		msg.payload[0] = dev->node;
+		scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg);
+		dev->node_remove_ack_pending = false;
+	}
+	if (dev->exit_ack_pending) {
+		msg.uop = SCIF_EXIT_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = dev->node;
+		scif_nodeqp_send(dev, &msg);
+		dev->exit_ack_pending = false;
+	}
+}
+
+/*
+ * scif_cleanup_scifdev
+ *
+ * @dev: Remote SCIF device.
+ * Uninitialize SCIF data structures for remote SCIF device.
+ */
+void scif_cleanup_scifdev(struct scif_dev *dev)
+{
+	struct scif_hw_dev *sdev = dev->sdev;
+
+	if (!dev->sdev)
+		return;
+	if (scifdev_is_p2p(dev)) {
+		if (dev->cookie) {
+			sdev->hw_ops->free_irq(sdev, dev->cookie, dev);
+			dev->cookie = NULL;
+		}
+		scif_destroy_intr_wq(dev);
+	}
+	scif_destroy_p2p(dev);
+	scif_send_acks(dev);
+	if (!dev->node && scif_info.card_initiated_exit) {
+		/*
+		 * Send an SCIF_EXIT message which is the last message from MIC
+		 * to the Host and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(dev);
+		scif_info.card_initiated_exit = false;
+	}
+	scif_cleanup_qp(dev);
+}
+
+/*
+ * scif_remove_node:
+ *
+ * @node: Node to remove
+ */
+void scif_handle_remove_node(int node)
+{
+	struct scif_dev *scifdev = &scif_dev[node];
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	else
+		scif_send_acks(scifdev);
+}
+
+static int scif_send_rmnode_msg(int node, int remove_node)
+{
+	struct scifmsg notif_msg;
+	struct scif_dev *dev = &scif_dev[node];
+
+	notif_msg.uop = SCIF_NODE_REMOVE;
+	notif_msg.src.node = scif_info.nodeid;
+	notif_msg.dst.node = node;
+	notif_msg.payload[0] = remove_node;
+	return scif_nodeqp_send(dev, &notif_msg);
+}
+
+/**
+ * scif_node_disconnect:
+ *
+ * @node_id[in]: source node id.
+ * @mgmt_initiated: Disconnection initiated from the mgmt node
+ *
+ * Disconnect a node from the scif network.
+ */
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
+{
+	int ret;
+	int msg_cnt = 0;
+	u32 i = 0;
+	struct scif_dev *scifdev = &scif_dev[node_id];
+
+	if (!node_id)
+		return;
+
+	atomic_set(&scifdev->disconn_rescnt, 0);
+
+	/* Destroy p2p network */
+	for (i = 1; i <= scif_info.maxid; i++) {
+		if (i == node_id)
+			continue;
+		ret = scif_send_rmnode_msg(i, node_id);
+		if (!ret)
+			msg_cnt++;
+	}
+	/* Wait for the remote nodes to respond with SCIF_NODE_REMOVE_ACK */
+	ret = wait_event_timeout(scifdev->disconn_wq,
+				 (atomic_read(&scifdev->disconn_rescnt)
+				 == msg_cnt), SCIF_NODE_ALIVE_TIMEOUT);
+	/* Tell the card to clean up */
+	if (mgmt_initiated && _scifdev_alive(scifdev))
+		/*
+		 * Send an SCIF_EXIT message which is the last message from Host
+		 * to the MIC and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(scifdev);
+	atomic_set(&scifdev->disconn_rescnt, 0);
+	/* Tell the mgmt node to clean up */
+	ret = scif_send_rmnode_msg(SCIF_MGMT_NODE, node_id);
+	if (!ret)
+		/* Wait for mgmt node to respond with SCIF_NODE_REMOVE_ACK */
+		wait_event_timeout(scifdev->disconn_wq,
+				   (atomic_read(&scifdev->disconn_rescnt) == 1),
+				   SCIF_NODE_ALIVE_TIMEOUT);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
new file mode 100644
index 0000000..8978af9
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -0,0 +1,1259 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_nodeqp.h"
+#include "scif_map.h"
+
+/*
+ ************************************************************************
+ * SCIF node Queue Pair (QP) setup flow:
+ *
+ * 1) SCIF driver gets probed with a scif_hw_dev via the scif_hw_bus
+ * 2) scif_setup_qp(..) allocates the local qp and calls
+ *	scif_setup_qp_connect(..) which allocates and maps the local
+ *	buffer for the inbound QP
+ * 3) The local node updates the device page with the DMA address of the QP
+ * 4) A delayed work is scheduled (qp_dwork) which periodically reads if
+ *	the peer node has updated its QP DMA address
+ * 5) Once a valid non zero address is found in the QP DMA address field
+ *	in the device page, the local node maps the remote node's QP,
+ *	updates its outbound QP and sends a SCIF_INIT message to the peer
+ * 6) The SCIF_INIT message is received by the peer node QP interrupt bottom
+ *	half handler by calling scif_init(..)
+ * 7) scif_init(..) registers a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node
+ * 8) On the mgmt node, P2P network setup/teardown is initiated if all the
+ *	remote nodes are online via scif_p2p_setup(..)
+ * 9) For P2P setup, the host maps the remote nodes' aperture and memory
+ *	bars and sends a SCIF_NODE_ADD message to both nodes
+ * 10) As part of scif_nodeadd, both nodes set up their local inbound
+ *	QPs and send a SCIF_NODE_ADD_ACK to the mgmt node
+ * 11) As part of scif_node_add_ack(..) the mgmt node forwards the
+ *	SCIF_NODE_ADD_ACK to the remote nodes
+ * 12) As part of scif_node_add_ack(..) the remote nodes update their
+ *	outbound QPs, make sure they can access memory on the remote node
+ *	and then add a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node.
+ * 13) The SCIF network is now established across all nodes.
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by non mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) A non mgmt node now cleans up all local data structures and sends a
+ *	SCIF_EXIT message to the peer and waits for a SCIF_EXIT_ACK
+ * 4) As part of scif_exit(..) handling scif_disconnect_node(..) is called
+ * 5) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the
+ *	peers and waits for a SCIF_NODE_REMOVE_ACK
+ * 6) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 7) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it sends itself a node remove message whose handling cleans up local
+ *	data structures and unregisters the peer node from the SCIF network
+ * 8) The mgmt node sends a SCIF_EXIT_ACK
+ * 9) Upon receipt of the SCIF_EXIT_ACK the node initiating the teardown
+ *	completes the SCIF remove routine
+ * 10) The SCIF network is now torn down for the node initiating the
+ *	teardown sequence
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) The mgmt node calls scif_disconnect_node(..)
+ * 4) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the peers
+ *	and waits for a SCIF_NODE_REMOVE_ACK
+ * 5) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 6) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it unregisters the peer node from the SCIF network
+ * 7) The mgmt node sends a SCIF_EXIT message and waits for a SCIF_EXIT_ACK.
+ * 8) A non mgmt node upon receipt of a SCIF_EXIT message calls scif_stop(..)
+ *	which would clean up local data structures for all SCIF nodes and
+ *	then send a SCIF_EXIT_ACK back to the mgmt node
+ * 9) Upon receipt of the SCIF_EXIT_ACK the the mgmt node sends itself a node
+ *	remove message whose handling cleans up local data structures and
+ *	destroys any P2P mappings.
+ * 10) The SCIF hardware device for which a remove callback was received is now
+ *	disconnected from the SCIF network.
+ */
+/*
+ * Initializes "local" data structures for the QP. Allocates the QP
+ * ring buffer (rb) and initializes the "in bound" queue.
+ */
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev)
+{
+	void *local_q = NULL;
+	int err = 0;
+	u32 tmp_rd = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		return err;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &tmp_rd,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	/*
+	 * The read pointer is NULL initially and it is unsafe to use the ring
+	 * buffer til this changes!
+	 */
+	qp->inbound_q.read_ptr = NULL;
+	err = scif_map_single(qp_offset, qp,
+			      scifdev, sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+	return err;
+}
+
+/* When the other side has already done it's allocation, this is called */
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev)
+{
+	void *local_q;
+	void *remote_q;
+	struct scif_qp *remote_qp;
+	int remote_size;
+	int err = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+	/* Start by figuring out where we need to point */
+	remote_qp = scif_ioremap(phys, sizeof(struct scif_qp), scifdev);
+	if (!remote_qp)
+		return -EIO;
+	qp->remote_qp = remote_qp;
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_buf = remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	remote_q = scif_ioremap(qp->remote_buf, remote_size, scifdev);
+	if (!remote_q) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_qp->local_write = 0;
+	/*
+	 * To setup the outbound_q, the buffer lives in remote memory,
+	 * the read pointer is local, the write pointer is remote
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     remote_q,
+		     get_count_order(remote_size));
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto iounmap_1;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	qp->remote_qp->local_read = 0;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	err = scif_map_single(qp_offset, qp, scifdev,
+			      sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+iounmap_1:
+	scif_iounmap(remote_q, remote_size, scifdev);
+	qp->outbound_q.rb_base = NULL;
+iounmap:
+	scif_iounmap(qp->remote_qp, sizeof(struct scif_qp), scifdev);
+	qp->remote_qp = NULL;
+	return err;
+}
+
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload)
+{
+	int err = 0;
+	void *r_buf;
+	int remote_size;
+	phys_addr_t tmp_phys;
+
+	qp->remote_qp = scif_ioremap(payload, sizeof(struct scif_qp), scifdev);
+
+	if (!qp->remote_qp) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		dev_err(&scifdev->sdev->dev,
+			"SCIFEP_MAGIC mismatch between self %d remote %d\n",
+			scif_dev[scif_info.nodeid].node, scifdev->node);
+		err = -ENODEV;
+		goto error;
+	}
+
+	tmp_phys = qp->remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	r_buf = scif_ioremap(tmp_phys, remote_size, scifdev);
+
+	if (!r_buf)
+		return -EIO;
+
+	qp->local_read = 0;
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     r_buf,
+		     get_count_order(remote_size));
+	/*
+	 * resetup the inbound_q now that we know where the
+	 * inbound_read really is.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     qp->inbound_q.rb_base,
+		     get_count_order(qp->inbound_q.size));
+error:
+	return err;
+}
+
+static __always_inline void
+scif_send_msg_intr(struct scif_dev *scifdev)
+{
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_is_p2p(scifdev))
+		sdev->hw_ops->send_p2p_intr(sdev, scifdev->rdb, &scifdev->mmio);
+	else
+		sdev->hw_ops->send_intr(sdev, scifdev->rdb);
+}
+
+int scif_qp_response(phys_addr_t phys, struct scif_dev *scifdev)
+{
+	int err = 0;
+	struct scifmsg msg;
+
+	err = scif_setup_qp_connect_response(scifdev, scifdev->qpairs, phys);
+	if (!err) {
+		/*
+		 * Now that everything is setup and mapped, we're ready
+		 * to tell the peer about our queue's location
+		 */
+		msg.uop = SCIF_INIT;
+		msg.dst.node = scifdev->node;
+		err = scif_nodeqp_send(scifdev, &msg);
+	}
+	return err;
+}
+
+void scif_send_exit(struct scif_dev *scifdev)
+{
+	struct scifmsg msg;
+	int ret;
+
+	scifdev->exit = OP_IN_PROGRESS;
+	msg.uop = SCIF_EXIT;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = scifdev->node;
+	ret = scif_nodeqp_send(scifdev, &msg);
+	if (ret)
+		goto done;
+	/* Wait for a SCIF_EXIT_ACK message */
+	wait_event_timeout(scif_info.exitwq, scifdev->exit == OP_COMPLETED,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+done:
+	scifdev->exit = OP_IDLE;
+}
+
+int scif_setup_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	int local_size;
+	struct scif_qp *qp;
+
+	local_size = SCIF_NODE_QP_SIZE;
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp) {
+		err = -ENOMEM;
+		return err;
+	}
+	qp->magic = SCIFEP_MAGIC;
+	scifdev->qpairs = qp;
+	err = scif_setup_qp_connect(qp, &scifdev->qp_dma_addr,
+				    local_size, scifdev);
+	if (err)
+		goto free_qp;
+	/*
+	 * We're as setup as we can be. The inbound_q is setup, w/o a usable
+	 * outbound q.  When we get a message, the read_ptr will be updated,
+	 * and we will pull the message.
+	 */
+	return err;
+free_qp:
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+	return err;
+}
+
+static void scif_p2p_freesg(struct scatterlist *sg)
+{
+	kfree(sg);
+}
+
+static struct scatterlist *
+scif_p2p_setsg(void __iomem *va, int page_size, int page_cnt)
+{
+	struct scatterlist *sg;
+	struct page *page;
+	int i;
+
+	sg = kcalloc(page_cnt, sizeof(struct scatterlist), GFP_KERNEL);
+	if (!sg)
+		return NULL;
+	sg_init_table(sg, page_cnt);
+	for (i = 0; i < page_cnt; i++) {
+		page = vmalloc_to_page((void __force *)va);
+		if (!page)
+			goto p2p_sg_err;
+		sg_set_page(&sg[i], page, page_size, 0);
+		va += page_size;
+	}
+	return sg;
+p2p_sg_err:
+	kfree(sg);
+	return NULL;
+}
+
+/* Init p2p mappings required to access peerdev from scifdev */
+static struct scif_p2p_info *
+scif_init_p2p_info(struct scif_dev *scifdev, struct scif_dev *peerdev)
+{
+	struct scif_p2p_info *p2p;
+	int num_mmio_pages, num_aper_pages, sg_page_shift, err, num_aper_chunks;
+	struct scif_hw_dev *psdev = peerdev->sdev;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	num_mmio_pages = psdev->mmio->len >> PAGE_SHIFT;
+	num_aper_pages = psdev->aper->len >> PAGE_SHIFT;
+
+	p2p = kzalloc(sizeof(*p2p), GFP_KERNEL);
+	if (!p2p)
+		return NULL;
+	p2p->ppi_sg[SCIF_PPI_MMIO] = scif_p2p_setsg(psdev->mmio->va,
+						    PAGE_SIZE, num_mmio_pages);
+	if (!p2p->ppi_sg[SCIF_PPI_MMIO])
+		goto free_p2p;
+	p2p->sg_nentries[SCIF_PPI_MMIO] = num_mmio_pages;
+	sg_page_shift = get_order(min(psdev->aper->len, (u64)(1 << 30)));
+	num_aper_chunks = num_aper_pages >> (sg_page_shift - PAGE_SHIFT);
+	p2p->ppi_sg[SCIF_PPI_APER] = scif_p2p_setsg(psdev->aper->va,
+						    1 << sg_page_shift,
+						    num_aper_chunks);
+	p2p->sg_nentries[SCIF_PPI_APER] = num_aper_chunks;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			 num_mmio_pages, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_mmio_pages)
+		goto scif_p2p_free;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			 num_aper_chunks, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_aper_chunks)
+		goto dma_unmap;
+	p2p->ppi_da[SCIF_PPI_MMIO] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	p2p->ppi_da[SCIF_PPI_APER] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_APER]);
+	p2p->ppi_len[SCIF_PPI_MMIO] = num_mmio_pages;
+	p2p->ppi_len[SCIF_PPI_APER] = num_aper_pages;
+	p2p->ppi_peer_id = peerdev->node;
+	return p2p;
+dma_unmap:
+	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);
+scif_p2p_free:
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+free_p2p:
+	kfree(p2p);
+	return NULL;
+}
+
+/**
+ * scif_node_connect: Respond to SCIF_NODE_CONNECT interrupt message
+ * @dst: Destination node
+ *
+ * Connect the src and dst node by setting up the p2p connection
+ * between them. Management node here acts like a proxy.
+ */
+static void scif_node_connect(struct scif_dev *scifdev, int dst)
+{
+	struct scif_dev *dev_j = scifdev;
+	struct scif_dev *dev_i = NULL;
+	struct scif_p2p_info *p2p_ij = NULL;    /* bus addr for j from i */
+	struct scif_p2p_info *p2p_ji = NULL;    /* bus addr for i from j */
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	struct scifmsg msg;
+	int err;
+	u64 tmppayload;
+
+	if (dst < 1 || dst > scif_info.maxid)
+		return;
+
+	dev_i = &scif_dev[dst];
+
+	if (!_scifdev_alive(dev_i))
+		return;
+	/*
+	 * If the p2p connection is already setup or in the process of setting
+	 * up then just ignore this request. The requested node will get
+	 * informed by SCIF_NODE_ADD_ACK or SCIF_NODE_ADD_NACK
+	 */
+	if (!list_empty(&dev_i->p2p)) {
+		list_for_each_safe(pos, tmp, &dev_i->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == dev_j->node)
+				return;
+		}
+	}
+	p2p_ij = scif_init_p2p_info(dev_i, dev_j);
+	if (!p2p_ij)
+		return;
+	p2p_ji = scif_init_p2p_info(dev_j, dev_i);
+	if (!p2p_ji)
+		return;
+	list_add_tail(&p2p_ij->ppi_list, &dev_i->p2p);
+	list_add_tail(&p2p_ji->ppi_list, &dev_j->p2p);
+
+	/*
+	 * Send a SCIF_NODE_ADD to dev_i, pass it its bus address
+	 * as seen from dev_j
+	 */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_j->node;
+	msg.dst.node = dev_i->node;
+
+	msg.payload[0] = p2p_ji->ppi_da[SCIF_PPI_APER];
+	msg.payload[1] = p2p_ij->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[2] = p2p_ij->ppi_da[SCIF_PPI_APER];
+	msg.payload[3] = p2p_ij->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	err = scif_nodeqp_send(dev_i,  &msg);
+	if (err) {
+		dev_err(&scifdev->sdev->dev,
+			"%s %d error %d\n", __func__, __LINE__, err);
+		return;
+	}
+
+	/* Same as above but to dev_j */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_i->node;
+	msg.dst.node = dev_j->node;
+
+	tmppayload = msg.payload[0];
+	msg.payload[0] = msg.payload[2];
+	msg.payload[2] = tmppayload;
+	msg.payload[1] = p2p_ji->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[3] = p2p_ji->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	scif_nodeqp_send(dev_j, &msg);
+}
+
+static void scif_p2p_setup(void)
+{
+	int i, j;
+
+	if (!scif_info.p2p_enable)
+		return;
+
+	for (i = 1; i <= scif_info.maxid; i++)
+		if (!_scifdev_alive(&scif_dev[i]))
+			return;
+
+	for (i = 1; i <= scif_info.maxid; i++) {
+		for (j = 1; j <= scif_info.maxid; j++) {
+			struct scif_dev *scifdev = &scif_dev[i];
+
+			if (i == j)
+				continue;
+			scif_node_connect(scifdev, j);
+		}
+	}
+}
+
+static int scif_qp_response_ack(struct scif_dev *scifdev, u64 payload)
+{
+	int err = 0;
+	struct scif_peer_dev *spdev;
+
+	/* Drop the INIT message if it has already been received */
+	if (_scifdev_alive(scifdev))
+		goto done;
+	/*
+	 * Allow the thread waiting for device page updates for the peer QP DMA
+	 * address to complete initializing the inbound_q.
+	 */
+	flush_delayed_work(&scifdev->qp_dwork);
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto done;
+	}
+	if (scif_is_mgmt_node()) {
+		mutex_lock(&scif_info.conflock);
+		scif_p2p_setup();
+		mutex_unlock(&scif_info.conflock);
+	}
+done:
+	return err;
+}
+
+static char *message_types[] = {"BAD",
+				"INIT",
+				"EXIT",
+				"SCIF_EXIT_ACK",
+				"SCIF_NODE_ADD",
+				"SCIF_NODE_ADD_ACK",
+				"SCIF_NODE_ADD_NACK",
+				"REMOVE_NODE",
+				"REMOVE_NODE_ACK"};
+
+static void
+scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
+		     const char *label)
+{
+	if (!scif_info.en_msg_log)
+		return;
+	if (msg->uop > SCIF_MAX_MSG) {
+		dev_err(&scifdev->sdev->dev,
+			"%s: unknown msg type %d\n", label, msg->uop);
+		return;
+	}
+	dev_info(&scifdev->sdev->dev,
+		 "%s: msg type %s, src %d:%d, dest %d:%d payload 0x%llx:0x%llx:0x%llx:0x%llx\n",
+		 label, message_types[msg->uop], msg->src.node, msg->src.port,
+		 msg->dst.node, msg->dst.port, msg->payload[0], msg->payload[1],
+		 msg->payload[2], msg->payload[3]);
+}
+
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+	int err = -ENOMEM, loop_cnt = 0;
+
+	scif_display_message(scifdev, msg, "Sent");
+	if (!qp) {
+		err = -EINVAL;
+		goto error;
+	}
+	spin_lock(&qp->send_lock);
+
+	while ((err = scif_rb_write(&qp->outbound_q,
+				    msg, sizeof(struct scifmsg)))) {
+		mdelay(1);
+#define SCIF_NODEQP_SEND_TO_MSEC (3 * 1000)
+		if (loop_cnt++ > (SCIF_NODEQP_SEND_TO_MSEC)) {
+			err = -ENODEV;
+			break;
+		}
+	}
+	if (!err)
+		scif_rb_commit(&qp->outbound_q);
+	spin_unlock(&qp->send_lock);
+	if (!err) {
+		if (scifdev_self(scifdev))
+			/*
+			 * For loopback we need to emulate an interrupt by
+			 * queuing work for the queue handling real node
+			 * Qp interrupts.
+			 */
+			queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+		else
+			scif_send_msg_intr(scifdev);
+	}
+error:
+	if (err)
+		dev_dbg(&scifdev->sdev->dev,
+			"%s %d error %d uop %d\n",
+			 __func__, __LINE__, err, msg->uop);
+	return err;
+}
+
+/**
+ * scif_nodeqp_send - Send a message on the node queue pair
+ * @scifdev: Scif Device.
+ * @msg: The message to be sent.
+ */
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int err;
+	struct device *spdev = NULL;
+
+	if (msg->uop > SCIF_EXIT_ACK) {
+		/* Dont send messages once the exit flow has begun */
+		if (OP_IDLE != scifdev->exit)
+			return -ENODEV;
+		spdev = scif_get_peer_dev(scifdev);
+		if (IS_ERR(spdev)) {
+			err = PTR_ERR(spdev);
+			return err;
+		}
+	}
+	err = _scif_nodeqp_send(scifdev, msg);
+	if (msg->uop > SCIF_EXIT_ACK)
+		scif_put_peer_dev(spdev);
+	return err;
+}
+
+/*
+ * scif_misc_handler:
+ *
+ * Work queue handler for servicing miscellaneous SCIF tasks.
+ * Examples include:
+ * 1) Cleanup of zombie endpoints.
+ */
+void scif_misc_handler(struct work_struct *work)
+{
+	scif_cleanup_zombie_epd();
+}
+
+/**
+ * scif_init() - Respond to SCIF_INIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ */
+static __always_inline void
+scif_init(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	scif_qp_response_ack(scifdev, msg->payload[0]);
+}
+
+/**
+ * scif_exit() - Respond to SCIF_EXIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * This function stops the SCIF interface for the node which sent
+ * the SCIF_EXIT message and starts waiting for that node to
+ * resetup the queue pair again.
+ */
+static __always_inline void
+scif_exit(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit_ack_pending = true;
+	if (scif_is_mgmt_node())
+		scif_disconnect_node(scifdev->node, false);
+	else
+		scif_stop(scifdev);
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+}
+
+/**
+ * scif_exitack() - Respond to SCIF_EXIT_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ */
+static __always_inline void
+scif_exit_ack(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit = OP_COMPLETED;
+	wake_up(&scif_info.exitwq);
+}
+
+/**
+ * scif_node_add() - Respond to SCIF_NODE_ADD interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * When the mgmt node driver has finished initializing a MIC node queue pair it
+ * marks the node as online. It then looks for all currently online MIC cards
+ * and send a SCIF_NODE_ADD message to identify the ID of the new card for
+ * peer to peer initialization
+ *
+ * The local node allocates its incoming queue and sends its address in the
+ * SCIF_NODE_ADD_ACK message back to the mgmt node, the mgmt node "reflects"
+ * this message to the new node
+ */
+static __always_inline void
+scif_node_add(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *newdev;
+	dma_addr_t qp_offset;
+	int qp_connect;
+	struct scif_hw_dev *sdev;
+
+	dev_dbg(&scifdev->sdev->dev,
+		"Scifdev %d:%d received NODE_ADD msg for node %d\n",
+		scifdev->node, msg->dst.node, msg->src.node);
+	dev_dbg(&scifdev->sdev->dev,
+		"Remote address for this node's aperture %llx\n",
+		msg->payload[0]);
+	newdev = &scif_dev[msg->src.node];
+	newdev->node = msg->src.node;
+	newdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	sdev = newdev->sdev;
+
+	if (scif_setup_intr_wq(newdev)) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to setup interrupts for %d\n", msg->src.node);
+		goto interrupt_setup_error;
+	}
+	newdev->mmio.va = ioremap_nocache(msg->payload[1], sdev->mmio->len);
+	if (!newdev->mmio.va) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to map mmio for %d\n", msg->src.node);
+		goto mmio_map_error;
+	}
+	newdev->qpairs = kzalloc(sizeof(*newdev->qpairs), GFP_KERNEL);
+	if (!newdev->qpairs)
+		goto qp_alloc_error;
+	/*
+	 * Set the base address of the remote node's memory since it gets
+	 * added to qp_offset
+	 */
+	newdev->base_addr = msg->payload[0];
+
+	qp_connect = scif_setup_qp_connect(newdev->qpairs, &qp_offset,
+					   SCIF_NODE_QP_SIZE, newdev);
+	if (qp_connect) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to setup qp_connect %d\n", qp_connect);
+		goto qp_connect_error;
+	}
+
+	newdev->db = sdev->hw_ops->next_db(sdev);
+	newdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						   "SCIF_INTR", newdev,
+						   newdev->db);
+	if (IS_ERR(newdev->cookie))
+		goto qp_connect_error;
+	newdev->qpairs->magic = SCIFEP_MAGIC;
+	newdev->qpairs->qp_state = SCIF_QP_OFFLINE;
+
+	msg->uop = SCIF_NODE_ADD_ACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	msg->payload[0] = qp_offset;
+	msg->payload[2] = newdev->db;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+	return;
+qp_connect_error:
+	kfree(newdev->qpairs);
+	newdev->qpairs = NULL;
+qp_alloc_error:
+	iounmap(newdev->mmio.va);
+	newdev->mmio.va = NULL;
+mmio_map_error:
+interrupt_setup_error:
+	dev_err(&scifdev->sdev->dev,
+		"node add failed for node %d\n", msg->src.node);
+	msg->uop = SCIF_NODE_ADD_NACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+}
+
+void scif_poll_qp_state(struct work_struct *work)
+{
+#define SCIF_NODE_QP_RETRY 100
+#define SCIF_NODE_QP_TIMEOUT 100
+	struct scif_dev *peerdev = container_of(work, struct scif_dev,
+							p2p_dwork.work);
+	struct scif_qp *qp = &peerdev->qpairs[0];
+
+	if (qp->qp_state != SCIF_QP_ONLINE ||
+	    qp->remote_qp->qp_state != SCIF_QP_ONLINE) {
+		if (peerdev->p2p_retry++ == SCIF_NODE_QP_RETRY) {
+			dev_err(&peerdev->sdev->dev,
+				"Warning: QP check timeout with state %d\n",
+				qp->qp_state);
+			goto timeout;
+		}
+		schedule_delayed_work(&peerdev->p2p_dwork,
+				      msecs_to_jiffies(SCIF_NODE_QP_TIMEOUT));
+		return;
+	}
+	scif_peer_register_device(peerdev);
+	return;
+timeout:
+	dev_err(&peerdev->sdev->dev,
+		"%s %d remote node %d offline,  state = 0x%x\n",
+		__func__, __LINE__, peerdev->node, qp->qp_state);
+	qp->remote_qp->qp_state = SCIF_QP_OFFLINE;
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_ack() - Respond to SCIF_NODE_ADD_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * After a MIC node receives the SCIF_NODE_ADD_ACK message it send this
+ * message to the mgmt node to confirm the sequence is finished.
+ *
+ */
+static __always_inline void
+scif_node_add_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *peerdev;
+	struct scif_qp *qp;
+	struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+	dev_dbg(&scifdev->sdev->dev,
+		"Scifdev %d received SCIF_NODE_ADD_ACK msg src %d dst %d\n",
+		scifdev->node, msg->src.node, msg->dst.node);
+	dev_dbg(&scifdev->sdev->dev,
+		"payload %llx %llx %llx %llx\n", msg->payload[0],
+		msg->payload[1], msg->payload[2], msg->payload[3]);
+	if (scif_is_mgmt_node()) {
+		/*
+		 * the lock serializes with scif_qp_response_ack. The mgmt node
+		 * is forwarding the NODE_ADD_ACK message from src to dst we
+		 * need to make sure that the dst has already received a
+		 * NODE_ADD for src and setup its end of the qp to dst
+		 */
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		scif_nodeqp_send(dst_dev, msg);
+		mutex_unlock(&scif_info.conflock);
+		return;
+	}
+	peerdev = &scif_dev[msg->src.node];
+	peerdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	peerdev->node = msg->src.node;
+
+	qp = &peerdev->qpairs[0];
+
+	if ((scif_setup_qp_connect_response(peerdev, &peerdev->qpairs[0],
+					    msg->payload[0])))
+		goto local_error;
+	peerdev->rdb = msg->payload[2];
+	qp->remote_qp->qp_state = SCIF_QP_ONLINE;
+	schedule_delayed_work(&peerdev->p2p_dwork, 0);
+	return;
+local_error:
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_nack: Respond to SCIF_NODE_ADD_NACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * SCIF_NODE_ADD failed, so inform the waiting wq.
+ */
+static __always_inline void
+scif_node_add_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+		dev_dbg(&scifdev->sdev->dev,
+			"SCIF_NODE_ADD_NACK received from %d\n", scifdev->node);
+		scif_nodeqp_send(dst_dev, msg);
+	}
+}
+
+/*
+ * scif_node_remove: Handle SCIF_NODE_REMOVE message
+ * @msg: Interrupt message
+ *
+ * Handle node removal.
+ */
+static __always_inline void
+scif_node_remove(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int node = msg->payload[0];
+	struct scif_dev *scdev = &scif_dev[node];
+
+	scdev->node_remove_ack_pending = true;
+	scif_handle_remove_node(node);
+}
+
+/*
+ * scif_node_remove_ack: Handle SCIF_NODE_REMOVE_ACK message
+ * @msg: Interrupt message
+ *
+ * The peer has acked a SCIF_NODE_REMOVE message.
+ */
+static __always_inline void
+scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *sdev = &scif_dev[msg->payload[0]];
+
+	atomic_inc(&sdev->disconn_rescnt);
+	wake_up(&sdev->disconn_wq);
+}
+
+static void
+scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	/* Bogus Node Qp Message? */
+	dev_err(&scifdev->sdev->dev,
+		"Unknown message 0x%xn scifdev->node 0x%x\n",
+		msg->uop, scifdev->node);
+}
+
+static void (*scif_intr_func[SCIF_MAX_MSG + 1])
+	    (struct scif_dev *, struct scifmsg *msg) = {
+	scif_msg_unknown,	/* Error */
+	scif_init,		/* SCIF_INIT */
+	scif_exit,		/* SCIF_EXIT */
+	scif_exit_ack,		/* SCIF_EXIT_ACK */
+	scif_node_add,		/* SCIF_NODE_ADD */
+	scif_node_add_ack,	/* SCIF_NODE_ADD_ACK */
+	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
+	scif_node_remove,	/* SCIF_NODE_REMOVE */
+	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+};
+
+/**
+ * scif_nodeqp_msg_handler() - Common handler for node messages
+ * @scifdev: Remote device to respond to
+ * @qp: Remote memory pointer
+ * @msg: The message to be handled.
+ *
+ * This routine calls the appropriate routine to handle a Node Qp
+ * message receipt
+ */
+static int scif_max_msg_id = SCIF_MAX_MSG;
+
+static void
+scif_nodeqp_msg_handler(struct scif_dev *scifdev,
+			struct scif_qp *qp, struct scifmsg *msg)
+{
+	scif_display_message(scifdev, msg, "Rcvd");
+
+	if (msg->uop > (u32)scif_max_msg_id) {
+		/* Bogus Node Qp Message? */
+		dev_err(&scifdev->sdev->dev,
+			"Unknown message 0x%xn scifdev->node 0x%x\n",
+			msg->uop, scifdev->node);
+		return;
+	}
+
+	scif_intr_func[msg->uop](scifdev, msg);
+}
+
+/**
+ * scif_nodeqp_intrhandler() - Interrupt handler for node messages
+ * @scifdev:    Remote device to respond to
+ * @qp:         Remote memory pointer
+ *
+ * This routine is triggered by the interrupt mechanism.  It reads
+ * messages from the node queue RB and calls the Node QP Message handling
+ * routine.
+ */
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	struct scifmsg msg;
+	int read_size;
+
+	do {
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg, sizeof(msg));
+		if (!read_size)
+			break;
+		scif_nodeqp_msg_handler(scifdev, qp, &msg);
+		/*
+		 * The node queue pair is unmapped so skip the read pointer
+		 * update after receipt of a SCIF_EXIT_ACK
+		 */
+		if (SCIF_EXIT_ACK == msg.uop)
+			break;
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (1);
+}
+
+/**
+ * scif_loopb_wq_handler - Loopback Workqueue Handler.
+ * @work: loop back work
+ *
+ * This work queue routine is invoked by the loopback work queue handler.
+ * It grabs the recv lock, dequeues any available messages from the head
+ * of the loopback message list, calls the node QP message handler,
+ * waits for it to return, then frees up this message and dequeues more
+ * elements of the list if available.
+ */
+static void scif_loopb_wq_handler(struct work_struct *unused)
+{
+	struct scif_dev *scifdev = scif_info.loopb_dev;
+	struct scif_qp *qp = scifdev->qpairs;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = NULL;
+		spin_lock(&qp->recv_lock);
+		if (!list_empty(&scif_info.loopb_recv_q)) {
+			msg = list_first_entry(&scif_info.loopb_recv_q,
+					       struct scif_loopb_msg,
+					       list);
+			list_del(&msg->list);
+		}
+		spin_unlock(&qp->recv_lock);
+
+		if (msg) {
+			scif_nodeqp_msg_handler(scifdev, qp, &msg->msg);
+			kfree(msg);
+		}
+	} while (msg);
+}
+
+/**
+ * scif_loopb_msg_handler() - Workqueue handler for loopback messages.
+ * @scifdev: SCIF device
+ * @qp: Queue pair.
+ *
+ * This work queue routine is triggered when a loopback message is received.
+ *
+ * We need special handling for receiving Node Qp messages on a loopback SCIF
+ * device via two workqueues for receiving messages.
+ *
+ * The reason we need the extra workqueue which is not required with *normal*
+ * non-loopback SCIF devices is the potential classic deadlock described below:
+ *
+ * Thread A tries to send a message on a loopback SCIF device and blocks since
+ * there is no space in the RB while it has the send_lock held or another
+ * lock called lock X for example.
+ *
+ * Thread B: The Loopback Node QP message receive workqueue receives the message
+ * and tries to send a message (eg an ACK) to the loopback SCIF device. It tries
+ * to grab the send lock again or lock X and deadlocks with Thread A. The RB
+ * cannot be drained any further due to this classic deadlock.
+ *
+ * In order to avoid deadlocks as mentioned above we have an extra level of
+ * indirection achieved by having two workqueues.
+ * 1) The first workqueue whose handler is scif_loopb_msg_handler reads
+ * messages from the Node QP RB, adds them to a list and queues work for the
+ * second workqueue.
+ *
+ * 2) The second workqueue whose handler is scif_loopb_wq_handler dequeues
+ * messages from the list, handles them, frees up the memory and dequeues
+ * more elements from the list if possible.
+ */
+int
+scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	int read_size;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = kmalloc(sizeof(*msg), GFP_KERNEL);
+		if (!msg)
+			return -ENOMEM;
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg->msg,
+					     sizeof(struct scifmsg));
+		if (read_size != sizeof(struct scifmsg)) {
+			kfree(msg);
+			scif_rb_update_read_ptr(&qp->inbound_q);
+			break;
+		}
+		spin_lock(&qp->recv_lock);
+		list_add_tail(&msg->list, &scif_info.loopb_recv_q);
+		spin_unlock(&qp->recv_lock);
+		queue_work(scif_info.loopb_wq, &scif_info.loopb_work);
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (read_size == sizeof(struct scifmsg));
+	return read_size;
+}
+
+/**
+ * scif_setup_loopback_qp - One time setup work for Loopback Node Qp.
+ * @scifdev: SCIF device
+ *
+ * Sets up the required loopback workqueues, queue pairs and ring buffers
+ */
+int scif_setup_loopback_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	void *local_q;
+	struct scif_qp *qp;
+	struct scif_peer_dev *spdev;
+
+	err = scif_setup_intr_wq(scifdev);
+	if (err)
+		goto exit;
+	INIT_LIST_HEAD(&scif_info.loopb_recv_q);
+	snprintf(scif_info.loopb_wqname, sizeof(scif_info.loopb_wqname),
+		 "SCIF LOOPB %d", scifdev->node);
+	scif_info.loopb_wq =
+		alloc_ordered_workqueue(scif_info.loopb_wqname, 0);
+	if (!scif_info.loopb_wq) {
+		err = -ENOMEM;
+		goto destroy_intr;
+	}
+	INIT_WORK(&scif_info.loopb_work, scif_loopb_wq_handler);
+	/* Allocate Self Qpair */
+	scifdev->qpairs = kzalloc(sizeof(*scifdev->qpairs), GFP_KERNEL);
+	if (!scifdev->qpairs) {
+		err = -ENOMEM;
+		goto destroy_loopb_wq;
+	}
+
+	qp = scifdev->qpairs;
+	qp->magic = SCIFEP_MAGIC;
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(SCIF_NODE_QP_SIZE, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto free_qpairs;
+	}
+	/*
+	 * For loopback the inbound_q and outbound_q are essentially the same
+	 * since the Node sends a message on the loopback interface to the
+	 * outbound_q which is then received on the inbound_q.
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+
+	scif_rb_init(&qp->inbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+	scif_info.nodeid = scifdev->node;
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto free_local_q;
+	}
+	scif_info.loopb_dev = scifdev;
+	return err;
+free_local_q:
+	kfree(local_q);
+free_qpairs:
+	kfree(scifdev->qpairs);
+destroy_loopb_wq:
+	destroy_workqueue(scif_info.loopb_wq);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+exit:
+	return err;
+}
+
+/**
+ * scif_destroy_loopback_qp - One time uninit work for Loopback Node Qp
+ * @scifdev: SCIF device
+ *
+ * Destroys the workqueues and frees up the Ring Buffer and Queue Pair memory.
+ */
+int scif_destroy_loopback_qp(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	destroy_workqueue(scif_info.loopb_wq);
+	scif_destroy_intr_wq(scifdev);
+	kfree(scifdev->qpairs->outbound_q.rb_base);
+	kfree(scifdev->qpairs);
+	scifdev->sdev = NULL;
+	scif_info.loopb_dev = NULL;
+	return 0;
+}
+
+void scif_destroy_p2p(struct scif_dev *scifdev)
+{
+	struct scif_dev *peer_dev;
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	int bd;
+
+	mutex_lock(&scif_info.conflock);
+	/* Free P2P mappings in the given node for all its peer nodes */
+	list_for_each_safe(pos, tmp, &scifdev->p2p) {
+		p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			     p2p->sg_nentries[SCIF_PPI_MMIO],
+			     DMA_BIDIRECTIONAL);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			     p2p->sg_nentries[SCIF_PPI_APER],
+			     DMA_BIDIRECTIONAL);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+		list_del(pos);
+		kfree(p2p);
+	}
+
+	/* Free P2P mapping created in the peer nodes for the given node */
+	for (bd = SCIF_MGMT_NODE + 1; bd <= scif_info.maxid; bd++) {
+		peer_dev = &scif_dev[bd];
+		list_for_each_safe(pos, tmp, &peer_dev->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == scifdev->node) {
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_MMIO],
+					     p2p->sg_nentries[SCIF_PPI_MMIO],
+					     DMA_BIDIRECTIONAL);
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_APER],
+					     p2p->sg_nentries[SCIF_PPI_APER],
+					     DMA_BIDIRECTIONAL);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+				list_del(pos);
+				kfree(p2p);
+			}
+		}
+	}
+	mutex_unlock(&scif_info.conflock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 28%]

* [PATCH char-misc-next v3 12/13] misc: mic: MIC card driver specific changes to enable SCIF
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (10 preceding siblings ...)
  2015-01-27  0:32 57% ` [PATCH char-misc-next v3 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
@ 2015-01-27  0:32 70% ` Sudeep Dutt
  2015-01-27  0:32 98% ` [PATCH char-misc-next v3 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
  2015-02-24  0:45 99% ` [PATCH char-misc-next v3 00/13] misc: mic: " Sudeep Dutt
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC card driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/card/mic_device.h |  11 +++-
 drivers/misc/mic/card/mic_x100.h   |   1 +
 drivers/misc/mic/card/mic_device.c | 132 ++++++++++++++++++++++++++++++++++++-
 drivers/misc/mic/card/mic_x100.c   |  61 ++++++++++++++++-
 5 files changed, 201 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index a2ea3ab..e9f2f56 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -56,7 +56,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86 && INTEL_MIC_BUS
+	depends on 64BIT && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 844be8f..1dbf83c 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -29,9 +29,9 @@
 
 #include <linux/workqueue.h>
 #include <linux/io.h>
-#include <linux/irqreturn.h>
 #include <linux/interrupt.h>
 #include <linux/mic_bus.h>
+#include "../bus/scif_bus.h"
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -73,6 +73,9 @@ struct mic_device {
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
  * @dma_mbdev: dma device on the MIC virtual bus.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -84,6 +87,9 @@ struct mic_driver {
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
 	struct mbus_device *dma_mbdev;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -122,10 +128,11 @@ void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
 struct mic_irq *
 mic_request_card_irq(irq_handler_t handler, irq_handler_t thread_fn,
-		     const char *name, void *data, int intr_src);
+		     const char *name, void *data, int db);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
+void mic_send_p2p_intr(int doorbell, struct mic_mw *mw);
 int mic_db_to_irq(struct mic_driver *mdrv, int db);
 u32 mic_ack_interrupt(struct mic_device *mdev);
 void mic_hw_intr_init(struct mic_driver *mdrv);
diff --git a/drivers/misc/mic/card/mic_x100.h b/drivers/misc/mic/card/mic_x100.h
index d66ea55..7e22249 100644
--- a/drivers/misc/mic/card/mic_x100.h
+++ b/drivers/misc/mic/card/mic_x100.h
@@ -35,6 +35,7 @@
 #define MIC_X100_SBOX_SDBIC0 0x0000CC90
 #define MIC_X100_SBOX_SDBIC0_DBREQ_BIT 0x80000000
 #define MIC_X100_SBOX_RDMASR0	0x0000B180
+#define MIC_X100_SBOX_APICICR0 0x0000A9D0
 
 #define MIC_X100_MAX_DOORBELL_IDX 8
 
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 83819ee..6338908 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -28,6 +28,8 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/reboot.h>
+#include <linux/dmaengine.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include "../common/mic_dev.h"
@@ -240,6 +242,111 @@ static void mic_uninit_irq(void)
 	kfree(mdrv->irq_info.irq_usage_count);
 }
 
+static inline struct mic_driver *scdev_to_mdrv(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name, void *data,
+				       int db)
+{
+	return mic_request_card_irq(func, NULL, name, data, db);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	return mic_free_card_irq(cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_ack_interrupt(&mdrv->mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	return mic_next_card_db();
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_send_intr(&mdrv->mdev, db);
+}
+
+static void ___mic_send_p2p_intr(struct scif_hw_dev *scdev, int db,
+				 struct mic_mw *mw)
+{
+	mic_send_p2p_intr(db, mw);
+}
+
+static void __iomem *
+___mic_ioremap(struct scif_hw_dev *scdev,
+	       phys_addr_t pa, size_t len)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	return mic_card_map(&mdrv->mdev, pa, len);
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_card_unmap(&mdrv->mdev, va);
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.send_p2p_intr = ___mic_send_p2p_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
+static int mic_request_dma_chans(struct mic_driver *mdrv)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, NULL, NULL);
+		if (chan) {
+			mdrv->dma_ch[mdrv->num_dma_ch++] = chan;
+			if (mdrv->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdrv->dev, "DMA channels # %d\n", mdrv->num_dma_ch);
+	return mdrv->num_dma_ch;
+}
+
+static void mic_free_dma_chans(struct mic_driver *mdrv)
+{
+	int i = 0;
+
+	for (i = 0; i < mdrv->num_dma_ch; i++) {
+		dma_release_channel(mdrv->dma_ch[i]);
+		mdrv->dma_ch[i] = NULL;
+	}
+	mdrv->num_dma_ch = 0;
+}
+
 /*
  * mic_driver_init - MIC driver initialization tasks.
  *
@@ -248,6 +355,8 @@ static void mic_uninit_irq(void)
 int __init mic_driver_init(struct mic_driver *mdrv)
 {
 	int rc;
+	struct mic_bootparam __iomem *bootparam;
+	u8 node_id;
 
 	g_drv = mdrv;
 	/*
@@ -268,13 +377,32 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 	rc = mic_shutdown_init();
 	if (rc)
 		goto irq_uninit;
+	if (!mic_request_dma_chans(mdrv)) {
+		rc = -ENODEV;
+		goto shutdown_uninit;
+	}
 	rc = mic_devices_init(mdrv);
 	if (rc)
-		goto shutdown_uninit;
+		goto dma_free;
+	bootparam = mdrv->dp;
+	node_id = ioread8(&bootparam->node_id);
+	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
+					   NULL, &scif_hw_ops,
+					   0, node_id, &mdrv->mdev.mmio, NULL,
+					   NULL, mdrv->dp, mdrv->dma_ch,
+					   mdrv->num_dma_ch);
+	if (IS_ERR(mdrv->scdev)) {
+		rc = PTR_ERR(mdrv->scdev);
+		goto device_uninit;
+	}
 	mic_create_card_debug_dir(mdrv);
 	atomic_notifier_chain_register(&panic_notifier_list, &mic_panic);
 done:
 	return rc;
+device_uninit:
+	mic_devices_uninit(mdrv);
+dma_free:
+	mic_free_dma_chans(mdrv);
 shutdown_uninit:
 	mic_shutdown_uninit();
 irq_uninit:
@@ -294,7 +422,9 @@ put:
 void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
+	scif_unregister_device(mdrv->scdev);
 	mic_devices_uninit(mdrv);
+	mic_free_dma_chans(mdrv);
 	/*
 	 * Inform the host about the shutdown status i.e. poweroff/restart etc.
 	 * The module cannot be unloaded so the only code path to call
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index e98e537..77fd41781 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -70,6 +70,41 @@ void mic_send_intr(struct mic_device *mdev, int doorbell)
 		       (MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
 }
 
+/*
+ * mic_x100_send_sbox_intr - Send an MIC_X100_SBOX interrupt to MIC.
+ */
+static void mic_x100_send_sbox_intr(struct mic_mw *mw, int doorbell)
+{
+	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
+	u32 apicicr_low = mic_mmio_read(mw, MIC_X100_SBOX_BASE_ADDRESS +
+					apic_icr_offset);
+
+	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
+	apicicr_low = (apicicr_low | (1 << 13));
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, apicicr_low,
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+}
+
+static void mic_x100_send_rdmasr_intr(struct mic_mw *mw, int doorbell)
+{
+	int rdmasr_offset = MIC_X100_SBOX_RDMASR0 + (doorbell << 2);
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, 0, MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+}
+
 /**
  * mic_ack_interrupt - Device specific interrupt handling.
  * @mdev: pointer to mic_device instance
@@ -91,6 +126,18 @@ static inline int mic_get_rdmasr_irq(int index)
 	return  MIC_X100_RDMASR_IRQ_BASE + index;
 }
 
+void mic_send_p2p_intr(int db, struct mic_mw *mw)
+{
+	int rdmasr_index;
+
+	if (db < MIC_X100_NUM_SBOX_IRQ) {
+		mic_x100_send_sbox_intr(mw, db);
+	} else {
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
+		mic_x100_send_rdmasr_intr(mw, rdmasr_index);
+	}
+}
+
 /**
  * mic_hw_intr_init - Initialize h/w specific interrupt
  * information.
@@ -113,11 +160,15 @@ void mic_hw_intr_init(struct mic_driver *mdrv)
 int mic_db_to_irq(struct mic_driver *mdrv, int db)
 {
 	int rdmasr_index;
+
+	/*
+	 * The total number of doorbell interrupts on the card are 16. Indices
+	 * 0-8 falls in the SBOX category and 8-15 fall in the RDMASR category.
+	 */
 	if (db < MIC_X100_NUM_SBOX_IRQ) {
 		return mic_get_sbox_irq(db);
 	} else {
-		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
 		return mic_get_rdmasr_irq(rdmasr_index);
 	}
 }
@@ -243,10 +294,16 @@ static void mic_platform_shutdown(struct platform_device *pdev)
 	mic_remove(pdev);
 }
 
+static u64 mic_dma_mask = DMA_BIT_MASK(64);
+
 static struct platform_device mic_platform_dev = {
 	.name = mic_driver_name,
 	.id   = 0,
 	.num_resources = 0,
+	.dev = {
+		.dma_mask = &mic_dma_mask,
+		.coherent_dma_mask = DMA_BIT_MASK(64),
+	},
 };
 
 static struct platform_driver __refdata mic_platform_driver = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 70%]

* [PATCH char-misc-next v3 04/13] misc: mic: SCIF Peer Bus
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (2 preceding siblings ...)
  2015-01-27  0:32 61% ` [PATCH char-misc-next v3 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
@ 2015-01-27  0:32 78% ` Sudeep Dutt
  2015-01-27  0:32 96% ` [PATCH char-misc-next v3 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF peer bus is used to register and unregister SCIF peer devices
internally by the SCIF driver to signify the addition and removal of
peer nodes respectively from the SCIF network. This simplifies remote node
handling within SCIF and will also be used to support device probe/remove
for SCIF client drivers (e.g. netdev over SCIF)

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_peer_bus.h |  65 ++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c | 124 ++++++++++++++++++++++++++++++++++
 2 files changed, 189 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c

diff --git a/drivers/misc/mic/scif/scif_peer_bus.h b/drivers/misc/mic/scif/scif_peer_bus.h
new file mode 100644
index 0000000..33f0dbb
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.h
@@ -0,0 +1,65 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef _SCIF_PEER_BUS_H_
+#define _SCIF_PEER_BUS_H_
+
+#include <linux/device.h>
+#include <linux/mic_common.h>
+
+/*
+ * Peer devices show up as PCIe devices for the mgmt node but not the cards.
+ * The mgmt node discovers all the cards on the PCIe bus and informs the other
+ * cards about their peers. Upon notification of a peer a node adds a peer
+ * device to the peer bus to maintain symmetry in the way devices are
+ * discovered across all nodes in the SCIF network.
+ */
+/**
+ * scif_peer_dev - representation of a peer SCIF device
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ */
+struct scif_peer_dev {
+	struct device dev;
+	u8 dnode;
+};
+
+/**
+ * scif_peer_driver - operations for a scif_peer I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_peer_driver {
+	struct device_driver driver;
+	const struct scif_peer_dev_id *id_table;
+
+	int (*probe)(struct scif_peer_dev *dev);
+	void (*remove)(struct scif_peer_dev *dev);
+};
+
+struct scif_dev;
+
+int scif_peer_register_driver(struct scif_peer_driver *driver);
+void scif_peer_unregister_driver(struct scif_peer_driver *driver);
+
+struct scif_peer_dev *scif_peer_register_device(struct scif_dev *sdev);
+void scif_peer_unregister_device(struct scif_peer_dev *sdev);
+
+int scif_peer_bus_init(void);
+void scif_peer_bus_exit(void);
+#endif /* _SCIF_PEER_BUS_H */
diff --git a/drivers/misc/mic/scif/scif_peer_bus.c b/drivers/misc/mic/scif/scif_peer_bus.c
new file mode 100644
index 0000000..589ae9a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#include "scif_main.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+
+static inline struct scif_peer_dev *
+dev_to_scif_peer(struct device *dev)
+{
+	return container_of(dev, struct scif_peer_dev, dev);
+}
+
+static inline struct scif_peer_driver *
+drv_to_scif_peer(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_peer_driver, driver);
+}
+
+static int scif_peer_dev_match(struct device *dv, struct device_driver *dr)
+{
+	return !strncmp(dev_name(dv), dr->name, 4);
+}
+
+static int scif_peer_dev_probe(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_peer_dev_remove(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_peer_bus = {
+	.name  = "scif_peer_bus",
+	.match = scif_peer_dev_match,
+	.probe = scif_peer_dev_probe,
+	.remove = scif_peer_dev_remove,
+};
+
+int scif_peer_register_driver(struct scif_peer_driver *driver)
+{
+	driver->driver.bus = &scif_peer_bus;
+	return driver_register(&driver->driver);
+}
+
+void scif_peer_unregister_driver(struct scif_peer_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+
+static void scif_peer_release_dev(struct device *d)
+{
+	struct scif_peer_dev *sdev = dev_to_scif_peer(d);
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	scif_cleanup_scifdev(scifdev);
+	kfree(sdev);
+}
+
+struct scif_peer_dev *
+scif_peer_register_device(struct scif_dev *scifdev)
+{
+	int ret;
+	struct scif_peer_dev *spdev;
+
+	spdev = kzalloc(sizeof(*spdev), GFP_KERNEL);
+	if (!spdev)
+		return ERR_PTR(-ENOMEM);
+
+	spdev->dev.parent = scifdev->sdev->dev.parent;
+	spdev->dev.release = scif_peer_release_dev;
+	spdev->dnode = scifdev->node;
+	spdev->dev.bus = &scif_peer_bus;
+
+	dev_set_name(&spdev->dev, "scif_peer-dev%u", spdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&spdev->dev);
+	if (ret)
+		goto free_spdev;
+	return spdev;
+free_spdev:
+	kfree(spdev);
+	return ERR_PTR(ret);
+}
+
+void scif_peer_unregister_device(struct scif_peer_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+
+int scif_peer_bus_init(void)
+{
+	return bus_register(&scif_peer_bus);
+}
+
+void scif_peer_bus_exit(void)
+{
+	bus_unregister(&scif_peer_bus);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 78%]

* [PATCH char-misc-next v3 02/13] misc: mic: SCIF ring buffer infrastructure
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
  2015-01-27  0:32 22% ` [PATCH char-misc-next v3 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
@ 2015-01-27  0:32 58% ` Sudeep Dutt
  2015-01-27  0:32 61% ` [PATCH char-misc-next v3 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF ring buffer is a single producer, single consumer byte stream
ring buffer optimized for avoiding reads across the PCIe bus while
adding the required barriers and hardware workarounds for the MIC
Coprocessor. The ring buffer is used to implement a receive queue for
SCIF driver messaging between two nodes and for byte stream messaging
between SCIF endpoints. The existing in-kernel ring buffer was not
reused since it has not been designed for our use across the PCIe bus
where each node runs an independent OS. Each SCIF node has a receive
queue for every other SCIF node, and each connected endpoint has a
receive queue for messages from its peer. This pair of receive
queues is referred to as a SCIF queue pair.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_rb.h | 100 ++++++++++++++++
 drivers/misc/mic/scif/scif_rb.c | 248 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 348 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

diff --git a/drivers/misc/mic/scif/scif_rb.h b/drivers/misc/mic/scif/scif_rb.h
new file mode 100644
index 0000000..166dffe
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.h
@@ -0,0 +1,100 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef SCIF_RB_H
+#define SCIF_RB_H
+/*
+ * This file describes a general purpose, byte based ring buffer. Writers to the
+ * ring buffer need to synchronize using a lock. The same is true for readers,
+ * although in practice, the ring buffer has a single reader. It is lockless
+ * between producer and consumer so it can handle being used across the PCIe
+ * bus. The ring buffer ensures that there are no reads across the PCIe bus for
+ * performance reasons. Two of these are used to form a single bidirectional
+ * queue-pair across PCIe.
+ */
+/*
+ * struct scif_rb - SCIF Ring Buffer
+ *
+ * @rb_base: The base of the memory used for storing RB messages
+ * @read_ptr: Pointer to the read offset
+ * @write_ptr: Pointer to the write offset
+ * @size: Size of the memory in rb_base
+ * @current_read_offset: Cached read offset for performance
+ * @current_write_offset: Cached write offset for performance
+ */
+struct scif_rb {
+	void *rb_base;
+	u32 *read_ptr;
+	u32 *write_ptr;
+	u32 size;
+	u32 current_read_offset;
+	u32 current_write_offset;
+};
+
+/* methods used by both */
+void scif_rb_init(struct scif_rb *rb, u32 *read_ptr, u32 *write_ptr,
+		  void *rb_base, u8 size);
+/* writer only methods */
+/* write a new command, then scif_rb_commit() */
+int scif_rb_write(struct scif_rb *rb, void *msg, u32 size);
+/* after write(), then scif_rb_commit() */
+void scif_rb_commit(struct scif_rb *rb);
+/* query space available for writing to a RB. */
+u32 scif_rb_space(struct scif_rb *rb);
+
+/* reader only methods */
+/* read a new message from the ring buffer of size bytes */
+u32 scif_rb_get_next(struct scif_rb *rb, void *msg, u32 size);
+/* update the read pointer so that the space can be reused */
+void scif_rb_update_read_ptr(struct scif_rb *rb);
+/* count the number of bytes that can be read */
+u32 scif_rb_count(struct scif_rb *rb, u32 size);
+#endif
diff --git a/drivers/misc/mic/scif/scif_rb.c b/drivers/misc/mic/scif/scif_rb.c
new file mode 100644
index 0000000..28e99ea
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.c
@@ -0,0 +1,248 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/circ_buf.h>
+#include <linux/types.h>
+#include <linux/io.h>
+
+#include "scif_rb.h"
+
+#define scif_rb_ring_cnt(head, tail, size) CIRC_CNT(head, tail, size)
+#define scif_rb_ring_space(head, tail, size) CIRC_SPACE(head, tail, size)
+
+/**
+ * scif_rb_init - Initializes the ring buffer
+ * @rb: ring buffer
+ * @read_ptr: A pointer to the read offset
+ * @write_ptr: A pointer to the write offset
+ * @rb_base: A pointer to the base of the ring buffer
+ * @size: The size of the ring buffer in powers of two
+ */
+void scif_rb_init(struct scif_rb *rb, u32 *read_ptr, u32 *write_ptr,
+		  void *rb_base, u8 size)
+{
+	rb->rb_base = rb_base;
+	rb->size = (1 << size);
+	rb->read_ptr = read_ptr;
+	rb->write_ptr = write_ptr;
+	rb->current_read_offset = *read_ptr;
+	rb->current_write_offset = *write_ptr;
+}
+
+/* Copies a message to the ring buffer -- handles the wrap around case */
+static void memcpy_torb(struct scif_rb *rb, void *header,
+			void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_toio((void __iomem __force *)header, msg, size1);
+		memcpy_toio((void __iomem __force *)rb->rb_base,
+			    msg + size1, size2);
+	} else {
+		memcpy_toio((void __iomem __force *)header, msg, size);
+	}
+}
+
+/* Copies a message from the ring buffer -- handles the wrap around case */
+static void memcpy_fromrb(struct scif_rb *rb, void *header,
+			  void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_fromio(msg, (void __iomem __force *)header, size1);
+		memcpy_fromio(msg+size1,
+			      (void __iomem __force *)rb->rb_base, size2);
+	} else {
+		memcpy_fromio(msg, (void __iomem __force *)header, size);
+	}
+}
+
+/**
+ * scif_rb_space - Query space available for writing to the RB
+ * @rb: ring buffer
+ *
+ * Return: size available for writing to RB in bytes.
+ */
+u32 scif_rb_space(struct scif_rb *rb)
+{
+	rb->current_read_offset = *rb->read_ptr;
+	/*
+	 * Update from the HW read pointer only once the peer has exposed the
+	 * new empty slot. This barrier is paired with the memory barrier
+	 * scif_rb_update_read_ptr()
+	 */
+	mb();
+	return scif_rb_ring_space(rb->current_write_offset,
+				  rb->current_read_offset, rb->size);
+}
+
+/**
+ * scif_rb_write - Write a message to the RB
+ * @rb: ring buffer
+ * @msg: buffer to send the message.  Must be at least size bytes long
+ * @size: the size (in bytes) to be copied to the RB
+ *
+ * This API does not block if there isn't enough space in the RB.
+ * Returns: 0 on success or -ENOMEM on failure
+ */
+int scif_rb_write(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header;
+
+	if (scif_rb_space(rb) < size)
+		return -ENOMEM;
+	header = rb->rb_base + rb->current_write_offset;
+	memcpy_torb(rb, header, msg, size);
+	/*
+	 * Wait until scif_rb_commit(). Update the local ring
+	 * buffer data, not the shared data until commit.
+	 */
+	rb->current_write_offset =
+		(rb->current_write_offset + size) & (rb->size - 1);
+	return 0;
+}
+
+/**
+ * scif_rb_commit - To submit the message to let the peer fetch it
+ * @rb: ring buffer
+ */
+void scif_rb_commit(struct scif_rb *rb)
+{
+	/*
+	 * We must ensure ordering between the all the data committed
+	 * previously before we expose the new message to the peer by
+	 * updating the write_ptr. This write barrier is paired with
+	 * the read barrier in scif_rb_count(..)
+	 */
+	wmb();
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#endif
+}
+
+/**
+ * scif_rb_get - To get next message from the ring buffer
+ * @rb: ring buffer
+ * @size: Number of bytes to be read
+ *
+ * Return: NULL if no bytes to be read from the ring buffer, otherwise the
+ *	pointer to the next byte
+ */
+static void *scif_rb_get(struct scif_rb *rb, u32 size)
+{
+	void *header = NULL;
+
+	if (scif_rb_count(rb, size) >= size)
+		header = rb->rb_base + rb->current_read_offset;
+	return header;
+}
+
+/*
+ * scif_rb_get_next - Read from ring buffer.
+ * @rb: ring buffer
+ * @msg: buffer to hold the message.  Must be at least size bytes long
+ * @size: Number of bytes to be read
+ *
+ * Return: number of bytes read if available bytes are >= size, otherwise
+ * returns zero.
+ */
+u32 scif_rb_get_next(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header = NULL;
+	int read_size = 0;
+
+	header = scif_rb_get(rb, size);
+	if (header) {
+		u32 next_cmd_offset =
+			(rb->current_read_offset + size) & (rb->size - 1);
+
+		read_size = size;
+		rb->current_read_offset = next_cmd_offset;
+		memcpy_fromrb(rb, header, msg, size);
+	}
+	return read_size;
+}
+
+/**
+ * scif_rb_update_read_ptr
+ * @rb: ring buffer
+ */
+void scif_rb_update_read_ptr(struct scif_rb *rb)
+{
+	u32 new_offset;
+
+	new_offset = rb->current_read_offset;
+	/*
+	 * We must ensure ordering between the all the data committed or read
+	 * previously before we expose the empty slot to the peer by updating
+	 * the read_ptr. This barrier is paired with the memory barrier in
+	 * scif_rb_space(..)
+	 */
+	mb();
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si Bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#endif
+}
+
+/**
+ * scif_rb_count
+ * @rb: ring buffer
+ * @size: Number of bytes expected to be read
+ *
+ * Return: number of bytes that can be read from the RB
+ */
+u32 scif_rb_count(struct scif_rb *rb, u32 size)
+{
+	if (scif_rb_ring_cnt(rb->current_write_offset,
+			     rb->current_read_offset,
+			     rb->size) < size) {
+		rb->current_write_offset = *rb->write_ptr;
+		/*
+		 * Update from the HW write pointer if empty only once the peer
+		 * has exposed the new message. This read barrier is paired
+		 * with the write barrier in scif_rb_commit(..)
+		 */
+		smp_rmb();
+	}
+	return scif_rb_ring_cnt(rb->current_write_offset,
+				rb->current_read_offset,
+				rb->size);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 58%]

* [PATCH char-misc-next v3 05/13] misc: mic: Common MIC header file changes in preparation for SCIF
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (3 preceding siblings ...)
  2015-01-27  0:32 78% ` [PATCH char-misc-next v3 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
@ 2015-01-27  0:32 96% ` Sudeep Dutt
  2015-01-27  0:32 47% ` [PATCH char-misc-next v3 06/13] misc: mic: SCIF module initialization Sudeep Dutt
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

Update mic_bootparam and define the maximum number of DMA channels

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/common/mic_dev.h |  3 +++
 include/uapi/linux/mic_common.h   | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/misc/mic/common/mic_dev.h b/drivers/misc/mic/common/mic_dev.h
index 92999c2..0b58c46 100644
--- a/drivers/misc/mic/common/mic_dev.h
+++ b/drivers/misc/mic/common/mic_dev.h
@@ -48,4 +48,7 @@ struct mic_mw {
 #define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
 #define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
 
+/* Maximum number of DMA channels */
+#define MIC_MAX_DMA_CHAN 4
+
 #endif
diff --git a/include/uapi/linux/mic_common.h b/include/uapi/linux/mic_common.h
index 6eb4024..302a2ce 100644
--- a/include/uapi/linux/mic_common.h
+++ b/include/uapi/linux/mic_common.h
@@ -80,6 +80,12 @@ struct mic_device_ctrl {
  * @h2c_config_db: Host to Card Virtio config doorbell set by card
  * @shutdown_status: Card shutdown status set by card
  * @shutdown_card: Set to 1 by the host when a card shutdown is initiated
+ * @tot_nodes: Total number of nodes in the SCIF network
+ * @node_id: Unique id of the node
+ * @h2c_scif_db - Host to card SCIF doorbell set by card
+ * @c2h_scif_db - Card to host SCIF doorbell set by host
+ * @scif_host_dma_addr - SCIF host queue pair DMA address
+ * @scif_card_dma_addr - SCIF card queue pair DMA address
  */
 struct mic_bootparam {
 	__le32 magic;
@@ -88,6 +94,12 @@ struct mic_bootparam {
 	__s8 h2c_config_db;
 	__u8 shutdown_status;
 	__u8 shutdown_card;
+	__u8 tot_nodes;
+	__u8 node_id;
+	__u8 h2c_scif_db;
+	__u8 c2h_scif_db;
+	__u64 scif_host_dma_addr;
+	__u64 scif_card_dma_addr;
 } __attribute__ ((aligned(8)));
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 96%]

* [PATCH char-misc-next v3 09/13] misc: mic: SCIF connections APIs i.e. accept and connect
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (7 preceding siblings ...)
  2015-01-27  0:32 41% ` [PATCH char-misc-next v3 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
@ 2015-01-27  0:32 41% ` Sudeep Dutt
  2015-01-27  0:32 52% ` [PATCH char-misc-next v3 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

From: Nikhil Rao <nikhil.rao@intel.com>

SCIF connection APIs which establish a SCIF connection between
a pair of SCIF endpoints. A SCIF connection consists of a
dedicated queue-pair between the endpoints. Client messages are
sent over the queue-pair whereas the signaling associated with the
message is multiplexed over the node queue-pair. Similarly other
control messages such as exposing registered memory are also sent
over the node queue-pair. The SCIF endpoints must be in connected
state to exchange messages, register memory, map remote memory and
trigger DMA transfers. SCIF connections can be set up
asynchronously or synchronously.

Thanks to Johnnie S Peters for authoring parts of this patch during
early bring up of the SCIF driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   8 +
 drivers/misc/mic/scif/scif_api.c    | 452 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    | 229 ++++++++++++++++++
 drivers/misc/mic/scif/scif_fd.c     | 106 +++++++++
 drivers/misc/mic/scif/scif_main.c   |   1 +
 drivers/misc/mic/scif/scif_nm.c     |  36 +++
 drivers/misc/mic/scif/scif_nodeqp.c |  16 +-
 7 files changed, 847 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index b973b99..65c628e 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,5 +144,13 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index cbdc27b..8b55b59 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -415,3 +415,455 @@ int scif_listen(scif_epd_t epd, int backlog)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(scif_listen);
+
+/*
+ ************************************************************************
+ * SCIF connection flow:
+ *
+ * 1) A SCIF listening endpoint can call scif_accept(..) to wait for SCIF
+ *	connections via a SCIF_CNCT_REQ message
+ * 2) A SCIF endpoint can initiate a SCIF connection by calling
+ *	scif_connect(..) which calls scif_setup_qp_connect(..) which
+ *	allocates the local qp for the endpoint ring buffer and then sends
+ *	a SCIF_CNCT_REQ to the remote node and waits for a SCIF_CNCT_GNT or
+ *	a SCIF_CNCT_REJ message
+ * 3) The peer node handles a SCIF_CNCT_REQ via scif_cnctreq_resp(..) which
+ *	wakes up any threads blocked in step 1 or sends a SCIF_CNCT_REJ
+ *	message otherwise
+ * 4) A thread blocked waiting for incoming connections allocates its local
+ *	endpoint QP and ring buffer following which it sends a SCIF_CNCT_GNT
+ *	and waits for a SCIF_CNCT_GNT(N)ACK. If the allocation fails then
+ *	the node sends a SCIF_CNCT_REJ message
+ * 5) Upon receipt of a SCIF_CNCT_GNT or a SCIF_CNCT_REJ message the
+ *	connecting endpoint is woken up as part of handling
+ *	scif_cnctgnt_resp(..) following which it maps the remote endpoints'
+ *	QP, updates its outbound QP and sends a SCIF_CNCT_GNTACK message on
+ *	success or a SCIF_CNCT_GNTNACK message on failure and completes
+ *	the scif_connect(..) API
+ * 6) Upon receipt of a SCIF_CNCT_GNT(N)ACK the accepting endpoint blocked
+ *	in step 4 is woken up and completes the scif_accept(..) API
+ * 7) The SCIF connection is now established between the two SCIF endpoints.
+ */
+static int scif_conn_func(struct scif_endpt *ep)
+{
+	int err = 0;
+	struct scifmsg msg;
+	struct device *spdev;
+
+	/* Initiate the first part of the endpoint QP setup */
+	err = scif_setup_qp_connect(ep->qp_info.qp, &ep->qp_info.qp_offset,
+				    SCIF_ENDPT_QP_SIZE, ep->remote_dev);
+	if (err) {
+		dev_err(&ep->remote_dev->sdev->dev,
+			"%s err %d qp_offset 0x%llx\n",
+			__func__, err, ep->qp_info.qp_offset);
+		ep->state = SCIFEP_BOUND;
+		goto connect_error_simple;
+	}
+
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	/* Format connect message and send it */
+	msg.src = ep->port;
+	msg.dst = ep->conn_port;
+	msg.uop = SCIF_CNCT_REQ;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->qp_info.qp_offset;
+	err = _scif_nodeqp_send(ep->remote_dev, &msg);
+	if (err)
+		goto connect_error_dec;
+	scif_put_peer_dev(spdev);
+	/*
+	 * Wait for the remote node to respond with SCIF_CNCT_GNT or
+	 * SCIF_CNCT_REJ message.
+	 */
+	wait_event_timeout(ep->conwq, ep->state != SCIFEP_CONNECTING,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	if (ep->state == SCIFEP_MAPPING) {
+		err = scif_setup_qp_connect_response(ep->remote_dev,
+						     ep->qp_info.qp,
+						     ep->qp_info.gnt_pld);
+		/*
+		 * If the resource to map the queue are not available then
+		 * we need to tell the other side to terminate the accept
+		 */
+		if (err) {
+			dev_err(&ep->remote_dev->sdev->dev,
+				"%s %d err %d\n", __func__, __LINE__, err);
+			msg.uop = SCIF_CNCT_GNTNACK;
+			msg.payload[0] = ep->remote_ep;
+			_scif_nodeqp_send(ep->remote_dev, &msg);
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+
+		msg.uop = SCIF_CNCT_GNTACK;
+		msg.payload[0] = ep->remote_ep;
+		err = _scif_nodeqp_send(ep->remote_dev, &msg);
+		if (err) {
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+		ep->state = SCIFEP_CONNECTED;
+		mutex_lock(&scif_info.connlock);
+		list_add_tail(&ep->list, &scif_info.connected);
+		mutex_unlock(&scif_info.connlock);
+		dev_dbg(&ep->remote_dev->sdev->dev,
+			"SCIFAPI connect: ep %p connected\n", ep);
+	} else if (ep->state == SCIFEP_BOUND) {
+		dev_dbg(&ep->remote_dev->sdev->dev,
+			"SCIFAPI connect: ep %p connection refused\n", ep);
+		err = -ECONNREFUSED;
+		goto connect_error_dec;
+	}
+	scif_put_peer_dev(spdev);
+	return err;
+connect_error_dec:
+	scif_put_peer_dev(spdev);
+cleanup_qp:
+	scif_cleanup_ep_qp(ep);
+connect_error_simple:
+	return err;
+}
+
+/*
+ * scif_conn_handler:
+ *
+ * Workqueue handler for servicing non-blocking SCIF connect
+ *
+ */
+void scif_conn_handler(struct work_struct *work)
+{
+	struct scif_endpt *ep;
+
+	do {
+		ep = NULL;
+		spin_lock(&scif_info.nb_connect_lock);
+		if (!list_empty(&scif_info.nb_connect_list)) {
+			ep = list_first_entry(&scif_info.nb_connect_list,
+					      struct scif_endpt, conn_list);
+			list_del(&ep->conn_list);
+		}
+		spin_unlock(&scif_info.nb_connect_lock);
+		if (ep)
+			ep->conn_err = scif_conn_func(ep);
+	} while (ep);
+}
+
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	struct scif_dev *remote_dev;
+	struct device *spdev;
+
+	dev_dbg(scif_info.mdev.this_device, "SCIFAPI connect: ep %p %s\n", ep,
+		scif_ep_states[ep->state]);
+
+	if (!scif_dev || dst->node > scif_info.maxid)
+		return -ENODEV;
+
+	might_sleep();
+
+	remote_dev = &scif_dev[dst->node];
+	spdev = scif_get_peer_dev(remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		return err;
+	}
+
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+		err = -EINVAL;
+		break;
+	case SCIFEP_DISCONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EINVAL;
+		break;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+		err = -EOPNOTSUPP;
+		break;
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			err = -EINPROGRESS;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_CONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_UNBOUND:
+		ep->port.port = scif_get_new_port();
+		if (!ep->port.port) {
+			err = -ENOSPC;
+		} else {
+			ep->port.node = scif_info.nodeid;
+			ep->conn_async_state = ASYNC_CONN_IDLE;
+		}
+		/* Fall through */
+	case SCIFEP_BOUND:
+		/*
+		 * If a non-blocking connect has been already initiated
+		 * (conn_async_state is either ASYNC_CONN_INPROGRESS or
+		 * ASYNC_CONN_FLUSH_WORK), the end point could end up in
+		 * SCIF_BOUND due an error in the connection process
+		 * (e.g., connection refused) If conn_async_state is
+		 * ASYNC_CONN_INPROGRESS - transition to ASYNC_CONN_FLUSH_WORK
+		 * so that the error status can be collected. If the state is
+		 * already ASYNC_CONN_FLUSH_WORK - then set the error to
+		 * EINPROGRESS since some other thread is waiting to collect
+		 * error status.
+		 */
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+			err = -EINPROGRESS;
+		} else {
+			ep->conn_port = *dst;
+			init_waitqueue_head(&ep->sendwq);
+			init_waitqueue_head(&ep->recvwq);
+			init_waitqueue_head(&ep->conwq);
+			ep->conn_async_state = 0;
+
+			if (unlikely(non_block))
+				ep->conn_async_state = ASYNC_CONN_INPROGRESS;
+		}
+		break;
+	}
+
+	if (err || ep->conn_async_state == ASYNC_CONN_FLUSH_WORK)
+			goto connect_simple_unlock1;
+
+	ep->state = SCIFEP_CONNECTING;
+	ep->remote_dev = &scif_dev[dst->node];
+	ep->qp_info.qp->magic = SCIFEP_MAGIC;
+	if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+		spin_lock(&scif_info.nb_connect_lock);
+		list_add_tail(&ep->conn_list, &scif_info.nb_connect_list);
+		spin_unlock(&scif_info.nb_connect_lock);
+		err = -EINPROGRESS;
+		schedule_work(&scif_info.conn_work);
+	}
+connect_simple_unlock1:
+	spin_unlock(&ep->lock);
+	scif_put_peer_dev(spdev);
+	if (err) {
+		return err;
+	} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+		flush_work(&scif_info.conn_work);
+		err = ep->conn_err;
+		spin_lock(&ep->lock);
+		ep->conn_async_state = ASYNC_CONN_IDLE;
+		spin_unlock(&ep->lock);
+	} else {
+		err = scif_conn_func(ep);
+	}
+	return err;
+}
+
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst)
+{
+	return __scif_connect(epd, dst, false);
+}
+EXPORT_SYMBOL_GPL(scif_connect);
+
+/**
+ * scif_accept() - Accept a connection request from the remote node
+ *
+ * The function accepts a connection request from the remote node.  Successful
+ * complete is indicate by a new end point being created and passed back
+ * to the caller for future reference.
+ *
+ * Upon successful complete a zero will be returned and the peer information
+ * will be filled in.
+ *
+ * If the end point is not in the listening state -EINVAL will be returned.
+ *
+ * If during the connection sequence resource allocation fails the -ENOMEM
+ * will be returned.
+ *
+ * If the function is called with the ASYNC flag set and no connection requests
+ * are pending it will return -EAGAIN.
+ *
+ * If the remote side is not sending any connection requests the caller may
+ * terminate this function with a signal.  If so a -EINTR will be returned.
+ */
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer,
+		scif_epd_t *newepd, int flags)
+{
+	struct scif_endpt *lep = (struct scif_endpt *)epd;
+	struct scif_endpt *cep;
+	struct scif_conreq *conreq;
+	struct scifmsg msg;
+	int err;
+	struct device *spdev;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI accept: ep %p %s\n", lep, scif_ep_states[lep->state]);
+
+	if (flags & ~SCIF_ACCEPT_SYNC)
+		return -EINVAL;
+
+	if (!peer || !newepd)
+		return -EINVAL;
+
+	might_sleep();
+	spin_lock(&lep->lock);
+	if (lep->state != SCIFEP_LISTENING) {
+		spin_unlock(&lep->lock);
+		return -EINVAL;
+	}
+
+	if (!lep->conreqcnt && !(flags & SCIF_ACCEPT_SYNC)) {
+		/* No connection request present and we do not want to wait */
+		spin_unlock(&lep->lock);
+		return -EAGAIN;
+	}
+
+	lep->files = current->files;
+retry_connection:
+	spin_unlock(&lep->lock);
+	/* Wait for the remote node to send us a SCIF_CNCT_REQ */
+	err = wait_event_interruptible(lep->conwq,
+				       (lep->conreqcnt ||
+				       (lep->state != SCIFEP_LISTENING)));
+	if (err)
+		return err;
+
+	if (lep->state != SCIFEP_LISTENING)
+		return -EINTR;
+
+	spin_lock(&lep->lock);
+
+	if (!lep->conreqcnt)
+		goto retry_connection;
+
+	/* Get the first connect request off the list */
+	conreq = list_first_entry(&lep->conlist, struct scif_conreq, list);
+	list_del(&conreq->list);
+	lep->conreqcnt--;
+	spin_unlock(&lep->lock);
+
+	/* Fill in the peer information */
+	peer->node = conreq->msg.src.node;
+	peer->port = conreq->msg.src.port;
+
+	cep = kzalloc(sizeof(*cep), GFP_KERNEL);
+	if (!cep) {
+		err = -ENOMEM;
+		goto scif_accept_error_epalloc;
+	}
+	spin_lock_init(&cep->lock);
+	mutex_init(&cep->sendlock);
+	mutex_init(&cep->recvlock);
+	cep->state = SCIFEP_CONNECTING;
+	cep->remote_dev = &scif_dev[peer->node];
+	cep->remote_ep = conreq->msg.payload[0];
+
+	cep->qp_info.qp = kzalloc(sizeof(*cep->qp_info.qp), GFP_KERNEL);
+	if (!cep->qp_info.qp) {
+		err = -ENOMEM;
+		goto scif_accept_error_qpalloc;
+	}
+
+	cep->qp_info.qp->magic = SCIFEP_MAGIC;
+	spdev = scif_get_peer_dev(cep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto scif_accept_error_map;
+	}
+	err = scif_setup_qp_accept(cep->qp_info.qp, &cep->qp_info.qp_offset,
+				   conreq->msg.payload[1], SCIF_ENDPT_QP_SIZE,
+				   cep->remote_dev);
+	if (err) {
+		dev_dbg(&cep->remote_dev->sdev->dev,
+			"SCIFAPI accept: ep %p new %p scif_setup_qp_accept %d qp_offset 0x%llx\n",
+			lep, cep, err, cep->qp_info.qp_offset);
+		scif_put_peer_dev(spdev);
+		goto scif_accept_error_map;
+	}
+
+	cep->port.node = lep->port.node;
+	cep->port.port = lep->port.port;
+	cep->peer.node = peer->node;
+	cep->peer.port = peer->port;
+	init_waitqueue_head(&cep->sendwq);
+	init_waitqueue_head(&cep->recvwq);
+	init_waitqueue_head(&cep->conwq);
+
+	msg.uop = SCIF_CNCT_GNT;
+	msg.src = cep->port;
+	msg.payload[0] = cep->remote_ep;
+	msg.payload[1] = cep->qp_info.qp_offset;
+	msg.payload[2] = (u64)cep;
+
+	err = _scif_nodeqp_send(cep->remote_dev, &msg);
+	scif_put_peer_dev(spdev);
+	if (err)
+		goto scif_accept_error_map;
+retry:
+	/* Wait for the remote node to respond with SCIF_CNCT_GNT(N)ACK */
+	err = wait_event_timeout(cep->conwq, cep->state != SCIFEP_CONNECTING,
+				 SCIF_NODE_ACCEPT_TIMEOUT);
+	if (!err && scifdev_alive(cep))
+		goto retry;
+	err = !err ? -ENODEV : 0;
+	if (err)
+		goto scif_accept_error_map;
+	kfree(conreq);
+
+	spin_lock(&cep->lock);
+
+	if (cep->state == SCIFEP_CLOSING) {
+		/*
+		 * Remote failed to allocate resources and NAKed the grant.
+		 * There is at this point nothing referencing the new end point.
+		 */
+		spin_unlock(&cep->lock);
+		scif_teardown_ep(cep);
+		kfree(cep);
+
+		/* If call with sync flag then go back and wait. */
+		if (flags & SCIF_ACCEPT_SYNC) {
+			spin_lock(&lep->lock);
+			goto retry_connection;
+		}
+		return -EAGAIN;
+	}
+
+	scif_get_port(cep->port.port);
+	*newepd = (scif_epd_t)cep;
+	spin_unlock(&cep->lock);
+	return 0;
+scif_accept_error_map:
+	scif_teardown_ep(cep);
+scif_accept_error_qpalloc:
+	kfree(cep);
+scif_accept_error_epalloc:
+	msg.uop = SCIF_CNCT_REJ;
+	msg.dst.node = conreq->msg.src.node;
+	msg.dst.port = conreq->msg.src.port;
+	msg.payload[0] = conreq->msg.payload[0];
+	msg.payload[1] = conreq->msg.payload[1];
+	scif_nodeqp_send(&scif_dev[conreq->msg.src.node], &msg);
+	kfree(conreq);
+	return err;
+}
+EXPORT_SYMBOL_GPL(scif_accept);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index a2b728a..685ee5c 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -76,6 +76,24 @@ void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
 	schedule_work(&scif_info.misc_work);
 }
 
+static struct scif_endpt *scif_find_listen_ep(u16 port)
+{
+	struct scif_endpt *ep = NULL;
+	struct list_head *pos, *tmpq;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.listen) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->port.port == port) {
+			spin_lock(&ep->lock);
+			spin_unlock(&scif_info.eplock);
+			return ep;
+		}
+	}
+	spin_unlock(&scif_info.eplock);
+	return NULL;
+}
+
 void scif_cleanup_zombie_epd(void)
 {
 	struct list_head *pos, *tmpq;
@@ -90,3 +108,214 @@ void scif_cleanup_zombie_epd(void)
 	}
 	spin_unlock(&scif_info.eplock);
 }
+
+/**
+ * scif_cnctreq() - Respond to SCIF_CNCT_REQ interrupt message
+ * @msg:        Interrupt message
+ *
+ * This message is initiated by the remote node to request a connection
+ * to the local node.  This function looks for an end point in the
+ * listen state on the requested port id.
+ *
+ * If it finds a listening port it places the connect request on the
+ * listening end points queue and wakes up any pending accept calls.
+ *
+ * If it does not find a listening end point it sends a connection
+ * reject message to the remote node.
+ */
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_conreq *conreq;
+
+	conreq = kmalloc(sizeof(*conreq), GFP_KERNEL);
+	if (!conreq)
+		/* Lack of resources so reject the request. */
+		goto conreq_sendrej;
+
+	ep = scif_find_listen_ep(msg->dst.port);
+	if (!ep)
+		/*  Send reject due to no listening ports */
+		goto conreq_sendrej_free;
+
+	if (ep->backlog <= ep->conreqcnt) {
+		/*  Send reject due to too many pending requests */
+		spin_unlock(&ep->lock);
+		goto conreq_sendrej_free;
+	}
+
+	conreq->msg = *msg;
+	list_add_tail(&conreq->list, &ep->conlist);
+	ep->conreqcnt++;
+	wake_up_interruptible(&ep->conwq);
+	spin_unlock(&ep->lock);
+	return;
+
+conreq_sendrej_free:
+	kfree(conreq);
+conreq_sendrej:
+	msg->uop = SCIF_CNCT_REJ;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_cnctgnt() - Respond to SCIF_CNCT_GNT interrupt message
+ * @msg:        Interrupt message
+ *
+ * An accept() on the remote node has occurred and sent this message
+ * to indicate success.  Place the end point in the MAPPING state and
+ * save the remote nodes memory information.  Then wake up the connect
+ * request so it can finish.
+ */
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->peer.node = msg->src.node;
+		ep->peer.port = msg->src.port;
+		ep->qp_info.gnt_pld = msg->payload[1];
+		ep->remote_ep = msg->payload[2];
+		ep->state = SCIFEP_MAPPING;
+
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctgnt_ack() - Respond to SCIF_CNCT_GNTACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request has finished mapping the local memory.
+ * Place the connection in the connected state and wake up the pending
+ * accept() call.
+ */
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	mutex_lock(&scif_info.connlock);
+	spin_lock(&ep->lock);
+	/* New ep is now connected with all resources set. */
+	ep->state = SCIFEP_CONNECTED;
+	list_add_tail(&ep->list, &scif_info.connected);
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+}
+
+/**
+ * scif_cnctgnt_nack() - Respond to SCIF_CNCT_GNTNACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request failed to map the local memory it was sent.
+ * Place the end point in the CLOSING state to indicate it and wake up
+ * the pending accept();
+ */
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_CLOSING;
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctrej() - Respond to SCIF_CNCT_REJ interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote end has rejected the connection request.  Set the end
+ * point back to the bound state and wake up the pending connect().
+ */
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->state = SCIFEP_BOUND;
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_discnct() - Respond to SCIF_DISCNCT interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote node has indicated close() has been called on its end
+ * point.  Remove the local end point from the connected list, set its
+ * state to disconnected and ensure accesses to the remote node are
+ * shutdown.
+ *
+ * When all accesses to the remote end have completed then send a
+ * DISCNT_ACK to indicate it can remove its resources and complete
+ * the close routine.
+ */
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		/*
+		 * The local ep may have sent a disconnect and and been closed
+		 * due to a message response time out. It may have been
+		 * allocated again and formed a new connection so we want to
+		 * check if the remote ep matches
+		 */
+		if (((u64)tmpep == msg->payload[1]) &&
+		    ((u64)tmpep->remote_ep == msg->payload[0])) {
+			list_del(pos);
+			ep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	/*
+	 * If the terminated end is not found then this side started closing
+	 * before the other side sent the disconnect.  If so the ep will no
+	 * longer be on the connected list.  Regardless the other side
+	 * needs to be acked to let it know close is complete.
+	 */
+	if (!ep) {
+		mutex_unlock(&scif_info.connlock);
+		goto discnct_ack;
+	}
+
+	ep->state = SCIFEP_DISCONNECTED;
+	list_add_tail(&ep->list, &scif_info.disconnected);
+
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+discnct_ack:
+	msg->uop = SCIF_DISCNT_ACK;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_discnct_ack() - Respond to SCIF_DISCNT_ACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side has indicated it has not more references to local resources
+ */
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_DISCONNECTED;
+	spin_unlock(&ep->lock);
+	complete(&ep->discon);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index 812c23f..c1d6bc3 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -68,6 +68,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 {
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
+	int err = 0;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -91,6 +92,111 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	}
 	case SCIF_LISTEN:
 		return scif_listen(priv, arg);
+	case SCIF_CONNECT:
+	{
+		struct scifioctl_connect req;
+		struct scif_endpt *ep = (struct scif_endpt *)priv;
+
+		if (copy_from_user(&req, argp, sizeof(req)))
+			return -EFAULT;
+
+		err = __scif_connect(priv, &req.peer, non_block);
+		if (err < 0)
+			return err;
+
+		req.self.node = ep->port.node;
+		req.self.port = ep->port.port;
+
+		if (copy_to_user(argp, &req, sizeof(req)))
+			return -EFAULT;
+
+		return 0;
+	}
+	/*
+	 * Accept is done in two halves.  The request ioctl does the basic
+	 * functionality of accepting the request and returning the information
+	 * about it including the internal ID of the end point.  The register
+	 * is done with the internal ID on a new file descriptor opened by the
+	 * requesting process.
+	 */
+	case SCIF_ACCEPTREQ:
+	{
+		struct scifioctl_accept request;
+		scif_epd_t *ep = (scif_epd_t *)&request.endpt;
+
+		if (copy_from_user(&request, argp, sizeof(request)))
+			return -EFAULT;
+
+		err = scif_accept(priv, &request.peer, ep, request.flags);
+		if (err < 0)
+			return err;
+
+		if (copy_to_user(argp, &request, sizeof(request))) {
+			scif_close(*ep);
+			return -EFAULT;
+		}
+		/*
+		 * Add to the list of user mode eps where the second half
+		 * of the accept is not yet completed.
+		 */
+		spin_lock(&scif_info.eplock);
+		list_add_tail(&((*ep)->miacceptlist), &scif_info.uaccept);
+		list_add_tail(&((*ep)->liacceptlist), &priv->li_accept);
+		(*ep)->listenep = priv;
+		priv->acceptcnt++;
+		spin_unlock(&scif_info.eplock);
+
+		return 0;
+	}
+	case SCIF_ACCEPTREG:
+	{
+		struct scif_endpt *priv = f->private_data;
+		struct scif_endpt *newep;
+		struct scif_endpt *lisep;
+		struct scif_endpt *fep = NULL;
+		struct scif_endpt *tmpep;
+		struct list_head *pos, *tmpq;
+
+		/* Finally replace the pointer to the accepted endpoint */
+		if (copy_from_user(&newep, argp, sizeof(void *)))
+			return -EFAULT;
+
+		/* Remove form the user accept queue */
+		spin_lock(&scif_info.eplock);
+		list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, miacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				fep = tmpep;
+				break;
+			}
+		}
+
+		if (!fep) {
+			spin_unlock(&scif_info.eplock);
+			return -ENOENT;
+		}
+
+		lisep = newep->listenep;
+		list_for_each_safe(pos, tmpq, &lisep->li_accept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, liacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				lisep->acceptcnt--;
+				break;
+			}
+		}
+
+		spin_unlock(&scif_info.eplock);
+
+		/* Free the resources automatically created from the open. */
+		scif_teardown_ep(priv);
+		scif_add_epd_to_zombie_list(priv, !SCIF_EPLOCK_HELD);
+		f->private_data = newep;
+		return 0;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
index 714bfea..2393274 100644
--- a/drivers/misc/mic/scif/scif_main.c
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -330,6 +330,7 @@ static int _scif_init(void)
 	scif_info.en_msg_log = 0;
 	scif_info.p2p_enable = 1;
 	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	INIT_WORK(&scif_info.conn_work, scif_conn_handler);
 	idr_init(&scif_ports);
 	return 0;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 5fefacc..0469aad 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -20,6 +20,41 @@
 #include "scif_main.h"
 #include "scif_map.h"
 
+/**
+ * scif_invalidate_ep() - Set state for all connected endpoints
+ * to disconnected and wake up all send/recv waitqueues
+ */
+static void scif_invalidate_ep(int node)
+{
+	struct scif_endpt *ep;
+	struct list_head *pos, *tmpq;
+
+	flush_work(&scif_info.conn_work);
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			spin_lock(&ep->lock);
+			scif_cleanup_ep_qp(ep);
+			spin_unlock(&ep->lock);
+		}
+	}
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			list_del(pos);
+			spin_lock(&ep->lock);
+			ep->state = SCIFEP_DISCONNECTED;
+			list_add_tail(&ep->list, &scif_info.disconnected);
+			scif_cleanup_ep_qp(ep);
+			wake_up_interruptible(&ep->sendwq);
+			wake_up_interruptible(&ep->recvwq);
+			spin_unlock(&ep->lock);
+		}
+	}
+	mutex_unlock(&scif_info.connlock);
+}
+
 void scif_free_qp(struct scif_dev *scifdev)
 {
 	struct scif_qp *qp = scifdev->qpairs;
@@ -91,6 +126,7 @@ void scif_cleanup_scifdev(struct scif_dev *dev)
 		scif_destroy_intr_wq(dev);
 	}
 	scif_destroy_p2p(dev);
+	scif_invalidate_ep(dev->node);
 	scif_send_acks(dev);
 	if (!dev->node && scif_info.card_initiated_exit) {
 		/*
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index 8978af9..aee9025 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -569,7 +569,14 @@ static char *message_types[] = {"BAD",
 				"SCIF_NODE_ADD_ACK",
 				"SCIF_NODE_ADD_NACK",
 				"REMOVE_NODE",
-				"REMOVE_NODE_ACK"};
+				"REMOVE_NODE_ACK",
+				"CNCT_REQ",
+				"CNCT_GNT",
+				"CNCT_GNTACK",
+				"CNCT_GNTNACK",
+				"CNCT_REJ",
+				"DISCNCT",
+				"DISCNT_ACK"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -959,6 +966,13 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
 	scif_node_remove,	/* SCIF_NODE_REMOVE */
 	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+	scif_cnctreq,		/* SCIF_CNCT_REQ */
+	scif_cnctgnt,		/* SCIF_CNCT_GNT */
+	scif_cnctgnt_ack,	/* SCIF_CNCT_GNTACK */
+	scif_cnctgnt_nack,	/* SCIF_CNCT_GNTNACK */
+	scif_cnctrej,		/* SCIF_CNCT_REJ */
+	scif_discnct,		/* SCIF_DISCNCT */
+	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v3 13/13] misc: mic: add support for loading/unloading SCIF driver
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (11 preceding siblings ...)
  2015-01-27  0:32 70% ` [PATCH char-misc-next v3 12/13] misc: mic: MIC card " Sudeep Dutt
@ 2015-01-27  0:32 98% ` Sudeep Dutt
  2015-02-24  0:45 99% ` [PATCH char-misc-next v3 00/13] misc: mic: " Sudeep Dutt
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

modprobe SCIF driver upon start and remove it upon unload

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpss | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index cacbdb0..582aad4 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -35,6 +35,7 @@
 
 exec=/usr/sbin/mpssd
 sysfs="/sys/class/mic"
+mic_modules="mic_host mic_x100_dma scif"
 
 start()
 {
@@ -48,18 +49,15 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
+	echo -e $"Loading MIC drivers:" $mic_modules
 
-	for f in "mic_host" "mic_x100_dma"
-	do
-		modprobe $f
-		RETVAL=$?
-		if [ $RETVAL -ne 0 ]; then
-			failure
-			echo
-			return $RETVAL
-		fi
-	done
+	modprobe -a $mic_modules
+	RETVAL=$?
+	if [ $RETVAL -ne 0 ]; then
+		failure
+		echo
+		return $RETVAL
+	fi
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +168,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
-	modprobe -r mic_host mic_x100_dma
+	echo -n $"Removing MIC drivers:" $mic_modules
+	modprobe -r $mic_modules
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 98%]

* [PATCH char-misc-next v3 08/13] misc: mic: SCIF open close bind and listen APIs
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (6 preceding siblings ...)
  2015-01-27  0:32 28% ` [PATCH char-misc-next v3 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
@ 2015-01-27  0:32 41% ` Sudeep Dutt
  2015-01-27  0:32 41% ` [PATCH char-misc-next v3 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF character device file operations and kernel APIs for opening and
closing a user and kernel mode SCIF endpoint. This patch also enables
binding to a SCIF port and listening for incoming SCIF connections.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |  19 ++
 drivers/misc/mic/Makefile           |   1 +
 drivers/misc/mic/scif/Makefile      |  15 ++
 drivers/misc/mic/scif/scif_epd.h    | 148 +++++++++++++
 drivers/misc/mic/scif/scif_main.h   |   2 +
 drivers/misc/mic/scif/scif_nodeqp.h |  12 +-
 drivers/misc/mic/scif/scif_api.c    | 417 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  92 ++++++++
 drivers/misc/mic/scif/scif_fd.c     | 104 +++++++++
 drivers/misc/mic/scif/scif_ports.c  | 124 +++++++++++
 10 files changed, 933 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index afc9f53..bcf6e36 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -69,3 +69,22 @@ config INTEL_MIC_CARD
 
 	  For more information see
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "SCIF Driver"
+
+config SCIF
+	tristate "SCIF Driver"
+	depends on 64BIT && PCI && X86 && SCIF_BUS
+	help
+	  This enables SCIF Driver support for the Intel Many Integrated
+	  Core (MIC) family of PCIe form factor coprocessor devices that
+	  run a 64 bit Linux OS. The Symmetric Communication Interface
+	  (SCIF (pronounced as skiff)) is a low level communications API
+	  across PCIe currently implemented for MIC.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 6bc6083..a74042c 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
 obj-y += bus/
+obj-$(CONFIG_SCIF) += scif/
diff --git a/drivers/misc/mic/scif/Makefile b/drivers/misc/mic/scif/Makefile
new file mode 100644
index 0000000..bf10bb7
--- /dev/null
+++ b/drivers/misc/mic/scif/Makefile
@@ -0,0 +1,15 @@
+#
+# Makefile - SCIF driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_SCIF) += scif.o
+scif-objs := scif_main.o
+scif-objs += scif_peer_bus.o
+scif-objs += scif_ports.o
+scif-objs += scif_debugfs.o
+scif-objs += scif_fd.o
+scif-objs += scif_api.o
+scif-objs += scif_epd.o
+scif-objs += scif_rb.o
+scif-objs += scif_nodeqp.o
+scif-objs += scif_nm.o
diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
new file mode 100644
index 0000000..b973b99
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -0,0 +1,148 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_EPD_H
+#define SCIF_EPD_H
+
+#include <linux/delay.h>
+#include <linux/scif.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_EPLOCK_HELD true
+
+enum scif_epd_state {
+	SCIFEP_UNBOUND,
+	SCIFEP_BOUND,
+	SCIFEP_LISTENING,
+	SCIFEP_CONNECTED,
+	SCIFEP_CONNECTING,
+	SCIFEP_MAPPING,
+	SCIFEP_CLOSING,
+	SCIFEP_CLLISTEN,
+	SCIFEP_DISCONNECTED,
+	SCIFEP_ZOMBIE
+};
+
+/*
+ * struct scif_conreq - Data structure added to the connection list.
+ *
+ * @msg: connection request message received
+ * @list: link to list of connection requests
+ */
+struct scif_conreq {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+/* Size of the RB for the Endpoint QP */
+#define SCIF_ENDPT_QP_SIZE 0x1000
+
+/*
+ * scif_endpt_qp_info - SCIF endpoint queue pair
+ *
+ * @qp - Qpair for this endpoint
+ * @qp_offset - DMA address of the QP
+ * @gnt_pld - Payload in a SCIF_CNCT_GNT message containing the
+ * physical address of the remote_qp.
+ */
+struct scif_endpt_qp_info {
+	struct scif_qp *qp;
+	dma_addr_t qp_offset;
+	dma_addr_t gnt_pld;
+};
+
+/*
+ * struct scif_endpt - The SCIF endpoint data structure
+ *
+ * @state: end point state
+ * @lock: lock synchronizing access to endpoint fields like state etc
+ * @port: self port information
+ * @peer: peer port information
+ * @backlog: maximum pending connection requests
+ * @qp_info: Endpoint QP information for SCIF messaging
+ * @remote_dev: scifdev used by this endpt to communicate with remote node.
+ * @remote_ep: remote endpoint
+ * @conreqcnt: Keep track of number of connection requests.
+ * @files: Open file information used to match the id passed in with
+ *         the flush routine.
+ * @conlist: list of connection requests
+ * @conwq: waitqueue for connection processing
+ * @discon: completion used during disconnection
+ * @sendwq: waitqueue used during sending messages
+ * @recvwq: waitqueue used during message receipt
+ * @sendlock: Synchronize ordering of messages sent
+ * @recvlock: Synchronize ordering of messages received
+ * @list: link to list of various endpoints like connected, listening etc
+ * @li_accept: pending ACCEPTREG
+ * @acceptcnt: pending ACCEPTREG cnt
+ * @liacceptlist: link to listen accept
+ * @miacceptlist: link to uaccept
+ * @listenep: associated listen ep
+ * @conn_work: Non blocking connect work
+ * @conn_port: Connection port
+ * @conn_err: Errors during connection
+ * @conn_async_state: Async connection
+ * @conn_list: List of async connection requests
+ */
+struct scif_endpt {
+	enum scif_epd_state state;
+	spinlock_t lock;
+	struct scif_port_id port;
+	struct scif_port_id peer;
+	int backlog;
+	struct scif_endpt_qp_info qp_info;
+	struct scif_dev *remote_dev;
+	u64 remote_ep;
+	int conreqcnt;
+	struct files_struct *files;
+	struct list_head conlist;
+	wait_queue_head_t conwq;
+	struct completion discon;
+	wait_queue_head_t sendwq;
+	wait_queue_head_t recvwq;
+	struct mutex sendlock;
+	struct mutex recvlock;
+	struct list_head list;
+	struct list_head li_accept;
+	int acceptcnt;
+	struct list_head liacceptlist;
+	struct list_head miacceptlist;
+	struct scif_endpt *listenep;
+	struct scif_port_id conn_port;
+	int conn_err;
+	int conn_async_state;
+	struct list_head conn_list;
+};
+
+static inline int scifdev_alive(struct scif_endpt *ep)
+{
+	return _scifdev_alive(ep->remote_dev);
+}
+
+void scif_cleanup_zombie_epd(void);
+void scif_teardown_ep(void *endpt);
+void scif_cleanup_ep_qp(struct scif_endpt *ep);
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held);
+void scif_get_node_info(void);
+void scif_send_acks(struct scif_dev *dev);
+void scif_conn_handler(struct work_struct *work);
+int scif_rsrv_port(u16 port);
+void scif_get_port(u16 port);
+int scif_get_new_port(void);
+void scif_put_port(u16 port);
+int __scif_flush(scif_epd_t epd);
+#endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
index 0bc091b..ce8d0c9 100644
--- a/drivers/misc/mic/scif/scif_main.h
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -236,6 +236,8 @@ static inline int _scifdev_alive(struct scif_dev *scifdev)
 	return !!spdev;
 }
 
+#include "scif_epd.h"
+
 void __init scif_init_debugfs(void);
 void scif_exit_debugfs(void);
 int scif_setup_intr_wq(struct scif_dev *scifdev);
diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
index 511ad12..0edd2ad 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.h
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -64,7 +64,17 @@
 #define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
 #define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
 #define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
-#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+#define SCIF_CNCT_REQ 9  /* Phys addr of Request connection to a port */
+#define SCIF_CNCT_GNT 10  /* Phys addr of new Grant connection request */
+#define SCIF_CNCT_GNTACK 11  /* Error type Reject a connection request */
+#define SCIF_CNCT_GNTNACK 12  /* Error type Reject a connection request */
+#define SCIF_CNCT_REJ 13  /* Error type Reject a connection request */
+#define SCIF_DISCNCT 14 /* Notify peer that connection is being terminated */
+#define SCIF_DISCNT_ACK 15 /* Notify peer that connection is being terminated */
+#define SCIF_CLIENT_SENT 16 /* Notify the peer that data has been written */
+#define SCIF_CLIENT_RCVD 17 /* Notify the peer that data has been read */
+#define SCIF_GET_NODE_INFO 18 /* Get current node mask from the mgmt node*/
+#define SCIF_MAX_MSG SCIF_GET_NODE_INFO
 
 /*
  * struct scifmsg - Node QP message format
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
new file mode 100644
index 0000000..cbdc27b
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -0,0 +1,417 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/scif.h>
+#include "scif_main.h"
+#include "scif_map.h"
+
+static const char * const scif_ep_states[] = {
+	"Unbound",
+	"Bound",
+	"Listening",
+	"Connected",
+	"Connecting",
+	"Mapping",
+	"Closing",
+	"Close Listening",
+	"Disconnected",
+	"Zombie"};
+
+enum conn_async_state {
+	ASYNC_CONN_IDLE = 1,	/* ep setup for async connect */
+	ASYNC_CONN_INPROGRESS,	/* async connect in progress */
+	ASYNC_CONN_FLUSH_WORK	/* async work flush in progress  */
+};
+
+scif_epd_t scif_open(void)
+{
+	struct scif_endpt *ep;
+
+	might_sleep();
+	ep = kzalloc(sizeof(*ep), GFP_KERNEL);
+	if (!ep)
+		goto err_ep_alloc;
+
+	ep->qp_info.qp = kzalloc(sizeof(*ep->qp_info.qp), GFP_KERNEL);
+	if (!ep->qp_info.qp)
+		goto err_qp_alloc;
+
+	spin_lock_init(&ep->lock);
+	mutex_init(&ep->sendlock);
+	mutex_init(&ep->recvlock);
+
+	ep->state = SCIFEP_UNBOUND;
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI open: ep %p success\n", ep);
+	return ep;
+
+err_qp_alloc:
+	kfree(ep);
+err_ep_alloc:
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(scif_open);
+
+/*
+ * scif_disconnect_ep - Disconnects the endpoint if found
+ * @epd: The end point returned from scif_open()
+ */
+static struct scif_endpt *scif_disconnect_ep(struct scif_endpt *ep)
+{
+	struct scifmsg msg;
+	struct scif_endpt *fep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	int err;
+
+	/*
+	 * Wake up any threads blocked in send()/recv() before closing
+	 * out the connection. Grabbing and releasing the send/recv lock
+	 * will ensure that any blocked senders/receivers have exited for
+	 * Ring 0 endpoints. It is a Ring 0 bug to call send/recv after
+	 * close. Ring 3 endpoints are not affected since close will not
+	 * be called while there are IOCTLs executing.
+	 */
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	mutex_lock(&ep->sendlock);
+	mutex_unlock(&ep->sendlock);
+	mutex_lock(&ep->recvlock);
+	mutex_unlock(&ep->recvlock);
+
+	/* Remove from the connected list */
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		if (tmpep == ep) {
+			list_del(pos);
+			fep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	if (!fep) {
+		/*
+		 * The other side has completed the disconnect before
+		 * the end point can be removed from the list. Therefore
+		 * the ep lock is not locked, traverse the disconnected
+		 * list to find the endpoint and release the conn lock.
+		 */
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		return NULL;
+	}
+
+	init_completion(&ep->discon);
+	msg.uop = SCIF_DISCNCT;
+	msg.src = ep->port;
+	msg.dst = ep->peer;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->remote_ep;
+
+	err = scif_nodeqp_send(ep->remote_dev, &msg);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+	if (!err)
+		/* Wait for the remote node to respond with SCIF_DISCNT_ACK */
+		wait_for_completion_timeout(&ep->discon,
+					    SCIF_NODE_ALIVE_TIMEOUT);
+	return ep;
+}
+
+int scif_close(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	enum scif_epd_state oldstate;
+	bool flush_conn;
+
+	dev_dbg(scif_info.mdev.this_device, "SCIFAPI close: ep %p %s\n",
+		ep, scif_ep_states[ep->state]);
+	might_sleep();
+	spin_lock(&ep->lock);
+	flush_conn = (ep->conn_async_state == ASYNC_CONN_INPROGRESS);
+	spin_unlock(&ep->lock);
+
+	if (flush_conn)
+		flush_work(&scif_info.conn_work);
+
+	spin_lock(&ep->lock);
+	oldstate = ep->state;
+
+	ep->state = SCIFEP_CLOSING;
+
+	switch (oldstate) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		/* Remove from the disconnected list */
+		mutex_lock(&scif_info.connlock);
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		break;
+	case SCIFEP_UNBOUND:
+	case SCIFEP_BOUND:
+	case SCIFEP_CONNECTING:
+		spin_unlock(&ep->lock);
+		break;
+	case SCIFEP_MAPPING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CLOSING:
+	{
+		spin_unlock(&ep->lock);
+		scif_disconnect_ep(ep);
+		break;
+	}
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+	{
+		struct scif_conreq *conreq;
+		struct scifmsg msg;
+		struct scif_endpt *aep;
+
+		spin_unlock(&ep->lock);
+		spin_lock(&scif_info.eplock);
+
+		/* remove from listen list */
+		list_for_each_safe(pos, tmpq, &scif_info.listen) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep)
+				list_del(pos);
+		}
+		/* Remove any dangling accepts */
+		while (ep->acceptcnt) {
+			aep = list_first_entry(&ep->li_accept,
+					       struct scif_endpt, liacceptlist);
+			list_del(&aep->liacceptlist);
+			scif_put_port(aep->port.port);
+			list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+				tmpep = list_entry(pos, struct scif_endpt,
+						   miacceptlist);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			spin_unlock(&scif_info.eplock);
+			mutex_lock(&scif_info.connlock);
+			list_for_each_safe(pos, tmpq, &scif_info.connected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			mutex_unlock(&scif_info.connlock);
+			scif_teardown_ep(aep);
+			spin_lock(&scif_info.eplock);
+			scif_add_epd_to_zombie_list(aep, SCIF_EPLOCK_HELD);
+			ep->acceptcnt--;
+		}
+
+		spin_lock(&ep->lock);
+		spin_unlock(&scif_info.eplock);
+
+		/* Remove and reject any pending connection requests. */
+		while (ep->conreqcnt) {
+			conreq = list_first_entry(&ep->conlist,
+						  struct scif_conreq, list);
+			list_del(&conreq->list);
+
+			msg.uop = SCIF_CNCT_REJ;
+			msg.dst.node = conreq->msg.src.node;
+			msg.dst.port = conreq->msg.src.port;
+			msg.payload[0] = conreq->msg.payload[0];
+			msg.payload[1] = conreq->msg.payload[1];
+			/*
+			 * No Error Handling on purpose for scif_nodeqp_send().
+			 * If the remote node is lost we still want free the
+			 * connection requests on the self node.
+			 */
+			scif_nodeqp_send(&scif_dev[conreq->msg.src.node],
+					 &msg);
+			ep->conreqcnt--;
+			kfree(conreq);
+		}
+
+		spin_unlock(&ep->lock);
+		/* If a kSCIF accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	}
+	scif_put_port(ep->port.port);
+	scif_teardown_ep(ep);
+	scif_add_epd_to_zombie_list(ep, !SCIF_EPLOCK_HELD);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_close);
+
+/**
+ * scif_flush() - Wakes up any blocking accepts. The endpoint will no longer
+ *			accept new connections.
+ * @epd: The end point returned from scif_open()
+ */
+int __scif_flush(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	switch (ep->state) {
+	case SCIFEP_LISTENING:
+	{
+		ep->state = SCIFEP_CLLISTEN;
+
+		/* If an accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	default:
+		break;
+	}
+	return 0;
+}
+
+int scif_bind(scif_epd_t epd, u16 pn)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret = 0;
+	int tmp;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI bind: ep %p %s requested port number %d\n",
+		ep, scif_ep_states[ep->state], pn);
+	if (pn) {
+		/*
+		 * Similar to IETF RFC 1700, SCIF ports below
+		 * SCIF_ADMIN_PORT_END can only be bound by system (or root)
+		 * processes or by processes executed by privileged users.
+		 */
+		if (pn < SCIF_ADMIN_PORT_END && !capable(CAP_SYS_ADMIN)) {
+			ret = -EACCES;
+			goto scif_bind_admin_exit;
+		}
+	}
+
+	spin_lock(&ep->lock);
+	if (ep->state == SCIFEP_BOUND) {
+		ret = -EINVAL;
+		goto scif_bind_exit;
+	} else if (ep->state != SCIFEP_UNBOUND) {
+		ret = -EISCONN;
+		goto scif_bind_exit;
+	}
+
+	if (pn) {
+		tmp = scif_rsrv_port(pn);
+		if (tmp != pn) {
+			ret = -EINVAL;
+			goto scif_bind_exit;
+		}
+	} else {
+		pn = scif_get_new_port();
+		if (!pn) {
+			ret = -ENOSPC;
+			goto scif_bind_exit;
+		}
+	}
+
+	ep->state = SCIFEP_BOUND;
+	ep->port.node = scif_info.nodeid;
+	ep->port.port = pn;
+	ep->conn_async_state = ASYNC_CONN_IDLE;
+	ret = pn;
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI bind: bound to port number %d\n", pn);
+scif_bind_exit:
+	spin_unlock(&ep->lock);
+scif_bind_admin_exit:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_bind);
+
+int scif_listen(scif_epd_t epd, int backlog)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI listen: ep %p %s\n", ep, scif_ep_states[ep->state]);
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+	case SCIFEP_CLLISTEN:
+	case SCIFEP_UNBOUND:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		return -EINVAL;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		spin_unlock(&ep->lock);
+		return -EISCONN;
+	case SCIFEP_BOUND:
+		break;
+	}
+
+	ep->state = SCIFEP_LISTENING;
+	ep->backlog = backlog;
+
+	ep->conreqcnt = 0;
+	ep->acceptcnt = 0;
+	INIT_LIST_HEAD(&ep->conlist);
+	init_waitqueue_head(&ep->conwq);
+	INIT_LIST_HEAD(&ep->li_accept);
+	spin_unlock(&ep->lock);
+
+	/*
+	 * Listen status is complete so delete the qp information not needed
+	 * on a listen before placing on the list of listening ep's
+	 */
+	scif_teardown_ep(ep);
+	ep->qp_info.qp = NULL;
+
+	spin_lock(&scif_info.eplock);
+	list_add_tail(&ep->list, &scif_info.listen);
+	spin_unlock(&scif_info.eplock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_listen);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
new file mode 100644
index 0000000..a2b728a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -0,0 +1,92 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_cleanup_ep_qp(struct scif_endpt *ep)
+{
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp->outbound_q.rb_base) {
+		scif_iounmap((void *)qp->outbound_q.rb_base,
+			     qp->outbound_q.size, ep->remote_dev);
+		qp->outbound_q.rb_base = NULL;
+	}
+	if (qp->remote_qp) {
+		scif_iounmap((void *)qp->remote_qp,
+			     sizeof(struct scif_qp), ep->remote_dev);
+		qp->remote_qp = NULL;
+	}
+	if (qp->local_qp) {
+		scif_unmap_single(qp->local_qp, ep->remote_dev,
+				  sizeof(struct scif_qp));
+		qp->local_qp = 0x0;
+	}
+	if (qp->local_buf) {
+		scif_unmap_single(qp->local_buf, ep->remote_dev,
+				  SCIF_ENDPT_QP_SIZE);
+		qp->local_buf = 0;
+	}
+}
+
+void scif_teardown_ep(void *endpt)
+{
+	struct scif_endpt *ep = endpt;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp) {
+		spin_lock(&ep->lock);
+		scif_cleanup_ep_qp(ep);
+		spin_unlock(&ep->lock);
+		kfree(qp->inbound_q.rb_base);
+		kfree(qp);
+	}
+}
+
+/*
+ * Enqueue the endpoint to the zombie list for cleanup.
+ * The endpoint should not be accessed once this API returns.
+ */
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
+{
+	if (!eplock_held)
+		spin_lock(&scif_info.eplock);
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_ZOMBIE;
+	spin_unlock(&ep->lock);
+	list_add_tail(&ep->list, &scif_info.zombie);
+	scif_info.nr_zombies++;
+	if (!eplock_held)
+		spin_unlock(&scif_info.eplock);
+	schedule_work(&scif_info.misc_work);
+}
+
+void scif_cleanup_zombie_epd(void)
+{
+	struct list_head *pos, *tmpq;
+	struct scif_endpt *ep;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.zombie) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		list_del(pos);
+		scif_info.nr_zombies--;
+		kfree(ep);
+	}
+	spin_unlock(&scif_info.eplock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
new file mode 100644
index 0000000..812c23f
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -0,0 +1,104 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+
+static int scif_fdopen(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = scif_open();
+
+	if (!priv)
+		return -ENOMEM;
+	f->private_data = priv;
+	return 0;
+}
+
+static int scif_fdclose(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = f->private_data;
+
+	return scif_close(priv);
+}
+
+static int scif_fdflush(struct file *f, fl_owner_t id)
+{
+	struct scif_endpt *ep = f->private_data;
+
+	spin_lock(&ep->lock);
+	/*
+	 * The listening endpoint stashes the open file information before
+	 * waiting for incoming connections. The release callback would never be
+	 * called if the application closed the endpoint, while waiting for
+	 * incoming connections from a separate thread since the file descriptor
+	 * reference count is bumped up in the accept IOCTL. Call the flush
+	 * routine if the id matches the endpoint open file information so that
+	 * the listening endpoint can be woken up and the fd released.
+	 */
+	if (ep->files == id)
+		__scif_flush(ep);
+	spin_unlock(&ep->lock);
+	return 0;
+}
+
+static __always_inline void scif_err_debug(int err, const char *str)
+{
+	/*
+	 * ENOTCONN is a common uninteresting error which is
+	 * flooding debug messages to the console unnecessarily.
+	 */
+	if (err < 0 && err != -ENOTCONN)
+		dev_dbg(scif_info.mdev.this_device, "%s err %d\n", str, err);
+}
+
+static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct scif_endpt *priv = f->private_data;
+	void __user *argp = (void __user *)arg;
+	bool non_block = false;
+
+	non_block = !!(f->f_flags & O_NONBLOCK);
+
+	switch (cmd) {
+	case SCIF_BIND:
+	{
+		int pn;
+
+		if (copy_from_user(&pn, argp, sizeof(pn)))
+			return -EFAULT;
+
+		pn = scif_bind(priv, pn);
+		if (pn < 0)
+			return pn;
+
+		if (copy_to_user(argp, &pn, sizeof(pn)))
+			return -EFAULT;
+
+		return 0;
+	}
+	case SCIF_LISTEN:
+		return scif_listen(priv, arg);
+	}
+	return -EINVAL;
+}
+
+const struct file_operations scif_fops = {
+	.open = scif_fdopen,
+	.release = scif_fdclose,
+	.unlocked_ioctl = scif_fdioctl,
+	.flush = scif_fdflush,
+	.owner = THIS_MODULE,
+};
diff --git a/drivers/misc/mic/scif/scif_ports.c b/drivers/misc/mic/scif/scif_ports.c
new file mode 100644
index 0000000..594e18d
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_ports.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/idr.h>
+
+#include "scif_main.h"
+
+#define SCIF_PORT_COUNT	0x10000	/* Ports available */
+
+struct idr scif_ports;
+
+/*
+ * struct scif_port - SCIF port information
+ *
+ * @ref_cnt - Reference count since there can be multiple endpoints
+ *		created via scif_accept(..) simultaneously using a port.
+ */
+struct scif_port {
+	int ref_cnt;
+};
+
+/**
+ * __scif_get_port - Reserve a specified port # for SCIF and add it
+ * to the global list.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+static int __scif_get_port(int start, int end)
+{
+	int id;
+	struct scif_port *port = kzalloc(sizeof(*port), GFP_ATOMIC);
+
+	if (!port)
+		return -ENOMEM;
+	spin_lock(&scif_info.port_lock);
+	id = idr_alloc(&scif_ports, port, start, end, GFP_ATOMIC);
+	if (id >= 0)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+	return id;
+}
+
+/**
+ * scif_rsrv_port - Reserve a specified port # for SCIF.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_rsrv_port(u16 port)
+{
+	return __scif_get_port(port, port + 1);
+}
+
+/**
+ * scif_get_new_port - Get and reserve any port # for SCIF in the range
+ *			SCIF_PORT_RSVD + 1 to SCIF_PORT_COUNT - 1.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if no ports available.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_get_new_port(void)
+{
+	return __scif_get_port(SCIF_PORT_RSVD + 1, SCIF_PORT_COUNT);
+}
+
+/**
+ * scif_get_port - Increment the reference count for a SCIF port
+ * @id : SCIF port
+ *
+ * @return : None
+ */
+void scif_get_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+}
+
+/**
+ * scif_put_port - Release a reserved SCIF port
+ * @id : SCIF port to be released.
+ *
+ * @return : None
+ */
+void scif_put_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port) {
+		port->ref_cnt--;
+		if (!port->ref_cnt) {
+			idr_remove(&scif_ports, id);
+			kfree(port);
+		}
+	}
+	spin_unlock(&scif_info.port_lock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v3 10/13] misc: mic: SCIF messaging and node enumeration APIs
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (8 preceding siblings ...)
  2015-01-27  0:32 41% ` [PATCH char-misc-next v3 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
@ 2015-01-27  0:32 52% ` Sudeep Dutt
  2015-01-27  0:32 57% ` [PATCH char-misc-next v3 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF messaging APIs which allow sending messages between the SCIF
endpoints via a byte stream based ring buffer which has been
optimized to avoid reads across PCIe. The SCIF messaging APIs
are typically used for short < 1024 byte messages for best
performance while the RDMA APIs which will be submitted in a future
patch series is recommended for larger transfers. The node
enumeration API enables a user to query for the number of nodes
online in the SCIF network and their node ids.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   4 +
 drivers/misc/mic/scif/scif_api.c    | 402 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  32 +++
 drivers/misc/mic/scif/scif_fd.c     |  93 +++++++++
 drivers/misc/mic/scif/scif_nm.c     |  17 ++
 drivers/misc/mic/scif/scif_nodeqp.c |  36 +++-
 6 files changed, 583 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index 65c628e..331322a 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,6 +144,8 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags);
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags);
 void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
@@ -151,6 +153,8 @@ void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg);
 int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index 8b55b59..41fba06 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -867,3 +867,405 @@ scif_accept_error_epalloc:
 	return err;
 }
 EXPORT_SYMBOL_GPL(scif_accept);
+
+/*
+ * scif_msg_param_check:
+ * @epd: The end point returned from scif_open()
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * Validate parameters for messaging APIs scif_send(..)/scif_recv(..).
+ */
+static inline int scif_msg_param_check(scif_epd_t epd, int len, int flags)
+{
+	int ret = -EINVAL;
+
+	if (len < 0)
+		goto err_ret;
+	if (flags && (!(flags & SCIF_RECV_BLOCK)))
+		goto err_ret;
+	ret = 0;
+err_ret:
+	return ret;
+}
+
+static int _scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_xfer_len = 0, sent_len = 0, write_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_SEND_BLOCK)
+		might_sleep();
+
+	spin_lock(&ep->lock);
+	while (sent_len != len && SCIFEP_CONNECTED == ep->state) {
+		write_count = scif_rb_space(&qp->outbound_q);
+		if (write_count) {
+			/* Best effort to send as much data as possible */
+			curr_xfer_len = min(len - sent_len, write_count);
+			ret = scif_rb_write(&qp->outbound_q, msg,
+					    curr_xfer_len);
+			if (ret < 0)
+				break;
+			/* Success. Update write pointer */
+			scif_rb_commit(&qp->outbound_q);
+			/*
+			 * Send a notification to the peer about the
+			 * produced data message.
+			 */
+			notif_msg.src = ep->port;
+			notif_msg.uop = SCIF_CLIENT_SENT;
+			notif_msg.payload[0] = ep->remote_ep;
+			ret = _scif_nodeqp_send(ep->remote_dev, &notif_msg);
+			if (ret)
+				break;
+			sent_len += curr_xfer_len;
+			msg = msg + curr_xfer_len;
+			continue;
+		}
+		curr_xfer_len = min(len - sent_len, SCIF_ENDPT_QP_SIZE - 1);
+		/* Not enough RB space. return for the Non Blocking case */
+		if (!(flags & SCIF_SEND_BLOCK))
+			break;
+
+		spin_unlock(&ep->lock);
+		/* Wait for a SCIF_CLIENT_RCVD message in the Blocking case */
+		ret =
+		wait_event_interruptible(ep->sendwq,
+					 (SCIFEP_CONNECTED != ep->state) ||
+					 (scif_rb_space(&qp->outbound_q) >=
+					 curr_xfer_len));
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (sent_len)
+		ret = sent_len;
+	else if (!ret && SCIFEP_CONNECTED != ep->state)
+		ret = SCIFEP_DISCONNECTED == ep->state ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+static int _scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	int read_size;
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_recv_len = 0, remaining_len = len, read_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_RECV_BLOCK)
+		might_sleep();
+	spin_lock(&ep->lock);
+	while (remaining_len && (SCIFEP_CONNECTED == ep->state ||
+				 SCIFEP_DISCONNECTED == ep->state)) {
+		read_count = scif_rb_count(&qp->inbound_q, remaining_len);
+		if (read_count) {
+			/*
+			 * Best effort to recv as much data as there
+			 * are bytes to read in the RB particularly
+			 * important for the Non Blocking case.
+			 */
+			curr_recv_len = min(remaining_len, read_count);
+			read_size = scif_rb_get_next(&qp->inbound_q,
+						     msg, curr_recv_len);
+			if (ep->state == SCIFEP_CONNECTED) {
+				/*
+				 * Update the read pointer only if the endpoint
+				 * is still connected else the read pointer
+				 * might no longer exist since the peer has
+				 * freed resources!
+				 */
+				scif_rb_update_read_ptr(&qp->inbound_q);
+				/*
+				 * Send a notification to the peer about the
+				 * consumed data message only if the EP is in
+				 * SCIFEP_CONNECTED state.
+				 */
+				notif_msg.src = ep->port;
+				notif_msg.uop = SCIF_CLIENT_RCVD;
+				notif_msg.payload[0] = ep->remote_ep;
+				ret = _scif_nodeqp_send(ep->remote_dev,
+							&notif_msg);
+				if (ret)
+					break;
+			}
+			remaining_len -= curr_recv_len;
+			msg = msg + curr_recv_len;
+			continue;
+		}
+		/*
+		 * Bail out now if the EP is in SCIFEP_DISCONNECTED state else
+		 * we will keep looping forever.
+		 */
+		if (ep->state == SCIFEP_DISCONNECTED)
+			break;
+		/*
+		 * Return in the Non Blocking case if there is no data
+		 * to read in this iteration.
+		 */
+		if (!(flags & SCIF_RECV_BLOCK))
+			break;
+		curr_recv_len = min(remaining_len, SCIF_ENDPT_QP_SIZE - 1);
+		spin_unlock(&ep->lock);
+		/*
+		 * Wait for a SCIF_CLIENT_SEND message in the blocking case
+		 * or until other side disconnects.
+		 */
+		ret =
+		wait_event_interruptible(ep->recvwq,
+					 SCIFEP_CONNECTED != ep->state ||
+					 scif_rb_count(&qp->inbound_q,
+						       curr_recv_len)
+					 >= curr_recv_len);
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (len - remaining_len)
+		ret = len - remaining_len;
+	else if (!ret && ep->state != SCIFEP_CONNECTED)
+		ret = ep->state == SCIFEP_DISCONNECTED ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+/**
+ * scif_user_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_send().
+ */
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int sent_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI send (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto send_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto send_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->sendlock);
+	while (sent_len != len) {
+		loop_len = len - sent_len;
+		loop_len = min(chunk_len, loop_len);
+		if (copy_from_user(tmp, msg, loop_len)) {
+			err = -EFAULT;
+			goto send_free_err;
+		}
+		err = _scif_send(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto send_free_err;
+		sent_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto send_free_err;
+	}
+send_free_err:
+	mutex_unlock(&ep->sendlock);
+	kfree(tmp);
+send_err:
+	return err < 0 ? err : sent_len;
+}
+
+/**
+ * scif_user_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_recv().
+ */
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int recv_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI recv (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto recv_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto recv_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->recvlock);
+	while (recv_len != len) {
+		loop_len = len - recv_len;
+		loop_len = min(chunk_len, loop_len);
+		err = _scif_recv(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto recv_free_err;
+		if (copy_to_user(msg, tmp, err)) {
+			err = -EFAULT;
+			goto recv_free_err;
+		}
+		recv_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto recv_free_err;
+	}
+recv_free_err:
+	mutex_unlock(&ep->recvlock);
+	kfree(tmp);
+recv_err:
+	return err < 0 ? err : recv_len;
+}
+
+/**
+ * scif_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_send().
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI send (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	if (!ep->remote_dev)
+		return -ENOTCONN;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_lock(&ep->sendlock);
+
+	ret = _scif_send(epd, msg, len, flags);
+
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_unlock(&ep->sendlock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_send);
+
+/**
+ * scif_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_recv().
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI recv (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_lock(&ep->recvlock);
+
+	ret = _scif_recv(epd, msg, len, flags);
+
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_unlock(&ep->recvlock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_recv);
+
+int scif_get_node_ids(u16 *nodes, int len, u16 *self)
+{
+	int online = 0;
+	int offset = 0;
+	int node;
+
+	if (!scif_is_mgmt_node())
+		scif_get_node_info();
+
+	*self = scif_info.nodeid;
+	mutex_lock(&scif_info.conflock);
+	len = min_t(int, len, scif_info.total);
+	for (node = 0; node <= scif_info.maxid; node++) {
+		if (_scifdev_alive(&scif_dev[node])) {
+			online++;
+			if (offset < len)
+				nodes[offset++] = node;
+		}
+	}
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI get_node_ids total %d online %d filled in %d nodes\n",
+		scif_info.total, online, offset);
+	mutex_unlock(&scif_info.conflock);
+
+	return online;
+}
+EXPORT_SYMBOL_GPL(scif_get_node_ids);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index 685ee5c..b4bfbb0 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -319,3 +319,35 @@ void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	spin_unlock(&ep->lock);
 	complete(&ep->discon);
 }
+
+/**
+ * scif_clientsend() - Respond to SCIF_CLIENT_SEND interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_clientrcvd() - Respond to SCIF_CLIENT_RCVD interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->sendwq);
+	spin_unlock(&ep->lock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index c1d6bc3..b1678fe 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -69,6 +69,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
 	int err = 0;
+	struct scifioctl_msg request;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -197,6 +198,98 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 		f->private_data = newep;
 		return 0;
 	}
+	case SCIF_SEND:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = scif_user_send(priv, request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto send_err;
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+				 &err, sizeof(err))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = 0;
+send_err:
+		scif_err_debug(err, "scif_send");
+		return err;
+	}
+	case SCIF_RECV:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+
+		err = scif_user_recv(priv, request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto recv_err;
+
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+			&err, sizeof(err))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+		err = 0;
+recv_err:
+		scif_err_debug(err, "scif_recv");
+		return err;
+	}
+	case SCIF_GET_NODEIDS:
+	{
+		struct scifioctl_node_ids node_ids;
+		int entries;
+		u16 *nodes;
+		void __user *unodes, *uself;
+		u16 self;
+
+		if (copy_from_user(&node_ids, argp, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err2;
+		}
+
+		entries = min_t(int, scif_info.maxid, node_ids.len);
+		nodes = kmalloc_array(entries, sizeof(u16), GFP_KERNEL);
+		if (entries && !nodes) {
+			err = -ENOMEM;
+			goto getnodes_err2;
+		}
+		node_ids.len = scif_get_node_ids(nodes, entries, &self);
+
+		unodes = node_ids.nodes;
+		if (copy_to_user(unodes, nodes, sizeof(u16) * entries)) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		uself = node_ids.self;
+		if (copy_to_user(uself, &self, sizeof(u16))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		if (copy_to_user(argp, &node_ids, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+getnodes_err1:
+		kfree(nodes);
+getnodes_err2:
+		return err;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 0469aad..9b4c538 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -218,3 +218,20 @@ void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
 				   (atomic_read(&scifdev->disconn_rescnt) == 1),
 				   SCIF_NODE_ALIVE_TIMEOUT);
 }
+
+void scif_get_node_info(void)
+{
+	struct scifmsg msg;
+	DECLARE_COMPLETION_ONSTACK(node_info);
+
+	msg.uop = SCIF_GET_NODE_INFO;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = SCIF_MGMT_NODE;
+	msg.payload[3] = (u64)&node_info;
+
+	if ((scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg)))
+		return;
+
+	/* Wait for a response with SCIF_GET_NODE_INFO */
+	wait_for_completion(&node_info);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index aee9025..141923c 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -576,7 +576,10 @@ static char *message_types[] = {"BAD",
 				"CNCT_GNTNACK",
 				"CNCT_REJ",
 				"DISCNCT",
-				"DISCNT_ACK"};
+				"DISCNT_ACK",
+				"CLIENT_SENT",
+				"CLIENT_RCVD",
+				"SCIF_GET_NODE_INFO"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -946,6 +949,34 @@ scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	wake_up(&sdev->disconn_wq);
 }
 
+/**
+ * scif_get_node_info: Respond to SCIF_GET_NODE_INFO interrupt message
+ * @msg:        Interrupt message
+ *
+ * Retrieve node info i.e maxid and total from the mgmt node.
+ */
+static __always_inline void
+scif_get_node_info_resp(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		swap(msg->dst.node, msg->src.node);
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		msg->payload[2] = scif_info.total;
+		mutex_unlock(&scif_info.conflock);
+		scif_nodeqp_send(scifdev, msg);
+	} else {
+		struct completion *node_info =
+			(struct completion *)msg->payload[3];
+
+		mutex_lock(&scif_info.conflock);
+		scif_info.maxid = msg->payload[1];
+		scif_info.total = msg->payload[2];
+		complete_all(node_info);
+		mutex_unlock(&scif_info.conflock);
+	}
+}
+
 static void
 scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
 {
@@ -973,6 +1004,9 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_cnctrej,		/* SCIF_CNCT_REJ */
 	scif_discnct,		/* SCIF_DISCNCT */
 	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
+	scif_clientsend,	/* SCIF_CLIENT_SENT */
+	scif_clientrcvd,	/* SCIF_CLIENT_RCVD */
+	scif_get_node_info_resp,/* SCIF_GET_NODE_INFO */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 52%]

* [PATCH char-misc-next v3 06/13] misc: mic: SCIF module initialization
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (4 preceding siblings ...)
  2015-01-27  0:32 96% ` [PATCH char-misc-next v3 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
@ 2015-01-27  0:32 47% ` Sudeep Dutt
  2015-01-27  0:32 28% ` [PATCH char-misc-next v3 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF module initialization, DMA mapping, ioremap wrapper APIs
and debugfs hooks. SCIF gets probed by the SCIF hardware bus
if SCIF devices were registered by base drivers. A MISC device
is registered to provide the SCIF character device interface.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_main.h    | 250 ++++++++++++++++++++++
 drivers/misc/mic/scif/scif_map.h     | 113 ++++++++++
 drivers/misc/mic/scif/scif_debugfs.c |  85 ++++++++
 drivers/misc/mic/scif/scif_main.c    | 390 +++++++++++++++++++++++++++++++++++
 4 files changed, 838 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c

diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
new file mode 100644
index 0000000..0bc091b
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -0,0 +1,250 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAIN_H
+#define SCIF_MAIN_H
+
+#include <linux/sched.h>
+#include <linux/pci.h>
+#include <linux/miscdevice.h>
+#include <linux/dmaengine.h>
+#include <linux/file.h>
+#include <linux/scif.h>
+
+#include "../common/mic_dev.h"
+
+#define SCIF_MGMT_NODE 0
+#define SCIF_DEFAULT_WATCHDOG_TO 30
+#define SCIF_NODE_ACCEPT_TIMEOUT (3 * HZ)
+#define SCIF_NODE_ALIVE_TIMEOUT (SCIF_DEFAULT_WATCHDOG_TO * HZ)
+
+/*
+ * Generic state used for certain node QP message exchanges
+ * like Unregister, Alloc etc.
+ */
+enum scif_msg_state {
+	OP_IDLE = 1,
+	OP_IN_PROGRESS,
+	OP_COMPLETED,
+	OP_FAILED
+};
+
+/*
+ * struct scif_info - Global SCIF information
+ *
+ * @nodeid: Node ID this node is to others
+ * @maxid: Max known node ID
+ * @total: Total number of SCIF nodes
+ * @nr_zombies: number of zombie endpoints
+ * @eplock: Lock to synchronize listening, zombie endpoint lists
+ * @connlock: Lock to synchronize connected and disconnected lists
+ * @nb_connect_lock: Synchronize non blocking connect operations
+ * @port_lock: Synchronize access to SCIF ports
+ * @uaccept: List of user acceptreq waiting for acceptreg
+ * @listen: List of listening end points
+ * @zombie: List of zombie end points with pending RMA's
+ * @connected: List of end points in connected state
+ * @disconnected: List of end points in disconnected state
+ * @nb_connect_list: List for non blocking connections
+ * @misc_work: miscellaneous SCIF tasks
+ * @conflock: Lock to synchronize SCIF node configuration changes
+ * @en_msg_log: Enable debug message logging
+ * @p2p_enable: Enable P2P SCIF network
+ * @mdev: The MISC device
+ * @conn_work: Work for workqueue handling all connections
+ * @exitwq: Wait queue for waiting for an EXIT node QP message response
+ * @loopb_dev: Dummy SCIF device used for loopback
+ * @loopb_wq: Workqueue used for handling loopback messages
+ * @loopb_wqname[16]: Name of loopback workqueue
+ * @loopb_work: Used for submitting work to loopb_wq
+ * @loopb_recv_q: List of messages received on the loopb_wq
+ * @card_initiated_exit: set when the card has initiated the exit
+ */
+struct scif_info {
+	u8 nodeid;
+	u8 maxid;
+	u8 total;
+	u32 nr_zombies;
+	spinlock_t eplock;
+	struct mutex connlock;
+	spinlock_t nb_connect_lock;
+	spinlock_t port_lock;
+	struct list_head uaccept;
+	struct list_head listen;
+	struct list_head zombie;
+	struct list_head connected;
+	struct list_head disconnected;
+	struct list_head nb_connect_list;
+	struct work_struct misc_work;
+	struct mutex conflock;
+	u8 en_msg_log;
+	u8 p2p_enable;
+	struct miscdevice mdev;
+	struct work_struct conn_work;
+	wait_queue_head_t exitwq;
+	struct scif_dev *loopb_dev;
+	struct workqueue_struct *loopb_wq;
+	char loopb_wqname[16];
+	struct work_struct loopb_work;
+	struct list_head loopb_recv_q;
+	bool card_initiated_exit;
+};
+
+/*
+ * struct scif_p2p_info - SCIF mapping information used for P2P
+ *
+ * @ppi_peer_id - SCIF peer node id
+ * @ppi_sg - Scatter list for bar information (One for mmio and one for aper)
+ * @sg_nentries - Number of entries in the scatterlist
+ * @ppi_da: DMA address for MMIO and APER bars
+ * @ppi_len: Length of MMIO and APER bars
+ * @ppi_list: Link in list of mapping information
+ */
+struct scif_p2p_info {
+	u8 ppi_peer_id;
+	struct scatterlist *ppi_sg[2];
+	u64 sg_nentries[2];
+	dma_addr_t ppi_da[2];
+	u64 ppi_len[2];
+#define SCIF_PPI_MMIO 0
+#define SCIF_PPI_APER 1
+	struct list_head ppi_list;
+};
+
+/*
+ * struct scif_dev - SCIF remote device specific fields
+ *
+ * @node: Node id
+ * @p2p: List of P2P mapping information
+ * @qpairs: The node queue pair for exchanging control messages
+ * @intr_wq: Workqueue for handling Node QP messages
+ * @intr_wqname: Name of node QP workqueue for handling interrupts
+ * @intr_bh: Used for submitting work to intr_wq
+ * @lock: Lock used for synchronizing access to the scif device
+ * @sdev: SCIF hardware device on the SCIF hardware bus
+ * @db: doorbell the peer will trigger to generate an interrupt on self
+ * @rdb: Doorbell to trigger on the peer to generate an interrupt on the peer
+ * @cookie: Cookie received while registering the interrupt handler
+ * @qp_dma_addr: Queue pair DMA address passed to the peer
+ * @p2p_dwork: Delayed work to enable polling for P2P state
+ * @qp_dwork: Delayed work for enabling polling for remote QP information
+ * @p2p_retry: Number of times to retry polling of P2P state
+ * @base_addr: P2P aperture bar base address
+ * @mic_mw mmio: The peer MMIO information used for P2P
+ * @spdev: SCIF peer device on the SCIF peer bus
+ * @node_remove_ack_pending: True if a node_remove_ack is pending
+ * @exit_ack_pending: true if an exit_ack is pending
+ * @disconn_wq: Used while waiting for a node remove response
+ * @disconn_rescnt: Keeps track of number of node remove requests sent
+ * @exit: Status of exit message
+*/
+struct scif_dev {
+	u8 node;
+	struct list_head p2p;
+	struct scif_qp *qpairs;
+	struct workqueue_struct *intr_wq;
+	char intr_wqname[16];
+	struct work_struct intr_bh;
+	struct mutex lock;
+	struct scif_hw_dev *sdev;
+	int db;
+	int rdb;
+	struct mic_irq *cookie;
+	dma_addr_t qp_dma_addr;
+	struct delayed_work p2p_dwork;
+	struct delayed_work qp_dwork;
+	int p2p_retry;
+	dma_addr_t base_addr;
+	struct mic_mw mmio;
+	struct scif_peer_dev __rcu *spdev;
+	bool node_remove_ack_pending;
+	bool exit_ack_pending;
+	wait_queue_head_t disconn_wq;
+	atomic_t disconn_rescnt;
+	enum scif_msg_state exit;
+};
+
+extern struct scif_info scif_info;
+extern struct idr scif_ports;
+extern struct scif_dev *scif_dev;
+extern const struct file_operations scif_fops;
+
+/* Size of the RB for the Node QP */
+#define SCIF_NODE_QP_SIZE 0x10000
+
+#include "scif_nodeqp.h"
+
+/*
+ * scifdev_self:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device passed is the self aka Loopback SCIF device.
+ */
+static inline int scifdev_self(struct scif_dev *dev)
+{
+	return dev->node == scif_info.nodeid;
+}
+
+static inline bool scif_is_mgmt_node(void)
+{
+	return !scif_info.nodeid;
+}
+
+/*
+ * scifdev_is_p2p:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device is a MIC Peer to Peer SCIF device.
+ */
+static inline bool scifdev_is_p2p(struct scif_dev *dev)
+{
+	if (scif_is_mgmt_node())
+		return false;
+	else
+		return dev != &scif_dev[SCIF_MGMT_NODE] &&
+			!scifdev_self(dev);
+}
+
+/*
+ * scifdev_alive:
+ * @scifdev: The remote SCIF Device
+ *
+ * Returns true if the remote SCIF Device is running or sleeping for
+ * this endpoint.
+ */
+static inline int _scifdev_alive(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	return !!spdev;
+}
+
+void __init scif_init_debugfs(void);
+void scif_exit_debugfs(void);
+int scif_setup_intr_wq(struct scif_dev *scifdev);
+void scif_destroy_intr_wq(struct scif_dev *scifdev);
+void scif_cleanup_scifdev(struct scif_dev *dev);
+void scif_handle_remove_node(int node);
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated);
+void scif_free_qp(struct scif_dev *dev);
+void scif_misc_handler(struct work_struct *work);
+void scif_stop(struct scif_dev *scifdev);
+irqreturn_t scif_intr_handler(int irq, void *data);
+#endif /* SCIF_MAIN_H */
diff --git a/drivers/misc/mic/scif/scif_map.h b/drivers/misc/mic/scif/scif_map.h
new file mode 100644
index 0000000..20e50b4
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_map.h
@@ -0,0 +1,113 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAP_H
+#define SCIF_MAP_H
+
+#include "../bus/scif_bus.h"
+
+static __always_inline void *
+scif_alloc_coherent(dma_addr_t *dma_handle,
+		    struct scif_dev *scifdev, size_t size,
+		    gfp_t gfp)
+{
+	void *va;
+
+	if (scifdev_self(scifdev)) {
+		va = kmalloc(size, gfp);
+		if (va)
+			*dma_handle = virt_to_phys(va);
+	} else {
+		va = dma_alloc_coherent(&scifdev->sdev->dev,
+					size, dma_handle, gfp);
+		if (va && scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	return va;
+}
+
+static __always_inline void
+scif_free_coherent(void *va, dma_addr_t local,
+		   struct scif_dev *scifdev, size_t size)
+{
+	if (scifdev_self(scifdev)) {
+		kfree(va);
+	} else {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_free_coherent(&scifdev->sdev->dev,
+				  size, va, local);
+	}
+}
+
+static __always_inline int
+scif_map_single(dma_addr_t *dma_handle,
+		void *local, struct scif_dev *scifdev, size_t size)
+{
+	int err = 0;
+
+	if (scifdev_self(scifdev)) {
+		*dma_handle = virt_to_phys((local));
+	} else {
+		*dma_handle = dma_map_single(&scifdev->sdev->dev,
+					     local, size, DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(&scifdev->sdev->dev, *dma_handle))
+			err = -ENOMEM;
+		else if (scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	if (err)
+		*dma_handle = 0;
+	return err;
+}
+
+static __always_inline void
+scif_unmap_single(dma_addr_t local, struct scif_dev *scifdev,
+		  size_t size)
+{
+	if (!scifdev_self(scifdev)) {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_unmap_single(&scifdev->sdev->dev, local,
+				 size, DMA_BIDIRECTIONAL);
+	}
+}
+
+static __always_inline void *
+scif_ioremap(dma_addr_t phys, size_t size, struct scif_dev *scifdev)
+{
+	void *out_virt;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_self(scifdev))
+		out_virt = phys_to_virt(phys);
+	else
+		out_virt = (void __force *)
+			   sdev->hw_ops->ioremap(sdev, phys, size);
+	return out_virt;
+}
+
+static __always_inline void
+scif_iounmap(void *virt, size_t len, struct scif_dev *scifdev)
+{
+	if (!scifdev_self(scifdev)) {
+		struct scif_hw_dev *sdev = scifdev->sdev;
+
+		sdev->hw_ops->iounmap(sdev, (void __force __iomem *)virt);
+	}
+}
+#endif  /* SCIF_MAP_H */
diff --git a/drivers/misc/mic/scif/scif_debugfs.c b/drivers/misc/mic/scif/scif_debugfs.c
new file mode 100644
index 0000000..51f14e2
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_debugfs.c
@@ -0,0 +1,85 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+#include "../common/mic_dev.h"
+#include "scif_main.h"
+
+/* Debugfs parent dir */
+static struct dentry *scif_dbg;
+
+static int scif_dev_test(struct seq_file *s, void *unused)
+{
+	int node;
+
+	seq_printf(s, "Total Nodes %d Self Node Id %d Maxid %d\n",
+		   scif_info.total, scif_info.nodeid,
+		   scif_info.maxid);
+
+	if (!scif_dev)
+		return 0;
+
+	seq_printf(s, "%-16s\t%-16s\n", "node_id", "state");
+
+	for (node = 0; node <= scif_info.maxid; node++)
+		seq_printf(s, "%-16d\t%-16s\n", scif_dev[node].node,
+			   _scifdev_alive(&scif_dev[node]) ?
+			   "Running" : "Offline");
+	return 0;
+}
+
+static int scif_dev_test_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, scif_dev_test, inode->i_private);
+}
+
+static int scif_dev_test_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations scif_dev_ops = {
+	.owner   = THIS_MODULE,
+	.open    = scif_dev_test_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = scif_dev_test_release
+};
+
+void __init scif_init_debugfs(void)
+{
+	struct dentry *d;
+
+	scif_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	if (!scif_dbg) {
+		dev_err(scif_info.mdev.this_device,
+			"can't create debugfs dir scif\n");
+		return;
+	}
+
+	d = debugfs_create_file("scif_dev", 0444, scif_dbg,
+				NULL, &scif_dev_ops);
+	debugfs_create_u8("en_msg_log", 0666, scif_dbg, &scif_info.en_msg_log);
+	debugfs_create_u8("p2p_enable", 0666, scif_dbg, &scif_info.p2p_enable);
+}
+
+void scif_exit_debugfs(void)
+{
+	debugfs_remove_recursive(scif_dbg);
+}
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
new file mode 100644
index 0000000..714bfea
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -0,0 +1,390 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/module.h>
+#include <linux/idr.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_map.h"
+
+struct scif_info scif_info = {
+	.mdev = {
+		.minor = MISC_DYNAMIC_MINOR,
+		.name = "scif",
+		.fops = &scif_fops,
+	}
+};
+
+struct scif_dev *scif_dev;
+static atomic_t g_loopb_cnt;
+
+/* Runs in the context of intr_wq */
+static void scif_intr_bh_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev =
+			container_of(work, struct scif_dev, intr_bh);
+
+	if (scifdev_self(scifdev))
+		scif_loopb_msg_handler(scifdev, scifdev->qpairs);
+	else
+		scif_nodeqp_intrhandler(scifdev, scifdev->qpairs);
+}
+
+int scif_setup_intr_wq(struct scif_dev *scifdev)
+{
+	if (!scifdev->intr_wq) {
+		snprintf(scifdev->intr_wqname, sizeof(scifdev->intr_wqname),
+			 "SCIF INTR %d", scifdev->node);
+		scifdev->intr_wq =
+			alloc_ordered_workqueue(scifdev->intr_wqname, 0);
+		if (!scifdev->intr_wq)
+			return -ENOMEM;
+		INIT_WORK(&scifdev->intr_bh, scif_intr_bh_handler);
+	}
+	return 0;
+}
+
+void scif_destroy_intr_wq(struct scif_dev *scifdev)
+{
+	if (scifdev->intr_wq) {
+		destroy_workqueue(scifdev->intr_wq);
+		scifdev->intr_wq = NULL;
+	}
+}
+
+irqreturn_t scif_intr_handler(int irq, void *data)
+{
+	struct scif_dev *scifdev = data;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	sdev->hw_ops->ack_interrupt(sdev, scifdev->db);
+	queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+	return IRQ_HANDLED;
+}
+
+static int scif_peer_probe(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total++;
+	scif_info.maxid = max_t(u32, spdev->dnode, scif_info.maxid);
+	mutex_unlock(&scif_info.conflock);
+	rcu_assign_pointer(scifdev->spdev, spdev);
+
+	/* In the future SCIF kernel client devices will be added here */
+	dev_info(&spdev->dev, "Peer added dnode %d\n",
+		 spdev->dnode);
+	return 0;
+}
+
+static void scif_peer_remove(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	/* In the future SCIF kernel client devices will be removed here */
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	synchronize_rcu();
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total--;
+	mutex_unlock(&scif_info.conflock);
+	dev_info(&spdev->dev, "Peer removed dnode %d\n",
+		 spdev->dnode);
+}
+
+static void scif_qp_setup_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev = container_of(work, struct scif_dev,
+						qp_dwork.work);
+	struct scif_hw_dev *sdev = scifdev->sdev;
+	dma_addr_t da = 0;
+	int err;
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		da = bp->scif_card_dma_addr;
+		scifdev->rdb = bp->h2c_scif_db;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		da = readq(&bp->scif_host_dma_addr);
+		scifdev->rdb = ioread8(&bp->c2h_scif_db);
+	}
+	if (da) {
+		err = scif_qp_response(da, scifdev);
+		if (err)
+			dev_err(&scifdev->sdev->dev,
+				"scif_qp_response err %d\n", err);
+	} else {
+		schedule_delayed_work(&scifdev->qp_dwork,
+				      msecs_to_jiffies(1000));
+	}
+}
+
+static int scif_setup_scifdev(struct scif_hw_dev *sdev)
+{
+	int i;
+	u8 num_nodes;
+
+	if (sdev->snode) {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		num_nodes = ioread8(&bp->tot_nodes);
+	} else {
+		struct mic_bootparam *bp = sdev->dp;
+
+		num_nodes = bp->tot_nodes;
+	}
+	scif_dev = kcalloc(num_nodes, sizeof(*scif_dev), GFP_KERNEL);
+	if (!scif_dev)
+		return -ENOMEM;
+	for (i = 0; i < num_nodes; i++) {
+		struct scif_dev *scifdev = &scif_dev[i];
+
+		scifdev->node = i;
+		scifdev->exit = OP_IDLE;
+		init_waitqueue_head(&scifdev->disconn_wq);
+		mutex_init(&scifdev->lock);
+		INIT_DELAYED_WORK(&scifdev->p2p_dwork,
+				  scif_poll_qp_state);
+		INIT_DELAYED_WORK(&scifdev->qp_dwork,
+				  scif_qp_setup_handler);
+		INIT_LIST_HEAD(&scifdev->p2p);
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	}
+	return 0;
+}
+
+static void scif_destroy_scifdev(void)
+{
+	kfree(scif_dev);
+}
+
+static int scif_probe(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev;
+	int rc;
+
+	dev_set_drvdata(&sdev->dev, sdev);
+	if (1 == atomic_add_return(1, &g_loopb_cnt)) {
+		struct scif_dev *loopb_dev;
+
+		rc = scif_setup_scifdev(sdev);
+		if (rc)
+			goto exit;
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+		loopb_dev = &scif_dev[sdev->snode];
+		loopb_dev->sdev = sdev;
+		rc = scif_setup_loopback_qp(loopb_dev);
+		if (rc)
+			goto free_sdev;
+	} else {
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+	}
+	rc = scif_setup_intr_wq(scifdev);
+	if (rc)
+		goto destroy_loopb;
+	rc = scif_setup_qp(scifdev);
+	if (rc)
+		goto destroy_intr;
+	scifdev->db = sdev->hw_ops->next_db(sdev);
+	scifdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						    "SCIF_INTR", scifdev,
+						    scifdev->db);
+	if (IS_ERR(scifdev->cookie)) {
+		rc = PTR_ERR(scifdev->cookie);
+		goto free_qp;
+	}
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = scifdev->db;
+		bp->scif_host_dma_addr = scifdev->qp_dma_addr;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(scifdev->db, &bp->h2c_scif_db);
+		writeq(scifdev->qp_dma_addr, &bp->scif_card_dma_addr);
+	}
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+	return rc;
+free_qp:
+	scif_free_qp(scifdev);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+destroy_loopb:
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+free_sdev:
+	scif_destroy_scifdev();
+exit:
+	return rc;
+}
+
+void scif_stop(struct scif_dev *scifdev)
+{
+	struct scif_dev *dev;
+	int i;
+
+	for (i = scif_info.maxid; i >= 0; i--) {
+		dev = &scif_dev[i];
+		if (scifdev_self(dev))
+			continue;
+		scif_handle_remove_node(i);
+	}
+}
+
+static void scif_remove(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = -1;
+		bp->scif_host_dma_addr = 0x0;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(-1, &bp->h2c_scif_db);
+		writeq(0x0, &bp->scif_card_dma_addr);
+	}
+	if (scif_is_mgmt_node()) {
+		scif_disconnect_node(scifdev->node, true);
+	} else {
+		scif_info.card_initiated_exit = true;
+		scif_stop(scifdev);
+	}
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+	if (scifdev->cookie) {
+		sdev->hw_ops->free_irq(sdev, scifdev->cookie, scifdev);
+		scifdev->cookie = NULL;
+	}
+	scif_destroy_intr_wq(scifdev);
+	cancel_delayed_work(&scifdev->qp_dwork);
+	scif_free_qp(scifdev);
+	scifdev->rdb = -1;
+	scifdev->sdev = NULL;
+}
+
+static struct scif_peer_driver scif_peer_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.probe = scif_peer_probe,
+	.remove = scif_peer_remove,
+};
+
+static struct scif_hw_dev_id id_table[] = {
+	{ MIC_SCIF_DEV, SCIF_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct scif_driver scif_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = scif_probe,
+	.remove = scif_remove,
+};
+
+static int _scif_init(void)
+{
+	spin_lock_init(&scif_info.eplock);
+	spin_lock_init(&scif_info.nb_connect_lock);
+	spin_lock_init(&scif_info.port_lock);
+	mutex_init(&scif_info.conflock);
+	mutex_init(&scif_info.connlock);
+	INIT_LIST_HEAD(&scif_info.uaccept);
+	INIT_LIST_HEAD(&scif_info.listen);
+	INIT_LIST_HEAD(&scif_info.zombie);
+	INIT_LIST_HEAD(&scif_info.connected);
+	INIT_LIST_HEAD(&scif_info.disconnected);
+	INIT_LIST_HEAD(&scif_info.nb_connect_list);
+	init_waitqueue_head(&scif_info.exitwq);
+	scif_info.en_msg_log = 0;
+	scif_info.p2p_enable = 1;
+	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	idr_init(&scif_ports);
+	return 0;
+}
+
+static void _scif_exit(void)
+{
+	idr_destroy(&scif_ports);
+	scif_destroy_scifdev();
+}
+
+static int __init scif_init(void)
+{
+	struct miscdevice *mdev = &scif_info.mdev;
+	int rc;
+
+	_scif_init();
+	rc = scif_peer_bus_init();
+	if (rc)
+		goto exit;
+	rc = scif_peer_register_driver(&scif_peer_driver);
+	if (rc)
+		goto peer_bus_exit;
+	rc = scif_register_driver(&scif_driver);
+	if (rc)
+		goto unreg_scif_peer;
+	rc = misc_register(mdev);
+	if (rc)
+		goto unreg_scif;
+	scif_init_debugfs();
+	return 0;
+unreg_scif:
+	scif_unregister_driver(&scif_driver);
+unreg_scif_peer:
+	scif_peer_unregister_driver(&scif_peer_driver);
+peer_bus_exit:
+	scif_peer_bus_exit();
+exit:
+	_scif_exit();
+	return rc;
+}
+
+static void __exit scif_exit(void)
+{
+	scif_exit_debugfs();
+	misc_deregister(&scif_info.mdev);
+	scif_unregister_driver(&scif_driver);
+	scif_peer_unregister_driver(&scif_peer_driver);
+	scif_peer_bus_exit();
+	_scif_exit();
+}
+
+module_init(scif_init);
+module_exit(scif_exit);
+
+MODULE_DEVICE_TABLE(scif, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 47%]

* [PATCH char-misc-next v3 11/13] misc: mic: MIC host driver specific changes to enable SCIF
  2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
                   ` (9 preceding siblings ...)
  2015-01-27  0:32 52% ` [PATCH char-misc-next v3 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
@ 2015-01-27  0:32 57% ` Sudeep Dutt
  2015-01-27  0:32 70% ` [PATCH char-misc-next v3 12/13] misc: mic: MIC card " Sudeep Dutt
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC host driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |   2 +-
 drivers/misc/mic/host/mic_device.h  |  11 +-
 drivers/misc/mic/host/mic_intr.h    |   3 +-
 drivers/misc/mic/host/mic_smpt.h    |   1 +
 drivers/misc/mic/host/mic_boot.c    | 264 ++++++++++++++++++++++++++++++++++--
 drivers/misc/mic/host/mic_debugfs.c |  13 ++
 drivers/misc/mic/host/mic_main.c    |   6 +
 drivers/misc/mic/host/mic_smpt.c    |   7 +-
 drivers/misc/mic/host/mic_virtio.c  |   6 +-
 drivers/misc/mic/host/mic_x100.c    |   3 +-
 10 files changed, 292 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bcf6e36..a2ea3ab 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -36,7 +36,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 016bd15..01a7555 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -27,7 +27,7 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 #include <linux/mic_bus.h>
-
+#include "../bus/scif_bus.h"
 #include "mic_intr.h"
 
 /* The maximum number of MIC devices supported in a single host system. */
@@ -90,7 +90,9 @@ enum mic_stepping {
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
  * @dma_mbdev: MIC BUS DMA device.
- * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -129,7 +131,9 @@ struct mic_device {
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
 	struct mbus_device *dma_mbdev;
-	struct dma_chan *dma_ch;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -228,4 +232,5 @@ void mic_exit_debugfs(void);
 void mic_prepare_suspend(struct mic_device *mdev);
 void mic_complete_resume(struct mic_device *mdev);
 void mic_suspend(struct mic_device *mdev);
+extern atomic_t g_num_mics;
 #endif
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 9f783d4a..cce2882 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -28,8 +28,9 @@
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
  * 4 for host owned DMA channels.
+ * 1 for SCIF
  */
-#define MIC_MIN_MSIX 8
+#define MIC_MIN_MSIX 9
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_smpt.h b/drivers/misc/mic/host/mic_smpt.h
index 51970ab..68721c6 100644
--- a/drivers/misc/mic/host/mic_smpt.h
+++ b/drivers/misc/mic/host/mic_smpt.h
@@ -78,6 +78,7 @@ void mic_unmap_single(struct mic_device *mdev,
 dma_addr_t mic_map(struct mic_device *mdev,
 	dma_addr_t dma_addr, size_t size);
 void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size);
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr);
 
 /**
  * mic_map_error - Check a MIC address for errors.
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index ff2b0fb..bb15506 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -21,6 +21,7 @@
 #include <linux/delay.h>
 #include <linux/firmware.h>
 #include <linux/pci.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include <linux/mic_bus.h>
@@ -29,6 +30,188 @@
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static void *__mic_dma_alloc(struct device *dev, size_t size,
+			     dma_addr_t *dma_handle, gfp_t gfp,
+			     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	dma_addr_t tmp;
+	void *va = kmalloc(size, gfp);
+
+	if (va) {
+		tmp = mic_map_single(mdev, va, size);
+		if (dma_mapping_error(dev, tmp)) {
+			kfree(va);
+			va = NULL;
+		} else {
+			*dma_handle = tmp;
+		}
+	}
+	return va;
+}
+
+static void __mic_dma_free(struct device *dev, size_t size, void *vaddr,
+			   dma_addr_t dma_handle, struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_handle, size);
+	kfree(vaddr);
+}
+
+static dma_addr_t
+__mic_dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+		   size_t size, enum dma_data_direction dir,
+		   struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void
+__mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+		     size_t size, enum dma_data_direction dir,
+		     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static int __mic_dma_map_sg(struct device *dev, struct scatterlist *sg,
+			    int nents, enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	int i, j, ret;
+	dma_addr_t da;
+
+	ret = dma_map_sg(mdev->sdev->parent, sg, nents, dir);
+	if (ret <= 0)
+		return 0;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_map(mdev, sg_dma_address(s) + s->offset, s->length);
+		if (!da)
+			goto err;
+		sg_dma_address(s) = da;
+	}
+	return nents;
+err:
+	for_each_sg(sg, s, i, j) {
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = mic_to_dma_addr(mdev, sg_dma_address(s));
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+	return 0;
+}
+
+static void __mic_dma_unmap_sg(struct device *dev,
+			       struct scatterlist *sg, int nents,
+			       enum dma_data_direction dir,
+			       struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	dma_addr_t da;
+	int i;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_to_dma_addr(mdev, sg_dma_address(s));
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = da;
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+}
+
+static struct dma_map_ops __mic_dma_ops = {
+	.alloc = __mic_dma_alloc,
+	.free = __mic_dma_free,
+	.map_page = __mic_dma_map_page,
+	.unmap_page = __mic_dma_unmap_page,
+	.map_sg = __mic_dma_map_sg,
+	.unmap_sg = __mic_dma_unmap_sg,
+};
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name,
+				       void *data, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_request_threaded_irq(mdev, func, NULL, name, data,
+					db, MIC_INTR_DB);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_free_irq(mdev, cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_next_db(mdev);
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->send_intr(mdev, db);
+}
+
+static void __iomem *___mic_ioremap(struct scif_hw_dev *scdev,
+				    phys_addr_t pa, size_t len)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mdev->aper.va + pa;
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	/* nothing to do */
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
 static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
 {
 	return dev_get_drvdata(mbdev->dev.parent);
@@ -127,6 +310,58 @@ void mic_bootparam_init(struct mic_device *mdev)
 	bootparam->h2c_config_db = -1;
 	bootparam->shutdown_status = 0;
 	bootparam->shutdown_card = 0;
+	/* Total nodes = number of MICs + 1 for self node */
+	bootparam->tot_nodes = atomic_read(&g_num_mics) + 1;
+	bootparam->node_id = mdev->id + 1;
+	bootparam->scif_host_dma_addr = 0x0;
+	bootparam->scif_card_dma_addr = 0x0;
+	bootparam->c2h_scif_db = -1;
+	bootparam->h2c_scif_db = -1;
+}
+
+/**
+ * mic_request_dma_chans - Request DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns number of DMA channels acquired
+ */
+static int mic_request_dma_chans(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, mdev->ops->dma_filter,
+					   mdev->sdev->parent);
+		if (chan) {
+			mdev->dma_ch[mdev->num_dma_ch++] = chan;
+			if (mdev->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdev->sdev->parent, "DMA channels # %d\n", mdev->num_dma_ch);
+	return mdev->num_dma_ch;
+}
+
+/**
+ * mic_free_dma_chans - release DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns none
+ */
+static void mic_free_dma_chans(struct mic_device *mdev)
+{
+	int i = 0;
+
+	for (i = 0; i < mdev->num_dma_ch; i++) {
+		dma_release_channel(mdev->dma_ch[i]);
+		mdev->dma_ch[i] = NULL;
+	}
+	mdev->num_dma_ch = 0;
 }
 
 /**
@@ -141,6 +376,7 @@ int mic_start(struct mic_device *mdev, const char *buf)
 {
 	int rc;
 	mutex_lock(&mdev->mic_mutex);
+	mic_bootparam_init(mdev);
 retry:
 	if (MIC_OFFLINE != mdev->state) {
 		rc = -EINVAL;
@@ -161,14 +397,22 @@ retry:
 		rc = PTR_ERR(mdev->dma_mbdev);
 		goto unlock_ret;
 	}
-	mdev->dma_ch = mic_request_dma_chan(mdev);
-	if (!mdev->dma_ch) {
-		rc = -ENXIO;
+	if (!mic_request_dma_chans(mdev)) {
+		rc = -ENODEV;
 		goto dma_remove;
 	}
+	mdev->scdev = scif_register_device(mdev->sdev->parent, MIC_SCIF_DEV,
+					   &__mic_dma_ops, &scif_hw_ops,
+					   mdev->id + 1, 0, &mdev->mmio,
+					   &mdev->aper, mdev->dp, NULL,
+					   mdev->dma_ch, mdev->num_dma_ch);
+	if (IS_ERR(mdev->scdev)) {
+		rc = PTR_ERR(mdev->scdev);
+		goto dma_free;
+	}
 	rc = mdev->ops->load_mic_fw(mdev, buf);
 	if (rc)
-		goto dma_release;
+		goto scif_remove;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -177,8 +421,10 @@ retry:
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
 	goto unlock_ret;
-dma_release:
-	dma_release_channel(mdev->dma_ch);
+scif_remove:
+	scif_unregister_device(mdev->scdev);
+dma_free:
+	mic_free_dma_chans(mdev);
 dma_remove:
 	mbus_unregister_device(mdev->dma_mbdev);
 unlock_ret:
@@ -197,11 +443,9 @@ void mic_stop(struct mic_device *mdev, bool force)
 {
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
+		scif_unregister_device(mdev->scdev);
 		mic_virtio_reset_devices(mdev);
-		if (mdev->dma_ch) {
-			dma_release_channel(mdev->dma_ch);
-			mdev->dma_ch = NULL;
-		}
+		mic_free_dma_chans(mdev);
 		mbus_unregister_device(mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 687e9aa..3c9ea48 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -214,6 +214,19 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 		   bootparam->shutdown_status);
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
 		   bootparam->shutdown_card);
+	seq_printf(s, "Bootparam: tot_nodes %d\n",
+		   bootparam->tot_nodes);
+	seq_printf(s, "Bootparam: node_id %d\n",
+		   bootparam->node_id);
+	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
+		   bootparam->c2h_scif_db);
+	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
+		   bootparam->h2c_scif_db);
+	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
+		   bootparam->scif_host_dma_addr);
+	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
+		   bootparam->scif_card_dma_addr);
+
 
 	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
 	     i += mic_total_desc_size(d)) {
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index ab37a31..4564629 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -67,6 +67,8 @@ static struct ida g_mic_ida;
 static struct class *g_mic_class;
 /* Base device node number for MIC devices */
 static dev_t g_mic_devno;
+/* Track the total number of MIC devices */
+atomic_t g_num_mics;
 
 static const struct file_operations mic_fops = {
 	.open = mic_open,
@@ -408,6 +410,7 @@ static int mic_probe(struct pci_dev *pdev,
 		dev_err(&pdev->dev, "cdev_add err id %d rc %d\n", mdev->id, rc);
 		goto cleanup_debug_dir;
 	}
+	atomic_inc(&g_num_mics);
 	return 0;
 cleanup_debug_dir:
 	mic_delete_debug_dir(mdev);
@@ -459,6 +462,7 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	mic_stop(mdev, false);
+	atomic_dec(&g_num_mics);
 	cdev_del(&mdev->cdev);
 	mic_delete_debug_dir(mdev);
 	mutex_lock(&mdev->mic_mutex);
@@ -478,6 +482,7 @@ static void mic_remove(struct pci_dev *pdev)
 	ida_simple_remove(&g_mic_ida, mdev->id);
 	kfree(mdev);
 }
+
 static struct pci_driver mic_driver = {
 	.name = mic_driver_name,
 	.id_table = mic_pci_tbl,
@@ -512,6 +517,7 @@ static int __init mic_init(void)
 	}
 	return ret;
 cleanup_debugfs:
+	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
 	class_destroy(g_mic_class);
 cleanup_chrdev:
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index fae474c..cec8203 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -174,8 +174,7 @@ static int mic_get_smpt_ref_count(struct mic_device *mdev, dma_addr_t dma_addr,
  *
  * returns a DMA address.
  */
-static dma_addr_t
-mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
 {
 	struct mic_smpt_info *smpt_info = mdev->smpt;
 	int spt;
@@ -214,7 +213,7 @@ dma_addr_t mic_map(struct mic_device *mdev, dma_addr_t dma_addr, size_t size)
 	if (!size || size > mic_max_system_memory(mdev))
 		return mic_addr;
 
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return mic_addr;
 
@@ -271,7 +270,7 @@ void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
 	}
 
 	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return;
 
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index a020e4e..cc08e9f 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -40,7 +40,7 @@ static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
 {
 	int err = 0;
 	struct dma_async_tx_descriptor *tx;
-	struct dma_chan *mic_ch = mdev->dma_ch;
+	struct dma_chan *mic_ch = mdev->dma_ch[0];
 
 	if (!mic_ch) {
 		err = -EBUSY;
@@ -80,7 +80,7 @@ static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t dma_offset;
 	size_t partlen;
 	int err;
@@ -129,7 +129,7 @@ static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t partlen;
 	int err;
 
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index b7a21e1..3341e90 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -167,8 +167,7 @@ static void mic_x100_send_intr(struct mic_device *mdev, int doorbell)
 	if (doorbell < MIC_X100_NUM_SBOX_IRQ) {
 		mic_x100_send_sbox_intr(mdev, doorbell);
 	} else {
-		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ;
 		mic_x100_send_rdmasr_intr(mdev, rdmasr_db);
 	}
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 57%]

* [PATCH char-misc-next v3 00/13] misc: mic: SCIF driver
@ 2015-01-27  0:32 63% Sudeep Dutt
  2015-01-27  0:32 22% ` [PATCH char-misc-next v3 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
                   ` (13 more replies)
  0 siblings, 14 replies; 200+ results
From: Sudeep Dutt @ 2015-01-27  0:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

ChangeLog:
=========

v2 => v3:
a) Formatting fixes for SCIF header file documentation, data type fixes
   for SCIF IOCTL interface and added SCIF user space documentation in
   scif_overview.txt as per feedback from Greg Kroah-Hartman

v1 => v2 @ https://lkml.org/lkml/2015/1/12/1029
a) Use kernel-doc formatting for SCIF header file documentation, enhance
   SCIF ring buffer documentation and formatting cleanup in patch 4 as
   per feedback from Greg Kroah-Hartman
b) SCIF bug fixes and cleanups since initial post

v1: Initial post @ https://lkml.org/lkml/2014/12/10/546

Description:
============

The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
level communications API across PCIe currently implemented for MIC. Currently
SCIF provides inter-node communication within a single host platform, where a
node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
communicating over the PCIe bus while providing an API that is symmetric
across all the nodes in the PCIe network. An important design objective for SCIF
is to deliver the maximum possible performance given the communication
abilities of the hardware. SCIF has been used to implement an offload compiler
runtime and OFED support for MPI implementations for MIC coprocessors.

==== SCIF API Components ====
The SCIF API has the following parts:
1. Connection establishment using a client server model
2. Byte stream messaging intended for short messages
3. Node enumeration to determine online nodes
4. Poll semantics for detection of incoming connections and messages
5. Memory registration to pin down pages
6. Remote memory mapping for low latency CPU accesses via mmap
7. Remote DMA (RDMA) for high bandwidth DMA transfers
8. Fence APIs for RDMA synchronization

SCIF exposes the notion of a connection which can be used by peer processes on
nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
process in a SCIF node initiates a SCIF connection to a peer process on a
different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
which are similar to connection oriented socket APIs. Connected SCIF endpoints
can also register local memory which is followed by data transfer using either
DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
kernel mode clients which are functionally equivalent.

==== SCIF Performance for MIC ====
DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
SCIF shows the performance advantages of SCIF for HPC applications and runtimes.

             Comparison of TCP and SCIF based BW

  Throughput (GB/sec)
    8 +                                             PCIe Bandwidth ******
      +                                                        TCP ######
    7 +    **************************************             SCIF %%%%%%
      |                       %%%%%%%%%%%%%%%%%%%
    6 +                   %%%%
      |                 %%
      |               %%%
    5 +              %%
      |            %%
    4 +           %%
      |          %%
    3 +         %%
      |        %
    2 +      %%
      |     %%
      |    %
    1 +
      +    ######################################
    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
      1       10     100      1000   10000   100000
                   Transfer Size (KBytes)

SCIF allows memory sharing via mmap(..) between processes on different PCIe
nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.

For more information on SCIF please refer to the Linux Con 2013 slides titled
"Intel® MIC x100 Coprocessor Driver - on the Frontiers of Linux & HPC" at
http://events.linuxfoundation.org/sites/events/files/slides/LinuxConMicDriver.pdf
The talk can also be viewed at https://www.youtube.com/watch?v=dylbmmQs4W0

This initial patch series introduces the SCIF API and implements the SCIF
connection, messaging and node enumeration APIs. SCIF poll, mmap, RDMA and
fence APIs will be submitted in a future patch series once the initial base
patches are accepted upstream.

This patch series is divided into 13 patches as follows:
1) SCIF header file documenting the API along with the IOCTL interface

2) SCIF ring buffer is a single producer, single consumer byte stream ring
   buffer optimized for avoiding reads across the PCIe bus. The ring buffer is
   used to implement a receive queue for SCIF driver messaging between nodes and
   for byte stream messaging between SCIF endpoints. Each SCIF node has a
   receive queue for every other SCIF node, and each connected endpoint has a
   receive queue for messages from its peer. This pair of receive queues is
   referred to as a SCIF queue pair.

3) SCIF hardware bus which abstracts the low level hardware details so that the
   same SCIF driver can work without any changes on the host or the card as long
   as the hardware bus operations are implemented.

4) SCIF peer bus is used to register and unregister SCIF peer devices internally
   by the SCIF driver to signify the addition and removal of peer nodes
   respectively from the SCIF network. This simplifies remote node handling
   within SCIF and will also be used to support device probe/remove for SCIF
   client drivers (e.g. netdev over SCIF)

5) Common MIC header file changes to enable SCIF

6) SCIF module initialization, DMA/remote memory mapping APIs and debugfs hooks.

7) SCIF node queue pair setup sets up the kernel mode private node
   queue pairs between all the nodes to enable internal control
   message communication.

8) SCIF character device file operations and kernel APIs for opening and
   closing a user and kernel mode SCIF endpoint. This patch also enables
   binding to a SCIF port and listening for incoming SCIF connections.

9) SCIF connection APIs which establish a SCIF connection between
   a pair of user or kernel mode endpoints.

10) SCIF messaging APIs which allow sending messages between the SCIF endpoints
    via a byte stream based ring buffer which has been optimized to avoid reads
    across PCIe. The node enumeration API enables a user to query for the number
    of online nodes.

11) MIC host driver specific changes to enable SCIF

12) MIC card driver specific changes to enable SCIF

13) Add support for loading/unloading SCIF driver

Nikhil Rao (1):
  misc: mic: SCIF connections APIs i.e. accept and connect

Sudeep Dutt (12):
  misc: mic: SCIF header file and IOCTL interface
  misc: mic: SCIF ring buffer infrastructure
  misc: mic: SCIF Hardware Bus
  misc: mic: SCIF Peer Bus
  misc: mic: Common MIC header file changes in preparation for SCIF
  misc: mic: SCIF module initialization
  misc: mic: SCIF node queue pair setup management
  misc: mic: SCIF open close bind and listen APIs
  misc: mic: SCIF messaging and node enumeration APIs
  misc: mic: MIC host driver specific changes to enable SCIF
  misc: mic: MIC card driver specific changes to enable SCIF
  misc: mic: add support for loading/unloading SCIF driver

 Documentation/mic/mic_overview.txt    |   28 +-
 Documentation/mic/scif_overview.txt   |   98 +++
 drivers/misc/mic/Kconfig              |   40 +-
 drivers/misc/mic/Makefile             |    3 +-
 drivers/misc/mic/bus/Makefile         |    1 +
 drivers/misc/mic/scif/Makefile        |   15 +
 include/uapi/linux/Kbuild             |    1 +
 drivers/misc/mic/bus/scif_bus.h       |  129 ++++
 drivers/misc/mic/card/mic_device.h    |   11 +-
 drivers/misc/mic/card/mic_x100.h      |    1 +
 drivers/misc/mic/common/mic_dev.h     |    3 +
 drivers/misc/mic/host/mic_device.h    |   11 +-
 drivers/misc/mic/host/mic_intr.h      |    3 +-
 drivers/misc/mic/host/mic_smpt.h      |    1 +
 drivers/misc/mic/scif/scif_epd.h      |  160 ++++
 drivers/misc/mic/scif/scif_main.h     |  252 +++++++
 drivers/misc/mic/scif/scif_map.h      |  113 +++
 drivers/misc/mic/scif/scif_nodeqp.h   |  182 +++++
 drivers/misc/mic/scif/scif_peer_bus.h |   65 ++
 drivers/misc/mic/scif/scif_rb.h       |  100 +++
 include/linux/scif.h                  |  993 +++++++++++++++++++++++++
 include/uapi/linux/mic_common.h       |   12 +
 include/uapi/linux/scif_ioctl.h       |  215 ++++++
 drivers/misc/mic/bus/scif_bus.c       |  210 ++++++
 drivers/misc/mic/card/mic_device.c    |  132 +++-
 drivers/misc/mic/card/mic_x100.c      |   61 +-
 drivers/misc/mic/host/mic_boot.c      |  264 ++++++-
 drivers/misc/mic/host/mic_debugfs.c   |   13 +
 drivers/misc/mic/host/mic_main.c      |    6 +
 drivers/misc/mic/host/mic_smpt.c      |    7 +-
 drivers/misc/mic/host/mic_virtio.c    |    6 +-
 drivers/misc/mic/host/mic_x100.c      |    3 +-
 drivers/misc/mic/scif/scif_api.c      | 1271 ++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_debugfs.c  |   85 +++
 drivers/misc/mic/scif/scif_epd.c      |  353 +++++++++
 drivers/misc/mic/scif/scif_fd.c       |  303 ++++++++
 drivers/misc/mic/scif/scif_main.c     |  391 ++++++++++
 drivers/misc/mic/scif/scif_nm.c       |  237 ++++++
 drivers/misc/mic/scif/scif_nodeqp.c   | 1307 +++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c |  124 ++++
 drivers/misc/mic/scif/scif_ports.c    |  124 ++++
 drivers/misc/mic/scif/scif_rb.c       |  248 +++++++
 Documentation/mic/mpssd/mpss          |   24 +-
 43 files changed, 7550 insertions(+), 56 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

-- 
1.8.2.1


^ permalink raw reply	[relevance 63%]

* Re: [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface
  @ 2015-01-26 18:07 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-26 18:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit

On Sun, 2015-01-25 at 20:39 +0800, Greg Kroah-Hartman wrote:
> On Mon, Jan 12, 2015 at 03:09:55PM -0800, Sudeep Dutt wrote:
> > --- /dev/null
> > +++ b/include/uapi/linux/scif_ioctl.h
> > @@ -0,0 +1,213 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * This file is provided under a dual BSD/GPLv2 license.  When using or
> > + * redistributing this file, you may do so under either license.
> > + *
> > + * GPL LICENSE SUMMARY
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of version 2 of the GNU General Public License as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * General Public License for more details.
> > + *
> > + * BSD LICENSE
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + *
> > + * * Redistributions of source code must retain the above copyright
> > + *   notice, this list of conditions and the following disclaimer.
> > + * * Redistributions in binary form must reproduce the above copyright
> > + *   notice, this list of conditions and the following disclaimer in
> > + *   the documentation and/or other materials provided with the
> > + *   distribution.
> > + * * Neither the name of Intel Corporation nor the names of its
> > + *   contributors may be used to endorse or promote products derived
> > + *   from this software without specific prior written permission.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + *
> > + * Intel SCIF driver.
> > + *
> > + */
> > +/*
> > + * -----------------------------------------
> > + * SCIF IOCTL interface information
> > + * -----------------------------------------
> > + */
> > +#ifndef SCIF_IOCTL_H
> > +#define SCIF_IOCTL_H
> > +
> > +/**
> > + * struct scif_port_id - SCIF port information
> > + * @node:	node on which port resides
> > + * @port:	local port number
> > + */
> > +struct scif_port_id {
> > +	uint16_t node;
> > +	uint16_t port;
> 
> variables that cross the kernel/user interface need to use the __*
> types, so these would be __u16.  Please fix up all of these structures
> in this file.
> 

Agreed, I have fixed it in v3 which will be posted this week.


> > +};
> > +
> > +/**
> > + * struct scifioctl_connect - used for SCIF_CONNECT IOCTL
> > + * @self:	used to read back the assigned port_id
> > + * @peer:	destination node and port to connect to
> > + */
> > +struct scifioctl_connect {
> > +	struct scif_port_id	self;
> > +	struct scif_port_id	peer;
> > +};
> > +
> > +/**
> > + * struct scifioctl_accept - used for SCIF_ACCEPTREQ IOCTL
> > + * @flags:	flags
> > + * @peer:	global id of peer endpoint
> > + * @endpt:	new connected endpoint descriptor
> > + */
> > +struct scifioctl_accept {
> > +	int			flags;
> 
> int?  Please be specific of your types for ioctls.
> 

Agreed, I have fixed this in v3 which will be posted this week.

Thanks,
Sudeep Dutt




^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface
  @ 2015-01-26 18:07 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-26 18:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit

On Sun, 2015-01-25 at 20:41 +0800, Greg Kroah-Hartman wrote:
> On Mon, Jan 12, 2015 at 03:09:55PM -0800, Sudeep Dutt wrote:
> > + * Errors:
> > + * EBADF, ENOTTY
> > + * epd is not a valid endpoint descriptor
> > + * ECONNREFUSED
> > + * The destination was not listening for connections or refused the
> > + *	connection request
> > + * EINVAL
> > + * epd is not a valid endpoint descriptor, or
> > + * dst.port is not a valid port ID
> > + * EISCONN
> > + * The endpoint is already connected
> > + * ENOMEM
> > + * No buffer space is available
> > + * ENODEV
> > + * The destination node does not exist, or
> > + * The node is lost or existed, but is not currently in the network since it
> > + *	may have crashed
> > + * ENOSPC
> > + * No port number available for assignment
> > + * EOPNOTSUPP
> > + * The endpoint is listening and cannot be connected
> > +*/
> 
> Very odd formatting of these error codes here, and other places in this
> .h file, please fix them all up and properly wrap your lines.
> 

Agreed, I have cleaned it up in v3 which will be posted this week.

Thanks,
Sudeep Dutt



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 00/13] misc: mic: SCIF driver
  @ 2015-01-26 18:07 97%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-26 18:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit

On Sun, 2015-01-25 at 20:44 +0800, Greg Kroah-Hartman wrote:
> On Mon, Jan 12, 2015 at 03:09:54PM -0800, Sudeep Dutt wrote:
> > ChangeLog:
> > =========
> > v1 => v2:
> > a) Use kernel-doc formatting for SCIF header file documentation, enhance
> >    SCIF ring buffer documentation and formatting cleanup in patch 4 as
> >    per feedback from Greg Kroah-Hartman
> > b) SCIF bug fixes and cleanups since initial post
> > 
> > v1: Initial post @ https://lkml.org/lkml/2014/12/10/546
> 
> Do you have any test code that can be added to the tree to exercise this
> api and at least give a hint as to how it all works from the userspace
> side?  You don't have any pointers to any userspace code, what is using
> this API?
> 

SCIF has been used to implement an ICC offload compiler runtime in user
space. We will be submitting a kernel client as well if this initial
patch series is accepted upstream. SCIF has a user space library which
is a thin IOCTL wrapper providing a user space API similar to the kernel
API in scif.h. The SCIF user space library is distributed separately via
https://software.intel.com/en-us/mic-developer

Here is some SCIF pseudo code as an example:

Process A (on host)	 		Process B (on MIC Coprocessor)

scif_open(..)				scif_open(..)
scif_bind(..)				scif_bind(..)
scif_listen(..)
scif_accept(..)				scif_connect(..)
/* SCIF connection established */

/* Send and receive control messages */
scif_send(..)/scif_recv(..)		scif_send(..)/scif_recv(..)

/* Register memory */
scif_register(..)			scif_register(..)

/* RDMA */
scif_readfrom(..)/scif_writeto(..)    scif_readfrom(..)/scif_writeto(..)

mmap(..)				mmap(..)
/* Access remote registered memory */

/* Close the endpoints */
scif_close(..)				scif_close(..)

The user space code has been distributed and maintained separately from
the kernel so we did not include it as part of this patch series to
avoid bloating the kernel. Do you think we should include the user space
SCIF library and some sample user space test applications under
Documentation/mic with this patch series?

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 97%]

* [PATCH char-misc-next v2 03/13] misc: mic: SCIF Hardware Bus
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
  2015-01-12 23:09 22% ` [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
  2015-01-12 23:09 58% ` [PATCH char-misc-next v2 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
@ 2015-01-12 23:09 61% ` Sudeep Dutt
  2015-01-12 23:09 78% ` [PATCH char-misc-next v2 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF hardware bus abstracts the low level hardware driver details
like interrupts and mapping remote memory so that the same SCIF driver
can work without any changes with the MIC host or card driver as long
as the hardware bus operations are implemented. The SCIF hardware
device is registered by the host and card drivers on the SCIF hardware
bus resulting in probing the SCIF driver.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig        |  17 ++++
 drivers/misc/mic/Makefile       |   2 +-
 drivers/misc/mic/bus/Makefile   |   1 +
 drivers/misc/mic/bus/scif_bus.h | 129 ++++++++++++++++++++++++
 drivers/misc/mic/bus/scif_bus.c | 210 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 358 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index cc4eef0..afc9f53 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -15,6 +15,23 @@ config INTEL_MIC_BUS
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
 
+comment "SCIF Bus Driver"
+
+config SCIF_BUS
+	tristate "SCIF Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the SCIF Bus, such as CONFIG_INTEL_MIC_HOST
+	  and CONFIG_INTEL_MIC_CARD.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index e9bf148..6bc6083 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,4 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
-obj-$(CONFIG_INTEL_MIC_BUS) += bus/
+obj-y += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
index d85c7f2..1ed37e2 100644
--- a/drivers/misc/mic/bus/Makefile
+++ b/drivers/misc/mic/bus/Makefile
@@ -3,3 +3,4 @@
 # Copyright(c) 2014, Intel Corporation.
 #
 obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
+obj-$(CONFIG_SCIF_BUS) += scif_bus.o
diff --git a/drivers/misc/mic/bus/scif_bus.h b/drivers/misc/mic/bus/scif_bus.h
new file mode 100644
index 0000000..335a228
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.h
@@ -0,0 +1,129 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#ifndef _SCIF_BUS_H_
+#define _SCIF_BUS_H_
+/*
+ * Everything a scif driver needs to work with any particular scif
+ * hardware abstraction layer.
+ */
+#include <linux/dma-mapping.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+struct scif_hw_dev_id {
+	u32 device;
+	u32 vendor;
+};
+
+#define MIC_SCIF_DEV 1
+#define SCIF_DEV_ANY_ID 0xffffffff
+
+/**
+ * scif_hw_dev - representation of a hardware device abstracted for scif
+ * @hw_ops: the hardware ops supported by this device
+ * @id: the device type identification (used to match it with a driver)
+ * @mmio: MMIO memory window
+ * @aper: Aperture memory window
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ * @snode - The source node for this device.
+ * @dp - Self device page
+ * @rdp - Remote device page
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ */
+struct scif_hw_dev {
+	struct scif_hw_ops *hw_ops;
+	struct scif_hw_dev_id id;
+	struct mic_mw *mmio;
+	struct mic_mw *aper;
+	struct device dev;
+	u8 dnode;
+	u8 snode;
+	void *dp;
+	void __iomem *rdp;
+	struct dma_chan **dma_ch;
+	int num_dma_ch;
+};
+
+/**
+ * scif_driver - operations for a scif I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_driver {
+	struct device_driver driver;
+	const struct scif_hw_dev_id *id_table;
+	int (*probe)(struct scif_hw_dev *dev);
+	void (*remove)(struct scif_hw_dev *dev);
+};
+
+/**
+ * scif_hw_ops - Hardware operations for accessing a SCIF device on the SCIF bus.
+ *
+ * @next_db: Obtain the next available doorbell.
+ * @request_irq: Request an interrupt on a particular doorbell.
+ * @free_irq: Free an interrupt requested previously.
+ * @ack_interrupt: acknowledge an interrupt in the ISR.
+ * @send_intr: Send an interrupt to the remote node on a specified doorbell.
+ * @send_p2p_intr: Send an interrupt to the peer node on a specified doorbell
+ * which is specifically targeted for a peer to peer node.
+ * @ioremap: Map a buffer with the specified physical address and length.
+ * @iounmap: Unmap a buffer previously mapped.
+ */
+struct scif_hw_ops {
+	int (*next_db)(struct scif_hw_dev *sdev);
+	struct mic_irq * (*request_irq)(struct scif_hw_dev *sdev,
+					irqreturn_t (*func)(int irq,
+							    void *data),
+					const char *name, void *data,
+					int db);
+	void (*free_irq)(struct scif_hw_dev *sdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct scif_hw_dev *sdev, int num);
+	void (*send_intr)(struct scif_hw_dev *sdev, int db);
+	void (*send_p2p_intr)(struct scif_hw_dev *sdev, int db,
+			      struct mic_mw *mw);
+	void __iomem * (*ioremap)(struct scif_hw_dev *sdev,
+				  phys_addr_t pa, size_t len);
+	void (*iounmap)(struct scif_hw_dev *sdev, void __iomem *va);
+};
+
+int scif_register_driver(struct scif_driver *driver);
+void scif_unregister_driver(struct scif_driver *driver);
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id,
+		     struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper,
+		     void *dp, void __iomem *rdp,
+		     struct dma_chan **chan, int num_chan);
+void scif_unregister_device(struct scif_hw_dev *sdev);
+
+static inline struct scif_hw_dev *dev_to_scif(struct device *dev)
+{
+	return container_of(dev, struct scif_hw_dev, dev);
+}
+
+static inline struct scif_driver *drv_to_scif(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_driver, driver);
+}
+#endif /* _SCIF_BUS_H */
diff --git a/drivers/misc/mic/bus/scif_bus.c b/drivers/misc/mic/bus/scif_bus.c
new file mode 100644
index 0000000..2da7cee
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.c
@@ -0,0 +1,210 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/dma-mapping.h>
+
+#include "scif_bus.h"
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "scif:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *scif_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(scif_dev);
+
+static inline int scif_id_match(const struct scif_hw_dev *dev,
+				const struct scif_hw_dev_id *id)
+{
+	if (id->device != dev->id.device && id->device != SCIF_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == SCIF_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call scif_dev_probe().
+ */
+static int scif_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+	const struct scif_hw_dev_id *ids;
+
+	ids = drv_to_scif(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (scif_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int scif_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+
+	return add_uevent_var(env, "MODALIAS=scif:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int scif_dev_probe(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_dev_remove(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_bus = {
+	.name  = "scif_bus",
+	.match = scif_dev_match,
+	.dev_groups = scif_dev_groups,
+	.uevent = scif_uevent,
+	.probe = scif_dev_probe,
+	.remove = scif_dev_remove,
+};
+
+int scif_register_driver(struct scif_driver *driver)
+{
+	driver->driver.bus = &scif_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_register_driver);
+
+void scif_unregister_driver(struct scif_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_driver);
+
+static void scif_release_dev(struct device *d)
+{
+	struct scif_hw_dev *sdev = dev_to_scif(d);
+
+	kfree(sdev);
+}
+
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper, void *dp,
+		     void __iomem *rdp, struct dma_chan **chan, int num_chan)
+{
+	int ret;
+	struct scif_hw_dev *sdev;
+
+	sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
+	if (!sdev)
+		return ERR_PTR(-ENOMEM);
+
+	sdev->dev.parent = pdev;
+	sdev->id.device = id;
+	sdev->id.vendor = SCIF_DEV_ANY_ID;
+	sdev->dev.archdata.dma_ops = dma_ops;
+	sdev->dev.release = scif_release_dev;
+	sdev->hw_ops = hw_ops;
+	sdev->dnode = dnode;
+	sdev->snode = snode;
+	dev_set_drvdata(&sdev->dev, sdev);
+	sdev->dev.bus = &scif_bus;
+	sdev->mmio = mmio;
+	sdev->aper = aper;
+	sdev->dp = dp;
+	sdev->rdp = rdp;
+	sdev->dev.dma_mask = &sdev->dev.coherent_dma_mask;
+	dma_set_mask(&sdev->dev, DMA_BIT_MASK(64));
+	sdev->dma_ch = chan;
+	sdev->num_dma_ch = num_chan;
+	dev_set_name(&sdev->dev, "scif-dev%u", sdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&sdev->dev);
+	if (ret)
+		goto free_sdev;
+	return sdev;
+free_sdev:
+	kfree(sdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(scif_register_device);
+
+void scif_unregister_device(struct scif_hw_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_device);
+
+static int __init scif_init(void)
+{
+	return bus_register(&scif_bus);
+}
+
+static void __exit scif_exit(void)
+{
+	bus_unregister(&scif_bus);
+}
+
+core_initcall(scif_init);
+module_exit(scif_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF Bus driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 61%]

* [PATCH char-misc-next v2 05/13] misc: mic: Common MIC header file changes in preparation for SCIF
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (3 preceding siblings ...)
  2015-01-12 23:09 78% ` [PATCH char-misc-next v2 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
@ 2015-01-12 23:09 96% ` Sudeep Dutt
  2015-01-12 23:10 47% ` [PATCH char-misc-next v2 06/13] misc: mic: SCIF module initialization Sudeep Dutt
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

Update mic_bootparam and define the maximum number of DMA channels

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/common/mic_dev.h |  3 +++
 include/uapi/linux/mic_common.h   | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/misc/mic/common/mic_dev.h b/drivers/misc/mic/common/mic_dev.h
index 92999c2..0b58c46 100644
--- a/drivers/misc/mic/common/mic_dev.h
+++ b/drivers/misc/mic/common/mic_dev.h
@@ -48,4 +48,7 @@ struct mic_mw {
 #define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
 #define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
 
+/* Maximum number of DMA channels */
+#define MIC_MAX_DMA_CHAN 4
+
 #endif
diff --git a/include/uapi/linux/mic_common.h b/include/uapi/linux/mic_common.h
index 6eb4024..302a2ce 100644
--- a/include/uapi/linux/mic_common.h
+++ b/include/uapi/linux/mic_common.h
@@ -80,6 +80,12 @@ struct mic_device_ctrl {
  * @h2c_config_db: Host to Card Virtio config doorbell set by card
  * @shutdown_status: Card shutdown status set by card
  * @shutdown_card: Set to 1 by the host when a card shutdown is initiated
+ * @tot_nodes: Total number of nodes in the SCIF network
+ * @node_id: Unique id of the node
+ * @h2c_scif_db - Host to card SCIF doorbell set by card
+ * @c2h_scif_db - Card to host SCIF doorbell set by host
+ * @scif_host_dma_addr - SCIF host queue pair DMA address
+ * @scif_card_dma_addr - SCIF card queue pair DMA address
  */
 struct mic_bootparam {
 	__le32 magic;
@@ -88,6 +94,12 @@ struct mic_bootparam {
 	__s8 h2c_config_db;
 	__u8 shutdown_status;
 	__u8 shutdown_card;
+	__u8 tot_nodes;
+	__u8 node_id;
+	__u8 h2c_scif_db;
+	__u8 c2h_scif_db;
+	__u64 scif_host_dma_addr;
+	__u64 scif_card_dma_addr;
 } __attribute__ ((aligned(8)));
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 96%]

* [PATCH char-misc-next v2 06/13] misc: mic: SCIF module initialization
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (4 preceding siblings ...)
  2015-01-12 23:09 96% ` [PATCH char-misc-next v2 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
@ 2015-01-12 23:10 47% ` Sudeep Dutt
  2015-01-12 23:10 28% ` [PATCH char-misc-next v2 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF module initialization, DMA mapping, ioremap wrapper APIs
and debugfs hooks. SCIF gets probed by the SCIF hardware bus
if SCIF devices were registered by base drivers. A MISC device
is registered to provide the SCIF character device interface.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_main.h    | 250 ++++++++++++++++++++++
 drivers/misc/mic/scif/scif_map.h     | 113 ++++++++++
 drivers/misc/mic/scif/scif_debugfs.c |  85 ++++++++
 drivers/misc/mic/scif/scif_main.c    | 390 +++++++++++++++++++++++++++++++++++
 4 files changed, 838 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c

diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
new file mode 100644
index 0000000..0bc091b
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -0,0 +1,250 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAIN_H
+#define SCIF_MAIN_H
+
+#include <linux/sched.h>
+#include <linux/pci.h>
+#include <linux/miscdevice.h>
+#include <linux/dmaengine.h>
+#include <linux/file.h>
+#include <linux/scif.h>
+
+#include "../common/mic_dev.h"
+
+#define SCIF_MGMT_NODE 0
+#define SCIF_DEFAULT_WATCHDOG_TO 30
+#define SCIF_NODE_ACCEPT_TIMEOUT (3 * HZ)
+#define SCIF_NODE_ALIVE_TIMEOUT (SCIF_DEFAULT_WATCHDOG_TO * HZ)
+
+/*
+ * Generic state used for certain node QP message exchanges
+ * like Unregister, Alloc etc.
+ */
+enum scif_msg_state {
+	OP_IDLE = 1,
+	OP_IN_PROGRESS,
+	OP_COMPLETED,
+	OP_FAILED
+};
+
+/*
+ * struct scif_info - Global SCIF information
+ *
+ * @nodeid: Node ID this node is to others
+ * @maxid: Max known node ID
+ * @total: Total number of SCIF nodes
+ * @nr_zombies: number of zombie endpoints
+ * @eplock: Lock to synchronize listening, zombie endpoint lists
+ * @connlock: Lock to synchronize connected and disconnected lists
+ * @nb_connect_lock: Synchronize non blocking connect operations
+ * @port_lock: Synchronize access to SCIF ports
+ * @uaccept: List of user acceptreq waiting for acceptreg
+ * @listen: List of listening end points
+ * @zombie: List of zombie end points with pending RMA's
+ * @connected: List of end points in connected state
+ * @disconnected: List of end points in disconnected state
+ * @nb_connect_list: List for non blocking connections
+ * @misc_work: miscellaneous SCIF tasks
+ * @conflock: Lock to synchronize SCIF node configuration changes
+ * @en_msg_log: Enable debug message logging
+ * @p2p_enable: Enable P2P SCIF network
+ * @mdev: The MISC device
+ * @conn_work: Work for workqueue handling all connections
+ * @exitwq: Wait queue for waiting for an EXIT node QP message response
+ * @loopb_dev: Dummy SCIF device used for loopback
+ * @loopb_wq: Workqueue used for handling loopback messages
+ * @loopb_wqname[16]: Name of loopback workqueue
+ * @loopb_work: Used for submitting work to loopb_wq
+ * @loopb_recv_q: List of messages received on the loopb_wq
+ * @card_initiated_exit: set when the card has initiated the exit
+ */
+struct scif_info {
+	u8 nodeid;
+	u8 maxid;
+	u8 total;
+	u32 nr_zombies;
+	spinlock_t eplock;
+	struct mutex connlock;
+	spinlock_t nb_connect_lock;
+	spinlock_t port_lock;
+	struct list_head uaccept;
+	struct list_head listen;
+	struct list_head zombie;
+	struct list_head connected;
+	struct list_head disconnected;
+	struct list_head nb_connect_list;
+	struct work_struct misc_work;
+	struct mutex conflock;
+	u8 en_msg_log;
+	u8 p2p_enable;
+	struct miscdevice mdev;
+	struct work_struct conn_work;
+	wait_queue_head_t exitwq;
+	struct scif_dev *loopb_dev;
+	struct workqueue_struct *loopb_wq;
+	char loopb_wqname[16];
+	struct work_struct loopb_work;
+	struct list_head loopb_recv_q;
+	bool card_initiated_exit;
+};
+
+/*
+ * struct scif_p2p_info - SCIF mapping information used for P2P
+ *
+ * @ppi_peer_id - SCIF peer node id
+ * @ppi_sg - Scatter list for bar information (One for mmio and one for aper)
+ * @sg_nentries - Number of entries in the scatterlist
+ * @ppi_da: DMA address for MMIO and APER bars
+ * @ppi_len: Length of MMIO and APER bars
+ * @ppi_list: Link in list of mapping information
+ */
+struct scif_p2p_info {
+	u8 ppi_peer_id;
+	struct scatterlist *ppi_sg[2];
+	u64 sg_nentries[2];
+	dma_addr_t ppi_da[2];
+	u64 ppi_len[2];
+#define SCIF_PPI_MMIO 0
+#define SCIF_PPI_APER 1
+	struct list_head ppi_list;
+};
+
+/*
+ * struct scif_dev - SCIF remote device specific fields
+ *
+ * @node: Node id
+ * @p2p: List of P2P mapping information
+ * @qpairs: The node queue pair for exchanging control messages
+ * @intr_wq: Workqueue for handling Node QP messages
+ * @intr_wqname: Name of node QP workqueue for handling interrupts
+ * @intr_bh: Used for submitting work to intr_wq
+ * @lock: Lock used for synchronizing access to the scif device
+ * @sdev: SCIF hardware device on the SCIF hardware bus
+ * @db: doorbell the peer will trigger to generate an interrupt on self
+ * @rdb: Doorbell to trigger on the peer to generate an interrupt on the peer
+ * @cookie: Cookie received while registering the interrupt handler
+ * @qp_dma_addr: Queue pair DMA address passed to the peer
+ * @p2p_dwork: Delayed work to enable polling for P2P state
+ * @qp_dwork: Delayed work for enabling polling for remote QP information
+ * @p2p_retry: Number of times to retry polling of P2P state
+ * @base_addr: P2P aperture bar base address
+ * @mic_mw mmio: The peer MMIO information used for P2P
+ * @spdev: SCIF peer device on the SCIF peer bus
+ * @node_remove_ack_pending: True if a node_remove_ack is pending
+ * @exit_ack_pending: true if an exit_ack is pending
+ * @disconn_wq: Used while waiting for a node remove response
+ * @disconn_rescnt: Keeps track of number of node remove requests sent
+ * @exit: Status of exit message
+*/
+struct scif_dev {
+	u8 node;
+	struct list_head p2p;
+	struct scif_qp *qpairs;
+	struct workqueue_struct *intr_wq;
+	char intr_wqname[16];
+	struct work_struct intr_bh;
+	struct mutex lock;
+	struct scif_hw_dev *sdev;
+	int db;
+	int rdb;
+	struct mic_irq *cookie;
+	dma_addr_t qp_dma_addr;
+	struct delayed_work p2p_dwork;
+	struct delayed_work qp_dwork;
+	int p2p_retry;
+	dma_addr_t base_addr;
+	struct mic_mw mmio;
+	struct scif_peer_dev __rcu *spdev;
+	bool node_remove_ack_pending;
+	bool exit_ack_pending;
+	wait_queue_head_t disconn_wq;
+	atomic_t disconn_rescnt;
+	enum scif_msg_state exit;
+};
+
+extern struct scif_info scif_info;
+extern struct idr scif_ports;
+extern struct scif_dev *scif_dev;
+extern const struct file_operations scif_fops;
+
+/* Size of the RB for the Node QP */
+#define SCIF_NODE_QP_SIZE 0x10000
+
+#include "scif_nodeqp.h"
+
+/*
+ * scifdev_self:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device passed is the self aka Loopback SCIF device.
+ */
+static inline int scifdev_self(struct scif_dev *dev)
+{
+	return dev->node == scif_info.nodeid;
+}
+
+static inline bool scif_is_mgmt_node(void)
+{
+	return !scif_info.nodeid;
+}
+
+/*
+ * scifdev_is_p2p:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device is a MIC Peer to Peer SCIF device.
+ */
+static inline bool scifdev_is_p2p(struct scif_dev *dev)
+{
+	if (scif_is_mgmt_node())
+		return false;
+	else
+		return dev != &scif_dev[SCIF_MGMT_NODE] &&
+			!scifdev_self(dev);
+}
+
+/*
+ * scifdev_alive:
+ * @scifdev: The remote SCIF Device
+ *
+ * Returns true if the remote SCIF Device is running or sleeping for
+ * this endpoint.
+ */
+static inline int _scifdev_alive(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	return !!spdev;
+}
+
+void __init scif_init_debugfs(void);
+void scif_exit_debugfs(void);
+int scif_setup_intr_wq(struct scif_dev *scifdev);
+void scif_destroy_intr_wq(struct scif_dev *scifdev);
+void scif_cleanup_scifdev(struct scif_dev *dev);
+void scif_handle_remove_node(int node);
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated);
+void scif_free_qp(struct scif_dev *dev);
+void scif_misc_handler(struct work_struct *work);
+void scif_stop(struct scif_dev *scifdev);
+irqreturn_t scif_intr_handler(int irq, void *data);
+#endif /* SCIF_MAIN_H */
diff --git a/drivers/misc/mic/scif/scif_map.h b/drivers/misc/mic/scif/scif_map.h
new file mode 100644
index 0000000..20e50b4
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_map.h
@@ -0,0 +1,113 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAP_H
+#define SCIF_MAP_H
+
+#include "../bus/scif_bus.h"
+
+static __always_inline void *
+scif_alloc_coherent(dma_addr_t *dma_handle,
+		    struct scif_dev *scifdev, size_t size,
+		    gfp_t gfp)
+{
+	void *va;
+
+	if (scifdev_self(scifdev)) {
+		va = kmalloc(size, gfp);
+		if (va)
+			*dma_handle = virt_to_phys(va);
+	} else {
+		va = dma_alloc_coherent(&scifdev->sdev->dev,
+					size, dma_handle, gfp);
+		if (va && scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	return va;
+}
+
+static __always_inline void
+scif_free_coherent(void *va, dma_addr_t local,
+		   struct scif_dev *scifdev, size_t size)
+{
+	if (scifdev_self(scifdev)) {
+		kfree(va);
+	} else {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_free_coherent(&scifdev->sdev->dev,
+				  size, va, local);
+	}
+}
+
+static __always_inline int
+scif_map_single(dma_addr_t *dma_handle,
+		void *local, struct scif_dev *scifdev, size_t size)
+{
+	int err = 0;
+
+	if (scifdev_self(scifdev)) {
+		*dma_handle = virt_to_phys((local));
+	} else {
+		*dma_handle = dma_map_single(&scifdev->sdev->dev,
+					     local, size, DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(&scifdev->sdev->dev, *dma_handle))
+			err = -ENOMEM;
+		else if (scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	if (err)
+		*dma_handle = 0;
+	return err;
+}
+
+static __always_inline void
+scif_unmap_single(dma_addr_t local, struct scif_dev *scifdev,
+		  size_t size)
+{
+	if (!scifdev_self(scifdev)) {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_unmap_single(&scifdev->sdev->dev, local,
+				 size, DMA_BIDIRECTIONAL);
+	}
+}
+
+static __always_inline void *
+scif_ioremap(dma_addr_t phys, size_t size, struct scif_dev *scifdev)
+{
+	void *out_virt;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_self(scifdev))
+		out_virt = phys_to_virt(phys);
+	else
+		out_virt = (void __force *)
+			   sdev->hw_ops->ioremap(sdev, phys, size);
+	return out_virt;
+}
+
+static __always_inline void
+scif_iounmap(void *virt, size_t len, struct scif_dev *scifdev)
+{
+	if (!scifdev_self(scifdev)) {
+		struct scif_hw_dev *sdev = scifdev->sdev;
+
+		sdev->hw_ops->iounmap(sdev, (void __force __iomem *)virt);
+	}
+}
+#endif  /* SCIF_MAP_H */
diff --git a/drivers/misc/mic/scif/scif_debugfs.c b/drivers/misc/mic/scif/scif_debugfs.c
new file mode 100644
index 0000000..51f14e2
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_debugfs.c
@@ -0,0 +1,85 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+#include "../common/mic_dev.h"
+#include "scif_main.h"
+
+/* Debugfs parent dir */
+static struct dentry *scif_dbg;
+
+static int scif_dev_test(struct seq_file *s, void *unused)
+{
+	int node;
+
+	seq_printf(s, "Total Nodes %d Self Node Id %d Maxid %d\n",
+		   scif_info.total, scif_info.nodeid,
+		   scif_info.maxid);
+
+	if (!scif_dev)
+		return 0;
+
+	seq_printf(s, "%-16s\t%-16s\n", "node_id", "state");
+
+	for (node = 0; node <= scif_info.maxid; node++)
+		seq_printf(s, "%-16d\t%-16s\n", scif_dev[node].node,
+			   _scifdev_alive(&scif_dev[node]) ?
+			   "Running" : "Offline");
+	return 0;
+}
+
+static int scif_dev_test_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, scif_dev_test, inode->i_private);
+}
+
+static int scif_dev_test_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations scif_dev_ops = {
+	.owner   = THIS_MODULE,
+	.open    = scif_dev_test_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = scif_dev_test_release
+};
+
+void __init scif_init_debugfs(void)
+{
+	struct dentry *d;
+
+	scif_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	if (!scif_dbg) {
+		dev_err(scif_info.mdev.this_device,
+			"can't create debugfs dir scif\n");
+		return;
+	}
+
+	d = debugfs_create_file("scif_dev", 0444, scif_dbg,
+				NULL, &scif_dev_ops);
+	debugfs_create_u8("en_msg_log", 0666, scif_dbg, &scif_info.en_msg_log);
+	debugfs_create_u8("p2p_enable", 0666, scif_dbg, &scif_info.p2p_enable);
+}
+
+void scif_exit_debugfs(void)
+{
+	debugfs_remove_recursive(scif_dbg);
+}
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
new file mode 100644
index 0000000..714bfea
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -0,0 +1,390 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/module.h>
+#include <linux/idr.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_map.h"
+
+struct scif_info scif_info = {
+	.mdev = {
+		.minor = MISC_DYNAMIC_MINOR,
+		.name = "scif",
+		.fops = &scif_fops,
+	}
+};
+
+struct scif_dev *scif_dev;
+static atomic_t g_loopb_cnt;
+
+/* Runs in the context of intr_wq */
+static void scif_intr_bh_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev =
+			container_of(work, struct scif_dev, intr_bh);
+
+	if (scifdev_self(scifdev))
+		scif_loopb_msg_handler(scifdev, scifdev->qpairs);
+	else
+		scif_nodeqp_intrhandler(scifdev, scifdev->qpairs);
+}
+
+int scif_setup_intr_wq(struct scif_dev *scifdev)
+{
+	if (!scifdev->intr_wq) {
+		snprintf(scifdev->intr_wqname, sizeof(scifdev->intr_wqname),
+			 "SCIF INTR %d", scifdev->node);
+		scifdev->intr_wq =
+			alloc_ordered_workqueue(scifdev->intr_wqname, 0);
+		if (!scifdev->intr_wq)
+			return -ENOMEM;
+		INIT_WORK(&scifdev->intr_bh, scif_intr_bh_handler);
+	}
+	return 0;
+}
+
+void scif_destroy_intr_wq(struct scif_dev *scifdev)
+{
+	if (scifdev->intr_wq) {
+		destroy_workqueue(scifdev->intr_wq);
+		scifdev->intr_wq = NULL;
+	}
+}
+
+irqreturn_t scif_intr_handler(int irq, void *data)
+{
+	struct scif_dev *scifdev = data;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	sdev->hw_ops->ack_interrupt(sdev, scifdev->db);
+	queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+	return IRQ_HANDLED;
+}
+
+static int scif_peer_probe(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total++;
+	scif_info.maxid = max_t(u32, spdev->dnode, scif_info.maxid);
+	mutex_unlock(&scif_info.conflock);
+	rcu_assign_pointer(scifdev->spdev, spdev);
+
+	/* In the future SCIF kernel client devices will be added here */
+	dev_info(&spdev->dev, "Peer added dnode %d\n",
+		 spdev->dnode);
+	return 0;
+}
+
+static void scif_peer_remove(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	/* In the future SCIF kernel client devices will be removed here */
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	synchronize_rcu();
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total--;
+	mutex_unlock(&scif_info.conflock);
+	dev_info(&spdev->dev, "Peer removed dnode %d\n",
+		 spdev->dnode);
+}
+
+static void scif_qp_setup_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev = container_of(work, struct scif_dev,
+						qp_dwork.work);
+	struct scif_hw_dev *sdev = scifdev->sdev;
+	dma_addr_t da = 0;
+	int err;
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		da = bp->scif_card_dma_addr;
+		scifdev->rdb = bp->h2c_scif_db;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		da = readq(&bp->scif_host_dma_addr);
+		scifdev->rdb = ioread8(&bp->c2h_scif_db);
+	}
+	if (da) {
+		err = scif_qp_response(da, scifdev);
+		if (err)
+			dev_err(&scifdev->sdev->dev,
+				"scif_qp_response err %d\n", err);
+	} else {
+		schedule_delayed_work(&scifdev->qp_dwork,
+				      msecs_to_jiffies(1000));
+	}
+}
+
+static int scif_setup_scifdev(struct scif_hw_dev *sdev)
+{
+	int i;
+	u8 num_nodes;
+
+	if (sdev->snode) {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		num_nodes = ioread8(&bp->tot_nodes);
+	} else {
+		struct mic_bootparam *bp = sdev->dp;
+
+		num_nodes = bp->tot_nodes;
+	}
+	scif_dev = kcalloc(num_nodes, sizeof(*scif_dev), GFP_KERNEL);
+	if (!scif_dev)
+		return -ENOMEM;
+	for (i = 0; i < num_nodes; i++) {
+		struct scif_dev *scifdev = &scif_dev[i];
+
+		scifdev->node = i;
+		scifdev->exit = OP_IDLE;
+		init_waitqueue_head(&scifdev->disconn_wq);
+		mutex_init(&scifdev->lock);
+		INIT_DELAYED_WORK(&scifdev->p2p_dwork,
+				  scif_poll_qp_state);
+		INIT_DELAYED_WORK(&scifdev->qp_dwork,
+				  scif_qp_setup_handler);
+		INIT_LIST_HEAD(&scifdev->p2p);
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	}
+	return 0;
+}
+
+static void scif_destroy_scifdev(void)
+{
+	kfree(scif_dev);
+}
+
+static int scif_probe(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev;
+	int rc;
+
+	dev_set_drvdata(&sdev->dev, sdev);
+	if (1 == atomic_add_return(1, &g_loopb_cnt)) {
+		struct scif_dev *loopb_dev;
+
+		rc = scif_setup_scifdev(sdev);
+		if (rc)
+			goto exit;
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+		loopb_dev = &scif_dev[sdev->snode];
+		loopb_dev->sdev = sdev;
+		rc = scif_setup_loopback_qp(loopb_dev);
+		if (rc)
+			goto free_sdev;
+	} else {
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+	}
+	rc = scif_setup_intr_wq(scifdev);
+	if (rc)
+		goto destroy_loopb;
+	rc = scif_setup_qp(scifdev);
+	if (rc)
+		goto destroy_intr;
+	scifdev->db = sdev->hw_ops->next_db(sdev);
+	scifdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						    "SCIF_INTR", scifdev,
+						    scifdev->db);
+	if (IS_ERR(scifdev->cookie)) {
+		rc = PTR_ERR(scifdev->cookie);
+		goto free_qp;
+	}
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = scifdev->db;
+		bp->scif_host_dma_addr = scifdev->qp_dma_addr;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(scifdev->db, &bp->h2c_scif_db);
+		writeq(scifdev->qp_dma_addr, &bp->scif_card_dma_addr);
+	}
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+	return rc;
+free_qp:
+	scif_free_qp(scifdev);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+destroy_loopb:
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+free_sdev:
+	scif_destroy_scifdev();
+exit:
+	return rc;
+}
+
+void scif_stop(struct scif_dev *scifdev)
+{
+	struct scif_dev *dev;
+	int i;
+
+	for (i = scif_info.maxid; i >= 0; i--) {
+		dev = &scif_dev[i];
+		if (scifdev_self(dev))
+			continue;
+		scif_handle_remove_node(i);
+	}
+}
+
+static void scif_remove(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = -1;
+		bp->scif_host_dma_addr = 0x0;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(-1, &bp->h2c_scif_db);
+		writeq(0x0, &bp->scif_card_dma_addr);
+	}
+	if (scif_is_mgmt_node()) {
+		scif_disconnect_node(scifdev->node, true);
+	} else {
+		scif_info.card_initiated_exit = true;
+		scif_stop(scifdev);
+	}
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+	if (scifdev->cookie) {
+		sdev->hw_ops->free_irq(sdev, scifdev->cookie, scifdev);
+		scifdev->cookie = NULL;
+	}
+	scif_destroy_intr_wq(scifdev);
+	cancel_delayed_work(&scifdev->qp_dwork);
+	scif_free_qp(scifdev);
+	scifdev->rdb = -1;
+	scifdev->sdev = NULL;
+}
+
+static struct scif_peer_driver scif_peer_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.probe = scif_peer_probe,
+	.remove = scif_peer_remove,
+};
+
+static struct scif_hw_dev_id id_table[] = {
+	{ MIC_SCIF_DEV, SCIF_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct scif_driver scif_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = scif_probe,
+	.remove = scif_remove,
+};
+
+static int _scif_init(void)
+{
+	spin_lock_init(&scif_info.eplock);
+	spin_lock_init(&scif_info.nb_connect_lock);
+	spin_lock_init(&scif_info.port_lock);
+	mutex_init(&scif_info.conflock);
+	mutex_init(&scif_info.connlock);
+	INIT_LIST_HEAD(&scif_info.uaccept);
+	INIT_LIST_HEAD(&scif_info.listen);
+	INIT_LIST_HEAD(&scif_info.zombie);
+	INIT_LIST_HEAD(&scif_info.connected);
+	INIT_LIST_HEAD(&scif_info.disconnected);
+	INIT_LIST_HEAD(&scif_info.nb_connect_list);
+	init_waitqueue_head(&scif_info.exitwq);
+	scif_info.en_msg_log = 0;
+	scif_info.p2p_enable = 1;
+	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	idr_init(&scif_ports);
+	return 0;
+}
+
+static void _scif_exit(void)
+{
+	idr_destroy(&scif_ports);
+	scif_destroy_scifdev();
+}
+
+static int __init scif_init(void)
+{
+	struct miscdevice *mdev = &scif_info.mdev;
+	int rc;
+
+	_scif_init();
+	rc = scif_peer_bus_init();
+	if (rc)
+		goto exit;
+	rc = scif_peer_register_driver(&scif_peer_driver);
+	if (rc)
+		goto peer_bus_exit;
+	rc = scif_register_driver(&scif_driver);
+	if (rc)
+		goto unreg_scif_peer;
+	rc = misc_register(mdev);
+	if (rc)
+		goto unreg_scif;
+	scif_init_debugfs();
+	return 0;
+unreg_scif:
+	scif_unregister_driver(&scif_driver);
+unreg_scif_peer:
+	scif_peer_unregister_driver(&scif_peer_driver);
+peer_bus_exit:
+	scif_peer_bus_exit();
+exit:
+	_scif_exit();
+	return rc;
+}
+
+static void __exit scif_exit(void)
+{
+	scif_exit_debugfs();
+	misc_deregister(&scif_info.mdev);
+	scif_unregister_driver(&scif_driver);
+	scif_peer_unregister_driver(&scif_peer_driver);
+	scif_peer_bus_exit();
+	_scif_exit();
+}
+
+module_init(scif_init);
+module_exit(scif_exit);
+
+MODULE_DEVICE_TABLE(scif, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 47%]

* [PATCH char-misc-next v2 07/13] misc: mic: SCIF node queue pair setup management
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (5 preceding siblings ...)
  2015-01-12 23:10 47% ` [PATCH char-misc-next v2 06/13] misc: mic: SCIF module initialization Sudeep Dutt
@ 2015-01-12 23:10 28% ` Sudeep Dutt
  2015-01-12 23:10 41% ` [PATCH char-misc-next v2 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF node queue pair setup creates the SCIF driver kernel
mode private node queue pairs between all the nodes to enable
internal control message communication once SCIF gets probed
by the SCIF hardware bus. Peer to peer communication between
MIC Coprocessor nodes is supported.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_nodeqp.h |  172 +++++
 drivers/misc/mic/scif/scif_nm.c     |  184 +++++
 drivers/misc/mic/scif/scif_nodeqp.c | 1259 +++++++++++++++++++++++++++++++++++
 3 files changed, 1615 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c

diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
new file mode 100644
index 0000000..4d8f53c
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -0,0 +1,172 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_NODEQP
+#define SCIF_NODEQP
+
+#include "scif_rb.h"
+#include "scif_peer_bus.h"
+
+#define SCIF_INIT 1  /* First message sent to the peer node for discovery */
+#define SCIF_EXIT 2  /* Last message from the peer informing intent to exit */
+#define SCIF_EXIT_ACK 3 /* Response to SCIF_EXIT message */
+#define SCIF_NODE_ADD 4  /* Tell Online nodes a new node exits */
+#define SCIF_NODE_ADD_ACK 5  /* Confirm to mgmt node sequence is finished */
+#define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
+#define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
+#define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
+#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+
+/*
+ * struct scifmsg - Node QP message format
+ *
+ * @src: Source information
+ * @dst: Destination information
+ * @uop: The message opcode
+ * @payload: Unique payload format for each message
+ */
+struct scifmsg {
+	struct scif_port_id src;
+	struct scif_port_id dst;
+	uint32_t uop;
+	uint64_t payload[4];
+} __packed;
+
+/*
+ * struct scif_qp - Node Queue Pair
+ *
+ * Interesting structure -- a little difficult because we can only
+ * write across the PCIe, so any r/w pointer we need to read is
+ * local. We only need to read the read pointer on the inbound_q
+ * and read the write pointer in the outbound_q
+ *
+ * @magic: Magic value to ensure the peer sees the QP correctly
+ * @outbound_q: The outbound ring buffer for sending messages
+ * @inbound_q: The inbound ring buffer for receiving messages
+ * @local_write: Local write index
+ * @local_read: Local read index
+ * @remote_qp: The remote queue pair
+ * @local_buf: DMA address of local ring buffer
+ * @local_qp: DMA address of the local queue pair data structure
+ * @remote_buf: DMA address of remote ring buffer
+ * @qp_state: QP state i.e. online or offline used for P2P
+ * @send_lock: synchronize access to outbound queue
+ * @recv_lock: Synchronize access to inbound queue
+ */
+struct scif_qp {
+	uint64_t magic;
+#define SCIFEP_MAGIC 0x5c1f000000005c1fULL
+	struct scif_rb outbound_q;
+	struct scif_rb inbound_q;
+
+	uint32_t local_write __aligned(64);
+	uint32_t local_read __aligned(64);
+	struct scif_qp *remote_qp;
+	dma_addr_t local_buf;
+	dma_addr_t local_qp;
+	dma_addr_t remote_buf;
+	uint32_t qp_state;
+#define SCIF_QP_OFFLINE 0xdead
+#define SCIF_QP_ONLINE 0xc0de
+	spinlock_t send_lock;
+	spinlock_t recv_lock;
+};
+
+/*
+ * struct scif_loopb_msg - An element in the loopback Node QP message list.
+ *
+ * @msg - The SCIF node QP message
+ * @list - link in the list of messages
+ */
+struct scif_loopb_msg {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_setup_qp(struct scif_dev *scifdev);
+int scif_qp_response(phys_addr_t phys, struct scif_dev *dev);
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev);
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev);
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload);
+int scif_setup_loopback_qp(struct scif_dev *scifdev);
+int scif_destroy_loopback_qp(struct scif_dev *scifdev);
+void scif_poll_qp_state(struct work_struct *work);
+void scif_destroy_p2p(struct scif_dev *scifdev);
+void scif_send_exit(struct scif_dev *scifdev);
+static inline struct device *scif_get_peer_dev(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+	struct device *spdev_ret;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		spdev_ret = get_device(&spdev->dev);
+	else
+		spdev_ret = ERR_PTR(-ENODEV);
+	rcu_read_unlock();
+	return spdev_ret;
+}
+
+static inline void scif_put_peer_dev(struct device *dev)
+{
+	put_device(dev);
+}
+#endif  /* SCIF_NODEQP */
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
new file mode 100644
index 0000000..5fefacc
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -0,0 +1,184 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_peer_bus.h"
+
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_free_qp(struct scif_dev *scifdev)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+
+	if (!qp)
+		return;
+	scif_free_coherent((void *)qp->inbound_q.rb_base,
+			   qp->local_buf, scifdev, qp->inbound_q.size);
+	scif_unmap_single(qp->local_qp, scifdev, sizeof(struct scif_qp));
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+}
+
+static void scif_cleanup_qp(struct scif_dev *dev)
+{
+	struct scif_qp *qp = &dev->qpairs[0];
+
+	if (!qp)
+		return;
+	scif_iounmap((void *)qp->remote_qp, sizeof(struct scif_qp), dev);
+	scif_iounmap((void *)qp->outbound_q.rb_base,
+		     sizeof(struct scif_qp), dev);
+	qp->remote_qp = NULL;
+	qp->local_write = 0;
+	qp->inbound_q.current_write_offset = 0;
+	qp->inbound_q.current_read_offset = 0;
+	if (scifdev_is_p2p(dev))
+		scif_free_qp(dev);
+}
+
+void scif_send_acks(struct scif_dev *dev)
+{
+	struct scifmsg msg;
+
+	if (dev->node_remove_ack_pending) {
+		msg.uop = SCIF_NODE_REMOVE_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = SCIF_MGMT_NODE;
+		msg.payload[0] = dev->node;
+		scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg);
+		dev->node_remove_ack_pending = false;
+	}
+	if (dev->exit_ack_pending) {
+		msg.uop = SCIF_EXIT_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = dev->node;
+		scif_nodeqp_send(dev, &msg);
+		dev->exit_ack_pending = false;
+	}
+}
+
+/*
+ * scif_cleanup_scifdev
+ *
+ * @dev: Remote SCIF device.
+ * Uninitialize SCIF data structures for remote SCIF device.
+ */
+void scif_cleanup_scifdev(struct scif_dev *dev)
+{
+	struct scif_hw_dev *sdev = dev->sdev;
+
+	if (!dev->sdev)
+		return;
+	if (scifdev_is_p2p(dev)) {
+		if (dev->cookie) {
+			sdev->hw_ops->free_irq(sdev, dev->cookie, dev);
+			dev->cookie = NULL;
+		}
+		scif_destroy_intr_wq(dev);
+	}
+	scif_destroy_p2p(dev);
+	scif_send_acks(dev);
+	if (!dev->node && scif_info.card_initiated_exit) {
+		/*
+		 * Send an SCIF_EXIT message which is the last message from MIC
+		 * to the Host and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(dev);
+		scif_info.card_initiated_exit = false;
+	}
+	scif_cleanup_qp(dev);
+}
+
+/*
+ * scif_remove_node:
+ *
+ * @node: Node to remove
+ */
+void scif_handle_remove_node(int node)
+{
+	struct scif_dev *scifdev = &scif_dev[node];
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	else
+		scif_send_acks(scifdev);
+}
+
+static int scif_send_rmnode_msg(int node, int remove_node)
+{
+	struct scifmsg notif_msg;
+	struct scif_dev *dev = &scif_dev[node];
+
+	notif_msg.uop = SCIF_NODE_REMOVE;
+	notif_msg.src.node = scif_info.nodeid;
+	notif_msg.dst.node = node;
+	notif_msg.payload[0] = remove_node;
+	return scif_nodeqp_send(dev, &notif_msg);
+}
+
+/**
+ * scif_node_disconnect:
+ *
+ * @node_id[in]: source node id.
+ * @mgmt_initiated: Disconnection initiated from the mgmt node
+ *
+ * Disconnect a node from the scif network.
+ */
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
+{
+	int ret;
+	int msg_cnt = 0;
+	u32 i = 0;
+	struct scif_dev *scifdev = &scif_dev[node_id];
+
+	if (!node_id)
+		return;
+
+	atomic_set(&scifdev->disconn_rescnt, 0);
+
+	/* Destroy p2p network */
+	for (i = 1; i <= scif_info.maxid; i++) {
+		if (i == node_id)
+			continue;
+		ret = scif_send_rmnode_msg(i, node_id);
+		if (!ret)
+			msg_cnt++;
+	}
+	/* Wait for the remote nodes to respond with SCIF_NODE_REMOVE_ACK */
+	ret = wait_event_timeout(scifdev->disconn_wq,
+				 (atomic_read(&scifdev->disconn_rescnt)
+				 == msg_cnt), SCIF_NODE_ALIVE_TIMEOUT);
+	/* Tell the card to clean up */
+	if (mgmt_initiated && _scifdev_alive(scifdev))
+		/*
+		 * Send an SCIF_EXIT message which is the last message from Host
+		 * to the MIC and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(scifdev);
+	atomic_set(&scifdev->disconn_rescnt, 0);
+	/* Tell the mgmt node to clean up */
+	ret = scif_send_rmnode_msg(SCIF_MGMT_NODE, node_id);
+	if (!ret)
+		/* Wait for mgmt node to respond with SCIF_NODE_REMOVE_ACK */
+		wait_event_timeout(scifdev->disconn_wq,
+				   (atomic_read(&scifdev->disconn_rescnt) == 1),
+				   SCIF_NODE_ALIVE_TIMEOUT);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
new file mode 100644
index 0000000..8978af9
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -0,0 +1,1259 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_nodeqp.h"
+#include "scif_map.h"
+
+/*
+ ************************************************************************
+ * SCIF node Queue Pair (QP) setup flow:
+ *
+ * 1) SCIF driver gets probed with a scif_hw_dev via the scif_hw_bus
+ * 2) scif_setup_qp(..) allocates the local qp and calls
+ *	scif_setup_qp_connect(..) which allocates and maps the local
+ *	buffer for the inbound QP
+ * 3) The local node updates the device page with the DMA address of the QP
+ * 4) A delayed work is scheduled (qp_dwork) which periodically reads if
+ *	the peer node has updated its QP DMA address
+ * 5) Once a valid non zero address is found in the QP DMA address field
+ *	in the device page, the local node maps the remote node's QP,
+ *	updates its outbound QP and sends a SCIF_INIT message to the peer
+ * 6) The SCIF_INIT message is received by the peer node QP interrupt bottom
+ *	half handler by calling scif_init(..)
+ * 7) scif_init(..) registers a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node
+ * 8) On the mgmt node, P2P network setup/teardown is initiated if all the
+ *	remote nodes are online via scif_p2p_setup(..)
+ * 9) For P2P setup, the host maps the remote nodes' aperture and memory
+ *	bars and sends a SCIF_NODE_ADD message to both nodes
+ * 10) As part of scif_nodeadd, both nodes set up their local inbound
+ *	QPs and send a SCIF_NODE_ADD_ACK to the mgmt node
+ * 11) As part of scif_node_add_ack(..) the mgmt node forwards the
+ *	SCIF_NODE_ADD_ACK to the remote nodes
+ * 12) As part of scif_node_add_ack(..) the remote nodes update their
+ *	outbound QPs, make sure they can access memory on the remote node
+ *	and then add a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node.
+ * 13) The SCIF network is now established across all nodes.
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by non mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) A non mgmt node now cleans up all local data structures and sends a
+ *	SCIF_EXIT message to the peer and waits for a SCIF_EXIT_ACK
+ * 4) As part of scif_exit(..) handling scif_disconnect_node(..) is called
+ * 5) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the
+ *	peers and waits for a SCIF_NODE_REMOVE_ACK
+ * 6) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 7) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it sends itself a node remove message whose handling cleans up local
+ *	data structures and unregisters the peer node from the SCIF network
+ * 8) The mgmt node sends a SCIF_EXIT_ACK
+ * 9) Upon receipt of the SCIF_EXIT_ACK the node initiating the teardown
+ *	completes the SCIF remove routine
+ * 10) The SCIF network is now torn down for the node initiating the
+ *	teardown sequence
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) The mgmt node calls scif_disconnect_node(..)
+ * 4) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the peers
+ *	and waits for a SCIF_NODE_REMOVE_ACK
+ * 5) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 6) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it unregisters the peer node from the SCIF network
+ * 7) The mgmt node sends a SCIF_EXIT message and waits for a SCIF_EXIT_ACK.
+ * 8) A non mgmt node upon receipt of a SCIF_EXIT message calls scif_stop(..)
+ *	which would clean up local data structures for all SCIF nodes and
+ *	then send a SCIF_EXIT_ACK back to the mgmt node
+ * 9) Upon receipt of the SCIF_EXIT_ACK the the mgmt node sends itself a node
+ *	remove message whose handling cleans up local data structures and
+ *	destroys any P2P mappings.
+ * 10) The SCIF hardware device for which a remove callback was received is now
+ *	disconnected from the SCIF network.
+ */
+/*
+ * Initializes "local" data structures for the QP. Allocates the QP
+ * ring buffer (rb) and initializes the "in bound" queue.
+ */
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev)
+{
+	void *local_q = NULL;
+	int err = 0;
+	u32 tmp_rd = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		return err;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &tmp_rd,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	/*
+	 * The read pointer is NULL initially and it is unsafe to use the ring
+	 * buffer til this changes!
+	 */
+	qp->inbound_q.read_ptr = NULL;
+	err = scif_map_single(qp_offset, qp,
+			      scifdev, sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+	return err;
+}
+
+/* When the other side has already done it's allocation, this is called */
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev)
+{
+	void *local_q;
+	void *remote_q;
+	struct scif_qp *remote_qp;
+	int remote_size;
+	int err = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+	/* Start by figuring out where we need to point */
+	remote_qp = scif_ioremap(phys, sizeof(struct scif_qp), scifdev);
+	if (!remote_qp)
+		return -EIO;
+	qp->remote_qp = remote_qp;
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_buf = remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	remote_q = scif_ioremap(qp->remote_buf, remote_size, scifdev);
+	if (!remote_q) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_qp->local_write = 0;
+	/*
+	 * To setup the outbound_q, the buffer lives in remote memory,
+	 * the read pointer is local, the write pointer is remote
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     remote_q,
+		     get_count_order(remote_size));
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto iounmap_1;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	qp->remote_qp->local_read = 0;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	err = scif_map_single(qp_offset, qp, scifdev,
+			      sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+iounmap_1:
+	scif_iounmap(remote_q, remote_size, scifdev);
+	qp->outbound_q.rb_base = NULL;
+iounmap:
+	scif_iounmap(qp->remote_qp, sizeof(struct scif_qp), scifdev);
+	qp->remote_qp = NULL;
+	return err;
+}
+
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload)
+{
+	int err = 0;
+	void *r_buf;
+	int remote_size;
+	phys_addr_t tmp_phys;
+
+	qp->remote_qp = scif_ioremap(payload, sizeof(struct scif_qp), scifdev);
+
+	if (!qp->remote_qp) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		dev_err(&scifdev->sdev->dev,
+			"SCIFEP_MAGIC mismatch between self %d remote %d\n",
+			scif_dev[scif_info.nodeid].node, scifdev->node);
+		err = -ENODEV;
+		goto error;
+	}
+
+	tmp_phys = qp->remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	r_buf = scif_ioremap(tmp_phys, remote_size, scifdev);
+
+	if (!r_buf)
+		return -EIO;
+
+	qp->local_read = 0;
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     r_buf,
+		     get_count_order(remote_size));
+	/*
+	 * resetup the inbound_q now that we know where the
+	 * inbound_read really is.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     qp->inbound_q.rb_base,
+		     get_count_order(qp->inbound_q.size));
+error:
+	return err;
+}
+
+static __always_inline void
+scif_send_msg_intr(struct scif_dev *scifdev)
+{
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_is_p2p(scifdev))
+		sdev->hw_ops->send_p2p_intr(sdev, scifdev->rdb, &scifdev->mmio);
+	else
+		sdev->hw_ops->send_intr(sdev, scifdev->rdb);
+}
+
+int scif_qp_response(phys_addr_t phys, struct scif_dev *scifdev)
+{
+	int err = 0;
+	struct scifmsg msg;
+
+	err = scif_setup_qp_connect_response(scifdev, scifdev->qpairs, phys);
+	if (!err) {
+		/*
+		 * Now that everything is setup and mapped, we're ready
+		 * to tell the peer about our queue's location
+		 */
+		msg.uop = SCIF_INIT;
+		msg.dst.node = scifdev->node;
+		err = scif_nodeqp_send(scifdev, &msg);
+	}
+	return err;
+}
+
+void scif_send_exit(struct scif_dev *scifdev)
+{
+	struct scifmsg msg;
+	int ret;
+
+	scifdev->exit = OP_IN_PROGRESS;
+	msg.uop = SCIF_EXIT;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = scifdev->node;
+	ret = scif_nodeqp_send(scifdev, &msg);
+	if (ret)
+		goto done;
+	/* Wait for a SCIF_EXIT_ACK message */
+	wait_event_timeout(scif_info.exitwq, scifdev->exit == OP_COMPLETED,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+done:
+	scifdev->exit = OP_IDLE;
+}
+
+int scif_setup_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	int local_size;
+	struct scif_qp *qp;
+
+	local_size = SCIF_NODE_QP_SIZE;
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp) {
+		err = -ENOMEM;
+		return err;
+	}
+	qp->magic = SCIFEP_MAGIC;
+	scifdev->qpairs = qp;
+	err = scif_setup_qp_connect(qp, &scifdev->qp_dma_addr,
+				    local_size, scifdev);
+	if (err)
+		goto free_qp;
+	/*
+	 * We're as setup as we can be. The inbound_q is setup, w/o a usable
+	 * outbound q.  When we get a message, the read_ptr will be updated,
+	 * and we will pull the message.
+	 */
+	return err;
+free_qp:
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+	return err;
+}
+
+static void scif_p2p_freesg(struct scatterlist *sg)
+{
+	kfree(sg);
+}
+
+static struct scatterlist *
+scif_p2p_setsg(void __iomem *va, int page_size, int page_cnt)
+{
+	struct scatterlist *sg;
+	struct page *page;
+	int i;
+
+	sg = kcalloc(page_cnt, sizeof(struct scatterlist), GFP_KERNEL);
+	if (!sg)
+		return NULL;
+	sg_init_table(sg, page_cnt);
+	for (i = 0; i < page_cnt; i++) {
+		page = vmalloc_to_page((void __force *)va);
+		if (!page)
+			goto p2p_sg_err;
+		sg_set_page(&sg[i], page, page_size, 0);
+		va += page_size;
+	}
+	return sg;
+p2p_sg_err:
+	kfree(sg);
+	return NULL;
+}
+
+/* Init p2p mappings required to access peerdev from scifdev */
+static struct scif_p2p_info *
+scif_init_p2p_info(struct scif_dev *scifdev, struct scif_dev *peerdev)
+{
+	struct scif_p2p_info *p2p;
+	int num_mmio_pages, num_aper_pages, sg_page_shift, err, num_aper_chunks;
+	struct scif_hw_dev *psdev = peerdev->sdev;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	num_mmio_pages = psdev->mmio->len >> PAGE_SHIFT;
+	num_aper_pages = psdev->aper->len >> PAGE_SHIFT;
+
+	p2p = kzalloc(sizeof(*p2p), GFP_KERNEL);
+	if (!p2p)
+		return NULL;
+	p2p->ppi_sg[SCIF_PPI_MMIO] = scif_p2p_setsg(psdev->mmio->va,
+						    PAGE_SIZE, num_mmio_pages);
+	if (!p2p->ppi_sg[SCIF_PPI_MMIO])
+		goto free_p2p;
+	p2p->sg_nentries[SCIF_PPI_MMIO] = num_mmio_pages;
+	sg_page_shift = get_order(min(psdev->aper->len, (u64)(1 << 30)));
+	num_aper_chunks = num_aper_pages >> (sg_page_shift - PAGE_SHIFT);
+	p2p->ppi_sg[SCIF_PPI_APER] = scif_p2p_setsg(psdev->aper->va,
+						    1 << sg_page_shift,
+						    num_aper_chunks);
+	p2p->sg_nentries[SCIF_PPI_APER] = num_aper_chunks;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			 num_mmio_pages, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_mmio_pages)
+		goto scif_p2p_free;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			 num_aper_chunks, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_aper_chunks)
+		goto dma_unmap;
+	p2p->ppi_da[SCIF_PPI_MMIO] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	p2p->ppi_da[SCIF_PPI_APER] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_APER]);
+	p2p->ppi_len[SCIF_PPI_MMIO] = num_mmio_pages;
+	p2p->ppi_len[SCIF_PPI_APER] = num_aper_pages;
+	p2p->ppi_peer_id = peerdev->node;
+	return p2p;
+dma_unmap:
+	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);
+scif_p2p_free:
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+free_p2p:
+	kfree(p2p);
+	return NULL;
+}
+
+/**
+ * scif_node_connect: Respond to SCIF_NODE_CONNECT interrupt message
+ * @dst: Destination node
+ *
+ * Connect the src and dst node by setting up the p2p connection
+ * between them. Management node here acts like a proxy.
+ */
+static void scif_node_connect(struct scif_dev *scifdev, int dst)
+{
+	struct scif_dev *dev_j = scifdev;
+	struct scif_dev *dev_i = NULL;
+	struct scif_p2p_info *p2p_ij = NULL;    /* bus addr for j from i */
+	struct scif_p2p_info *p2p_ji = NULL;    /* bus addr for i from j */
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	struct scifmsg msg;
+	int err;
+	u64 tmppayload;
+
+	if (dst < 1 || dst > scif_info.maxid)
+		return;
+
+	dev_i = &scif_dev[dst];
+
+	if (!_scifdev_alive(dev_i))
+		return;
+	/*
+	 * If the p2p connection is already setup or in the process of setting
+	 * up then just ignore this request. The requested node will get
+	 * informed by SCIF_NODE_ADD_ACK or SCIF_NODE_ADD_NACK
+	 */
+	if (!list_empty(&dev_i->p2p)) {
+		list_for_each_safe(pos, tmp, &dev_i->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == dev_j->node)
+				return;
+		}
+	}
+	p2p_ij = scif_init_p2p_info(dev_i, dev_j);
+	if (!p2p_ij)
+		return;
+	p2p_ji = scif_init_p2p_info(dev_j, dev_i);
+	if (!p2p_ji)
+		return;
+	list_add_tail(&p2p_ij->ppi_list, &dev_i->p2p);
+	list_add_tail(&p2p_ji->ppi_list, &dev_j->p2p);
+
+	/*
+	 * Send a SCIF_NODE_ADD to dev_i, pass it its bus address
+	 * as seen from dev_j
+	 */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_j->node;
+	msg.dst.node = dev_i->node;
+
+	msg.payload[0] = p2p_ji->ppi_da[SCIF_PPI_APER];
+	msg.payload[1] = p2p_ij->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[2] = p2p_ij->ppi_da[SCIF_PPI_APER];
+	msg.payload[3] = p2p_ij->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	err = scif_nodeqp_send(dev_i,  &msg);
+	if (err) {
+		dev_err(&scifdev->sdev->dev,
+			"%s %d error %d\n", __func__, __LINE__, err);
+		return;
+	}
+
+	/* Same as above but to dev_j */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_i->node;
+	msg.dst.node = dev_j->node;
+
+	tmppayload = msg.payload[0];
+	msg.payload[0] = msg.payload[2];
+	msg.payload[2] = tmppayload;
+	msg.payload[1] = p2p_ji->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[3] = p2p_ji->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	scif_nodeqp_send(dev_j, &msg);
+}
+
+static void scif_p2p_setup(void)
+{
+	int i, j;
+
+	if (!scif_info.p2p_enable)
+		return;
+
+	for (i = 1; i <= scif_info.maxid; i++)
+		if (!_scifdev_alive(&scif_dev[i]))
+			return;
+
+	for (i = 1; i <= scif_info.maxid; i++) {
+		for (j = 1; j <= scif_info.maxid; j++) {
+			struct scif_dev *scifdev = &scif_dev[i];
+
+			if (i == j)
+				continue;
+			scif_node_connect(scifdev, j);
+		}
+	}
+}
+
+static int scif_qp_response_ack(struct scif_dev *scifdev, u64 payload)
+{
+	int err = 0;
+	struct scif_peer_dev *spdev;
+
+	/* Drop the INIT message if it has already been received */
+	if (_scifdev_alive(scifdev))
+		goto done;
+	/*
+	 * Allow the thread waiting for device page updates for the peer QP DMA
+	 * address to complete initializing the inbound_q.
+	 */
+	flush_delayed_work(&scifdev->qp_dwork);
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto done;
+	}
+	if (scif_is_mgmt_node()) {
+		mutex_lock(&scif_info.conflock);
+		scif_p2p_setup();
+		mutex_unlock(&scif_info.conflock);
+	}
+done:
+	return err;
+}
+
+static char *message_types[] = {"BAD",
+				"INIT",
+				"EXIT",
+				"SCIF_EXIT_ACK",
+				"SCIF_NODE_ADD",
+				"SCIF_NODE_ADD_ACK",
+				"SCIF_NODE_ADD_NACK",
+				"REMOVE_NODE",
+				"REMOVE_NODE_ACK"};
+
+static void
+scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
+		     const char *label)
+{
+	if (!scif_info.en_msg_log)
+		return;
+	if (msg->uop > SCIF_MAX_MSG) {
+		dev_err(&scifdev->sdev->dev,
+			"%s: unknown msg type %d\n", label, msg->uop);
+		return;
+	}
+	dev_info(&scifdev->sdev->dev,
+		 "%s: msg type %s, src %d:%d, dest %d:%d payload 0x%llx:0x%llx:0x%llx:0x%llx\n",
+		 label, message_types[msg->uop], msg->src.node, msg->src.port,
+		 msg->dst.node, msg->dst.port, msg->payload[0], msg->payload[1],
+		 msg->payload[2], msg->payload[3]);
+}
+
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+	int err = -ENOMEM, loop_cnt = 0;
+
+	scif_display_message(scifdev, msg, "Sent");
+	if (!qp) {
+		err = -EINVAL;
+		goto error;
+	}
+	spin_lock(&qp->send_lock);
+
+	while ((err = scif_rb_write(&qp->outbound_q,
+				    msg, sizeof(struct scifmsg)))) {
+		mdelay(1);
+#define SCIF_NODEQP_SEND_TO_MSEC (3 * 1000)
+		if (loop_cnt++ > (SCIF_NODEQP_SEND_TO_MSEC)) {
+			err = -ENODEV;
+			break;
+		}
+	}
+	if (!err)
+		scif_rb_commit(&qp->outbound_q);
+	spin_unlock(&qp->send_lock);
+	if (!err) {
+		if (scifdev_self(scifdev))
+			/*
+			 * For loopback we need to emulate an interrupt by
+			 * queuing work for the queue handling real node
+			 * Qp interrupts.
+			 */
+			queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+		else
+			scif_send_msg_intr(scifdev);
+	}
+error:
+	if (err)
+		dev_dbg(&scifdev->sdev->dev,
+			"%s %d error %d uop %d\n",
+			 __func__, __LINE__, err, msg->uop);
+	return err;
+}
+
+/**
+ * scif_nodeqp_send - Send a message on the node queue pair
+ * @scifdev: Scif Device.
+ * @msg: The message to be sent.
+ */
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int err;
+	struct device *spdev = NULL;
+
+	if (msg->uop > SCIF_EXIT_ACK) {
+		/* Dont send messages once the exit flow has begun */
+		if (OP_IDLE != scifdev->exit)
+			return -ENODEV;
+		spdev = scif_get_peer_dev(scifdev);
+		if (IS_ERR(spdev)) {
+			err = PTR_ERR(spdev);
+			return err;
+		}
+	}
+	err = _scif_nodeqp_send(scifdev, msg);
+	if (msg->uop > SCIF_EXIT_ACK)
+		scif_put_peer_dev(spdev);
+	return err;
+}
+
+/*
+ * scif_misc_handler:
+ *
+ * Work queue handler for servicing miscellaneous SCIF tasks.
+ * Examples include:
+ * 1) Cleanup of zombie endpoints.
+ */
+void scif_misc_handler(struct work_struct *work)
+{
+	scif_cleanup_zombie_epd();
+}
+
+/**
+ * scif_init() - Respond to SCIF_INIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ */
+static __always_inline void
+scif_init(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	scif_qp_response_ack(scifdev, msg->payload[0]);
+}
+
+/**
+ * scif_exit() - Respond to SCIF_EXIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * This function stops the SCIF interface for the node which sent
+ * the SCIF_EXIT message and starts waiting for that node to
+ * resetup the queue pair again.
+ */
+static __always_inline void
+scif_exit(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit_ack_pending = true;
+	if (scif_is_mgmt_node())
+		scif_disconnect_node(scifdev->node, false);
+	else
+		scif_stop(scifdev);
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+}
+
+/**
+ * scif_exitack() - Respond to SCIF_EXIT_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ */
+static __always_inline void
+scif_exit_ack(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit = OP_COMPLETED;
+	wake_up(&scif_info.exitwq);
+}
+
+/**
+ * scif_node_add() - Respond to SCIF_NODE_ADD interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * When the mgmt node driver has finished initializing a MIC node queue pair it
+ * marks the node as online. It then looks for all currently online MIC cards
+ * and send a SCIF_NODE_ADD message to identify the ID of the new card for
+ * peer to peer initialization
+ *
+ * The local node allocates its incoming queue and sends its address in the
+ * SCIF_NODE_ADD_ACK message back to the mgmt node, the mgmt node "reflects"
+ * this message to the new node
+ */
+static __always_inline void
+scif_node_add(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *newdev;
+	dma_addr_t qp_offset;
+	int qp_connect;
+	struct scif_hw_dev *sdev;
+
+	dev_dbg(&scifdev->sdev->dev,
+		"Scifdev %d:%d received NODE_ADD msg for node %d\n",
+		scifdev->node, msg->dst.node, msg->src.node);
+	dev_dbg(&scifdev->sdev->dev,
+		"Remote address for this node's aperture %llx\n",
+		msg->payload[0]);
+	newdev = &scif_dev[msg->src.node];
+	newdev->node = msg->src.node;
+	newdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	sdev = newdev->sdev;
+
+	if (scif_setup_intr_wq(newdev)) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to setup interrupts for %d\n", msg->src.node);
+		goto interrupt_setup_error;
+	}
+	newdev->mmio.va = ioremap_nocache(msg->payload[1], sdev->mmio->len);
+	if (!newdev->mmio.va) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to map mmio for %d\n", msg->src.node);
+		goto mmio_map_error;
+	}
+	newdev->qpairs = kzalloc(sizeof(*newdev->qpairs), GFP_KERNEL);
+	if (!newdev->qpairs)
+		goto qp_alloc_error;
+	/*
+	 * Set the base address of the remote node's memory since it gets
+	 * added to qp_offset
+	 */
+	newdev->base_addr = msg->payload[0];
+
+	qp_connect = scif_setup_qp_connect(newdev->qpairs, &qp_offset,
+					   SCIF_NODE_QP_SIZE, newdev);
+	if (qp_connect) {
+		dev_err(&scifdev->sdev->dev,
+			"failed to setup qp_connect %d\n", qp_connect);
+		goto qp_connect_error;
+	}
+
+	newdev->db = sdev->hw_ops->next_db(sdev);
+	newdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						   "SCIF_INTR", newdev,
+						   newdev->db);
+	if (IS_ERR(newdev->cookie))
+		goto qp_connect_error;
+	newdev->qpairs->magic = SCIFEP_MAGIC;
+	newdev->qpairs->qp_state = SCIF_QP_OFFLINE;
+
+	msg->uop = SCIF_NODE_ADD_ACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	msg->payload[0] = qp_offset;
+	msg->payload[2] = newdev->db;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+	return;
+qp_connect_error:
+	kfree(newdev->qpairs);
+	newdev->qpairs = NULL;
+qp_alloc_error:
+	iounmap(newdev->mmio.va);
+	newdev->mmio.va = NULL;
+mmio_map_error:
+interrupt_setup_error:
+	dev_err(&scifdev->sdev->dev,
+		"node add failed for node %d\n", msg->src.node);
+	msg->uop = SCIF_NODE_ADD_NACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+}
+
+void scif_poll_qp_state(struct work_struct *work)
+{
+#define SCIF_NODE_QP_RETRY 100
+#define SCIF_NODE_QP_TIMEOUT 100
+	struct scif_dev *peerdev = container_of(work, struct scif_dev,
+							p2p_dwork.work);
+	struct scif_qp *qp = &peerdev->qpairs[0];
+
+	if (qp->qp_state != SCIF_QP_ONLINE ||
+	    qp->remote_qp->qp_state != SCIF_QP_ONLINE) {
+		if (peerdev->p2p_retry++ == SCIF_NODE_QP_RETRY) {
+			dev_err(&peerdev->sdev->dev,
+				"Warning: QP check timeout with state %d\n",
+				qp->qp_state);
+			goto timeout;
+		}
+		schedule_delayed_work(&peerdev->p2p_dwork,
+				      msecs_to_jiffies(SCIF_NODE_QP_TIMEOUT));
+		return;
+	}
+	scif_peer_register_device(peerdev);
+	return;
+timeout:
+	dev_err(&peerdev->sdev->dev,
+		"%s %d remote node %d offline,  state = 0x%x\n",
+		__func__, __LINE__, peerdev->node, qp->qp_state);
+	qp->remote_qp->qp_state = SCIF_QP_OFFLINE;
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_ack() - Respond to SCIF_NODE_ADD_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * After a MIC node receives the SCIF_NODE_ADD_ACK message it send this
+ * message to the mgmt node to confirm the sequence is finished.
+ *
+ */
+static __always_inline void
+scif_node_add_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *peerdev;
+	struct scif_qp *qp;
+	struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+	dev_dbg(&scifdev->sdev->dev,
+		"Scifdev %d received SCIF_NODE_ADD_ACK msg src %d dst %d\n",
+		scifdev->node, msg->src.node, msg->dst.node);
+	dev_dbg(&scifdev->sdev->dev,
+		"payload %llx %llx %llx %llx\n", msg->payload[0],
+		msg->payload[1], msg->payload[2], msg->payload[3]);
+	if (scif_is_mgmt_node()) {
+		/*
+		 * the lock serializes with scif_qp_response_ack. The mgmt node
+		 * is forwarding the NODE_ADD_ACK message from src to dst we
+		 * need to make sure that the dst has already received a
+		 * NODE_ADD for src and setup its end of the qp to dst
+		 */
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		scif_nodeqp_send(dst_dev, msg);
+		mutex_unlock(&scif_info.conflock);
+		return;
+	}
+	peerdev = &scif_dev[msg->src.node];
+	peerdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	peerdev->node = msg->src.node;
+
+	qp = &peerdev->qpairs[0];
+
+	if ((scif_setup_qp_connect_response(peerdev, &peerdev->qpairs[0],
+					    msg->payload[0])))
+		goto local_error;
+	peerdev->rdb = msg->payload[2];
+	qp->remote_qp->qp_state = SCIF_QP_ONLINE;
+	schedule_delayed_work(&peerdev->p2p_dwork, 0);
+	return;
+local_error:
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_nack: Respond to SCIF_NODE_ADD_NACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * SCIF_NODE_ADD failed, so inform the waiting wq.
+ */
+static __always_inline void
+scif_node_add_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+		dev_dbg(&scifdev->sdev->dev,
+			"SCIF_NODE_ADD_NACK received from %d\n", scifdev->node);
+		scif_nodeqp_send(dst_dev, msg);
+	}
+}
+
+/*
+ * scif_node_remove: Handle SCIF_NODE_REMOVE message
+ * @msg: Interrupt message
+ *
+ * Handle node removal.
+ */
+static __always_inline void
+scif_node_remove(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int node = msg->payload[0];
+	struct scif_dev *scdev = &scif_dev[node];
+
+	scdev->node_remove_ack_pending = true;
+	scif_handle_remove_node(node);
+}
+
+/*
+ * scif_node_remove_ack: Handle SCIF_NODE_REMOVE_ACK message
+ * @msg: Interrupt message
+ *
+ * The peer has acked a SCIF_NODE_REMOVE message.
+ */
+static __always_inline void
+scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *sdev = &scif_dev[msg->payload[0]];
+
+	atomic_inc(&sdev->disconn_rescnt);
+	wake_up(&sdev->disconn_wq);
+}
+
+static void
+scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	/* Bogus Node Qp Message? */
+	dev_err(&scifdev->sdev->dev,
+		"Unknown message 0x%xn scifdev->node 0x%x\n",
+		msg->uop, scifdev->node);
+}
+
+static void (*scif_intr_func[SCIF_MAX_MSG + 1])
+	    (struct scif_dev *, struct scifmsg *msg) = {
+	scif_msg_unknown,	/* Error */
+	scif_init,		/* SCIF_INIT */
+	scif_exit,		/* SCIF_EXIT */
+	scif_exit_ack,		/* SCIF_EXIT_ACK */
+	scif_node_add,		/* SCIF_NODE_ADD */
+	scif_node_add_ack,	/* SCIF_NODE_ADD_ACK */
+	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
+	scif_node_remove,	/* SCIF_NODE_REMOVE */
+	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+};
+
+/**
+ * scif_nodeqp_msg_handler() - Common handler for node messages
+ * @scifdev: Remote device to respond to
+ * @qp: Remote memory pointer
+ * @msg: The message to be handled.
+ *
+ * This routine calls the appropriate routine to handle a Node Qp
+ * message receipt
+ */
+static int scif_max_msg_id = SCIF_MAX_MSG;
+
+static void
+scif_nodeqp_msg_handler(struct scif_dev *scifdev,
+			struct scif_qp *qp, struct scifmsg *msg)
+{
+	scif_display_message(scifdev, msg, "Rcvd");
+
+	if (msg->uop > (u32)scif_max_msg_id) {
+		/* Bogus Node Qp Message? */
+		dev_err(&scifdev->sdev->dev,
+			"Unknown message 0x%xn scifdev->node 0x%x\n",
+			msg->uop, scifdev->node);
+		return;
+	}
+
+	scif_intr_func[msg->uop](scifdev, msg);
+}
+
+/**
+ * scif_nodeqp_intrhandler() - Interrupt handler for node messages
+ * @scifdev:    Remote device to respond to
+ * @qp:         Remote memory pointer
+ *
+ * This routine is triggered by the interrupt mechanism.  It reads
+ * messages from the node queue RB and calls the Node QP Message handling
+ * routine.
+ */
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	struct scifmsg msg;
+	int read_size;
+
+	do {
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg, sizeof(msg));
+		if (!read_size)
+			break;
+		scif_nodeqp_msg_handler(scifdev, qp, &msg);
+		/*
+		 * The node queue pair is unmapped so skip the read pointer
+		 * update after receipt of a SCIF_EXIT_ACK
+		 */
+		if (SCIF_EXIT_ACK == msg.uop)
+			break;
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (1);
+}
+
+/**
+ * scif_loopb_wq_handler - Loopback Workqueue Handler.
+ * @work: loop back work
+ *
+ * This work queue routine is invoked by the loopback work queue handler.
+ * It grabs the recv lock, dequeues any available messages from the head
+ * of the loopback message list, calls the node QP message handler,
+ * waits for it to return, then frees up this message and dequeues more
+ * elements of the list if available.
+ */
+static void scif_loopb_wq_handler(struct work_struct *unused)
+{
+	struct scif_dev *scifdev = scif_info.loopb_dev;
+	struct scif_qp *qp = scifdev->qpairs;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = NULL;
+		spin_lock(&qp->recv_lock);
+		if (!list_empty(&scif_info.loopb_recv_q)) {
+			msg = list_first_entry(&scif_info.loopb_recv_q,
+					       struct scif_loopb_msg,
+					       list);
+			list_del(&msg->list);
+		}
+		spin_unlock(&qp->recv_lock);
+
+		if (msg) {
+			scif_nodeqp_msg_handler(scifdev, qp, &msg->msg);
+			kfree(msg);
+		}
+	} while (msg);
+}
+
+/**
+ * scif_loopb_msg_handler() - Workqueue handler for loopback messages.
+ * @scifdev: SCIF device
+ * @qp: Queue pair.
+ *
+ * This work queue routine is triggered when a loopback message is received.
+ *
+ * We need special handling for receiving Node Qp messages on a loopback SCIF
+ * device via two workqueues for receiving messages.
+ *
+ * The reason we need the extra workqueue which is not required with *normal*
+ * non-loopback SCIF devices is the potential classic deadlock described below:
+ *
+ * Thread A tries to send a message on a loopback SCIF device and blocks since
+ * there is no space in the RB while it has the send_lock held or another
+ * lock called lock X for example.
+ *
+ * Thread B: The Loopback Node QP message receive workqueue receives the message
+ * and tries to send a message (eg an ACK) to the loopback SCIF device. It tries
+ * to grab the send lock again or lock X and deadlocks with Thread A. The RB
+ * cannot be drained any further due to this classic deadlock.
+ *
+ * In order to avoid deadlocks as mentioned above we have an extra level of
+ * indirection achieved by having two workqueues.
+ * 1) The first workqueue whose handler is scif_loopb_msg_handler reads
+ * messages from the Node QP RB, adds them to a list and queues work for the
+ * second workqueue.
+ *
+ * 2) The second workqueue whose handler is scif_loopb_wq_handler dequeues
+ * messages from the list, handles them, frees up the memory and dequeues
+ * more elements from the list if possible.
+ */
+int
+scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	int read_size;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = kmalloc(sizeof(*msg), GFP_KERNEL);
+		if (!msg)
+			return -ENOMEM;
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg->msg,
+					     sizeof(struct scifmsg));
+		if (read_size != sizeof(struct scifmsg)) {
+			kfree(msg);
+			scif_rb_update_read_ptr(&qp->inbound_q);
+			break;
+		}
+		spin_lock(&qp->recv_lock);
+		list_add_tail(&msg->list, &scif_info.loopb_recv_q);
+		spin_unlock(&qp->recv_lock);
+		queue_work(scif_info.loopb_wq, &scif_info.loopb_work);
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (read_size == sizeof(struct scifmsg));
+	return read_size;
+}
+
+/**
+ * scif_setup_loopback_qp - One time setup work for Loopback Node Qp.
+ * @scifdev: SCIF device
+ *
+ * Sets up the required loopback workqueues, queue pairs and ring buffers
+ */
+int scif_setup_loopback_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	void *local_q;
+	struct scif_qp *qp;
+	struct scif_peer_dev *spdev;
+
+	err = scif_setup_intr_wq(scifdev);
+	if (err)
+		goto exit;
+	INIT_LIST_HEAD(&scif_info.loopb_recv_q);
+	snprintf(scif_info.loopb_wqname, sizeof(scif_info.loopb_wqname),
+		 "SCIF LOOPB %d", scifdev->node);
+	scif_info.loopb_wq =
+		alloc_ordered_workqueue(scif_info.loopb_wqname, 0);
+	if (!scif_info.loopb_wq) {
+		err = -ENOMEM;
+		goto destroy_intr;
+	}
+	INIT_WORK(&scif_info.loopb_work, scif_loopb_wq_handler);
+	/* Allocate Self Qpair */
+	scifdev->qpairs = kzalloc(sizeof(*scifdev->qpairs), GFP_KERNEL);
+	if (!scifdev->qpairs) {
+		err = -ENOMEM;
+		goto destroy_loopb_wq;
+	}
+
+	qp = scifdev->qpairs;
+	qp->magic = SCIFEP_MAGIC;
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(SCIF_NODE_QP_SIZE, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto free_qpairs;
+	}
+	/*
+	 * For loopback the inbound_q and outbound_q are essentially the same
+	 * since the Node sends a message on the loopback interface to the
+	 * outbound_q which is then received on the inbound_q.
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+
+	scif_rb_init(&qp->inbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+	scif_info.nodeid = scifdev->node;
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto free_local_q;
+	}
+	scif_info.loopb_dev = scifdev;
+	return err;
+free_local_q:
+	kfree(local_q);
+free_qpairs:
+	kfree(scifdev->qpairs);
+destroy_loopb_wq:
+	destroy_workqueue(scif_info.loopb_wq);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+exit:
+	return err;
+}
+
+/**
+ * scif_destroy_loopback_qp - One time uninit work for Loopback Node Qp
+ * @scifdev: SCIF device
+ *
+ * Destroys the workqueues and frees up the Ring Buffer and Queue Pair memory.
+ */
+int scif_destroy_loopback_qp(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	destroy_workqueue(scif_info.loopb_wq);
+	scif_destroy_intr_wq(scifdev);
+	kfree(scifdev->qpairs->outbound_q.rb_base);
+	kfree(scifdev->qpairs);
+	scifdev->sdev = NULL;
+	scif_info.loopb_dev = NULL;
+	return 0;
+}
+
+void scif_destroy_p2p(struct scif_dev *scifdev)
+{
+	struct scif_dev *peer_dev;
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	int bd;
+
+	mutex_lock(&scif_info.conflock);
+	/* Free P2P mappings in the given node for all its peer nodes */
+	list_for_each_safe(pos, tmp, &scifdev->p2p) {
+		p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			     p2p->sg_nentries[SCIF_PPI_MMIO],
+			     DMA_BIDIRECTIONAL);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			     p2p->sg_nentries[SCIF_PPI_APER],
+			     DMA_BIDIRECTIONAL);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+		list_del(pos);
+		kfree(p2p);
+	}
+
+	/* Free P2P mapping created in the peer nodes for the given node */
+	for (bd = SCIF_MGMT_NODE + 1; bd <= scif_info.maxid; bd++) {
+		peer_dev = &scif_dev[bd];
+		list_for_each_safe(pos, tmp, &peer_dev->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == scifdev->node) {
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_MMIO],
+					     p2p->sg_nentries[SCIF_PPI_MMIO],
+					     DMA_BIDIRECTIONAL);
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_APER],
+					     p2p->sg_nentries[SCIF_PPI_APER],
+					     DMA_BIDIRECTIONAL);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+				list_del(pos);
+				kfree(p2p);
+			}
+		}
+	}
+	mutex_unlock(&scif_info.conflock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 28%]

* [PATCH char-misc-next v2 08/13] misc: mic: SCIF open close bind and listen APIs
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (6 preceding siblings ...)
  2015-01-12 23:10 28% ` [PATCH char-misc-next v2 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
@ 2015-01-12 23:10 41% ` Sudeep Dutt
  2015-01-12 23:10 41% ` [PATCH char-misc-next v2 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF character device file operations and kernel APIs for opening and
closing a user and kernel mode SCIF endpoint. This patch also enables
binding to a SCIF port and listening for incoming SCIF connections.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |  19 ++
 drivers/misc/mic/Makefile           |   1 +
 drivers/misc/mic/scif/Makefile      |  15 ++
 drivers/misc/mic/scif/scif_epd.h    | 148 +++++++++++++
 drivers/misc/mic/scif/scif_main.h   |   2 +
 drivers/misc/mic/scif/scif_nodeqp.h |  12 +-
 drivers/misc/mic/scif/scif_api.c    | 417 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  92 ++++++++
 drivers/misc/mic/scif/scif_fd.c     | 104 +++++++++
 drivers/misc/mic/scif/scif_ports.c  | 124 +++++++++++
 10 files changed, 933 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index afc9f53..bcf6e36 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -69,3 +69,22 @@ config INTEL_MIC_CARD
 
 	  For more information see
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "SCIF Driver"
+
+config SCIF
+	tristate "SCIF Driver"
+	depends on 64BIT && PCI && X86 && SCIF_BUS
+	help
+	  This enables SCIF Driver support for the Intel Many Integrated
+	  Core (MIC) family of PCIe form factor coprocessor devices that
+	  run a 64 bit Linux OS. The Symmetric Communication Interface
+	  (SCIF (pronounced as skiff)) is a low level communications API
+	  across PCIe currently implemented for MIC.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 6bc6083..a74042c 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
 obj-y += bus/
+obj-$(CONFIG_SCIF) += scif/
diff --git a/drivers/misc/mic/scif/Makefile b/drivers/misc/mic/scif/Makefile
new file mode 100644
index 0000000..bf10bb7
--- /dev/null
+++ b/drivers/misc/mic/scif/Makefile
@@ -0,0 +1,15 @@
+#
+# Makefile - SCIF driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_SCIF) += scif.o
+scif-objs := scif_main.o
+scif-objs += scif_peer_bus.o
+scif-objs += scif_ports.o
+scif-objs += scif_debugfs.o
+scif-objs += scif_fd.o
+scif-objs += scif_api.o
+scif-objs += scif_epd.o
+scif-objs += scif_rb.o
+scif-objs += scif_nodeqp.o
+scif-objs += scif_nm.o
diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
new file mode 100644
index 0000000..b973b99
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -0,0 +1,148 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_EPD_H
+#define SCIF_EPD_H
+
+#include <linux/delay.h>
+#include <linux/scif.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_EPLOCK_HELD true
+
+enum scif_epd_state {
+	SCIFEP_UNBOUND,
+	SCIFEP_BOUND,
+	SCIFEP_LISTENING,
+	SCIFEP_CONNECTED,
+	SCIFEP_CONNECTING,
+	SCIFEP_MAPPING,
+	SCIFEP_CLOSING,
+	SCIFEP_CLLISTEN,
+	SCIFEP_DISCONNECTED,
+	SCIFEP_ZOMBIE
+};
+
+/*
+ * struct scif_conreq - Data structure added to the connection list.
+ *
+ * @msg: connection request message received
+ * @list: link to list of connection requests
+ */
+struct scif_conreq {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+/* Size of the RB for the Endpoint QP */
+#define SCIF_ENDPT_QP_SIZE 0x1000
+
+/*
+ * scif_endpt_qp_info - SCIF endpoint queue pair
+ *
+ * @qp - Qpair for this endpoint
+ * @qp_offset - DMA address of the QP
+ * @gnt_pld - Payload in a SCIF_CNCT_GNT message containing the
+ * physical address of the remote_qp.
+ */
+struct scif_endpt_qp_info {
+	struct scif_qp *qp;
+	dma_addr_t qp_offset;
+	dma_addr_t gnt_pld;
+};
+
+/*
+ * struct scif_endpt - The SCIF endpoint data structure
+ *
+ * @state: end point state
+ * @lock: lock synchronizing access to endpoint fields like state etc
+ * @port: self port information
+ * @peer: peer port information
+ * @backlog: maximum pending connection requests
+ * @qp_info: Endpoint QP information for SCIF messaging
+ * @remote_dev: scifdev used by this endpt to communicate with remote node.
+ * @remote_ep: remote endpoint
+ * @conreqcnt: Keep track of number of connection requests.
+ * @files: Open file information used to match the id passed in with
+ *         the flush routine.
+ * @conlist: list of connection requests
+ * @conwq: waitqueue for connection processing
+ * @discon: completion used during disconnection
+ * @sendwq: waitqueue used during sending messages
+ * @recvwq: waitqueue used during message receipt
+ * @sendlock: Synchronize ordering of messages sent
+ * @recvlock: Synchronize ordering of messages received
+ * @list: link to list of various endpoints like connected, listening etc
+ * @li_accept: pending ACCEPTREG
+ * @acceptcnt: pending ACCEPTREG cnt
+ * @liacceptlist: link to listen accept
+ * @miacceptlist: link to uaccept
+ * @listenep: associated listen ep
+ * @conn_work: Non blocking connect work
+ * @conn_port: Connection port
+ * @conn_err: Errors during connection
+ * @conn_async_state: Async connection
+ * @conn_list: List of async connection requests
+ */
+struct scif_endpt {
+	enum scif_epd_state state;
+	spinlock_t lock;
+	struct scif_port_id port;
+	struct scif_port_id peer;
+	int backlog;
+	struct scif_endpt_qp_info qp_info;
+	struct scif_dev *remote_dev;
+	u64 remote_ep;
+	int conreqcnt;
+	struct files_struct *files;
+	struct list_head conlist;
+	wait_queue_head_t conwq;
+	struct completion discon;
+	wait_queue_head_t sendwq;
+	wait_queue_head_t recvwq;
+	struct mutex sendlock;
+	struct mutex recvlock;
+	struct list_head list;
+	struct list_head li_accept;
+	int acceptcnt;
+	struct list_head liacceptlist;
+	struct list_head miacceptlist;
+	struct scif_endpt *listenep;
+	struct scif_port_id conn_port;
+	int conn_err;
+	int conn_async_state;
+	struct list_head conn_list;
+};
+
+static inline int scifdev_alive(struct scif_endpt *ep)
+{
+	return _scifdev_alive(ep->remote_dev);
+}
+
+void scif_cleanup_zombie_epd(void);
+void scif_teardown_ep(void *endpt);
+void scif_cleanup_ep_qp(struct scif_endpt *ep);
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held);
+void scif_get_node_info(void);
+void scif_send_acks(struct scif_dev *dev);
+void scif_conn_handler(struct work_struct *work);
+int scif_rsrv_port(u16 port);
+void scif_get_port(u16 port);
+int scif_get_new_port(void);
+void scif_put_port(u16 port);
+int __scif_flush(scif_epd_t epd);
+#endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
index 0bc091b..ce8d0c9 100644
--- a/drivers/misc/mic/scif/scif_main.h
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -236,6 +236,8 @@ static inline int _scifdev_alive(struct scif_dev *scifdev)
 	return !!spdev;
 }
 
+#include "scif_epd.h"
+
 void __init scif_init_debugfs(void);
 void scif_exit_debugfs(void);
 int scif_setup_intr_wq(struct scif_dev *scifdev);
diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
index 4d8f53c..9540233 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.h
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -64,7 +64,17 @@
 #define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
 #define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
 #define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
-#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+#define SCIF_CNCT_REQ 9  /* Phys addr of Request connection to a port */
+#define SCIF_CNCT_GNT 10  /* Phys addr of new Grant connection request */
+#define SCIF_CNCT_GNTACK 11  /* Error type Reject a connection request */
+#define SCIF_CNCT_GNTNACK 12  /* Error type Reject a connection request */
+#define SCIF_CNCT_REJ 13  /* Error type Reject a connection request */
+#define SCIF_DISCNCT 14 /* Notify peer that connection is being terminated */
+#define SCIF_DISCNT_ACK 15 /* Notify peer that connection is being terminated */
+#define SCIF_CLIENT_SENT 16 /* Notify the peer that data has been written */
+#define SCIF_CLIENT_RCVD 17 /* Notify the peer that data has been read */
+#define SCIF_GET_NODE_INFO 18 /* Get current node mask from the mgmt node*/
+#define SCIF_MAX_MSG SCIF_GET_NODE_INFO
 
 /*
  * struct scifmsg - Node QP message format
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
new file mode 100644
index 0000000..cbdc27b
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -0,0 +1,417 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/scif.h>
+#include "scif_main.h"
+#include "scif_map.h"
+
+static const char * const scif_ep_states[] = {
+	"Unbound",
+	"Bound",
+	"Listening",
+	"Connected",
+	"Connecting",
+	"Mapping",
+	"Closing",
+	"Close Listening",
+	"Disconnected",
+	"Zombie"};
+
+enum conn_async_state {
+	ASYNC_CONN_IDLE = 1,	/* ep setup for async connect */
+	ASYNC_CONN_INPROGRESS,	/* async connect in progress */
+	ASYNC_CONN_FLUSH_WORK	/* async work flush in progress  */
+};
+
+scif_epd_t scif_open(void)
+{
+	struct scif_endpt *ep;
+
+	might_sleep();
+	ep = kzalloc(sizeof(*ep), GFP_KERNEL);
+	if (!ep)
+		goto err_ep_alloc;
+
+	ep->qp_info.qp = kzalloc(sizeof(*ep->qp_info.qp), GFP_KERNEL);
+	if (!ep->qp_info.qp)
+		goto err_qp_alloc;
+
+	spin_lock_init(&ep->lock);
+	mutex_init(&ep->sendlock);
+	mutex_init(&ep->recvlock);
+
+	ep->state = SCIFEP_UNBOUND;
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI open: ep %p success\n", ep);
+	return ep;
+
+err_qp_alloc:
+	kfree(ep);
+err_ep_alloc:
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(scif_open);
+
+/*
+ * scif_disconnect_ep - Disconnects the endpoint if found
+ * @epd: The end point returned from scif_open()
+ */
+static struct scif_endpt *scif_disconnect_ep(struct scif_endpt *ep)
+{
+	struct scifmsg msg;
+	struct scif_endpt *fep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	int err;
+
+	/*
+	 * Wake up any threads blocked in send()/recv() before closing
+	 * out the connection. Grabbing and releasing the send/recv lock
+	 * will ensure that any blocked senders/receivers have exited for
+	 * Ring 0 endpoints. It is a Ring 0 bug to call send/recv after
+	 * close. Ring 3 endpoints are not affected since close will not
+	 * be called while there are IOCTLs executing.
+	 */
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	mutex_lock(&ep->sendlock);
+	mutex_unlock(&ep->sendlock);
+	mutex_lock(&ep->recvlock);
+	mutex_unlock(&ep->recvlock);
+
+	/* Remove from the connected list */
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		if (tmpep == ep) {
+			list_del(pos);
+			fep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	if (!fep) {
+		/*
+		 * The other side has completed the disconnect before
+		 * the end point can be removed from the list. Therefore
+		 * the ep lock is not locked, traverse the disconnected
+		 * list to find the endpoint and release the conn lock.
+		 */
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		return NULL;
+	}
+
+	init_completion(&ep->discon);
+	msg.uop = SCIF_DISCNCT;
+	msg.src = ep->port;
+	msg.dst = ep->peer;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->remote_ep;
+
+	err = scif_nodeqp_send(ep->remote_dev, &msg);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+	if (!err)
+		/* Wait for the remote node to respond with SCIF_DISCNT_ACK */
+		wait_for_completion_timeout(&ep->discon,
+					    SCIF_NODE_ALIVE_TIMEOUT);
+	return ep;
+}
+
+int scif_close(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	enum scif_epd_state oldstate;
+	bool flush_conn;
+
+	dev_dbg(scif_info.mdev.this_device, "SCIFAPI close: ep %p %s\n",
+		ep, scif_ep_states[ep->state]);
+	might_sleep();
+	spin_lock(&ep->lock);
+	flush_conn = (ep->conn_async_state == ASYNC_CONN_INPROGRESS);
+	spin_unlock(&ep->lock);
+
+	if (flush_conn)
+		flush_work(&scif_info.conn_work);
+
+	spin_lock(&ep->lock);
+	oldstate = ep->state;
+
+	ep->state = SCIFEP_CLOSING;
+
+	switch (oldstate) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		/* Remove from the disconnected list */
+		mutex_lock(&scif_info.connlock);
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		break;
+	case SCIFEP_UNBOUND:
+	case SCIFEP_BOUND:
+	case SCIFEP_CONNECTING:
+		spin_unlock(&ep->lock);
+		break;
+	case SCIFEP_MAPPING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CLOSING:
+	{
+		spin_unlock(&ep->lock);
+		scif_disconnect_ep(ep);
+		break;
+	}
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+	{
+		struct scif_conreq *conreq;
+		struct scifmsg msg;
+		struct scif_endpt *aep;
+
+		spin_unlock(&ep->lock);
+		spin_lock(&scif_info.eplock);
+
+		/* remove from listen list */
+		list_for_each_safe(pos, tmpq, &scif_info.listen) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep)
+				list_del(pos);
+		}
+		/* Remove any dangling accepts */
+		while (ep->acceptcnt) {
+			aep = list_first_entry(&ep->li_accept,
+					       struct scif_endpt, liacceptlist);
+			list_del(&aep->liacceptlist);
+			scif_put_port(aep->port.port);
+			list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+				tmpep = list_entry(pos, struct scif_endpt,
+						   miacceptlist);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			spin_unlock(&scif_info.eplock);
+			mutex_lock(&scif_info.connlock);
+			list_for_each_safe(pos, tmpq, &scif_info.connected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			mutex_unlock(&scif_info.connlock);
+			scif_teardown_ep(aep);
+			spin_lock(&scif_info.eplock);
+			scif_add_epd_to_zombie_list(aep, SCIF_EPLOCK_HELD);
+			ep->acceptcnt--;
+		}
+
+		spin_lock(&ep->lock);
+		spin_unlock(&scif_info.eplock);
+
+		/* Remove and reject any pending connection requests. */
+		while (ep->conreqcnt) {
+			conreq = list_first_entry(&ep->conlist,
+						  struct scif_conreq, list);
+			list_del(&conreq->list);
+
+			msg.uop = SCIF_CNCT_REJ;
+			msg.dst.node = conreq->msg.src.node;
+			msg.dst.port = conreq->msg.src.port;
+			msg.payload[0] = conreq->msg.payload[0];
+			msg.payload[1] = conreq->msg.payload[1];
+			/*
+			 * No Error Handling on purpose for scif_nodeqp_send().
+			 * If the remote node is lost we still want free the
+			 * connection requests on the self node.
+			 */
+			scif_nodeqp_send(&scif_dev[conreq->msg.src.node],
+					 &msg);
+			ep->conreqcnt--;
+			kfree(conreq);
+		}
+
+		spin_unlock(&ep->lock);
+		/* If a kSCIF accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	}
+	scif_put_port(ep->port.port);
+	scif_teardown_ep(ep);
+	scif_add_epd_to_zombie_list(ep, !SCIF_EPLOCK_HELD);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_close);
+
+/**
+ * scif_flush() - Wakes up any blocking accepts. The endpoint will no longer
+ *			accept new connections.
+ * @epd: The end point returned from scif_open()
+ */
+int __scif_flush(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	switch (ep->state) {
+	case SCIFEP_LISTENING:
+	{
+		ep->state = SCIFEP_CLLISTEN;
+
+		/* If an accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	default:
+		break;
+	}
+	return 0;
+}
+
+int scif_bind(scif_epd_t epd, u16 pn)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret = 0;
+	int tmp;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI bind: ep %p %s requested port number %d\n",
+		ep, scif_ep_states[ep->state], pn);
+	if (pn) {
+		/*
+		 * Similar to IETF RFC 1700, SCIF ports below
+		 * SCIF_ADMIN_PORT_END can only be bound by system (or root)
+		 * processes or by processes executed by privileged users.
+		 */
+		if (pn < SCIF_ADMIN_PORT_END && !capable(CAP_SYS_ADMIN)) {
+			ret = -EACCES;
+			goto scif_bind_admin_exit;
+		}
+	}
+
+	spin_lock(&ep->lock);
+	if (ep->state == SCIFEP_BOUND) {
+		ret = -EINVAL;
+		goto scif_bind_exit;
+	} else if (ep->state != SCIFEP_UNBOUND) {
+		ret = -EISCONN;
+		goto scif_bind_exit;
+	}
+
+	if (pn) {
+		tmp = scif_rsrv_port(pn);
+		if (tmp != pn) {
+			ret = -EINVAL;
+			goto scif_bind_exit;
+		}
+	} else {
+		pn = scif_get_new_port();
+		if (!pn) {
+			ret = -ENOSPC;
+			goto scif_bind_exit;
+		}
+	}
+
+	ep->state = SCIFEP_BOUND;
+	ep->port.node = scif_info.nodeid;
+	ep->port.port = pn;
+	ep->conn_async_state = ASYNC_CONN_IDLE;
+	ret = pn;
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI bind: bound to port number %d\n", pn);
+scif_bind_exit:
+	spin_unlock(&ep->lock);
+scif_bind_admin_exit:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_bind);
+
+int scif_listen(scif_epd_t epd, int backlog)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI listen: ep %p %s\n", ep, scif_ep_states[ep->state]);
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+	case SCIFEP_CLLISTEN:
+	case SCIFEP_UNBOUND:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		return -EINVAL;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		spin_unlock(&ep->lock);
+		return -EISCONN;
+	case SCIFEP_BOUND:
+		break;
+	}
+
+	ep->state = SCIFEP_LISTENING;
+	ep->backlog = backlog;
+
+	ep->conreqcnt = 0;
+	ep->acceptcnt = 0;
+	INIT_LIST_HEAD(&ep->conlist);
+	init_waitqueue_head(&ep->conwq);
+	INIT_LIST_HEAD(&ep->li_accept);
+	spin_unlock(&ep->lock);
+
+	/*
+	 * Listen status is complete so delete the qp information not needed
+	 * on a listen before placing on the list of listening ep's
+	 */
+	scif_teardown_ep(ep);
+	ep->qp_info.qp = NULL;
+
+	spin_lock(&scif_info.eplock);
+	list_add_tail(&ep->list, &scif_info.listen);
+	spin_unlock(&scif_info.eplock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_listen);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
new file mode 100644
index 0000000..a2b728a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -0,0 +1,92 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_cleanup_ep_qp(struct scif_endpt *ep)
+{
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp->outbound_q.rb_base) {
+		scif_iounmap((void *)qp->outbound_q.rb_base,
+			     qp->outbound_q.size, ep->remote_dev);
+		qp->outbound_q.rb_base = NULL;
+	}
+	if (qp->remote_qp) {
+		scif_iounmap((void *)qp->remote_qp,
+			     sizeof(struct scif_qp), ep->remote_dev);
+		qp->remote_qp = NULL;
+	}
+	if (qp->local_qp) {
+		scif_unmap_single(qp->local_qp, ep->remote_dev,
+				  sizeof(struct scif_qp));
+		qp->local_qp = 0x0;
+	}
+	if (qp->local_buf) {
+		scif_unmap_single(qp->local_buf, ep->remote_dev,
+				  SCIF_ENDPT_QP_SIZE);
+		qp->local_buf = 0;
+	}
+}
+
+void scif_teardown_ep(void *endpt)
+{
+	struct scif_endpt *ep = endpt;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp) {
+		spin_lock(&ep->lock);
+		scif_cleanup_ep_qp(ep);
+		spin_unlock(&ep->lock);
+		kfree(qp->inbound_q.rb_base);
+		kfree(qp);
+	}
+}
+
+/*
+ * Enqueue the endpoint to the zombie list for cleanup.
+ * The endpoint should not be accessed once this API returns.
+ */
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
+{
+	if (!eplock_held)
+		spin_lock(&scif_info.eplock);
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_ZOMBIE;
+	spin_unlock(&ep->lock);
+	list_add_tail(&ep->list, &scif_info.zombie);
+	scif_info.nr_zombies++;
+	if (!eplock_held)
+		spin_unlock(&scif_info.eplock);
+	schedule_work(&scif_info.misc_work);
+}
+
+void scif_cleanup_zombie_epd(void)
+{
+	struct list_head *pos, *tmpq;
+	struct scif_endpt *ep;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.zombie) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		list_del(pos);
+		scif_info.nr_zombies--;
+		kfree(ep);
+	}
+	spin_unlock(&scif_info.eplock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
new file mode 100644
index 0000000..812c23f
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -0,0 +1,104 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+
+static int scif_fdopen(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = scif_open();
+
+	if (!priv)
+		return -ENOMEM;
+	f->private_data = priv;
+	return 0;
+}
+
+static int scif_fdclose(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = f->private_data;
+
+	return scif_close(priv);
+}
+
+static int scif_fdflush(struct file *f, fl_owner_t id)
+{
+	struct scif_endpt *ep = f->private_data;
+
+	spin_lock(&ep->lock);
+	/*
+	 * The listening endpoint stashes the open file information before
+	 * waiting for incoming connections. The release callback would never be
+	 * called if the application closed the endpoint, while waiting for
+	 * incoming connections from a separate thread since the file descriptor
+	 * reference count is bumped up in the accept IOCTL. Call the flush
+	 * routine if the id matches the endpoint open file information so that
+	 * the listening endpoint can be woken up and the fd released.
+	 */
+	if (ep->files == id)
+		__scif_flush(ep);
+	spin_unlock(&ep->lock);
+	return 0;
+}
+
+static __always_inline void scif_err_debug(int err, const char *str)
+{
+	/*
+	 * ENOTCONN is a common uninteresting error which is
+	 * flooding debug messages to the console unnecessarily.
+	 */
+	if (err < 0 && err != -ENOTCONN)
+		dev_dbg(scif_info.mdev.this_device, "%s err %d\n", str, err);
+}
+
+static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct scif_endpt *priv = f->private_data;
+	void __user *argp = (void __user *)arg;
+	bool non_block = false;
+
+	non_block = !!(f->f_flags & O_NONBLOCK);
+
+	switch (cmd) {
+	case SCIF_BIND:
+	{
+		int pn;
+
+		if (copy_from_user(&pn, argp, sizeof(pn)))
+			return -EFAULT;
+
+		pn = scif_bind(priv, pn);
+		if (pn < 0)
+			return pn;
+
+		if (copy_to_user(argp, &pn, sizeof(pn)))
+			return -EFAULT;
+
+		return 0;
+	}
+	case SCIF_LISTEN:
+		return scif_listen(priv, arg);
+	}
+	return -EINVAL;
+}
+
+const struct file_operations scif_fops = {
+	.open = scif_fdopen,
+	.release = scif_fdclose,
+	.unlocked_ioctl = scif_fdioctl,
+	.flush = scif_fdflush,
+	.owner = THIS_MODULE,
+};
diff --git a/drivers/misc/mic/scif/scif_ports.c b/drivers/misc/mic/scif/scif_ports.c
new file mode 100644
index 0000000..594e18d
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_ports.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/idr.h>
+
+#include "scif_main.h"
+
+#define SCIF_PORT_COUNT	0x10000	/* Ports available */
+
+struct idr scif_ports;
+
+/*
+ * struct scif_port - SCIF port information
+ *
+ * @ref_cnt - Reference count since there can be multiple endpoints
+ *		created via scif_accept(..) simultaneously using a port.
+ */
+struct scif_port {
+	int ref_cnt;
+};
+
+/**
+ * __scif_get_port - Reserve a specified port # for SCIF and add it
+ * to the global list.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+static int __scif_get_port(int start, int end)
+{
+	int id;
+	struct scif_port *port = kzalloc(sizeof(*port), GFP_ATOMIC);
+
+	if (!port)
+		return -ENOMEM;
+	spin_lock(&scif_info.port_lock);
+	id = idr_alloc(&scif_ports, port, start, end, GFP_ATOMIC);
+	if (id >= 0)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+	return id;
+}
+
+/**
+ * scif_rsrv_port - Reserve a specified port # for SCIF.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_rsrv_port(u16 port)
+{
+	return __scif_get_port(port, port + 1);
+}
+
+/**
+ * scif_get_new_port - Get and reserve any port # for SCIF in the range
+ *			SCIF_PORT_RSVD + 1 to SCIF_PORT_COUNT - 1.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if no ports available.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_get_new_port(void)
+{
+	return __scif_get_port(SCIF_PORT_RSVD + 1, SCIF_PORT_COUNT);
+}
+
+/**
+ * scif_get_port - Increment the reference count for a SCIF port
+ * @id : SCIF port
+ *
+ * @return : None
+ */
+void scif_get_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+}
+
+/**
+ * scif_put_port - Release a reserved SCIF port
+ * @id : SCIF port to be released.
+ *
+ * @return : None
+ */
+void scif_put_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port) {
+		port->ref_cnt--;
+		if (!port->ref_cnt) {
+			idr_remove(&scif_ports, id);
+			kfree(port);
+		}
+	}
+	spin_unlock(&scif_info.port_lock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v2 10/13] misc: mic: SCIF messaging and node enumeration APIs
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (8 preceding siblings ...)
  2015-01-12 23:10 41% ` [PATCH char-misc-next v2 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
@ 2015-01-12 23:10 52% ` Sudeep Dutt
  2015-01-12 23:10 57% ` [PATCH char-misc-next v2 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF messaging APIs which allow sending messages between the SCIF
endpoints via a byte stream based ring buffer which has been
optimized to avoid reads across PCIe. The SCIF messaging APIs
are typically used for short < 1024 byte messages for best
performance while the RDMA APIs which will be submitted in a future
patch series is recommended for larger transfers. The node
enumeration API enables a user to query for the number of nodes
online in the SCIF network and their node ids.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   4 +
 drivers/misc/mic/scif/scif_api.c    | 402 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  32 +++
 drivers/misc/mic/scif/scif_fd.c     |  93 +++++++++
 drivers/misc/mic/scif/scif_nm.c     |  17 ++
 drivers/misc/mic/scif/scif_nodeqp.c |  36 +++-
 6 files changed, 583 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index 65c628e..331322a 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,6 +144,8 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags);
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags);
 void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
@@ -151,6 +153,8 @@ void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg);
 int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index 8b55b59..41fba06 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -867,3 +867,405 @@ scif_accept_error_epalloc:
 	return err;
 }
 EXPORT_SYMBOL_GPL(scif_accept);
+
+/*
+ * scif_msg_param_check:
+ * @epd: The end point returned from scif_open()
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * Validate parameters for messaging APIs scif_send(..)/scif_recv(..).
+ */
+static inline int scif_msg_param_check(scif_epd_t epd, int len, int flags)
+{
+	int ret = -EINVAL;
+
+	if (len < 0)
+		goto err_ret;
+	if (flags && (!(flags & SCIF_RECV_BLOCK)))
+		goto err_ret;
+	ret = 0;
+err_ret:
+	return ret;
+}
+
+static int _scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_xfer_len = 0, sent_len = 0, write_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_SEND_BLOCK)
+		might_sleep();
+
+	spin_lock(&ep->lock);
+	while (sent_len != len && SCIFEP_CONNECTED == ep->state) {
+		write_count = scif_rb_space(&qp->outbound_q);
+		if (write_count) {
+			/* Best effort to send as much data as possible */
+			curr_xfer_len = min(len - sent_len, write_count);
+			ret = scif_rb_write(&qp->outbound_q, msg,
+					    curr_xfer_len);
+			if (ret < 0)
+				break;
+			/* Success. Update write pointer */
+			scif_rb_commit(&qp->outbound_q);
+			/*
+			 * Send a notification to the peer about the
+			 * produced data message.
+			 */
+			notif_msg.src = ep->port;
+			notif_msg.uop = SCIF_CLIENT_SENT;
+			notif_msg.payload[0] = ep->remote_ep;
+			ret = _scif_nodeqp_send(ep->remote_dev, &notif_msg);
+			if (ret)
+				break;
+			sent_len += curr_xfer_len;
+			msg = msg + curr_xfer_len;
+			continue;
+		}
+		curr_xfer_len = min(len - sent_len, SCIF_ENDPT_QP_SIZE - 1);
+		/* Not enough RB space. return for the Non Blocking case */
+		if (!(flags & SCIF_SEND_BLOCK))
+			break;
+
+		spin_unlock(&ep->lock);
+		/* Wait for a SCIF_CLIENT_RCVD message in the Blocking case */
+		ret =
+		wait_event_interruptible(ep->sendwq,
+					 (SCIFEP_CONNECTED != ep->state) ||
+					 (scif_rb_space(&qp->outbound_q) >=
+					 curr_xfer_len));
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (sent_len)
+		ret = sent_len;
+	else if (!ret && SCIFEP_CONNECTED != ep->state)
+		ret = SCIFEP_DISCONNECTED == ep->state ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+static int _scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	int read_size;
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_recv_len = 0, remaining_len = len, read_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_RECV_BLOCK)
+		might_sleep();
+	spin_lock(&ep->lock);
+	while (remaining_len && (SCIFEP_CONNECTED == ep->state ||
+				 SCIFEP_DISCONNECTED == ep->state)) {
+		read_count = scif_rb_count(&qp->inbound_q, remaining_len);
+		if (read_count) {
+			/*
+			 * Best effort to recv as much data as there
+			 * are bytes to read in the RB particularly
+			 * important for the Non Blocking case.
+			 */
+			curr_recv_len = min(remaining_len, read_count);
+			read_size = scif_rb_get_next(&qp->inbound_q,
+						     msg, curr_recv_len);
+			if (ep->state == SCIFEP_CONNECTED) {
+				/*
+				 * Update the read pointer only if the endpoint
+				 * is still connected else the read pointer
+				 * might no longer exist since the peer has
+				 * freed resources!
+				 */
+				scif_rb_update_read_ptr(&qp->inbound_q);
+				/*
+				 * Send a notification to the peer about the
+				 * consumed data message only if the EP is in
+				 * SCIFEP_CONNECTED state.
+				 */
+				notif_msg.src = ep->port;
+				notif_msg.uop = SCIF_CLIENT_RCVD;
+				notif_msg.payload[0] = ep->remote_ep;
+				ret = _scif_nodeqp_send(ep->remote_dev,
+							&notif_msg);
+				if (ret)
+					break;
+			}
+			remaining_len -= curr_recv_len;
+			msg = msg + curr_recv_len;
+			continue;
+		}
+		/*
+		 * Bail out now if the EP is in SCIFEP_DISCONNECTED state else
+		 * we will keep looping forever.
+		 */
+		if (ep->state == SCIFEP_DISCONNECTED)
+			break;
+		/*
+		 * Return in the Non Blocking case if there is no data
+		 * to read in this iteration.
+		 */
+		if (!(flags & SCIF_RECV_BLOCK))
+			break;
+		curr_recv_len = min(remaining_len, SCIF_ENDPT_QP_SIZE - 1);
+		spin_unlock(&ep->lock);
+		/*
+		 * Wait for a SCIF_CLIENT_SEND message in the blocking case
+		 * or until other side disconnects.
+		 */
+		ret =
+		wait_event_interruptible(ep->recvwq,
+					 SCIFEP_CONNECTED != ep->state ||
+					 scif_rb_count(&qp->inbound_q,
+						       curr_recv_len)
+					 >= curr_recv_len);
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (len - remaining_len)
+		ret = len - remaining_len;
+	else if (!ret && ep->state != SCIFEP_CONNECTED)
+		ret = ep->state == SCIFEP_DISCONNECTED ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+/**
+ * scif_user_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_send().
+ */
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int sent_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI send (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto send_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto send_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->sendlock);
+	while (sent_len != len) {
+		loop_len = len - sent_len;
+		loop_len = min(chunk_len, loop_len);
+		if (copy_from_user(tmp, msg, loop_len)) {
+			err = -EFAULT;
+			goto send_free_err;
+		}
+		err = _scif_send(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto send_free_err;
+		sent_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto send_free_err;
+	}
+send_free_err:
+	mutex_unlock(&ep->sendlock);
+	kfree(tmp);
+send_err:
+	return err < 0 ? err : sent_len;
+}
+
+/**
+ * scif_user_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_recv().
+ */
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int recv_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI recv (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto recv_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto recv_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->recvlock);
+	while (recv_len != len) {
+		loop_len = len - recv_len;
+		loop_len = min(chunk_len, loop_len);
+		err = _scif_recv(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto recv_free_err;
+		if (copy_to_user(msg, tmp, err)) {
+			err = -EFAULT;
+			goto recv_free_err;
+		}
+		recv_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto recv_free_err;
+	}
+recv_free_err:
+	mutex_unlock(&ep->recvlock);
+	kfree(tmp);
+recv_err:
+	return err < 0 ? err : recv_len;
+}
+
+/**
+ * scif_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_send().
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI send (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	if (!ep->remote_dev)
+		return -ENOTCONN;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_lock(&ep->sendlock);
+
+	ret = _scif_send(epd, msg, len, flags);
+
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_unlock(&ep->sendlock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_send);
+
+/**
+ * scif_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_recv().
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI recv (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_lock(&ep->recvlock);
+
+	ret = _scif_recv(epd, msg, len, flags);
+
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_unlock(&ep->recvlock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_recv);
+
+int scif_get_node_ids(u16 *nodes, int len, u16 *self)
+{
+	int online = 0;
+	int offset = 0;
+	int node;
+
+	if (!scif_is_mgmt_node())
+		scif_get_node_info();
+
+	*self = scif_info.nodeid;
+	mutex_lock(&scif_info.conflock);
+	len = min_t(int, len, scif_info.total);
+	for (node = 0; node <= scif_info.maxid; node++) {
+		if (_scifdev_alive(&scif_dev[node])) {
+			online++;
+			if (offset < len)
+				nodes[offset++] = node;
+		}
+	}
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI get_node_ids total %d online %d filled in %d nodes\n",
+		scif_info.total, online, offset);
+	mutex_unlock(&scif_info.conflock);
+
+	return online;
+}
+EXPORT_SYMBOL_GPL(scif_get_node_ids);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index 685ee5c..b4bfbb0 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -319,3 +319,35 @@ void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	spin_unlock(&ep->lock);
 	complete(&ep->discon);
 }
+
+/**
+ * scif_clientsend() - Respond to SCIF_CLIENT_SEND interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_clientrcvd() - Respond to SCIF_CLIENT_RCVD interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->sendwq);
+	spin_unlock(&ep->lock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index c1d6bc3..b1678fe 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -69,6 +69,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
 	int err = 0;
+	struct scifioctl_msg request;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -197,6 +198,98 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 		f->private_data = newep;
 		return 0;
 	}
+	case SCIF_SEND:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = scif_user_send(priv, request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto send_err;
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+				 &err, sizeof(err))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = 0;
+send_err:
+		scif_err_debug(err, "scif_send");
+		return err;
+	}
+	case SCIF_RECV:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+
+		err = scif_user_recv(priv, request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto recv_err;
+
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+			&err, sizeof(err))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+		err = 0;
+recv_err:
+		scif_err_debug(err, "scif_recv");
+		return err;
+	}
+	case SCIF_GET_NODEIDS:
+	{
+		struct scifioctl_node_ids node_ids;
+		int entries;
+		u16 *nodes;
+		void __user *unodes, *uself;
+		u16 self;
+
+		if (copy_from_user(&node_ids, argp, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err2;
+		}
+
+		entries = min_t(int, scif_info.maxid, node_ids.len);
+		nodes = kmalloc_array(entries, sizeof(u16), GFP_KERNEL);
+		if (entries && !nodes) {
+			err = -ENOMEM;
+			goto getnodes_err2;
+		}
+		node_ids.len = scif_get_node_ids(nodes, entries, &self);
+
+		unodes = node_ids.nodes;
+		if (copy_to_user(unodes, nodes, sizeof(u16) * entries)) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		uself = node_ids.self;
+		if (copy_to_user(uself, &self, sizeof(u16))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		if (copy_to_user(argp, &node_ids, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+getnodes_err1:
+		kfree(nodes);
+getnodes_err2:
+		return err;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 0469aad..9b4c538 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -218,3 +218,20 @@ void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
 				   (atomic_read(&scifdev->disconn_rescnt) == 1),
 				   SCIF_NODE_ALIVE_TIMEOUT);
 }
+
+void scif_get_node_info(void)
+{
+	struct scifmsg msg;
+	DECLARE_COMPLETION_ONSTACK(node_info);
+
+	msg.uop = SCIF_GET_NODE_INFO;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = SCIF_MGMT_NODE;
+	msg.payload[3] = (u64)&node_info;
+
+	if ((scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg)))
+		return;
+
+	/* Wait for a response with SCIF_GET_NODE_INFO */
+	wait_for_completion(&node_info);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index aee9025..141923c 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -576,7 +576,10 @@ static char *message_types[] = {"BAD",
 				"CNCT_GNTNACK",
 				"CNCT_REJ",
 				"DISCNCT",
-				"DISCNT_ACK"};
+				"DISCNT_ACK",
+				"CLIENT_SENT",
+				"CLIENT_RCVD",
+				"SCIF_GET_NODE_INFO"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -946,6 +949,34 @@ scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	wake_up(&sdev->disconn_wq);
 }
 
+/**
+ * scif_get_node_info: Respond to SCIF_GET_NODE_INFO interrupt message
+ * @msg:        Interrupt message
+ *
+ * Retrieve node info i.e maxid and total from the mgmt node.
+ */
+static __always_inline void
+scif_get_node_info_resp(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		swap(msg->dst.node, msg->src.node);
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		msg->payload[2] = scif_info.total;
+		mutex_unlock(&scif_info.conflock);
+		scif_nodeqp_send(scifdev, msg);
+	} else {
+		struct completion *node_info =
+			(struct completion *)msg->payload[3];
+
+		mutex_lock(&scif_info.conflock);
+		scif_info.maxid = msg->payload[1];
+		scif_info.total = msg->payload[2];
+		complete_all(node_info);
+		mutex_unlock(&scif_info.conflock);
+	}
+}
+
 static void
 scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
 {
@@ -973,6 +1004,9 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_cnctrej,		/* SCIF_CNCT_REJ */
 	scif_discnct,		/* SCIF_DISCNCT */
 	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
+	scif_clientsend,	/* SCIF_CLIENT_SENT */
+	scif_clientrcvd,	/* SCIF_CLIENT_RCVD */
+	scif_get_node_info_resp,/* SCIF_GET_NODE_INFO */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 52%]

* [PATCH char-misc-next v2 09/13] misc: mic: SCIF connections APIs i.e. accept and connect
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (7 preceding siblings ...)
  2015-01-12 23:10 41% ` [PATCH char-misc-next v2 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
@ 2015-01-12 23:10 41% ` Sudeep Dutt
  2015-01-12 23:10 52% ` [PATCH char-misc-next v2 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

From: Nikhil Rao <nikhil.rao@intel.com>

SCIF connection APIs which establish a SCIF connection between
a pair of SCIF endpoints. A SCIF connection consists of a
dedicated queue-pair between the endpoints. Client messages are
sent over the queue-pair whereas the signaling associated with the
message is multiplexed over the node queue-pair. Similarly other
control messages such as exposing registered memory are also sent
over the node queue-pair. The SCIF endpoints must be in connected
state to exchange messages, register memory, map remote memory and
trigger DMA transfers. SCIF connections can be set up
asynchronously or synchronously.

Thanks to Johnnie S Peters for authoring parts of this patch during
early bring up of the SCIF driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   8 +
 drivers/misc/mic/scif/scif_api.c    | 452 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    | 229 ++++++++++++++++++
 drivers/misc/mic/scif/scif_fd.c     | 106 +++++++++
 drivers/misc/mic/scif/scif_main.c   |   1 +
 drivers/misc/mic/scif/scif_nm.c     |  36 +++
 drivers/misc/mic/scif/scif_nodeqp.c |  16 +-
 7 files changed, 847 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index b973b99..65c628e 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,5 +144,13 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index cbdc27b..8b55b59 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -415,3 +415,455 @@ int scif_listen(scif_epd_t epd, int backlog)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(scif_listen);
+
+/*
+ ************************************************************************
+ * SCIF connection flow:
+ *
+ * 1) A SCIF listening endpoint can call scif_accept(..) to wait for SCIF
+ *	connections via a SCIF_CNCT_REQ message
+ * 2) A SCIF endpoint can initiate a SCIF connection by calling
+ *	scif_connect(..) which calls scif_setup_qp_connect(..) which
+ *	allocates the local qp for the endpoint ring buffer and then sends
+ *	a SCIF_CNCT_REQ to the remote node and waits for a SCIF_CNCT_GNT or
+ *	a SCIF_CNCT_REJ message
+ * 3) The peer node handles a SCIF_CNCT_REQ via scif_cnctreq_resp(..) which
+ *	wakes up any threads blocked in step 1 or sends a SCIF_CNCT_REJ
+ *	message otherwise
+ * 4) A thread blocked waiting for incoming connections allocates its local
+ *	endpoint QP and ring buffer following which it sends a SCIF_CNCT_GNT
+ *	and waits for a SCIF_CNCT_GNT(N)ACK. If the allocation fails then
+ *	the node sends a SCIF_CNCT_REJ message
+ * 5) Upon receipt of a SCIF_CNCT_GNT or a SCIF_CNCT_REJ message the
+ *	connecting endpoint is woken up as part of handling
+ *	scif_cnctgnt_resp(..) following which it maps the remote endpoints'
+ *	QP, updates its outbound QP and sends a SCIF_CNCT_GNTACK message on
+ *	success or a SCIF_CNCT_GNTNACK message on failure and completes
+ *	the scif_connect(..) API
+ * 6) Upon receipt of a SCIF_CNCT_GNT(N)ACK the accepting endpoint blocked
+ *	in step 4 is woken up and completes the scif_accept(..) API
+ * 7) The SCIF connection is now established between the two SCIF endpoints.
+ */
+static int scif_conn_func(struct scif_endpt *ep)
+{
+	int err = 0;
+	struct scifmsg msg;
+	struct device *spdev;
+
+	/* Initiate the first part of the endpoint QP setup */
+	err = scif_setup_qp_connect(ep->qp_info.qp, &ep->qp_info.qp_offset,
+				    SCIF_ENDPT_QP_SIZE, ep->remote_dev);
+	if (err) {
+		dev_err(&ep->remote_dev->sdev->dev,
+			"%s err %d qp_offset 0x%llx\n",
+			__func__, err, ep->qp_info.qp_offset);
+		ep->state = SCIFEP_BOUND;
+		goto connect_error_simple;
+	}
+
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	/* Format connect message and send it */
+	msg.src = ep->port;
+	msg.dst = ep->conn_port;
+	msg.uop = SCIF_CNCT_REQ;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->qp_info.qp_offset;
+	err = _scif_nodeqp_send(ep->remote_dev, &msg);
+	if (err)
+		goto connect_error_dec;
+	scif_put_peer_dev(spdev);
+	/*
+	 * Wait for the remote node to respond with SCIF_CNCT_GNT or
+	 * SCIF_CNCT_REJ message.
+	 */
+	wait_event_timeout(ep->conwq, ep->state != SCIFEP_CONNECTING,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	if (ep->state == SCIFEP_MAPPING) {
+		err = scif_setup_qp_connect_response(ep->remote_dev,
+						     ep->qp_info.qp,
+						     ep->qp_info.gnt_pld);
+		/*
+		 * If the resource to map the queue are not available then
+		 * we need to tell the other side to terminate the accept
+		 */
+		if (err) {
+			dev_err(&ep->remote_dev->sdev->dev,
+				"%s %d err %d\n", __func__, __LINE__, err);
+			msg.uop = SCIF_CNCT_GNTNACK;
+			msg.payload[0] = ep->remote_ep;
+			_scif_nodeqp_send(ep->remote_dev, &msg);
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+
+		msg.uop = SCIF_CNCT_GNTACK;
+		msg.payload[0] = ep->remote_ep;
+		err = _scif_nodeqp_send(ep->remote_dev, &msg);
+		if (err) {
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+		ep->state = SCIFEP_CONNECTED;
+		mutex_lock(&scif_info.connlock);
+		list_add_tail(&ep->list, &scif_info.connected);
+		mutex_unlock(&scif_info.connlock);
+		dev_dbg(&ep->remote_dev->sdev->dev,
+			"SCIFAPI connect: ep %p connected\n", ep);
+	} else if (ep->state == SCIFEP_BOUND) {
+		dev_dbg(&ep->remote_dev->sdev->dev,
+			"SCIFAPI connect: ep %p connection refused\n", ep);
+		err = -ECONNREFUSED;
+		goto connect_error_dec;
+	}
+	scif_put_peer_dev(spdev);
+	return err;
+connect_error_dec:
+	scif_put_peer_dev(spdev);
+cleanup_qp:
+	scif_cleanup_ep_qp(ep);
+connect_error_simple:
+	return err;
+}
+
+/*
+ * scif_conn_handler:
+ *
+ * Workqueue handler for servicing non-blocking SCIF connect
+ *
+ */
+void scif_conn_handler(struct work_struct *work)
+{
+	struct scif_endpt *ep;
+
+	do {
+		ep = NULL;
+		spin_lock(&scif_info.nb_connect_lock);
+		if (!list_empty(&scif_info.nb_connect_list)) {
+			ep = list_first_entry(&scif_info.nb_connect_list,
+					      struct scif_endpt, conn_list);
+			list_del(&ep->conn_list);
+		}
+		spin_unlock(&scif_info.nb_connect_lock);
+		if (ep)
+			ep->conn_err = scif_conn_func(ep);
+	} while (ep);
+}
+
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	struct scif_dev *remote_dev;
+	struct device *spdev;
+
+	dev_dbg(scif_info.mdev.this_device, "SCIFAPI connect: ep %p %s\n", ep,
+		scif_ep_states[ep->state]);
+
+	if (!scif_dev || dst->node > scif_info.maxid)
+		return -ENODEV;
+
+	might_sleep();
+
+	remote_dev = &scif_dev[dst->node];
+	spdev = scif_get_peer_dev(remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		return err;
+	}
+
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+		err = -EINVAL;
+		break;
+	case SCIFEP_DISCONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EINVAL;
+		break;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+		err = -EOPNOTSUPP;
+		break;
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			err = -EINPROGRESS;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_CONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_UNBOUND:
+		ep->port.port = scif_get_new_port();
+		if (!ep->port.port) {
+			err = -ENOSPC;
+		} else {
+			ep->port.node = scif_info.nodeid;
+			ep->conn_async_state = ASYNC_CONN_IDLE;
+		}
+		/* Fall through */
+	case SCIFEP_BOUND:
+		/*
+		 * If a non-blocking connect has been already initiated
+		 * (conn_async_state is either ASYNC_CONN_INPROGRESS or
+		 * ASYNC_CONN_FLUSH_WORK), the end point could end up in
+		 * SCIF_BOUND due an error in the connection process
+		 * (e.g., connection refused) If conn_async_state is
+		 * ASYNC_CONN_INPROGRESS - transition to ASYNC_CONN_FLUSH_WORK
+		 * so that the error status can be collected. If the state is
+		 * already ASYNC_CONN_FLUSH_WORK - then set the error to
+		 * EINPROGRESS since some other thread is waiting to collect
+		 * error status.
+		 */
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+			err = -EINPROGRESS;
+		} else {
+			ep->conn_port = *dst;
+			init_waitqueue_head(&ep->sendwq);
+			init_waitqueue_head(&ep->recvwq);
+			init_waitqueue_head(&ep->conwq);
+			ep->conn_async_state = 0;
+
+			if (unlikely(non_block))
+				ep->conn_async_state = ASYNC_CONN_INPROGRESS;
+		}
+		break;
+	}
+
+	if (err || ep->conn_async_state == ASYNC_CONN_FLUSH_WORK)
+			goto connect_simple_unlock1;
+
+	ep->state = SCIFEP_CONNECTING;
+	ep->remote_dev = &scif_dev[dst->node];
+	ep->qp_info.qp->magic = SCIFEP_MAGIC;
+	if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+		spin_lock(&scif_info.nb_connect_lock);
+		list_add_tail(&ep->conn_list, &scif_info.nb_connect_list);
+		spin_unlock(&scif_info.nb_connect_lock);
+		err = -EINPROGRESS;
+		schedule_work(&scif_info.conn_work);
+	}
+connect_simple_unlock1:
+	spin_unlock(&ep->lock);
+	scif_put_peer_dev(spdev);
+	if (err) {
+		return err;
+	} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+		flush_work(&scif_info.conn_work);
+		err = ep->conn_err;
+		spin_lock(&ep->lock);
+		ep->conn_async_state = ASYNC_CONN_IDLE;
+		spin_unlock(&ep->lock);
+	} else {
+		err = scif_conn_func(ep);
+	}
+	return err;
+}
+
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst)
+{
+	return __scif_connect(epd, dst, false);
+}
+EXPORT_SYMBOL_GPL(scif_connect);
+
+/**
+ * scif_accept() - Accept a connection request from the remote node
+ *
+ * The function accepts a connection request from the remote node.  Successful
+ * complete is indicate by a new end point being created and passed back
+ * to the caller for future reference.
+ *
+ * Upon successful complete a zero will be returned and the peer information
+ * will be filled in.
+ *
+ * If the end point is not in the listening state -EINVAL will be returned.
+ *
+ * If during the connection sequence resource allocation fails the -ENOMEM
+ * will be returned.
+ *
+ * If the function is called with the ASYNC flag set and no connection requests
+ * are pending it will return -EAGAIN.
+ *
+ * If the remote side is not sending any connection requests the caller may
+ * terminate this function with a signal.  If so a -EINTR will be returned.
+ */
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer,
+		scif_epd_t *newepd, int flags)
+{
+	struct scif_endpt *lep = (struct scif_endpt *)epd;
+	struct scif_endpt *cep;
+	struct scif_conreq *conreq;
+	struct scifmsg msg;
+	int err;
+	struct device *spdev;
+
+	dev_dbg(scif_info.mdev.this_device,
+		"SCIFAPI accept: ep %p %s\n", lep, scif_ep_states[lep->state]);
+
+	if (flags & ~SCIF_ACCEPT_SYNC)
+		return -EINVAL;
+
+	if (!peer || !newepd)
+		return -EINVAL;
+
+	might_sleep();
+	spin_lock(&lep->lock);
+	if (lep->state != SCIFEP_LISTENING) {
+		spin_unlock(&lep->lock);
+		return -EINVAL;
+	}
+
+	if (!lep->conreqcnt && !(flags & SCIF_ACCEPT_SYNC)) {
+		/* No connection request present and we do not want to wait */
+		spin_unlock(&lep->lock);
+		return -EAGAIN;
+	}
+
+	lep->files = current->files;
+retry_connection:
+	spin_unlock(&lep->lock);
+	/* Wait for the remote node to send us a SCIF_CNCT_REQ */
+	err = wait_event_interruptible(lep->conwq,
+				       (lep->conreqcnt ||
+				       (lep->state != SCIFEP_LISTENING)));
+	if (err)
+		return err;
+
+	if (lep->state != SCIFEP_LISTENING)
+		return -EINTR;
+
+	spin_lock(&lep->lock);
+
+	if (!lep->conreqcnt)
+		goto retry_connection;
+
+	/* Get the first connect request off the list */
+	conreq = list_first_entry(&lep->conlist, struct scif_conreq, list);
+	list_del(&conreq->list);
+	lep->conreqcnt--;
+	spin_unlock(&lep->lock);
+
+	/* Fill in the peer information */
+	peer->node = conreq->msg.src.node;
+	peer->port = conreq->msg.src.port;
+
+	cep = kzalloc(sizeof(*cep), GFP_KERNEL);
+	if (!cep) {
+		err = -ENOMEM;
+		goto scif_accept_error_epalloc;
+	}
+	spin_lock_init(&cep->lock);
+	mutex_init(&cep->sendlock);
+	mutex_init(&cep->recvlock);
+	cep->state = SCIFEP_CONNECTING;
+	cep->remote_dev = &scif_dev[peer->node];
+	cep->remote_ep = conreq->msg.payload[0];
+
+	cep->qp_info.qp = kzalloc(sizeof(*cep->qp_info.qp), GFP_KERNEL);
+	if (!cep->qp_info.qp) {
+		err = -ENOMEM;
+		goto scif_accept_error_qpalloc;
+	}
+
+	cep->qp_info.qp->magic = SCIFEP_MAGIC;
+	spdev = scif_get_peer_dev(cep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto scif_accept_error_map;
+	}
+	err = scif_setup_qp_accept(cep->qp_info.qp, &cep->qp_info.qp_offset,
+				   conreq->msg.payload[1], SCIF_ENDPT_QP_SIZE,
+				   cep->remote_dev);
+	if (err) {
+		dev_dbg(&cep->remote_dev->sdev->dev,
+			"SCIFAPI accept: ep %p new %p scif_setup_qp_accept %d qp_offset 0x%llx\n",
+			lep, cep, err, cep->qp_info.qp_offset);
+		scif_put_peer_dev(spdev);
+		goto scif_accept_error_map;
+	}
+
+	cep->port.node = lep->port.node;
+	cep->port.port = lep->port.port;
+	cep->peer.node = peer->node;
+	cep->peer.port = peer->port;
+	init_waitqueue_head(&cep->sendwq);
+	init_waitqueue_head(&cep->recvwq);
+	init_waitqueue_head(&cep->conwq);
+
+	msg.uop = SCIF_CNCT_GNT;
+	msg.src = cep->port;
+	msg.payload[0] = cep->remote_ep;
+	msg.payload[1] = cep->qp_info.qp_offset;
+	msg.payload[2] = (u64)cep;
+
+	err = _scif_nodeqp_send(cep->remote_dev, &msg);
+	scif_put_peer_dev(spdev);
+	if (err)
+		goto scif_accept_error_map;
+retry:
+	/* Wait for the remote node to respond with SCIF_CNCT_GNT(N)ACK */
+	err = wait_event_timeout(cep->conwq, cep->state != SCIFEP_CONNECTING,
+				 SCIF_NODE_ACCEPT_TIMEOUT);
+	if (!err && scifdev_alive(cep))
+		goto retry;
+	err = !err ? -ENODEV : 0;
+	if (err)
+		goto scif_accept_error_map;
+	kfree(conreq);
+
+	spin_lock(&cep->lock);
+
+	if (cep->state == SCIFEP_CLOSING) {
+		/*
+		 * Remote failed to allocate resources and NAKed the grant.
+		 * There is at this point nothing referencing the new end point.
+		 */
+		spin_unlock(&cep->lock);
+		scif_teardown_ep(cep);
+		kfree(cep);
+
+		/* If call with sync flag then go back and wait. */
+		if (flags & SCIF_ACCEPT_SYNC) {
+			spin_lock(&lep->lock);
+			goto retry_connection;
+		}
+		return -EAGAIN;
+	}
+
+	scif_get_port(cep->port.port);
+	*newepd = (scif_epd_t)cep;
+	spin_unlock(&cep->lock);
+	return 0;
+scif_accept_error_map:
+	scif_teardown_ep(cep);
+scif_accept_error_qpalloc:
+	kfree(cep);
+scif_accept_error_epalloc:
+	msg.uop = SCIF_CNCT_REJ;
+	msg.dst.node = conreq->msg.src.node;
+	msg.dst.port = conreq->msg.src.port;
+	msg.payload[0] = conreq->msg.payload[0];
+	msg.payload[1] = conreq->msg.payload[1];
+	scif_nodeqp_send(&scif_dev[conreq->msg.src.node], &msg);
+	kfree(conreq);
+	return err;
+}
+EXPORT_SYMBOL_GPL(scif_accept);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index a2b728a..685ee5c 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -76,6 +76,24 @@ void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
 	schedule_work(&scif_info.misc_work);
 }
 
+static struct scif_endpt *scif_find_listen_ep(u16 port)
+{
+	struct scif_endpt *ep = NULL;
+	struct list_head *pos, *tmpq;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.listen) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->port.port == port) {
+			spin_lock(&ep->lock);
+			spin_unlock(&scif_info.eplock);
+			return ep;
+		}
+	}
+	spin_unlock(&scif_info.eplock);
+	return NULL;
+}
+
 void scif_cleanup_zombie_epd(void)
 {
 	struct list_head *pos, *tmpq;
@@ -90,3 +108,214 @@ void scif_cleanup_zombie_epd(void)
 	}
 	spin_unlock(&scif_info.eplock);
 }
+
+/**
+ * scif_cnctreq() - Respond to SCIF_CNCT_REQ interrupt message
+ * @msg:        Interrupt message
+ *
+ * This message is initiated by the remote node to request a connection
+ * to the local node.  This function looks for an end point in the
+ * listen state on the requested port id.
+ *
+ * If it finds a listening port it places the connect request on the
+ * listening end points queue and wakes up any pending accept calls.
+ *
+ * If it does not find a listening end point it sends a connection
+ * reject message to the remote node.
+ */
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_conreq *conreq;
+
+	conreq = kmalloc(sizeof(*conreq), GFP_KERNEL);
+	if (!conreq)
+		/* Lack of resources so reject the request. */
+		goto conreq_sendrej;
+
+	ep = scif_find_listen_ep(msg->dst.port);
+	if (!ep)
+		/*  Send reject due to no listening ports */
+		goto conreq_sendrej_free;
+
+	if (ep->backlog <= ep->conreqcnt) {
+		/*  Send reject due to too many pending requests */
+		spin_unlock(&ep->lock);
+		goto conreq_sendrej_free;
+	}
+
+	conreq->msg = *msg;
+	list_add_tail(&conreq->list, &ep->conlist);
+	ep->conreqcnt++;
+	wake_up_interruptible(&ep->conwq);
+	spin_unlock(&ep->lock);
+	return;
+
+conreq_sendrej_free:
+	kfree(conreq);
+conreq_sendrej:
+	msg->uop = SCIF_CNCT_REJ;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_cnctgnt() - Respond to SCIF_CNCT_GNT interrupt message
+ * @msg:        Interrupt message
+ *
+ * An accept() on the remote node has occurred and sent this message
+ * to indicate success.  Place the end point in the MAPPING state and
+ * save the remote nodes memory information.  Then wake up the connect
+ * request so it can finish.
+ */
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->peer.node = msg->src.node;
+		ep->peer.port = msg->src.port;
+		ep->qp_info.gnt_pld = msg->payload[1];
+		ep->remote_ep = msg->payload[2];
+		ep->state = SCIFEP_MAPPING;
+
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctgnt_ack() - Respond to SCIF_CNCT_GNTACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request has finished mapping the local memory.
+ * Place the connection in the connected state and wake up the pending
+ * accept() call.
+ */
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	mutex_lock(&scif_info.connlock);
+	spin_lock(&ep->lock);
+	/* New ep is now connected with all resources set. */
+	ep->state = SCIFEP_CONNECTED;
+	list_add_tail(&ep->list, &scif_info.connected);
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+}
+
+/**
+ * scif_cnctgnt_nack() - Respond to SCIF_CNCT_GNTNACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request failed to map the local memory it was sent.
+ * Place the end point in the CLOSING state to indicate it and wake up
+ * the pending accept();
+ */
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_CLOSING;
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctrej() - Respond to SCIF_CNCT_REJ interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote end has rejected the connection request.  Set the end
+ * point back to the bound state and wake up the pending connect().
+ */
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->state = SCIFEP_BOUND;
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_discnct() - Respond to SCIF_DISCNCT interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote node has indicated close() has been called on its end
+ * point.  Remove the local end point from the connected list, set its
+ * state to disconnected and ensure accesses to the remote node are
+ * shutdown.
+ *
+ * When all accesses to the remote end have completed then send a
+ * DISCNT_ACK to indicate it can remove its resources and complete
+ * the close routine.
+ */
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		/*
+		 * The local ep may have sent a disconnect and and been closed
+		 * due to a message response time out. It may have been
+		 * allocated again and formed a new connection so we want to
+		 * check if the remote ep matches
+		 */
+		if (((u64)tmpep == msg->payload[1]) &&
+		    ((u64)tmpep->remote_ep == msg->payload[0])) {
+			list_del(pos);
+			ep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	/*
+	 * If the terminated end is not found then this side started closing
+	 * before the other side sent the disconnect.  If so the ep will no
+	 * longer be on the connected list.  Regardless the other side
+	 * needs to be acked to let it know close is complete.
+	 */
+	if (!ep) {
+		mutex_unlock(&scif_info.connlock);
+		goto discnct_ack;
+	}
+
+	ep->state = SCIFEP_DISCONNECTED;
+	list_add_tail(&ep->list, &scif_info.disconnected);
+
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+discnct_ack:
+	msg->uop = SCIF_DISCNT_ACK;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_discnct_ack() - Respond to SCIF_DISCNT_ACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side has indicated it has not more references to local resources
+ */
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_DISCONNECTED;
+	spin_unlock(&ep->lock);
+	complete(&ep->discon);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index 812c23f..c1d6bc3 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -68,6 +68,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 {
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
+	int err = 0;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -91,6 +92,111 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	}
 	case SCIF_LISTEN:
 		return scif_listen(priv, arg);
+	case SCIF_CONNECT:
+	{
+		struct scifioctl_connect req;
+		struct scif_endpt *ep = (struct scif_endpt *)priv;
+
+		if (copy_from_user(&req, argp, sizeof(req)))
+			return -EFAULT;
+
+		err = __scif_connect(priv, &req.peer, non_block);
+		if (err < 0)
+			return err;
+
+		req.self.node = ep->port.node;
+		req.self.port = ep->port.port;
+
+		if (copy_to_user(argp, &req, sizeof(req)))
+			return -EFAULT;
+
+		return 0;
+	}
+	/*
+	 * Accept is done in two halves.  The request ioctl does the basic
+	 * functionality of accepting the request and returning the information
+	 * about it including the internal ID of the end point.  The register
+	 * is done with the internal ID on a new file descriptor opened by the
+	 * requesting process.
+	 */
+	case SCIF_ACCEPTREQ:
+	{
+		struct scifioctl_accept request;
+		scif_epd_t *ep = (scif_epd_t *)&request.endpt;
+
+		if (copy_from_user(&request, argp, sizeof(request)))
+			return -EFAULT;
+
+		err = scif_accept(priv, &request.peer, ep, request.flags);
+		if (err < 0)
+			return err;
+
+		if (copy_to_user(argp, &request, sizeof(request))) {
+			scif_close(*ep);
+			return -EFAULT;
+		}
+		/*
+		 * Add to the list of user mode eps where the second half
+		 * of the accept is not yet completed.
+		 */
+		spin_lock(&scif_info.eplock);
+		list_add_tail(&((*ep)->miacceptlist), &scif_info.uaccept);
+		list_add_tail(&((*ep)->liacceptlist), &priv->li_accept);
+		(*ep)->listenep = priv;
+		priv->acceptcnt++;
+		spin_unlock(&scif_info.eplock);
+
+		return 0;
+	}
+	case SCIF_ACCEPTREG:
+	{
+		struct scif_endpt *priv = f->private_data;
+		struct scif_endpt *newep;
+		struct scif_endpt *lisep;
+		struct scif_endpt *fep = NULL;
+		struct scif_endpt *tmpep;
+		struct list_head *pos, *tmpq;
+
+		/* Finally replace the pointer to the accepted endpoint */
+		if (copy_from_user(&newep, argp, sizeof(void *)))
+			return -EFAULT;
+
+		/* Remove form the user accept queue */
+		spin_lock(&scif_info.eplock);
+		list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, miacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				fep = tmpep;
+				break;
+			}
+		}
+
+		if (!fep) {
+			spin_unlock(&scif_info.eplock);
+			return -ENOENT;
+		}
+
+		lisep = newep->listenep;
+		list_for_each_safe(pos, tmpq, &lisep->li_accept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, liacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				lisep->acceptcnt--;
+				break;
+			}
+		}
+
+		spin_unlock(&scif_info.eplock);
+
+		/* Free the resources automatically created from the open. */
+		scif_teardown_ep(priv);
+		scif_add_epd_to_zombie_list(priv, !SCIF_EPLOCK_HELD);
+		f->private_data = newep;
+		return 0;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
index 714bfea..2393274 100644
--- a/drivers/misc/mic/scif/scif_main.c
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -330,6 +330,7 @@ static int _scif_init(void)
 	scif_info.en_msg_log = 0;
 	scif_info.p2p_enable = 1;
 	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	INIT_WORK(&scif_info.conn_work, scif_conn_handler);
 	idr_init(&scif_ports);
 	return 0;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 5fefacc..0469aad 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -20,6 +20,41 @@
 #include "scif_main.h"
 #include "scif_map.h"
 
+/**
+ * scif_invalidate_ep() - Set state for all connected endpoints
+ * to disconnected and wake up all send/recv waitqueues
+ */
+static void scif_invalidate_ep(int node)
+{
+	struct scif_endpt *ep;
+	struct list_head *pos, *tmpq;
+
+	flush_work(&scif_info.conn_work);
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			spin_lock(&ep->lock);
+			scif_cleanup_ep_qp(ep);
+			spin_unlock(&ep->lock);
+		}
+	}
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			list_del(pos);
+			spin_lock(&ep->lock);
+			ep->state = SCIFEP_DISCONNECTED;
+			list_add_tail(&ep->list, &scif_info.disconnected);
+			scif_cleanup_ep_qp(ep);
+			wake_up_interruptible(&ep->sendwq);
+			wake_up_interruptible(&ep->recvwq);
+			spin_unlock(&ep->lock);
+		}
+	}
+	mutex_unlock(&scif_info.connlock);
+}
+
 void scif_free_qp(struct scif_dev *scifdev)
 {
 	struct scif_qp *qp = scifdev->qpairs;
@@ -91,6 +126,7 @@ void scif_cleanup_scifdev(struct scif_dev *dev)
 		scif_destroy_intr_wq(dev);
 	}
 	scif_destroy_p2p(dev);
+	scif_invalidate_ep(dev->node);
 	scif_send_acks(dev);
 	if (!dev->node && scif_info.card_initiated_exit) {
 		/*
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index 8978af9..aee9025 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -569,7 +569,14 @@ static char *message_types[] = {"BAD",
 				"SCIF_NODE_ADD_ACK",
 				"SCIF_NODE_ADD_NACK",
 				"REMOVE_NODE",
-				"REMOVE_NODE_ACK"};
+				"REMOVE_NODE_ACK",
+				"CNCT_REQ",
+				"CNCT_GNT",
+				"CNCT_GNTACK",
+				"CNCT_GNTNACK",
+				"CNCT_REJ",
+				"DISCNCT",
+				"DISCNT_ACK"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -959,6 +966,13 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
 	scif_node_remove,	/* SCIF_NODE_REMOVE */
 	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+	scif_cnctreq,		/* SCIF_CNCT_REQ */
+	scif_cnctgnt,		/* SCIF_CNCT_GNT */
+	scif_cnctgnt_ack,	/* SCIF_CNCT_GNTACK */
+	scif_cnctgnt_nack,	/* SCIF_CNCT_GNTNACK */
+	scif_cnctrej,		/* SCIF_CNCT_REJ */
+	scif_discnct,		/* SCIF_DISCNCT */
+	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v2 11/13] misc: mic: MIC host driver specific changes to enable SCIF
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (9 preceding siblings ...)
  2015-01-12 23:10 52% ` [PATCH char-misc-next v2 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
@ 2015-01-12 23:10 57% ` Sudeep Dutt
  2015-01-12 23:10 70% ` [PATCH char-misc-next v2 12/13] misc: mic: MIC card " Sudeep Dutt
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC host driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |   2 +-
 drivers/misc/mic/host/mic_device.h  |  11 +-
 drivers/misc/mic/host/mic_intr.h    |   3 +-
 drivers/misc/mic/host/mic_smpt.h    |   1 +
 drivers/misc/mic/host/mic_boot.c    | 264 ++++++++++++++++++++++++++++++++++--
 drivers/misc/mic/host/mic_debugfs.c |  13 ++
 drivers/misc/mic/host/mic_main.c    |   6 +
 drivers/misc/mic/host/mic_smpt.c    |   7 +-
 drivers/misc/mic/host/mic_virtio.c  |   6 +-
 drivers/misc/mic/host/mic_x100.c    |   3 +-
 10 files changed, 292 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bcf6e36..a2ea3ab 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -36,7 +36,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 016bd15..01a7555 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -27,7 +27,7 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 #include <linux/mic_bus.h>
-
+#include "../bus/scif_bus.h"
 #include "mic_intr.h"
 
 /* The maximum number of MIC devices supported in a single host system. */
@@ -90,7 +90,9 @@ enum mic_stepping {
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
  * @dma_mbdev: MIC BUS DMA device.
- * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -129,7 +131,9 @@ struct mic_device {
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
 	struct mbus_device *dma_mbdev;
-	struct dma_chan *dma_ch;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -228,4 +232,5 @@ void mic_exit_debugfs(void);
 void mic_prepare_suspend(struct mic_device *mdev);
 void mic_complete_resume(struct mic_device *mdev);
 void mic_suspend(struct mic_device *mdev);
+extern atomic_t g_num_mics;
 #endif
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 9f783d4a..cce2882 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -28,8 +28,9 @@
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
  * 4 for host owned DMA channels.
+ * 1 for SCIF
  */
-#define MIC_MIN_MSIX 8
+#define MIC_MIN_MSIX 9
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_smpt.h b/drivers/misc/mic/host/mic_smpt.h
index 51970ab..68721c6 100644
--- a/drivers/misc/mic/host/mic_smpt.h
+++ b/drivers/misc/mic/host/mic_smpt.h
@@ -78,6 +78,7 @@ void mic_unmap_single(struct mic_device *mdev,
 dma_addr_t mic_map(struct mic_device *mdev,
 	dma_addr_t dma_addr, size_t size);
 void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size);
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr);
 
 /**
  * mic_map_error - Check a MIC address for errors.
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index ff2b0fb..bb15506 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -21,6 +21,7 @@
 #include <linux/delay.h>
 #include <linux/firmware.h>
 #include <linux/pci.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include <linux/mic_bus.h>
@@ -29,6 +30,188 @@
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static void *__mic_dma_alloc(struct device *dev, size_t size,
+			     dma_addr_t *dma_handle, gfp_t gfp,
+			     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	dma_addr_t tmp;
+	void *va = kmalloc(size, gfp);
+
+	if (va) {
+		tmp = mic_map_single(mdev, va, size);
+		if (dma_mapping_error(dev, tmp)) {
+			kfree(va);
+			va = NULL;
+		} else {
+			*dma_handle = tmp;
+		}
+	}
+	return va;
+}
+
+static void __mic_dma_free(struct device *dev, size_t size, void *vaddr,
+			   dma_addr_t dma_handle, struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_handle, size);
+	kfree(vaddr);
+}
+
+static dma_addr_t
+__mic_dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+		   size_t size, enum dma_data_direction dir,
+		   struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void
+__mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+		     size_t size, enum dma_data_direction dir,
+		     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static int __mic_dma_map_sg(struct device *dev, struct scatterlist *sg,
+			    int nents, enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	int i, j, ret;
+	dma_addr_t da;
+
+	ret = dma_map_sg(mdev->sdev->parent, sg, nents, dir);
+	if (ret <= 0)
+		return 0;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_map(mdev, sg_dma_address(s) + s->offset, s->length);
+		if (!da)
+			goto err;
+		sg_dma_address(s) = da;
+	}
+	return nents;
+err:
+	for_each_sg(sg, s, i, j) {
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = mic_to_dma_addr(mdev, sg_dma_address(s));
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+	return 0;
+}
+
+static void __mic_dma_unmap_sg(struct device *dev,
+			       struct scatterlist *sg, int nents,
+			       enum dma_data_direction dir,
+			       struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	dma_addr_t da;
+	int i;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_to_dma_addr(mdev, sg_dma_address(s));
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = da;
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+}
+
+static struct dma_map_ops __mic_dma_ops = {
+	.alloc = __mic_dma_alloc,
+	.free = __mic_dma_free,
+	.map_page = __mic_dma_map_page,
+	.unmap_page = __mic_dma_unmap_page,
+	.map_sg = __mic_dma_map_sg,
+	.unmap_sg = __mic_dma_unmap_sg,
+};
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name,
+				       void *data, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_request_threaded_irq(mdev, func, NULL, name, data,
+					db, MIC_INTR_DB);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_free_irq(mdev, cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_next_db(mdev);
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->send_intr(mdev, db);
+}
+
+static void __iomem *___mic_ioremap(struct scif_hw_dev *scdev,
+				    phys_addr_t pa, size_t len)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mdev->aper.va + pa;
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	/* nothing to do */
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
 static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
 {
 	return dev_get_drvdata(mbdev->dev.parent);
@@ -127,6 +310,58 @@ void mic_bootparam_init(struct mic_device *mdev)
 	bootparam->h2c_config_db = -1;
 	bootparam->shutdown_status = 0;
 	bootparam->shutdown_card = 0;
+	/* Total nodes = number of MICs + 1 for self node */
+	bootparam->tot_nodes = atomic_read(&g_num_mics) + 1;
+	bootparam->node_id = mdev->id + 1;
+	bootparam->scif_host_dma_addr = 0x0;
+	bootparam->scif_card_dma_addr = 0x0;
+	bootparam->c2h_scif_db = -1;
+	bootparam->h2c_scif_db = -1;
+}
+
+/**
+ * mic_request_dma_chans - Request DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns number of DMA channels acquired
+ */
+static int mic_request_dma_chans(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, mdev->ops->dma_filter,
+					   mdev->sdev->parent);
+		if (chan) {
+			mdev->dma_ch[mdev->num_dma_ch++] = chan;
+			if (mdev->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdev->sdev->parent, "DMA channels # %d\n", mdev->num_dma_ch);
+	return mdev->num_dma_ch;
+}
+
+/**
+ * mic_free_dma_chans - release DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns none
+ */
+static void mic_free_dma_chans(struct mic_device *mdev)
+{
+	int i = 0;
+
+	for (i = 0; i < mdev->num_dma_ch; i++) {
+		dma_release_channel(mdev->dma_ch[i]);
+		mdev->dma_ch[i] = NULL;
+	}
+	mdev->num_dma_ch = 0;
 }
 
 /**
@@ -141,6 +376,7 @@ int mic_start(struct mic_device *mdev, const char *buf)
 {
 	int rc;
 	mutex_lock(&mdev->mic_mutex);
+	mic_bootparam_init(mdev);
 retry:
 	if (MIC_OFFLINE != mdev->state) {
 		rc = -EINVAL;
@@ -161,14 +397,22 @@ retry:
 		rc = PTR_ERR(mdev->dma_mbdev);
 		goto unlock_ret;
 	}
-	mdev->dma_ch = mic_request_dma_chan(mdev);
-	if (!mdev->dma_ch) {
-		rc = -ENXIO;
+	if (!mic_request_dma_chans(mdev)) {
+		rc = -ENODEV;
 		goto dma_remove;
 	}
+	mdev->scdev = scif_register_device(mdev->sdev->parent, MIC_SCIF_DEV,
+					   &__mic_dma_ops, &scif_hw_ops,
+					   mdev->id + 1, 0, &mdev->mmio,
+					   &mdev->aper, mdev->dp, NULL,
+					   mdev->dma_ch, mdev->num_dma_ch);
+	if (IS_ERR(mdev->scdev)) {
+		rc = PTR_ERR(mdev->scdev);
+		goto dma_free;
+	}
 	rc = mdev->ops->load_mic_fw(mdev, buf);
 	if (rc)
-		goto dma_release;
+		goto scif_remove;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -177,8 +421,10 @@ retry:
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
 	goto unlock_ret;
-dma_release:
-	dma_release_channel(mdev->dma_ch);
+scif_remove:
+	scif_unregister_device(mdev->scdev);
+dma_free:
+	mic_free_dma_chans(mdev);
 dma_remove:
 	mbus_unregister_device(mdev->dma_mbdev);
 unlock_ret:
@@ -197,11 +443,9 @@ void mic_stop(struct mic_device *mdev, bool force)
 {
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
+		scif_unregister_device(mdev->scdev);
 		mic_virtio_reset_devices(mdev);
-		if (mdev->dma_ch) {
-			dma_release_channel(mdev->dma_ch);
-			mdev->dma_ch = NULL;
-		}
+		mic_free_dma_chans(mdev);
 		mbus_unregister_device(mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 687e9aa..3c9ea48 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -214,6 +214,19 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 		   bootparam->shutdown_status);
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
 		   bootparam->shutdown_card);
+	seq_printf(s, "Bootparam: tot_nodes %d\n",
+		   bootparam->tot_nodes);
+	seq_printf(s, "Bootparam: node_id %d\n",
+		   bootparam->node_id);
+	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
+		   bootparam->c2h_scif_db);
+	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
+		   bootparam->h2c_scif_db);
+	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
+		   bootparam->scif_host_dma_addr);
+	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
+		   bootparam->scif_card_dma_addr);
+
 
 	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
 	     i += mic_total_desc_size(d)) {
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index ab37a31..4564629 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -67,6 +67,8 @@ static struct ida g_mic_ida;
 static struct class *g_mic_class;
 /* Base device node number for MIC devices */
 static dev_t g_mic_devno;
+/* Track the total number of MIC devices */
+atomic_t g_num_mics;
 
 static const struct file_operations mic_fops = {
 	.open = mic_open,
@@ -408,6 +410,7 @@ static int mic_probe(struct pci_dev *pdev,
 		dev_err(&pdev->dev, "cdev_add err id %d rc %d\n", mdev->id, rc);
 		goto cleanup_debug_dir;
 	}
+	atomic_inc(&g_num_mics);
 	return 0;
 cleanup_debug_dir:
 	mic_delete_debug_dir(mdev);
@@ -459,6 +462,7 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	mic_stop(mdev, false);
+	atomic_dec(&g_num_mics);
 	cdev_del(&mdev->cdev);
 	mic_delete_debug_dir(mdev);
 	mutex_lock(&mdev->mic_mutex);
@@ -478,6 +482,7 @@ static void mic_remove(struct pci_dev *pdev)
 	ida_simple_remove(&g_mic_ida, mdev->id);
 	kfree(mdev);
 }
+
 static struct pci_driver mic_driver = {
 	.name = mic_driver_name,
 	.id_table = mic_pci_tbl,
@@ -512,6 +517,7 @@ static int __init mic_init(void)
 	}
 	return ret;
 cleanup_debugfs:
+	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
 	class_destroy(g_mic_class);
 cleanup_chrdev:
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index fae474c..cec8203 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -174,8 +174,7 @@ static int mic_get_smpt_ref_count(struct mic_device *mdev, dma_addr_t dma_addr,
  *
  * returns a DMA address.
  */
-static dma_addr_t
-mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
 {
 	struct mic_smpt_info *smpt_info = mdev->smpt;
 	int spt;
@@ -214,7 +213,7 @@ dma_addr_t mic_map(struct mic_device *mdev, dma_addr_t dma_addr, size_t size)
 	if (!size || size > mic_max_system_memory(mdev))
 		return mic_addr;
 
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return mic_addr;
 
@@ -271,7 +270,7 @@ void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
 	}
 
 	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return;
 
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index a020e4e..cc08e9f 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -40,7 +40,7 @@ static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
 {
 	int err = 0;
 	struct dma_async_tx_descriptor *tx;
-	struct dma_chan *mic_ch = mdev->dma_ch;
+	struct dma_chan *mic_ch = mdev->dma_ch[0];
 
 	if (!mic_ch) {
 		err = -EBUSY;
@@ -80,7 +80,7 @@ static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t dma_offset;
 	size_t partlen;
 	int err;
@@ -129,7 +129,7 @@ static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t partlen;
 	int err;
 
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index b7a21e1..3341e90 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -167,8 +167,7 @@ static void mic_x100_send_intr(struct mic_device *mdev, int doorbell)
 	if (doorbell < MIC_X100_NUM_SBOX_IRQ) {
 		mic_x100_send_sbox_intr(mdev, doorbell);
 	} else {
-		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ;
 		mic_x100_send_rdmasr_intr(mdev, rdmasr_db);
 	}
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 57%]

* [PATCH char-misc-next v2 12/13] misc: mic: MIC card driver specific changes to enable SCIF
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (10 preceding siblings ...)
  2015-01-12 23:10 57% ` [PATCH char-misc-next v2 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
@ 2015-01-12 23:10 70% ` Sudeep Dutt
  2015-01-12 23:10 98% ` [PATCH char-misc-next v2 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
    13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC card driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/card/mic_device.h |  11 +++-
 drivers/misc/mic/card/mic_x100.h   |   1 +
 drivers/misc/mic/card/mic_device.c | 132 ++++++++++++++++++++++++++++++++++++-
 drivers/misc/mic/card/mic_x100.c   |  61 ++++++++++++++++-
 5 files changed, 201 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index a2ea3ab..e9f2f56 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -56,7 +56,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86 && INTEL_MIC_BUS
+	depends on 64BIT && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 844be8f..1dbf83c 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -29,9 +29,9 @@
 
 #include <linux/workqueue.h>
 #include <linux/io.h>
-#include <linux/irqreturn.h>
 #include <linux/interrupt.h>
 #include <linux/mic_bus.h>
+#include "../bus/scif_bus.h"
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -73,6 +73,9 @@ struct mic_device {
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
  * @dma_mbdev: dma device on the MIC virtual bus.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -84,6 +87,9 @@ struct mic_driver {
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
 	struct mbus_device *dma_mbdev;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -122,10 +128,11 @@ void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
 struct mic_irq *
 mic_request_card_irq(irq_handler_t handler, irq_handler_t thread_fn,
-		     const char *name, void *data, int intr_src);
+		     const char *name, void *data, int db);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
+void mic_send_p2p_intr(int doorbell, struct mic_mw *mw);
 int mic_db_to_irq(struct mic_driver *mdrv, int db);
 u32 mic_ack_interrupt(struct mic_device *mdev);
 void mic_hw_intr_init(struct mic_driver *mdrv);
diff --git a/drivers/misc/mic/card/mic_x100.h b/drivers/misc/mic/card/mic_x100.h
index d66ea55..7e22249 100644
--- a/drivers/misc/mic/card/mic_x100.h
+++ b/drivers/misc/mic/card/mic_x100.h
@@ -35,6 +35,7 @@
 #define MIC_X100_SBOX_SDBIC0 0x0000CC90
 #define MIC_X100_SBOX_SDBIC0_DBREQ_BIT 0x80000000
 #define MIC_X100_SBOX_RDMASR0	0x0000B180
+#define MIC_X100_SBOX_APICICR0 0x0000A9D0
 
 #define MIC_X100_MAX_DOORBELL_IDX 8
 
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 83819ee..6338908 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -28,6 +28,8 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/reboot.h>
+#include <linux/dmaengine.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include "../common/mic_dev.h"
@@ -240,6 +242,111 @@ static void mic_uninit_irq(void)
 	kfree(mdrv->irq_info.irq_usage_count);
 }
 
+static inline struct mic_driver *scdev_to_mdrv(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name, void *data,
+				       int db)
+{
+	return mic_request_card_irq(func, NULL, name, data, db);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	return mic_free_card_irq(cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_ack_interrupt(&mdrv->mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	return mic_next_card_db();
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_send_intr(&mdrv->mdev, db);
+}
+
+static void ___mic_send_p2p_intr(struct scif_hw_dev *scdev, int db,
+				 struct mic_mw *mw)
+{
+	mic_send_p2p_intr(db, mw);
+}
+
+static void __iomem *
+___mic_ioremap(struct scif_hw_dev *scdev,
+	       phys_addr_t pa, size_t len)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	return mic_card_map(&mdrv->mdev, pa, len);
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_card_unmap(&mdrv->mdev, va);
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.send_p2p_intr = ___mic_send_p2p_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
+static int mic_request_dma_chans(struct mic_driver *mdrv)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, NULL, NULL);
+		if (chan) {
+			mdrv->dma_ch[mdrv->num_dma_ch++] = chan;
+			if (mdrv->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdrv->dev, "DMA channels # %d\n", mdrv->num_dma_ch);
+	return mdrv->num_dma_ch;
+}
+
+static void mic_free_dma_chans(struct mic_driver *mdrv)
+{
+	int i = 0;
+
+	for (i = 0; i < mdrv->num_dma_ch; i++) {
+		dma_release_channel(mdrv->dma_ch[i]);
+		mdrv->dma_ch[i] = NULL;
+	}
+	mdrv->num_dma_ch = 0;
+}
+
 /*
  * mic_driver_init - MIC driver initialization tasks.
  *
@@ -248,6 +355,8 @@ static void mic_uninit_irq(void)
 int __init mic_driver_init(struct mic_driver *mdrv)
 {
 	int rc;
+	struct mic_bootparam __iomem *bootparam;
+	u8 node_id;
 
 	g_drv = mdrv;
 	/*
@@ -268,13 +377,32 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 	rc = mic_shutdown_init();
 	if (rc)
 		goto irq_uninit;
+	if (!mic_request_dma_chans(mdrv)) {
+		rc = -ENODEV;
+		goto shutdown_uninit;
+	}
 	rc = mic_devices_init(mdrv);
 	if (rc)
-		goto shutdown_uninit;
+		goto dma_free;
+	bootparam = mdrv->dp;
+	node_id = ioread8(&bootparam->node_id);
+	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
+					   NULL, &scif_hw_ops,
+					   0, node_id, &mdrv->mdev.mmio, NULL,
+					   NULL, mdrv->dp, mdrv->dma_ch,
+					   mdrv->num_dma_ch);
+	if (IS_ERR(mdrv->scdev)) {
+		rc = PTR_ERR(mdrv->scdev);
+		goto device_uninit;
+	}
 	mic_create_card_debug_dir(mdrv);
 	atomic_notifier_chain_register(&panic_notifier_list, &mic_panic);
 done:
 	return rc;
+device_uninit:
+	mic_devices_uninit(mdrv);
+dma_free:
+	mic_free_dma_chans(mdrv);
 shutdown_uninit:
 	mic_shutdown_uninit();
 irq_uninit:
@@ -294,7 +422,9 @@ put:
 void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
+	scif_unregister_device(mdrv->scdev);
 	mic_devices_uninit(mdrv);
+	mic_free_dma_chans(mdrv);
 	/*
 	 * Inform the host about the shutdown status i.e. poweroff/restart etc.
 	 * The module cannot be unloaded so the only code path to call
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index e98e537..77fd41781 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -70,6 +70,41 @@ void mic_send_intr(struct mic_device *mdev, int doorbell)
 		       (MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
 }
 
+/*
+ * mic_x100_send_sbox_intr - Send an MIC_X100_SBOX interrupt to MIC.
+ */
+static void mic_x100_send_sbox_intr(struct mic_mw *mw, int doorbell)
+{
+	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
+	u32 apicicr_low = mic_mmio_read(mw, MIC_X100_SBOX_BASE_ADDRESS +
+					apic_icr_offset);
+
+	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
+	apicicr_low = (apicicr_low | (1 << 13));
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, apicicr_low,
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+}
+
+static void mic_x100_send_rdmasr_intr(struct mic_mw *mw, int doorbell)
+{
+	int rdmasr_offset = MIC_X100_SBOX_RDMASR0 + (doorbell << 2);
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, 0, MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+}
+
 /**
  * mic_ack_interrupt - Device specific interrupt handling.
  * @mdev: pointer to mic_device instance
@@ -91,6 +126,18 @@ static inline int mic_get_rdmasr_irq(int index)
 	return  MIC_X100_RDMASR_IRQ_BASE + index;
 }
 
+void mic_send_p2p_intr(int db, struct mic_mw *mw)
+{
+	int rdmasr_index;
+
+	if (db < MIC_X100_NUM_SBOX_IRQ) {
+		mic_x100_send_sbox_intr(mw, db);
+	} else {
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
+		mic_x100_send_rdmasr_intr(mw, rdmasr_index);
+	}
+}
+
 /**
  * mic_hw_intr_init - Initialize h/w specific interrupt
  * information.
@@ -113,11 +160,15 @@ void mic_hw_intr_init(struct mic_driver *mdrv)
 int mic_db_to_irq(struct mic_driver *mdrv, int db)
 {
 	int rdmasr_index;
+
+	/*
+	 * The total number of doorbell interrupts on the card are 16. Indices
+	 * 0-8 falls in the SBOX category and 8-15 fall in the RDMASR category.
+	 */
 	if (db < MIC_X100_NUM_SBOX_IRQ) {
 		return mic_get_sbox_irq(db);
 	} else {
-		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
 		return mic_get_rdmasr_irq(rdmasr_index);
 	}
 }
@@ -243,10 +294,16 @@ static void mic_platform_shutdown(struct platform_device *pdev)
 	mic_remove(pdev);
 }
 
+static u64 mic_dma_mask = DMA_BIT_MASK(64);
+
 static struct platform_device mic_platform_dev = {
 	.name = mic_driver_name,
 	.id   = 0,
 	.num_resources = 0,
+	.dev = {
+		.dma_mask = &mic_dma_mask,
+		.coherent_dma_mask = DMA_BIT_MASK(64),
+	},
 };
 
 static struct platform_driver __refdata mic_platform_driver = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 70%]

* [PATCH char-misc-next v2 13/13] misc: mic: add support for loading/unloading SCIF driver
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (11 preceding siblings ...)
  2015-01-12 23:10 70% ` [PATCH char-misc-next v2 12/13] misc: mic: MIC card " Sudeep Dutt
@ 2015-01-12 23:10 98% ` Sudeep Dutt
    13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

modprobe SCIF driver upon start and remove it upon unload

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpss | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index cacbdb0..582aad4 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -35,6 +35,7 @@
 
 exec=/usr/sbin/mpssd
 sysfs="/sys/class/mic"
+mic_modules="mic_host mic_x100_dma scif"
 
 start()
 {
@@ -48,18 +49,15 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
+	echo -e $"Loading MIC drivers:" $mic_modules
 
-	for f in "mic_host" "mic_x100_dma"
-	do
-		modprobe $f
-		RETVAL=$?
-		if [ $RETVAL -ne 0 ]; then
-			failure
-			echo
-			return $RETVAL
-		fi
-	done
+	modprobe -a $mic_modules
+	RETVAL=$?
+	if [ $RETVAL -ne 0 ]; then
+		failure
+		echo
+		return $RETVAL
+	fi
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +168,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
-	modprobe -r mic_host mic_x100_dma
+	echo -n $"Removing MIC drivers:" $mic_modules
+	modprobe -r $mic_modules
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 98%]

* [PATCH char-misc-next v2 00/13] misc: mic: SCIF driver
@ 2015-01-12 23:09 64% Sudeep Dutt
  2015-01-12 23:09 22% ` [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
                   ` (13 more replies)
  0 siblings, 14 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

ChangeLog:
=========
v1 => v2:
a) Use kernel-doc formatting for SCIF header file documentation, enhance
   SCIF ring buffer documentation and formatting cleanup in patch 4 as
   per feedback from Greg Kroah-Hartman
b) SCIF bug fixes and cleanups since initial post

v1: Initial post @ https://lkml.org/lkml/2014/12/10/546

Description:
============

The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
level communications API across PCIe currently implemented for MIC. Currently
SCIF provides inter-node communication within a single host platform, where a
node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
communicating over the PCIe bus while providing an API that is symmetric
across all the nodes in the PCIe network. An important design objective for SCIF
is to deliver the maximum possible performance given the communication
abilities of the hardware. SCIF has been used to implement an offload compiler
runtime and OFED support for MPI implementations for MIC coprocessors.

==== SCIF API Components ====
The SCIF API has the following parts:
1. Connection establishment using a client server model
2. Byte stream messaging intended for short messages
3. Node enumeration to determine online nodes
4. Poll semantics for detection of incoming connections and messages
5. Memory registration to pin down pages
6. Remote memory mapping for low latency CPU accesses via mmap
7. Remote DMA (RDMA) for high bandwidth DMA transfers
8. Fence APIs for RDMA synchronization

SCIF exposes the notion of a connection which can be used by peer processes on
nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
process in a SCIF node initiates a SCIF connection to a peer process on a
different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
which are similar to connection oriented socket APIs. Connected SCIF endpoints
can also register local memory which is followed by data transfer using either
DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
kernel mode clients which are functionally equivalent.

==== SCIF Performance for MIC ====
DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
SCIF shows the performance advantages of SCIF for HPC applications and runtimes.

             Comparison of TCP and SCIF based BW

  Throughput (GB/sec)
    8 +                                             PCIe Bandwidth ******
      +                                                        TCP ######
    7 +    **************************************             SCIF %%%%%%
      |                       %%%%%%%%%%%%%%%%%%%
    6 +                   %%%%
      |                 %%
      |               %%%
    5 +              %%
      |            %%
    4 +           %%
      |          %%
    3 +         %%
      |        %
    2 +      %%
      |     %%
      |    %
    1 +
      +    ######################################
    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
      1       10     100      1000   10000   100000
                   Transfer Size (KBytes)

SCIF allows memory sharing via mmap(..) between processes on different PCIe
nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.

For more information on SCIF please refer to the Linux Con 2013 slides titled
"Intel® MIC x100 Coprocessor Driver - on the Frontiers of Linux & HPC" at
http://events.linuxfoundation.org/sites/events/files/slides/LinuxConMicDriver.pdf
The talk can also be viewed at https://www.youtube.com/watch?v=dylbmmQs4W0

This initial patch series introduces the SCIF API and implements the SCIF
connection, messaging and node enumeration APIs. SCIF poll, mmap, RDMA and
fence APIs will be submitted in a future patch series once the initial base
patches are accepted upstream.

This patch series is divided into 13 patches as follows:
1) SCIF header file documenting the API along with the IOCTL interface

2) SCIF ring buffer is a single producer, single consumer byte stream ring
   buffer optimized for avoiding reads across the PCIe bus. The ring buffer is
   used to implement a receive queue for SCIF driver messaging between nodes and
   for byte stream messaging between SCIF endpoints. Each SCIF node has a
   receive queue for every other SCIF node, and each connected endpoint has a
   receive queue for messages from its peer. This pair of receive queues is
   referred to as a SCIF queue pair.

3) SCIF hardware bus which abstracts the low level hardware details so that the
   same SCIF driver can work without any changes on the host or the card as long
   as the hardware bus operations are implemented.

4) SCIF peer bus is used to register and unregister SCIF peer devices internally
   by the SCIF driver to signify the addition and removal of peer nodes
   respectively from the SCIF network. This simplifies remote node handling
   within SCIF and will also be used to support device probe/remove for SCIF
   client drivers (e.g. netdev over SCIF)

5) Common MIC header file changes to enable SCIF

6) SCIF module initialization, DMA/remote memory mapping APIs and debugfs hooks.

7) SCIF node queue pair setup sets up the kernel mode private node
   queue pairs between all the nodes to enable internal control
   message communication.

8) SCIF character device file operations and kernel APIs for opening and
   closing a user and kernel mode SCIF endpoint. This patch also enables
   binding to a SCIF port and listening for incoming SCIF connections.

9) SCIF connection APIs which establish a SCIF connection between
   a pair of user or kernel mode endpoints.

10) SCIF messaging APIs which allow sending messages between the SCIF endpoints
    via a byte stream based ring buffer which has been optimized to avoid reads
    across PCIe. The node enumeration API enables a user to query for the number
    of online nodes.

11) MIC host driver specific changes to enable SCIF

12) MIC card driver specific changes to enable SCIF

13) Add support for loading/unloading SCIF driver

Nikhil Rao (1):
  misc: mic: SCIF connections APIs i.e. accept and connect

Sudeep Dutt (12):
  misc: mic: SCIF header file and IOCTL interface
  misc: mic: SCIF ring buffer infrastructure
  misc: mic: SCIF Hardware Bus
  misc: mic: SCIF Peer Bus
  misc: mic: Common MIC header file changes in preparation for SCIF
  misc: mic: SCIF module initialization
  misc: mic: SCIF node queue pair setup management
  misc: mic: SCIF open close bind and listen APIs
  misc: mic: SCIF messaging and node enumeration APIs
  misc: mic: MIC host driver specific changes to enable SCIF
  misc: mic: MIC card driver specific changes to enable SCIF
  misc: mic: add support for loading/unloading SCIF driver

 Documentation/mic/mic_overview.txt    |   28 +-
 Documentation/mic/scif_overview.txt   |   62 ++
 drivers/misc/mic/Kconfig              |   40 +-
 drivers/misc/mic/Makefile             |    3 +-
 drivers/misc/mic/bus/Makefile         |    1 +
 drivers/misc/mic/scif/Makefile        |   15 +
 include/uapi/linux/Kbuild             |    1 +
 drivers/misc/mic/bus/scif_bus.h       |  129 ++++
 drivers/misc/mic/card/mic_device.h    |   11 +-
 drivers/misc/mic/card/mic_x100.h      |    1 +
 drivers/misc/mic/common/mic_dev.h     |    3 +
 drivers/misc/mic/host/mic_device.h    |   11 +-
 drivers/misc/mic/host/mic_intr.h      |    3 +-
 drivers/misc/mic/host/mic_smpt.h      |    1 +
 drivers/misc/mic/scif/scif_epd.h      |  160 ++++
 drivers/misc/mic/scif/scif_main.h     |  252 +++++++
 drivers/misc/mic/scif/scif_map.h      |  113 +++
 drivers/misc/mic/scif/scif_nodeqp.h   |  182 +++++
 drivers/misc/mic/scif/scif_peer_bus.h |   65 ++
 drivers/misc/mic/scif/scif_rb.h       |  100 +++
 include/linux/scif.h                  | 1125 ++++++++++++++++++++++++++++
 include/uapi/linux/mic_common.h       |   12 +
 include/uapi/linux/scif_ioctl.h       |  213 ++++++
 drivers/misc/mic/bus/scif_bus.c       |  210 ++++++
 drivers/misc/mic/card/mic_device.c    |  132 +++-
 drivers/misc/mic/card/mic_x100.c      |   61 +-
 drivers/misc/mic/host/mic_boot.c      |  264 ++++++-
 drivers/misc/mic/host/mic_debugfs.c   |   13 +
 drivers/misc/mic/host/mic_main.c      |    6 +
 drivers/misc/mic/host/mic_smpt.c      |    7 +-
 drivers/misc/mic/host/mic_virtio.c    |    6 +-
 drivers/misc/mic/host/mic_x100.c      |    3 +-
 drivers/misc/mic/scif/scif_api.c      | 1271 ++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_debugfs.c  |   85 +++
 drivers/misc/mic/scif/scif_epd.c      |  353 +++++++++
 drivers/misc/mic/scif/scif_fd.c       |  303 ++++++++
 drivers/misc/mic/scif/scif_main.c     |  391 ++++++++++
 drivers/misc/mic/scif/scif_nm.c       |  237 ++++++
 drivers/misc/mic/scif/scif_nodeqp.c   | 1307 +++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c |  124 ++++
 drivers/misc/mic/scif/scif_ports.c    |  124 ++++
 drivers/misc/mic/scif/scif_rb.c       |  248 +++++++
 Documentation/mic/mpssd/mpss          |   24 +-
 43 files changed, 7644 insertions(+), 56 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

-- 
1.8.2.1


^ permalink raw reply	[relevance 64%]

* [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
@ 2015-01-12 23:09 22% ` Sudeep Dutt
      2015-01-12 23:09 58% ` [PATCH char-misc-next v2 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

This patch introduces the SCIF documentation in the header file
and describes the IOCTL interface for user mode. mic_overview.txt
is updated with documentation on SCIF and a new document
describing SCIF in more details is available in scif_overview.txt.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mic_overview.txt  |   28 +-
 Documentation/mic/scif_overview.txt |   62 ++
 include/uapi/linux/Kbuild           |    1 +
 include/linux/scif.h                | 1125 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/scif_ioctl.h     |  213 +++++++
 5 files changed, 1417 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index 77c5418..1a2f2c8 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -24,6 +24,10 @@ a virtual bus called mic bus is created and virtual dma devices are
 created on it by the host/card drivers. On host the channels are private
 and used only by the host driver to transfer data for the virtio devices.
 
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
+low level communications API across PCIe currently implemented for MIC.
+More details are available at scif_overview.txt.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
@@ -47,18 +51,18 @@ the fact that the virtio block storage backend can only be on the host.
                       |               |       | Virtio over PCIe IOCTLs  |
                       |               |       +--------------------------+
 +-----------+         |               |                   |  +-----------+
-| MIC DMA   |         |               |                   |  | MIC DMA   |
-| Driver    |         |               |                   |  | Driver    |
-+-----------+         |               |                   |  +-----------+
-      |               |               |                   |        |
-+---------------+     |               |                   |  +----------------+
-|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
-+---------------+     |               |                   |  +----------------+
-      |               |               |                   |              |
-      |   +--------------+            |            +---------------+     |
-      |   |Intel MIC     |            |            |Intel MIC      |     |
-      +---|Card Driver   |            |            |Host Driver    |     |
-          +--------------+            |            +---------------+-----+
+| MIC DMA   |         |  +----------+ | +-----------+     |  | MIC DMA   |
+| Driver    |         |  |  SCIF    | | |   SCIF    |     |  | Driver    |
++-----------+         |  +----------+ | +-----------+     |  +-----------+
+      |               |       |       |       |           |        |
++---------------+     | +-----+-----+ | +-----+-----+     | +---------------+
+|MIC virtual Bus|     | |SCIF HW Bus| | |SCIF HW BUS|     | |MIC virtual Bus|
++---------------+     | +-----------+ | +-----+-----+     | +---------------+
+      |               |       |       |       |           |              |
+      |   +--------------+    |       |       |    +---------------+     |
+      |   |Intel MIC     |    |       |       |    |Intel MIC      |     |
+      +---|Card Driver   +----+       |       |    |Host Driver    |     |
+          +--------------+            |       +----+---------------+-----+
                       |               |                   |
              +-------------------------------------------------------------+
              |                                                             |
diff --git a/Documentation/mic/scif_overview.txt b/Documentation/mic/scif_overview.txt
new file mode 100644
index 0000000..75549c4
--- /dev/null
+++ b/Documentation/mic/scif_overview.txt
@@ -0,0 +1,62 @@
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
+level communications API across PCIe currently implemented for MIC. Currently
+SCIF provides inter-node communication within a single host platform, where a
+node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
+communicating over the PCIe bus while providing an API that is symmetric
+across all the nodes in the PCIe network. An important design objective for SCIF
+is to deliver the maximum possible performance given the communication
+abilities of the hardware. SCIF has been used to implement an offload compiler
+runtime and OFED support for MPI implementations for MIC coprocessors.
+
+==== SCIF API Components ====
+The SCIF API has the following parts:
+1. Connection establishment using a client server model
+2. Byte stream messaging intended for short messages
+3. Node enumeration to determine online nodes
+4. Poll semantics for detection of incoming connections and messages
+5. Memory registration to pin down pages
+6. Remote memory mapping for low latency CPU accesses via mmap
+7. Remote DMA (RDMA) for high bandwidth DMA transfers
+8. Fence APIs for RDMA synchronization
+
+SCIF exposes the notion of a connection which can be used by peer processes on
+nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
+process in a SCIF node initiates a SCIF connection to a peer process on a
+different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
+which are similar to connection oriented socket APIs. Connected SCIF endpoints
+can also register local memory which is followed by data transfer using either
+DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
+kernel mode clients which are functionally equivalent.
+
+==== SCIF Performance for MIC ====
+DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
+SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
+
+             Comparison of TCP and SCIF based BW
+
+  Throughput (GB/sec)
+    8 +                                             PCIe Bandwidth ******
+      +                                                        TCP ######
+    7 +    **************************************             SCIF %%%%%%
+      |                       %%%%%%%%%%%%%%%%%%%
+    6 +                   %%%%
+      |                 %%
+      |               %%%
+    5 +              %%
+      |            %%
+    4 +           %%
+      |          %%
+    3 +         %%
+      |        %
+    2 +      %%
+      |     %%
+      |    %
+    1 +
+      +    ######################################
+    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
+      1       10     100      1000   10000   100000
+                   Transfer Size (KBytes)
+
+SCIF allows memory sharing via mmap(..) between processes on different PCIe
+nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
+latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 00b10002..5083817 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -349,6 +349,7 @@ header-y += rtc.h
 header-y += rtnetlink.h
 header-y += scc.h
 header-y += sched.h
+header-y += scif_ioctl.h
 header-y += screen_info.h
 header-y += sctp.h
 header-y += sdla.h
diff --git a/include/linux/scif.h b/include/linux/scif.h
new file mode 100644
index 0000000..556b032
--- /dev/null
+++ b/include/linux/scif.h
@@ -0,0 +1,1125 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef __SCIF_H__
+#define __SCIF_H__
+
+#include <linux/types.h>
+#include <linux/poll.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_ACCEPT_SYNC	1
+#define SCIF_SEND_BLOCK		1
+#define SCIF_RECV_BLOCK		1
+
+enum {
+	SCIF_PROT_READ = (1 << 0),
+	SCIF_PROT_WRITE = (1 << 1)
+};
+
+enum {
+	SCIF_MAP_FIXED = 0x10,
+	SCIF_MAP_KERNEL	= 0x20,
+};
+
+enum {
+	SCIF_FENCE_INIT_SELF = (1 << 0),
+	SCIF_FENCE_INIT_PEER = (1 << 1),
+	SCIF_SIGNAL_LOCAL = (1 << 4),
+	SCIF_SIGNAL_REMOTE = (1 << 5)
+};
+
+enum {
+	SCIF_RMA_USECPU = (1 << 0),
+	SCIF_RMA_USECACHE = (1 << 1),
+	SCIF_RMA_SYNC = (1 << 2),
+	SCIF_RMA_ORDERED = (1 << 3)
+};
+
+/* End of SCIF Admin Reserved Ports */
+#define SCIF_ADMIN_PORT_END	1024
+
+/* End of SCIF Reserved Ports */
+#define SCIF_PORT_RSVD		1088
+
+typedef struct scif_endpt *scif_epd_t;
+
+#define SCIF_OPEN_FAILED ((scif_epd_t)-1)
+#define SCIF_REGISTER_FAILED ((off_t)-1)
+#define SCIF_MMAP_FAILED ((void *)-1)
+
+/**
+ * scif_open() - Create an endpoint
+ *
+ * Return:
+ * Upon successful completion, scif_open() returns an endpoint descriptor to
+ * be used in subsequent SCIF functions calls to refer to that endpoint;
+ * otherwise in user mode SCIF_OPEN_FAILED (that is ((scif_epd_t)-1)) is
+ * returned and errno is set to indicate the error; in kernel mode a NULL
+ * scif_epd_t is returned.
+ *
+ * Errors:
+ * ENOMEM
+ * Insufficient kernel memory was available
+ */
+scif_epd_t scif_open(void);
+
+/**
+ * scif_bind() - Bind an endpoint to a port
+ * @epd:	endpoint descriptor
+ * @pn:		port number
+ *
+ * scif_bind() binds endpoint epd to port pn, where pn is a port number on the
+ * local node. If pn is zero, a port number greater than or equal to
+ * SCIF_PORT_RSVD is assigned and returned. Each endpoint may be bound to
+ * exactly one local port. Ports less than 1024 when requested can only be bound
+ * by system (or root) processes or by processes executed by privileged users.
+ *
+ * Return:
+ * Upon successful completion, scif_bind() returns the port number to which epd
+ * is bound; otherwise in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * The endpoint or the port are already bound
+ * EISCONN
+ * The endpoint is already connected
+ * ENOSPC
+ * No port number available for assignment
+ * EACCES
+ * The port requested is protected and the user is not the superuser
+*/
+int scif_bind(scif_epd_t epd, uint16_t pn);
+
+/**
+ * scif_listen() - Listen for connections on an endpoint
+ * @epd:	endpoint descriptor
+ * @backlog:	maximum pending connection requests
+ *
+ * scif_listen() marks the endpoint epd as a listening endpoint - that is, as
+ * an endpoint that will be used to accept incoming connection requests. Once
+ * so marked, the endpoint is said to be in the listening state and may not be
+ * used as the endpoint of a connection.
+ *
+ * The endpoint, epd, must have been bound to a port.
+ *
+ * The backlog argument defines the maximum length to which the queue of
+ * pending connections for epd may grow. If a connection request arrives when
+ * the queue is full, the client may receive an error with an indication that
+ * the connection was refused.
+ *
+ * Return:
+ * Upon successful completion, scif_listen() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * The endpoint is not bound to a port
+ * EISCONN
+ * The endpoint is already connected or listening
+*/
+int scif_listen(scif_epd_t epd, int backlog);
+
+/**
+ * scif_connect() - Initiate a connection on a port
+ * @epd:	endpoint descriptor
+ * @dst:	global id of port to which to connect
+ *
+ * The scif_connect() function requests the connection of endpoint epd to remote
+ * port dst. If the connection is successful, a peer endpoint, bound to dst, is
+ * created on node dst.node. On successful return, the connection is complete.
+ *
+ * If the endpoint epd has not already been bound to a port, scif_connect()
+ * will bind it to an unused local port.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * In user space, scif_connect() supports an asynchronous connection mode
+ * if the application has set the O_NONBLOCK flag on the endpoint via the
+ * fcntl() system call. Setting this flag will result in the calling process
+ * not to wait during scif_connect().
+ *
+ * Return:
+ * Upon successful completion, scif_connect() returns the port ID to which the
+ * endpoint, epd, is bound; otherwise in user mode -1 is returned and errno is
+ * set to indicate the error; in kernel mode the negative of one of the
+ * following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNREFUSED
+ * The destination was not listening for connections or refused the
+ *	connection request
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * dst.port is not a valid port ID
+ * EISCONN
+ * The endpoint is already connected
+ * ENOMEM
+ * No buffer space is available
+ * ENODEV
+ * The destination node does not exist, or
+ * The node is lost or existed, but is not currently in the network since it
+ *	may have crashed
+ * ENOSPC
+ * No port number available for assignment
+ * EOPNOTSUPP
+ * The endpoint is listening and cannot be connected
+*/
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst);
+
+/**
+ * scif_accept() - Accept a connection on an endpoint
+ * @epd:	endpoint descriptor
+ * @peer:	global id of port to which connected
+ * @newepd:	new connected endpoint descriptor
+ * @flags:	flags
+ *
+ * The scif_accept() call extracts the first connection request from the queue
+ * of pending connections for the port on which epd is listening. scif_accept()
+ * creates a new endpoint, bound to the same port as epd, and allocates a new
+ * SCIF endpoint descriptor, returned in newepd, for the endpoint. The new
+ * endpoint is connected to the endpoint through which the connection was
+ * requested. epd is unaffected by this call, and remains in the listening
+ * state.
+ *
+ * On successful return, peer holds the global port identifier (node id and
+ * local port number) of the port which requested the connection.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * The number of connections that can (subsequently) be accepted on epd is only
+ * limited by system resources (memory).
+ *
+ * The flags argument is formed by OR'ing together zero or more of the
+ * following values.
+ * SCIF_ACCEPT_SYNC - block until a connection request is presented. If
+ *			SCIF_ACCEPT_SYNC is not in flags, and no pending
+ *			connections are present on the queue, scif_accept()
+ *			fails with an EAGAIN error
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when there is a connection request. In kernel mode, the scif_poll()
+ * function may be used for this purpose. A readable event will be delivered
+ * when a connection is requested.
+ *
+ * Return:
+ * Upon successful completion, scif_accept() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ *	negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EAGAIN
+ * SCIF_ACCEPT_SYNC is not set and no connections are present to be accepted,
+ *	or
+ * SCIF_ACCEPT_SYNC is not set and remote node failed to complete its
+ *	connection request
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * EINTR
+ * Interrupted function
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * epd is not a listening endpoint, or
+ * flags is invalid, or
+ * peer is NULL, or
+ * newepd is NULL
+ * ENODEV
+ * The requesting node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOMEM
+ * Not enough space
+ * ENOENT
+ * Secondary part of epd registration failed
+*/
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer, scif_epd_t
+		*newepd, int flags);
+
+/**
+ * scif_close() - Close an endpoint
+ * @epd:	endpoint descriptor
+ *
+ * scif_close() closes an endpoint and performs necessary teardown of
+ * facilities associated with that endpoint.
+ *
+ * If epd is a listening endpoint then it will no longer accept connection
+ * requests on the port to which it is bound. Any pending connection requests
+ * are rejected.
+ *
+ * If epd is a connected endpoint, then its peer endpoint is also closed. RMAs
+ * which are in-process through epd or its peer endpoint will complete before
+ * scif_close() returns. Registered windows of the local and peer endpoints are
+ * released as if scif_unregister() was called against each window.
+ *
+ * Closing a SCIF endpoint does not affect local registered memory mapped by
+ * a SCIF endpoint on a remote node. The local memory remains mapped by the peer
+ * SCIF endpoint explicitly removed by calling munmap(..) by the peer.
+ *
+ * If the peer endpoint's receive queue is not empty at the time that epd is
+ * closed, then the peer endpoint can be passed as the endpoint parameter to
+ * scif_recv() until the receive queue is empty.
+ *
+ * epd is freed and may no longer be accessed.
+ *
+ * Return:
+ * Upon successful completion, scif_close() returns 0; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * EINVAL
+ * epd is not a valid endpoint descriptor
+ */
+int scif_close(scif_epd_t epd);
+
+/**
+ * scif_send() - Send a message
+ * @epd:	endpoint descriptor
+ * @msg:	message buffer address
+ * @len:	message length
+ * @flags:	blocking mode flags
+ *
+ * scif_send() sends data to the peer of endpoint epd. Up to len bytes of data
+ * are copied from memory starting at address msg. On successful execution the
+ * return value of scif_send() is the number of bytes that were sent, and is
+ * zero if no bytes were sent because len was zero. scif_send() may be called
+ * only when the endpoint is in a connected state.
+ *
+ * If a scif_send() call is non-blocking, then it sends only those bytes which
+ * can be sent without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_send() call is blocking, then it normally returns after sending
+ * all len bytes. If a blocking call is interrupted or the connection is
+ * reset, the call is considered successful if some bytes were sent or len is
+ * zero, otherwise the call is considered unsuccessful.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when the send queue is not full. In kernel mode, the scif_poll() function
+ * may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * scif_send() will block until the entire message is sent if SCIF_SEND_BLOCK
+ * is passed as the flags argument.
+ *
+ * Return:
+ * Upon successful completion, scif_send() returns the number of bytes sent;
+ * otherwise in user mode -1 is returned and errno is set to indicate the
+ * error; in kernel mode the negative of one of the following errors is
+ * returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EFAULT
+ * An invalid address was specified for a parameter
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * flags is invalid, or
+ * len is negative
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOMEM
+ * Not enough space
+ * ENOTCONN
+ * The endpoint is not connected
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_recv() - Receive a message
+ * @epd:	endpoint descriptor
+ * @msg:	message buffer address
+ * @len:	message buffer length
+ * @flags:	blocking mode flags
+ *
+ * scif_recv() receives data from the peer of endpoint epd. Up to len bytes of
+ * data are copied to memory starting at address msg. On successful execution
+ * the return value of scif_recv() is the number of bytes that were received,
+ * and is zero if no bytes were received because len was zero. scif_recv() may
+ * be called only when the endpoint is in a connected state.
+ *
+ * If a scif_recv() call is non-blocking, then it receives only those bytes
+ * which can be received without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_recv() call is blocking, then it normally returns after receiving
+ * all len bytes. If the blocking call was interrupted due to a disconnection,
+ * subsequent calls to scif_recv() will copy all bytes received upto the point
+ * of disconnection.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when data is available to be received. In kernel mode, the scif_poll()
+ * function may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * scif_recv() will block until the entire message is received if
+ * SCIF_RECV_BLOCK is passed as the flags argument.
+ *
+ * Return:
+ * Upon successful completion, scif_recv() returns the number of bytes
+ * received; otherwise in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ * Errors:
+ * EAGAIN
+ * The destination node is returning from a low power state
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EFAULT
+ * An invalid address was specified for a parameter
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * flags is invalid, or
+ * len is negative
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOMEM
+ * Not enough space
+ * ENOTCONN
+ * The endpoint is not connected
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_register() - Mark a memory region for remote access.
+ * @epd:		endpoint descriptor
+ * @addr:		starting virtual address
+ * @len:		length of range
+ * @offset:		offset of window
+ * @prot_flags:		read/write protection flags
+ * @map_flags:		mapping flags
+ *
+ * The scif_register() function opens a window, a range of whole pages of the
+ * registered address space of the endpoint epd, starting at offset po and
+ * continuing for len bytes. The value of po, further described below, is a
+ * function of the parameters offset and len, and the value of map_flags. Each
+ * page of the window represents the physical memory page which backs the
+ * corresponding page of the range of virtual address pages starting at addr
+ * and continuing for len bytes. addr and len are constrained to be multiples
+ * of the page size. A successful scif_register() call returns po.
+ *
+ * When SCIF_MAP_FIXED is set in the map_flags argument, po will be offset
+ * exactly, and offset is constrained to be a multiple of the page size. The
+ * mapping established by scif_register() will not replace any existing
+ * registration; an error is returned if any page within the range [offset,
+ * offset + len - 1] intersects an existing window.
+ *
+ * When SCIF_MAP_FIXED is not set, the implementation uses offset in an
+ * implementation-defined manner to arrive at po. The po value so chosen will
+ * be an area of the registered address space that the implementation deems
+ * suitable for a mapping of len bytes. An offset value of 0 is interpreted as
+ * granting the implementation complete freedom in selecting po, subject to
+ * constraints described below. A non-zero value of offset is taken to be a
+ * suggestion of an offset near which the mapping should be placed. When the
+ * implementation selects a value for po, it does not replace any extant
+ * window. In all cases, po will be a multiple of the page size.
+ *
+ * The physical pages which are so represented by a window are available for
+ * access in calls to mmap(), scif_readfrom(), scif_writeto(),
+ * scif_vreadfrom(), and scif_vwriteto(). While a window is registered, the
+ * physical pages represented by the window will not be reused by the memory
+ * subsystem for any other purpose. Note that the same physical page may be
+ * represented by multiple windows.
+ *
+ * Subsequent operations which change the memory pages to which virtual
+ * addresses are mapped (such as mmap(), munmap()) have no effect on
+ * existing window.
+ *
+ * If the process will fork(), it is recommended that the registered
+ * virtual address range be marked with MADV_DONTFORK. Doing so will prevent
+ * problems due to copy-on-write semantics.
+ *
+ * The prot_flags argument is formed by OR'ing together one or more of the
+ * following values.
+ * SCIF_PROT_READ - allow read operations from the window
+ * SCIF_PROT_WRITE - allow write operations to the window
+ *
+ * The map_flags argument can be set to SCIF_MAP_FIXED which interprets a
+ * fixed offset.
+ *
+ * Return:
+ * Upon successful completion, scif_register() returns the offset at which the
+ * mapping was placed (po); otherwise in user mode SCIF_REGISTER_FAILED (that
+ * is (off_t *)-1) is returned and errno is set to indicate the error; in
+ * kernel mode the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EADDRINUSE
+ * SCIF_MAP_FIXED is set in map_flags, and pages in the range [offset,
+ *	offset + len -1] are already registered
+ * EAGAIN
+ * The mapping could not be performed due to lack of resources
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EFAULT
+ * Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * map_flags is invalid, or
+ * prot_flags is invalid, or
+ * SCIF_MAP_FIXED is set in flags, and offset is not a multiple of
+ *	the page size, or
+ * addr is not a multiple of the page size, or
+ * len is not a multiple of the page size, or is 0, or
+ * offset is negative
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOMEM
+ * Not enough space
+ * ENOTCONN
+ * The endpoint is not connected
+ */
+off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
+		    int prot_flags, int map_flags);
+
+/**
+ * scif_unregister() - Mark a memory region for remote access.
+ * @epd:	endpoint descriptor
+ * @offset:	start of range to unregister
+ * @len:	length of range to unregister
+ *
+ * The scif_unregister() function closes those previously registered windows
+ * which are entirely within the range [offset, offset + len - 1]. It is an
+ * error to specify a range which intersects only a subrange of a window.
+ *
+ * On a successful return, pages within the window may no longer be specified
+ * in calls to mmap(), scif_readfrom(), scif_writeto(), scif_vreadfrom(),
+ * scif_vwriteto(), scif_get_pages, and scif_fence_signal(). The window,
+ * however, continues to exist until all previous references against it are
+ * removed. A window is referenced if there is a mapping to it created by
+ * mmap(), or if scif_get_pages() was called against the window
+ * (and the pages have not been returned via scif_put_pages()). A window is
+ * also referenced while an RMA, in which some range of the window is a source
+ * or destination, is in progress. Finally a window is referenced while some
+ * offset in that window was specified to scif_fence_signal(), and the RMAs
+ * marked by that call to scif_fence_signal() have not completed. While a
+ * window is in this state, its registered address space pages are not
+ * available for use in a new registered window.
+ *
+ * When all such references to the window have been removed, its references to
+ * all the physical pages which it represents are removed. Similarly, the
+ * registered address space pages of the window become available for
+ * registration in a new window.
+ *
+ * Return:
+ * Upon successful completion, scif_unregister() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned. In the event of an
+ * error, no windows are unregistered.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * The range [offset, offset + len - 1] intersects a subrange of a window, or
+ * offset is negative
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENXIO
+ * Offsets in the range [offset, offset + len - 1] are invalid for the
+ *	registered address space of epd
+ */
+int scif_unregister(scif_epd_t epd, off_t offset, size_t len);
+
+/**
+ * scif_readfrom() - Copy from a remote address space
+ * @epd:	endpoint descriptor
+ * @loffset:	offset in local registered address space to
+ *		which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space
+ *		from which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_readfrom() copies len bytes from the remote registered address space of
+ * the peer of endpoint epd, starting at the offset roffset to the local
+ * registered address space of epd, starting at the offset loffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_readfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if loffset and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_readfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS
+ * Attempt to write to a read-only range
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * rma_flags is invalid
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENXIO
+ * The range [loffset, loffset + len - 1] is invalid for the registered
+ *	address space of epd, or,
+ * The range [roffset, roffset + len - 1] is invalid for the registered
+ *	address space of the peer of epd, or
+ * loffset or roffset is negative
+*/
+int scif_readfrom(scif_epd_t epd, off_t loffset, size_t len, off_t
+		  roffset, int rma_flags);
+
+/**
+ * scif_writeto() - Copy to a remote address space
+ * @epd:	endpoint descriptor
+ * @loffset:	offset in local registered address space
+ *		from which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to
+ *		which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_writeto() copies len bytes from the local registered address space of
+ * epd, starting at the offset loffset to the remote registered address space
+ * of the peer of endpoint epd, starting at the offset roffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_writeto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if loffset and roffset are not separated by a multiple
+ * of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *			engine.
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_readfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS
+ * Attempt to write to a read-only range
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * rma_flags is invalid
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENXIO
+ * The range [loffset, loffset + len - 1] is invalid for the registered
+ *	address space of epd, or,
+ * The range [roffset , roffset + len -1] is invalid for the registered
+ *	address space of the peer of epd, or
+ * loffset or roffset is negative
+ */
+int scif_writeto(scif_epd_t epd, off_t loffset, size_t len, off_t
+		 roffset, int rma_flags);
+
+/**
+ * scif_vreadfrom() - Copy from a remote address space
+ * @epd:	endpoint descriptor
+ * @addr:	address to which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space
+ *		from which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_vreadfrom() copies len bytes from the remote registered address
+ * space of the peer of endpoint epd, starting at the offset roffset, to local
+ * memory, starting at addr.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may
+ * intersect multiple registered windows, but only if those windows are
+ * contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vreadfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if addr and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_USECACHE - enable registration caching
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *	the source range becomes visible on the destination node
+ *	after all other transferred data in the source range has
+ *	become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_vreadfrom() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS
+ * Attempt to write to a read-only range
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EFAULT
+ * Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * rma_flags is invalid
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENXIO
+ * Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ *	registered address space of epd
+ */
+int scif_vreadfrom(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		   int rma_flags);
+
+/**
+ * scif_vwriteto() - Copy to a remote address space
+ * @epd:	endpoint descriptor
+ * @addr:	address from which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to
+ *		which to copy
+ * @rma_flags:	transfer mode flags
+ *
+ * scif_vwriteto() copies len bytes from the local memory, starting at addr, to
+ * the remote registered address space of the peer of endpoint epd, starting at
+ * the offset roffset.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may intersect
+ * multiple registered windows, but only if those windows are contiguous in the
+ * registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vwriteto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and offset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and offset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if addr and offset are not separated by a multiple of
+ * 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values.
+ * SCIF_RMA_USECPU - perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ * SCIF_RMA_USECACHE - allow registration caching
+ * SCIF_RMA_SYNC - perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ * SCIF_RMA_ORDERED - ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ * Return:
+ * Upon successful completion, scif_vwriteto() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EACCESS
+ * Attempt to write to a read-only range
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EFAULT
+ * Addresses in the range [addr, addr + len - 1] are invalid
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * rma_flags is invalid
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENXIO
+ * Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ *	registered address space of epd
+ */
+int scif_vwriteto(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		  int rma_flags);
+
+/**
+ * scif_fence_mark() - Mark previously issued RMAs
+ * @epd:	endpoint descriptor
+ * @flags:	control flags
+ * @mark:	marked value returned as output.
+ *
+ * scif_fence_mark() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd. The RMAs are
+ * marked with a value returned at mark. The application may subsequently call
+ * scif_fence_wait(), passing the value returned at mark, to await completion
+ * of all RMAs so marked.
+ *
+ * The flags argument has exactly one of the following values.
+ * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
+ *	epd are marked
+ * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ *
+ * Return:
+ * Upon successful completion, scif_fence_mark() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EINVAL
+ * flags is invalid, or
+ * epd is not a valid endpoint descriptor, or
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENOMEM
+ * Insufficient kernel memory was available
+ */
+int scif_fence_mark(scif_epd_t epd, int flags, int *mark);
+
+/**
+ * scif_fence_wait() - Wait for completion of marked RMAs
+ * @epd:	endpoint descriptor
+ * @mark:	mark request
+ *
+ * scif_fence_wait() returns after all RMAs marked with mark have completed.
+ * The value passed in mark must have been obtained in a previous call to
+ * scif_fence_mark().
+ *
+ * Return:
+ * Upon successful completion, scif_fence_wait() returns 0; otherwise in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENOMEM
+ * Insufficient kernel memory was available
+ */
+int scif_fence_wait(scif_epd_t epd, int mark);
+
+/**
+ * scif_fence_signal() - Request a memory update on completion of RMAs
+ * @epd:	endpoint descriptor
+ * @loff:	local offset
+ * @lval:	local value to write to loffset
+ * @roff:	remote offset
+ * @rval:	remote value to write to roffset
+ * @flags:	flags
+ *
+ * scif_fence_signal() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or marking the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd.
+ *
+ * If flags includes SCIF_SIGNAL_LOCAL, then on completion of the RMAs in the
+ * marked set, lval is written to memory at the address corresponding to offset
+ * loff in the local registered address space of epd. loff must be within a
+ * registered window. If flags includes SCIF_SIGNAL_REMOTE, then on completion
+ * of the RMAs in the marked set, rval is written to memory at the address
+ * corresponding to offset roff in the remote registered address space of epd.
+ * roff must be within a remote registered window of the peer of epd. Note
+ * that any specified offset must be DWORD (4 byte / 32 bit) aligned.
+ *
+ * The flags argument is formed by OR'ing together the following.
+ * Exactly one of the following values.
+ * SCIF_FENCE_INIT_SELF - RMA operations initiated through endpoint
+ *	epd are marked
+ * SCIF_FENCE_INIT_PEER - RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ * One or more of the following values.
+ * SCIF_SIGNAL_LOCAL - On completion of the marked set of RMAs, write lval to
+ *	memory at the address corresponding to offset loff in the local
+ *	registered address space of epd.
+ * SCIF_SIGNAL_REMOTE - On completion of the marked set of RMAs, write rval to
+ *	memory at the address corresponding to offset roff in the remote
+ *	registered address space of epd.
+ *
+ * Return:
+ * Upon successful completion, scif_fence_signal() returns 0; otherwise in
+ * user mode -1 is returned and errno is set to indicate the error; in kernel
+ * mode the negative of one of the following errors is returned.
+ * Errors:
+ * EBADF, ENOTTY
+ * epd is not a valid endpoint descriptor
+ * ECONNRESET
+ * Connection reset by peer
+ * EINVAL
+ * epd is not a valid endpoint descriptor, or
+ * flags is invalid, or
+ * loff or roff are not DWORD aligned
+ * ENODEV
+ * The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ * ENOTCONN
+ * The endpoint is not connected
+ * ENXIO
+ * loff is invalid for the registered address of epd, or
+ * roff is invalid for the registered address space, of the peer of epd
+ */
+int scif_fence_signal(scif_epd_t epd, off_t loff, uint64_t lval, off_t roff,
+		      uint64_t rval, int flags);
+
+/**
+ * scif_get_node_ids() - Return information about online nodes
+ * @nodes:	array in which to return online node IDs
+ * @len:	number of entries in the nodes array
+ * @self:	address to place the node ID of the local node
+ *
+ * scif_get_node_ids() fills in the nodes array with up to len node IDs of the
+ * nodes in the SCIF network. If there is not enough space in nodes, as
+ * indicated by the len parameter, only len node IDs are returned in nodes. The
+ * return value of scif_get_node_ids() is the total number of nodes currently in
+ * the SCIF network. By checking the return value against the len parameter,
+ * the user may determine if enough space for nodes was allocated.
+ *
+ * The node ID of the local node is returned at self.
+ *
+ * Return:
+ * Upon successful completion, scif_get_node_ids() returns the actual number of
+ * online nodes in the SCIF network including 'self'; otherwise in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode no
+ * errors are returned.
+ *
+ * Errors:
+ * EFAULT
+ * Bad address
+ */
+int scif_get_node_ids(uint16_t *nodes, int len, uint16_t *self);
+
+#endif /* __SCIF_H__ */
diff --git a/include/uapi/linux/scif_ioctl.h b/include/uapi/linux/scif_ioctl.h
new file mode 100644
index 0000000..b8a2b2c
--- /dev/null
+++ b/include/uapi/linux/scif_ioctl.h
@@ -0,0 +1,213 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+/*
+ * -----------------------------------------
+ * SCIF IOCTL interface information
+ * -----------------------------------------
+ */
+#ifndef SCIF_IOCTL_H
+#define SCIF_IOCTL_H
+
+/**
+ * struct scif_port_id - SCIF port information
+ * @node:	node on which port resides
+ * @port:	local port number
+ */
+struct scif_port_id {
+	uint16_t node;
+	uint16_t port;
+};
+
+/**
+ * struct scifioctl_connect - used for SCIF_CONNECT IOCTL
+ * @self:	used to read back the assigned port_id
+ * @peer:	destination node and port to connect to
+ */
+struct scifioctl_connect {
+	struct scif_port_id	self;
+	struct scif_port_id	peer;
+};
+
+/**
+ * struct scifioctl_accept - used for SCIF_ACCEPTREQ IOCTL
+ * @flags:	flags
+ * @peer:	global id of peer endpoint
+ * @endpt:	new connected endpoint descriptor
+ */
+struct scifioctl_accept {
+	int			flags;
+	struct scif_port_id	peer;
+	void			*endpt;
+};
+
+/**
+ * struct scifioctl_msg - used for SCIF_SEND/SCIF_RECV IOCTL
+ * @msg:	message buffer address
+ * @len:	message length
+ * @flags:	flags
+ * @out_len:	number of bytes sent/received
+ */
+struct scifioctl_msg {
+	void	__user *msg;
+	int	len;
+	int	flags;
+	int	out_len;
+};
+
+/**
+ * struct scifioctl_reg - used for SCIF_REG IOCTL
+ * @addr:	starting virtual address
+ * @len:	length of range
+ * @offset:	offset of window
+ * @prot:	read/write protection
+ * @flags:	flags
+ * @out_offset:	offset returned
+ */
+struct scifioctl_reg {
+	void		*addr;
+	uint64_t	len;
+	off_t		offset;
+	int		prot;
+	int		flags;
+	off_t		out_offset;
+};
+
+/**
+ * struct scifioctl_unreg - used for SCIF_UNREG IOCTL
+ * @offset:	start of range to unregister
+ * @len:	length of range to unregister
+ */
+struct scifioctl_unreg {
+	off_t		offset;
+	uint64_t	len;
+};
+
+/**
+ * struct scifioctl_copy - used for SCIF DMA copy IOCTLs
+ *
+ * @loffset:	offset in local registered address space to/from
+ *		which to copy
+ * @len:	length of range to copy
+ * @roffset:	offset in remote registered address space to/from
+ *		which to copy
+ * @addr:	user virtual address to/from which to copy
+ * @flags:	flags
+ *
+ * This structure is used for SCIF_READFROM, SCIF_WRITETO, SCIF_VREADFROM
+ * and SCIF_VREADFROM IOCTL's.
+ */
+struct scifioctl_copy {
+	off_t		loffset;
+	uint64_t	len;
+	off_t		roffset;
+	uint8_t		__user *addr;
+	int		flags;
+};
+
+/**
+ * struct scifioctl_fence_mark  - used for SCIF_FENCE_MARK IOCTL
+ * @flags:	flags
+ * @mark:	fence handle returned by reference
+ */
+struct scifioctl_fence_mark {
+	int	flags;
+	int	__user *mark;
+};
+
+/**
+ * struct scifioctl_fence_signal - used for SCIF_FENCE_SIGNAL IOCTL
+ * @loff:	local offset
+ * @lval:	local value to write to loffset
+ * @roff:	remote offset
+ * @rval:	remote value to write to roffset
+ * @flags:	flags
+ */
+struct scifioctl_fence_signal {
+	off_t		loff;
+	uint64_t	lval;
+	off_t		roff;
+	uint64_t	rval;
+	int		flags;
+};
+
+/**
+ * struct scifioctl_node_ids - used for SCIF_GET_NODEIDS IOCTL
+ * @nodes:	pointer to an array of node_ids
+ * @len:	length of array
+ * @self:	ID of the current node
+ */
+struct scifioctl_node_ids {
+	uint16_t __user		*nodes;
+	int			len;
+	uint16_t __user		*self;
+};
+
+#define SCIF_BIND		_IOWR('s', 1, int *)
+#define SCIF_LISTEN		_IOW('s', 2, int)
+#define SCIF_CONNECT		_IOWR('s', 3, struct scifioctl_connect *)
+#define SCIF_ACCEPTREQ		_IOWR('s', 4, struct scifioctl_accept *)
+#define SCIF_ACCEPTREG		_IOWR('s', 5, void *)
+#define SCIF_SEND		_IOWR('s', 6, struct scifioctl_msg *)
+#define SCIF_RECV		_IOWR('s', 7, struct scifioctl_msg *)
+#define SCIF_REG		_IOWR('s', 8, struct scifioctl_reg *)
+#define SCIF_UNREG		_IOWR('s', 9, struct scifioctl_unreg *)
+#define SCIF_READFROM		_IOWR('s', 10, struct scifioctl_copy *)
+#define SCIF_WRITETO		_IOWR('s', 11, struct scifioctl_copy *)
+#define SCIF_VREADFROM		_IOWR('s', 12, struct scifioctl_copy *)
+#define SCIF_VWRITETO		_IOWR('s', 13, struct scifioctl_copy *)
+#define SCIF_GET_NODEIDS	_IOWR('s', 14, struct scifioctl_node_ids *)
+#define SCIF_FENCE_MARK		_IOWR('s', 15, struct scifioctl_fence_mark *)
+#define SCIF_FENCE_WAIT		_IOWR('s', 16, int)
+#define SCIF_FENCE_SIGNAL	_IOWR('s', 17, struct scifioctl_fence_signal *)
+
+#endif /* SCIF_IOCTL_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 22%]

* [PATCH char-misc-next v2 02/13] misc: mic: SCIF ring buffer infrastructure
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
  2015-01-12 23:09 22% ` [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
@ 2015-01-12 23:09 58% ` Sudeep Dutt
  2015-01-12 23:09 61% ` [PATCH char-misc-next v2 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF ring buffer is a single producer, single consumer byte stream
ring buffer optimized for avoiding reads across the PCIe bus while
adding the required barriers and hardware workarounds for the MIC
Coprocessor. The ring buffer is used to implement a receive queue for
SCIF driver messaging between two nodes and for byte stream messaging
between SCIF endpoints. The existing in-kernel ring buffer was not
reused since it has not been designed for our use across the PCIe bus
where each node runs an independent OS. Each SCIF node has a receive
queue for every other SCIF node, and each connected endpoint has a
receive queue for messages from its peer. This pair of receive
queues is referred to as a SCIF queue pair.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_rb.h | 100 ++++++++++++++++
 drivers/misc/mic/scif/scif_rb.c | 248 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 348 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

diff --git a/drivers/misc/mic/scif/scif_rb.h b/drivers/misc/mic/scif/scif_rb.h
new file mode 100644
index 0000000..ba4bfce
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.h
@@ -0,0 +1,100 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef SCIF_RB_H
+#define SCIF_RB_H
+/*
+ * This file describes a general purpose, byte based ring buffer. Writers to the
+ * ring buffer need to synchronize using a lock. The same is true for readers,
+ * although in practice, the ring buffer has a single reader. It is lockless
+ * between producer and consumer so it can handle being used across the PCIe
+ * bus. The ring buffer ensures that there are no reads across the PCIe bus for
+ * performance reasons. Two of these are used to form a single bidirectional
+ * queue-pair across PCIe.
+ */
+/*
+ * struct scif_rb - SCIF Ring Buffer
+ *
+ * @rb_base: The base of the memory used for storing RB messages
+ * @read_ptr: Pointer to the read offset
+ * @write_ptr: Pointer to the write offset
+ * @size: Size of the memory in rb_base
+ * @current_read_offset: Cached read offset for performance
+ * @current_write_offset: Cached write offset for performance
+ */
+struct scif_rb {
+	void *rb_base;
+	uint32_t *read_ptr;
+	uint32_t *write_ptr;
+	uint32_t size;
+	uint32_t current_read_offset;
+	uint32_t current_write_offset;
+};
+
+/* methods used by both */
+void scif_rb_init(struct scif_rb *rb, uint32_t *read_ptr, uint32_t *write_ptr,
+		  void *rb_base, uint8_t size);
+/* writer only methods */
+/* write a new command, then scif_rb_commit() */
+int scif_rb_write(struct scif_rb *rb, void *msg, uint32_t size);
+/* after write(), then scif_rb_commit() */
+void scif_rb_commit(struct scif_rb *rb);
+/* query space available for writing to a RB. */
+uint32_t scif_rb_space(struct scif_rb *rb);
+
+/* reader only methods */
+/* read a new message from the ring buffer of size bytes */
+uint32_t scif_rb_get_next(struct scif_rb *rb, void *msg, uint32_t size);
+/* update the read pointer so that the space can be reused */
+void scif_rb_update_read_ptr(struct scif_rb *rb);
+/* count the number of bytes that can be read */
+uint32_t scif_rb_count(struct scif_rb *rb, uint32_t size);
+#endif
diff --git a/drivers/misc/mic/scif/scif_rb.c b/drivers/misc/mic/scif/scif_rb.c
new file mode 100644
index 0000000..28e99ea
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.c
@@ -0,0 +1,248 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/circ_buf.h>
+#include <linux/types.h>
+#include <linux/io.h>
+
+#include "scif_rb.h"
+
+#define scif_rb_ring_cnt(head, tail, size) CIRC_CNT(head, tail, size)
+#define scif_rb_ring_space(head, tail, size) CIRC_SPACE(head, tail, size)
+
+/**
+ * scif_rb_init - Initializes the ring buffer
+ * @rb: ring buffer
+ * @read_ptr: A pointer to the read offset
+ * @write_ptr: A pointer to the write offset
+ * @rb_base: A pointer to the base of the ring buffer
+ * @size: The size of the ring buffer in powers of two
+ */
+void scif_rb_init(struct scif_rb *rb, u32 *read_ptr, u32 *write_ptr,
+		  void *rb_base, u8 size)
+{
+	rb->rb_base = rb_base;
+	rb->size = (1 << size);
+	rb->read_ptr = read_ptr;
+	rb->write_ptr = write_ptr;
+	rb->current_read_offset = *read_ptr;
+	rb->current_write_offset = *write_ptr;
+}
+
+/* Copies a message to the ring buffer -- handles the wrap around case */
+static void memcpy_torb(struct scif_rb *rb, void *header,
+			void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_toio((void __iomem __force *)header, msg, size1);
+		memcpy_toio((void __iomem __force *)rb->rb_base,
+			    msg + size1, size2);
+	} else {
+		memcpy_toio((void __iomem __force *)header, msg, size);
+	}
+}
+
+/* Copies a message from the ring buffer -- handles the wrap around case */
+static void memcpy_fromrb(struct scif_rb *rb, void *header,
+			  void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_fromio(msg, (void __iomem __force *)header, size1);
+		memcpy_fromio(msg+size1,
+			      (void __iomem __force *)rb->rb_base, size2);
+	} else {
+		memcpy_fromio(msg, (void __iomem __force *)header, size);
+	}
+}
+
+/**
+ * scif_rb_space - Query space available for writing to the RB
+ * @rb: ring buffer
+ *
+ * Return: size available for writing to RB in bytes.
+ */
+u32 scif_rb_space(struct scif_rb *rb)
+{
+	rb->current_read_offset = *rb->read_ptr;
+	/*
+	 * Update from the HW read pointer only once the peer has exposed the
+	 * new empty slot. This barrier is paired with the memory barrier
+	 * scif_rb_update_read_ptr()
+	 */
+	mb();
+	return scif_rb_ring_space(rb->current_write_offset,
+				  rb->current_read_offset, rb->size);
+}
+
+/**
+ * scif_rb_write - Write a message to the RB
+ * @rb: ring buffer
+ * @msg: buffer to send the message.  Must be at least size bytes long
+ * @size: the size (in bytes) to be copied to the RB
+ *
+ * This API does not block if there isn't enough space in the RB.
+ * Returns: 0 on success or -ENOMEM on failure
+ */
+int scif_rb_write(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header;
+
+	if (scif_rb_space(rb) < size)
+		return -ENOMEM;
+	header = rb->rb_base + rb->current_write_offset;
+	memcpy_torb(rb, header, msg, size);
+	/*
+	 * Wait until scif_rb_commit(). Update the local ring
+	 * buffer data, not the shared data until commit.
+	 */
+	rb->current_write_offset =
+		(rb->current_write_offset + size) & (rb->size - 1);
+	return 0;
+}
+
+/**
+ * scif_rb_commit - To submit the message to let the peer fetch it
+ * @rb: ring buffer
+ */
+void scif_rb_commit(struct scif_rb *rb)
+{
+	/*
+	 * We must ensure ordering between the all the data committed
+	 * previously before we expose the new message to the peer by
+	 * updating the write_ptr. This write barrier is paired with
+	 * the read barrier in scif_rb_count(..)
+	 */
+	wmb();
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#endif
+}
+
+/**
+ * scif_rb_get - To get next message from the ring buffer
+ * @rb: ring buffer
+ * @size: Number of bytes to be read
+ *
+ * Return: NULL if no bytes to be read from the ring buffer, otherwise the
+ *	pointer to the next byte
+ */
+static void *scif_rb_get(struct scif_rb *rb, u32 size)
+{
+	void *header = NULL;
+
+	if (scif_rb_count(rb, size) >= size)
+		header = rb->rb_base + rb->current_read_offset;
+	return header;
+}
+
+/*
+ * scif_rb_get_next - Read from ring buffer.
+ * @rb: ring buffer
+ * @msg: buffer to hold the message.  Must be at least size bytes long
+ * @size: Number of bytes to be read
+ *
+ * Return: number of bytes read if available bytes are >= size, otherwise
+ * returns zero.
+ */
+u32 scif_rb_get_next(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header = NULL;
+	int read_size = 0;
+
+	header = scif_rb_get(rb, size);
+	if (header) {
+		u32 next_cmd_offset =
+			(rb->current_read_offset + size) & (rb->size - 1);
+
+		read_size = size;
+		rb->current_read_offset = next_cmd_offset;
+		memcpy_fromrb(rb, header, msg, size);
+	}
+	return read_size;
+}
+
+/**
+ * scif_rb_update_read_ptr
+ * @rb: ring buffer
+ */
+void scif_rb_update_read_ptr(struct scif_rb *rb)
+{
+	u32 new_offset;
+
+	new_offset = rb->current_read_offset;
+	/*
+	 * We must ensure ordering between the all the data committed or read
+	 * previously before we expose the empty slot to the peer by updating
+	 * the read_ptr. This barrier is paired with the memory barrier in
+	 * scif_rb_space(..)
+	 */
+	mb();
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si Bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#endif
+}
+
+/**
+ * scif_rb_count
+ * @rb: ring buffer
+ * @size: Number of bytes expected to be read
+ *
+ * Return: number of bytes that can be read from the RB
+ */
+u32 scif_rb_count(struct scif_rb *rb, u32 size)
+{
+	if (scif_rb_ring_cnt(rb->current_write_offset,
+			     rb->current_read_offset,
+			     rb->size) < size) {
+		rb->current_write_offset = *rb->write_ptr;
+		/*
+		 * Update from the HW write pointer if empty only once the peer
+		 * has exposed the new message. This read barrier is paired
+		 * with the write barrier in scif_rb_commit(..)
+		 */
+		smp_rmb();
+	}
+	return scif_rb_ring_cnt(rb->current_write_offset,
+				rb->current_read_offset,
+				rb->size);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 58%]

* [PATCH char-misc-next v2 04/13] misc: mic: SCIF Peer Bus
  2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
                   ` (2 preceding siblings ...)
  2015-01-12 23:09 61% ` [PATCH char-misc-next v2 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
@ 2015-01-12 23:09 78% ` Sudeep Dutt
  2015-01-12 23:09 96% ` [PATCH char-misc-next v2 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-12 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF peer bus is used to register and unregister SCIF peer devices
internally by the SCIF driver to signify the addition and removal of
peer nodes respectively from the SCIF network. This simplifies remote node
handling within SCIF and will also be used to support device probe/remove
for SCIF client drivers (e.g. netdev over SCIF)

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_peer_bus.h |  65 ++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c | 124 ++++++++++++++++++++++++++++++++++
 2 files changed, 189 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c

diff --git a/drivers/misc/mic/scif/scif_peer_bus.h b/drivers/misc/mic/scif/scif_peer_bus.h
new file mode 100644
index 0000000..33f0dbb
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.h
@@ -0,0 +1,65 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef _SCIF_PEER_BUS_H_
+#define _SCIF_PEER_BUS_H_
+
+#include <linux/device.h>
+#include <linux/mic_common.h>
+
+/*
+ * Peer devices show up as PCIe devices for the mgmt node but not the cards.
+ * The mgmt node discovers all the cards on the PCIe bus and informs the other
+ * cards about their peers. Upon notification of a peer a node adds a peer
+ * device to the peer bus to maintain symmetry in the way devices are
+ * discovered across all nodes in the SCIF network.
+ */
+/**
+ * scif_peer_dev - representation of a peer SCIF device
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ */
+struct scif_peer_dev {
+	struct device dev;
+	u8 dnode;
+};
+
+/**
+ * scif_peer_driver - operations for a scif_peer I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_peer_driver {
+	struct device_driver driver;
+	const struct scif_peer_dev_id *id_table;
+
+	int (*probe)(struct scif_peer_dev *dev);
+	void (*remove)(struct scif_peer_dev *dev);
+};
+
+struct scif_dev;
+
+int scif_peer_register_driver(struct scif_peer_driver *driver);
+void scif_peer_unregister_driver(struct scif_peer_driver *driver);
+
+struct scif_peer_dev *scif_peer_register_device(struct scif_dev *sdev);
+void scif_peer_unregister_device(struct scif_peer_dev *sdev);
+
+int scif_peer_bus_init(void);
+void scif_peer_bus_exit(void);
+#endif /* _SCIF_PEER_BUS_H */
diff --git a/drivers/misc/mic/scif/scif_peer_bus.c b/drivers/misc/mic/scif/scif_peer_bus.c
new file mode 100644
index 0000000..589ae9a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#include "scif_main.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+
+static inline struct scif_peer_dev *
+dev_to_scif_peer(struct device *dev)
+{
+	return container_of(dev, struct scif_peer_dev, dev);
+}
+
+static inline struct scif_peer_driver *
+drv_to_scif_peer(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_peer_driver, driver);
+}
+
+static int scif_peer_dev_match(struct device *dv, struct device_driver *dr)
+{
+	return !strncmp(dev_name(dv), dr->name, 4);
+}
+
+static int scif_peer_dev_probe(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_peer_dev_remove(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_peer_bus = {
+	.name  = "scif_peer_bus",
+	.match = scif_peer_dev_match,
+	.probe = scif_peer_dev_probe,
+	.remove = scif_peer_dev_remove,
+};
+
+int scif_peer_register_driver(struct scif_peer_driver *driver)
+{
+	driver->driver.bus = &scif_peer_bus;
+	return driver_register(&driver->driver);
+}
+
+void scif_peer_unregister_driver(struct scif_peer_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+
+static void scif_peer_release_dev(struct device *d)
+{
+	struct scif_peer_dev *sdev = dev_to_scif_peer(d);
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	scif_cleanup_scifdev(scifdev);
+	kfree(sdev);
+}
+
+struct scif_peer_dev *
+scif_peer_register_device(struct scif_dev *scifdev)
+{
+	int ret;
+	struct scif_peer_dev *spdev;
+
+	spdev = kzalloc(sizeof(*spdev), GFP_KERNEL);
+	if (!spdev)
+		return ERR_PTR(-ENOMEM);
+
+	spdev->dev.parent = scifdev->sdev->dev.parent;
+	spdev->dev.release = scif_peer_release_dev;
+	spdev->dnode = scifdev->node;
+	spdev->dev.bus = &scif_peer_bus;
+
+	dev_set_name(&spdev->dev, "scif_peer-dev%u", spdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&spdev->dev);
+	if (ret)
+		goto free_spdev;
+	return spdev;
+free_spdev:
+	kfree(spdev);
+	return ERR_PTR(ret);
+}
+
+void scif_peer_unregister_device(struct scif_peer_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+
+int scif_peer_bus_init(void)
+{
+	return bus_register(&scif_peer_bus);
+}
+
+void scif_peer_bus_exit(void)
+{
+	bus_unregister(&scif_peer_bus);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 78%]

* Re: [PATCH char-misc-next 02/13] misc: mic: SCIF ring buffer infrastructure
  @ 2015-01-10  0:28 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-10  0:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit

On Fri, 2015-01-09 at 15:05 -0800, Greg Kroah-Hartman wrote:
> On Wed, Dec 10, 2014 at 11:47:42AM -0800, Sudeep Dutt wrote:
> > SCIF ring buffer is a single producer, single consumer byte stream ring
> > buffer optimized for avoiding reads across the PCIe bus. The ring buffer is
> > used to implement a receive queue for SCIF driver messaging between nodes and
> > for byte stream messaging between SCIF endpoints. Each SCIF node has a
> > receive queue for every other SCIF node, and each connected endpoint has a
> > receive queue for messages from its peer. This pair of receive queues is
> > referred to as a SCIF queue pair.
> 
> And the reason you aren't using the built-in kernel ring buffer code is
> why?
> 

This simple byte stream ring buffer is in our performance data path for
small messages and is optimized to avoid reads across the PCIe bus while
adding the required barriers and hardware workarounds for the MIC
Coprocessor. I will add some more documentation here in the v2 which I
will post early next week. 

We did not find other ring buffers in the kernel which were tailored for
our use case across PCIe. I am guessing (please correct me) that you are
referring to the ring buffer in include/linux/ring_buffer.h. It does not
seem to be designed for being used between two independent OS's across
the PCIe bus. Please let me know your thoughts.

Thanks,
Sudeep Dutt



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next 04/13] misc: mic: SCIF Peer Bus
  @ 2015-01-09 23:58 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-09 23:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Fri, 2015-01-09 at 15:07 -0800, Greg Kroah-Hartman wrote:
> On Wed, Dec 10, 2014 at 11:47:44AM -0800, Sudeep Dutt wrote:
> > The SCIF peer bus is used to register and unregister SCIF peer devices
> > internally by the SCIF driver to signify the addition and removal of
> > peer nodes respectively from the SCIF network. This simplifies remote node
> > handling within SCIF and will also be used to support device probe/remove
> > for SCIF client drivers (e.g. netdev over SCIF)
> > 
> > Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > ---
> >  drivers/misc/mic/scif/scif_peer_bus.h |  63 +++++++++++++++++
> >  drivers/misc/mic/scif/scif_peer_bus.c | 124 ++++++++++++++++++++++++++++++++++
> >  2 files changed, 187 insertions(+)
> >  create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
> >  create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c
> > 
> > diff --git a/drivers/misc/mic/scif/scif_peer_bus.h b/drivers/misc/mic/scif/scif_peer_bus.h
> > new file mode 100644
> > index 0000000..27ed508
> > --- /dev/null
> > +++ b/drivers/misc/mic/scif/scif_peer_bus.h
> > @@ -0,0 +1,63 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License, version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * General Public License for more details.
> > + *
> > + * Intel SCIF driver.
> > + */
> > +#ifndef _SCIF_PEER_BUS_H_
> > +#define _SCIF_PEER_BUS_H_
> > +
> > +#include <linux/device.h>
> > +#include <linux/mic_common.h>
> > +
> > +/*
> > + * Peer devices show up as PCIe devices for the mgmt node but not the cards.
> > + * The mgmt node discovers all the cards on the PCIe bus and informs the other
> > + * cards about their peers. Upon notification of a peer a node adds a peer
> > + * device to the peer bus to maintain symmetry in the way devices are
> > + * discovered across all nodes in the SCIF network.
> > + */
> > +/**
> > + * scif_peer_dev - representation of a peer SCIF device
> > + * @dev: underlying device
> > + * @dnode - The destination node which this device will communicate with.
> > + */
> > +struct scif_peer_dev {
> > +	struct device dev;
> > +	u8 dnode;
> > +};
> > +
> > +/**
> > + * scif_peer_driver - operations for a scif_peer I/O driver
> > + * @driver: underlying device driver (populate name and owner).
> > + * @id_table: the ids serviced by this driver.
> > + * @probe: the function to call when a device is found.  Returns 0 or -errno.
> > + * @remove: the function to call when a device is removed.
> > + */
> > +struct scif_peer_driver {
> > +	struct device_driver driver;
> > +	const struct scif_peer_dev_id *id_table;
> > +
> > +	int (*probe)(struct scif_peer_dev *dev);
> > +	void (*remove)(struct scif_peer_dev *dev);
> > +};
> > +
> > +int scif_peer_register_driver(struct scif_peer_driver *driver);
> > +void scif_peer_unregister_driver(struct scif_peer_driver *driver);
> > +struct scif_dev;
> > +struct scif_peer_dev *
> > +scif_peer_register_device(struct scif_dev *sdev);
> 
> Odd formatting, don't you think?
> 
> Please put your "predeclare scif_dev" on an earlier line by itself, as
> this is a mess.

I will fix it up and send across a v2 early next week.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next 01/13] misc: mic: SCIF header file and IOCTL interface
  @ 2015-01-09 23:57 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2015-01-09 23:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Arnd Bergmann, Jonathan Corbet, linux-kernel,
	linux-doc, Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

On Fri, 2015-01-09 at 15:04 -0800, Greg Kroah-Hartman wrote:
> On Wed, Dec 10, 2014 at 11:47:41AM -0800, Sudeep Dutt wrote:
> > This patch introduces the SCIF documentation in the header file
> > and describes the IOCTL interface for user mode. mic_overview.txt
> > is updated with documentation on SCIF and a new document
> > describing SCIF in more details is available in scif_overview.txt.
> > 
> > Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > ---
> >  Documentation/mic/mic_overview.txt  |   28 +-
> >  Documentation/mic/scif_overview.txt |   62 ++
> >  include/uapi/linux/Kbuild           |    1 +
> >  include/linux/scif.h                | 1132 +++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/scif_ioctl.h     |  233 +++++++
> >  5 files changed, 1444 insertions(+), 12 deletions(-)
> >  create mode 100644 Documentation/mic/scif_overview.txt
> >  create mode 100644 include/linux/scif.h
> >  create mode 100644 include/uapi/linux/scif_ioctl.h
> > 
> > diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
> > index 77c5418..1a2f2c8 100644
> > --- a/Documentation/mic/mic_overview.txt
> > +++ b/Documentation/mic/mic_overview.txt
> > @@ -24,6 +24,10 @@ a virtual bus called mic bus is created and virtual dma devices are
> >  created on it by the host/card drivers. On host the channels are private
> >  and used only by the host driver to transfer data for the virtio devices.
> >  
> > +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
> > +low level communications API across PCIe currently implemented for MIC.
> > +More details are available at scif_overview.txt.
> > +
> >  Here is a block diagram of the various components described above. The
> >  virtio backends are situated on the host rather than the card given better
> >  single threaded performance for the host compared to MIC, the ability of
> > @@ -47,18 +51,18 @@ the fact that the virtio block storage backend can only be on the host.
> >                        |               |       | Virtio over PCIe IOCTLs  |
> >                        |               |       +--------------------------+
> >  +-----------+         |               |                   |  +-----------+
> > -| MIC DMA   |         |               |                   |  | MIC DMA   |
> > -| Driver    |         |               |                   |  | Driver    |
> > -+-----------+         |               |                   |  +-----------+
> > -      |               |               |                   |        |
> > -+---------------+     |               |                   |  +----------------+
> > -|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
> > -+---------------+     |               |                   |  +----------------+
> > -      |               |               |                   |              |
> > -      |   +--------------+            |            +---------------+     |
> > -      |   |Intel MIC     |            |            |Intel MIC      |     |
> > -      +---|Card Driver   |            |            |Host Driver    |     |
> > -          +--------------+            |            +---------------+-----+
> > +| MIC DMA   |         |  +----------+ | +-----------+     |  | MIC DMA   |
> > +| Driver    |         |  |  SCIF    | | |   SCIF    |     |  | Driver    |
> > ++-----------+         |  +----------+ | +-----------+     |  +-----------+
> > +      |               |       |       |       |           |        |
> > ++---------------+     | +-----+-----+ | +-----+-----+     | +---------------+
> > +|MIC virtual Bus|     | |SCIF HW Bus| | |SCIF HW BUS|     | |MIC virtual Bus|
> > ++---------------+     | +-----------+ | +-----+-----+     | +---------------+
> > +      |               |       |       |       |           |              |
> > +      |   +--------------+    |       |       |    +---------------+     |
> > +      |   |Intel MIC     |    |       |       |    |Intel MIC      |     |
> > +      +---|Card Driver   +----+       |       |    |Host Driver    |     |
> > +          +--------------+            |       +----+---------------+-----+
> >                        |               |                   |
> >               +-------------------------------------------------------------+
> >               |                                                             |
> > diff --git a/Documentation/mic/scif_overview.txt b/Documentation/mic/scif_overview.txt
> > new file mode 100644
> > index 0000000..75549c4
> > --- /dev/null
> > +++ b/Documentation/mic/scif_overview.txt
> > @@ -0,0 +1,62 @@
> > +The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
> > +level communications API across PCIe currently implemented for MIC. Currently
> > +SCIF provides inter-node communication within a single host platform, where a
> > +node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
> > +communicating over the PCIe bus while providing an API that is symmetric
> > +across all the nodes in the PCIe network. An important design objective for SCIF
> > +is to deliver the maximum possible performance given the communication
> > +abilities of the hardware. SCIF has been used to implement an offload compiler
> > +runtime and OFED support for MPI implementations for MIC coprocessors.
> > +
> > +==== SCIF API Components ====
> > +The SCIF API has the following parts:
> > +1. Connection establishment using a client server model
> > +2. Byte stream messaging intended for short messages
> > +3. Node enumeration to determine online nodes
> > +4. Poll semantics for detection of incoming connections and messages
> > +5. Memory registration to pin down pages
> > +6. Remote memory mapping for low latency CPU accesses via mmap
> > +7. Remote DMA (RDMA) for high bandwidth DMA transfers
> > +8. Fence APIs for RDMA synchronization
> > +
> > +SCIF exposes the notion of a connection which can be used by peer processes on
> > +nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
> > +process in a SCIF node initiates a SCIF connection to a peer process on a
> > +different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
> > +which are similar to connection oriented socket APIs. Connected SCIF endpoints
> > +can also register local memory which is followed by data transfer using either
> > +DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
> > +kernel mode clients which are functionally equivalent.
> > +
> > +==== SCIF Performance for MIC ====
> > +DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
> > +SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
> > +
> > +             Comparison of TCP and SCIF based BW
> > +
> > +  Throughput (GB/sec)
> > +    8 +                                             PCIe Bandwidth ******
> > +      +                                                        TCP ######
> > +    7 +    **************************************             SCIF %%%%%%
> > +      |                       %%%%%%%%%%%%%%%%%%%
> > +    6 +                   %%%%
> > +      |                 %%
> > +      |               %%%
> > +    5 +              %%
> > +      |            %%
> > +    4 +           %%
> > +      |          %%
> > +    3 +         %%
> > +      |        %
> > +    2 +      %%
> > +      |     %%
> > +      |    %
> > +    1 +
> > +      +    ######################################
> > +    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
> > +      1       10     100      1000   10000   100000
> > +                   Transfer Size (KBytes)
> > +
> > +SCIF allows memory sharing via mmap(..) between processes on different PCIe
> > +nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
> > +latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
> > diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> > index 4c94f31..9083b60 100644
> > --- a/include/uapi/linux/Kbuild
> > +++ b/include/uapi/linux/Kbuild
> > @@ -345,6 +345,7 @@ header-y += rtc.h
> >  header-y += rtnetlink.h
> >  header-y += scc.h
> >  header-y += sched.h
> > +header-y += scif_ioctl.h
> >  header-y += screen_info.h
> >  header-y += sctp.h
> >  header-y += sdla.h
> > diff --git a/include/linux/scif.h b/include/linux/scif.h
> > new file mode 100644
> > index 0000000..a0652a6
> > --- /dev/null
> > +++ b/include/linux/scif.h
> > @@ -0,0 +1,1132 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * This file is provided under a dual BSD/GPLv2 license.  When using or
> > + * redistributing this file, you may do so under either license.
> > + *
> > + * GPL LICENSE SUMMARY
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of version 2 of the GNU General Public License as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * General Public License for more details.
> > + *
> > + * BSD LICENSE
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + *
> > + * * Redistributions of source code must retain the above copyright
> > + *   notice, this list of conditions and the following disclaimer.
> > + * * Redistributions in binary form must reproduce the above copyright
> > + *   notice, this list of conditions and the following disclaimer in
> > + *   the documentation and/or other materials provided with the
> > + *   distribution.
> > + * * Neither the name of Intel Corporation nor the names of its
> > + *   contributors may be used to endorse or promote products derived
> > + *   from this software without specific prior written permission.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + *
> > + * Intel SCIF driver.
> > + *
> > + */
> > +#ifndef __SCIF_H__
> > +#define __SCIF_H__
> > +
> > +#include <linux/types.h>
> > +#include <linux/poll.h>
> > +#include <linux/scif_ioctl.h>
> > +
> > +#define SCIF_ACCEPT_SYNC	1
> > +#define SCIF_SEND_BLOCK		1
> > +#define SCIF_RECV_BLOCK		1
> > +
> > +enum {
> > +	SCIF_PROT_READ = (1 << 0),
> > +	SCIF_PROT_WRITE = (1 << 1)
> > +};
> > +
> > +enum {
> > +	SCIF_MAP_FIXED = 0x10,
> > +	SCIF_MAP_KERNEL	= 0x20,
> > +};
> > +
> > +enum {
> > +	SCIF_FENCE_INIT_SELF = (1 << 0),
> > +	SCIF_FENCE_INIT_PEER = (1 << 1),
> > +	SCIF_SIGNAL_LOCAL = (1 << 4),
> > +	SCIF_SIGNAL_REMOTE = (1 << 5)
> > +};
> > +
> > +enum {
> > +	SCIF_RMA_USECPU = (1 << 0),
> > +	SCIF_RMA_USECACHE = (1 << 1),
> > +	SCIF_RMA_SYNC = (1 << 2),
> > +	SCIF_RMA_ORDERED = (1 << 3)
> > +};
> > +
> > +/* End of SCIF Admin Reserved Ports */
> > +#define SCIF_ADMIN_PORT_END	1024
> > +
> > +/* End of SCIF Reserved Ports */
> > +#define SCIF_PORT_RSVD		1088
> > +
> > +typedef struct scif_endpt *scif_epd_t;
> > +
> > +#define SCIF_OPEN_FAILED ((scif_epd_t)-1)
> > +#define SCIF_REGISTER_FAILED ((off_t)-1)
> > +#define SCIF_MMAP_FAILED ((void *)-1)
> > +
> > +/**
> > + * scif_open - Create an endpoint
> > + *
> > + *\return
> > + * Upon successful completion, scif_open() returns an endpoint descriptor to
> > + * be used in subsequent SCIF functions calls to refer to that endpoint;
> > + * otherwise: in user mode SCIF_OPEN_FAILED (that is ((scif_epd_t)-1)) is
> > + * returned and errno is set to indicate the error; in kernel mode a NULL
> > + * scif_epd_t is returned.
> > + *
> > + *\par Errors:
> > + *- ENOMEM
> > + * - Insufficient kernel memory was available
> > + */
> 
> Documentation is great, but if you are going to do it, use the proper
> kerneldoc format and not some other odd variant that I have never seen
> before.  This whole patch has tons of oddly labled comments, like this:
> 

This file was set up for being an input into doxygen some time back. I
agree that it looks really odd so will fix it up and send across a v2
early next week.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next 11/13] misc: mic: MIC host driver specific changes to enable SCIF
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (9 preceding siblings ...)
  2014-12-10 19:47 52% ` [PATCH char-misc-next 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
@ 2014-12-10 19:47 57% ` Sudeep Dutt
  2014-12-10 19:47 70% ` [PATCH char-misc-next 12/13] misc: mic: MIC card " Sudeep Dutt
  2014-12-10 19:47 98% ` [PATCH char-misc-next 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC host driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |   2 +-
 drivers/misc/mic/host/mic_device.h  |  11 +-
 drivers/misc/mic/host/mic_intr.h    |   3 +-
 drivers/misc/mic/host/mic_smpt.h    |   1 +
 drivers/misc/mic/host/mic_boot.c    | 264 ++++++++++++++++++++++++++++++++++--
 drivers/misc/mic/host/mic_debugfs.c |  13 ++
 drivers/misc/mic/host/mic_main.c    |   6 +
 drivers/misc/mic/host/mic_smpt.c    |   7 +-
 drivers/misc/mic/host/mic_virtio.c  |   6 +-
 drivers/misc/mic/host/mic_x100.c    |   3 +-
 10 files changed, 292 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bcf6e36..a2ea3ab 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -36,7 +36,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 016bd15..01a7555 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -27,7 +27,7 @@
 #include <linux/irqreturn.h>
 #include <linux/dmaengine.h>
 #include <linux/mic_bus.h>
-
+#include "../bus/scif_bus.h"
 #include "mic_intr.h"
 
 /* The maximum number of MIC devices supported in a single host system. */
@@ -90,7 +90,9 @@ enum mic_stepping {
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
  * @dma_mbdev: MIC BUS DMA device.
- * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -129,7 +131,9 @@ struct mic_device {
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
 	struct mbus_device *dma_mbdev;
-	struct dma_chan *dma_ch;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -228,4 +232,5 @@ void mic_exit_debugfs(void);
 void mic_prepare_suspend(struct mic_device *mdev);
 void mic_complete_resume(struct mic_device *mdev);
 void mic_suspend(struct mic_device *mdev);
+extern atomic_t g_num_mics;
 #endif
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 9f783d4a..cce2882 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -28,8 +28,9 @@
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
  * 4 for host owned DMA channels.
+ * 1 for SCIF
  */
-#define MIC_MIN_MSIX 8
+#define MIC_MIN_MSIX 9
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_smpt.h b/drivers/misc/mic/host/mic_smpt.h
index 51970ab..68721c6 100644
--- a/drivers/misc/mic/host/mic_smpt.h
+++ b/drivers/misc/mic/host/mic_smpt.h
@@ -78,6 +78,7 @@ void mic_unmap_single(struct mic_device *mdev,
 dma_addr_t mic_map(struct mic_device *mdev,
 	dma_addr_t dma_addr, size_t size);
 void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size);
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr);
 
 /**
  * mic_map_error - Check a MIC address for errors.
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index ff2b0fb..bb15506 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -21,6 +21,7 @@
 #include <linux/delay.h>
 #include <linux/firmware.h>
 #include <linux/pci.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include <linux/mic_bus.h>
@@ -29,6 +30,188 @@
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *scdev_to_mdev(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static void *__mic_dma_alloc(struct device *dev, size_t size,
+			     dma_addr_t *dma_handle, gfp_t gfp,
+			     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	dma_addr_t tmp;
+	void *va = kmalloc(size, gfp);
+
+	if (va) {
+		tmp = mic_map_single(mdev, va, size);
+		if (dma_mapping_error(dev, tmp)) {
+			kfree(va);
+			va = NULL;
+		} else {
+			*dma_handle = tmp;
+		}
+	}
+	return va;
+}
+
+static void __mic_dma_free(struct device *dev, size_t size, void *vaddr,
+			   dma_addr_t dma_handle, struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_handle, size);
+	kfree(vaddr);
+}
+
+static dma_addr_t
+__mic_dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+		   size_t size, enum dma_data_direction dir,
+		   struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void
+__mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+		     size_t size, enum dma_data_direction dir,
+		     struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static int __mic_dma_map_sg(struct device *dev, struct scatterlist *sg,
+			    int nents, enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	int i, j, ret;
+	dma_addr_t da;
+
+	ret = dma_map_sg(mdev->sdev->parent, sg, nents, dir);
+	if (ret <= 0)
+		return 0;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_map(mdev, sg_dma_address(s) + s->offset, s->length);
+		if (!da)
+			goto err;
+		sg_dma_address(s) = da;
+	}
+	return nents;
+err:
+	for_each_sg(sg, s, i, j) {
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = mic_to_dma_addr(mdev, sg_dma_address(s));
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+	return 0;
+}
+
+static void __mic_dma_unmap_sg(struct device *dev,
+			       struct scatterlist *sg, int nents,
+			       enum dma_data_direction dir,
+			       struct dma_attrs *attrs)
+{
+	struct scif_hw_dev *scdev = dev_get_drvdata(dev);
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+	struct scatterlist *s;
+	dma_addr_t da;
+	int i;
+
+	for_each_sg(sg, s, nents, i) {
+		da = mic_to_dma_addr(mdev, sg_dma_address(s));
+		mic_unmap(mdev, sg_dma_address(s), s->length);
+		sg_dma_address(s) = da;
+	}
+	dma_unmap_sg(mdev->sdev->parent, sg, nents, dir);
+}
+
+static struct dma_map_ops __mic_dma_ops = {
+	.alloc = __mic_dma_alloc,
+	.free = __mic_dma_free,
+	.map_page = __mic_dma_map_page,
+	.unmap_page = __mic_dma_unmap_page,
+	.map_sg = __mic_dma_map_sg,
+	.unmap_sg = __mic_dma_unmap_sg,
+};
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name,
+				       void *data, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_request_threaded_irq(mdev, func, NULL, name, data,
+					db, MIC_INTR_DB);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_free_irq(mdev, cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mic_next_db(mdev);
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	mdev->ops->send_intr(mdev, db);
+}
+
+static void __iomem *___mic_ioremap(struct scif_hw_dev *scdev,
+				    phys_addr_t pa, size_t len)
+{
+	struct mic_device *mdev = scdev_to_mdev(scdev);
+
+	return mdev->aper.va + pa;
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	/* nothing to do */
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
 static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
 {
 	return dev_get_drvdata(mbdev->dev.parent);
@@ -127,6 +310,58 @@ void mic_bootparam_init(struct mic_device *mdev)
 	bootparam->h2c_config_db = -1;
 	bootparam->shutdown_status = 0;
 	bootparam->shutdown_card = 0;
+	/* Total nodes = number of MICs + 1 for self node */
+	bootparam->tot_nodes = atomic_read(&g_num_mics) + 1;
+	bootparam->node_id = mdev->id + 1;
+	bootparam->scif_host_dma_addr = 0x0;
+	bootparam->scif_card_dma_addr = 0x0;
+	bootparam->c2h_scif_db = -1;
+	bootparam->h2c_scif_db = -1;
+}
+
+/**
+ * mic_request_dma_chans - Request DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns number of DMA channels acquired
+ */
+static int mic_request_dma_chans(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, mdev->ops->dma_filter,
+					   mdev->sdev->parent);
+		if (chan) {
+			mdev->dma_ch[mdev->num_dma_ch++] = chan;
+			if (mdev->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdev->sdev->parent, "DMA channels # %d\n", mdev->num_dma_ch);
+	return mdev->num_dma_ch;
+}
+
+/**
+ * mic_free_dma_chans - release DMA channels
+ * @mdev: pointer to mic_device instance
+ *
+ * returns none
+ */
+static void mic_free_dma_chans(struct mic_device *mdev)
+{
+	int i = 0;
+
+	for (i = 0; i < mdev->num_dma_ch; i++) {
+		dma_release_channel(mdev->dma_ch[i]);
+		mdev->dma_ch[i] = NULL;
+	}
+	mdev->num_dma_ch = 0;
 }
 
 /**
@@ -141,6 +376,7 @@ int mic_start(struct mic_device *mdev, const char *buf)
 {
 	int rc;
 	mutex_lock(&mdev->mic_mutex);
+	mic_bootparam_init(mdev);
 retry:
 	if (MIC_OFFLINE != mdev->state) {
 		rc = -EINVAL;
@@ -161,14 +397,22 @@ retry:
 		rc = PTR_ERR(mdev->dma_mbdev);
 		goto unlock_ret;
 	}
-	mdev->dma_ch = mic_request_dma_chan(mdev);
-	if (!mdev->dma_ch) {
-		rc = -ENXIO;
+	if (!mic_request_dma_chans(mdev)) {
+		rc = -ENODEV;
 		goto dma_remove;
 	}
+	mdev->scdev = scif_register_device(mdev->sdev->parent, MIC_SCIF_DEV,
+					   &__mic_dma_ops, &scif_hw_ops,
+					   mdev->id + 1, 0, &mdev->mmio,
+					   &mdev->aper, mdev->dp, NULL,
+					   mdev->dma_ch, mdev->num_dma_ch);
+	if (IS_ERR(mdev->scdev)) {
+		rc = PTR_ERR(mdev->scdev);
+		goto dma_free;
+	}
 	rc = mdev->ops->load_mic_fw(mdev, buf);
 	if (rc)
-		goto dma_release;
+		goto scif_remove;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -177,8 +421,10 @@ retry:
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
 	goto unlock_ret;
-dma_release:
-	dma_release_channel(mdev->dma_ch);
+scif_remove:
+	scif_unregister_device(mdev->scdev);
+dma_free:
+	mic_free_dma_chans(mdev);
 dma_remove:
 	mbus_unregister_device(mdev->dma_mbdev);
 unlock_ret:
@@ -197,11 +443,9 @@ void mic_stop(struct mic_device *mdev, bool force)
 {
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
+		scif_unregister_device(mdev->scdev);
 		mic_virtio_reset_devices(mdev);
-		if (mdev->dma_ch) {
-			dma_release_channel(mdev->dma_ch);
-			mdev->dma_ch = NULL;
-		}
+		mic_free_dma_chans(mdev);
 		mbus_unregister_device(mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 028ba5d6..f024d29 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -214,6 +214,19 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 		   bootparam->shutdown_status);
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
 		   bootparam->shutdown_card);
+	seq_printf(s, "Bootparam: tot_nodes %d\n",
+		   bootparam->tot_nodes);
+	seq_printf(s, "Bootparam: node_id %d\n",
+		   bootparam->node_id);
+	seq_printf(s, "Bootparam: c2h_scif_db %d\n",
+		   bootparam->c2h_scif_db);
+	seq_printf(s, "Bootparam: h2c_scif_db %d\n",
+		   bootparam->h2c_scif_db);
+	seq_printf(s, "Bootparam: scif_host_dma_addr 0x%llx\n",
+		   bootparam->scif_host_dma_addr);
+	seq_printf(s, "Bootparam: scif_card_dma_addr 0x%llx\n",
+		   bootparam->scif_card_dma_addr);
+
 
 	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
 	     i += mic_total_desc_size(d)) {
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index ab37a31..4564629 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -67,6 +67,8 @@ static struct ida g_mic_ida;
 static struct class *g_mic_class;
 /* Base device node number for MIC devices */
 static dev_t g_mic_devno;
+/* Track the total number of MIC devices */
+atomic_t g_num_mics;
 
 static const struct file_operations mic_fops = {
 	.open = mic_open,
@@ -408,6 +410,7 @@ static int mic_probe(struct pci_dev *pdev,
 		dev_err(&pdev->dev, "cdev_add err id %d rc %d\n", mdev->id, rc);
 		goto cleanup_debug_dir;
 	}
+	atomic_inc(&g_num_mics);
 	return 0;
 cleanup_debug_dir:
 	mic_delete_debug_dir(mdev);
@@ -459,6 +462,7 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	mic_stop(mdev, false);
+	atomic_dec(&g_num_mics);
 	cdev_del(&mdev->cdev);
 	mic_delete_debug_dir(mdev);
 	mutex_lock(&mdev->mic_mutex);
@@ -478,6 +482,7 @@ static void mic_remove(struct pci_dev *pdev)
 	ida_simple_remove(&g_mic_ida, mdev->id);
 	kfree(mdev);
 }
+
 static struct pci_driver mic_driver = {
 	.name = mic_driver_name,
 	.id_table = mic_pci_tbl,
@@ -512,6 +517,7 @@ static int __init mic_init(void)
 	}
 	return ret;
 cleanup_debugfs:
+	ida_destroy(&g_mic_ida);
 	mic_exit_debugfs();
 	class_destroy(g_mic_class);
 cleanup_chrdev:
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index fae474c..cec8203 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -174,8 +174,7 @@ static int mic_get_smpt_ref_count(struct mic_device *mdev, dma_addr_t dma_addr,
  *
  * returns a DMA address.
  */
-static dma_addr_t
-mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
+dma_addr_t mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
 {
 	struct mic_smpt_info *smpt_info = mdev->smpt;
 	int spt;
@@ -214,7 +213,7 @@ dma_addr_t mic_map(struct mic_device *mdev, dma_addr_t dma_addr, size_t size)
 	if (!size || size > mic_max_system_memory(mdev))
 		return mic_addr;
 
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return mic_addr;
 
@@ -271,7 +270,7 @@ void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
 	}
 
 	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
-	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	ref = kmalloc_array(mdev->smpt->info.num_reg, sizeof(s64), GFP_ATOMIC);
 	if (!ref)
 		return;
 
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index a020e4e..cc08e9f 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -40,7 +40,7 @@ static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
 {
 	int err = 0;
 	struct dma_async_tx_descriptor *tx;
-	struct dma_chan *mic_ch = mdev->dma_ch;
+	struct dma_chan *mic_ch = mdev->dma_ch[0];
 
 	if (!mic_ch) {
 		err = -EBUSY;
@@ -80,7 +80,7 @@ static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t dma_offset;
 	size_t partlen;
 	int err;
@@ -129,7 +129,7 @@ static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
 	struct mic_device *mdev = mvdev->mdev;
 	void __iomem *dbuf = mdev->aper.va + daddr;
 	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
-	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_alignment = 1 << mdev->dma_ch[0]->device->copy_align;
 	size_t partlen;
 	int err;
 
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index b7a21e1..3341e90 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -167,8 +167,7 @@ static void mic_x100_send_intr(struct mic_device *mdev, int doorbell)
 	if (doorbell < MIC_X100_NUM_SBOX_IRQ) {
 		mic_x100_send_sbox_intr(mdev, doorbell);
 	} else {
-		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ;
 		mic_x100_send_rdmasr_intr(mdev, rdmasr_db);
 	}
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 57%]

* [PATCH char-misc-next 01/13] misc: mic: SCIF header file and IOCTL interface
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
@ 2014-12-10 19:47 21% ` Sudeep Dutt
    2014-12-10 19:47 58% ` [PATCH char-misc-next 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

This patch introduces the SCIF documentation in the header file
and describes the IOCTL interface for user mode. mic_overview.txt
is updated with documentation on SCIF and a new document
describing SCIF in more details is available in scif_overview.txt.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mic_overview.txt  |   28 +-
 Documentation/mic/scif_overview.txt |   62 ++
 include/uapi/linux/Kbuild           |    1 +
 include/linux/scif.h                | 1132 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/scif_ioctl.h     |  233 +++++++
 5 files changed, 1444 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index 77c5418..1a2f2c8 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -24,6 +24,10 @@ a virtual bus called mic bus is created and virtual dma devices are
 created on it by the host/card drivers. On host the channels are private
 and used only by the host driver to transfer data for the virtio devices.
 
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
+low level communications API across PCIe currently implemented for MIC.
+More details are available at scif_overview.txt.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
@@ -47,18 +51,18 @@ the fact that the virtio block storage backend can only be on the host.
                       |               |       | Virtio over PCIe IOCTLs  |
                       |               |       +--------------------------+
 +-----------+         |               |                   |  +-----------+
-| MIC DMA   |         |               |                   |  | MIC DMA   |
-| Driver    |         |               |                   |  | Driver    |
-+-----------+         |               |                   |  +-----------+
-      |               |               |                   |        |
-+---------------+     |               |                   |  +----------------+
-|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
-+---------------+     |               |                   |  +----------------+
-      |               |               |                   |              |
-      |   +--------------+            |            +---------------+     |
-      |   |Intel MIC     |            |            |Intel MIC      |     |
-      +---|Card Driver   |            |            |Host Driver    |     |
-          +--------------+            |            +---------------+-----+
+| MIC DMA   |         |  +----------+ | +-----------+     |  | MIC DMA   |
+| Driver    |         |  |  SCIF    | | |   SCIF    |     |  | Driver    |
++-----------+         |  +----------+ | +-----------+     |  +-----------+
+      |               |       |       |       |           |        |
++---------------+     | +-----+-----+ | +-----+-----+     | +---------------+
+|MIC virtual Bus|     | |SCIF HW Bus| | |SCIF HW BUS|     | |MIC virtual Bus|
++---------------+     | +-----------+ | +-----+-----+     | +---------------+
+      |               |       |       |       |           |              |
+      |   +--------------+    |       |       |    +---------------+     |
+      |   |Intel MIC     |    |       |       |    |Intel MIC      |     |
+      +---|Card Driver   +----+       |       |    |Host Driver    |     |
+          +--------------+            |       +----+---------------+-----+
                       |               |                   |
              +-------------------------------------------------------------+
              |                                                             |
diff --git a/Documentation/mic/scif_overview.txt b/Documentation/mic/scif_overview.txt
new file mode 100644
index 0000000..75549c4
--- /dev/null
+++ b/Documentation/mic/scif_overview.txt
@@ -0,0 +1,62 @@
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
+level communications API across PCIe currently implemented for MIC. Currently
+SCIF provides inter-node communication within a single host platform, where a
+node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
+communicating over the PCIe bus while providing an API that is symmetric
+across all the nodes in the PCIe network. An important design objective for SCIF
+is to deliver the maximum possible performance given the communication
+abilities of the hardware. SCIF has been used to implement an offload compiler
+runtime and OFED support for MPI implementations for MIC coprocessors.
+
+==== SCIF API Components ====
+The SCIF API has the following parts:
+1. Connection establishment using a client server model
+2. Byte stream messaging intended for short messages
+3. Node enumeration to determine online nodes
+4. Poll semantics for detection of incoming connections and messages
+5. Memory registration to pin down pages
+6. Remote memory mapping for low latency CPU accesses via mmap
+7. Remote DMA (RDMA) for high bandwidth DMA transfers
+8. Fence APIs for RDMA synchronization
+
+SCIF exposes the notion of a connection which can be used by peer processes on
+nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
+process in a SCIF node initiates a SCIF connection to a peer process on a
+different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
+which are similar to connection oriented socket APIs. Connected SCIF endpoints
+can also register local memory which is followed by data transfer using either
+DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
+kernel mode clients which are functionally equivalent.
+
+==== SCIF Performance for MIC ====
+DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
+SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
+
+             Comparison of TCP and SCIF based BW
+
+  Throughput (GB/sec)
+    8 +                                             PCIe Bandwidth ******
+      +                                                        TCP ######
+    7 +    **************************************             SCIF %%%%%%
+      |                       %%%%%%%%%%%%%%%%%%%
+    6 +                   %%%%
+      |                 %%
+      |               %%%
+    5 +              %%
+      |            %%
+    4 +           %%
+      |          %%
+    3 +         %%
+      |        %
+    2 +      %%
+      |     %%
+      |    %
+    1 +
+      +    ######################################
+    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
+      1       10     100      1000   10000   100000
+                   Transfer Size (KBytes)
+
+SCIF allows memory sharing via mmap(..) between processes on different PCIe
+nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
+latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 4c94f31..9083b60 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -345,6 +345,7 @@ header-y += rtc.h
 header-y += rtnetlink.h
 header-y += scc.h
 header-y += sched.h
+header-y += scif_ioctl.h
 header-y += screen_info.h
 header-y += sctp.h
 header-y += sdla.h
diff --git a/include/linux/scif.h b/include/linux/scif.h
new file mode 100644
index 0000000..a0652a6
--- /dev/null
+++ b/include/linux/scif.h
@@ -0,0 +1,1132 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef __SCIF_H__
+#define __SCIF_H__
+
+#include <linux/types.h>
+#include <linux/poll.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_ACCEPT_SYNC	1
+#define SCIF_SEND_BLOCK		1
+#define SCIF_RECV_BLOCK		1
+
+enum {
+	SCIF_PROT_READ = (1 << 0),
+	SCIF_PROT_WRITE = (1 << 1)
+};
+
+enum {
+	SCIF_MAP_FIXED = 0x10,
+	SCIF_MAP_KERNEL	= 0x20,
+};
+
+enum {
+	SCIF_FENCE_INIT_SELF = (1 << 0),
+	SCIF_FENCE_INIT_PEER = (1 << 1),
+	SCIF_SIGNAL_LOCAL = (1 << 4),
+	SCIF_SIGNAL_REMOTE = (1 << 5)
+};
+
+enum {
+	SCIF_RMA_USECPU = (1 << 0),
+	SCIF_RMA_USECACHE = (1 << 1),
+	SCIF_RMA_SYNC = (1 << 2),
+	SCIF_RMA_ORDERED = (1 << 3)
+};
+
+/* End of SCIF Admin Reserved Ports */
+#define SCIF_ADMIN_PORT_END	1024
+
+/* End of SCIF Reserved Ports */
+#define SCIF_PORT_RSVD		1088
+
+typedef struct scif_endpt *scif_epd_t;
+
+#define SCIF_OPEN_FAILED ((scif_epd_t)-1)
+#define SCIF_REGISTER_FAILED ((off_t)-1)
+#define SCIF_MMAP_FAILED ((void *)-1)
+
+/**
+ * scif_open - Create an endpoint
+ *
+ *\return
+ * Upon successful completion, scif_open() returns an endpoint descriptor to
+ * be used in subsequent SCIF functions calls to refer to that endpoint;
+ * otherwise: in user mode SCIF_OPEN_FAILED (that is ((scif_epd_t)-1)) is
+ * returned and errno is set to indicate the error; in kernel mode a NULL
+ * scif_epd_t is returned.
+ *
+ *\par Errors:
+ *- ENOMEM
+ * - Insufficient kernel memory was available
+ */
+scif_epd_t scif_open(void);
+
+/**
+ * scif _bind - Bind an endpoint to a port
+ *	\param epd			endpoint descriptor
+ *	\param pn			port number
+ *
+ * scif_bind() binds endpoint epd to port pn, where pn is a port number on the
+ * local node. If pn is zero, a port number greater than or equal to
+ * SCIF_PORT_RSVD is assigned and returned. Each endpoint may be bound to
+ * exactly one local port. Ports less than 1024 when requested can only be bound
+ * by system (or root) processes or by processes executed by privileged users.
+ *
+ *\return
+ * Upon successful completion, scif_bind() returns the port number to which epd
+ * is bound; otherwise: in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - The endpoint or the port are already bound
+ *- EISCONN
+ * - The endpoint is already connected
+ *- ENOSPC
+ * - No port number available for assignment
+ *- EACCES
+ * - The port requested is protected and the user is not the superuser
+*/
+int scif_bind(scif_epd_t epd, uint16_t pn);
+
+/**
+ * scif_listen - Listen for connections on an endpoint
+ *
+ *	\param epd		endpoint descriptor
+ *	\param backlog		maximum pending connection requests
+ *
+ * scif_listen() marks the endpoint epd as a listening endpoint - that is, as
+ * an endpoint that will be used to accept incoming connection requests. Once
+ * so marked, the endpoint is said to be in the listening state and may not be
+ * used as the endpoint of a connection.
+ *
+ * The endpoint, epd, must have been bound to a port.
+ *
+ * The backlog argument defines the maximum length to which the queue of
+ * pending connections for epd may grow. If a connection request arrives when
+ * the queue is full, the client may receive an error with an indication that
+ * the connection was refused.
+ *
+ *\return
+ * Upon successful completion, scif_listen() returns 0; otherwise: in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - The endpoint is not bound to a port
+ *- EISCONN
+ * - The endpoint is already connected or listening
+*/
+int scif_listen(scif_epd_t epd, int backlog);
+
+/**
+ * scif_connect - Initiate a connection on a port
+ *	\param epd		endpoint descriptor
+ *	\param dst		global id of port to which to connect
+ *
+ * The scif_connect() function requests the connection of endpoint epd to remote
+ * port dst. If the connection is successful, a peer endpoint, bound to dst, is
+ * created on node dst.node. On successful return, the connection is complete.
+ *
+ * If the endpoint epd has not already been bound to a port, scif_connect()
+ * will bind it to an unused local port.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * In user space, scif_connect() supports an asynchronous connection mode
+ * if the application has set the O_NONBLOCK flag on the endpoint via the
+ * fcntl() system call. Setting this flag will result in the calling process
+ * not to wait during scif_connect().
+ *
+ *\return
+ * Upon successful completion, scif_connect() returns the port ID to which the
+ * endpoint, epd, is bound; otherwise: in user mode -1 is returned and errno is
+ * set to indicate the error; in kernel mode the negative of one of the
+ * following errors is returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNREFUSED
+ * - The destination was not listening for connections or refused the
+ *	connection request
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - dst.port is not a valid port ID
+ *- EISCONN
+ * - The endpoint is already connected
+ *- ENOMEM
+ * - No buffer space is available
+ *- ENODEV
+ * - The destination node does not exist, or
+ * - The node is lost or existed, but is not currently in the network since it
+ *	may have crashed
+ *- ENOSPC
+ * - No port number available for assignment
+ *- EOPNOTSUPP
+ * - The endpoint is listening and cannot be connected
+*/
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst);
+
+/**
+ * scif_accept - Accept a connection on an endpoint
+ *	\param epd		endpoint descriptor
+ *	\param peer		global id of port to which connected
+ *	\param newepd		new connected endpoint descriptor
+ *	\param flags		flags
+ *
+ * The scif_accept() call extracts the first connection request from the queue
+ * of pending connections for the port on which epd is listening. scif_accept()
+ * creates a new endpoint, bound to the same port as epd, and allocates a new
+ * SCIF endpoint descriptor, returned in newepd, for the endpoint. The new
+ * endpoint is connected to the endpoint through which the connection was
+ * requested. epd is unaffected by this call, and remains in the listening
+ * state.
+ *
+ * On successful return, peer holds the global port identifier (node id and
+ * local port number) of the port which requested the connection.
+ *
+ * A connection is terminated when an endpoint of the connection is closed,
+ * either explicitly by scif_close(), or when a process that owns one of the
+ * endpoints of the connection is terminated.
+ *
+ * The number of connections that can (subsequently) be accepted on epd is only
+ * limited by system resources (memory).
+ *
+ * The flags argument is formed by OR'ing together zero or more of the
+ * following values:
+ *- SCIF_ACCEPT_SYNC: block until a connection request is presented. If
+ *			SCIF_ACCEPT_SYNC is not in flags, and no pending
+ *			connections are present on the queue, scif_accept()
+ *			fails with an EAGAIN error
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when there is a connection request. In kernel mode, the scif_poll()
+ * function may be used for this purpose. A readable event will be delivered
+ * when a connection is requested.
+ *
+ *\return
+ * Upon successful completion, scif_accept() returns 0; otherwise: in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ *	negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EAGAIN
+ * - SCIF_ACCEPT_SYNC is not set and no connections are present to be accepted,
+ *	or
+ * - SCIF_ACCEPT_SYNC is not set and remote node failed to complete its
+ *	connection request
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- EINTR
+ * - Interrupted function
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - epd is not a listening endpoint
+ * - flags is invalid
+ * - peer is NULL
+ * - newepd is NULL
+ *- ENODEV
+ * - The requesting node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOMEM
+ * - Not enough space
+ *- ENOENT
+ * - Secondary part of epd registration failed
+*/
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer, scif_epd_t
+		*newepd, int flags);
+
+/**
+ * scif_close - Close an endpoint
+ *	\param epd	endpoint descriptor
+ *
+ * scif_close() closes an endpoint and performs necessary teardown of
+ * facilities associated with that endpoint.
+ *
+ * If epd is a listening endpoint then it will no longer accept connection
+ * requests on the port to which it is bound. Any pending connection requests
+ * are rejected.
+ *
+ * If epd is a connected endpoint, then its peer endpoint is also closed. RMAs
+ * which are in-process through epd or its peer endpoint will complete before
+ * scif_close() returns. Registered windows of the local and peer endpoints are
+ * released as if scif_unregister() was called against each window.
+ *
+ * Closing a SCIF endpoint does not affect local registered memory mapped by
+ * a SCIF endpoint on a remote node. The local memory remains mapped by the peer
+ * SCIF endpoint explicitly removed by calling munmap(..) by the peer.
+ *
+ * If the peer endpoint's receive queue is not empty at the time that epd is
+ * closed, then the peer endpoint can be passed as the endpoint parameter to
+ * scif_recv() until the receive queue is empty.
+ *
+ * epd is freed and may no longer be accessed.
+ *
+ *\return
+ * Upon successful completion, scif_close() returns 0; otherwise: in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode the
+ * negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor
+ */
+int scif_close(scif_epd_t epd);
+
+/**
+ * scif_send - Send a message
+ *	\param epd		endpoint descriptor
+ *	\param msg		message buffer address
+ *	\param len		message length
+ *	\param flags		blocking mode flags
+ *
+ * scif_send() sends data to the peer of endpoint epd. Up to len bytes of data
+ * are copied from memory starting at address msg. On successful execution the
+ * return value of scif_send() is the number of bytes that were sent, and is
+ * zero if no bytes were sent because len was zero. scif_send() may be called
+ * only when the endpoint is in a connected state.
+ *
+ * If a scif_send() call is non-blocking, then it sends only those bytes which
+ * can be sent without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_send() call is blocking, then it normally returns after sending
+ * all len bytes. If a blocking call is interrupted or the connection is
+ * reset, the call is considered successful if some bytes were sent or len is
+ * zero, otherwise the call is considered unsuccessful.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when the send queue is not full. In kernel mode, the scif_poll() function
+ * may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * The flags argument is formed by ORing together zero or more of the following
+ * values:
+ *- SCIF_SEND_BLOCK: block until the entire message is sent.
+ *
+ *\return
+ * Upon successful completion, scif_send() returns the number of bytes sent;
+ * otherwise: in user mode -1 is returned and errno is set to indicate the
+ * error; in kernel mode the negative of one of the following errors is
+ * returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EFAULT
+ * - An invalid address was specified for a parameter
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - flags is invalid
+ * - len is negative
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOMEM
+ * - Not enough space
+ *- ENOTCONN
+ * - The endpoint is not connected
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_recv - Receive a message
+ *	\param epd		endpoint descriptor
+ *	\param msg		message buffer address
+ *	\param len		message buffer length
+ *	\param flags		blocking mode flags
+ *
+ * scif_recv() receives data from the peer of endpoint epd. Up to len bytes of
+ * data are copied to memory starting at address msg. On successful execution
+ * the return value of scif_recv() is the number of bytes that were received,
+ * and is zero if no bytes were received because len was zero. scif_recv() may
+ * be called only when the endpoint is in a connected state.
+ *
+ * If a scif_recv() call is non-blocking, then it receives only those bytes
+ * which can be received without waiting, up to a maximum of len bytes.
+ *
+ * If a scif_recv() call is blocking, then it normally returns after receiving
+ * all len bytes. If the blocking call was interrupted due to a disconnection,
+ * subsequent calls to scif_recv() will copy all bytes received upto the point
+ * of disconnection.
+ *
+ * In user mode, the select() and poll() functions can be used to determine
+ * when data is available to be received. In kernel mode, the scif_poll()
+ * function may be used for this purpose.
+ *
+ * It is recommended that scif_send()/scif_recv() only be used for short
+ * control-type message communication between SCIF endpoints. The SCIF RMA
+ * APIs are expected to provide better performance for transfer sizes of
+ * 1024 bytes or longer for the current MIC hardware and software
+ * implementation.
+ *
+ * The flags argument is formed by ORing together zero or more of the following
+ * values:
+ *- SCIF_RECV_BLOCK: block until the entire message is received.
+ *
+ *\return
+ * Upon successful completion, scif_recv() returns the number of bytes
+ * received; otherwise: in user mode -1 is returned and errno is set to
+ * indicate the error; in kernel mode the negative of one of the following
+ * errors is returned.
+ *
+ *\par Errors:
+ *- EAGAIN
+ * - The destination node is returning from a low power state
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EFAULT
+ * - An invalid address was specified for a parameter
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - flags is invalid, or
+ * - len is negative
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOMEM
+ * - Not enough space
+ *- ENOTCONN
+ * - The endpoint is not connected
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags);
+
+/**
+ * scif_register - Mark a memory region for remote access.
+ *	\param epd		endpoint descriptor
+ *	\param addr		starting virtual address
+ *	\param len		length of range
+ *	\param offset		offset of window
+ *	\param prot_flags	read/write protection flags
+ *	\param map_flags	mapping flags
+ *
+ * The scif_register() function opens a window, a range of whole pages of the
+ * registered address space of the endpoint epd, starting at offset po and
+ * continuing for len bytes. The value of po, further described below, is a
+ * function of the parameters offset and len, and the value of map_flags. Each
+ * page of the window represents the physical memory page which backs the
+ * corresponding page of the range of virtual address pages starting at addr
+ * and continuing for len bytes. addr and len are constrained to be multiples
+ * of the page size. A successful scif_register() call returns po.
+ *
+ * When SCIF_MAP_FIXED is set in the map_flags argument, po will be offset
+ * exactly, and offset is constrained to be a multiple of the page size. The
+ * mapping established by scif_register() will not replace any existing
+ * registration; an error is returned if any page within the range [offset,
+ * offset + len - 1] intersects an existing window.
+ * Note: When SCIF_MAP_FIXED is set the current implementation limits
+ * offset to the range [0..2^62-1] and returns EADDRINUSE if the offset
+ * requested with SCIF_MAP_FIXED is in the range [2^62..2^63-1].
+ *
+ * When SCIF_MAP_FIXED is not set, the implementation uses offset in an
+ * implementation-defined manner to arrive at po. The po value so chosen will
+ * be an area of the registered address space that the implementation deems
+ * suitable for a mapping of len bytes. An offset value of 0 is interpreted as
+ * granting the implementation complete freedom in selecting po, subject to
+ * constraints described below. A non-zero value of offset is taken to be a
+ * suggestion of an offset near which the mapping should be placed. When the
+ * implementation selects a value for po, it does not replace any extant
+ * window. In all cases, po will be a multiple of the page size.
+ *
+ * The physical pages which are so represented by a window are available for
+ * access in calls to mmap(), scif_readfrom(), scif_writeto(),
+ * scif_vreadfrom(), and scif_vwriteto(). While a window is registered, the
+ * physical pages represented by the window will not be reused by the memory
+ * subsystem for any other purpose. Note that the same physical page may be
+ * represented by multiple windows.
+ *
+ * Subsequent operations which change the memory pages to which virtual
+ * addresses are mapped (such as mmap(), munmap()) have no effect on
+ * existing window.
+ *
+ * If the process will fork(), it is recommended that the registered
+ * virtual address range be marked with MADV_DONTFORK. Doing so will prevent
+ * problems due to copy-on-write semantics.
+ *
+ * The prot_flags argument is formed by OR'ing together one or more of the
+ * following values:
+ *- SCIF_PROT_READ: allow read operations from the window
+ *- SCIF_PROT_WRITE: allow write operations to the window
+ *
+ * The map_flags argument is formed by OR'ing together zero or more of
+ * the following values:
+ *- SCIF_MAP_FIXED: interpret offset exactly
+ *
+ *\return
+ * Upon successful completion, scif_register() returns the offset at which the
+ * mapping was placed (po); otherwise: in user mode SCIF_REGISTER_FAILED (that
+ * is (off_t *)-1) is returned and errno is set to indicate the error; in
+ * kernel mode the negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EADDRINUSE
+ * - SCIF_MAP_FIXED is set in map_flags, and pages in the range [offset,
+ *	offset + len -1] are already registered
+ *- EAGAIN
+ * - The mapping could not be performed due to lack of resources
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EFAULT
+ * - Addresses in the range [addr, addr + len - 1] are invalid
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - map_flags is invalid, or
+ * - prot_flags is invalid, or
+ * - SCIF_MAP_FIXED is set in flags, and offset is not a multiple of
+ *	the page size, or
+ * - addr is not a multiple of the page size, or
+ * - len is not a multiple of the page size, or is 0, or
+ * - offset is negative
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOMEM
+ * - Not enough space
+ *- ENOTCONN
+ * - The endpoint is not connected
+ */
+off_t scif_register(scif_epd_t epd, void *addr, size_t len, off_t offset,
+		    int prot_flags, int map_flags);
+
+/**
+ * scif_unregister - Mark a memory region for remote access.
+ *	\param epd		endpoint descriptor
+ *	\param offset		start of range to unregister
+ *	\param len		length of range to unregister
+ *
+ * The scif_unregister() function closes those previously registered windows
+ * which are entirely within the range [offset, offset + len - 1]. It is an
+ * error to specify a range which intersects only a subrange of a window.
+ *
+ * On a successful return, pages within the window may no longer be specified
+ * in calls to mmap(), scif_readfrom(), scif_writeto(), scif_vreadfrom(),
+ * scif_vwriteto(), scif_get_pages, and scif_fence_signal(). The window,
+ * however, continues to exist until all previous references against it are
+ * removed. A window is referenced if there is a mapping to it created by
+ * mmap(), or if scif_get_pages() was called against the window
+ * (and the pages have not been returned via scif_put_pages()). A window is
+ * also referenced while an RMA, in which some range of the window is a source
+ * or destination, is in progress. Finally a window is referenced while some
+ * offset in that window was specified to scif_fence_signal(), and the RMAs
+ * marked by that call to scif_fence_signal() have not completed. While a
+ * window is in this state, its registered address space pages are not
+ * available for use in a new registered window.
+ *
+ * When all such references to the window have been removed, its references to
+ * all the physical pages which it represents are removed. Similarly, the
+ * registered address space pages of the window become available for
+ * registration in a new window.
+ *
+ *\return
+ * Upon successful completion, scif_unregister() returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned. In the event of an
+ * error, no windows are unregistered.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - The range [offset, offset + len - 1] intersects a subrange of a window, or
+ * - offset is negative
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENXIO
+ * - Offsets in the range [offset, offset + len - 1] are invalid for the
+ *	registered address space of epd
+ */
+int scif_unregister(scif_epd_t epd, off_t offset, size_t len);
+
+/**
+ * scif_readfrom - Copy from a remote address space
+ *	\param epd		endpoint descriptor
+ *	\param loffset		offset in local registered address space to
+ *				which to copy
+ *	\param len		length of range to copy
+ *	\param roffset		offset in remote registered address space
+ *				from which to copy
+ *	\param rma_flags	transfer mode flags
+ *
+ * scif_readfrom() copies len bytes from the remote registered address space of
+ * the peer of endpoint epd, starting at the offset roffset to the local
+ * registered address space of epd, starting at the offset loffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_readfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if loffset and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values:
+ *- SCIF_RMA_USECPU: perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ *- SCIF_RMA_SYNC: perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ *- SCIF_RMA_ORDERED: ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ *\return
+ * Upon successful completion, scif_readfrom() returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ *\par Errors
+ *- EACCESS
+ * - Attempt to write to a read-only range
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - rma_flags is invalid
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENXIO
+ * - The range [loffset, loffset + len - 1] is invalid for the registered
+ *	address space of epd, or,
+ * - The range [roffset, roffset + len - 1] is invalid for the registered
+ *	address space of the peer of epd, or
+ * - loffset or roffset is negative
+*/
+int scif_readfrom(scif_epd_t epd, off_t loffset, size_t len, off_t
+		  roffset, int rma_flags);
+
+/**
+ * scif_writeto - Copy to a remote address space
+ *	\param epd		endpoint descriptor
+ *	\param loffset		offset in local registered address space
+ *				from which to copy
+ *	\param len		length of range to copy
+ *	\param roffset		offset in remote registered address space to
+ *				which to copy
+ *	\param rma_flags	transfer mode flags
+ *
+ * scif_writeto() copies len bytes from the local registered address space of
+ * epd, starting at the offset loffset to the remote registered address space
+ * of the peer of endpoint epd, starting at the offset roffset.
+ *
+ * Each of the specified ranges [loffset, loffset + len - 1] and [roffset,
+ * roffset + len - 1] must be within some registered window or windows of the
+ * local and remote nodes. A range may intersect multiple registered windows,
+ * but only if those windows are contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_writeto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * The optimal DMA performance will likely be realized if both
+ * loffset and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if loffset and roffset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if loffset and roffset are not separated by a multiple
+ * of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values:
+ *- SCIF_RMA_USECPU: perform the transfer using the CPU, otherwise use the DMA
+ *			engine.
+ *- SCIF_RMA_SYNC: perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ *- SCIF_RMA_ORDERED: ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ *\return
+ * Upon successful completion, scif_readfrom() returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EACCESS
+ * - Attempt to write to a read-only range
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - rma_flags is invalid
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENXIO
+ * - The range [loffset, loffset + len - 1] is invalid for the registered
+ *	address space of epd, or,
+ * - The range [roffset , roffset + len -1] is invalid for the registered
+ *	address space of the peer of epd, or
+ * - loffset or roffset is negative
+ */
+int scif_writeto(scif_epd_t epd, off_t loffset, size_t len, off_t
+		 roffset, int rma_flags);
+
+/**
+ * scif_vreadfrom - Copy from a remote address space
+ *	\param epd		endpoint descriptor
+ *	\param addr		address to which to copy
+ *	\param len		length of range to copy
+ *	\param roffset		offset in remote registered address space
+ *				from which to copy
+ *	\param rma_flags	transfer mode flags
+ *
+ * scif_vreadfrom() copies len bytes from the remote registered address
+ * space of the peer of endpoint epd, starting at the offset roffset, to local
+ * memory, starting at addr.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may
+ * intersect multiple registered windows, but only if those windows are
+ * contiguous in the registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vreadfrom() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and roffset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and roffset are not
+ * cacheline aligned but are separated by some multiple of 64. The lowest level
+ * of performance is likely if addr and roffset are not separated by a
+ * multiple of 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values:
+ *- SCIF_RMA_USECPU: perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ *- SCIF_RMA_USECACHE: enable registration caching
+ *- SCIF_RMA_SYNC: perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ *- SCIF_RMA_ORDERED: ensure that the last cacheline or partial cacheline of
+ *	the source range becomes visible on the destination node
+ *	after all other transferred data in the source range has
+ *	become visible on the destination
+ *
+ *\return
+ * Upon successful completion, scif_vreadfrom() returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EACCESS
+ * - Attempt to write to a read-only range
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EFAULT
+ * - Addresses in the range [addr, addr + len - 1] are invalid
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - rma_flags is invalid
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENXIO
+ * - Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ *	registered address space of epd
+ */
+int scif_vreadfrom(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		   int rma_flags);
+
+/**
+ * scif_vwriteto - Copy to a remote address space
+ *	\param epd		endpoint descriptor
+ *	\param addr		address from which to copy
+ *	\param len		length of range to copy
+ *	\param roffset		offset in remote registered address space to
+ *				which to copy
+ *	\param rma_flags	transfer mode flags
+ *
+ * scif_vwriteto() copies len bytes from the local memory, starting at addr, to
+ * the remote registered address space of the peer of endpoint epd, starting at
+ * the offset roffset.
+ *
+ * The specified range [roffset, roffset + len - 1] must be within some
+ * registered window or windows of the remote nodes. The range may intersect
+ * multiple registered windows, but only if those windows are contiguous in the
+ * registered address space.
+ *
+ * If rma_flags includes SCIF_RMA_USECPU, then the data is copied using
+ * programmed read/writes. Otherwise the data is copied using DMA. If rma_-
+ * flags includes SCIF_RMA_SYNC, then scif_vwriteto() will return after the
+ * transfer is complete. Otherwise, the transfer may be performed asynchron-
+ * ously. The order in which any two asynchronous RMA operations complete
+ * is non-deterministic. The synchronization functions, scif_fence_mark()/
+ * scif_fence_wait() and scif_fence_signal(), can be used to synchronize to
+ * the completion of asynchronous RMA operations on the same endpoint.
+ *
+ * The DMA transfer of individual bytes is not guaranteed to complete in
+ * address order. If rma_flags includes SCIF_RMA_ORDERED, then the last
+ * cacheline or partial cacheline of the source range will become visible on
+ * the destination node after all other transferred data in the source
+ * range has become visible on the destination node.
+ *
+ * If rma_flags includes SCIF_RMA_USECACHE, then the physical pages which back
+ * the specified local memory range may be remain in a pinned state even after
+ * the specified transfer completes. This may reduce overhead if some or all of
+ * the same virtual address range is referenced in a subsequent call of
+ * scif_vreadfrom() or scif_vwriteto().
+ *
+ * The optimal DMA performance will likely be realized if both
+ * addr and offset are cacheline aligned (are a multiple of 64). Lower
+ * performance will likely be realized if addr and offset are not cacheline
+ * aligned but are separated by some multiple of 64. The lowest level of
+ * performance is likely if addr and offset are not separated by a multiple of
+ * 64.
+ *
+ * The rma_flags argument is formed by ORing together zero or more of the
+ * following values:
+ *- SCIF_RMA_USECPU: perform the transfer using the CPU, otherwise use the DMA
+ *	engine.
+ *- SCIF_RMA_USECACHE: allow registration caching
+ *- SCIF_RMA_SYNC: perform the transfer synchronously, returning after the
+ *		transfer has completed. Passing this flag results in the
+ *		current implementation busy waiting and consuming CPU cycles
+ *		while the DMA transfer is in progress for best performance by
+ *		avoiding the interrupt latency.
+ *- SCIF_RMA_ORDERED: ensure that the last cacheline or partial cacheline of
+ *		the source range becomes visible on the destination node
+ *		after all other transferred data in the source range has
+ *		become visible on the destination
+ *
+ *\return
+ * Upon successful completion, scif_vwriteto () returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EACCESS
+ * - Attempt to write to a read-only range
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EFAULT
+ * - Addresses in the range [addr, addr + len - 1] are invalid
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - rma_flags is invalid
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENXIO
+ * - Offsets in the range [roffset, roffset + len - 1] are invalid for the
+ *	registered address space of epd
+ */
+int scif_vwriteto(scif_epd_t epd, void *addr, size_t len, off_t roffset,
+		  int rma_flags);
+
+/**
+ * scif_fence_mark - Mark previously issued RMAs
+ *	\param epd		endpoint descriptor
+ *	\param flags		control flags
+ *	\param mark		marked value returned as output.
+ *
+ * scif_fence_mark() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd. The RMAs are
+ * marked with a value returned at mark. The application may subsequently call
+ * scif_fence_wait(), passing the value returned at mark, to await completion
+ * of all RMAs so marked.
+ *
+ * The flags argument has exactly one of the following values:
+ *- SCIF_FENCE_INIT_SELF: RMA operations initiated through endpoint
+ *	epd are marked
+ *- SCIF_FENCE_INIT_PEER: RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ *
+ * \return
+ * Upon successful completion, scif_fence_mark() returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EINVAL
+ * - flags is invalid, or
+ * - epd is not a valid endpoint descriptor, or
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENOMEM
+ * - Insufficient kernel memory was available
+ */
+int scif_fence_mark(scif_epd_t epd, int flags, int *mark);
+
+/**
+ * scif_fence_wait - Wait for completion of marked RMAs
+ *
+ *	\param epd		endpoint descriptor
+ *	\param mark		mark request
+ *
+ * scif_fence_wait() returns after all RMAs marked with mark have completed.
+ * The value passed in mark must have been obtained in a previous call to
+ * scif_fence_mark().
+ *
+ *\return
+ * Upon successful completion, scif_fence_wait() returns 0; otherwise: in user
+ * mode -1 is returned and errno is set to indicate the error; in kernel mode
+ * the negative of one of the following errors is returned.
+ *
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENOMEM
+ * - Insufficient kernel memory was available
+ */
+int scif_fence_wait(scif_epd_t epd, int mark);
+
+/**
+ * scif_fence_signal - Request a memory update on completion of RMAs
+ *	\param loff		local offset
+ *	\param lval		local value to write to loffset
+ *	\param roff		remote offset
+ *	\param rval		remote value to write to roffset
+ *	\param flags		flags
+ *
+ * scif_fence_signal() returns after marking the current set of all uncompleted
+ * RMAs initiated through the endpoint epd or marking the current set of all
+ * uncompleted RMAs initiated through the peer of endpoint epd.
+ *
+ * If flags includes SCIF_SIGNAL_LOCAL, then on completion of the RMAs in the
+ * marked set, lval is written to memory at the address corresponding to offset
+ * loff in the local registered address space of epd. loff must be within a
+ * registered window. If flags includes SCIF_SIGNAL_REMOTE, then on completion
+ * of the RMAs in the marked set, rval is written to memory at the address
+ * corresponding to offset roff in the remote registered address space of epd.
+ * roff must be within a remote registered window of the peer of epd. Note
+ * that any specified offset must be DWORD (4 byte / 32 bit) aligned.
+ *
+ * The flags argument is formed by OR'ing together the following:
+ *- Exactly one of the following values:
+ * - SCIF_FENCE_INIT_SELF: RMA operations initiated through endpoint
+ *	epd are marked
+ * - SCIF_FENCE_INIT_PEER: RMA operations initiated through the peer
+ *	of endpoint epd are marked
+ *- One or more of the following values:
+ * - SCIF_SIGNAL_LOCAL: On completion of the marked set of RMAs, write lval to
+ *	memory at the address corresponding to offset loff in the local
+ *	registered address space of epd.
+ * - SCIF_SIGNAL_REMOTE: On completion of the marked set of RMAs, write rval to
+ *	memory at the address corresponding to offset roff in the remote
+ *	registered address space of epd.
+ *
+ *\return
+ * Upon successful completion, scif_fence_signal() returns 0; otherwise: in
+ * user mode -1 is returned and errno is set to indicate the error; in kernel
+ * mode the negative of one of the following errors is returned.
+ *\par Errors:
+ *- EBADF, ENOTTY
+ * - epd is not a valid endpoint descriptor
+ *- ECONNRESET
+ * - Connection reset by peer
+ *- EINVAL
+ * - epd is not a valid endpoint descriptor, or
+ * - flags is invalid, or
+ * - loff or roff are not DWORD aligned
+ *- ENODEV
+ * - The remote node is lost or existed, but is not currently in the
+ *	network since it may have crashed
+ *- ENOTCONN
+ * - The endpoint is not connected
+ *- ENXIO
+ * - loff is invalid for the registered address of epd, or
+ * - roff is invalid for the registered address space, of the peer of epd
+ */
+int scif_fence_signal(scif_epd_t epd, off_t loff, uint64_t lval, off_t roff,
+		      uint64_t rval, int flags);
+
+/**
+ * scif_get_node_ids - Return information about online nodes
+ *	\param nodes		array in which to return online node IDs
+ *	\param len		number of entries in the nodes array
+ *	\param self		address to place the node ID of the local node
+ *
+ * scif_get_node_ids() fills in the nodes array with up to len node IDs of the
+ * nodes in the SCIF network. If there is not enough space in nodes, as
+ * indicated by the len parameter, only len node IDs are returned in nodes. The
+ * return value of scif_get_node_ids() is the total number of nodes currently in
+ * the SCIF network. By checking the return value against the len parameter,
+ * the user may determine if enough space for nodes was allocated.
+ *
+ * The node ID of the local node is returned at self.
+ *
+ *\return
+ * Upon successful completion, scif_get_node_ids() returns the actual number of
+ * online nodes in the SCIF network including 'self'; otherwise: in user mode
+ * -1 is returned and errno is set to indicate the error; in kernel mode no
+ * errors are returned.
+ *
+ *\par Errors:
+ *- EFAULT
+ * - Bad address
+ */
+int scif_get_node_ids(uint16_t *nodes, int len, uint16_t *self);
+
+#endif /* __SCIF_H__ */
diff --git a/include/uapi/linux/scif_ioctl.h b/include/uapi/linux/scif_ioctl.h
new file mode 100644
index 0000000..22aa0ae
--- /dev/null
+++ b/include/uapi/linux/scif_ioctl.h
@@ -0,0 +1,233 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+/*
+ * -----------------------------------------
+ * SCIF IOCTL interface information
+ * -----------------------------------------
+ */
+#ifndef SCIF_IOCTL_H
+#define SCIF_IOCTL_H
+
+struct scif_port_id {
+	uint16_t node; /* node on which port resides */
+	uint16_t port; /* Local port number */
+};
+
+/**
+ * struct scifioctl_connect:
+ *
+ * \param self			used to read back the assigned port_id
+ * \param peer			destination node and port to connect to
+ *
+ * This structure is used for CONNECT IOCTL.
+ */
+struct scifioctl_connect {
+	struct scif_port_id	self;
+	struct scif_port_id	peer;
+};
+
+/**
+ * struct scifioctl_accept:
+ *
+ * \param flags			flags
+ * \param peer			global id of peer endpoint
+ * \param endpt			new connected endpoint descriptor
+ *
+ * This structure is used for SCIF_ACCEPTREQ IOCTL.
+ */
+struct scifioctl_accept {
+	int			flags;
+	struct scif_port_id	peer;
+	void			*endpt;
+};
+
+/**
+ * struct scifioctl_msg:
+ *
+ * \param msg			message buffer address
+ * \param len			message length
+ * \param flags			flags
+ * \param out_len		Number of bytes sent/received.
+ *
+ * This structure is used for SCIF_SEND/SCIF_RECV IOCTL.
+ */
+struct scifioctl_msg {
+	void	__user *msg;
+	int	len;
+	int	flags;
+	int	out_len;
+};
+
+/**
+ * struct scifioctl_reg:
+ *
+ * \param addr starting virtual address
+ * \param len			length of range
+ * \param offset		offset of window
+ * \param prot			read/write protection
+ * \param flags			flags
+ * \param out_len		offset returned.
+ *
+ * This structure is used for SCIF_REG IOCTL.
+ */
+struct scifioctl_reg {
+	void		*addr;
+	uint64_t	len;
+	off_t		offset;
+	int		prot;
+	int		flags;
+	off_t		out_offset;
+};
+
+/**
+ * struct scifioctl_unreg:
+ *
+ * \param offset		start of range to unregister
+ * \param len			length of range to unregister
+ *
+ * This structure is used for SCIF_UNREG IOCTL.
+ */
+struct scifioctl_unreg {
+	off_t		offset;
+	uint64_t	len;
+};
+
+/**
+ * struct scifioctl_copy:
+ *
+ * \param loffset	offset in local registered address space to/from
+which to copy
+ * \param len		length of range to copy
+ * \param roffset	offset in remote registered address space to/from
+which to copy
+ * \param addr		user virtual address to/from which to copy
+ * \param flags		flags
+ *
+ * This structure is used for SCIF_READFROM, SCIF_WRITETO, SCIF_VREADFROM
+and
+ * SCIF_VREADFROM IOCTL's.
+ */
+struct scifioctl_copy {
+	off_t		loffset;
+	uint64_t	len;
+	off_t		roffset;
+	uint8_t		__user *addr;
+	int		flags;
+};
+
+/**
+ * struct scifioctl_fence_mark:
+ *
+ * \param flags		flags
+ * \param mark		Fence handle returned by reference.
+ *
+ * This structure is used from SCIF_FENCE_MARK IOCTL.
+ */
+struct scifioctl_fence_mark {
+	int		flags;
+	int		__user *mark;
+};
+
+/**
+ * struct scifioctl_fence_signal:
+ *
+ * \param loff		local offset
+ * \param lval		local value to write to loffset
+ * \param roff		remote offset
+ * \param rval		remote value to write to roffset
+ * \param flags		flags
+ *
+ * This structure is used for SCIF_FENCE_SIGNAL IOCTL.
+ */
+struct scifioctl_fence_signal {
+	off_t		loff;
+	uint64_t	lval;
+	off_t		roff;
+	uint64_t	rval;
+	int		flags;
+};
+
+/**
+ * struct scifioctl_node_ids:
+ *
+ * \param nodes		pointer to an array of node_ids
+ * \param len		length of array
+ * \param self		ID of the current node
+ *
+ * This structure is used for the SCIF_GET_NODEIDS ioctl
+ */
+struct scifioctl_node_ids {
+	uint16_t __user		*nodes;
+	int			len;
+	uint16_t __user		*self;
+};
+
+#define SCIF_BIND		_IOWR('s', 1, int *)
+#define SCIF_LISTEN		_IOW('s', 2, int)
+#define SCIF_CONNECT		_IOWR('s', 3, struct scifioctl_connect *)
+#define SCIF_ACCEPTREQ		_IOWR('s', 4, struct scifioctl_accept *)
+#define SCIF_ACCEPTREG		_IOWR('s', 5, void *)
+#define SCIF_SEND		_IOWR('s', 6, struct scifioctl_msg *)
+#define SCIF_RECV		_IOWR('s', 7, struct scifioctl_msg *)
+#define SCIF_REG		_IOWR('s', 8, struct scifioctl_reg *)
+#define SCIF_UNREG		_IOWR('s', 9, struct scifioctl_unreg *)
+#define SCIF_READFROM		_IOWR('s', 10, struct scifioctl_copy *)
+#define SCIF_WRITETO		_IOWR('s', 11, struct scifioctl_copy *)
+#define SCIF_VREADFROM		_IOWR('s', 12, struct scifioctl_copy *)
+#define SCIF_VWRITETO		_IOWR('s', 13, struct scifioctl_copy *)
+#define SCIF_GET_NODEIDS	_IOWR('s', 14, struct scifioctl_node_ids *)
+#define SCIF_FENCE_MARK		_IOWR('s', 15, struct scifioctl_fence_mark *)
+#define SCIF_FENCE_WAIT		_IOWR('s', 16, int)
+#define SCIF_FENCE_SIGNAL	_IOWR('s', 17, struct scifioctl_fence_signal *)
+
+#endif /* SCIF_IOCTL_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 21%]

* [PATCH char-misc-next 06/13] misc: mic: SCIF module initialization
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (4 preceding siblings ...)
  2014-12-10 19:47 97% ` [PATCH char-misc-next 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
@ 2014-12-10 19:47 47% ` Sudeep Dutt
  2014-12-10 19:47 28% ` [PATCH char-misc-next 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF module initialization, DMA mapping, ioremap wrapper APIs
and debugfs hooks. SCIF gets probed by the SCIF hardware bus
if SCIF devices were registered by base drivers. A MISC device
is registered to provide the SCIF character device interface.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_main.h    | 248 ++++++++++++++++++++++
 drivers/misc/mic/scif/scif_map.h     | 113 ++++++++++
 drivers/misc/mic/scif/scif_debugfs.c |  82 ++++++++
 drivers/misc/mic/scif/scif_main.c    | 394 +++++++++++++++++++++++++++++++++++
 4 files changed, 837 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c

diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
new file mode 100644
index 0000000..861439e
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -0,0 +1,248 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAIN_H
+#define SCIF_MAIN_H
+
+#include <linux/sched.h>
+#include <linux/pci.h>
+#include <linux/miscdevice.h>
+#include <linux/dmaengine.h>
+#include <linux/file.h>
+#include <linux/scif.h>
+
+#include "../common/mic_dev.h"
+
+#define SCIF_MGMT_NODE 0
+#define SCIF_DEFAULT_WATCHDOG_TO 30
+#define SCIF_NODE_ACCEPT_TIMEOUT (3 * HZ)
+#define SCIF_NODE_ALIVE_TIMEOUT (SCIF_DEFAULT_WATCHDOG_TO * HZ)
+
+/*
+ * Generic state used for certain node QP message exchanges
+ * like Unregister, Alloc etc.
+ */
+enum scif_msg_state {
+	OP_IDLE = 1,
+	OP_IN_PROGRESS,
+	OP_COMPLETED,
+	OP_FAILED
+};
+
+/*
+ * struct scif_info - Global SCIF information
+ *
+ * @nodeid: Node ID this node is to others
+ * @maxid: Max known node ID
+ * @total: Total number of SCIF nodes
+ * @nr_zombies: number of zombie endpoints
+ * @eplock: Lock to synchronize listening, zombie endpoint lists
+ * @connlock: Lock to synchronize connected and disconnected lists
+ * @nb_connect_lock: Synchronize non blocking connect operations
+ * @port_lock: Synchronize access to SCIF ports
+ * @uaccept: List of user acceptreq waiting for acceptreg
+ * @listen: List of listening end points
+ * @zombie: List of zombie end points with pending RMA's
+ * @connected: List of end points in connected state
+ * @disconnected: List of end points in disconnected state
+ * @nb_connect_list: List for non blocking connections
+ * @misc_work: miscellaneous SCIF tasks
+ * @conflock: Lock to synchronize SCIF node configuration changes
+ * @en_msg_log: Enable debug message logging
+ * @p2p_enable: Enable P2P SCIF network
+ * @miscdev: The MISC device
+ * @conn_work: Work for workqueue handling all connections
+ * @exitwq: Wait queue for waiting for an EXIT node QP message response
+ * @loopb_dev: Dummy SCIF device used for loopback
+ * @loopb_wq: Workqueue used for handling loopback messages
+ * @loopb_wqname[16]: Name of loopback workqueue
+ * @loopb_work: Used for submitting work to loopb_wq
+ * @loopb_recv_q: List of messages received on the loopb_wq
+ */
+struct scif_info {
+	u8 nodeid;
+	u8 maxid;
+	u8 total;
+	u32 nr_zombies;
+	spinlock_t eplock;
+	struct mutex connlock;
+	spinlock_t nb_connect_lock;
+	spinlock_t port_lock;
+	struct list_head uaccept;
+	struct list_head listen;
+	struct list_head zombie;
+	struct list_head connected;
+	struct list_head disconnected;
+	struct list_head nb_connect_list;
+	struct work_struct misc_work;
+	struct mutex conflock;
+	u8 en_msg_log;
+	u8 p2p_enable;
+	struct miscdevice miscdev;
+	struct work_struct conn_work;
+	wait_queue_head_t exitwq;
+	struct scif_dev *loopb_dev;
+	struct workqueue_struct *loopb_wq;
+	char loopb_wqname[16];
+	struct work_struct loopb_work;
+	struct list_head loopb_recv_q;
+};
+
+/*
+ * struct scif_p2p_info - SCIF mapping information used for P2P
+ *
+ * @ppi_peer_id - SCIF peer node id
+ * @ppi_sg - Scatter list for bar information (One for mmio and one for aper)
+ * @sg_nentries - Number of entries in the scatterlist
+ * @ppi_da: DMA address for MMIO and APER bars
+ * @ppi_len: Length of MMIO and APER bars
+ * @ppi_list: Link in list of mapping information
+ */
+struct scif_p2p_info {
+	u8 ppi_peer_id;
+	struct scatterlist *ppi_sg[2];
+	u64 sg_nentries[2];
+	dma_addr_t ppi_da[2];
+	u64 ppi_len[2];
+#define SCIF_PPI_MMIO 0
+#define SCIF_PPI_APER 1
+	struct list_head ppi_list;
+};
+
+/*
+ * struct scif_dev - SCIF remote device specific fields
+ *
+ * @node: Node id
+ * @p2p: List of P2P mapping information
+ * @qpairs: The node queue pair for exchanging control messages
+ * @intr_wq: Workqueue for handling Node QP messages
+ * @intr_wqname: Name of node QP workqueue for handling interrupts
+ * @intr_bh: Used for submitting work to intr_wq
+ * @lock: Lock used for synchronizing access to the scif device
+ * @sdev: SCIF hardware device on the SCIF hardware bus
+ * @db: doorbell the peer will trigger to generate an interrupt on self
+ * @rdb: Doorbell to trigger on the peer to generate an interrupt on the peer
+ * @cookie: Cookie received while registering the interrupt handler
+ * @qp_dma_addr: Queue pair DMA address passed to the peer
+ * @p2p_dwork: Delayed work to enable polling for P2P state
+ * @qp_dwork: Delayed work for enabling polling for remote QP information
+ * @p2p_retry: Number of times to retry polling of P2P state
+ * @base_addr: P2P aperture bar base address
+ * @mic_mw mmio: The peer MMIO information used for P2P
+ * @spdev: SCIF peer device on the SCIF peer bus
+ * @node_remove_ack_pending: True if a node_remove_ack is pending
+ * @exit_ack_pending: true if an exit_ack is pending
+ * @disconn_wq: Used while waiting for a node remove response
+ * @disconn_rescnt: Keeps track of number of node remove requests sent
+ * @exit: Status of exit message
+*/
+struct scif_dev {
+	u8 node;
+	struct list_head p2p;
+	struct scif_qp *qpairs;
+	struct workqueue_struct *intr_wq;
+	char intr_wqname[16];
+	struct work_struct intr_bh;
+	struct mutex lock;
+	struct scif_hw_dev *sdev;
+	int db;
+	int rdb;
+	struct mic_irq *cookie;
+	dma_addr_t qp_dma_addr;
+	struct delayed_work p2p_dwork;
+	struct delayed_work qp_dwork;
+	int p2p_retry;
+	dma_addr_t base_addr;
+	struct mic_mw mmio;
+	struct scif_peer_dev __rcu *spdev;
+	bool node_remove_ack_pending;
+	bool exit_ack_pending;
+	wait_queue_head_t disconn_wq;
+	atomic_t disconn_rescnt;
+	enum scif_msg_state exit;
+};
+
+extern struct scif_info scif_info;
+extern struct idr scif_ports;
+extern struct scif_dev *scif_dev;
+extern const struct file_operations scif_fops;
+
+/* Size of the RB for the Node QP */
+#define SCIF_NODE_QP_SIZE 0x10000
+
+#include "scif_nodeqp.h"
+
+/*
+ * scifdev_self:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device passed is the self aka Loopback SCIF device.
+ */
+static inline int scifdev_self(struct scif_dev *dev)
+{
+	return dev->node == scif_info.nodeid;
+}
+
+static inline bool scif_is_mgmt_node(void)
+{
+	return !scif_info.nodeid;
+}
+
+/*
+ * scifdev_is_p2p:
+ * @dev: The remote SCIF Device
+ *
+ * Returns true if the SCIF Device is a MIC Peer to Peer SCIF device.
+ */
+static inline bool scifdev_is_p2p(struct scif_dev *dev)
+{
+	if (scif_is_mgmt_node())
+		return false;
+	else
+		return dev != &scif_dev[SCIF_MGMT_NODE] &&
+			!scifdev_self(dev);
+}
+
+/*
+ * scifdev_alive:
+ * @scifdev: The remote SCIF Device
+ *
+ * Returns true if the remote SCIF Device is running or sleeping for
+ * this endpoint.
+ */
+static inline int _scifdev_alive(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	return !!spdev;
+}
+
+void __init scif_init_debugfs(void);
+void scif_exit_debugfs(void);
+int scif_setup_intr_wq(struct scif_dev *scifdev);
+void scif_destroy_intr_wq(struct scif_dev *scifdev);
+void scif_cleanup_scifdev(struct scif_dev *dev);
+void scif_handle_remove_node(int node);
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated);
+void scif_free_qp(struct scif_dev *dev);
+void scif_misc_handler(struct work_struct *work);
+void scif_stop(struct scif_dev *scifdev);
+irqreturn_t scif_intr_handler(int irq, void *data);
+#endif /* SCIF_MAIN_H */
diff --git a/drivers/misc/mic/scif/scif_map.h b/drivers/misc/mic/scif/scif_map.h
new file mode 100644
index 0000000..20e50b4
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_map.h
@@ -0,0 +1,113 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_MAP_H
+#define SCIF_MAP_H
+
+#include "../bus/scif_bus.h"
+
+static __always_inline void *
+scif_alloc_coherent(dma_addr_t *dma_handle,
+		    struct scif_dev *scifdev, size_t size,
+		    gfp_t gfp)
+{
+	void *va;
+
+	if (scifdev_self(scifdev)) {
+		va = kmalloc(size, gfp);
+		if (va)
+			*dma_handle = virt_to_phys(va);
+	} else {
+		va = dma_alloc_coherent(&scifdev->sdev->dev,
+					size, dma_handle, gfp);
+		if (va && scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	return va;
+}
+
+static __always_inline void
+scif_free_coherent(void *va, dma_addr_t local,
+		   struct scif_dev *scifdev, size_t size)
+{
+	if (scifdev_self(scifdev)) {
+		kfree(va);
+	} else {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_free_coherent(&scifdev->sdev->dev,
+				  size, va, local);
+	}
+}
+
+static __always_inline int
+scif_map_single(dma_addr_t *dma_handle,
+		void *local, struct scif_dev *scifdev, size_t size)
+{
+	int err = 0;
+
+	if (scifdev_self(scifdev)) {
+		*dma_handle = virt_to_phys((local));
+	} else {
+		*dma_handle = dma_map_single(&scifdev->sdev->dev,
+					     local, size, DMA_BIDIRECTIONAL);
+		if (dma_mapping_error(&scifdev->sdev->dev, *dma_handle))
+			err = -ENOMEM;
+		else if (scifdev_is_p2p(scifdev))
+			*dma_handle = *dma_handle + scifdev->base_addr;
+	}
+	if (err)
+		*dma_handle = 0;
+	return err;
+}
+
+static __always_inline void
+scif_unmap_single(dma_addr_t local, struct scif_dev *scifdev,
+		  size_t size)
+{
+	if (!scifdev_self(scifdev)) {
+		if (scifdev_is_p2p(scifdev) && local > scifdev->base_addr)
+			local = local - scifdev->base_addr;
+		dma_unmap_single(&scifdev->sdev->dev, local,
+				 size, DMA_BIDIRECTIONAL);
+	}
+}
+
+static __always_inline void *
+scif_ioremap(dma_addr_t phys, size_t size, struct scif_dev *scifdev)
+{
+	void *out_virt;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_self(scifdev))
+		out_virt = phys_to_virt(phys);
+	else
+		out_virt = (void __force *)
+			   sdev->hw_ops->ioremap(sdev, phys, size);
+	return out_virt;
+}
+
+static __always_inline void
+scif_iounmap(void *virt, size_t len, struct scif_dev *scifdev)
+{
+	if (!scifdev_self(scifdev)) {
+		struct scif_hw_dev *sdev = scifdev->sdev;
+
+		sdev->hw_ops->iounmap(sdev, (void __force __iomem *)virt);
+	}
+}
+#endif  /* SCIF_MAP_H */
diff --git a/drivers/misc/mic/scif/scif_debugfs.c b/drivers/misc/mic/scif/scif_debugfs.c
new file mode 100644
index 0000000..f837761
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_debugfs.c
@@ -0,0 +1,82 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+#include "../common/mic_dev.h"
+#include "scif_main.h"
+
+/* Debugfs parent dir */
+static struct dentry *scif_dbg;
+
+static int scif_dev_test(struct seq_file *s, void *unused)
+{
+	int node;
+
+	seq_printf(s, "Total Nodes %d Self Node Id %d Maxid %d\n",
+		   scif_info.total, scif_info.nodeid,
+		   scif_info.maxid);
+
+	if (!scif_dev)
+		return 0;
+
+	seq_printf(s, "%-16s\t%-16s\n", "node_id", "state");
+
+	for (node = 0; node <= scif_info.maxid; node++)
+		seq_printf(s, "%-16d\t%-16s\n", scif_dev[node].node,
+			   _scifdev_alive(&scif_dev[node]) ?
+			   "Running" : "Offline");
+	return 0;
+}
+
+static int scif_dev_test_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, scif_dev_test, inode->i_private);
+}
+
+static int scif_dev_test_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations scif_dev_ops = {
+	.owner   = THIS_MODULE,
+	.open    = scif_dev_test_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = scif_dev_test_release
+};
+
+void __init scif_init_debugfs(void)
+{
+	struct dentry *d;
+
+	scif_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	if (!scif_dbg)
+		pr_err("can't create debugfs dir scif\n");
+
+	d = debugfs_create_file("scif_dev", 0444, scif_dbg,
+				NULL, &scif_dev_ops);
+	debugfs_create_u8("en_msg_log", 0666, scif_dbg, &scif_info.en_msg_log);
+	debugfs_create_u8("p2p_enable", 0666, scif_dbg, &scif_info.p2p_enable);
+}
+
+void scif_exit_debugfs(void)
+{
+	debugfs_remove_recursive(scif_dbg);
+}
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
new file mode 100644
index 0000000..b81ad4d
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -0,0 +1,394 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/module.h>
+#include <linux/idr.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_map.h"
+
+struct scif_info scif_info = {
+	.miscdev = {
+		.minor = MISC_DYNAMIC_MINOR,
+		.name = "scif",
+		.fops = &scif_fops,
+	}
+};
+
+struct scif_dev *scif_dev;
+static atomic_t g_loopb_cnt;
+
+/* Runs in the context of intr_wq */
+static void scif_intr_bh_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev =
+			container_of(work, struct scif_dev, intr_bh);
+
+	if (scifdev_self(scifdev))
+		scif_loopb_msg_handler(scifdev, scifdev->qpairs);
+	else
+		scif_nodeqp_intrhandler(scifdev, scifdev->qpairs);
+}
+
+int scif_setup_intr_wq(struct scif_dev *scifdev)
+{
+	if (!scifdev->intr_wq) {
+		snprintf(scifdev->intr_wqname, sizeof(scifdev->intr_wqname),
+			 "SCIF INTR %d", scifdev->node);
+		scifdev->intr_wq =
+			alloc_ordered_workqueue(scifdev->intr_wqname, 0);
+		if (!scifdev->intr_wq)
+			return -ENOMEM;
+		INIT_WORK(&scifdev->intr_bh, scif_intr_bh_handler);
+	}
+	return 0;
+}
+
+void scif_destroy_intr_wq(struct scif_dev *scifdev)
+{
+	if (scifdev->intr_wq) {
+		destroy_workqueue(scifdev->intr_wq);
+		scifdev->intr_wq = NULL;
+	}
+}
+
+irqreturn_t scif_intr_handler(int irq, void *data)
+{
+	struct scif_dev *scifdev = data;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	sdev->hw_ops->ack_interrupt(sdev, scifdev->db);
+	queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+	return IRQ_HANDLED;
+}
+
+static int scif_peer_probe(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total++;
+	scif_info.maxid = max_t(u32, spdev->dnode, scif_info.maxid);
+	mutex_unlock(&scif_info.conflock);
+	rcu_assign_pointer(scifdev->spdev, spdev);
+
+	/* In the future SCIF kernel client devices will be added here */
+	dev_info(&spdev->dev, "Peer added dnode %d\n",
+		 spdev->dnode);
+	return 0;
+}
+
+static void scif_peer_remove(struct scif_peer_dev *spdev)
+{
+	struct scif_dev *scifdev = &scif_dev[spdev->dnode];
+
+	/* In the future SCIF kernel client devices will be removed here */
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	synchronize_rcu();
+
+	mutex_lock(&scif_info.conflock);
+	scif_info.total--;
+	mutex_unlock(&scif_info.conflock);
+	dev_info(&spdev->dev, "Peer removed dnode %d\n",
+		 spdev->dnode);
+}
+
+static void scif_qp_setup_handler(struct work_struct *work)
+{
+	struct scif_dev *scifdev = container_of(work, struct scif_dev,
+						qp_dwork.work);
+	struct scif_hw_dev *sdev = scifdev->sdev;
+	dma_addr_t da = 0;
+	int err;
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		da = bp->scif_card_dma_addr;
+		scifdev->rdb = bp->h2c_scif_db;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		da = readq(&bp->scif_host_dma_addr);
+		scifdev->rdb = ioread8(&bp->c2h_scif_db);
+	}
+	if (da) {
+		err = scif_qp_response(da, scifdev);
+		if (err)
+			dev_err(&scifdev->sdev->dev,
+				"scif_qp_response err %d\n", err);
+	} else {
+		schedule_delayed_work(&scifdev->qp_dwork,
+				      msecs_to_jiffies(1000));
+	}
+}
+
+static int scif_setup_scifdev(struct scif_hw_dev *sdev)
+{
+	int i;
+	u8 num_nodes;
+
+	if (sdev->snode) {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		num_nodes = ioread8(&bp->tot_nodes);
+	} else {
+		struct mic_bootparam *bp = sdev->dp;
+
+		num_nodes = bp->tot_nodes;
+	}
+	scif_dev = kcalloc(num_nodes, sizeof(*scif_dev), GFP_KERNEL);
+	if (!scif_dev)
+		return -ENOMEM;
+	for (i = 0; i < num_nodes; i++) {
+		struct scif_dev *scifdev = &scif_dev[i];
+
+		scifdev->node = i;
+		scifdev->exit = OP_IDLE;
+		init_waitqueue_head(&scifdev->disconn_wq);
+		mutex_init(&scifdev->lock);
+		INIT_DELAYED_WORK(&scifdev->p2p_dwork,
+				  scif_poll_qp_state);
+		INIT_DELAYED_WORK(&scifdev->qp_dwork,
+				  scif_qp_setup_handler);
+		INIT_LIST_HEAD(&scifdev->p2p);
+		RCU_INIT_POINTER(scifdev->spdev, NULL);
+	}
+	return 0;
+}
+
+static void scif_destroy_scifdev(void)
+{
+	kfree(scif_dev);
+}
+
+static int scif_probe(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev;
+	int rc;
+
+	dev_set_drvdata(&sdev->dev, sdev);
+	if (1 == atomic_add_return(1, &g_loopb_cnt)) {
+		struct scif_dev *loopb_dev;
+
+		rc = scif_setup_scifdev(sdev);
+		if (rc)
+			goto exit;
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+		loopb_dev = &scif_dev[sdev->snode];
+		loopb_dev->sdev = sdev;
+		rc = scif_setup_loopback_qp(loopb_dev);
+		if (rc)
+			goto free_sdev;
+	} else {
+		scifdev = &scif_dev[sdev->dnode];
+		scifdev->sdev = sdev;
+	}
+	rc = scif_setup_intr_wq(scifdev);
+	if (rc)
+		goto destroy_loopb;
+	rc = scif_setup_qp(scifdev);
+	if (rc)
+		goto destroy_intr;
+	scifdev->db = sdev->hw_ops->next_db(sdev);
+	scifdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						    "SCIF_INTR", scifdev,
+						    scifdev->db);
+	if (IS_ERR(scifdev->cookie)) {
+		rc = PTR_ERR(scifdev->cookie);
+		goto free_qp;
+	}
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = scifdev->db;
+		bp->scif_host_dma_addr = scifdev->qp_dma_addr;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(scifdev->db, &bp->h2c_scif_db);
+		writeq(scifdev->qp_dma_addr, &bp->scif_card_dma_addr);
+	}
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+	return rc;
+free_qp:
+	scif_free_qp(scifdev);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+destroy_loopb:
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+free_sdev:
+	scif_destroy_scifdev();
+exit:
+	return rc;
+}
+
+void scif_stop(struct scif_dev *scifdev)
+{
+	struct scif_dev *dev;
+	int i;
+
+	for (i = scif_info.maxid; i >= 0; i--) {
+		dev = &scif_dev[i];
+		if (scifdev_self(dev))
+			continue;
+		scif_handle_remove_node(i);
+	}
+}
+
+static void scif_remove(struct scif_hw_dev *sdev)
+{
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	if (scif_is_mgmt_node()) {
+		struct mic_bootparam *bp = sdev->dp;
+
+		bp->c2h_scif_db = -1;
+		bp->scif_host_dma_addr = 0x0;
+	} else {
+		struct mic_bootparam __iomem *bp = sdev->rdp;
+
+		iowrite8(-1, &bp->h2c_scif_db);
+		writeq(0x0, &bp->scif_card_dma_addr);
+	}
+	if (scif_is_mgmt_node()) {
+		scif_disconnect_node(scifdev->node, true);
+	} else {
+		/*
+		 * Send an SCIF_EXIT message which is the last message from MIC
+		 * to the Host and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(scifdev);
+		scif_stop(scifdev);
+	}
+	if (atomic_dec_and_test(&g_loopb_cnt))
+		scif_destroy_loopback_qp(&scif_dev[sdev->snode]);
+	if (scifdev->cookie) {
+		sdev->hw_ops->free_irq(sdev, scifdev->cookie, scifdev);
+		scifdev->cookie = NULL;
+	}
+	scif_destroy_intr_wq(scifdev);
+	cancel_delayed_work(&scifdev->qp_dwork);
+	scif_free_qp(scifdev);
+	scifdev->rdb = -1;
+	scifdev->sdev = NULL;
+}
+
+static struct scif_peer_driver scif_peer_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.probe = scif_peer_probe,
+	.remove = scif_peer_remove,
+};
+
+static struct scif_hw_dev_id id_table[] = {
+	{ MIC_SCIF_DEV, SCIF_DEV_ANY_ID },
+	{ 0 },
+};
+
+static struct scif_driver scif_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = scif_probe,
+	.remove = scif_remove,
+};
+
+static int _scif_init(void)
+{
+	spin_lock_init(&scif_info.eplock);
+	spin_lock_init(&scif_info.nb_connect_lock);
+	spin_lock_init(&scif_info.port_lock);
+	mutex_init(&scif_info.conflock);
+	mutex_init(&scif_info.connlock);
+	INIT_LIST_HEAD(&scif_info.uaccept);
+	INIT_LIST_HEAD(&scif_info.listen);
+	INIT_LIST_HEAD(&scif_info.zombie);
+	INIT_LIST_HEAD(&scif_info.connected);
+	INIT_LIST_HEAD(&scif_info.disconnected);
+	INIT_LIST_HEAD(&scif_info.nb_connect_list);
+	init_waitqueue_head(&scif_info.exitwq);
+	scif_info.en_msg_log = 0;
+	scif_info.p2p_enable = 1;
+	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	idr_init(&scif_ports);
+	return 0;
+}
+
+static void _scif_exit(void)
+{
+	idr_destroy(&scif_ports);
+	scif_destroy_scifdev();
+}
+
+static int __init scif_init(void)
+{
+	struct miscdevice *mdev = &scif_info.miscdev;
+	int rc;
+
+	_scif_init();
+	rc = scif_peer_bus_init();
+	if (rc)
+		goto exit;
+	rc = scif_peer_register_driver(&scif_peer_driver);
+	if (rc)
+		goto peer_bus_exit;
+	rc = scif_register_driver(&scif_driver);
+	if (rc)
+		goto unreg_scif_peer;
+	rc = misc_register(mdev);
+	if (rc)
+		goto unreg_scif;
+	scif_init_debugfs();
+	return 0;
+unreg_scif:
+	scif_unregister_driver(&scif_driver);
+unreg_scif_peer:
+	scif_peer_unregister_driver(&scif_peer_driver);
+peer_bus_exit:
+	scif_peer_bus_exit();
+exit:
+	_scif_exit();
+	return rc;
+}
+
+static void __exit scif_exit(void)
+{
+	scif_exit_debugfs();
+	misc_deregister(&scif_info.miscdev);
+	scif_unregister_driver(&scif_driver);
+	scif_peer_unregister_driver(&scif_peer_driver);
+	scif_peer_bus_exit();
+	_scif_exit();
+}
+
+module_init(scif_init);
+module_exit(scif_exit);
+
+MODULE_DEVICE_TABLE(scif, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 47%]

* [PATCH char-misc-next 03/13] misc: mic: SCIF Hardware Bus
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
  2014-12-10 19:47 21% ` [PATCH char-misc-next 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
  2014-12-10 19:47 58% ` [PATCH char-misc-next 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
@ 2014-12-10 19:47 61% ` Sudeep Dutt
  2014-12-10 19:47 78% ` [PATCH char-misc-next 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF hardware bus abstracts the low level hardware driver details
like interrupts and mapping remote memory so that the same SCIF driver
can work without any changes with the MIC host or card driver as long
as the hardware bus operations are implemented. The SCIF hardware
device is registered by the host and card drivers on the SCIF hardware
bus resulting in probing the SCIF driver.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig        |  17 ++++
 drivers/misc/mic/Makefile       |   2 +-
 drivers/misc/mic/bus/Makefile   |   1 +
 drivers/misc/mic/bus/scif_bus.h | 129 ++++++++++++++++++++++++
 drivers/misc/mic/bus/scif_bus.c | 210 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 358 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index cc4eef0..afc9f53 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -15,6 +15,23 @@ config INTEL_MIC_BUS
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
 
+comment "SCIF Bus Driver"
+
+config SCIF_BUS
+	tristate "SCIF Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the SCIF Bus, such as CONFIG_INTEL_MIC_HOST
+	  and CONFIG_INTEL_MIC_CARD.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index e9bf148..6bc6083 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,4 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
-obj-$(CONFIG_INTEL_MIC_BUS) += bus/
+obj-y += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
index d85c7f2..1ed37e2 100644
--- a/drivers/misc/mic/bus/Makefile
+++ b/drivers/misc/mic/bus/Makefile
@@ -3,3 +3,4 @@
 # Copyright(c) 2014, Intel Corporation.
 #
 obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
+obj-$(CONFIG_SCIF_BUS) += scif_bus.o
diff --git a/drivers/misc/mic/bus/scif_bus.h b/drivers/misc/mic/bus/scif_bus.h
new file mode 100644
index 0000000..335a228
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.h
@@ -0,0 +1,129 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#ifndef _SCIF_BUS_H_
+#define _SCIF_BUS_H_
+/*
+ * Everything a scif driver needs to work with any particular scif
+ * hardware abstraction layer.
+ */
+#include <linux/dma-mapping.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_dev.h"
+
+struct scif_hw_dev_id {
+	u32 device;
+	u32 vendor;
+};
+
+#define MIC_SCIF_DEV 1
+#define SCIF_DEV_ANY_ID 0xffffffff
+
+/**
+ * scif_hw_dev - representation of a hardware device abstracted for scif
+ * @hw_ops: the hardware ops supported by this device
+ * @id: the device type identification (used to match it with a driver)
+ * @mmio: MMIO memory window
+ * @aper: Aperture memory window
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ * @snode - The source node for this device.
+ * @dp - Self device page
+ * @rdp - Remote device page
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ */
+struct scif_hw_dev {
+	struct scif_hw_ops *hw_ops;
+	struct scif_hw_dev_id id;
+	struct mic_mw *mmio;
+	struct mic_mw *aper;
+	struct device dev;
+	u8 dnode;
+	u8 snode;
+	void *dp;
+	void __iomem *rdp;
+	struct dma_chan **dma_ch;
+	int num_dma_ch;
+};
+
+/**
+ * scif_driver - operations for a scif I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_driver {
+	struct device_driver driver;
+	const struct scif_hw_dev_id *id_table;
+	int (*probe)(struct scif_hw_dev *dev);
+	void (*remove)(struct scif_hw_dev *dev);
+};
+
+/**
+ * scif_hw_ops - Hardware operations for accessing a SCIF device on the SCIF bus.
+ *
+ * @next_db: Obtain the next available doorbell.
+ * @request_irq: Request an interrupt on a particular doorbell.
+ * @free_irq: Free an interrupt requested previously.
+ * @ack_interrupt: acknowledge an interrupt in the ISR.
+ * @send_intr: Send an interrupt to the remote node on a specified doorbell.
+ * @send_p2p_intr: Send an interrupt to the peer node on a specified doorbell
+ * which is specifically targeted for a peer to peer node.
+ * @ioremap: Map a buffer with the specified physical address and length.
+ * @iounmap: Unmap a buffer previously mapped.
+ */
+struct scif_hw_ops {
+	int (*next_db)(struct scif_hw_dev *sdev);
+	struct mic_irq * (*request_irq)(struct scif_hw_dev *sdev,
+					irqreturn_t (*func)(int irq,
+							    void *data),
+					const char *name, void *data,
+					int db);
+	void (*free_irq)(struct scif_hw_dev *sdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct scif_hw_dev *sdev, int num);
+	void (*send_intr)(struct scif_hw_dev *sdev, int db);
+	void (*send_p2p_intr)(struct scif_hw_dev *sdev, int db,
+			      struct mic_mw *mw);
+	void __iomem * (*ioremap)(struct scif_hw_dev *sdev,
+				  phys_addr_t pa, size_t len);
+	void (*iounmap)(struct scif_hw_dev *sdev, void __iomem *va);
+};
+
+int scif_register_driver(struct scif_driver *driver);
+void scif_unregister_driver(struct scif_driver *driver);
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id,
+		     struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper,
+		     void *dp, void __iomem *rdp,
+		     struct dma_chan **chan, int num_chan);
+void scif_unregister_device(struct scif_hw_dev *sdev);
+
+static inline struct scif_hw_dev *dev_to_scif(struct device *dev)
+{
+	return container_of(dev, struct scif_hw_dev, dev);
+}
+
+static inline struct scif_driver *drv_to_scif(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_driver, driver);
+}
+#endif /* _SCIF_BUS_H */
diff --git a/drivers/misc/mic/bus/scif_bus.c b/drivers/misc/mic/bus/scif_bus.c
new file mode 100644
index 0000000..2da7cee
--- /dev/null
+++ b/drivers/misc/mic/bus/scif_bus.c
@@ -0,0 +1,210 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel Symmetric Communications Interface Bus driver.
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/dma-mapping.h>
+
+#include "scif_bus.h"
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+
+	return sprintf(buf, "scif:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *scif_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(scif_dev);
+
+static inline int scif_id_match(const struct scif_hw_dev *dev,
+				const struct scif_hw_dev_id *id)
+{
+	if (id->device != dev->id.device && id->device != SCIF_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == SCIF_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call scif_dev_probe().
+ */
+static int scif_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+	const struct scif_hw_dev_id *ids;
+
+	ids = drv_to_scif(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (scif_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int scif_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct scif_hw_dev *dev = dev_to_scif(dv);
+
+	return add_uevent_var(env, "MODALIAS=scif:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int scif_dev_probe(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_dev_remove(struct device *d)
+{
+	struct scif_hw_dev *dev = dev_to_scif(d);
+	struct scif_driver *drv = drv_to_scif(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_bus = {
+	.name  = "scif_bus",
+	.match = scif_dev_match,
+	.dev_groups = scif_dev_groups,
+	.uevent = scif_uevent,
+	.probe = scif_dev_probe,
+	.remove = scif_dev_remove,
+};
+
+int scif_register_driver(struct scif_driver *driver)
+{
+	driver->driver.bus = &scif_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_register_driver);
+
+void scif_unregister_driver(struct scif_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_driver);
+
+static void scif_release_dev(struct device *d)
+{
+	struct scif_hw_dev *sdev = dev_to_scif(d);
+
+	kfree(sdev);
+}
+
+struct scif_hw_dev *
+scif_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct scif_hw_ops *hw_ops, u8 dnode, u8 snode,
+		     struct mic_mw *mmio, struct mic_mw *aper, void *dp,
+		     void __iomem *rdp, struct dma_chan **chan, int num_chan)
+{
+	int ret;
+	struct scif_hw_dev *sdev;
+
+	sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
+	if (!sdev)
+		return ERR_PTR(-ENOMEM);
+
+	sdev->dev.parent = pdev;
+	sdev->id.device = id;
+	sdev->id.vendor = SCIF_DEV_ANY_ID;
+	sdev->dev.archdata.dma_ops = dma_ops;
+	sdev->dev.release = scif_release_dev;
+	sdev->hw_ops = hw_ops;
+	sdev->dnode = dnode;
+	sdev->snode = snode;
+	dev_set_drvdata(&sdev->dev, sdev);
+	sdev->dev.bus = &scif_bus;
+	sdev->mmio = mmio;
+	sdev->aper = aper;
+	sdev->dp = dp;
+	sdev->rdp = rdp;
+	sdev->dev.dma_mask = &sdev->dev.coherent_dma_mask;
+	dma_set_mask(&sdev->dev, DMA_BIT_MASK(64));
+	sdev->dma_ch = chan;
+	sdev->num_dma_ch = num_chan;
+	dev_set_name(&sdev->dev, "scif-dev%u", sdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&sdev->dev);
+	if (ret)
+		goto free_sdev;
+	return sdev;
+free_sdev:
+	kfree(sdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(scif_register_device);
+
+void scif_unregister_device(struct scif_hw_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+EXPORT_SYMBOL_GPL(scif_unregister_device);
+
+static int __init scif_init(void)
+{
+	return bus_register(&scif_bus);
+}
+
+static void __exit scif_exit(void)
+{
+	bus_unregister(&scif_bus);
+}
+
+core_initcall(scif_init);
+module_exit(scif_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) SCIF Bus driver");
+MODULE_LICENSE("GPL v2");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 61%]

* [PATCH char-misc-next 08/13] misc: mic: SCIF open close bind and listen APIs
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (6 preceding siblings ...)
  2014-12-10 19:47 28% ` [PATCH char-misc-next 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
@ 2014-12-10 19:47 41% ` Sudeep Dutt
  2014-12-10 19:47 41% ` [PATCH char-misc-next 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF character device file operations and kernel APIs for opening and
closing a user and kernel mode SCIF endpoint. This patch also enables
binding to a SCIF port and listening for incoming SCIF connections.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig            |  19 ++
 drivers/misc/mic/Makefile           |   1 +
 drivers/misc/mic/scif/Makefile      |  15 ++
 drivers/misc/mic/scif/scif_epd.h    | 148 +++++++++++++
 drivers/misc/mic/scif/scif_main.h   |   2 +
 drivers/misc/mic/scif/scif_nodeqp.h |  12 +-
 drivers/misc/mic/scif/scif_api.c    | 417 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  92 ++++++++
 drivers/misc/mic/scif/scif_fd.c     | 104 +++++++++
 drivers/misc/mic/scif/scif_ports.c  | 124 +++++++++++
 10 files changed, 933 insertions(+), 1 deletion(-)
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index afc9f53..bcf6e36 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -69,3 +69,22 @@ config INTEL_MIC_CARD
 
 	  For more information see
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "SCIF Driver"
+
+config SCIF
+	tristate "SCIF Driver"
+	depends on 64BIT && PCI && X86 && SCIF_BUS
+	help
+	  This enables SCIF Driver support for the Intel Many Integrated
+	  Core (MIC) family of PCIe form factor coprocessor devices that
+	  run a 64 bit Linux OS. The Symmetric Communication Interface
+	  (SCIF (pronounced as skiff)) is a low level communications API
+	  across PCIe currently implemented for MIC.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 6bc6083..a74042c 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -5,3 +5,4 @@
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
 obj-y += bus/
+obj-$(CONFIG_SCIF) += scif/
diff --git a/drivers/misc/mic/scif/Makefile b/drivers/misc/mic/scif/Makefile
new file mode 100644
index 0000000..bf10bb7
--- /dev/null
+++ b/drivers/misc/mic/scif/Makefile
@@ -0,0 +1,15 @@
+#
+# Makefile - SCIF driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_SCIF) += scif.o
+scif-objs := scif_main.o
+scif-objs += scif_peer_bus.o
+scif-objs += scif_ports.o
+scif-objs += scif_debugfs.o
+scif-objs += scif_fd.o
+scif-objs += scif_api.o
+scif-objs += scif_epd.o
+scif-objs += scif_rb.o
+scif-objs += scif_nodeqp.o
+scif-objs += scif_nm.o
diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
new file mode 100644
index 0000000..b973b99
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -0,0 +1,148 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_EPD_H
+#define SCIF_EPD_H
+
+#include <linux/delay.h>
+#include <linux/scif.h>
+#include <linux/scif_ioctl.h>
+
+#define SCIF_EPLOCK_HELD true
+
+enum scif_epd_state {
+	SCIFEP_UNBOUND,
+	SCIFEP_BOUND,
+	SCIFEP_LISTENING,
+	SCIFEP_CONNECTED,
+	SCIFEP_CONNECTING,
+	SCIFEP_MAPPING,
+	SCIFEP_CLOSING,
+	SCIFEP_CLLISTEN,
+	SCIFEP_DISCONNECTED,
+	SCIFEP_ZOMBIE
+};
+
+/*
+ * struct scif_conreq - Data structure added to the connection list.
+ *
+ * @msg: connection request message received
+ * @list: link to list of connection requests
+ */
+struct scif_conreq {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+/* Size of the RB for the Endpoint QP */
+#define SCIF_ENDPT_QP_SIZE 0x1000
+
+/*
+ * scif_endpt_qp_info - SCIF endpoint queue pair
+ *
+ * @qp - Qpair for this endpoint
+ * @qp_offset - DMA address of the QP
+ * @gnt_pld - Payload in a SCIF_CNCT_GNT message containing the
+ * physical address of the remote_qp.
+ */
+struct scif_endpt_qp_info {
+	struct scif_qp *qp;
+	dma_addr_t qp_offset;
+	dma_addr_t gnt_pld;
+};
+
+/*
+ * struct scif_endpt - The SCIF endpoint data structure
+ *
+ * @state: end point state
+ * @lock: lock synchronizing access to endpoint fields like state etc
+ * @port: self port information
+ * @peer: peer port information
+ * @backlog: maximum pending connection requests
+ * @qp_info: Endpoint QP information for SCIF messaging
+ * @remote_dev: scifdev used by this endpt to communicate with remote node.
+ * @remote_ep: remote endpoint
+ * @conreqcnt: Keep track of number of connection requests.
+ * @files: Open file information used to match the id passed in with
+ *         the flush routine.
+ * @conlist: list of connection requests
+ * @conwq: waitqueue for connection processing
+ * @discon: completion used during disconnection
+ * @sendwq: waitqueue used during sending messages
+ * @recvwq: waitqueue used during message receipt
+ * @sendlock: Synchronize ordering of messages sent
+ * @recvlock: Synchronize ordering of messages received
+ * @list: link to list of various endpoints like connected, listening etc
+ * @li_accept: pending ACCEPTREG
+ * @acceptcnt: pending ACCEPTREG cnt
+ * @liacceptlist: link to listen accept
+ * @miacceptlist: link to uaccept
+ * @listenep: associated listen ep
+ * @conn_work: Non blocking connect work
+ * @conn_port: Connection port
+ * @conn_err: Errors during connection
+ * @conn_async_state: Async connection
+ * @conn_list: List of async connection requests
+ */
+struct scif_endpt {
+	enum scif_epd_state state;
+	spinlock_t lock;
+	struct scif_port_id port;
+	struct scif_port_id peer;
+	int backlog;
+	struct scif_endpt_qp_info qp_info;
+	struct scif_dev *remote_dev;
+	u64 remote_ep;
+	int conreqcnt;
+	struct files_struct *files;
+	struct list_head conlist;
+	wait_queue_head_t conwq;
+	struct completion discon;
+	wait_queue_head_t sendwq;
+	wait_queue_head_t recvwq;
+	struct mutex sendlock;
+	struct mutex recvlock;
+	struct list_head list;
+	struct list_head li_accept;
+	int acceptcnt;
+	struct list_head liacceptlist;
+	struct list_head miacceptlist;
+	struct scif_endpt *listenep;
+	struct scif_port_id conn_port;
+	int conn_err;
+	int conn_async_state;
+	struct list_head conn_list;
+};
+
+static inline int scifdev_alive(struct scif_endpt *ep)
+{
+	return _scifdev_alive(ep->remote_dev);
+}
+
+void scif_cleanup_zombie_epd(void);
+void scif_teardown_ep(void *endpt);
+void scif_cleanup_ep_qp(struct scif_endpt *ep);
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held);
+void scif_get_node_info(void);
+void scif_send_acks(struct scif_dev *dev);
+void scif_conn_handler(struct work_struct *work);
+int scif_rsrv_port(u16 port);
+void scif_get_port(u16 port);
+int scif_get_new_port(void);
+void scif_put_port(u16 port);
+int __scif_flush(scif_epd_t epd);
+#endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_main.h b/drivers/misc/mic/scif/scif_main.h
index 861439e..591cf4a 100644
--- a/drivers/misc/mic/scif/scif_main.h
+++ b/drivers/misc/mic/scif/scif_main.h
@@ -234,6 +234,8 @@ static inline int _scifdev_alive(struct scif_dev *scifdev)
 	return !!spdev;
 }
 
+#include "scif_epd.h"
+
 void __init scif_init_debugfs(void);
 void scif_exit_debugfs(void);
 int scif_setup_intr_wq(struct scif_dev *scifdev);
diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
index 4d8f53c..9540233 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.h
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -64,7 +64,17 @@
 #define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
 #define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
 #define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
-#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+#define SCIF_CNCT_REQ 9  /* Phys addr of Request connection to a port */
+#define SCIF_CNCT_GNT 10  /* Phys addr of new Grant connection request */
+#define SCIF_CNCT_GNTACK 11  /* Error type Reject a connection request */
+#define SCIF_CNCT_GNTNACK 12  /* Error type Reject a connection request */
+#define SCIF_CNCT_REJ 13  /* Error type Reject a connection request */
+#define SCIF_DISCNCT 14 /* Notify peer that connection is being terminated */
+#define SCIF_DISCNT_ACK 15 /* Notify peer that connection is being terminated */
+#define SCIF_CLIENT_SENT 16 /* Notify the peer that data has been written */
+#define SCIF_CLIENT_RCVD 17 /* Notify the peer that data has been read */
+#define SCIF_GET_NODE_INFO 18 /* Get current node mask from the mgmt node*/
+#define SCIF_MAX_MSG SCIF_GET_NODE_INFO
 
 /*
  * struct scifmsg - Node QP message format
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
new file mode 100644
index 0000000..d5f10f3
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -0,0 +1,417 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/scif.h>
+#include "scif_main.h"
+#include "scif_map.h"
+
+static const char * const scif_ep_states[] = {
+	"Unbound",
+	"Bound",
+	"Listening",
+	"Connected",
+	"Connecting",
+	"Mapping",
+	"Closing",
+	"Close Listening",
+	"Disconnected",
+	"Zombie"};
+
+enum conn_async_state {
+	ASYNC_CONN_IDLE = 1,	/* ep setup for async connect */
+	ASYNC_CONN_INPROGRESS,	/* async connect in progress */
+	ASYNC_CONN_FLUSH_WORK	/* async work flush in progress  */
+};
+
+scif_epd_t scif_open(void)
+{
+	struct scif_endpt *ep;
+
+	might_sleep();
+	ep = kzalloc(sizeof(*ep), GFP_KERNEL);
+	if (!ep)
+		goto err_ep_alloc;
+
+	ep->qp_info.qp = kzalloc(sizeof(*ep->qp_info.qp), GFP_KERNEL);
+	if (!ep->qp_info.qp)
+		goto err_qp_alloc;
+
+	spin_lock_init(&ep->lock);
+	mutex_init(&ep->sendlock);
+	mutex_init(&ep->recvlock);
+
+	ep->state = SCIFEP_UNBOUND;
+	pr_debug("SCIFAPI open: ep %p success\n", ep);
+	return ep;
+
+err_qp_alloc:
+	kfree(ep);
+err_ep_alloc:
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(scif_open);
+
+/*
+ * scif_disconnect_ep - Disconnects the endpoint if found
+ * @epd: The end point returned from scif_open()
+ */
+static struct scif_endpt *scif_disconnect_ep(struct scif_endpt *ep)
+{
+	struct scifmsg msg;
+	struct scif_endpt *fep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	int err;
+
+	/*
+	 * Wake up any threads blocked in send()/recv() before closing
+	 * out the connection. Grabbing and releasing the send/recv lock
+	 * will ensure that any blocked senders/receivers have exited for
+	 * Ring 0 endpoints. It is a Ring 0 bug to call send/recv after
+	 * close. Ring 3 endpoints are not affected since close will not
+	 * be called while there are IOCTLs executing.
+	 */
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	mutex_lock(&ep->sendlock);
+	mutex_unlock(&ep->sendlock);
+	mutex_lock(&ep->recvlock);
+	mutex_unlock(&ep->recvlock);
+
+	/* Remove from the connected list */
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		if (tmpep == ep) {
+			list_del(pos);
+			fep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	if (!fep) {
+		/*
+		 * The other side has completed the disconnect before
+		 * the end point can be removed from the list. Therefore
+		 * the ep lock is not locked, traverse the disconnected
+		 * list to find the endpoint and release the conn lock.
+		 */
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		return NULL;
+	}
+
+	init_completion(&ep->discon);
+	msg.uop = SCIF_DISCNCT;
+	msg.src = ep->port;
+	msg.dst = ep->peer;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->remote_ep;
+
+	err = scif_nodeqp_send(ep->remote_dev, &msg);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+	if (!err)
+		/* Wait for the remote node to respond with SCIF_DISCNT_ACK */
+		wait_for_completion_timeout(&ep->discon,
+					    SCIF_NODE_ALIVE_TIMEOUT);
+	return ep;
+}
+
+int scif_close(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+	enum scif_epd_state oldstate;
+	bool flush_conn;
+
+	pr_debug("SCIFAPI close: ep %p %s\n", ep, scif_ep_states[ep->state]);
+
+	might_sleep();
+
+	spin_lock(&ep->lock);
+	flush_conn = (ep->conn_async_state == ASYNC_CONN_INPROGRESS);
+	spin_unlock(&ep->lock);
+
+	if (flush_conn)
+		flush_work(&scif_info.conn_work);
+
+	spin_lock(&ep->lock);
+	oldstate = ep->state;
+
+	ep->state = SCIFEP_CLOSING;
+
+	switch (oldstate) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		/* Remove from the disconnected list */
+		mutex_lock(&scif_info.connlock);
+		list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep) {
+				list_del(pos);
+				break;
+			}
+		}
+		mutex_unlock(&scif_info.connlock);
+		break;
+	case SCIFEP_UNBOUND:
+	case SCIFEP_BOUND:
+	case SCIFEP_CONNECTING:
+		spin_unlock(&ep->lock);
+		break;
+	case SCIFEP_MAPPING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CLOSING:
+	{
+		spin_unlock(&ep->lock);
+		scif_disconnect_ep(ep);
+		break;
+	}
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+	{
+		struct scif_conreq *conreq;
+		struct scifmsg msg;
+		struct scif_endpt *aep;
+
+		spin_unlock(&ep->lock);
+		spin_lock(&scif_info.eplock);
+
+		/* remove from listen list */
+		list_for_each_safe(pos, tmpq, &scif_info.listen) {
+			tmpep = list_entry(pos, struct scif_endpt, list);
+			if (tmpep == ep)
+				list_del(pos);
+		}
+		/* Remove any dangling accepts */
+		while (ep->acceptcnt) {
+			aep = list_first_entry(&ep->li_accept,
+					       struct scif_endpt, liacceptlist);
+			list_del(&aep->liacceptlist);
+			scif_put_port(aep->port.port);
+			list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+				tmpep = list_entry(pos, struct scif_endpt,
+						   miacceptlist);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			spin_unlock(&scif_info.eplock);
+			mutex_lock(&scif_info.connlock);
+			list_for_each_safe(pos, tmpq, &scif_info.connected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+				tmpep = list_entry(pos,
+						   struct scif_endpt, list);
+				if (tmpep == aep) {
+					list_del(pos);
+					break;
+				}
+			}
+			mutex_unlock(&scif_info.connlock);
+			scif_teardown_ep(aep);
+			spin_lock(&scif_info.eplock);
+			scif_add_epd_to_zombie_list(aep, SCIF_EPLOCK_HELD);
+			ep->acceptcnt--;
+		}
+
+		spin_lock(&ep->lock);
+		spin_unlock(&scif_info.eplock);
+
+		/* Remove and reject any pending connection requests. */
+		while (ep->conreqcnt) {
+			conreq = list_first_entry(&ep->conlist,
+						  struct scif_conreq, list);
+			list_del(&conreq->list);
+
+			msg.uop = SCIF_CNCT_REJ;
+			msg.dst.node = conreq->msg.src.node;
+			msg.dst.port = conreq->msg.src.port;
+			msg.payload[0] = conreq->msg.payload[0];
+			msg.payload[1] = conreq->msg.payload[1];
+			/*
+			 * No Error Handling on purpose for scif_nodeqp_send().
+			 * If the remote node is lost we still want free the
+			 * connection requests on the self node.
+			 */
+			scif_nodeqp_send(&scif_dev[conreq->msg.src.node],
+					 &msg);
+			ep->conreqcnt--;
+			kfree(conreq);
+		}
+
+		spin_unlock(&ep->lock);
+		/* If a kSCIF accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	}
+	scif_put_port(ep->port.port);
+	scif_teardown_ep(ep);
+	scif_add_epd_to_zombie_list(ep, !SCIF_EPLOCK_HELD);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_close);
+
+/**
+ * scif_flush() - Wakes up any blocking accepts. The endpoint will no longer
+ *			accept new connections.
+ * @epd: The end point returned from scif_open()
+ */
+int __scif_flush(scif_epd_t epd)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	switch (ep->state) {
+	case SCIFEP_LISTENING:
+	{
+		ep->state = SCIFEP_CLLISTEN;
+
+		/* If an accept is waiting wake it up */
+		wake_up_interruptible(&ep->conwq);
+		break;
+	}
+	default:
+		break;
+	}
+	return 0;
+}
+
+int scif_bind(scif_epd_t epd, u16 pn)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret = 0;
+	int tmp;
+
+	pr_debug("SCIFAPI bind: ep %p %s requested port number %d\n",
+		 ep, scif_ep_states[ep->state], pn);
+
+	if (pn) {
+		/*
+		 * Similar to IETF RFC 1700, SCIF ports below
+		 * SCIF_ADMIN_PORT_END can only be bound by system (or root)
+		 * processes or by processes executed by privileged users.
+		 */
+		if (pn < SCIF_ADMIN_PORT_END && !capable(CAP_SYS_ADMIN)) {
+			ret = -EACCES;
+			goto scif_bind_admin_exit;
+		}
+	}
+
+	spin_lock(&ep->lock);
+	if (ep->state == SCIFEP_BOUND) {
+		ret = -EINVAL;
+		goto scif_bind_exit;
+	} else if (ep->state != SCIFEP_UNBOUND) {
+		ret = -EISCONN;
+		goto scif_bind_exit;
+	}
+
+	if (pn) {
+		tmp = scif_rsrv_port(pn);
+		if (tmp != pn) {
+			ret = -EINVAL;
+			goto scif_bind_exit;
+		}
+	} else {
+		pn = scif_get_new_port();
+		if (!pn) {
+			ret = -ENOSPC;
+			goto scif_bind_exit;
+		}
+	}
+
+	ep->state = SCIFEP_BOUND;
+	ep->port.node = scif_info.nodeid;
+	ep->port.port = pn;
+	ep->conn_async_state = ASYNC_CONN_IDLE;
+	ret = pn;
+	pr_debug("SCIFAPI bind: bound to port number %d\n", pn);
+
+scif_bind_exit:
+	spin_unlock(&ep->lock);
+scif_bind_admin_exit:
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_bind);
+
+int scif_listen(scif_epd_t epd, int backlog)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+
+	pr_debug("SCIFAPI listen: ep %p %s\n", ep, scif_ep_states[ep->state]);
+
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+	case SCIFEP_CLLISTEN:
+	case SCIFEP_UNBOUND:
+	case SCIFEP_DISCONNECTED:
+		spin_unlock(&ep->lock);
+		return -EINVAL;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CONNECTED:
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		spin_unlock(&ep->lock);
+		return -EISCONN;
+	case SCIFEP_BOUND:
+		break;
+	}
+
+	ep->state = SCIFEP_LISTENING;
+	ep->backlog = backlog;
+
+	ep->conreqcnt = 0;
+	ep->acceptcnt = 0;
+	INIT_LIST_HEAD(&ep->conlist);
+	init_waitqueue_head(&ep->conwq);
+	INIT_LIST_HEAD(&ep->li_accept);
+	spin_unlock(&ep->lock);
+
+	/*
+	 * Listen status is complete so delete the qp information not needed
+	 * on a listen before placing on the list of listening ep's
+	 */
+	scif_teardown_ep(ep);
+	ep->qp_info.qp = NULL;
+
+	spin_lock(&scif_info.eplock);
+	list_add_tail(&ep->list, &scif_info.listen);
+	spin_unlock(&scif_info.eplock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scif_listen);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
new file mode 100644
index 0000000..a2b728a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -0,0 +1,92 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_cleanup_ep_qp(struct scif_endpt *ep)
+{
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp->outbound_q.rb_base) {
+		scif_iounmap((void *)qp->outbound_q.rb_base,
+			     qp->outbound_q.size, ep->remote_dev);
+		qp->outbound_q.rb_base = NULL;
+	}
+	if (qp->remote_qp) {
+		scif_iounmap((void *)qp->remote_qp,
+			     sizeof(struct scif_qp), ep->remote_dev);
+		qp->remote_qp = NULL;
+	}
+	if (qp->local_qp) {
+		scif_unmap_single(qp->local_qp, ep->remote_dev,
+				  sizeof(struct scif_qp));
+		qp->local_qp = 0x0;
+	}
+	if (qp->local_buf) {
+		scif_unmap_single(qp->local_buf, ep->remote_dev,
+				  SCIF_ENDPT_QP_SIZE);
+		qp->local_buf = 0;
+	}
+}
+
+void scif_teardown_ep(void *endpt)
+{
+	struct scif_endpt *ep = endpt;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (qp) {
+		spin_lock(&ep->lock);
+		scif_cleanup_ep_qp(ep);
+		spin_unlock(&ep->lock);
+		kfree(qp->inbound_q.rb_base);
+		kfree(qp);
+	}
+}
+
+/*
+ * Enqueue the endpoint to the zombie list for cleanup.
+ * The endpoint should not be accessed once this API returns.
+ */
+void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
+{
+	if (!eplock_held)
+		spin_lock(&scif_info.eplock);
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_ZOMBIE;
+	spin_unlock(&ep->lock);
+	list_add_tail(&ep->list, &scif_info.zombie);
+	scif_info.nr_zombies++;
+	if (!eplock_held)
+		spin_unlock(&scif_info.eplock);
+	schedule_work(&scif_info.misc_work);
+}
+
+void scif_cleanup_zombie_epd(void)
+{
+	struct list_head *pos, *tmpq;
+	struct scif_endpt *ep;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.zombie) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		list_del(pos);
+		scif_info.nr_zombies--;
+		kfree(ep);
+	}
+	spin_unlock(&scif_info.eplock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
new file mode 100644
index 0000000..d81ab7c
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -0,0 +1,104 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_main.h"
+
+static int scif_fdopen(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = scif_open();
+
+	if (!priv)
+		return -ENOMEM;
+	f->private_data = priv;
+	return 0;
+}
+
+static int scif_fdclose(struct inode *inode, struct file *f)
+{
+	struct scif_endpt *priv = f->private_data;
+
+	return scif_close(priv);
+}
+
+static int scif_fdflush(struct file *f, fl_owner_t id)
+{
+	struct scif_endpt *ep = f->private_data;
+
+	spin_lock(&ep->lock);
+	/*
+	 * The listening endpoint stashes the open file information before
+	 * waiting for incoming connections. The release callback would never be
+	 * called if the application closed the endpoint, while waiting for
+	 * incoming connections from a separate thread since the file descriptor
+	 * reference count is bumped up in the accept IOCTL. Call the flush
+	 * routine if the id matches the endpoint open file information so that
+	 * the listening endpoint can be woken up and the fd released.
+	 */
+	if (ep->files == id)
+		__scif_flush(ep);
+	spin_unlock(&ep->lock);
+	return 0;
+}
+
+static __always_inline void scif_err_debug(int err, const char *str)
+{
+	/*
+	 * ENOTCONN is a common uninteresting error which is
+	 * flooding debug messages to the console unnecessarily.
+	 */
+	if (err < 0 && err != -ENOTCONN)
+		pr_debug("%s err %d\n", str, err);
+}
+
+static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct scif_endpt *priv = f->private_data;
+	void __user *argp = (void __user *)arg;
+	bool non_block = false;
+
+	non_block = !!(f->f_flags & O_NONBLOCK);
+
+	switch (cmd) {
+	case SCIF_BIND:
+	{
+		int pn;
+
+		if (copy_from_user(&pn, argp, sizeof(pn)))
+			return -EFAULT;
+
+		pn = scif_bind(priv, pn);
+		if (pn < 0)
+			return pn;
+
+		if (copy_to_user(argp, &pn, sizeof(pn)))
+			return -EFAULT;
+
+		return 0;
+	}
+	case SCIF_LISTEN:
+		return scif_listen(priv, arg);
+	}
+	return -EINVAL;
+}
+
+const struct file_operations scif_fops = {
+	.open = scif_fdopen,
+	.release = scif_fdclose,
+	.unlocked_ioctl = scif_fdioctl,
+	.flush = scif_fdflush,
+	.owner = THIS_MODULE,
+};
diff --git a/drivers/misc/mic/scif/scif_ports.c b/drivers/misc/mic/scif/scif_ports.c
new file mode 100644
index 0000000..594e18d
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_ports.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/idr.h>
+
+#include "scif_main.h"
+
+#define SCIF_PORT_COUNT	0x10000	/* Ports available */
+
+struct idr scif_ports;
+
+/*
+ * struct scif_port - SCIF port information
+ *
+ * @ref_cnt - Reference count since there can be multiple endpoints
+ *		created via scif_accept(..) simultaneously using a port.
+ */
+struct scif_port {
+	int ref_cnt;
+};
+
+/**
+ * __scif_get_port - Reserve a specified port # for SCIF and add it
+ * to the global list.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+static int __scif_get_port(int start, int end)
+{
+	int id;
+	struct scif_port *port = kzalloc(sizeof(*port), GFP_ATOMIC);
+
+	if (!port)
+		return -ENOMEM;
+	spin_lock(&scif_info.port_lock);
+	id = idr_alloc(&scif_ports, port, start, end, GFP_ATOMIC);
+	if (id >= 0)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+	return id;
+}
+
+/**
+ * scif_rsrv_port - Reserve a specified port # for SCIF.
+ * @port : port # to be reserved.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if port unavailable.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_rsrv_port(u16 port)
+{
+	return __scif_get_port(port, port + 1);
+}
+
+/**
+ * scif_get_new_port - Get and reserve any port # for SCIF in the range
+ *			SCIF_PORT_RSVD + 1 to SCIF_PORT_COUNT - 1.
+ *
+ * @return : Allocated SCIF port #, or -ENOSPC if no ports available.
+ *		On memory allocation failure, returns -ENOMEM.
+ */
+int scif_get_new_port(void)
+{
+	return __scif_get_port(SCIF_PORT_RSVD + 1, SCIF_PORT_COUNT);
+}
+
+/**
+ * scif_get_port - Increment the reference count for a SCIF port
+ * @id : SCIF port
+ *
+ * @return : None
+ */
+void scif_get_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port)
+		port->ref_cnt++;
+	spin_unlock(&scif_info.port_lock);
+}
+
+/**
+ * scif_put_port - Release a reserved SCIF port
+ * @id : SCIF port to be released.
+ *
+ * @return : None
+ */
+void scif_put_port(u16 id)
+{
+	struct scif_port *port;
+
+	if (!id)
+		return;
+	spin_lock(&scif_info.port_lock);
+	port = idr_find(&scif_ports, id);
+	if (port) {
+		port->ref_cnt--;
+		if (!port->ref_cnt) {
+			idr_remove(&scif_ports, id);
+			kfree(port);
+		}
+	}
+	spin_unlock(&scif_info.port_lock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next 07/13] misc: mic: SCIF node queue pair setup management
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (5 preceding siblings ...)
  2014-12-10 19:47 47% ` [PATCH char-misc-next 06/13] misc: mic: SCIF module initialization Sudeep Dutt
@ 2014-12-10 19:47 28% ` Sudeep Dutt
  2014-12-10 19:47 41% ` [PATCH char-misc-next 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF node queue pair setup creates the SCIF driver kernel
mode private node queue pairs between all the nodes to enable
internal control message communication once SCIF gets probed
by the SCIF hardware bus. Peer to peer communication between
MIC Coprocessor nodes is supported.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_nodeqp.h |  172 +++++
 drivers/misc/mic/scif/scif_nm.c     |  176 +++++
 drivers/misc/mic/scif/scif_nodeqp.c | 1255 +++++++++++++++++++++++++++++++++++
 3 files changed, 1603 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c

diff --git a/drivers/misc/mic/scif/scif_nodeqp.h b/drivers/misc/mic/scif/scif_nodeqp.h
new file mode 100644
index 0000000..4d8f53c
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.h
@@ -0,0 +1,172 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#ifndef SCIF_NODEQP
+#define SCIF_NODEQP
+
+#include "scif_rb.h"
+#include "scif_peer_bus.h"
+
+#define SCIF_INIT 1  /* First message sent to the peer node for discovery */
+#define SCIF_EXIT 2  /* Last message from the peer informing intent to exit */
+#define SCIF_EXIT_ACK 3 /* Response to SCIF_EXIT message */
+#define SCIF_NODE_ADD 4  /* Tell Online nodes a new node exits */
+#define SCIF_NODE_ADD_ACK 5  /* Confirm to mgmt node sequence is finished */
+#define SCIF_NODE_ADD_NACK 6 /* SCIF_NODE_ADD failed */
+#define SCIF_NODE_REMOVE 7 /* Request to deactivate a SCIF node */
+#define SCIF_NODE_REMOVE_ACK 8 /* Response to a SCIF_NODE_REMOVE message */
+#define SCIF_MAX_MSG SCIF_NODE_REMOVE_ACK
+
+/*
+ * struct scifmsg - Node QP message format
+ *
+ * @src: Source information
+ * @dst: Destination information
+ * @uop: The message opcode
+ * @payload: Unique payload format for each message
+ */
+struct scifmsg {
+	struct scif_port_id src;
+	struct scif_port_id dst;
+	uint32_t uop;
+	uint64_t payload[4];
+} __packed;
+
+/*
+ * struct scif_qp - Node Queue Pair
+ *
+ * Interesting structure -- a little difficult because we can only
+ * write across the PCIe, so any r/w pointer we need to read is
+ * local. We only need to read the read pointer on the inbound_q
+ * and read the write pointer in the outbound_q
+ *
+ * @magic: Magic value to ensure the peer sees the QP correctly
+ * @outbound_q: The outbound ring buffer for sending messages
+ * @inbound_q: The inbound ring buffer for receiving messages
+ * @local_write: Local write index
+ * @local_read: Local read index
+ * @remote_qp: The remote queue pair
+ * @local_buf: DMA address of local ring buffer
+ * @local_qp: DMA address of the local queue pair data structure
+ * @remote_buf: DMA address of remote ring buffer
+ * @qp_state: QP state i.e. online or offline used for P2P
+ * @send_lock: synchronize access to outbound queue
+ * @recv_lock: Synchronize access to inbound queue
+ */
+struct scif_qp {
+	uint64_t magic;
+#define SCIFEP_MAGIC 0x5c1f000000005c1fULL
+	struct scif_rb outbound_q;
+	struct scif_rb inbound_q;
+
+	uint32_t local_write __aligned(64);
+	uint32_t local_read __aligned(64);
+	struct scif_qp *remote_qp;
+	dma_addr_t local_buf;
+	dma_addr_t local_qp;
+	dma_addr_t remote_buf;
+	uint32_t qp_state;
+#define SCIF_QP_OFFLINE 0xdead
+#define SCIF_QP_ONLINE 0xc0de
+	spinlock_t send_lock;
+	spinlock_t recv_lock;
+};
+
+/*
+ * struct scif_loopb_msg - An element in the loopback Node QP message list.
+ *
+ * @msg - The SCIF node QP message
+ * @list - link in the list of messages
+ */
+struct scif_loopb_msg {
+	struct scifmsg msg;
+	struct list_head list;
+};
+
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp);
+int scif_setup_qp(struct scif_dev *scifdev);
+int scif_qp_response(phys_addr_t phys, struct scif_dev *dev);
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev);
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev);
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload);
+int scif_setup_loopback_qp(struct scif_dev *scifdev);
+int scif_destroy_loopback_qp(struct scif_dev *scifdev);
+void scif_poll_qp_state(struct work_struct *work);
+void scif_destroy_p2p(struct scif_dev *scifdev);
+void scif_send_exit(struct scif_dev *scifdev);
+static inline struct device *scif_get_peer_dev(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+	struct device *spdev_ret;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	if (spdev)
+		spdev_ret = get_device(&spdev->dev);
+	else
+		spdev_ret = ERR_PTR(-ENODEV);
+	rcu_read_unlock();
+	return spdev_ret;
+}
+
+static inline void scif_put_peer_dev(struct device *dev)
+{
+	put_device(dev);
+}
+#endif  /* SCIF_NODEQP */
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
new file mode 100644
index 0000000..0c612db
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -0,0 +1,176 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "scif_peer_bus.h"
+
+#include "scif_main.h"
+#include "scif_map.h"
+
+void scif_free_qp(struct scif_dev *scifdev)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+
+	if (!qp)
+		return;
+	scif_free_coherent((void *)qp->inbound_q.rb_base,
+			   qp->local_buf, scifdev, qp->inbound_q.size);
+	scif_unmap_single(qp->local_qp, scifdev, sizeof(struct scif_qp));
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+}
+
+static void scif_cleanup_qp(struct scif_dev *dev)
+{
+	struct scif_qp *qp = &dev->qpairs[0];
+
+	if (!qp)
+		return;
+	scif_iounmap((void *)qp->remote_qp, sizeof(struct scif_qp), dev);
+	scif_iounmap((void *)qp->outbound_q.rb_base,
+		     sizeof(struct scif_qp), dev);
+	qp->remote_qp = NULL;
+	qp->local_write = 0;
+	qp->inbound_q.current_write_offset = 0;
+	qp->inbound_q.current_read_offset = 0;
+	if (scifdev_is_p2p(dev))
+		scif_free_qp(dev);
+}
+
+void scif_send_acks(struct scif_dev *dev)
+{
+	struct scifmsg msg;
+
+	if (dev->node_remove_ack_pending) {
+		msg.uop = SCIF_NODE_REMOVE_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = SCIF_MGMT_NODE;
+		msg.payload[0] = dev->node;
+		scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg);
+		dev->node_remove_ack_pending = false;
+	}
+	if (dev->exit_ack_pending) {
+		msg.uop = SCIF_EXIT_ACK;
+		msg.src.node = scif_info.nodeid;
+		msg.dst.node = dev->node;
+		scif_nodeqp_send(dev, &msg);
+		dev->exit_ack_pending = false;
+	}
+}
+
+/*
+ * scif_cleanup_scifdev
+ *
+ * @dev: Remote SCIF device.
+ * Uninitialize SCIF data structures for remote SCIF device.
+ */
+void scif_cleanup_scifdev(struct scif_dev *dev)
+{
+	struct scif_hw_dev *sdev = dev->sdev;
+
+	if (!dev->sdev)
+		return;
+	if (scifdev_is_p2p(dev)) {
+		if (dev->cookie) {
+			sdev->hw_ops->free_irq(sdev, dev->cookie, dev);
+			dev->cookie = NULL;
+		}
+		scif_destroy_intr_wq(dev);
+	}
+	scif_destroy_p2p(dev);
+	scif_send_acks(dev);
+	scif_cleanup_qp(dev);
+}
+
+/*
+ * scif_remove_node:
+ *
+ * @node: Node to remove
+ */
+void scif_handle_remove_node(int node)
+{
+	struct scif_dev *scifdev = &scif_dev[node];
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	else
+		scif_send_acks(scifdev);
+}
+
+static int scif_send_rmnode_msg(int node, int remove_node)
+{
+	struct scifmsg notif_msg;
+	struct scif_dev *dev = &scif_dev[node];
+
+	notif_msg.uop = SCIF_NODE_REMOVE;
+	notif_msg.src.node = scif_info.nodeid;
+	notif_msg.dst.node = node;
+	notif_msg.payload[0] = remove_node;
+	return scif_nodeqp_send(dev, &notif_msg);
+}
+
+/**
+ * scif_node_disconnect:
+ *
+ * @node_id[in]: source node id.
+ * @mgmt_initiated: Disconnection initiated from the mgmt node
+ *
+ * Disconnect a node from the scif network.
+ */
+void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
+{
+	int ret;
+	int msg_cnt = 0;
+	u32 i = 0;
+	struct scif_dev *scifdev = &scif_dev[node_id];
+
+	if (!node_id)
+		return;
+
+	atomic_set(&scifdev->disconn_rescnt, 0);
+
+	/* Destroy p2p network */
+	for (i = 1; i <= scif_info.maxid; i++) {
+		if (i == node_id)
+			continue;
+		ret = scif_send_rmnode_msg(i, node_id);
+		if (!ret)
+			msg_cnt++;
+	}
+	/* Wait for the remote nodes to respond with SCIF_NODE_REMOVE_ACK */
+	ret = wait_event_timeout(scifdev->disconn_wq,
+				 (atomic_read(&scifdev->disconn_rescnt)
+				 == msg_cnt), SCIF_NODE_ALIVE_TIMEOUT);
+	/* Tell the card to clean up */
+	if (mgmt_initiated && _scifdev_alive(scifdev))
+		/*
+		 * Send an SCIF_EXIT message which is the last message from Host
+		 * to the MIC and wait for a SCIF_EXIT_ACK
+		 */
+		scif_send_exit(scifdev);
+	atomic_set(&scifdev->disconn_rescnt, 0);
+	/* Tell the mgmt node to clean up */
+	ret = scif_send_rmnode_msg(SCIF_MGMT_NODE, node_id);
+	if (!ret)
+		/* Wait for mgmt node to respond with SCIF_NODE_REMOVE_ACK */
+		wait_event_timeout(scifdev->disconn_wq,
+				   (atomic_read(&scifdev->disconn_rescnt) == 1),
+				   SCIF_NODE_ALIVE_TIMEOUT);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
new file mode 100644
index 0000000..7c72d8f
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -0,0 +1,1255 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+#include "scif_main.h"
+#include "scif_nodeqp.h"
+#include "scif_map.h"
+
+/*
+ ************************************************************************
+ * SCIF node Queue Pair (QP) setup flow:
+ *
+ * 1) SCIF driver gets probed with a scif_hw_dev via the scif_hw_bus
+ * 2) scif_setup_qp(..) allocates the local qp and calls
+ *	scif_setup_qp_connect(..) which allocates and maps the local
+ *	buffer for the inbound QP
+ * 3) The local node updates the device page with the DMA address of the QP
+ * 4) A delayed work is scheduled (qp_dwork) which periodically reads if
+ *	the peer node has updated its QP DMA address
+ * 5) Once a valid non zero address is found in the QP DMA address field
+ *	in the device page, the local node maps the remote node's QP,
+ *	updates its outbound QP and sends a SCIF_INIT message to the peer
+ * 6) The SCIF_INIT message is received by the peer node QP interrupt bottom
+ *	half handler by calling scif_init(..)
+ * 7) scif_init(..) registers a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node
+ * 8) On the mgmt node, P2P network setup/teardown is initiated if all the
+ *	remote nodes are online via scif_p2p_setup(..)
+ * 9) For P2P setup, the host maps the remote nodes' aperture and memory
+ *	bars and sends a SCIF_NODE_ADD message to both nodes
+ * 10) As part of scif_nodeadd, both nodes set up their local inbound
+ *	QPs and send a SCIF_NODE_ADD_ACK to the mgmt node
+ * 11) As part of scif_node_add_ack(..) the mgmt node forwards the
+ *	SCIF_NODE_ADD_ACK to the remote nodes
+ * 12) As part of scif_node_add_ack(..) the remote nodes update their
+ *	outbound QPs, make sure they can access memory on the remote node
+ *	and then add a new SCIF peer node by calling
+ *	scif_peer_register_device(..) which signifies the addition of a new
+ *	SCIF node.
+ * 13) The SCIF network is now established across all nodes.
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by non mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) A non mgmt node now sends a SCIF_EXIT message to the peer and waits for
+ *	a SCIF_EXIT_ACK
+ * 4) As part of scif_exit(..) handling scif_disconnect_node(..) is called
+ * 5) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the
+ *	peers and waits for a SCIF_NODE_REMOVE_ACK
+ * 6) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 7) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it sends itself a node remove message whose handling cleans up local
+ *	data structures and unregisters the peer node from the SCIF network
+ * 8) The mgmt node sends a SCIF_EXIT_ACK
+ * 9) Upon receipt of the SCIF_EXIT_ACK the node initiating the teardown
+ *	cleans up local data structures and completes the SCIF remove routine
+ * 10) The SCIF network is now torn down for the node initiating the
+ *	teardown sequence
+ *
+ ************************************************************************
+ * SCIF node QP teardown flow (initiated by mgmt node):
+ *
+ * 1) SCIF driver gets a remove callback with a scif_hw_dev via the scif_hw_bus
+ * 2) The device page QP DMA address field is updated with 0x0
+ * 3) The mgmt node calls scif_disconnect_node(..)
+ * 4) scif_disconnect_node(..) sends a SCIF_NODE_REMOVE message to all the peers
+ *	and waits for a SCIF_NODE_REMOVE_ACK
+ * 5) As part of scif_node_remove(..) a remote node unregisters the peer
+ *	node from the SCIF network and sends a SCIF_NODE_REMOVE_ACK
+ * 6) When the mgmt node has received all the SCIF_NODE_REMOVE_ACKs
+ *	it unregisters the peer node from the SCIF network
+ * 7) The mgmt node sends a SCIF_EXIT message and waits for a SCIF_EXIT_ACK.
+ * 8) A non mgmt node upon receipt of a SCIF_EXIT message calls scif_stop(..)
+ *	which would clean up local data structures for all SCIF nodes and
+ *	then send a SCIF_EXIT_ACK back to the mgmt node
+ * 9) Upon receipt of the SCIF_EXIT_ACK the the mgmt node sends itself a node
+ *	remove message whose handling cleans up local data structures and
+ *	destroys any P2P mappings.
+ * 10) The SCIF hardware device for which a remove callback was received is now
+ *	disconnected from the SCIF network.
+ */
+/*
+ * Initializes "local" data structures for the QP. Allocates the QP
+ * ring buffer (rb) and initializes the "in bound" queue.
+ */
+int scif_setup_qp_connect(struct scif_qp *qp, dma_addr_t *qp_offset,
+			  int local_size, struct scif_dev *scifdev)
+{
+	void *local_q = NULL;
+	int err = 0;
+	u32 tmp_rd = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		return err;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &tmp_rd,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	/*
+	 * The read pointer is NULL initially and it is unsafe to use the ring
+	 * buffer til this changes!
+	 */
+	qp->inbound_q.read_ptr = NULL;
+	err = scif_map_single(qp_offset, qp,
+			      scifdev, sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+	return err;
+}
+
+/* When the other side has already done it's allocation, this is called */
+int scif_setup_qp_accept(struct scif_qp *qp, dma_addr_t *qp_offset,
+			 dma_addr_t phys, int local_size,
+			 struct scif_dev *scifdev)
+{
+	void *local_q;
+	void *remote_q;
+	struct scif_qp *remote_qp;
+	int remote_size;
+	int err = 0;
+
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+	/* Start by figuring out where we need to point */
+	remote_qp = scif_ioremap(phys, sizeof(struct scif_qp), scifdev);
+	if (!remote_qp)
+		return -EIO;
+	qp->remote_qp = remote_qp;
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_buf = remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	remote_q = scif_ioremap(qp->remote_buf, remote_size, scifdev);
+	if (!remote_q) {
+		err = -EIO;
+		goto iounmap;
+	}
+	qp->remote_qp->local_write = 0;
+	/*
+	 * To setup the outbound_q, the buffer lives in remote memory,
+	 * the read pointer is local, the write pointer is remote
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     remote_q,
+		     get_count_order(remote_size));
+	local_q = kzalloc(local_size, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto iounmap_1;
+	}
+	err = scif_map_single(&qp->local_buf, local_q, scifdev, local_size);
+	if (err)
+		goto kfree;
+	qp->remote_qp->local_read = 0;
+	/*
+	 * To setup the inbound_q, the buffer lives locally, the read pointer
+	 * is remote and the write pointer is local
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(local_size));
+	err = scif_map_single(qp_offset, qp, scifdev,
+			      sizeof(struct scif_qp));
+	if (err)
+		goto unmap;
+	qp->local_qp = *qp_offset;
+	return err;
+unmap:
+	scif_unmap_single(qp->local_buf, scifdev, local_size);
+	qp->local_buf = 0;
+kfree:
+	kfree(local_q);
+iounmap_1:
+	scif_iounmap(remote_q, remote_size, scifdev);
+	qp->outbound_q.rb_base = NULL;
+iounmap:
+	scif_iounmap(qp->remote_qp, sizeof(struct scif_qp), scifdev);
+	qp->remote_qp = NULL;
+	return err;
+}
+
+int scif_setup_qp_connect_response(struct scif_dev *scifdev,
+				   struct scif_qp *qp, u64 payload)
+{
+	int err = 0;
+	void *r_buf;
+	int remote_size;
+	phys_addr_t tmp_phys;
+
+	qp->remote_qp = scif_ioremap(payload, sizeof(struct scif_qp), scifdev);
+
+	if (!qp->remote_qp) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	if (qp->remote_qp->magic != SCIFEP_MAGIC) {
+		pr_err("SCIFEP_MAGIC mismatch between self %d remote %d\n",
+		       scif_dev[scif_info.nodeid].node, scifdev->node);
+		err = -ENODEV;
+		goto error;
+	}
+
+	tmp_phys = qp->remote_qp->local_buf;
+	remote_size = qp->remote_qp->inbound_q.size;
+	r_buf = scif_ioremap(tmp_phys, remote_size, scifdev);
+
+	if (!r_buf)
+		return -EIO;
+
+	qp->local_read = 0;
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->remote_qp->local_write,
+		     r_buf,
+		     get_count_order(remote_size));
+	/*
+	 * resetup the inbound_q now that we know where the
+	 * inbound_read really is.
+	 */
+	scif_rb_init(&qp->inbound_q,
+		     &qp->remote_qp->local_read,
+		     &qp->local_write,
+		     qp->inbound_q.rb_base,
+		     get_count_order(qp->inbound_q.size));
+error:
+	return err;
+}
+
+static __always_inline void
+scif_send_msg_intr(struct scif_dev *scifdev)
+{
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	if (scifdev_is_p2p(scifdev))
+		sdev->hw_ops->send_p2p_intr(sdev, scifdev->rdb, &scifdev->mmio);
+	else
+		sdev->hw_ops->send_intr(sdev, scifdev->rdb);
+}
+
+int scif_qp_response(phys_addr_t phys, struct scif_dev *scifdev)
+{
+	int err = 0;
+	struct scifmsg msg;
+
+	err = scif_setup_qp_connect_response(scifdev, scifdev->qpairs, phys);
+	if (!err) {
+		/*
+		 * Now that everything is setup and mapped, we're ready
+		 * to tell the peer about our queue's location
+		 */
+		msg.uop = SCIF_INIT;
+		msg.dst.node = scifdev->node;
+		err = scif_nodeqp_send(scifdev, &msg);
+	}
+	return err;
+}
+
+void scif_send_exit(struct scif_dev *scifdev)
+{
+	struct scifmsg msg;
+	int ret;
+
+	scifdev->exit = OP_IN_PROGRESS;
+	msg.uop = SCIF_EXIT;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = scifdev->node;
+	ret = scif_nodeqp_send(scifdev, &msg);
+	if (ret)
+		goto done;
+	/* Wait for a SCIF_EXIT_ACK message */
+	wait_event_timeout(scif_info.exitwq, scifdev->exit == OP_COMPLETED,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+done:
+	scifdev->exit = OP_IDLE;
+}
+
+int scif_setup_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	int local_size;
+	struct scif_qp *qp;
+
+	local_size = SCIF_NODE_QP_SIZE;
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp) {
+		err = -ENOMEM;
+		return err;
+	}
+	qp->magic = SCIFEP_MAGIC;
+	scifdev->qpairs = qp;
+	err = scif_setup_qp_connect(qp, &scifdev->qp_dma_addr,
+				    local_size, scifdev);
+	if (err)
+		goto free_qp;
+	/*
+	 * We're as setup as we can be. The inbound_q is setup, w/o a usable
+	 * outbound q.  When we get a message, the read_ptr will be updated,
+	 * and we will pull the message.
+	 */
+	return err;
+free_qp:
+	kfree(scifdev->qpairs);
+	scifdev->qpairs = NULL;
+	return err;
+}
+
+static void scif_p2p_freesg(struct scatterlist *sg)
+{
+	kfree(sg);
+}
+
+static struct scatterlist *
+scif_p2p_setsg(void __iomem *va, int page_size, int page_cnt)
+{
+	struct scatterlist *sg;
+	struct page *page;
+	int i;
+
+	sg = kcalloc(page_cnt, sizeof(struct scatterlist), GFP_KERNEL);
+	if (!sg)
+		return NULL;
+	sg_init_table(sg, page_cnt);
+	for (i = 0; i < page_cnt; i++) {
+		page = vmalloc_to_page((void __force *)va);
+		if (!page)
+			goto p2p_sg_err;
+		sg_set_page(&sg[i], page, page_size, 0);
+		va += page_size;
+	}
+	return sg;
+p2p_sg_err:
+	kfree(sg);
+	return NULL;
+}
+
+/* Init p2p mappings required to access peerdev from scifdev */
+static struct scif_p2p_info *
+scif_init_p2p_info(struct scif_dev *scifdev, struct scif_dev *peerdev)
+{
+	struct scif_p2p_info *p2p;
+	int num_mmio_pages, num_aper_pages, sg_page_shift, err, num_aper_chunks;
+	struct scif_hw_dev *psdev = peerdev->sdev;
+	struct scif_hw_dev *sdev = scifdev->sdev;
+
+	num_mmio_pages = psdev->mmio->len >> PAGE_SHIFT;
+	num_aper_pages = psdev->aper->len >> PAGE_SHIFT;
+
+	p2p = kzalloc(sizeof(*p2p), GFP_KERNEL);
+	if (!p2p)
+		return NULL;
+	p2p->ppi_sg[SCIF_PPI_MMIO] = scif_p2p_setsg(psdev->mmio->va,
+						    PAGE_SIZE, num_mmio_pages);
+	if (!p2p->ppi_sg[SCIF_PPI_MMIO])
+		goto free_p2p;
+	p2p->sg_nentries[SCIF_PPI_MMIO] = num_mmio_pages;
+	sg_page_shift = get_order(min(psdev->aper->len, (u64)(1 << 30)));
+	num_aper_chunks = num_aper_pages >> (sg_page_shift - PAGE_SHIFT);
+	p2p->ppi_sg[SCIF_PPI_APER] = scif_p2p_setsg(psdev->aper->va,
+						    1 << sg_page_shift,
+						    num_aper_chunks);
+	p2p->sg_nentries[SCIF_PPI_APER] = num_aper_chunks;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			 num_mmio_pages, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_mmio_pages)
+		goto scif_p2p_free;
+	err = dma_map_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			 num_aper_chunks, PCI_DMA_BIDIRECTIONAL);
+	if (err != num_aper_chunks)
+		goto dma_unmap;
+	p2p->ppi_da[SCIF_PPI_MMIO] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	p2p->ppi_da[SCIF_PPI_APER] = sg_dma_address(p2p->ppi_sg[SCIF_PPI_APER]);
+	p2p->ppi_len[SCIF_PPI_MMIO] = num_mmio_pages;
+	p2p->ppi_len[SCIF_PPI_APER] = num_aper_pages;
+	p2p->ppi_peer_id = peerdev->node;
+	return p2p;
+dma_unmap:
+	dma_unmap_sg(&sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+		     p2p->sg_nentries[SCIF_PPI_MMIO], DMA_BIDIRECTIONAL);
+scif_p2p_free:
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+	scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+free_p2p:
+	kfree(p2p);
+	return NULL;
+}
+
+/**
+ * scif_node_connect: Respond to SCIF_NODE_CONNECT interrupt message
+ * @dst: Destination node
+ *
+ * Connect the src and dst node by setting up the p2p connection
+ * between them. Management node here acts like a proxy.
+ */
+static void scif_node_connect(struct scif_dev *scifdev, int dst)
+{
+	struct scif_dev *dev_j = scifdev;
+	struct scif_dev *dev_i = NULL;
+	struct scif_p2p_info *p2p_ij = NULL;    /* bus addr for j from i */
+	struct scif_p2p_info *p2p_ji = NULL;    /* bus addr for i from j */
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	struct scifmsg msg;
+	int err;
+	u64 tmppayload;
+
+	if (dst < 1 || dst > scif_info.maxid)
+		return;
+
+	dev_i = &scif_dev[dst];
+
+	if (!_scifdev_alive(dev_i))
+		return;
+	/*
+	 * If the p2p connection is already setup or in the process of setting
+	 * up then just ignore this request. The requested node will get
+	 * informed by SCIF_NODE_ADD_ACK or SCIF_NODE_ADD_NACK
+	 */
+	if (!list_empty(&dev_i->p2p)) {
+		list_for_each_safe(pos, tmp, &dev_i->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == dev_j->node)
+				return;
+		}
+	}
+	p2p_ij = scif_init_p2p_info(dev_i, dev_j);
+	if (!p2p_ij)
+		return;
+	p2p_ji = scif_init_p2p_info(dev_j, dev_i);
+	if (!p2p_ji)
+		return;
+	list_add_tail(&p2p_ij->ppi_list, &dev_i->p2p);
+	list_add_tail(&p2p_ji->ppi_list, &dev_j->p2p);
+
+	/*
+	 * Send a SCIF_NODE_ADD to dev_i, pass it its bus address
+	 * as seen from dev_j
+	 */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_j->node;
+	msg.dst.node = dev_i->node;
+
+	msg.payload[0] = p2p_ji->ppi_da[SCIF_PPI_APER];
+	msg.payload[1] = p2p_ij->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[2] = p2p_ij->ppi_da[SCIF_PPI_APER];
+	msg.payload[3] = p2p_ij->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	err = scif_nodeqp_send(dev_i,  &msg);
+	if (err) {
+		pr_err("%s %d error %d\n", __func__, __LINE__, err);
+		return;
+	}
+
+	/* Same as above but to dev_j */
+	msg.uop = SCIF_NODE_ADD;
+	msg.src.node = dev_i->node;
+	msg.dst.node = dev_j->node;
+
+	tmppayload = msg.payload[0];
+	msg.payload[0] = msg.payload[2];
+	msg.payload[2] = tmppayload;
+	msg.payload[1] = p2p_ji->ppi_da[SCIF_PPI_MMIO];
+	msg.payload[3] = p2p_ji->ppi_len[SCIF_PPI_APER] << PAGE_SHIFT;
+
+	scif_nodeqp_send(dev_j, &msg);
+}
+
+static void scif_p2p_setup(void)
+{
+	int i, j;
+
+	if (!scif_info.p2p_enable)
+		return;
+
+	for (i = 1; i <= scif_info.maxid; i++)
+		if (!_scifdev_alive(&scif_dev[i]))
+			return;
+
+	for (i = 1; i <= scif_info.maxid; i++) {
+		for (j = 1; j <= scif_info.maxid; j++) {
+			struct scif_dev *scifdev = &scif_dev[i];
+
+			if (i == j)
+				continue;
+			scif_node_connect(scifdev, j);
+		}
+	}
+}
+
+static int scif_qp_response_ack(struct scif_dev *scifdev, u64 payload)
+{
+	int err = 0;
+	struct scif_peer_dev *spdev;
+	struct scifmsg msg;
+
+	/*
+	 * Allow the thread waiting for device page updates for the peer QP DMA
+	 * address to complete initializing the inbound_q.
+	 */
+	flush_delayed_work(&scifdev->qp_dwork);
+	/*
+	 * scif_setup_qp_connect_response reinitializes the inbound qp including
+	 * the read pointer upon receipt of the INIT message. We need to
+	 * re-receive the INIT message so that the read offset matches the end
+	 * of the INIT message else the INIT message will be received twice
+	 * resulting in problems. This logic is unique to the SCIF kernel
+	 * node QP and is not required by the endpoint QP because the endpoint
+	 * QP does not have an INIT message itself. The endpoint QP gets set up
+	 * via the kernel mode node QP using the CNCT* messages so it does
+	 * need INIT messages itself.
+	 */
+	scif_rb_get_next(&scifdev->qpairs->inbound_q, &msg,
+			 sizeof(struct scifmsg));
+	scif_rb_update_read_ptr(&scifdev->qpairs->inbound_q);
+
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto done;
+	}
+	if (scif_is_mgmt_node()) {
+		mutex_lock(&scif_info.conflock);
+		scif_p2p_setup();
+		mutex_unlock(&scif_info.conflock);
+	}
+done:
+	return err;
+}
+
+static char *message_types[] = {"BAD",
+				"INIT",
+				"EXIT",
+				"SCIF_EXIT_ACK",
+				"SCIF_NODE_ADD",
+				"SCIF_NODE_ADD_ACK",
+				"SCIF_NODE_ADD_NACK",
+				"REMOVE_NODE",
+				"REMOVE_NODE_ACK"};
+
+static void
+scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
+		     const char *label)
+{
+	if (!scif_info.en_msg_log)
+		return;
+	if (msg->uop > SCIF_MAX_MSG) {
+		pr_debug("%s: unknown msg type %d\n", label, msg->uop);
+		return;
+	}
+	pr_info("%s: msg type %s, src %d:%d, dest %d:%d payload 0x%llx:0x%llx:0x%llx:0x%llx\n",
+		label, message_types[msg->uop], msg->src.node, msg->src.port,
+		msg->dst.node, msg->dst.port, msg->payload[0], msg->payload[1],
+		msg->payload[2], msg->payload[3]);
+}
+
+int _scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_qp *qp = scifdev->qpairs;
+	int err = -ENOMEM, loop_cnt = 0;
+
+	scif_display_message(scifdev, msg, "Sent");
+	if (!qp) {
+		err = -EINVAL;
+		goto error;
+	}
+	spin_lock(&qp->send_lock);
+
+	while ((err = scif_rb_write(&qp->outbound_q,
+				    msg, sizeof(struct scifmsg)))) {
+		mdelay(1);
+#define SCIF_NODEQP_SEND_TO_MSEC (3 * 1000)
+		if (loop_cnt++ > (SCIF_NODEQP_SEND_TO_MSEC)) {
+			err = -ENODEV;
+			break;
+		}
+	}
+	if (!err)
+		scif_rb_commit(&qp->outbound_q);
+	spin_unlock(&qp->send_lock);
+	if (!err) {
+		if (scifdev_self(scifdev))
+			/*
+			 * For loopback we need to emulate an interrupt by
+			 * queuing work for the queue handling real node
+			 * Qp interrupts.
+			 */
+			queue_work(scifdev->intr_wq, &scifdev->intr_bh);
+		else
+			scif_send_msg_intr(scifdev);
+	}
+error:
+	if (err)
+		pr_debug("%s %d error %d uop %d\n",
+			 __func__, __LINE__, err, msg->uop);
+	return err;
+}
+
+/**
+ * scif_nodeqp_send - Send a message on the node queue pair
+ * @scifdev: Scif Device.
+ * @msg: The message to be sent.
+ */
+int scif_nodeqp_send(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int err;
+	struct device *spdev = NULL;
+
+	if (msg->uop > SCIF_EXIT_ACK) {
+		/* Dont send messages once the exit flow has begun */
+		if (OP_IDLE != scifdev->exit)
+			return -ENODEV;
+		spdev = scif_get_peer_dev(scifdev);
+		if (IS_ERR(spdev)) {
+			err = PTR_ERR(spdev);
+			return err;
+		}
+	}
+	err = _scif_nodeqp_send(scifdev, msg);
+	if (msg->uop > SCIF_EXIT_ACK)
+		scif_put_peer_dev(spdev);
+	return err;
+}
+
+/*
+ * scif_misc_handler:
+ *
+ * Work queue handler for servicing miscellaneous SCIF tasks.
+ * Examples include:
+ * 1) Cleanup of zombie endpoints.
+ */
+void scif_misc_handler(struct work_struct *work)
+{
+	scif_cleanup_zombie_epd();
+}
+
+/**
+ * scif_init() - Respond to SCIF_INIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ */
+static __always_inline void
+scif_init(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	scif_qp_response_ack(scifdev, msg->payload[0]);
+}
+
+/**
+ * scif_exit() - Respond to SCIF_EXIT interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * This function stops the SCIF interface for the node which sent
+ * the SCIF_EXIT message and starts waiting for that node to
+ * resetup the queue pair again.
+ */
+static __always_inline void
+scif_exit(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit_ack_pending = true;
+	if (scif_is_mgmt_node())
+		scif_disconnect_node(scifdev->node, false);
+	else
+		scif_stop(scifdev);
+	schedule_delayed_work(&scifdev->qp_dwork,
+			      msecs_to_jiffies(1000));
+}
+
+/**
+ * scif_exitack() - Respond to SCIF_EXIT_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ */
+static __always_inline void
+scif_exit_ack(struct scif_dev *scifdev, struct scifmsg *unused)
+{
+	scifdev->exit = OP_COMPLETED;
+	wake_up(&scif_info.exitwq);
+}
+
+/**
+ * scif_node_add() - Respond to SCIF_NODE_ADD interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * When the mgmt node driver has finished initializing a MIC node queue pair it
+ * marks the node as online. It then looks for all currently online MIC cards
+ * and send a SCIF_NODE_ADD message to identify the ID of the new card for
+ * peer to peer initialization
+ *
+ * The local node allocates its incoming queue and sends its address in the
+ * SCIF_NODE_ADD_ACK message back to the mgmt node, the mgmt node "reflects"
+ * this message to the new node
+ */
+static __always_inline void
+scif_node_add(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *newdev;
+	dma_addr_t qp_offset;
+	int qp_connect;
+	struct scif_hw_dev *sdev;
+
+	pr_debug("Scifdev %d:%d received NODE_ADD msg for node %d\n",
+		 scifdev->node, msg->dst.node, msg->src.node);
+	pr_debug("Remote address for this node's aperture %llx\n",
+		 msg->payload[0]);
+	newdev = &scif_dev[msg->src.node];
+	newdev->node = msg->src.node;
+	newdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	sdev = newdev->sdev;
+
+	if (scif_setup_intr_wq(newdev)) {
+		pr_err("failed to setup interrupts for %d\n", msg->src.node);
+		goto interrupt_setup_error;
+	}
+	newdev->mmio.va = ioremap_nocache(msg->payload[1], sdev->mmio->len);
+	if (!newdev->mmio.va) {
+		pr_err("failed to map mmio for %d\n", msg->src.node);
+		goto mmio_map_error;
+	}
+	newdev->qpairs = kzalloc(sizeof(*newdev->qpairs), GFP_KERNEL);
+	if (!newdev->qpairs)
+		goto qp_alloc_error;
+	/*
+	 * Set the base address of the remote node's memory since it gets
+	 * added to qp_offset
+	 */
+	newdev->base_addr = msg->payload[0];
+
+	qp_connect = scif_setup_qp_connect(newdev->qpairs, &qp_offset,
+					   SCIF_NODE_QP_SIZE, newdev);
+	if (qp_connect) {
+		pr_err("failed to setup qp_connect %d\n", qp_connect);
+		goto qp_connect_error;
+	}
+
+	newdev->db = sdev->hw_ops->next_db(sdev);
+	newdev->cookie = sdev->hw_ops->request_irq(sdev, scif_intr_handler,
+						   "SCIF_INTR", newdev,
+						   newdev->db);
+	if (IS_ERR(newdev->cookie))
+		goto qp_connect_error;
+	newdev->qpairs->magic = SCIFEP_MAGIC;
+	newdev->qpairs->qp_state = SCIF_QP_OFFLINE;
+
+	msg->uop = SCIF_NODE_ADD_ACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	msg->payload[0] = qp_offset;
+	msg->payload[2] = newdev->db;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+	return;
+qp_connect_error:
+	kfree(newdev->qpairs);
+	newdev->qpairs = NULL;
+qp_alloc_error:
+	iounmap(newdev->mmio.va);
+	newdev->mmio.va = NULL;
+mmio_map_error:
+interrupt_setup_error:
+	pr_err("node add failed for node %d\n", msg->src.node);
+	msg->uop = SCIF_NODE_ADD_NACK;
+	msg->dst.node = msg->src.node;
+	msg->src.node = scif_info.nodeid;
+	scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], msg);
+}
+
+void scif_poll_qp_state(struct work_struct *work)
+{
+#define SCIF_NODE_QP_RETRY 100
+#define SCIF_NODE_QP_TIMEOUT 100
+	struct scif_dev *peerdev = container_of(work, struct scif_dev,
+							p2p_dwork.work);
+	struct scif_qp *qp = &peerdev->qpairs[0];
+
+	if (qp->qp_state != SCIF_QP_ONLINE ||
+	    qp->remote_qp->qp_state != SCIF_QP_ONLINE) {
+		if (peerdev->p2p_retry++ == SCIF_NODE_QP_RETRY) {
+			pr_err("Warning: QP check timeout with state %d\n",
+			       qp->qp_state);
+			goto timeout;
+		}
+		schedule_delayed_work(&peerdev->p2p_dwork,
+				      msecs_to_jiffies(SCIF_NODE_QP_TIMEOUT));
+		return;
+	}
+	scif_peer_register_device(peerdev);
+	return;
+timeout:
+	pr_err("%s %d remote node %d offline,  state = 0x%x\n",
+	       __func__, __LINE__, peerdev->node, qp->qp_state);
+	qp->remote_qp->qp_state = SCIF_QP_OFFLINE;
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_ack() - Respond to SCIF_NODE_ADD_ACK interrupt message
+ * @scifdev:    Remote SCIF device node
+ * @msg:        Interrupt message
+ *
+ * After a MIC node receives the SCIF_NODE_ADD_ACK message it send this
+ * message to the mgmt node to confirm the sequence is finished.
+ *
+ */
+static __always_inline void
+scif_node_add_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *peerdev;
+	struct scif_qp *qp;
+	struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+	pr_debug("Scifdev %d received SCIF_NODE_ADD_ACK msg src %d dst %d\n",
+		 scifdev->node, msg->src.node, msg->dst.node);
+	pr_debug("payload %llx %llx %llx %llx\n", msg->payload[0],
+		 msg->payload[1], msg->payload[2], msg->payload[3]);
+	if (scif_is_mgmt_node()) {
+		/*
+		 * the lock serializes with scif_qp_response_ack. The mgmt node
+		 * is forwarding the NODE_ADD_ACK message from src to dst we
+		 * need to make sure that the dst has already received a
+		 * NODE_ADD for src and setup its end of the qp to dst
+		 */
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		scif_nodeqp_send(dst_dev, msg);
+		mutex_unlock(&scif_info.conflock);
+		return;
+	}
+	peerdev = &scif_dev[msg->src.node];
+	peerdev->sdev = scif_dev[SCIF_MGMT_NODE].sdev;
+	peerdev->node = msg->src.node;
+
+	qp = &peerdev->qpairs[0];
+
+	if ((scif_setup_qp_connect_response(peerdev, &peerdev->qpairs[0],
+					    msg->payload[0])))
+		goto local_error;
+	peerdev->rdb = msg->payload[2];
+	qp->remote_qp->qp_state = SCIF_QP_ONLINE;
+	schedule_delayed_work(&peerdev->p2p_dwork, 0);
+	return;
+local_error:
+	scif_cleanup_scifdev(peerdev);
+}
+
+/**
+ * scif_node_add_nack: Respond to SCIF_NODE_ADD_NACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * SCIF_NODE_ADD failed, so inform the waiting wq.
+ */
+static __always_inline void
+scif_node_add_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		struct scif_dev *dst_dev = &scif_dev[msg->dst.node];
+
+		pr_debug("SCIF_NODE_ADD_NACK received from %d\n",
+			 scifdev->node);
+		scif_nodeqp_send(dst_dev, msg);
+	}
+}
+
+/*
+ * scif_node_remove: Handle SCIF_NODE_REMOVE message
+ * @msg: Interrupt message
+ *
+ * Handle node removal.
+ */
+static __always_inline void
+scif_node_remove(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	int node = msg->payload[0];
+	struct scif_dev *scdev = &scif_dev[node];
+
+	scdev->node_remove_ack_pending = true;
+	scif_handle_remove_node(node);
+}
+
+/*
+ * scif_node_remove_ack: Handle SCIF_NODE_REMOVE_ACK message
+ * @msg: Interrupt message
+ *
+ * The peer has acked a SCIF_NODE_REMOVE message.
+ */
+static __always_inline void
+scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_dev *sdev = &scif_dev[msg->payload[0]];
+
+	atomic_inc(&sdev->disconn_rescnt);
+	wake_up(&sdev->disconn_wq);
+}
+
+static void
+scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	/* Bogus Node Qp Message? */
+	pr_err("Unknown message 0x%xn scifdev->node 0x%x\n",
+	       msg->uop, scifdev->node);
+}
+
+static void (*scif_intr_func[SCIF_MAX_MSG + 1])
+	    (struct scif_dev *, struct scifmsg *msg) = {
+	scif_msg_unknown,	/* Error */
+	scif_init,		/* SCIF_INIT */
+	scif_exit,		/* SCIF_EXIT */
+	scif_exit_ack,		/* SCIF_EXIT_ACK */
+	scif_node_add,		/* SCIF_NODE_ADD */
+	scif_node_add_ack,	/* SCIF_NODE_ADD_ACK */
+	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
+	scif_node_remove,	/* SCIF_NODE_REMOVE */
+	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+};
+
+/**
+ * scif_nodeqp_msg_handler() - Common handler for node messages
+ * @scifdev: Remote device to respond to
+ * @qp: Remote memory pointer
+ * @msg: The message to be handled.
+ *
+ * This routine calls the appropriate routine to handle a Node Qp
+ * message receipt
+ */
+static int scif_max_msg_id = SCIF_MAX_MSG;
+
+static void
+scif_nodeqp_msg_handler(struct scif_dev *scifdev,
+			struct scif_qp *qp, struct scifmsg *msg)
+{
+	scif_display_message(scifdev, msg, "Rcvd");
+
+	if (msg->uop > (u32)scif_max_msg_id) {
+		/* Bogus Node Qp Message? */
+		pr_err("Unknown message 0x%xn scifdev->node 0x%x\n",
+		       msg->uop, scifdev->node);
+		return;
+	}
+
+	scif_intr_func[msg->uop](scifdev, msg);
+}
+
+/**
+ * scif_nodeqp_intrhandler() - Interrupt handler for node messages
+ * @scifdev:    Remote device to respond to
+ * @qp:         Remote memory pointer
+ *
+ * This routine is triggered by the interrupt mechanism.  It reads
+ * messages from the node queue RB and calls the Node QP Message handling
+ * routine.
+ */
+void scif_nodeqp_intrhandler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	struct scifmsg msg;
+	int read_size;
+
+	do {
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg, sizeof(msg));
+		if (!read_size)
+			break;
+		scif_nodeqp_msg_handler(scifdev, qp, &msg);
+		/*
+		 * The node queue pair is unmapped so skip the read pointer
+		 * update after receipt of a SCIF_EXIT_ACK
+		 */
+		if (SCIF_EXIT_ACK == msg.uop)
+			break;
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (1);
+}
+
+/**
+ * scif_loopb_wq_handler - Loopback Workqueue Handler.
+ * @work: loop back work
+ *
+ * This work queue routine is invoked by the loopback work queue handler.
+ * It grabs the recv lock, dequeues any available messages from the head
+ * of the loopback message list, calls the node QP message handler,
+ * waits for it to return, then frees up this message and dequeues more
+ * elements of the list if available.
+ */
+static void scif_loopb_wq_handler(struct work_struct *unused)
+{
+	struct scif_dev *scifdev = scif_info.loopb_dev;
+	struct scif_qp *qp = scifdev->qpairs;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = NULL;
+		spin_lock(&qp->recv_lock);
+		if (!list_empty(&scif_info.loopb_recv_q)) {
+			msg = list_first_entry(&scif_info.loopb_recv_q,
+					       struct scif_loopb_msg,
+					       list);
+			list_del(&msg->list);
+		}
+		spin_unlock(&qp->recv_lock);
+
+		if (msg) {
+			scif_nodeqp_msg_handler(scifdev, qp, &msg->msg);
+			kfree(msg);
+		}
+	} while (msg);
+}
+
+/**
+ * scif_loopb_msg_handler() - Workqueue handler for loopback messages.
+ * @scifdev: SCIF device
+ * @qp: Queue pair.
+ *
+ * This work queue routine is triggered when a loopback message is received.
+ *
+ * We need special handling for receiving Node Qp messages on a loopback SCIF
+ * device via two workqueues for receiving messages.
+ *
+ * The reason we need the extra workqueue which is not required with *normal*
+ * non-loopback SCIF devices is the potential classic deadlock described below:
+ *
+ * Thread A tries to send a message on a loopback SCIF device and blocks since
+ * there is no space in the RB while it has the send_lock held or another
+ * lock called lock X for example.
+ *
+ * Thread B: The Loopback Node QP message receive workqueue receives the message
+ * and tries to send a message (eg an ACK) to the loopback SCIF device. It tries
+ * to grab the send lock again or lock X and deadlocks with Thread A. The RB
+ * cannot be drained any further due to this classic deadlock.
+ *
+ * In order to avoid deadlocks as mentioned above we have an extra level of
+ * indirection achieved by having two workqueues.
+ * 1) The first workqueue whose handler is scif_loopb_msg_handler reads
+ * messages from the Node QP RB, adds them to a list and queues work for the
+ * second workqueue.
+ *
+ * 2) The second workqueue whose handler is scif_loopb_wq_handler dequeues
+ * messages from the list, handles them, frees up the memory and dequeues
+ * more elements from the list if possible.
+ */
+int
+scif_loopb_msg_handler(struct scif_dev *scifdev, struct scif_qp *qp)
+{
+	int read_size;
+	struct scif_loopb_msg *msg;
+
+	do {
+		msg = kmalloc(sizeof(*msg), GFP_KERNEL);
+		if (!msg)
+			return -ENOMEM;
+		read_size = scif_rb_get_next(&qp->inbound_q, &msg->msg,
+					     sizeof(struct scifmsg));
+		if (read_size != sizeof(struct scifmsg)) {
+			kfree(msg);
+			scif_rb_update_read_ptr(&qp->inbound_q);
+			break;
+		}
+		spin_lock(&qp->recv_lock);
+		list_add_tail(&msg->list, &scif_info.loopb_recv_q);
+		spin_unlock(&qp->recv_lock);
+		queue_work(scif_info.loopb_wq, &scif_info.loopb_work);
+		scif_rb_update_read_ptr(&qp->inbound_q);
+	} while (read_size == sizeof(struct scifmsg));
+	return read_size;
+}
+
+/**
+ * scif_setup_loopback_qp - One time setup work for Loopback Node Qp.
+ * @scifdev: SCIF device
+ *
+ * Sets up the required loopback workqueues, queue pairs and ring buffers
+ */
+int scif_setup_loopback_qp(struct scif_dev *scifdev)
+{
+	int err = 0;
+	void *local_q;
+	struct scif_qp *qp;
+	struct scif_peer_dev *spdev;
+
+	err = scif_setup_intr_wq(scifdev);
+	if (err)
+		goto exit;
+	INIT_LIST_HEAD(&scif_info.loopb_recv_q);
+	snprintf(scif_info.loopb_wqname, sizeof(scif_info.loopb_wqname),
+		 "SCIF LOOPB %d", scifdev->node);
+	scif_info.loopb_wq =
+		alloc_ordered_workqueue(scif_info.loopb_wqname, 0);
+	if (!scif_info.loopb_wq) {
+		err = -ENOMEM;
+		goto destroy_intr;
+	}
+	INIT_WORK(&scif_info.loopb_work, scif_loopb_wq_handler);
+	/* Allocate Self Qpair */
+	scifdev->qpairs = kzalloc(sizeof(*scifdev->qpairs), GFP_KERNEL);
+	if (!scifdev->qpairs) {
+		err = -ENOMEM;
+		goto destroy_loopb_wq;
+	}
+
+	qp = scifdev->qpairs;
+	qp->magic = SCIFEP_MAGIC;
+	spin_lock_init(&qp->send_lock);
+	spin_lock_init(&qp->recv_lock);
+
+	local_q = kzalloc(SCIF_NODE_QP_SIZE, GFP_KERNEL);
+	if (!local_q) {
+		err = -ENOMEM;
+		goto free_qpairs;
+	}
+	/*
+	 * For loopback the inbound_q and outbound_q are essentially the same
+	 * since the Node sends a message on the loopback interface to the
+	 * outbound_q which is then received on the inbound_q.
+	 */
+	scif_rb_init(&qp->outbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+
+	scif_rb_init(&qp->inbound_q,
+		     &qp->local_read,
+		     &qp->local_write,
+		     local_q, get_count_order(SCIF_NODE_QP_SIZE));
+	scif_info.nodeid = scifdev->node;
+	spdev = scif_peer_register_device(scifdev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto free_local_q;
+	}
+	scif_info.loopb_dev = scifdev;
+	return err;
+free_local_q:
+	kfree(local_q);
+free_qpairs:
+	kfree(scifdev->qpairs);
+destroy_loopb_wq:
+	destroy_workqueue(scif_info.loopb_wq);
+destroy_intr:
+	scif_destroy_intr_wq(scifdev);
+exit:
+	return err;
+}
+
+/**
+ * scif_destroy_loopback_qp - One time uninit work for Loopback Node Qp
+ * @scifdev: SCIF device
+ *
+ * Destroys the workqueues and frees up the Ring Buffer and Queue Pair memory.
+ */
+int scif_destroy_loopback_qp(struct scif_dev *scifdev)
+{
+	struct scif_peer_dev *spdev;
+
+	rcu_read_lock();
+	spdev = rcu_dereference(scifdev->spdev);
+	rcu_read_unlock();
+	if (spdev)
+		scif_peer_unregister_device(spdev);
+	destroy_workqueue(scif_info.loopb_wq);
+	scif_destroy_intr_wq(scifdev);
+	kfree(scifdev->qpairs->outbound_q.rb_base);
+	kfree(scifdev->qpairs);
+	scifdev->sdev = NULL;
+	scif_info.loopb_dev = NULL;
+	return 0;
+}
+
+void scif_destroy_p2p(struct scif_dev *scifdev)
+{
+	struct scif_dev *peer_dev;
+	struct scif_p2p_info *p2p;
+	struct list_head *pos, *tmp;
+	int bd;
+
+	mutex_lock(&scif_info.conflock);
+	/* Free P2P mappings in the given node for all its peer nodes */
+	list_for_each_safe(pos, tmp, &scifdev->p2p) {
+		p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_MMIO],
+			     p2p->sg_nentries[SCIF_PPI_MMIO],
+			     DMA_BIDIRECTIONAL);
+		dma_unmap_sg(&scifdev->sdev->dev, p2p->ppi_sg[SCIF_PPI_APER],
+			     p2p->sg_nentries[SCIF_PPI_APER],
+			     DMA_BIDIRECTIONAL);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+		scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+		list_del(pos);
+		kfree(p2p);
+	}
+
+	/* Free P2P mapping created in the peer nodes for the given node */
+	for (bd = SCIF_MGMT_NODE + 1; bd <= scif_info.maxid; bd++) {
+		peer_dev = &scif_dev[bd];
+		list_for_each_safe(pos, tmp, &peer_dev->p2p) {
+			p2p = list_entry(pos, struct scif_p2p_info, ppi_list);
+			if (p2p->ppi_peer_id == scifdev->node) {
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_MMIO],
+					     p2p->sg_nentries[SCIF_PPI_MMIO],
+					     DMA_BIDIRECTIONAL);
+				dma_unmap_sg(&peer_dev->sdev->dev,
+					     p2p->ppi_sg[SCIF_PPI_APER],
+					     p2p->sg_nentries[SCIF_PPI_APER],
+					     DMA_BIDIRECTIONAL);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_MMIO]);
+				scif_p2p_freesg(p2p->ppi_sg[SCIF_PPI_APER]);
+				list_del(pos);
+				kfree(p2p);
+			}
+		}
+	}
+	mutex_unlock(&scif_info.conflock);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 28%]

* [PATCH char-misc-next 04/13] misc: mic: SCIF Peer Bus
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (2 preceding siblings ...)
  2014-12-10 19:47 61% ` [PATCH char-misc-next 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
@ 2014-12-10 19:47 78% ` Sudeep Dutt
    2014-12-10 19:47 97% ` [PATCH char-misc-next 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The SCIF peer bus is used to register and unregister SCIF peer devices
internally by the SCIF driver to signify the addition and removal of
peer nodes respectively from the SCIF network. This simplifies remote node
handling within SCIF and will also be used to support device probe/remove
for SCIF client drivers (e.g. netdev over SCIF)

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_peer_bus.h |  63 +++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c | 124 ++++++++++++++++++++++++++++++++++
 2 files changed, 187 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c

diff --git a/drivers/misc/mic/scif/scif_peer_bus.h b/drivers/misc/mic/scif/scif_peer_bus.h
new file mode 100644
index 0000000..27ed508
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.h
@@ -0,0 +1,63 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef _SCIF_PEER_BUS_H_
+#define _SCIF_PEER_BUS_H_
+
+#include <linux/device.h>
+#include <linux/mic_common.h>
+
+/*
+ * Peer devices show up as PCIe devices for the mgmt node but not the cards.
+ * The mgmt node discovers all the cards on the PCIe bus and informs the other
+ * cards about their peers. Upon notification of a peer a node adds a peer
+ * device to the peer bus to maintain symmetry in the way devices are
+ * discovered across all nodes in the SCIF network.
+ */
+/**
+ * scif_peer_dev - representation of a peer SCIF device
+ * @dev: underlying device
+ * @dnode - The destination node which this device will communicate with.
+ */
+struct scif_peer_dev {
+	struct device dev;
+	u8 dnode;
+};
+
+/**
+ * scif_peer_driver - operations for a scif_peer I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct scif_peer_driver {
+	struct device_driver driver;
+	const struct scif_peer_dev_id *id_table;
+
+	int (*probe)(struct scif_peer_dev *dev);
+	void (*remove)(struct scif_peer_dev *dev);
+};
+
+int scif_peer_register_driver(struct scif_peer_driver *driver);
+void scif_peer_unregister_driver(struct scif_peer_driver *driver);
+struct scif_dev;
+struct scif_peer_dev *
+scif_peer_register_device(struct scif_dev *sdev);
+void scif_peer_unregister_device(struct scif_peer_dev *sdev);
+int scif_peer_bus_init(void);
+void scif_peer_bus_exit(void);
+#endif /* _SCIF_PEER_BUS_H */
diff --git a/drivers/misc/mic/scif/scif_peer_bus.c b/drivers/misc/mic/scif/scif_peer_bus.c
new file mode 100644
index 0000000..589ae9a
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_peer_bus.c
@@ -0,0 +1,124 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ */
+#include "scif_main.h"
+#include "../bus/scif_bus.h"
+#include "scif_peer_bus.h"
+
+static inline struct scif_peer_dev *
+dev_to_scif_peer(struct device *dev)
+{
+	return container_of(dev, struct scif_peer_dev, dev);
+}
+
+static inline struct scif_peer_driver *
+drv_to_scif_peer(struct device_driver *drv)
+{
+	return container_of(drv, struct scif_peer_driver, driver);
+}
+
+static int scif_peer_dev_match(struct device *dv, struct device_driver *dr)
+{
+	return !strncmp(dev_name(dv), dr->name, 4);
+}
+
+static int scif_peer_dev_probe(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	return drv->probe(dev);
+}
+
+static int scif_peer_dev_remove(struct device *d)
+{
+	struct scif_peer_dev *dev = dev_to_scif_peer(d);
+	struct scif_peer_driver *drv = drv_to_scif_peer(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type scif_peer_bus = {
+	.name  = "scif_peer_bus",
+	.match = scif_peer_dev_match,
+	.probe = scif_peer_dev_probe,
+	.remove = scif_peer_dev_remove,
+};
+
+int scif_peer_register_driver(struct scif_peer_driver *driver)
+{
+	driver->driver.bus = &scif_peer_bus;
+	return driver_register(&driver->driver);
+}
+
+void scif_peer_unregister_driver(struct scif_peer_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+
+static void scif_peer_release_dev(struct device *d)
+{
+	struct scif_peer_dev *sdev = dev_to_scif_peer(d);
+	struct scif_dev *scifdev = &scif_dev[sdev->dnode];
+
+	scif_cleanup_scifdev(scifdev);
+	kfree(sdev);
+}
+
+struct scif_peer_dev *
+scif_peer_register_device(struct scif_dev *scifdev)
+{
+	int ret;
+	struct scif_peer_dev *spdev;
+
+	spdev = kzalloc(sizeof(*spdev), GFP_KERNEL);
+	if (!spdev)
+		return ERR_PTR(-ENOMEM);
+
+	spdev->dev.parent = scifdev->sdev->dev.parent;
+	spdev->dev.release = scif_peer_release_dev;
+	spdev->dnode = scifdev->node;
+	spdev->dev.bus = &scif_peer_bus;
+
+	dev_set_name(&spdev->dev, "scif_peer-dev%u", spdev->dnode);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&spdev->dev);
+	if (ret)
+		goto free_spdev;
+	return spdev;
+free_spdev:
+	kfree(spdev);
+	return ERR_PTR(ret);
+}
+
+void scif_peer_unregister_device(struct scif_peer_dev *sdev)
+{
+	device_unregister(&sdev->dev);
+}
+
+int scif_peer_bus_init(void)
+{
+	return bus_register(&scif_peer_bus);
+}
+
+void scif_peer_bus_exit(void)
+{
+	bus_unregister(&scif_peer_bus);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 78%]

* [PATCH char-misc-next 10/13] misc: mic: SCIF messaging and node enumeration APIs
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (8 preceding siblings ...)
  2014-12-10 19:47 41% ` [PATCH char-misc-next 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
@ 2014-12-10 19:47 52% ` Sudeep Dutt
  2014-12-10 19:47 57% ` [PATCH char-misc-next 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF messaging APIs which allow sending messages between the SCIF
endpoints via a byte stream based ring buffer which has been
optimized to avoid reads across PCIe. The SCIF messaging APIs
are typically used for short < 1024 byte messages for best
performance while the RDMA APIs which will be submitted in a future
patch series is recommended for larger transfers. The node
enumeration API enables a user to query for the number of nodes
online in the SCIF network and their node ids.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   4 +
 drivers/misc/mic/scif/scif_api.c    | 398 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    |  32 +++
 drivers/misc/mic/scif/scif_fd.c     |  93 +++++++++
 drivers/misc/mic/scif/scif_nm.c     |  17 ++
 drivers/misc/mic/scif/scif_nodeqp.c |  36 +++-
 6 files changed, 579 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index 65c628e..331322a 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,6 +144,8 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags);
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags);
 void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
@@ -151,6 +153,8 @@ void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
 void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg);
 int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index 28494a3..2fb219b 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -870,3 +870,401 @@ scif_accept_error_epalloc:
 	return err;
 }
 EXPORT_SYMBOL_GPL(scif_accept);
+
+/*
+ * scif_msg_param_check:
+ * @epd: The end point returned from scif_open()
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * Validate parameters for messaging APIs scif_send(..)/scif_recv(..).
+ */
+static inline int scif_msg_param_check(scif_epd_t epd, int len, int flags)
+{
+	int ret = -EINVAL;
+
+	if (len < 0)
+		goto err_ret;
+	if (flags && (!(flags & SCIF_RECV_BLOCK)))
+		goto err_ret;
+	ret = 0;
+err_ret:
+	return ret;
+}
+
+static int _scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_xfer_len = 0, sent_len = 0, write_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_SEND_BLOCK)
+		might_sleep();
+
+	spin_lock(&ep->lock);
+	while (sent_len != len && SCIFEP_CONNECTED == ep->state) {
+		write_count = scif_rb_space(&qp->outbound_q);
+		if (write_count) {
+			/* Best effort to send as much data as possible */
+			curr_xfer_len = min(len - sent_len, write_count);
+			ret = scif_rb_write(&qp->outbound_q, msg,
+					    curr_xfer_len);
+			if (ret < 0)
+				break;
+			/* Success. Update write pointer */
+			scif_rb_commit(&qp->outbound_q);
+			/*
+			 * Send a notification to the peer about the
+			 * produced data message.
+			 */
+			notif_msg.src = ep->port;
+			notif_msg.uop = SCIF_CLIENT_SENT;
+			notif_msg.payload[0] = ep->remote_ep;
+			ret = _scif_nodeqp_send(ep->remote_dev, &notif_msg);
+			if (ret)
+				break;
+			sent_len += curr_xfer_len;
+			msg = msg + curr_xfer_len;
+			continue;
+		}
+		curr_xfer_len = min(len - sent_len, SCIF_ENDPT_QP_SIZE - 1);
+		/* Not enough RB space. return for the Non Blocking case */
+		if (!(flags & SCIF_SEND_BLOCK))
+			break;
+
+		spin_unlock(&ep->lock);
+		/* Wait for a SCIF_CLIENT_RCVD message in the Blocking case */
+		ret =
+		wait_event_interruptible(ep->sendwq,
+					 (SCIFEP_CONNECTED != ep->state) ||
+					 (scif_rb_space(&qp->outbound_q) >=
+					 curr_xfer_len));
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (sent_len)
+		ret = sent_len;
+	else if (!ret && SCIFEP_CONNECTED != ep->state)
+		ret = SCIFEP_DISCONNECTED == ep->state ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+static int _scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	int read_size;
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	struct scifmsg notif_msg;
+	int curr_recv_len = 0, remaining_len = len, read_count;
+	int ret = 0;
+	struct scif_qp *qp = ep->qp_info.qp;
+
+	if (flags & SCIF_RECV_BLOCK)
+		might_sleep();
+	spin_lock(&ep->lock);
+	while (remaining_len && (SCIFEP_CONNECTED == ep->state ||
+				 SCIFEP_DISCONNECTED == ep->state)) {
+		read_count = scif_rb_count(&qp->inbound_q, remaining_len);
+		if (read_count) {
+			/*
+			 * Best effort to recv as much data as there
+			 * are bytes to read in the RB particularly
+			 * important for the Non Blocking case.
+			 */
+			curr_recv_len = min(remaining_len, read_count);
+			read_size = scif_rb_get_next(&qp->inbound_q,
+						     msg, curr_recv_len);
+			if (ep->state == SCIFEP_CONNECTED) {
+				/*
+				 * Update the read pointer only if the endpoint
+				 * is still connected else the read pointer
+				 * might no longer exist since the peer has
+				 * freed resources!
+				 */
+				scif_rb_update_read_ptr(&qp->inbound_q);
+				/*
+				 * Send a notification to the peer about the
+				 * consumed data message only if the EP is in
+				 * SCIFEP_CONNECTED state.
+				 */
+				notif_msg.src = ep->port;
+				notif_msg.uop = SCIF_CLIENT_RCVD;
+				notif_msg.payload[0] = ep->remote_ep;
+				ret = _scif_nodeqp_send(ep->remote_dev,
+							&notif_msg);
+				if (ret)
+					break;
+			}
+			remaining_len -= curr_recv_len;
+			msg = msg + curr_recv_len;
+			continue;
+		}
+		/*
+		 * Bail out now if the EP is in SCIFEP_DISCONNECTED state else
+		 * we will keep looping forever.
+		 */
+		if (ep->state == SCIFEP_DISCONNECTED)
+			break;
+		/*
+		 * Return in the Non Blocking case if there is no data
+		 * to read in this iteration.
+		 */
+		if (!(flags & SCIF_RECV_BLOCK))
+			break;
+		curr_recv_len = min(remaining_len, SCIF_ENDPT_QP_SIZE - 1);
+		spin_unlock(&ep->lock);
+		/*
+		 * Wait for a SCIF_CLIENT_SEND message in the blocking case
+		 * or until other side disconnects.
+		 */
+		ret =
+		wait_event_interruptible(ep->recvwq,
+					 SCIFEP_CONNECTED != ep->state ||
+					 scif_rb_count(&qp->inbound_q,
+						       curr_recv_len)
+					 >= curr_recv_len);
+		spin_lock(&ep->lock);
+		if (ret)
+			break;
+	}
+	if (len - remaining_len)
+		ret = len - remaining_len;
+	else if (!ret && ep->state != SCIFEP_CONNECTED)
+		ret = ep->state == SCIFEP_DISCONNECTED ?
+			-ECONNRESET : -ENOTCONN;
+	spin_unlock(&ep->lock);
+	return ret;
+}
+
+/**
+ * scif_user_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_send().
+ */
+int scif_user_send(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int sent_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	pr_debug("SCIFAPI send (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto send_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto send_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->sendlock);
+	while (sent_len != len) {
+		loop_len = len - sent_len;
+		loop_len = min(chunk_len, loop_len);
+		if (copy_from_user(tmp, msg, loop_len)) {
+			err = -EFAULT;
+			goto send_free_err;
+		}
+		err = _scif_send(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto send_free_err;
+		sent_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto send_free_err;
+	}
+send_free_err:
+	mutex_unlock(&ep->sendlock);
+	kfree(tmp);
+send_err:
+	return err < 0 ? err : sent_len;
+}
+
+/**
+ * scif_user_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the driver IOCTL entry point
+ * only and is a wrapper for _scif_recv().
+ */
+int scif_user_recv(scif_epd_t epd, void __user *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	int recv_len = 0;
+	char *tmp;
+	int loop_len;
+	int chunk_len = min(len, (1 << (MAX_ORDER + PAGE_SHIFT - 1)));
+
+	pr_debug("SCIFAPI recv (U): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	err = scif_msg_param_check(epd, len, flags);
+	if (err)
+		goto recv_err;
+
+	tmp = kmalloc(chunk_len, GFP_KERNEL);
+	if (!tmp) {
+		err = -ENOMEM;
+		goto recv_err;
+	}
+	/*
+	 * Grabbing the lock before breaking up the transfer in
+	 * multiple chunks is required to ensure that messages do
+	 * not get fragmented and reordered.
+	 */
+	mutex_lock(&ep->recvlock);
+	while (recv_len != len) {
+		loop_len = len - recv_len;
+		loop_len = min(chunk_len, loop_len);
+		err = _scif_recv(epd, tmp, loop_len, flags);
+		if (err < 0)
+			goto recv_free_err;
+		if (copy_to_user(msg, tmp, err)) {
+			err = -EFAULT;
+			goto recv_free_err;
+		}
+		recv_len += err;
+		msg += err;
+		if (err != loop_len)
+			goto recv_free_err;
+	}
+recv_free_err:
+	mutex_unlock(&ep->recvlock);
+	kfree(tmp);
+recv_err:
+	return err < 0 ? err : recv_len;
+}
+
+/**
+ * scif_send() - Send data to connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_send().
+ */
+int scif_send(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	pr_debug("SCIFAPI send (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	if (!ep->remote_dev)
+		return -ENOTCONN;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_lock(&ep->sendlock);
+
+	ret = _scif_send(epd, msg, len, flags);
+
+	if (flags & SCIF_SEND_BLOCK)
+		mutex_unlock(&ep->sendlock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_send);
+
+/**
+ * scif_recv() - Receive data from connection queue
+ * @epd: The end point returned from scif_open()
+ * @msg: Address to place data
+ * @len: Length to receive
+ * @flags: blocking or non blocking
+ *
+ * This function is called from the kernel mode only and is
+ * a wrapper for _scif_recv().
+ */
+int scif_recv(scif_epd_t epd, void *msg, int len, int flags)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int ret;
+
+	pr_debug("SCIFAPI recv (K): ep %p %s\n", ep, scif_ep_states[ep->state]);
+
+	if (!len)
+		return 0;
+
+	ret = scif_msg_param_check(epd, len, flags);
+	if (ret)
+		return ret;
+	/*
+	 * Grab the mutex lock in the blocking case only
+	 * to ensure messages do not get fragmented/reordered.
+	 * The non blocking mode is protected using spin locks
+	 * in _scif_send().
+	 */
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_lock(&ep->recvlock);
+
+	ret = _scif_recv(epd, msg, len, flags);
+
+	if (flags & SCIF_RECV_BLOCK)
+		mutex_unlock(&ep->recvlock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scif_recv);
+
+int scif_get_node_ids(u16 *nodes, int len, u16 *self)
+{
+	int online = 0;
+	int offset = 0;
+	int node;
+
+	if (!scif_is_mgmt_node())
+		scif_get_node_info();
+
+	*self = scif_info.nodeid;
+	mutex_lock(&scif_info.conflock);
+	len = min_t(int, len, scif_info.total);
+	for (node = 0; node <= scif_info.maxid; node++) {
+		if (_scifdev_alive(&scif_dev[node])) {
+			online++;
+			if (offset < len)
+				nodes[offset++] = node;
+		}
+	}
+	pr_debug("SCIFAPI get_node_ids total %d online %d filled in %d nodes\n",
+		 scif_info.total, online, offset);
+	mutex_unlock(&scif_info.conflock);
+
+	return online;
+}
+EXPORT_SYMBOL_GPL(scif_get_node_ids);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index 685ee5c..b4bfbb0 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -319,3 +319,35 @@ void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	spin_unlock(&ep->lock);
 	complete(&ep->discon);
 }
+
+/**
+ * scif_clientsend() - Respond to SCIF_CLIENT_SEND interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientsend(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_clientrcvd() - Respond to SCIF_CLIENT_RCVD interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side is confirming send or receive interrupt handling is complete.
+ */
+void scif_clientrcvd(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTED == ep->state)
+		wake_up_interruptible(&ep->sendwq);
+	spin_unlock(&ep->lock);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index 4ccdf7d..c590524 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -69,6 +69,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
 	int err = 0;
+	struct scifioctl_msg request;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -197,6 +198,98 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 		f->private_data = newep;
 		return 0;
 	}
+	case SCIF_SEND:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = scif_user_send(priv, request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto send_err;
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+				 &err, sizeof(err))) {
+			err = -EFAULT;
+			goto send_err;
+		}
+		err = 0;
+send_err:
+		scif_err_debug(err, "scif_send");
+		return err;
+	}
+	case SCIF_RECV:
+	{
+		struct scif_endpt *priv = f->private_data;
+
+		if (copy_from_user(&request, argp,
+				   sizeof(struct scifioctl_msg))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+
+		err = scif_user_recv(priv, request.msg,
+				     request.len, request.flags);
+		if (err < 0)
+			goto recv_err;
+
+		if (copy_to_user(&
+				 ((struct scifioctl_msg __user *)argp)->out_len,
+			&err, sizeof(err))) {
+			err = -EFAULT;
+			goto recv_err;
+		}
+		err = 0;
+recv_err:
+		scif_err_debug(err, "scif_recv");
+		return err;
+	}
+	case SCIF_GET_NODEIDS:
+	{
+		struct scifioctl_node_ids node_ids;
+		int entries;
+		u16 *nodes;
+		void __user *unodes, *uself;
+		u16 self;
+
+		if (copy_from_user(&node_ids, argp, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err2;
+		}
+
+		entries = min_t(int, scif_info.maxid, node_ids.len);
+		nodes = kmalloc_array(entries, sizeof(u16), GFP_KERNEL);
+		if (entries && !nodes) {
+			err = -ENOMEM;
+			goto getnodes_err2;
+		}
+		node_ids.len = scif_get_node_ids(nodes, entries, &self);
+
+		unodes = node_ids.nodes;
+		if (copy_to_user(unodes, nodes, sizeof(u16) * entries)) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		uself = node_ids.self;
+		if (copy_to_user(uself, &self, sizeof(u16))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+
+		if (copy_to_user(argp, &node_ids, sizeof(node_ids))) {
+			err = -EFAULT;
+			goto getnodes_err1;
+		}
+getnodes_err1:
+		kfree(nodes);
+getnodes_err2:
+		return err;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index f28b74a..e1298ee 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -210,3 +210,20 @@ void scif_disconnect_node(u32 node_id, bool mgmt_initiated)
 				   (atomic_read(&scifdev->disconn_rescnt) == 1),
 				   SCIF_NODE_ALIVE_TIMEOUT);
 }
+
+void scif_get_node_info(void)
+{
+	struct scifmsg msg;
+	DECLARE_COMPLETION_ONSTACK(node_info);
+
+	msg.uop = SCIF_GET_NODE_INFO;
+	msg.src.node = scif_info.nodeid;
+	msg.dst.node = SCIF_MGMT_NODE;
+	msg.payload[3] = (u64)&node_info;
+
+	if ((scif_nodeqp_send(&scif_dev[SCIF_MGMT_NODE], &msg)))
+		return;
+
+	/* Wait for a response with SCIF_GET_NODE_INFO */
+	wait_for_completion(&node_info);
+}
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index a51762d0..0f6bedf 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -587,7 +587,10 @@ static char *message_types[] = {"BAD",
 				"CNCT_GNTNACK",
 				"CNCT_REJ",
 				"DISCNCT",
-				"DISCNT_ACK"};
+				"DISCNT_ACK",
+				"CLIENT_SENT",
+				"CLIENT_RCVD",
+				"SCIF_GET_NODE_INFO"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -944,6 +947,34 @@ scif_node_remove_ack(struct scif_dev *scifdev, struct scifmsg *msg)
 	wake_up(&sdev->disconn_wq);
 }
 
+/**
+ * scif_get_node_info: Respond to SCIF_GET_NODE_INFO interrupt message
+ * @msg:        Interrupt message
+ *
+ * Retrieve node info i.e maxid and total from the mgmt node.
+ */
+static __always_inline void
+scif_get_node_info_resp(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	if (scif_is_mgmt_node()) {
+		swap(msg->dst.node, msg->src.node);
+		mutex_lock(&scif_info.conflock);
+		msg->payload[1] = scif_info.maxid;
+		msg->payload[2] = scif_info.total;
+		mutex_unlock(&scif_info.conflock);
+		scif_nodeqp_send(scifdev, msg);
+	} else {
+		struct completion *node_info =
+			(struct completion *)msg->payload[3];
+
+		mutex_lock(&scif_info.conflock);
+		scif_info.maxid = msg->payload[1];
+		scif_info.total = msg->payload[2];
+		complete_all(node_info);
+		mutex_unlock(&scif_info.conflock);
+	}
+}
+
 static void
 scif_msg_unknown(struct scif_dev *scifdev, struct scifmsg *msg)
 {
@@ -970,6 +1001,9 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_cnctrej,		/* SCIF_CNCT_REJ */
 	scif_discnct,		/* SCIF_DISCNCT */
 	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
+	scif_clientsend,	/* SCIF_CLIENT_SENT */
+	scif_clientrcvd,	/* SCIF_CLIENT_RCVD */
+	scif_get_node_info_resp,/* SCIF_GET_NODE_INFO */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 52%]

* [PATCH char-misc-next 09/13] misc: mic: SCIF connections APIs i.e. accept and connect
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (7 preceding siblings ...)
  2014-12-10 19:47 41% ` [PATCH char-misc-next 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
@ 2014-12-10 19:47 41% ` Sudeep Dutt
  2014-12-10 19:47 52% ` [PATCH char-misc-next 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

From: Nikhil Rao <nikhil.rao@intel.com>

SCIF connection APIs which establish a SCIF connection between
a pair of SCIF endpoints. A SCIF connection consists of a
dedicated queue-pair between the endpoints. Client messages are
sent over the queue-pair whereas the signaling associated with the
message is multiplexed over the node queue-pair. Similarly other
control messages such as exposing registered memory are also sent
over the node queue-pair. The SCIF endpoints must be in connected
state to exchange messages, register memory, map remote memory and
trigger DMA transfers. SCIF connections can be set up
asynchronously or synchronously.

Thanks to Johnnie S Peters for authoring parts of this patch during
early bring up of the SCIF driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
---
 drivers/misc/mic/scif/scif_epd.h    |   8 +
 drivers/misc/mic/scif/scif_api.c    | 455 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_epd.c    | 229 ++++++++++++++++++
 drivers/misc/mic/scif/scif_fd.c     | 106 +++++++++
 drivers/misc/mic/scif/scif_main.c   |   1 +
 drivers/misc/mic/scif/scif_nm.c     |  36 +++
 drivers/misc/mic/scif/scif_nodeqp.c |  16 +-
 7 files changed, 850 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mic/scif/scif_epd.h b/drivers/misc/mic/scif/scif_epd.h
index b973b99..65c628e 100644
--- a/drivers/misc/mic/scif/scif_epd.h
+++ b/drivers/misc/mic/scif/scif_epd.h
@@ -144,5 +144,13 @@ int scif_rsrv_port(u16 port);
 void scif_get_port(u16 port);
 int scif_get_new_port(void);
 void scif_put_port(u16 port);
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg);
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg);
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block);
 int __scif_flush(scif_epd_t epd);
 #endif /* SCIF_EPD_H */
diff --git a/drivers/misc/mic/scif/scif_api.c b/drivers/misc/mic/scif/scif_api.c
index d5f10f3..28494a3 100644
--- a/drivers/misc/mic/scif/scif_api.c
+++ b/drivers/misc/mic/scif/scif_api.c
@@ -415,3 +415,458 @@ int scif_listen(scif_epd_t epd, int backlog)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(scif_listen);
+
+/*
+ ************************************************************************
+ * SCIF connection flow:
+ *
+ * 1) A SCIF listening endpoint can call scif_accept(..) to wait for SCIF
+ *	connections via a SCIF_CNCT_REQ message
+ * 2) A SCIF endpoint can initiate a SCIF connection by calling
+ *	scif_connect(..) which calls scif_setup_qp_connect(..) which
+ *	allocates the local qp for the endpoint ring buffer and then sends
+ *	a SCIF_CNCT_REQ to the remote node and waits for a SCIF_CNCT_GNT or
+ *	a SCIF_CNCT_REJ message
+ * 3) The peer node handles a SCIF_CNCT_REQ via scif_cnctreq_resp(..) which
+ *	wakes up any threads blocked in step 1 or sends a SCIF_CNCT_REJ
+ *	message otherwise
+ * 4) A thread blocked waiting for incoming connections allocates its local
+ *	endpoint QP and ring buffer following which it sends a SCIF_CNCT_GNT
+ *	and waits for a SCIF_CNCT_GNT(N)ACK. If the allocation fails then
+ *	the node sends a SCIF_CNCT_REJ message
+ * 5) Upon receipt of a SCIF_CNCT_GNT or a SCIF_CNCT_REJ message the
+ *	connecting endpoint is woken up as part of handling
+ *	scif_cnctgnt_resp(..) following which it maps the remote endpoints'
+ *	QP, updates its outbound QP and sends a SCIF_CNCT_GNTACK message on
+ *	success or a SCIF_CNCT_GNTNACK message on failure and completes
+ *	the scif_connect(..) API
+ * 6) Upon receipt of a SCIF_CNCT_GNT(N)ACK the accepting endpoint blocked
+ *	in step 4 is woken up and completes the scif_accept(..) API
+ * 7) The SCIF connection is now established between the two SCIF endpoints.
+ */
+static int scif_conn_func(struct scif_endpt *ep)
+{
+	int err = 0;
+	struct scifmsg msg;
+	struct device *spdev;
+
+	/* Initiate the first part of the endpoint QP setup */
+	err = scif_setup_qp_connect(ep->qp_info.qp, &ep->qp_info.qp_offset,
+				    SCIF_ENDPT_QP_SIZE, ep->remote_dev);
+	if (err) {
+		pr_err("%s err %d qp_offset 0x%llx\n",
+		       __func__, err, ep->qp_info.qp_offset);
+		ep->state = SCIFEP_BOUND;
+		goto connect_error_simple;
+	}
+
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	/* Format connect message and send it */
+	msg.src = ep->port;
+	msg.dst = ep->conn_port;
+	msg.uop = SCIF_CNCT_REQ;
+	msg.payload[0] = (u64)ep;
+	msg.payload[1] = ep->qp_info.qp_offset;
+	err = _scif_nodeqp_send(ep->remote_dev, &msg);
+	if (err)
+		goto connect_error_dec;
+	scif_put_peer_dev(spdev);
+	/*
+	 * Wait for the remote node to respond with SCIF_CNCT_GNT or
+	 * SCIF_CNCT_REJ message.
+	 */
+	wait_event_timeout(ep->conwq, ep->state != SCIFEP_CONNECTING,
+			   SCIF_NODE_ALIVE_TIMEOUT);
+	spdev = scif_get_peer_dev(ep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto cleanup_qp;
+	}
+	if (ep->state == SCIFEP_MAPPING) {
+		err = scif_setup_qp_connect_response(ep->remote_dev,
+						     ep->qp_info.qp,
+						     ep->qp_info.gnt_pld);
+		/*
+		 * If the resource to map the queue are not available then
+		 * we need to tell the other side to terminate the accept
+		 */
+		if (err) {
+			pr_err("%s %d err %d\n", __func__, __LINE__, err);
+			msg.uop = SCIF_CNCT_GNTNACK;
+			msg.payload[0] = ep->remote_ep;
+			_scif_nodeqp_send(ep->remote_dev, &msg);
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+
+		msg.uop = SCIF_CNCT_GNTACK;
+		msg.payload[0] = ep->remote_ep;
+		err = _scif_nodeqp_send(ep->remote_dev, &msg);
+		if (err) {
+			ep->state = SCIFEP_BOUND;
+			goto connect_error_dec;
+		}
+		ep->state = SCIFEP_CONNECTED;
+		mutex_lock(&scif_info.connlock);
+		list_add_tail(&ep->list, &scif_info.connected);
+		mutex_unlock(&scif_info.connlock);
+		pr_debug("SCIFAPI connect: ep %p connected\n", ep);
+	} else if (ep->state == SCIFEP_BOUND) {
+		pr_debug("SCIFAPI connect: ep %p connection refused\n", ep);
+		err = -ECONNREFUSED;
+		goto connect_error_dec;
+	}
+	scif_put_peer_dev(spdev);
+	return err;
+connect_error_dec:
+	scif_put_peer_dev(spdev);
+cleanup_qp:
+	scif_cleanup_ep_qp(ep);
+connect_error_simple:
+	return err;
+}
+
+/*
+ * scif_conn_handler:
+ *
+ * Workqueue handler for servicing non-blocking SCIF connect
+ *
+ */
+void scif_conn_handler(struct work_struct *work)
+{
+	struct scif_endpt *ep;
+
+	do {
+		ep = NULL;
+		spin_lock(&scif_info.nb_connect_lock);
+		if (!list_empty(&scif_info.nb_connect_list)) {
+			ep = list_first_entry(&scif_info.nb_connect_list,
+					      struct scif_endpt, conn_list);
+			list_del(&ep->conn_list);
+		}
+		spin_unlock(&scif_info.nb_connect_lock);
+		if (ep)
+			ep->conn_err = scif_conn_func(ep);
+	} while (ep);
+}
+
+int __scif_connect(scif_epd_t epd, struct scif_port_id *dst, bool non_block)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)epd;
+	int err = 0;
+	struct scif_dev *remote_dev;
+	struct device *spdev;
+
+	pr_debug("SCIFAPI connect: ep %p %s\n", ep,
+		 scif_ep_states[ep->state]);
+
+	if (!scif_dev || dst->node > scif_info.maxid)
+		return -ENODEV;
+
+	might_sleep();
+
+	remote_dev = &scif_dev[dst->node];
+	spdev = scif_get_peer_dev(remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		return err;
+	}
+
+	spin_lock(&ep->lock);
+	switch (ep->state) {
+	case SCIFEP_ZOMBIE:
+	case SCIFEP_CLOSING:
+		err = -EINVAL;
+		break;
+	case SCIFEP_DISCONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EINVAL;
+		break;
+	case SCIFEP_LISTENING:
+	case SCIFEP_CLLISTEN:
+		err = -EOPNOTSUPP;
+		break;
+	case SCIFEP_CONNECTING:
+	case SCIFEP_MAPPING:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			err = -EINPROGRESS;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_CONNECTED:
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS)
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		else
+			err = -EISCONN;
+		break;
+	case SCIFEP_UNBOUND:
+		ep->port.port = scif_get_new_port();
+		if (!ep->port.port) {
+			err = -ENOSPC;
+		} else {
+			ep->port.node = scif_info.nodeid;
+			ep->conn_async_state = ASYNC_CONN_IDLE;
+		}
+		/* Fall through */
+	case SCIFEP_BOUND:
+		/*
+		 * If a non-blocking connect has been already initiated
+		 * (conn_async_state is either ASYNC_CONN_INPROGRESS or
+		 * ASYNC_CONN_FLUSH_WORK), the end point could end up in
+		 * SCIF_BOUND due an error in the connection process
+		 * (e.g., connection refused) If conn_async_state is
+		 * ASYNC_CONN_INPROGRESS - transition to ASYNC_CONN_FLUSH_WORK
+		 * so that the error status can be collected. If the state is
+		 * already ASYNC_CONN_FLUSH_WORK - then set the error to
+		 * EINPROGRESS since some other thread is waiting to collect
+		 * error status.
+		 */
+		if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+			ep->conn_async_state = ASYNC_CONN_FLUSH_WORK;
+		} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+			err = -EINPROGRESS;
+		} else {
+			ep->conn_port = *dst;
+			init_waitqueue_head(&ep->sendwq);
+			init_waitqueue_head(&ep->recvwq);
+			init_waitqueue_head(&ep->conwq);
+			ep->conn_async_state = 0;
+
+			if (unlikely(non_block))
+				ep->conn_async_state = ASYNC_CONN_INPROGRESS;
+		}
+		break;
+	}
+
+	if (err || ep->conn_async_state == ASYNC_CONN_FLUSH_WORK)
+			goto connect_simple_unlock1;
+
+	ep->state = SCIFEP_CONNECTING;
+	ep->remote_dev = &scif_dev[dst->node];
+	ep->qp_info.qp->magic = SCIFEP_MAGIC;
+	if (ep->conn_async_state == ASYNC_CONN_INPROGRESS) {
+		spin_lock(&scif_info.nb_connect_lock);
+		list_add_tail(&ep->conn_list, &scif_info.nb_connect_list);
+		spin_unlock(&scif_info.nb_connect_lock);
+		err = -EINPROGRESS;
+		schedule_work(&scif_info.conn_work);
+	}
+connect_simple_unlock1:
+	spin_unlock(&ep->lock);
+	scif_put_peer_dev(spdev);
+	if (err) {
+		return err;
+	} else if (ep->conn_async_state == ASYNC_CONN_FLUSH_WORK) {
+		flush_work(&scif_info.conn_work);
+		err = ep->conn_err;
+		spin_lock(&ep->lock);
+		ep->conn_async_state = ASYNC_CONN_IDLE;
+		spin_unlock(&ep->lock);
+	} else {
+		err = scif_conn_func(ep);
+	}
+	return err;
+}
+
+int scif_connect(scif_epd_t epd, struct scif_port_id *dst)
+{
+	return __scif_connect(epd, dst, false);
+}
+EXPORT_SYMBOL_GPL(scif_connect);
+
+/**
+ * scif_accept() - Accept a connection request from the remote node
+ *
+ * The function accepts a connection request from the remote node.  Successful
+ * complete is indicate by a new end point being created and passed back
+ * to the caller for future reference.
+ *
+ * Upon successful complete a zero will be returned and the peer information
+ * will be filled in.
+ *
+ * If the end point is not in the listening state -EINVAL will be returned.
+ *
+ * If during the connection sequence resource allocation fails the -ENOMEM
+ * will be returned.
+ *
+ * If the function is called with the ASYNC flag set and no connection requests
+ * are pending it will return -EAGAIN.
+ *
+ * If the remote side is not sending any connection requests the caller may
+ * terminate this function with a signal.  If so a -EINTR will be returned.
+ */
+int scif_accept(scif_epd_t epd, struct scif_port_id *peer,
+		scif_epd_t *newepd, int flags)
+{
+	struct scif_endpt *lep = (struct scif_endpt *)epd;
+	struct scif_endpt *cep;
+	struct scif_conreq *conreq;
+	struct scifmsg msg;
+	int err;
+	struct device *spdev;
+
+	pr_debug("SCIFAPI accept: ep %p %s\n", lep, scif_ep_states[lep->state]);
+
+	if (flags & ~SCIF_ACCEPT_SYNC) {
+		pr_debug("SCIFAPI accept: ep %p invalid flags %x\n",
+			 lep, flags & ~SCIF_ACCEPT_SYNC);
+		return -EINVAL;
+	}
+
+	if (!peer || !newepd) {
+		pr_debug("SCIFAPI accept: ep %p peer %p or newepd %p NULL\n",
+			 lep, peer, newepd);
+		return -EINVAL;
+	}
+
+	might_sleep();
+	spin_lock(&lep->lock);
+	if (lep->state != SCIFEP_LISTENING) {
+		pr_debug("SCIFAPI accept: ep %p not listening\n", lep);
+		spin_unlock(&lep->lock);
+		return -EINVAL;
+	}
+
+	if (!lep->conreqcnt && !(flags & SCIF_ACCEPT_SYNC)) {
+		/* No connection request present and we do not want to wait */
+		spin_unlock(&lep->lock);
+		return -EAGAIN;
+	}
+
+	lep->files = current->files;
+retry_connection:
+	spin_unlock(&lep->lock);
+	/* Wait for the remote node to send us a SCIF_CNCT_REQ */
+	err = wait_event_interruptible(lep->conwq,
+				       (lep->conreqcnt ||
+				       (lep->state != SCIFEP_LISTENING)));
+	if (err)
+		return err;
+
+	if (lep->state != SCIFEP_LISTENING)
+		return -EINTR;
+
+	spin_lock(&lep->lock);
+
+	if (!lep->conreqcnt)
+		goto retry_connection;
+
+	/* Get the first connect request off the list */
+	conreq = list_first_entry(&lep->conlist, struct scif_conreq, list);
+	list_del(&conreq->list);
+	lep->conreqcnt--;
+	spin_unlock(&lep->lock);
+
+	/* Fill in the peer information */
+	peer->node = conreq->msg.src.node;
+	peer->port = conreq->msg.src.port;
+
+	cep = kzalloc(sizeof(*cep), GFP_KERNEL);
+	if (!cep) {
+		err = -ENOMEM;
+		goto scif_accept_error_epalloc;
+	}
+	spin_lock_init(&cep->lock);
+	mutex_init(&cep->sendlock);
+	mutex_init(&cep->recvlock);
+	cep->state = SCIFEP_CONNECTING;
+	cep->remote_dev = &scif_dev[peer->node];
+	cep->remote_ep = conreq->msg.payload[0];
+
+	cep->qp_info.qp = kzalloc(sizeof(*cep->qp_info.qp), GFP_KERNEL);
+	if (!cep->qp_info.qp) {
+		err = -ENOMEM;
+		goto scif_accept_error_qpalloc;
+	}
+
+	cep->qp_info.qp->magic = SCIFEP_MAGIC;
+	spdev = scif_get_peer_dev(cep->remote_dev);
+	if (IS_ERR(spdev)) {
+		err = PTR_ERR(spdev);
+		goto scif_accept_error_map;
+	}
+	err = scif_setup_qp_accept(cep->qp_info.qp, &cep->qp_info.qp_offset,
+				   conreq->msg.payload[1], SCIF_ENDPT_QP_SIZE,
+				   cep->remote_dev);
+	if (err) {
+		pr_debug("SCIFAPI accept: ep %p new %p scif_setup_qp_accept %d qp_offset 0x%llx\n",
+			 lep, cep, err, cep->qp_info.qp_offset);
+		scif_put_peer_dev(spdev);
+		goto scif_accept_error_map;
+	}
+
+	cep->port.node = lep->port.node;
+	cep->port.port = lep->port.port;
+	cep->peer.node = peer->node;
+	cep->peer.port = peer->port;
+	init_waitqueue_head(&cep->sendwq);
+	init_waitqueue_head(&cep->recvwq);
+	init_waitqueue_head(&cep->conwq);
+
+	msg.uop = SCIF_CNCT_GNT;
+	msg.src = cep->port;
+	msg.payload[0] = cep->remote_ep;
+	msg.payload[1] = cep->qp_info.qp_offset;
+	msg.payload[2] = (u64)cep;
+
+	err = _scif_nodeqp_send(cep->remote_dev, &msg);
+	scif_put_peer_dev(spdev);
+	if (err)
+		goto scif_accept_error_map;
+retry:
+	/* Wait for the remote node to respond with SCIF_CNCT_GNT(N)ACK */
+	err = wait_event_timeout(cep->conwq, cep->state != SCIFEP_CONNECTING,
+				 SCIF_NODE_ACCEPT_TIMEOUT);
+	if (!err && scifdev_alive(cep))
+		goto retry;
+	err = !err ? -ENODEV : 0;
+	if (err)
+		goto scif_accept_error_map;
+	kfree(conreq);
+
+	spin_lock(&cep->lock);
+
+	if (cep->state == SCIFEP_CLOSING) {
+		/*
+		 * Remote failed to allocate resources and NAKed the grant.
+		 * There is at this point nothing referencing the new end point.
+		 */
+		spin_unlock(&cep->lock);
+		scif_teardown_ep(cep);
+		kfree(cep);
+
+		/* If call with sync flag then go back and wait. */
+		if (flags & SCIF_ACCEPT_SYNC) {
+			spin_lock(&lep->lock);
+			goto retry_connection;
+		}
+
+		pr_debug("SCIFAPI accept: ep %p new %p fail\n", lep, cep);
+		return -EAGAIN;
+	}
+
+	scif_get_port(cep->port.port);
+	*newepd = (scif_epd_t)cep;
+	spin_unlock(&cep->lock);
+	return 0;
+scif_accept_error_map:
+	scif_teardown_ep(cep);
+scif_accept_error_qpalloc:
+	kfree(cep);
+scif_accept_error_epalloc:
+	msg.uop = SCIF_CNCT_REJ;
+	msg.dst.node = conreq->msg.src.node;
+	msg.dst.port = conreq->msg.src.port;
+	msg.payload[0] = conreq->msg.payload[0];
+	msg.payload[1] = conreq->msg.payload[1];
+	scif_nodeqp_send(&scif_dev[conreq->msg.src.node], &msg);
+	kfree(conreq);
+	return err;
+}
+EXPORT_SYMBOL_GPL(scif_accept);
diff --git a/drivers/misc/mic/scif/scif_epd.c b/drivers/misc/mic/scif/scif_epd.c
index a2b728a..685ee5c 100644
--- a/drivers/misc/mic/scif/scif_epd.c
+++ b/drivers/misc/mic/scif/scif_epd.c
@@ -76,6 +76,24 @@ void scif_add_epd_to_zombie_list(struct scif_endpt *ep, bool eplock_held)
 	schedule_work(&scif_info.misc_work);
 }
 
+static struct scif_endpt *scif_find_listen_ep(u16 port)
+{
+	struct scif_endpt *ep = NULL;
+	struct list_head *pos, *tmpq;
+
+	spin_lock(&scif_info.eplock);
+	list_for_each_safe(pos, tmpq, &scif_info.listen) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->port.port == port) {
+			spin_lock(&ep->lock);
+			spin_unlock(&scif_info.eplock);
+			return ep;
+		}
+	}
+	spin_unlock(&scif_info.eplock);
+	return NULL;
+}
+
 void scif_cleanup_zombie_epd(void)
 {
 	struct list_head *pos, *tmpq;
@@ -90,3 +108,214 @@ void scif_cleanup_zombie_epd(void)
 	}
 	spin_unlock(&scif_info.eplock);
 }
+
+/**
+ * scif_cnctreq() - Respond to SCIF_CNCT_REQ interrupt message
+ * @msg:        Interrupt message
+ *
+ * This message is initiated by the remote node to request a connection
+ * to the local node.  This function looks for an end point in the
+ * listen state on the requested port id.
+ *
+ * If it finds a listening port it places the connect request on the
+ * listening end points queue and wakes up any pending accept calls.
+ *
+ * If it does not find a listening end point it sends a connection
+ * reject message to the remote node.
+ */
+void scif_cnctreq(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_conreq *conreq;
+
+	conreq = kmalloc(sizeof(*conreq), GFP_KERNEL);
+	if (!conreq)
+		/* Lack of resources so reject the request. */
+		goto conreq_sendrej;
+
+	ep = scif_find_listen_ep(msg->dst.port);
+	if (!ep)
+		/*  Send reject due to no listening ports */
+		goto conreq_sendrej_free;
+
+	if (ep->backlog <= ep->conreqcnt) {
+		/*  Send reject due to too many pending requests */
+		spin_unlock(&ep->lock);
+		goto conreq_sendrej_free;
+	}
+
+	conreq->msg = *msg;
+	list_add_tail(&conreq->list, &ep->conlist);
+	ep->conreqcnt++;
+	wake_up_interruptible(&ep->conwq);
+	spin_unlock(&ep->lock);
+	return;
+
+conreq_sendrej_free:
+	kfree(conreq);
+conreq_sendrej:
+	msg->uop = SCIF_CNCT_REJ;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_cnctgnt() - Respond to SCIF_CNCT_GNT interrupt message
+ * @msg:        Interrupt message
+ *
+ * An accept() on the remote node has occurred and sent this message
+ * to indicate success.  Place the end point in the MAPPING state and
+ * save the remote nodes memory information.  Then wake up the connect
+ * request so it can finish.
+ */
+void scif_cnctgnt(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->peer.node = msg->src.node;
+		ep->peer.port = msg->src.port;
+		ep->qp_info.gnt_pld = msg->payload[1];
+		ep->remote_ep = msg->payload[2];
+		ep->state = SCIFEP_MAPPING;
+
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctgnt_ack() - Respond to SCIF_CNCT_GNTACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request has finished mapping the local memory.
+ * Place the connection in the connected state and wake up the pending
+ * accept() call.
+ */
+void scif_cnctgnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	mutex_lock(&scif_info.connlock);
+	spin_lock(&ep->lock);
+	/* New ep is now connected with all resources set. */
+	ep->state = SCIFEP_CONNECTED;
+	list_add_tail(&ep->list, &scif_info.connected);
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+}
+
+/**
+ * scif_cnctgnt_nack() - Respond to SCIF_CNCT_GNTNACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote connection request failed to map the local memory it was sent.
+ * Place the end point in the CLOSING state to indicate it and wake up
+ * the pending accept();
+ */
+void scif_cnctgnt_nack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_CLOSING;
+	wake_up(&ep->conwq);
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_cnctrej() - Respond to SCIF_CNCT_REJ interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote end has rejected the connection request.  Set the end
+ * point back to the bound state and wake up the pending connect().
+ */
+void scif_cnctrej(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	if (SCIFEP_CONNECTING == ep->state) {
+		ep->state = SCIFEP_BOUND;
+		wake_up(&ep->conwq);
+	}
+	spin_unlock(&ep->lock);
+}
+
+/**
+ * scif_discnct() - Respond to SCIF_DISCNCT interrupt message
+ * @msg:        Interrupt message
+ *
+ * The remote node has indicated close() has been called on its end
+ * point.  Remove the local end point from the connected list, set its
+ * state to disconnected and ensure accesses to the remote node are
+ * shutdown.
+ *
+ * When all accesses to the remote end have completed then send a
+ * DISCNT_ACK to indicate it can remove its resources and complete
+ * the close routine.
+ */
+void scif_discnct(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = NULL;
+	struct scif_endpt *tmpep;
+	struct list_head *pos, *tmpq;
+
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		tmpep = list_entry(pos, struct scif_endpt, list);
+		/*
+		 * The local ep may have sent a disconnect and and been closed
+		 * due to a message response time out. It may have been
+		 * allocated again and formed a new connection so we want to
+		 * check if the remote ep matches
+		 */
+		if (((u64)tmpep == msg->payload[1]) &&
+		    ((u64)tmpep->remote_ep == msg->payload[0])) {
+			list_del(pos);
+			ep = tmpep;
+			spin_lock(&ep->lock);
+			break;
+		}
+	}
+
+	/*
+	 * If the terminated end is not found then this side started closing
+	 * before the other side sent the disconnect.  If so the ep will no
+	 * longer be on the connected list.  Regardless the other side
+	 * needs to be acked to let it know close is complete.
+	 */
+	if (!ep) {
+		mutex_unlock(&scif_info.connlock);
+		goto discnct_ack;
+	}
+
+	ep->state = SCIFEP_DISCONNECTED;
+	list_add_tail(&ep->list, &scif_info.disconnected);
+
+	wake_up_interruptible(&ep->sendwq);
+	wake_up_interruptible(&ep->recvwq);
+	spin_unlock(&ep->lock);
+	mutex_unlock(&scif_info.connlock);
+
+discnct_ack:
+	msg->uop = SCIF_DISCNT_ACK;
+	scif_nodeqp_send(&scif_dev[msg->src.node], msg);
+}
+
+/**
+ * scif_discnct_ack() - Respond to SCIF_DISCNT_ACK interrupt message
+ * @msg:        Interrupt message
+ *
+ * Remote side has indicated it has not more references to local resources
+ */
+void scif_discnt_ack(struct scif_dev *scifdev, struct scifmsg *msg)
+{
+	struct scif_endpt *ep = (struct scif_endpt *)msg->payload[0];
+
+	spin_lock(&ep->lock);
+	ep->state = SCIFEP_DISCONNECTED;
+	spin_unlock(&ep->lock);
+	complete(&ep->discon);
+}
diff --git a/drivers/misc/mic/scif/scif_fd.c b/drivers/misc/mic/scif/scif_fd.c
index d81ab7c..4ccdf7d 100644
--- a/drivers/misc/mic/scif/scif_fd.c
+++ b/drivers/misc/mic/scif/scif_fd.c
@@ -68,6 +68,7 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 {
 	struct scif_endpt *priv = f->private_data;
 	void __user *argp = (void __user *)arg;
+	int err = 0;
 	bool non_block = false;
 
 	non_block = !!(f->f_flags & O_NONBLOCK);
@@ -91,6 +92,111 @@ static long scif_fdioctl(struct file *f, unsigned int cmd, unsigned long arg)
 	}
 	case SCIF_LISTEN:
 		return scif_listen(priv, arg);
+	case SCIF_CONNECT:
+	{
+		struct scifioctl_connect req;
+		struct scif_endpt *ep = (struct scif_endpt *)priv;
+
+		if (copy_from_user(&req, argp, sizeof(req)))
+			return -EFAULT;
+
+		err = __scif_connect(priv, &req.peer, non_block);
+		if (err < 0)
+			return err;
+
+		req.self.node = ep->port.node;
+		req.self.port = ep->port.port;
+
+		if (copy_to_user(argp, &req, sizeof(req)))
+			return -EFAULT;
+
+		return 0;
+	}
+	/*
+	 * Accept is done in two halves.  The request ioctl does the basic
+	 * functionality of accepting the request and returning the information
+	 * about it including the internal ID of the end point.  The register
+	 * is done with the internal ID on a new file descriptor opened by the
+	 * requesting process.
+	 */
+	case SCIF_ACCEPTREQ:
+	{
+		struct scifioctl_accept request;
+		scif_epd_t *ep = (scif_epd_t *)&request.endpt;
+
+		if (copy_from_user(&request, argp, sizeof(request)))
+			return -EFAULT;
+
+		err = scif_accept(priv, &request.peer, ep, request.flags);
+		if (err < 0)
+			return err;
+
+		if (copy_to_user(argp, &request, sizeof(request))) {
+			scif_close(*ep);
+			return -EFAULT;
+		}
+		/*
+		 * Add to the list of user mode eps where the second half
+		 * of the accept is not yet completed.
+		 */
+		spin_lock(&scif_info.eplock);
+		list_add_tail(&((*ep)->miacceptlist), &scif_info.uaccept);
+		list_add_tail(&((*ep)->liacceptlist), &priv->li_accept);
+		(*ep)->listenep = priv;
+		priv->acceptcnt++;
+		spin_unlock(&scif_info.eplock);
+
+		return 0;
+	}
+	case SCIF_ACCEPTREG:
+	{
+		struct scif_endpt *priv = f->private_data;
+		struct scif_endpt *newep;
+		struct scif_endpt *lisep;
+		struct scif_endpt *fep = NULL;
+		struct scif_endpt *tmpep;
+		struct list_head *pos, *tmpq;
+
+		/* Finally replace the pointer to the accepted endpoint */
+		if (copy_from_user(&newep, argp, sizeof(void *)))
+			return -EFAULT;
+
+		/* Remove form the user accept queue */
+		spin_lock(&scif_info.eplock);
+		list_for_each_safe(pos, tmpq, &scif_info.uaccept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, miacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				fep = tmpep;
+				break;
+			}
+		}
+
+		if (!fep) {
+			spin_unlock(&scif_info.eplock);
+			return -ENOENT;
+		}
+
+		lisep = newep->listenep;
+		list_for_each_safe(pos, tmpq, &lisep->li_accept) {
+			tmpep = list_entry(pos,
+					   struct scif_endpt, liacceptlist);
+			if (tmpep == newep) {
+				list_del(pos);
+				lisep->acceptcnt--;
+				break;
+			}
+		}
+
+		spin_unlock(&scif_info.eplock);
+
+		/* Free the resources automatically created from the open. */
+		scif_teardown_ep(priv);
+		scif_add_epd_to_zombie_list(priv, !SCIF_EPLOCK_HELD);
+		f->private_data = newep;
+		return 0;
+	}
 	}
 	return -EINVAL;
 }
diff --git a/drivers/misc/mic/scif/scif_main.c b/drivers/misc/mic/scif/scif_main.c
index b81ad4d..d834552 100644
--- a/drivers/misc/mic/scif/scif_main.c
+++ b/drivers/misc/mic/scif/scif_main.c
@@ -334,6 +334,7 @@ static int _scif_init(void)
 	scif_info.en_msg_log = 0;
 	scif_info.p2p_enable = 1;
 	INIT_WORK(&scif_info.misc_work, scif_misc_handler);
+	INIT_WORK(&scif_info.conn_work, scif_conn_handler);
 	idr_init(&scif_ports);
 	return 0;
 }
diff --git a/drivers/misc/mic/scif/scif_nm.c b/drivers/misc/mic/scif/scif_nm.c
index 0c612db..f28b74a 100644
--- a/drivers/misc/mic/scif/scif_nm.c
+++ b/drivers/misc/mic/scif/scif_nm.c
@@ -20,6 +20,41 @@
 #include "scif_main.h"
 #include "scif_map.h"
 
+/**
+ * scif_invalidate_ep() - Set state for all connected endpoints
+ * to disconnected and wake up all send/recv waitqueues
+ */
+static void scif_invalidate_ep(int node)
+{
+	struct scif_endpt *ep;
+	struct list_head *pos, *tmpq;
+
+	flush_work(&scif_info.conn_work);
+	mutex_lock(&scif_info.connlock);
+	list_for_each_safe(pos, tmpq, &scif_info.disconnected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			spin_lock(&ep->lock);
+			scif_cleanup_ep_qp(ep);
+			spin_unlock(&ep->lock);
+		}
+	}
+	list_for_each_safe(pos, tmpq, &scif_info.connected) {
+		ep = list_entry(pos, struct scif_endpt, list);
+		if (ep->remote_dev->node == node) {
+			list_del(pos);
+			spin_lock(&ep->lock);
+			ep->state = SCIFEP_DISCONNECTED;
+			list_add_tail(&ep->list, &scif_info.disconnected);
+			scif_cleanup_ep_qp(ep);
+			wake_up_interruptible(&ep->sendwq);
+			wake_up_interruptible(&ep->recvwq);
+			spin_unlock(&ep->lock);
+		}
+	}
+	mutex_unlock(&scif_info.connlock);
+}
+
 void scif_free_qp(struct scif_dev *scifdev)
 {
 	struct scif_qp *qp = scifdev->qpairs;
@@ -91,6 +126,7 @@ void scif_cleanup_scifdev(struct scif_dev *dev)
 		scif_destroy_intr_wq(dev);
 	}
 	scif_destroy_p2p(dev);
+	scif_invalidate_ep(dev->node);
 	scif_send_acks(dev);
 	scif_cleanup_qp(dev);
 }
diff --git a/drivers/misc/mic/scif/scif_nodeqp.c b/drivers/misc/mic/scif/scif_nodeqp.c
index 7c72d8f..a51762d0 100644
--- a/drivers/misc/mic/scif/scif_nodeqp.c
+++ b/drivers/misc/mic/scif/scif_nodeqp.c
@@ -580,7 +580,14 @@ static char *message_types[] = {"BAD",
 				"SCIF_NODE_ADD_ACK",
 				"SCIF_NODE_ADD_NACK",
 				"REMOVE_NODE",
-				"REMOVE_NODE_ACK"};
+				"REMOVE_NODE_ACK",
+				"CNCT_REQ",
+				"CNCT_GNT",
+				"CNCT_GNTACK",
+				"CNCT_GNTNACK",
+				"CNCT_REJ",
+				"DISCNCT",
+				"DISCNT_ACK"};
 
 static void
 scif_display_message(struct scif_dev *scifdev, struct scifmsg *msg,
@@ -956,6 +963,13 @@ static void (*scif_intr_func[SCIF_MAX_MSG + 1])
 	scif_node_add_nack,	/* SCIF_NODE_ADD_NACK */
 	scif_node_remove,	/* SCIF_NODE_REMOVE */
 	scif_node_remove_ack,	/* SCIF_NODE_REMOVE_ACK */
+	scif_cnctreq,		/* SCIF_CNCT_REQ */
+	scif_cnctgnt,		/* SCIF_CNCT_GNT */
+	scif_cnctgnt_ack,	/* SCIF_CNCT_GNTACK */
+	scif_cnctgnt_nack,	/* SCIF_CNCT_GNTNACK */
+	scif_cnctrej,		/* SCIF_CNCT_REJ */
+	scif_discnct,		/* SCIF_DISCNCT */
+	scif_discnt_ack,	/* SCIF_DISCNT_ACK */
 };
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next 05/13] misc: mic: Common MIC header file changes in preparation for SCIF
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (3 preceding siblings ...)
  2014-12-10 19:47 78% ` [PATCH char-misc-next 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
@ 2014-12-10 19:47 97% ` Sudeep Dutt
  2014-12-10 19:47 47% ` [PATCH char-misc-next 06/13] misc: mic: SCIF module initialization Sudeep Dutt
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/common/mic_dev.h |  3 +++
 include/uapi/linux/mic_common.h   | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/misc/mic/common/mic_dev.h b/drivers/misc/mic/common/mic_dev.h
index 92999c2..0b58c46 100644
--- a/drivers/misc/mic/common/mic_dev.h
+++ b/drivers/misc/mic/common/mic_dev.h
@@ -48,4 +48,7 @@ struct mic_mw {
 #define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
 #define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
 
+/* Maximum number of DMA channels */
+#define MIC_MAX_DMA_CHAN 4
+
 #endif
diff --git a/include/uapi/linux/mic_common.h b/include/uapi/linux/mic_common.h
index 6eb4024..302a2ce 100644
--- a/include/uapi/linux/mic_common.h
+++ b/include/uapi/linux/mic_common.h
@@ -80,6 +80,12 @@ struct mic_device_ctrl {
  * @h2c_config_db: Host to Card Virtio config doorbell set by card
  * @shutdown_status: Card shutdown status set by card
  * @shutdown_card: Set to 1 by the host when a card shutdown is initiated
+ * @tot_nodes: Total number of nodes in the SCIF network
+ * @node_id: Unique id of the node
+ * @h2c_scif_db - Host to card SCIF doorbell set by card
+ * @c2h_scif_db - Card to host SCIF doorbell set by host
+ * @scif_host_dma_addr - SCIF host queue pair DMA address
+ * @scif_card_dma_addr - SCIF card queue pair DMA address
  */
 struct mic_bootparam {
 	__le32 magic;
@@ -88,6 +94,12 @@ struct mic_bootparam {
 	__s8 h2c_config_db;
 	__u8 shutdown_status;
 	__u8 shutdown_card;
+	__u8 tot_nodes;
+	__u8 node_id;
+	__u8 h2c_scif_db;
+	__u8 c2h_scif_db;
+	__u64 scif_host_dma_addr;
+	__u64 scif_card_dma_addr;
 } __attribute__ ((aligned(8)));
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 97%]

* [PATCH char-misc-next 12/13] misc: mic: MIC card driver specific changes to enable SCIF
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (10 preceding siblings ...)
  2014-12-10 19:47 57% ` [PATCH char-misc-next 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
@ 2014-12-10 19:47 70% ` Sudeep Dutt
  2014-12-10 19:47 98% ` [PATCH char-misc-next 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

MIC card driver specific changes to enable SCIF. This patch implements
the SCIF hardware bus operations and registers a SCIF device on the
SCIF hardware bus.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/card/mic_device.h |  11 +++-
 drivers/misc/mic/card/mic_x100.h   |   1 +
 drivers/misc/mic/card/mic_device.c | 132 ++++++++++++++++++++++++++++++++++++-
 drivers/misc/mic/card/mic_x100.c   |  61 ++++++++++++++++-
 5 files changed, 201 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index a2ea3ab..e9f2f56 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -56,7 +56,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86 && INTEL_MIC_BUS
+	depends on 64BIT && X86 && INTEL_MIC_BUS && SCIF_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 844be8f..1dbf83c 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -29,9 +29,9 @@
 
 #include <linux/workqueue.h>
 #include <linux/io.h>
-#include <linux/irqreturn.h>
 #include <linux/interrupt.h>
 #include <linux/mic_bus.h>
+#include "../bus/scif_bus.h"
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -73,6 +73,9 @@ struct mic_device {
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
  * @dma_mbdev: dma device on the MIC virtual bus.
+ * @dma_ch - Array of DMA channels
+ * @num_dma_ch - Number of DMA channels available
+ * @scdev: SCIF device on the SCIF virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -84,6 +87,9 @@ struct mic_driver {
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
 	struct mbus_device *dma_mbdev;
+	struct dma_chan *dma_ch[MIC_MAX_DMA_CHAN];
+	int num_dma_ch;
+	struct scif_hw_dev *scdev;
 };
 
 /**
@@ -122,10 +128,11 @@ void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
 struct mic_irq *
 mic_request_card_irq(irq_handler_t handler, irq_handler_t thread_fn,
-		     const char *name, void *data, int intr_src);
+		     const char *name, void *data, int db);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
+void mic_send_p2p_intr(int doorbell, struct mic_mw *mw);
 int mic_db_to_irq(struct mic_driver *mdrv, int db);
 u32 mic_ack_interrupt(struct mic_device *mdev);
 void mic_hw_intr_init(struct mic_driver *mdrv);
diff --git a/drivers/misc/mic/card/mic_x100.h b/drivers/misc/mic/card/mic_x100.h
index d66ea55..7e22249 100644
--- a/drivers/misc/mic/card/mic_x100.h
+++ b/drivers/misc/mic/card/mic_x100.h
@@ -35,6 +35,7 @@
 #define MIC_X100_SBOX_SDBIC0 0x0000CC90
 #define MIC_X100_SBOX_SDBIC0_DBREQ_BIT 0x80000000
 #define MIC_X100_SBOX_RDMASR0	0x0000B180
+#define MIC_X100_SBOX_APICICR0 0x0000A9D0
 
 #define MIC_X100_MAX_DOORBELL_IDX 8
 
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 83819ee..6338908 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -28,6 +28,8 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/reboot.h>
+#include <linux/dmaengine.h>
+#include <linux/kmod.h>
 
 #include <linux/mic_common.h>
 #include "../common/mic_dev.h"
@@ -240,6 +242,111 @@ static void mic_uninit_irq(void)
 	kfree(mdrv->irq_info.irq_usage_count);
 }
 
+static inline struct mic_driver *scdev_to_mdrv(struct scif_hw_dev *scdev)
+{
+	return dev_get_drvdata(scdev->dev.parent);
+}
+
+static struct mic_irq *
+___mic_request_irq(struct scif_hw_dev *scdev,
+		   irqreturn_t (*func)(int irq, void *data),
+				       const char *name, void *data,
+				       int db)
+{
+	return mic_request_card_irq(func, NULL, name, data, db);
+}
+
+static void
+___mic_free_irq(struct scif_hw_dev *scdev,
+		struct mic_irq *cookie, void *data)
+{
+	return mic_free_card_irq(cookie, data);
+}
+
+static void ___mic_ack_interrupt(struct scif_hw_dev *scdev, int num)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_ack_interrupt(&mdrv->mdev);
+}
+
+static int ___mic_next_db(struct scif_hw_dev *scdev)
+{
+	return mic_next_card_db();
+}
+
+static void ___mic_send_intr(struct scif_hw_dev *scdev, int db)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_send_intr(&mdrv->mdev, db);
+}
+
+static void ___mic_send_p2p_intr(struct scif_hw_dev *scdev, int db,
+				 struct mic_mw *mw)
+{
+	mic_send_p2p_intr(db, mw);
+}
+
+static void __iomem *
+___mic_ioremap(struct scif_hw_dev *scdev,
+	       phys_addr_t pa, size_t len)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	return mic_card_map(&mdrv->mdev, pa, len);
+}
+
+static void ___mic_iounmap(struct scif_hw_dev *scdev, void __iomem *va)
+{
+	struct mic_driver *mdrv = scdev_to_mdrv(scdev);
+
+	mic_card_unmap(&mdrv->mdev, va);
+}
+
+static struct scif_hw_ops scif_hw_ops = {
+	.request_irq = ___mic_request_irq,
+	.free_irq = ___mic_free_irq,
+	.ack_interrupt = ___mic_ack_interrupt,
+	.next_db = ___mic_next_db,
+	.send_intr = ___mic_send_intr,
+	.send_p2p_intr = ___mic_send_p2p_intr,
+	.ioremap = ___mic_ioremap,
+	.iounmap = ___mic_iounmap,
+};
+
+static int mic_request_dma_chans(struct mic_driver *mdrv)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	request_module("mic_x100_dma");
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+
+	do {
+		chan = dma_request_channel(mask, NULL, NULL);
+		if (chan) {
+			mdrv->dma_ch[mdrv->num_dma_ch++] = chan;
+			if (mdrv->num_dma_ch >= MIC_MAX_DMA_CHAN)
+				break;
+		}
+	} while (chan);
+	dev_info(mdrv->dev, "DMA channels # %d\n", mdrv->num_dma_ch);
+	return mdrv->num_dma_ch;
+}
+
+static void mic_free_dma_chans(struct mic_driver *mdrv)
+{
+	int i = 0;
+
+	for (i = 0; i < mdrv->num_dma_ch; i++) {
+		dma_release_channel(mdrv->dma_ch[i]);
+		mdrv->dma_ch[i] = NULL;
+	}
+	mdrv->num_dma_ch = 0;
+}
+
 /*
  * mic_driver_init - MIC driver initialization tasks.
  *
@@ -248,6 +355,8 @@ static void mic_uninit_irq(void)
 int __init mic_driver_init(struct mic_driver *mdrv)
 {
 	int rc;
+	struct mic_bootparam __iomem *bootparam;
+	u8 node_id;
 
 	g_drv = mdrv;
 	/*
@@ -268,13 +377,32 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 	rc = mic_shutdown_init();
 	if (rc)
 		goto irq_uninit;
+	if (!mic_request_dma_chans(mdrv)) {
+		rc = -ENODEV;
+		goto shutdown_uninit;
+	}
 	rc = mic_devices_init(mdrv);
 	if (rc)
-		goto shutdown_uninit;
+		goto dma_free;
+	bootparam = mdrv->dp;
+	node_id = ioread8(&bootparam->node_id);
+	mdrv->scdev = scif_register_device(mdrv->dev, MIC_SCIF_DEV,
+					   NULL, &scif_hw_ops,
+					   0, node_id, &mdrv->mdev.mmio, NULL,
+					   NULL, mdrv->dp, mdrv->dma_ch,
+					   mdrv->num_dma_ch);
+	if (IS_ERR(mdrv->scdev)) {
+		rc = PTR_ERR(mdrv->scdev);
+		goto device_uninit;
+	}
 	mic_create_card_debug_dir(mdrv);
 	atomic_notifier_chain_register(&panic_notifier_list, &mic_panic);
 done:
 	return rc;
+device_uninit:
+	mic_devices_uninit(mdrv);
+dma_free:
+	mic_free_dma_chans(mdrv);
 shutdown_uninit:
 	mic_shutdown_uninit();
 irq_uninit:
@@ -294,7 +422,9 @@ put:
 void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
+	scif_unregister_device(mdrv->scdev);
 	mic_devices_uninit(mdrv);
+	mic_free_dma_chans(mdrv);
 	/*
 	 * Inform the host about the shutdown status i.e. poweroff/restart etc.
 	 * The module cannot be unloaded so the only code path to call
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 9d57545..5ae1078 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -70,6 +70,41 @@ void mic_send_intr(struct mic_device *mdev, int doorbell)
 		       (MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
 }
 
+/*
+ * mic_x100_send_sbox_intr - Send an MIC_X100_SBOX interrupt to MIC.
+ */
+static void mic_x100_send_sbox_intr(struct mic_mw *mw, int doorbell)
+{
+	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
+	u32 apicicr_low = mic_mmio_read(mw, MIC_X100_SBOX_BASE_ADDRESS +
+					apic_icr_offset);
+
+	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
+	apicicr_low = (apicicr_low | (1 << 13));
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, apicicr_low,
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+}
+
+static void mic_x100_send_rdmasr_intr(struct mic_mw *mw, int doorbell)
+{
+	int rdmasr_offset = MIC_X100_SBOX_RDMASR0 + (doorbell << 2);
+	/*
+	 * Ensure that the interrupt is ordered w.r.t. previous stores
+	 * to main memory. Fence instructions are not implemented in X100
+	 * since execution is in order but a compiler barrier is still
+	 * required.
+	 */
+	wmb();
+	mic_mmio_write(mw, 0, MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+}
+
 /**
  * mic_ack_interrupt - Device specific interrupt handling.
  * @mdev: pointer to mic_device instance
@@ -91,6 +126,18 @@ static inline int mic_get_rdmasr_irq(int index)
 	return  MIC_X100_RDMASR_IRQ_BASE + index;
 }
 
+void mic_send_p2p_intr(int db, struct mic_mw *mw)
+{
+	int rdmasr_index;
+
+	if (db < MIC_X100_NUM_SBOX_IRQ) {
+		mic_x100_send_sbox_intr(mw, db);
+	} else {
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
+		mic_x100_send_rdmasr_intr(mw, rdmasr_index);
+	}
+}
+
 /**
  * mic_hw_intr_init - Initialize h/w specific interrupt
  * information.
@@ -113,11 +160,15 @@ void mic_hw_intr_init(struct mic_driver *mdrv)
 int mic_db_to_irq(struct mic_driver *mdrv, int db)
 {
 	int rdmasr_index;
+
+	/*
+	 * The total number of doorbell interrupts on the card are 16. Indices
+	 * 0-8 falls in the SBOX category and 8-15 fall in the RDMASR category.
+	 */
 	if (db < MIC_X100_NUM_SBOX_IRQ) {
 		return mic_get_sbox_irq(db);
 	} else {
-		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ +
-			MIC_X100_RDMASR_IRQ_BASE;
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ;
 		return mic_get_rdmasr_irq(rdmasr_index);
 	}
 }
@@ -243,10 +294,16 @@ static void mic_platform_shutdown(struct platform_device *pdev)
 	mic_remove(pdev);
 }
 
+static u64 mic_dma_mask = DMA_BIT_MASK(64);
+
 static struct platform_device mic_platform_dev = {
 	.name = mic_driver_name,
 	.id   = 0,
 	.num_resources = 0,
+	.dev = {
+		.dma_mask = &mic_dma_mask,
+		.coherent_dma_mask = DMA_BIT_MASK(64),
+	},
 };
 
 static struct platform_driver __refdata mic_platform_driver = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 70%]

* [PATCH char-misc-next 13/13] misc: mic: add support for loading/unloading SCIF driver
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
                   ` (11 preceding siblings ...)
  2014-12-10 19:47 70% ` [PATCH char-misc-next 12/13] misc: mic: MIC card " Sudeep Dutt
@ 2014-12-10 19:47 98% ` Sudeep Dutt
  12 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

modprobe SCIF driver upon start and remove it upon unload

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpss | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index cacbdb0..fa0e99e 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -48,18 +48,15 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
+	echo -e $"Loading MIC_HOST, MIC_X100_DMA & SCIF Modules"
 
-	for f in "mic_host" "mic_x100_dma"
-	do
-		modprobe $f
-		RETVAL=$?
-		if [ $RETVAL -ne 0 ]; then
-			failure
-			echo
-			return $RETVAL
-		fi
-	done
+	modprobe -a mic_host mic_x100_dma scif
+	RETVAL=$?
+	if [ $RETVAL -ne 0 ]; then
+		failure
+		echo
+		return $RETVAL
+	fi
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +167,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
-	modprobe -r mic_host mic_x100_dma
+	echo -n $"Removing SCIF, MIC_HOST & MIC_X100_DMA Modules: "
+	modprobe -r scif mic_host mic_x100_dma scif
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 98%]

* [PATCH char-misc-next 02/13] misc: mic: SCIF ring buffer infrastructure
  2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
  2014-12-10 19:47 21% ` [PATCH char-misc-next 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
@ 2014-12-10 19:47 58% ` Sudeep Dutt
    2014-12-10 19:47 61% ` [PATCH char-misc-next 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

SCIF ring buffer is a single producer, single consumer byte stream ring
buffer optimized for avoiding reads across the PCIe bus. The ring buffer is
used to implement a receive queue for SCIF driver messaging between nodes and
for byte stream messaging between SCIF endpoints. Each SCIF node has a
receive queue for every other SCIF node, and each connected endpoint has a
receive queue for messages from its peer. This pair of receive queues is
referred to as a SCIF queue pair.

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/scif/scif_rb.h | 100 ++++++++++++++++
 drivers/misc/mic/scif/scif_rb.c | 248 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 348 insertions(+)
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

diff --git a/drivers/misc/mic/scif/scif_rb.h b/drivers/misc/mic/scif/scif_rb.h
new file mode 100644
index 0000000..ba4bfce
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.h
@@ -0,0 +1,100 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Intel SCIF driver.
+ */
+#ifndef SCIF_RB_H
+#define SCIF_RB_H
+/*
+ * This file describes a general purpose, byte based ring buffer. Writers to the
+ * ring buffer need to synchronize using a lock. The same is true for readers,
+ * although in practice, the ring buffer has a single reader. It is lockless
+ * between producer and consumer so it can handle being used across the PCIe
+ * bus. The ring buffer ensures that there are no reads across the PCIe bus for
+ * performance reasons. Two of these are used to form a single bidirectional
+ * queue-pair across PCIe.
+ */
+/*
+ * struct scif_rb - SCIF Ring Buffer
+ *
+ * @rb_base: The base of the memory used for storing RB messages
+ * @read_ptr: Pointer to the read offset
+ * @write_ptr: Pointer to the write offset
+ * @size: Size of the memory in rb_base
+ * @current_read_offset: Cached read offset for performance
+ * @current_write_offset: Cached write offset for performance
+ */
+struct scif_rb {
+	void *rb_base;
+	uint32_t *read_ptr;
+	uint32_t *write_ptr;
+	uint32_t size;
+	uint32_t current_read_offset;
+	uint32_t current_write_offset;
+};
+
+/* methods used by both */
+void scif_rb_init(struct scif_rb *rb, uint32_t *read_ptr, uint32_t *write_ptr,
+		  void *rb_base, uint8_t size);
+/* writer only methods */
+/* write a new command, then scif_rb_commit() */
+int scif_rb_write(struct scif_rb *rb, void *msg, uint32_t size);
+/* after write(), then scif_rb_commit() */
+void scif_rb_commit(struct scif_rb *rb);
+/* query space available for writing to a RB. */
+uint32_t scif_rb_space(struct scif_rb *rb);
+
+/* reader only methods */
+/* read a new message from the ring buffer of size bytes */
+uint32_t scif_rb_get_next(struct scif_rb *rb, void *msg, uint32_t size);
+/* update the read pointer so that the space can be reused */
+void scif_rb_update_read_ptr(struct scif_rb *rb);
+/* count the number of bytes that can be read */
+uint32_t scif_rb_count(struct scif_rb *rb, uint32_t size);
+#endif
diff --git a/drivers/misc/mic/scif/scif_rb.c b/drivers/misc/mic/scif/scif_rb.c
new file mode 100644
index 0000000..b72da07
--- /dev/null
+++ b/drivers/misc/mic/scif/scif_rb.c
@@ -0,0 +1,248 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * Intel SCIF driver.
+ *
+ */
+#include <linux/circ_buf.h>
+#include <linux/types.h>
+#include <linux/io.h>
+
+#include "scif_rb.h"
+
+#define scif_rb_ring_cnt(head, tail, size) CIRC_CNT(head, tail, size)
+#define scif_rb_ring_space(head, tail, size) CIRC_SPACE(head, tail, size)
+
+/**
+ * scif_rb_init - Initializes the ring buffer
+ * @rb: ring buffer
+ * @read_ptr: A pointer to the read offset
+ * @write_ptr: A pointer to the write offset
+ * @rb_base: A pointer to the base of the ring buffer
+ * @size: The size of the ring buffer in powers of two
+ */
+void scif_rb_init(struct scif_rb *rb, u32 *read_ptr, u32 *write_ptr,
+		  void *rb_base, u8 size)
+{
+	rb->rb_base = rb_base;
+	rb->size = (1 << size);
+	rb->read_ptr = read_ptr;
+	rb->write_ptr = write_ptr;
+	rb->current_read_offset = *read_ptr;
+	rb->current_write_offset = *write_ptr;
+}
+
+/* Copies a message to the ring buffer -- handles the wrap around case */
+static void memcpy_torb(struct scif_rb *rb, void *header,
+			void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_toio((void __iomem __force *)header, msg, size1);
+		memcpy_toio((void __iomem __force *)rb->rb_base,
+			    msg + size1, size2);
+	} else {
+		memcpy_toio((void __iomem __force *)header, msg, size);
+	}
+}
+
+/* Copies a message from the ring buffer -- handles the wrap around case */
+static void memcpy_fromrb(struct scif_rb *rb, void *header,
+			  void *msg, u32 size)
+{
+	u32 size1, size2;
+
+	if (header + size >= rb->rb_base + rb->size) {
+		/* Need to call two copies if it wraps around */
+		size1 = (u32)(rb->rb_base + rb->size - header);
+		size2 = size - size1;
+		memcpy_fromio(msg, (void __iomem __force *)header, size1);
+		memcpy_fromio(msg+size1,
+			      (void __iomem __force *)rb->rb_base, size2);
+	} else {
+		memcpy_fromio(msg, (void __iomem __force *)header, size);
+	}
+}
+
+/**
+ * scif_rb_space - Query space available for writing to the RB
+ * @rb: ring buffer
+ *
+ * Return: size available for writing to RB in bytes.
+ */
+u32 scif_rb_space(struct scif_rb *rb)
+{
+	rb->current_read_offset = *rb->read_ptr;
+	/*
+	 * Update from the HW read pointer only once the peer has exposed the
+	 * new empty slot. This barrier is paired with the memory barrier
+	 * scif_rb_update_read_ptr()
+	 */
+	mb();
+	return scif_rb_ring_space(rb->current_write_offset,
+				  rb->current_read_offset, rb->size);
+}
+
+/**
+ * scif_rb_write - Write a message to the RB
+ * @rb: ring buffer
+ * @msg: buffer to send the message.  Must be at least size bytes long
+ * @size: the size (in bytes) to be copied to the RB
+ *
+ * This API does not block if there isn't enough space in the RB.
+ * Returns: 0 on success or -ENOMEM on failure
+ */
+int scif_rb_write(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header;
+
+	if (scif_rb_space(rb) < size)
+		return -ENOMEM;
+	header = rb->rb_base + rb->current_write_offset;
+	memcpy_torb(rb, header, msg, size);
+	/*
+	 * Wait until scif_rb_commit(). Update the local ring
+	 * buffer data, not the shared data until commit.
+	 */
+	rb->current_write_offset =
+		(rb->current_write_offset + size) & (rb->size - 1);
+	return 0;
+}
+
+/**
+ * scif_rb_commit - To submit the message to let the peer fetch it
+ * @rb: ring buffer
+ */
+void scif_rb_commit(struct scif_rb *rb)
+{
+	/*
+	 * We must ensure ordering between the all the data committed
+	 * previously before we expose the new message to the peer by
+	 * updating the write_ptr. This write barrier is paired with
+	 * the read barrier in scif_rb_count(..)
+	 */
+	wmb();
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->write_ptr) = rb->current_write_offset;
+#endif
+}
+
+/**
+ * scif_rb_get - To get next message from the ring buffer
+ * @rb: ring buffer
+ * @size: Number of bytes to be read
+ *
+ * Return: NULL if no bytes to be read from the ring buffer, otherwise the
+ *	pointer to the next byte
+ */
+static void *scif_rb_get(struct scif_rb *rb, u32 size)
+{
+	void *header = NULL;
+
+	if (scif_rb_count(rb, size) >= size)
+		header = rb->rb_base + rb->current_read_offset;
+	return header;
+}
+
+/*
+ * scif_rb_get_next - Read from ring buffer.
+ * @rb: ring buffer
+ * @msg: buffer to hold the message.  Must be at least size bytes long
+ * @size: Number of bytes to be read
+ *
+ * Return: number of bytes read if available bytes are >= size, otherwise
+ * returns zero.
+ */
+u32 scif_rb_get_next(struct scif_rb *rb, void *msg, u32 size)
+{
+	void *header = NULL;
+	int read_size = 0;
+
+	header = scif_rb_get(rb, size);
+	if (header) {
+		u32 next_cmd_offset =
+			(rb->current_read_offset + size) & (rb->size - 1);
+
+		read_size = size;
+		rb->current_read_offset = next_cmd_offset;
+		memcpy_fromrb(rb, header, msg, size);
+	}
+	return read_size;
+}
+
+/**
+ * scif_rb_update_read_ptr
+ * @rb: ring buffer
+ */
+void scif_rb_update_read_ptr(struct scif_rb *rb)
+{
+	u32 new_offset;
+
+	new_offset = rb->current_read_offset;
+	/*
+	 * We must ensure ordering between the all the data committed or read
+	 * previously before we expose the empty slot to the peer by updating
+	 * the read_ptr. This barrier is paired with the read barrier in
+	 * scif_rb_space(..)
+	 */
+	mb();
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#ifdef CONFIG_INTEL_MIC_CARD
+	/*
+	 * X100 Si Bug: For the case where a Core is performing an EXT_WR
+	 * followed by a Doorbell Write, the Core must perform two EXT_WR to the
+	 * same address with the same data before it does the Doorbell Write.
+	 * This way, if ordering is violated for the Interrupt Message, it will
+	 * fall just behind the first Posted associated with the first EXT_WR.
+	 */
+	ACCESS_ONCE(*rb->read_ptr) = new_offset;
+#endif
+}
+
+/**
+ * scif_rb_count
+ * @rb: ring buffer
+ * @size: Number of bytes expected to be read
+ *
+ * Return: number of bytes that can be read from the RB
+ */
+u32 scif_rb_count(struct scif_rb *rb, u32 size)
+{
+	if (scif_rb_ring_cnt(rb->current_write_offset,
+			     rb->current_read_offset,
+			     rb->size) < size) {
+		rb->current_write_offset = *rb->write_ptr;
+		/*
+		 * Update from the HW write pointer if empty only once the peer
+		 * has exposed the new message. This read barrier is paired
+		 * with the write barrier in scif_rb_commit(..)
+		 */
+		smp_rmb();
+	}
+	return scif_rb_ring_cnt(rb->current_write_offset,
+				rb->current_read_offset,
+				rb->size);
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 58%]

* [PATCH char-misc-next 00/13] misc: mic: SCIF driver
@ 2014-12-10 19:47 65% Sudeep Dutt
  2014-12-10 19:47 21% ` [PATCH char-misc-next 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
                   ` (12 more replies)
  0 siblings, 13 replies; 200+ results
From: Sudeep Dutt @ 2014-12-10 19:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Jonathan Corbet, linux-kernel, linux-doc,
	Dave Jiang, Nikhil Rao, Ashutosh Dixit, Sudeep Dutt

The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
level communications API across PCIe currently implemented for MIC. Currently
SCIF provides inter-node communication within a single host platform, where a
node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
communicating over the PCIe bus while providing an API that is symmetric
across all the nodes in the PCIe network. An important design objective for SCIF
is to deliver the maximum possible performance given the communication
abilities of the hardware. SCIF has been used to implement an offload compiler
runtime and OFED support for MPI implementations for MIC coprocessors.

==== SCIF API Components ====
The SCIF API has the following parts:
1. Connection establishment using a client server model
2. Byte stream messaging intended for short messages
3. Node enumeration to determine online nodes
4. Poll semantics for detection of incoming connections and messages
5. Memory registration to pin down pages
6. Remote memory mapping for low latency CPU accesses via mmap
7. Remote DMA (RDMA) for high bandwidth DMA transfers
8. Fence APIs for RDMA synchronization

SCIF exposes the notion of a connection which can be used by peer processes on
nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
process in a SCIF node initiates a SCIF connection to a peer process on a
different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
which are similar to connection oriented socket APIs. Connected SCIF endpoints
can also register local memory which is followed by data transfer using either
DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
kernel mode clients which are functionally equivalent.

==== SCIF Performance for MIC ====
DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
SCIF shows the performance advantages of SCIF for HPC applications and runtimes.

             Comparison of TCP and SCIF based BW

  Throughput (GB/sec)
    8 +                                             PCIe Bandwidth ******
      +                                                        TCP ######
    7 +    **************************************             SCIF %%%%%%
      |                       %%%%%%%%%%%%%%%%%%%
    6 +                   %%%%
      |                 %%
      |               %%%
    5 +              %%
      |            %%
    4 +           %%
      |          %%
    3 +         %%
      |        %
    2 +      %%
      |     %%
      |    %
    1 +
      +    ######################################
    0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
      1       10     100      1000   10000   100000
                   Transfer Size (KBytes)

SCIF allows memory sharing via mmap(..) between processes on different PCIe
nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.

For more information on SCIF please refer to the Linux Con 2013 slides titled
"Intel® MIC x100 Coprocessor Driver - on the Frontiers of Linux & HPC" at
http://events.linuxfoundation.org/sites/events/files/slides/LinuxConMicDriver.pdf
The talk can also be viewed at https://www.youtube.com/watch?v=dylbmmQs4W0

This initial patch series introduces the SCIF API and implements the SCIF
connection, messaging and node enumeration APIs. SCIF poll, mmap, RDMA and
fence APIs will be submitted in a future patch series once the initial base
patches are accepted upstream.

This patch series is divided into 13 patches as follows:
1) SCIF header file documenting the API along with the IOCTL interface

2) SCIF ring buffer is a single producer, single consumer byte stream ring
   buffer optimized for avoiding reads across the PCIe bus. The ring buffer is
   used to implement a receive queue for SCIF driver messaging between nodes and
   for byte stream messaging between SCIF endpoints. Each SCIF node has a
   receive queue for every other SCIF node, and each connected endpoint has a
   receive queue for messages from its peer. This pair of receive queues is
   referred to as a SCIF queue pair.

3) SCIF hardware bus which abstracts the low level hardware details so that the
   same SCIF driver can work without any changes on the host or the card as long
   as the hardware bus operations are implemented.

4) SCIF peer bus is used to register and unregister SCIF peer devices internally
   by the SCIF driver to signify the addition and removal of peer nodes
   respectively from the SCIF network. This simplifies remote node handling
   within SCIF and will also be used to support device probe/remove for SCIF
   client drivers (e.g. netdev over SCIF)

5) Common MIC header file changes to enable SCIF

6) SCIF module initialization, DMA/remote memory mapping APIs and debugfs hooks.

7) SCIF node queue pair setup sets up the kernel mode private node
   queue pairs between all the nodes to enable internal control
   message communication.

8) SCIF character device file operations and kernel APIs for opening and
   closing a user and kernel mode SCIF endpoint. This patch also enables
   binding to a SCIF port and listening for incoming SCIF connections.

9) SCIF connection APIs which establish a SCIF connection between
   a pair of user or kernel mode endpoints.

10) SCIF messaging APIs which allow sending messages between the SCIF endpoints
    via a byte stream based ring buffer which has been optimized to avoid reads
    across PCIe. The node enumeration API enables a user to query for the number
    of online nodes.

11) MIC host driver specific changes to enable SCIF

12) MIC card driver specific changes to enable SCIF

13) Add support for loading/unloading SCIF driver

Nikhil Rao (1):
  misc: mic: SCIF connections APIs i.e. accept and connect

Sudeep Dutt (12):
  misc: mic: SCIF header file and IOCTL interface
  misc: mic: SCIF ring buffer infrastructure
  misc: mic: SCIF Hardware Bus
  misc: mic: SCIF Peer Bus
  misc: mic: Common MIC header file changes in preparation for SCIF
  misc: mic: SCIF module initialization
  misc: mic: SCIF node queue pair setup management
  misc: mic: SCIF open close bind and listen APIs
  misc: mic: SCIF messaging and node enumeration APIs
  misc: mic: MIC host driver specific changes to enable SCIF
  misc: mic: MIC card driver specific changes to enable SCIF
  misc: mic: add support for loading/unloading SCIF driver

 Documentation/mic/mic_overview.txt    |   28 +-
 Documentation/mic/scif_overview.txt   |   62 ++
 drivers/misc/mic/Kconfig              |   40 +-
 drivers/misc/mic/Makefile             |    3 +-
 drivers/misc/mic/bus/Makefile         |    1 +
 drivers/misc/mic/scif/Makefile        |   15 +
 include/uapi/linux/Kbuild             |    1 +
 drivers/misc/mic/bus/scif_bus.h       |  129 ++++
 drivers/misc/mic/card/mic_device.h    |   11 +-
 drivers/misc/mic/card/mic_x100.h      |    1 +
 drivers/misc/mic/common/mic_dev.h     |    3 +
 drivers/misc/mic/host/mic_device.h    |   11 +-
 drivers/misc/mic/host/mic_intr.h      |    3 +-
 drivers/misc/mic/host/mic_smpt.h      |    1 +
 drivers/misc/mic/scif/scif_epd.h      |  160 ++++
 drivers/misc/mic/scif/scif_main.h     |  250 +++++++
 drivers/misc/mic/scif/scif_map.h      |  113 +++
 drivers/misc/mic/scif/scif_nodeqp.h   |  182 +++++
 drivers/misc/mic/scif/scif_peer_bus.h |   63 ++
 drivers/misc/mic/scif/scif_rb.h       |  100 +++
 include/linux/scif.h                  | 1132 ++++++++++++++++++++++++++++
 include/uapi/linux/mic_common.h       |   12 +
 include/uapi/linux/scif_ioctl.h       |  233 ++++++
 drivers/misc/mic/bus/scif_bus.c       |  210 ++++++
 drivers/misc/mic/card/mic_device.c    |  132 +++-
 drivers/misc/mic/card/mic_x100.c      |   61 +-
 drivers/misc/mic/host/mic_boot.c      |  264 ++++++-
 drivers/misc/mic/host/mic_debugfs.c   |   13 +
 drivers/misc/mic/host/mic_main.c      |    6 +
 drivers/misc/mic/host/mic_smpt.c      |    7 +-
 drivers/misc/mic/host/mic_virtio.c    |    6 +-
 drivers/misc/mic/host/mic_x100.c      |    3 +-
 drivers/misc/mic/scif/scif_api.c      | 1270 ++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_debugfs.c  |   82 +++
 drivers/misc/mic/scif/scif_epd.c      |  353 +++++++++
 drivers/misc/mic/scif/scif_fd.c       |  303 ++++++++
 drivers/misc/mic/scif/scif_main.c     |  395 ++++++++++
 drivers/misc/mic/scif/scif_nm.c       |  229 ++++++
 drivers/misc/mic/scif/scif_nodeqp.c   | 1303 +++++++++++++++++++++++++++++++++
 drivers/misc/mic/scif/scif_peer_bus.c |  124 ++++
 drivers/misc/mic/scif/scif_ports.c    |  124 ++++
 drivers/misc/mic/scif/scif_rb.c       |  248 +++++++
 Documentation/mic/mpssd/mpss          |   23 +-
 43 files changed, 7654 insertions(+), 56 deletions(-)
 create mode 100644 Documentation/mic/scif_overview.txt
 create mode 100644 drivers/misc/mic/scif/Makefile
 create mode 100644 drivers/misc/mic/bus/scif_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_epd.h
 create mode 100644 drivers/misc/mic/scif/scif_main.h
 create mode 100644 drivers/misc/mic/scif/scif_map.h
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.h
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.h
 create mode 100644 drivers/misc/mic/scif/scif_rb.h
 create mode 100644 include/linux/scif.h
 create mode 100644 include/uapi/linux/scif_ioctl.h
 create mode 100644 drivers/misc/mic/bus/scif_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_api.c
 create mode 100644 drivers/misc/mic/scif/scif_debugfs.c
 create mode 100644 drivers/misc/mic/scif/scif_epd.c
 create mode 100644 drivers/misc/mic/scif/scif_fd.c
 create mode 100644 drivers/misc/mic/scif/scif_main.c
 create mode 100644 drivers/misc/mic/scif/scif_nm.c
 create mode 100644 drivers/misc/mic/scif/scif_nodeqp.c
 create mode 100644 drivers/misc/mic/scif/scif_peer_bus.c
 create mode 100644 drivers/misc/mic/scif/scif_ports.c
 create mode 100644 drivers/misc/mic/scif/scif_rb.c

-- 
1.8.2.1


^ permalink raw reply	[relevance 65%]

* [PATCH char-misc-next 1/2] misc: mic: Introduce the managed version of ioremap
  2014-07-14 21:23 99% [PATCH char-misc-next 0/2] Fix minor issues reported in MIC drivers Sudeep Dutt
@ 2014-07-14 21:23 96% ` Sudeep Dutt
  2014-07-14 21:23 99% ` [PATCH char-misc-next 2/2] dma: Fix MIC X100 DMA Driver Kconfig option Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-14 21:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dan Williams, Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao,
	Sudeep Dutt, Paul Bolle, Himangi Saraogi, linux-kernel,
	dmaengine

From: Himangi Saraogi <himangi774@gmail.com>

This patch moves data allocated using ioremap to managed data
allocated using devm_ioremap and cleans now unnecessary
iounmaps in probe and remove functions. Also the unnecessary
label iounmap is done away with.

Link: https://lkml.org/lkml/2014/6/1/191
Tested-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
---
 drivers/misc/mic/card/mic_x100.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 55c9465..9d57545 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -200,7 +200,8 @@ static int __init mic_probe(struct platform_device *pdev)
 
 	mdev->mmio.pa = MIC_X100_MMIO_BASE;
 	mdev->mmio.len = MIC_X100_MMIO_LEN;
-	mdev->mmio.va = ioremap(MIC_X100_MMIO_BASE, MIC_X100_MMIO_LEN);
+	mdev->mmio.va = devm_ioremap(&pdev->dev, MIC_X100_MMIO_BASE,
+				     MIC_X100_MMIO_LEN);
 	if (!mdev->mmio.va) {
 		dev_err(&pdev->dev, "Cannot remap MMIO BAR\n");
 		rc = -EIO;
@@ -214,7 +215,7 @@ static int __init mic_probe(struct platform_device *pdev)
 	if (IS_ERR(mdrv->dma_mbdev)) {
 		rc = PTR_ERR(mdrv->dma_mbdev);
 		dev_err(&pdev->dev, "mbus_add_device failed rc %d\n", rc);
-		goto iounmap;
+		goto done;
 	}
 	rc = mic_driver_init(mdrv);
 	if (rc) {
@@ -225,19 +226,15 @@ done:
 	return rc;
 remove_dma:
 	mbus_unregister_device(mdrv->dma_mbdev);
-iounmap:
-	iounmap(mdev->mmio.va);
 	return rc;
 }
 
 static int mic_remove(struct platform_device *pdev)
 {
 	struct mic_driver *mdrv = &g_drv;
-	struct mic_device *mdev = &mdrv->mdev;
 
 	mic_driver_uninit(mdrv);
 	mbus_unregister_device(mdrv->dma_mbdev);
-	iounmap(mdev->mmio.va);
 	return 0;
 }
 
-- 
1.8.2.1


^ permalink raw reply related	[relevance 96%]

* [PATCH char-misc-next 2/2] dma: Fix MIC X100 DMA Driver Kconfig option
  2014-07-14 21:23 99% [PATCH char-misc-next 0/2] Fix minor issues reported in MIC drivers Sudeep Dutt
  2014-07-14 21:23 96% ` [PATCH char-misc-next 1/2] misc: mic: Introduce the managed version of ioremap Sudeep Dutt
@ 2014-07-14 21:23 99% ` Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-14 21:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dan Williams, Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao,
	Sudeep Dutt, Paul Bolle, Himangi Saraogi, linux-kernel,
	dmaengine

Select DMA_ENGINE instead of DMAENGINE and delete the default
line as the default is 'n' anyways.

Link: https://lkml.org/lkml/2014/7/14/90
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/dma/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 7c8b8c4..8f6afbf 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -36,8 +36,7 @@ comment "DMA Devices"
 config INTEL_MIC_X100_DMA
 	tristate "Intel MIC X100 DMA Driver"
 	depends on 64BIT && X86 && INTEL_MIC_BUS
-	select DMAENGINE
-	default N
+	select DMA_ENGINE
 	help
 	  This enables DMA support for the Intel Many Integrated Core
 	  (MIC) family of PCIe form factor coprocessor X100 devices that
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 0/2] Fix minor issues reported in MIC drivers
@ 2014-07-14 21:23 99% Sudeep Dutt
  2014-07-14 21:23 96% ` [PATCH char-misc-next 1/2] misc: mic: Introduce the managed version of ioremap Sudeep Dutt
  2014-07-14 21:23 99% ` [PATCH char-misc-next 2/2] dma: Fix MIC X100 DMA Driver Kconfig option Sudeep Dutt
  0 siblings, 2 replies; 200+ results
From: Sudeep Dutt @ 2014-07-14 21:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dan Williams, Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao,
	Sudeep Dutt, Paul Bolle, Himangi Saraogi, linux-kernel,
	dmaengine

Himangi Saraogi (1):
  misc: mic: Introduce the managed version of ioremap

Sudeep Dutt (1):
  dma: Fix MIC X100 DMA Driver Kconfig option

 drivers/dma/Kconfig              | 3 +--
 drivers/misc/mic/card/mic_x100.c | 9 +++------
 2 files changed, 4 insertions(+), 8 deletions(-)

-- 
1.8.2.1


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v3 2/8] misc: mic: add a bus driver for virtual MIC devices
  @ 2014-07-12  6:49 99%       ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-12  6:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, linux-kernel,
	dmaengine, Sudeep Dutt

On Fri, 2014-07-11 at 18:31 -0700, Greg Kroah-Hartman wrote:
> On Fri, Jul 11, 2014 at 06:27:20PM -0700, Greg Kroah-Hartman wrote:
> > On Fri, Jul 11, 2014 at 02:04:20PM -0700, Sudeep Dutt wrote:
> > > This MIC virtual bus driver takes the responsibility of creating all
> > > the virtual devices connected to the PCIe device on the host and the
> > > platform device on the card. The MIC bus hardware operations provide
> > > a way to abstract certain hardware details from the base physical devices.
> > > Examples of devices added on the MIC virtual bus include host DMA and card DMA.
> > > This abstraction enables using a common DMA driver on host and card.
> > > 
> > > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
> > > Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> > > Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
> > > ---
> > >  drivers/misc/mic/Kconfig       |  17 ++++
> > >  drivers/misc/mic/Makefile      |   1 +
> > >  drivers/misc/mic/bus/Makefile  |   5 +
> > >  drivers/misc/mic/bus/mic_bus.c | 218 +++++++++++++++++++++++++++++++++++++++++
> > >  include/linux/mic_bus.h        | 110 +++++++++++++++++++++
> > >  5 files changed, 351 insertions(+)
> > >  create mode 100644 drivers/misc/mic/bus/Makefile
> > >  create mode 100644 drivers/misc/mic/bus/mic_bus.c
> > >  create mode 100644 include/linux/mic_bus.h
> > 
> > Looks good, but no addition to scripts/mod/file2alias.c to allow
> > autoloading of mic_bus drivers?
> > 
> > I'll take patches 1 and 2 here, as I need some acks from the DMA people
> > on patch 3.
> 
> And I took 4-8 as well, as they seemed independant from patch 3.
> 

Thanks for accepting the patches Greg. The good news is that the 7
patches accepted build fine without patch 3 and the drivers load
successfully.

The bad news is that the MIC Coprocessors no longer boot without patch 3
since the host driver does not find any DMA channels. Reverting patch 5
titled "add dma support in host driver" with SHA d4ef098 allows the
Coprocessors to boot again but accepting patch 3 instead is the cleanest
solution here. Dan, please review patch 3 as we would really like it to
get accepted before the 3.17 merge window opens.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next v3 6/8] misc: mic: add threaded irq support in card driver
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (4 preceding siblings ...)
  2014-07-11 21:04 54% ` [PATCH char-misc-next v3 5/8] misc: mic: add dma " Sudeep Dutt
@ 2014-07-11 21:04 85% ` Sudeep Dutt
  2014-07-11 21:04 85% ` [PATCH char-misc-next v3 7/8] misc: mic: add dma " Sudeep Dutt
  2014-07-11 21:04 99% ` [PATCH char-misc-next v3 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Add threaded irq support in mic_request_card_irq which will be used
for virtual devices added on mic bus.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/card/mic_device.c | 23 +++++++++++++----------
 drivers/misc/mic/card/mic_device.h |  5 +++--
 drivers/misc/mic/card/mic_virtio.c |  7 ++++---
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index d0980ff..83819ee 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -83,8 +83,8 @@ static int mic_shutdown_init(void)
 	int shutdown_db;
 
 	shutdown_db = mic_next_card_db();
-	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr,
-			"Shutdown", mdrv, shutdown_db);
+	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr, NULL,
+					       "Shutdown", mdrv, shutdown_db);
 	if (IS_ERR(shutdown_cookie))
 		rc = PTR_ERR(shutdown_cookie);
 	else
@@ -136,7 +136,8 @@ static void mic_dp_uninit(void)
 /**
  * mic_request_card_irq - request an irq.
  *
- * @func: The callback function that handles the interrupt.
+ * @handler: interrupt handler passed to request_threaded_irq.
+ * @thread_fn: thread fn. passed to request_threaded_irq.
  * @name: The ASCII name of the callee requesting the irq.
  * @data: private data that is returned back when calling the
  * function handler.
@@ -149,17 +150,19 @@ static void mic_dp_uninit(void)
  * error code.
  *
  */
-struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int index)
+struct mic_irq *
+mic_request_card_irq(irq_handler_t handler,
+		     irq_handler_t thread_fn, const char *name,
+		     void *data, int index)
 {
 	int rc = 0;
 	unsigned long cookie;
 	struct mic_driver *mdrv = g_drv;
 
-	rc  = request_irq(mic_db_to_irq(mdrv, index), func,
-		0, name, data);
+	rc  = request_threaded_irq(mic_db_to_irq(mdrv, index), handler,
+				   thread_fn, 0, name, data);
 	if (rc) {
-		dev_err(mdrv->dev, "request_irq failed rc = %d\n", rc);
+		dev_err(mdrv->dev, "request_threaded_irq failed rc = %d\n", rc);
 		goto err;
 	}
 	mdrv->irq_info.irq_usage_count[index]++;
@@ -172,9 +175,9 @@ err:
 /**
  * mic_free_card_irq - free irq.
  *
- * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @cookie: cookie obtained during a successful call to mic_request_threaded_irq
  * @data: private data specified by the calling function during the
- * mic_request_irq
+ * mic_request_threaded_irq
  *
  * returns: none.
  */
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 306f502..e12a0c2 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -30,6 +30,7 @@
 #include <linux/workqueue.h>
 #include <linux/io.h>
 #include <linux/irqreturn.h>
+#include <linux/interrupt.h>
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -116,8 +117,8 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 int mic_driver_init(struct mic_driver *mdrv);
 void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
-struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int intr_src);
+struct mic_irq *mic_request_card_irq(irq_handler_t handler,
+	irq_handler_t thread_fn, const char *name, void *data, int intr_src);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 653799b..f14b600 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -417,7 +417,7 @@ static int mic_add_device(struct mic_device_desc __iomem *d,
 
 	virtio_db = mic_next_card_db();
 	mvdev->virtio_cookie = mic_request_card_irq(mic_virtio_intr_handler,
-			"virtio intr", mvdev, virtio_db);
+			NULL, "virtio intr", mvdev, virtio_db);
 	if (IS_ERR(mvdev->virtio_cookie)) {
 		ret = PTR_ERR(mvdev->virtio_cookie);
 		goto kfree;
@@ -606,8 +606,9 @@ int mic_devices_init(struct mic_driver *mdrv)
 	mic_scan_devices(mdrv, !REMOVE_DEVICES);
 
 	config_db = mic_next_card_db();
-	virtio_config_cookie = mic_request_card_irq(mic_extint_handler,
-			"virtio_config_intr", mdrv, config_db);
+	virtio_config_cookie = mic_request_card_irq(mic_extint_handler, NULL,
+						    "virtio_config_intr", mdrv,
+						    config_db);
 	if (IS_ERR(virtio_config_cookie)) {
 		rc = PTR_ERR(virtio_config_cookie);
 		goto exit;
-- 
1.8.2.1


^ permalink raw reply related	[relevance 85%]

* [PATCH char-misc-next v3 4/8] misc: mic: add threaded irq support in host driver
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (2 preceding siblings ...)
  2014-07-11 21:04 41% ` [PATCH char-misc-next v3 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
@ 2014-07-11 21:04 62% ` Sudeep Dutt
  2014-07-11 21:04 54% ` [PATCH char-misc-next v3 5/8] misc: mic: add dma " Sudeep Dutt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Convert mic_request_irq to mic_request_threaded_irq to support threaded
irq for virtual devices on mic bus.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/host/mic_intr.c   | 121 ++++++++++++++++++++++---------------
 drivers/misc/mic/host/mic_intr.h   |  24 +++++---
 drivers/misc/mic/host/mic_main.c   |   5 +-
 drivers/misc/mic/host/mic_virtio.c |   6 +-
 4 files changed, 96 insertions(+), 60 deletions(-)

diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index dbc5afd..d686f28 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -24,28 +24,29 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 
-/*
- * mic_invoke_callback - Invoke callback functions registered for
- * the corresponding source id.
- *
- * @mdev: pointer to the mic_device instance
- * @idx: The interrupt source id.
- *
- * Returns none.
- */
-static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
+static irqreturn_t mic_thread_fn(int irq, void *dev)
 {
+	struct mic_device *mdev = dev;
+	struct mic_intr_info *intr_info = mdev->intr_info;
+	struct mic_irq_info *irq_info = &mdev->irq_info;
 	struct mic_intr_cb *intr_cb;
 	struct pci_dev *pdev = container_of(mdev->sdev->parent,
-		struct pci_dev, dev);
+					    struct pci_dev, dev);
+	int i;
 
-	spin_lock(&mdev->irq_info.mic_intr_lock);
-	list_for_each_entry(intr_cb, &mdev->irq_info.cb_list[idx], list)
-		if (intr_cb->func)
-			intr_cb->func(pdev->irq, intr_cb->data);
-	spin_unlock(&mdev->irq_info.mic_intr_lock);
+	spin_lock(&irq_info->mic_thread_lock);
+	for (i = intr_info->intr_start_idx[MIC_INTR_DB];
+			i < intr_info->intr_len[MIC_INTR_DB]; i++)
+		if (test_and_clear_bit(i, &irq_info->mask)) {
+			list_for_each_entry(intr_cb, &irq_info->cb_list[i],
+					    list)
+				if (intr_cb->thread_fn)
+					intr_cb->thread_fn(pdev->irq,
+							 intr_cb->data);
+		}
+	spin_unlock(&irq_info->mic_thread_lock);
+	return IRQ_HANDLED;
 }
-
 /**
  * mic_interrupt - Generic interrupt handler for
  * MSI and INTx based interrupts.
@@ -53,7 +54,11 @@ static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
 static irqreturn_t mic_interrupt(int irq, void *dev)
 {
 	struct mic_device *mdev = dev;
-	struct mic_intr_info *info = mdev->intr_info;
+	struct mic_intr_info *intr_info = mdev->intr_info;
+	struct mic_irq_info *irq_info = &mdev->irq_info;
+	struct mic_intr_cb *intr_cb;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+					    struct pci_dev, dev);
 	u32 mask;
 	int i;
 
@@ -61,12 +66,19 @@ static irqreturn_t mic_interrupt(int irq, void *dev)
 	if (!mask)
 		return IRQ_NONE;
 
-	for (i = info->intr_start_idx[MIC_INTR_DB];
-			i < info->intr_len[MIC_INTR_DB]; i++)
-		if (mask & BIT(i))
-			mic_invoke_callback(mdev, i);
-
-	return IRQ_HANDLED;
+	spin_lock(&irq_info->mic_intr_lock);
+	for (i = intr_info->intr_start_idx[MIC_INTR_DB];
+			i < intr_info->intr_len[MIC_INTR_DB]; i++)
+		if (mask & BIT(i)) {
+			list_for_each_entry(intr_cb, &irq_info->cb_list[i],
+					    list)
+				if (intr_cb->handler)
+					intr_cb->handler(pdev->irq,
+							 intr_cb->data);
+			set_bit(i, &irq_info->mask);
+		}
+	spin_unlock(&irq_info->mic_intr_lock);
+	return IRQ_WAKE_THREAD;
 }
 
 /* Return the interrupt offset from the index. Index is 0 based. */
@@ -99,14 +111,15 @@ static struct msix_entry *mic_get_available_vector(struct mic_device *mdev)
  *
  * @mdev: pointer to the mic_device instance
  * @idx: The source id to be registered.
- * @func: The function to be called when the source id receives
+ * @handler: The function to be called when the source id receives
  * the interrupt.
+ * @thread_fn: thread fn. corresponding to the handler
  * @data: Private data of the requester.
  * Return the callback structure that was registered or an
  * appropriate error on failure.
  */
 static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
-			u8 idx, irqreturn_t (*func) (int irq, void *dev),
+			u8 idx, irq_handler_t handler, irq_handler_t thread_fn,
 			void *data)
 {
 	struct mic_intr_cb *intr_cb;
@@ -117,7 +130,8 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 	if (!intr_cb)
 		return ERR_PTR(-ENOMEM);
 
-	intr_cb->func = func;
+	intr_cb->handler = handler;
+	intr_cb->thread_fn = thread_fn;
 	intr_cb->data = data;
 	intr_cb->cb_id = ida_simple_get(&mdev->irq_info.cb_ida,
 		0, 0, GFP_KERNEL);
@@ -126,9 +140,11 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 		goto ida_fail;
 	}
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
 	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	list_add_tail(&intr_cb->list, &mdev->irq_info.cb_list[idx]);
 	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 
 	return intr_cb;
 ida_fail:
@@ -152,8 +168,9 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 	unsigned long flags;
 	int i;
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	for (i = 0;  i < MIC_NUM_OFFSETS; i++) {
-		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
 			if (intr_cb->cb_id == idx) {
@@ -163,11 +180,13 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 				kfree(intr_cb);
 				spin_unlock_irqrestore(
 					&mdev->irq_info.mic_intr_lock, flags);
+				spin_unlock(&mdev->irq_info.mic_thread_lock);
 				return i;
 			}
 		}
-		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
 	}
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 	return MIC_NUM_OFFSETS;
 }
 
@@ -242,6 +261,7 @@ static int mic_setup_callbacks(struct mic_device *mdev)
 		INIT_LIST_HEAD(&mdev->irq_info.cb_list[i]);
 	ida_init(&mdev->irq_info.cb_ida);
 	spin_lock_init(&mdev->irq_info.mic_intr_lock);
+	spin_lock_init(&mdev->irq_info.mic_thread_lock);
 	return 0;
 }
 
@@ -258,14 +278,12 @@ static void mic_release_callbacks(struct mic_device *mdev)
 	struct mic_intr_cb *intr_cb;
 	int i;
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	for (i = 0; i < MIC_NUM_OFFSETS; i++) {
-		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 
-		if (list_empty(&mdev->irq_info.cb_list[i])) {
-			spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock,
-					       flags);
+		if (list_empty(&mdev->irq_info.cb_list[i]))
 			break;
-		}
 
 		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
@@ -274,8 +292,9 @@ static void mic_release_callbacks(struct mic_device *mdev)
 					  intr_cb->cb_id);
 			kfree(intr_cb);
 		}
-		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
 	}
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 	ida_destroy(&mdev->irq_info.cb_ida);
 	kfree(mdev->irq_info.cb_list);
 }
@@ -313,7 +332,8 @@ static int mic_setup_msi(struct mic_device *mdev, struct pci_dev *pdev)
 		goto err_nomem2;
 	}
 
-	rc = request_irq(pdev->irq, mic_interrupt, 0 , "mic-msi", mdev);
+	rc = request_threaded_irq(pdev->irq, mic_interrupt, mic_thread_fn,
+				  0, "mic-msi", mdev);
 	if (rc) {
 		dev_err(&pdev->dev, "Error allocating MSI interrupt\n");
 		goto err_irq_req_fail;
@@ -353,8 +373,8 @@ static int mic_setup_intx(struct mic_device *mdev, struct pci_dev *pdev)
 		goto err_nomem;
 	}
 
-	rc = request_irq(pdev->irq, mic_interrupt,
-		IRQF_SHARED, "mic-intx", mdev);
+	rc = request_threaded_irq(pdev->irq, mic_interrupt, mic_thread_fn,
+				  IRQF_SHARED, "mic-intx", mdev);
 	if (rc)
 		goto err;
 
@@ -391,13 +411,14 @@ int mic_next_db(struct mic_device *mdev)
 #define MK_COOKIE(x, y) ((x) | (y) << COOKIE_ID_SHIFT)
 
 /**
- * mic_request_irq - request an irq. mic_mutex needs
+ * mic_request_threaded_irq - request an irq. mic_mutex needs
  * to be held before calling this function.
  *
  * @mdev: pointer to mic_device instance
- * @func: The callback function that handles the interrupt.
+ * @handler: The callback function that handles the interrupt.
  * The function needs to call ack_interrupts
  * (mdev->ops->ack_interrupt(mdev)) when handling the interrupts.
+ * @thread_fn: thread fn required by request_threaded_irq.
  * @name: The ASCII name of the callee requesting the irq.
  * @data: private data that is returned back when calling the
  * function handler.
@@ -412,10 +433,11 @@ int mic_next_db(struct mic_device *mdev)
  * error code.
  *
  */
-struct mic_irq *mic_request_irq(struct mic_device *mdev,
-	irqreturn_t (*func)(int irq, void *dev),
-	const char *name, void *data, int intr_src,
-	enum mic_intr_type type)
+struct mic_irq *
+mic_request_threaded_irq(struct mic_device *mdev,
+			 irq_handler_t handler, irq_handler_t thread_fn,
+			 const char *name, void *data, int intr_src,
+			 enum mic_intr_type type)
 {
 	u16 offset;
 	int rc = 0;
@@ -444,7 +466,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 			goto err;
 		}
 
-		rc = request_irq(msix->vector, func, 0, name, data);
+		rc = request_threaded_irq(msix->vector, handler, thread_fn,
+					  0, name, data);
 		if (rc) {
 			dev_dbg(mdev->sdev->parent,
 				"request irq failed rc = %d\n", rc);
@@ -458,8 +481,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 		dev_dbg(mdev->sdev->parent, "irq: %d assigned for src: %d\n",
 			msix->vector, intr_src);
 	} else {
-		intr_cb = mic_register_intr_callback(mdev,
-				offset, func, data);
+		intr_cb = mic_register_intr_callback(mdev, offset, handler,
+						     thread_fn, data);
 		if (IS_ERR(intr_cb)) {
 			dev_err(mdev->sdev->parent,
 				"No available callback entries for use\n");
@@ -487,9 +510,9 @@ err:
  *  needs to be held before calling this function.
  *
  * @mdev: pointer to mic_device instance
- * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @cookie: cookie obtained during a successful call to mic_request_threaded_irq
  * @data: private data specified by the calling function during the
- * mic_request_irq
+ * mic_request_threaded_irq
  *
  * returns: none.
  */
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 6091aa9..b1334dd 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -21,6 +21,8 @@
 #ifndef _MIC_INTR_H_
 #define _MIC_INTR_H_
 
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
 /*
  * The minimum number of msix vectors required for normal operation.
  * 3 for virtio network, console and block devices.
@@ -68,7 +70,11 @@ struct mic_intr_info {
  * @num_vectors: The number of MSI/MSI-x vectors that have been allocated.
  * @cb_ida: callback ID allocator to track the callbacks registered.
  * @mic_intr_lock: spinlock to protect the interrupt callback list.
+ * @mic_thread_lock: spinlock to protect the thread callback list.
+ *		   This lock is used to protect against thread_fn while
+ *		   mic_intr_lock is used to protect against interrupt handler.
  * @cb_list: Array of callback lists one for each source.
+ * @mask: Mask used by the main thread fn to call the underlying thread fns.
  */
 struct mic_irq_info {
 	int next_avail_src;
@@ -77,19 +83,23 @@ struct mic_irq_info {
 	u16 num_vectors;
 	struct ida cb_ida;
 	spinlock_t mic_intr_lock;
+	spinlock_t mic_thread_lock;
 	struct list_head *cb_list;
+	unsigned long mask;
 };
 
 /**
  * struct mic_intr_cb - Interrupt callback structure.
  *
- * @func: The callback function
+ * @handler: The callback function
+ * @thread_fn: The thread_fn.
  * @data: Private data of the requester.
  * @cb_id: The callback id. Identifies this callback.
  * @list: list head pointing to the next callback structure.
  */
 struct mic_intr_cb {
-	irqreturn_t (*func) (int irq, void *data);
+	irq_handler_t handler;
+	irq_handler_t thread_fn;
 	void *data;
 	int cb_id;
 	struct list_head list;
@@ -124,11 +134,11 @@ struct mic_hw_intr_ops {
 };
 
 int mic_next_db(struct mic_device *mdev);
-struct mic_irq *mic_request_irq(struct mic_device *mdev,
-	irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int intr_src,
-	enum mic_intr_type type);
-
+struct mic_irq *
+mic_request_threaded_irq(struct mic_device *mdev,
+			 irq_handler_t handler, irq_handler_t thread_fn,
+			 const char *name, void *data, int intr_src,
+			 enum mic_intr_type type);
 void mic_free_irq(struct mic_device *mdev,
 		struct mic_irq *cookie, void *data);
 int mic_setup_interrupts(struct mic_device *mdev, struct pci_dev *pdev);
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index c04a021..fdc9c13 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -389,8 +389,9 @@ static int mic_probe(struct pci_dev *pdev,
 	mutex_lock(&mdev->mic_mutex);
 
 	mdev->shutdown_db = mic_next_db(mdev);
-	mdev->shutdown_cookie = mic_request_irq(mdev, mic_shutdown_db,
-		"shutdown-interrupt", mdev, mdev->shutdown_db, MIC_INTR_DB);
+	mdev->shutdown_cookie = mic_request_threaded_irq(mdev, mic_shutdown_db,
+					NULL, "shutdown-interrupt", mdev,
+					mdev->shutdown_db, MIC_INTR_DB);
 	if (IS_ERR(mdev->shutdown_cookie)) {
 		rc = PTR_ERR(mdev->shutdown_cookie);
 		mutex_unlock(&mdev->mic_mutex);
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 7e1ef0e..aba3e83 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -594,8 +594,10 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
 		 mvdev->virtio_id);
 	mvdev->virtio_db = mic_next_db(mdev);
-	mvdev->virtio_cookie = mic_request_irq(mdev, mic_virtio_intr_handler,
-			irqname, mvdev, mvdev->virtio_db, MIC_INTR_DB);
+	mvdev->virtio_cookie = mic_request_threaded_irq(mdev,
+					       mic_virtio_intr_handler,
+					       NULL, irqname, mvdev,
+					       mvdev->virtio_db, MIC_INTR_DB);
 	if (IS_ERR(mvdev->virtio_cookie)) {
 		ret = PTR_ERR(mvdev->virtio_cookie);
 		dev_dbg(mdev->sdev->parent, "request irq failed\n");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 62%]

* [PATCH char-misc-next v3 8/8] misc: mic: add support for loading/unloading dma driver
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (6 preceding siblings ...)
  2014-07-11 21:04 85% ` [PATCH char-misc-next v3 7/8] misc: mic: add dma " Sudeep Dutt
@ 2014-07-11 21:04 99% ` Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

modprobe dma driver upon start and remove it upon unload.

Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 Documentation/mic/mpssd/mpss | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index 3136c68..cacbdb0 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -48,18 +48,18 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_HOST Module"
+	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
 
-	# Ensure the driver is loaded
-	if [ ! -d "$sysfs" ]; then
-		modprobe mic_host
+	for f in "mic_host" "mic_x100_dma"
+	do
+		modprobe $f
 		RETVAL=$?
 		if [ $RETVAL -ne 0 ]; then
 			failure
 			echo
 			return $RETVAL
 		fi
-	fi
+	done
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +170,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST Module: "
-	modprobe -r mic_host
+	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
+	modprobe -r mic_host mic_x100_dma
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next v3 5/8] misc: mic: add dma support in host driver
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (3 preceding siblings ...)
  2014-07-11 21:04 62% ` [PATCH char-misc-next v3 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
@ 2014-07-11 21:04 54% ` Sudeep Dutt
  2014-07-11 21:04 85% ` [PATCH char-misc-next v3 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch adds a dma device on the mic virtual bus and uses this dmaengine
to transfer data for virtio devices

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/host/mic_boot.c   |  83 ++++++++++++++++-
 drivers/misc/mic/host/mic_device.h |  24 +++++
 drivers/misc/mic/host/mic_intr.h   |   3 +-
 drivers/misc/mic/host/mic_virtio.c | 181 +++++++++++++++++++++++++++++--------
 drivers/misc/mic/host/mic_virtio.h |  21 ++++-
 drivers/misc/mic/host/mic_x100.c   |   8 ++
 7 files changed, 281 insertions(+), 41 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index ee1d2ac..bf76313 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -19,7 +19,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index b75c6b5..ff2b0fb 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -23,11 +23,70 @@
 #include <linux/pci.h>
 
 #include <linux/mic_common.h>
+#include <linux/mic_bus.h>
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
+{
+	return dev_get_drvdata(mbdev->dev.parent);
+}
+
+static dma_addr_t
+mic_dma_map_page(struct device *dev, struct page *page,
+		 unsigned long offset, size_t size, enum dma_data_direction dir,
+		 struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void
+mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+		   size_t size, enum dma_data_direction dir,
+		   struct dma_attrs *attrs)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static struct dma_map_ops mic_dma_ops = {
+	.map_page = mic_dma_map_page,
+	.unmap_page = mic_dma_unmap_page,
+};
+
+static struct mic_irq *
+_mic_request_threaded_irq(struct mbus_device *mbdev,
+			  irq_handler_t handler, irq_handler_t thread_fn,
+			  const char *name, void *data, int intr_src)
+{
+	return mic_request_threaded_irq(mbdev_to_mdev(mbdev), handler,
+					thread_fn, name, data,
+					intr_src, MIC_INTR_DMA);
+}
+
+static void _mic_free_irq(struct mbus_device *mbdev,
+			  struct mic_irq *cookie, void *data)
+{
+	return mic_free_irq(mbdev_to_mdev(mbdev), cookie, data);
+}
+
+static void _mic_ack_interrupt(struct mbus_device *mbdev, int num)
+{
+	struct mic_device *mdev = mbdev_to_mdev(mbdev);
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static struct mbus_hw_ops mbus_hw_ops = {
+	.request_threaded_irq = _mic_request_threaded_irq,
+	.free_irq = _mic_free_irq,
+	.ack_interrupt = _mic_ack_interrupt,
+};
+
 /**
  * mic_reset - Reset the MIC device.
  * @mdev: pointer to mic_device instance
@@ -95,9 +154,21 @@ retry:
 		 */
 		goto retry;
 	}
+	mdev->dma_mbdev = mbus_register_device(mdev->sdev->parent,
+					       MBUS_DEV_DMA_HOST, &mic_dma_ops,
+					       &mbus_hw_ops, mdev->mmio.va);
+	if (IS_ERR(mdev->dma_mbdev)) {
+		rc = PTR_ERR(mdev->dma_mbdev);
+		goto unlock_ret;
+	}
+	mdev->dma_ch = mic_request_dma_chan(mdev);
+	if (!mdev->dma_ch) {
+		rc = -ENXIO;
+		goto dma_remove;
+	}
 	rc = mdev->ops->load_mic_fw(mdev, buf);
 	if (rc)
-		goto unlock_ret;
+		goto dma_release;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -105,6 +176,11 @@ retry:
 	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
+	goto unlock_ret;
+dma_release:
+	dma_release_channel(mdev->dma_ch);
+dma_remove:
+	mbus_unregister_device(mdev->dma_mbdev);
 unlock_ret:
 	mutex_unlock(&mdev->mic_mutex);
 	return rc;
@@ -122,6 +198,11 @@ void mic_stop(struct mic_device *mdev, bool force)
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
 		mic_virtio_reset_devices(mdev);
+		if (mdev->dma_ch) {
+			dma_release_channel(mdev->dma_ch);
+			mdev->dma_ch = NULL;
+		}
+		mbus_unregister_device(mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
 		if (MIC_RESET_FAILED == mdev->state)
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 0398c69..016bd15 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -25,6 +25,8 @@
 #include <linux/idr.h>
 #include <linux/notifier.h>
 #include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+#include <linux/mic_bus.h>
 
 #include "mic_intr.h"
 
@@ -87,6 +89,8 @@ enum mic_stepping {
  * @cdev: Character device for MIC.
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
+ * @dma_mbdev: MIC BUS DMA device.
+ * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -124,6 +128,8 @@ struct mic_device {
 	struct cdev cdev;
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
+	struct mbus_device *dma_mbdev;
+	struct dma_chan *dma_ch;
 };
 
 /**
@@ -144,6 +150,7 @@ struct mic_device {
  * @load_mic_fw: Load firmware segments required to boot the card
  * into card memory. This includes the kernel, command line, ramdisk etc.
  * @get_postcode: Get post code status from firmware.
+ * @dma_filter: DMA filter function to be used.
  */
 struct mic_hw_ops {
 	u8 aper_bar;
@@ -159,6 +166,7 @@ struct mic_hw_ops {
 	void (*send_firmware_intr)(struct mic_device *mdev);
 	int (*load_mic_fw)(struct mic_device *mdev, const char *buf);
 	u32 (*get_postcode)(struct mic_device *mdev);
+	bool (*dma_filter)(struct dma_chan *chan, void *param);
 };
 
 /**
@@ -187,6 +195,22 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 	iowrite32(val, mw->va + offset);
 }
 
+static inline struct dma_chan *mic_request_dma_chan(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+	chan = dma_request_channel(mask, mdev->ops->dma_filter,
+				   mdev->sdev->parent);
+	if (chan)
+		return chan;
+	dev_err(mdev->sdev->parent, "%s %d unable to acquire channel\n",
+		__func__, __LINE__);
+	return NULL;
+}
+
 void mic_sysfs_init(struct mic_device *mdev);
 int mic_start(struct mic_device *mdev, const char *buf);
 void mic_stop(struct mic_device *mdev, bool force);
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index b1334dd..9f783d4a 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -27,8 +27,9 @@
  * The minimum number of msix vectors required for normal operation.
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
+ * 4 for host owned DMA channels.
  */
-#define MIC_MIN_MSIX 4
+#define MIC_MIN_MSIX 8
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index aba3e83..a020e4e 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -21,60 +21,157 @@
 #include <linux/pci.h>
 #include <linux/sched.h>
 #include <linux/uaccess.h>
-
+#include <linux/dmaengine.h>
 #include <linux/mic_common.h>
+
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
 /*
- * Initiates the copies across the PCIe bus from card memory to
- * a user space buffer.
+ * Size of the internal buffer used during DMA's as an intermediate buffer
+ * for copy to/from user.
  */
-static int mic_virtio_copy_to_user(struct mic_vdev *mvdev,
-		void __user *ubuf, size_t len, u64 addr)
+#define MIC_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
+
+static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
+			dma_addr_t src, size_t len)
 {
-	int err;
-	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
-	/*
-	 * We are copying from IO below an should ideally use something
-	 * like copy_to_user_fromio(..) if it existed.
-	 */
-	if (copy_to_user(ubuf, (void __force *)dbuf, len)) {
-		err = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+	int err = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *mic_ch = mdev->dma_ch;
+
+	if (!mic_ch) {
+		err = -EBUSY;
+		goto error;
+	}
+
+	tx = mic_ch->device->device_prep_dma_memcpy(mic_ch, dst, src, len,
+						    DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	} else {
+		dma_cookie_t cookie = tx->tx_submit(tx);
+
+		err = dma_submit_error(cookie);
+		if (err)
+			goto error;
+		err = dma_sync_wait(mic_ch, cookie);
+	}
+error:
+	if (err)
+		dev_err(mdev->sdev->parent, "%s %d err %d\n",
 			__func__, __LINE__, err);
-		goto err;
+	return err;
+}
+
+/*
+ * Initiates the copies across the PCIe bus from card memory to a user
+ * space buffer. When transfers are done using DMA, source/destination
+ * addresses and transfer length must follow the alignment requirements of
+ * the MIC DMA engine.
+ */
+static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
+				   size_t len, u64 daddr, size_t dlen,
+				   int vr_idx)
+{
+	struct mic_device *mdev = mvdev->mdev;
+	void __iomem *dbuf = mdev->aper.va + daddr;
+	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
+	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_offset;
+	size_t partlen;
+	int err;
+
+	dma_offset = daddr - round_down(daddr, dma_alignment);
+	daddr -= dma_offset;
+	len += dma_offset;
+
+	while (len) {
+		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
+
+		err = mic_sync_dma(mdev, mvr->buf_da, daddr,
+				   ALIGN(partlen, dma_alignment));
+		if (err)
+			goto err;
+
+		if (copy_to_user(ubuf, mvr->buf + dma_offset,
+				 partlen - dma_offset)) {
+			err = -EFAULT;
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		mvdev->in_bytes_dma += partlen;
+		mvdev->in_bytes += partlen;
+		len -= partlen;
+		dma_offset = 0;
 	}
-	mvdev->in_bytes += len;
-	err = 0;
+	return 0;
 err:
+	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
 	return err;
 }
 
 /*
- * Initiates copies across the PCIe bus from a user space
- * buffer to card memory.
+ * Initiates copies across the PCIe bus from a user space buffer to card
+ * memory. When transfers are done using DMA, source/destination addresses
+ * and transfer length must follow the alignment requirements of the MIC
+ * DMA engine.
  */
-static int mic_virtio_copy_from_user(struct mic_vdev *mvdev,
-		void __user *ubuf, size_t len, u64 addr)
+static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
+				     size_t len, u64 daddr, size_t dlen,
+				     int vr_idx)
 {
+	struct mic_device *mdev = mvdev->mdev;
+	void __iomem *dbuf = mdev->aper.va + daddr;
+	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
+	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t partlen;
 	int err;
-	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
+
+	if (daddr & (dma_alignment - 1)) {
+		mvdev->tx_dst_unaligned += len;
+		goto memcpy;
+	} else if (ALIGN(len, dma_alignment) > dlen) {
+		mvdev->tx_len_unaligned += len;
+		goto memcpy;
+	}
+
+	while (len) {
+		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
+
+		if (copy_from_user(mvr->buf, ubuf, partlen)) {
+			err = -EFAULT;
+			goto err;
+		}
+		err = mic_sync_dma(mdev, daddr, mvr->buf_da,
+				   ALIGN(partlen, dma_alignment));
+		if (err)
+			goto err;
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		mvdev->out_bytes_dma += partlen;
+		mvdev->out_bytes += partlen;
+		len -= partlen;
+	}
+memcpy:
 	/*
 	 * We are copying to IO below and should ideally use something
 	 * like copy_from_user_toio(..) if it existed.
 	 */
 	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
 		err = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
 		goto err;
 	}
 	mvdev->out_bytes += len;
-	err = 0;
+	return 0;
 err:
+	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
 	return err;
 }
 
@@ -110,7 +207,8 @@ static inline u32 mic_vringh_iov_consumed(struct vringh_kiov *iov)
  * way to override the VRINGH xfer(..) routines as of v3.10.
  */
 static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
-	void __user *ubuf, size_t len, bool read, size_t *out_len)
+			void __user *ubuf, size_t len, bool read, int vr_idx,
+			size_t *out_len)
 {
 	int ret = 0;
 	size_t partlen, tot_len = 0;
@@ -118,13 +216,15 @@ static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
 	while (len && iov->i < iov->used) {
 		partlen = min(iov->iov[iov->i].iov_len, len);
 		if (read)
-			ret = mic_virtio_copy_to_user(mvdev,
-				ubuf, partlen,
-				(u64)iov->iov[iov->i].iov_base);
+			ret = mic_virtio_copy_to_user(mvdev, ubuf, partlen,
+						(u64)iov->iov[iov->i].iov_base,
+						iov->iov[iov->i].iov_len,
+						vr_idx);
 		else
-			ret = mic_virtio_copy_from_user(mvdev,
-				ubuf, partlen,
-				(u64)iov->iov[iov->i].iov_base);
+			ret = mic_virtio_copy_from_user(mvdev, ubuf, partlen,
+						(u64)iov->iov[iov->i].iov_base,
+						iov->iov[iov->i].iov_len,
+						vr_idx);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -192,8 +292,8 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			ubuf = iov.iov_base;
 		}
 		/* Issue all the read descriptors first */
-		ret = mic_vringh_copy(mvdev, riov, ubuf, len,
-			MIC_VRINGH_READ, &out_len);
+		ret = mic_vringh_copy(mvdev, riov, ubuf, len, MIC_VRINGH_READ,
+				      copy->vr_idx, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -203,8 +303,8 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 		ubuf += out_len;
 		copy->out_len += out_len;
 		/* Issue the write descriptors next */
-		ret = mic_vringh_copy(mvdev, wiov, ubuf, len,
-			!MIC_VRINGH_READ, &out_len);
+		ret = mic_vringh_copy(mvdev, wiov, ubuf, len, !MIC_VRINGH_READ,
+				      copy->vr_idx, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -589,6 +689,10 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		dev_dbg(mdev->sdev->parent,
 			"%s %d index %d va %p info %p vr_size 0x%x\n",
 			__func__, __LINE__, i, vr->va, vr->info, vr_size);
+		mvr->buf = (void *)__get_free_pages(GFP_KERNEL,
+					get_order(MIC_INT_DMA_BUF_SIZE));
+		mvr->buf_da = mic_map_single(mvdev->mdev, mvr->buf,
+					  MIC_INT_DMA_BUF_SIZE);
 	}
 
 	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
@@ -673,6 +777,11 @@ skip_hot_remove:
 	vqconfig = mic_vq_config(mvdev->dd);
 	for (i = 0; i < mvdev->dd->num_vq; i++) {
 		struct mic_vringh *mvr = &mvdev->mvr[i];
+
+		mic_unmap_single(mvdev->mdev, mvr->buf_da,
+				 MIC_INT_DMA_BUF_SIZE);
+		free_pages((unsigned long)mvr->buf,
+			   get_order(MIC_INT_DMA_BUF_SIZE));
 		vringh_kiov_cleanup(&mvr->riov);
 		vringh_kiov_cleanup(&mvr->wiov);
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
diff --git a/drivers/misc/mic/host/mic_virtio.h b/drivers/misc/mic/host/mic_virtio.h
index 184f3c8..d574efb 100644
--- a/drivers/misc/mic/host/mic_virtio.h
+++ b/drivers/misc/mic/host/mic_virtio.h
@@ -46,18 +46,23 @@
  * @vrh: The host VRINGH used for accessing the card vrings.
  * @riov: The VRINGH read kernel IOV.
  * @wiov: The VRINGH write kernel IOV.
- * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
  * @vr_mutex: Mutex for synchronizing access to the VRING.
+ * @buf: Temporary kernel buffer used to copy in/out data
+ * from/to the card via DMA.
+ * @buf_da: dma address of buf.
  * @mvdev: Back pointer to MIC virtio device for vringh_notify(..).
+ * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
  */
 struct mic_vringh {
 	struct mic_vring vring;
 	struct vringh vrh;
 	struct vringh_kiov riov;
 	struct vringh_kiov wiov;
-	u16 head;
 	struct mutex vr_mutex;
+	void *buf;
+	dma_addr_t buf_da;
 	struct mic_vdev *mvdev;
+	u16 head;
 };
 
 /**
@@ -69,6 +74,14 @@ struct mic_vringh {
  * @poll_wake - Used for waking up threads blocked in poll.
  * @out_bytes - Debug stats for number of bytes copied from host to card.
  * @in_bytes - Debug stats for number of bytes copied from card to host.
+ * @out_bytes_dma - Debug stats for number of bytes copied from host to card
+ * using DMA.
+ * @in_bytes_dma - Debug stats for number of bytes copied from card to host
+ * using DMA.
+ * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
+ * the transfer length did not have the required DMA alignment.
+ * @tx_dst_unaligned - Debug stats for number of bytes copied where the
+ * destination address on the card did not have the required DMA alignment.
  * @mvr - Store per VRING data structures.
  * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
  * @dd - Virtio device descriptor.
@@ -84,6 +97,10 @@ struct mic_vdev {
 	int poll_wake;
 	unsigned long out_bytes;
 	unsigned long in_bytes;
+	unsigned long out_bytes_dma;
+	unsigned long in_bytes_dma;
+	unsigned long tx_len_unaligned;
+	unsigned long tx_dst_unaligned;
 	struct mic_vringh mvr[MIC_MAX_VRINGS];
 	struct work_struct virtio_bh_work;
 	struct mic_device_desc *dd;
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index 5562fdd..b7a21e1 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -549,6 +549,13 @@ struct mic_smpt_ops mic_x100_smpt_ops = {
 	.set = mic_x100_smpt_set,
 };
 
+static bool mic_x100_dma_filter(struct dma_chan *chan, void *param)
+{
+	if (chan->device->dev->parent == (struct device *)param)
+		return true;
+	return false;
+}
+
 struct mic_hw_ops mic_x100_ops = {
 	.aper_bar = MIC_X100_APER_BAR,
 	.mmio_bar = MIC_X100_MMIO_BAR,
@@ -563,6 +570,7 @@ struct mic_hw_ops mic_x100_ops = {
 	.send_firmware_intr = mic_x100_send_firmware_intr,
 	.load_mic_fw = mic_x100_load_firmware,
 	.get_postcode = mic_x100_get_postcode,
+	.dma_filter = mic_x100_dma_filter,
 };
 
 struct mic_hw_intr_ops mic_x100_intr_ops = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 54%]

* [PATCH char-misc-next v3 3/8] dma: MIC X100 DMA Driver
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
  2014-07-11 21:04 91% ` [PATCH char-misc-next v3 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
  2014-07-11 21:04 63% ` [PATCH char-misc-next v3 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
@ 2014-07-11 21:04 41% ` Sudeep Dutt
  2014-07-11 21:04 62% ` [PATCH char-misc-next v3 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch implements DMA Engine API for DMA controller on MIC X100
Coprocessors. DMA h/w is shared between host and card s/w.
Channels 0 to 3 are used by host and 4 to 7 are used by card.
Since the DMA device doesn't show up as PCIe device, a virtual bus called mic
bus is created and virtual devices are added on that bus to follow device model.
Allowed dma transfer directions are host to card, card to host and card to card.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/dma/Kconfig        |  19 ++
 drivers/dma/Makefile       |   1 +
 drivers/dma/mic_x100_dma.c | 774 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/mic_x100_dma.h | 286 +++++++++++++++++
 4 files changed, 1080 insertions(+)
 create mode 100644 drivers/dma/mic_x100_dma.c
 create mode 100644 drivers/dma/mic_x100_dma.h

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 1eca7b9..7c8b8c4 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -33,6 +33,25 @@ if DMADEVICES
 
 comment "DMA Devices"
 
+config INTEL_MIC_X100_DMA
+	tristate "Intel MIC X100 DMA Driver"
+	depends on 64BIT && X86 && INTEL_MIC_BUS
+	select DMAENGINE
+	default N
+	help
+	  This enables DMA support for the Intel Many Integrated Core
+	  (MIC) family of PCIe form factor coprocessor X100 devices that
+	  run a 64 bit Linux OS. This driver will be used by both MIC
+	  host and card drivers.
+
+	  If you are building host kernel with a MIC device or a card
+	  kernel for a MIC device, then say M (recommended) or Y, else
+	  say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 config INTEL_MID_DMAC
 	tristate "Intel MID DMA support for Peripheral DMA controllers"
 	depends on PCI && X86
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index c779e1e..bd9e7fa 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -47,3 +47,4 @@ obj-$(CONFIG_MOXART_DMA) += moxart-dma.o
 obj-$(CONFIG_FSL_EDMA) += fsl-edma.o
 obj-$(CONFIG_QCOM_BAM_DMA) += qcom_bam_dma.o
 obj-y += xilinx/
+obj-$(CONFIG_INTEL_MIC_X100_DMA) += mic_x100_dma.o
diff --git a/drivers/dma/mic_x100_dma.c b/drivers/dma/mic_x100_dma.c
new file mode 100644
index 0000000..6de2e67
--- /dev/null
+++ b/drivers/dma/mic_x100_dma.c
@@ -0,0 +1,774 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC X100 DMA Driver.
+ *
+ * Adapted from IOAT dma driver.
+ */
+#include <linux/module.h>
+#include <linux/io.h>
+#include <linux/seq_file.h>
+
+#include "mic_x100_dma.h"
+
+#define MIC_DMA_MAX_XFER_SIZE_CARD  (1 * 1024 * 1024 -\
+				       MIC_DMA_ALIGN_BYTES)
+#define MIC_DMA_MAX_XFER_SIZE_HOST  (1 * 1024 * 1024 >> 1)
+#define MIC_DMA_DESC_TYPE_SHIFT	60
+#define MIC_DMA_MEMCPY_LEN_SHIFT 46
+#define MIC_DMA_STAT_INTR_SHIFT 59
+
+/* high-water mark for pushing dma descriptors */
+static int mic_dma_pending_level = 4;
+
+/* Status descriptor is used to write a 64 bit value to a memory location */
+enum mic_dma_desc_format_type {
+	MIC_DMA_MEMCPY = 1,
+	MIC_DMA_STATUS,
+};
+
+static inline u32 mic_dma_hw_ring_inc(u32 val)
+{
+	return (val + 1) % MIC_DMA_DESC_RX_SIZE;
+}
+
+static inline u32 mic_dma_hw_ring_dec(u32 val)
+{
+	return val ? val - 1 : MIC_DMA_DESC_RX_SIZE - 1;
+}
+
+static inline void mic_dma_hw_ring_inc_head(struct mic_dma_chan *ch)
+{
+	ch->head = mic_dma_hw_ring_inc(ch->head);
+}
+
+/* Prepare a memcpy desc */
+static inline void mic_dma_memcpy_desc(struct mic_dma_desc *desc,
+	dma_addr_t src_phys, dma_addr_t dst_phys, u64 size)
+{
+	u64 qw0, qw1;
+
+	qw0 = src_phys;
+	qw0 |= (size >> MIC_DMA_ALIGN_SHIFT) << MIC_DMA_MEMCPY_LEN_SHIFT;
+	qw1 = MIC_DMA_MEMCPY;
+	qw1 <<= MIC_DMA_DESC_TYPE_SHIFT;
+	qw1 |= dst_phys;
+	desc->qw0 = qw0;
+	desc->qw1 = qw1;
+}
+
+/* Prepare a status desc. with @data to be written at @dst_phys */
+static inline void mic_dma_prep_status_desc(struct mic_dma_desc *desc, u64 data,
+	dma_addr_t dst_phys, bool generate_intr)
+{
+	u64 qw0, qw1;
+
+	qw0 = data;
+	qw1 = (u64) MIC_DMA_STATUS << MIC_DMA_DESC_TYPE_SHIFT | dst_phys;
+	if (generate_intr)
+		qw1 |= (1ULL << MIC_DMA_STAT_INTR_SHIFT);
+	desc->qw0 = qw0;
+	desc->qw1 = qw1;
+}
+
+static void mic_dma_cleanup(struct mic_dma_chan *ch)
+{
+	struct dma_async_tx_descriptor *tx;
+	u32 tail;
+	u32 last_tail;
+
+	spin_lock(&ch->cleanup_lock);
+	tail = mic_dma_read_cmp_cnt(ch);
+	/*
+	 * This is the barrier pair for smp_wmb() in fn.
+	 * mic_dma_tx_submit_unlock. It's required so that we read the
+	 * updated cookie value from tx->cookie.
+	 */
+	smp_rmb();
+	for (last_tail = ch->last_tail; tail != last_tail;) {
+		tx = &ch->tx_array[last_tail];
+		if (tx->cookie) {
+			dma_cookie_complete(tx);
+			if (tx->callback) {
+				tx->callback(tx->callback_param);
+				tx->callback = NULL;
+			}
+		}
+		last_tail = mic_dma_hw_ring_inc(last_tail);
+	}
+	/* finish all completion callbacks before incrementing tail */
+	smp_mb();
+	ch->last_tail = last_tail;
+	spin_unlock(&ch->cleanup_lock);
+}
+
+static u32 mic_dma_ring_count(u32 head, u32 tail)
+{
+	u32 count;
+
+	if (head >= tail)
+		count = (tail - 0) + (MIC_DMA_DESC_RX_SIZE - head);
+	else
+		count = tail - head;
+	return count - 1;
+}
+
+/* Returns the num. of free descriptors on success, -ENOMEM on failure */
+static int mic_dma_avail_desc_ring_space(struct mic_dma_chan *ch, int required)
+{
+	struct device *dev = mic_dma_ch_to_device(ch);
+	u32 count;
+
+	count = mic_dma_ring_count(ch->head, ch->last_tail);
+	if (count < required) {
+		mic_dma_cleanup(ch);
+		count = mic_dma_ring_count(ch->head, ch->last_tail);
+	}
+
+	if (count < required) {
+		dev_dbg(dev, "Not enough desc space");
+		dev_dbg(dev, "%s %d required=%u, avail=%u\n",
+			__func__, __LINE__, required, count);
+		return -ENOMEM;
+	} else {
+		return count;
+	}
+}
+
+/* Program memcpy descriptors into the descriptor ring and update s/w head ptr*/
+static int mic_dma_prog_memcpy_desc(struct mic_dma_chan *ch, dma_addr_t src,
+				    dma_addr_t dst, size_t len)
+{
+	size_t current_transfer_len;
+	size_t max_xfer_size = to_mic_dma_dev(ch)->max_xfer_size;
+	/* 3 is added to make sure we have enough space for status desc */
+	int num_desc = len / max_xfer_size + 3;
+	int ret;
+
+	if (len % max_xfer_size)
+		num_desc++;
+
+	ret = mic_dma_avail_desc_ring_space(ch, num_desc);
+	if (ret < 0)
+		return ret;
+	do {
+		current_transfer_len = min(len, max_xfer_size);
+		mic_dma_memcpy_desc(&ch->desc_ring[ch->head],
+				    src, dst, current_transfer_len);
+		mic_dma_hw_ring_inc_head(ch);
+		len -= current_transfer_len;
+		dst = dst + current_transfer_len;
+		src = src + current_transfer_len;
+	} while (len > 0);
+	return 0;
+}
+
+/* It's a h/w quirk and h/w needs 2 status descriptors for every status desc */
+static void mic_dma_prog_intr(struct mic_dma_chan *ch)
+{
+	mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+				 ch->status_dest_micpa, false);
+	mic_dma_hw_ring_inc_head(ch);
+	mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+				 ch->status_dest_micpa, true);
+	mic_dma_hw_ring_inc_head(ch);
+}
+
+/* Wrapper function to program memcpy descriptors/status descriptors */
+static int mic_dma_do_dma(struct mic_dma_chan *ch, int flags, dma_addr_t src,
+			  dma_addr_t dst, size_t len)
+{
+	if (-ENOMEM == mic_dma_prog_memcpy_desc(ch, src, dst, len))
+		return -ENOMEM;
+	/* Above mic_dma_prog_memcpy_desc() makes sure we have enough space */
+	if (flags & DMA_PREP_FENCE) {
+		mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+					 ch->status_dest_micpa, false);
+		mic_dma_hw_ring_inc_head(ch);
+	}
+
+	if (flags & DMA_PREP_INTERRUPT)
+		mic_dma_prog_intr(ch);
+
+	return 0;
+}
+
+static inline void mic_dma_issue_pending(struct dma_chan *ch)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+
+	spin_lock(&mic_ch->issue_lock);
+	/*
+	 * Write to head triggers h/w to act on the descriptors.
+	 * On MIC, writing the same head value twice causes
+	 * a h/w error. On second write, h/w assumes we filled
+	 * the entire ring & overwrote some of the descriptors.
+	 */
+	if (mic_ch->issued == mic_ch->submitted)
+		goto out;
+	mic_ch->issued = mic_ch->submitted;
+	/*
+	 * make descriptor updates visible before advancing head,
+	 * this is purposefully not smp_wmb() since we are also
+	 * publishing the descriptor updates to a dma device
+	 */
+	wmb();
+	mic_dma_write_reg(mic_ch, MIC_DMA_REG_DHPR, mic_ch->issued);
+out:
+	spin_unlock(&mic_ch->issue_lock);
+}
+
+static inline void mic_dma_update_pending(struct mic_dma_chan *ch)
+{
+	if (mic_dma_ring_count(ch->issued, ch->submitted)
+			> mic_dma_pending_level)
+		mic_dma_issue_pending(&ch->api_ch);
+}
+
+static dma_cookie_t mic_dma_tx_submit_unlock(struct dma_async_tx_descriptor *tx)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(tx->chan);
+	dma_cookie_t cookie;
+
+	dma_cookie_assign(tx);
+	cookie = tx->cookie;
+	/*
+	 * We need an smp write barrier here because another CPU might see
+	 * an update to submitted and update h/w head even before we
+	 * assigned a cookie to this tx.
+	 */
+	smp_wmb();
+	mic_ch->submitted = mic_ch->head;
+	spin_unlock(&mic_ch->prep_lock);
+	mic_dma_update_pending(mic_ch);
+	return cookie;
+}
+
+static inline struct dma_async_tx_descriptor *
+allocate_tx(struct mic_dma_chan *ch)
+{
+	u32 idx = mic_dma_hw_ring_dec(ch->head);
+	struct dma_async_tx_descriptor *tx = &ch->tx_array[idx];
+
+	dma_async_tx_descriptor_init(tx, &ch->api_ch);
+	tx->tx_submit = mic_dma_tx_submit_unlock;
+	return tx;
+}
+
+/*
+ * Prepare a memcpy descriptor to be added to the ring.
+ * Note that the temporary descriptor adds an extra overhead of copying the
+ * descriptor to ring. So, we copy directly to the descriptor ring
+ */
+static struct dma_async_tx_descriptor *
+mic_dma_prep_memcpy_lock(struct dma_chan *ch, dma_addr_t dma_dest,
+			 dma_addr_t dma_src, size_t len, unsigned long flags)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	struct device *dev = mic_dma_ch_to_device(mic_ch);
+	int result;
+
+	if (!len && !flags)
+		return NULL;
+
+	spin_lock(&mic_ch->prep_lock);
+	result = mic_dma_do_dma(mic_ch, flags, dma_src, dma_dest, len);
+	if (result >= 0)
+		return allocate_tx(mic_ch);
+	dev_err(dev, "Error enqueueing dma, error=%d\n", result);
+	spin_unlock(&mic_ch->prep_lock);
+	return NULL;
+}
+
+static struct dma_async_tx_descriptor *
+mic_dma_prep_interrupt_lock(struct dma_chan *ch, unsigned long flags)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	int ret;
+
+	spin_lock(&mic_ch->prep_lock);
+	ret = mic_dma_do_dma(mic_ch, flags, 0, 0, 0);
+	if (!ret)
+		return allocate_tx(mic_ch);
+	spin_unlock(&mic_ch->prep_lock);
+	return NULL;
+}
+
+/* Return the status of the transaction */
+static enum dma_status
+mic_dma_tx_status(struct dma_chan *ch, dma_cookie_t cookie,
+		  struct dma_tx_state *txstate)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+
+	if (DMA_COMPLETE != dma_cookie_status(ch, cookie, txstate))
+		mic_dma_cleanup(mic_ch);
+
+	return dma_cookie_status(ch, cookie, txstate);
+}
+
+static irqreturn_t mic_dma_thread_fn(int irq, void *data)
+{
+	mic_dma_cleanup((struct mic_dma_chan *)data);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t mic_dma_intr_handler(int irq, void *data)
+{
+	struct mic_dma_chan *ch = ((struct mic_dma_chan *)data);
+
+	mic_dma_ack_interrupt(ch);
+	return IRQ_WAKE_THREAD;
+}
+
+static int mic_dma_alloc_desc_ring(struct mic_dma_chan *ch)
+{
+	u64 desc_ring_size = MIC_DMA_DESC_RX_SIZE * sizeof(*ch->desc_ring);
+	struct device *dev = &to_mbus_device(ch)->dev;
+
+	desc_ring_size = ALIGN(desc_ring_size, MIC_DMA_ALIGN_BYTES);
+	ch->desc_ring = kzalloc(desc_ring_size, GFP_KERNEL);
+
+	if (!ch->desc_ring)
+		return -ENOMEM;
+
+	ch->desc_ring_micpa = dma_map_single(dev, ch->desc_ring,
+					     desc_ring_size, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, ch->desc_ring_micpa))
+		goto map_error;
+
+	ch->tx_array = vzalloc(MIC_DMA_DESC_RX_SIZE * sizeof(*ch->tx_array));
+	if (!ch->tx_array)
+		goto tx_error;
+	return 0;
+tx_error:
+	dma_unmap_single(dev, ch->desc_ring_micpa, desc_ring_size,
+			 DMA_BIDIRECTIONAL);
+map_error:
+	kfree(ch->desc_ring);
+	return -ENOMEM;
+}
+
+static void mic_dma_free_desc_ring(struct mic_dma_chan *ch)
+{
+	u64 desc_ring_size = MIC_DMA_DESC_RX_SIZE * sizeof(*ch->desc_ring);
+
+	vfree(ch->tx_array);
+	desc_ring_size = ALIGN(desc_ring_size, MIC_DMA_ALIGN_BYTES);
+	dma_unmap_single(&to_mbus_device(ch)->dev, ch->desc_ring_micpa,
+			 desc_ring_size, DMA_BIDIRECTIONAL);
+	kfree(ch->desc_ring);
+	ch->desc_ring = NULL;
+}
+
+static void mic_dma_free_status_dest(struct mic_dma_chan *ch)
+{
+	dma_unmap_single(&to_mbus_device(ch)->dev, ch->status_dest_micpa,
+			 L1_CACHE_BYTES, DMA_BIDIRECTIONAL);
+	kfree(ch->status_dest);
+}
+
+static int mic_dma_alloc_status_dest(struct mic_dma_chan *ch)
+{
+	struct device *dev = &to_mbus_device(ch)->dev;
+
+	ch->status_dest = kzalloc(L1_CACHE_BYTES, GFP_KERNEL);
+	if (!ch->status_dest)
+		return -ENOMEM;
+	ch->status_dest_micpa = dma_map_single(dev, ch->status_dest,
+					L1_CACHE_BYTES, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, ch->status_dest_micpa)) {
+		kfree(ch->status_dest);
+		ch->status_dest = NULL;
+		return -ENOMEM;
+	}
+	return 0;
+}
+
+static int mic_dma_check_chan(struct mic_dma_chan *ch)
+{
+	if (mic_dma_read_reg(ch, MIC_DMA_REG_DCHERR) ||
+	    mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT) & MIC_DMA_CHAN_QUIESCE) {
+		mic_dma_disable_chan(ch);
+		mic_dma_chan_mask_intr(ch);
+		dev_err(mic_dma_ch_to_device(ch),
+			"%s %d error setting up mic dma chan %d\n",
+			__func__, __LINE__, ch->ch_num);
+		return -EBUSY;
+	}
+	return 0;
+}
+
+static int mic_dma_chan_setup(struct mic_dma_chan *ch)
+{
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		mic_dma_chan_set_owner(ch);
+	mic_dma_disable_chan(ch);
+	mic_dma_chan_mask_intr(ch);
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCHERRMSK, 0);
+	mic_dma_chan_set_desc_ring(ch);
+	ch->last_tail = mic_dma_read_reg(ch, MIC_DMA_REG_DTPR);
+	ch->head = ch->last_tail;
+	ch->issued = 0;
+	mic_dma_chan_unmask_intr(ch);
+	mic_dma_enable_chan(ch);
+	return mic_dma_check_chan(ch);
+}
+
+static void mic_dma_chan_destroy(struct mic_dma_chan *ch)
+{
+	mic_dma_disable_chan(ch);
+	mic_dma_chan_mask_intr(ch);
+}
+
+static void mic_dma_unregister_dma_device(struct mic_dma_device *mic_dma_dev)
+{
+	dma_async_device_unregister(&mic_dma_dev->dma_dev);
+}
+
+static int mic_dma_setup_irq(struct mic_dma_chan *ch)
+{
+	ch->cookie =
+		to_mbus_hw_ops(ch)->request_threaded_irq(to_mbus_device(ch),
+			mic_dma_intr_handler, mic_dma_thread_fn,
+			"mic dma_channel", ch, ch->ch_num);
+	if (IS_ERR(ch->cookie))
+		return IS_ERR(ch->cookie);
+	return 0;
+}
+
+static inline void mic_dma_free_irq(struct mic_dma_chan *ch)
+{
+	to_mbus_hw_ops(ch)->free_irq(to_mbus_device(ch), ch->cookie, ch);
+}
+
+static int mic_dma_chan_init(struct mic_dma_chan *ch)
+{
+	int ret = mic_dma_alloc_desc_ring(ch);
+
+	if (ret)
+		goto ring_error;
+	ret = mic_dma_alloc_status_dest(ch);
+	if (ret)
+		goto status_error;
+	ret = mic_dma_chan_setup(ch);
+	if (ret)
+		goto chan_error;
+	return ret;
+chan_error:
+	mic_dma_free_status_dest(ch);
+status_error:
+	mic_dma_free_desc_ring(ch);
+ring_error:
+	return ret;
+}
+
+static int mic_dma_drain_chan(struct mic_dma_chan *ch)
+{
+	struct dma_async_tx_descriptor *tx;
+	int err = 0;
+	dma_cookie_t cookie;
+
+	tx = mic_dma_prep_memcpy_lock(&ch->api_ch, 0, 0, 0, DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	cookie = tx->tx_submit(tx);
+	if (dma_submit_error(cookie))
+		err = -ENOMEM;
+	else
+		err = dma_sync_wait(&ch->api_ch, cookie);
+	if (err) {
+		dev_err(mic_dma_ch_to_device(ch), "%s %d TO chan 0x%x\n",
+			__func__, __LINE__, ch->ch_num);
+		err = -EIO;
+	}
+error:
+	mic_dma_cleanup(ch);
+	return err;
+}
+
+static inline void mic_dma_chan_uninit(struct mic_dma_chan *ch)
+{
+	mic_dma_chan_destroy(ch);
+	mic_dma_cleanup(ch);
+	mic_dma_free_status_dest(ch);
+	mic_dma_free_desc_ring(ch);
+}
+
+static int mic_dma_init(struct mic_dma_device *mic_dma_dev,
+			enum mic_dma_chan_owner owner)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+	int ret;
+
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		unsigned long data;
+		ch = &mic_dma_dev->mic_ch[i];
+		data = (unsigned long)ch;
+		ch->ch_num = i;
+		ch->owner = owner;
+		spin_lock_init(&ch->cleanup_lock);
+		spin_lock_init(&ch->prep_lock);
+		spin_lock_init(&ch->issue_lock);
+		ret = mic_dma_setup_irq(ch);
+		if (ret)
+			goto error;
+	}
+	return 0;
+error:
+	for (i = i - 1; i >= first_chan; i--)
+		mic_dma_free_irq(ch);
+	return ret;
+}
+
+static void mic_dma_uninit(struct mic_dma_device *mic_dma_dev)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		ch = &mic_dma_dev->mic_ch[i];
+		mic_dma_free_irq(ch);
+	}
+}
+
+static int mic_dma_alloc_chan_resources(struct dma_chan *ch)
+{
+	int ret = mic_dma_chan_init(to_mic_dma_chan(ch));
+	if (ret)
+		return ret;
+	return MIC_DMA_DESC_RX_SIZE;
+}
+
+static void mic_dma_free_chan_resources(struct dma_chan *ch)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	mic_dma_drain_chan(mic_ch);
+	mic_dma_chan_uninit(mic_ch);
+}
+
+/* Set the fn. handlers and register the dma device with dma api */
+static int mic_dma_register_dma_device(struct mic_dma_device *mic_dma_dev,
+				       enum mic_dma_chan_owner owner)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+
+	dma_cap_zero(mic_dma_dev->dma_dev.cap_mask);
+	/*
+	 * This dma engine is not capable of host memory to host memory
+	 * transfers
+	 */
+	dma_cap_set(DMA_MEMCPY, mic_dma_dev->dma_dev.cap_mask);
+
+	if (MIC_DMA_CHAN_HOST == owner)
+		dma_cap_set(DMA_PRIVATE, mic_dma_dev->dma_dev.cap_mask);
+	mic_dma_dev->dma_dev.device_alloc_chan_resources =
+		mic_dma_alloc_chan_resources;
+	mic_dma_dev->dma_dev.device_free_chan_resources =
+		mic_dma_free_chan_resources;
+	mic_dma_dev->dma_dev.device_tx_status = mic_dma_tx_status;
+	mic_dma_dev->dma_dev.device_prep_dma_memcpy = mic_dma_prep_memcpy_lock;
+	mic_dma_dev->dma_dev.device_prep_dma_interrupt =
+		mic_dma_prep_interrupt_lock;
+	mic_dma_dev->dma_dev.device_issue_pending = mic_dma_issue_pending;
+	mic_dma_dev->dma_dev.copy_align = MIC_DMA_ALIGN_SHIFT;
+	INIT_LIST_HEAD(&mic_dma_dev->dma_dev.channels);
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		mic_dma_dev->mic_ch[i].api_ch.device = &mic_dma_dev->dma_dev;
+		dma_cookie_init(&mic_dma_dev->mic_ch[i].api_ch);
+		list_add_tail(&mic_dma_dev->mic_ch[i].api_ch.device_node,
+			      &mic_dma_dev->dma_dev.channels);
+	}
+	return dma_async_device_register(&mic_dma_dev->dma_dev);
+}
+
+/*
+ * Initializes dma channels and registers the dma device with the
+ * dma engine api.
+ */
+static struct mic_dma_device *mic_dma_dev_reg(struct mbus_device *mbdev,
+					      enum mic_dma_chan_owner owner)
+{
+	struct mic_dma_device *mic_dma_dev;
+	int ret;
+	struct device *dev = &mbdev->dev;
+
+	mic_dma_dev = kzalloc(sizeof(*mic_dma_dev), GFP_KERNEL);
+	if (!mic_dma_dev) {
+		ret = -ENOMEM;
+		goto alloc_error;
+	}
+	mic_dma_dev->mbdev = mbdev;
+	mic_dma_dev->dma_dev.dev = dev;
+	mic_dma_dev->mmio = mbdev->mmio_va;
+	if (MIC_DMA_CHAN_HOST == owner) {
+		mic_dma_dev->start_ch = 0;
+		mic_dma_dev->max_xfer_size = MIC_DMA_MAX_XFER_SIZE_HOST;
+	} else {
+		mic_dma_dev->start_ch = 4;
+		mic_dma_dev->max_xfer_size = MIC_DMA_MAX_XFER_SIZE_CARD;
+	}
+	ret = mic_dma_init(mic_dma_dev, owner);
+	if (ret)
+		goto init_error;
+	ret = mic_dma_register_dma_device(mic_dma_dev, owner);
+	if (ret)
+		goto reg_error;
+	return mic_dma_dev;
+reg_error:
+	mic_dma_uninit(mic_dma_dev);
+init_error:
+	kfree(mic_dma_dev);
+	mic_dma_dev = NULL;
+alloc_error:
+	dev_err(dev, "Error at %s %d ret=%d\n", __func__, __LINE__, ret);
+	return mic_dma_dev;
+}
+
+static void mic_dma_dev_unreg(struct mic_dma_device *mic_dma_dev)
+{
+	mic_dma_unregister_dma_device(mic_dma_dev);
+	mic_dma_uninit(mic_dma_dev);
+	kfree(mic_dma_dev);
+}
+
+/* DEBUGFS CODE */
+static int mic_dma_reg_seq_show(struct seq_file *s, void *pos)
+{
+	struct mic_dma_device *mic_dma_dev = s->private;
+	int i, chan_num, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+
+	seq_printf(s, "SBOX_DCR: %#x\n",
+		   mic_dma_mmio_read(&mic_dma_dev->mic_ch[first_chan],
+				     MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR));
+	seq_puts(s, "DMA Channel Registers\n");
+	seq_printf(s, "%-10s| %-10s %-10s %-10s %-10s %-10s",
+		   "Channel", "DCAR", "DTPR", "DHPR", "DRAR_HI", "DRAR_LO");
+	seq_printf(s, " %-11s %-14s %-10s\n", "DCHERR", "DCHERRMSK", "DSTAT");
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		ch = &mic_dma_dev->mic_ch[i];
+		chan_num = ch->ch_num;
+		seq_printf(s, "%-10i| %-#10x %-#10x %-#10x %-#10x",
+			   chan_num,
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCAR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DTPR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DHPR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DRAR_HI));
+		seq_printf(s, " %-#10x %-#10x %-#14x %-#10x\n",
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DRAR_LO),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCHERR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCHERRMSK),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT));
+	}
+	return 0;
+}
+
+static int mic_dma_reg_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_dma_reg_seq_show, inode->i_private);
+}
+
+static int mic_dma_reg_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations mic_dma_reg_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_dma_reg_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_dma_reg_debug_release
+};
+
+/* Debugfs parent dir */
+static struct dentry *mic_dma_dbg;
+
+static int mic_dma_driver_probe(struct mbus_device *mbdev)
+{
+	struct mic_dma_device *mic_dma_dev;
+	enum mic_dma_chan_owner owner;
+
+	if (MBUS_DEV_DMA_MIC == mbdev->id.device)
+		owner = MIC_DMA_CHAN_MIC;
+	else
+		owner = MIC_DMA_CHAN_HOST;
+
+	mic_dma_dev = mic_dma_dev_reg(mbdev, owner);
+	dev_set_drvdata(&mbdev->dev, mic_dma_dev);
+
+	if (mic_dma_dbg) {
+		mic_dma_dev->dbg_dir = debugfs_create_dir(dev_name(&mbdev->dev),
+							  mic_dma_dbg);
+		if (mic_dma_dev->dbg_dir)
+			debugfs_create_file("mic_dma_reg", 0444,
+					    mic_dma_dev->dbg_dir, mic_dma_dev,
+					    &mic_dma_reg_ops);
+	}
+	return 0;
+}
+
+static void mic_dma_driver_remove(struct mbus_device *mbdev)
+{
+	struct mic_dma_device *mic_dma_dev;
+
+	mic_dma_dev = dev_get_drvdata(&mbdev->dev);
+	debugfs_remove_recursive(mic_dma_dev->dbg_dir);
+	mic_dma_dev_unreg(mic_dma_dev);
+}
+
+static struct mbus_device_id id_table[] = {
+	{MBUS_DEV_DMA_MIC, MBUS_DEV_ANY_ID},
+	{MBUS_DEV_DMA_HOST, MBUS_DEV_ANY_ID},
+	{0},
+};
+
+static struct mbus_driver mic_dma_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = mic_dma_driver_probe,
+	.remove = mic_dma_driver_remove,
+};
+
+static int __init mic_x100_dma_init(void)
+{
+	int rc = mbus_register_driver(&mic_dma_driver);
+	if (rc)
+		return rc;
+	mic_dma_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	return 0;
+}
+
+static void __exit mic_x100_dma_exit(void)
+{
+	debugfs_remove_recursive(mic_dma_dbg);
+	mbus_unregister_driver(&mic_dma_driver);
+}
+
+module_init(mic_x100_dma_init);
+module_exit(mic_x100_dma_exit);
+
+MODULE_DEVICE_TABLE(mbus, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC X100 DMA Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/dma/mic_x100_dma.h b/drivers/dma/mic_x100_dma.h
new file mode 100644
index 0000000..f663b0b
--- /dev/null
+++ b/drivers/dma/mic_x100_dma.h
@@ -0,0 +1,286 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC X100 DMA Driver.
+ *
+ * Adapted from IOAT dma driver.
+ */
+#ifndef _MIC_X100_DMA_H_
+#define _MIC_X100_DMA_H_
+
+#include <linux/kernel.h>
+#include <linux/delay.h>
+#include <linux/sched.h>
+#include <linux/debugfs.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/mic_bus.h>
+
+#include "dmaengine.h"
+
+/*
+ * MIC has a total of 8 dma channels.
+ * Four channels are assigned for host SW use & the remaining for MIC SW.
+ * MIC DMA transfer size & addresses need to be 64 byte aligned.
+ */
+#define MIC_DMA_MAX_NUM_CHAN	8
+#define MIC_DMA_NUM_CHAN	4
+#define MIC_DMA_ALIGN_SHIFT	6
+#define MIC_DMA_ALIGN_BYTES	(1 << MIC_DMA_ALIGN_SHIFT)
+#define MIC_DMA_DESC_RX_SIZE	(128 * 1024 - 4)
+
+/*
+ * Register descriptions
+ * All the registers are 32 bit registers.
+ * DCR is a global register and all others are per-channel.
+ * DCR - bits 0, 2, 4, 6, 8, 10, 12, 14 - enable bits for channels 0 to 7
+ *	 bits 1, 3, 5, 7, 9, 11, 13, 15 - owner bits for channels 0 to 7
+ * DCAR - bit 24 & 25 interrupt masks for mic owned & host owned channels
+ * DHPR - head of the descriptor ring updated by s/w
+ * DTPR - tail of the descriptor ring updated by h/w
+ * DRAR_LO - lower 32 bits of descriptor ring's mic address
+ * DRAR_HI - 3:0 - remaining 4 bits of descriptor ring's mic address
+ *	     20:4 descriptor ring size
+ *	     25:21 mic smpt entry number
+ * DSTAT - 16:0 h/w completion count; 31:28 dma engine status
+ * DCHERR - this register is non-zero on error
+ * DCHERRMSK - interrupt mask register
+ */
+#define MIC_DMA_HW_CMP_CNT_MASK		0x1ffff
+#define MIC_DMA_CHAN_QUIESCE		0x20000000
+#define MIC_DMA_SBOX_BASE		0x00010000
+#define MIC_DMA_SBOX_DCR		0x0000A280
+#define MIC_DMA_SBOX_CH_BASE		0x0001A000
+#define MIC_DMA_SBOX_CHAN_OFF		0x40
+#define MIC_DMA_SBOX_DCAR_IM0		(0x1 << 24)
+#define MIC_DMA_SBOX_DCAR_IM1		(0x1 << 25)
+#define MIC_DMA_SBOX_DRARHI_SYS_MASK	(0x1 << 26)
+#define MIC_DMA_REG_DCAR		0
+#define MIC_DMA_REG_DHPR		4
+#define MIC_DMA_REG_DTPR		8
+#define MIC_DMA_REG_DRAR_LO		20
+#define MIC_DMA_REG_DRAR_HI		24
+#define MIC_DMA_REG_DSTAT		32
+#define MIC_DMA_REG_DCHERR		44
+#define MIC_DMA_REG_DCHERRMSK		48
+
+/* HW dma desc */
+struct mic_dma_desc {
+	u64 qw0;
+	u64 qw1;
+};
+
+enum mic_dma_chan_owner {
+	MIC_DMA_CHAN_MIC = 0,
+	MIC_DMA_CHAN_HOST
+};
+
+/*
+ * mic_dma_chan - channel specific information
+ * @ch_num: channel number
+ * @owner: owner of this channel
+ * @last_tail: cached value of descriptor ring tail
+ * @head: index of next descriptor in desc_ring
+ * @issued: hardware notification point
+ * @submitted: index that will be used to submit descriptors to h/w
+ * @api_ch: dma engine api channel
+ * @desc_ring: dma descriptor ring
+ * @desc_ring_micpa: mic physical address of desc_ring
+ * @status_dest: destination for status (fence) descriptor
+ * @status_dest_micpa: mic address for status_dest,
+ *		       DMA controller uses this address
+ * @tx_array: array of async_tx
+ * @cleanup_lock: lock held when processing completed tx
+ * @prep_lock: lock held in prep_memcpy & released in tx_submit
+ * @issue_lock: lock used to synchronize writes to head
+ * @cookie: mic_irq cookie used with mic irq request
+ */
+struct mic_dma_chan {
+	int ch_num;
+	enum mic_dma_chan_owner owner;
+	u32 last_tail;
+	u32 head;
+	u32 issued;
+	u32 submitted;
+	struct dma_chan api_ch;
+	struct mic_dma_desc *desc_ring;
+	dma_addr_t desc_ring_micpa;
+	u64 *status_dest;
+	dma_addr_t status_dest_micpa;
+	struct dma_async_tx_descriptor *tx_array;
+	spinlock_t cleanup_lock;
+	spinlock_t prep_lock;
+	spinlock_t issue_lock;
+	struct mic_irq *cookie;
+};
+
+/*
+ * struct mic_dma_device - per mic device
+ * @mic_ch: dma channels
+ * @dma_dev: underlying dma device
+ * @mbdev: mic bus dma device
+ * @mmio: virtual address of the mmio space
+ * @dbg_dir: debugfs directory
+ * @start_ch: first channel number that can be used
+ * @max_xfer_size: maximum transfer size per dma descriptor
+ */
+struct mic_dma_device {
+	struct mic_dma_chan mic_ch[MIC_DMA_MAX_NUM_CHAN];
+	struct dma_device dma_dev;
+	struct mbus_device *mbdev;
+	void __iomem *mmio;
+	struct dentry *dbg_dir;
+	int start_ch;
+	size_t max_xfer_size;
+};
+
+static inline struct mic_dma_chan *to_mic_dma_chan(struct dma_chan *ch)
+{
+	return container_of(ch, struct mic_dma_chan, api_ch);
+}
+
+static inline struct mic_dma_device *to_mic_dma_dev(struct mic_dma_chan *ch)
+{
+	return
+	container_of((const typeof(((struct mic_dma_device *)0)->mic_ch)*)
+		     (ch - ch->ch_num), struct mic_dma_device, mic_ch);
+}
+
+static inline struct mbus_device *to_mbus_device(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->mbdev;
+}
+
+static inline struct mbus_hw_ops *to_mbus_hw_ops(struct mic_dma_chan *ch)
+{
+	return to_mbus_device(ch)->hw_ops;
+}
+
+static inline struct device *mic_dma_ch_to_device(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->dma_dev.dev;
+}
+
+static inline void __iomem *mic_dma_chan_to_mmio(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->mmio;
+}
+
+static inline u32 mic_dma_read_reg(struct mic_dma_chan *ch, u32 reg)
+{
+	return ioread32(mic_dma_chan_to_mmio(ch) + MIC_DMA_SBOX_CH_BASE +
+			ch->ch_num * MIC_DMA_SBOX_CHAN_OFF + reg);
+}
+
+static inline void mic_dma_write_reg(struct mic_dma_chan *ch, u32 reg, u32 val)
+{
+	iowrite32(val, mic_dma_chan_to_mmio(ch) + MIC_DMA_SBOX_CH_BASE +
+		  ch->ch_num * MIC_DMA_SBOX_CHAN_OFF + reg);
+}
+
+static inline u32 mic_dma_mmio_read(struct mic_dma_chan *ch, u32 offset)
+{
+	return ioread32(mic_dma_chan_to_mmio(ch) + offset);
+}
+
+static inline void mic_dma_mmio_write(struct mic_dma_chan *ch, u32 val,
+				      u32 offset)
+{
+	iowrite32(val, mic_dma_chan_to_mmio(ch) + offset);
+}
+
+static inline u32 mic_dma_read_cmp_cnt(struct mic_dma_chan *ch)
+{
+	return mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT) &
+	       MIC_DMA_HW_CMP_CNT_MASK;
+}
+
+static inline void mic_dma_chan_set_owner(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+	u32 chan_num = ch->ch_num;
+
+	dcr = (dcr & ~(0x1 << (chan_num * 2))) | (ch->owner << (chan_num * 2));
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static inline void mic_dma_enable_chan(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+
+	dcr |= 2 << (ch->ch_num << 1);
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static inline void mic_dma_disable_chan(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+
+	dcr &= ~(2 << (ch->ch_num << 1));
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static void mic_dma_chan_set_desc_ring(struct mic_dma_chan *ch)
+{
+	u32 drar_hi;
+	dma_addr_t desc_ring_micpa = ch->desc_ring_micpa;
+
+	drar_hi = (MIC_DMA_DESC_RX_SIZE & 0x1ffff) << 4;
+	if (MIC_DMA_CHAN_MIC == ch->owner) {
+		drar_hi |= (desc_ring_micpa >> 32) & 0xf;
+	} else {
+		drar_hi |= MIC_DMA_SBOX_DRARHI_SYS_MASK;
+		drar_hi |= ((desc_ring_micpa >> 34)
+			    & 0x1f) << 21;
+		drar_hi |= (desc_ring_micpa >> 32) & 0x3;
+	}
+	mic_dma_write_reg(ch, MIC_DMA_REG_DRAR_LO, (u32) desc_ring_micpa);
+	mic_dma_write_reg(ch, MIC_DMA_REG_DRAR_HI, drar_hi);
+}
+
+static inline void mic_dma_chan_mask_intr(struct mic_dma_chan *ch)
+{
+	u32 dcar = mic_dma_read_reg(ch, MIC_DMA_REG_DCAR);
+
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		dcar |= MIC_DMA_SBOX_DCAR_IM0;
+	else
+		dcar |= MIC_DMA_SBOX_DCAR_IM1;
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCAR, dcar);
+}
+
+static inline void mic_dma_chan_unmask_intr(struct mic_dma_chan *ch)
+{
+	u32 dcar = mic_dma_read_reg(ch, MIC_DMA_REG_DCAR);
+
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		dcar &= ~MIC_DMA_SBOX_DCAR_IM0;
+	else
+		dcar &= ~MIC_DMA_SBOX_DCAR_IM1;
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCAR, dcar);
+}
+
+static void mic_dma_ack_interrupt(struct mic_dma_chan *ch)
+{
+	if (MIC_DMA_CHAN_MIC == ch->owner) {
+		/* HW errata */
+		mic_dma_chan_mask_intr(ch);
+		mic_dma_chan_unmask_intr(ch);
+	}
+	to_mbus_hw_ops(ch)->ack_interrupt(to_mbus_device(ch), ch->ch_num);
+}
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v3 7/8] misc: mic: add dma support in card driver
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (5 preceding siblings ...)
  2014-07-11 21:04 85% ` [PATCH char-misc-next v3 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
@ 2014-07-11 21:04 85% ` Sudeep Dutt
  2014-07-11 21:04 99% ` [PATCH char-misc-next v3 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch adds a dma device on the mic virtual bus

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/misc/mic/Kconfig           |  2 +-
 drivers/misc/mic/card/mic_device.h |  8 ++++--
 drivers/misc/mic/card/mic_x100.c   | 55 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bf76313..cc4eef0 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -39,7 +39,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86
+	depends on 64BIT && X86 && INTEL_MIC_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index e12a0c2..844be8f 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -31,6 +31,7 @@
 #include <linux/io.h>
 #include <linux/irqreturn.h>
 #include <linux/interrupt.h>
+#include <linux/mic_bus.h>
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -71,6 +72,7 @@ struct mic_device {
  * @hotplug_work: Hot plug work for adding/removing virtio devices.
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
+ * @dma_mbdev: dma device on the MIC virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -81,6 +83,7 @@ struct mic_driver {
 	struct work_struct hotplug_work;
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
+	struct mbus_device *dma_mbdev;
 };
 
 /**
@@ -117,8 +120,9 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 int mic_driver_init(struct mic_driver *mdrv);
 void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
-struct mic_irq *mic_request_card_irq(irq_handler_t handler,
-	irq_handler_t thread_fn, const char *name, void *data, int intr_src);
+struct mic_irq *
+mic_request_card_irq(irq_handler_t handler, irq_handler_t thread_fn,
+		     const char *name, void *data, int intr_src);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 2868945..55c9465 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -148,6 +148,47 @@ void mic_card_unmap(struct mic_device *mdev, void __iomem *addr)
 	iounmap(addr);
 }
 
+static inline struct mic_driver *mbdev_to_mdrv(struct mbus_device *mbdev)
+{
+	return dev_get_drvdata(mbdev->dev.parent);
+}
+
+static struct mic_irq *
+_mic_request_threaded_irq(struct mbus_device *mbdev,
+			  irq_handler_t handler, irq_handler_t thread_fn,
+			  const char *name, void *data, int intr_src)
+{
+	int rc = 0;
+	unsigned int irq = intr_src;
+	unsigned long cookie = irq;
+
+	rc  = request_threaded_irq(irq, handler, thread_fn, 0, name, data);
+	if (rc) {
+		dev_err(mbdev_to_mdrv(mbdev)->dev,
+			"request_threaded_irq failed rc = %d\n", rc);
+		return ERR_PTR(rc);
+	}
+	return (struct mic_irq *)cookie;
+}
+
+static void _mic_free_irq(struct mbus_device *mbdev,
+			  struct mic_irq *cookie, void *data)
+{
+	unsigned long irq = (unsigned long)cookie;
+	free_irq(irq, data);
+}
+
+static void _mic_ack_interrupt(struct mbus_device *mbdev, int num)
+{
+	mic_ack_interrupt(&mbdev_to_mdrv(mbdev)->mdev);
+}
+
+static struct mbus_hw_ops mbus_hw_ops = {
+	.request_threaded_irq = _mic_request_threaded_irq,
+	.free_irq = _mic_free_irq,
+	.ack_interrupt = _mic_ack_interrupt,
+};
+
 static int __init mic_probe(struct platform_device *pdev)
 {
 	struct mic_driver *mdrv = &g_drv;
@@ -166,13 +207,24 @@ static int __init mic_probe(struct platform_device *pdev)
 		goto done;
 	}
 	mic_hw_intr_init(mdrv);
+	platform_set_drvdata(pdev, mdrv);
+	mdrv->dma_mbdev = mbus_register_device(mdrv->dev, MBUS_DEV_DMA_MIC,
+					       NULL, &mbus_hw_ops,
+					       mdrv->mdev.mmio.va);
+	if (IS_ERR(mdrv->dma_mbdev)) {
+		rc = PTR_ERR(mdrv->dma_mbdev);
+		dev_err(&pdev->dev, "mbus_add_device failed rc %d\n", rc);
+		goto iounmap;
+	}
 	rc = mic_driver_init(mdrv);
 	if (rc) {
 		dev_err(&pdev->dev, "mic_driver_init failed rc %d\n", rc);
-		goto iounmap;
+		goto remove_dma;
 	}
 done:
 	return rc;
+remove_dma:
+	mbus_unregister_device(mdrv->dma_mbdev);
 iounmap:
 	iounmap(mdev->mmio.va);
 	return rc;
@@ -184,6 +236,7 @@ static int mic_remove(struct platform_device *pdev)
 	struct mic_device *mdev = &mdrv->mdev;
 
 	mic_driver_uninit(mdrv);
+	mbus_unregister_device(mdrv->dma_mbdev);
 	iounmap(mdev->mmio.va);
 	return 0;
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 85%]

* [PATCH char-misc-next v3 2/8] misc: mic: add a bus driver for virtual MIC devices
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
  2014-07-11 21:04 91% ` [PATCH char-misc-next v3 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
@ 2014-07-11 21:04 63% ` Sudeep Dutt
    2014-07-11 21:04 41% ` [PATCH char-misc-next v3 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

This MIC virtual bus driver takes the responsibility of creating all
the virtual devices connected to the PCIe device on the host and the
platform device on the card. The MIC bus hardware operations provide
a way to abstract certain hardware details from the base physical devices.
Examples of devices added on the MIC virtual bus include host DMA and card DMA.
This abstraction enables using a common DMA driver on host and card.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/Kconfig       |  17 ++++
 drivers/misc/mic/Makefile      |   1 +
 drivers/misc/mic/bus/Makefile  |   5 +
 drivers/misc/mic/bus/mic_bus.c | 218 +++++++++++++++++++++++++++++++++++++++++
 include/linux/mic_bus.h        | 110 +++++++++++++++++++++
 5 files changed, 351 insertions(+)
 create mode 100644 drivers/misc/mic/bus/Makefile
 create mode 100644 drivers/misc/mic/bus/mic_bus.c
 create mode 100644 include/linux/mic_bus.h

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 462a5b1..ee1d2ac 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -1,3 +1,20 @@
+comment "Intel MIC Bus Driver"
+
+config INTEL_MIC_BUS
+	tristate "Intel MIC Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the MIC Bus, such as CONFIG_INTEL_MIC_HOST,
+	  CONFIG_INTEL_MIC_CARD, CONFIG_INTEL_MIC_X100_DMA etc.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 05b34d6..e9bf148 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,3 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
+obj-$(CONFIG_INTEL_MIC_BUS) += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
new file mode 100644
index 0000000..d85c7f2
--- /dev/null
+++ b/drivers/misc/mic/bus/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
diff --git a/drivers/misc/mic/bus/mic_bus.c b/drivers/misc/mic/bus/mic_bus.c
new file mode 100644
index 0000000..961ae90
--- /dev/null
+++ b/drivers/misc/mic/bus/mic_bus.c
@@ -0,0 +1,218 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Bus driver.
+ *
+ * This implementation is very similar to the the virtio bus driver
+ * implementation @ drivers/virtio/virtio.c
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/mic_bus.h>
+
+/* Unique numbering for mbus devices. */
+static DEFINE_IDA(mbus_index_ida);
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "mbus:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *mbus_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(mbus_dev);
+
+static inline int mbus_id_match(const struct mbus_device *dev,
+				const struct mbus_device_id *id)
+{
+	if (id->device != dev->id.device && id->device != MBUS_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == MBUS_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call mbus_dev_probe().
+ */
+static int mbus_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct mbus_device *dev = dev_to_mbus(dv);
+	const struct mbus_device_id *ids;
+
+	ids = drv_to_mbus(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (mbus_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int mbus_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct mbus_device *dev = dev_to_mbus(dv);
+
+	return add_uevent_var(env, "MODALIAS=mbus:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int mbus_dev_probe(struct device *d)
+{
+	int err;
+	struct mbus_device *dev = dev_to_mbus(d);
+	struct mbus_driver *drv = drv_to_mbus(dev->dev.driver);
+
+	err = drv->probe(dev);
+	if (!err)
+		if (drv->scan)
+			drv->scan(dev);
+	return err;
+}
+
+static int mbus_dev_remove(struct device *d)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	struct mbus_driver *drv = drv_to_mbus(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type mic_bus = {
+	.name  = "mic_bus",
+	.match = mbus_dev_match,
+	.dev_groups = mbus_dev_groups,
+	.uevent = mbus_uevent,
+	.probe = mbus_dev_probe,
+	.remove = mbus_dev_remove,
+};
+
+int mbus_register_driver(struct mbus_driver *driver)
+{
+	driver->driver.bus = &mic_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(mbus_register_driver);
+
+void mbus_unregister_driver(struct mbus_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(mbus_unregister_driver);
+
+static void mbus_release_dev(struct device *d)
+{
+	struct mbus_device *mbdev = dev_to_mbus(d);
+	kfree(mbdev);
+}
+
+struct mbus_device *
+mbus_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct mbus_hw_ops *hw_ops, void __iomem *mmio_va)
+{
+	int ret;
+	struct mbus_device *mbdev;
+
+	mbdev = kzalloc(sizeof(*mbdev), GFP_KERNEL);
+	if (!mbdev)
+		return ERR_PTR(-ENOMEM);
+
+	mbdev->mmio_va = mmio_va;
+	mbdev->dev.parent = pdev;
+	mbdev->id.device = id;
+	mbdev->id.vendor = MBUS_DEV_ANY_ID;
+	mbdev->dev.archdata.dma_ops = dma_ops;
+	mbdev->dev.dma_mask = &mbdev->dev.coherent_dma_mask;
+	dma_set_mask(&mbdev->dev, DMA_BIT_MASK(64));
+	mbdev->dev.release = mbus_release_dev;
+	mbdev->hw_ops = hw_ops;
+	mbdev->dev.bus = &mic_bus;
+
+	/* Assign a unique device index and hence name. */
+	ret = ida_simple_get(&mbus_index_ida, 0, 0, GFP_KERNEL);
+	if (ret < 0)
+		goto free_mbdev;
+
+	mbdev->index = ret;
+	dev_set_name(&mbdev->dev, "mbus-dev%u", mbdev->index);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&mbdev->dev);
+	if (ret)
+		goto ida_remove;
+	return mbdev;
+ida_remove:
+	ida_simple_remove(&mbus_index_ida, mbdev->index);
+free_mbdev:
+	kfree(mbdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(mbus_register_device);
+
+void mbus_unregister_device(struct mbus_device *mbdev)
+{
+	int index = mbdev->index; /* save for after device release */
+
+	device_unregister(&mbdev->dev);
+	ida_simple_remove(&mbus_index_ida, index);
+}
+EXPORT_SYMBOL_GPL(mbus_unregister_device);
+
+static int __init mbus_init(void)
+{
+	return bus_register(&mic_bus);
+}
+
+static void __exit mbus_exit(void)
+{
+	bus_unregister(&mic_bus);
+	ida_destroy(&mbus_index_ida);
+}
+
+core_initcall(mbus_init);
+module_exit(mbus_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC Bus driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/mic_bus.h b/include/linux/mic_bus.h
new file mode 100644
index 0000000..d5b5f76
--- /dev/null
+++ b/include/linux/mic_bus.h
@@ -0,0 +1,110 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Bus driver.
+ *
+ * This implementation is very similar to the the virtio bus driver
+ * implementation @ include/linux/virtio.h.
+ */
+#ifndef _MIC_BUS_H_
+#define _MIC_BUS_H_
+/*
+ * Everything a mbus driver needs to work with any particular mbus
+ * implementation.
+ */
+#include <linux/interrupt.h>
+#include <linux/dma-mapping.h>
+
+struct mbus_device_id {
+	__u32 device;
+	__u32 vendor;
+};
+
+#define MBUS_DEV_DMA_HOST 2
+#define MBUS_DEV_DMA_MIC 3
+#define MBUS_DEV_ANY_ID 0xffffffff
+
+/**
+ * mbus_device - representation of a device using mbus
+ * @mmio_va: virtual address of mmio space
+ * @hw_ops: the hardware ops supported by this device.
+ * @id: the device type identification (used to match it with a driver).
+ * @dev: underlying device.
+ * be used to communicate with.
+ * @index: unique position on the mbus bus
+ */
+struct mbus_device {
+	void __iomem *mmio_va;
+	struct mbus_hw_ops *hw_ops;
+	struct mbus_device_id id;
+	struct device dev;
+	int index;
+};
+
+/**
+ * mbus_driver - operations for a mbus I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct mbus_driver {
+	struct device_driver driver;
+	const struct mbus_device_id *id_table;
+	int (*probe)(struct mbus_device *dev);
+	void (*scan)(struct mbus_device *dev);
+	void (*remove)(struct mbus_device *dev);
+};
+
+/**
+ * struct mic_irq - opaque pointer used as cookie
+ */
+struct mic_irq;
+
+/**
+ * mbus_hw_ops - Hardware operations for accessing a MIC device on the MIC bus.
+ */
+struct mbus_hw_ops {
+	struct mic_irq* (*request_threaded_irq)(struct mbus_device *mbdev,
+						irq_handler_t handler,
+						irq_handler_t thread_fn,
+						const char *name, void *data,
+						int intr_src);
+	void (*free_irq)(struct mbus_device *mbdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct mbus_device *mbdev, int num);
+};
+
+struct mbus_device *
+mbus_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct mbus_hw_ops *hw_ops, void __iomem *mmio_va);
+void mbus_unregister_device(struct mbus_device *mbdev);
+
+int mbus_register_driver(struct mbus_driver *drv);
+void mbus_unregister_driver(struct mbus_driver *drv);
+
+static inline struct mbus_device *dev_to_mbus(struct device *_dev)
+{
+	return container_of(_dev, struct mbus_device, dev);
+}
+
+static inline struct mbus_driver *drv_to_mbus(struct device_driver *drv)
+{
+	return container_of(drv, struct mbus_driver, driver);
+}
+
+#endif /* _MIC_BUS_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 63%]

* [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors
@ 2014-07-11 21:04 82% Sudeep Dutt
  2014-07-11 21:04 91% ` [PATCH char-misc-next v3 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

ChangeLog:
=========
v2 => v3:
a) MIC bus device private pointer cleanup as per feedback from
   Greg Kroah-Hartman.
b) Fixed minor Makefile conflict after rebasing to v3.16-rc2

v1 => v2: https://lkml.org/lkml/2014/5/29/981
a) MIC bus driver cleanups and device release callback fix in patch 2,
   as per feedback from Greg Kroah-Hartman.

v1: Initial post @ https://lkml.org/lkml/2014/5/27/819

These patches are being sent to char-misc because there is a dependency
between the patches for dma driver and other drivers.

Description:
============
This set of patches add support for MIC X100 dma driver.
MIC PCIe card has a dma controller with 8 channels. These channels are
shared between the host s/w and the card s/w. 0 to 3 are used by host
nd 4 to 7 by card. As the dma device doesn't show up as PCIe device,
a virtual bus called mic bus is created and virtual dma devices are
created on it by the host/card drivers. On host the channels are private
and used only by the host driver to transfer data for the virtio devices.

Here is a higher level block diagram.
                                      |
               +----------+           |             +----------+
               | Card OS  |           |             | Host OS  |
               +----------+           |             +----------+
                                      |
        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
        | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
        | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
        | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
            |         |         |     |      |            |         |
            |         |         |     |User  |            |         |
            |         |         |     |------|------------|---------|-------
            +-------------------+     |Kernel +--------------------------+
                      |               |       | Virtio over PCIe IOCTLs  |
                      |               |       +--------------------------+
+-----------+         |               |                   |  +-----------+
| MIC DMA   |         |               |                   |  | MIC DMA   |
| Driver    |         |               |                   |  | Driver    |
+-----------+         |               |                   |  +-----------+
      |               |               |                   |        |
+---------------+     |               |                   |  +----------------+
|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
+---------------+     |               |                   |  +----------------+
      |               |               |                   |              |
      |   +--------------+            |            +---------------+     |
      |   |Intel MIC     |            |            |Intel MIC      |     |
      +---|Card Driver   |            |            |Host Driver    |     |
          +--------------+            |            +---------------+-----+
                      |               |                   |
             +-------------------------------------------------------------+
             |                                                             |
             |                    PCIe Bus                                 |
             +-------------------------------------------------------------+

The following series of patches are partitioned as follows:

Patch 1: Add mic bus and dma driver documentation.
Patch 2: Add a bus driver for virtual MIC devices.
Patch 3: MIC X100 DMA Driver.
Patch 4: Add threaded irq support in host driver.
	 This is needed as the dma driver uses threaded irq.
Patch 5: Use dma to transfer data between MIC and host.
	 Authors: Siva Yerramreddy, Ashutosh Dixit
Patch 6: Add threaded irq support in mic_request_card_irq.
	 This is needed as the dma driver uses threaded irq.
Patch 7: Add dma device on mic bus.
Patch 8: Modify the mpss script to load/unload mic_x100_dma.ko.

The patches have been compiled/validated against v3.16-rc2. Tested using
dmatest module with module parameter "threads_per_chan=60". These patches
have also been scanned by Fengguang Wu's 0-day infrastructure and no
issues have been reported.

Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the initial
review.

Siva Yerramreddy (7):
  misc: mic: Add mic bus and dma driver documentation
  dma: MIC X100 DMA Driver
  misc: mic: add threaded irq support in host driver
  misc: mic: add dma support in host driver
  misc: mic: add threaded irq support in card driver
  misc: mic: add dma support in card driver
  misc: mic: add support for loading/unloading dma driver

Sudeep Dutt (1):
  misc: mic: add a bus driver for virtual MIC devices

 Documentation/mic/mic_overview.txt |  67 ++--
 Documentation/mic/mpssd/mpss       |  14 +-
 drivers/dma/Kconfig                |  19 +
 drivers/dma/Makefile               |   1 +
 drivers/dma/mic_x100_dma.c         | 774 +++++++++++++++++++++++++++++++++++++
 drivers/dma/mic_x100_dma.h         | 286 ++++++++++++++
 drivers/misc/mic/Kconfig           |  21 +-
 drivers/misc/mic/Makefile          |   1 +
 drivers/misc/mic/bus/Makefile      |   5 +
 drivers/misc/mic/bus/mic_bus.c     | 218 +++++++++++
 drivers/misc/mic/card/mic_device.c |  23 +-
 drivers/misc/mic/card/mic_device.h |   9 +-
 drivers/misc/mic/card/mic_virtio.c |   7 +-
 drivers/misc/mic/card/mic_x100.c   |  55 ++-
 drivers/misc/mic/host/mic_boot.c   |  83 +++-
 drivers/misc/mic/host/mic_device.h |  24 ++
 drivers/misc/mic/host/mic_intr.c   | 121 +++---
 drivers/misc/mic/host/mic_intr.h   |  27 +-
 drivers/misc/mic/host/mic_main.c   |   5 +-
 drivers/misc/mic/host/mic_virtio.c | 187 +++++++--
 drivers/misc/mic/host/mic_virtio.h |  21 +-
 drivers/misc/mic/host/mic_x100.c   |   8 +
 include/linux/mic_bus.h            | 110 ++++++
 23 files changed, 1935 insertions(+), 151 deletions(-)
 create mode 100644 drivers/dma/mic_x100_dma.c
 create mode 100644 drivers/dma/mic_x100_dma.h
 create mode 100644 drivers/misc/mic/bus/Makefile
 create mode 100644 drivers/misc/mic/bus/mic_bus.c
 create mode 100644 include/linux/mic_bus.h

-- 
1.8.2.1


^ permalink raw reply	[relevance 82%]

* [PATCH char-misc-next v3 1/8] misc: mic: Add mic bus and dma driver documentation
  2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
@ 2014-07-11 21:04 91% ` Sudeep Dutt
  2014-07-11 21:04 63% ` [PATCH char-misc-next v3 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-11 21:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Added an overview of mic bus and dma driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 Documentation/mic/mic_overview.txt | 67 +++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index b419292..77c5418 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -17,35 +17,50 @@ for applications. A key benefit of our solution is that it leverages
 the standard virtio framework for network, disk and console devices,
 though in our case the virtio framework is used across a PCIe bus.
 
+MIC PCIe card has a dma controller with 8 channels. These channels are
+shared between the host s/w and the card s/w. 0 to 3 are used by host
+and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
+a virtual bus called mic bus is created and virtual dma devices are
+created on it by the host/card drivers. On host the channels are private
+and used only by the host driver to transfer data for the virtio devices.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
 the host to initiate DMA's to/from the card using the MIC DMA engine and
 the fact that the virtio block storage backend can only be on the host.
 
-                              |
-       +----------+           |             +----------+
-       | Card OS  |           |             | Host OS  |
-       +----------+           |             +----------+
-                              |
-+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
-| Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
-| Net   | |Console | |Block | | |Net      |  |Console | |Block   |
-| Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
-+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
-    |         |         |     |      |            |         |
-    |         |         |     |User  |            |         |
-    |         |         |     |------|------------|---------|-------
-    +-------------------+     |Kernel +--------------------------+
-              |               |       | Virtio over PCIe IOCTLs  |
-              |               |       +--------------------------+
-      +--------------+        |                   |
-      |Intel MIC     |        |            +---------------+
-      |Card Driver   |        |            |Intel MIC      |
-      +--------------+        |            |Host Driver    |
-              |               |            +---------------+
-              |               |                   |
-     +-------------------------------------------------------------+
-     |                                                             |
-     |                    PCIe Bus                                 |
-     +-------------------------------------------------------------+
+                                      |
+               +----------+           |             +----------+
+               | Card OS  |           |             | Host OS  |
+               +----------+           |             +----------+
+                                      |
+        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+        | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
+        | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
+        | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
+        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+            |         |         |     |      |            |         |
+            |         |         |     |User  |            |         |
+            |         |         |     |------|------------|---------|-------
+            +-------------------+     |Kernel +--------------------------+
+                      |               |       | Virtio over PCIe IOCTLs  |
+                      |               |       +--------------------------+
++-----------+         |               |                   |  +-----------+
+| MIC DMA   |         |               |                   |  | MIC DMA   |
+| Driver    |         |               |                   |  | Driver    |
++-----------+         |               |                   |  +-----------+
+      |               |               |                   |        |
++---------------+     |               |                   |  +----------------+
+|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
++---------------+     |               |                   |  +----------------+
+      |               |               |                   |              |
+      |   +--------------+            |            +---------------+     |
+      |   |Intel MIC     |            |            |Intel MIC      |     |
+      +---|Card Driver   |            |            |Host Driver    |     |
+          +--------------+            |            +---------------+-----+
+                      |               |                   |
+             +-------------------------------------------------------------+
+             |                                                             |
+             |                    PCIe Bus                                 |
+             +-------------------------------------------------------------+
-- 
1.8.2.1


^ permalink raw reply related	[relevance 91%]

* Re: [PATCH char-misc-next v2 2/8] misc: mic: add a bus driver for virtual MIC devices
  @ 2014-07-10 16:43 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-07-10 16:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Dan Williams, Siva Yerramreddy, Ashutosh Dixit,
	Nikhil Rao, linux-kernel, dmaengine

On Wed, 2014-07-09 at 17:42 -0700, Greg Kroah-Hartman wrote:
> On Thu, May 29, 2014 at 02:23:59PM -0700, Sudeep Dutt wrote:
> > +/**
> > + * mbus_device - representation of a device using mbus
> > + * @priv: private pointer for the driver's use.
> > + * @mmio_va: virtual address of mmio space
> > + * @hw_ops: the hardware ops supported by this device.
> > + * @id: the device type identification (used to match it with a driver).
> > + * @dev: underlying device.
> > + * be used to communicate with.
> > + * @index: unique position on the mbus bus
> > + */
> > +struct mbus_device {
> > +	void *priv;
> 
> What's wrong with the built-in "private" pointer that struct device has?
> That is why it is there...
> 

We were using both mbus_device priv and the device driver_data fields
unnecessarily. I have a patch which gets rid of the mbus_device priv
field and uses only the device driver_data field via
dev_set/get_drvdata(..). The next revision of the patch series will have
this cleaned up.

Thanks for the review!

Sudeep Dutt

> > +	void __iomem *mmio_va;
> > +	struct mbus_hw_ops *hw_ops;
> > +	struct mbus_device_id id;
> > +	struct device dev;
> > +	int index;
> > +};
> 



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors
  @ 2014-06-30 19:04 99%       ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-06-30 19:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Dan Williams, Siva Yerramreddy, Ashutosh Dixit,
	Nikhil Rao, linux-kernel, dmaengine

On Mon, 2014-06-30 at 09:27 -0700, Greg Kroah-Hartman wrote:
> On Mon, Jun 30, 2014 at 08:33:53AM -0700, Sudeep Dutt wrote:
> > On Mon, 2014-06-16 at 08:58 -0700, Sudeep Dutt wrote:
> > > On Thu, 2014-05-29 at 14:23 -0700, Sudeep Dutt wrote:
> > > > ChangeLog:
> > > > =========
> > > > v1 => v2:
> > > > a) MIC bus driver cleanups and device release callback fix in patch 2,
> > > >    as per feedback from Greg Kroah-Hartman.
> > > > 
> > > > v1: Initial post @ https://lkml.org/lkml/2014/5/27/819
> > > > 
> > > 
> > > Hi Greg/Dan,
> > > 
> > > We have addressed the feedback received so far. Please take a look at
> > > this patch series.
> > > 
> > 
> > Another gentle reminder to take a look at this patch series. Greg/Dan,
> > please let us know what we can do to speed things up here.
> 
> Wait for me to get back from vacation next week :)
> 

Sounds good. I do see a minor conflict in patch 3 with v3.16-rc1 which
can be fixed up with a trivial Makefile change for adding the driver to
the end of the file. Please let me know if you want me to resend the
patch series with that fixed up.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors
  2014-06-16 15:58 99% ` [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
@ 2014-06-30 15:33 99%   ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-06-30 15:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Dan Williams, Siva Yerramreddy, Ashutosh Dixit,
	Nikhil Rao, linux-kernel, dmaengine

On Mon, 2014-06-16 at 08:58 -0700, Sudeep Dutt wrote:
> On Thu, 2014-05-29 at 14:23 -0700, Sudeep Dutt wrote:
> > ChangeLog:
> > =========
> > v1 => v2:
> > a) MIC bus driver cleanups and device release callback fix in patch 2,
> >    as per feedback from Greg Kroah-Hartman.
> > 
> > v1: Initial post @ https://lkml.org/lkml/2014/5/27/819
> > 
> 
> Hi Greg/Dan,
> 
> We have addressed the feedback received so far. Please take a look at
> this patch series.
> 

Another gentle reminder to take a look at this patch series. Greg/Dan,
please let us know what we can do to speed things up here.

Thanks,
Sudeep Dutt

> > These patches are being sent to char-misc because there is a dependency
> > between the patches for dma driver and other drivers.
> > 
> > Description:
> > ============
> > This set of patches add support for MIC X100 dma driver.
> > MIC PCIe card has a dma controller with 8 channels. These channels are
> > shared between the host s/w and the card s/w. 0 to 3 are used by host
> > nd 4 to 7 by card. As the dma device doesn't show up as PCIe device,
> > a virtual bus called mic bus is created and virtual dma devices are
> > created on it by the host/card drivers. On host the channels are private
> > and used only by the host driver to transfer data for the virtio devices.
> > 
> > Here is a higher level block diagram.
> >                                       |
> >                +----------+           |             +----------+
> >                | Card OS  |           |             | Host OS  |
> >                +----------+           |             +----------+
> >                                       |
> >         +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
> >         | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
> >         | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
> >         | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
> >         +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
> >             |         |         |     |      |            |         |
> >             |         |         |     |User  |            |         |
> >             |         |         |     |------|------------|---------|-------
> >             +-------------------+     |Kernel +--------------------------+
> >                       |               |       | Virtio over PCIe IOCTLs  |
> >                       |               |       +--------------------------+
> > +-----------+         |               |                   |  +-----------+
> > | MIC DMA   |         |               |                   |  | MIC DMA   |
> > | Driver    |         |               |                   |  | Driver    |
> > +-----------+         |               |                   |  +-----------+
> >       |               |               |                   |        |
> > +---------------+     |               |                   |  +----------------+
> > |MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
> > +---------------+     |               |                   |  +----------------+
> >       |               |               |                   |              |
> >       |   +--------------+            |            +---------------+     |
> >       |   |Intel MIC     |            |            |Intel MIC      |     |
> >       +---|Card Driver   |            |            |Host Driver    |     |
> >           +--------------+            |            +---------------+-----+
> >                       |               |                   |
> >              +-------------------------------------------------------------+
> >              |                                                             |
> >              |                    PCIe Bus                                 |
> >              +-------------------------------------------------------------+
> > 
> > The following series of patches are partitioned as follows:
> > 
> > Patch 1: Add mic bus and dma driver documentation.
> > 	 Author: Siva Yerramreddy
> > Patch 2: Add a bus driver for virtual MIC devices.
> > 	 Authors: Siva Yerramreddy, Sudeep Dutt
> > Patch 3: MIC X100 DMA Driver.
> > 	 Author: Siva Yerramreddy
> > Patch 4: Add threaded irq support in host driver.
> > 	 This is needed as the dma driver uses threaded irq.
> > 	 Author: Siva Yerramreddy
> > Patch 5: Use dma to transfer data between MIC and host.
> > 	 Authors: Siva Yerramreddy, Ashutosh Dixit
> > Patch 6: Add threaded irq support in mic_request_card_irq.
> > 	 This is needed as the dma driver uses threaded irq.
> > 	 Author: Siva Yerramreddy
> > Patch 7: Add dma device on mic bus.
> > 	 Author: Siva Yerramreddy
> > Patch 8: Modify the mpss script to load/unload mic_x100_dma.ko.
> > 	 Author: Siva Yerramreddy
> > 
> > The patches have been compiled/validated against v3.15-rc3. Tested using
> > dmatest module with module parameter "threads_per_chan=60". These patches
> > have also been scanned by Fengguang Wu's 0-day infrastructure and no
> > issues have been reported.
> > 
> > Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the initial
> > review.
> > 
> > Siva Yerramreddy (7):
> >   misc: mic: Add mic bus and dma driver documentation
> >   dma: MIC X100 DMA Driver
> >   misc: mic: add threaded irq support in host driver
> >   misc: mic: add dma support in host driver
> >   misc: mic: add threaded irq support in card driver
> >   misc: mic: add dma support in card driver
> >   misc: mic: add support for loading/unloading dma driver
> > 
> > Sudeep Dutt (1):
> >   misc: mic: add a bus driver for virtual MIC devices
> > 
> >  Documentation/mic/mic_overview.txt |  67 ++--
> >  Documentation/mic/mpssd/mpss       |  14 +-
> >  drivers/dma/Kconfig                |  19 +
> >  drivers/dma/Makefile               |   1 +
> >  drivers/dma/mic_x100_dma.c         | 774 +++++++++++++++++++++++++++++++++++++
> >  drivers/dma/mic_x100_dma.h         | 286 ++++++++++++++
> >  drivers/misc/mic/Kconfig           |  21 +-
> >  drivers/misc/mic/Makefile          |   1 +
> >  drivers/misc/mic/bus/Makefile      |   5 +
> >  drivers/misc/mic/bus/mic_bus.c     | 219 +++++++++++
> >  drivers/misc/mic/card/mic_device.c |  23 +-
> >  drivers/misc/mic/card/mic_device.h |   9 +-
> >  drivers/misc/mic/card/mic_virtio.c |   7 +-
> >  drivers/misc/mic/card/mic_x100.c   |  55 ++-
> >  drivers/misc/mic/host/mic_boot.c   |  83 +++-
> >  drivers/misc/mic/host/mic_device.h |  24 ++
> >  drivers/misc/mic/host/mic_intr.c   | 121 +++---
> >  drivers/misc/mic/host/mic_intr.h   |  27 +-
> >  drivers/misc/mic/host/mic_main.c   |   5 +-
> >  drivers/misc/mic/host/mic_virtio.c | 187 +++++++--
> >  drivers/misc/mic/host/mic_virtio.h |  21 +-
> >  drivers/misc/mic/host/mic_x100.c   |   8 +
> >  include/linux/mic_bus.h            | 112 ++++++
> >  23 files changed, 1938 insertions(+), 151 deletions(-)
> >  create mode 100644 drivers/dma/mic_x100_dma.c
> >  create mode 100644 drivers/dma/mic_x100_dma.h
> >  create mode 100644 drivers/misc/mic/bus/Makefile
> >  create mode 100644 drivers/misc/mic/bus/mic_bus.c
> >  create mode 100644 include/linux/mic_bus.h
> > 
> 
> 



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (7 preceding siblings ...)
  2014-05-29 21:24 99% ` [PATCH char-misc-next v2 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
@ 2014-06-16 15:58 99% ` Sudeep Dutt
  2014-06-30 15:33 99%   ` Sudeep Dutt
  8 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-06-16 15:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, linux-kernel,
	dmaengine, Sudeep Dutt

On Thu, 2014-05-29 at 14:23 -0700, Sudeep Dutt wrote:
> ChangeLog:
> =========
> v1 => v2:
> a) MIC bus driver cleanups and device release callback fix in patch 2,
>    as per feedback from Greg Kroah-Hartman.
> 
> v1: Initial post @ https://lkml.org/lkml/2014/5/27/819
> 

Hi Greg/Dan,

We have addressed the feedback received so far. Please take a look at
this patch series.

Thanks,
Sudeep Dutt

> These patches are being sent to char-misc because there is a dependency
> between the patches for dma driver and other drivers.
> 
> Description:
> ============
> This set of patches add support for MIC X100 dma driver.
> MIC PCIe card has a dma controller with 8 channels. These channels are
> shared between the host s/w and the card s/w. 0 to 3 are used by host
> nd 4 to 7 by card. As the dma device doesn't show up as PCIe device,
> a virtual bus called mic bus is created and virtual dma devices are
> created on it by the host/card drivers. On host the channels are private
> and used only by the host driver to transfer data for the virtio devices.
> 
> Here is a higher level block diagram.
>                                       |
>                +----------+           |             +----------+
>                | Card OS  |           |             | Host OS  |
>                +----------+           |             +----------+
>                                       |
>         +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
>         | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
>         | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
>         | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
>         +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
>             |         |         |     |      |            |         |
>             |         |         |     |User  |            |         |
>             |         |         |     |------|------------|---------|-------
>             +-------------------+     |Kernel +--------------------------+
>                       |               |       | Virtio over PCIe IOCTLs  |
>                       |               |       +--------------------------+
> +-----------+         |               |                   |  +-----------+
> | MIC DMA   |         |               |                   |  | MIC DMA   |
> | Driver    |         |               |                   |  | Driver    |
> +-----------+         |               |                   |  +-----------+
>       |               |               |                   |        |
> +---------------+     |               |                   |  +----------------+
> |MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
> +---------------+     |               |                   |  +----------------+
>       |               |               |                   |              |
>       |   +--------------+            |            +---------------+     |
>       |   |Intel MIC     |            |            |Intel MIC      |     |
>       +---|Card Driver   |            |            |Host Driver    |     |
>           +--------------+            |            +---------------+-----+
>                       |               |                   |
>              +-------------------------------------------------------------+
>              |                                                             |
>              |                    PCIe Bus                                 |
>              +-------------------------------------------------------------+
> 
> The following series of patches are partitioned as follows:
> 
> Patch 1: Add mic bus and dma driver documentation.
> 	 Author: Siva Yerramreddy
> Patch 2: Add a bus driver for virtual MIC devices.
> 	 Authors: Siva Yerramreddy, Sudeep Dutt
> Patch 3: MIC X100 DMA Driver.
> 	 Author: Siva Yerramreddy
> Patch 4: Add threaded irq support in host driver.
> 	 This is needed as the dma driver uses threaded irq.
> 	 Author: Siva Yerramreddy
> Patch 5: Use dma to transfer data between MIC and host.
> 	 Authors: Siva Yerramreddy, Ashutosh Dixit
> Patch 6: Add threaded irq support in mic_request_card_irq.
> 	 This is needed as the dma driver uses threaded irq.
> 	 Author: Siva Yerramreddy
> Patch 7: Add dma device on mic bus.
> 	 Author: Siva Yerramreddy
> Patch 8: Modify the mpss script to load/unload mic_x100_dma.ko.
> 	 Author: Siva Yerramreddy
> 
> The patches have been compiled/validated against v3.15-rc3. Tested using
> dmatest module with module parameter "threads_per_chan=60". These patches
> have also been scanned by Fengguang Wu's 0-day infrastructure and no
> issues have been reported.
> 
> Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the initial
> review.
> 
> Siva Yerramreddy (7):
>   misc: mic: Add mic bus and dma driver documentation
>   dma: MIC X100 DMA Driver
>   misc: mic: add threaded irq support in host driver
>   misc: mic: add dma support in host driver
>   misc: mic: add threaded irq support in card driver
>   misc: mic: add dma support in card driver
>   misc: mic: add support for loading/unloading dma driver
> 
> Sudeep Dutt (1):
>   misc: mic: add a bus driver for virtual MIC devices
> 
>  Documentation/mic/mic_overview.txt |  67 ++--
>  Documentation/mic/mpssd/mpss       |  14 +-
>  drivers/dma/Kconfig                |  19 +
>  drivers/dma/Makefile               |   1 +
>  drivers/dma/mic_x100_dma.c         | 774 +++++++++++++++++++++++++++++++++++++
>  drivers/dma/mic_x100_dma.h         | 286 ++++++++++++++
>  drivers/misc/mic/Kconfig           |  21 +-
>  drivers/misc/mic/Makefile          |   1 +
>  drivers/misc/mic/bus/Makefile      |   5 +
>  drivers/misc/mic/bus/mic_bus.c     | 219 +++++++++++
>  drivers/misc/mic/card/mic_device.c |  23 +-
>  drivers/misc/mic/card/mic_device.h |   9 +-
>  drivers/misc/mic/card/mic_virtio.c |   7 +-
>  drivers/misc/mic/card/mic_x100.c   |  55 ++-
>  drivers/misc/mic/host/mic_boot.c   |  83 +++-
>  drivers/misc/mic/host/mic_device.h |  24 ++
>  drivers/misc/mic/host/mic_intr.c   | 121 +++---
>  drivers/misc/mic/host/mic_intr.h   |  27 +-
>  drivers/misc/mic/host/mic_main.c   |   5 +-
>  drivers/misc/mic/host/mic_virtio.c | 187 +++++++--
>  drivers/misc/mic/host/mic_virtio.h |  21 +-
>  drivers/misc/mic/host/mic_x100.c   |   8 +
>  include/linux/mic_bus.h            | 112 ++++++
>  23 files changed, 1938 insertions(+), 151 deletions(-)
>  create mode 100644 drivers/dma/mic_x100_dma.c
>  create mode 100644 drivers/dma/mic_x100_dma.h
>  create mode 100644 drivers/misc/mic/bus/Makefile
>  create mode 100644 drivers/misc/mic/bus/mic_bus.c
>  create mode 100644 include/linux/mic_bus.h
> 



^ permalink raw reply	[relevance 99%]

* Re: [PATCH] misc: mic: Introduce the managed version of ioremap
  @ 2014-06-02 16:30 99% ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-06-02 16:30 UTC (permalink / raw)
  To: Himangi Saraogi
  Cc: sudeep.dutt, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Nikhil Rao, linux-kernel, julia.lawall

On Mon, 2014-06-02 at 00:24 +0530, Himangi Saraogi wrote:
> This patch moves data allocated using ioremap to managed data allocated
> using devm_ioremap and cleans now unnecessary iounmaps in probe and remove
> functions. Also the unnecessary label iounmap is done away with.
> 
> Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
> ---
> I wonder if mic_card_map and mic_card_unmap can be used on the same data
> as the devm_ioremap in the probe function. If yes, can I devmify them as well,
> in which case mic_card_unmap will not be required.
> 

Hi Himangi,

Thanks for the patch which I have tested on MIC and it works fine.
mic_card_map(..) and mic_card_unmap(..) can be called multiple times
while the driver is loaded so it would be best to leave those APIs
unchanged or modify them to use devm_ioremap(..) and devm_iounmap(..)
instead.

This patch does create conflicts with a MIC DMA patch series under
review @ https://lkml.org/lkml/2014/5/29/981 I will send your patch to
Greg K-H for inclusion in char-misc-next once the larger patch is
accepted so that there are no conflicts.

Thanks,
Sudeep Dutt

>  drivers/misc/mic/card/mic_x100.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
> index 2868945..5f26b17 100644
> --- a/drivers/misc/mic/card/mic_x100.c
> +++ b/drivers/misc/mic/card/mic_x100.c
> @@ -159,7 +159,8 @@ static int __init mic_probe(struct platform_device *pdev)
>  
>  	mdev->mmio.pa = MIC_X100_MMIO_BASE;
>  	mdev->mmio.len = MIC_X100_MMIO_LEN;
> -	mdev->mmio.va = ioremap(MIC_X100_MMIO_BASE, MIC_X100_MMIO_LEN);
> +	mdev->mmio.va = devm_ioremap(&pdev->dev, MIC_X100_MMIO_BASE,
> +				     MIC_X100_MMIO_LEN);
>  	if (!mdev->mmio.va) {
>  		dev_err(&pdev->dev, "Cannot remap MMIO BAR\n");
>  		rc = -EIO;
> @@ -169,22 +170,17 @@ static int __init mic_probe(struct platform_device *pdev)
>  	rc = mic_driver_init(mdrv);
>  	if (rc) {
>  		dev_err(&pdev->dev, "mic_driver_init failed rc %d\n", rc);
> -		goto iounmap;
> +		goto done;
>  	}
>  done:
>  	return rc;
> -iounmap:
> -	iounmap(mdev->mmio.va);
> -	return rc;
>  }
>  
>  static int mic_remove(struct platform_device *pdev)
>  {
>  	struct mic_driver *mdrv = &g_drv;
> -	struct mic_device *mdev = &mdrv->mdev;
>  
>  	mic_driver_uninit(mdrv);
> -	iounmap(mdev->mmio.va);
>  	return 0;
>  }
>  



^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next v2 2/8] misc: mic: add a bus driver for virtual MIC devices
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
  2014-05-29 21:23 91% ` [PATCH char-misc-next v2 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
@ 2014-05-29 21:23 63% ` Sudeep Dutt
    2014-05-29 21:24 41% ` [PATCH char-misc-next v2 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

This MIC virtual bus driver takes the responsibility of creating all
the virtual devices connected to the PCIe device on the host and the
platform device on the card. The MIC bus hardware operations provide
a way to abstract certain hardware details from the base physical devices.
Examples of devices added on the MIC virtual bus include host DMA and card DMA.
This abstraction enables using a common DMA driver on host and card.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/Kconfig       |  17 ++++
 drivers/misc/mic/Makefile      |   1 +
 drivers/misc/mic/bus/Makefile  |   5 +
 drivers/misc/mic/bus/mic_bus.c | 219 +++++++++++++++++++++++++++++++++++++++++
 include/linux/mic_bus.h        | 112 +++++++++++++++++++++
 5 files changed, 354 insertions(+)
 create mode 100644 drivers/misc/mic/bus/Makefile
 create mode 100644 drivers/misc/mic/bus/mic_bus.c
 create mode 100644 include/linux/mic_bus.h

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 462a5b1..ee1d2ac 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -1,3 +1,20 @@
+comment "Intel MIC Bus Driver"
+
+config INTEL_MIC_BUS
+	tristate "Intel MIC Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the MIC Bus, such as CONFIG_INTEL_MIC_HOST,
+	  CONFIG_INTEL_MIC_CARD, CONFIG_INTEL_MIC_X100_DMA etc.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 05b34d6..e9bf148 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,3 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
+obj-$(CONFIG_INTEL_MIC_BUS) += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
new file mode 100644
index 0000000..d85c7f2
--- /dev/null
+++ b/drivers/misc/mic/bus/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
diff --git a/drivers/misc/mic/bus/mic_bus.c b/drivers/misc/mic/bus/mic_bus.c
new file mode 100644
index 0000000..36d1233
--- /dev/null
+++ b/drivers/misc/mic/bus/mic_bus.c
@@ -0,0 +1,219 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Bus driver.
+ *
+ * This implementation is very similar to the the virtio bus driver
+ * implementation @ drivers/virtio/virtio.c
+ */
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/mic_bus.h>
+
+/* Unique numbering for mbus devices. */
+static DEFINE_IDA(mbus_index_ida);
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "mbus:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *mbus_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(mbus_dev);
+
+static inline int mbus_id_match(const struct mbus_device *dev,
+				const struct mbus_device_id *id)
+{
+	if (id->device != dev->id.device && id->device != MBUS_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == MBUS_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call mbus_dev_probe().
+ */
+static int mbus_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct mbus_device *dev = dev_to_mbus(dv);
+	const struct mbus_device_id *ids;
+
+	ids = drv_to_mbus(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (mbus_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int mbus_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct mbus_device *dev = dev_to_mbus(dv);
+
+	return add_uevent_var(env, "MODALIAS=mbus:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int mbus_dev_probe(struct device *d)
+{
+	int err;
+	struct mbus_device *dev = dev_to_mbus(d);
+	struct mbus_driver *drv = drv_to_mbus(dev->dev.driver);
+
+	err = drv->probe(dev);
+	if (!err)
+		if (drv->scan)
+			drv->scan(dev);
+	return err;
+}
+
+static int mbus_dev_remove(struct device *d)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	struct mbus_driver *drv = drv_to_mbus(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type mic_bus = {
+	.name  = "mic_bus",
+	.match = mbus_dev_match,
+	.dev_groups = mbus_dev_groups,
+	.uevent = mbus_uevent,
+	.probe = mbus_dev_probe,
+	.remove = mbus_dev_remove,
+};
+
+int mbus_register_driver(struct mbus_driver *driver)
+{
+	driver->driver.bus = &mic_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(mbus_register_driver);
+
+void mbus_unregister_driver(struct mbus_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(mbus_unregister_driver);
+
+static void mbus_release_dev(struct device *d)
+{
+	struct mbus_device *mbdev = dev_to_mbus(d);
+	kfree(mbdev);
+}
+
+struct mbus_device *
+mbus_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct mbus_hw_ops *hw_ops, void __iomem *mmio_va)
+{
+	int ret;
+	struct mbus_device *mbdev;
+
+	mbdev = kzalloc(sizeof(*mbdev), GFP_KERNEL);
+	if (!mbdev)
+		return ERR_PTR(-ENOMEM);
+
+	mbdev->mmio_va = mmio_va;
+	mbdev->dev.parent = pdev;
+	mbdev->id.device = id;
+	mbdev->id.vendor = MBUS_DEV_ANY_ID;
+	mbdev->dev.archdata.dma_ops = dma_ops;
+	mbdev->dev.dma_mask = &mbdev->dev.coherent_dma_mask;
+	dma_set_mask(&mbdev->dev, DMA_BIT_MASK(64));
+	mbdev->dev.release = mbus_release_dev;
+	mbdev->hw_ops = hw_ops;
+	dev_set_drvdata(&mbdev->dev, mbdev);
+	mbdev->dev.bus = &mic_bus;
+
+	/* Assign a unique device index and hence name. */
+	ret = ida_simple_get(&mbus_index_ida, 0, 0, GFP_KERNEL);
+	if (ret < 0)
+		goto free_mbdev;
+
+	mbdev->index = ret;
+	dev_set_name(&mbdev->dev, "mbus-dev%u", mbdev->index);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	ret = device_register(&mbdev->dev);
+	if (ret)
+		goto ida_remove;
+	return mbdev;
+ida_remove:
+	ida_simple_remove(&mbus_index_ida, mbdev->index);
+free_mbdev:
+	kfree(mbdev);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(mbus_register_device);
+
+void mbus_unregister_device(struct mbus_device *mbdev)
+{
+	int index = mbdev->index; /* save for after device release */
+
+	device_unregister(&mbdev->dev);
+	ida_simple_remove(&mbus_index_ida, index);
+}
+EXPORT_SYMBOL_GPL(mbus_unregister_device);
+
+static int __init mbus_init(void)
+{
+	return bus_register(&mic_bus);
+}
+
+static void __exit mbus_exit(void)
+{
+	bus_unregister(&mic_bus);
+	ida_destroy(&mbus_index_ida);
+}
+
+core_initcall(mbus_init);
+module_exit(mbus_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC Bus driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/mic_bus.h b/include/linux/mic_bus.h
new file mode 100644
index 0000000..74b6b82
--- /dev/null
+++ b/include/linux/mic_bus.h
@@ -0,0 +1,112 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Bus driver.
+ *
+ * This implementation is very similar to the the virtio bus driver
+ * implementation @ include/linux/virtio.h.
+ */
+#ifndef _MIC_BUS_H_
+#define _MIC_BUS_H_
+/*
+ * Everything a mbus driver needs to work with any particular mbus
+ * implementation.
+ */
+#include <linux/interrupt.h>
+#include <linux/dma-mapping.h>
+
+struct mbus_device_id {
+	__u32 device;
+	__u32 vendor;
+};
+
+#define MBUS_DEV_DMA_HOST 2
+#define MBUS_DEV_DMA_MIC 3
+#define MBUS_DEV_ANY_ID 0xffffffff
+
+/**
+ * mbus_device - representation of a device using mbus
+ * @priv: private pointer for the driver's use.
+ * @mmio_va: virtual address of mmio space
+ * @hw_ops: the hardware ops supported by this device.
+ * @id: the device type identification (used to match it with a driver).
+ * @dev: underlying device.
+ * be used to communicate with.
+ * @index: unique position on the mbus bus
+ */
+struct mbus_device {
+	void *priv;
+	void __iomem *mmio_va;
+	struct mbus_hw_ops *hw_ops;
+	struct mbus_device_id id;
+	struct device dev;
+	int index;
+};
+
+/**
+ * mbus_driver - operations for a mbus I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct mbus_driver {
+	struct device_driver driver;
+	const struct mbus_device_id *id_table;
+	int (*probe)(struct mbus_device *dev);
+	void (*scan)(struct mbus_device *dev);
+	void (*remove)(struct mbus_device *dev);
+};
+
+/**
+ * struct mic_irq - opaque pointer used as cookie
+ */
+struct mic_irq;
+
+/**
+ * mbus_hw_ops - Hardware operations for accessing a MIC device on the MIC bus.
+ */
+struct mbus_hw_ops {
+	struct mic_irq* (*request_threaded_irq)(struct mbus_device *mbdev,
+						irq_handler_t handler,
+						irq_handler_t thread_fn,
+						const char *name, void *data,
+						int intr_src);
+	void (*free_irq)(struct mbus_device *mbdev,
+			 struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct mbus_device *mbdev, int num);
+};
+
+struct mbus_device *
+mbus_register_device(struct device *pdev, int id, struct dma_map_ops *dma_ops,
+		     struct mbus_hw_ops *hw_ops, void __iomem *mmio_va);
+void mbus_unregister_device(struct mbus_device *mbdev);
+
+int mbus_register_driver(struct mbus_driver *drv);
+void mbus_unregister_driver(struct mbus_driver *drv);
+
+static inline struct mbus_device *dev_to_mbus(struct device *_dev)
+{
+	return container_of(_dev, struct mbus_device, dev);
+}
+
+static inline struct mbus_driver *drv_to_mbus(struct device_driver *drv)
+{
+	return container_of(drv, struct mbus_driver, driver);
+}
+
+#endif /* _MIC_BUS_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 63%]

* [PATCH char-misc-next v2 4/8] misc: mic: add threaded irq support in host driver
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (2 preceding siblings ...)
  2014-05-29 21:24 41% ` [PATCH char-misc-next v2 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
@ 2014-05-29 21:24 62% ` Sudeep Dutt
  2014-05-29 21:24 54% ` [PATCH char-misc-next v2 5/8] misc: mic: add dma " Sudeep Dutt
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Convert mic_request_irq to mic_request_threaded_irq to support threaded
irq for virtual devices on mic bus.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/host/mic_intr.c   | 121 ++++++++++++++++++++++---------------
 drivers/misc/mic/host/mic_intr.h   |  24 +++++---
 drivers/misc/mic/host/mic_main.c   |   5 +-
 drivers/misc/mic/host/mic_virtio.c |   6 +-
 4 files changed, 96 insertions(+), 60 deletions(-)

diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index dbc5afd..d686f28 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -24,28 +24,29 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 
-/*
- * mic_invoke_callback - Invoke callback functions registered for
- * the corresponding source id.
- *
- * @mdev: pointer to the mic_device instance
- * @idx: The interrupt source id.
- *
- * Returns none.
- */
-static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
+static irqreturn_t mic_thread_fn(int irq, void *dev)
 {
+	struct mic_device *mdev = dev;
+	struct mic_intr_info *intr_info = mdev->intr_info;
+	struct mic_irq_info *irq_info = &mdev->irq_info;
 	struct mic_intr_cb *intr_cb;
 	struct pci_dev *pdev = container_of(mdev->sdev->parent,
-		struct pci_dev, dev);
+					    struct pci_dev, dev);
+	int i;
 
-	spin_lock(&mdev->irq_info.mic_intr_lock);
-	list_for_each_entry(intr_cb, &mdev->irq_info.cb_list[idx], list)
-		if (intr_cb->func)
-			intr_cb->func(pdev->irq, intr_cb->data);
-	spin_unlock(&mdev->irq_info.mic_intr_lock);
+	spin_lock(&irq_info->mic_thread_lock);
+	for (i = intr_info->intr_start_idx[MIC_INTR_DB];
+			i < intr_info->intr_len[MIC_INTR_DB]; i++)
+		if (test_and_clear_bit(i, &irq_info->mask)) {
+			list_for_each_entry(intr_cb, &irq_info->cb_list[i],
+					    list)
+				if (intr_cb->thread_fn)
+					intr_cb->thread_fn(pdev->irq,
+							 intr_cb->data);
+		}
+	spin_unlock(&irq_info->mic_thread_lock);
+	return IRQ_HANDLED;
 }
-
 /**
  * mic_interrupt - Generic interrupt handler for
  * MSI and INTx based interrupts.
@@ -53,7 +54,11 @@ static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
 static irqreturn_t mic_interrupt(int irq, void *dev)
 {
 	struct mic_device *mdev = dev;
-	struct mic_intr_info *info = mdev->intr_info;
+	struct mic_intr_info *intr_info = mdev->intr_info;
+	struct mic_irq_info *irq_info = &mdev->irq_info;
+	struct mic_intr_cb *intr_cb;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+					    struct pci_dev, dev);
 	u32 mask;
 	int i;
 
@@ -61,12 +66,19 @@ static irqreturn_t mic_interrupt(int irq, void *dev)
 	if (!mask)
 		return IRQ_NONE;
 
-	for (i = info->intr_start_idx[MIC_INTR_DB];
-			i < info->intr_len[MIC_INTR_DB]; i++)
-		if (mask & BIT(i))
-			mic_invoke_callback(mdev, i);
-
-	return IRQ_HANDLED;
+	spin_lock(&irq_info->mic_intr_lock);
+	for (i = intr_info->intr_start_idx[MIC_INTR_DB];
+			i < intr_info->intr_len[MIC_INTR_DB]; i++)
+		if (mask & BIT(i)) {
+			list_for_each_entry(intr_cb, &irq_info->cb_list[i],
+					    list)
+				if (intr_cb->handler)
+					intr_cb->handler(pdev->irq,
+							 intr_cb->data);
+			set_bit(i, &irq_info->mask);
+		}
+	spin_unlock(&irq_info->mic_intr_lock);
+	return IRQ_WAKE_THREAD;
 }
 
 /* Return the interrupt offset from the index. Index is 0 based. */
@@ -99,14 +111,15 @@ static struct msix_entry *mic_get_available_vector(struct mic_device *mdev)
  *
  * @mdev: pointer to the mic_device instance
  * @idx: The source id to be registered.
- * @func: The function to be called when the source id receives
+ * @handler: The function to be called when the source id receives
  * the interrupt.
+ * @thread_fn: thread fn. corresponding to the handler
  * @data: Private data of the requester.
  * Return the callback structure that was registered or an
  * appropriate error on failure.
  */
 static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
-			u8 idx, irqreturn_t (*func) (int irq, void *dev),
+			u8 idx, irq_handler_t handler, irq_handler_t thread_fn,
 			void *data)
 {
 	struct mic_intr_cb *intr_cb;
@@ -117,7 +130,8 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 	if (!intr_cb)
 		return ERR_PTR(-ENOMEM);
 
-	intr_cb->func = func;
+	intr_cb->handler = handler;
+	intr_cb->thread_fn = thread_fn;
 	intr_cb->data = data;
 	intr_cb->cb_id = ida_simple_get(&mdev->irq_info.cb_ida,
 		0, 0, GFP_KERNEL);
@@ -126,9 +140,11 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 		goto ida_fail;
 	}
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
 	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	list_add_tail(&intr_cb->list, &mdev->irq_info.cb_list[idx]);
 	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 
 	return intr_cb;
 ida_fail:
@@ -152,8 +168,9 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 	unsigned long flags;
 	int i;
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	for (i = 0;  i < MIC_NUM_OFFSETS; i++) {
-		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
 			if (intr_cb->cb_id == idx) {
@@ -163,11 +180,13 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 				kfree(intr_cb);
 				spin_unlock_irqrestore(
 					&mdev->irq_info.mic_intr_lock, flags);
+				spin_unlock(&mdev->irq_info.mic_thread_lock);
 				return i;
 			}
 		}
-		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
 	}
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 	return MIC_NUM_OFFSETS;
 }
 
@@ -242,6 +261,7 @@ static int mic_setup_callbacks(struct mic_device *mdev)
 		INIT_LIST_HEAD(&mdev->irq_info.cb_list[i]);
 	ida_init(&mdev->irq_info.cb_ida);
 	spin_lock_init(&mdev->irq_info.mic_intr_lock);
+	spin_lock_init(&mdev->irq_info.mic_thread_lock);
 	return 0;
 }
 
@@ -258,14 +278,12 @@ static void mic_release_callbacks(struct mic_device *mdev)
 	struct mic_intr_cb *intr_cb;
 	int i;
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	for (i = 0; i < MIC_NUM_OFFSETS; i++) {
-		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 
-		if (list_empty(&mdev->irq_info.cb_list[i])) {
-			spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock,
-					       flags);
+		if (list_empty(&mdev->irq_info.cb_list[i]))
 			break;
-		}
 
 		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
@@ -274,8 +292,9 @@ static void mic_release_callbacks(struct mic_device *mdev)
 					  intr_cb->cb_id);
 			kfree(intr_cb);
 		}
-		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
 	}
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 	ida_destroy(&mdev->irq_info.cb_ida);
 	kfree(mdev->irq_info.cb_list);
 }
@@ -313,7 +332,8 @@ static int mic_setup_msi(struct mic_device *mdev, struct pci_dev *pdev)
 		goto err_nomem2;
 	}
 
-	rc = request_irq(pdev->irq, mic_interrupt, 0 , "mic-msi", mdev);
+	rc = request_threaded_irq(pdev->irq, mic_interrupt, mic_thread_fn,
+				  0, "mic-msi", mdev);
 	if (rc) {
 		dev_err(&pdev->dev, "Error allocating MSI interrupt\n");
 		goto err_irq_req_fail;
@@ -353,8 +373,8 @@ static int mic_setup_intx(struct mic_device *mdev, struct pci_dev *pdev)
 		goto err_nomem;
 	}
 
-	rc = request_irq(pdev->irq, mic_interrupt,
-		IRQF_SHARED, "mic-intx", mdev);
+	rc = request_threaded_irq(pdev->irq, mic_interrupt, mic_thread_fn,
+				  IRQF_SHARED, "mic-intx", mdev);
 	if (rc)
 		goto err;
 
@@ -391,13 +411,14 @@ int mic_next_db(struct mic_device *mdev)
 #define MK_COOKIE(x, y) ((x) | (y) << COOKIE_ID_SHIFT)
 
 /**
- * mic_request_irq - request an irq. mic_mutex needs
+ * mic_request_threaded_irq - request an irq. mic_mutex needs
  * to be held before calling this function.
  *
  * @mdev: pointer to mic_device instance
- * @func: The callback function that handles the interrupt.
+ * @handler: The callback function that handles the interrupt.
  * The function needs to call ack_interrupts
  * (mdev->ops->ack_interrupt(mdev)) when handling the interrupts.
+ * @thread_fn: thread fn required by request_threaded_irq.
  * @name: The ASCII name of the callee requesting the irq.
  * @data: private data that is returned back when calling the
  * function handler.
@@ -412,10 +433,11 @@ int mic_next_db(struct mic_device *mdev)
  * error code.
  *
  */
-struct mic_irq *mic_request_irq(struct mic_device *mdev,
-	irqreturn_t (*func)(int irq, void *dev),
-	const char *name, void *data, int intr_src,
-	enum mic_intr_type type)
+struct mic_irq *
+mic_request_threaded_irq(struct mic_device *mdev,
+			 irq_handler_t handler, irq_handler_t thread_fn,
+			 const char *name, void *data, int intr_src,
+			 enum mic_intr_type type)
 {
 	u16 offset;
 	int rc = 0;
@@ -444,7 +466,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 			goto err;
 		}
 
-		rc = request_irq(msix->vector, func, 0, name, data);
+		rc = request_threaded_irq(msix->vector, handler, thread_fn,
+					  0, name, data);
 		if (rc) {
 			dev_dbg(mdev->sdev->parent,
 				"request irq failed rc = %d\n", rc);
@@ -458,8 +481,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 		dev_dbg(mdev->sdev->parent, "irq: %d assigned for src: %d\n",
 			msix->vector, intr_src);
 	} else {
-		intr_cb = mic_register_intr_callback(mdev,
-				offset, func, data);
+		intr_cb = mic_register_intr_callback(mdev, offset, handler,
+						     thread_fn, data);
 		if (IS_ERR(intr_cb)) {
 			dev_err(mdev->sdev->parent,
 				"No available callback entries for use\n");
@@ -487,9 +510,9 @@ err:
  *  needs to be held before calling this function.
  *
  * @mdev: pointer to mic_device instance
- * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @cookie: cookie obtained during a successful call to mic_request_threaded_irq
  * @data: private data specified by the calling function during the
- * mic_request_irq
+ * mic_request_threaded_irq
  *
  * returns: none.
  */
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 6091aa9..b1334dd 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -21,6 +21,8 @@
 #ifndef _MIC_INTR_H_
 #define _MIC_INTR_H_
 
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
 /*
  * The minimum number of msix vectors required for normal operation.
  * 3 for virtio network, console and block devices.
@@ -68,7 +70,11 @@ struct mic_intr_info {
  * @num_vectors: The number of MSI/MSI-x vectors that have been allocated.
  * @cb_ida: callback ID allocator to track the callbacks registered.
  * @mic_intr_lock: spinlock to protect the interrupt callback list.
+ * @mic_thread_lock: spinlock to protect the thread callback list.
+ *		   This lock is used to protect against thread_fn while
+ *		   mic_intr_lock is used to protect against interrupt handler.
  * @cb_list: Array of callback lists one for each source.
+ * @mask: Mask used by the main thread fn to call the underlying thread fns.
  */
 struct mic_irq_info {
 	int next_avail_src;
@@ -77,19 +83,23 @@ struct mic_irq_info {
 	u16 num_vectors;
 	struct ida cb_ida;
 	spinlock_t mic_intr_lock;
+	spinlock_t mic_thread_lock;
 	struct list_head *cb_list;
+	unsigned long mask;
 };
 
 /**
  * struct mic_intr_cb - Interrupt callback structure.
  *
- * @func: The callback function
+ * @handler: The callback function
+ * @thread_fn: The thread_fn.
  * @data: Private data of the requester.
  * @cb_id: The callback id. Identifies this callback.
  * @list: list head pointing to the next callback structure.
  */
 struct mic_intr_cb {
-	irqreturn_t (*func) (int irq, void *data);
+	irq_handler_t handler;
+	irq_handler_t thread_fn;
 	void *data;
 	int cb_id;
 	struct list_head list;
@@ -124,11 +134,11 @@ struct mic_hw_intr_ops {
 };
 
 int mic_next_db(struct mic_device *mdev);
-struct mic_irq *mic_request_irq(struct mic_device *mdev,
-	irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int intr_src,
-	enum mic_intr_type type);
-
+struct mic_irq *
+mic_request_threaded_irq(struct mic_device *mdev,
+			 irq_handler_t handler, irq_handler_t thread_fn,
+			 const char *name, void *data, int intr_src,
+			 enum mic_intr_type type);
 void mic_free_irq(struct mic_device *mdev,
 		struct mic_irq *cookie, void *data);
 int mic_setup_interrupts(struct mic_device *mdev, struct pci_dev *pdev);
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index c04a021..fdc9c13 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -389,8 +389,9 @@ static int mic_probe(struct pci_dev *pdev,
 	mutex_lock(&mdev->mic_mutex);
 
 	mdev->shutdown_db = mic_next_db(mdev);
-	mdev->shutdown_cookie = mic_request_irq(mdev, mic_shutdown_db,
-		"shutdown-interrupt", mdev, mdev->shutdown_db, MIC_INTR_DB);
+	mdev->shutdown_cookie = mic_request_threaded_irq(mdev, mic_shutdown_db,
+					NULL, "shutdown-interrupt", mdev,
+					mdev->shutdown_db, MIC_INTR_DB);
 	if (IS_ERR(mdev->shutdown_cookie)) {
 		rc = PTR_ERR(mdev->shutdown_cookie);
 		mutex_unlock(&mdev->mic_mutex);
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 7e1ef0e..aba3e83 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -594,8 +594,10 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
 		 mvdev->virtio_id);
 	mvdev->virtio_db = mic_next_db(mdev);
-	mvdev->virtio_cookie = mic_request_irq(mdev, mic_virtio_intr_handler,
-			irqname, mvdev, mvdev->virtio_db, MIC_INTR_DB);
+	mvdev->virtio_cookie = mic_request_threaded_irq(mdev,
+					       mic_virtio_intr_handler,
+					       NULL, irqname, mvdev,
+					       mvdev->virtio_db, MIC_INTR_DB);
 	if (IS_ERR(mvdev->virtio_cookie)) {
 		ret = PTR_ERR(mvdev->virtio_cookie);
 		dev_dbg(mdev->sdev->parent, "request irq failed\n");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 62%]

* [PATCH char-misc-next v2 6/8] misc: mic: add threaded irq support in card driver
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (4 preceding siblings ...)
  2014-05-29 21:24 54% ` [PATCH char-misc-next v2 5/8] misc: mic: add dma " Sudeep Dutt
@ 2014-05-29 21:24 85% ` Sudeep Dutt
  2014-05-29 21:24 85% ` [PATCH char-misc-next v2 7/8] misc: mic: add dma " Sudeep Dutt
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Add threaded irq support in mic_request_card_irq which will be used
for virtual devices added on mic bus.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/card/mic_device.c | 23 +++++++++++++----------
 drivers/misc/mic/card/mic_device.h |  5 +++--
 drivers/misc/mic/card/mic_virtio.c |  7 ++++---
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index d0980ff..83819ee 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -83,8 +83,8 @@ static int mic_shutdown_init(void)
 	int shutdown_db;
 
 	shutdown_db = mic_next_card_db();
-	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr,
-			"Shutdown", mdrv, shutdown_db);
+	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr, NULL,
+					       "Shutdown", mdrv, shutdown_db);
 	if (IS_ERR(shutdown_cookie))
 		rc = PTR_ERR(shutdown_cookie);
 	else
@@ -136,7 +136,8 @@ static void mic_dp_uninit(void)
 /**
  * mic_request_card_irq - request an irq.
  *
- * @func: The callback function that handles the interrupt.
+ * @handler: interrupt handler passed to request_threaded_irq.
+ * @thread_fn: thread fn. passed to request_threaded_irq.
  * @name: The ASCII name of the callee requesting the irq.
  * @data: private data that is returned back when calling the
  * function handler.
@@ -149,17 +150,19 @@ static void mic_dp_uninit(void)
  * error code.
  *
  */
-struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int index)
+struct mic_irq *
+mic_request_card_irq(irq_handler_t handler,
+		     irq_handler_t thread_fn, const char *name,
+		     void *data, int index)
 {
 	int rc = 0;
 	unsigned long cookie;
 	struct mic_driver *mdrv = g_drv;
 
-	rc  = request_irq(mic_db_to_irq(mdrv, index), func,
-		0, name, data);
+	rc  = request_threaded_irq(mic_db_to_irq(mdrv, index), handler,
+				   thread_fn, 0, name, data);
 	if (rc) {
-		dev_err(mdrv->dev, "request_irq failed rc = %d\n", rc);
+		dev_err(mdrv->dev, "request_threaded_irq failed rc = %d\n", rc);
 		goto err;
 	}
 	mdrv->irq_info.irq_usage_count[index]++;
@@ -172,9 +175,9 @@ err:
 /**
  * mic_free_card_irq - free irq.
  *
- * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @cookie: cookie obtained during a successful call to mic_request_threaded_irq
  * @data: private data specified by the calling function during the
- * mic_request_irq
+ * mic_request_threaded_irq
  *
  * returns: none.
  */
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 306f502..e12a0c2 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -30,6 +30,7 @@
 #include <linux/workqueue.h>
 #include <linux/io.h>
 #include <linux/irqreturn.h>
+#include <linux/interrupt.h>
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -116,8 +117,8 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 int mic_driver_init(struct mic_driver *mdrv);
 void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
-struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int intr_src);
+struct mic_irq *mic_request_card_irq(irq_handler_t handler,
+	irq_handler_t thread_fn, const char *name, void *data, int intr_src);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 653799b..f14b600 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -417,7 +417,7 @@ static int mic_add_device(struct mic_device_desc __iomem *d,
 
 	virtio_db = mic_next_card_db();
 	mvdev->virtio_cookie = mic_request_card_irq(mic_virtio_intr_handler,
-			"virtio intr", mvdev, virtio_db);
+			NULL, "virtio intr", mvdev, virtio_db);
 	if (IS_ERR(mvdev->virtio_cookie)) {
 		ret = PTR_ERR(mvdev->virtio_cookie);
 		goto kfree;
@@ -606,8 +606,9 @@ int mic_devices_init(struct mic_driver *mdrv)
 	mic_scan_devices(mdrv, !REMOVE_DEVICES);
 
 	config_db = mic_next_card_db();
-	virtio_config_cookie = mic_request_card_irq(mic_extint_handler,
-			"virtio_config_intr", mdrv, config_db);
+	virtio_config_cookie = mic_request_card_irq(mic_extint_handler, NULL,
+						    "virtio_config_intr", mdrv,
+						    config_db);
 	if (IS_ERR(virtio_config_cookie)) {
 		rc = PTR_ERR(virtio_config_cookie);
 		goto exit;
-- 
1.8.2.1


^ permalink raw reply related	[relevance 85%]

* [PATCH char-misc-next v2 5/8] misc: mic: add dma support in host driver
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (3 preceding siblings ...)
  2014-05-29 21:24 62% ` [PATCH char-misc-next v2 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
@ 2014-05-29 21:24 54% ` Sudeep Dutt
  2014-05-29 21:24 85% ` [PATCH char-misc-next v2 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch adds a dma device on the mic virtual bus and uses this dmaengine
to transfer data for virtio devices

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/host/mic_boot.c   |  83 ++++++++++++++++-
 drivers/misc/mic/host/mic_device.h |  24 +++++
 drivers/misc/mic/host/mic_intr.h   |   3 +-
 drivers/misc/mic/host/mic_virtio.c | 181 +++++++++++++++++++++++++++++--------
 drivers/misc/mic/host/mic_virtio.h |  21 ++++-
 drivers/misc/mic/host/mic_x100.c   |   8 ++
 7 files changed, 281 insertions(+), 41 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index ee1d2ac..bf76313 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -19,7 +19,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index b75c6b5..e820d6f 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -23,11 +23,70 @@
 #include <linux/pci.h>
 
 #include <linux/mic_common.h>
+#include <linux/mic_bus.h>
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
+{
+	return dev_get_drvdata(mbdev->dev.parent);
+}
+
+static dma_addr_t
+mic_dma_map_page(struct device *dev, struct page *page,
+		 unsigned long offset, size_t size, enum dma_data_direction dir,
+		 struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct mic_device *mdev = mbdev_to_mdev(dev_get_drvdata(dev));
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void
+mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+		   size_t size, enum dma_data_direction dir,
+		   struct dma_attrs *attrs)
+{
+	struct mic_device *mdev = mbdev_to_mdev(dev_get_drvdata(dev));
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static struct dma_map_ops mic_dma_ops = {
+	.map_page = mic_dma_map_page,
+	.unmap_page = mic_dma_unmap_page,
+};
+
+static struct mic_irq *
+_mic_request_threaded_irq(struct mbus_device *mbdev,
+			  irq_handler_t handler, irq_handler_t thread_fn,
+			  const char *name, void *data, int intr_src)
+{
+	return mic_request_threaded_irq(mbdev_to_mdev(mbdev), handler,
+					thread_fn, name, data,
+					intr_src, MIC_INTR_DMA);
+}
+
+static void _mic_free_irq(struct mbus_device *mbdev,
+			  struct mic_irq *cookie, void *data)
+{
+	return mic_free_irq(mbdev_to_mdev(mbdev), cookie, data);
+}
+
+static void _mic_ack_interrupt(struct mbus_device *mbdev, int num)
+{
+	struct mic_device *mdev = mbdev_to_mdev(mbdev);
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static struct mbus_hw_ops mbus_hw_ops = {
+	.request_threaded_irq = _mic_request_threaded_irq,
+	.free_irq = _mic_free_irq,
+	.ack_interrupt = _mic_ack_interrupt,
+};
+
 /**
  * mic_reset - Reset the MIC device.
  * @mdev: pointer to mic_device instance
@@ -95,9 +154,21 @@ retry:
 		 */
 		goto retry;
 	}
+	mdev->dma_mbdev = mbus_register_device(mdev->sdev->parent,
+					       MBUS_DEV_DMA_HOST, &mic_dma_ops,
+					       &mbus_hw_ops, mdev->mmio.va);
+	if (IS_ERR(mdev->dma_mbdev)) {
+		rc = PTR_ERR(mdev->dma_mbdev);
+		goto unlock_ret;
+	}
+	mdev->dma_ch = mic_request_dma_chan(mdev);
+	if (!mdev->dma_ch) {
+		rc = -ENXIO;
+		goto dma_remove;
+	}
 	rc = mdev->ops->load_mic_fw(mdev, buf);
 	if (rc)
-		goto unlock_ret;
+		goto dma_release;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -105,6 +176,11 @@ retry:
 	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
+	goto unlock_ret;
+dma_release:
+	dma_release_channel(mdev->dma_ch);
+dma_remove:
+	mbus_unregister_device(mdev->dma_mbdev);
 unlock_ret:
 	mutex_unlock(&mdev->mic_mutex);
 	return rc;
@@ -122,6 +198,11 @@ void mic_stop(struct mic_device *mdev, bool force)
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
 		mic_virtio_reset_devices(mdev);
+		if (mdev->dma_ch) {
+			dma_release_channel(mdev->dma_ch);
+			mdev->dma_ch = NULL;
+		}
+		mbus_unregister_device(mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
 		if (MIC_RESET_FAILED == mdev->state)
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 0398c69..016bd15 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -25,6 +25,8 @@
 #include <linux/idr.h>
 #include <linux/notifier.h>
 #include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+#include <linux/mic_bus.h>
 
 #include "mic_intr.h"
 
@@ -87,6 +89,8 @@ enum mic_stepping {
  * @cdev: Character device for MIC.
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
+ * @dma_mbdev: MIC BUS DMA device.
+ * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -124,6 +128,8 @@ struct mic_device {
 	struct cdev cdev;
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
+	struct mbus_device *dma_mbdev;
+	struct dma_chan *dma_ch;
 };
 
 /**
@@ -144,6 +150,7 @@ struct mic_device {
  * @load_mic_fw: Load firmware segments required to boot the card
  * into card memory. This includes the kernel, command line, ramdisk etc.
  * @get_postcode: Get post code status from firmware.
+ * @dma_filter: DMA filter function to be used.
  */
 struct mic_hw_ops {
 	u8 aper_bar;
@@ -159,6 +166,7 @@ struct mic_hw_ops {
 	void (*send_firmware_intr)(struct mic_device *mdev);
 	int (*load_mic_fw)(struct mic_device *mdev, const char *buf);
 	u32 (*get_postcode)(struct mic_device *mdev);
+	bool (*dma_filter)(struct dma_chan *chan, void *param);
 };
 
 /**
@@ -187,6 +195,22 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 	iowrite32(val, mw->va + offset);
 }
 
+static inline struct dma_chan *mic_request_dma_chan(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+	chan = dma_request_channel(mask, mdev->ops->dma_filter,
+				   mdev->sdev->parent);
+	if (chan)
+		return chan;
+	dev_err(mdev->sdev->parent, "%s %d unable to acquire channel\n",
+		__func__, __LINE__);
+	return NULL;
+}
+
 void mic_sysfs_init(struct mic_device *mdev);
 int mic_start(struct mic_device *mdev, const char *buf);
 void mic_stop(struct mic_device *mdev, bool force);
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index b1334dd..9f783d4a 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -27,8 +27,9 @@
  * The minimum number of msix vectors required for normal operation.
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
+ * 4 for host owned DMA channels.
  */
-#define MIC_MIN_MSIX 4
+#define MIC_MIN_MSIX 8
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index aba3e83..a020e4e 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -21,60 +21,157 @@
 #include <linux/pci.h>
 #include <linux/sched.h>
 #include <linux/uaccess.h>
-
+#include <linux/dmaengine.h>
 #include <linux/mic_common.h>
+
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
 /*
- * Initiates the copies across the PCIe bus from card memory to
- * a user space buffer.
+ * Size of the internal buffer used during DMA's as an intermediate buffer
+ * for copy to/from user.
  */
-static int mic_virtio_copy_to_user(struct mic_vdev *mvdev,
-		void __user *ubuf, size_t len, u64 addr)
+#define MIC_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
+
+static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
+			dma_addr_t src, size_t len)
 {
-	int err;
-	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
-	/*
-	 * We are copying from IO below an should ideally use something
-	 * like copy_to_user_fromio(..) if it existed.
-	 */
-	if (copy_to_user(ubuf, (void __force *)dbuf, len)) {
-		err = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+	int err = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *mic_ch = mdev->dma_ch;
+
+	if (!mic_ch) {
+		err = -EBUSY;
+		goto error;
+	}
+
+	tx = mic_ch->device->device_prep_dma_memcpy(mic_ch, dst, src, len,
+						    DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	} else {
+		dma_cookie_t cookie = tx->tx_submit(tx);
+
+		err = dma_submit_error(cookie);
+		if (err)
+			goto error;
+		err = dma_sync_wait(mic_ch, cookie);
+	}
+error:
+	if (err)
+		dev_err(mdev->sdev->parent, "%s %d err %d\n",
 			__func__, __LINE__, err);
-		goto err;
+	return err;
+}
+
+/*
+ * Initiates the copies across the PCIe bus from card memory to a user
+ * space buffer. When transfers are done using DMA, source/destination
+ * addresses and transfer length must follow the alignment requirements of
+ * the MIC DMA engine.
+ */
+static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
+				   size_t len, u64 daddr, size_t dlen,
+				   int vr_idx)
+{
+	struct mic_device *mdev = mvdev->mdev;
+	void __iomem *dbuf = mdev->aper.va + daddr;
+	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
+	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_offset;
+	size_t partlen;
+	int err;
+
+	dma_offset = daddr - round_down(daddr, dma_alignment);
+	daddr -= dma_offset;
+	len += dma_offset;
+
+	while (len) {
+		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
+
+		err = mic_sync_dma(mdev, mvr->buf_da, daddr,
+				   ALIGN(partlen, dma_alignment));
+		if (err)
+			goto err;
+
+		if (copy_to_user(ubuf, mvr->buf + dma_offset,
+				 partlen - dma_offset)) {
+			err = -EFAULT;
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		mvdev->in_bytes_dma += partlen;
+		mvdev->in_bytes += partlen;
+		len -= partlen;
+		dma_offset = 0;
 	}
-	mvdev->in_bytes += len;
-	err = 0;
+	return 0;
 err:
+	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
 	return err;
 }
 
 /*
- * Initiates copies across the PCIe bus from a user space
- * buffer to card memory.
+ * Initiates copies across the PCIe bus from a user space buffer to card
+ * memory. When transfers are done using DMA, source/destination addresses
+ * and transfer length must follow the alignment requirements of the MIC
+ * DMA engine.
  */
-static int mic_virtio_copy_from_user(struct mic_vdev *mvdev,
-		void __user *ubuf, size_t len, u64 addr)
+static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
+				     size_t len, u64 daddr, size_t dlen,
+				     int vr_idx)
 {
+	struct mic_device *mdev = mvdev->mdev;
+	void __iomem *dbuf = mdev->aper.va + daddr;
+	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
+	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t partlen;
 	int err;
-	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
+
+	if (daddr & (dma_alignment - 1)) {
+		mvdev->tx_dst_unaligned += len;
+		goto memcpy;
+	} else if (ALIGN(len, dma_alignment) > dlen) {
+		mvdev->tx_len_unaligned += len;
+		goto memcpy;
+	}
+
+	while (len) {
+		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
+
+		if (copy_from_user(mvr->buf, ubuf, partlen)) {
+			err = -EFAULT;
+			goto err;
+		}
+		err = mic_sync_dma(mdev, daddr, mvr->buf_da,
+				   ALIGN(partlen, dma_alignment));
+		if (err)
+			goto err;
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		mvdev->out_bytes_dma += partlen;
+		mvdev->out_bytes += partlen;
+		len -= partlen;
+	}
+memcpy:
 	/*
 	 * We are copying to IO below and should ideally use something
 	 * like copy_from_user_toio(..) if it existed.
 	 */
 	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
 		err = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
 		goto err;
 	}
 	mvdev->out_bytes += len;
-	err = 0;
+	return 0;
 err:
+	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
 	return err;
 }
 
@@ -110,7 +207,8 @@ static inline u32 mic_vringh_iov_consumed(struct vringh_kiov *iov)
  * way to override the VRINGH xfer(..) routines as of v3.10.
  */
 static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
-	void __user *ubuf, size_t len, bool read, size_t *out_len)
+			void __user *ubuf, size_t len, bool read, int vr_idx,
+			size_t *out_len)
 {
 	int ret = 0;
 	size_t partlen, tot_len = 0;
@@ -118,13 +216,15 @@ static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
 	while (len && iov->i < iov->used) {
 		partlen = min(iov->iov[iov->i].iov_len, len);
 		if (read)
-			ret = mic_virtio_copy_to_user(mvdev,
-				ubuf, partlen,
-				(u64)iov->iov[iov->i].iov_base);
+			ret = mic_virtio_copy_to_user(mvdev, ubuf, partlen,
+						(u64)iov->iov[iov->i].iov_base,
+						iov->iov[iov->i].iov_len,
+						vr_idx);
 		else
-			ret = mic_virtio_copy_from_user(mvdev,
-				ubuf, partlen,
-				(u64)iov->iov[iov->i].iov_base);
+			ret = mic_virtio_copy_from_user(mvdev, ubuf, partlen,
+						(u64)iov->iov[iov->i].iov_base,
+						iov->iov[iov->i].iov_len,
+						vr_idx);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -192,8 +292,8 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			ubuf = iov.iov_base;
 		}
 		/* Issue all the read descriptors first */
-		ret = mic_vringh_copy(mvdev, riov, ubuf, len,
-			MIC_VRINGH_READ, &out_len);
+		ret = mic_vringh_copy(mvdev, riov, ubuf, len, MIC_VRINGH_READ,
+				      copy->vr_idx, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -203,8 +303,8 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 		ubuf += out_len;
 		copy->out_len += out_len;
 		/* Issue the write descriptors next */
-		ret = mic_vringh_copy(mvdev, wiov, ubuf, len,
-			!MIC_VRINGH_READ, &out_len);
+		ret = mic_vringh_copy(mvdev, wiov, ubuf, len, !MIC_VRINGH_READ,
+				      copy->vr_idx, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -589,6 +689,10 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		dev_dbg(mdev->sdev->parent,
 			"%s %d index %d va %p info %p vr_size 0x%x\n",
 			__func__, __LINE__, i, vr->va, vr->info, vr_size);
+		mvr->buf = (void *)__get_free_pages(GFP_KERNEL,
+					get_order(MIC_INT_DMA_BUF_SIZE));
+		mvr->buf_da = mic_map_single(mvdev->mdev, mvr->buf,
+					  MIC_INT_DMA_BUF_SIZE);
 	}
 
 	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
@@ -673,6 +777,11 @@ skip_hot_remove:
 	vqconfig = mic_vq_config(mvdev->dd);
 	for (i = 0; i < mvdev->dd->num_vq; i++) {
 		struct mic_vringh *mvr = &mvdev->mvr[i];
+
+		mic_unmap_single(mvdev->mdev, mvr->buf_da,
+				 MIC_INT_DMA_BUF_SIZE);
+		free_pages((unsigned long)mvr->buf,
+			   get_order(MIC_INT_DMA_BUF_SIZE));
 		vringh_kiov_cleanup(&mvr->riov);
 		vringh_kiov_cleanup(&mvr->wiov);
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
diff --git a/drivers/misc/mic/host/mic_virtio.h b/drivers/misc/mic/host/mic_virtio.h
index 184f3c8..d574efb 100644
--- a/drivers/misc/mic/host/mic_virtio.h
+++ b/drivers/misc/mic/host/mic_virtio.h
@@ -46,18 +46,23 @@
  * @vrh: The host VRINGH used for accessing the card vrings.
  * @riov: The VRINGH read kernel IOV.
  * @wiov: The VRINGH write kernel IOV.
- * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
  * @vr_mutex: Mutex for synchronizing access to the VRING.
+ * @buf: Temporary kernel buffer used to copy in/out data
+ * from/to the card via DMA.
+ * @buf_da: dma address of buf.
  * @mvdev: Back pointer to MIC virtio device for vringh_notify(..).
+ * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
  */
 struct mic_vringh {
 	struct mic_vring vring;
 	struct vringh vrh;
 	struct vringh_kiov riov;
 	struct vringh_kiov wiov;
-	u16 head;
 	struct mutex vr_mutex;
+	void *buf;
+	dma_addr_t buf_da;
 	struct mic_vdev *mvdev;
+	u16 head;
 };
 
 /**
@@ -69,6 +74,14 @@ struct mic_vringh {
  * @poll_wake - Used for waking up threads blocked in poll.
  * @out_bytes - Debug stats for number of bytes copied from host to card.
  * @in_bytes - Debug stats for number of bytes copied from card to host.
+ * @out_bytes_dma - Debug stats for number of bytes copied from host to card
+ * using DMA.
+ * @in_bytes_dma - Debug stats for number of bytes copied from card to host
+ * using DMA.
+ * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
+ * the transfer length did not have the required DMA alignment.
+ * @tx_dst_unaligned - Debug stats for number of bytes copied where the
+ * destination address on the card did not have the required DMA alignment.
  * @mvr - Store per VRING data structures.
  * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
  * @dd - Virtio device descriptor.
@@ -84,6 +97,10 @@ struct mic_vdev {
 	int poll_wake;
 	unsigned long out_bytes;
 	unsigned long in_bytes;
+	unsigned long out_bytes_dma;
+	unsigned long in_bytes_dma;
+	unsigned long tx_len_unaligned;
+	unsigned long tx_dst_unaligned;
 	struct mic_vringh mvr[MIC_MAX_VRINGS];
 	struct work_struct virtio_bh_work;
 	struct mic_device_desc *dd;
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index 5562fdd..b7a21e1 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -549,6 +549,13 @@ struct mic_smpt_ops mic_x100_smpt_ops = {
 	.set = mic_x100_smpt_set,
 };
 
+static bool mic_x100_dma_filter(struct dma_chan *chan, void *param)
+{
+	if (chan->device->dev->parent == (struct device *)param)
+		return true;
+	return false;
+}
+
 struct mic_hw_ops mic_x100_ops = {
 	.aper_bar = MIC_X100_APER_BAR,
 	.mmio_bar = MIC_X100_MMIO_BAR,
@@ -563,6 +570,7 @@ struct mic_hw_ops mic_x100_ops = {
 	.send_firmware_intr = mic_x100_send_firmware_intr,
 	.load_mic_fw = mic_x100_load_firmware,
 	.get_postcode = mic_x100_get_postcode,
+	.dma_filter = mic_x100_dma_filter,
 };
 
 struct mic_hw_intr_ops mic_x100_intr_ops = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 54%]

* [PATCH char-misc-next v2 7/8] misc: mic: add dma support in card driver
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (5 preceding siblings ...)
  2014-05-29 21:24 85% ` [PATCH char-misc-next v2 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
@ 2014-05-29 21:24 85% ` Sudeep Dutt
  2014-05-29 21:24 99% ` [PATCH char-misc-next v2 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
  2014-06-16 15:58 99% ` [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch adds a dma device on the mic virtual bus

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/misc/mic/Kconfig           |  2 +-
 drivers/misc/mic/card/mic_device.h |  8 ++++--
 drivers/misc/mic/card/mic_x100.c   | 55 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bf76313..cc4eef0 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -39,7 +39,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86
+	depends on 64BIT && X86 && INTEL_MIC_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index e12a0c2..844be8f 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -31,6 +31,7 @@
 #include <linux/io.h>
 #include <linux/irqreturn.h>
 #include <linux/interrupt.h>
+#include <linux/mic_bus.h>
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -71,6 +72,7 @@ struct mic_device {
  * @hotplug_work: Hot plug work for adding/removing virtio devices.
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
+ * @dma_mbdev: dma device on the MIC virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -81,6 +83,7 @@ struct mic_driver {
 	struct work_struct hotplug_work;
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
+	struct mbus_device *dma_mbdev;
 };
 
 /**
@@ -117,8 +120,9 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 int mic_driver_init(struct mic_driver *mdrv);
 void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
-struct mic_irq *mic_request_card_irq(irq_handler_t handler,
-	irq_handler_t thread_fn, const char *name, void *data, int intr_src);
+struct mic_irq *
+mic_request_card_irq(irq_handler_t handler, irq_handler_t thread_fn,
+		     const char *name, void *data, int intr_src);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 2868945..55c9465 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -148,6 +148,47 @@ void mic_card_unmap(struct mic_device *mdev, void __iomem *addr)
 	iounmap(addr);
 }
 
+static inline struct mic_driver *mbdev_to_mdrv(struct mbus_device *mbdev)
+{
+	return dev_get_drvdata(mbdev->dev.parent);
+}
+
+static struct mic_irq *
+_mic_request_threaded_irq(struct mbus_device *mbdev,
+			  irq_handler_t handler, irq_handler_t thread_fn,
+			  const char *name, void *data, int intr_src)
+{
+	int rc = 0;
+	unsigned int irq = intr_src;
+	unsigned long cookie = irq;
+
+	rc  = request_threaded_irq(irq, handler, thread_fn, 0, name, data);
+	if (rc) {
+		dev_err(mbdev_to_mdrv(mbdev)->dev,
+			"request_threaded_irq failed rc = %d\n", rc);
+		return ERR_PTR(rc);
+	}
+	return (struct mic_irq *)cookie;
+}
+
+static void _mic_free_irq(struct mbus_device *mbdev,
+			  struct mic_irq *cookie, void *data)
+{
+	unsigned long irq = (unsigned long)cookie;
+	free_irq(irq, data);
+}
+
+static void _mic_ack_interrupt(struct mbus_device *mbdev, int num)
+{
+	mic_ack_interrupt(&mbdev_to_mdrv(mbdev)->mdev);
+}
+
+static struct mbus_hw_ops mbus_hw_ops = {
+	.request_threaded_irq = _mic_request_threaded_irq,
+	.free_irq = _mic_free_irq,
+	.ack_interrupt = _mic_ack_interrupt,
+};
+
 static int __init mic_probe(struct platform_device *pdev)
 {
 	struct mic_driver *mdrv = &g_drv;
@@ -166,13 +207,24 @@ static int __init mic_probe(struct platform_device *pdev)
 		goto done;
 	}
 	mic_hw_intr_init(mdrv);
+	platform_set_drvdata(pdev, mdrv);
+	mdrv->dma_mbdev = mbus_register_device(mdrv->dev, MBUS_DEV_DMA_MIC,
+					       NULL, &mbus_hw_ops,
+					       mdrv->mdev.mmio.va);
+	if (IS_ERR(mdrv->dma_mbdev)) {
+		rc = PTR_ERR(mdrv->dma_mbdev);
+		dev_err(&pdev->dev, "mbus_add_device failed rc %d\n", rc);
+		goto iounmap;
+	}
 	rc = mic_driver_init(mdrv);
 	if (rc) {
 		dev_err(&pdev->dev, "mic_driver_init failed rc %d\n", rc);
-		goto iounmap;
+		goto remove_dma;
 	}
 done:
 	return rc;
+remove_dma:
+	mbus_unregister_device(mdrv->dma_mbdev);
 iounmap:
 	iounmap(mdev->mmio.va);
 	return rc;
@@ -184,6 +236,7 @@ static int mic_remove(struct platform_device *pdev)
 	struct mic_device *mdev = &mdrv->mdev;
 
 	mic_driver_uninit(mdrv);
+	mbus_unregister_device(mdrv->dma_mbdev);
 	iounmap(mdev->mmio.va);
 	return 0;
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 85%]

* [PATCH char-misc-next v2 3/8] dma: MIC X100 DMA Driver
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
  2014-05-29 21:23 91% ` [PATCH char-misc-next v2 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
  2014-05-29 21:23 63% ` [PATCH char-misc-next v2 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
@ 2014-05-29 21:24 41% ` Sudeep Dutt
  2014-05-29 21:24 62% ` [PATCH char-misc-next v2 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch implements DMA Engine API for DMA controller on MIC X100
Coprocessors. DMA h/w is shared between host and card s/w.
Channels 0 to 3 are used by host and 4 to 7 are used by card.
Since the DMA device doesn't show up as PCIe device, a virtual bus called mic
bus is created and virtual devices are added on that bus to follow device model.
Allowed dma transfer directions are host to card, card to host and card to card.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/dma/Kconfig        |  19 ++
 drivers/dma/Makefile       |   1 +
 drivers/dma/mic_x100_dma.c | 774 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/mic_x100_dma.h | 286 +++++++++++++++++
 4 files changed, 1080 insertions(+)
 create mode 100644 drivers/dma/mic_x100_dma.c
 create mode 100644 drivers/dma/mic_x100_dma.h

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 5c58638..39b66a8 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -33,6 +33,25 @@ if DMADEVICES
 
 comment "DMA Devices"
 
+config INTEL_MIC_X100_DMA
+	tristate "Intel MIC X100 DMA Driver"
+	depends on 64BIT && X86 && INTEL_MIC_BUS
+	select DMAENGINE
+	default N
+	help
+	  This enables DMA support for the Intel Many Integrated Core
+	  (MIC) family of PCIe form factor coprocessor X100 devices that
+	  run a 64 bit Linux OS. This driver will be used by both MIC
+	  host and card drivers.
+
+	  If you are building host kernel with a MIC device or a card
+	  kernel for a MIC device, then say M (recommended) or Y, else
+	  say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 config INTEL_MID_DMAC
 	tristate "Intel MID DMA support for Peripheral DMA controllers"
 	depends on PCI && X86
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 5150c82..c933022 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -46,3 +46,4 @@ obj-$(CONFIG_K3_DMA) += k3dma.o
 obj-$(CONFIG_MOXART_DMA) += moxart-dma.o
 obj-$(CONFIG_FSL_EDMA) += fsl-edma.o
 obj-$(CONFIG_QCOM_BAM_DMA) += qcom_bam_dma.o
+obj-$(CONFIG_INTEL_MIC_X100_DMA) += mic_x100_dma.o
diff --git a/drivers/dma/mic_x100_dma.c b/drivers/dma/mic_x100_dma.c
new file mode 100644
index 0000000..46ff856
--- /dev/null
+++ b/drivers/dma/mic_x100_dma.c
@@ -0,0 +1,774 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC X100 DMA Driver.
+ *
+ * Adapted from IOAT dma driver.
+ */
+#include <linux/module.h>
+#include <linux/io.h>
+#include <linux/seq_file.h>
+
+#include "mic_x100_dma.h"
+
+#define MIC_DMA_MAX_XFER_SIZE_CARD  (1 * 1024 * 1024 -\
+				       MIC_DMA_ALIGN_BYTES)
+#define MIC_DMA_MAX_XFER_SIZE_HOST  (1 * 1024 * 1024 >> 1)
+#define MIC_DMA_DESC_TYPE_SHIFT	60
+#define MIC_DMA_MEMCPY_LEN_SHIFT 46
+#define MIC_DMA_STAT_INTR_SHIFT 59
+
+/* high-water mark for pushing dma descriptors */
+static int mic_dma_pending_level = 4;
+
+/* Status descriptor is used to write a 64 bit value to a memory location */
+enum mic_dma_desc_format_type {
+	MIC_DMA_MEMCPY = 1,
+	MIC_DMA_STATUS,
+};
+
+static inline u32 mic_dma_hw_ring_inc(u32 val)
+{
+	return (val + 1) % MIC_DMA_DESC_RX_SIZE;
+}
+
+static inline u32 mic_dma_hw_ring_dec(u32 val)
+{
+	return val ? val - 1 : MIC_DMA_DESC_RX_SIZE - 1;
+}
+
+static inline void mic_dma_hw_ring_inc_head(struct mic_dma_chan *ch)
+{
+	ch->head = mic_dma_hw_ring_inc(ch->head);
+}
+
+/* Prepare a memcpy desc */
+static inline void mic_dma_memcpy_desc(struct mic_dma_desc *desc,
+	dma_addr_t src_phys, dma_addr_t dst_phys, u64 size)
+{
+	u64 qw0, qw1;
+
+	qw0 = src_phys;
+	qw0 |= (size >> MIC_DMA_ALIGN_SHIFT) << MIC_DMA_MEMCPY_LEN_SHIFT;
+	qw1 = MIC_DMA_MEMCPY;
+	qw1 <<= MIC_DMA_DESC_TYPE_SHIFT;
+	qw1 |= dst_phys;
+	desc->qw0 = qw0;
+	desc->qw1 = qw1;
+}
+
+/* Prepare a status desc. with @data to be written at @dst_phys */
+static inline void mic_dma_prep_status_desc(struct mic_dma_desc *desc, u64 data,
+	dma_addr_t dst_phys, bool generate_intr)
+{
+	u64 qw0, qw1;
+
+	qw0 = data;
+	qw1 = (u64) MIC_DMA_STATUS << MIC_DMA_DESC_TYPE_SHIFT | dst_phys;
+	if (generate_intr)
+		qw1 |= (1ULL << MIC_DMA_STAT_INTR_SHIFT);
+	desc->qw0 = qw0;
+	desc->qw1 = qw1;
+}
+
+static void mic_dma_cleanup(struct mic_dma_chan *ch)
+{
+	struct dma_async_tx_descriptor *tx;
+	u32 tail;
+	u32 last_tail;
+
+	spin_lock(&ch->cleanup_lock);
+	tail = mic_dma_read_cmp_cnt(ch);
+	/*
+	 * This is the barrier pair for smp_wmb() in fn.
+	 * mic_dma_tx_submit_unlock. It's required so that we read the
+	 * updated cookie value from tx->cookie.
+	 */
+	smp_rmb();
+	for (last_tail = ch->last_tail; tail != last_tail;) {
+		tx = &ch->tx_array[last_tail];
+		if (tx->cookie) {
+			dma_cookie_complete(tx);
+			if (tx->callback) {
+				tx->callback(tx->callback_param);
+				tx->callback = NULL;
+			}
+		}
+		last_tail = mic_dma_hw_ring_inc(last_tail);
+	}
+	/* finish all completion callbacks before incrementing tail */
+	smp_mb();
+	ch->last_tail = last_tail;
+	spin_unlock(&ch->cleanup_lock);
+}
+
+static u32 mic_dma_ring_count(u32 head, u32 tail)
+{
+	u32 count;
+
+	if (head >= tail)
+		count = (tail - 0) + (MIC_DMA_DESC_RX_SIZE - head);
+	else
+		count = tail - head;
+	return count - 1;
+}
+
+/* Returns the num. of free descriptors on success, -ENOMEM on failure */
+static int mic_dma_avail_desc_ring_space(struct mic_dma_chan *ch, int required)
+{
+	struct device *dev = mic_dma_ch_to_device(ch);
+	u32 count;
+
+	count = mic_dma_ring_count(ch->head, ch->last_tail);
+	if (count < required) {
+		mic_dma_cleanup(ch);
+		count = mic_dma_ring_count(ch->head, ch->last_tail);
+	}
+
+	if (count < required) {
+		dev_dbg(dev, "Not enough desc space");
+		dev_dbg(dev, "%s %d required=%u, avail=%u\n",
+			__func__, __LINE__, required, count);
+		return -ENOMEM;
+	} else {
+		return count;
+	}
+}
+
+/* Program memcpy descriptors into the descriptor ring and update s/w head ptr*/
+static int mic_dma_prog_memcpy_desc(struct mic_dma_chan *ch, dma_addr_t src,
+				    dma_addr_t dst, size_t len)
+{
+	size_t current_transfer_len;
+	size_t max_xfer_size = to_mic_dma_dev(ch)->max_xfer_size;
+	/* 3 is added to make sure we have enough space for status desc */
+	int num_desc = len / max_xfer_size + 3;
+	int ret;
+
+	if (len % max_xfer_size)
+		num_desc++;
+
+	ret = mic_dma_avail_desc_ring_space(ch, num_desc);
+	if (ret < 0)
+		return ret;
+	do {
+		current_transfer_len = min(len, max_xfer_size);
+		mic_dma_memcpy_desc(&ch->desc_ring[ch->head],
+				    src, dst, current_transfer_len);
+		mic_dma_hw_ring_inc_head(ch);
+		len -= current_transfer_len;
+		dst = dst + current_transfer_len;
+		src = src + current_transfer_len;
+	} while (len > 0);
+	return 0;
+}
+
+/* It's a h/w quirk and h/w needs 2 status descriptors for every status desc */
+static void mic_dma_prog_intr(struct mic_dma_chan *ch)
+{
+	mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+				 ch->status_dest_micpa, false);
+	mic_dma_hw_ring_inc_head(ch);
+	mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+				 ch->status_dest_micpa, true);
+	mic_dma_hw_ring_inc_head(ch);
+}
+
+/* Wrapper function to program memcpy descriptors/status descriptors */
+static int mic_dma_do_dma(struct mic_dma_chan *ch, int flags, dma_addr_t src,
+			  dma_addr_t dst, size_t len)
+{
+	if (-ENOMEM == mic_dma_prog_memcpy_desc(ch, src, dst, len))
+		return -ENOMEM;
+	/* Above mic_dma_prog_memcpy_desc() makes sure we have enough space */
+	if (flags & DMA_PREP_FENCE) {
+		mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+					 ch->status_dest_micpa, false);
+		mic_dma_hw_ring_inc_head(ch);
+	}
+
+	if (flags & DMA_PREP_INTERRUPT)
+		mic_dma_prog_intr(ch);
+
+	return 0;
+}
+
+static inline void mic_dma_issue_pending(struct dma_chan *ch)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+
+	spin_lock(&mic_ch->issue_lock);
+	/*
+	 * Write to head triggers h/w to act on the descriptors.
+	 * On MIC, writing the same head value twice causes
+	 * a h/w error. On second write, h/w assumes we filled
+	 * the entire ring & overwrote some of the descriptors.
+	 */
+	if (mic_ch->issued == mic_ch->submitted)
+		goto out;
+	mic_ch->issued = mic_ch->submitted;
+	/*
+	 * make descriptor updates visible before advancing head,
+	 * this is purposefully not smp_wmb() since we are also
+	 * publishing the descriptor updates to a dma device
+	 */
+	wmb();
+	mic_dma_write_reg(mic_ch, MIC_DMA_REG_DHPR, mic_ch->issued);
+out:
+	spin_unlock(&mic_ch->issue_lock);
+}
+
+static inline void mic_dma_update_pending(struct mic_dma_chan *ch)
+{
+	if (mic_dma_ring_count(ch->issued, ch->submitted)
+			> mic_dma_pending_level)
+		mic_dma_issue_pending(&ch->api_ch);
+}
+
+static dma_cookie_t mic_dma_tx_submit_unlock(struct dma_async_tx_descriptor *tx)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(tx->chan);
+	dma_cookie_t cookie;
+
+	dma_cookie_assign(tx);
+	cookie = tx->cookie;
+	/*
+	 * We need an smp write barrier here because another CPU might see
+	 * an update to submitted and update h/w head even before we
+	 * assigned a cookie to this tx.
+	 */
+	smp_wmb();
+	mic_ch->submitted = mic_ch->head;
+	spin_unlock(&mic_ch->prep_lock);
+	mic_dma_update_pending(mic_ch);
+	return cookie;
+}
+
+static inline struct dma_async_tx_descriptor *
+allocate_tx(struct mic_dma_chan *ch)
+{
+	u32 idx = mic_dma_hw_ring_dec(ch->head);
+	struct dma_async_tx_descriptor *tx = &ch->tx_array[idx];
+
+	dma_async_tx_descriptor_init(tx, &ch->api_ch);
+	tx->tx_submit = mic_dma_tx_submit_unlock;
+	return tx;
+}
+
+/*
+ * Prepare a memcpy descriptor to be added to the ring.
+ * Note that the temporary descriptor adds an extra overhead of copying the
+ * descriptor to ring. So, we copy directly to the descriptor ring
+ */
+static struct dma_async_tx_descriptor *
+mic_dma_prep_memcpy_lock(struct dma_chan *ch, dma_addr_t dma_dest,
+			 dma_addr_t dma_src, size_t len, unsigned long flags)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	struct device *dev = mic_dma_ch_to_device(mic_ch);
+	int result;
+
+	if (!len && !flags)
+		return NULL;
+
+	spin_lock(&mic_ch->prep_lock);
+	result = mic_dma_do_dma(mic_ch, flags, dma_src, dma_dest, len);
+	if (result >= 0)
+		return allocate_tx(mic_ch);
+	dev_err(dev, "Error enqueueing dma, error=%d\n", result);
+	spin_unlock(&mic_ch->prep_lock);
+	return NULL;
+}
+
+static struct dma_async_tx_descriptor *
+mic_dma_prep_interrupt_lock(struct dma_chan *ch, unsigned long flags)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	int ret;
+
+	spin_lock(&mic_ch->prep_lock);
+	ret = mic_dma_do_dma(mic_ch, flags, 0, 0, 0);
+	if (!ret)
+		return allocate_tx(mic_ch);
+	spin_unlock(&mic_ch->prep_lock);
+	return NULL;
+}
+
+/* Return the status of the transaction */
+static enum dma_status
+mic_dma_tx_status(struct dma_chan *ch, dma_cookie_t cookie,
+		  struct dma_tx_state *txstate)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+
+	if (DMA_COMPLETE != dma_cookie_status(ch, cookie, txstate))
+		mic_dma_cleanup(mic_ch);
+
+	return dma_cookie_status(ch, cookie, txstate);
+}
+
+static irqreturn_t mic_dma_thread_fn(int irq, void *data)
+{
+	mic_dma_cleanup((struct mic_dma_chan *)data);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t mic_dma_intr_handler(int irq, void *data)
+{
+	struct mic_dma_chan *ch = ((struct mic_dma_chan *)data);
+
+	mic_dma_ack_interrupt(ch);
+	return IRQ_WAKE_THREAD;
+}
+
+static int mic_dma_alloc_desc_ring(struct mic_dma_chan *ch)
+{
+	u64 desc_ring_size = MIC_DMA_DESC_RX_SIZE * sizeof(*ch->desc_ring);
+	struct device *dev = &to_mbus_device(ch)->dev;
+
+	desc_ring_size = ALIGN(desc_ring_size, MIC_DMA_ALIGN_BYTES);
+	ch->desc_ring = kzalloc(desc_ring_size, GFP_KERNEL);
+
+	if (!ch->desc_ring)
+		return -ENOMEM;
+
+	ch->desc_ring_micpa = dma_map_single(dev, ch->desc_ring,
+					     desc_ring_size, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, ch->desc_ring_micpa))
+		goto map_error;
+
+	ch->tx_array = vzalloc(MIC_DMA_DESC_RX_SIZE * sizeof(*ch->tx_array));
+	if (!ch->tx_array)
+		goto tx_error;
+	return 0;
+tx_error:
+	dma_unmap_single(dev, ch->desc_ring_micpa, desc_ring_size,
+			 DMA_BIDIRECTIONAL);
+map_error:
+	kfree(ch->desc_ring);
+	return -ENOMEM;
+}
+
+static void mic_dma_free_desc_ring(struct mic_dma_chan *ch)
+{
+	u64 desc_ring_size = MIC_DMA_DESC_RX_SIZE * sizeof(*ch->desc_ring);
+
+	vfree(ch->tx_array);
+	desc_ring_size = ALIGN(desc_ring_size, MIC_DMA_ALIGN_BYTES);
+	dma_unmap_single(&to_mbus_device(ch)->dev, ch->desc_ring_micpa,
+			 desc_ring_size, DMA_BIDIRECTIONAL);
+	kfree(ch->desc_ring);
+	ch->desc_ring = NULL;
+}
+
+static void mic_dma_free_status_dest(struct mic_dma_chan *ch)
+{
+	dma_unmap_single(&to_mbus_device(ch)->dev, ch->status_dest_micpa,
+			 L1_CACHE_BYTES, DMA_BIDIRECTIONAL);
+	kfree(ch->status_dest);
+}
+
+static int mic_dma_alloc_status_dest(struct mic_dma_chan *ch)
+{
+	struct device *dev = &to_mbus_device(ch)->dev;
+
+	ch->status_dest = kzalloc(L1_CACHE_BYTES, GFP_KERNEL);
+	if (!ch->status_dest)
+		return -ENOMEM;
+	ch->status_dest_micpa = dma_map_single(dev, ch->status_dest,
+					L1_CACHE_BYTES, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, ch->status_dest_micpa)) {
+		kfree(ch->status_dest);
+		ch->status_dest = NULL;
+		return -ENOMEM;
+	}
+	return 0;
+}
+
+static int mic_dma_check_chan(struct mic_dma_chan *ch)
+{
+	if (mic_dma_read_reg(ch, MIC_DMA_REG_DCHERR) ||
+	    mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT) & MIC_DMA_CHAN_QUIESCE) {
+		mic_dma_disable_chan(ch);
+		mic_dma_chan_mask_intr(ch);
+		dev_err(mic_dma_ch_to_device(ch),
+			"%s %d error setting up mic dma chan %d\n",
+			__func__, __LINE__, ch->ch_num);
+		return -EBUSY;
+	}
+	return 0;
+}
+
+static int mic_dma_chan_setup(struct mic_dma_chan *ch)
+{
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		mic_dma_chan_set_owner(ch);
+	mic_dma_disable_chan(ch);
+	mic_dma_chan_mask_intr(ch);
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCHERRMSK, 0);
+	mic_dma_chan_set_desc_ring(ch);
+	ch->last_tail = mic_dma_read_reg(ch, MIC_DMA_REG_DTPR);
+	ch->head = ch->last_tail;
+	ch->issued = 0;
+	mic_dma_chan_unmask_intr(ch);
+	mic_dma_enable_chan(ch);
+	return mic_dma_check_chan(ch);
+}
+
+static void mic_dma_chan_destroy(struct mic_dma_chan *ch)
+{
+	mic_dma_disable_chan(ch);
+	mic_dma_chan_mask_intr(ch);
+}
+
+static void mic_dma_unregister_dma_device(struct mic_dma_device *mic_dma_dev)
+{
+	dma_async_device_unregister(&mic_dma_dev->dma_dev);
+}
+
+static int mic_dma_setup_irq(struct mic_dma_chan *ch)
+{
+	ch->cookie =
+		to_mbus_hw_ops(ch)->request_threaded_irq(to_mbus_device(ch),
+			mic_dma_intr_handler, mic_dma_thread_fn,
+			"mic dma_channel", ch, ch->ch_num);
+	if (IS_ERR(ch->cookie))
+		return IS_ERR(ch->cookie);
+	return 0;
+}
+
+static inline void mic_dma_free_irq(struct mic_dma_chan *ch)
+{
+	to_mbus_hw_ops(ch)->free_irq(to_mbus_device(ch), ch->cookie, ch);
+}
+
+static int mic_dma_chan_init(struct mic_dma_chan *ch)
+{
+	int ret = mic_dma_alloc_desc_ring(ch);
+
+	if (ret)
+		goto ring_error;
+	ret = mic_dma_alloc_status_dest(ch);
+	if (ret)
+		goto status_error;
+	ret = mic_dma_chan_setup(ch);
+	if (ret)
+		goto chan_error;
+	return ret;
+chan_error:
+	mic_dma_free_status_dest(ch);
+status_error:
+	mic_dma_free_desc_ring(ch);
+ring_error:
+	return ret;
+}
+
+static int mic_dma_drain_chan(struct mic_dma_chan *ch)
+{
+	struct dma_async_tx_descriptor *tx;
+	int err = 0;
+	dma_cookie_t cookie;
+
+	tx = mic_dma_prep_memcpy_lock(&ch->api_ch, 0, 0, 0, DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	cookie = tx->tx_submit(tx);
+	if (dma_submit_error(cookie))
+		err = -ENOMEM;
+	else
+		err = dma_sync_wait(&ch->api_ch, cookie);
+	if (err) {
+		dev_err(mic_dma_ch_to_device(ch), "%s %d TO chan 0x%x\n",
+			__func__, __LINE__, ch->ch_num);
+		err = -EIO;
+	}
+error:
+	mic_dma_cleanup(ch);
+	return err;
+}
+
+static inline void mic_dma_chan_uninit(struct mic_dma_chan *ch)
+{
+	mic_dma_chan_destroy(ch);
+	mic_dma_cleanup(ch);
+	mic_dma_free_status_dest(ch);
+	mic_dma_free_desc_ring(ch);
+}
+
+static int mic_dma_init(struct mic_dma_device *mic_dma_dev,
+			enum mic_dma_chan_owner owner)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+	int ret;
+
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		unsigned long data;
+		ch = &mic_dma_dev->mic_ch[i];
+		data = (unsigned long)ch;
+		ch->ch_num = i;
+		ch->owner = owner;
+		spin_lock_init(&ch->cleanup_lock);
+		spin_lock_init(&ch->prep_lock);
+		spin_lock_init(&ch->issue_lock);
+		ret = mic_dma_setup_irq(ch);
+		if (ret)
+			goto error;
+	}
+	return 0;
+error:
+	for (i = i - 1; i >= first_chan; i--)
+		mic_dma_free_irq(ch);
+	return ret;
+}
+
+static void mic_dma_uninit(struct mic_dma_device *mic_dma_dev)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		ch = &mic_dma_dev->mic_ch[i];
+		mic_dma_free_irq(ch);
+	}
+}
+
+static int mic_dma_alloc_chan_resources(struct dma_chan *ch)
+{
+	int ret = mic_dma_chan_init(to_mic_dma_chan(ch));
+	if (ret)
+		return ret;
+	return MIC_DMA_DESC_RX_SIZE;
+}
+
+static void mic_dma_free_chan_resources(struct dma_chan *ch)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	mic_dma_drain_chan(mic_ch);
+	mic_dma_chan_uninit(mic_ch);
+}
+
+/* Set the fn. handlers and register the dma device with dma api */
+static int mic_dma_register_dma_device(struct mic_dma_device *mic_dma_dev,
+				       enum mic_dma_chan_owner owner)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+
+	dma_cap_zero(mic_dma_dev->dma_dev.cap_mask);
+	/*
+	 * This dma engine is not capable of host memory to host memory
+	 * transfers
+	 */
+	dma_cap_set(DMA_MEMCPY, mic_dma_dev->dma_dev.cap_mask);
+
+	if (MIC_DMA_CHAN_HOST == owner)
+		dma_cap_set(DMA_PRIVATE, mic_dma_dev->dma_dev.cap_mask);
+	mic_dma_dev->dma_dev.device_alloc_chan_resources =
+		mic_dma_alloc_chan_resources;
+	mic_dma_dev->dma_dev.device_free_chan_resources =
+		mic_dma_free_chan_resources;
+	mic_dma_dev->dma_dev.device_tx_status = mic_dma_tx_status;
+	mic_dma_dev->dma_dev.device_prep_dma_memcpy = mic_dma_prep_memcpy_lock;
+	mic_dma_dev->dma_dev.device_prep_dma_interrupt =
+		mic_dma_prep_interrupt_lock;
+	mic_dma_dev->dma_dev.device_issue_pending = mic_dma_issue_pending;
+	mic_dma_dev->dma_dev.copy_align = MIC_DMA_ALIGN_SHIFT;
+	INIT_LIST_HEAD(&mic_dma_dev->dma_dev.channels);
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		mic_dma_dev->mic_ch[i].api_ch.device = &mic_dma_dev->dma_dev;
+		dma_cookie_init(&mic_dma_dev->mic_ch[i].api_ch);
+		list_add_tail(&mic_dma_dev->mic_ch[i].api_ch.device_node,
+			      &mic_dma_dev->dma_dev.channels);
+	}
+	return dma_async_device_register(&mic_dma_dev->dma_dev);
+}
+
+/*
+ * Initializes dma channels and registers the dma device with the
+ * dma engine api.
+ */
+static struct mic_dma_device *mic_dma_dev_reg(struct mbus_device *mbdev,
+					      enum mic_dma_chan_owner owner)
+{
+	struct mic_dma_device *mic_dma_dev;
+	int ret;
+	struct device *dev = &mbdev->dev;
+
+	mic_dma_dev = kzalloc(sizeof(*mic_dma_dev), GFP_KERNEL);
+	if (!mic_dma_dev) {
+		ret = -ENOMEM;
+		goto alloc_error;
+	}
+	mic_dma_dev->mbdev = mbdev;
+	mic_dma_dev->dma_dev.dev = dev;
+	mic_dma_dev->mmio = mbdev->mmio_va;
+	if (MIC_DMA_CHAN_HOST == owner) {
+		mic_dma_dev->start_ch = 0;
+		mic_dma_dev->max_xfer_size = MIC_DMA_MAX_XFER_SIZE_HOST;
+	} else {
+		mic_dma_dev->start_ch = 4;
+		mic_dma_dev->max_xfer_size = MIC_DMA_MAX_XFER_SIZE_CARD;
+	}
+	ret = mic_dma_init(mic_dma_dev, owner);
+	if (ret)
+		goto init_error;
+	ret = mic_dma_register_dma_device(mic_dma_dev, owner);
+	if (ret)
+		goto reg_error;
+	return mic_dma_dev;
+reg_error:
+	mic_dma_uninit(mic_dma_dev);
+init_error:
+	kfree(mic_dma_dev);
+	mic_dma_dev = NULL;
+alloc_error:
+	dev_err(dev, "Error at %s %d ret=%d\n", __func__, __LINE__, ret);
+	return mic_dma_dev;
+}
+
+static void mic_dma_dev_unreg(struct mic_dma_device *mic_dma_dev)
+{
+	mic_dma_unregister_dma_device(mic_dma_dev);
+	mic_dma_uninit(mic_dma_dev);
+	kfree(mic_dma_dev);
+}
+
+/* DEBUGFS CODE */
+static int mic_dma_reg_seq_show(struct seq_file *s, void *pos)
+{
+	struct mic_dma_device *mic_dma_dev = s->private;
+	int i, chan_num, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+
+	seq_printf(s, "SBOX_DCR: %#x\n",
+		   mic_dma_mmio_read(&mic_dma_dev->mic_ch[first_chan],
+				     MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR));
+	seq_puts(s, "DMA Channel Registers\n");
+	seq_printf(s, "%-10s| %-10s %-10s %-10s %-10s %-10s",
+		   "Channel", "DCAR", "DTPR", "DHPR", "DRAR_HI", "DRAR_LO");
+	seq_printf(s, " %-11s %-14s %-10s\n", "DCHERR", "DCHERRMSK", "DSTAT");
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		ch = &mic_dma_dev->mic_ch[i];
+		chan_num = ch->ch_num;
+		seq_printf(s, "%-10i| %-#10x %-#10x %-#10x %-#10x",
+			   chan_num,
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCAR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DTPR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DHPR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DRAR_HI));
+		seq_printf(s, " %-#10x %-#10x %-#14x %-#10x\n",
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DRAR_LO),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCHERR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCHERRMSK),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT));
+	}
+	return 0;
+}
+
+static int mic_dma_reg_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_dma_reg_seq_show, inode->i_private);
+}
+
+static int mic_dma_reg_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations mic_dma_reg_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_dma_reg_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_dma_reg_debug_release
+};
+
+/* Debugfs parent dir */
+static struct dentry *mic_dma_dbg;
+
+static int mic_dma_driver_probe(struct mbus_device *mbdev)
+{
+	struct mic_dma_device *mic_dma_dev;
+	enum mic_dma_chan_owner owner;
+
+	if (MBUS_DEV_DMA_MIC == mbdev->id.device)
+		owner = MIC_DMA_CHAN_MIC;
+	else
+		owner = MIC_DMA_CHAN_HOST;
+
+	mic_dma_dev = mic_dma_dev_reg(mbdev, owner);
+	mbdev->priv = mic_dma_dev;
+
+	if (mic_dma_dbg) {
+		mic_dma_dev->dbg_dir = debugfs_create_dir(dev_name(&mbdev->dev),
+							  mic_dma_dbg);
+		if (mic_dma_dev->dbg_dir)
+			debugfs_create_file("mic_dma_reg", 0444,
+					    mic_dma_dev->dbg_dir, mic_dma_dev,
+					    &mic_dma_reg_ops);
+	}
+	return 0;
+}
+
+static void mic_dma_driver_remove(struct mbus_device *mbdev)
+{
+	struct mic_dma_device *mic_dma_dev;
+
+	mic_dma_dev = (struct mic_dma_device *)mbdev->priv;
+	debugfs_remove_recursive(mic_dma_dev->dbg_dir);
+	mic_dma_dev_unreg(mic_dma_dev);
+}
+
+static struct mbus_device_id id_table[] = {
+	{MBUS_DEV_DMA_MIC, MBUS_DEV_ANY_ID},
+	{MBUS_DEV_DMA_HOST, MBUS_DEV_ANY_ID},
+	{0},
+};
+
+static struct mbus_driver mic_dma_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = mic_dma_driver_probe,
+	.remove = mic_dma_driver_remove,
+};
+
+static int __init mic_x100_dma_init(void)
+{
+	int rc = mbus_register_driver(&mic_dma_driver);
+	if (rc)
+		return rc;
+	mic_dma_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	return 0;
+}
+
+static void __exit mic_x100_dma_exit(void)
+{
+	debugfs_remove_recursive(mic_dma_dbg);
+	mbus_unregister_driver(&mic_dma_driver);
+}
+
+module_init(mic_x100_dma_init);
+module_exit(mic_x100_dma_exit);
+
+MODULE_DEVICE_TABLE(mbus, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC X100 DMA Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/dma/mic_x100_dma.h b/drivers/dma/mic_x100_dma.h
new file mode 100644
index 0000000..f663b0b
--- /dev/null
+++ b/drivers/dma/mic_x100_dma.h
@@ -0,0 +1,286 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC X100 DMA Driver.
+ *
+ * Adapted from IOAT dma driver.
+ */
+#ifndef _MIC_X100_DMA_H_
+#define _MIC_X100_DMA_H_
+
+#include <linux/kernel.h>
+#include <linux/delay.h>
+#include <linux/sched.h>
+#include <linux/debugfs.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/mic_bus.h>
+
+#include "dmaengine.h"
+
+/*
+ * MIC has a total of 8 dma channels.
+ * Four channels are assigned for host SW use & the remaining for MIC SW.
+ * MIC DMA transfer size & addresses need to be 64 byte aligned.
+ */
+#define MIC_DMA_MAX_NUM_CHAN	8
+#define MIC_DMA_NUM_CHAN	4
+#define MIC_DMA_ALIGN_SHIFT	6
+#define MIC_DMA_ALIGN_BYTES	(1 << MIC_DMA_ALIGN_SHIFT)
+#define MIC_DMA_DESC_RX_SIZE	(128 * 1024 - 4)
+
+/*
+ * Register descriptions
+ * All the registers are 32 bit registers.
+ * DCR is a global register and all others are per-channel.
+ * DCR - bits 0, 2, 4, 6, 8, 10, 12, 14 - enable bits for channels 0 to 7
+ *	 bits 1, 3, 5, 7, 9, 11, 13, 15 - owner bits for channels 0 to 7
+ * DCAR - bit 24 & 25 interrupt masks for mic owned & host owned channels
+ * DHPR - head of the descriptor ring updated by s/w
+ * DTPR - tail of the descriptor ring updated by h/w
+ * DRAR_LO - lower 32 bits of descriptor ring's mic address
+ * DRAR_HI - 3:0 - remaining 4 bits of descriptor ring's mic address
+ *	     20:4 descriptor ring size
+ *	     25:21 mic smpt entry number
+ * DSTAT - 16:0 h/w completion count; 31:28 dma engine status
+ * DCHERR - this register is non-zero on error
+ * DCHERRMSK - interrupt mask register
+ */
+#define MIC_DMA_HW_CMP_CNT_MASK		0x1ffff
+#define MIC_DMA_CHAN_QUIESCE		0x20000000
+#define MIC_DMA_SBOX_BASE		0x00010000
+#define MIC_DMA_SBOX_DCR		0x0000A280
+#define MIC_DMA_SBOX_CH_BASE		0x0001A000
+#define MIC_DMA_SBOX_CHAN_OFF		0x40
+#define MIC_DMA_SBOX_DCAR_IM0		(0x1 << 24)
+#define MIC_DMA_SBOX_DCAR_IM1		(0x1 << 25)
+#define MIC_DMA_SBOX_DRARHI_SYS_MASK	(0x1 << 26)
+#define MIC_DMA_REG_DCAR		0
+#define MIC_DMA_REG_DHPR		4
+#define MIC_DMA_REG_DTPR		8
+#define MIC_DMA_REG_DRAR_LO		20
+#define MIC_DMA_REG_DRAR_HI		24
+#define MIC_DMA_REG_DSTAT		32
+#define MIC_DMA_REG_DCHERR		44
+#define MIC_DMA_REG_DCHERRMSK		48
+
+/* HW dma desc */
+struct mic_dma_desc {
+	u64 qw0;
+	u64 qw1;
+};
+
+enum mic_dma_chan_owner {
+	MIC_DMA_CHAN_MIC = 0,
+	MIC_DMA_CHAN_HOST
+};
+
+/*
+ * mic_dma_chan - channel specific information
+ * @ch_num: channel number
+ * @owner: owner of this channel
+ * @last_tail: cached value of descriptor ring tail
+ * @head: index of next descriptor in desc_ring
+ * @issued: hardware notification point
+ * @submitted: index that will be used to submit descriptors to h/w
+ * @api_ch: dma engine api channel
+ * @desc_ring: dma descriptor ring
+ * @desc_ring_micpa: mic physical address of desc_ring
+ * @status_dest: destination for status (fence) descriptor
+ * @status_dest_micpa: mic address for status_dest,
+ *		       DMA controller uses this address
+ * @tx_array: array of async_tx
+ * @cleanup_lock: lock held when processing completed tx
+ * @prep_lock: lock held in prep_memcpy & released in tx_submit
+ * @issue_lock: lock used to synchronize writes to head
+ * @cookie: mic_irq cookie used with mic irq request
+ */
+struct mic_dma_chan {
+	int ch_num;
+	enum mic_dma_chan_owner owner;
+	u32 last_tail;
+	u32 head;
+	u32 issued;
+	u32 submitted;
+	struct dma_chan api_ch;
+	struct mic_dma_desc *desc_ring;
+	dma_addr_t desc_ring_micpa;
+	u64 *status_dest;
+	dma_addr_t status_dest_micpa;
+	struct dma_async_tx_descriptor *tx_array;
+	spinlock_t cleanup_lock;
+	spinlock_t prep_lock;
+	spinlock_t issue_lock;
+	struct mic_irq *cookie;
+};
+
+/*
+ * struct mic_dma_device - per mic device
+ * @mic_ch: dma channels
+ * @dma_dev: underlying dma device
+ * @mbdev: mic bus dma device
+ * @mmio: virtual address of the mmio space
+ * @dbg_dir: debugfs directory
+ * @start_ch: first channel number that can be used
+ * @max_xfer_size: maximum transfer size per dma descriptor
+ */
+struct mic_dma_device {
+	struct mic_dma_chan mic_ch[MIC_DMA_MAX_NUM_CHAN];
+	struct dma_device dma_dev;
+	struct mbus_device *mbdev;
+	void __iomem *mmio;
+	struct dentry *dbg_dir;
+	int start_ch;
+	size_t max_xfer_size;
+};
+
+static inline struct mic_dma_chan *to_mic_dma_chan(struct dma_chan *ch)
+{
+	return container_of(ch, struct mic_dma_chan, api_ch);
+}
+
+static inline struct mic_dma_device *to_mic_dma_dev(struct mic_dma_chan *ch)
+{
+	return
+	container_of((const typeof(((struct mic_dma_device *)0)->mic_ch)*)
+		     (ch - ch->ch_num), struct mic_dma_device, mic_ch);
+}
+
+static inline struct mbus_device *to_mbus_device(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->mbdev;
+}
+
+static inline struct mbus_hw_ops *to_mbus_hw_ops(struct mic_dma_chan *ch)
+{
+	return to_mbus_device(ch)->hw_ops;
+}
+
+static inline struct device *mic_dma_ch_to_device(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->dma_dev.dev;
+}
+
+static inline void __iomem *mic_dma_chan_to_mmio(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->mmio;
+}
+
+static inline u32 mic_dma_read_reg(struct mic_dma_chan *ch, u32 reg)
+{
+	return ioread32(mic_dma_chan_to_mmio(ch) + MIC_DMA_SBOX_CH_BASE +
+			ch->ch_num * MIC_DMA_SBOX_CHAN_OFF + reg);
+}
+
+static inline void mic_dma_write_reg(struct mic_dma_chan *ch, u32 reg, u32 val)
+{
+	iowrite32(val, mic_dma_chan_to_mmio(ch) + MIC_DMA_SBOX_CH_BASE +
+		  ch->ch_num * MIC_DMA_SBOX_CHAN_OFF + reg);
+}
+
+static inline u32 mic_dma_mmio_read(struct mic_dma_chan *ch, u32 offset)
+{
+	return ioread32(mic_dma_chan_to_mmio(ch) + offset);
+}
+
+static inline void mic_dma_mmio_write(struct mic_dma_chan *ch, u32 val,
+				      u32 offset)
+{
+	iowrite32(val, mic_dma_chan_to_mmio(ch) + offset);
+}
+
+static inline u32 mic_dma_read_cmp_cnt(struct mic_dma_chan *ch)
+{
+	return mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT) &
+	       MIC_DMA_HW_CMP_CNT_MASK;
+}
+
+static inline void mic_dma_chan_set_owner(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+	u32 chan_num = ch->ch_num;
+
+	dcr = (dcr & ~(0x1 << (chan_num * 2))) | (ch->owner << (chan_num * 2));
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static inline void mic_dma_enable_chan(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+
+	dcr |= 2 << (ch->ch_num << 1);
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static inline void mic_dma_disable_chan(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+
+	dcr &= ~(2 << (ch->ch_num << 1));
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static void mic_dma_chan_set_desc_ring(struct mic_dma_chan *ch)
+{
+	u32 drar_hi;
+	dma_addr_t desc_ring_micpa = ch->desc_ring_micpa;
+
+	drar_hi = (MIC_DMA_DESC_RX_SIZE & 0x1ffff) << 4;
+	if (MIC_DMA_CHAN_MIC == ch->owner) {
+		drar_hi |= (desc_ring_micpa >> 32) & 0xf;
+	} else {
+		drar_hi |= MIC_DMA_SBOX_DRARHI_SYS_MASK;
+		drar_hi |= ((desc_ring_micpa >> 34)
+			    & 0x1f) << 21;
+		drar_hi |= (desc_ring_micpa >> 32) & 0x3;
+	}
+	mic_dma_write_reg(ch, MIC_DMA_REG_DRAR_LO, (u32) desc_ring_micpa);
+	mic_dma_write_reg(ch, MIC_DMA_REG_DRAR_HI, drar_hi);
+}
+
+static inline void mic_dma_chan_mask_intr(struct mic_dma_chan *ch)
+{
+	u32 dcar = mic_dma_read_reg(ch, MIC_DMA_REG_DCAR);
+
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		dcar |= MIC_DMA_SBOX_DCAR_IM0;
+	else
+		dcar |= MIC_DMA_SBOX_DCAR_IM1;
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCAR, dcar);
+}
+
+static inline void mic_dma_chan_unmask_intr(struct mic_dma_chan *ch)
+{
+	u32 dcar = mic_dma_read_reg(ch, MIC_DMA_REG_DCAR);
+
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		dcar &= ~MIC_DMA_SBOX_DCAR_IM0;
+	else
+		dcar &= ~MIC_DMA_SBOX_DCAR_IM1;
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCAR, dcar);
+}
+
+static void mic_dma_ack_interrupt(struct mic_dma_chan *ch)
+{
+	if (MIC_DMA_CHAN_MIC == ch->owner) {
+		/* HW errata */
+		mic_dma_chan_mask_intr(ch);
+		mic_dma_chan_unmask_intr(ch);
+	}
+	to_mbus_hw_ops(ch)->ack_interrupt(to_mbus_device(ch), ch->ch_num);
+}
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next v2 8/8] misc: mic: add support for loading/unloading dma driver
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
                   ` (6 preceding siblings ...)
  2014-05-29 21:24 85% ` [PATCH char-misc-next v2 7/8] misc: mic: add dma " Sudeep Dutt
@ 2014-05-29 21:24 99% ` Sudeep Dutt
  2014-06-16 15:58 99% ` [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

modprobe dma driver upon start and remove it upon unload.

Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 Documentation/mic/mpssd/mpss | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index 3136c68..cacbdb0 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -48,18 +48,18 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_HOST Module"
+	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
 
-	# Ensure the driver is loaded
-	if [ ! -d "$sysfs" ]; then
-		modprobe mic_host
+	for f in "mic_host" "mic_x100_dma"
+	do
+		modprobe $f
 		RETVAL=$?
 		if [ $RETVAL -ne 0 ]; then
 			failure
 			echo
 			return $RETVAL
 		fi
-	fi
+	done
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +170,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST Module: "
-	modprobe -r mic_host
+	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
+	modprobe -r mic_host mic_x100_dma
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next v2 1/8] misc: mic: Add mic bus and dma driver documentation
  2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
@ 2014-05-29 21:23 91% ` Sudeep Dutt
  2014-05-29 21:23 63% ` [PATCH char-misc-next v2 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Added an overview of mic bus and dma driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 Documentation/mic/mic_overview.txt | 67 +++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index b419292..77c5418 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -17,35 +17,50 @@ for applications. A key benefit of our solution is that it leverages
 the standard virtio framework for network, disk and console devices,
 though in our case the virtio framework is used across a PCIe bus.
 
+MIC PCIe card has a dma controller with 8 channels. These channels are
+shared between the host s/w and the card s/w. 0 to 3 are used by host
+and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
+a virtual bus called mic bus is created and virtual dma devices are
+created on it by the host/card drivers. On host the channels are private
+and used only by the host driver to transfer data for the virtio devices.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
 the host to initiate DMA's to/from the card using the MIC DMA engine and
 the fact that the virtio block storage backend can only be on the host.
 
-                              |
-       +----------+           |             +----------+
-       | Card OS  |           |             | Host OS  |
-       +----------+           |             +----------+
-                              |
-+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
-| Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
-| Net   | |Console | |Block | | |Net      |  |Console | |Block   |
-| Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
-+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
-    |         |         |     |      |            |         |
-    |         |         |     |User  |            |         |
-    |         |         |     |------|------------|---------|-------
-    +-------------------+     |Kernel +--------------------------+
-              |               |       | Virtio over PCIe IOCTLs  |
-              |               |       +--------------------------+
-      +--------------+        |                   |
-      |Intel MIC     |        |            +---------------+
-      |Card Driver   |        |            |Intel MIC      |
-      +--------------+        |            |Host Driver    |
-              |               |            +---------------+
-              |               |                   |
-     +-------------------------------------------------------------+
-     |                                                             |
-     |                    PCIe Bus                                 |
-     +-------------------------------------------------------------+
+                                      |
+               +----------+           |             +----------+
+               | Card OS  |           |             | Host OS  |
+               +----------+           |             +----------+
+                                      |
+        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+        | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
+        | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
+        | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
+        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+            |         |         |     |      |            |         |
+            |         |         |     |User  |            |         |
+            |         |         |     |------|------------|---------|-------
+            +-------------------+     |Kernel +--------------------------+
+                      |               |       | Virtio over PCIe IOCTLs  |
+                      |               |       +--------------------------+
++-----------+         |               |                   |  +-----------+
+| MIC DMA   |         |               |                   |  | MIC DMA   |
+| Driver    |         |               |                   |  | Driver    |
++-----------+         |               |                   |  +-----------+
+      |               |               |                   |        |
++---------------+     |               |                   |  +----------------+
+|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
++---------------+     |               |                   |  +----------------+
+      |               |               |                   |              |
+      |   +--------------+            |            +---------------+     |
+      |   |Intel MIC     |            |            |Intel MIC      |     |
+      +---|Card Driver   |            |            |Host Driver    |     |
+          +--------------+            |            +---------------+-----+
+                      |               |                   |
+             +-------------------------------------------------------------+
+             |                                                             |
+             |                    PCIe Bus                                 |
+             +-------------------------------------------------------------+
-- 
1.8.2.1


^ permalink raw reply related	[relevance 91%]

* [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors
@ 2014-05-29 21:23 82% Sudeep Dutt
  2014-05-29 21:23 91% ` [PATCH char-misc-next v2 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
                   ` (8 more replies)
  0 siblings, 9 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29 21:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

ChangeLog:
=========
v1 => v2:
a) MIC bus driver cleanups and device release callback fix in patch 2,
   as per feedback from Greg Kroah-Hartman.

v1: Initial post @ https://lkml.org/lkml/2014/5/27/819

These patches are being sent to char-misc because there is a dependency
between the patches for dma driver and other drivers.

Description:
============
This set of patches add support for MIC X100 dma driver.
MIC PCIe card has a dma controller with 8 channels. These channels are
shared between the host s/w and the card s/w. 0 to 3 are used by host
nd 4 to 7 by card. As the dma device doesn't show up as PCIe device,
a virtual bus called mic bus is created and virtual dma devices are
created on it by the host/card drivers. On host the channels are private
and used only by the host driver to transfer data for the virtio devices.

Here is a higher level block diagram.
                                      |
               +----------+           |             +----------+
               | Card OS  |           |             | Host OS  |
               +----------+           |             +----------+
                                      |
        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
        | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
        | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
        | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
            |         |         |     |      |            |         |
            |         |         |     |User  |            |         |
            |         |         |     |------|------------|---------|-------
            +-------------------+     |Kernel +--------------------------+
                      |               |       | Virtio over PCIe IOCTLs  |
                      |               |       +--------------------------+
+-----------+         |               |                   |  +-----------+
| MIC DMA   |         |               |                   |  | MIC DMA   |
| Driver    |         |               |                   |  | Driver    |
+-----------+         |               |                   |  +-----------+
      |               |               |                   |        |
+---------------+     |               |                   |  +----------------+
|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
+---------------+     |               |                   |  +----------------+
      |               |               |                   |              |
      |   +--------------+            |            +---------------+     |
      |   |Intel MIC     |            |            |Intel MIC      |     |
      +---|Card Driver   |            |            |Host Driver    |     |
          +--------------+            |            +---------------+-----+
                      |               |                   |
             +-------------------------------------------------------------+
             |                                                             |
             |                    PCIe Bus                                 |
             +-------------------------------------------------------------+

The following series of patches are partitioned as follows:

Patch 1: Add mic bus and dma driver documentation.
	 Author: Siva Yerramreddy
Patch 2: Add a bus driver for virtual MIC devices.
	 Authors: Siva Yerramreddy, Sudeep Dutt
Patch 3: MIC X100 DMA Driver.
	 Author: Siva Yerramreddy
Patch 4: Add threaded irq support in host driver.
	 This is needed as the dma driver uses threaded irq.
	 Author: Siva Yerramreddy
Patch 5: Use dma to transfer data between MIC and host.
	 Authors: Siva Yerramreddy, Ashutosh Dixit
Patch 6: Add threaded irq support in mic_request_card_irq.
	 This is needed as the dma driver uses threaded irq.
	 Author: Siva Yerramreddy
Patch 7: Add dma device on mic bus.
	 Author: Siva Yerramreddy
Patch 8: Modify the mpss script to load/unload mic_x100_dma.ko.
	 Author: Siva Yerramreddy

The patches have been compiled/validated against v3.15-rc3. Tested using
dmatest module with module parameter "threads_per_chan=60". These patches
have also been scanned by Fengguang Wu's 0-day infrastructure and no
issues have been reported.

Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the initial
review.

Siva Yerramreddy (7):
  misc: mic: Add mic bus and dma driver documentation
  dma: MIC X100 DMA Driver
  misc: mic: add threaded irq support in host driver
  misc: mic: add dma support in host driver
  misc: mic: add threaded irq support in card driver
  misc: mic: add dma support in card driver
  misc: mic: add support for loading/unloading dma driver

Sudeep Dutt (1):
  misc: mic: add a bus driver for virtual MIC devices

 Documentation/mic/mic_overview.txt |  67 ++--
 Documentation/mic/mpssd/mpss       |  14 +-
 drivers/dma/Kconfig                |  19 +
 drivers/dma/Makefile               |   1 +
 drivers/dma/mic_x100_dma.c         | 774 +++++++++++++++++++++++++++++++++++++
 drivers/dma/mic_x100_dma.h         | 286 ++++++++++++++
 drivers/misc/mic/Kconfig           |  21 +-
 drivers/misc/mic/Makefile          |   1 +
 drivers/misc/mic/bus/Makefile      |   5 +
 drivers/misc/mic/bus/mic_bus.c     | 219 +++++++++++
 drivers/misc/mic/card/mic_device.c |  23 +-
 drivers/misc/mic/card/mic_device.h |   9 +-
 drivers/misc/mic/card/mic_virtio.c |   7 +-
 drivers/misc/mic/card/mic_x100.c   |  55 ++-
 drivers/misc/mic/host/mic_boot.c   |  83 +++-
 drivers/misc/mic/host/mic_device.h |  24 ++
 drivers/misc/mic/host/mic_intr.c   | 121 +++---
 drivers/misc/mic/host/mic_intr.h   |  27 +-
 drivers/misc/mic/host/mic_main.c   |   5 +-
 drivers/misc/mic/host/mic_virtio.c | 187 +++++++--
 drivers/misc/mic/host/mic_virtio.h |  21 +-
 drivers/misc/mic/host/mic_x100.c   |   8 +
 include/linux/mic_bus.h            | 112 ++++++
 23 files changed, 1938 insertions(+), 151 deletions(-)
 create mode 100644 drivers/dma/mic_x100_dma.c
 create mode 100644 drivers/dma/mic_x100_dma.h
 create mode 100644 drivers/misc/mic/bus/Makefile
 create mode 100644 drivers/misc/mic/bus/mic_bus.c
 create mode 100644 include/linux/mic_bus.h

-- 
1.8.2.1


^ permalink raw reply	[relevance 82%]

* Re: [PATCH char-misc-next 2/8] misc: mic: add a bus driver for virtual MIC devices
  @ 2014-05-29  2:56 96%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-29  2:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Dan Williams, Siva Yerramreddy, Ashutosh Dixit,
	Nikhil Rao, linux-kernel, dmaengine

On Wed, 2014-05-28 at 13:50 -0700, Greg Kroah-Hartman wrote: 
> On Tue, May 27, 2014 at 07:36:11PM -0700, Sudeep Dutt wrote:
> > +int register_mbus_driver(struct mbus_driver *driver)
> > +{
> > +	driver->driver.bus = &mic_bus;
> > +	return driver_register(&driver->driver);
> > +}
> > +EXPORT_SYMBOL_GPL(register_mbus_driver);
> 
> mbus_register_driver()?
> 

The idea was to follow the register_virtio_driver(..) naming convention
but will rename as your suggestion is better.

> > +void unregister_mbus_driver(struct mbus_driver *driver)
> > +{
> > +	driver_unregister(&driver->driver);
> > +}
> > +EXPORT_SYMBOL_GPL(unregister_mbus_driver);
> 
> mbus_unregister_driver()?

Will rename.

> 
> > +int register_mbus_device(struct mbus_device *dev)
> 
> mbus_register_device()?
> 
> Trying to keep the kernel namespace sane.

Will rename.

> 
> Why doesn't this function create the device structure?
> 

The mbus_device containing the device structure is allocated by the
driver calling this API. However that creates other issues as you
highlighted below. It is better to allocate mbus_device here so that it
can be freed in the device release callback correctly by the bus driver.

> > +{
> > +	int err;
> > +
> > +	dev->dev.bus = &mic_bus;
> > +
> > +	/* Assign a unique device index and hence name. */
> > +	err = ida_simple_get(&mbus_index_ida, 0, 0, GFP_KERNEL);
> > +	if (err < 0)
> > +		return err;
> > +
> > +	dev->index = err;
> > +	dev_set_name(&dev->dev, "mbus-dev%u", dev->index);
> > +	/*
> > +	 * device_register() causes the bus infrastructure to look for a
> > +	 * matching driver.
> > +	 */
> > +	err = device_register(&dev->dev);
> > +	return err;
> > +}
> > +EXPORT_SYMBOL_GPL(register_mbus_device);
> > +
> > +void unregister_mbus_device(struct mbus_device *dev)
> > +{
> > +	int index = dev->index; /* save for after device release */
> > +
> > +	device_unregister(&dev->dev);
> > +	ida_simple_remove(&mbus_index_ida, index);
> > +}
> > +EXPORT_SYMBOL_GPL(unregister_mbus_device);
> > +
> > +static int __init mbus_init(void)
> > +{
> > +	return bus_register(&mic_bus);
> > +}
> > +
> > +static void __exit mbus_exit(void)
> > +{
> > +	bus_unregister(&mic_bus);
> > +}
> > +
> > +core_initcall(mbus_init);
> > +module_exit(mbus_exit);
> > +
> > +MODULE_AUTHOR("Intel Corporation");
> > +MODULE_DESCRIPTION("Intel(R) MIC Bus driver");
> > +MODULE_LICENSE("GPL v2");
> > diff --git a/include/linux/mic_bus.h b/include/linux/mic_bus.h
> > new file mode 100644
> > index 0000000..8297573
> > --- /dev/null
> > +++ b/include/linux/mic_bus.h
> > @@ -0,0 +1,148 @@
> > +/*
> > + * Intel MIC Platform Software Stack (MPSS)
> > + *
> > + * Copyright(c) 2014 Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License, version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but
> > + * WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * General Public License for more details.
> > + *
> > + * The full GNU General Public License is included in this distribution in
> > + * the file called "COPYING".
> > + *
> > + * Intel MIC Bus driver.
> > + *
> > + * This implementation is very similar to the the virtio bus driver
> > + * implementation @ include/linux/virtio.h.
> > + */
> > +#ifndef _MIC_BUS_H_
> > +#define _MIC_BUS_H_
> > +/*
> > + * Everything a mbus driver needs to work with any particular mbus
> > + * implementation.
> > + */
> > +#include <linux/types.h>
> > +#include <linux/device.h>
> > +#include <linux/mod_devicetable.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/dma-mapping.h>
> > +
> > +struct mbus_device_id {
> > +	__u32 device;
> > +	__u32 vendor;
> > +};
> > +
> > +#define MBUS_DEV_DMA_HOST 2
> > +#define MBUS_DEV_DMA_MIC 3
> > +#define MBUS_DEV_ANY_ID 0xffffffff
> > +
> > +/**
> > + * mbus_device - representation of a device using mbus
> > + * @priv: private pointer for the driver's use.
> > + * @mmio_va: virtual address of mmio space
> > + * @hw_ops: the hardware ops supported by this device.
> > + * @id: the device type identification (used to match it with a driver).
> > + * @dev: underlying device.
> > + * be used to communicate with.
> > + * @index: unique position on the mbus bus
> > + */
> > +struct mbus_device {
> > +	void *priv;
> > +	void __iomem *mmio_va;
> > +	struct mbus_hw_ops *hw_ops;
> > +	struct mbus_device_id id;
> > +	struct device dev;
> > +	int index;
> > +};
> > +
> > +/**
> > + * mbus_driver - operations for a mbus I/O driver
> > + * @driver: underlying device driver (populate name and owner).
> > + * @id_table: the ids serviced by this driver.
> > + * @probe: the function to call when a device is found.  Returns 0 or -errno.
> > + * @remove: the function to call when a device is removed.
> > + */
> > +struct mbus_driver {
> > +	struct device_driver driver;
> > +	const struct mbus_device_id *id_table;
> > +	int (*probe)(struct mbus_device *dev);
> > +	void (*scan)(struct mbus_device *dev);
> > +	void (*remove)(struct mbus_device *dev);
> > +};
> > +
> > +/**
> > + * struct mic_irq - opaque pointer used as cookie
> > + */
> > +struct mic_irq;
> > +
> > +/**
> > + * mbus_hw_ops - Hardware operations for accessing a MIC device on the MIC bus.
> > + */
> > +struct mbus_hw_ops {
> > +	struct mic_irq* (*request_threaded_irq)(struct mbus_device *mbdev,
> > +			irq_handler_t handler, irq_handler_t thread_fn,
> > +			const char *name, void *data, int intr_src);
> > +	void (*free_irq)(struct mbus_device *mbdev,
> > +			struct mic_irq *cookie, void *data);
> > +	void (*ack_interrupt)(struct mbus_device *mbdev, int num);
> > +};
> > +
> > +int register_mbus_device(struct mbus_device *dev);
> > +void unregister_mbus_device(struct mbus_device *dev);
> > +
> > +int register_mbus_driver(struct mbus_driver *drv);
> > +void unregister_mbus_driver(struct mbus_driver *drv);
> > +
> > +static inline struct mbus_device *dev_to_mbus(struct device *_dev)
> > +{
> > +	return container_of(_dev, struct mbus_device, dev);
> > +}
> > +
> > +static inline struct mbus_driver *drv_to_mbus(struct device_driver *drv)
> > +{
> > +	return container_of(drv, struct mbus_driver, driver);
> > +}
> > +
> > +static inline void mbus_release_dev(struct device *d)
> > +{
> > +}
> 
> As per the kernel documentation rules, I get to publicly make fun of you
> for doing this half-hearted attempt to get around the kernel providing
> you with a valid kernel message as to why you always need to provide a
> proper release function :(
> 
> Consider yourself mocked.
> 
> And go read the kernel documentation for why.

This will be fixed up by allocating mbus_device in
mbus_register_device(..) so that it can be freed in the device release
callback correctly by the bus driver.

> An inline function?  Why?
> 

This is no longer an inline function after the changes above.

> 
> > +static inline int
> > +mbus_add_device(struct mbus_device *mbdev, struct device *pdev, int id,
> > +		struct dma_map_ops *dma_ops, struct mbus_hw_ops *hw_ops,
> > +		void __iomem *mmio_va)
> 
> inline?  Why?
> 

This API was a wrapper for register_mbus_device(..). However after the
changes above this inline function is no longer required as
mbus_register_device(..) allocates and initializes mbus_device.

> > +{
> > +	int ret;
> > +
> > +	mbdev->mmio_va = mmio_va;
> > +	mbdev->dev.parent = pdev;
> > +	mbdev->id.device = id;
> > +	mbdev->id.vendor = MBUS_DEV_ANY_ID;
> > +	mbdev->dev.archdata.dma_ops = dma_ops;
> > +	mbdev->dev.dma_mask = &mbdev->dev.coherent_dma_mask;
> > +	dma_set_mask(&mbdev->dev, DMA_BIT_MASK(64));
> > +	mbdev->dev.release = mbus_release_dev;
> > +	mbdev->hw_ops = hw_ops;
> > +	dev_set_drvdata(&mbdev->dev, mbdev);
> > +
> > +	ret = register_mbus_device(mbdev);
> > +	if (ret) {
> > +		dev_err(mbdev->dev.parent,
> > +			"Failed to register mbus device type %u\n", id);
> > +		return ret;
> > +	}
> > +	return 0;
> > +}
> > +
> > +static inline void mbus_remove_device(struct mbus_device *mbdev)
> > +{
> > +	unregister_mbus_device(mbdev);
> > +	memset(mbdev, 0x0, sizeof(*mbdev));
> 
> Why are you doing this?  (hint, I think I know why, and it's totally and
> unacceptably wrong.)
> 

mbus_device is registered by the host driver upon card boot and
unregistered when the card is reset. The mbus_device field in the host
driver was being reused since it was not allocated by the bus driver for
each reboot iteration and hence needed the ugly memset(..) here. The
change to allocate the mbus_device in the bus driver cleans this up
nicely and we no longer need the memset(..).

> Also, inline?
> 

This inline function is no longer required after the changes above. 

Rev2 of this patch series addressing your feedback will be posted this
week after some validation. Thanks for the review!

Sudeep Dutt 


^ permalink raw reply	[relevance 96%]

* Re: [PATCH 0/8] Enable dma driver for MIC X100 Coprocessors.
  @ 2014-05-28  2:34 99%       ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: sudeep.dutt, Dan Williams, Nikhil Rao, Ashutosh Dixit,
	linux-kernel, dmaengine, Siva Yerramreddy

On Tue, 2014-05-27 at 14:14 -0700, Greg Kroah-Hartman wrote:
> On Wed, May 07, 2014 at 08:10:57PM -0700, Sudeep Dutt wrote:
> > On Thu, 2014-04-24 at 11:10 -0700, Siva Krishna Yerramreddy wrote:
> > > On Mon, 2014-04-14 at 13:14 -0700, Siva Yerramreddy wrote:
> > > > I am sending all these patches to char-misc because there is a dependency
> > > > between the patches for dma driver and other drivers.
> > > > 
> > > Greg, any feedback on the patches?
> > 
> > Hi Greg,
> > The primary author of this patch series Siva is no longer with Intel so
> > we will be taking ownership of addressing review feedback.
> 
> Care to resend these with an author email address that will not bounce?
> I don't like taking code from people with invalid email addresses...
> 

Sure, I have resent the patch series. Please take a look.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next 1/8] misc: mic: Add mic bus and dma driver documentation
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
@ 2014-05-28  2:36 91% ` Sudeep Dutt
  2014-05-28  2:36 63% ` [PATCH char-misc-next 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Added an overview of mic bus and dma driver.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 Documentation/mic/mic_overview.txt | 67 +++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
index b419292..77c5418 100644
--- a/Documentation/mic/mic_overview.txt
+++ b/Documentation/mic/mic_overview.txt
@@ -17,35 +17,50 @@ for applications. A key benefit of our solution is that it leverages
 the standard virtio framework for network, disk and console devices,
 though in our case the virtio framework is used across a PCIe bus.
 
+MIC PCIe card has a dma controller with 8 channels. These channels are
+shared between the host s/w and the card s/w. 0 to 3 are used by host
+and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
+a virtual bus called mic bus is created and virtual dma devices are
+created on it by the host/card drivers. On host the channels are private
+and used only by the host driver to transfer data for the virtio devices.
+
 Here is a block diagram of the various components described above. The
 virtio backends are situated on the host rather than the card given better
 single threaded performance for the host compared to MIC, the ability of
 the host to initiate DMA's to/from the card using the MIC DMA engine and
 the fact that the virtio block storage backend can only be on the host.
 
-                              |
-       +----------+           |             +----------+
-       | Card OS  |           |             | Host OS  |
-       +----------+           |             +----------+
-                              |
-+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
-| Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
-| Net   | |Console | |Block | | |Net      |  |Console | |Block   |
-| Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
-+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
-    |         |         |     |      |            |         |
-    |         |         |     |User  |            |         |
-    |         |         |     |------|------------|---------|-------
-    +-------------------+     |Kernel +--------------------------+
-              |               |       | Virtio over PCIe IOCTLs  |
-              |               |       +--------------------------+
-      +--------------+        |                   |
-      |Intel MIC     |        |            +---------------+
-      |Card Driver   |        |            |Intel MIC      |
-      +--------------+        |            |Host Driver    |
-              |               |            +---------------+
-              |               |                   |
-     +-------------------------------------------------------------+
-     |                                                             |
-     |                    PCIe Bus                                 |
-     +-------------------------------------------------------------+
+                                      |
+               +----------+           |             +----------+
+               | Card OS  |           |             | Host OS  |
+               +----------+           |             +----------+
+                                      |
+        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+        | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
+        | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
+        | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
+        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+            |         |         |     |      |            |         |
+            |         |         |     |User  |            |         |
+            |         |         |     |------|------------|---------|-------
+            +-------------------+     |Kernel +--------------------------+
+                      |               |       | Virtio over PCIe IOCTLs  |
+                      |               |       +--------------------------+
++-----------+         |               |                   |  +-----------+
+| MIC DMA   |         |               |                   |  | MIC DMA   |
+| Driver    |         |               |                   |  | Driver    |
++-----------+         |               |                   |  +-----------+
+      |               |               |                   |        |
++---------------+     |               |                   |  +----------------+
+|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
++---------------+     |               |                   |  +----------------+
+      |               |               |                   |              |
+      |   +--------------+            |            +---------------+     |
+      |   |Intel MIC     |            |            |Intel MIC      |     |
+      +---|Card Driver   |            |            |Host Driver    |     |
+          +--------------+            |            +---------------+-----+
+                      |               |                   |
+             +-------------------------------------------------------------+
+             |                                                             |
+             |                    PCIe Bus                                 |
+             +-------------------------------------------------------------+
-- 
1.8.2.1


^ permalink raw reply related	[relevance 91%]

* [PATCH char-misc-next 4/8] misc: mic: add threaded irq support in host driver
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
                   ` (2 preceding siblings ...)
  2014-05-28  2:36 41% ` [PATCH char-misc-next 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
@ 2014-05-28  2:36 63% ` Sudeep Dutt
  2014-05-28  2:36 54% ` [PATCH char-misc-next 5/8] misc: mic: add dma " Sudeep Dutt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Convert mic_request_irq to mic_request_threaded_irq to support threaded
irq for virtual devices on mic bus.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/host/mic_intr.c   | 116 ++++++++++++++++++++++---------------
 drivers/misc/mic/host/mic_intr.h   |  18 ++++--
 drivers/misc/mic/host/mic_main.c   |   5 +-
 drivers/misc/mic/host/mic_virtio.c |   6 +-
 4 files changed, 90 insertions(+), 55 deletions(-)

diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index dbc5afd..e53b150 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -24,28 +24,29 @@
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 
-/*
- * mic_invoke_callback - Invoke callback functions registered for
- * the corresponding source id.
- *
- * @mdev: pointer to the mic_device instance
- * @idx: The interrupt source id.
- *
- * Returns none.
- */
-static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
+static irqreturn_t mic_thread_fn(int irq, void *dev)
 {
+	struct mic_device *mdev = dev;
+	struct mic_intr_info *intr_info = mdev->intr_info;
+	struct mic_irq_info *irq_info = &mdev->irq_info;
 	struct mic_intr_cb *intr_cb;
 	struct pci_dev *pdev = container_of(mdev->sdev->parent,
-		struct pci_dev, dev);
+					    struct pci_dev, dev);
+	int i;
 
-	spin_lock(&mdev->irq_info.mic_intr_lock);
-	list_for_each_entry(intr_cb, &mdev->irq_info.cb_list[idx], list)
-		if (intr_cb->func)
-			intr_cb->func(pdev->irq, intr_cb->data);
-	spin_unlock(&mdev->irq_info.mic_intr_lock);
+	spin_lock(&irq_info->mic_thread_lock);
+	for (i = intr_info->intr_start_idx[MIC_INTR_DB];
+			i < intr_info->intr_len[MIC_INTR_DB]; i++)
+		if (test_and_clear_bit(i, &irq_info->mask)) {
+			list_for_each_entry(intr_cb, &irq_info->cb_list[i],
+					    list)
+				if (intr_cb->thread_fn)
+					intr_cb->thread_fn(pdev->irq,
+							 intr_cb->data);
+		}
+	spin_unlock(&irq_info->mic_thread_lock);
+	return IRQ_HANDLED;
 }
-
 /**
  * mic_interrupt - Generic interrupt handler for
  * MSI and INTx based interrupts.
@@ -53,7 +54,11 @@ static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
 static irqreturn_t mic_interrupt(int irq, void *dev)
 {
 	struct mic_device *mdev = dev;
-	struct mic_intr_info *info = mdev->intr_info;
+	struct mic_intr_info *intr_info = mdev->intr_info;
+	struct mic_irq_info *irq_info = &mdev->irq_info;
+	struct mic_intr_cb *intr_cb;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+					    struct pci_dev, dev);
 	u32 mask;
 	int i;
 
@@ -61,12 +66,19 @@ static irqreturn_t mic_interrupt(int irq, void *dev)
 	if (!mask)
 		return IRQ_NONE;
 
-	for (i = info->intr_start_idx[MIC_INTR_DB];
-			i < info->intr_len[MIC_INTR_DB]; i++)
-		if (mask & BIT(i))
-			mic_invoke_callback(mdev, i);
-
-	return IRQ_HANDLED;
+	spin_lock(&irq_info->mic_intr_lock);
+	for (i = intr_info->intr_start_idx[MIC_INTR_DB];
+			i < intr_info->intr_len[MIC_INTR_DB]; i++)
+		if (mask & BIT(i)) {
+			list_for_each_entry(intr_cb, &irq_info->cb_list[i],
+					    list)
+				if (intr_cb->handler)
+					intr_cb->handler(pdev->irq,
+							 intr_cb->data);
+			set_bit(i, &irq_info->mask);
+		}
+	spin_unlock(&irq_info->mic_intr_lock);
+	return IRQ_WAKE_THREAD;
 }
 
 /* Return the interrupt offset from the index. Index is 0 based. */
@@ -99,14 +111,15 @@ static struct msix_entry *mic_get_available_vector(struct mic_device *mdev)
  *
  * @mdev: pointer to the mic_device instance
  * @idx: The source id to be registered.
- * @func: The function to be called when the source id receives
+ * @handler: The function to be called when the source id receives
  * the interrupt.
+ * @thread_fn: thread fn. corresponding to the handler
  * @data: Private data of the requester.
  * Return the callback structure that was registered or an
  * appropriate error on failure.
  */
 static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
-			u8 idx, irqreturn_t (*func) (int irq, void *dev),
+			u8 idx, irq_handler_t handler, irq_handler_t thread_fn,
 			void *data)
 {
 	struct mic_intr_cb *intr_cb;
@@ -117,7 +130,8 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 	if (!intr_cb)
 		return ERR_PTR(-ENOMEM);
 
-	intr_cb->func = func;
+	intr_cb->handler = handler;
+	intr_cb->thread_fn = thread_fn;
 	intr_cb->data = data;
 	intr_cb->cb_id = ida_simple_get(&mdev->irq_info.cb_ida,
 		0, 0, GFP_KERNEL);
@@ -126,9 +140,11 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 		goto ida_fail;
 	}
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
 	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	list_add_tail(&intr_cb->list, &mdev->irq_info.cb_list[idx]);
 	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 
 	return intr_cb;
 ida_fail:
@@ -152,8 +168,9 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 	unsigned long flags;
 	int i;
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	for (i = 0;  i < MIC_NUM_OFFSETS; i++) {
-		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
 			if (intr_cb->cb_id == idx) {
@@ -163,11 +180,13 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 				kfree(intr_cb);
 				spin_unlock_irqrestore(
 					&mdev->irq_info.mic_intr_lock, flags);
+				spin_unlock(&mdev->irq_info.mic_thread_lock);
 				return i;
 			}
 		}
-		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
 	}
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 	return MIC_NUM_OFFSETS;
 }
 
@@ -242,6 +261,7 @@ static int mic_setup_callbacks(struct mic_device *mdev)
 		INIT_LIST_HEAD(&mdev->irq_info.cb_list[i]);
 	ida_init(&mdev->irq_info.cb_ida);
 	spin_lock_init(&mdev->irq_info.mic_intr_lock);
+	spin_lock_init(&mdev->irq_info.mic_thread_lock);
 	return 0;
 }
 
@@ -258,14 +278,12 @@ static void mic_release_callbacks(struct mic_device *mdev)
 	struct mic_intr_cb *intr_cb;
 	int i;
 
+	spin_lock(&mdev->irq_info.mic_thread_lock);
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 	for (i = 0; i < MIC_NUM_OFFSETS; i++) {
-		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
 
-		if (list_empty(&mdev->irq_info.cb_list[i])) {
-			spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock,
-					       flags);
+		if (list_empty(&mdev->irq_info.cb_list[i]))
 			break;
-		}
 
 		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
@@ -274,8 +292,9 @@ static void mic_release_callbacks(struct mic_device *mdev)
 					  intr_cb->cb_id);
 			kfree(intr_cb);
 		}
-		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
 	}
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	spin_unlock(&mdev->irq_info.mic_thread_lock);
 	ida_destroy(&mdev->irq_info.cb_ida);
 	kfree(mdev->irq_info.cb_list);
 }
@@ -313,7 +332,8 @@ static int mic_setup_msi(struct mic_device *mdev, struct pci_dev *pdev)
 		goto err_nomem2;
 	}
 
-	rc = request_irq(pdev->irq, mic_interrupt, 0 , "mic-msi", mdev);
+	rc = request_threaded_irq(pdev->irq, mic_interrupt, mic_thread_fn,
+				  0, "mic-msi", mdev);
 	if (rc) {
 		dev_err(&pdev->dev, "Error allocating MSI interrupt\n");
 		goto err_irq_req_fail;
@@ -353,8 +373,8 @@ static int mic_setup_intx(struct mic_device *mdev, struct pci_dev *pdev)
 		goto err_nomem;
 	}
 
-	rc = request_irq(pdev->irq, mic_interrupt,
-		IRQF_SHARED, "mic-intx", mdev);
+	rc = request_threaded_irq(pdev->irq, mic_interrupt, mic_thread_fn,
+				  IRQF_SHARED, "mic-intx", mdev);
 	if (rc)
 		goto err;
 
@@ -391,13 +411,14 @@ int mic_next_db(struct mic_device *mdev)
 #define MK_COOKIE(x, y) ((x) | (y) << COOKIE_ID_SHIFT)
 
 /**
- * mic_request_irq - request an irq. mic_mutex needs
+ * mic_request_threaded_irq - request an irq. mic_mutex needs
  * to be held before calling this function.
  *
  * @mdev: pointer to mic_device instance
- * @func: The callback function that handles the interrupt.
+ * @handler: The callback function that handles the interrupt.
  * The function needs to call ack_interrupts
  * (mdev->ops->ack_interrupt(mdev)) when handling the interrupts.
+ * @thread_fn: thread fn required by request_threaded_irq.
  * @name: The ASCII name of the callee requesting the irq.
  * @data: private data that is returned back when calling the
  * function handler.
@@ -412,8 +433,8 @@ int mic_next_db(struct mic_device *mdev)
  * error code.
  *
  */
-struct mic_irq *mic_request_irq(struct mic_device *mdev,
-	irqreturn_t (*func)(int irq, void *dev),
+struct mic_irq *mic_request_threaded_irq(struct mic_device *mdev,
+	irq_handler_t handler, irq_handler_t thread_fn,
 	const char *name, void *data, int intr_src,
 	enum mic_intr_type type)
 {
@@ -444,7 +465,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 			goto err;
 		}
 
-		rc = request_irq(msix->vector, func, 0, name, data);
+		rc = request_threaded_irq(msix->vector, handler, thread_fn,
+					  0, name, data);
 		if (rc) {
 			dev_dbg(mdev->sdev->parent,
 				"request irq failed rc = %d\n", rc);
@@ -458,8 +480,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 		dev_dbg(mdev->sdev->parent, "irq: %d assigned for src: %d\n",
 			msix->vector, intr_src);
 	} else {
-		intr_cb = mic_register_intr_callback(mdev,
-				offset, func, data);
+		intr_cb = mic_register_intr_callback(mdev, offset, handler,
+						     thread_fn, data);
 		if (IS_ERR(intr_cb)) {
 			dev_err(mdev->sdev->parent,
 				"No available callback entries for use\n");
@@ -487,9 +509,9 @@ err:
  *  needs to be held before calling this function.
  *
  * @mdev: pointer to mic_device instance
- * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @cookie: cookie obtained during a successful call to mic_request_threaded_irq
  * @data: private data specified by the calling function during the
- * mic_request_irq
+ * mic_request_threaded_irq
  *
  * returns: none.
  */
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 6091aa9..6b03f84 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -21,6 +21,8 @@
 #ifndef _MIC_INTR_H_
 #define _MIC_INTR_H_
 
+#include <linux/bitops.h>
+#include <linux/interrupt.h>
 /*
  * The minimum number of msix vectors required for normal operation.
  * 3 for virtio network, console and block devices.
@@ -68,7 +70,11 @@ struct mic_intr_info {
  * @num_vectors: The number of MSI/MSI-x vectors that have been allocated.
  * @cb_ida: callback ID allocator to track the callbacks registered.
  * @mic_intr_lock: spinlock to protect the interrupt callback list.
+ * @mic_thread_lock: spinlock to protect the thread callback list.
+ *		   This lock is used to protect against thread_fn while
+ *		   mic_intr_lock is used to protect against interrupt handler.
  * @cb_list: Array of callback lists one for each source.
+ * @mask: Mask used by the main thread fn to call the underlying thread fns.
  */
 struct mic_irq_info {
 	int next_avail_src;
@@ -77,19 +83,23 @@ struct mic_irq_info {
 	u16 num_vectors;
 	struct ida cb_ida;
 	spinlock_t mic_intr_lock;
+	spinlock_t mic_thread_lock;
 	struct list_head *cb_list;
+	unsigned long mask;
 };
 
 /**
  * struct mic_intr_cb - Interrupt callback structure.
  *
- * @func: The callback function
+ * @handler: The callback function
+ * @thread_fn: The thread_fn.
  * @data: Private data of the requester.
  * @cb_id: The callback id. Identifies this callback.
  * @list: list head pointing to the next callback structure.
  */
 struct mic_intr_cb {
-	irqreturn_t (*func) (int irq, void *data);
+	irq_handler_t handler;
+	irq_handler_t thread_fn;
 	void *data;
 	int cb_id;
 	struct list_head list;
@@ -124,8 +134,8 @@ struct mic_hw_intr_ops {
 };
 
 int mic_next_db(struct mic_device *mdev);
-struct mic_irq *mic_request_irq(struct mic_device *mdev,
-	irqreturn_t (*func)(int irq, void *data),
+struct mic_irq *mic_request_threaded_irq(struct mic_device *mdev,
+	irq_handler_t handler, irq_handler_t thread_fn,
 	const char *name, void *data, int intr_src,
 	enum mic_intr_type type);
 
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index c04a021..fdc9c13 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -389,8 +389,9 @@ static int mic_probe(struct pci_dev *pdev,
 	mutex_lock(&mdev->mic_mutex);
 
 	mdev->shutdown_db = mic_next_db(mdev);
-	mdev->shutdown_cookie = mic_request_irq(mdev, mic_shutdown_db,
-		"shutdown-interrupt", mdev, mdev->shutdown_db, MIC_INTR_DB);
+	mdev->shutdown_cookie = mic_request_threaded_irq(mdev, mic_shutdown_db,
+					NULL, "shutdown-interrupt", mdev,
+					mdev->shutdown_db, MIC_INTR_DB);
 	if (IS_ERR(mdev->shutdown_cookie)) {
 		rc = PTR_ERR(mdev->shutdown_cookie);
 		mutex_unlock(&mdev->mic_mutex);
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 7e1ef0e..aba3e83 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -594,8 +594,10 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
 		 mvdev->virtio_id);
 	mvdev->virtio_db = mic_next_db(mdev);
-	mvdev->virtio_cookie = mic_request_irq(mdev, mic_virtio_intr_handler,
-			irqname, mvdev, mvdev->virtio_db, MIC_INTR_DB);
+	mvdev->virtio_cookie = mic_request_threaded_irq(mdev,
+					       mic_virtio_intr_handler,
+					       NULL, irqname, mvdev,
+					       mvdev->virtio_db, MIC_INTR_DB);
 	if (IS_ERR(mvdev->virtio_cookie)) {
 		ret = PTR_ERR(mvdev->virtio_cookie);
 		dev_dbg(mdev->sdev->parent, "request irq failed\n");
-- 
1.8.2.1


^ permalink raw reply related	[relevance 63%]

* [PATCH char-misc-next 3/8] dma: MIC X100 DMA Driver
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
  2014-05-28  2:36 91% ` [PATCH char-misc-next 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
  2014-05-28  2:36 63% ` [PATCH char-misc-next 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
@ 2014-05-28  2:36 41% ` Sudeep Dutt
  2014-05-28  2:36 63% ` [PATCH char-misc-next 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch implements DMA Engine API for DMA controller on MIC X100
Coprocessors. DMA h/w is shared between host and card s/w.
Channels 0 to 3 are used by host and 4 to 7 are used by card.
Since the DMA device doesn't show up as PCIe device, a virtual bus called mic
bus is created and virtual devices are added on that bus to follow device model.
Allowed dma transfer directions are host to card, card to host and card to card.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/dma/Kconfig        |  19 ++
 drivers/dma/Makefile       |   1 +
 drivers/dma/mic_x100_dma.c | 774 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/mic_x100_dma.h | 286 +++++++++++++++++
 4 files changed, 1080 insertions(+)
 create mode 100644 drivers/dma/mic_x100_dma.c
 create mode 100644 drivers/dma/mic_x100_dma.h

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 5c58638..39b66a8 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -33,6 +33,25 @@ if DMADEVICES
 
 comment "DMA Devices"
 
+config INTEL_MIC_X100_DMA
+	tristate "Intel MIC X100 DMA Driver"
+	depends on 64BIT && X86 && INTEL_MIC_BUS
+	select DMAENGINE
+	default N
+	help
+	  This enables DMA support for the Intel Many Integrated Core
+	  (MIC) family of PCIe form factor coprocessor X100 devices that
+	  run a 64 bit Linux OS. This driver will be used by both MIC
+	  host and card drivers.
+
+	  If you are building host kernel with a MIC device or a card
+	  kernel for a MIC device, then say M (recommended) or Y, else
+	  say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 config INTEL_MID_DMAC
 	tristate "Intel MID DMA support for Peripheral DMA controllers"
 	depends on PCI && X86
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 5150c82..c933022 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -46,3 +46,4 @@ obj-$(CONFIG_K3_DMA) += k3dma.o
 obj-$(CONFIG_MOXART_DMA) += moxart-dma.o
 obj-$(CONFIG_FSL_EDMA) += fsl-edma.o
 obj-$(CONFIG_QCOM_BAM_DMA) += qcom_bam_dma.o
+obj-$(CONFIG_INTEL_MIC_X100_DMA) += mic_x100_dma.o
diff --git a/drivers/dma/mic_x100_dma.c b/drivers/dma/mic_x100_dma.c
new file mode 100644
index 0000000..6aec4df
--- /dev/null
+++ b/drivers/dma/mic_x100_dma.c
@@ -0,0 +1,774 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC X100 DMA Driver.
+ *
+ * Adapted from IOAT dma driver.
+ */
+#include <linux/module.h>
+#include <linux/io.h>
+#include <linux/seq_file.h>
+
+#include "mic_x100_dma.h"
+
+#define MIC_DMA_MAX_XFER_SIZE_CARD  (1 * 1024 * 1024 -\
+				       MIC_DMA_ALIGN_BYTES)
+#define MIC_DMA_MAX_XFER_SIZE_HOST  (1 * 1024 * 1024 >> 1)
+#define MIC_DMA_DESC_TYPE_SHIFT	60
+#define MIC_DMA_MEMCPY_LEN_SHIFT 46
+#define MIC_DMA_STAT_INTR_SHIFT 59
+
+/* high-water mark for pushing dma descriptors */
+static int mic_dma_pending_level = 4;
+
+/* Status descriptor is used to write a 64 bit value to a memory location */
+enum mic_dma_desc_format_type {
+	MIC_DMA_MEMCPY = 1,
+	MIC_DMA_STATUS,
+};
+
+static inline u32 mic_dma_hw_ring_inc(u32 val)
+{
+	return (val + 1) % MIC_DMA_DESC_RX_SIZE;
+}
+
+static inline u32 mic_dma_hw_ring_dec(u32 val)
+{
+	return val ? val - 1 : MIC_DMA_DESC_RX_SIZE - 1;
+}
+
+static inline void mic_dma_hw_ring_inc_head(struct mic_dma_chan *ch)
+{
+	ch->head = mic_dma_hw_ring_inc(ch->head);
+}
+
+/* Prepare a memcpy desc */
+static inline void mic_dma_memcpy_desc(struct mic_dma_desc *desc,
+	dma_addr_t src_phys, dma_addr_t dst_phys, u64 size)
+{
+	u64 qw0, qw1;
+
+	qw0 = src_phys;
+	qw0 |= (size >> MIC_DMA_ALIGN_SHIFT) << MIC_DMA_MEMCPY_LEN_SHIFT;
+	qw1 = MIC_DMA_MEMCPY;
+	qw1 <<= MIC_DMA_DESC_TYPE_SHIFT;
+	qw1 |= dst_phys;
+	desc->qw0 = qw0;
+	desc->qw1 = qw1;
+}
+
+/* Prepare a status desc. with @data to be written at @dst_phys */
+static inline void mic_dma_prep_status_desc(struct mic_dma_desc *desc, u64 data,
+	dma_addr_t dst_phys, bool generate_intr)
+{
+	u64 qw0, qw1;
+
+	qw0 = data;
+	qw1 = (u64) MIC_DMA_STATUS << MIC_DMA_DESC_TYPE_SHIFT | dst_phys;
+	if (generate_intr)
+		qw1 |= (1ULL << MIC_DMA_STAT_INTR_SHIFT);
+	desc->qw0 = qw0;
+	desc->qw1 = qw1;
+}
+
+static void mic_dma_cleanup(struct mic_dma_chan *ch)
+{
+	struct dma_async_tx_descriptor *tx;
+	u32 tail;
+	u32 last_tail;
+
+	spin_lock(&ch->cleanup_lock);
+	tail = mic_dma_read_cmp_cnt(ch);
+	/*
+	 * This is the barrier pair for smp_wmb() in fn.
+	 * mic_dma_tx_submit_unlock. It's required so that we read the
+	 * updated cookie value from tx->cookie.
+	 */
+	smp_rmb();
+	for (last_tail = ch->last_tail; tail != last_tail;) {
+		tx = &ch->tx_array[last_tail];
+		if (tx->cookie) {
+			dma_cookie_complete(tx);
+			if (tx->callback) {
+				tx->callback(tx->callback_param);
+				tx->callback = NULL;
+			}
+		}
+		last_tail = mic_dma_hw_ring_inc(last_tail);
+	}
+	/* finish all completion callbacks before incrementing tail */
+	smp_mb();
+	ch->last_tail = last_tail;
+	spin_unlock(&ch->cleanup_lock);
+}
+
+static u32 mic_dma_ring_count(u32 head, u32 tail)
+{
+	u32 count;
+
+	if (head >= tail)
+		count = (tail - 0) + (MIC_DMA_DESC_RX_SIZE - head);
+	else
+		count = tail - head;
+	return count - 1;
+}
+
+/* Returns the num. of free descriptors on success, -ENOMEM on failure */
+static int mic_dma_avail_desc_ring_space(struct mic_dma_chan *ch, int required)
+{
+	struct device *dev = mic_dma_ch_to_device(ch);
+	u32 count;
+
+	count = mic_dma_ring_count(ch->head, ch->last_tail);
+	if (count < required) {
+		mic_dma_cleanup(ch);
+		count = mic_dma_ring_count(ch->head, ch->last_tail);
+	}
+
+	if (count < required) {
+		dev_dbg(dev, "Not enough desc space");
+		dev_dbg(dev, "%s %d required=%u, avail=%u\n",
+			__func__, __LINE__, required, count);
+		return -ENOMEM;
+	} else {
+		return count;
+	}
+}
+
+/* Program memcpy descriptors into the descriptor ring and update s/w head ptr*/
+static int mic_dma_prog_memcpy_desc(struct mic_dma_chan *ch, dma_addr_t src,
+				    dma_addr_t dst, size_t len)
+{
+	size_t current_transfer_len;
+	size_t max_xfer_size = to_mic_dma_dev(ch)->max_xfer_size;
+	/* 3 is added to make sure we have enough space for status desc */
+	int num_desc = len / max_xfer_size + 3;
+	int ret;
+
+	if (len % max_xfer_size)
+		num_desc++;
+
+	ret = mic_dma_avail_desc_ring_space(ch, num_desc);
+	if (ret < 0)
+		return ret;
+	do {
+		current_transfer_len = min(len, max_xfer_size);
+		mic_dma_memcpy_desc(&ch->desc_ring[ch->head],
+				    src, dst, current_transfer_len);
+		mic_dma_hw_ring_inc_head(ch);
+		len -= current_transfer_len;
+		dst = dst + current_transfer_len;
+		src = src + current_transfer_len;
+	} while (len > 0);
+	return 0;
+}
+
+/* It's a h/w quirk and h/w needs 2 status descriptors for every status desc */
+static void mic_dma_prog_intr(struct mic_dma_chan *ch)
+{
+	mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+				 ch->status_dest_micpa, false);
+	mic_dma_hw_ring_inc_head(ch);
+	mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+				 ch->status_dest_micpa, true);
+	mic_dma_hw_ring_inc_head(ch);
+}
+
+/* Wrapper function to program memcpy descriptors/status descriptors */
+static int mic_dma_do_dma(struct mic_dma_chan *ch, int flags, dma_addr_t src,
+			  dma_addr_t dst, size_t len)
+{
+	if (-ENOMEM == mic_dma_prog_memcpy_desc(ch, src, dst, len))
+		return -ENOMEM;
+	/* Above mic_dma_prog_memcpy_desc() makes sure we have enough space */
+	if (flags & DMA_PREP_FENCE) {
+		mic_dma_prep_status_desc(&ch->desc_ring[ch->head], 0,
+					 ch->status_dest_micpa, false);
+		mic_dma_hw_ring_inc_head(ch);
+	}
+
+	if (flags & DMA_PREP_INTERRUPT)
+		mic_dma_prog_intr(ch);
+
+	return 0;
+}
+
+static inline void mic_dma_issue_pending(struct dma_chan *ch)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+
+	spin_lock(&mic_ch->issue_lock);
+	/*
+	 * Write to head triggers h/w to act on the descriptors.
+	 * On MIC, writing the same head value twice causes
+	 * a h/w error. On second write, h/w assumes we filled
+	 * the entire ring & overwrote some of the descriptors.
+	 */
+	if (mic_ch->issued == mic_ch->submitted)
+		goto out;
+	mic_ch->issued = mic_ch->submitted;
+	/*
+	 * make descriptor updates visible before advancing head,
+	 * this is purposefully not smp_wmb() since we are also
+	 * publishing the descriptor updates to a dma device
+	 */
+	wmb();
+	mic_dma_write_reg(mic_ch, MIC_DMA_REG_DHPR, mic_ch->issued);
+out:
+	spin_unlock(&mic_ch->issue_lock);
+}
+
+static inline void mic_dma_update_pending(struct mic_dma_chan *ch)
+{
+	if (mic_dma_ring_count(ch->issued, ch->submitted)
+			> mic_dma_pending_level)
+		mic_dma_issue_pending(&ch->api_ch);
+}
+
+static dma_cookie_t mic_dma_tx_submit_unlock(struct dma_async_tx_descriptor *tx)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(tx->chan);
+	dma_cookie_t cookie;
+
+	dma_cookie_assign(tx);
+	cookie = tx->cookie;
+	/*
+	 * We need an smp write barrier here because another CPU might see
+	 * an update to submitted and update h/w head even before we
+	 * assigned a cookie to this tx.
+	 */
+	smp_wmb();
+	mic_ch->submitted = mic_ch->head;
+	spin_unlock(&mic_ch->prep_lock);
+	mic_dma_update_pending(mic_ch);
+	return cookie;
+}
+
+static inline struct dma_async_tx_descriptor *
+allocate_tx(struct mic_dma_chan *ch)
+{
+	u32 idx = mic_dma_hw_ring_dec(ch->head);
+	struct dma_async_tx_descriptor *tx = &ch->tx_array[idx];
+
+	dma_async_tx_descriptor_init(tx, &ch->api_ch);
+	tx->tx_submit = mic_dma_tx_submit_unlock;
+	return tx;
+}
+
+/*
+ * Prepare a memcpy descriptor to be added to the ring.
+ * Note that the temporary descriptor adds an extra overhead of copying the
+ * descriptor to ring. So, we copy directly to the descriptor ring
+ */
+static struct dma_async_tx_descriptor *
+mic_dma_prep_memcpy_lock(struct dma_chan *ch, dma_addr_t dma_dest,
+			 dma_addr_t dma_src, size_t len, unsigned long flags)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	struct device *dev = mic_dma_ch_to_device(mic_ch);
+	int result;
+
+	if (!len && !flags)
+		return NULL;
+
+	spin_lock(&mic_ch->prep_lock);
+	result = mic_dma_do_dma(mic_ch, flags, dma_src, dma_dest, len);
+	if (result >= 0)
+		return allocate_tx(mic_ch);
+	dev_err(dev, "Error enqueueing dma, error=%d\n", result);
+	spin_unlock(&mic_ch->prep_lock);
+	return NULL;
+}
+
+static struct dma_async_tx_descriptor *
+mic_dma_prep_interrupt_lock(struct dma_chan *ch, unsigned long flags)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	int ret;
+
+	spin_lock(&mic_ch->prep_lock);
+	ret = mic_dma_do_dma(mic_ch, flags, 0, 0, 0);
+	if (!ret)
+		return allocate_tx(mic_ch);
+	spin_unlock(&mic_ch->prep_lock);
+	return NULL;
+}
+
+/* Return the status of the transaction */
+static enum dma_status
+mic_dma_tx_status(struct dma_chan *ch, dma_cookie_t cookie,
+		  struct dma_tx_state *txstate)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+
+	if (DMA_COMPLETE != dma_cookie_status(ch, cookie, txstate))
+		mic_dma_cleanup(mic_ch);
+
+	return dma_cookie_status(ch, cookie, txstate);
+}
+
+static irqreturn_t mic_dma_thread_fn(int irq, void *data)
+{
+	mic_dma_cleanup((struct mic_dma_chan *)data);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t mic_dma_intr_handler(int irq, void *data)
+{
+	struct mic_dma_chan *ch = ((struct mic_dma_chan *)data);
+
+	mic_dma_ack_interrupt(ch);
+	return IRQ_WAKE_THREAD;
+}
+
+static int mic_dma_alloc_desc_ring(struct mic_dma_chan *ch)
+{
+	u64 desc_ring_size = MIC_DMA_DESC_RX_SIZE * sizeof(*ch->desc_ring);
+	struct device *dev = &to_mbus_device(ch)->dev;
+
+	desc_ring_size = ALIGN(desc_ring_size, MIC_DMA_ALIGN_BYTES);
+	ch->desc_ring = kzalloc(desc_ring_size, GFP_KERNEL);
+
+	if (!ch->desc_ring)
+		return -ENOMEM;
+
+	ch->desc_ring_micpa = dma_map_single(dev, ch->desc_ring,
+					     desc_ring_size, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, ch->desc_ring_micpa))
+		goto map_error;
+
+	ch->tx_array = vzalloc(MIC_DMA_DESC_RX_SIZE * sizeof(*ch->tx_array));
+	if (!ch->tx_array)
+		goto tx_error;
+	return 0;
+tx_error:
+	dma_unmap_single(dev, ch->desc_ring_micpa, desc_ring_size,
+			 DMA_BIDIRECTIONAL);
+map_error:
+	kfree(ch->desc_ring);
+	return -ENOMEM;
+}
+
+static void mic_dma_free_desc_ring(struct mic_dma_chan *ch)
+{
+	u64 desc_ring_size = MIC_DMA_DESC_RX_SIZE * sizeof(*ch->desc_ring);
+
+	vfree(ch->tx_array);
+	desc_ring_size = ALIGN(desc_ring_size, MIC_DMA_ALIGN_BYTES);
+	dma_unmap_single(&to_mbus_device(ch)->dev, ch->desc_ring_micpa,
+			 desc_ring_size, DMA_BIDIRECTIONAL);
+	kfree(ch->desc_ring);
+	ch->desc_ring = NULL;
+}
+
+static void mic_dma_free_status_dest(struct mic_dma_chan *ch)
+{
+	dma_unmap_single(&to_mbus_device(ch)->dev, ch->status_dest_micpa,
+			 L1_CACHE_BYTES, DMA_BIDIRECTIONAL);
+	kfree(ch->status_dest);
+}
+
+static int mic_dma_alloc_status_dest(struct mic_dma_chan *ch)
+{
+	struct device *dev = &to_mbus_device(ch)->dev;
+
+	ch->status_dest = kzalloc(L1_CACHE_BYTES, GFP_KERNEL);
+	if (!ch->status_dest)
+		return -ENOMEM;
+	ch->status_dest_micpa = dma_map_single(dev, ch->status_dest,
+					L1_CACHE_BYTES, DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(dev, ch->status_dest_micpa)) {
+		kfree(ch->status_dest);
+		ch->status_dest = NULL;
+		return -ENOMEM;
+	}
+	return 0;
+}
+
+static int mic_dma_check_chan(struct mic_dma_chan *ch)
+{
+	if (mic_dma_read_reg(ch, MIC_DMA_REG_DCHERR) ||
+	    mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT) & MIC_DMA_CHAN_QUIESCE) {
+		mic_dma_disable_chan(ch);
+		mic_dma_chan_mask_intr(ch);
+		dev_err(mic_dma_ch_to_device(ch),
+			"%s %d error setting up mic dma chan %d\n",
+			__func__, __LINE__, ch->ch_num);
+		return -EBUSY;
+	}
+	return 0;
+}
+
+static int mic_dma_chan_setup(struct mic_dma_chan *ch)
+{
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		mic_dma_chan_set_owner(ch);
+	mic_dma_disable_chan(ch);
+	mic_dma_chan_mask_intr(ch);
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCHERRMSK, 0);
+	mic_dma_chan_set_desc_ring(ch);
+	ch->last_tail = mic_dma_read_reg(ch, MIC_DMA_REG_DTPR);
+	ch->head = ch->last_tail;
+	ch->issued = 0;
+	mic_dma_chan_unmask_intr(ch);
+	mic_dma_enable_chan(ch);
+	return mic_dma_check_chan(ch);
+}
+
+static void mic_dma_chan_destroy(struct mic_dma_chan *ch)
+{
+	mic_dma_disable_chan(ch);
+	mic_dma_chan_mask_intr(ch);
+}
+
+static void mic_dma_unregister_dma_device(struct mic_dma_device *mic_dma_dev)
+{
+	dma_async_device_unregister(&mic_dma_dev->dma_dev);
+}
+
+static int mic_dma_setup_irq(struct mic_dma_chan *ch)
+{
+	ch->cookie =
+		to_mbus_hw_ops(ch)->request_threaded_irq(to_mbus_device(ch),
+			mic_dma_intr_handler, mic_dma_thread_fn,
+			"mic dma_channel", ch, ch->ch_num);
+	if (IS_ERR(ch->cookie))
+		return IS_ERR(ch->cookie);
+	return 0;
+}
+
+static inline void mic_dma_free_irq(struct mic_dma_chan *ch)
+{
+	to_mbus_hw_ops(ch)->free_irq(to_mbus_device(ch), ch->cookie, ch);
+}
+
+static int mic_dma_chan_init(struct mic_dma_chan *ch)
+{
+	int ret = mic_dma_alloc_desc_ring(ch);
+
+	if (ret)
+		goto ring_error;
+	ret = mic_dma_alloc_status_dest(ch);
+	if (ret)
+		goto status_error;
+	ret = mic_dma_chan_setup(ch);
+	if (ret)
+		goto chan_error;
+	return ret;
+chan_error:
+	mic_dma_free_status_dest(ch);
+status_error:
+	mic_dma_free_desc_ring(ch);
+ring_error:
+	return ret;
+}
+
+static int mic_dma_drain_chan(struct mic_dma_chan *ch)
+{
+	struct dma_async_tx_descriptor *tx;
+	int err = 0;
+	dma_cookie_t cookie;
+
+	tx = mic_dma_prep_memcpy_lock(&ch->api_ch, 0, 0, 0, DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	cookie = tx->tx_submit(tx);
+	if (dma_submit_error(cookie))
+		err = -ENOMEM;
+	else
+		err = dma_sync_wait(&ch->api_ch, cookie);
+	if (err) {
+		dev_err(mic_dma_ch_to_device(ch), "%s %d TO chan 0x%x\n",
+			__func__, __LINE__, ch->ch_num);
+		err = -EIO;
+	}
+error:
+	mic_dma_cleanup(ch);
+	return err;
+}
+
+static inline void mic_dma_chan_uninit(struct mic_dma_chan *ch)
+{
+	mic_dma_chan_destroy(ch);
+	mic_dma_cleanup(ch);
+	mic_dma_free_status_dest(ch);
+	mic_dma_free_desc_ring(ch);
+}
+
+static int mic_dma_init(struct mic_dma_device *mic_dma_dev,
+			enum mic_dma_chan_owner owner)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+	int ret;
+
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		unsigned long data;
+		ch = &mic_dma_dev->mic_ch[i];
+		data = (unsigned long)ch;
+		ch->ch_num = i;
+		ch->owner = owner;
+		spin_lock_init(&ch->cleanup_lock);
+		spin_lock_init(&ch->prep_lock);
+		spin_lock_init(&ch->issue_lock);
+		ret = mic_dma_setup_irq(ch);
+		if (ret)
+			goto error;
+	}
+	return 0;
+error:
+	for (i = i - 1; i >= first_chan; i--)
+		mic_dma_free_irq(ch);
+	return ret;
+}
+
+static void mic_dma_uninit(struct mic_dma_device *mic_dma_dev)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		ch = &mic_dma_dev->mic_ch[i];
+		mic_dma_free_irq(ch);
+	}
+}
+
+static int mic_dma_alloc_chan_resources(struct dma_chan *ch)
+{
+	int ret = mic_dma_chan_init(to_mic_dma_chan(ch));
+	if (ret)
+		return ret;
+	return MIC_DMA_DESC_RX_SIZE;
+}
+
+static void mic_dma_free_chan_resources(struct dma_chan *ch)
+{
+	struct mic_dma_chan *mic_ch = to_mic_dma_chan(ch);
+	mic_dma_drain_chan(mic_ch);
+	mic_dma_chan_uninit(mic_ch);
+}
+
+/* Set the fn. handlers and register the dma device with dma api */
+static int mic_dma_register_dma_device(struct mic_dma_device *mic_dma_dev,
+				       enum mic_dma_chan_owner owner)
+{
+	int i, first_chan = mic_dma_dev->start_ch;
+
+	dma_cap_zero(mic_dma_dev->dma_dev.cap_mask);
+	/*
+	 * This dma engine is not capable of host memory to host memory
+	 * transfers
+	 */
+	dma_cap_set(DMA_MEMCPY, mic_dma_dev->dma_dev.cap_mask);
+
+	if (MIC_DMA_CHAN_HOST == owner)
+		dma_cap_set(DMA_PRIVATE, mic_dma_dev->dma_dev.cap_mask);
+	mic_dma_dev->dma_dev.device_alloc_chan_resources =
+		mic_dma_alloc_chan_resources;
+	mic_dma_dev->dma_dev.device_free_chan_resources =
+		mic_dma_free_chan_resources;
+	mic_dma_dev->dma_dev.device_tx_status = mic_dma_tx_status;
+	mic_dma_dev->dma_dev.device_prep_dma_memcpy = mic_dma_prep_memcpy_lock;
+	mic_dma_dev->dma_dev.device_prep_dma_interrupt =
+		mic_dma_prep_interrupt_lock;
+	mic_dma_dev->dma_dev.device_issue_pending = mic_dma_issue_pending;
+	mic_dma_dev->dma_dev.copy_align = MIC_DMA_ALIGN_SHIFT;
+	INIT_LIST_HEAD(&mic_dma_dev->dma_dev.channels);
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		mic_dma_dev->mic_ch[i].api_ch.device = &mic_dma_dev->dma_dev;
+		dma_cookie_init(&mic_dma_dev->mic_ch[i].api_ch);
+		list_add_tail(&mic_dma_dev->mic_ch[i].api_ch.device_node,
+			      &mic_dma_dev->dma_dev.channels);
+	}
+	return dma_async_device_register(&mic_dma_dev->dma_dev);
+}
+
+/*
+ * Initializes dma channels and registers the dma device with the
+ * dma engine api.
+ */
+static struct mic_dma_device *mic_dma_dev_reg(struct mbus_device *mbdev,
+					      enum mic_dma_chan_owner owner)
+{
+	struct mic_dma_device *mic_dma_dev;
+	int ret;
+	struct device *dev = &mbdev->dev;
+
+	mic_dma_dev = kzalloc(sizeof(*mic_dma_dev), GFP_KERNEL);
+	if (!mic_dma_dev) {
+		ret = -ENOMEM;
+		goto alloc_error;
+	}
+	mic_dma_dev->mbdev = mbdev;
+	mic_dma_dev->dma_dev.dev = dev;
+	mic_dma_dev->mmio = mbdev->mmio_va;
+	if (MIC_DMA_CHAN_HOST == owner) {
+		mic_dma_dev->start_ch = 0;
+		mic_dma_dev->max_xfer_size = MIC_DMA_MAX_XFER_SIZE_HOST;
+	} else {
+		mic_dma_dev->start_ch = 4;
+		mic_dma_dev->max_xfer_size = MIC_DMA_MAX_XFER_SIZE_CARD;
+	}
+	ret = mic_dma_init(mic_dma_dev, owner);
+	if (ret)
+		goto init_error;
+	ret = mic_dma_register_dma_device(mic_dma_dev, owner);
+	if (ret)
+		goto reg_error;
+	return mic_dma_dev;
+reg_error:
+	mic_dma_uninit(mic_dma_dev);
+init_error:
+	kfree(mic_dma_dev);
+	mic_dma_dev = NULL;
+alloc_error:
+	dev_err(dev, "Error at %s %d ret=%d\n", __func__, __LINE__, ret);
+	return mic_dma_dev;
+}
+
+static void mic_dma_dev_unreg(struct mic_dma_device *mic_dma_dev)
+{
+	mic_dma_unregister_dma_device(mic_dma_dev);
+	mic_dma_uninit(mic_dma_dev);
+	kfree(mic_dma_dev);
+}
+
+/* DEBUGFS CODE */
+static int mic_dma_reg_seq_show(struct seq_file *s, void *pos)
+{
+	struct mic_dma_device *mic_dma_dev = s->private;
+	int i, chan_num, first_chan = mic_dma_dev->start_ch;
+	struct mic_dma_chan *ch;
+
+	seq_printf(s, "SBOX_DCR: %#x\n",
+		   mic_dma_mmio_read(&mic_dma_dev->mic_ch[first_chan],
+				     MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR));
+	seq_puts(s, "DMA Channel Registers\n");
+	seq_printf(s, "%-10s| %-10s %-10s %-10s %-10s %-10s",
+		   "Channel", "DCAR", "DTPR", "DHPR", "DRAR_HI", "DRAR_LO");
+	seq_printf(s, " %-11s %-14s %-10s\n", "DCHERR", "DCHERRMSK", "DSTAT");
+	for (i = first_chan; i < first_chan + MIC_DMA_NUM_CHAN; i++) {
+		ch = &mic_dma_dev->mic_ch[i];
+		chan_num = ch->ch_num;
+		seq_printf(s, "%-10i| %-#10x %-#10x %-#10x %-#10x",
+			   chan_num,
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCAR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DTPR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DHPR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DRAR_HI));
+		seq_printf(s, " %-#10x %-#10x %-#14x %-#10x\n",
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DRAR_LO),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCHERR),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DCHERRMSK),
+			   mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT));
+	}
+	return 0;
+}
+
+static int mic_dma_reg_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_dma_reg_seq_show, inode->i_private);
+}
+
+static int mic_dma_reg_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations mic_dma_reg_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_dma_reg_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_dma_reg_debug_release
+};
+
+/* Debugfs parent dir */
+static struct dentry *mic_dma_dbg;
+
+static int mic_dma_driver_probe(struct mbus_device *mbdev)
+{
+	struct mic_dma_device *mic_dma_dev;
+	enum mic_dma_chan_owner owner;
+
+	if (MBUS_DEV_DMA_MIC == mbdev->id.device)
+		owner = MIC_DMA_CHAN_MIC;
+	else
+		owner = MIC_DMA_CHAN_HOST;
+
+	mic_dma_dev = mic_dma_dev_reg(mbdev, owner);
+	mbdev->priv = mic_dma_dev;
+
+	if (mic_dma_dbg) {
+		mic_dma_dev->dbg_dir = debugfs_create_dir(dev_name(&mbdev->dev),
+							  mic_dma_dbg);
+		if (mic_dma_dev->dbg_dir)
+			debugfs_create_file("mic_dma_reg", 0444,
+					    mic_dma_dev->dbg_dir, mic_dma_dev,
+					    &mic_dma_reg_ops);
+	}
+	return 0;
+}
+
+static void mic_dma_driver_remove(struct mbus_device *mbdev)
+{
+	struct mic_dma_device *mic_dma_dev;
+
+	mic_dma_dev = (struct mic_dma_device *)mbdev->priv;
+	debugfs_remove_recursive(mic_dma_dev->dbg_dir);
+	mic_dma_dev_unreg(mic_dma_dev);
+}
+
+static struct mbus_device_id id_table[] = {
+	{MBUS_DEV_DMA_MIC, MBUS_DEV_ANY_ID},
+	{MBUS_DEV_DMA_HOST, MBUS_DEV_ANY_ID},
+	{0},
+};
+
+static struct mbus_driver mic_dma_driver = {
+	.driver.name =	KBUILD_MODNAME,
+	.driver.owner =	THIS_MODULE,
+	.id_table = id_table,
+	.probe = mic_dma_driver_probe,
+	.remove = mic_dma_driver_remove,
+};
+
+static int __init mic_x100_dma_init(void)
+{
+	int rc = register_mbus_driver(&mic_dma_driver);
+	if (rc)
+		return rc;
+	mic_dma_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	return 0;
+}
+
+static void __exit mic_x100_dma_exit(void)
+{
+	debugfs_remove_recursive(mic_dma_dbg);
+	unregister_mbus_driver(&mic_dma_driver);
+}
+
+module_init(mic_x100_dma_init);
+module_exit(mic_x100_dma_exit);
+
+MODULE_DEVICE_TABLE(mbus, id_table);
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC X100 DMA Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/dma/mic_x100_dma.h b/drivers/dma/mic_x100_dma.h
new file mode 100644
index 0000000..f663b0b
--- /dev/null
+++ b/drivers/dma/mic_x100_dma.h
@@ -0,0 +1,286 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC X100 DMA Driver.
+ *
+ * Adapted from IOAT dma driver.
+ */
+#ifndef _MIC_X100_DMA_H_
+#define _MIC_X100_DMA_H_
+
+#include <linux/kernel.h>
+#include <linux/delay.h>
+#include <linux/sched.h>
+#include <linux/debugfs.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/mic_bus.h>
+
+#include "dmaengine.h"
+
+/*
+ * MIC has a total of 8 dma channels.
+ * Four channels are assigned for host SW use & the remaining for MIC SW.
+ * MIC DMA transfer size & addresses need to be 64 byte aligned.
+ */
+#define MIC_DMA_MAX_NUM_CHAN	8
+#define MIC_DMA_NUM_CHAN	4
+#define MIC_DMA_ALIGN_SHIFT	6
+#define MIC_DMA_ALIGN_BYTES	(1 << MIC_DMA_ALIGN_SHIFT)
+#define MIC_DMA_DESC_RX_SIZE	(128 * 1024 - 4)
+
+/*
+ * Register descriptions
+ * All the registers are 32 bit registers.
+ * DCR is a global register and all others are per-channel.
+ * DCR - bits 0, 2, 4, 6, 8, 10, 12, 14 - enable bits for channels 0 to 7
+ *	 bits 1, 3, 5, 7, 9, 11, 13, 15 - owner bits for channels 0 to 7
+ * DCAR - bit 24 & 25 interrupt masks for mic owned & host owned channels
+ * DHPR - head of the descriptor ring updated by s/w
+ * DTPR - tail of the descriptor ring updated by h/w
+ * DRAR_LO - lower 32 bits of descriptor ring's mic address
+ * DRAR_HI - 3:0 - remaining 4 bits of descriptor ring's mic address
+ *	     20:4 descriptor ring size
+ *	     25:21 mic smpt entry number
+ * DSTAT - 16:0 h/w completion count; 31:28 dma engine status
+ * DCHERR - this register is non-zero on error
+ * DCHERRMSK - interrupt mask register
+ */
+#define MIC_DMA_HW_CMP_CNT_MASK		0x1ffff
+#define MIC_DMA_CHAN_QUIESCE		0x20000000
+#define MIC_DMA_SBOX_BASE		0x00010000
+#define MIC_DMA_SBOX_DCR		0x0000A280
+#define MIC_DMA_SBOX_CH_BASE		0x0001A000
+#define MIC_DMA_SBOX_CHAN_OFF		0x40
+#define MIC_DMA_SBOX_DCAR_IM0		(0x1 << 24)
+#define MIC_DMA_SBOX_DCAR_IM1		(0x1 << 25)
+#define MIC_DMA_SBOX_DRARHI_SYS_MASK	(0x1 << 26)
+#define MIC_DMA_REG_DCAR		0
+#define MIC_DMA_REG_DHPR		4
+#define MIC_DMA_REG_DTPR		8
+#define MIC_DMA_REG_DRAR_LO		20
+#define MIC_DMA_REG_DRAR_HI		24
+#define MIC_DMA_REG_DSTAT		32
+#define MIC_DMA_REG_DCHERR		44
+#define MIC_DMA_REG_DCHERRMSK		48
+
+/* HW dma desc */
+struct mic_dma_desc {
+	u64 qw0;
+	u64 qw1;
+};
+
+enum mic_dma_chan_owner {
+	MIC_DMA_CHAN_MIC = 0,
+	MIC_DMA_CHAN_HOST
+};
+
+/*
+ * mic_dma_chan - channel specific information
+ * @ch_num: channel number
+ * @owner: owner of this channel
+ * @last_tail: cached value of descriptor ring tail
+ * @head: index of next descriptor in desc_ring
+ * @issued: hardware notification point
+ * @submitted: index that will be used to submit descriptors to h/w
+ * @api_ch: dma engine api channel
+ * @desc_ring: dma descriptor ring
+ * @desc_ring_micpa: mic physical address of desc_ring
+ * @status_dest: destination for status (fence) descriptor
+ * @status_dest_micpa: mic address for status_dest,
+ *		       DMA controller uses this address
+ * @tx_array: array of async_tx
+ * @cleanup_lock: lock held when processing completed tx
+ * @prep_lock: lock held in prep_memcpy & released in tx_submit
+ * @issue_lock: lock used to synchronize writes to head
+ * @cookie: mic_irq cookie used with mic irq request
+ */
+struct mic_dma_chan {
+	int ch_num;
+	enum mic_dma_chan_owner owner;
+	u32 last_tail;
+	u32 head;
+	u32 issued;
+	u32 submitted;
+	struct dma_chan api_ch;
+	struct mic_dma_desc *desc_ring;
+	dma_addr_t desc_ring_micpa;
+	u64 *status_dest;
+	dma_addr_t status_dest_micpa;
+	struct dma_async_tx_descriptor *tx_array;
+	spinlock_t cleanup_lock;
+	spinlock_t prep_lock;
+	spinlock_t issue_lock;
+	struct mic_irq *cookie;
+};
+
+/*
+ * struct mic_dma_device - per mic device
+ * @mic_ch: dma channels
+ * @dma_dev: underlying dma device
+ * @mbdev: mic bus dma device
+ * @mmio: virtual address of the mmio space
+ * @dbg_dir: debugfs directory
+ * @start_ch: first channel number that can be used
+ * @max_xfer_size: maximum transfer size per dma descriptor
+ */
+struct mic_dma_device {
+	struct mic_dma_chan mic_ch[MIC_DMA_MAX_NUM_CHAN];
+	struct dma_device dma_dev;
+	struct mbus_device *mbdev;
+	void __iomem *mmio;
+	struct dentry *dbg_dir;
+	int start_ch;
+	size_t max_xfer_size;
+};
+
+static inline struct mic_dma_chan *to_mic_dma_chan(struct dma_chan *ch)
+{
+	return container_of(ch, struct mic_dma_chan, api_ch);
+}
+
+static inline struct mic_dma_device *to_mic_dma_dev(struct mic_dma_chan *ch)
+{
+	return
+	container_of((const typeof(((struct mic_dma_device *)0)->mic_ch)*)
+		     (ch - ch->ch_num), struct mic_dma_device, mic_ch);
+}
+
+static inline struct mbus_device *to_mbus_device(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->mbdev;
+}
+
+static inline struct mbus_hw_ops *to_mbus_hw_ops(struct mic_dma_chan *ch)
+{
+	return to_mbus_device(ch)->hw_ops;
+}
+
+static inline struct device *mic_dma_ch_to_device(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->dma_dev.dev;
+}
+
+static inline void __iomem *mic_dma_chan_to_mmio(struct mic_dma_chan *ch)
+{
+	return to_mic_dma_dev(ch)->mmio;
+}
+
+static inline u32 mic_dma_read_reg(struct mic_dma_chan *ch, u32 reg)
+{
+	return ioread32(mic_dma_chan_to_mmio(ch) + MIC_DMA_SBOX_CH_BASE +
+			ch->ch_num * MIC_DMA_SBOX_CHAN_OFF + reg);
+}
+
+static inline void mic_dma_write_reg(struct mic_dma_chan *ch, u32 reg, u32 val)
+{
+	iowrite32(val, mic_dma_chan_to_mmio(ch) + MIC_DMA_SBOX_CH_BASE +
+		  ch->ch_num * MIC_DMA_SBOX_CHAN_OFF + reg);
+}
+
+static inline u32 mic_dma_mmio_read(struct mic_dma_chan *ch, u32 offset)
+{
+	return ioread32(mic_dma_chan_to_mmio(ch) + offset);
+}
+
+static inline void mic_dma_mmio_write(struct mic_dma_chan *ch, u32 val,
+				      u32 offset)
+{
+	iowrite32(val, mic_dma_chan_to_mmio(ch) + offset);
+}
+
+static inline u32 mic_dma_read_cmp_cnt(struct mic_dma_chan *ch)
+{
+	return mic_dma_read_reg(ch, MIC_DMA_REG_DSTAT) &
+	       MIC_DMA_HW_CMP_CNT_MASK;
+}
+
+static inline void mic_dma_chan_set_owner(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+	u32 chan_num = ch->ch_num;
+
+	dcr = (dcr & ~(0x1 << (chan_num * 2))) | (ch->owner << (chan_num * 2));
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static inline void mic_dma_enable_chan(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+
+	dcr |= 2 << (ch->ch_num << 1);
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static inline void mic_dma_disable_chan(struct mic_dma_chan *ch)
+{
+	u32 dcr = mic_dma_mmio_read(ch, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+
+	dcr &= ~(2 << (ch->ch_num << 1));
+	mic_dma_mmio_write(ch, dcr, MIC_DMA_SBOX_BASE + MIC_DMA_SBOX_DCR);
+}
+
+static void mic_dma_chan_set_desc_ring(struct mic_dma_chan *ch)
+{
+	u32 drar_hi;
+	dma_addr_t desc_ring_micpa = ch->desc_ring_micpa;
+
+	drar_hi = (MIC_DMA_DESC_RX_SIZE & 0x1ffff) << 4;
+	if (MIC_DMA_CHAN_MIC == ch->owner) {
+		drar_hi |= (desc_ring_micpa >> 32) & 0xf;
+	} else {
+		drar_hi |= MIC_DMA_SBOX_DRARHI_SYS_MASK;
+		drar_hi |= ((desc_ring_micpa >> 34)
+			    & 0x1f) << 21;
+		drar_hi |= (desc_ring_micpa >> 32) & 0x3;
+	}
+	mic_dma_write_reg(ch, MIC_DMA_REG_DRAR_LO, (u32) desc_ring_micpa);
+	mic_dma_write_reg(ch, MIC_DMA_REG_DRAR_HI, drar_hi);
+}
+
+static inline void mic_dma_chan_mask_intr(struct mic_dma_chan *ch)
+{
+	u32 dcar = mic_dma_read_reg(ch, MIC_DMA_REG_DCAR);
+
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		dcar |= MIC_DMA_SBOX_DCAR_IM0;
+	else
+		dcar |= MIC_DMA_SBOX_DCAR_IM1;
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCAR, dcar);
+}
+
+static inline void mic_dma_chan_unmask_intr(struct mic_dma_chan *ch)
+{
+	u32 dcar = mic_dma_read_reg(ch, MIC_DMA_REG_DCAR);
+
+	if (MIC_DMA_CHAN_MIC == ch->owner)
+		dcar &= ~MIC_DMA_SBOX_DCAR_IM0;
+	else
+		dcar &= ~MIC_DMA_SBOX_DCAR_IM1;
+	mic_dma_write_reg(ch, MIC_DMA_REG_DCAR, dcar);
+}
+
+static void mic_dma_ack_interrupt(struct mic_dma_chan *ch)
+{
+	if (MIC_DMA_CHAN_MIC == ch->owner) {
+		/* HW errata */
+		mic_dma_chan_mask_intr(ch);
+		mic_dma_chan_unmask_intr(ch);
+	}
+	to_mbus_hw_ops(ch)->ack_interrupt(to_mbus_device(ch), ch->ch_num);
+}
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 41%]

* [PATCH char-misc-next 6/8] misc: mic: add threaded irq support in card driver
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
                   ` (4 preceding siblings ...)
  2014-05-28  2:36 54% ` [PATCH char-misc-next 5/8] misc: mic: add dma " Sudeep Dutt
@ 2014-05-28  2:36 86% ` Sudeep Dutt
  2014-05-28  2:36 87% ` [PATCH char-misc-next 7/8] misc: mic: add dma " Sudeep Dutt
  2014-05-28  2:36 99% ` [PATCH char-misc-next 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

Add threaded irq support in mic_request_card_irq which will be used
for virtual devices added on mic bus.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/card/mic_device.c | 21 +++++++++++----------
 drivers/misc/mic/card/mic_device.h |  5 +++--
 drivers/misc/mic/card/mic_virtio.c |  4 ++--
 3 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index d0980ff..ff485b7 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -83,8 +83,8 @@ static int mic_shutdown_init(void)
 	int shutdown_db;
 
 	shutdown_db = mic_next_card_db();
-	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr,
-			"Shutdown", mdrv, shutdown_db);
+	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr, NULL,
+					       "Shutdown", mdrv, shutdown_db);
 	if (IS_ERR(shutdown_cookie))
 		rc = PTR_ERR(shutdown_cookie);
 	else
@@ -136,7 +136,8 @@ static void mic_dp_uninit(void)
 /**
  * mic_request_card_irq - request an irq.
  *
- * @func: The callback function that handles the interrupt.
+ * @handler: interrupt handler passed to request_threaded_irq.
+ * @thread_fn: thread fn. passed to request_threaded_irq.
  * @name: The ASCII name of the callee requesting the irq.
  * @data: private data that is returned back when calling the
  * function handler.
@@ -149,17 +150,17 @@ static void mic_dp_uninit(void)
  * error code.
  *
  */
-struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int index)
+struct mic_irq *mic_request_card_irq(irq_handler_t handler,
+	irq_handler_t thread_fn, const char *name, void *data, int index)
 {
 	int rc = 0;
 	unsigned long cookie;
 	struct mic_driver *mdrv = g_drv;
 
-	rc  = request_irq(mic_db_to_irq(mdrv, index), func,
-		0, name, data);
+	rc  = request_threaded_irq(mic_db_to_irq(mdrv, index), handler,
+				   thread_fn, 0, name, data);
 	if (rc) {
-		dev_err(mdrv->dev, "request_irq failed rc = %d\n", rc);
+		dev_err(mdrv->dev, "request_threaded_irq failed rc = %d\n", rc);
 		goto err;
 	}
 	mdrv->irq_info.irq_usage_count[index]++;
@@ -172,9 +173,9 @@ err:
 /**
  * mic_free_card_irq - free irq.
  *
- * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @cookie: cookie obtained during a successful call to mic_request_threaded_irq
  * @data: private data specified by the calling function during the
- * mic_request_irq
+ * mic_request_threaded_irq
  *
  * returns: none.
  */
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index 306f502..e12a0c2 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -30,6 +30,7 @@
 #include <linux/workqueue.h>
 #include <linux/io.h>
 #include <linux/irqreturn.h>
+#include <linux/interrupt.h>
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -116,8 +117,8 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 int mic_driver_init(struct mic_driver *mdrv);
 void mic_driver_uninit(struct mic_driver *mdrv);
 int mic_next_card_db(void);
-struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
-	const char *name, void *data, int intr_src);
+struct mic_irq *mic_request_card_irq(irq_handler_t handler,
+	irq_handler_t thread_fn, const char *name, void *data, int intr_src);
 void mic_free_card_irq(struct mic_irq *cookie, void *data);
 u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
 void mic_send_intr(struct mic_device *mdev, int doorbell);
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 653799b..8cdbc68 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -417,7 +417,7 @@ static int mic_add_device(struct mic_device_desc __iomem *d,
 
 	virtio_db = mic_next_card_db();
 	mvdev->virtio_cookie = mic_request_card_irq(mic_virtio_intr_handler,
-			"virtio intr", mvdev, virtio_db);
+			NULL, "virtio intr", mvdev, virtio_db);
 	if (IS_ERR(mvdev->virtio_cookie)) {
 		ret = PTR_ERR(mvdev->virtio_cookie);
 		goto kfree;
@@ -606,7 +606,7 @@ int mic_devices_init(struct mic_driver *mdrv)
 	mic_scan_devices(mdrv, !REMOVE_DEVICES);
 
 	config_db = mic_next_card_db();
-	virtio_config_cookie = mic_request_card_irq(mic_extint_handler,
+	virtio_config_cookie = mic_request_card_irq(mic_extint_handler, NULL,
 			"virtio_config_intr", mdrv, config_db);
 	if (IS_ERR(virtio_config_cookie)) {
 		rc = PTR_ERR(virtio_config_cookie);
-- 
1.8.2.1


^ permalink raw reply related	[relevance 86%]

* [PATCH char-misc-next 5/8] misc: mic: add dma support in host driver
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
                   ` (3 preceding siblings ...)
  2014-05-28  2:36 63% ` [PATCH char-misc-next 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
@ 2014-05-28  2:36 54% ` Sudeep Dutt
  2014-05-28  2:36 86% ` [PATCH char-misc-next 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch adds a dma device on the mic virtual bus and uses this dmaengine
to transfer data for virtio devices

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/Kconfig           |   2 +-
 drivers/misc/mic/host/mic_boot.c   |  78 +++++++++++++++-
 drivers/misc/mic/host/mic_device.h |  24 +++++
 drivers/misc/mic/host/mic_intr.h   |   3 +-
 drivers/misc/mic/host/mic_virtio.c | 179 +++++++++++++++++++++++++++++--------
 drivers/misc/mic/host/mic_virtio.h |  21 ++++-
 drivers/misc/mic/host/mic_x100.c   |   8 ++
 7 files changed, 274 insertions(+), 41 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index ee1d2ac..bf76313 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -19,7 +19,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI && X86
+	depends on 64BIT && PCI && X86 && INTEL_MIC_BUS
 	select VHOST_RING
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index b75c6b5..b462177 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -23,11 +23,66 @@
 #include <linux/pci.h>
 
 #include <linux/mic_common.h>
+#include <linux/mic_bus.h>
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
+static inline struct mic_device *mbdev_to_mdev(struct mbus_device *mbdev)
+{
+	return dev_get_drvdata(mbdev->dev.parent);
+}
+
+static dma_addr_t mic_dma_map_page(struct device *dev, struct page *page,
+	unsigned long offset, size_t size, enum dma_data_direction dir,
+	struct dma_attrs *attrs)
+{
+	void *va = phys_to_virt(page_to_phys(page)) + offset;
+	struct mic_device *mdev = mbdev_to_mdev(dev_get_drvdata(dev));
+
+	return mic_map_single(mdev, va, size);
+}
+
+static void mic_dma_unmap_page(struct device *dev, dma_addr_t dma_addr,
+	size_t size, enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct mic_device *mdev = mbdev_to_mdev(dev_get_drvdata(dev));
+	mic_unmap_single(mdev, dma_addr, size);
+}
+
+static struct dma_map_ops mic_dma_ops = {
+	.map_page = mic_dma_map_page,
+	.unmap_page = mic_dma_unmap_page,
+};
+
+static struct mic_irq *_mic_request_threaded_irq(struct mbus_device *mbdev,
+		irq_handler_t handler, irq_handler_t thread_fn,
+		const char *name, void *data, int intr_src)
+{
+	return mic_request_threaded_irq(mbdev_to_mdev(mbdev), handler,
+					thread_fn, name, data,
+					intr_src, MIC_INTR_DMA);
+}
+
+static void _mic_free_irq(struct mbus_device *mbdev,
+		struct mic_irq *cookie, void *data)
+{
+	return mic_free_irq(mbdev_to_mdev(mbdev), cookie, data);
+}
+
+static void _mic_ack_interrupt(struct mbus_device *mbdev, int num)
+{
+	struct mic_device *mdev = mbdev_to_mdev(mbdev);
+	mdev->ops->intr_workarounds(mdev);
+}
+
+static struct mbus_hw_ops mbus_hw_ops = {
+	.request_threaded_irq = _mic_request_threaded_irq,
+	.free_irq = _mic_free_irq,
+	.ack_interrupt = _mic_ack_interrupt,
+};
+
 /**
  * mic_reset - Reset the MIC device.
  * @mdev: pointer to mic_device instance
@@ -95,9 +150,20 @@ retry:
 		 */
 		goto retry;
 	}
-	rc = mdev->ops->load_mic_fw(mdev, buf);
+	rc = mbus_add_device(&mdev->dma_mbdev, mdev->sdev->parent,
+			     MBUS_DEV_DMA_HOST, &mic_dma_ops, &mbus_hw_ops,
+			     mdev->mmio.va);
 	if (rc)
 		goto unlock_ret;
+
+	mdev->dma_ch = mic_request_dma_chan(mdev);
+	if (!mdev->dma_ch) {
+		rc = -ENXIO;
+		goto dma_remove;
+	}
+	rc = mdev->ops->load_mic_fw(mdev, buf);
+	if (rc)
+		goto dma_release;
 	mic_smpt_restore(mdev);
 	mic_intr_restore(mdev);
 	mdev->intr_ops->enable_interrupts(mdev);
@@ -105,6 +171,11 @@ retry:
 	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
 	mdev->ops->send_firmware_intr(mdev);
 	mic_set_state(mdev, MIC_ONLINE);
+	goto unlock_ret;
+dma_release:
+	dma_release_channel(mdev->dma_ch);
+dma_remove:
+	mbus_remove_device(&mdev->dma_mbdev);
 unlock_ret:
 	mutex_unlock(&mdev->mic_mutex);
 	return rc;
@@ -122,6 +193,11 @@ void mic_stop(struct mic_device *mdev, bool force)
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
 		mic_virtio_reset_devices(mdev);
+		if (mdev->dma_ch) {
+			dma_release_channel(mdev->dma_ch);
+			mdev->dma_ch = NULL;
+		}
+		mbus_remove_device(&mdev->dma_mbdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
 		if (MIC_RESET_FAILED == mdev->state)
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 0398c69..7583a80 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -25,6 +25,8 @@
 #include <linux/idr.h>
 #include <linux/notifier.h>
 #include <linux/irqreturn.h>
+#include <linux/dmaengine.h>
+#include <linux/mic_bus.h>
 
 #include "mic_intr.h"
 
@@ -87,6 +89,8 @@ enum mic_stepping {
  * @cdev: Character device for MIC.
  * @vdev_list: list of virtio devices.
  * @pm_notifier: Handles PM notifications from the OS.
+ * @dma_mbdev: MIC BUS DMA device.
+ * @dma_ch: DMA channel reserved by this driver for use by virtio devices.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -124,6 +128,8 @@ struct mic_device {
 	struct cdev cdev;
 	struct list_head vdev_list;
 	struct notifier_block pm_notifier;
+	struct mbus_device dma_mbdev;
+	struct dma_chan *dma_ch;
 };
 
 /**
@@ -144,6 +150,7 @@ struct mic_device {
  * @load_mic_fw: Load firmware segments required to boot the card
  * into card memory. This includes the kernel, command line, ramdisk etc.
  * @get_postcode: Get post code status from firmware.
+ * @dma_filter: DMA filter function to be used.
  */
 struct mic_hw_ops {
 	u8 aper_bar;
@@ -159,6 +166,7 @@ struct mic_hw_ops {
 	void (*send_firmware_intr)(struct mic_device *mdev);
 	int (*load_mic_fw)(struct mic_device *mdev, const char *buf);
 	u32 (*get_postcode)(struct mic_device *mdev);
+	bool (*dma_filter)(struct dma_chan *chan, void *param);
 };
 
 /**
@@ -187,6 +195,22 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 	iowrite32(val, mw->va + offset);
 }
 
+static inline struct dma_chan *mic_request_dma_chan(struct mic_device *mdev)
+{
+	dma_cap_mask_t mask;
+	struct dma_chan *chan;
+
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
+	chan = dma_request_channel(mask, mdev->ops->dma_filter,
+				   mdev->sdev->parent);
+	if (chan)
+		return chan;
+	dev_err(mdev->sdev->parent, "%s %d unable to acquire channel\n",
+		__func__, __LINE__);
+	return NULL;
+}
+
 void mic_sysfs_init(struct mic_device *mdev);
 int mic_start(struct mic_device *mdev, const char *buf);
 void mic_stop(struct mic_device *mdev, bool force);
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
index 6b03f84..dd354ff 100644
--- a/drivers/misc/mic/host/mic_intr.h
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -27,8 +27,9 @@
  * The minimum number of msix vectors required for normal operation.
  * 3 for virtio network, console and block devices.
  * 1 for card shutdown notifications.
+ * 4 for host owned DMA channels.
  */
-#define MIC_MIN_MSIX 4
+#define MIC_MIN_MSIX 8
 #define MIC_NUM_OFFSETS 32
 
 /**
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index aba3e83..aa8e4d6 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -21,60 +21,155 @@
 #include <linux/pci.h>
 #include <linux/sched.h>
 #include <linux/uaccess.h>
-
+#include <linux/dmaengine.h>
 #include <linux/mic_common.h>
+
 #include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
 
 /*
- * Initiates the copies across the PCIe bus from card memory to
- * a user space buffer.
+ * Size of the internal buffer used during DMA's as an intermediate buffer
+ * for copy to/from user.
  */
-static int mic_virtio_copy_to_user(struct mic_vdev *mvdev,
-		void __user *ubuf, size_t len, u64 addr)
+#define MIC_INT_DMA_BUF_SIZE PAGE_ALIGN(64 * 1024ULL)
+
+static int mic_sync_dma(struct mic_device *mdev, dma_addr_t dst,
+			dma_addr_t src, size_t len)
 {
-	int err;
-	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
-	/*
-	 * We are copying from IO below an should ideally use something
-	 * like copy_to_user_fromio(..) if it existed.
-	 */
-	if (copy_to_user(ubuf, (void __force *)dbuf, len)) {
-		err = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+	int err = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *mic_ch = mdev->dma_ch;
+
+	if (!mic_ch) {
+		err = -EBUSY;
+		goto error;
+	}
+
+	tx = mic_ch->device->device_prep_dma_memcpy(mic_ch, dst, src, len,
+						    DMA_PREP_FENCE);
+	if (!tx) {
+		err = -ENOMEM;
+		goto error;
+	} else {
+		dma_cookie_t cookie = tx->tx_submit(tx);
+
+		err = dma_submit_error(cookie);
+		if (err)
+			goto error;
+		err = dma_sync_wait(mic_ch, cookie);
+	}
+error:
+	if (err)
+		dev_err(mdev->sdev->parent, "%s %d err %d\n",
 			__func__, __LINE__, err);
-		goto err;
+	return err;
+}
+
+/*
+ * Initiates the copies across the PCIe bus from card memory to a user
+ * space buffer. When transfers are done using DMA, source/destination
+ * addresses and transfer length must follow the alignment requirements of
+ * the MIC DMA engine.
+ */
+static int mic_virtio_copy_to_user(struct mic_vdev *mvdev, void __user *ubuf,
+				size_t len, u64 daddr, size_t dlen, int vr_idx)
+{
+	struct mic_device *mdev = mvdev->mdev;
+	void __iomem *dbuf = mdev->aper.va + daddr;
+	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
+	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t dma_offset;
+	size_t partlen;
+	int err;
+
+	dma_offset = daddr - round_down(daddr, dma_alignment);
+	daddr -= dma_offset;
+	len += dma_offset;
+
+	while (len) {
+		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
+
+		err = mic_sync_dma(mdev, mvr->buf_da, daddr,
+				   ALIGN(partlen, dma_alignment));
+		if (err)
+			goto err;
+
+		if (copy_to_user(ubuf, mvr->buf + dma_offset,
+				 partlen - dma_offset)) {
+			err = -EFAULT;
+			goto err;
+		}
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		mvdev->in_bytes_dma += partlen;
+		mvdev->in_bytes += partlen;
+		len -= partlen;
+		dma_offset = 0;
 	}
-	mvdev->in_bytes += len;
-	err = 0;
+	return 0;
 err:
+	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
 	return err;
 }
 
 /*
- * Initiates copies across the PCIe bus from a user space
- * buffer to card memory.
+ * Initiates copies across the PCIe bus from a user space buffer to card
+ * memory. When transfers are done using DMA, source/destination addresses
+ * and transfer length must follow the alignment requirements of the MIC
+ * DMA engine.
  */
-static int mic_virtio_copy_from_user(struct mic_vdev *mvdev,
-		void __user *ubuf, size_t len, u64 addr)
+static int mic_virtio_copy_from_user(struct mic_vdev *mvdev, void __user *ubuf,
+				size_t len, u64 daddr, size_t dlen, int vr_idx)
 {
+	struct mic_device *mdev = mvdev->mdev;
+	void __iomem *dbuf = mdev->aper.va + daddr;
+	struct mic_vringh *mvr = &mvdev->mvr[vr_idx];
+	size_t dma_alignment = 1 << mdev->dma_ch->device->copy_align;
+	size_t partlen;
 	int err;
-	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
+
+	if (daddr & (dma_alignment - 1)) {
+		mvdev->tx_dst_unaligned += len;
+		goto memcpy;
+	} else if (ALIGN(len, dma_alignment) > dlen) {
+		mvdev->tx_len_unaligned += len;
+		goto memcpy;
+	}
+
+	while (len) {
+		partlen = min_t(size_t, len, MIC_INT_DMA_BUF_SIZE);
+
+		if (copy_from_user(mvr->buf, ubuf, partlen)) {
+			err = -EFAULT;
+			goto err;
+		}
+		err = mic_sync_dma(mdev, daddr, mvr->buf_da,
+				   ALIGN(partlen, dma_alignment));
+		if (err)
+			goto err;
+		daddr += partlen;
+		ubuf += partlen;
+		dbuf += partlen;
+		mvdev->out_bytes_dma += partlen;
+		mvdev->out_bytes += partlen;
+		len -= partlen;
+	}
+memcpy:
 	/*
 	 * We are copying to IO below and should ideally use something
 	 * like copy_from_user_toio(..) if it existed.
 	 */
 	if (copy_from_user((void __force *)dbuf, ubuf, len)) {
 		err = -EFAULT;
-		dev_err(mic_dev(mvdev), "%s %d err %d\n",
-			__func__, __LINE__, err);
 		goto err;
 	}
 	mvdev->out_bytes += len;
-	err = 0;
+	return 0;
 err:
+	dev_err(mic_dev(mvdev), "%s %d err %d\n", __func__, __LINE__, err);
 	return err;
 }
 
@@ -110,7 +205,8 @@ static inline u32 mic_vringh_iov_consumed(struct vringh_kiov *iov)
  * way to override the VRINGH xfer(..) routines as of v3.10.
  */
 static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
-	void __user *ubuf, size_t len, bool read, size_t *out_len)
+			void __user *ubuf, size_t len, bool read, int vr_idx,
+			size_t *out_len)
 {
 	int ret = 0;
 	size_t partlen, tot_len = 0;
@@ -118,13 +214,15 @@ static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
 	while (len && iov->i < iov->used) {
 		partlen = min(iov->iov[iov->i].iov_len, len);
 		if (read)
-			ret = mic_virtio_copy_to_user(mvdev,
-				ubuf, partlen,
-				(u64)iov->iov[iov->i].iov_base);
+			ret = mic_virtio_copy_to_user(mvdev, ubuf, partlen,
+						(u64)iov->iov[iov->i].iov_base,
+						iov->iov[iov->i].iov_len,
+						vr_idx);
 		else
-			ret = mic_virtio_copy_from_user(mvdev,
-				ubuf, partlen,
-				(u64)iov->iov[iov->i].iov_base);
+			ret = mic_virtio_copy_from_user(mvdev, ubuf, partlen,
+						(u64)iov->iov[iov->i].iov_base,
+						iov->iov[iov->i].iov_len,
+						vr_idx);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -192,8 +290,8 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			ubuf = iov.iov_base;
 		}
 		/* Issue all the read descriptors first */
-		ret = mic_vringh_copy(mvdev, riov, ubuf, len,
-			MIC_VRINGH_READ, &out_len);
+		ret = mic_vringh_copy(mvdev, riov, ubuf, len, MIC_VRINGH_READ,
+				      copy->vr_idx, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -203,8 +301,8 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 		ubuf += out_len;
 		copy->out_len += out_len;
 		/* Issue the write descriptors next */
-		ret = mic_vringh_copy(mvdev, wiov, ubuf, len,
-			!MIC_VRINGH_READ, &out_len);
+		ret = mic_vringh_copy(mvdev, wiov, ubuf, len, !MIC_VRINGH_READ,
+				      copy->vr_idx, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -589,6 +687,10 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		dev_dbg(mdev->sdev->parent,
 			"%s %d index %d va %p info %p vr_size 0x%x\n",
 			__func__, __LINE__, i, vr->va, vr->info, vr_size);
+		mvr->buf = (void *)__get_free_pages(GFP_KERNEL,
+					get_order(MIC_INT_DMA_BUF_SIZE));
+		mvr->buf_da = mic_map_single(mvdev->mdev, mvr->buf,
+					  MIC_INT_DMA_BUF_SIZE);
 	}
 
 	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
@@ -673,6 +775,11 @@ skip_hot_remove:
 	vqconfig = mic_vq_config(mvdev->dd);
 	for (i = 0; i < mvdev->dd->num_vq; i++) {
 		struct mic_vringh *mvr = &mvdev->mvr[i];
+
+		mic_unmap_single(mvdev->mdev, mvr->buf_da,
+				 MIC_INT_DMA_BUF_SIZE);
+		free_pages((unsigned long)mvr->buf,
+			   get_order(MIC_INT_DMA_BUF_SIZE));
 		vringh_kiov_cleanup(&mvr->riov);
 		vringh_kiov_cleanup(&mvr->wiov);
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
diff --git a/drivers/misc/mic/host/mic_virtio.h b/drivers/misc/mic/host/mic_virtio.h
index 184f3c8..d574efb 100644
--- a/drivers/misc/mic/host/mic_virtio.h
+++ b/drivers/misc/mic/host/mic_virtio.h
@@ -46,18 +46,23 @@
  * @vrh: The host VRINGH used for accessing the card vrings.
  * @riov: The VRINGH read kernel IOV.
  * @wiov: The VRINGH write kernel IOV.
- * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
  * @vr_mutex: Mutex for synchronizing access to the VRING.
+ * @buf: Temporary kernel buffer used to copy in/out data
+ * from/to the card via DMA.
+ * @buf_da: dma address of buf.
  * @mvdev: Back pointer to MIC virtio device for vringh_notify(..).
+ * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
  */
 struct mic_vringh {
 	struct mic_vring vring;
 	struct vringh vrh;
 	struct vringh_kiov riov;
 	struct vringh_kiov wiov;
-	u16 head;
 	struct mutex vr_mutex;
+	void *buf;
+	dma_addr_t buf_da;
 	struct mic_vdev *mvdev;
+	u16 head;
 };
 
 /**
@@ -69,6 +74,14 @@ struct mic_vringh {
  * @poll_wake - Used for waking up threads blocked in poll.
  * @out_bytes - Debug stats for number of bytes copied from host to card.
  * @in_bytes - Debug stats for number of bytes copied from card to host.
+ * @out_bytes_dma - Debug stats for number of bytes copied from host to card
+ * using DMA.
+ * @in_bytes_dma - Debug stats for number of bytes copied from card to host
+ * using DMA.
+ * @tx_len_unaligned - Debug stats for number of bytes copied to the card where
+ * the transfer length did not have the required DMA alignment.
+ * @tx_dst_unaligned - Debug stats for number of bytes copied where the
+ * destination address on the card did not have the required DMA alignment.
  * @mvr - Store per VRING data structures.
  * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
  * @dd - Virtio device descriptor.
@@ -84,6 +97,10 @@ struct mic_vdev {
 	int poll_wake;
 	unsigned long out_bytes;
 	unsigned long in_bytes;
+	unsigned long out_bytes_dma;
+	unsigned long in_bytes_dma;
+	unsigned long tx_len_unaligned;
+	unsigned long tx_dst_unaligned;
 	struct mic_vringh mvr[MIC_MAX_VRINGS];
 	struct work_struct virtio_bh_work;
 	struct mic_device_desc *dd;
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index 5562fdd..b7a21e1 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -549,6 +549,13 @@ struct mic_smpt_ops mic_x100_smpt_ops = {
 	.set = mic_x100_smpt_set,
 };
 
+static bool mic_x100_dma_filter(struct dma_chan *chan, void *param)
+{
+	if (chan->device->dev->parent == (struct device *)param)
+		return true;
+	return false;
+}
+
 struct mic_hw_ops mic_x100_ops = {
 	.aper_bar = MIC_X100_APER_BAR,
 	.mmio_bar = MIC_X100_MMIO_BAR,
@@ -563,6 +570,7 @@ struct mic_hw_ops mic_x100_ops = {
 	.send_firmware_intr = mic_x100_send_firmware_intr,
 	.load_mic_fw = mic_x100_load_firmware,
 	.get_postcode = mic_x100_get_postcode,
+	.dma_filter = mic_x100_dma_filter,
 };
 
 struct mic_hw_intr_ops mic_x100_intr_ops = {
-- 
1.8.2.1


^ permalink raw reply related	[relevance 54%]

* [PATCH char-misc-next 8/8] misc: mic: add support for loading/unloading dma driver
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
                   ` (6 preceding siblings ...)
  2014-05-28  2:36 87% ` [PATCH char-misc-next 7/8] misc: mic: add dma " Sudeep Dutt
@ 2014-05-28  2:36 99% ` Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

modprobe dma driver upon start and remove it upon unload.

Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 Documentation/mic/mpssd/mpss | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
index 3136c68..cacbdb0 100755
--- a/Documentation/mic/mpssd/mpss
+++ b/Documentation/mic/mpssd/mpss
@@ -48,18 +48,18 @@ start()
 	fi
 
 	echo -e $"Starting MPSS Stack"
-	echo -e $"Loading MIC_HOST Module"
+	echo -e $"Loading MIC_X100_DMA & MIC_HOST Modules"
 
-	# Ensure the driver is loaded
-	if [ ! -d "$sysfs" ]; then
-		modprobe mic_host
+	for f in "mic_host" "mic_x100_dma"
+	do
+		modprobe $f
 		RETVAL=$?
 		if [ $RETVAL -ne 0 ]; then
 			failure
 			echo
 			return $RETVAL
 		fi
-	fi
+	done
 
 	# Start the daemon
 	echo -n $"Starting MPSSD "
@@ -170,8 +170,8 @@ unload()
 	stop
 
 	sleep 5
-	echo -n $"Removing MIC_HOST Module: "
-	modprobe -r mic_host
+	echo -n $"Removing MIC_HOST & MIC_X100_DMA Modules: "
+	modprobe -r mic_host mic_x100_dma
 	RETVAL=$?
 	[ $RETVAL -ne 0 ] && failure || success
 	echo
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 7/8] misc: mic: add dma support in card driver
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
                   ` (5 preceding siblings ...)
  2014-05-28  2:36 86% ` [PATCH char-misc-next 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
@ 2014-05-28  2:36 87% ` Sudeep Dutt
  2014-05-28  2:36 99% ` [PATCH char-misc-next 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This patch adds a dma device on the mic virtual bus

Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
---
 drivers/misc/mic/Kconfig           |  2 +-
 drivers/misc/mic/card/mic_device.h |  3 +++
 drivers/misc/mic/card/mic_x100.c   | 52 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index bf76313..cc4eef0 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -39,7 +39,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT && X86
+	depends on 64BIT && X86 && INTEL_MIC_BUS
 	select VIRTIO
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
index e12a0c2..8d735ba 100644
--- a/drivers/misc/mic/card/mic_device.h
+++ b/drivers/misc/mic/card/mic_device.h
@@ -31,6 +31,7 @@
 #include <linux/io.h>
 #include <linux/irqreturn.h>
 #include <linux/interrupt.h>
+#include <linux/mic_bus.h>
 
 /**
  * struct mic_intr_info - Contains h/w specific interrupt sources info
@@ -71,6 +72,7 @@ struct mic_device {
  * @hotplug_work: Hot plug work for adding/removing virtio devices.
  * @irq_info: The OS specific irq information
  * @intr_info: H/W specific interrupt information.
+ * @dma_mbdev: dma device on the MIC virtual bus.
  */
 struct mic_driver {
 	char name[20];
@@ -81,6 +83,7 @@ struct mic_driver {
 	struct work_struct hotplug_work;
 	struct mic_irq_info irq_info;
 	struct mic_intr_info intr_info;
+	struct mbus_device dma_mbdev;
 };
 
 /**
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 2868945..85066cf 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -148,6 +148,46 @@ void mic_card_unmap(struct mic_device *mdev, void __iomem *addr)
 	iounmap(addr);
 }
 
+static inline struct mic_driver *mbdev_to_mdrv(struct mbus_device *mbdev)
+{
+	return dev_get_drvdata(mbdev->dev.parent);
+}
+
+static struct mic_irq *_mic_request_threaded_irq(struct mbus_device *mbdev,
+		irq_handler_t handler, irq_handler_t thread_fn,
+		const char *name, void *data, int intr_src)
+{
+	int rc = 0;
+	unsigned int irq = intr_src;
+	unsigned long cookie = irq;
+
+	rc  = request_threaded_irq(irq, handler, thread_fn, 0, name, data);
+	if (rc) {
+		dev_err(mbdev_to_mdrv(mbdev)->dev,
+			"request_threaded_irq failed rc = %d\n", rc);
+		return ERR_PTR(rc);
+	}
+	return (struct mic_irq *)cookie;
+}
+
+static void _mic_free_irq(struct mbus_device *mbdev,
+		struct mic_irq *cookie, void *data)
+{
+	unsigned long irq = (unsigned long)cookie;
+	free_irq(irq, data);
+}
+
+static void _mic_ack_interrupt(struct mbus_device *mbdev, int num)
+{
+	mic_ack_interrupt(&mbdev_to_mdrv(mbdev)->mdev);
+}
+
+static struct mbus_hw_ops mbus_hw_ops = {
+	.request_threaded_irq = _mic_request_threaded_irq,
+	.free_irq = _mic_free_irq,
+	.ack_interrupt = _mic_ack_interrupt,
+};
+
 static int __init mic_probe(struct platform_device *pdev)
 {
 	struct mic_driver *mdrv = &g_drv;
@@ -166,13 +206,22 @@ static int __init mic_probe(struct platform_device *pdev)
 		goto done;
 	}
 	mic_hw_intr_init(mdrv);
+	platform_set_drvdata(pdev, mdrv);
+	rc = mbus_add_device(&mdrv->dma_mbdev, mdrv->dev, MBUS_DEV_DMA_MIC,
+			     NULL, &mbus_hw_ops, mdrv->mdev.mmio.va);
+	if (rc) {
+		dev_err(&pdev->dev, "mbus_add_device failed rc %d\n", rc);
+		goto iounmap;
+	}
 	rc = mic_driver_init(mdrv);
 	if (rc) {
 		dev_err(&pdev->dev, "mic_driver_init failed rc %d\n", rc);
-		goto iounmap;
+		goto remove_dma;
 	}
 done:
 	return rc;
+remove_dma:
+	mbus_remove_device(&mdrv->dma_mbdev);
 iounmap:
 	iounmap(mdev->mmio.va);
 	return rc;
@@ -184,6 +233,7 @@ static int mic_remove(struct platform_device *pdev)
 	struct mic_device *mdev = &mdrv->mdev;
 
 	mic_driver_uninit(mdrv);
+	mbus_remove_device(&mdrv->dma_mbdev);
 	iounmap(mdev->mmio.va);
 	return 0;
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 87%]

* [PATCH char-misc-next 2/8] misc: mic: add a bus driver for virtual MIC devices
  2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
  2014-05-28  2:36 91% ` [PATCH char-misc-next 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
@ 2014-05-28  2:36 63% ` Sudeep Dutt
    2014-05-28  2:36 41% ` [PATCH char-misc-next 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

From: Siva Yerramreddy <yshivakrishna@gmail.com>

This MIC virtual bus driver takes the responsibility of creating all
the virtual devices connected to the PCIe device on the host and the
platform device on the card. The MIC bus hardware operations provide
a way to abstract certain hardware details from the base physical devices.
Examples of devices added on the MIC virtual bus include host DMA and card DMA.
This abstraction enables using a common DMA driver on host and card.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Siva Yerramreddy <yshivakrishna@gmail.com>
---
 drivers/misc/mic/Kconfig       |  17 ++++
 drivers/misc/mic/Makefile      |   1 +
 drivers/misc/mic/bus/Makefile  |   5 ++
 drivers/misc/mic/bus/mic_bus.c | 188 +++++++++++++++++++++++++++++++++++++++++
 include/linux/mic_bus.h        | 148 ++++++++++++++++++++++++++++++++
 5 files changed, 359 insertions(+)
 create mode 100644 drivers/misc/mic/bus/Makefile
 create mode 100644 drivers/misc/mic/bus/mic_bus.c
 create mode 100644 include/linux/mic_bus.h

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 462a5b1..ee1d2ac 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -1,3 +1,20 @@
+comment "Intel MIC Bus Driver"
+
+config INTEL_MIC_BUS
+	tristate "Intel MIC Bus Driver"
+	depends on 64BIT && PCI && X86 && X86_DEV_DMA_OPS
+	help
+	  This option is selected by any driver which registers a
+	  device or driver on the MIC Bus, such as CONFIG_INTEL_MIC_HOST,
+	  CONFIG_INTEL_MIC_CARD, CONFIG_INTEL_MIC_X100_DMA etc.
+
+	  If you are building a host/card kernel with an Intel MIC device
+	  then say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
+
 comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 05b34d6..e9bf148 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -4,3 +4,4 @@
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
 obj-$(CONFIG_INTEL_MIC_CARD) += card/
+obj-$(CONFIG_INTEL_MIC_BUS) += bus/
diff --git a/drivers/misc/mic/bus/Makefile b/drivers/misc/mic/bus/Makefile
new file mode 100644
index 0000000..d85c7f2
--- /dev/null
+++ b/drivers/misc/mic/bus/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2014, Intel Corporation.
+#
+obj-$(CONFIG_INTEL_MIC_BUS) += mic_bus.o
diff --git a/drivers/misc/mic/bus/mic_bus.c b/drivers/misc/mic/bus/mic_bus.c
new file mode 100644
index 0000000..39253b5
--- /dev/null
+++ b/drivers/misc/mic/bus/mic_bus.c
@@ -0,0 +1,188 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Bus driver.
+ *
+ * This implementation is very similar to the the virtio bus driver
+ * implementation @ drivers/virtio/virtio.c
+ */
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/mic_bus.h>
+
+/* Unique numbering for mbus devices. */
+static DEFINE_IDA(mbus_index_ida);
+
+static ssize_t device_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "0x%04x\n", dev->id.device);
+}
+static DEVICE_ATTR_RO(device);
+
+static ssize_t vendor_show(struct device *d,
+			   struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "0x%04x\n", dev->id.vendor);
+}
+static DEVICE_ATTR_RO(vendor);
+
+static ssize_t modalias_show(struct device *d,
+			     struct device_attribute *attr, char *buf)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	return sprintf(buf, "mbus:d%08Xv%08X\n",
+		       dev->id.device, dev->id.vendor);
+}
+static DEVICE_ATTR_RO(modalias);
+
+static struct attribute *mbus_dev_attrs[] = {
+	&dev_attr_device.attr,
+	&dev_attr_vendor.attr,
+	&dev_attr_modalias.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(mbus_dev);
+
+static inline int mbus_id_match(const struct mbus_device *dev,
+				  const struct mbus_device_id *id)
+{
+	if (id->device != dev->id.device && id->device != MBUS_DEV_ANY_ID)
+		return 0;
+
+	return id->vendor == MBUS_DEV_ANY_ID || id->vendor == dev->id.vendor;
+}
+
+/*
+ * This looks through all the IDs a driver claims to support.  If any of them
+ * match, we return 1 and the kernel will call mbus_dev_probe().
+ */
+static int mbus_dev_match(struct device *dv, struct device_driver *dr)
+{
+	unsigned int i;
+	struct mbus_device *dev = dev_to_mbus(dv);
+	const struct mbus_device_id *ids;
+
+	ids = drv_to_mbus(dr)->id_table;
+	for (i = 0; ids[i].device; i++)
+		if (mbus_id_match(dev, &ids[i]))
+			return 1;
+	return 0;
+}
+
+static int mbus_uevent(struct device *dv, struct kobj_uevent_env *env)
+{
+	struct mbus_device *dev = dev_to_mbus(dv);
+
+	return add_uevent_var(env, "MODALIAS=mbus:d%08Xv%08X",
+			      dev->id.device, dev->id.vendor);
+}
+
+static int mbus_dev_probe(struct device *d)
+{
+	int err;
+	struct mbus_device *dev = dev_to_mbus(d);
+	struct mbus_driver *drv = drv_to_mbus(dev->dev.driver);
+
+	err = drv->probe(dev);
+	if (!err)
+		if (drv->scan)
+			drv->scan(dev);
+	return err;
+}
+
+static int mbus_dev_remove(struct device *d)
+{
+	struct mbus_device *dev = dev_to_mbus(d);
+	struct mbus_driver *drv = drv_to_mbus(dev->dev.driver);
+
+	drv->remove(dev);
+	return 0;
+}
+
+static struct bus_type mic_bus = {
+	.name  = "mic_bus",
+	.match = mbus_dev_match,
+	.dev_groups = mbus_dev_groups,
+	.uevent = mbus_uevent,
+	.probe = mbus_dev_probe,
+	.remove = mbus_dev_remove,
+};
+
+int register_mbus_driver(struct mbus_driver *driver)
+{
+	driver->driver.bus = &mic_bus;
+	return driver_register(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(register_mbus_driver);
+
+void unregister_mbus_driver(struct mbus_driver *driver)
+{
+	driver_unregister(&driver->driver);
+}
+EXPORT_SYMBOL_GPL(unregister_mbus_driver);
+
+int register_mbus_device(struct mbus_device *dev)
+{
+	int err;
+
+	dev->dev.bus = &mic_bus;
+
+	/* Assign a unique device index and hence name. */
+	err = ida_simple_get(&mbus_index_ida, 0, 0, GFP_KERNEL);
+	if (err < 0)
+		return err;
+
+	dev->index = err;
+	dev_set_name(&dev->dev, "mbus-dev%u", dev->index);
+	/*
+	 * device_register() causes the bus infrastructure to look for a
+	 * matching driver.
+	 */
+	err = device_register(&dev->dev);
+	return err;
+}
+EXPORT_SYMBOL_GPL(register_mbus_device);
+
+void unregister_mbus_device(struct mbus_device *dev)
+{
+	int index = dev->index; /* save for after device release */
+
+	device_unregister(&dev->dev);
+	ida_simple_remove(&mbus_index_ida, index);
+}
+EXPORT_SYMBOL_GPL(unregister_mbus_device);
+
+static int __init mbus_init(void)
+{
+	return bus_register(&mic_bus);
+}
+
+static void __exit mbus_exit(void)
+{
+	bus_unregister(&mic_bus);
+}
+
+core_initcall(mbus_init);
+module_exit(mbus_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC Bus driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/mic_bus.h b/include/linux/mic_bus.h
new file mode 100644
index 0000000..8297573
--- /dev/null
+++ b/include/linux/mic_bus.h
@@ -0,0 +1,148 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Bus driver.
+ *
+ * This implementation is very similar to the the virtio bus driver
+ * implementation @ include/linux/virtio.h.
+ */
+#ifndef _MIC_BUS_H_
+#define _MIC_BUS_H_
+/*
+ * Everything a mbus driver needs to work with any particular mbus
+ * implementation.
+ */
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/mod_devicetable.h>
+#include <linux/interrupt.h>
+#include <linux/dma-mapping.h>
+
+struct mbus_device_id {
+	__u32 device;
+	__u32 vendor;
+};
+
+#define MBUS_DEV_DMA_HOST 2
+#define MBUS_DEV_DMA_MIC 3
+#define MBUS_DEV_ANY_ID 0xffffffff
+
+/**
+ * mbus_device - representation of a device using mbus
+ * @priv: private pointer for the driver's use.
+ * @mmio_va: virtual address of mmio space
+ * @hw_ops: the hardware ops supported by this device.
+ * @id: the device type identification (used to match it with a driver).
+ * @dev: underlying device.
+ * be used to communicate with.
+ * @index: unique position on the mbus bus
+ */
+struct mbus_device {
+	void *priv;
+	void __iomem *mmio_va;
+	struct mbus_hw_ops *hw_ops;
+	struct mbus_device_id id;
+	struct device dev;
+	int index;
+};
+
+/**
+ * mbus_driver - operations for a mbus I/O driver
+ * @driver: underlying device driver (populate name and owner).
+ * @id_table: the ids serviced by this driver.
+ * @probe: the function to call when a device is found.  Returns 0 or -errno.
+ * @remove: the function to call when a device is removed.
+ */
+struct mbus_driver {
+	struct device_driver driver;
+	const struct mbus_device_id *id_table;
+	int (*probe)(struct mbus_device *dev);
+	void (*scan)(struct mbus_device *dev);
+	void (*remove)(struct mbus_device *dev);
+};
+
+/**
+ * struct mic_irq - opaque pointer used as cookie
+ */
+struct mic_irq;
+
+/**
+ * mbus_hw_ops - Hardware operations for accessing a MIC device on the MIC bus.
+ */
+struct mbus_hw_ops {
+	struct mic_irq* (*request_threaded_irq)(struct mbus_device *mbdev,
+			irq_handler_t handler, irq_handler_t thread_fn,
+			const char *name, void *data, int intr_src);
+	void (*free_irq)(struct mbus_device *mbdev,
+			struct mic_irq *cookie, void *data);
+	void (*ack_interrupt)(struct mbus_device *mbdev, int num);
+};
+
+int register_mbus_device(struct mbus_device *dev);
+void unregister_mbus_device(struct mbus_device *dev);
+
+int register_mbus_driver(struct mbus_driver *drv);
+void unregister_mbus_driver(struct mbus_driver *drv);
+
+static inline struct mbus_device *dev_to_mbus(struct device *_dev)
+{
+	return container_of(_dev, struct mbus_device, dev);
+}
+
+static inline struct mbus_driver *drv_to_mbus(struct device_driver *drv)
+{
+	return container_of(drv, struct mbus_driver, driver);
+}
+
+static inline void mbus_release_dev(struct device *d)
+{
+}
+
+static inline int
+mbus_add_device(struct mbus_device *mbdev, struct device *pdev, int id,
+		struct dma_map_ops *dma_ops, struct mbus_hw_ops *hw_ops,
+		void __iomem *mmio_va)
+{
+	int ret;
+
+	mbdev->mmio_va = mmio_va;
+	mbdev->dev.parent = pdev;
+	mbdev->id.device = id;
+	mbdev->id.vendor = MBUS_DEV_ANY_ID;
+	mbdev->dev.archdata.dma_ops = dma_ops;
+	mbdev->dev.dma_mask = &mbdev->dev.coherent_dma_mask;
+	dma_set_mask(&mbdev->dev, DMA_BIT_MASK(64));
+	mbdev->dev.release = mbus_release_dev;
+	mbdev->hw_ops = hw_ops;
+	dev_set_drvdata(&mbdev->dev, mbdev);
+
+	ret = register_mbus_device(mbdev);
+	if (ret) {
+		dev_err(mbdev->dev.parent,
+			"Failed to register mbus device type %u\n", id);
+		return ret;
+	}
+	return 0;
+}
+
+static inline void mbus_remove_device(struct mbus_device *mbdev)
+{
+	unregister_mbus_device(mbdev);
+	memset(mbdev, 0x0, sizeof(*mbdev));
+}
+
+#endif /* _MIC_BUS_H */
-- 
1.8.2.1


^ permalink raw reply related	[relevance 63%]

* [PATCH char-misc-next 0/8] Enable dma driver for MIC X100 Coprocessors
@ 2014-05-28  2:36 83% Sudeep Dutt
  2014-05-28  2:36 91% ` [PATCH char-misc-next 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Sudeep Dutt @ 2014-05-28  2:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Dan Williams
  Cc: Siva Yerramreddy, Ashutosh Dixit, Nikhil Rao, Sudeep Dutt,
	linux-kernel, dmaengine

These patches are being sent to char-misc because there is a dependency
between the patches for dma driver and other drivers.

Description:
============
This set of patches add support for MIC X100 dma driver.
MIC PCIe card has a dma controller with 8 channels. These channels are
shared between the host s/w and the card s/w. 0 to 3 are used by host
nd 4 to 7 by card. As the dma device doesn't show up as PCIe device,
a virtual bus called mic bus is created and virtual dma devices are
created on it by the host/card drivers. On host the channels are private
and used only by the host driver to transfer data for the virtio devices.

Here is a higher level block diagram.
                                      |
               +----------+           |             +----------+
               | Card OS  |           |             | Host OS  |
               +----------+           |             +----------+
                                      |
        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
        | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
        | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
        | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
        +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
            |         |         |     |      |            |         |
            |         |         |     |User  |            |         |
            |         |         |     |------|------------|---------|-------
            +-------------------+     |Kernel +--------------------------+
                      |               |       | Virtio over PCIe IOCTLs  |
                      |               |       +--------------------------+
+-----------+         |               |                   |  +-----------+
| MIC DMA   |         |               |                   |  | MIC DMA   |
| Driver    |         |               |                   |  | Driver    |
+-----------+         |               |                   |  +-----------+
      |               |               |                   |        |
+---------------+     |               |                   |  +----------------+
|MIC virtual Bus|     |               |                   |  |MIC virtual Bus |
+---------------+     |               |                   |  +----------------+
      |               |               |                   |              |
      |   +--------------+            |            +---------------+     |
      |   |Intel MIC     |            |            |Intel MIC      |     |
      +---|Card Driver   |            |            |Host Driver    |     |
          +--------------+            |            +---------------+-----+
                      |               |                   |
             +-------------------------------------------------------------+
             |                                                             |
             |                    PCIe Bus                                 |
             +-------------------------------------------------------------+

The following series of patches are partitioned as follows:

Patch 1: Add mic bus and dma driver documentation.
	 Author: Siva Yerramreddy
Patch 2: Add a bus driver for virtual MIC devices.
	 Authors: Siva Yerramreddy, Sudeep Dutt
Patch 3: MIC X100 DMA Driver.
	 Author: Siva Yerramreddy
Patch 4: Add threaded irq support in host driver.
	 This is needed as the dma driver uses threaded irq.
	 Author: Siva Yerramreddy
Patch 5: Use dma to transfer data between MIC and host.
	 Authors: Siva Yerramreddy, Ashutosh Dixit
Patch 6: Add threaded irq support in mic_request_card_irq.
	 This is needed as the dma driver uses threaded irq.
	 Author: Siva Yerramreddy
Patch 7: Add dma device on mic bus.
	 Author: Siva Yerramreddy
Patch 8: Modify the mpss script to load/unload mic_x100_dma.ko.
	 Author: Siva Yerramreddy

The patches have been compiled/validated against v3.15-rc3. Tested using
dmatest module with module parameter "threads_per_chan=60". These patches
have also been scanned by Fengguang Wu's 0-day infrastructure and no
issues have been reported.

Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the initial
review.

Siva Yerramreddy (8):
  misc: mic: Add mic bus and dma driver documentation
  misc: mic: add a bus driver for virtual MIC devices
  dma: MIC X100 DMA Driver
  misc: mic: add threaded irq support in host driver
  misc: mic: add dma support in host driver
  misc: mic: add threaded irq support in card driver
  misc: mic: add dma support in card driver
  misc: mic: add support for loading/unloading dma driver

 Documentation/mic/mic_overview.txt |  67 ++--
 Documentation/mic/mpssd/mpss       |  14 +-
 drivers/dma/Kconfig                |  19 +
 drivers/dma/Makefile               |   1 +
 drivers/dma/mic_x100_dma.c         | 774 +++++++++++++++++++++++++++++++++++++
 drivers/dma/mic_x100_dma.h         | 286 ++++++++++++++
 drivers/misc/mic/Kconfig           |  21 +-
 drivers/misc/mic/Makefile          |   1 +
 drivers/misc/mic/bus/Makefile      |   5 +
 drivers/misc/mic/bus/mic_bus.c     | 188 +++++++++
 drivers/misc/mic/card/mic_device.c |  21 +-
 drivers/misc/mic/card/mic_device.h |   8 +-
 drivers/misc/mic/card/mic_virtio.c |   4 +-
 drivers/misc/mic/card/mic_x100.c   |  52 ++-
 drivers/misc/mic/host/mic_boot.c   |  78 +++-
 drivers/misc/mic/host/mic_device.h |  24 ++
 drivers/misc/mic/host/mic_intr.c   | 116 +++---
 drivers/misc/mic/host/mic_intr.h   |  21 +-
 drivers/misc/mic/host/mic_main.c   |   5 +-
 drivers/misc/mic/host/mic_virtio.c | 185 +++++++--
 drivers/misc/mic/host/mic_virtio.h |  21 +-
 drivers/misc/mic/host/mic_x100.c   |   8 +
 include/linux/mic_bus.h            | 148 +++++++
 23 files changed, 1922 insertions(+), 145 deletions(-)
 create mode 100644 drivers/dma/mic_x100_dma.c
 create mode 100644 drivers/dma/mic_x100_dma.h
 create mode 100644 drivers/misc/mic/bus/Makefile
 create mode 100644 drivers/misc/mic/bus/mic_bus.c
 create mode 100644 include/linux/mic_bus.h

-- 
1.8.2.1


^ permalink raw reply	[relevance 83%]

* Re: [PATCH 0/8] Enable dma driver for MIC X100 Coprocessors.
       [not found]     ` <1398363008.6542.1.camel@siva-rh63@ra.intel.com>
@ 2014-05-08  3:10 99%   ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2014-05-08  3:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dan Williams, Nikhil Rao, Ashutosh Dixit, linux-kernel,
	dmaengine, Sudeep Dutt, Siva Yerramreddy

On Thu, 2014-04-24 at 11:10 -0700, Siva Krishna Yerramreddy wrote:
> On Mon, 2014-04-14 at 13:14 -0700, Siva Yerramreddy wrote:
> > I am sending all these patches to char-misc because there is a dependency
> > between the patches for dma driver and other drivers.
> > 
> Greg, any feedback on the patches?

Hi Greg,
The primary author of this patch series Siva is no longer with Intel so
we will be taking ownership of addressing review feedback.

The patches have been applied to the MIC GITHUB tree which is registered
with Fengguang Wu's 0-day infrastructure and no issues have been
reported.

We have not received any feedback on the patches yet and were wondering
if you had a chance to review them?

Thanks,
Sudeep Dutt

> > Description:
> > ============
> > This set of patches add support for MIC X100 dma driver.
> > MIC PCIe card has a dma controller with 8 channels. These channels are
> > shared between the host s/w and the card s/w. 0 to 3 are used by host
> > and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
> > a virtual bus called mic bus is created and virtual dma devices are
> > created on it by the host/card drivers. On host the channels are private
> > and used only by the host driver to transfer data for the virtio devices.
> > 
> > Here is a higher level block diagram.
> > 				      |
> > 	       +----------+           |             +----------+
> > 	       | Card OS  |           |             | Host OS  |
> > 	       +----------+           |             +----------+
> > 				      |
> > 	+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
> > 	| Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
> > 	| Net   | |Console | |Block | | |Net      |  |Console | |Block   |
> > 	| Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
> > 	+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
> > 	    |         |         |     |      |            |         |
> > 	    |         |         |     |User  |            |         |
> > 	    |         |         |     |------|------------|---------|-------
> > 	    +-------------------+     |Kernel +--------------------------+
> > 		      |               |       | Virtio over PCIe IOCTLs  |
> > 		      |               |       +--------------------------+
> > +-----------+	      |               |	    		  |  +-----------+
> > | MIC DMA   |	      |               | 		  |  | MIC DMA	 |
> > | Driver    |         |               |	  		  |  | Driver	 |
> > +-----------+         |               |			  |  +-----------+
> >       |		      |	   	      |			  |	   |
> > +---------------+     |		      |			  |  +----------------+
> > |MIC virtual Bus|     |	 	      |			  |  |MIC virtual Bus |
> > +---------------+     |		      |			  |  +----------------+
> >       |               |               |                   |		 |
> >       |   +--------------+	      |            +---------------+     |
> >       |   |Intel MIC     |	      |            |Intel MIC      |     |
> >       +---|Card Driver   |            |            |Host Driver    |     |
> > 	  +--------------+            |            +---------------+-----+
> > 		      |               |                   |
> > 	     +-------------------------------------------------------------+
> > 	     |                                                             |
> > 	     |                    PCIe Bus                                 |
> > 	     +-------------------------------------------------------------+
> > 
> > The following series of patches are partitioned as follows:
> > 
> > Patch 1: Add mic bus and dma driver documentation.
> > 	 Author: Siva Yerramreddy
> > Patch 2: Add a bus driver for virtual MIC devices.
> > 	 Authors: Siva Yerramreddy, Sudeep Dutt
> > Patch 3: MIC X100 DMA Driver.
> > 	 Author: Siva Yerramreddy
> > Patch 4: Add threaded irq support in host driver.
> > 	 This is needed as the dma driver uses threaded irq.
> > 	 Author: Siva Yerramreddy
> > Patch 5: Add dma support in host driver.
> > 	 Authors: Siva Yerramreddy, Ashutosh Dixit, Sudeep Dutt
> > Patch 6: Add threaded irq support in card driver.
> > 	 This is needed as the dma driver uses threaded irq.
> > 	 Author: Siva Yerramreddy
> > Patch 7: Add dma support in card driver.
> > 	 Author: Siva Yerramreddy
> > Patch 8: Add support for loading/unloading dma driver.
> > 	 Author: Siva Yerramreddy
> > 
> > The patches have been compiled/validated against v3.14.
> > Tested using dmatest module with module parameter "threads_per_chan=60".
> > 
> > Thanks to Dan Williams, Vinod Koul, Jon Mason, Dave Jiang for the initial
> > review.
> > 
> > Siva Yerramreddy (8):
> >   misc: mic: Add mic bus and dma driver documentation
> >   misc: mic: add a bus driver for virtual MIC devices
> >   dma: MIC X100 DMA Driver
> >   misc: mic: add threaded irq support in host driver
> >   misc: mic: add dma support in host driver
> >   misc: mic: add threaded irq support in card driver
> >   misc: mic: add dma support in card driver
> >   misc: mic: add support for loading/unloading dma driver
> > 
> >  Documentation/mic/mic_overview.txt |  67 ++--
> >  Documentation/mic/mpssd/mpss       |  14 +-
> >  drivers/dma/Kconfig                |  19 +
> >  drivers/dma/Makefile               |   1 +
> >  drivers/dma/mic_x100_dma.c         | 774 +++++++++++++++++++++++++++++++++++++
> >  drivers/dma/mic_x100_dma.h         | 286 ++++++++++++++
> >  drivers/misc/mic/Kconfig           |  21 +-
> >  drivers/misc/mic/Makefile          |   1 +
> >  drivers/misc/mic/bus/Makefile      |   5 +
> >  drivers/misc/mic/bus/mic_bus.c     | 188 +++++++++
> >  drivers/misc/mic/card/mic_device.c |  21 +-
> >  drivers/misc/mic/card/mic_device.h |   8 +-
> >  drivers/misc/mic/card/mic_virtio.c |   4 +-
> >  drivers/misc/mic/card/mic_x100.c   |  52 ++-
> >  drivers/misc/mic/host/mic_boot.c   |  78 +++-
> >  drivers/misc/mic/host/mic_device.h |  24 ++
> >  drivers/misc/mic/host/mic_intr.c   | 116 +++---
> >  drivers/misc/mic/host/mic_intr.h   |  21 +-
> >  drivers/misc/mic/host/mic_main.c   |   5 +-
> >  drivers/misc/mic/host/mic_virtio.c | 185 +++++++--
> >  drivers/misc/mic/host/mic_virtio.h |  21 +-
> >  drivers/misc/mic/host/mic_x100.c   |   8 +
> >  include/linux/mic_bus.h            | 148 +++++++
> >  23 files changed, 1922 insertions(+), 145 deletions(-)
> >  create mode 100644 drivers/dma/mic_x100_dma.c
> >  create mode 100644 drivers/dma/mic_x100_dma.h
> >  create mode 100644 drivers/misc/mic/bus/Makefile
> >  create mode 100644 drivers/misc/mic/bus/mic_bus.c
> >  create mode 100644 include/linux/mic_bus.h
> > 
> 
> 



^ permalink raw reply	[relevance 99%]

* Re: [PATCH v2 1/4] misc: mic: Add include of <linux/irqreturn.h>
  @ 2014-02-19 17:09 99%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-02-19 17:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, e1000-devel, Joerg Roedel, linux-kernel,
	Stephen Hemminger, iommu, Tomas Winkler, Aaron F Brown,
	Arjan van de Ven, Sudeep Dutt, Ashutosh Dixit

On Tue, 2014-02-18 at 13:59 -0700, Bjorn Helgaas wrote:
> We currently include <linux/irqreturn.h> in <linux/pci.h>, but I'm about to
> remove that from linux/pci.h, so add explicit includes where needed.
> 
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/misc/mic/card/mic_device.h |    1 +
>  drivers/misc/mic/host/mic_device.h |    1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
> index 347b9b3b7916..306f502be95e 100644
> --- a/drivers/misc/mic/card/mic_device.h
> +++ b/drivers/misc/mic/card/mic_device.h
> @@ -29,6 +29,7 @@
>  
>  #include <linux/workqueue.h>
>  #include <linux/io.h>
> +#include <linux/irqreturn.h>
>  
>  /**
>   * struct mic_intr_info - Contains h/w specific interrupt sources info
> diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
> index 1a6edce2ecde..0398c696d257 100644
> --- a/drivers/misc/mic/host/mic_device.h
> +++ b/drivers/misc/mic/host/mic_device.h
> @@ -24,6 +24,7 @@
>  #include <linux/cdev.h>
>  #include <linux/idr.h>
>  #include <linux/notifier.h>
> +#include <linux/irqreturn.h>
>  
>  #include "mic_intr.h"
>  
> 

Acked-by: Sudeep Dutt <sudeep.dutt@intel.com>

Thanks,




^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-linus] misc: mic: fix possible signed underflow (undefined behavior) in userspace API
@ 2014-02-03 22:53 96% Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-02-03 22:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mathieu Desnoyers, linux-kernel, Ashutosh Dixit, Nikhil Rao,
	Sudeep Dutt, Dasaratharaman Chandramouli

iovcnt is declared as a signed integer in both the userspace API and
as a local variable in mic_virtio.c. The while() loop in mic_virtio.c
iterates until the local variable iovcnt reaches the value 0. If
userspace passes e.g. INT_MIN as iovcnt field, this loop then appears
to depend on an undefined behavior (signed underflow) to complete.
The fix is to use unsigned integers in both the userspace API and
the local variable.

This issue was reported @ https://lkml.org/lkml/2014/1/10/10

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/host/mic_virtio.c | 3 ++-
 include/uapi/linux/mic_ioctl.h     | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 752ff87..7e1ef0e 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -156,7 +156,8 @@ static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
 static int _mic_virtio_copy(struct mic_vdev *mvdev,
 	struct mic_copy_desc *copy)
 {
-	int ret = 0, iovcnt = copy->iovcnt;
+	int ret = 0;
+	u32 iovcnt = copy->iovcnt;
 	struct iovec iov;
 	struct iovec __user *u_iov = copy->iov;
 	void __user *ubuf = NULL;
diff --git a/include/uapi/linux/mic_ioctl.h b/include/uapi/linux/mic_ioctl.h
index 7fabba5..feb0b4c 100644
--- a/include/uapi/linux/mic_ioctl.h
+++ b/include/uapi/linux/mic_ioctl.h
@@ -39,7 +39,7 @@ struct mic_copy_desc {
 #else
 	struct iovec *iov;
 #endif
-	int iovcnt;
+	__u32 iovcnt;
 	__u8 vr_idx;
 	__u8 update_used;
 	__u32 out_len;
-- 
1.8.2.1


^ permalink raw reply related	[relevance 96%]

* Re: Intel MIC host driver: possible signed underflow (undefined behavior) in userspace API
  @ 2014-01-10 18:22 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2014-01-10 18:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mathieu Desnoyers, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Nikhil Rao, Harshavardhan R Kharche,
	Peter P Waskiewicz Jr, Linux Kernel Mailing List, Sudeep Dutt

On Fri, 2014-01-10 at 06:21 -0800, Greg Kroah-Hartman wrote:
> On Fri, Jan 10, 2014 at 05:56:25AM +0000, Mathieu Desnoyers wrote:
> > Hi,
> > 
> > Looking at this commit:
> > 
> > commit f69bcbf3b4c4b333dcd7a48eaf868bf0c88edab5
> > Author: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Date:   Thu Sep 5 16:42:18 2013 -0700
> > 
> >     Intel MIC Host Driver Changes for Virtio Devices.
> > 
> > Especially at:
> > 
> > +struct mic_copy_desc {
> > +#ifdef __KERNEL__
> > +       struct iovec __user *iov;
> > +#else
> > +       struct iovec *iov;
> > +#endif
> > +       int iovcnt;
> > +       __u8 vr_idx;
> > +       __u8 update_used;
> > +       __u32 out_len;
> > +};
> > 
> > Seeing iovcnt being declared as a signed integer seems strange. The
> > first question would be: why is it signed rather than unsigned ?
> > 
> > Then, looking further into 
> > 
> > drivers/misc/mic/host/mic_virtio.c:_mic_virtio_copy()
> > 
> > We can see that the while() loop iterates until the local variable
> > iovcnt reaches the value 0 (and iovcnt is also a signed integer). If
> > user-space passes e.g. INT_MIN as iovcnt field, this loop then appears
> > to depend on an undefined behavior (signed underflow) to complete.
> > Wouldn't it be better to use an unsigned integers both in the
> > userspace API and for the local variable ?
> 
> Better yet, it should be a "__" type variable, as "int" doesn't mean
> much when crossing the user/kernel boundry...
> 

We had designed the interface to be similar to readv(..)/writev(..)
which takes an integer iovcnt. However, the suggestion to use __u32
works nicely since it avoids adding the missing integer parameter
validation in the driver. We will post a patch incorporating this
feedback for the next kernel release.

Thanks to Mathieu for reporting this issue.

Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next] misc: mic: Fix build issues in sample daemon.
@ 2013-10-02 23:09 85% Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-10-02 23:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, Rob Landley,
	linux-kernel, linux-doc
  Cc: Nikhil Rao, Caz Yokoyama, Ashutosh Dixit,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

Specifying gcc format function attribute for mpsslog(..) and
building on 32 bit systems exposed a few build issues in the
sample MIC daemon which are fixed by this patch. Some of these
changes were authored by Joe Perches @
https://lkml.org/lkml/2013/9/27/419

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpssd.c | 22 +++++++++++-----------
 Documentation/mic/mpssd/mpssd.h |  1 +
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
index f9327a2..82c6bc2 100644
--- a/Documentation/mic/mpssd/mpssd.c
+++ b/Documentation/mic/mpssd/mpssd.c
@@ -362,9 +362,9 @@ static inline void verify_out_len(struct mic_info *mic,
 	struct mic_copy_desc *copy)
 {
 	if (copy->out_len != sum_iovec_len(copy)) {
-		mpsslog("%s %s %d BUG copy->out_len 0x%x len 0x%x\n",
+		mpsslog("%s %s %d BUG copy->out_len 0x%x len 0x%zx\n",
 			mic->name, __func__, __LINE__,
-				copy->out_len, sum_iovec_len(copy));
+			copy->out_len, sum_iovec_len(copy));
 		assert(copy->out_len == sum_iovec_len(copy));
 	}
 }
@@ -377,7 +377,7 @@ disp_iovec(struct mic_info *mic, struct mic_copy_desc *copy,
 	int i;
 
 	for (i = 0; i < copy->iovcnt; i++)
-		mpsslog("%s %s %d copy->iov[%d] addr %p len 0x%lx\n",
+		mpsslog("%s %s %d copy->iov[%d] addr %p len 0x%zx\n",
 			mic->name, s, line, i,
 			copy->iov[i].iov_base, copy->iov[i].iov_len);
 }
@@ -630,7 +630,7 @@ virtio_net(void *arg)
 				disp_iovec(mic, &copy, __func__, __LINE__);
 				mpsslog("%s %s %d read failed %s ", mic->name,
 					__func__, __LINE__, strerror(errno));
-				mpsslog("cnt %d sum %d\n",
+				mpsslog("cnt %d sum %zd\n",
 					copy.iovcnt, sum_iovec_len(&copy));
 			}
 		}
@@ -680,8 +680,8 @@ virtio_net(void *arg)
 					if (len != sum_iovec_len(&copy)) {
 						mpsslog("Tun write failed %s ",
 							strerror(errno));
-						mpsslog("len 0x%x ", len);
-						mpsslog("read_len 0x%x\n",
+						mpsslog("len 0x%zx ", len);
+						mpsslog("read_len 0x%zx\n",
 							sum_iovec_len(&copy));
 					} else {
 #ifdef DEBUG
@@ -827,7 +827,7 @@ virtio_console(void *arg)
 				mpsslog("%s %s %d read failed %s ",
 					mic->name, __func__, __LINE__,
 					strerror(errno));
-				mpsslog("cnt %d sum %d\n",
+				mpsslog("cnt %d sum %zd\n",
 					copy.iovcnt, sum_iovec_len(&copy));
 			}
 		}
@@ -859,8 +859,8 @@ virtio_console(void *arg)
 					if (len != sum_iovec_len(&copy)) {
 						mpsslog("Tun write failed %s ",
 							strerror(errno));
-						mpsslog("len 0x%x ", len);
-						mpsslog("read_len 0x%x\n",
+						mpsslog("len 0x%zx ", len);
+						mpsslog("read_len 0x%zx\n",
 							sum_iovec_len(&copy));
 					} else {
 #ifdef DEBUG
@@ -953,7 +953,7 @@ set_backend_file(struct mic_info *mic)
 		return false;
 	mic->mic_virtblk.backend_file = malloc(strlen(evv) + 1);
 	if (mic->mic_virtblk.backend_file == NULL) {
-		mpsslog("can't allocate memory\n", mic->name, mic->id);
+		mpsslog("%s %d can't allocate memory\n", mic->name, mic->id);
 		return false;
 	}
 	strcpy(mic->mic_virtblk.backend_file, evv + 1);
@@ -1026,7 +1026,7 @@ close_backend(struct mic_info *mic)
 static bool
 start_virtblk(struct mic_info *mic, struct mic_vring *vring)
 {
-	if (((__u64)&virtblk_dev_page.blk_config % 8) != 0) {
+	if (((unsigned long)&virtblk_dev_page.blk_config % 8) != 0) {
 		mpsslog("%s: blk_config is not 8 byte aligned.\n",
 			mic->name);
 		return false;
diff --git a/Documentation/mic/mpssd/mpssd.h b/Documentation/mic/mpssd/mpssd.h
index b6dee38..ccd589f 100644
--- a/Documentation/mic/mpssd/mpssd.h
+++ b/Documentation/mic/mpssd/mpssd.h
@@ -94,6 +94,7 @@ struct mic_info {
 	struct mic_info *next;
 };
 
+__attribute__((format(printf, 1, 2)))
 void mpsslog(char *format, ...);
 char *readsysfs(char *dir, char *entry);
 int setsysfs(char *dir, char *entry, char *value);
-- 
1.8.2.1


^ permalink raw reply related	[relevance 85%]

* Re: [PATCH CHAR-MISC-NEXT 4/4] Intel MIC Header File warning fix.
  @ 2013-09-27 19:37 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 19:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Joe Perches, linux-kernel, linux-doc, Nikhil Rao, Ashutosh Dixit,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

On Thu, 2013-09-26 at 20:17 -0700, Greg Kroah-Hartman wrote:
> On Thu, Sep 26, 2013 at 06:10:23PM -0700, Sudeep Dutt wrote:
> > usr/include/linux/mic_ioctl.h:37:
> > found __[us]{8,16,32,64} type without #include <linux/types.h>
> 
> What is that from?  What does it mean?
> 

This warning was displayed during "make headers_check" via
scripts/headers_check.pl. The patch description has been fixed up to
read as follows in the patches resent this morning @
https://lkml.org/lkml/2013/9/27/376 

The following warning from mic_ioctl.h is fixed via this patch:
found __[us]{8,16,32,64} type without #include <linux/types.h>

Sorry for the confusion.

Thanks,
Sudeep Dutt

> confused,
> 
> greg k-h



^ permalink raw reply	[relevance 99%]

* Re: [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix.
  @ 2013-09-27 18:14 99%   ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 18:14 UTC (permalink / raw)
  To: Joe Perches
  Cc: Greg Kroah-Hartman, Arnd Bergmann, linux-kernel, linux-doc,
	Fengguang Wu, Stephen Rothwell, Nikhil Rao, Ashutosh Dixit,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

On Fri, 2013-09-27 at 10:29 -0700, Joe Perches wrote:
> On Fri, 2013-09-27 at 09:49 -0700, Sudeep Dutt wrote:
> > These patches address code review feedback received on the
> > patch series @ https://lkml.org/lkml/2013/9/5/561 ,from:
> > a) Greg Kroah-Hartman on sysfs and header file cleanups.
> > b) Joe Perches on issues found by "--strict" checkpatch.
> > 
> > It should also fix the build failures on certain non X86
> > architectures, reported in linux-next and 0-DAY builds.
> > 
> > Ashutosh Dixit (1):
> >   misc: mic: cleanups for "--strict" checkpatch.
> > 
> > Sudeep Dutt (4):
> >   misc: mic: host driver sysfs cleanups.
> >   misc: mic: header file cleanups.
> >   misc: mic: fix a warning in the IOCTL header file.
> >   misc: mic: depend on X86 for both host and card drivers.
> > 
> >  Documentation/mic/mpssd/mpssd.c                    | 65 +++++++--------
> >  Documentation/mic/mpssd/sysfs.c                    |  4 +-
> 
> Why are these in Documentation/ at all?
> Shouldn't the directory be moved to tools/ ?
> 

Currently, this is just sample working code for configuring MIC devices.
The longer term plan is to move this code to tools/ but not with this
patch series.

> It doesn't build normally btw.
> 
> Also, there are some defects/shortcomings here:
> 

We will post a follow up patch incorporating the changes below next
week. It would be great if this cleanup series is applied to
char-misc-next as is since patch 5 should fix a linux-next build failure
with powerpc.

Thanks for the review!
Sudeep Dutt

>  Documentation/mic/mpssd/mpssd.c | 7 ++++---
>  Documentation/mic/mpssd/mpssd.h | 1 +
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
> index 8064804..cb62663 100644
> --- a/Documentation/mic/mpssd/mpssd.c
> +++ b/Documentation/mic/mpssd/mpssd.c
> @@ -379,7 +379,7 @@ disp_iovec(struct mic_info *mic, struct mic_copy_desc *copy,
>  	for (i = 0; i < copy->iovcnt; i++)
>  		mpsslog("%s %s %d copy->iov[%d] addr %p len 0x%lx\n",
>  			mic->name, s, line, i,
> -			copy->iov[i].iov_base, copy->iov[i].iov_len);
> +			copy->iov[i].iov_base, (long)copy->iov[i].iov_len);
>  }
>  
>  static inline __u16 read_avail_idx(struct mic_vring *vr)
> @@ -953,7 +953,8 @@ set_backend_file(struct mic_info *mic)
>  		return false;
>  	mic->mic_virtblk.backend_file = malloc(strlen(evv) + 1);
>  	if (mic->mic_virtblk.backend_file == NULL) {
> -		mpsslog("can't allocate memory\n", mic->name, mic->id);
> +		mpsslog("%s id %d - can't allocate memory\n",
> +			mic->name, mic->id);
>  		return false;
>  	}
>  	strcpy(mic->mic_virtblk.backend_file, evv + 1);
> @@ -1026,7 +1027,7 @@ close_backend(struct mic_info *mic)
>  static bool
>  start_virtblk(struct mic_info *mic, struct mic_vring *vring)
>  {
> -	if (((__u64)&virtblk_dev_page.blk_config % 8) != 0) {
> +	if (((__u64)(unsigned long)&virtblk_dev_page.blk_config % 8) != 0) {
>  		mpsslog("%s: blk_config is not 8 byte aligned.\n",
>  			mic->name);
>  		return false;
> diff --git a/Documentation/mic/mpssd/mpssd.h b/Documentation/mic/mpssd/mpssd.h
> index b6dee38..ccd589f 100644
> --- a/Documentation/mic/mpssd/mpssd.h
> +++ b/Documentation/mic/mpssd/mpssd.h
> @@ -94,6 +94,7 @@ struct mic_info {
>  	struct mic_info *next;
>  };
>  
> +__attribute__((format(printf, 1, 2)))
>  void mpsslog(char *format, ...);
>  char *readsysfs(char *dir, char *entry);
>  int setsysfs(char *dir, char *entry, char *value);
> 
> 



^ permalink raw reply	[relevance 99%]

* [PATCH char-misc-next 5/5] misc: mic: depend on X86 for both host and card drivers.
  2013-09-27 16:49 94% [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Sudeep Dutt
                   ` (3 preceding siblings ...)
  2013-09-27 16:50 99% ` [PATCH char-misc-next 4/5] misc: mic: fix a warning in the IOCTL header file Sudeep Dutt
@ 2013-09-27 16:50 95% ` Sudeep Dutt
    5 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 16:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, linux-kernel,
	linux-doc, Fengguang Wu, Stephen Rothwell
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

This fixes build failures seen on certain non X86 architectures.
The card driver should correctly always depend on X86. The host
driver can potentially work on non X86 architectures although
it has never been built or validated in such configurations.
The host driver dependency on X86 can be removed at some point
in the future but this workaround is required for now.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index d453768..e42b331 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -2,7 +2,7 @@ comment "Intel MIC Host Driver"
 
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
-	depends on 64BIT && PCI
+	depends on 64BIT && PCI && X86
 	select VHOST_RING
 	default N
 	help
@@ -23,7 +23,7 @@ comment "Intel MIC Card Driver"
 
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
-	depends on 64BIT
+	depends on 64BIT && X86
 	select VIRTIO
 	default N
 	help
-- 
1.8.2.1


^ permalink raw reply related	[relevance 95%]

* [PATCH char-misc-next 4/5] misc: mic: fix a warning in the IOCTL header file.
  2013-09-27 16:49 94% [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Sudeep Dutt
                   ` (2 preceding siblings ...)
  2013-09-27 16:49 35% ` [PATCH char-misc-next 3/5] misc: mic: cleanups for "--strict" checkpatch Sudeep Dutt
@ 2013-09-27 16:50 99% ` Sudeep Dutt
  2013-09-27 16:50 95% ` [PATCH char-misc-next 5/5] misc: mic: depend on X86 for both host and card drivers Sudeep Dutt
    5 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 16:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, linux-kernel,
	linux-doc, Fengguang Wu, Stephen Rothwell
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

The following warning from mic_ioctl.h is fixed via this patch:
found __[us]{8,16,32,64} type without #include <linux/types.h>

Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 include/uapi/linux/mic_ioctl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/mic_ioctl.h b/include/uapi/linux/mic_ioctl.h
index 0e6cbf3..7fabba5 100644
--- a/include/uapi/linux/mic_ioctl.h
+++ b/include/uapi/linux/mic_ioctl.h
@@ -21,6 +21,8 @@
 #ifndef _MIC_IOCTL_H_
 #define _MIC_IOCTL_H_
 
+#include <linux/types.h>
+
 /*
  * mic_copy - MIC virtio descriptor copy.
  *
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH char-misc-next 3/5] misc: mic: cleanups for "--strict" checkpatch.
  2013-09-27 16:49 94% [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Sudeep Dutt
  2013-09-27 16:49 74% ` [PATCH char-misc-next 1/5] misc: mic: host driver sysfs cleanups Sudeep Dutt
  2013-09-27 16:49 75% ` [PATCH char-misc-next 2/5] misc: mic: header file cleanups Sudeep Dutt
@ 2013-09-27 16:49 35% ` Sudeep Dutt
  2013-09-27 16:50 99% ` [PATCH char-misc-next 4/5] misc: mic: fix a warning in the IOCTL header file Sudeep Dutt
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 16:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, linux-kernel,
	linux-doc, Fengguang Wu, Stephen Rothwell
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

From: Ashutosh Dixit <ashutosh.dixit@intel.com>

These changes were mostly authored by Joe Perches <joe@perches.com>
@ https://lkml.org/lkml/2013/9/5/602

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpssd.c     | 65 ++++++++++++++-------------
 Documentation/mic/mpssd/sysfs.c     |  4 +-
 drivers/misc/mic/card/mic_device.c  |  1 -
 drivers/misc/mic/card/mic_virtio.c  | 17 ++++---
 drivers/misc/mic/card/mic_x100.c    |  4 +-
 drivers/misc/mic/host/mic_debugfs.c | 90 ++++++++++++++++++-------------------
 drivers/misc/mic/host/mic_fops.c    |  6 +--
 drivers/misc/mic/host/mic_intr.c    | 34 +++++++-------
 drivers/misc/mic/host/mic_smpt.c    | 14 +++---
 drivers/misc/mic/host/mic_sysfs.c   | 20 ++++-----
 drivers/misc/mic/host/mic_virtio.c  | 37 +++++++--------
 drivers/misc/mic/host/mic_x100.c    | 33 +++++++-------
 12 files changed, 158 insertions(+), 167 deletions(-)

diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
index 8064804..f9327a2 100644
--- a/Documentation/mic/mpssd/mpssd.c
+++ b/Documentation/mic/mpssd/mpssd.c
@@ -246,7 +246,7 @@ static int tun_alloc(struct mic_info *mic, char *dev)
 	if (*dev)
 		strncpy(ifr.ifr_name, dev, IFNAMSIZ);
 
-	err = ioctl(fd, TUNSETIFF, (void *) &ifr);
+	err = ioctl(fd, TUNSETIFF, (void *)&ifr);
 	if (err < 0) {
 		mpsslog("%s %s %d TUNSETIFF failed %s\n",
 			mic->name, __func__, __LINE__, strerror(errno));
@@ -363,7 +363,7 @@ static inline void verify_out_len(struct mic_info *mic,
 {
 	if (copy->out_len != sum_iovec_len(copy)) {
 		mpsslog("%s %s %d BUG copy->out_len 0x%x len 0x%x\n",
-				mic->name, __func__, __LINE__,
+			mic->name, __func__, __LINE__,
 				copy->out_len, sum_iovec_len(copy));
 		assert(copy->out_len == sum_iovec_len(copy));
 	}
@@ -372,7 +372,7 @@ static inline void verify_out_len(struct mic_info *mic,
 /* Display an iovec */
 static void
 disp_iovec(struct mic_info *mic, struct mic_copy_desc *copy,
-	const char *s, int line)
+	   const char *s, int line)
 {
 	int i;
 
@@ -401,7 +401,7 @@ static inline void txrx_prepare(int type, bool tx, struct mic_vring *vr,
 /* Central API which triggers the copies */
 static int
 mic_virtio_copy(struct mic_info *mic, int fd,
-	struct mic_vring *vr, struct mic_copy_desc *copy)
+		struct mic_vring *vr, struct mic_copy_desc *copy)
 {
 	int ret;
 
@@ -440,7 +440,7 @@ init_vr(struct mic_info *mic, int fd, int type,
 	vr0->info = vr0->va +
 		vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN);
 	vring_init(&vr0->vr,
-		MIC_VRING_ENTRIES, vr0->va, MIC_VIRTIO_RING_ALIGN);
+		   MIC_VRING_ENTRIES, vr0->va, MIC_VIRTIO_RING_ALIGN);
 	mpsslog("%s %s vr0 %p vr0->info %p vr_size 0x%x vring 0x%x ",
 		__func__, mic->name, vr0->va, vr0->info, vr_size,
 		vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN));
@@ -453,7 +453,7 @@ init_vr(struct mic_info *mic, int fd, int type,
 		vr1->info = vr1->va + vring_size(MIC_VRING_ENTRIES,
 			MIC_VIRTIO_RING_ALIGN);
 		vring_init(&vr1->vr,
-			MIC_VRING_ENTRIES, vr1->va, MIC_VIRTIO_RING_ALIGN);
+			   MIC_VRING_ENTRIES, vr1->va, MIC_VIRTIO_RING_ALIGN);
 		mpsslog("%s %s vr1 %p vr1->info %p vr_size 0x%x vring 0x%x ",
 			__func__, mic->name, vr1->va, vr1->info, vr_size,
 			vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN));
@@ -551,7 +551,7 @@ virtio_net(void *arg)
 	net_poll[NET_FD_TUN].events = POLLIN;
 
 	if (MAP_FAILED == init_vr(mic, mic->mic_net.virtio_net_fd,
-		VIRTIO_ID_NET, &tx_vr, &rx_vr,
+				  VIRTIO_ID_NET, &tx_vr, &rx_vr,
 		virtnet_dev_page.dd.num_vq)) {
 		mpsslog("%s init_vr failed %s\n",
 			mic->name, strerror(errno));
@@ -576,7 +576,7 @@ virtio_net(void *arg)
 		}
 		if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK))
 			wait_for_card_driver(mic, mic->mic_net.virtio_net_fd,
-					VIRTIO_ID_NET);
+					     VIRTIO_ID_NET);
 		/*
 		 * Check if there is data to be read from TUN and write to
 		 * virtio net fd if there is.
@@ -587,7 +587,7 @@ virtio_net(void *arg)
 				copy.iov, copy.iovcnt);
 			if (len > 0) {
 				struct virtio_net_hdr *hdr
-					= (struct virtio_net_hdr *) vnet_hdr[0];
+					= (struct virtio_net_hdr *)vnet_hdr[0];
 
 				/* Disable checksums on the card since we are on
 				   a reliable PCIe link */
@@ -606,7 +606,7 @@ virtio_net(void *arg)
 #endif
 				spin_for_descriptors(mic, &tx_vr);
 				txrx_prepare(VIRTIO_ID_NET, 1, &tx_vr, &copy,
-					len);
+					     len);
 
 				err = mic_virtio_copy(mic,
 					mic->mic_net.virtio_net_fd, &tx_vr,
@@ -644,7 +644,7 @@ virtio_net(void *arg)
 				le16toh(rx_vr.vr.avail->idx)) {
 				copy.iov = iov1;
 				txrx_prepare(VIRTIO_ID_NET, 0, &rx_vr, &copy,
-					MAX_NET_PKT_SIZE
+					     MAX_NET_PKT_SIZE
 					+ sizeof(struct virtio_net_hdr));
 
 				err = mic_virtio_copy(mic,
@@ -669,7 +669,7 @@ virtio_net(void *arg)
 					verify_out_len(mic, &copy);
 #ifdef DEBUG
 					disp_iovec(mic, copy, __func__,
-						__LINE__);
+						   __LINE__);
 					mpsslog("%s %s %d ",
 						mic->name, __func__, __LINE__);
 					mpsslog("read from net 0x%lx\n",
@@ -686,7 +686,7 @@ virtio_net(void *arg)
 					} else {
 #ifdef DEBUG
 						disp_iovec(mic, &copy, __func__,
-							__LINE__);
+							   __LINE__);
 						mpsslog("%s %s %d ",
 							mic->name, __func__,
 							__LINE__);
@@ -750,13 +750,13 @@ virtio_console(void *arg)
 	err = grantpt(pty_fd);
 	if (err < 0) {
 		mpsslog("can't grant access: %s %s\n",
-				pts_name, strerror(errno));
+			pts_name, strerror(errno));
 		goto _close_pty;
 	}
 	err = unlockpt(pty_fd);
 	if (err < 0) {
 		mpsslog("can't unlock a pseudoterminal: %s %s\n",
-				pts_name, strerror(errno));
+			pts_name, strerror(errno));
 		goto _close_pty;
 	}
 	console_poll[MONITOR_FD].fd = pty_fd;
@@ -766,7 +766,7 @@ virtio_console(void *arg)
 	console_poll[VIRTIO_CONSOLE_FD].events = POLLIN;
 
 	if (MAP_FAILED == init_vr(mic, mic->mic_console.virtio_console_fd,
-		VIRTIO_ID_CONSOLE, &tx_vr, &rx_vr,
+				  VIRTIO_ID_CONSOLE, &tx_vr, &rx_vr,
 		virtcons_dev_page.dd.num_vq)) {
 		mpsslog("%s init_vr failed %s\n",
 			mic->name, strerror(errno));
@@ -787,7 +787,7 @@ virtio_console(void *arg)
 		}
 		if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK))
 			wait_for_card_driver(mic,
-				mic->mic_console.virtio_console_fd,
+					     mic->mic_console.virtio_console_fd,
 				VIRTIO_ID_CONSOLE);
 
 		if (console_poll[MONITOR_FD].revents & POLLIN) {
@@ -802,7 +802,7 @@ virtio_console(void *arg)
 #endif
 				spin_for_descriptors(mic, &tx_vr);
 				txrx_prepare(VIRTIO_ID_CONSOLE, 1, &tx_vr,
-					&copy, len);
+					     &copy, len);
 
 				err = mic_virtio_copy(mic,
 					mic->mic_console.virtio_console_fd,
@@ -837,7 +837,7 @@ virtio_console(void *arg)
 				le16toh(rx_vr.vr.avail->idx)) {
 				copy.iov = iov1;
 				txrx_prepare(VIRTIO_ID_CONSOLE, 0, &rx_vr,
-					&copy, PAGE_SIZE);
+					     &copy, PAGE_SIZE);
 
 				err = mic_virtio_copy(mic,
 					mic->mic_console.virtio_console_fd,
@@ -848,7 +848,7 @@ virtio_console(void *arg)
 					verify_out_len(mic, &copy);
 #ifdef DEBUG
 					disp_iovec(mic, copy, __func__,
-						__LINE__);
+						   __LINE__);
 					mpsslog("%s %s %d ",
 						mic->name, __func__, __LINE__);
 					mpsslog("read from net 0x%lx\n",
@@ -865,7 +865,7 @@ virtio_console(void *arg)
 					} else {
 #ifdef DEBUG
 						disp_iovec(mic, copy, __func__,
-							__LINE__);
+							   __LINE__);
 						mpsslog("%s %s %d ",
 							mic->name, __func__,
 							__LINE__);
@@ -1033,7 +1033,8 @@ start_virtblk(struct mic_info *mic, struct mic_vring *vring)
 	}
 	add_virtio_device(mic, &virtblk_dev_page.dd);
 	if (MAP_FAILED == init_vr(mic, mic->mic_virtblk.virtio_block_fd,
-		VIRTIO_ID_BLOCK, vring, NULL, virtblk_dev_page.dd.num_vq)) {
+				  VIRTIO_ID_BLOCK, vring, NULL,
+				  virtblk_dev_page.dd.num_vq)) {
 		mpsslog("%s init_vr failed %s\n",
 			mic->name, strerror(errno));
 		return false;
@@ -1060,7 +1061,7 @@ header_error_check(struct vring_desc *desc)
 {
 	if (le32toh(desc->len) != sizeof(struct virtio_blk_outhdr)) {
 		mpsslog("%s() %d: length is not sizeof(virtio_blk_outhd)\n",
-				__func__, __LINE__);
+			__func__, __LINE__);
 		return -EIO;
 	}
 	if (!(le16toh(desc->flags) & VRING_DESC_F_NEXT)) {
@@ -1132,7 +1133,7 @@ write_status(int fd, __u8 *status)
 static void *
 virtio_block(void *arg)
 {
-	struct mic_info *mic = (struct mic_info *) arg;
+	struct mic_info *mic = (struct mic_info *)arg;
 	int ret;
 	struct pollfd block_poll;
 	struct mic_vring vring;
@@ -1219,8 +1220,8 @@ virtio_block(void *arg)
 				status = 0;
 				fos = mic->mic_virtblk.backend_addr +
 					(hdr.sector * SECTOR_SIZE);
-				buffer_desc_idx = desc_idx =
-					next_desc(desc);
+				buffer_desc_idx = next_desc(desc);
+				desc_idx = buffer_desc_idx;
 				for (desc = &vring.vr.desc[buffer_desc_idx];
 				     desc->flags & VRING_DESC_F_NEXT;
 				     desc_idx = next_desc(desc),
@@ -1255,7 +1256,7 @@ virtio_block(void *arg)
 						iovec,
 						piov - iovec);
 					if (ret < 0 &&
-						status != 0)
+					    status != 0)
 						status = ret;
 				}
 				/* write status and update used pointer */
@@ -1391,7 +1392,7 @@ mic_config(void *arg)
 	char value[4096];
 
 	snprintf(pathname, PATH_MAX - 1, "%s/%s/%s",
-		MICSYSFSDIR, mic->name, "state");
+		 MICSYSFSDIR, mic->name, "state");
 
 	fd = open(pathname, O_RDONLY);
 	if (fd < 0) {
@@ -1560,22 +1561,22 @@ init_mic(struct mic_info *mic)
 			virtio_console, mic);
 		if (err)
 			mpsslog("%s virtcons pthread_create failed %s\n",
-			mic->name, strerror(err));
+				mic->name, strerror(err));
 		err = pthread_create(&mic->mic_net.net_thread, NULL,
 			virtio_net, mic);
 		if (err)
 			mpsslog("%s virtnet pthread_create failed %s\n",
-			mic->name, strerror(err));
+				mic->name, strerror(err));
 		err = pthread_create(&mic->mic_virtblk.block_thread, NULL,
 			virtio_block, mic);
 		if (err)
 			mpsslog("%s virtblk pthread_create failed %s\n",
-			mic->name, strerror(err));
+				mic->name, strerror(err));
 		sigemptyset(&act.sa_mask);
 		err = sigaction(SIGUSR1, &act, NULL);
 		if (err)
 			mpsslog("%s sigaction SIGUSR1 failed %s\n",
-			mic->name, strerror(errno));
+				mic->name, strerror(errno));
 		while (1)
 			sleep(60);
 	case -1:
diff --git a/Documentation/mic/mpssd/sysfs.c b/Documentation/mic/mpssd/sysfs.c
index 11de72b..8dd3269 100644
--- a/Documentation/mic/mpssd/sysfs.c
+++ b/Documentation/mic/mpssd/sysfs.c
@@ -35,7 +35,7 @@ readsysfs(char *dir, char *entry)
 		snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry);
 	else
 		snprintf(filename, PATH_MAX,
-			"%s/%s/%s", MICSYSFSDIR, dir, entry);
+			 "%s/%s/%s", MICSYSFSDIR, dir, entry);
 
 	fd = open(filename, O_RDONLY);
 	if (fd < 0) {
@@ -75,7 +75,7 @@ setsysfs(char *dir, char *entry, char *value)
 		snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry);
 	else
 		snprintf(filename, PATH_MAX, "%s/%s/%s",
-			MICSYSFSDIR, dir, entry);
+			 MICSYSFSDIR, dir, entry);
 
 	oldvalue = readsysfs(dir, entry);
 
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 175d114..d0980ff 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -167,7 +167,6 @@ struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
 	return (struct mic_irq *)cookie;
 err:
 	return ERR_PTR(rc);
-
 }
 
 /**
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 7420c33..914cc9b 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -103,7 +103,7 @@ static void mic_finalize_features(struct virtio_device *vdev)
 	for (i = 0; i < bits; i++) {
 		if (test_bit(i, vdev->features))
 			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
-				&out_features[i / 8]);
+				 &out_features[i / 8]);
 	}
 }
 
@@ -197,10 +197,9 @@ static void mic_notify(struct virtqueue *vq)
 static void mic_del_vq(struct virtqueue *vq, int n)
 {
 	struct mic_vdev *mvdev = to_micvdev(vq->vdev);
-	struct vring *vr = (struct vring *) (vq + 1);
+	struct vring *vr = (struct vring *)(vq + 1);
 
-	free_pages((unsigned long) vr->used,
-		get_order(mvdev->used_size[n]));
+	free_pages((unsigned long) vr->used, get_order(mvdev->used_size[n]));
 	vring_del_virtqueue(vq);
 	mic_card_unmap(mvdev->mdev, mvdev->vr[n]);
 	mvdev->vr[n] = NULL;
@@ -274,8 +273,8 @@ static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
 	/* Allocate and reassign used ring now */
 	mvdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
 			sizeof(struct vring_used_elem) * config.num);
-	used = (void *) __get_free_pages(GFP_KERNEL | __GFP_ZERO,
-				get_order(mvdev->used_size[index]));
+	used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					get_order(mvdev->used_size[index]));
 	if (!used) {
 		err = -ENOMEM;
 		dev_err(mic_dev(mvdev), "%s %d err %d\n",
@@ -291,7 +290,7 @@ static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
 	 * vring_new_virtqueue() would ensure that
 	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
 	 */
-	vr = (struct vring *) (vq + 1);
+	vr = (struct vring *)(vq + 1);
 	vr->used = used;
 
 	vq->priv = mvdev;
@@ -544,7 +543,7 @@ static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
 		if (dev) {
 			if (remove)
 				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
-					&dc->config_change);
+					 &dc->config_change);
 			put_device(dev);
 			mic_handle_config_change(d, i, mdrv);
 			ret = mic_remove_device(d, i, mdrv);
@@ -559,7 +558,7 @@ static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
 
 		/* new device */
 		dev_dbg(mdrv->dev, "%s %d Adding new virtio device %p\n",
-				__func__, __LINE__, d);
+			__func__, __LINE__, d);
 		if (!remove)
 			mic_add_device(d, i, mdrv);
 	}
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index f491c91..2868945 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -66,8 +66,8 @@ void mic_send_intr(struct mic_device *mdev, int doorbell)
 	/* Ensure that the interrupt is ordered w.r.t previous stores. */
 	wmb();
 	mic_mmio_write(mw, MIC_X100_SBOX_SDBIC0_DBREQ_BIT,
-		MIC_X100_SBOX_BASE_ADDRESS +
-		(MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
+		       MIC_X100_SBOX_BASE_ADDRESS +
+		       (MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
 }
 
 /**
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 98a61b4..028ba5d6 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -103,7 +103,7 @@ static int mic_smpt_show(struct seq_file *s, void *pos)
 	unsigned long flags;
 
 	seq_printf(s, "MIC %-2d |%-10s| %-14s %-10s\n",
-		mdev->id, "SMPT entry", "SW DMA addr", "RefCount");
+		   mdev->id, "SMPT entry", "SW DMA addr", "RefCount");
 	seq_puts(s, "====================================================\n");
 
 	if (mdev->smpt) {
@@ -111,8 +111,8 @@ static int mic_smpt_show(struct seq_file *s, void *pos)
 		spin_lock_irqsave(&smpt_info->smpt_lock, flags);
 		for (i = 0; i < smpt_info->info.num_reg; i++) {
 			seq_printf(s, "%9s|%-10d| %-#14llx %-10lld\n",
-				" ",  i, smpt_info->entry[i].dma_addr,
-				smpt_info->entry[i].ref_count);
+				   " ",  i, smpt_info->entry[i].dma_addr,
+				   smpt_info->entry[i].ref_count);
 		}
 		spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
 	}
@@ -203,17 +203,17 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 	int i, j;
 
 	seq_printf(s, "Bootparam: magic 0x%x\n",
-		bootparam->magic);
+		   bootparam->magic);
 	seq_printf(s, "Bootparam: h2c_shutdown_db %d\n",
-		bootparam->h2c_shutdown_db);
+		   bootparam->h2c_shutdown_db);
 	seq_printf(s, "Bootparam: h2c_config_db %d\n",
-		bootparam->h2c_config_db);
+		   bootparam->h2c_config_db);
 	seq_printf(s, "Bootparam: c2h_shutdown_db %d\n",
-		bootparam->c2h_shutdown_db);
+		   bootparam->c2h_shutdown_db);
 	seq_printf(s, "Bootparam: shutdown_status %d\n",
-		bootparam->shutdown_status);
+		   bootparam->shutdown_status);
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
-		bootparam->shutdown_card);
+		   bootparam->shutdown_card);
 
 	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
 	     i += mic_total_desc_size(d)) {
@@ -239,10 +239,10 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 			seq_printf(s, "address 0x%llx ", vqconfig->address);
 			seq_printf(s, "num %d ", vqconfig->num);
 			seq_printf(s, "used address 0x%llx\n",
-				vqconfig->used_address);
+				   vqconfig->used_address);
 		}
 
-		features = (__u32 *) mic_vq_features(d);
+		features = (__u32 *)mic_vq_features(d);
 		seq_printf(s, "Features: Host 0x%x ", features[0]);
 		seq_printf(s, "Guest 0x%x\n", features[1]);
 
@@ -256,7 +256,7 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
 		seq_printf(s, "Host ack %d\n", dc->host_ack);
 		seq_printf(s, "Used address updated %d ",
-			dc->used_address_updated);
+			   dc->used_address_updated);
 		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
 		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
 		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
@@ -294,10 +294,10 @@ static int mic_vdev_info_show(struct seq_file *s, void *unused)
 	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
 		mvdev = list_entry(pos, struct mic_vdev, list);
 		seq_printf(s, "VDEV type %d state %s in %ld out %ld\n",
-			mvdev->virtio_id,
-			mic_vdevup(mvdev) ? "UP" : "DOWN",
-			mvdev->in_bytes,
-			mvdev->out_bytes);
+			   mvdev->virtio_id,
+			   mic_vdevup(mvdev) ? "UP" : "DOWN",
+			   mvdev->in_bytes,
+			   mvdev->out_bytes);
 		for (i = 0; i < MIC_MAX_VRINGS; i++) {
 			struct vring_desc *desc;
 			struct vring_avail *avail;
@@ -309,38 +309,38 @@ static int mic_vdev_info_show(struct seq_file *s, void *unused)
 				continue;
 			desc = vrh->vring.desc;
 			seq_printf(s, "vring i %d avail_idx %d",
-				i, mvr->vring.info->avail_idx & (num - 1));
+				   i, mvr->vring.info->avail_idx & (num - 1));
 			seq_printf(s, " vring i %d avail_idx %d\n",
-				i, mvr->vring.info->avail_idx);
+				   i, mvr->vring.info->avail_idx);
 			seq_printf(s, "vrh i %d weak_barriers %d",
-				i, vrh->weak_barriers);
+				   i, vrh->weak_barriers);
 			seq_printf(s, " last_avail_idx %d last_used_idx %d",
-				vrh->last_avail_idx, vrh->last_used_idx);
+				   vrh->last_avail_idx, vrh->last_used_idx);
 			seq_printf(s, " completed %d\n", vrh->completed);
 			for (j = 0; j < num; j++) {
 				seq_printf(s, "desc[%d] addr 0x%llx len %d",
-					j, desc->addr, desc->len);
+					   j, desc->addr, desc->len);
 				seq_printf(s, " flags 0x%x next %d\n",
-					desc->flags,
-					desc->next);
+					   desc->flags, desc->next);
 				desc++;
 			}
 			avail = vrh->vring.avail;
 			seq_printf(s, "avail flags 0x%x idx %d\n",
-				avail->flags, avail->idx & (num - 1));
+				   avail->flags, avail->idx & (num - 1));
 			seq_printf(s, "avail flags 0x%x idx %d\n",
-				avail->flags, avail->idx);
+				   avail->flags, avail->idx);
 			for (j = 0; j < num; j++)
 				seq_printf(s, "avail ring[%d] %d\n",
-					j, avail->ring[j]);
+					   j, avail->ring[j]);
 			used = vrh->vring.used;
 			seq_printf(s, "used flags 0x%x idx %d\n",
-				used->flags, used->idx & (num - 1));
+				   used->flags, used->idx & (num - 1));
 			seq_printf(s, "used flags 0x%x idx %d\n",
-				used->flags, used->idx);
+				   used->flags, used->idx);
 			for (j = 0; j < num; j++)
 				seq_printf(s, "used ring[%d] id %d len %d\n",
-					j, used->ring[j].id, used->ring[j].len);
+					   j, used->ring[j].id,
+					   used->ring[j].len);
 		}
 	}
 	mutex_unlock(&mdev->mic_mutex);
@@ -389,7 +389,7 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 			reg = mdev->intr_ops->read_msi_to_src_map(mdev, entry);
 
 			seq_printf(s, "%s %-10d %s %-10d MXAR[%d]: %08X\n",
-				"IRQ:", vector, "Entry:", entry, i, reg);
+				   "IRQ:", vector, "Entry:", entry, i, reg);
 
 			seq_printf(s, "%-10s", "offset:");
 			for (j = (MIC_NUM_OFFSETS - 1); j >= 0; j--)
@@ -400,8 +400,8 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 			seq_printf(s, "%-10s", "count:");
 			for (j = (MIC_NUM_OFFSETS - 1); j >= 0; j--)
 				seq_printf(s, "%4d ",
-				(mdev->irq_info.mic_msi_map[i] & BIT(j)) ?
-					1 : 0);
+					   (mdev->irq_info.mic_msi_map[i] &
+					   BIT(j)) ? 1 : 0);
 			seq_puts(s, "\n\n");
 		}
 	} else {
@@ -409,7 +409,6 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 	}
 
 	return 0;
-
 }
 
 static int mic_msi_irq_info_debug_open(struct inode *inode, struct file *file)
@@ -443,26 +442,23 @@ void mic_create_debug_dir(struct mic_device *mdev)
 	if (!mdev->dbg_dir)
 		return;
 
-	debugfs_create_file("log_buf", 0444, mdev->dbg_dir,
-		mdev, &log_buf_ops);
+	debugfs_create_file("log_buf", 0444, mdev->dbg_dir, mdev, &log_buf_ops);
 
-	debugfs_create_file("smpt", 0444, mdev->dbg_dir,
-		mdev, &smpt_file_ops);
+	debugfs_create_file("smpt", 0444, mdev->dbg_dir, mdev, &smpt_file_ops);
 
-	debugfs_create_file("soft_reset", 0444, mdev->dbg_dir,
-		mdev, &soft_reset_ops);
+	debugfs_create_file("soft_reset", 0444, mdev->dbg_dir, mdev,
+			    &soft_reset_ops);
 
-	debugfs_create_file("post_code", 0444, mdev->dbg_dir,
-		mdev, &post_code_ops);
+	debugfs_create_file("post_code", 0444, mdev->dbg_dir, mdev,
+			    &post_code_ops);
 
-	debugfs_create_file("dp", 0444, mdev->dbg_dir,
-		mdev, &dp_ops);
+	debugfs_create_file("dp", 0444, mdev->dbg_dir, mdev, &dp_ops);
 
-	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir,
-		mdev, &vdev_info_ops);
+	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir, mdev,
+			    &vdev_info_ops);
 
-	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir,
-		mdev, &msi_irq_info_ops);
+	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir, mdev,
+			    &msi_irq_info_ops);
 }
 
 /**
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
index e699c80..8dc6ff1 100644
--- a/drivers/misc/mic/host/mic_fops.c
+++ b/drivers/misc/mic/host/mic_fops.c
@@ -140,9 +140,9 @@ unsigned int mic_poll(struct file *f, poll_table *wait)
 
 	poll_wait(f, &mvdev->waitq, wait);
 
-	if (mic_vdev_inited(mvdev))
+	if (mic_vdev_inited(mvdev)) {
 		mask = POLLERR;
-	else if (mvdev->poll_wake) {
+	} else if (mvdev->poll_wake) {
 		mvdev->poll_wake = 0;
 		mask = POLLIN | POLLOUT;
 	}
@@ -152,7 +152,7 @@ unsigned int mic_poll(struct file *f, poll_table *wait)
 
 static inline int
 mic_query_offset(struct mic_vdev *mvdev, unsigned long offset,
-	unsigned long *size, unsigned long *pa)
+		 unsigned long *size, unsigned long *pa)
 {
 	struct mic_device *mdev = mvdev->mdev;
 	unsigned long start = MIC_DP_SIZE;
diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index 71a7521..f9c29bc 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -71,8 +71,8 @@ static irqreturn_t mic_interrupt(int irq, void *dev)
 
 /* Return the interrupt offset from the index. Index is 0 based. */
 static u16 mic_map_src_to_offset(struct mic_device *mdev,
-		int intr_src, enum mic_intr_type type) {
-
+		int intr_src, enum mic_intr_type type)
+{
 	if (type >= MIC_NUM_INTR_TYPES)
 		return MIC_NUM_OFFSETS;
 	if (intr_src >= mdev->intr_info->intr_len[type])
@@ -112,7 +112,7 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 	struct mic_intr_cb *intr_cb;
 	unsigned long flags;
 	int rc;
-	intr_cb = kmalloc(sizeof(struct mic_intr_cb), GFP_KERNEL);
+	intr_cb = kmalloc(sizeof(*intr_cb), GFP_KERNEL);
 
 	if (!intr_cb)
 		return ERR_PTR(-ENOMEM);
@@ -159,7 +159,7 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 			if (intr_cb->cb_id == idx) {
 				list_del(pos);
 				ida_simple_remove(&mdev->irq_info.cb_ida,
-					intr_cb->cb_id);
+						  intr_cb->cb_id);
 				kfree(intr_cb);
 				spin_unlock_irqrestore(
 					&mdev->irq_info.mic_intr_lock, flags);
@@ -182,9 +182,10 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 static int mic_setup_msix(struct mic_device *mdev, struct pci_dev *pdev)
 {
 	int rc, i;
+	int entry_size = sizeof(*mdev->irq_info.msix_entries);
 
-	mdev->irq_info.msix_entries = kmalloc(sizeof(struct msix_entry) *
-			MIC_MIN_MSIX, GFP_KERNEL);
+	mdev->irq_info.msix_entries = kmalloc_array(MIC_MIN_MSIX,
+						    entry_size, GFP_KERNEL);
 	if (!mdev->irq_info.msix_entries) {
 		rc = -ENOMEM;
 		goto err_nomem1;
@@ -231,8 +232,9 @@ static int mic_setup_callbacks(struct mic_device *mdev)
 {
 	int i;
 
-	mdev->irq_info.cb_list = kmalloc(sizeof(struct list_head) *
-		MIC_NUM_OFFSETS, GFP_KERNEL);
+	mdev->irq_info.cb_list = kmalloc_array(MIC_NUM_OFFSETS,
+					       sizeof(*mdev->irq_info.cb_list),
+					       GFP_KERNEL);
 	if (!mdev->irq_info.cb_list)
 		return -ENOMEM;
 
@@ -261,7 +263,7 @@ static void mic_release_callbacks(struct mic_device *mdev)
 
 		if (list_empty(&mdev->irq_info.cb_list[i])) {
 			spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock,
-				flags);
+					       flags);
 			break;
 		}
 
@@ -269,7 +271,7 @@ static void mic_release_callbacks(struct mic_device *mdev)
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
 			list_del(pos);
 			ida_simple_remove(&mdev->irq_info.cb_ida,
-				intr_cb->cb_id);
+					  intr_cb->cb_id);
 			kfree(intr_cb);
 		}
 		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
@@ -427,8 +429,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 	offset = mic_map_src_to_offset(mdev, intr_src, type);
 	if (offset >= MIC_NUM_OFFSETS) {
 		dev_err(mdev->sdev->parent,
-				"Error mapping index %d to a valid source id.\n",
-				intr_src);
+			"Error mapping index %d to a valid source id.\n",
+			intr_src);
 		rc = -EINVAL;
 		goto err;
 	}
@@ -437,7 +439,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 		msix = mic_get_available_vector(mdev);
 		if (!msix) {
 			dev_err(mdev->sdev->parent,
-			"No MSIx vectors available for use.\n");
+				"No MSIx vectors available for use.\n");
 			rc = -ENOSPC;
 			goto err;
 		}
@@ -460,7 +462,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 				offset, func, data);
 		if (IS_ERR(intr_cb)) {
 			dev_err(mdev->sdev->parent,
-			"No available callback entries for use\n");
+				"No available callback entries for use\n");
 			rc = PTR_ERR(intr_cb);
 			goto err;
 		}
@@ -506,7 +508,7 @@ void mic_free_irq(struct mic_device *mdev,
 	if (mdev->irq_info.num_vectors > 1) {
 		if (entry >= mdev->irq_info.num_vectors) {
 			dev_warn(mdev->sdev->parent,
-				"entry %d should be < num_irq %d\n",
+				 "entry %d should be < num_irq %d\n",
 				entry, mdev->irq_info.num_vectors);
 			return;
 		}
@@ -581,7 +583,7 @@ void mic_free_interrupts(struct mic_device *mdev, struct pci_dev *pdev)
 		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
 			if (mdev->irq_info.mic_msi_map[i])
 				dev_warn(&pdev->dev, "irq %d may still be in use.\n",
-					mdev->irq_info.msix_entries[i].vector);
+					 mdev->irq_info.msix_entries[i].vector);
 		}
 		kfree(mdev->irq_info.mic_msi_map);
 		kfree(mdev->irq_info.msix_entries);
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index 003d02b..fae474c 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -84,7 +84,7 @@ static void mic_add_smpt_entry(int spt, s64 *ref, u64 addr,
 	for (i = spt; i < spt + entries; i++,
 		addr += smpt_info->info.page_size) {
 		if (!smpt_info->entry[i].ref_count &&
-			(smpt_info->entry[i].dma_addr != addr)) {
+		    (smpt_info->entry[i].dma_addr != addr)) {
 			mdev->smpt_ops->set(mdev, addr, i);
 			smpt_info->entry[i].dma_addr = addr;
 		}
@@ -183,7 +183,7 @@ mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
 
 	if (!mic_is_system_addr(mdev, mic_addr)) {
 		dev_err(mdev->sdev->parent,
-		"mic_addr is invalid. mic_addr = 0x%llx\n", mic_addr);
+			"mic_addr is invalid. mic_addr = 0x%llx\n", mic_addr);
 		return -EINVAL;
 	}
 	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
@@ -286,7 +286,7 @@ void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
 		smpt_info->entry[i].ref_count -= ref[i - spt];
 		if (smpt_info->entry[i].ref_count < 0)
 			dev_warn(mdev->sdev->parent,
-				"ref count for entry %d is negative\n", i);
+				 "ref count for entry %d is negative\n", i);
 	}
 	spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
 	kfree(ref);
@@ -320,7 +320,7 @@ dma_addr_t mic_map_single(struct mic_device *mdev, void *va, size_t size)
 				"mic_map failed dma_addr 0x%llx size 0x%lx\n",
 				dma_addr, size);
 			pci_unmap_single(pdev, dma_addr,
-				size, PCI_DMA_BIDIRECTIONAL);
+					 size, PCI_DMA_BIDIRECTIONAL);
 		}
 	}
 	return mic_addr;
@@ -366,8 +366,8 @@ int mic_smpt_init(struct mic_device *mdev)
 
 	smpt_info = mdev->smpt;
 	mdev->smpt_ops->init(mdev);
-	smpt_info->entry = kmalloc(sizeof(struct mic_smpt)
-			* smpt_info->info.num_reg, GFP_KERNEL);
+	smpt_info->entry = kmalloc_array(smpt_info->info.num_reg,
+					 sizeof(*smpt_info->entry), GFP_KERNEL);
 	if (!smpt_info->entry) {
 		err = -ENOMEM;
 		goto free_smpt;
@@ -412,7 +412,7 @@ void mic_smpt_uninit(struct mic_device *mdev)
 			smpt_info->entry[i].ref_count);
 		if (smpt_info->entry[i].ref_count)
 			dev_warn(mdev->sdev->parent,
-			"ref count for entry %d is not zero\n", i);
+				 "ref count for entry %d is not zero\n", i);
 	}
 	kfree(smpt_info->entry);
 	kfree(smpt_info);
diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index 029a4f3..75746ad 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -130,7 +130,7 @@ state_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 state_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	    const char *buf, size_t count)
 {
 	int rc = 0;
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -163,7 +163,7 @@ done:
 static DEVICE_ATTR_RW(state);
 
 static ssize_t shutdown_status_show(struct device *dev,
-	struct device_attribute *attr, char *buf)
+				    struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -193,7 +193,7 @@ cmdline_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 cmdline_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	      const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -239,7 +239,7 @@ firmware_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 firmware_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	       const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -284,7 +284,7 @@ ramdisk_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 ramdisk_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	      const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -330,7 +330,7 @@ bootmode_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 bootmode_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	       const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -363,7 +363,7 @@ static DEVICE_ATTR_RW(bootmode);
 
 static ssize_t
 log_buf_addr_show(struct device *dev, struct device_attribute *attr,
-	char *buf)
+		  char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -375,7 +375,7 @@ log_buf_addr_show(struct device *dev, struct device_attribute *attr,
 
 static ssize_t
 log_buf_addr_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+		   const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	int ret;
@@ -397,7 +397,7 @@ static DEVICE_ATTR_RW(log_buf_addr);
 
 static ssize_t
 log_buf_len_show(struct device *dev, struct device_attribute *attr,
-	char *buf)
+		 char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -409,7 +409,7 @@ log_buf_len_show(struct device *dev, struct device_attribute *attr,
 
 static ssize_t
 log_buf_len_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+		  const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	int ret;
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 9e0456f..0c883cd 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -195,7 +195,7 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			MIC_VRINGH_READ, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-					__func__, __LINE__, ret);
+				__func__, __LINE__, ret);
 			break;
 		}
 		len -= out_len;
@@ -206,7 +206,7 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			!MIC_VRINGH_READ, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-					__func__, __LINE__, ret);
+				__func__, __LINE__, ret);
 			break;
 		}
 		len -= out_len;
@@ -225,8 +225,7 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 	 * Update the used ring if a descriptor was available and some data was
 	 * copied in/out and the user asked for a used ring update.
 	 */
-	if (*head != USHRT_MAX && copy->out_len &&
-		copy->update_used) {
+	if (*head != USHRT_MAX && copy->out_len && copy->update_used) {
 		u32 total = 0;
 
 		/* Determine the total data consumed */
@@ -367,7 +366,6 @@ void mic_bh_handler(struct work_struct *work)
 
 static irqreturn_t mic_virtio_intr_handler(int irq, void *data)
 {
-
 	struct mic_vdev *mvdev = data;
 	struct mic_device *mdev = mvdev->mdev;
 
@@ -394,7 +392,7 @@ int mic_virtio_config_change(struct mic_vdev *mvdev,
 	}
 
 	if (copy_from_user(mic_vq_configspace(mvdev->dd),
-				argp, mvdev->dd->config_len)) {
+			   argp, mvdev->dd->config_len)) {
 		dev_err(mic_dev(mvdev), "%s %d err %d\n",
 			__func__, __LINE__, -EFAULT);
 		ret = -EFAULT;
@@ -438,8 +436,8 @@ static int mic_copy_dp_entry(struct mic_vdev *mvdev,
 		return -EFAULT;
 	}
 
-	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE
-		|| dd.num_vq > MIC_MAX_VRINGS) {
+	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE ||
+	    dd.num_vq > MIC_MAX_VRINGS) {
 		dev_err(mic_dev(mvdev), "%s %d err %d\n",
 			__func__, __LINE__, -EINVAL);
 		return -EINVAL;
@@ -503,7 +501,7 @@ static void mic_init_device_ctrl(struct mic_vdev *mvdev,
 {
 	struct mic_device_ctrl *dc;
 
-	dc = mvdev->dc = (void *)devpage + mic_aligned_desc_size(devpage);
+	dc = (void *)devpage + mic_aligned_desc_size(devpage);
 
 	dc->config_change = 0;
 	dc->guest_ack = 0;
@@ -512,6 +510,7 @@ static void mic_init_device_ctrl(struct mic_vdev *mvdev,
 	dc->used_address_updated = 0;
 	dc->c2h_vdev_db = -1;
 	dc->h2c_vdev_db = -1;
+	mvdev->dc = dc;
 }
 
 int mic_virtio_add_device(struct mic_vdev *mvdev,
@@ -551,7 +550,7 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 			sizeof(struct _mic_vring_info));
 		vr->va = (void *)
 			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-			get_order(vr_size));
+					 get_order(vr_size));
 		if (!vr->va) {
 			ret = -ENOMEM;
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
@@ -564,8 +563,7 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		vqconfig[i].address = mic_map_single(mdev,
 			vr->va, vr_size);
 		if (mic_map_error(vqconfig[i].address)) {
-			free_pages((unsigned long)vr->va,
-				get_order(vr_size));
+			free_pages((unsigned long)vr->va, get_order(vr_size));
 			ret = -ENOMEM;
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -573,8 +571,7 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		}
 		vqconfig[i].address = cpu_to_le64(vqconfig[i].address);
 
-		vring_init(&vr->vr, num,
-			vr->va, MIC_VIRTIO_RING_ALIGN);
+		vring_init(&vr->vr, num, vr->va, MIC_VIRTIO_RING_ALIGN);
 		ret = vringh_init_kern(&mvr->vrh,
 			*(u32 *)mic_vq_features(mvdev->dd), num, false,
 			vr->vr.desc, vr->vr.avail, vr->vr.used);
@@ -593,8 +590,8 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 			__func__, __LINE__, i, vr->va, vr->info, vr_size);
 	}
 
-	snprintf(irqname, sizeof(irqname),
-		"mic%dvirtio%d", mdev->id, mvdev->virtio_id);
+	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
+		 mvdev->virtio_id);
 	mvdev->virtio_db = mic_next_db(mdev);
 	mvdev->virtio_cookie = mic_request_irq(mdev, mic_virtio_intr_handler,
 			irqname, mvdev, mvdev->virtio_db, MIC_INTR_DB);
@@ -628,9 +625,9 @@ err:
 	for (j = 0; j < i; j++) {
 		struct mic_vringh *mvr = &mvdev->mvr[j];
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[j].address),
-				mvr->vring.len);
+				 mvr->vring.len);
 		free_pages((unsigned long)mvr->vring.va,
-			get_order(mvr->vring.len));
+			   get_order(mvr->vring.len));
 	}
 	mutex_unlock(&mdev->mic_mutex);
 	return ret;
@@ -676,9 +673,9 @@ skip_hot_remove:
 		vringh_kiov_cleanup(&mvr->riov);
 		vringh_kiov_cleanup(&mvr->wiov);
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
-				mvr->vring.len);
+				 mvr->vring.len);
 		free_pages((unsigned long)mvr->vring.va,
-			get_order(mvr->vring.len));
+			   get_order(mvr->vring.len));
 	}
 
 	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index 3a0d660..81e9541 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -46,8 +46,8 @@ mic_x100_write_spad(struct mic_device *mdev, unsigned int idx, u32 val)
 	dev_dbg(mdev->sdev->parent, "Writing 0x%x to scratch pad index %d\n",
 		val, idx);
 	mic_mmio_write(&mdev->mmio, val,
-		MIC_X100_SBOX_BASE_ADDRESS +
-		MIC_X100_SBOX_SPAD0 + idx * 4);
+		       MIC_X100_SBOX_BASE_ADDRESS +
+		       MIC_X100_SBOX_SPAD0 + idx * 4);
 }
 
 /**
@@ -130,8 +130,8 @@ static void mic_x100_send_sbox_intr(struct mic_device *mdev,
 {
 	struct mic_mw *mw = &mdev->mmio;
 	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
-	u32 apicicr_low = mic_mmio_read(mw,
-			MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+	u32 apicicr_low = mic_mmio_read(mw, MIC_X100_SBOX_BASE_ADDRESS +
+					apic_icr_offset);
 
 	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
 	apicicr_low = (apicicr_low | (1 << 13));
@@ -139,7 +139,7 @@ static void mic_x100_send_sbox_intr(struct mic_device *mdev,
 	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
 	wmb();
 	mic_mmio_write(mw, apicicr_low,
-		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
 }
 
 /**
@@ -153,7 +153,7 @@ static void mic_x100_send_rdmasr_intr(struct mic_device *mdev,
 	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
 	wmb();
 	mic_mmio_write(&mdev->mmio, 0,
-		MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+		       MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
 }
 
 /**
@@ -212,7 +212,7 @@ done:
  */
 static void mic_x100_hw_intr_init(struct mic_device *mdev)
 {
-	mdev->intr_info = (struct mic_intr_info *) mic_x100_intr_init;
+	mdev->intr_info = (struct mic_intr_info *)mic_x100_intr_init;
 }
 
 /**
@@ -244,7 +244,7 @@ mic_x100_read_msi_to_src_map(struct mic_device *mdev, int idx)
  */
 static void
 mic_x100_program_msi_to_src_map(struct mic_device *mdev,
-			int idx, int offset, bool set)
+				int idx, int offset, bool set)
 {
 	unsigned long reg;
 	struct mic_mw *mw = &mdev->mmio;
@@ -308,12 +308,12 @@ static void mic_x100_send_firmware_intr(struct mic_device *mdev)
 	apicicr_low = (vector | (1 << 13));
 
 	mic_mmio_write(mw, mic_x100_get_apic_id(mdev),
-		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset + 4);
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset + 4);
 
 	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
 	wmb();
 	mic_mmio_write(mw, apicicr_low,
-		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
 }
 
 /**
@@ -365,8 +365,7 @@ mic_x100_load_command_line(struct mic_device *mdev, const struct firmware *fw)
 	len += snprintf(buf, CMDLINE_SIZE - len,
 		" mem=%dM", boot_mem);
 	if (mdev->cmdline)
-		snprintf(buf + len, CMDLINE_SIZE - len,
-				" %s", mdev->cmdline);
+		snprintf(buf + len, CMDLINE_SIZE - len, " %s", mdev->cmdline);
 	memcpy_toio(cmd_line_va, buf, strlen(buf) + 1);
 	kfree(buf);
 	return 0;
@@ -397,8 +396,7 @@ mic_x100_load_ramdisk(struct mic_device *mdev)
 	 * Typically the bootaddr for card OS is 64M
 	 * so copy over the ramdisk @ 128M.
 	 */
-	memcpy_toio(mdev->aper.va + (mdev->bootaddr << 1),
-		fw->data, fw->size);
+	memcpy_toio(mdev->aper.va + (mdev->bootaddr << 1), fw->data, fw->size);
 	iowrite32(cpu_to_le32(mdev->bootaddr << 1), &bp->hdr.ramdisk_image);
 	iowrite32(cpu_to_le32(fw->size), &bp->hdr.ramdisk_size);
 	release_firmware(fw);
@@ -484,8 +482,7 @@ mic_x100_load_firmware(struct mic_device *mdev, const char *buf)
 	if (mdev->ramdisk)
 		rc = mic_x100_load_ramdisk(mdev);
 error:
-	dev_dbg(mdev->sdev->parent, "%s %d rc %d\n",
-			__func__, __LINE__, rc);
+	dev_dbg(mdev->sdev->parent, "%s %d rc %d\n", __func__, __LINE__, rc);
 done:
 	return rc;
 }
@@ -524,8 +521,8 @@ mic_x100_smpt_set(struct mic_device *mdev, dma_addr_t dma_addr, u8 index)
 	uint32_t smpt_reg_val = BUILD_SMPT(SNOOP_ON,
 			dma_addr >> mdev->smpt->info.page_shift);
 	mic_mmio_write(&mdev->mmio, smpt_reg_val,
-		MIC_X100_SBOX_BASE_ADDRESS +
-		MIC_X100_SBOX_SMPT00 + (4 * index));
+		       MIC_X100_SBOX_BASE_ADDRESS +
+		       MIC_X100_SBOX_SMPT00 + (4 * index));
 }
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 35%]

* [PATCH char-misc-next 2/5] misc: mic: header file cleanups.
  2013-09-27 16:49 94% [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Sudeep Dutt
  2013-09-27 16:49 74% ` [PATCH char-misc-next 1/5] misc: mic: host driver sysfs cleanups Sudeep Dutt
@ 2013-09-27 16:49 75% ` Sudeep Dutt
  2013-09-27 16:49 35% ` [PATCH char-misc-next 3/5] misc: mic: cleanups for "--strict" checkpatch Sudeep Dutt
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 16:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, linux-kernel,
	linux-doc, Fengguang Wu, Stephen Rothwell
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

Dont use same name for header files in different folders.

These changes were suggested by Greg Kroah-Hartman during the
code review @ https://lkml.org/lkml/2013/9/6/18

Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/card/mic_debugfs.c                 | 2 +-
 drivers/misc/mic/card/mic_device.c                  | 2 +-
 drivers/misc/mic/card/mic_virtio.c                  | 2 +-
 drivers/misc/mic/card/mic_x100.c                    | 2 +-
 drivers/misc/mic/common/{mic_device.h => mic_dev.h} | 4 ++--
 drivers/misc/mic/host/mic_boot.c                    | 2 +-
 drivers/misc/mic/host/mic_debugfs.c                 | 2 +-
 drivers/misc/mic/host/mic_fops.c                    | 2 +-
 drivers/misc/mic/host/mic_intr.c                    | 2 +-
 drivers/misc/mic/host/mic_main.c                    | 2 +-
 drivers/misc/mic/host/mic_smpt.c                    | 2 +-
 drivers/misc/mic/host/mic_sysfs.c                   | 2 +-
 drivers/misc/mic/host/mic_virtio.c                  | 2 +-
 drivers/misc/mic/host/mic_x100.c                    | 2 +-
 14 files changed, 15 insertions(+), 15 deletions(-)
 rename drivers/misc/mic/common/{mic_device.h => mic_dev.h} (95%)

diff --git a/drivers/misc/mic/card/mic_debugfs.c b/drivers/misc/mic/card/mic_debugfs.c
index 95cf186..421b3d7 100644
--- a/drivers/misc/mic/card/mic_debugfs.c
+++ b/drivers/misc/mic/card/mic_debugfs.c
@@ -30,7 +30,7 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 
 /* Debugfs parent dir */
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 4125217..175d114 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -30,7 +30,7 @@
 #include <linux/reboot.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_virtio.h"
 
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 38275c1..7420c33 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -40,7 +40,7 @@
 #include <linux/slab.h>
 #include <linux/virtio_config.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_virtio.h"
 
 #define VIRTIO_SUBCODE_64 0x0D00
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 7cb3469..f491c91 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -28,7 +28,7 @@
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 
diff --git a/drivers/misc/mic/common/mic_device.h b/drivers/misc/mic/common/mic_dev.h
similarity index 95%
rename from drivers/misc/mic/common/mic_device.h
rename to drivers/misc/mic/common/mic_dev.h
index 01eb74f..92999c2 100644
--- a/drivers/misc/mic/common/mic_device.h
+++ b/drivers/misc/mic/common/mic_dev.h
@@ -18,8 +18,8 @@
  * Intel MIC driver.
  *
  */
-#ifndef __MIC_COMMON_DEVICE_H_
-#define __MIC_COMMON_DEVICE_H_
+#ifndef __MIC_DEV_H__
+#define __MIC_DEV_H__
 
 /**
  * struct mic_mw - MIC memory window
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index fd9ff6d..60c54d5 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -22,7 +22,7 @@
 #include <linux/firmware.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index e22fb7b..98a61b4 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -23,7 +23,7 @@
 #include <linux/seq_file.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
index 661469a..e699c80 100644
--- a/drivers/misc/mic/host/mic_fops.c
+++ b/drivers/misc/mic/host/mic_fops.c
@@ -21,7 +21,7 @@
 #include <linux/poll.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_fops.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index cdae314..71a7521 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -21,7 +21,7 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 
 /*
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index a8965d4..ca06aa9 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -28,7 +28,7 @@
 #include <linux/poll.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index a346207..003d02b 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -20,7 +20,7 @@
  */
 #include <linux/pci.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 
diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index bebbae2..029a4f3 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -21,7 +21,7 @@
 #include <linux/pci.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 
 /*
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index be2a1f0..9e0456f 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -23,7 +23,7 @@
 #include <linux/uaccess.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index a12ae5c..3a0d660 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -24,7 +24,7 @@
 #include <linux/firmware.h>
 #include <linux/delay.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
-- 
1.8.2.1


^ permalink raw reply related	[relevance 75%]

* [PATCH char-misc-next 1/5] misc: mic: host driver sysfs cleanups.
  2013-09-27 16:49 94% [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Sudeep Dutt
@ 2013-09-27 16:49 74% ` Sudeep Dutt
  2013-09-27 16:49 75% ` [PATCH char-misc-next 2/5] misc: mic: header file cleanups Sudeep Dutt
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 16:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, linux-kernel,
	linux-doc, Fengguang Wu, Stephen Rothwell
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

+ Use DEVICE_ATTR_RO/RW instead of DEVICE_ATTR
+ Use ATTRIBUTE_GROUPS

These changes were suggested by Greg Kroah-Hartman during the
code review @ https://lkml.org/lkml/2013/9/6/13

Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/host/mic_sysfs.c | 72 ++++++++++++++++-----------------------
 1 file changed, 29 insertions(+), 43 deletions(-)

diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index aaf8499..bebbae2 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -65,7 +65,7 @@ void mic_set_state(struct mic_device *mdev, u8 state)
 }
 
 static ssize_t
-mic_show_family(struct device *dev, struct device_attribute *attr, char *buf)
+family_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	static const char x100[] = "x100";
 	static const char unknown[] = "Unknown";
@@ -85,10 +85,10 @@ mic_show_family(struct device *dev, struct device_attribute *attr, char *buf)
 	}
 	return scnprintf(buf, PAGE_SIZE, "%s\n", card);
 }
-static DEVICE_ATTR(family, S_IRUGO, mic_show_family, NULL);
+static DEVICE_ATTR_RO(family);
 
 static ssize_t
-mic_show_stepping(struct device *dev, struct device_attribute *attr, char *buf)
+stepping_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *string = "??";
@@ -114,10 +114,10 @@ mic_show_stepping(struct device *dev, struct device_attribute *attr, char *buf)
 	}
 	return scnprintf(buf, PAGE_SIZE, "%s\n", string);
 }
-static DEVICE_ATTR(stepping, S_IRUGO, mic_show_stepping, NULL);
+static DEVICE_ATTR_RO(stepping);
 
 static ssize_t
-mic_show_state(struct device *dev, struct device_attribute *attr, char *buf)
+state_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -129,7 +129,7 @@ mic_show_state(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_state(struct device *dev, struct device_attribute *attr,
+state_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	int rc = 0;
@@ -160,9 +160,9 @@ mic_store_state(struct device *dev, struct device_attribute *attr,
 done:
 	return count;
 }
-static DEVICE_ATTR(state, S_IRUGO|S_IWUSR, mic_show_state, mic_store_state);
+static DEVICE_ATTR_RW(state);
 
-static ssize_t mic_show_shutdown_status(struct device *dev,
+static ssize_t shutdown_status_show(struct device *dev,
 	struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -173,11 +173,10 @@ static ssize_t mic_show_shutdown_status(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%s\n",
 		mic_shutdown_status_string[mdev->shutdown_status]);
 }
-static DEVICE_ATTR(shutdown_status, S_IRUGO|S_IWUSR,
-	mic_show_shutdown_status, NULL);
+static DEVICE_ATTR_RO(shutdown_status);
 
 static ssize_t
-mic_show_cmdline(struct device *dev, struct device_attribute *attr, char *buf)
+cmdline_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *cmdline;
@@ -193,7 +192,7 @@ mic_show_cmdline(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_cmdline(struct device *dev, struct device_attribute *attr,
+cmdline_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -220,11 +219,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(cmdline, S_IRUGO | S_IWUSR,
-	mic_show_cmdline, mic_store_cmdline);
+static DEVICE_ATTR_RW(cmdline);
 
 static ssize_t
-mic_show_firmware(struct device *dev, struct device_attribute *attr, char *buf)
+firmware_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *firmware;
@@ -240,7 +238,7 @@ mic_show_firmware(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_firmware(struct device *dev, struct device_attribute *attr,
+firmware_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -266,11 +264,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(firmware, S_IRUGO | S_IWUSR,
-	mic_show_firmware, mic_store_firmware);
+static DEVICE_ATTR_RW(firmware);
 
 static ssize_t
-mic_show_ramdisk(struct device *dev, struct device_attribute *attr, char *buf)
+ramdisk_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *ramdisk;
@@ -286,7 +283,7 @@ mic_show_ramdisk(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_ramdisk(struct device *dev, struct device_attribute *attr,
+ramdisk_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -313,11 +310,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(ramdisk, S_IRUGO | S_IWUSR,
-	mic_show_ramdisk, mic_store_ramdisk);
+static DEVICE_ATTR_RW(ramdisk);
 
 static ssize_t
-mic_show_bootmode(struct device *dev, struct device_attribute *attr, char *buf)
+bootmode_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *bootmode;
@@ -333,7 +329,7 @@ mic_show_bootmode(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_bootmode(struct device *dev, struct device_attribute *attr,
+bootmode_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -363,11 +359,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(bootmode, S_IRUGO | S_IWUSR,
-	mic_show_bootmode, mic_store_bootmode);
+static DEVICE_ATTR_RW(bootmode);
 
 static ssize_t
-mic_show_log_buf_addr(struct device *dev, struct device_attribute *attr,
+log_buf_addr_show(struct device *dev, struct device_attribute *attr,
 	char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -379,7 +374,7 @@ mic_show_log_buf_addr(struct device *dev, struct device_attribute *attr,
 }
 
 static ssize_t
-mic_store_log_buf_addr(struct device *dev, struct device_attribute *attr,
+log_buf_addr_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -398,11 +393,10 @@ mic_store_log_buf_addr(struct device *dev, struct device_attribute *attr,
 exit:
 	return ret;
 }
-static DEVICE_ATTR(log_buf_addr, S_IRUGO | S_IWUSR,
-	mic_show_log_buf_addr, mic_store_log_buf_addr);
+static DEVICE_ATTR_RW(log_buf_addr);
 
 static ssize_t
-mic_show_log_buf_len(struct device *dev, struct device_attribute *attr,
+log_buf_len_show(struct device *dev, struct device_attribute *attr,
 	char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -414,7 +408,7 @@ mic_show_log_buf_len(struct device *dev, struct device_attribute *attr,
 }
 
 static ssize_t
-mic_store_log_buf_len(struct device *dev, struct device_attribute *attr,
+log_buf_len_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -433,8 +427,7 @@ mic_store_log_buf_len(struct device *dev, struct device_attribute *attr,
 exit:
 	return ret;
 }
-static DEVICE_ATTR(log_buf_len, S_IRUGO | S_IWUSR,
-	mic_show_log_buf_len, mic_store_log_buf_len);
+static DEVICE_ATTR_RW(log_buf_len);
 
 static struct attribute *mic_default_attrs[] = {
 	&dev_attr_family.attr,
@@ -451,16 +444,9 @@ static struct attribute *mic_default_attrs[] = {
 	NULL
 };
 
-static struct attribute_group mic_attr_group = {
-	.attrs = mic_default_attrs,
-};
-
-static const struct attribute_group *__mic_attr_group[] = {
-	&mic_attr_group,
-	NULL
-};
+ATTRIBUTE_GROUPS(mic_default);
 
 void mic_sysfs_init(struct mic_device *mdev)
 {
-	mdev->attr_group = __mic_attr_group;
+	mdev->attr_group = mic_default_groups;
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 74%]

* [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix.
@ 2013-09-27 16:49 94% Sudeep Dutt
  2013-09-27 16:49 74% ` [PATCH char-misc-next 1/5] misc: mic: host driver sysfs cleanups Sudeep Dutt
                   ` (5 more replies)
  0 siblings, 6 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27 16:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Joe Perches, linux-kernel,
	linux-doc, Fengguang Wu, Stephen Rothwell
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

These patches address code review feedback received on the
patch series @ https://lkml.org/lkml/2013/9/5/561 ,from:
a) Greg Kroah-Hartman on sysfs and header file cleanups.
b) Joe Perches on issues found by "--strict" checkpatch.

It should also fix the build failures on certain non X86
architectures, reported in linux-next and 0-DAY builds.

Ashutosh Dixit (1):
  misc: mic: cleanups for "--strict" checkpatch.

Sudeep Dutt (4):
  misc: mic: host driver sysfs cleanups.
  misc: mic: header file cleanups.
  misc: mic: fix a warning in the IOCTL header file.
  misc: mic: depend on X86 for both host and card drivers.

 Documentation/mic/mpssd/mpssd.c                    | 65 +++++++--------
 Documentation/mic/mpssd/sysfs.c                    |  4 +-
 drivers/misc/mic/Kconfig                           |  4 +-
 drivers/misc/mic/card/mic_debugfs.c                |  2 +-
 drivers/misc/mic/card/mic_device.c                 |  3 +-
 drivers/misc/mic/card/mic_virtio.c                 | 19 +++--
 drivers/misc/mic/card/mic_x100.c                   |  6 +-
 .../misc/mic/common/{mic_device.h => mic_dev.h}    |  4 +-
 drivers/misc/mic/host/mic_boot.c                   |  2 +-
 drivers/misc/mic/host/mic_debugfs.c                | 92 ++++++++++-----------
 drivers/misc/mic/host/mic_fops.c                   |  8 +-
 drivers/misc/mic/host/mic_intr.c                   | 36 +++++----
 drivers/misc/mic/host/mic_main.c                   |  2 +-
 drivers/misc/mic/host/mic_smpt.c                   | 16 ++--
 drivers/misc/mic/host/mic_sysfs.c                  | 94 +++++++++-------------
 drivers/misc/mic/host/mic_virtio.c                 | 39 +++++----
 drivers/misc/mic/host/mic_x100.c                   | 35 ++++----
 include/uapi/linux/mic_ioctl.h                     |  2 +
 18 files changed, 206 insertions(+), 227 deletions(-)
 rename drivers/misc/mic/common/{mic_device.h => mic_dev.h} (95%)

-- 
1.8.2.1


^ permalink raw reply	[relevance 94%]

* [PATCH CHAR-MISC-NEXT 4/4] Intel MIC Header File warning fix.
  2013-09-27  1:09 96% [PATCH CHAR-MISC-NEXT 0/4] Intel MIC Driver Cleanups Sudeep Dutt
                   ` (2 preceding siblings ...)
  2013-09-27  1:10 35% ` [PATCH CHAR-MISC-NEXT 3/4] Intel MIC Driver Cleanups for "--strict" checkpatch Sudeep Dutt
@ 2013-09-27  1:10 99% ` Sudeep Dutt
    3 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2013-09-27  1:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joe Perches, linux-kernel, linux-doc
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

usr/include/linux/mic_ioctl.h:37:
found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 include/uapi/linux/mic_ioctl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/mic_ioctl.h b/include/uapi/linux/mic_ioctl.h
index 0e6cbf3..7fabba5 100644
--- a/include/uapi/linux/mic_ioctl.h
+++ b/include/uapi/linux/mic_ioctl.h
@@ -21,6 +21,8 @@
 #ifndef _MIC_IOCTL_H_
 #define _MIC_IOCTL_H_
 
+#include <linux/types.h>
+
 /*
  * mic_copy - MIC virtio descriptor copy.
  *
-- 
1.8.2.1


^ permalink raw reply related	[relevance 99%]

* [PATCH CHAR-MISC-NEXT 2/4] Intel MIC Driver Header File Cleanups.
  2013-09-27  1:09 96% [PATCH CHAR-MISC-NEXT 0/4] Intel MIC Driver Cleanups Sudeep Dutt
  2013-09-27  1:09 75% ` [PATCH CHAR-MISC-NEXT 1/4] Intel MIC Host Driver Sysfs Cleanups Sudeep Dutt
@ 2013-09-27  1:10 67% ` Sudeep Dutt
  2013-09-27  1:10 35% ` [PATCH CHAR-MISC-NEXT 3/4] Intel MIC Driver Cleanups for "--strict" checkpatch Sudeep Dutt
  2013-09-27  1:10 99% ` [PATCH CHAR-MISC-NEXT 4/4] Intel MIC Header File warning fix Sudeep Dutt
  3 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27  1:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joe Perches, linux-kernel, linux-doc
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

Dont use same name for header files in different folders.

These changes were suggested by Greg Kroah-Hartman during the
code review @ https://lkml.org/lkml/2013/9/6/18

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/card/mic_debugfs.c  |  2 +-
 drivers/misc/mic/card/mic_device.c   |  2 +-
 drivers/misc/mic/card/mic_virtio.c   |  2 +-
 drivers/misc/mic/card/mic_x100.c     |  2 +-
 drivers/misc/mic/common/mic_dev.h    | 51 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/common/mic_device.h | 51 ------------------------------------
 drivers/misc/mic/host/mic_boot.c     |  2 +-
 drivers/misc/mic/host/mic_debugfs.c  |  2 +-
 drivers/misc/mic/host/mic_fops.c     |  2 +-
 drivers/misc/mic/host/mic_intr.c     |  2 +-
 drivers/misc/mic/host/mic_main.c     |  2 +-
 drivers/misc/mic/host/mic_smpt.c     |  2 +-
 drivers/misc/mic/host/mic_sysfs.c    |  2 +-
 drivers/misc/mic/host/mic_virtio.c   |  2 +-
 drivers/misc/mic/host/mic_x100.c     |  2 +-
 15 files changed, 64 insertions(+), 64 deletions(-)
 create mode 100644 drivers/misc/mic/common/mic_dev.h
 delete mode 100644 drivers/misc/mic/common/mic_device.h

diff --git a/drivers/misc/mic/card/mic_debugfs.c b/drivers/misc/mic/card/mic_debugfs.c
index 95cf186..421b3d7 100644
--- a/drivers/misc/mic/card/mic_debugfs.c
+++ b/drivers/misc/mic/card/mic_debugfs.c
@@ -30,7 +30,7 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 
 /* Debugfs parent dir */
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 4125217..175d114 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -30,7 +30,7 @@
 #include <linux/reboot.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_virtio.h"
 
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 38275c1..7420c33 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -40,7 +40,7 @@
 #include <linux/slab.h>
 #include <linux/virtio_config.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_virtio.h"
 
 #define VIRTIO_SUBCODE_64 0x0D00
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index 7cb3469..f491c91 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -28,7 +28,7 @@
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 
diff --git a/drivers/misc/mic/common/mic_dev.h b/drivers/misc/mic/common/mic_dev.h
new file mode 100644
index 0000000..92999c2
--- /dev/null
+++ b/drivers/misc/mic/common/mic_dev.h
@@ -0,0 +1,51 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC driver.
+ *
+ */
+#ifndef __MIC_DEV_H__
+#define __MIC_DEV_H__
+
+/**
+ * struct mic_mw - MIC memory window
+ *
+ * @pa: Base physical address.
+ * @va: Base ioremap'd virtual address.
+ * @len: Size of the memory window.
+ */
+struct mic_mw {
+	phys_addr_t pa;
+	void __iomem *va;
+	resource_size_t len;
+};
+
+/*
+ * Scratch pad register offsets used by the host to communicate
+ * device page DMA address to the card.
+ */
+#define MIC_DPLO_SPAD 14
+#define MIC_DPHI_SPAD 15
+
+/*
+ * These values are supposed to be in the config_change field of the
+ * device page when the host sends a config change interrupt to the card.
+ */
+#define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
+#define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
+
+#endif
diff --git a/drivers/misc/mic/common/mic_device.h b/drivers/misc/mic/common/mic_device.h
deleted file mode 100644
index 01eb74f..0000000
--- a/drivers/misc/mic/common/mic_device.h
+++ /dev/null
@@ -1,51 +0,0 @@
-/*
- * Intel MIC Platform Software Stack (MPSS)
- *
- * Copyright(c) 2013 Intel Corporation.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License, version 2, as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * The full GNU General Public License is included in this distribution in
- * the file called "COPYING".
- *
- * Intel MIC driver.
- *
- */
-#ifndef __MIC_COMMON_DEVICE_H_
-#define __MIC_COMMON_DEVICE_H_
-
-/**
- * struct mic_mw - MIC memory window
- *
- * @pa: Base physical address.
- * @va: Base ioremap'd virtual address.
- * @len: Size of the memory window.
- */
-struct mic_mw {
-	phys_addr_t pa;
-	void __iomem *va;
-	resource_size_t len;
-};
-
-/*
- * Scratch pad register offsets used by the host to communicate
- * device page DMA address to the card.
- */
-#define MIC_DPLO_SPAD 14
-#define MIC_DPHI_SPAD 15
-
-/*
- * These values are supposed to be in the config_change field of the
- * device page when the host sends a config change interrupt to the card.
- */
-#define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
-#define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
-
-#endif
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index fd9ff6d..60c54d5 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -22,7 +22,7 @@
 #include <linux/firmware.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index e22fb7b..98a61b4 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -23,7 +23,7 @@
 #include <linux/seq_file.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
index 661469a..e699c80 100644
--- a/drivers/misc/mic/host/mic_fops.c
+++ b/drivers/misc/mic/host/mic_fops.c
@@ -21,7 +21,7 @@
 #include <linux/poll.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_fops.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index cdae314..71a7521 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -21,7 +21,7 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 
 /*
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index a8965d4..ca06aa9 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -28,7 +28,7 @@
 #include <linux/poll.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index a346207..003d02b 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -20,7 +20,7 @@
  */
 #include <linux/pci.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 
diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index bebbae2..029a4f3 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -21,7 +21,7 @@
 #include <linux/pci.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 
 /*
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index be2a1f0..9e0456f 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -23,7 +23,7 @@
 #include <linux/uaccess.h>
 
 #include <linux/mic_common.h>
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
 #include "mic_virtio.h"
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index a12ae5c..3a0d660 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -24,7 +24,7 @@
 #include <linux/firmware.h>
 #include <linux/delay.h>
 
-#include "../common/mic_device.h"
+#include "../common/mic_dev.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
-- 
1.8.2.1


^ permalink raw reply related	[relevance 67%]

* [PATCH CHAR-MISC-NEXT 3/4] Intel MIC Driver Cleanups for "--strict" checkpatch.
  2013-09-27  1:09 96% [PATCH CHAR-MISC-NEXT 0/4] Intel MIC Driver Cleanups Sudeep Dutt
  2013-09-27  1:09 75% ` [PATCH CHAR-MISC-NEXT 1/4] Intel MIC Host Driver Sysfs Cleanups Sudeep Dutt
  2013-09-27  1:10 67% ` [PATCH CHAR-MISC-NEXT 2/4] Intel MIC Driver Header File Cleanups Sudeep Dutt
@ 2013-09-27  1:10 35% ` Sudeep Dutt
  2013-09-27  1:10 99% ` [PATCH CHAR-MISC-NEXT 4/4] Intel MIC Header File warning fix Sudeep Dutt
  3 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27  1:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joe Perches, linux-kernel, linux-doc
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

From: Ashutosh Dixit <ashutosh.dixit@intel.com>

These changes were mostly authored by Joe Perches <joe@perches.com>
@ https://lkml.org/lkml/2013/9/5/602

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 Documentation/mic/mpssd/mpssd.c     | 65 ++++++++++++++-------------
 Documentation/mic/mpssd/sysfs.c     |  4 +-
 drivers/misc/mic/card/mic_device.c  |  1 -
 drivers/misc/mic/card/mic_virtio.c  | 17 ++++---
 drivers/misc/mic/card/mic_x100.c    |  4 +-
 drivers/misc/mic/host/mic_debugfs.c | 90 ++++++++++++++++++-------------------
 drivers/misc/mic/host/mic_fops.c    |  6 +--
 drivers/misc/mic/host/mic_intr.c    | 34 +++++++-------
 drivers/misc/mic/host/mic_smpt.c    | 14 +++---
 drivers/misc/mic/host/mic_sysfs.c   | 20 ++++-----
 drivers/misc/mic/host/mic_virtio.c  | 37 +++++++--------
 drivers/misc/mic/host/mic_x100.c    | 33 +++++++-------
 12 files changed, 158 insertions(+), 167 deletions(-)

diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
index 8064804..f9327a2 100644
--- a/Documentation/mic/mpssd/mpssd.c
+++ b/Documentation/mic/mpssd/mpssd.c
@@ -246,7 +246,7 @@ static int tun_alloc(struct mic_info *mic, char *dev)
 	if (*dev)
 		strncpy(ifr.ifr_name, dev, IFNAMSIZ);
 
-	err = ioctl(fd, TUNSETIFF, (void *) &ifr);
+	err = ioctl(fd, TUNSETIFF, (void *)&ifr);
 	if (err < 0) {
 		mpsslog("%s %s %d TUNSETIFF failed %s\n",
 			mic->name, __func__, __LINE__, strerror(errno));
@@ -363,7 +363,7 @@ static inline void verify_out_len(struct mic_info *mic,
 {
 	if (copy->out_len != sum_iovec_len(copy)) {
 		mpsslog("%s %s %d BUG copy->out_len 0x%x len 0x%x\n",
-				mic->name, __func__, __LINE__,
+			mic->name, __func__, __LINE__,
 				copy->out_len, sum_iovec_len(copy));
 		assert(copy->out_len == sum_iovec_len(copy));
 	}
@@ -372,7 +372,7 @@ static inline void verify_out_len(struct mic_info *mic,
 /* Display an iovec */
 static void
 disp_iovec(struct mic_info *mic, struct mic_copy_desc *copy,
-	const char *s, int line)
+	   const char *s, int line)
 {
 	int i;
 
@@ -401,7 +401,7 @@ static inline void txrx_prepare(int type, bool tx, struct mic_vring *vr,
 /* Central API which triggers the copies */
 static int
 mic_virtio_copy(struct mic_info *mic, int fd,
-	struct mic_vring *vr, struct mic_copy_desc *copy)
+		struct mic_vring *vr, struct mic_copy_desc *copy)
 {
 	int ret;
 
@@ -440,7 +440,7 @@ init_vr(struct mic_info *mic, int fd, int type,
 	vr0->info = vr0->va +
 		vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN);
 	vring_init(&vr0->vr,
-		MIC_VRING_ENTRIES, vr0->va, MIC_VIRTIO_RING_ALIGN);
+		   MIC_VRING_ENTRIES, vr0->va, MIC_VIRTIO_RING_ALIGN);
 	mpsslog("%s %s vr0 %p vr0->info %p vr_size 0x%x vring 0x%x ",
 		__func__, mic->name, vr0->va, vr0->info, vr_size,
 		vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN));
@@ -453,7 +453,7 @@ init_vr(struct mic_info *mic, int fd, int type,
 		vr1->info = vr1->va + vring_size(MIC_VRING_ENTRIES,
 			MIC_VIRTIO_RING_ALIGN);
 		vring_init(&vr1->vr,
-			MIC_VRING_ENTRIES, vr1->va, MIC_VIRTIO_RING_ALIGN);
+			   MIC_VRING_ENTRIES, vr1->va, MIC_VIRTIO_RING_ALIGN);
 		mpsslog("%s %s vr1 %p vr1->info %p vr_size 0x%x vring 0x%x ",
 			__func__, mic->name, vr1->va, vr1->info, vr_size,
 			vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN));
@@ -551,7 +551,7 @@ virtio_net(void *arg)
 	net_poll[NET_FD_TUN].events = POLLIN;
 
 	if (MAP_FAILED == init_vr(mic, mic->mic_net.virtio_net_fd,
-		VIRTIO_ID_NET, &tx_vr, &rx_vr,
+				  VIRTIO_ID_NET, &tx_vr, &rx_vr,
 		virtnet_dev_page.dd.num_vq)) {
 		mpsslog("%s init_vr failed %s\n",
 			mic->name, strerror(errno));
@@ -576,7 +576,7 @@ virtio_net(void *arg)
 		}
 		if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK))
 			wait_for_card_driver(mic, mic->mic_net.virtio_net_fd,
-					VIRTIO_ID_NET);
+					     VIRTIO_ID_NET);
 		/*
 		 * Check if there is data to be read from TUN and write to
 		 * virtio net fd if there is.
@@ -587,7 +587,7 @@ virtio_net(void *arg)
 				copy.iov, copy.iovcnt);
 			if (len > 0) {
 				struct virtio_net_hdr *hdr
-					= (struct virtio_net_hdr *) vnet_hdr[0];
+					= (struct virtio_net_hdr *)vnet_hdr[0];
 
 				/* Disable checksums on the card since we are on
 				   a reliable PCIe link */
@@ -606,7 +606,7 @@ virtio_net(void *arg)
 #endif
 				spin_for_descriptors(mic, &tx_vr);
 				txrx_prepare(VIRTIO_ID_NET, 1, &tx_vr, &copy,
-					len);
+					     len);
 
 				err = mic_virtio_copy(mic,
 					mic->mic_net.virtio_net_fd, &tx_vr,
@@ -644,7 +644,7 @@ virtio_net(void *arg)
 				le16toh(rx_vr.vr.avail->idx)) {
 				copy.iov = iov1;
 				txrx_prepare(VIRTIO_ID_NET, 0, &rx_vr, &copy,
-					MAX_NET_PKT_SIZE
+					     MAX_NET_PKT_SIZE
 					+ sizeof(struct virtio_net_hdr));
 
 				err = mic_virtio_copy(mic,
@@ -669,7 +669,7 @@ virtio_net(void *arg)
 					verify_out_len(mic, &copy);
 #ifdef DEBUG
 					disp_iovec(mic, copy, __func__,
-						__LINE__);
+						   __LINE__);
 					mpsslog("%s %s %d ",
 						mic->name, __func__, __LINE__);
 					mpsslog("read from net 0x%lx\n",
@@ -686,7 +686,7 @@ virtio_net(void *arg)
 					} else {
 #ifdef DEBUG
 						disp_iovec(mic, &copy, __func__,
-							__LINE__);
+							   __LINE__);
 						mpsslog("%s %s %d ",
 							mic->name, __func__,
 							__LINE__);
@@ -750,13 +750,13 @@ virtio_console(void *arg)
 	err = grantpt(pty_fd);
 	if (err < 0) {
 		mpsslog("can't grant access: %s %s\n",
-				pts_name, strerror(errno));
+			pts_name, strerror(errno));
 		goto _close_pty;
 	}
 	err = unlockpt(pty_fd);
 	if (err < 0) {
 		mpsslog("can't unlock a pseudoterminal: %s %s\n",
-				pts_name, strerror(errno));
+			pts_name, strerror(errno));
 		goto _close_pty;
 	}
 	console_poll[MONITOR_FD].fd = pty_fd;
@@ -766,7 +766,7 @@ virtio_console(void *arg)
 	console_poll[VIRTIO_CONSOLE_FD].events = POLLIN;
 
 	if (MAP_FAILED == init_vr(mic, mic->mic_console.virtio_console_fd,
-		VIRTIO_ID_CONSOLE, &tx_vr, &rx_vr,
+				  VIRTIO_ID_CONSOLE, &tx_vr, &rx_vr,
 		virtcons_dev_page.dd.num_vq)) {
 		mpsslog("%s init_vr failed %s\n",
 			mic->name, strerror(errno));
@@ -787,7 +787,7 @@ virtio_console(void *arg)
 		}
 		if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK))
 			wait_for_card_driver(mic,
-				mic->mic_console.virtio_console_fd,
+					     mic->mic_console.virtio_console_fd,
 				VIRTIO_ID_CONSOLE);
 
 		if (console_poll[MONITOR_FD].revents & POLLIN) {
@@ -802,7 +802,7 @@ virtio_console(void *arg)
 #endif
 				spin_for_descriptors(mic, &tx_vr);
 				txrx_prepare(VIRTIO_ID_CONSOLE, 1, &tx_vr,
-					&copy, len);
+					     &copy, len);
 
 				err = mic_virtio_copy(mic,
 					mic->mic_console.virtio_console_fd,
@@ -837,7 +837,7 @@ virtio_console(void *arg)
 				le16toh(rx_vr.vr.avail->idx)) {
 				copy.iov = iov1;
 				txrx_prepare(VIRTIO_ID_CONSOLE, 0, &rx_vr,
-					&copy, PAGE_SIZE);
+					     &copy, PAGE_SIZE);
 
 				err = mic_virtio_copy(mic,
 					mic->mic_console.virtio_console_fd,
@@ -848,7 +848,7 @@ virtio_console(void *arg)
 					verify_out_len(mic, &copy);
 #ifdef DEBUG
 					disp_iovec(mic, copy, __func__,
-						__LINE__);
+						   __LINE__);
 					mpsslog("%s %s %d ",
 						mic->name, __func__, __LINE__);
 					mpsslog("read from net 0x%lx\n",
@@ -865,7 +865,7 @@ virtio_console(void *arg)
 					} else {
 #ifdef DEBUG
 						disp_iovec(mic, copy, __func__,
-							__LINE__);
+							   __LINE__);
 						mpsslog("%s %s %d ",
 							mic->name, __func__,
 							__LINE__);
@@ -1033,7 +1033,8 @@ start_virtblk(struct mic_info *mic, struct mic_vring *vring)
 	}
 	add_virtio_device(mic, &virtblk_dev_page.dd);
 	if (MAP_FAILED == init_vr(mic, mic->mic_virtblk.virtio_block_fd,
-		VIRTIO_ID_BLOCK, vring, NULL, virtblk_dev_page.dd.num_vq)) {
+				  VIRTIO_ID_BLOCK, vring, NULL,
+				  virtblk_dev_page.dd.num_vq)) {
 		mpsslog("%s init_vr failed %s\n",
 			mic->name, strerror(errno));
 		return false;
@@ -1060,7 +1061,7 @@ header_error_check(struct vring_desc *desc)
 {
 	if (le32toh(desc->len) != sizeof(struct virtio_blk_outhdr)) {
 		mpsslog("%s() %d: length is not sizeof(virtio_blk_outhd)\n",
-				__func__, __LINE__);
+			__func__, __LINE__);
 		return -EIO;
 	}
 	if (!(le16toh(desc->flags) & VRING_DESC_F_NEXT)) {
@@ -1132,7 +1133,7 @@ write_status(int fd, __u8 *status)
 static void *
 virtio_block(void *arg)
 {
-	struct mic_info *mic = (struct mic_info *) arg;
+	struct mic_info *mic = (struct mic_info *)arg;
 	int ret;
 	struct pollfd block_poll;
 	struct mic_vring vring;
@@ -1219,8 +1220,8 @@ virtio_block(void *arg)
 				status = 0;
 				fos = mic->mic_virtblk.backend_addr +
 					(hdr.sector * SECTOR_SIZE);
-				buffer_desc_idx = desc_idx =
-					next_desc(desc);
+				buffer_desc_idx = next_desc(desc);
+				desc_idx = buffer_desc_idx;
 				for (desc = &vring.vr.desc[buffer_desc_idx];
 				     desc->flags & VRING_DESC_F_NEXT;
 				     desc_idx = next_desc(desc),
@@ -1255,7 +1256,7 @@ virtio_block(void *arg)
 						iovec,
 						piov - iovec);
 					if (ret < 0 &&
-						status != 0)
+					    status != 0)
 						status = ret;
 				}
 				/* write status and update used pointer */
@@ -1391,7 +1392,7 @@ mic_config(void *arg)
 	char value[4096];
 
 	snprintf(pathname, PATH_MAX - 1, "%s/%s/%s",
-		MICSYSFSDIR, mic->name, "state");
+		 MICSYSFSDIR, mic->name, "state");
 
 	fd = open(pathname, O_RDONLY);
 	if (fd < 0) {
@@ -1560,22 +1561,22 @@ init_mic(struct mic_info *mic)
 			virtio_console, mic);
 		if (err)
 			mpsslog("%s virtcons pthread_create failed %s\n",
-			mic->name, strerror(err));
+				mic->name, strerror(err));
 		err = pthread_create(&mic->mic_net.net_thread, NULL,
 			virtio_net, mic);
 		if (err)
 			mpsslog("%s virtnet pthread_create failed %s\n",
-			mic->name, strerror(err));
+				mic->name, strerror(err));
 		err = pthread_create(&mic->mic_virtblk.block_thread, NULL,
 			virtio_block, mic);
 		if (err)
 			mpsslog("%s virtblk pthread_create failed %s\n",
-			mic->name, strerror(err));
+				mic->name, strerror(err));
 		sigemptyset(&act.sa_mask);
 		err = sigaction(SIGUSR1, &act, NULL);
 		if (err)
 			mpsslog("%s sigaction SIGUSR1 failed %s\n",
-			mic->name, strerror(errno));
+				mic->name, strerror(errno));
 		while (1)
 			sleep(60);
 	case -1:
diff --git a/Documentation/mic/mpssd/sysfs.c b/Documentation/mic/mpssd/sysfs.c
index 11de72b..8dd3269 100644
--- a/Documentation/mic/mpssd/sysfs.c
+++ b/Documentation/mic/mpssd/sysfs.c
@@ -35,7 +35,7 @@ readsysfs(char *dir, char *entry)
 		snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry);
 	else
 		snprintf(filename, PATH_MAX,
-			"%s/%s/%s", MICSYSFSDIR, dir, entry);
+			 "%s/%s/%s", MICSYSFSDIR, dir, entry);
 
 	fd = open(filename, O_RDONLY);
 	if (fd < 0) {
@@ -75,7 +75,7 @@ setsysfs(char *dir, char *entry, char *value)
 		snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry);
 	else
 		snprintf(filename, PATH_MAX, "%s/%s/%s",
-			MICSYSFSDIR, dir, entry);
+			 MICSYSFSDIR, dir, entry);
 
 	oldvalue = readsysfs(dir, entry);
 
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 175d114..d0980ff 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -167,7 +167,6 @@ struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
 	return (struct mic_irq *)cookie;
 err:
 	return ERR_PTR(rc);
-
 }
 
 /**
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
index 7420c33..914cc9b 100644
--- a/drivers/misc/mic/card/mic_virtio.c
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -103,7 +103,7 @@ static void mic_finalize_features(struct virtio_device *vdev)
 	for (i = 0; i < bits; i++) {
 		if (test_bit(i, vdev->features))
 			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
-				&out_features[i / 8]);
+				 &out_features[i / 8]);
 	}
 }
 
@@ -197,10 +197,9 @@ static void mic_notify(struct virtqueue *vq)
 static void mic_del_vq(struct virtqueue *vq, int n)
 {
 	struct mic_vdev *mvdev = to_micvdev(vq->vdev);
-	struct vring *vr = (struct vring *) (vq + 1);
+	struct vring *vr = (struct vring *)(vq + 1);
 
-	free_pages((unsigned long) vr->used,
-		get_order(mvdev->used_size[n]));
+	free_pages((unsigned long) vr->used, get_order(mvdev->used_size[n]));
 	vring_del_virtqueue(vq);
 	mic_card_unmap(mvdev->mdev, mvdev->vr[n]);
 	mvdev->vr[n] = NULL;
@@ -274,8 +273,8 @@ static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
 	/* Allocate and reassign used ring now */
 	mvdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
 			sizeof(struct vring_used_elem) * config.num);
-	used = (void *) __get_free_pages(GFP_KERNEL | __GFP_ZERO,
-				get_order(mvdev->used_size[index]));
+	used = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					get_order(mvdev->used_size[index]));
 	if (!used) {
 		err = -ENOMEM;
 		dev_err(mic_dev(mvdev), "%s %d err %d\n",
@@ -291,7 +290,7 @@ static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
 	 * vring_new_virtqueue() would ensure that
 	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
 	 */
-	vr = (struct vring *) (vq + 1);
+	vr = (struct vring *)(vq + 1);
 	vr->used = used;
 
 	vq->priv = mvdev;
@@ -544,7 +543,7 @@ static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
 		if (dev) {
 			if (remove)
 				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
-					&dc->config_change);
+					 &dc->config_change);
 			put_device(dev);
 			mic_handle_config_change(d, i, mdrv);
 			ret = mic_remove_device(d, i, mdrv);
@@ -559,7 +558,7 @@ static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
 
 		/* new device */
 		dev_dbg(mdrv->dev, "%s %d Adding new virtio device %p\n",
-				__func__, __LINE__, d);
+			__func__, __LINE__, d);
 		if (!remove)
 			mic_add_device(d, i, mdrv);
 	}
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
index f491c91..2868945 100644
--- a/drivers/misc/mic/card/mic_x100.c
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -66,8 +66,8 @@ void mic_send_intr(struct mic_device *mdev, int doorbell)
 	/* Ensure that the interrupt is ordered w.r.t previous stores. */
 	wmb();
 	mic_mmio_write(mw, MIC_X100_SBOX_SDBIC0_DBREQ_BIT,
-		MIC_X100_SBOX_BASE_ADDRESS +
-		(MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
+		       MIC_X100_SBOX_BASE_ADDRESS +
+		       (MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
 }
 
 /**
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 98a61b4..028ba5d6 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -103,7 +103,7 @@ static int mic_smpt_show(struct seq_file *s, void *pos)
 	unsigned long flags;
 
 	seq_printf(s, "MIC %-2d |%-10s| %-14s %-10s\n",
-		mdev->id, "SMPT entry", "SW DMA addr", "RefCount");
+		   mdev->id, "SMPT entry", "SW DMA addr", "RefCount");
 	seq_puts(s, "====================================================\n");
 
 	if (mdev->smpt) {
@@ -111,8 +111,8 @@ static int mic_smpt_show(struct seq_file *s, void *pos)
 		spin_lock_irqsave(&smpt_info->smpt_lock, flags);
 		for (i = 0; i < smpt_info->info.num_reg; i++) {
 			seq_printf(s, "%9s|%-10d| %-#14llx %-10lld\n",
-				" ",  i, smpt_info->entry[i].dma_addr,
-				smpt_info->entry[i].ref_count);
+				   " ",  i, smpt_info->entry[i].dma_addr,
+				   smpt_info->entry[i].ref_count);
 		}
 		spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
 	}
@@ -203,17 +203,17 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 	int i, j;
 
 	seq_printf(s, "Bootparam: magic 0x%x\n",
-		bootparam->magic);
+		   bootparam->magic);
 	seq_printf(s, "Bootparam: h2c_shutdown_db %d\n",
-		bootparam->h2c_shutdown_db);
+		   bootparam->h2c_shutdown_db);
 	seq_printf(s, "Bootparam: h2c_config_db %d\n",
-		bootparam->h2c_config_db);
+		   bootparam->h2c_config_db);
 	seq_printf(s, "Bootparam: c2h_shutdown_db %d\n",
-		bootparam->c2h_shutdown_db);
+		   bootparam->c2h_shutdown_db);
 	seq_printf(s, "Bootparam: shutdown_status %d\n",
-		bootparam->shutdown_status);
+		   bootparam->shutdown_status);
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
-		bootparam->shutdown_card);
+		   bootparam->shutdown_card);
 
 	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
 	     i += mic_total_desc_size(d)) {
@@ -239,10 +239,10 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 			seq_printf(s, "address 0x%llx ", vqconfig->address);
 			seq_printf(s, "num %d ", vqconfig->num);
 			seq_printf(s, "used address 0x%llx\n",
-				vqconfig->used_address);
+				   vqconfig->used_address);
 		}
 
-		features = (__u32 *) mic_vq_features(d);
+		features = (__u32 *)mic_vq_features(d);
 		seq_printf(s, "Features: Host 0x%x ", features[0]);
 		seq_printf(s, "Guest 0x%x\n", features[1]);
 
@@ -256,7 +256,7 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
 		seq_printf(s, "Host ack %d\n", dc->host_ack);
 		seq_printf(s, "Used address updated %d ",
-			dc->used_address_updated);
+			   dc->used_address_updated);
 		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
 		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
 		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
@@ -294,10 +294,10 @@ static int mic_vdev_info_show(struct seq_file *s, void *unused)
 	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
 		mvdev = list_entry(pos, struct mic_vdev, list);
 		seq_printf(s, "VDEV type %d state %s in %ld out %ld\n",
-			mvdev->virtio_id,
-			mic_vdevup(mvdev) ? "UP" : "DOWN",
-			mvdev->in_bytes,
-			mvdev->out_bytes);
+			   mvdev->virtio_id,
+			   mic_vdevup(mvdev) ? "UP" : "DOWN",
+			   mvdev->in_bytes,
+			   mvdev->out_bytes);
 		for (i = 0; i < MIC_MAX_VRINGS; i++) {
 			struct vring_desc *desc;
 			struct vring_avail *avail;
@@ -309,38 +309,38 @@ static int mic_vdev_info_show(struct seq_file *s, void *unused)
 				continue;
 			desc = vrh->vring.desc;
 			seq_printf(s, "vring i %d avail_idx %d",
-				i, mvr->vring.info->avail_idx & (num - 1));
+				   i, mvr->vring.info->avail_idx & (num - 1));
 			seq_printf(s, " vring i %d avail_idx %d\n",
-				i, mvr->vring.info->avail_idx);
+				   i, mvr->vring.info->avail_idx);
 			seq_printf(s, "vrh i %d weak_barriers %d",
-				i, vrh->weak_barriers);
+				   i, vrh->weak_barriers);
 			seq_printf(s, " last_avail_idx %d last_used_idx %d",
-				vrh->last_avail_idx, vrh->last_used_idx);
+				   vrh->last_avail_idx, vrh->last_used_idx);
 			seq_printf(s, " completed %d\n", vrh->completed);
 			for (j = 0; j < num; j++) {
 				seq_printf(s, "desc[%d] addr 0x%llx len %d",
-					j, desc->addr, desc->len);
+					   j, desc->addr, desc->len);
 				seq_printf(s, " flags 0x%x next %d\n",
-					desc->flags,
-					desc->next);
+					   desc->flags, desc->next);
 				desc++;
 			}
 			avail = vrh->vring.avail;
 			seq_printf(s, "avail flags 0x%x idx %d\n",
-				avail->flags, avail->idx & (num - 1));
+				   avail->flags, avail->idx & (num - 1));
 			seq_printf(s, "avail flags 0x%x idx %d\n",
-				avail->flags, avail->idx);
+				   avail->flags, avail->idx);
 			for (j = 0; j < num; j++)
 				seq_printf(s, "avail ring[%d] %d\n",
-					j, avail->ring[j]);
+					   j, avail->ring[j]);
 			used = vrh->vring.used;
 			seq_printf(s, "used flags 0x%x idx %d\n",
-				used->flags, used->idx & (num - 1));
+				   used->flags, used->idx & (num - 1));
 			seq_printf(s, "used flags 0x%x idx %d\n",
-				used->flags, used->idx);
+				   used->flags, used->idx);
 			for (j = 0; j < num; j++)
 				seq_printf(s, "used ring[%d] id %d len %d\n",
-					j, used->ring[j].id, used->ring[j].len);
+					   j, used->ring[j].id,
+					   used->ring[j].len);
 		}
 	}
 	mutex_unlock(&mdev->mic_mutex);
@@ -389,7 +389,7 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 			reg = mdev->intr_ops->read_msi_to_src_map(mdev, entry);
 
 			seq_printf(s, "%s %-10d %s %-10d MXAR[%d]: %08X\n",
-				"IRQ:", vector, "Entry:", entry, i, reg);
+				   "IRQ:", vector, "Entry:", entry, i, reg);
 
 			seq_printf(s, "%-10s", "offset:");
 			for (j = (MIC_NUM_OFFSETS - 1); j >= 0; j--)
@@ -400,8 +400,8 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 			seq_printf(s, "%-10s", "count:");
 			for (j = (MIC_NUM_OFFSETS - 1); j >= 0; j--)
 				seq_printf(s, "%4d ",
-				(mdev->irq_info.mic_msi_map[i] & BIT(j)) ?
-					1 : 0);
+					   (mdev->irq_info.mic_msi_map[i] &
+					   BIT(j)) ? 1 : 0);
 			seq_puts(s, "\n\n");
 		}
 	} else {
@@ -409,7 +409,6 @@ static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 	}
 
 	return 0;
-
 }
 
 static int mic_msi_irq_info_debug_open(struct inode *inode, struct file *file)
@@ -443,26 +442,23 @@ void mic_create_debug_dir(struct mic_device *mdev)
 	if (!mdev->dbg_dir)
 		return;
 
-	debugfs_create_file("log_buf", 0444, mdev->dbg_dir,
-		mdev, &log_buf_ops);
+	debugfs_create_file("log_buf", 0444, mdev->dbg_dir, mdev, &log_buf_ops);
 
-	debugfs_create_file("smpt", 0444, mdev->dbg_dir,
-		mdev, &smpt_file_ops);
+	debugfs_create_file("smpt", 0444, mdev->dbg_dir, mdev, &smpt_file_ops);
 
-	debugfs_create_file("soft_reset", 0444, mdev->dbg_dir,
-		mdev, &soft_reset_ops);
+	debugfs_create_file("soft_reset", 0444, mdev->dbg_dir, mdev,
+			    &soft_reset_ops);
 
-	debugfs_create_file("post_code", 0444, mdev->dbg_dir,
-		mdev, &post_code_ops);
+	debugfs_create_file("post_code", 0444, mdev->dbg_dir, mdev,
+			    &post_code_ops);
 
-	debugfs_create_file("dp", 0444, mdev->dbg_dir,
-		mdev, &dp_ops);
+	debugfs_create_file("dp", 0444, mdev->dbg_dir, mdev, &dp_ops);
 
-	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir,
-		mdev, &vdev_info_ops);
+	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir, mdev,
+			    &vdev_info_ops);
 
-	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir,
-		mdev, &msi_irq_info_ops);
+	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir, mdev,
+			    &msi_irq_info_ops);
 }
 
 /**
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
index e699c80..8dc6ff1 100644
--- a/drivers/misc/mic/host/mic_fops.c
+++ b/drivers/misc/mic/host/mic_fops.c
@@ -140,9 +140,9 @@ unsigned int mic_poll(struct file *f, poll_table *wait)
 
 	poll_wait(f, &mvdev->waitq, wait);
 
-	if (mic_vdev_inited(mvdev))
+	if (mic_vdev_inited(mvdev)) {
 		mask = POLLERR;
-	else if (mvdev->poll_wake) {
+	} else if (mvdev->poll_wake) {
 		mvdev->poll_wake = 0;
 		mask = POLLIN | POLLOUT;
 	}
@@ -152,7 +152,7 @@ unsigned int mic_poll(struct file *f, poll_table *wait)
 
 static inline int
 mic_query_offset(struct mic_vdev *mvdev, unsigned long offset,
-	unsigned long *size, unsigned long *pa)
+		 unsigned long *size, unsigned long *pa)
 {
 	struct mic_device *mdev = mvdev->mdev;
 	unsigned long start = MIC_DP_SIZE;
diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
index 71a7521..f9c29bc 100644
--- a/drivers/misc/mic/host/mic_intr.c
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -71,8 +71,8 @@ static irqreturn_t mic_interrupt(int irq, void *dev)
 
 /* Return the interrupt offset from the index. Index is 0 based. */
 static u16 mic_map_src_to_offset(struct mic_device *mdev,
-		int intr_src, enum mic_intr_type type) {
-
+		int intr_src, enum mic_intr_type type)
+{
 	if (type >= MIC_NUM_INTR_TYPES)
 		return MIC_NUM_OFFSETS;
 	if (intr_src >= mdev->intr_info->intr_len[type])
@@ -112,7 +112,7 @@ static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
 	struct mic_intr_cb *intr_cb;
 	unsigned long flags;
 	int rc;
-	intr_cb = kmalloc(sizeof(struct mic_intr_cb), GFP_KERNEL);
+	intr_cb = kmalloc(sizeof(*intr_cb), GFP_KERNEL);
 
 	if (!intr_cb)
 		return ERR_PTR(-ENOMEM);
@@ -159,7 +159,7 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 			if (intr_cb->cb_id == idx) {
 				list_del(pos);
 				ida_simple_remove(&mdev->irq_info.cb_ida,
-					intr_cb->cb_id);
+						  intr_cb->cb_id);
 				kfree(intr_cb);
 				spin_unlock_irqrestore(
 					&mdev->irq_info.mic_intr_lock, flags);
@@ -182,9 +182,10 @@ static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
 static int mic_setup_msix(struct mic_device *mdev, struct pci_dev *pdev)
 {
 	int rc, i;
+	int entry_size = sizeof(*mdev->irq_info.msix_entries);
 
-	mdev->irq_info.msix_entries = kmalloc(sizeof(struct msix_entry) *
-			MIC_MIN_MSIX, GFP_KERNEL);
+	mdev->irq_info.msix_entries = kmalloc_array(MIC_MIN_MSIX,
+						    entry_size, GFP_KERNEL);
 	if (!mdev->irq_info.msix_entries) {
 		rc = -ENOMEM;
 		goto err_nomem1;
@@ -231,8 +232,9 @@ static int mic_setup_callbacks(struct mic_device *mdev)
 {
 	int i;
 
-	mdev->irq_info.cb_list = kmalloc(sizeof(struct list_head) *
-		MIC_NUM_OFFSETS, GFP_KERNEL);
+	mdev->irq_info.cb_list = kmalloc_array(MIC_NUM_OFFSETS,
+					       sizeof(*mdev->irq_info.cb_list),
+					       GFP_KERNEL);
 	if (!mdev->irq_info.cb_list)
 		return -ENOMEM;
 
@@ -261,7 +263,7 @@ static void mic_release_callbacks(struct mic_device *mdev)
 
 		if (list_empty(&mdev->irq_info.cb_list[i])) {
 			spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock,
-				flags);
+					       flags);
 			break;
 		}
 
@@ -269,7 +271,7 @@ static void mic_release_callbacks(struct mic_device *mdev)
 			intr_cb = list_entry(pos, struct mic_intr_cb, list);
 			list_del(pos);
 			ida_simple_remove(&mdev->irq_info.cb_ida,
-				intr_cb->cb_id);
+					  intr_cb->cb_id);
 			kfree(intr_cb);
 		}
 		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
@@ -427,8 +429,8 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 	offset = mic_map_src_to_offset(mdev, intr_src, type);
 	if (offset >= MIC_NUM_OFFSETS) {
 		dev_err(mdev->sdev->parent,
-				"Error mapping index %d to a valid source id.\n",
-				intr_src);
+			"Error mapping index %d to a valid source id.\n",
+			intr_src);
 		rc = -EINVAL;
 		goto err;
 	}
@@ -437,7 +439,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 		msix = mic_get_available_vector(mdev);
 		if (!msix) {
 			dev_err(mdev->sdev->parent,
-			"No MSIx vectors available for use.\n");
+				"No MSIx vectors available for use.\n");
 			rc = -ENOSPC;
 			goto err;
 		}
@@ -460,7 +462,7 @@ struct mic_irq *mic_request_irq(struct mic_device *mdev,
 				offset, func, data);
 		if (IS_ERR(intr_cb)) {
 			dev_err(mdev->sdev->parent,
-			"No available callback entries for use\n");
+				"No available callback entries for use\n");
 			rc = PTR_ERR(intr_cb);
 			goto err;
 		}
@@ -506,7 +508,7 @@ void mic_free_irq(struct mic_device *mdev,
 	if (mdev->irq_info.num_vectors > 1) {
 		if (entry >= mdev->irq_info.num_vectors) {
 			dev_warn(mdev->sdev->parent,
-				"entry %d should be < num_irq %d\n",
+				 "entry %d should be < num_irq %d\n",
 				entry, mdev->irq_info.num_vectors);
 			return;
 		}
@@ -581,7 +583,7 @@ void mic_free_interrupts(struct mic_device *mdev, struct pci_dev *pdev)
 		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
 			if (mdev->irq_info.mic_msi_map[i])
 				dev_warn(&pdev->dev, "irq %d may still be in use.\n",
-					mdev->irq_info.msix_entries[i].vector);
+					 mdev->irq_info.msix_entries[i].vector);
 		}
 		kfree(mdev->irq_info.mic_msi_map);
 		kfree(mdev->irq_info.msix_entries);
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
index 003d02b..fae474c 100644
--- a/drivers/misc/mic/host/mic_smpt.c
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -84,7 +84,7 @@ static void mic_add_smpt_entry(int spt, s64 *ref, u64 addr,
 	for (i = spt; i < spt + entries; i++,
 		addr += smpt_info->info.page_size) {
 		if (!smpt_info->entry[i].ref_count &&
-			(smpt_info->entry[i].dma_addr != addr)) {
+		    (smpt_info->entry[i].dma_addr != addr)) {
 			mdev->smpt_ops->set(mdev, addr, i);
 			smpt_info->entry[i].dma_addr = addr;
 		}
@@ -183,7 +183,7 @@ mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
 
 	if (!mic_is_system_addr(mdev, mic_addr)) {
 		dev_err(mdev->sdev->parent,
-		"mic_addr is invalid. mic_addr = 0x%llx\n", mic_addr);
+			"mic_addr is invalid. mic_addr = 0x%llx\n", mic_addr);
 		return -EINVAL;
 	}
 	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
@@ -286,7 +286,7 @@ void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
 		smpt_info->entry[i].ref_count -= ref[i - spt];
 		if (smpt_info->entry[i].ref_count < 0)
 			dev_warn(mdev->sdev->parent,
-				"ref count for entry %d is negative\n", i);
+				 "ref count for entry %d is negative\n", i);
 	}
 	spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
 	kfree(ref);
@@ -320,7 +320,7 @@ dma_addr_t mic_map_single(struct mic_device *mdev, void *va, size_t size)
 				"mic_map failed dma_addr 0x%llx size 0x%lx\n",
 				dma_addr, size);
 			pci_unmap_single(pdev, dma_addr,
-				size, PCI_DMA_BIDIRECTIONAL);
+					 size, PCI_DMA_BIDIRECTIONAL);
 		}
 	}
 	return mic_addr;
@@ -366,8 +366,8 @@ int mic_smpt_init(struct mic_device *mdev)
 
 	smpt_info = mdev->smpt;
 	mdev->smpt_ops->init(mdev);
-	smpt_info->entry = kmalloc(sizeof(struct mic_smpt)
-			* smpt_info->info.num_reg, GFP_KERNEL);
+	smpt_info->entry = kmalloc_array(smpt_info->info.num_reg,
+					 sizeof(*smpt_info->entry), GFP_KERNEL);
 	if (!smpt_info->entry) {
 		err = -ENOMEM;
 		goto free_smpt;
@@ -412,7 +412,7 @@ void mic_smpt_uninit(struct mic_device *mdev)
 			smpt_info->entry[i].ref_count);
 		if (smpt_info->entry[i].ref_count)
 			dev_warn(mdev->sdev->parent,
-			"ref count for entry %d is not zero\n", i);
+				 "ref count for entry %d is not zero\n", i);
 	}
 	kfree(smpt_info->entry);
 	kfree(smpt_info);
diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index 029a4f3..75746ad 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -130,7 +130,7 @@ state_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 state_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	    const char *buf, size_t count)
 {
 	int rc = 0;
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -163,7 +163,7 @@ done:
 static DEVICE_ATTR_RW(state);
 
 static ssize_t shutdown_status_show(struct device *dev,
-	struct device_attribute *attr, char *buf)
+				    struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -193,7 +193,7 @@ cmdline_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 cmdline_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	      const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -239,7 +239,7 @@ firmware_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 firmware_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	       const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -284,7 +284,7 @@ ramdisk_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 ramdisk_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	      const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -330,7 +330,7 @@ bootmode_show(struct device *dev, struct device_attribute *attr, char *buf)
 
 static ssize_t
 bootmode_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+	       const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -363,7 +363,7 @@ static DEVICE_ATTR_RW(bootmode);
 
 static ssize_t
 log_buf_addr_show(struct device *dev, struct device_attribute *attr,
-	char *buf)
+		  char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -375,7 +375,7 @@ log_buf_addr_show(struct device *dev, struct device_attribute *attr,
 
 static ssize_t
 log_buf_addr_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+		   const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	int ret;
@@ -397,7 +397,7 @@ static DEVICE_ATTR_RW(log_buf_addr);
 
 static ssize_t
 log_buf_len_show(struct device *dev, struct device_attribute *attr,
-	char *buf)
+		 char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -409,7 +409,7 @@ log_buf_len_show(struct device *dev, struct device_attribute *attr,
 
 static ssize_t
 log_buf_len_store(struct device *dev, struct device_attribute *attr,
-	const char *buf, size_t count)
+		  const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	int ret;
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
index 9e0456f..0c883cd 100644
--- a/drivers/misc/mic/host/mic_virtio.c
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -195,7 +195,7 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			MIC_VRINGH_READ, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-					__func__, __LINE__, ret);
+				__func__, __LINE__, ret);
 			break;
 		}
 		len -= out_len;
@@ -206,7 +206,7 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 			!MIC_VRINGH_READ, &out_len);
 		if (ret) {
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
-					__func__, __LINE__, ret);
+				__func__, __LINE__, ret);
 			break;
 		}
 		len -= out_len;
@@ -225,8 +225,7 @@ static int _mic_virtio_copy(struct mic_vdev *mvdev,
 	 * Update the used ring if a descriptor was available and some data was
 	 * copied in/out and the user asked for a used ring update.
 	 */
-	if (*head != USHRT_MAX && copy->out_len &&
-		copy->update_used) {
+	if (*head != USHRT_MAX && copy->out_len && copy->update_used) {
 		u32 total = 0;
 
 		/* Determine the total data consumed */
@@ -367,7 +366,6 @@ void mic_bh_handler(struct work_struct *work)
 
 static irqreturn_t mic_virtio_intr_handler(int irq, void *data)
 {
-
 	struct mic_vdev *mvdev = data;
 	struct mic_device *mdev = mvdev->mdev;
 
@@ -394,7 +392,7 @@ int mic_virtio_config_change(struct mic_vdev *mvdev,
 	}
 
 	if (copy_from_user(mic_vq_configspace(mvdev->dd),
-				argp, mvdev->dd->config_len)) {
+			   argp, mvdev->dd->config_len)) {
 		dev_err(mic_dev(mvdev), "%s %d err %d\n",
 			__func__, __LINE__, -EFAULT);
 		ret = -EFAULT;
@@ -438,8 +436,8 @@ static int mic_copy_dp_entry(struct mic_vdev *mvdev,
 		return -EFAULT;
 	}
 
-	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE
-		|| dd.num_vq > MIC_MAX_VRINGS) {
+	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE ||
+	    dd.num_vq > MIC_MAX_VRINGS) {
 		dev_err(mic_dev(mvdev), "%s %d err %d\n",
 			__func__, __LINE__, -EINVAL);
 		return -EINVAL;
@@ -503,7 +501,7 @@ static void mic_init_device_ctrl(struct mic_vdev *mvdev,
 {
 	struct mic_device_ctrl *dc;
 
-	dc = mvdev->dc = (void *)devpage + mic_aligned_desc_size(devpage);
+	dc = (void *)devpage + mic_aligned_desc_size(devpage);
 
 	dc->config_change = 0;
 	dc->guest_ack = 0;
@@ -512,6 +510,7 @@ static void mic_init_device_ctrl(struct mic_vdev *mvdev,
 	dc->used_address_updated = 0;
 	dc->c2h_vdev_db = -1;
 	dc->h2c_vdev_db = -1;
+	mvdev->dc = dc;
 }
 
 int mic_virtio_add_device(struct mic_vdev *mvdev,
@@ -551,7 +550,7 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 			sizeof(struct _mic_vring_info));
 		vr->va = (void *)
 			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
-			get_order(vr_size));
+					 get_order(vr_size));
 		if (!vr->va) {
 			ret = -ENOMEM;
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
@@ -564,8 +563,7 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		vqconfig[i].address = mic_map_single(mdev,
 			vr->va, vr_size);
 		if (mic_map_error(vqconfig[i].address)) {
-			free_pages((unsigned long)vr->va,
-				get_order(vr_size));
+			free_pages((unsigned long)vr->va, get_order(vr_size));
 			ret = -ENOMEM;
 			dev_err(mic_dev(mvdev), "%s %d err %d\n",
 				__func__, __LINE__, ret);
@@ -573,8 +571,7 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 		}
 		vqconfig[i].address = cpu_to_le64(vqconfig[i].address);
 
-		vring_init(&vr->vr, num,
-			vr->va, MIC_VIRTIO_RING_ALIGN);
+		vring_init(&vr->vr, num, vr->va, MIC_VIRTIO_RING_ALIGN);
 		ret = vringh_init_kern(&mvr->vrh,
 			*(u32 *)mic_vq_features(mvdev->dd), num, false,
 			vr->vr.desc, vr->vr.avail, vr->vr.used);
@@ -593,8 +590,8 @@ int mic_virtio_add_device(struct mic_vdev *mvdev,
 			__func__, __LINE__, i, vr->va, vr->info, vr_size);
 	}
 
-	snprintf(irqname, sizeof(irqname),
-		"mic%dvirtio%d", mdev->id, mvdev->virtio_id);
+	snprintf(irqname, sizeof(irqname), "mic%dvirtio%d", mdev->id,
+		 mvdev->virtio_id);
 	mvdev->virtio_db = mic_next_db(mdev);
 	mvdev->virtio_cookie = mic_request_irq(mdev, mic_virtio_intr_handler,
 			irqname, mvdev, mvdev->virtio_db, MIC_INTR_DB);
@@ -628,9 +625,9 @@ err:
 	for (j = 0; j < i; j++) {
 		struct mic_vringh *mvr = &mvdev->mvr[j];
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[j].address),
-				mvr->vring.len);
+				 mvr->vring.len);
 		free_pages((unsigned long)mvr->vring.va,
-			get_order(mvr->vring.len));
+			   get_order(mvr->vring.len));
 	}
 	mutex_unlock(&mdev->mic_mutex);
 	return ret;
@@ -676,9 +673,9 @@ skip_hot_remove:
 		vringh_kiov_cleanup(&mvr->riov);
 		vringh_kiov_cleanup(&mvr->wiov);
 		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
-				mvr->vring.len);
+				 mvr->vring.len);
 		free_pages((unsigned long)mvr->vring.va,
-			get_order(mvr->vring.len));
+			   get_order(mvr->vring.len));
 	}
 
 	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index 3a0d660..81e9541 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -46,8 +46,8 @@ mic_x100_write_spad(struct mic_device *mdev, unsigned int idx, u32 val)
 	dev_dbg(mdev->sdev->parent, "Writing 0x%x to scratch pad index %d\n",
 		val, idx);
 	mic_mmio_write(&mdev->mmio, val,
-		MIC_X100_SBOX_BASE_ADDRESS +
-		MIC_X100_SBOX_SPAD0 + idx * 4);
+		       MIC_X100_SBOX_BASE_ADDRESS +
+		       MIC_X100_SBOX_SPAD0 + idx * 4);
 }
 
 /**
@@ -130,8 +130,8 @@ static void mic_x100_send_sbox_intr(struct mic_device *mdev,
 {
 	struct mic_mw *mw = &mdev->mmio;
 	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
-	u32 apicicr_low = mic_mmio_read(mw,
-			MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+	u32 apicicr_low = mic_mmio_read(mw, MIC_X100_SBOX_BASE_ADDRESS +
+					apic_icr_offset);
 
 	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
 	apicicr_low = (apicicr_low | (1 << 13));
@@ -139,7 +139,7 @@ static void mic_x100_send_sbox_intr(struct mic_device *mdev,
 	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
 	wmb();
 	mic_mmio_write(mw, apicicr_low,
-		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
 }
 
 /**
@@ -153,7 +153,7 @@ static void mic_x100_send_rdmasr_intr(struct mic_device *mdev,
 	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
 	wmb();
 	mic_mmio_write(&mdev->mmio, 0,
-		MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+		       MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
 }
 
 /**
@@ -212,7 +212,7 @@ done:
  */
 static void mic_x100_hw_intr_init(struct mic_device *mdev)
 {
-	mdev->intr_info = (struct mic_intr_info *) mic_x100_intr_init;
+	mdev->intr_info = (struct mic_intr_info *)mic_x100_intr_init;
 }
 
 /**
@@ -244,7 +244,7 @@ mic_x100_read_msi_to_src_map(struct mic_device *mdev, int idx)
  */
 static void
 mic_x100_program_msi_to_src_map(struct mic_device *mdev,
-			int idx, int offset, bool set)
+				int idx, int offset, bool set)
 {
 	unsigned long reg;
 	struct mic_mw *mw = &mdev->mmio;
@@ -308,12 +308,12 @@ static void mic_x100_send_firmware_intr(struct mic_device *mdev)
 	apicicr_low = (vector | (1 << 13));
 
 	mic_mmio_write(mw, mic_x100_get_apic_id(mdev),
-		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset + 4);
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset + 4);
 
 	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
 	wmb();
 	mic_mmio_write(mw, apicicr_low,
-		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+		       MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
 }
 
 /**
@@ -365,8 +365,7 @@ mic_x100_load_command_line(struct mic_device *mdev, const struct firmware *fw)
 	len += snprintf(buf, CMDLINE_SIZE - len,
 		" mem=%dM", boot_mem);
 	if (mdev->cmdline)
-		snprintf(buf + len, CMDLINE_SIZE - len,
-				" %s", mdev->cmdline);
+		snprintf(buf + len, CMDLINE_SIZE - len, " %s", mdev->cmdline);
 	memcpy_toio(cmd_line_va, buf, strlen(buf) + 1);
 	kfree(buf);
 	return 0;
@@ -397,8 +396,7 @@ mic_x100_load_ramdisk(struct mic_device *mdev)
 	 * Typically the bootaddr for card OS is 64M
 	 * so copy over the ramdisk @ 128M.
 	 */
-	memcpy_toio(mdev->aper.va + (mdev->bootaddr << 1),
-		fw->data, fw->size);
+	memcpy_toio(mdev->aper.va + (mdev->bootaddr << 1), fw->data, fw->size);
 	iowrite32(cpu_to_le32(mdev->bootaddr << 1), &bp->hdr.ramdisk_image);
 	iowrite32(cpu_to_le32(fw->size), &bp->hdr.ramdisk_size);
 	release_firmware(fw);
@@ -484,8 +482,7 @@ mic_x100_load_firmware(struct mic_device *mdev, const char *buf)
 	if (mdev->ramdisk)
 		rc = mic_x100_load_ramdisk(mdev);
 error:
-	dev_dbg(mdev->sdev->parent, "%s %d rc %d\n",
-			__func__, __LINE__, rc);
+	dev_dbg(mdev->sdev->parent, "%s %d rc %d\n", __func__, __LINE__, rc);
 done:
 	return rc;
 }
@@ -524,8 +521,8 @@ mic_x100_smpt_set(struct mic_device *mdev, dma_addr_t dma_addr, u8 index)
 	uint32_t smpt_reg_val = BUILD_SMPT(SNOOP_ON,
 			dma_addr >> mdev->smpt->info.page_shift);
 	mic_mmio_write(&mdev->mmio, smpt_reg_val,
-		MIC_X100_SBOX_BASE_ADDRESS +
-		MIC_X100_SBOX_SMPT00 + (4 * index));
+		       MIC_X100_SBOX_BASE_ADDRESS +
+		       MIC_X100_SBOX_SMPT00 + (4 * index));
 }
 
 /**
-- 
1.8.2.1


^ permalink raw reply related	[relevance 35%]

* [PATCH CHAR-MISC-NEXT 1/4] Intel MIC Host Driver Sysfs Cleanups.
  2013-09-27  1:09 96% [PATCH CHAR-MISC-NEXT 0/4] Intel MIC Driver Cleanups Sudeep Dutt
@ 2013-09-27  1:09 75% ` Sudeep Dutt
  2013-09-27  1:10 67% ` [PATCH CHAR-MISC-NEXT 2/4] Intel MIC Driver Header File Cleanups Sudeep Dutt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27  1:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joe Perches, linux-kernel, linux-doc
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

+ Use DEVICE_ATTR_RO/RW instead of DEVICE_ATTR
+ Use ATTRIBUTE_GROUPS

These changes were suggested by Greg Kroah-Hartman during the
code review @ https://lkml.org/lkml/2013/9/6/13

Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/misc/mic/host/mic_sysfs.c | 72 ++++++++++++++++-----------------------
 1 file changed, 29 insertions(+), 43 deletions(-)

diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index aaf8499..bebbae2 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -65,7 +65,7 @@ void mic_set_state(struct mic_device *mdev, u8 state)
 }
 
 static ssize_t
-mic_show_family(struct device *dev, struct device_attribute *attr, char *buf)
+family_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	static const char x100[] = "x100";
 	static const char unknown[] = "Unknown";
@@ -85,10 +85,10 @@ mic_show_family(struct device *dev, struct device_attribute *attr, char *buf)
 	}
 	return scnprintf(buf, PAGE_SIZE, "%s\n", card);
 }
-static DEVICE_ATTR(family, S_IRUGO, mic_show_family, NULL);
+static DEVICE_ATTR_RO(family);
 
 static ssize_t
-mic_show_stepping(struct device *dev, struct device_attribute *attr, char *buf)
+stepping_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *string = "??";
@@ -114,10 +114,10 @@ mic_show_stepping(struct device *dev, struct device_attribute *attr, char *buf)
 	}
 	return scnprintf(buf, PAGE_SIZE, "%s\n", string);
 }
-static DEVICE_ATTR(stepping, S_IRUGO, mic_show_stepping, NULL);
+static DEVICE_ATTR_RO(stepping);
 
 static ssize_t
-mic_show_state(struct device *dev, struct device_attribute *attr, char *buf)
+state_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 
@@ -129,7 +129,7 @@ mic_show_state(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_state(struct device *dev, struct device_attribute *attr,
+state_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	int rc = 0;
@@ -160,9 +160,9 @@ mic_store_state(struct device *dev, struct device_attribute *attr,
 done:
 	return count;
 }
-static DEVICE_ATTR(state, S_IRUGO|S_IWUSR, mic_show_state, mic_store_state);
+static DEVICE_ATTR_RW(state);
 
-static ssize_t mic_show_shutdown_status(struct device *dev,
+static ssize_t shutdown_status_show(struct device *dev,
 	struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -173,11 +173,10 @@ static ssize_t mic_show_shutdown_status(struct device *dev,
 	return scnprintf(buf, PAGE_SIZE, "%s\n",
 		mic_shutdown_status_string[mdev->shutdown_status]);
 }
-static DEVICE_ATTR(shutdown_status, S_IRUGO|S_IWUSR,
-	mic_show_shutdown_status, NULL);
+static DEVICE_ATTR_RO(shutdown_status);
 
 static ssize_t
-mic_show_cmdline(struct device *dev, struct device_attribute *attr, char *buf)
+cmdline_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *cmdline;
@@ -193,7 +192,7 @@ mic_show_cmdline(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_cmdline(struct device *dev, struct device_attribute *attr,
+cmdline_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -220,11 +219,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(cmdline, S_IRUGO | S_IWUSR,
-	mic_show_cmdline, mic_store_cmdline);
+static DEVICE_ATTR_RW(cmdline);
 
 static ssize_t
-mic_show_firmware(struct device *dev, struct device_attribute *attr, char *buf)
+firmware_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *firmware;
@@ -240,7 +238,7 @@ mic_show_firmware(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_firmware(struct device *dev, struct device_attribute *attr,
+firmware_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -266,11 +264,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(firmware, S_IRUGO | S_IWUSR,
-	mic_show_firmware, mic_store_firmware);
+static DEVICE_ATTR_RW(firmware);
 
 static ssize_t
-mic_show_ramdisk(struct device *dev, struct device_attribute *attr, char *buf)
+ramdisk_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *ramdisk;
@@ -286,7 +283,7 @@ mic_show_ramdisk(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_ramdisk(struct device *dev, struct device_attribute *attr,
+ramdisk_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -313,11 +310,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(ramdisk, S_IRUGO | S_IWUSR,
-	mic_show_ramdisk, mic_store_ramdisk);
+static DEVICE_ATTR_RW(ramdisk);
 
 static ssize_t
-mic_show_bootmode(struct device *dev, struct device_attribute *attr, char *buf)
+bootmode_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
 	char *bootmode;
@@ -333,7 +329,7 @@ mic_show_bootmode(struct device *dev, struct device_attribute *attr, char *buf)
 }
 
 static ssize_t
-mic_store_bootmode(struct device *dev, struct device_attribute *attr,
+bootmode_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -363,11 +359,10 @@ unlock:
 	mutex_unlock(&mdev->mic_mutex);
 	return count;
 }
-static DEVICE_ATTR(bootmode, S_IRUGO | S_IWUSR,
-	mic_show_bootmode, mic_store_bootmode);
+static DEVICE_ATTR_RW(bootmode);
 
 static ssize_t
-mic_show_log_buf_addr(struct device *dev, struct device_attribute *attr,
+log_buf_addr_show(struct device *dev, struct device_attribute *attr,
 	char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -379,7 +374,7 @@ mic_show_log_buf_addr(struct device *dev, struct device_attribute *attr,
 }
 
 static ssize_t
-mic_store_log_buf_addr(struct device *dev, struct device_attribute *attr,
+log_buf_addr_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -398,11 +393,10 @@ mic_store_log_buf_addr(struct device *dev, struct device_attribute *attr,
 exit:
 	return ret;
 }
-static DEVICE_ATTR(log_buf_addr, S_IRUGO | S_IWUSR,
-	mic_show_log_buf_addr, mic_store_log_buf_addr);
+static DEVICE_ATTR_RW(log_buf_addr);
 
 static ssize_t
-mic_show_log_buf_len(struct device *dev, struct device_attribute *attr,
+log_buf_len_show(struct device *dev, struct device_attribute *attr,
 	char *buf)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -414,7 +408,7 @@ mic_show_log_buf_len(struct device *dev, struct device_attribute *attr,
 }
 
 static ssize_t
-mic_store_log_buf_len(struct device *dev, struct device_attribute *attr,
+log_buf_len_store(struct device *dev, struct device_attribute *attr,
 	const char *buf, size_t count)
 {
 	struct mic_device *mdev = dev_get_drvdata(dev->parent);
@@ -433,8 +427,7 @@ mic_store_log_buf_len(struct device *dev, struct device_attribute *attr,
 exit:
 	return ret;
 }
-static DEVICE_ATTR(log_buf_len, S_IRUGO | S_IWUSR,
-	mic_show_log_buf_len, mic_store_log_buf_len);
+static DEVICE_ATTR_RW(log_buf_len);
 
 static struct attribute *mic_default_attrs[] = {
 	&dev_attr_family.attr,
@@ -451,16 +444,9 @@ static struct attribute *mic_default_attrs[] = {
 	NULL
 };
 
-static struct attribute_group mic_attr_group = {
-	.attrs = mic_default_attrs,
-};
-
-static const struct attribute_group *__mic_attr_group[] = {
-	&mic_attr_group,
-	NULL
-};
+ATTRIBUTE_GROUPS(mic_default);
 
 void mic_sysfs_init(struct mic_device *mdev)
 {
-	mdev->attr_group = __mic_attr_group;
+	mdev->attr_group = mic_default_groups;
 }
-- 
1.8.2.1


^ permalink raw reply related	[relevance 75%]

* [PATCH CHAR-MISC-NEXT 0/4] Intel MIC Driver Cleanups.
@ 2013-09-27  1:09 96% Sudeep Dutt
  2013-09-27  1:09 75% ` [PATCH CHAR-MISC-NEXT 1/4] Intel MIC Host Driver Sysfs Cleanups Sudeep Dutt
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Sudeep Dutt @ 2013-09-27  1:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joe Perches, linux-kernel, linux-doc
  Cc: Nikhil Rao, Ashutosh Dixit, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

These patches address code review feedback received on the patch
series @ https://lkml.org/lkml/2013/9/5/561 ,from:
a) Greg Kroah-Hartman on sysfs and header file cleanups.
b) Joe Perches on issues found by "--strict" checkpatch.

Ashutosh Dixit (1):
  Intel MIC Driver Cleanups for "--strict" checkpatch.

Sudeep Dutt (3):
  Intel MIC Host Driver Sysfs Cleanups.
  Intel MIC Driver Header File Cleanups.
  Intel MIC Header File warning fix.

 Documentation/mic/mpssd/mpssd.c      | 65 +++++++++++++------------
 Documentation/mic/mpssd/sysfs.c      |  4 +-
 drivers/misc/mic/card/mic_debugfs.c  |  2 +-
 drivers/misc/mic/card/mic_device.c   |  3 +-
 drivers/misc/mic/card/mic_virtio.c   | 19 ++++----
 drivers/misc/mic/card/mic_x100.c     |  6 +--
 drivers/misc/mic/common/mic_dev.h    | 51 +++++++++++++++++++
 drivers/misc/mic/common/mic_device.h | 51 -------------------
 drivers/misc/mic/host/mic_boot.c     |  2 +-
 drivers/misc/mic/host/mic_debugfs.c  | 92 +++++++++++++++++------------------
 drivers/misc/mic/host/mic_fops.c     |  8 +--
 drivers/misc/mic/host/mic_intr.c     | 36 +++++++-------
 drivers/misc/mic/host/mic_main.c     |  2 +-
 drivers/misc/mic/host/mic_smpt.c     | 16 +++---
 drivers/misc/mic/host/mic_sysfs.c    | 94 +++++++++++++++---------------------
 drivers/misc/mic/host/mic_virtio.c   | 39 +++++++--------
 drivers/misc/mic/host/mic_x100.c     | 35 ++++++--------
 include/uapi/linux/mic_ioctl.h       |  2 +
 18 files changed, 253 insertions(+), 274 deletions(-)
 create mode 100644 drivers/misc/mic/common/mic_dev.h
 delete mode 100644 drivers/misc/mic/common/mic_device.h

-- 
1.8.2.1


^ permalink raw reply	[relevance 96%]

* Re: [PATCH RESEND v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors.
  @ 2013-09-26 21:40 99%       ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-26 21:40 UTC (permalink / raw)
  To: Joe Perches
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He, Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

On Thu, 2013-09-26 at 14:33 -0700, Joe Perches wrote:
> On Thu, 2013-09-26 at 13:54 -0700, Greg Kroah-Hartman wrote:
> > On Thu, Sep 05, 2013 at 06:36:08PM -0700, Joe Perches wrote:
> > > Whitespace neatening...
> > > 
> > > Multiline statement argument alignment.
> > > Argument wrapping.
> > > Use kmalloc_array instead of kmalloc.
> > > 
> > > ---
> > 
> > -ENOSIGNEDOFFBY :(
> 
> These were just suggestions for Sudeep.
> 
> Sudeep said he'd incorporate and resend the series.
> 
> Sudeep?  Were you ever going to do that?
> 

We were waiting for Greg to apply the base patches to the char misc next
tree which happened today. We will send out the cleanup patch series
which addresses all feedback soon.

Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
  @ 2013-09-06 22:00 99%         ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-06 22:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Rusty Russell, Michael S. Tsirkin, Rob Landley,
	linux-kernel, virtualization, linux-doc, Asias He, Nikhil Rao,
	Ashutosh Dixit, Caz Yokoyama, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

On Fri, 2013-09-06 at 12:04 -0700, Greg Kroah-Hartman wrote:
> On Fri, Sep 06, 2013 at 11:41:03AM -0700, Sudeep Dutt wrote:
> > On Thu, 2013-09-05 at 22:01 -0700, Greg Kroah-Hartman wrote:
> > > On Thu, Sep 05, 2013 at 04:41:55PM -0700, Sudeep Dutt wrote:
> > > > +What:		/sys/class/mic/mic(x)/firmware
> > > > +Date:		August 2013
> > > > +KernelVersion:	3.11
> > > > +Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
> > > > +Description:
> > > > +		When read, this sysfs entry provides the path name under
> > > > +		/lib/firmware/ where the firmware image to be booted on the
> > > > +		card can be found. The entry can be written to change the
> > > > +		firmware image location under /lib/firmware/.
> > > 
> > > I don't understand, is the path under the HOST device, or the Client
> > > device's disk?  Why do you need to change the path on the HOST?  What's
> > > wrong with the existing firmware path selection we have in the kernel?
> > > 
> > 
> > The path is on the host. The card does not have a physical persistent
> > disk device. Our customers like the flexibility of changing the card
> > firmware/ramdisk contents and file names for individual MIC cards. This
> > flexibility is not possible with a static set of firmware file names in
> > the kernel for all cards.
> > 
> > Once the firmware/ramdisk path under /lib/firmware/ is set up via sysfs,
> > card boot is initiated via the "state" sysfs entry. The host driver then
> > obtains the contents of the firmware and ramdisk via the standard
> > request_firmware(..) interface, copies the contents to card memory and
> > interrupts the card BIOS to initiate boot.
> 
> So this is really a "filename" that might contain some directories as
> well, right?  The fact you used "path" confused me, as that doesn't
> usually imply a filename.
> 

Yes, it is a filename that might contain some directories. We will fix
up the documentation here to read filename in future patches.

> And is the "firmware" just the initramfs image for the kernel to boot?
> 

The firmware is usually a Linux kernel. The ramdisk is usually an
initramfs image. We have separate sysfs entries for firmware and ramdisk
filenames.

Thanks,
Sudeep Dutt

> thanks,
> 
> greg k-h



^ permalink raw reply	[relevance 99%]

* Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
  @ 2013-09-06 18:41 98%     ` Sudeep Dutt
    0 siblings, 1 reply; 200+ results
From: Sudeep Dutt @ 2013-09-06 18:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Rusty Russell, Michael S. Tsirkin, Rob Landley,
	linux-kernel, virtualization, linux-doc, Asias He, Nikhil Rao,
	Ashutosh Dixit, Caz Yokoyama, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

On Thu, 2013-09-05 at 22:01 -0700, Greg Kroah-Hartman wrote:
> On Thu, Sep 05, 2013 at 04:41:55PM -0700, Sudeep Dutt wrote:
> > +What:		/sys/class/mic/mic(x)/firmware
> > +Date:		August 2013
> > +KernelVersion:	3.11
> > +Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
> > +Description:
> > +		When read, this sysfs entry provides the path name under
> > +		/lib/firmware/ where the firmware image to be booted on the
> > +		card can be found. The entry can be written to change the
> > +		firmware image location under /lib/firmware/.
> 
> I don't understand, is the path under the HOST device, or the Client
> device's disk?  Why do you need to change the path on the HOST?  What's
> wrong with the existing firmware path selection we have in the kernel?
> 

The path is on the host. The card does not have a physical persistent
disk device. Our customers like the flexibility of changing the card
firmware/ramdisk contents and file names for individual MIC cards. This
flexibility is not possible with a static set of firmware file names in
the kernel for all cards.

Once the firmware/ramdisk path under /lib/firmware/ is set up via sysfs,
card boot is initiated via the "state" sysfs entry. The host driver then
obtains the contents of the firmware and ramdisk via the standard
request_firmware(..) interface, copies the contents to card memory and
interrupts the card BIOS to initiate boot.

Thanks,
Sudeep Dutt



^ permalink raw reply	[relevance 98%]

* Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
  @ 2013-09-06 18:30 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-06 18:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Rusty Russell, Michael S. Tsirkin, Rob Landley,
	linux-kernel, virtualization, linux-doc, Asias He, Nikhil Rao,
	Ashutosh Dixit, Caz Yokoyama, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

On Thu, 2013-09-05 at 22:00 -0700, Greg Kroah-Hartman wrote:
> On Thu, Sep 05, 2013 at 04:41:55PM -0700, Sudeep Dutt wrote:
> > +What:		/sys/class/mic/mic(x)/cmdline
> > +Date:		August 2013
> > +KernelVersion:	3.11
> > +Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
> > +Description:
> > +		An Intel MIC device runs a Linux OS during its operation. Before
> > +		booting this card OS, it is possible to pass kernel command line
> > +		options to configure various features in it, similar to
> > +		self-bootable machines. When read, this entry provides
> > +		information about the current kernel command line options set to
> > +		boot the card OS. This entry can be written to change the
> > +		existing kernel command line options. Typically, the user would
> > +		want to read the current command line options, append new ones
> > +		or modify existing ones and then write the whole kernel command
> > +		line back to this entry.
> 
> Is a PAGE_SIZE value going to be big enough for your command line?  I
> know some embedded systems have horribly long command lines, hopefully
> this will be enough for you.
> 

Yes, PAGE_SIZE is more than sufficient for our command line.
Thanks,
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* Re: [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
  @ 2013-09-06 18:29 99%     ` Sudeep Dutt
  0 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-06 18:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Rusty Russell, Michael S. Tsirkin, Rob Landley,
	linux-kernel, virtualization, linux-doc, Asias He, Nikhil Rao,
	Ashutosh Dixit, Caz Yokoyama, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

On Thu, 2013-09-05 at 21:58 -0700, Greg Kroah-Hartman wrote:
> Again, very minor fixups for later (I can even do them...)
> 
> > +static DEVICE_ATTR(state, S_IRUGO|S_IWUSR, mic_show_state, mic_store_state);
> 
> DEVICE_ATTR_RW() please.
> 
> Same for the other attributes you create in this patch.
> 

Sure, we will incorporate these changes along with your other feedback
in patch 1 and post the next revision of this patch series.

Thanks for the review!
Sudeep Dutt




^ permalink raw reply	[relevance 99%]

* Re: [PATCH RESEND v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors.
  @ 2013-09-06 18:27 99%   ` Sudeep Dutt
    1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-06 18:27 UTC (permalink / raw)
  To: Joe Perches
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He, Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

On Thu, 2013-09-05 at 18:36 -0700, Joe Perches wrote:
> Whitespace neatening...
> 
> Multiline statement argument alignment.
> Argument wrapping.
> Use kmalloc_array instead of kmalloc.
> 
> ---

Hi Joe,

We will incorporate your changes in the next version of this patch
series and run all future patches via the "--strict" checkpatch option.

Thanks a lot for the patch!
Sudeep Dutt


^ permalink raw reply	[relevance 99%]

* [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
  2013-09-05 23:41 44% ` [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family Sudeep Dutt
  2013-09-05 23:41 28% ` [PATCH RESEND v3 2/7] Intel MIC Host Driver Interrupt/SMPT support Sudeep Dutt
@ 2013-09-05 23:41 28% ` Sudeep Dutt
                       ` (2 more replies)
  2013-09-05 23:42 42% ` [PATCH RESEND v3 4/7] Intel MIC Card Driver for X100 family Sudeep Dutt
                   ` (4 subsequent siblings)
  7 siblings, 3 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

This patch enables the following features:
a) Boots and shuts down the card via sysfs entries.
b) Allocates and maps a device page for communication with the
   card driver and updates the device page address via scratchpad
   registers.
c) Provides sysfs entries for shutdown status, kernel command line,
   ramdisk and log buffer information.

Co-author: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
---
 Documentation/ABI/testing/sysfs-class-mic.txt | 113 ++++++++
 drivers/misc/mic/common/mic_device.h          |   7 +
 drivers/misc/mic/host/Makefile                |   2 +
 drivers/misc/mic/host/mic_boot.c              | 184 +++++++++++++
 drivers/misc/mic/host/mic_debugfs.c           | 355 +++++++++++++++++++++++++
 drivers/misc/mic/host/mic_device.h            |  60 +++++
 drivers/misc/mic/host/mic_main.c              | 129 ++++++++-
 drivers/misc/mic/host/mic_sysfs.c             | 369 ++++++++++++++++++++++++++
 drivers/misc/mic/host/mic_x100.c              | 251 ++++++++++++++++++
 drivers/misc/mic/host/mic_x100.h              |  12 +
 include/uapi/linux/Kbuild                     |   1 +
 include/uapi/linux/mic_common.h               |  74 ++++++
 12 files changed, 1553 insertions(+), 4 deletions(-)
 create mode 100644 drivers/misc/mic/host/mic_boot.c
 create mode 100644 drivers/misc/mic/host/mic_debugfs.c
 create mode 100644 include/uapi/linux/mic_common.h

diff --git a/Documentation/ABI/testing/sysfs-class-mic.txt b/Documentation/ABI/testing/sysfs-class-mic.txt
index 09eb3c6..82cdad3 100644
--- a/Documentation/ABI/testing/sysfs-class-mic.txt
+++ b/Documentation/ABI/testing/sysfs-class-mic.txt
@@ -32,3 +32,116 @@ Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
 Description:
 		Provides information about the silicon stepping for an Intel
 		MIC device. For example - "A0" or "B0"
+
+What:		/sys/class/mic/mic(x)/state
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		When read, this entry provides the current state of an Intel
+		MIC device in the context of the card OS. Possible values that
+		will be read are:
+		"offline" - The MIC device is ready to boot the card OS.
+		"online" - The MIC device has initiated booting a card OS.
+		"shutting_down" - The card OS is shutting down.
+		"reset_failed" - The MIC device has failed to reset.
+
+		When written, this sysfs entry triggers different state change
+		operations depending upon the current state of the card OS.
+		Acceptable values are:
+		"boot" - Boot the card OS image specified by the combination
+			 of firmware, ramdisk, cmdline and bootmode
+			sysfs entries.
+		"reset" - Initiates device reset.
+		"shutdown" - Initiates card OS shutdown.
+
+What:		/sys/class/mic/mic(x)/shutdown_status
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		An Intel MIC device runs a Linux OS during its operation. This
+		OS can shutdown because of various reasons. When read, this
+		entry provides the status on why the card OS was shutdown.
+		Possible values are:
+		"nop" -  shutdown status is not applicable, when the card OS is
+			"online"
+		"crashed" - Shutdown because of a HW or SW crash.
+		"halted" - Shutdown because of a halt command.
+		"poweroff" - Shutdown because of a poweroff command.
+		"restart" - Shutdown because of a restart command.
+
+What:		/sys/class/mic/mic(x)/cmdline
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		An Intel MIC device runs a Linux OS during its operation. Before
+		booting this card OS, it is possible to pass kernel command line
+		options to configure various features in it, similar to
+		self-bootable machines. When read, this entry provides
+		information about the current kernel command line options set to
+		boot the card OS. This entry can be written to change the
+		existing kernel command line options. Typically, the user would
+		want to read the current command line options, append new ones
+		or modify existing ones and then write the whole kernel command
+		line back to this entry.
+
+What:		/sys/class/mic/mic(x)/firmware
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		When read, this sysfs entry provides the path name under
+		/lib/firmware/ where the firmware image to be booted on the
+		card can be found. The entry can be written to change the
+		firmware image location under /lib/firmware/.
+
+What:		/sys/class/mic/mic(x)/ramdisk
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		When read, this sysfs entry provides the path name under
+		/lib/firmware/ where the ramdisk image to be used during card
+		OS boot can be found. The entry can be written to change
+		the ramdisk image location under /lib/firmware/.
+
+What:		/sys/class/mic/mic(x)/bootmode
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		When read, this sysfs entry provides the current bootmode for
+		the card. This sysfs entry can be written with the following
+		valid strings:
+		a) linux - Boot a Linux image.
+		b) elf - Boot an elf image for flash updates.
+
+What:		/sys/class/mic/mic(x)/log_buf_addr
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		An Intel MIC device runs a Linux OS during its operation. For
+		debugging purpose and early kernel boot messages, the user can
+		access the card OS log buffer via debugfs. When read, this entry
+		provides the kernel virtual address of the buffer where the card
+		OS log buffer can be read. This entry is written by the host
+		configuration daemon to set the log buffer address. The correct
+		log buffer address to be written can be found in the System.map
+		file of the card OS.
+
+What:		/sys/class/mic/mic(x)/log_buf_len
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		An Intel MIC device runs a Linux OS during its operation. For
+		debugging purpose and early kernel boot messages, the user can
+		access the card OS log buffer via debugfs. When read, this entry
+		provides the kernel virtual address where the card OS log buffer
+		length can be read. This entry is written by host configuration
+		daemon to set the log buffer length address. The correct log
+		buffer length address to be written can be found in the
+		System.map file of the card OS.
diff --git a/drivers/misc/mic/common/mic_device.h b/drivers/misc/mic/common/mic_device.h
index f02262e..6440e9d 100644
--- a/drivers/misc/mic/common/mic_device.h
+++ b/drivers/misc/mic/common/mic_device.h
@@ -34,4 +34,11 @@ struct mic_mw {
 	resource_size_t len;
 };
 
+/*
+ * Scratch pad register offsets used by the host to communicate
+ * device page DMA address to the card.
+ */
+#define MIC_DPLO_SPAD 14
+#define MIC_DPHI_SPAD 15
+
 #endif
diff --git a/drivers/misc/mic/host/Makefile b/drivers/misc/mic/host/Makefile
index 6ff5550..a375dd3 100644
--- a/drivers/misc/mic/host/Makefile
+++ b/drivers/misc/mic/host/Makefile
@@ -8,3 +8,5 @@ mic_host-objs += mic_x100.o
 mic_host-objs += mic_sysfs.o
 mic_host-objs += mic_smpt.o
 mic_host-objs += mic_intr.o
+mic_host-objs += mic_boot.o
+mic_host-objs += mic_debugfs.o
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
new file mode 100644
index 0000000..936fc58
--- /dev/null
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -0,0 +1,184 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/delay.h>
+#include <linux/firmware.h>
+#include <linux/interrupt.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_smpt.h"
+
+/**
+ * mic_reset - Reset the MIC device.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_reset(struct mic_device *mdev)
+{
+	int i;
+
+#define MIC_RESET_TO (45)
+
+	mdev->ops->reset_fw_ready(mdev);
+	mdev->ops->reset(mdev);
+
+	for (i = 0; i < MIC_RESET_TO; i++) {
+		if (mdev->ops->is_fw_ready(mdev))
+			return;
+		/*
+		 * Resets typically take 10s of seconds to complete.
+		 * Since an MMIO read is required to check if the
+		 * firmware is ready or not, a 1 second delay works nicely.
+		 */
+		msleep(1000);
+	}
+	mic_set_state(mdev, MIC_RESET_FAILED);
+}
+
+/* Initialize the MIC bootparams */
+void mic_bootparam_init(struct mic_device *mdev)
+{
+	struct mic_bootparam *bootparam = mdev->dp;
+
+	bootparam->magic = MIC_MAGIC;
+	bootparam->c2h_shutdown_db = mdev->shutdown_db;
+	bootparam->h2c_shutdown_db = -1;
+	bootparam->h2c_config_db = -1;
+	bootparam->shutdown_status = 0;
+	bootparam->shutdown_card = 0;
+}
+
+/**
+ * mic_start - Start the MIC.
+ * @mdev: pointer to mic_device instance
+ * @buf: buffer containing boot string including firmware/ramdisk path.
+ *
+ * This function prepares an MIC for boot and initiates boot.
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+int mic_start(struct mic_device *mdev, const char *buf)
+{
+	int rc;
+	mutex_lock(&mdev->mic_mutex);
+retry:
+	if (MIC_OFFLINE != mdev->state) {
+		rc = -EINVAL;
+		goto unlock_ret;
+	}
+	if (!mdev->ops->is_fw_ready(mdev)) {
+		mic_reset(mdev);
+		/*
+		 * The state will either be MIC_OFFLINE if the reset succeeded
+		 * or MIC_RESET_FAILED if the firmware reset failed.
+		 */
+		goto retry;
+	}
+	rc = mdev->ops->load_mic_fw(mdev, buf);
+	if (rc)
+		goto unlock_ret;
+	mic_smpt_restore(mdev);
+	mic_intr_restore(mdev);
+	mdev->intr_ops->enable_interrupts(mdev);
+	mdev->ops->write_spad(mdev, MIC_DPLO_SPAD, mdev->dp_dma_addr);
+	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
+	mdev->ops->send_firmware_intr(mdev);
+	mic_set_state(mdev, MIC_ONLINE);
+unlock_ret:
+	mutex_unlock(&mdev->mic_mutex);
+	return rc;
+}
+
+/**
+ * mic_stop - Prepare the MIC for reset and trigger reset.
+ * @mdev: pointer to mic_device instance
+ * @force: force a MIC to reset even if it is already offline.
+ *
+ * RETURNS: None.
+ */
+void mic_stop(struct mic_device *mdev, bool force)
+{
+	mutex_lock(&mdev->mic_mutex);
+	if (MIC_OFFLINE != mdev->state || force) {
+		mic_bootparam_init(mdev);
+		mic_reset(mdev);
+		if (MIC_RESET_FAILED == mdev->state)
+			goto unlock;
+		mic_set_shutdown_status(mdev, MIC_NOP);
+		mic_set_state(mdev, MIC_OFFLINE);
+	}
+unlock:
+	mutex_unlock(&mdev->mic_mutex);
+}
+
+/**
+ * mic_shutdown - Initiate MIC shutdown.
+ * @mdev: pointer to mic_device instance
+ *
+ * RETURNS: None.
+ */
+void mic_shutdown(struct mic_device *mdev)
+{
+	struct mic_bootparam *bootparam = mdev->dp;
+	s8 db = bootparam->h2c_shutdown_db;
+
+	mutex_lock(&mdev->mic_mutex);
+	if (MIC_ONLINE == mdev->state && db != -1) {
+		bootparam->shutdown_card = 1;
+		mdev->ops->send_intr(mdev, db);
+		mic_set_state(mdev, MIC_SHUTTING_DOWN);
+	}
+	mutex_unlock(&mdev->mic_mutex);
+}
+
+/**
+ * mic_shutdown_work - Handle shutdown interrupt from MIC.
+ * @work: The work structure.
+ *
+ * This work is scheduled whenever the host has received a shutdown
+ * interrupt from the MIC.
+ */
+void mic_shutdown_work(struct work_struct *work)
+{
+	struct mic_device *mdev = container_of(work, struct mic_device,
+			shutdown_work);
+	struct mic_bootparam *bootparam = mdev->dp;
+
+	mutex_lock(&mdev->mic_mutex);
+	mic_set_shutdown_status(mdev, bootparam->shutdown_status);
+	bootparam->shutdown_status = 0;
+	if (MIC_SHUTTING_DOWN != mdev->state)
+		mic_set_state(mdev, MIC_SHUTTING_DOWN);
+	mutex_unlock(&mdev->mic_mutex);
+}
+
+/**
+ * mic_reset_trigger_work - Trigger MIC reset.
+ * @work: The work structure.
+ *
+ * This work is scheduled whenever the host wants to reset the MIC.
+ */
+void mic_reset_trigger_work(struct work_struct *work)
+{
+	struct mic_device *mdev = container_of(work, struct mic_device,
+			reset_trigger_work);
+
+	mic_stop(mdev, false);
+}
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
new file mode 100644
index 0000000..78541d4
--- /dev/null
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -0,0 +1,355 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/pci.h>
+#include <linux/seq_file.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_smpt.h"
+
+/* Debugfs parent dir */
+static struct dentry *mic_dbg;
+
+/**
+ * mic_log_buf_show - Display MIC kernel log buffer.
+ *
+ * log_buf addr/len is read from System.map by user space
+ * and populated in sysfs entries.
+ */
+static int mic_log_buf_show(struct seq_file *s, void *unused)
+{
+	void __iomem *log_buf_va;
+	int __iomem *log_buf_len_va;
+	struct mic_device *mdev = s->private;
+	void *kva;
+	int size;
+	unsigned long aper_offset;
+
+	if (!mdev || !mdev->log_buf_addr || !mdev->log_buf_len)
+		goto done;
+	/*
+	 * Card kernel will never be relocated and any kernel text/data mapping
+	 * can be translated to phys address by subtracting __START_KERNEL_map.
+	 */
+	aper_offset = (unsigned long)mdev->log_buf_len - __START_KERNEL_map;
+	log_buf_len_va = mdev->aper.va + aper_offset;
+	aper_offset = (unsigned long)mdev->log_buf_addr - __START_KERNEL_map;
+	log_buf_va = mdev->aper.va + aper_offset;
+	size = ioread32(log_buf_len_va);
+
+	kva = kmalloc(size, GFP_KERNEL);
+	if (!kva)
+		goto done;
+	mutex_lock(&mdev->mic_mutex);
+	memcpy_fromio(kva, log_buf_va, size);
+	switch (mdev->state) {
+	case MIC_ONLINE:
+		/* Fall through */
+	case MIC_SHUTTING_DOWN:
+		seq_write(s, kva, size);
+		break;
+	default:
+		break;
+	}
+	mutex_unlock(&mdev->mic_mutex);
+	kfree(kva);
+done:
+	return 0;
+}
+
+static int mic_log_buf_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_log_buf_show, inode->i_private);
+}
+
+static int mic_log_buf_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations log_buf_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_log_buf_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_log_buf_release
+};
+
+static int mic_smpt_show(struct seq_file *s, void *pos)
+{
+	int i;
+	struct mic_device *mdev = s->private;
+	unsigned long flags;
+
+	seq_printf(s, "MIC %-2d |%-10s| %-14s %-10s\n",
+		mdev->id, "SMPT entry", "SW DMA addr", "RefCount");
+	seq_puts(s, "====================================================\n");
+
+	if (mdev->smpt) {
+		struct mic_smpt_info *smpt_info = mdev->smpt;
+		spin_lock_irqsave(&smpt_info->smpt_lock, flags);
+		for (i = 0; i < smpt_info->info.num_reg; i++) {
+			seq_printf(s, "%9s|%-10d| %-#14llx %-10lld\n",
+				" ",  i, smpt_info->entry[i].dma_addr,
+				smpt_info->entry[i].ref_count);
+		}
+		spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
+	}
+	seq_puts(s, "====================================================\n");
+	return 0;
+}
+
+static int mic_smpt_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_smpt_show, inode->i_private);
+}
+
+static int mic_smpt_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations smpt_file_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_smpt_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_smpt_debug_release
+};
+
+static int mic_soft_reset_show(struct seq_file *s, void *pos)
+{
+	struct mic_device *mdev = s->private;
+
+	mic_stop(mdev, true);
+	return 0;
+}
+
+static int mic_soft_reset_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_soft_reset_show, inode->i_private);
+}
+
+static int mic_soft_reset_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations soft_reset_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_soft_reset_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_soft_reset_debug_release
+};
+
+static int mic_post_code_show(struct seq_file *s, void *pos)
+{
+	struct mic_device *mdev = s->private;
+	u32 reg = mdev->ops->get_postcode(mdev);
+
+	seq_printf(s, "%c%c", reg & 0xff, (reg >> 8) & 0xff);
+	return 0;
+}
+
+static int mic_post_code_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_post_code_show, inode->i_private);
+}
+
+static int mic_post_code_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations post_code_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_post_code_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_post_code_debug_release
+};
+
+static int mic_dp_show(struct seq_file *s, void *pos)
+{
+	struct mic_device *mdev = s->private;
+	struct mic_bootparam *bootparam = mdev->dp;
+
+	seq_printf(s, "Bootparam: magic 0x%x\n",
+		bootparam->magic);
+	seq_printf(s, "Bootparam: h2c_shutdown_db %d\n",
+		bootparam->h2c_shutdown_db);
+	seq_printf(s, "Bootparam: h2c_config_db %d\n",
+		bootparam->h2c_config_db);
+	seq_printf(s, "Bootparam: c2h_shutdown_db %d\n",
+		bootparam->c2h_shutdown_db);
+	seq_printf(s, "Bootparam: shutdown_status %d\n",
+		bootparam->shutdown_status);
+	seq_printf(s, "Bootparam: shutdown_card %d\n",
+		bootparam->shutdown_card);
+
+	return 0;
+}
+
+static int mic_dp_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_dp_show, inode->i_private);
+}
+
+static int mic_dp_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations dp_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_dp_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_dp_debug_release
+};
+
+static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
+{
+	struct mic_device *mdev  = s->private;
+	int reg;
+	int i, j;
+	u16 entry;
+	u16 vector;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+
+	if (pci_dev_msi_enabled(pdev)) {
+		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
+			if (pdev->msix_enabled) {
+				entry = mdev->irq_info.msix_entries[i].entry;
+				vector = mdev->irq_info.msix_entries[i].vector;
+			} else {
+				entry = 0;
+				vector = pdev->irq;
+			}
+
+			reg = mdev->intr_ops->read_msi_to_src_map(mdev, entry);
+
+			seq_printf(s, "%s %-10d %s %-10d MXAR[%d]: %08X\n",
+				"IRQ:", vector, "Entry:", entry, i, reg);
+
+			seq_printf(s, "%-10s", "offset:");
+			for (j = (MIC_NUM_OFFSETS - 1); j >= 0; j--)
+				seq_printf(s, "%4d ", j);
+			seq_puts(s, "\n");
+
+
+			seq_printf(s, "%-10s", "count:");
+			for (j = (MIC_NUM_OFFSETS - 1); j >= 0; j--)
+				seq_printf(s, "%4d ",
+				(mdev->irq_info.mic_msi_map[i] & BIT(j)) ?
+					1 : 0);
+			seq_puts(s, "\n\n");
+		}
+	} else {
+		seq_puts(s, "MSI/MSIx interrupts not enabled\n");
+	}
+
+	return 0;
+
+}
+
+static int mic_msi_irq_info_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_msi_irq_info_show, inode->i_private);
+}
+
+static int
+mic_msi_irq_info_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations msi_irq_info_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_msi_irq_info_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_msi_irq_info_debug_release
+};
+
+/**
+ * mic_create_debug_dir - Initialize MIC debugfs entries.
+ */
+void mic_create_debug_dir(struct mic_device *mdev)
+{
+	if (!mic_dbg)
+		return;
+
+	mdev->dbg_dir = debugfs_create_dir(dev_name(mdev->sdev), mic_dbg);
+	if (!mdev->dbg_dir)
+		return;
+
+	debugfs_create_file("log_buf", 0444, mdev->dbg_dir,
+		mdev, &log_buf_ops);
+
+	debugfs_create_file("smpt", 0444, mdev->dbg_dir,
+		mdev, &smpt_file_ops);
+
+	debugfs_create_file("soft_reset", 0444, mdev->dbg_dir,
+		mdev, &soft_reset_ops);
+
+	debugfs_create_file("post_code", 0444, mdev->dbg_dir,
+		mdev, &post_code_ops);
+
+	debugfs_create_file("dp", 0444, mdev->dbg_dir,
+		mdev, &dp_ops);
+
+	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir,
+		mdev, &msi_irq_info_ops);
+}
+
+/**
+ * mic_delete_debug_dir - Uninitialize MIC debugfs entries.
+ */
+void mic_delete_debug_dir(struct mic_device *mdev)
+{
+	if (!mdev->dbg_dir)
+		return;
+
+	debugfs_remove_recursive(mdev->dbg_dir);
+}
+
+/**
+ * mic_init_debugfs - Initialize global debugfs entry.
+ */
+void __init mic_init_debugfs(void)
+{
+	mic_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	if (!mic_dbg)
+		pr_err("can't create debugfs dir\n");
+}
+
+/**
+ * mic_exit_debugfs - Uninitialize global debugfs entry
+ */
+void mic_exit_debugfs(void)
+{
+	debugfs_remove(mic_dbg);
+}
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index dd34b65..50b8b88 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -63,6 +63,23 @@ enum mic_stepping {
  * @smpt: MIC SMPT information.
  * @intr_info: H/W specific interrupt information.
  * @irq_info: The OS specific irq information
+ * @dbg_dir: debugfs directory of this MIC device.
+ * @cmdline: Kernel command line.
+ * @firmware: Firmware file name.
+ * @ramdisk: Ramdisk file name.
+ * @bootmode: Boot mode i.e. "linux" or "elf" for flash updates.
+ * @bootaddr: MIC boot address.
+ * @reset_trigger_work: Work for triggering reset requests.
+ * @shutdown_work: Work for handling shutdown interrupts.
+ * @state: MIC state.
+ * @shutdown_status: MIC status reported by card for shutdown/crashes.
+ * @state_sysfs: Sysfs dirent for notifying ring 3 about MIC state changes.
+ * @log_buf_addr: Log buffer address for MIC.
+ * @log_buf_len: Log buffer length address for MIC.
+ * @dp: virtio device page
+ * @dp_dma_addr: virtio device page DMA address.
+ * @shutdown_db: shutdown doorbell.
+ * @shutdown_cookie: shutdown cookie.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -79,6 +96,23 @@ struct mic_device {
 	struct mic_smpt_info *smpt;
 	struct mic_intr_info *intr_info;
 	struct mic_irq_info irq_info;
+	struct dentry *dbg_dir;
+	char *cmdline;
+	char *firmware;
+	char *ramdisk;
+	char *bootmode;
+	u32 bootaddr;
+	struct work_struct reset_trigger_work;
+	struct work_struct shutdown_work;
+	u8 state;
+	u8 shutdown_status;
+	struct sysfs_dirent *state_sysfs;
+	void *log_buf_addr;
+	int *log_buf_len;
+	void *dp;
+	dma_addr_t dp_dma_addr;
+	int shutdown_db;
+	struct mic_irq *shutdown_cookie;
 };
 
 /**
@@ -90,6 +124,13 @@ struct mic_device {
  * @send_intr: Send an interrupt for a particular doorbell on the card.
  * @ack_interrupt: Hardware specific operations to ack the h/w on
  * receipt of an interrupt.
+ * @reset: Reset the remote processor.
+ * @reset_fw_ready: Reset firmware ready field.
+ * @is_fw_ready: Check if firmware is ready for OS download.
+ * @send_firmware_intr: Send an interrupt to the card firmware.
+ * @load_mic_fw: Load firmware segments required to boot the card
+ * into card memory. This includes the kernel, command line, ramdisk etc.
+ * @get_postcode: Get post code status from firmware.
  */
 struct mic_hw_ops {
 	u8 aper_bar;
@@ -98,6 +139,12 @@ struct mic_hw_ops {
 	void (*write_spad)(struct mic_device *mdev, unsigned int idx, u32 val);
 	void (*send_intr)(struct mic_device *mdev, int doorbell);
 	u32 (*ack_interrupt)(struct mic_device *mdev);
+	void (*reset)(struct mic_device *mdev);
+	void (*reset_fw_ready)(struct mic_device *mdev);
+	bool (*is_fw_ready)(struct mic_device *mdev);
+	void (*send_firmware_intr)(struct mic_device *mdev);
+	int (*load_mic_fw)(struct mic_device *mdev, const char *buf);
+	u32 (*get_postcode)(struct mic_device *mdev);
 };
 
 /**
@@ -127,4 +174,17 @@ mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
 }
 
 void mic_sysfs_init(struct mic_device *mdev);
+int mic_start(struct mic_device *mdev, const char *buf);
+void mic_stop(struct mic_device *mdev, bool force);
+void mic_shutdown(struct mic_device *mdev);
+void mic_reset_delayed_work(struct work_struct *work);
+void mic_reset_trigger_work(struct work_struct *work);
+void mic_shutdown_work(struct work_struct *work);
+void mic_bootparam_init(struct mic_device *mdev);
+void mic_set_state(struct mic_device *mdev, u8 state);
+void mic_set_shutdown_status(struct mic_device *mdev, u8 status);
+void mic_create_debug_dir(struct mic_device *dev);
+void mic_delete_debug_dir(struct mic_device *dev);
+void __init mic_init_debugfs(void);
+void mic_exit_debugfs(void);
 #endif
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 332a15e..998a20a 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -26,6 +26,7 @@
 #include <linux/module.h>
 #include <linux/pci.h>
 
+#include <linux/mic_common.h>
 #include "../common/mic_device.h"
 #include "mic_device.h"
 #include "mic_x100.h"
@@ -63,6 +64,60 @@ static struct class *g_mic_class;
 /* Base device node number for MIC devices */
 static dev_t g_mic_devno;
 
+/* Initialize the device page */
+static int mic_dp_init(struct mic_device *mdev)
+{
+	mdev->dp = kzalloc(MIC_DP_SIZE, GFP_KERNEL);
+	if (!mdev->dp) {
+		dev_err(mdev->sdev->parent, "%s %d err %d\n",
+			__func__, __LINE__, -ENOMEM);
+		return -ENOMEM;
+	}
+
+	mdev->dp_dma_addr = mic_map_single(mdev,
+		mdev->dp, MIC_DP_SIZE);
+	if (mic_map_error(mdev->dp_dma_addr)) {
+		kfree(mdev->dp);
+		dev_err(mdev->sdev->parent, "%s %d err %d\n",
+			__func__, __LINE__, -ENOMEM);
+		return -ENOMEM;
+	}
+	mdev->ops->write_spad(mdev, MIC_DPLO_SPAD, mdev->dp_dma_addr);
+	mdev->ops->write_spad(mdev, MIC_DPHI_SPAD, mdev->dp_dma_addr >> 32);
+	return 0;
+}
+
+/* Uninitialize the device page */
+static void mic_dp_uninit(struct mic_device *mdev)
+{
+	mic_unmap_single(mdev, mdev->dp_dma_addr, MIC_DP_SIZE);
+	kfree(mdev->dp);
+}
+
+/**
+ * mic_shutdown_db - Shutdown doorbell interrupt handler.
+ */
+static irqreturn_t mic_shutdown_db(int irq, void *data)
+{
+	struct mic_device *mdev = data;
+	struct mic_bootparam *bootparam = mdev->dp;
+
+	mdev->ops->ack_interrupt(mdev);
+
+	switch (bootparam->shutdown_status) {
+	case MIC_HALTED:
+	case MIC_POWER_OFF:
+	case MIC_RESTART:
+		/* Fall through */
+	case MIC_CRASHED:
+		schedule_work(&mdev->shutdown_work);
+		break;
+	default:
+		break;
+	};
+	return IRQ_HANDLED;
+}
+
 /**
  * mic_ops_init: Initialize HW specific operation tables.
  *
@@ -136,6 +191,26 @@ mic_device_init(struct mic_device *mdev, struct pci_dev *pdev)
 	mic_sysfs_init(mdev);
 	mutex_init(&mdev->mic_mutex);
 	mdev->irq_info.next_avail_src = 0;
+	INIT_WORK(&mdev->reset_trigger_work, mic_reset_trigger_work);
+	INIT_WORK(&mdev->shutdown_work, mic_shutdown_work);
+}
+
+/**
+ * mic_device_uninit - Frees resources allocated during mic_device_init(..)
+ *
+ * @mdev: pointer to mic_device instance
+ *
+ * returns none
+ */
+static void mic_device_uninit(struct mic_device *mdev)
+{
+	/* The cmdline sysfs entry might have allocated cmdline */
+	kfree(mdev->cmdline);
+	kfree(mdev->firmware);
+	kfree(mdev->ramdisk);
+	kfree(mdev->bootmode);
+	flush_work(&mdev->reset_trigger_work);
+	flush_work(&mdev->shutdown_work);
 }
 
 /**
@@ -170,7 +245,7 @@ static int mic_probe(struct pci_dev *pdev,
 	rc = pci_enable_device(pdev);
 	if (rc) {
 		dev_err(&pdev->dev, "failed to enable pci device.\n");
-		goto ida_remove;
+		goto uninit_device;
 	}
 
 	pci_set_master(pdev);
@@ -228,7 +303,40 @@ static int mic_probe(struct pci_dev *pdev,
 			"device_create_with_groups failed rc %d\n", rc);
 		goto smpt_uninit;
 	}
+	mdev->state_sysfs = sysfs_get_dirent(mdev->sdev->kobj.sd,
+		NULL, "state");
+	if (!mdev->state_sysfs) {
+		rc = -ENODEV;
+		dev_err(&pdev->dev, "sysfs_get_dirent failed rc %d\n", rc);
+		goto destroy_device;
+	}
+
+	rc = mic_dp_init(mdev);
+	if (rc) {
+		dev_err(&pdev->dev, "mic_dp_init failed rc %d\n", rc);
+		goto sysfs_put;
+	}
+	mutex_lock(&mdev->mic_mutex);
+
+	mdev->shutdown_db = mic_next_db(mdev);
+	mdev->shutdown_cookie = mic_request_irq(mdev, mic_shutdown_db,
+		"shutdown-interrupt", mdev, mdev->shutdown_db, MIC_INTR_DB);
+	if (IS_ERR(mdev->shutdown_cookie)) {
+		rc = PTR_ERR(mdev->shutdown_cookie);
+		mutex_unlock(&mdev->mic_mutex);
+		goto dp_uninit;
+	}
+	mutex_unlock(&mdev->mic_mutex);
+	mic_bootparam_init(mdev);
+
+	mic_create_debug_dir(mdev);
 	return 0;
+dp_uninit:
+	mic_dp_uninit(mdev);
+sysfs_put:
+	sysfs_put(mdev->state_sysfs);
+destroy_device:
+	device_destroy(g_mic_class, MKDEV(MAJOR(g_mic_devno), mdev->id));
 smpt_uninit:
 	mic_smpt_uninit(mdev);
 free_interrupts:
@@ -241,7 +349,8 @@ release_regions:
 	pci_release_regions(pdev);
 disable_device:
 	pci_disable_device(pdev);
-ida_remove:
+uninit_device:
+	mic_device_uninit(mdev);
 	ida_simple_remove(&g_mic_ida, mdev->id);
 ida_fail:
 	kfree(mdev);
@@ -265,11 +374,20 @@ static void mic_remove(struct pci_dev *pdev)
 	if (!mdev)
 		return;
 
+	mic_stop(mdev, false);
+	mic_delete_debug_dir(mdev);
+	mutex_lock(&mdev->mic_mutex);
+	mic_free_irq(mdev, mdev->shutdown_cookie, mdev);
+	mutex_unlock(&mdev->mic_mutex);
+	flush_work(&mdev->shutdown_work);
+	mic_dp_uninit(mdev);
+	sysfs_put(mdev->state_sysfs);
 	device_destroy(g_mic_class, MKDEV(MAJOR(g_mic_devno), mdev->id));
 	mic_smpt_uninit(mdev);
 	mic_free_interrupts(mdev, pdev);
 	iounmap(mdev->mmio.va);
 	iounmap(mdev->aper.va);
+	mic_device_uninit(mdev);
 	pci_release_regions(pdev);
 	pci_disable_device(pdev);
 	ida_simple_remove(&g_mic_ida, mdev->id);
@@ -300,14 +418,16 @@ static int __init mic_init(void)
 		goto cleanup_chrdev;
 	}
 
+	mic_init_debugfs();
 	ida_init(&g_mic_ida);
 	ret = pci_register_driver(&mic_driver);
 	if (ret) {
 		pr_err("pci_register_driver failed ret %d\n", ret);
-		goto class_destroy;
+		goto cleanup_debugfs;
 	}
 	return ret;
-class_destroy:
+cleanup_debugfs:
+	mic_exit_debugfs();
 	class_destroy(g_mic_class);
 cleanup_chrdev:
 	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
@@ -319,6 +439,7 @@ static void __exit mic_exit(void)
 {
 	pci_unregister_driver(&mic_driver);
 	ida_destroy(&g_mic_ida);
+	mic_exit_debugfs();
 	class_destroy(g_mic_class);
 	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
 }
diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
index 972c182..aaf8499 100644
--- a/drivers/misc/mic/host/mic_sysfs.c
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -20,9 +20,50 @@
  */
 #include <linux/pci.h>
 
+#include <linux/mic_common.h>
 #include "../common/mic_device.h"
 #include "mic_device.h"
 
+/*
+ * A state-to-string lookup table, for exposing a human readable state
+ * via sysfs. Always keep in sync with enum mic_states
+ */
+static const char * const mic_state_string[] = {
+	[MIC_OFFLINE] = "offline",
+	[MIC_ONLINE] = "online",
+	[MIC_SHUTTING_DOWN] = "shutting_down",
+	[MIC_RESET_FAILED] = "reset_failed",
+};
+
+/*
+ * A shutdown-status-to-string lookup table, for exposing a human
+ * readable state via sysfs. Always keep in sync with enum mic_shutdown_status
+ */
+static const char * const mic_shutdown_status_string[] = {
+	[MIC_NOP] = "nop",
+	[MIC_CRASHED] = "crashed",
+	[MIC_HALTED] = "halted",
+	[MIC_POWER_OFF] = "poweroff",
+	[MIC_RESTART] = "restart",
+};
+
+void mic_set_shutdown_status(struct mic_device *mdev, u8 shutdown_status)
+{
+	dev_dbg(mdev->sdev->parent, "Shutdown Status %s -> %s\n",
+		mic_shutdown_status_string[mdev->shutdown_status],
+		mic_shutdown_status_string[shutdown_status]);
+	mdev->shutdown_status = shutdown_status;
+}
+
+void mic_set_state(struct mic_device *mdev, u8 state)
+{
+	dev_dbg(mdev->sdev->parent, "State %s -> %s\n",
+		mic_state_string[mdev->state],
+		mic_state_string[state]);
+	mdev->state = state;
+	sysfs_notify_dirent(mdev->state_sysfs);
+}
+
 static ssize_t
 mic_show_family(struct device *dev, struct device_attribute *attr, char *buf)
 {
@@ -75,9 +116,337 @@ mic_show_stepping(struct device *dev, struct device_attribute *attr, char *buf)
 }
 static DEVICE_ATTR(stepping, S_IRUGO, mic_show_stepping, NULL);
 
+static ssize_t
+mic_show_state(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev || mdev->state >= MIC_LAST)
+		return -EINVAL;
+
+	return scnprintf(buf, PAGE_SIZE, "%s\n",
+		mic_state_string[mdev->state]);
+}
+
+static ssize_t
+mic_store_state(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	int rc = 0;
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	if (!mdev)
+		return -EINVAL;
+	if (sysfs_streq(buf, "boot")) {
+		rc = mic_start(mdev, buf);
+		if (rc) {
+			dev_err(mdev->sdev->parent,
+				"mic_boot failed rc %d\n", rc);
+			count = rc;
+		}
+		goto done;
+	}
+
+	if (sysfs_streq(buf, "reset")) {
+		schedule_work(&mdev->reset_trigger_work);
+		goto done;
+	}
+
+	if (sysfs_streq(buf, "shutdown")) {
+		mic_shutdown(mdev);
+		goto done;
+	}
+
+	count = -EINVAL;
+done:
+	return count;
+}
+static DEVICE_ATTR(state, S_IRUGO|S_IWUSR, mic_show_state, mic_store_state);
+
+static ssize_t mic_show_shutdown_status(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev || mdev->shutdown_status >= MIC_STATUS_LAST)
+		return -EINVAL;
+
+	return scnprintf(buf, PAGE_SIZE, "%s\n",
+		mic_shutdown_status_string[mdev->shutdown_status]);
+}
+static DEVICE_ATTR(shutdown_status, S_IRUGO|S_IWUSR,
+	mic_show_shutdown_status, NULL);
+
+static ssize_t
+mic_show_cmdline(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	char *cmdline;
+
+	if (!mdev)
+		return -EINVAL;
+
+	cmdline = mdev->cmdline;
+
+	if (cmdline)
+		return scnprintf(buf, PAGE_SIZE, "%s\n", cmdline);
+	return 0;
+}
+
+static ssize_t
+mic_store_cmdline(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	mutex_lock(&mdev->mic_mutex);
+	kfree(mdev->cmdline);
+
+	mdev->cmdline = kmalloc(count + 1, GFP_KERNEL);
+	if (!mdev->cmdline) {
+		count = -ENOMEM;
+		goto unlock;
+	}
+
+	strncpy(mdev->cmdline, buf, count);
+
+	if (mdev->cmdline[count - 1] == '\n')
+		mdev->cmdline[count - 1] = '\0';
+	else
+		mdev->cmdline[count] = '\0';
+unlock:
+	mutex_unlock(&mdev->mic_mutex);
+	return count;
+}
+static DEVICE_ATTR(cmdline, S_IRUGO | S_IWUSR,
+	mic_show_cmdline, mic_store_cmdline);
+
+static ssize_t
+mic_show_firmware(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	char *firmware;
+
+	if (!mdev)
+		return -EINVAL;
+
+	firmware = mdev->firmware;
+
+	if (firmware)
+		return scnprintf(buf, PAGE_SIZE, "%s\n", firmware);
+	return 0;
+}
+
+static ssize_t
+mic_store_firmware(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	mutex_lock(&mdev->mic_mutex);
+	kfree(mdev->firmware);
+
+	mdev->firmware = kmalloc(count + 1, GFP_KERNEL);
+	if (!mdev->firmware) {
+		count = -ENOMEM;
+		goto unlock;
+	}
+	strncpy(mdev->firmware, buf, count);
+
+	if (mdev->firmware[count - 1] == '\n')
+		mdev->firmware[count - 1] = '\0';
+	else
+		mdev->firmware[count] = '\0';
+unlock:
+	mutex_unlock(&mdev->mic_mutex);
+	return count;
+}
+static DEVICE_ATTR(firmware, S_IRUGO | S_IWUSR,
+	mic_show_firmware, mic_store_firmware);
+
+static ssize_t
+mic_show_ramdisk(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	char *ramdisk;
+
+	if (!mdev)
+		return -EINVAL;
+
+	ramdisk = mdev->ramdisk;
+
+	if (ramdisk)
+		return scnprintf(buf, PAGE_SIZE, "%s\n", ramdisk);
+	return 0;
+}
+
+static ssize_t
+mic_store_ramdisk(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	mutex_lock(&mdev->mic_mutex);
+	kfree(mdev->ramdisk);
+
+	mdev->ramdisk = kmalloc(count + 1, GFP_KERNEL);
+	if (!mdev->ramdisk) {
+		count = -ENOMEM;
+		goto unlock;
+	}
+
+	strncpy(mdev->ramdisk, buf, count);
+
+	if (mdev->ramdisk[count - 1] == '\n')
+		mdev->ramdisk[count - 1] = '\0';
+	else
+		mdev->ramdisk[count] = '\0';
+unlock:
+	mutex_unlock(&mdev->mic_mutex);
+	return count;
+}
+static DEVICE_ATTR(ramdisk, S_IRUGO | S_IWUSR,
+	mic_show_ramdisk, mic_store_ramdisk);
+
+static ssize_t
+mic_show_bootmode(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	char *bootmode;
+
+	if (!mdev)
+		return -EINVAL;
+
+	bootmode = mdev->bootmode;
+
+	if (bootmode)
+		return scnprintf(buf, PAGE_SIZE, "%s\n", bootmode);
+	return 0;
+}
+
+static ssize_t
+mic_store_bootmode(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	if (!sysfs_streq(buf, "linux") && !sysfs_streq(buf, "elf"))
+		return -EINVAL;
+
+	mutex_lock(&mdev->mic_mutex);
+	kfree(mdev->bootmode);
+
+	mdev->bootmode = kmalloc(count + 1, GFP_KERNEL);
+	if (!mdev->bootmode) {
+		count = -ENOMEM;
+		goto unlock;
+	}
+
+	strncpy(mdev->bootmode, buf, count);
+
+	if (mdev->bootmode[count - 1] == '\n')
+		mdev->bootmode[count - 1] = '\0';
+	else
+		mdev->bootmode[count] = '\0';
+unlock:
+	mutex_unlock(&mdev->mic_mutex);
+	return count;
+}
+static DEVICE_ATTR(bootmode, S_IRUGO | S_IWUSR,
+	mic_show_bootmode, mic_store_bootmode);
+
+static ssize_t
+mic_show_log_buf_addr(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	return scnprintf(buf, PAGE_SIZE, "%p\n", mdev->log_buf_addr);
+}
+
+static ssize_t
+mic_store_log_buf_addr(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	int ret;
+	unsigned long addr;
+
+	if (!mdev)
+		return -EINVAL;
+
+	ret = kstrtoul(buf, 16, &addr);
+	if (ret)
+		goto exit;
+
+	mdev->log_buf_addr = (void *)addr;
+	ret = count;
+exit:
+	return ret;
+}
+static DEVICE_ATTR(log_buf_addr, S_IRUGO | S_IWUSR,
+	mic_show_log_buf_addr, mic_store_log_buf_addr);
+
+static ssize_t
+mic_show_log_buf_len(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	return scnprintf(buf, PAGE_SIZE, "%p\n", mdev->log_buf_len);
+}
+
+static ssize_t
+mic_store_log_buf_len(struct device *dev, struct device_attribute *attr,
+	const char *buf, size_t count)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	int ret;
+	unsigned long addr;
+
+	if (!mdev)
+		return -EINVAL;
+
+	ret = kstrtoul(buf, 16, &addr);
+	if (ret)
+		goto exit;
+
+	mdev->log_buf_len = (int *)addr;
+	ret = count;
+exit:
+	return ret;
+}
+static DEVICE_ATTR(log_buf_len, S_IRUGO | S_IWUSR,
+	mic_show_log_buf_len, mic_store_log_buf_len);
+
 static struct attribute *mic_default_attrs[] = {
 	&dev_attr_family.attr,
 	&dev_attr_stepping.attr,
+	&dev_attr_state.attr,
+	&dev_attr_shutdown_status.attr,
+	&dev_attr_cmdline.attr,
+	&dev_attr_firmware.attr,
+	&dev_attr_ramdisk.attr,
+	&dev_attr_bootmode.attr,
+	&dev_attr_log_buf_addr.attr,
+	&dev_attr_log_buf_len.attr,
 
 	NULL
 };
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index b637316..a12ae5c 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -20,6 +20,9 @@
  */
 #include <linux/fs.h>
 #include <linux/pci.h>
+#include <linux/sched.h>
+#include <linux/firmware.h>
+#include <linux/delay.h>
 
 #include "../common/mic_device.h"
 #include "mic_device.h"
@@ -256,6 +259,248 @@ mic_x100_program_msi_to_src_map(struct mic_device *mdev,
 	mic_mmio_write(mw, reg, mxar);
 }
 
+/*
+ * mic_x100_reset_fw_ready - Reset Firmware ready status field.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_reset_fw_ready(struct mic_device *mdev)
+{
+	mdev->ops->write_spad(mdev, MIC_X100_DOWNLOAD_INFO, 0);
+}
+
+/*
+ * mic_x100_is_fw_ready - Check if firmware is ready.
+ * @mdev: pointer to mic_device instance
+ */
+static bool mic_x100_is_fw_ready(struct mic_device *mdev)
+{
+	u32 scratch2 = mdev->ops->read_spad(mdev, MIC_X100_DOWNLOAD_INFO);
+	return MIC_X100_SPAD2_DOWNLOAD_STATUS(scratch2) ? true : false;
+}
+
+/**
+ * mic_x100_get_apic_id - Get bootstrap APIC ID.
+ * @mdev: pointer to mic_device instance
+ */
+static u32 mic_x100_get_apic_id(struct mic_device *mdev)
+{
+	u32 scratch2 = 0;
+
+	scratch2 = mdev->ops->read_spad(mdev, MIC_X100_DOWNLOAD_INFO);
+	return MIC_X100_SPAD2_APIC_ID(scratch2);
+}
+
+/**
+ * mic_x100_send_firmware_intr - Send an interrupt to the firmware on MIC.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_send_firmware_intr(struct mic_device *mdev)
+{
+	u32 apicicr_low;
+	u64 apic_icr_offset = MIC_X100_SBOX_APICICR7;
+	int vector = MIC_X100_BSP_INTERRUPT_VECTOR;
+	struct mic_mw *mw = &mdev->mmio;
+
+	/*
+	 * For MIC we need to make sure we "hit"
+	 * the send_icr bit (13).
+	 */
+	apicicr_low = (vector | (1 << 13));
+
+	mic_mmio_write(mw, mic_x100_get_apic_id(mdev),
+		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset + 4);
+
+	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
+	wmb();
+	mic_mmio_write(mw, apicicr_low,
+		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+}
+
+/**
+ * mic_x100_hw_reset - Reset the MIC device.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_hw_reset(struct mic_device *mdev)
+{
+	u32 reset_reg;
+	u32 rgcr = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_RGCR;
+	struct mic_mw *mw = &mdev->mmio;
+
+	/* Ensure that the reset is ordered w.r.t. previous loads and stores */
+	mb();
+	/* Trigger reset */
+	reset_reg = mic_mmio_read(mw, rgcr);
+	reset_reg |= 0x1;
+	mic_mmio_write(mw, reset_reg, rgcr);
+	/*
+	 * It seems we really want to delay at least 1 second
+	 * after touching reset to prevent a lot of problems.
+	 */
+	msleep(1000);
+}
+
+/**
+ * mic_x100_load_command_line - Load command line to MIC.
+ * @mdev: pointer to mic_device instance
+ * @fw: the firmware image
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int
+mic_x100_load_command_line(struct mic_device *mdev, const struct firmware *fw)
+{
+	u32 len = 0;
+	u32 boot_mem;
+	char *buf;
+	void __iomem *cmd_line_va = mdev->aper.va + mdev->bootaddr + fw->size;
+#define CMDLINE_SIZE 2048
+
+	boot_mem = mdev->aper.len >> 20;
+	buf = kzalloc(CMDLINE_SIZE, GFP_KERNEL);
+	if (!buf) {
+		dev_err(mdev->sdev->parent,
+			"%s %d allocation failed\n", __func__, __LINE__);
+		return -ENOMEM;
+	}
+	len += snprintf(buf, CMDLINE_SIZE - len,
+		" mem=%dM", boot_mem);
+	if (mdev->cmdline)
+		snprintf(buf + len, CMDLINE_SIZE - len,
+				" %s", mdev->cmdline);
+	memcpy_toio(cmd_line_va, buf, strlen(buf) + 1);
+	kfree(buf);
+	return 0;
+}
+
+/**
+ * mic_x100_load_ramdisk - Load ramdisk to MIC.
+ * @mdev: pointer to mic_device instance
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int
+mic_x100_load_ramdisk(struct mic_device *mdev)
+{
+	const struct firmware *fw;
+	int rc;
+	struct boot_params __iomem *bp = mdev->aper.va + mdev->bootaddr;
+
+	rc = request_firmware(&fw,
+			mdev->ramdisk, mdev->sdev->parent);
+	if (rc < 0) {
+		dev_err(mdev->sdev->parent,
+			"ramdisk request_firmware failed: %d %s\n",
+			rc, mdev->ramdisk);
+		goto error;
+	}
+	/*
+	 * Typically the bootaddr for card OS is 64M
+	 * so copy over the ramdisk @ 128M.
+	 */
+	memcpy_toio(mdev->aper.va + (mdev->bootaddr << 1),
+		fw->data, fw->size);
+	iowrite32(cpu_to_le32(mdev->bootaddr << 1), &bp->hdr.ramdisk_image);
+	iowrite32(cpu_to_le32(fw->size), &bp->hdr.ramdisk_size);
+	release_firmware(fw);
+error:
+	return rc;
+}
+
+/**
+ * mic_x100_get_boot_addr - Get MIC boot address.
+ * @mdev: pointer to mic_device instance
+ *
+ * This function is called during firmware load to determine
+ * the address at which the OS should be downloaded in card
+ * memory i.e. GDDR.
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int
+mic_x100_get_boot_addr(struct mic_device *mdev)
+{
+	u32 scratch2, boot_addr;
+	int rc = 0;
+
+	scratch2 = mdev->ops->read_spad(mdev, MIC_X100_DOWNLOAD_INFO);
+	boot_addr = MIC_X100_SPAD2_DOWNLOAD_ADDR(scratch2);
+	dev_dbg(mdev->sdev->parent, "%s %d boot_addr 0x%x\n",
+		__func__, __LINE__, boot_addr);
+	if (boot_addr > (1 << 31)) {
+		dev_err(mdev->sdev->parent,
+			"incorrect bootaddr 0x%x\n",
+			boot_addr);
+		rc = -EINVAL;
+		goto error;
+	}
+	mdev->bootaddr = boot_addr;
+error:
+	return rc;
+}
+
+/**
+ * mic_x100_load_firmware - Load firmware to MIC.
+ * @mdev: pointer to mic_device instance
+ * @buf: buffer containing boot string including firmware/ramdisk path.
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int
+mic_x100_load_firmware(struct mic_device *mdev, const char *buf)
+{
+	int rc;
+	const struct firmware *fw;
+
+	rc = mic_x100_get_boot_addr(mdev);
+	if (rc)
+		goto error;
+	/* load OS */
+	rc = request_firmware(&fw, mdev->firmware, mdev->sdev->parent);
+	if (rc < 0) {
+		dev_err(mdev->sdev->parent,
+			"ramdisk request_firmware failed: %d %s\n",
+			rc, mdev->firmware);
+		goto error;
+	}
+	if (mdev->bootaddr > mdev->aper.len - fw->size) {
+		rc = -EINVAL;
+		dev_err(mdev->sdev->parent, "%s %d rc %d bootaddr 0x%x\n",
+			__func__, __LINE__, rc, mdev->bootaddr);
+		release_firmware(fw);
+		goto error;
+	}
+	memcpy_toio(mdev->aper.va + mdev->bootaddr, fw->data, fw->size);
+	mdev->ops->write_spad(mdev, MIC_X100_FW_SIZE, fw->size);
+	if (!strcmp(mdev->bootmode, "elf"))
+		goto done;
+	/* load command line */
+	rc = mic_x100_load_command_line(mdev, fw);
+	if (rc) {
+		dev_err(mdev->sdev->parent, "%s %d rc %d\n",
+			__func__, __LINE__, rc);
+		goto error;
+	}
+	release_firmware(fw);
+	/* load ramdisk */
+	if (mdev->ramdisk)
+		rc = mic_x100_load_ramdisk(mdev);
+error:
+	dev_dbg(mdev->sdev->parent, "%s %d rc %d\n",
+			__func__, __LINE__, rc);
+done:
+	return rc;
+}
+
+/**
+ * mic_x100_get_postcode - Get postcode status from firmware.
+ * @mdev: pointer to mic_device instance
+ *
+ * RETURNS: postcode.
+ */
+static u32 mic_x100_get_postcode(struct mic_device *mdev)
+{
+	return mic_mmio_read(&mdev->mmio, MIC_X100_POSTCODE);
+}
+
 /**
  * mic_x100_smpt_set - Update an SMPT entry with a DMA address.
  * @mdev: pointer to mic_device instance
@@ -311,6 +556,12 @@ struct mic_hw_ops mic_x100_ops = {
 	.write_spad = mic_x100_write_spad,
 	.send_intr = mic_x100_send_intr,
 	.ack_interrupt = mic_x100_ack_interrupt,
+	.reset = mic_x100_hw_reset,
+	.reset_fw_ready = mic_x100_reset_fw_ready,
+	.is_fw_ready = mic_x100_is_fw_ready,
+	.send_firmware_intr = mic_x100_send_firmware_intr,
+	.load_mic_fw = mic_x100_load_firmware,
+	.get_postcode = mic_x100_get_postcode,
 };
 
 struct mic_hw_intr_ops mic_x100_intr_ops = {
diff --git a/drivers/misc/mic/host/mic_x100.h b/drivers/misc/mic/host/mic_x100.h
index 642cae9..8b7daa1 100644
--- a/drivers/misc/mic/host/mic_x100.h
+++ b/drivers/misc/mic/host/mic_x100.h
@@ -69,6 +69,15 @@
 #define MIC_X100_NUM_SBOX_IRQ 8
 #define MIC_X100_NUM_RDMASR_IRQ 8
 #define MIC_X100_RDMASR_IRQ_BASE 17
+#define MIC_X100_SPAD2_DOWNLOAD_STATUS(x) ((x) & 0x1)
+#define MIC_X100_SPAD2_APIC_ID(x)	(((x) >> 1) & 0x1ff)
+#define MIC_X100_SPAD2_DOWNLOAD_ADDR(x) ((x) & 0xfffff000)
+#define MIC_X100_SBOX_APICICR7 0x0000AA08
+#define MIC_X100_SBOX_RGCR 0x00004010
+#define MIC_X100_SBOX_SDBIC0 0x0000CC90
+#define MIC_X100_DOWNLOAD_INFO 2
+#define MIC_X100_FW_SIZE 5
+#define MIC_X100_POSTCODE 0x242c
 
 static const u16 mic_x100_intr_init[] = {
 		MIC_X100_DOORBELL_IDX_START,
@@ -79,6 +88,9 @@ static const u16 mic_x100_intr_init[] = {
 		MIC_X100_NUM_ERR,
 };
 
+/* Host->Card(bootstrap) Interrupt Vector */
+#define MIC_X100_BSP_INTERRUPT_VECTOR 229
+
 extern struct mic_hw_ops mic_x100_ops;
 extern struct mic_smpt_ops mic_x100_smpt_ops;
 extern struct mic_hw_intr_ops mic_x100_intr_ops;
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 997f9f2..e3f7b37 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -240,6 +240,7 @@ header-y += media.h
 header-y += mei.h
 header-y += mempolicy.h
 header-y += meye.h
+header-y += mic_common.h
 header-y += mii.h
 header-y += minix_fs.h
 header-y += mman.h
diff --git a/include/uapi/linux/mic_common.h b/include/uapi/linux/mic_common.h
new file mode 100644
index 0000000..a9091e5
--- /dev/null
+++ b/include/uapi/linux/mic_common.h
@@ -0,0 +1,74 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC driver.
+ *
+ */
+#ifndef __MIC_COMMON_H_
+#define __MIC_COMMON_H_
+
+#include <linux/types.h>
+
+/**
+ * struct mic_bootparam: Virtio device independent information in device page
+ *
+ * @magic: A magic value used by the card to ensure it can see the host
+ * @c2h_shutdown_db: Card to Host shutdown doorbell set by host
+ * @h2c_shutdown_db: Host to Card shutdown doorbell set by card
+ * @h2c_config_db: Host to Card Virtio config doorbell set by card
+ * @shutdown_status: Card shutdown status set by card
+ * @shutdown_card: Set to 1 by the host when a card shutdown is initiated
+ */
+struct mic_bootparam {
+	__u32 magic;
+	__s8 c2h_shutdown_db;
+	__s8 h2c_shutdown_db;
+	__s8 h2c_config_db;
+	__u8 shutdown_status;
+	__u8 shutdown_card;
+} __aligned(8);
+
+/* Device page size */
+#define MIC_DP_SIZE 4096
+
+#define MIC_MAGIC 0xc0ffee00
+
+/**
+ * enum mic_states - MIC states.
+ */
+enum mic_states {
+	MIC_OFFLINE = 0,
+	MIC_ONLINE,
+	MIC_SHUTTING_DOWN,
+	MIC_RESET_FAILED,
+	MIC_LAST
+};
+
+/**
+ * enum mic_status - MIC status reported by card after
+ * a host or card initiated shutdown or a card crash.
+ */
+enum mic_status {
+	MIC_NOP = 0,
+	MIC_CRASHED,
+	MIC_HALTED,
+	MIC_POWER_OFF,
+	MIC_RESTART,
+	MIC_STATUS_LAST
+};
+
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 28%]

* [PATCH RESEND v3 7/7] Sample Implementation of Intel MIC User Space Daemon.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
                   ` (5 preceding siblings ...)
  2013-09-05 23:42 46% ` [PATCH RESEND v3 6/7] Intel MIC Card " Sudeep Dutt
@ 2013-09-05 23:42 24% ` Sudeep Dutt
    7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

From: Caz Yokoyama <Caz.Yokoyama@intel.com>

This patch introduces a sample user space daemon which
implements the virtio device backends on the host. The daemon
creates/removes/configures virtio device backends by communicating with
the Intel MIC Host Driver. The virtio devices currently supported are
virtio net, virtio console and virtio block. Virtio net supports TSO/GSO.
The daemon also monitors card shutdown status and takes appropriate actions
like killing the virtio backends and resetting the card upon card shutdown
and crashes.

Co-author: Ashutosh Dixit <ashutosh.dixit@intel.com>
Co-author: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
---
 Documentation/mic/mic_overview.txt |   49 ++
 Documentation/mic/mpssd/.gitignore |    1 +
 Documentation/mic/mpssd/Makefile   |   19 +
 Documentation/mic/mpssd/micctrl    |  173 ++++
 Documentation/mic/mpssd/mpss       |  202 +++++
 Documentation/mic/mpssd/mpssd.c    | 1701 ++++++++++++++++++++++++++++++++++++
 Documentation/mic/mpssd/mpssd.h    |  100 +++
 Documentation/mic/mpssd/sysfs.c    |  102 +++
 8 files changed, 2347 insertions(+)
 create mode 100644 Documentation/mic/mic_overview.txt
 create mode 100644 Documentation/mic/mpssd/.gitignore
 create mode 100644 Documentation/mic/mpssd/Makefile
 create mode 100755 Documentation/mic/mpssd/micctrl
 create mode 100755 Documentation/mic/mpssd/mpss
 create mode 100644 Documentation/mic/mpssd/mpssd.c
 create mode 100644 Documentation/mic/mpssd/mpssd.h
 create mode 100644 Documentation/mic/mpssd/sysfs.c

diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt
new file mode 100644
index 0000000..c4424ed
--- /dev/null
+++ b/Documentation/mic/mic_overview.txt
@@ -0,0 +1,49 @@
+An Intel MIC X100 device is a PCIe form factor add-in coprocessor
+card based on the Intel Many Integrated Core (MIC) architecture
+that runs a Linux OS. It is a PCIe endpoint in a platform and therefore
+implements the three required standard address spaces i.e. configuration,
+memory and I/O. The host OS loads a device driver as is typical for
+PCIe devices. The card itself runs a bootstrap after reset that
+transfers control to the card OS downloaded from the host driver.
+The card OS as shipped by Intel is a Linux kernel with modifications
+for the X100 devices.
+
+Since it is a PCIe card, it does not have the ability to host hardware
+devices for networking, storage and console. We provide these devices
+on X100 coprocessors thus enabling a self-bootable equivalent environment
+for applications. A key benefit of our solution is that it leverages
+the standard virtio framework for network, disk and console devices,
+though in our case the virtio framework is used across a PCIe bus.
+
+Here is a block diagram of the various components described above. The
+virtio backends are situated on the host rather than the card given better
+single threaded performance for the host compared to MIC, the ability of
+the host to initiate DMA's to/from the card using the MIC DMA engine and
+the fact that the virtio block storage backend can only be on the host.
+
+                              |
+       +----------+           |             +----------+
+       | Card OS  |           |             | Host OS  |
+       +----------+           |             +----------+
+                              |
++-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+| Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
+| Net   | |Console | |Block | | |Net      |  |Console | |Block   |
+| Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
++-------+ +--------+ +------+ | +---------+  +--------+ +--------+
+    |         |         |     |      |            |         |
+    |         |         |     |User  |            |         |
+    |         |         |     |------|------------|---------|-------
+    +-------------------+     |Kernel +--------------------------+
+              |               |       | Virtio over PCIe IOCTLs  |
+              |               |       +--------------------------+
+      +--------------+        |                   |
+      |Intel MIC     |        |            +---------------+
+      |Card Driver   |        |            |Intel MIC      |
+      +--------------+        |            |Host Driver    |
+              |               |            +---------------+
+              |               |                   |
+     +-------------------------------------------------------------+
+     |                                                             |
+     |                    PCIe Bus                                 |
+     +-------------------------------------------------------------+
diff --git a/Documentation/mic/mpssd/.gitignore b/Documentation/mic/mpssd/.gitignore
new file mode 100644
index 0000000..8b7c72f
--- /dev/null
+++ b/Documentation/mic/mpssd/.gitignore
@@ -0,0 +1 @@
+mpssd
diff --git a/Documentation/mic/mpssd/Makefile b/Documentation/mic/mpssd/Makefile
new file mode 100644
index 0000000..eb860a7
--- /dev/null
+++ b/Documentation/mic/mpssd/Makefile
@@ -0,0 +1,19 @@
+#
+# Makefile - Intel MIC User Space Tools.
+# Copyright(c) 2013, Intel Corporation.
+#
+ifdef DEBUG
+CFLAGS += $(USERWARNFLAGS) -I. -g -Wall -DDEBUG=$(DEBUG)
+else
+CFLAGS += $(USERWARNFLAGS) -I. -g -Wall
+endif
+
+mpssd: mpssd.o sysfs.o
+	$(CC) $(CFLAGS) -o $@ $^ -lpthread
+
+install:
+	install mpssd /usr/sbin/mpssd
+	install micctrl /usr/sbin/micctrl
+
+clean:
+	rm -f mpssd *.o
diff --git a/Documentation/mic/mpssd/micctrl b/Documentation/mic/mpssd/micctrl
new file mode 100755
index 0000000..8f2629b
--- /dev/null
+++ b/Documentation/mic/mpssd/micctrl
@@ -0,0 +1,173 @@
+#!/bin/bash
+# Intel MIC Platform Software Stack (MPSS)
+#
+# Copyright(c) 2013 Intel Corporation.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License, version 2, as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# The full GNU General Public License is included in this distribution in
+# the file called "COPYING".
+#
+# Intel MIC User Space Tools.
+#
+# micctrl - Controls MIC boot/start/stop.
+#
+# chkconfig: 2345 95 05
+# description: start MPSS stack processing.
+#
+### BEGIN INIT INFO
+# Provides: micctrl
+### END INIT INFO
+
+# Source function library.
+. /etc/init.d/functions
+
+sysfs="/sys/class/mic"
+
+_status()
+{
+	f=$sysfs/$1
+	echo -e $1 state: "`cat $f/state`" shutdown_status: "`cat $f/shutdown_status`"
+}
+
+status()
+{
+	if [ "`echo $1 | head -c3`" == "mic" ]; then
+		_status $1
+		return $?
+	fi
+	for f in $sysfs/*
+	do
+		_status `basename $f`
+		RETVAL=$?
+		[ $RETVAL -ne 0 ] && return $RETVAL
+	done
+	return 0
+}
+
+_reset()
+{
+	f=$sysfs/$1
+	echo reset > $f/state
+}
+
+reset()
+{
+	if [ "`echo $1 | head -c3`" == "mic" ]; then
+		_reset $1
+		return $?
+	fi
+	for f in $sysfs/*
+	do
+		_reset `basename $f`
+		RETVAL=$?
+		[ $RETVAL -ne 0 ] && return $RETVAL
+	done
+	return 0
+}
+
+_boot()
+{
+	f=$sysfs/$1
+	echo "linux" > $f/bootmode
+	echo "mic/uos.img" > $f/firmware
+	echo "mic/$1.image" > $f/ramdisk
+	echo "boot" > $f/state
+}
+
+boot()
+{
+	if [ "`echo $1 | head -c3`" == "mic" ]; then
+		_boot $1
+		return $?
+	fi
+	for f in $sysfs/*
+	do
+		_boot `basename $f`
+		RETVAL=$?
+		[ $RETVAL -ne 0 ] && return $RETVAL
+	done
+	return 0
+}
+
+_shutdown()
+{
+	f=$sysfs/$1
+	echo shutdown > $f/state
+}
+
+shutdown()
+{
+	if [ "`echo $1 | head -c3`" == "mic" ]; then
+		_shutdown $1
+		return $?
+	fi
+	for f in $sysfs/*
+	do
+		_shutdown `basename $f`
+		RETVAL=$?
+		[ $RETVAL -ne 0 ] && return $RETVAL
+	done
+	return 0
+}
+
+_wait()
+{
+	f=$sysfs/$1
+	while [ "`cat $f/state`" != "offline" -a "`cat $f/state`" != "online" ]
+	do
+		sleep 1
+		echo -e "Waiting for $1 to go offline"
+	done
+}
+
+wait()
+{
+	if [ "`echo $1 | head -c3`" == "mic" ]; then
+		_wait $1
+		return $?
+	fi
+	# Wait for the cards to go offline
+	for f in $sysfs/*
+	do
+		_wait `basename $f`
+		RETVAL=$?
+		[ $RETVAL -ne 0 ] && return $RETVAL
+	done
+	return 0
+}
+
+if [ ! -d "$sysfs" ]; then
+	echo -e $"Module unloaded "
+	exit 3
+fi
+
+case $1 in
+	-s)
+		status $2
+		;;
+	-r)
+		reset $2
+		;;
+	-b)
+		boot $2
+		;;
+	-S)
+		shutdown $2
+		;;
+	-w)
+		wait $2
+		;;
+	*)
+		echo $"Usage: $0 {-s (status) |-r (reset) |-b (boot) |-S (shutdown) |-w (wait)}"
+		exit 2
+esac
+
+exit $?
diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss
new file mode 100755
index 0000000..3136c68
--- /dev/null
+++ b/Documentation/mic/mpssd/mpss
@@ -0,0 +1,202 @@
+#!/bin/bash
+# Intel MIC Platform Software Stack (MPSS)
+#
+# Copyright(c) 2013 Intel Corporation.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License, version 2, as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# The full GNU General Public License is included in this distribution in
+# the file called "COPYING".
+#
+# Intel MIC User Space Tools.
+#
+# mpss	Start mpssd.
+#
+# chkconfig: 2345 95 05
+# description: start MPSS stack processing.
+#
+### BEGIN INIT INFO
+# Provides: mpss
+# Required-Start:
+# Required-Stop:
+# Short-Description: MPSS stack control
+# Description: MPSS stack control
+### END INIT INFO
+
+# Source function library.
+. /etc/init.d/functions
+
+exec=/usr/sbin/mpssd
+sysfs="/sys/class/mic"
+
+start()
+{
+	[ -x $exec ] || exit 5
+
+	if [ "`ps -e | awk '{print $4}' | grep mpssd | head -1`" = "mpssd" ]; then
+		echo -e $"MPSSD already running! "
+		success
+		echo
+		return 0
+	fi
+
+	echo -e $"Starting MPSS Stack"
+	echo -e $"Loading MIC_HOST Module"
+
+	# Ensure the driver is loaded
+	if [ ! -d "$sysfs" ]; then
+		modprobe mic_host
+		RETVAL=$?
+		if [ $RETVAL -ne 0 ]; then
+			failure
+			echo
+			return $RETVAL
+		fi
+	fi
+
+	# Start the daemon
+	echo -n $"Starting MPSSD "
+	$exec
+	RETVAL=$?
+	if [ $RETVAL -ne 0 ]; then
+		failure
+		echo
+		return $RETVAL
+	fi
+	success
+	echo
+
+	sleep 5
+
+	# Boot the cards
+	micctrl -b
+
+	# Wait till ping works
+	for f in $sysfs/*
+	do
+		count=100
+		ipaddr=`cat $f/cmdline`
+		ipaddr=${ipaddr#*address,}
+		ipaddr=`echo $ipaddr | cut -d, -f1 | cut -d\; -f1`
+		while [ $count -ge 0 ]
+		do
+			echo -e "Pinging "`basename $f`" "
+			ping -c 1 $ipaddr &> /dev/null
+			RETVAL=$?
+			if [ $RETVAL -eq 0 ]; then
+				success
+				break
+			fi
+			sleep 1
+			count=`expr $count - 1`
+		done
+		[ $RETVAL -ne 0 ] && failure || success
+		echo
+	done
+	return $RETVAL
+}
+
+stop()
+{
+	echo -e $"Shutting down MPSS Stack: "
+
+	# Bail out if module is unloaded
+	if [ ! -d "$sysfs" ]; then
+		echo -n $"Module unloaded "
+		success
+		echo
+		return 0
+	fi
+
+	# Shut down the cards.
+	micctrl -S
+
+	# Wait for the cards to go offline
+	for f in $sysfs/*
+	do
+		while [ "`cat $f/state`" != "offline" ]
+		do
+			sleep 1
+			echo -e "Waiting for "`basename $f`" to go offline"
+		done
+	done
+
+	# Display the status of the cards
+	micctrl -s
+
+	# Kill MPSSD now
+	echo -n $"Killing MPSSD"
+	killall -9 mpssd 2>/dev/null
+	RETVAL=$?
+	[ $RETVAL -ne 0 ] && failure || success
+	echo
+	return $RETVAL
+}
+
+restart()
+{
+	stop
+	sleep 5
+	start
+}
+
+status()
+{
+	micctrl -s
+	if [ "`ps -e | awk '{print $4}' | grep mpssd | head -n 1`" = "mpssd" ]; then
+		echo "mpssd is running"
+	else
+		echo "mpssd is stopped"
+	fi
+	return 0
+}
+
+unload()
+{
+	if [ ! -d "$sysfs" ]; then
+		echo -n $"No MIC_HOST Module: "
+		success
+		echo
+		return
+	fi
+
+	stop
+
+	sleep 5
+	echo -n $"Removing MIC_HOST Module: "
+	modprobe -r mic_host
+	RETVAL=$?
+	[ $RETVAL -ne 0 ] && failure || success
+	echo
+	return $RETVAL
+}
+
+case $1 in
+	start)
+		start
+		;;
+	stop)
+		stop
+		;;
+	restart)
+		restart
+		;;
+	status)
+		status
+		;;
+	unload)
+		unload
+		;;
+	*)
+		echo $"Usage: $0 {start|stop|restart|status|unload}"
+		exit 2
+esac
+
+exit $?
diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c
new file mode 100644
index 0000000..8064804
--- /dev/null
+++ b/Documentation/mic/mpssd/mpssd.c
@@ -0,0 +1,1701 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC User Space Tools.
+ */
+
+#define _GNU_SOURCE
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <assert.h>
+#include <unistd.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <poll.h>
+#include <features.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <sys/socket.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_console.h>
+#include <linux/virtio_blk.h>
+#include <linux/version.h>
+#include "mpssd.h"
+#include <linux/mic_ioctl.h>
+#include <linux/mic_common.h>
+
+static void init_mic(struct mic_info *mic);
+
+static FILE *logfp;
+static struct mic_info mic_list;
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+#define min_t(type, x, y) ({				\
+		type __min1 = (x);                      \
+		type __min2 = (y);                      \
+		__min1 < __min2 ? __min1 : __min2; })
+
+/* align addr on a size boundary - adjust address up/down if needed */
+#define _ALIGN_DOWN(addr, size)  ((addr)&(~((size)-1)))
+#define _ALIGN_UP(addr, size)    _ALIGN_DOWN(addr + size - 1, size)
+
+/* align addr on a size boundary - adjust address up if needed */
+#define _ALIGN(addr, size)     _ALIGN_UP(addr, size)
+
+/* to align the pointer to the (next) page boundary */
+#define PAGE_ALIGN(addr)        _ALIGN(addr, PAGE_SIZE)
+
+#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
+
+#define GSO_ENABLED		1
+#define MAX_GSO_SIZE		(64 * 1024)
+#define ETH_H_LEN		14
+#define MAX_NET_PKT_SIZE	(_ALIGN_UP(MAX_GSO_SIZE + ETH_H_LEN, 64))
+#define MIC_DEVICE_PAGE_END	0x1000
+
+#ifndef VIRTIO_NET_HDR_F_DATA_VALID
+#define VIRTIO_NET_HDR_F_DATA_VALID	2	/* Csum is valid */
+#endif
+
+static struct {
+	struct mic_device_desc dd;
+	struct mic_vqconfig vqconfig[2];
+	__u32 host_features, guest_acknowledgements;
+	struct virtio_console_config cons_config;
+} virtcons_dev_page = {
+	.dd = {
+		.type = VIRTIO_ID_CONSOLE,
+		.num_vq = ARRAY_SIZE(virtcons_dev_page.vqconfig),
+		.feature_len = sizeof(virtcons_dev_page.host_features),
+		.config_len = sizeof(virtcons_dev_page.cons_config),
+	},
+	.vqconfig[0] = {
+		.num = htole16(MIC_VRING_ENTRIES),
+	},
+	.vqconfig[1] = {
+		.num = htole16(MIC_VRING_ENTRIES),
+	},
+};
+
+static struct {
+	struct mic_device_desc dd;
+	struct mic_vqconfig vqconfig[2];
+	__u32 host_features, guest_acknowledgements;
+	struct virtio_net_config net_config;
+} virtnet_dev_page = {
+	.dd = {
+		.type = VIRTIO_ID_NET,
+		.num_vq = ARRAY_SIZE(virtnet_dev_page.vqconfig),
+		.feature_len = sizeof(virtnet_dev_page.host_features),
+		.config_len = sizeof(virtnet_dev_page.net_config),
+	},
+	.vqconfig[0] = {
+		.num = htole16(MIC_VRING_ENTRIES),
+	},
+	.vqconfig[1] = {
+		.num = htole16(MIC_VRING_ENTRIES),
+	},
+#if GSO_ENABLED
+		.host_features = htole32(
+		1 << VIRTIO_NET_F_CSUM |
+		1 << VIRTIO_NET_F_GSO |
+		1 << VIRTIO_NET_F_GUEST_TSO4 |
+		1 << VIRTIO_NET_F_GUEST_TSO6 |
+		1 << VIRTIO_NET_F_GUEST_ECN |
+		1 << VIRTIO_NET_F_GUEST_UFO),
+#else
+		.host_features = 0,
+#endif
+};
+
+static const char *mic_config_dir = "/etc/sysconfig/mic";
+static const char *virtblk_backend = "VIRTBLK_BACKEND";
+static struct {
+	struct mic_device_desc dd;
+	struct mic_vqconfig vqconfig[1];
+	__u32 host_features, guest_acknowledgements;
+	struct virtio_blk_config blk_config;
+} virtblk_dev_page = {
+	.dd = {
+		.type = VIRTIO_ID_BLOCK,
+		.num_vq = ARRAY_SIZE(virtblk_dev_page.vqconfig),
+		.feature_len = sizeof(virtblk_dev_page.host_features),
+		.config_len = sizeof(virtblk_dev_page.blk_config),
+	},
+	.vqconfig[0] = {
+		.num = htole16(MIC_VRING_ENTRIES),
+	},
+	.host_features =
+		htole32(1<<VIRTIO_BLK_F_SEG_MAX),
+	.blk_config = {
+		.seg_max = htole32(MIC_VRING_ENTRIES - 2),
+		.capacity = htole64(0),
+	 }
+};
+
+static char *myname;
+
+static int
+tap_configure(struct mic_info *mic, char *dev)
+{
+	pid_t pid;
+	char *ifargv[7];
+	char ipaddr[IFNAMSIZ];
+	int ret = 0;
+
+	pid = fork();
+	if (pid == 0) {
+		ifargv[0] = "ip";
+		ifargv[1] = "link";
+		ifargv[2] = "set";
+		ifargv[3] = dev;
+		ifargv[4] = "up";
+		ifargv[5] = NULL;
+		mpsslog("Configuring %s\n", dev);
+		ret = execvp("ip", ifargv);
+		if (ret < 0) {
+			mpsslog("%s execvp failed errno %s\n",
+				mic->name, strerror(errno));
+			return ret;
+		}
+	}
+	if (pid < 0) {
+		mpsslog("%s fork failed errno %s\n",
+			mic->name, strerror(errno));
+		return ret;
+	}
+
+	ret = waitpid(pid, NULL, 0);
+	if (ret < 0) {
+		mpsslog("%s waitpid failed errno %s\n",
+			mic->name, strerror(errno));
+		return ret;
+	}
+
+	snprintf(ipaddr, IFNAMSIZ, "172.31.%d.254/24", mic->id);
+
+	pid = fork();
+	if (pid == 0) {
+		ifargv[0] = "ip";
+		ifargv[1] = "addr";
+		ifargv[2] = "add";
+		ifargv[3] = ipaddr;
+		ifargv[4] = "dev";
+		ifargv[5] = dev;
+		ifargv[6] = NULL;
+		mpsslog("Configuring %s ipaddr %s\n", dev, ipaddr);
+		ret = execvp("ip", ifargv);
+		if (ret < 0) {
+			mpsslog("%s execvp failed errno %s\n",
+				mic->name, strerror(errno));
+			return ret;
+		}
+	}
+	if (pid < 0) {
+		mpsslog("%s fork failed errno %s\n",
+			mic->name, strerror(errno));
+		return ret;
+	}
+
+	ret = waitpid(pid, NULL, 0);
+	if (ret < 0) {
+		mpsslog("%s waitpid failed errno %s\n",
+			mic->name, strerror(errno));
+		return ret;
+	}
+	mpsslog("MIC name %s %s %d DONE!\n",
+		mic->name, __func__, __LINE__);
+	return 0;
+}
+
+static int tun_alloc(struct mic_info *mic, char *dev)
+{
+	struct ifreq ifr;
+	int fd, err;
+#if GSO_ENABLED
+	unsigned offload;
+#endif
+	fd = open("/dev/net/tun", O_RDWR);
+	if (fd < 0) {
+		mpsslog("Could not open /dev/net/tun %s\n", strerror(errno));
+		goto done;
+	}
+
+	memset(&ifr, 0, sizeof(ifr));
+
+	ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_VNET_HDR;
+	if (*dev)
+		strncpy(ifr.ifr_name, dev, IFNAMSIZ);
+
+	err = ioctl(fd, TUNSETIFF, (void *) &ifr);
+	if (err < 0) {
+		mpsslog("%s %s %d TUNSETIFF failed %s\n",
+			mic->name, __func__, __LINE__, strerror(errno));
+		close(fd);
+		return err;
+	}
+#if GSO_ENABLED
+	offload = TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 |
+		TUN_F_TSO_ECN | TUN_F_UFO;
+
+	err = ioctl(fd, TUNSETOFFLOAD, offload);
+	if (err < 0) {
+		mpsslog("%s %s %d TUNSETOFFLOAD failed %s\n",
+			mic->name, __func__, __LINE__, strerror(errno));
+		close(fd);
+		return err;
+	}
+#endif
+	strcpy(dev, ifr.ifr_name);
+	mpsslog("Created TAP %s\n", dev);
+done:
+	return fd;
+}
+
+#define NET_FD_VIRTIO_NET 0
+#define NET_FD_TUN 1
+#define MAX_NET_FD 2
+
+static void set_dp(struct mic_info *mic, int type, void *dp)
+{
+	switch (type) {
+	case VIRTIO_ID_CONSOLE:
+		mic->mic_console.console_dp = dp;
+		return;
+	case VIRTIO_ID_NET:
+		mic->mic_net.net_dp = dp;
+		return;
+	case VIRTIO_ID_BLOCK:
+		mic->mic_virtblk.block_dp = dp;
+		return;
+	}
+	mpsslog("%s %s %d not found\n", mic->name, __func__, type);
+	assert(0);
+}
+
+static void *get_dp(struct mic_info *mic, int type)
+{
+	switch (type) {
+	case VIRTIO_ID_CONSOLE:
+		return mic->mic_console.console_dp;
+	case VIRTIO_ID_NET:
+		return mic->mic_net.net_dp;
+	case VIRTIO_ID_BLOCK:
+		return mic->mic_virtblk.block_dp;
+	}
+	mpsslog("%s %s %d not found\n", mic->name, __func__, type);
+	assert(0);
+	return NULL;
+}
+
+static struct mic_device_desc *get_device_desc(struct mic_info *mic, int type)
+{
+	struct mic_device_desc *d;
+	int i;
+	void *dp = get_dp(mic, type);
+
+	for (i = mic_aligned_size(struct mic_bootparam); i < PAGE_SIZE;
+		i += mic_total_desc_size(d)) {
+		d = dp + i;
+
+		/* End of list */
+		if (d->type == 0)
+			break;
+
+		if (d->type == -1)
+			continue;
+
+		mpsslog("%s %s d-> type %d d %p\n",
+			mic->name, __func__, d->type, d);
+
+		if (d->type == (__u8)type)
+			return d;
+	}
+	mpsslog("%s %s %d not found\n", mic->name, __func__, type);
+	assert(0);
+	return NULL;
+}
+
+/* See comments in vhost.c for explanation of next_desc() */
+static unsigned next_desc(struct vring_desc *desc)
+{
+	unsigned int next;
+
+	if (!(le16toh(desc->flags) & VRING_DESC_F_NEXT))
+		return -1U;
+	next = le16toh(desc->next);
+	return next;
+}
+
+/* Sum up all the IOVEC length */
+static ssize_t
+sum_iovec_len(struct mic_copy_desc *copy)
+{
+	ssize_t sum = 0;
+	int i;
+
+	for (i = 0; i < copy->iovcnt; i++)
+		sum += copy->iov[i].iov_len;
+	return sum;
+}
+
+static inline void verify_out_len(struct mic_info *mic,
+	struct mic_copy_desc *copy)
+{
+	if (copy->out_len != sum_iovec_len(copy)) {
+		mpsslog("%s %s %d BUG copy->out_len 0x%x len 0x%x\n",
+				mic->name, __func__, __LINE__,
+				copy->out_len, sum_iovec_len(copy));
+		assert(copy->out_len == sum_iovec_len(copy));
+	}
+}
+
+/* Display an iovec */
+static void
+disp_iovec(struct mic_info *mic, struct mic_copy_desc *copy,
+	const char *s, int line)
+{
+	int i;
+
+	for (i = 0; i < copy->iovcnt; i++)
+		mpsslog("%s %s %d copy->iov[%d] addr %p len 0x%lx\n",
+			mic->name, s, line, i,
+			copy->iov[i].iov_base, copy->iov[i].iov_len);
+}
+
+static inline __u16 read_avail_idx(struct mic_vring *vr)
+{
+	return ACCESS_ONCE(vr->info->avail_idx);
+}
+
+static inline void txrx_prepare(int type, bool tx, struct mic_vring *vr,
+				struct mic_copy_desc *copy, ssize_t len)
+{
+	copy->vr_idx = tx ? 0 : 1;
+	copy->update_used = true;
+	if (type == VIRTIO_ID_NET)
+		copy->iov[1].iov_len = len - sizeof(struct virtio_net_hdr);
+	else
+		copy->iov[0].iov_len = len;
+}
+
+/* Central API which triggers the copies */
+static int
+mic_virtio_copy(struct mic_info *mic, int fd,
+	struct mic_vring *vr, struct mic_copy_desc *copy)
+{
+	int ret;
+
+	ret = ioctl(fd, MIC_VIRTIO_COPY_DESC, copy);
+	if (ret) {
+		mpsslog("%s %s %d errno %s ret %d\n",
+			mic->name, __func__, __LINE__,
+			strerror(errno), ret);
+	}
+	return ret;
+}
+
+/*
+ * This initialization routine requires at least one
+ * vring i.e. vr0. vr1 is optional.
+ */
+static void *
+init_vr(struct mic_info *mic, int fd, int type,
+	struct mic_vring *vr0, struct mic_vring *vr1, int num_vq)
+{
+	int vr_size;
+	char *va;
+
+	vr_size = PAGE_ALIGN(vring_size(MIC_VRING_ENTRIES,
+		MIC_VIRTIO_RING_ALIGN) + sizeof(struct _mic_vring_info));
+	va = mmap(NULL, MIC_DEVICE_PAGE_END + vr_size * num_vq,
+		PROT_READ, MAP_SHARED, fd, 0);
+	if (MAP_FAILED == va) {
+		mpsslog("%s %s %d mmap failed errno %s\n",
+			mic->name, __func__, __LINE__,
+			strerror(errno));
+		goto done;
+	}
+	set_dp(mic, type, va);
+	vr0->va = (struct mic_vring *)&va[MIC_DEVICE_PAGE_END];
+	vr0->info = vr0->va +
+		vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN);
+	vring_init(&vr0->vr,
+		MIC_VRING_ENTRIES, vr0->va, MIC_VIRTIO_RING_ALIGN);
+	mpsslog("%s %s vr0 %p vr0->info %p vr_size 0x%x vring 0x%x ",
+		__func__, mic->name, vr0->va, vr0->info, vr_size,
+		vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN));
+	mpsslog("magic 0x%x expected 0x%x\n",
+		vr0->info->magic, MIC_MAGIC + type);
+	assert(vr0->info->magic == MIC_MAGIC + type);
+	if (vr1) {
+		vr1->va = (struct mic_vring *)
+			&va[MIC_DEVICE_PAGE_END + vr_size];
+		vr1->info = vr1->va + vring_size(MIC_VRING_ENTRIES,
+			MIC_VIRTIO_RING_ALIGN);
+		vring_init(&vr1->vr,
+			MIC_VRING_ENTRIES, vr1->va, MIC_VIRTIO_RING_ALIGN);
+		mpsslog("%s %s vr1 %p vr1->info %p vr_size 0x%x vring 0x%x ",
+			__func__, mic->name, vr1->va, vr1->info, vr_size,
+			vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN));
+		mpsslog("magic 0x%x expected 0x%x\n",
+			vr1->info->magic, MIC_MAGIC + type + 1);
+		assert(vr1->info->magic == MIC_MAGIC + type + 1);
+	}
+done:
+	return va;
+}
+
+static void
+wait_for_card_driver(struct mic_info *mic, int fd, int type)
+{
+	struct pollfd pollfd;
+	int err;
+	struct mic_device_desc *desc = get_device_desc(mic, type);
+
+	pollfd.fd = fd;
+	mpsslog("%s %s Waiting .... desc-> type %d status 0x%x\n",
+		mic->name, __func__, type, desc->status);
+	while (1) {
+		pollfd.events = POLLIN;
+		pollfd.revents = 0;
+		err = poll(&pollfd, 1, -1);
+		if (err < 0) {
+			mpsslog("%s %s poll failed %s\n",
+				mic->name, __func__, strerror(errno));
+			continue;
+		}
+
+		if (pollfd.revents) {
+			mpsslog("%s %s Waiting... desc-> type %d status 0x%x\n",
+				mic->name, __func__, type, desc->status);
+			if (desc->status & VIRTIO_CONFIG_S_DRIVER_OK) {
+				mpsslog("%s %s poll.revents %d\n",
+					mic->name, __func__, pollfd.revents);
+				mpsslog("%s %s desc-> type %d status 0x%x\n",
+					mic->name, __func__, type,
+					desc->status);
+				break;
+			}
+		}
+	}
+}
+
+/* Spin till we have some descriptors */
+static void
+spin_for_descriptors(struct mic_info *mic, struct mic_vring *vr)
+{
+	__u16 avail_idx = read_avail_idx(vr);
+
+	while (avail_idx == le16toh(ACCESS_ONCE(vr->vr.avail->idx))) {
+#ifdef DEBUG
+		mpsslog("%s %s waiting for desc avail %d info_avail %d\n",
+			mic->name, __func__,
+			le16toh(vr->vr.avail->idx), vr->info->avail_idx);
+#endif
+		sched_yield();
+	}
+}
+
+static void *
+virtio_net(void *arg)
+{
+	static __u8 vnet_hdr[2][sizeof(struct virtio_net_hdr)];
+	static __u8 vnet_buf[2][MAX_NET_PKT_SIZE] __aligned(64);
+	struct iovec vnet_iov[2][2] = {
+		{ { .iov_base = vnet_hdr[0], .iov_len = sizeof(vnet_hdr[0]) },
+		  { .iov_base = vnet_buf[0], .iov_len = sizeof(vnet_buf[0]) } },
+		{ { .iov_base = vnet_hdr[1], .iov_len = sizeof(vnet_hdr[1]) },
+		  { .iov_base = vnet_buf[1], .iov_len = sizeof(vnet_buf[1]) } },
+	};
+	struct iovec *iov0 = vnet_iov[0], *iov1 = vnet_iov[1];
+	struct mic_info *mic = (struct mic_info *)arg;
+	char if_name[IFNAMSIZ];
+	struct pollfd net_poll[MAX_NET_FD];
+	struct mic_vring tx_vr, rx_vr;
+	struct mic_copy_desc copy;
+	struct mic_device_desc *desc;
+	int err;
+
+	snprintf(if_name, IFNAMSIZ, "mic%d", mic->id);
+	mic->mic_net.tap_fd = tun_alloc(mic, if_name);
+	if (mic->mic_net.tap_fd < 0)
+		goto done;
+
+	if (tap_configure(mic, if_name))
+		goto done;
+	mpsslog("MIC name %s id %d\n", mic->name, mic->id);
+
+	net_poll[NET_FD_VIRTIO_NET].fd = mic->mic_net.virtio_net_fd;
+	net_poll[NET_FD_VIRTIO_NET].events = POLLIN;
+	net_poll[NET_FD_TUN].fd = mic->mic_net.tap_fd;
+	net_poll[NET_FD_TUN].events = POLLIN;
+
+	if (MAP_FAILED == init_vr(mic, mic->mic_net.virtio_net_fd,
+		VIRTIO_ID_NET, &tx_vr, &rx_vr,
+		virtnet_dev_page.dd.num_vq)) {
+		mpsslog("%s init_vr failed %s\n",
+			mic->name, strerror(errno));
+		goto done;
+	}
+
+	copy.iovcnt = 2;
+	desc = get_device_desc(mic, VIRTIO_ID_NET);
+
+	while (1) {
+		ssize_t len;
+
+		net_poll[NET_FD_VIRTIO_NET].revents = 0;
+		net_poll[NET_FD_TUN].revents = 0;
+
+		/* Start polling for data from tap and virtio net */
+		err = poll(net_poll, 2, -1);
+		if (err < 0) {
+			mpsslog("%s poll failed %s\n",
+				__func__, strerror(errno));
+			continue;
+		}
+		if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK))
+			wait_for_card_driver(mic, mic->mic_net.virtio_net_fd,
+					VIRTIO_ID_NET);
+		/*
+		 * Check if there is data to be read from TUN and write to
+		 * virtio net fd if there is.
+		 */
+		if (net_poll[NET_FD_TUN].revents & POLLIN) {
+			copy.iov = iov0;
+			len = readv(net_poll[NET_FD_TUN].fd,
+				copy.iov, copy.iovcnt);
+			if (len > 0) {
+				struct virtio_net_hdr *hdr
+					= (struct virtio_net_hdr *) vnet_hdr[0];
+
+				/* Disable checksums on the card since we are on
+				   a reliable PCIe link */
+				hdr->flags |= VIRTIO_NET_HDR_F_DATA_VALID;
+#ifdef DEBUG
+				mpsslog("%s %s %d hdr->flags 0x%x ", mic->name,
+					__func__, __LINE__, hdr->flags);
+				mpsslog("copy.out_len %d hdr->gso_type 0x%x\n",
+					copy.out_len, hdr->gso_type);
+#endif
+#ifdef DEBUG
+				disp_iovec(mic, copy, __func__, __LINE__);
+				mpsslog("%s %s %d read from tap 0x%lx\n",
+					mic->name, __func__, __LINE__,
+					len);
+#endif
+				spin_for_descriptors(mic, &tx_vr);
+				txrx_prepare(VIRTIO_ID_NET, 1, &tx_vr, &copy,
+					len);
+
+				err = mic_virtio_copy(mic,
+					mic->mic_net.virtio_net_fd, &tx_vr,
+					&copy);
+				if (err < 0) {
+					mpsslog("%s %s %d mic_virtio_copy %s\n",
+						mic->name, __func__, __LINE__,
+						strerror(errno));
+				}
+				if (!err)
+					verify_out_len(mic, &copy);
+#ifdef DEBUG
+				disp_iovec(mic, copy, __func__, __LINE__);
+				mpsslog("%s %s %d wrote to net 0x%lx\n",
+					mic->name, __func__, __LINE__,
+					sum_iovec_len(&copy));
+#endif
+				/* Reinitialize IOV for next run */
+				iov0[1].iov_len = MAX_NET_PKT_SIZE;
+			} else if (len < 0) {
+				disp_iovec(mic, &copy, __func__, __LINE__);
+				mpsslog("%s %s %d read failed %s ", mic->name,
+					__func__, __LINE__, strerror(errno));
+				mpsslog("cnt %d sum %d\n",
+					copy.iovcnt, sum_iovec_len(&copy));
+			}
+		}
+
+		/*
+		 * Check if there is data to be read from virtio net and
+		 * write to TUN if there is.
+		 */
+		if (net_poll[NET_FD_VIRTIO_NET].revents & POLLIN) {
+			while (rx_vr.info->avail_idx !=
+				le16toh(rx_vr.vr.avail->idx)) {
+				copy.iov = iov1;
+				txrx_prepare(VIRTIO_ID_NET, 0, &rx_vr, &copy,
+					MAX_NET_PKT_SIZE
+					+ sizeof(struct virtio_net_hdr));
+
+				err = mic_virtio_copy(mic,
+					mic->mic_net.virtio_net_fd, &rx_vr,
+					&copy);
+				if (!err) {
+#ifdef DEBUG
+					struct virtio_net_hdr *hdr
+						= (struct virtio_net_hdr *)
+							vnet_hdr[1];
+
+					mpsslog("%s %s %d hdr->flags 0x%x, ",
+						mic->name, __func__, __LINE__,
+						hdr->flags);
+					mpsslog("out_len %d gso_type 0x%x\n",
+						copy.out_len,
+						hdr->gso_type);
+#endif
+					/* Set the correct output iov_len */
+					iov1[1].iov_len = copy.out_len -
+						sizeof(struct virtio_net_hdr);
+					verify_out_len(mic, &copy);
+#ifdef DEBUG
+					disp_iovec(mic, copy, __func__,
+						__LINE__);
+					mpsslog("%s %s %d ",
+						mic->name, __func__, __LINE__);
+					mpsslog("read from net 0x%lx\n",
+						sum_iovec_len(copy));
+#endif
+					len = writev(net_poll[NET_FD_TUN].fd,
+						copy.iov, copy.iovcnt);
+					if (len != sum_iovec_len(&copy)) {
+						mpsslog("Tun write failed %s ",
+							strerror(errno));
+						mpsslog("len 0x%x ", len);
+						mpsslog("read_len 0x%x\n",
+							sum_iovec_len(&copy));
+					} else {
+#ifdef DEBUG
+						disp_iovec(mic, &copy, __func__,
+							__LINE__);
+						mpsslog("%s %s %d ",
+							mic->name, __func__,
+							__LINE__);
+						mpsslog("wrote to tap 0x%lx\n",
+							len);
+#endif
+					}
+				} else {
+					mpsslog("%s %s %d mic_virtio_copy %s\n",
+						mic->name, __func__, __LINE__,
+						strerror(errno));
+					break;
+				}
+			}
+		}
+		if (net_poll[NET_FD_VIRTIO_NET].revents & POLLERR)
+			mpsslog("%s: %s: POLLERR\n", __func__, mic->name);
+	}
+done:
+	pthread_exit(NULL);
+}
+
+/* virtio_console */
+#define VIRTIO_CONSOLE_FD 0
+#define MONITOR_FD (VIRTIO_CONSOLE_FD + 1)
+#define MAX_CONSOLE_FD (MONITOR_FD + 1)  /* must be the last one + 1 */
+#define MAX_BUFFER_SIZE PAGE_SIZE
+
+static void *
+virtio_console(void *arg)
+{
+	static __u8 vcons_buf[2][PAGE_SIZE];
+	struct iovec vcons_iov[2] = {
+		{ .iov_base = vcons_buf[0], .iov_len = sizeof(vcons_buf[0]) },
+		{ .iov_base = vcons_buf[1], .iov_len = sizeof(vcons_buf[1]) },
+	};
+	struct iovec *iov0 = &vcons_iov[0], *iov1 = &vcons_iov[1];
+	struct mic_info *mic = (struct mic_info *)arg;
+	int err;
+	struct pollfd console_poll[MAX_CONSOLE_FD];
+	int pty_fd;
+	char *pts_name;
+	ssize_t len;
+	struct mic_vring tx_vr, rx_vr;
+	struct mic_copy_desc copy;
+	struct mic_device_desc *desc;
+
+	pty_fd = posix_openpt(O_RDWR);
+	if (pty_fd < 0) {
+		mpsslog("can't open a pseudoterminal master device: %s\n",
+			strerror(errno));
+		goto _return;
+	}
+	pts_name = ptsname(pty_fd);
+	if (pts_name == NULL) {
+		mpsslog("can't get pts name\n");
+		goto _close_pty;
+	}
+	printf("%s console message goes to %s\n", mic->name, pts_name);
+	mpsslog("%s console message goes to %s\n", mic->name, pts_name);
+	err = grantpt(pty_fd);
+	if (err < 0) {
+		mpsslog("can't grant access: %s %s\n",
+				pts_name, strerror(errno));
+		goto _close_pty;
+	}
+	err = unlockpt(pty_fd);
+	if (err < 0) {
+		mpsslog("can't unlock a pseudoterminal: %s %s\n",
+				pts_name, strerror(errno));
+		goto _close_pty;
+	}
+	console_poll[MONITOR_FD].fd = pty_fd;
+	console_poll[MONITOR_FD].events = POLLIN;
+
+	console_poll[VIRTIO_CONSOLE_FD].fd = mic->mic_console.virtio_console_fd;
+	console_poll[VIRTIO_CONSOLE_FD].events = POLLIN;
+
+	if (MAP_FAILED == init_vr(mic, mic->mic_console.virtio_console_fd,
+		VIRTIO_ID_CONSOLE, &tx_vr, &rx_vr,
+		virtcons_dev_page.dd.num_vq)) {
+		mpsslog("%s init_vr failed %s\n",
+			mic->name, strerror(errno));
+		goto _close_pty;
+	}
+
+	copy.iovcnt = 1;
+	desc = get_device_desc(mic, VIRTIO_ID_CONSOLE);
+
+	for (;;) {
+		console_poll[MONITOR_FD].revents = 0;
+		console_poll[VIRTIO_CONSOLE_FD].revents = 0;
+		err = poll(console_poll, MAX_CONSOLE_FD, -1);
+		if (err < 0) {
+			mpsslog("%s %d: poll failed: %s\n", __func__, __LINE__,
+				strerror(errno));
+			continue;
+		}
+		if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK))
+			wait_for_card_driver(mic,
+				mic->mic_console.virtio_console_fd,
+				VIRTIO_ID_CONSOLE);
+
+		if (console_poll[MONITOR_FD].revents & POLLIN) {
+			copy.iov = iov0;
+			len = readv(pty_fd, copy.iov, copy.iovcnt);
+			if (len > 0) {
+#ifdef DEBUG
+				disp_iovec(mic, copy, __func__, __LINE__);
+				mpsslog("%s %s %d read from tap 0x%lx\n",
+					mic->name, __func__, __LINE__,
+					len);
+#endif
+				spin_for_descriptors(mic, &tx_vr);
+				txrx_prepare(VIRTIO_ID_CONSOLE, 1, &tx_vr,
+					&copy, len);
+
+				err = mic_virtio_copy(mic,
+					mic->mic_console.virtio_console_fd,
+					&tx_vr, &copy);
+				if (err < 0) {
+					mpsslog("%s %s %d mic_virtio_copy %s\n",
+						mic->name, __func__, __LINE__,
+						strerror(errno));
+				}
+				if (!err)
+					verify_out_len(mic, &copy);
+#ifdef DEBUG
+				disp_iovec(mic, copy, __func__, __LINE__);
+				mpsslog("%s %s %d wrote to net 0x%lx\n",
+					mic->name, __func__, __LINE__,
+					sum_iovec_len(copy));
+#endif
+				/* Reinitialize IOV for next run */
+				iov0->iov_len = PAGE_SIZE;
+			} else if (len < 0) {
+				disp_iovec(mic, &copy, __func__, __LINE__);
+				mpsslog("%s %s %d read failed %s ",
+					mic->name, __func__, __LINE__,
+					strerror(errno));
+				mpsslog("cnt %d sum %d\n",
+					copy.iovcnt, sum_iovec_len(&copy));
+			}
+		}
+
+		if (console_poll[VIRTIO_CONSOLE_FD].revents & POLLIN) {
+			while (rx_vr.info->avail_idx !=
+				le16toh(rx_vr.vr.avail->idx)) {
+				copy.iov = iov1;
+				txrx_prepare(VIRTIO_ID_CONSOLE, 0, &rx_vr,
+					&copy, PAGE_SIZE);
+
+				err = mic_virtio_copy(mic,
+					mic->mic_console.virtio_console_fd,
+					&rx_vr, &copy);
+				if (!err) {
+					/* Set the correct output iov_len */
+					iov1->iov_len = copy.out_len;
+					verify_out_len(mic, &copy);
+#ifdef DEBUG
+					disp_iovec(mic, copy, __func__,
+						__LINE__);
+					mpsslog("%s %s %d ",
+						mic->name, __func__, __LINE__);
+					mpsslog("read from net 0x%lx\n",
+						sum_iovec_len(copy));
+#endif
+					len = writev(pty_fd,
+						copy.iov, copy.iovcnt);
+					if (len != sum_iovec_len(&copy)) {
+						mpsslog("Tun write failed %s ",
+							strerror(errno));
+						mpsslog("len 0x%x ", len);
+						mpsslog("read_len 0x%x\n",
+							sum_iovec_len(&copy));
+					} else {
+#ifdef DEBUG
+						disp_iovec(mic, copy, __func__,
+							__LINE__);
+						mpsslog("%s %s %d ",
+							mic->name, __func__,
+							__LINE__);
+						mpsslog("wrote to tap 0x%lx\n",
+							len);
+#endif
+					}
+				} else {
+					mpsslog("%s %s %d mic_virtio_copy %s\n",
+						mic->name, __func__, __LINE__,
+						strerror(errno));
+					break;
+				}
+			}
+		}
+		if (console_poll[NET_FD_VIRTIO_NET].revents & POLLERR)
+			mpsslog("%s: %s: POLLERR\n", __func__, mic->name);
+	}
+_close_pty:
+	close(pty_fd);
+_return:
+	pthread_exit(NULL);
+}
+
+static void
+add_virtio_device(struct mic_info *mic, struct mic_device_desc *dd)
+{
+	char path[PATH_MAX];
+	int fd, err;
+
+	snprintf(path, PATH_MAX, "/dev/mic%d", mic->id);
+	fd = open(path, O_RDWR);
+	if (fd < 0) {
+		mpsslog("Could not open %s %s\n", path, strerror(errno));
+		return;
+	}
+
+	err = ioctl(fd, MIC_VIRTIO_ADD_DEVICE, dd);
+	if (err < 0) {
+		mpsslog("Could not add %d %s\n", dd->type, strerror(errno));
+		close(fd);
+		return;
+	}
+	switch (dd->type) {
+	case VIRTIO_ID_NET:
+		mic->mic_net.virtio_net_fd = fd;
+		mpsslog("Added VIRTIO_ID_NET for %s\n", mic->name);
+		break;
+	case VIRTIO_ID_CONSOLE:
+		mic->mic_console.virtio_console_fd = fd;
+		mpsslog("Added VIRTIO_ID_CONSOLE for %s\n", mic->name);
+		break;
+	case VIRTIO_ID_BLOCK:
+		mic->mic_virtblk.virtio_block_fd = fd;
+		mpsslog("Added VIRTIO_ID_BLOCK for %s\n", mic->name);
+		break;
+	}
+}
+
+static bool
+set_backend_file(struct mic_info *mic)
+{
+	FILE *config;
+	char buff[PATH_MAX], *line, *evv, *p;
+
+	snprintf(buff, PATH_MAX, "%s/mpssd%03d.conf", mic_config_dir, mic->id);
+	config = fopen(buff, "r");
+	if (config == NULL)
+		return false;
+	do {  /* look for "virtblk_backend=XXXX" */
+		line = fgets(buff, PATH_MAX, config);
+		if (line == NULL)
+			break;
+		if (*line == '#')
+			continue;
+		p = strchr(line, '\n');
+		if (p)
+			*p = '\0';
+	} while (strncmp(line, virtblk_backend, strlen(virtblk_backend)) != 0);
+	fclose(config);
+	if (line == NULL)
+		return false;
+	evv = strchr(line, '=');
+	if (evv == NULL)
+		return false;
+	mic->mic_virtblk.backend_file = malloc(strlen(evv) + 1);
+	if (mic->mic_virtblk.backend_file == NULL) {
+		mpsslog("can't allocate memory\n", mic->name, mic->id);
+		return false;
+	}
+	strcpy(mic->mic_virtblk.backend_file, evv + 1);
+	return true;
+}
+
+#define SECTOR_SIZE 512
+static bool
+set_backend_size(struct mic_info *mic)
+{
+	mic->mic_virtblk.backend_size = lseek(mic->mic_virtblk.backend, 0,
+		SEEK_END);
+	if (mic->mic_virtblk.backend_size < 0) {
+		mpsslog("%s: can't seek: %s\n",
+			mic->name, mic->mic_virtblk.backend_file);
+		return false;
+	}
+	virtblk_dev_page.blk_config.capacity =
+		mic->mic_virtblk.backend_size / SECTOR_SIZE;
+	if ((mic->mic_virtblk.backend_size % SECTOR_SIZE) != 0)
+		virtblk_dev_page.blk_config.capacity++;
+
+	virtblk_dev_page.blk_config.capacity =
+		htole64(virtblk_dev_page.blk_config.capacity);
+
+	return true;
+}
+
+static bool
+open_backend(struct mic_info *mic)
+{
+	if (!set_backend_file(mic))
+		goto _error_exit;
+	mic->mic_virtblk.backend = open(mic->mic_virtblk.backend_file, O_RDWR);
+	if (mic->mic_virtblk.backend < 0) {
+		mpsslog("%s: can't open: %s\n", mic->name,
+			mic->mic_virtblk.backend_file);
+		goto _error_free;
+	}
+	if (!set_backend_size(mic))
+		goto _error_close;
+	mic->mic_virtblk.backend_addr = mmap(NULL,
+		mic->mic_virtblk.backend_size,
+		PROT_READ|PROT_WRITE, MAP_SHARED,
+		mic->mic_virtblk.backend, 0L);
+	if (mic->mic_virtblk.backend_addr == MAP_FAILED) {
+		mpsslog("%s: can't map: %s %s\n",
+			mic->name, mic->mic_virtblk.backend_file,
+			strerror(errno));
+		goto _error_close;
+	}
+	return true;
+
+ _error_close:
+	close(mic->mic_virtblk.backend);
+ _error_free:
+	free(mic->mic_virtblk.backend_file);
+ _error_exit:
+	return false;
+}
+
+static void
+close_backend(struct mic_info *mic)
+{
+	munmap(mic->mic_virtblk.backend_addr, mic->mic_virtblk.backend_size);
+	close(mic->mic_virtblk.backend);
+	free(mic->mic_virtblk.backend_file);
+}
+
+static bool
+start_virtblk(struct mic_info *mic, struct mic_vring *vring)
+{
+	if (((__u64)&virtblk_dev_page.blk_config % 8) != 0) {
+		mpsslog("%s: blk_config is not 8 byte aligned.\n",
+			mic->name);
+		return false;
+	}
+	add_virtio_device(mic, &virtblk_dev_page.dd);
+	if (MAP_FAILED == init_vr(mic, mic->mic_virtblk.virtio_block_fd,
+		VIRTIO_ID_BLOCK, vring, NULL, virtblk_dev_page.dd.num_vq)) {
+		mpsslog("%s init_vr failed %s\n",
+			mic->name, strerror(errno));
+		return false;
+	}
+	return true;
+}
+
+static void
+stop_virtblk(struct mic_info *mic)
+{
+	int vr_size, ret;
+
+	vr_size = PAGE_ALIGN(vring_size(MIC_VRING_ENTRIES,
+		MIC_VIRTIO_RING_ALIGN) + sizeof(struct _mic_vring_info));
+	ret = munmap(mic->mic_virtblk.block_dp,
+		MIC_DEVICE_PAGE_END + vr_size * virtblk_dev_page.dd.num_vq);
+	if (ret < 0)
+		mpsslog("%s munmap errno %d\n", mic->name, errno);
+	close(mic->mic_virtblk.virtio_block_fd);
+}
+
+static __u8
+header_error_check(struct vring_desc *desc)
+{
+	if (le32toh(desc->len) != sizeof(struct virtio_blk_outhdr)) {
+		mpsslog("%s() %d: length is not sizeof(virtio_blk_outhd)\n",
+				__func__, __LINE__);
+		return -EIO;
+	}
+	if (!(le16toh(desc->flags) & VRING_DESC_F_NEXT)) {
+		mpsslog("%s() %d: alone\n",
+			__func__, __LINE__);
+		return -EIO;
+	}
+	if (le16toh(desc->flags) & VRING_DESC_F_WRITE) {
+		mpsslog("%s() %d: not read\n",
+			__func__, __LINE__);
+		return -EIO;
+	}
+	return 0;
+}
+
+static int
+read_header(int fd, struct virtio_blk_outhdr *hdr, __u32 desc_idx)
+{
+	struct iovec iovec;
+	struct mic_copy_desc copy;
+
+	iovec.iov_len = sizeof(*hdr);
+	iovec.iov_base = hdr;
+	copy.iov = &iovec;
+	copy.iovcnt = 1;
+	copy.vr_idx = 0;  /* only one vring on virtio_block */
+	copy.update_used = false;  /* do not update used index */
+	return ioctl(fd, MIC_VIRTIO_COPY_DESC, &copy);
+}
+
+static int
+transfer_blocks(int fd, struct iovec *iovec, __u32 iovcnt)
+{
+	struct mic_copy_desc copy;
+
+	copy.iov = iovec;
+	copy.iovcnt = iovcnt;
+	copy.vr_idx = 0;  /* only one vring on virtio_block */
+	copy.update_used = false;  /* do not update used index */
+	return ioctl(fd, MIC_VIRTIO_COPY_DESC, &copy);
+}
+
+static __u8
+status_error_check(struct vring_desc *desc)
+{
+	if (le32toh(desc->len) != sizeof(__u8)) {
+		mpsslog("%s() %d: length is not sizeof(status)\n",
+			__func__, __LINE__);
+		return -EIO;
+	}
+	return 0;
+}
+
+static int
+write_status(int fd, __u8 *status)
+{
+	struct iovec iovec;
+	struct mic_copy_desc copy;
+
+	iovec.iov_base = status;
+	iovec.iov_len = sizeof(*status);
+	copy.iov = &iovec;
+	copy.iovcnt = 1;
+	copy.vr_idx = 0;  /* only one vring on virtio_block */
+	copy.update_used = true; /* Update used index */
+	return ioctl(fd, MIC_VIRTIO_COPY_DESC, &copy);
+}
+
+static void *
+virtio_block(void *arg)
+{
+	struct mic_info *mic = (struct mic_info *) arg;
+	int ret;
+	struct pollfd block_poll;
+	struct mic_vring vring;
+	__u16 avail_idx;
+	__u32 desc_idx;
+	struct vring_desc *desc;
+	struct iovec *iovec, *piov;
+	__u8 status;
+	__u32 buffer_desc_idx;
+	struct virtio_blk_outhdr hdr;
+	void *fos;
+
+	for (;;) {  /* forever */
+		if (!open_backend(mic)) { /* No virtblk */
+			for (mic->mic_virtblk.signaled = 0;
+				!mic->mic_virtblk.signaled;)
+				sleep(1);
+			continue;
+		}
+
+		/* backend file is specified. */
+		if (!start_virtblk(mic, &vring))
+			goto _close_backend;
+		iovec = malloc(sizeof(*iovec) *
+			le32toh(virtblk_dev_page.blk_config.seg_max));
+		if (!iovec) {
+			mpsslog("%s: can't alloc iovec: %s\n",
+				mic->name, strerror(ENOMEM));
+			goto _stop_virtblk;
+		}
+
+		block_poll.fd = mic->mic_virtblk.virtio_block_fd;
+		block_poll.events = POLLIN;
+		for (mic->mic_virtblk.signaled = 0;
+		     !mic->mic_virtblk.signaled;) {
+			block_poll.revents = 0;
+					/* timeout in 1 sec to see signaled */
+			ret = poll(&block_poll, 1, 1000);
+			if (ret < 0) {
+				mpsslog("%s %d: poll failed: %s\n",
+					__func__, __LINE__,
+					strerror(errno));
+				continue;
+			}
+
+			if (!(block_poll.revents & POLLIN)) {
+#ifdef DEBUG
+				mpsslog("%s %d: block_poll.revents=0x%x\n",
+					__func__, __LINE__, block_poll.revents);
+#endif
+				continue;
+			}
+
+			/* POLLIN */
+			while (vring.info->avail_idx !=
+				le16toh(vring.vr.avail->idx)) {
+				/* read header element */
+				avail_idx =
+					vring.info->avail_idx &
+					(vring.vr.num - 1);
+				desc_idx = le16toh(
+					vring.vr.avail->ring[avail_idx]);
+				desc = &vring.vr.desc[desc_idx];
+#ifdef DEBUG
+				mpsslog("%s() %d: avail_idx=%d ",
+					__func__, __LINE__,
+					vring.info->avail_idx);
+				mpsslog("vring.vr.num=%d desc=%p\n",
+					vring.vr.num, desc);
+#endif
+				status = header_error_check(desc);
+				ret = read_header(
+					mic->mic_virtblk.virtio_block_fd,
+					&hdr, desc_idx);
+				if (ret < 0) {
+					mpsslog("%s() %d %s: ret=%d %s\n",
+						__func__, __LINE__,
+						mic->name, ret,
+						strerror(errno));
+					break;
+				}
+				/* buffer element */
+				piov = iovec;
+				status = 0;
+				fos = mic->mic_virtblk.backend_addr +
+					(hdr.sector * SECTOR_SIZE);
+				buffer_desc_idx = desc_idx =
+					next_desc(desc);
+				for (desc = &vring.vr.desc[buffer_desc_idx];
+				     desc->flags & VRING_DESC_F_NEXT;
+				     desc_idx = next_desc(desc),
+					     desc = &vring.vr.desc[desc_idx]) {
+					piov->iov_len = desc->len;
+					piov->iov_base = fos;
+					piov++;
+					fos += desc->len;
+				}
+				/* Returning NULLs for VIRTIO_BLK_T_GET_ID. */
+				if (hdr.type & ~(VIRTIO_BLK_T_OUT |
+					VIRTIO_BLK_T_GET_ID)) {
+					/*
+					  VIRTIO_BLK_T_IN - does not do
+					  anything. Probably for documenting.
+					  VIRTIO_BLK_T_SCSI_CMD - for
+					  virtio_scsi.
+					  VIRTIO_BLK_T_FLUSH - turned off in
+					  config space.
+					  VIRTIO_BLK_T_BARRIER - defined but not
+					  used in anywhere.
+					*/
+					mpsslog("%s() %d: type %x ",
+						__func__, __LINE__,
+						hdr.type);
+					mpsslog("is not supported\n");
+					status = -ENOTSUP;
+
+				} else {
+					ret = transfer_blocks(
+					mic->mic_virtblk.virtio_block_fd,
+						iovec,
+						piov - iovec);
+					if (ret < 0 &&
+						status != 0)
+						status = ret;
+				}
+				/* write status and update used pointer */
+				if (status != 0)
+					status = status_error_check(desc);
+				ret = write_status(
+					mic->mic_virtblk.virtio_block_fd,
+					&status);
+#ifdef DEBUG
+				mpsslog("%s() %d: write status=%d on desc=%p\n",
+					__func__, __LINE__,
+					status, desc);
+#endif
+			}
+		}
+		free(iovec);
+_stop_virtblk:
+		stop_virtblk(mic);
+_close_backend:
+		close_backend(mic);
+	}  /* forever */
+
+	pthread_exit(NULL);
+}
+
+static void
+reset(struct mic_info *mic)
+{
+#define RESET_TIMEOUT 120
+	int i = RESET_TIMEOUT;
+	setsysfs(mic->name, "state", "reset");
+	while (i) {
+		char *state;
+		state = readsysfs(mic->name, "state");
+		if (!state)
+			goto retry;
+		mpsslog("%s: %s %d state %s\n",
+			mic->name, __func__, __LINE__, state);
+		if ((!strcmp(state, "offline"))) {
+			free(state);
+			break;
+		}
+		free(state);
+retry:
+		sleep(1);
+		i--;
+	}
+}
+
+static int
+get_mic_shutdown_status(struct mic_info *mic, char *shutdown_status)
+{
+	if (!strcmp(shutdown_status, "nop"))
+		return MIC_NOP;
+	if (!strcmp(shutdown_status, "crashed"))
+		return MIC_CRASHED;
+	if (!strcmp(shutdown_status, "halted"))
+		return MIC_HALTED;
+	if (!strcmp(shutdown_status, "poweroff"))
+		return MIC_POWER_OFF;
+	if (!strcmp(shutdown_status, "restart"))
+		return MIC_RESTART;
+	mpsslog("%s: BUG invalid status %s\n", mic->name, shutdown_status);
+	/* Invalid state */
+	assert(0);
+};
+
+static int get_mic_state(struct mic_info *mic, char *state)
+{
+	if (!strcmp(state, "offline"))
+		return MIC_OFFLINE;
+	if (!strcmp(state, "online"))
+		return MIC_ONLINE;
+	if (!strcmp(state, "shutting_down"))
+		return MIC_SHUTTING_DOWN;
+	if (!strcmp(state, "reset_failed"))
+		return MIC_RESET_FAILED;
+	mpsslog("%s: BUG invalid state %s\n", mic->name, state);
+	/* Invalid state */
+	assert(0);
+};
+
+static void mic_handle_shutdown(struct mic_info *mic)
+{
+#define SHUTDOWN_TIMEOUT 60
+	int i = SHUTDOWN_TIMEOUT, ret, stat = 0;
+	char *shutdown_status;
+	while (i) {
+		shutdown_status = readsysfs(mic->name, "shutdown_status");
+		if (!shutdown_status)
+			continue;
+		mpsslog("%s: %s %d shutdown_status %s\n",
+			mic->name, __func__, __LINE__, shutdown_status);
+		switch (get_mic_shutdown_status(mic, shutdown_status)) {
+		case MIC_RESTART:
+			mic->restart = 1;
+		case MIC_HALTED:
+		case MIC_POWER_OFF:
+		case MIC_CRASHED:
+			free(shutdown_status);
+			goto reset;
+		default:
+			break;
+		}
+		free(shutdown_status);
+		sleep(1);
+		i--;
+	}
+reset:
+	ret = kill(mic->pid, SIGTERM);
+	mpsslog("%s: %s %d kill pid %d ret %d\n",
+		mic->name, __func__, __LINE__,
+		mic->pid, ret);
+	if (!ret) {
+		ret = waitpid(mic->pid, &stat,
+			WIFSIGNALED(stat));
+		mpsslog("%s: %s %d waitpid ret %d pid %d\n",
+			mic->name, __func__, __LINE__,
+			ret, mic->pid);
+	}
+	if (ret == mic->pid)
+		reset(mic);
+}
+
+static void *
+mic_config(void *arg)
+{
+	struct mic_info *mic = (struct mic_info *)arg;
+	char *state = NULL;
+	char pathname[PATH_MAX];
+	int fd, ret;
+	struct pollfd ufds[1];
+	char value[4096];
+
+	snprintf(pathname, PATH_MAX - 1, "%s/%s/%s",
+		MICSYSFSDIR, mic->name, "state");
+
+	fd = open(pathname, O_RDONLY);
+	if (fd < 0) {
+		mpsslog("%s: opening file %s failed %s\n",
+			mic->name, pathname, strerror(errno));
+		goto error;
+	}
+
+	do {
+		ret = read(fd, value, sizeof(value));
+		if (ret < 0) {
+			mpsslog("%s: Failed to read sysfs entry '%s': %s\n",
+				mic->name, pathname, strerror(errno));
+			goto close_error1;
+		}
+retry:
+		state = readsysfs(mic->name, "state");
+		if (!state)
+			goto retry;
+		mpsslog("%s: %s %d state %s\n",
+			mic->name, __func__, __LINE__, state);
+		switch (get_mic_state(mic, state)) {
+		case MIC_SHUTTING_DOWN:
+			mic_handle_shutdown(mic);
+			goto close_error;
+		default:
+			break;
+		}
+		free(state);
+
+		ufds[0].fd = fd;
+		ufds[0].events = POLLERR | POLLPRI;
+		ret = poll(ufds, 1, -1);
+		if (ret < 0) {
+			mpsslog("%s: poll failed %s\n",
+				mic->name, strerror(errno));
+			goto close_error1;
+		}
+	} while (1);
+close_error:
+	free(state);
+close_error1:
+	close(fd);
+error:
+	init_mic(mic);
+	pthread_exit(NULL);
+}
+
+static void
+set_cmdline(struct mic_info *mic)
+{
+	char buffer[PATH_MAX];
+	int len;
+
+	len = snprintf(buffer, PATH_MAX,
+		"clocksource=tsc highres=off nohz=off ");
+	len += snprintf(buffer + len, PATH_MAX,
+		"cpufreq_on;corec6_off;pc3_off;pc6_off ");
+	len += snprintf(buffer + len, PATH_MAX,
+		"ifcfg=static;address,172.31.%d.1;netmask,255.255.255.0",
+		mic->id);
+
+	setsysfs(mic->name, "cmdline", buffer);
+	mpsslog("%s: Command line: \"%s\"\n", mic->name, buffer);
+	snprintf(buffer, PATH_MAX, "172.31.%d.1", mic->id);
+	mpsslog("%s: IPADDR: \"%s\"\n", mic->name, buffer);
+}
+
+static void
+set_log_buf_info(struct mic_info *mic)
+{
+	int fd;
+	off_t len;
+	char system_map[] = "/lib/firmware/mic/System.map";
+	char *map, *temp, log_buf[17] = {'\0'};
+
+	fd = open(system_map, O_RDONLY);
+	if (fd < 0) {
+		mpsslog("%s: Opening System.map failed: %d\n",
+			mic->name, errno);
+		return;
+	}
+	len = lseek(fd, 0, SEEK_END);
+	if (len < 0) {
+		mpsslog("%s: Reading System.map size failed: %d\n",
+			mic->name, errno);
+		close(fd);
+		return;
+	}
+	map = mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (map == MAP_FAILED) {
+		mpsslog("%s: mmap of System.map failed: %d\n",
+			mic->name, errno);
+		close(fd);
+		return;
+	}
+	temp = strstr(map, "__log_buf");
+	if (!temp) {
+		mpsslog("%s: __log_buf not found: %d\n", mic->name, errno);
+		munmap(map, len);
+		close(fd);
+		return;
+	}
+	strncpy(log_buf, temp - 19, 16);
+	setsysfs(mic->name, "log_buf_addr", log_buf);
+	mpsslog("%s: log_buf_addr: %s\n", mic->name, log_buf);
+	temp = strstr(map, "log_buf_len");
+	if (!temp) {
+		mpsslog("%s: log_buf_len not found: %d\n", mic->name, errno);
+		munmap(map, len);
+		close(fd);
+		return;
+	}
+	strncpy(log_buf, temp - 19, 16);
+	setsysfs(mic->name, "log_buf_len", log_buf);
+	mpsslog("%s: log_buf_len: %s\n", mic->name, log_buf);
+	munmap(map, len);
+	close(fd);
+}
+
+static void init_mic(struct mic_info *mic);
+
+static void
+change_virtblk_backend(int x, siginfo_t *siginfo, void *p)
+{
+	struct mic_info *mic;
+
+	for (mic = mic_list.next; mic != NULL; mic = mic->next)
+		mic->mic_virtblk.signaled = 1/* true */;
+}
+
+static void
+init_mic(struct mic_info *mic)
+{
+	struct sigaction ignore = {
+		.sa_flags = 0,
+		.sa_handler = SIG_IGN
+	};
+	struct sigaction act = {
+		.sa_flags = SA_SIGINFO,
+		.sa_sigaction = change_virtblk_backend,
+	};
+	char buffer[PATH_MAX];
+	int err;
+
+	/*
+	 * Currently, one virtio block device is supported for each MIC card
+	 * at a time. Any user (or test) can send a SIGUSR1 to the MIC daemon.
+	 * The signal informs the virtio block backend about a change in the
+	 * configuration file which specifies the virtio backend file name on
+	 * the host. Virtio block backend then re-reads the configuration file
+	 * and switches to the new block device. This signalling mechanism may
+	 * not be required once multiple virtio block devices are supported by
+	 * the MIC daemon.
+	 */
+	sigaction(SIGUSR1, &ignore, NULL);
+
+	mic->pid = fork();
+	switch (mic->pid) {
+	case 0:
+		set_log_buf_info(mic);
+		set_cmdline(mic);
+		add_virtio_device(mic, &virtcons_dev_page.dd);
+		add_virtio_device(mic, &virtnet_dev_page.dd);
+		err = pthread_create(&mic->mic_console.console_thread, NULL,
+			virtio_console, mic);
+		if (err)
+			mpsslog("%s virtcons pthread_create failed %s\n",
+			mic->name, strerror(err));
+		err = pthread_create(&mic->mic_net.net_thread, NULL,
+			virtio_net, mic);
+		if (err)
+			mpsslog("%s virtnet pthread_create failed %s\n",
+			mic->name, strerror(err));
+		err = pthread_create(&mic->mic_virtblk.block_thread, NULL,
+			virtio_block, mic);
+		if (err)
+			mpsslog("%s virtblk pthread_create failed %s\n",
+			mic->name, strerror(err));
+		sigemptyset(&act.sa_mask);
+		err = sigaction(SIGUSR1, &act, NULL);
+		if (err)
+			mpsslog("%s sigaction SIGUSR1 failed %s\n",
+			mic->name, strerror(errno));
+		while (1)
+			sleep(60);
+	case -1:
+		mpsslog("fork failed MIC name %s id %d errno %d\n",
+			mic->name, mic->id, errno);
+		break;
+	default:
+		if (mic->restart) {
+			snprintf(buffer, PATH_MAX, "boot");
+			setsysfs(mic->name, "state", buffer);
+			mpsslog("%s restarting mic %d\n",
+				mic->name, mic->restart);
+			mic->restart = 0;
+		}
+		pthread_create(&mic->config_thread, NULL, mic_config, mic);
+	}
+}
+
+static void
+start_daemon(void)
+{
+	struct mic_info *mic;
+
+	for (mic = mic_list.next; mic != NULL; mic = mic->next)
+		init_mic(mic);
+
+	while (1)
+		sleep(60);
+}
+
+static int
+init_mic_list(void)
+{
+	struct mic_info *mic = &mic_list;
+	struct dirent *file;
+	DIR *dp;
+	int cnt = 0;
+
+	dp = opendir(MICSYSFSDIR);
+	if (!dp)
+		return 0;
+
+	while ((file = readdir(dp)) != NULL) {
+		if (!strncmp(file->d_name, "mic", 3)) {
+			mic->next = malloc(sizeof(struct mic_info));
+			if (mic->next) {
+				mic = mic->next;
+				mic->next = NULL;
+				memset(mic, 0, sizeof(struct mic_info));
+				mic->id = atoi(&file->d_name[3]);
+				mic->name = malloc(strlen(file->d_name) + 16);
+				if (mic->name)
+					strcpy(mic->name, file->d_name);
+				mpsslog("MIC name %s id %d\n", mic->name,
+					mic->id);
+				cnt++;
+			}
+		}
+	}
+
+	closedir(dp);
+	return cnt;
+}
+
+void
+mpsslog(char *format, ...)
+{
+	va_list args;
+	char buffer[4096];
+	char ts[52], *ts1;
+	time_t t;
+
+	if (logfp == NULL)
+		return;
+
+	va_start(args, format);
+	vsprintf(buffer, format, args);
+	va_end(args);
+
+	time(&t);
+	ts1 = ctime_r(&t, ts);
+	ts1[strlen(ts1) - 1] = '\0';
+	fprintf(logfp, "%s: %s", ts1, buffer);
+
+	fflush(logfp);
+}
+
+int
+main(int argc, char *argv[])
+{
+	int cnt;
+	pid_t pid;
+
+	myname = argv[0];
+
+	logfp = fopen(LOGFILE_NAME, "a+");
+	if (!logfp) {
+		fprintf(stderr, "cannot open logfile '%s'\n", LOGFILE_NAME);
+		exit(1);
+	}
+	pid = fork();
+	switch (pid) {
+	case 0:
+		break;
+	case -1:
+		exit(2);
+	default:
+		exit(0);
+	}
+
+	mpsslog("MIC Daemon start\n");
+
+	cnt = init_mic_list();
+	if (cnt == 0) {
+		mpsslog("MIC module not loaded\n");
+		exit(3);
+	}
+	mpsslog("MIC found %d devices\n", cnt);
+
+	start_daemon();
+
+	exit(0);
+}
diff --git a/Documentation/mic/mpssd/mpssd.h b/Documentation/mic/mpssd/mpssd.h
new file mode 100644
index 0000000..b6dee38
--- /dev/null
+++ b/Documentation/mic/mpssd/mpssd.h
@@ -0,0 +1,100 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC User Space Tools.
+ */
+#ifndef _MPSSD_H_
+#define _MPSSD_H_
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <dirent.h>
+#include <libgen.h>
+#include <pthread.h>
+#include <stdarg.h>
+#include <time.h>
+#include <errno.h>
+#include <sys/dir.h>
+#include <sys/ioctl.h>
+#include <sys/poll.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/mman.h>
+#include <sys/utsname.h>
+#include <sys/wait.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <netdb.h>
+#include <pthread.h>
+#include <signal.h>
+#include <limits.h>
+#include <syslog.h>
+#include <getopt.h>
+#include <net/if.h>
+#include <linux/if_tun.h>
+#include <linux/if_tun.h>
+#include <linux/virtio_ids.h>
+
+#define MICSYSFSDIR "/sys/class/mic"
+#define LOGFILE_NAME "/var/log/mpssd"
+#define PAGE_SIZE 4096
+
+struct mic_console_info {
+	pthread_t       console_thread;
+	int		virtio_console_fd;
+	void		*console_dp;
+};
+
+struct mic_net_info {
+	pthread_t       net_thread;
+	int		virtio_net_fd;
+	int		tap_fd;
+	void		*net_dp;
+};
+
+struct mic_virtblk_info {
+	pthread_t       block_thread;
+	int		virtio_block_fd;
+	void		*block_dp;
+	volatile sig_atomic_t	signaled;
+	char		*backend_file;
+	int		backend;
+	void		*backend_addr;
+	long		backend_size;
+};
+
+struct mic_info {
+	int		id;
+	char		*name;
+	pthread_t       config_thread;
+	pid_t		pid;
+	struct mic_console_info	mic_console;
+	struct mic_net_info	mic_net;
+	struct mic_virtblk_info	mic_virtblk;
+	int		restart;
+	struct mic_info *next;
+};
+
+void mpsslog(char *format, ...);
+char *readsysfs(char *dir, char *entry);
+int setsysfs(char *dir, char *entry, char *value);
+#endif
diff --git a/Documentation/mic/mpssd/sysfs.c b/Documentation/mic/mpssd/sysfs.c
new file mode 100644
index 0000000..11de72b
--- /dev/null
+++ b/Documentation/mic/mpssd/sysfs.c
@@ -0,0 +1,102 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC User Space Tools.
+ */
+
+#include "mpssd.h"
+
+#define PAGE_SIZE 4096
+
+char *
+readsysfs(char *dir, char *entry)
+{
+	char filename[PATH_MAX];
+	char value[PAGE_SIZE];
+	char *string = NULL;
+	int fd;
+	int len;
+
+	if (dir == NULL)
+		snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry);
+	else
+		snprintf(filename, PATH_MAX,
+			"%s/%s/%s", MICSYSFSDIR, dir, entry);
+
+	fd = open(filename, O_RDONLY);
+	if (fd < 0) {
+		mpsslog("Failed to open sysfs entry '%s': %s\n",
+			filename, strerror(errno));
+		return NULL;
+	}
+
+	len = read(fd, value, sizeof(value));
+	if (len < 0) {
+		mpsslog("Failed to read sysfs entry '%s': %s\n",
+			filename, strerror(errno));
+		goto readsys_ret;
+	}
+	if (len == 0)
+		goto readsys_ret;
+
+	value[len - 1] = '\0';
+
+	string = malloc(strlen(value) + 1);
+	if (string)
+		strcpy(string, value);
+
+readsys_ret:
+	close(fd);
+	return string;
+}
+
+int
+setsysfs(char *dir, char *entry, char *value)
+{
+	char filename[PATH_MAX];
+	char *oldvalue;
+	int fd, ret = 0;
+
+	if (dir == NULL)
+		snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry);
+	else
+		snprintf(filename, PATH_MAX, "%s/%s/%s",
+			MICSYSFSDIR, dir, entry);
+
+	oldvalue = readsysfs(dir, entry);
+
+	fd = open(filename, O_RDWR);
+	if (fd < 0) {
+		ret = errno;
+		mpsslog("Failed to open sysfs entry '%s': %s\n",
+			filename, strerror(errno));
+		goto done;
+	}
+
+	if (!oldvalue || strcmp(value, oldvalue)) {
+		if (write(fd, value, strlen(value)) < 0) {
+			ret = errno;
+			mpsslog("Failed to write new sysfs entry '%s': %s\n",
+				filename, strerror(errno));
+		}
+	}
+	close(fd);
+done:
+	if (oldvalue)
+		free(oldvalue);
+	return ret;
+}
-- 
1.8.2.1


^ permalink raw reply related	[relevance 24%]

* [PATCH RESEND v3 6/7] Intel MIC Card Driver Changes for Virtio Devices.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
                   ` (4 preceding siblings ...)
  2013-09-05 23:42 26% ` [PATCH RESEND v3 5/7] Intel MIC Host Driver Changes for Virtio Devices Sudeep Dutt
@ 2013-09-05 23:42 46% ` Sudeep Dutt
  2013-09-05 23:42 24% ` [PATCH RESEND v3 7/7] Sample Implementation of Intel MIC User Space Daemon Sudeep Dutt
    7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

From: Ashutosh Dixit <ashutosh.dixit@intel.com>

This patch introduces the card "Virtio over PCIe" interface for
Intel MIC. It allows virtio drivers on the card to communicate with their
user space backends on the host via a device page. Ring 3 apps on the host
can add, remove and configure virtio devices. A thin MIC specific
virtio_config_ops is implemented which is borrowed heavily from previous
similar implementations in lguest and s390 @
drivers/lguest/lguest_device.c
drivers/s390/kvm/kvm_virtio.c

Co-author: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
---
 drivers/misc/mic/Kconfig           |   1 +
 drivers/misc/mic/card/Makefile     |   1 +
 drivers/misc/mic/card/mic_device.c |   7 +
 drivers/misc/mic/card/mic_virtio.c | 631 +++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/card/mic_virtio.h |  77 +++++
 5 files changed, 717 insertions(+)
 create mode 100644 drivers/misc/mic/card/mic_virtio.c
 create mode 100644 drivers/misc/mic/card/mic_virtio.h

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 01f1a4a..d453768 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -24,6 +24,7 @@ comment "Intel MIC Card Driver"
 config INTEL_MIC_CARD
 	tristate "Intel MIC Card Driver"
 	depends on 64BIT
+	select VIRTIO
 	default N
 	help
 	  This enables card driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/card/Makefile b/drivers/misc/mic/card/Makefile
index 6e9675e..69d58be 100644
--- a/drivers/misc/mic/card/Makefile
+++ b/drivers/misc/mic/card/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_INTEL_MIC_CARD) += mic_card.o
 mic_card-y += mic_x100.o
 mic_card-y += mic_device.o
 mic_card-y += mic_debugfs.o
+mic_card-y += mic_virtio.o
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
index 3c5c302..4125217 100644
--- a/drivers/misc/mic/card/mic_device.c
+++ b/drivers/misc/mic/card/mic_device.c
@@ -32,6 +32,7 @@
 #include <linux/mic_common.h>
 #include "../common/mic_device.h"
 #include "mic_device.h"
+#include "mic_virtio.h"
 
 static struct mic_driver *g_drv;
 static struct mic_irq *shutdown_cookie;
@@ -265,10 +266,15 @@ int __init mic_driver_init(struct mic_driver *mdrv)
 	rc = mic_shutdown_init();
 	if (rc)
 		goto irq_uninit;
+	rc = mic_devices_init(mdrv);
+	if (rc)
+		goto shutdown_uninit;
 	mic_create_card_debug_dir(mdrv);
 	atomic_notifier_chain_register(&panic_notifier_list, &mic_panic);
 done:
 	return rc;
+shutdown_uninit:
+	mic_shutdown_uninit();
 irq_uninit:
 	mic_uninit_irq();
 dp_uninit:
@@ -286,6 +292,7 @@ put:
 void mic_driver_uninit(struct mic_driver *mdrv)
 {
 	mic_delete_card_debug_dir(mdrv);
+	mic_devices_uninit(mdrv);
 	/*
 	 * Inform the host about the shutdown status i.e. poweroff/restart etc.
 	 * The module cannot be unloaded so the only code path to call
diff --git a/drivers/misc/mic/card/mic_virtio.c b/drivers/misc/mic/card/mic_virtio.c
new file mode 100644
index 0000000..38275c1
--- /dev/null
+++ b/drivers/misc/mic/card/mic_virtio.c
@@ -0,0 +1,631 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Adapted from:
+ *
+ * virtio for kvm on s390
+ *
+ * Copyright IBM Corp. 2008
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License (version 2 only)
+ * as published by the Free Software Foundation.
+ *
+ *    Author(s): Christian Borntraeger <borntraeger@de.ibm.com>
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#include <linux/delay.h>
+#include <linux/slab.h>
+#include <linux/virtio_config.h>
+
+#include "../common/mic_device.h"
+#include "mic_virtio.h"
+
+#define VIRTIO_SUBCODE_64 0x0D00
+
+#define MIC_MAX_VRINGS                4
+struct mic_vdev {
+	struct virtio_device vdev;
+	struct mic_device_desc __iomem *desc;
+	struct mic_device_ctrl __iomem *dc;
+	struct mic_device *mdev;
+	void __iomem *vr[MIC_MAX_VRINGS];
+	int used_size[MIC_MAX_VRINGS];
+	struct completion reset_done;
+	struct mic_irq *virtio_cookie;
+	int c2h_vdev_db;
+};
+
+static struct mic_irq *virtio_config_cookie;
+#define to_micvdev(vd) container_of(vd, struct mic_vdev, vdev)
+
+/* Helper API to obtain the parent of the virtio device */
+static inline struct device *mic_dev(struct mic_vdev *mvdev)
+{
+	return mvdev->vdev.dev.parent;
+}
+
+/* This gets the device's feature bits. */
+static u32 mic_get_features(struct virtio_device *vdev)
+{
+	unsigned int i, bits;
+	u32 features = 0;
+	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
+	u8 __iomem *in_features = mic_vq_features(desc);
+	int feature_len = ioread8(&desc->feature_len);
+
+	bits = min_t(unsigned, feature_len,
+		sizeof(vdev->features)) * 8;
+	for (i = 0; i < bits; i++)
+		if (ioread8(&in_features[i / 8]) & (BIT(i % 8)))
+			features |= BIT(i);
+
+	return features;
+}
+
+static void mic_finalize_features(struct virtio_device *vdev)
+{
+	unsigned int i, bits;
+	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
+	u8 feature_len = ioread8(&desc->feature_len);
+	/* Second half of bitmap is features we accept. */
+	u8 __iomem *out_features =
+		mic_vq_features(desc) + feature_len;
+
+	/* Give virtio_ring a chance to accept features. */
+	vring_transport_features(vdev);
+
+	memset_io(out_features, 0, feature_len);
+	bits = min_t(unsigned, feature_len,
+		sizeof(vdev->features)) * 8;
+	for (i = 0; i < bits; i++) {
+		if (test_bit(i, vdev->features))
+			iowrite8(ioread8(&out_features[i / 8]) | (1 << (i % 8)),
+				&out_features[i / 8]);
+	}
+}
+
+/*
+ * Reading and writing elements in config space
+ */
+static void mic_get(struct virtio_device *vdev, unsigned int offset,
+		   void *buf, unsigned len)
+{
+	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
+
+	if (offset + len > ioread8(&desc->config_len))
+		return;
+	memcpy_fromio(buf, mic_vq_configspace(desc) + offset, len);
+}
+
+static void mic_set(struct virtio_device *vdev, unsigned int offset,
+		   const void *buf, unsigned len)
+{
+	struct mic_device_desc __iomem *desc = to_micvdev(vdev)->desc;
+
+	if (offset + len > ioread8(&desc->config_len))
+		return;
+	memcpy_toio(mic_vq_configspace(desc) + offset, buf, len);
+}
+
+/*
+ * The operations to get and set the status word just access the status
+ * field of the device descriptor. set_status also interrupts the host
+ * to tell about status changes.
+ */
+static u8 mic_get_status(struct virtio_device *vdev)
+{
+	return ioread8(&to_micvdev(vdev)->desc->status);
+}
+
+static void mic_set_status(struct virtio_device *vdev, u8 status)
+{
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+	if (!status)
+		return;
+	iowrite8(status, &mvdev->desc->status);
+	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
+}
+
+/* Inform host on a virtio device reset and wait for ack from host */
+static void mic_reset_inform_host(struct virtio_device *vdev)
+{
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+	struct mic_device_ctrl __iomem *dc = mvdev->dc;
+	int retry = 100, i;
+
+	iowrite8(0, &dc->host_ack);
+	iowrite8(1, &dc->vdev_reset);
+	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
+
+	/* Wait till host completes all card accesses and acks the reset */
+	for (i = retry; i--;) {
+		if (ioread8(&dc->host_ack))
+			break;
+		msleep(100);
+	};
+
+	dev_dbg(mic_dev(mvdev), "%s: retry: %d\n", __func__, retry);
+
+	/* Reset status to 0 in case we timed out */
+	iowrite8(0, &mvdev->desc->status);
+}
+
+static void mic_reset(struct virtio_device *vdev)
+{
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+
+	dev_dbg(mic_dev(mvdev), "%s: virtio id %d\n",
+		__func__, vdev->id.device);
+
+	mic_reset_inform_host(vdev);
+	complete_all(&mvdev->reset_done);
+}
+
+/*
+ * The virtio_ring code calls this API when it wants to notify the Host.
+ */
+static void mic_notify(struct virtqueue *vq)
+{
+	struct mic_vdev *mvdev = vq->priv;
+
+	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
+}
+
+static void mic_del_vq(struct virtqueue *vq, int n)
+{
+	struct mic_vdev *mvdev = to_micvdev(vq->vdev);
+	struct vring *vr = (struct vring *) (vq + 1);
+
+	free_pages((unsigned long) vr->used,
+		get_order(mvdev->used_size[n]));
+	vring_del_virtqueue(vq);
+	mic_card_unmap(mvdev->mdev, mvdev->vr[n]);
+	mvdev->vr[n] = NULL;
+}
+
+static void mic_del_vqs(struct virtio_device *vdev)
+{
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+	struct virtqueue *vq, *n;
+	int idx = 0;
+
+	dev_dbg(mic_dev(mvdev), "%s\n", __func__);
+
+	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
+		mic_del_vq(vq, idx++);
+}
+
+/*
+ * This routine will assign vring's allocated in host/io memory. Code in
+ * virtio_ring.c however continues to access this io memory as if it were local
+ * memory without io accessors.
+ */
+static struct virtqueue *mic_find_vq(struct virtio_device *vdev,
+				     unsigned index,
+				     void (*callback)(struct virtqueue *vq),
+				     const char *name)
+{
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+	struct mic_vqconfig __iomem *vqconfig;
+	struct mic_vqconfig config;
+	struct virtqueue *vq;
+	void __iomem *va;
+	struct _mic_vring_info __iomem *info;
+	void *used;
+	int vr_size, _vr_size, err, magic;
+	struct vring *vr;
+	u8 type = ioread8(&mvdev->desc->type);
+
+	if (index >= ioread8(&mvdev->desc->num_vq))
+		return ERR_PTR(-ENOENT);
+
+	if (!name)
+		return ERR_PTR(-ENOENT);
+
+	/* First assign the vring's allocated in host memory */
+	vqconfig = mic_vq_config(mvdev->desc) + index;
+	memcpy_fromio(&config, vqconfig, sizeof(config));
+	_vr_size = vring_size(config.num, MIC_VIRTIO_RING_ALIGN);
+	vr_size = PAGE_ALIGN(_vr_size + sizeof(struct _mic_vring_info));
+	va = mic_card_map(mvdev->mdev, config.address, vr_size);
+	if (!va)
+		return ERR_PTR(-ENOMEM);
+	mvdev->vr[index] = va;
+	memset_io(va, 0x0, _vr_size);
+	vq = vring_new_virtqueue(index,
+				config.num, MIC_VIRTIO_RING_ALIGN, vdev,
+				false,
+				va, mic_notify, callback, name);
+	if (!vq) {
+		err = -ENOMEM;
+		goto unmap;
+	}
+	info = va + _vr_size;
+	magic = ioread32(&info->magic);
+
+	if (WARN(magic != MIC_MAGIC + type + index, "magic mismatch")) {
+		err = -EIO;
+		goto unmap;
+	}
+
+	/* Allocate and reassign used ring now */
+	mvdev->used_size[index] = PAGE_ALIGN(sizeof(__u16) * 3 +
+			sizeof(struct vring_used_elem) * config.num);
+	used = (void *) __get_free_pages(GFP_KERNEL | __GFP_ZERO,
+				get_order(mvdev->used_size[index]));
+	if (!used) {
+		err = -ENOMEM;
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto del_vq;
+	}
+	iowrite64(virt_to_phys(used), &vqconfig->used_address);
+
+	/*
+	 * To reassign the used ring here we are directly accessing
+	 * struct vring_virtqueue which is a private data structure
+	 * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in
+	 * vring_new_virtqueue() would ensure that
+	 *  (&vq->vring == (struct vring *) (&vq->vq + 1));
+	 */
+	vr = (struct vring *) (vq + 1);
+	vr->used = used;
+
+	vq->priv = mvdev;
+	return vq;
+del_vq:
+	vring_del_virtqueue(vq);
+unmap:
+	mic_card_unmap(mvdev->mdev, mvdev->vr[index]);
+	return ERR_PTR(err);
+}
+
+static int mic_find_vqs(struct virtio_device *vdev, unsigned nvqs,
+			struct virtqueue *vqs[],
+			vq_callback_t *callbacks[],
+			const char *names[])
+{
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+	struct mic_device_ctrl __iomem *dc = mvdev->dc;
+	int i, err, retry = 100;
+
+	/* We must have this many virtqueues. */
+	if (nvqs > ioread8(&mvdev->desc->num_vq))
+		return -ENOENT;
+
+	for (i = 0; i < nvqs; ++i) {
+		dev_dbg(mic_dev(mvdev), "%s: %d: %s\n",
+			__func__, i, names[i]);
+		vqs[i] = mic_find_vq(vdev, i, callbacks[i], names[i]);
+		if (IS_ERR(vqs[i])) {
+			err = PTR_ERR(vqs[i]);
+			goto error;
+		}
+	}
+
+	iowrite8(1, &dc->used_address_updated);
+	/*
+	 * Send an interrupt to the host to inform it that used
+	 * rings have been re-assigned.
+	 */
+	mic_send_intr(mvdev->mdev, mvdev->c2h_vdev_db);
+	for (i = retry; i--;) {
+		if (!ioread8(&dc->used_address_updated))
+			break;
+		msleep(100);
+	};
+
+	dev_dbg(mic_dev(mvdev), "%s: retry: %d\n", __func__, retry);
+	if (!retry) {
+		err = -ENODEV;
+		goto error;
+	}
+
+	return 0;
+error:
+	mic_del_vqs(vdev);
+	return err;
+}
+
+/*
+ * The config ops structure as defined by virtio config
+ */
+static struct virtio_config_ops mic_vq_config_ops = {
+	.get_features = mic_get_features,
+	.finalize_features = mic_finalize_features,
+	.get = mic_get,
+	.set = mic_set,
+	.get_status = mic_get_status,
+	.set_status = mic_set_status,
+	.reset = mic_reset,
+	.find_vqs = mic_find_vqs,
+	.del_vqs = mic_del_vqs,
+};
+
+static irqreturn_t
+mic_virtio_intr_handler(int irq, void *data)
+{
+	struct mic_vdev *mvdev = data;
+	struct virtqueue *vq;
+
+	mic_ack_interrupt(mvdev->mdev);
+	list_for_each_entry(vq, &mvdev->vdev.vqs, list)
+		vring_interrupt(0, vq);
+
+	return IRQ_HANDLED;
+}
+
+static void mic_virtio_release_dev(struct device *_d)
+{
+	/*
+	 * No need for a release method similar to virtio PCI.
+	 * Provide an empty one to avoid getting a warning from core.
+	 */
+}
+
+/*
+ * adds a new device and register it with virtio
+ * appropriate drivers are loaded by the device model
+ */
+static int mic_add_device(struct mic_device_desc __iomem *d,
+	unsigned int offset, struct mic_driver *mdrv)
+{
+	struct mic_vdev *mvdev;
+	int ret;
+	int virtio_db;
+	u8 type = ioread8(&d->type);
+
+	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
+	if (!mvdev) {
+		dev_err(mdrv->dev, "Cannot allocate mic dev %u type %u\n",
+			offset, type);
+		return -ENOMEM;
+	}
+
+	mvdev->mdev = &mdrv->mdev;
+	mvdev->vdev.dev.parent = mdrv->dev;
+	mvdev->vdev.dev.release = mic_virtio_release_dev;
+	mvdev->vdev.id.device = type;
+	mvdev->vdev.config = &mic_vq_config_ops;
+	mvdev->desc = d;
+	mvdev->dc = (void __iomem *)d + mic_aligned_desc_size(d);
+	init_completion(&mvdev->reset_done);
+
+	virtio_db = mic_next_card_db();
+	mvdev->virtio_cookie = mic_request_card_irq(mic_virtio_intr_handler,
+			"virtio intr", mvdev, virtio_db);
+	if (IS_ERR(mvdev->virtio_cookie)) {
+		ret = PTR_ERR(mvdev->virtio_cookie);
+		goto kfree;
+	}
+	iowrite8((u8)virtio_db, &mvdev->dc->h2c_vdev_db);
+	mvdev->c2h_vdev_db = ioread8(&mvdev->dc->c2h_vdev_db);
+
+	ret = register_virtio_device(&mvdev->vdev);
+	if (ret) {
+		dev_err(mic_dev(mvdev),
+			"Failed to register mic device %u type %u\n",
+			offset, type);
+		goto free_irq;
+	}
+	iowrite64((u64)mvdev, &mvdev->dc->vdev);
+	dev_dbg(mic_dev(mvdev), "%s: registered mic device %u type %u mvdev %p\n",
+		__func__, offset, type, mvdev);
+
+	return 0;
+
+free_irq:
+	mic_free_card_irq(mvdev->virtio_cookie, mvdev);
+kfree:
+	kfree(mvdev);
+	return ret;
+}
+
+/*
+ * match for a mic device with a specific desc pointer
+ */
+static int mic_match_desc(struct device *dev, void *data)
+{
+	struct virtio_device *vdev = dev_to_virtio(dev);
+	struct mic_vdev *mvdev = to_micvdev(vdev);
+
+	return mvdev->desc == (void __iomem *)data;
+}
+
+static void mic_handle_config_change(struct mic_device_desc __iomem *d,
+	unsigned int offset, struct mic_driver *mdrv)
+{
+	struct mic_device_ctrl __iomem *dc
+		= (void __iomem *)d + mic_aligned_desc_size(d);
+	struct mic_vdev *mvdev = (struct mic_vdev *)ioread64(&dc->vdev);
+	struct virtio_driver *drv;
+
+	if (ioread8(&dc->config_change) != MIC_VIRTIO_PARAM_CONFIG_CHANGED)
+		return;
+
+	dev_dbg(mdrv->dev, "%s %d\n", __func__, __LINE__);
+	drv = container_of(mvdev->vdev.dev.driver,
+				struct virtio_driver, driver);
+	if (drv->config_changed)
+		drv->config_changed(&mvdev->vdev);
+	iowrite8(1, &dc->guest_ack);
+}
+
+/*
+ * removes a virtio device if a hot remove event has been
+ * requested by the host.
+ */
+static int mic_remove_device(struct mic_device_desc __iomem *d,
+	unsigned int offset, struct mic_driver *mdrv)
+{
+	struct mic_device_ctrl __iomem *dc
+		= (void __iomem *)d + mic_aligned_desc_size(d);
+	struct mic_vdev *mvdev = (struct mic_vdev *)ioread64(&dc->vdev);
+	u8 status;
+	int ret = -1;
+
+	if (ioread8(&dc->config_change) == MIC_VIRTIO_PARAM_DEV_REMOVE) {
+		dev_dbg(mdrv->dev,
+			"%s %d config_change %d type %d mvdev %p\n",
+			__func__, __LINE__,
+			ioread8(&dc->config_change), ioread8(&d->type), mvdev);
+
+		status = ioread8(&d->status);
+		INIT_COMPLETION(mvdev->reset_done);
+		unregister_virtio_device(&mvdev->vdev);
+		mic_free_card_irq(mvdev->virtio_cookie, mvdev);
+		if (status & VIRTIO_CONFIG_S_DRIVER_OK)
+			wait_for_completion(&mvdev->reset_done);
+		kfree(mvdev);
+		iowrite8(1, &dc->guest_ack);
+		dev_dbg(mdrv->dev, "%s %d guest_ack %d\n",
+			__func__, __LINE__, ioread8(&dc->guest_ack));
+		ret = 0;
+	}
+
+	return ret;
+}
+
+#define REMOVE_DEVICES true
+
+static void mic_scan_devices(struct mic_driver *mdrv, bool remove)
+{
+	s8 type;
+	unsigned int i;
+	struct mic_device_desc __iomem *d;
+	struct mic_device_ctrl __iomem *dc;
+	struct device *dev;
+	int ret;
+
+	for (i = mic_aligned_size(struct mic_bootparam);
+		i < MIC_DP_SIZE; i += mic_total_desc_size(d)) {
+		d = mdrv->dp + i;
+		dc = (void __iomem *)d + mic_aligned_desc_size(d);
+		/*
+		 * This read barrier is paired with the corresponding write
+		 * barrier on the host which is inserted before adding or
+		 * removing a virtio device descriptor, by updating the type.
+		 */
+		rmb();
+		type = ioread8(&d->type);
+
+		/* end of list */
+		if (type == 0)
+			break;
+
+		if (type == -1)
+			continue;
+
+		/* device already exists */
+		dev = device_find_child(mdrv->dev, d, mic_match_desc);
+		if (dev) {
+			if (remove)
+				iowrite8(MIC_VIRTIO_PARAM_DEV_REMOVE,
+					&dc->config_change);
+			put_device(dev);
+			mic_handle_config_change(d, i, mdrv);
+			ret = mic_remove_device(d, i, mdrv);
+			if (!ret && !remove)
+				iowrite8(-1, &d->type);
+			if (remove) {
+				iowrite8(0, &dc->config_change);
+				iowrite8(0, &dc->guest_ack);
+			}
+			continue;
+		}
+
+		/* new device */
+		dev_dbg(mdrv->dev, "%s %d Adding new virtio device %p\n",
+				__func__, __LINE__, d);
+		if (!remove)
+			mic_add_device(d, i, mdrv);
+	}
+}
+
+/*
+ * mic_hotplug_device tries to find changes in the device page.
+ */
+static void mic_hotplug_devices(struct work_struct *work)
+{
+	struct mic_driver *mdrv = container_of(work,
+		struct mic_driver, hotplug_work);
+
+	mic_scan_devices(mdrv, !REMOVE_DEVICES);
+}
+
+/*
+ * Interrupt handler for hot plug/config changes etc.
+ */
+static irqreturn_t
+mic_extint_handler(int irq, void *data)
+{
+	struct mic_driver *mdrv = (struct mic_driver *)data;
+
+	dev_dbg(mdrv->dev, "%s %d hotplug work\n",
+		__func__, __LINE__);
+	mic_ack_interrupt(&mdrv->mdev);
+	schedule_work(&mdrv->hotplug_work);
+	return IRQ_HANDLED;
+}
+
+/*
+ * Init function for virtio
+ */
+int mic_devices_init(struct mic_driver *mdrv)
+{
+	int rc;
+	struct mic_bootparam __iomem *bootparam;
+	int config_db;
+
+	INIT_WORK(&mdrv->hotplug_work, mic_hotplug_devices);
+	mic_scan_devices(mdrv, !REMOVE_DEVICES);
+
+	config_db = mic_next_card_db();
+	virtio_config_cookie = mic_request_card_irq(mic_extint_handler,
+			"virtio_config_intr", mdrv, config_db);
+	if (IS_ERR(virtio_config_cookie)) {
+		rc = PTR_ERR(virtio_config_cookie);
+		goto exit;
+	}
+
+	bootparam = mdrv->dp;
+	iowrite8(config_db, &bootparam->h2c_config_db);
+	return 0;
+exit:
+	return rc;
+}
+
+/*
+ * Uninit function for virtio
+ */
+void mic_devices_uninit(struct mic_driver *mdrv)
+{
+	struct mic_bootparam __iomem *bootparam = mdrv->dp;
+	iowrite8(-1, &bootparam->h2c_config_db);
+	mic_free_card_irq(virtio_config_cookie, mdrv);
+	flush_work(&mdrv->hotplug_work);
+	mic_scan_devices(mdrv, REMOVE_DEVICES);
+}
diff --git a/drivers/misc/mic/card/mic_virtio.h b/drivers/misc/mic/card/mic_virtio.h
new file mode 100644
index 0000000..2c5c22c
--- /dev/null
+++ b/drivers/misc/mic/card/mic_virtio.h
@@ -0,0 +1,77 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#ifndef __MIC_CARD_VIRTIO_H
+#define __MIC_CARD_VIRTIO_H
+
+#include <linux/mic_common.h>
+#include "mic_device.h"
+
+/*
+ * 64 bit I/O access
+ */
+#ifndef ioread64
+#define ioread64 readq
+#endif
+#ifndef iowrite64
+#define iowrite64 writeq
+#endif
+
+static inline unsigned mic_desc_size(struct mic_device_desc __iomem *desc)
+{
+	return mic_aligned_size(*desc)
+		+ ioread8(&desc->num_vq) * mic_aligned_size(struct mic_vqconfig)
+		+ ioread8(&desc->feature_len) * 2
+		+ ioread8(&desc->config_len);
+}
+
+static inline struct mic_vqconfig __iomem *
+mic_vq_config(struct mic_device_desc __iomem *desc)
+{
+	return (struct mic_vqconfig __iomem *)(desc + 1);
+}
+
+static inline __u8 __iomem *
+mic_vq_features(struct mic_device_desc __iomem *desc)
+{
+	return (__u8 __iomem *)(mic_vq_config(desc) + ioread8(&desc->num_vq));
+}
+
+static inline __u8 __iomem *
+mic_vq_configspace(struct mic_device_desc __iomem *desc)
+{
+	return mic_vq_features(desc) + ioread8(&desc->feature_len) * 2;
+}
+static inline unsigned mic_total_desc_size(struct mic_device_desc __iomem *desc)
+{
+	return mic_aligned_desc_size(desc) +
+		mic_aligned_size(struct mic_device_ctrl);
+}
+
+int mic_devices_init(struct mic_driver *mdrv);
+void mic_devices_uninit(struct mic_driver *mdrv);
+
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 46%]

* [PATCH RESEND v3 5/7] Intel MIC Host Driver Changes for Virtio Devices.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
                   ` (3 preceding siblings ...)
  2013-09-05 23:42 42% ` [PATCH RESEND v3 4/7] Intel MIC Card Driver for X100 family Sudeep Dutt
@ 2013-09-05 23:42 26% ` Sudeep Dutt
  2013-09-05 23:42 46% ` [PATCH RESEND v3 6/7] Intel MIC Card " Sudeep Dutt
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

From: Ashutosh Dixit <ashutosh.dixit@intel.com>

This patch introduces the host "Virtio over PCIe" interface for
Intel MIC. It allows creating user space backends on the host and instantiating
virtio devices for them on the Intel MIC card. It uses the existing VRINGH
infrastructure in the kernel to access virtio rings from the host. A character
device per MIC is exposed with IOCTL, mmap and poll callbacks. This allows the
user space backend to:
(a) add/remove a virtio device via a device page.
(b) map (R/O) virtio rings and device page to user space.
(c) poll for availability of data.
(d) copy a descriptor or entire descriptor chain to/from the card.
(e) modify virtio configuration.
(f) handle virtio device reset.
The buffers are copied over using CPU copies for this initial patch
and host initiated MIC DMA support is planned for future patches.
The avail and desc virtio rings are in host memory and the used ring
is in card memory to maximize writes across PCIe for performance.

Co-author: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
---
 drivers/misc/mic/Kconfig             |   1 +
 drivers/misc/mic/common/mic_device.h |   7 +
 drivers/misc/mic/host/Makefile       |   2 +
 drivers/misc/mic/host/mic_boot.c     |   3 +-
 drivers/misc/mic/host/mic_debugfs.c  | 140 +++++++
 drivers/misc/mic/host/mic_device.h   |   5 +
 drivers/misc/mic/host/mic_fops.c     | 221 +++++++++++
 drivers/misc/mic/host/mic_fops.h     |  32 ++
 drivers/misc/mic/host/mic_main.c     |  26 ++
 drivers/misc/mic/host/mic_virtio.c   | 703 +++++++++++++++++++++++++++++++++++
 drivers/misc/mic/host/mic_virtio.h   | 138 +++++++
 include/uapi/linux/Kbuild            |   1 +
 include/uapi/linux/mic_common.h      | 166 ++++++++-
 include/uapi/linux/mic_ioctl.h       |  74 ++++
 14 files changed, 1517 insertions(+), 2 deletions(-)
 create mode 100644 drivers/misc/mic/host/mic_fops.c
 create mode 100644 drivers/misc/mic/host/mic_fops.h
 create mode 100644 drivers/misc/mic/host/mic_virtio.c
 create mode 100644 drivers/misc/mic/host/mic_virtio.h
 create mode 100644 include/uapi/linux/mic_ioctl.h

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index 279a2e6..01f1a4a 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -3,6 +3,7 @@ comment "Intel MIC Host Driver"
 config INTEL_MIC_HOST
 	tristate "Intel MIC Host Driver"
 	depends on 64BIT && PCI
+	select VHOST_RING
 	default N
 	help
 	  This enables Host Driver support for the Intel Many Integrated
diff --git a/drivers/misc/mic/common/mic_device.h b/drivers/misc/mic/common/mic_device.h
index 6440e9d..01eb74f 100644
--- a/drivers/misc/mic/common/mic_device.h
+++ b/drivers/misc/mic/common/mic_device.h
@@ -41,4 +41,11 @@ struct mic_mw {
 #define MIC_DPLO_SPAD 14
 #define MIC_DPHI_SPAD 15
 
+/*
+ * These values are supposed to be in the config_change field of the
+ * device page when the host sends a config change interrupt to the card.
+ */
+#define MIC_VIRTIO_PARAM_DEV_REMOVE 0x1
+#define MIC_VIRTIO_PARAM_CONFIG_CHANGED 0x2
+
 #endif
diff --git a/drivers/misc/mic/host/Makefile b/drivers/misc/mic/host/Makefile
index a375dd3..c2197f9 100644
--- a/drivers/misc/mic/host/Makefile
+++ b/drivers/misc/mic/host/Makefile
@@ -10,3 +10,5 @@ mic_host-objs += mic_smpt.o
 mic_host-objs += mic_intr.o
 mic_host-objs += mic_boot.o
 mic_host-objs += mic_debugfs.o
+mic_host-objs += mic_fops.o
+mic_host-objs += mic_virtio.o
diff --git a/drivers/misc/mic/host/mic_boot.c b/drivers/misc/mic/host/mic_boot.c
index 936fc58..fd9ff6d 100644
--- a/drivers/misc/mic/host/mic_boot.c
+++ b/drivers/misc/mic/host/mic_boot.c
@@ -20,12 +20,12 @@
  */
 #include <linux/delay.h>
 #include <linux/firmware.h>
-#include <linux/interrupt.h>
 
 #include <linux/mic_common.h>
 #include "../common/mic_device.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
+#include "mic_virtio.h"
 
 /**
  * mic_reset - Reset the MIC device.
@@ -117,6 +117,7 @@ void mic_stop(struct mic_device *mdev, bool force)
 {
 	mutex_lock(&mdev->mic_mutex);
 	if (MIC_OFFLINE != mdev->state || force) {
+		mic_virtio_reset_devices(mdev);
 		mic_bootparam_init(mdev);
 		mic_reset(mdev);
 		if (MIC_RESET_FAILED == mdev->state)
diff --git a/drivers/misc/mic/host/mic_debugfs.c b/drivers/misc/mic/host/mic_debugfs.c
index 78541d4..e22fb7b 100644
--- a/drivers/misc/mic/host/mic_debugfs.c
+++ b/drivers/misc/mic/host/mic_debugfs.c
@@ -26,6 +26,7 @@
 #include "../common/mic_device.h"
 #include "mic_device.h"
 #include "mic_smpt.h"
+#include "mic_virtio.h"
 
 /* Debugfs parent dir */
 static struct dentry *mic_dbg;
@@ -193,7 +194,13 @@ static const struct file_operations post_code_ops = {
 static int mic_dp_show(struct seq_file *s, void *pos)
 {
 	struct mic_device *mdev = s->private;
+	struct mic_device_desc *d;
+	struct mic_device_ctrl *dc;
+	struct mic_vqconfig *vqconfig;
+	__u32 *features;
+	__u8 *config;
 	struct mic_bootparam *bootparam = mdev->dp;
+	int i, j;
 
 	seq_printf(s, "Bootparam: magic 0x%x\n",
 		bootparam->magic);
@@ -208,6 +215,53 @@ static int mic_dp_show(struct seq_file *s, void *pos)
 	seq_printf(s, "Bootparam: shutdown_card %d\n",
 		bootparam->shutdown_card);
 
+	for (i = sizeof(*bootparam); i < MIC_DP_SIZE;
+	     i += mic_total_desc_size(d)) {
+		d = mdev->dp + i;
+		dc = (void *)d + mic_aligned_desc_size(d);
+
+		/* end of list */
+		if (d->type == 0)
+			break;
+
+		if (d->type == -1)
+			continue;
+
+		seq_printf(s, "Type %d ", d->type);
+		seq_printf(s, "Num VQ %d ", d->num_vq);
+		seq_printf(s, "Feature Len %d\n", d->feature_len);
+		seq_printf(s, "Config Len %d ", d->config_len);
+		seq_printf(s, "Shutdown Status %d\n", d->status);
+
+		for (j = 0; j < d->num_vq; j++) {
+			vqconfig = mic_vq_config(d) + j;
+			seq_printf(s, "vqconfig[%d]: ", j);
+			seq_printf(s, "address 0x%llx ", vqconfig->address);
+			seq_printf(s, "num %d ", vqconfig->num);
+			seq_printf(s, "used address 0x%llx\n",
+				vqconfig->used_address);
+		}
+
+		features = (__u32 *) mic_vq_features(d);
+		seq_printf(s, "Features: Host 0x%x ", features[0]);
+		seq_printf(s, "Guest 0x%x\n", features[1]);
+
+		config = mic_vq_configspace(d);
+		for (j = 0; j < d->config_len; j++)
+			seq_printf(s, "config[%d]=%d\n", j, config[j]);
+
+		seq_puts(s, "Device control:\n");
+		seq_printf(s, "Config Change %d ", dc->config_change);
+		seq_printf(s, "Vdev reset %d\n", dc->vdev_reset);
+		seq_printf(s, "Guest Ack %d ", dc->guest_ack);
+		seq_printf(s, "Host ack %d\n", dc->host_ack);
+		seq_printf(s, "Used address updated %d ",
+			dc->used_address_updated);
+		seq_printf(s, "Vdev 0x%llx\n", dc->vdev);
+		seq_printf(s, "c2h doorbell %d ", dc->c2h_vdev_db);
+		seq_printf(s, "h2c doorbell %d\n", dc->h2c_vdev_db);
+	}
+
 	return 0;
 }
 
@@ -229,6 +283,89 @@ static const struct file_operations dp_ops = {
 	.release = mic_dp_debug_release
 };
 
+static int mic_vdev_info_show(struct seq_file *s, void *unused)
+{
+	struct mic_device *mdev = s->private;
+	struct list_head *pos, *tmp;
+	struct mic_vdev *mvdev;
+	int i, j;
+
+	mutex_lock(&mdev->mic_mutex);
+	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
+		mvdev = list_entry(pos, struct mic_vdev, list);
+		seq_printf(s, "VDEV type %d state %s in %ld out %ld\n",
+			mvdev->virtio_id,
+			mic_vdevup(mvdev) ? "UP" : "DOWN",
+			mvdev->in_bytes,
+			mvdev->out_bytes);
+		for (i = 0; i < MIC_MAX_VRINGS; i++) {
+			struct vring_desc *desc;
+			struct vring_avail *avail;
+			struct vring_used *used;
+			struct mic_vringh *mvr = &mvdev->mvr[i];
+			struct vringh *vrh = &mvr->vrh;
+			int num = vrh->vring.num;
+			if (!num)
+				continue;
+			desc = vrh->vring.desc;
+			seq_printf(s, "vring i %d avail_idx %d",
+				i, mvr->vring.info->avail_idx & (num - 1));
+			seq_printf(s, " vring i %d avail_idx %d\n",
+				i, mvr->vring.info->avail_idx);
+			seq_printf(s, "vrh i %d weak_barriers %d",
+				i, vrh->weak_barriers);
+			seq_printf(s, " last_avail_idx %d last_used_idx %d",
+				vrh->last_avail_idx, vrh->last_used_idx);
+			seq_printf(s, " completed %d\n", vrh->completed);
+			for (j = 0; j < num; j++) {
+				seq_printf(s, "desc[%d] addr 0x%llx len %d",
+					j, desc->addr, desc->len);
+				seq_printf(s, " flags 0x%x next %d\n",
+					desc->flags,
+					desc->next);
+				desc++;
+			}
+			avail = vrh->vring.avail;
+			seq_printf(s, "avail flags 0x%x idx %d\n",
+				avail->flags, avail->idx & (num - 1));
+			seq_printf(s, "avail flags 0x%x idx %d\n",
+				avail->flags, avail->idx);
+			for (j = 0; j < num; j++)
+				seq_printf(s, "avail ring[%d] %d\n",
+					j, avail->ring[j]);
+			used = vrh->vring.used;
+			seq_printf(s, "used flags 0x%x idx %d\n",
+				used->flags, used->idx & (num - 1));
+			seq_printf(s, "used flags 0x%x idx %d\n",
+				used->flags, used->idx);
+			for (j = 0; j < num; j++)
+				seq_printf(s, "used ring[%d] id %d len %d\n",
+					j, used->ring[j].id, used->ring[j].len);
+		}
+	}
+	mutex_unlock(&mdev->mic_mutex);
+
+	return 0;
+}
+
+static int mic_vdev_info_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_vdev_info_show, inode->i_private);
+}
+
+static int mic_vdev_info_debug_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations vdev_info_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_vdev_info_debug_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_vdev_info_debug_release
+};
+
 static int mic_msi_irq_info_show(struct seq_file *s, void *pos)
 {
 	struct mic_device *mdev  = s->private;
@@ -321,6 +458,9 @@ void mic_create_debug_dir(struct mic_device *mdev)
 	debugfs_create_file("dp", 0444, mdev->dbg_dir,
 		mdev, &dp_ops);
 
+	debugfs_create_file("vdev_info", 0444, mdev->dbg_dir,
+		mdev, &vdev_info_ops);
+
 	debugfs_create_file("msi_irq_info", 0444, mdev->dbg_dir,
 		mdev, &msi_irq_info_ops);
 }
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 50b8b88..dcba2a5 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -21,6 +21,7 @@
 #ifndef _MIC_DEVICE_H_
 #define _MIC_DEVICE_H_
 
+#include <linux/cdev.h>
 #include <linux/idr.h>
 
 #include "mic_intr.h"
@@ -80,6 +81,8 @@ enum mic_stepping {
  * @dp_dma_addr: virtio device page DMA address.
  * @shutdown_db: shutdown doorbell.
  * @shutdown_cookie: shutdown cookie.
+ * @cdev: Character device for MIC.
+ * @vdev_list: list of virtio devices.
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -113,6 +116,8 @@ struct mic_device {
 	dma_addr_t dp_dma_addr;
 	int shutdown_db;
 	struct mic_irq *shutdown_cookie;
+	struct cdev cdev;
+	struct list_head vdev_list;
 };
 
 /**
diff --git a/drivers/misc/mic/host/mic_fops.c b/drivers/misc/mic/host/mic_fops.c
new file mode 100644
index 0000000..661469a
--- /dev/null
+++ b/drivers/misc/mic/host/mic_fops.c
@@ -0,0 +1,221 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/poll.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_fops.h"
+#include "mic_virtio.h"
+
+int mic_open(struct inode *inode, struct file *f)
+{
+	struct mic_vdev *mvdev;
+	struct mic_device *mdev = container_of(inode->i_cdev,
+		struct mic_device, cdev);
+
+	mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
+	if (!mvdev)
+		return -ENOMEM;
+
+	init_waitqueue_head(&mvdev->waitq);
+	INIT_LIST_HEAD(&mvdev->list);
+	mvdev->mdev = mdev;
+	mvdev->virtio_id = -1;
+
+	f->private_data = mvdev;
+	return 0;
+}
+
+int mic_release(struct inode *inode, struct file *f)
+{
+	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
+
+	if (-1 != mvdev->virtio_id)
+		mic_virtio_del_device(mvdev);
+	f->private_data = NULL;
+	kfree(mvdev);
+	return 0;
+}
+
+long mic_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
+	void __user *argp = (void __user *)arg;
+	int ret;
+
+	switch (cmd) {
+	case MIC_VIRTIO_ADD_DEVICE:
+	{
+		ret = mic_virtio_add_device(mvdev, argp);
+		if (ret < 0) {
+			dev_err(mic_dev(mvdev),
+				"%s %d errno ret %d\n",
+				__func__, __LINE__, ret);
+			return ret;
+		}
+		break;
+	}
+	case MIC_VIRTIO_COPY_DESC:
+	{
+		struct mic_copy_desc copy;
+
+		ret = mic_vdev_inited(mvdev);
+		if (ret)
+			return ret;
+
+		if (copy_from_user(&copy, argp, sizeof(copy)))
+			return -EFAULT;
+
+		dev_dbg(mic_dev(mvdev),
+			"%s %d === iovcnt 0x%x vr_idx 0x%x update_used %d\n",
+			__func__, __LINE__, copy.iovcnt, copy.vr_idx,
+			copy.update_used);
+
+		ret = mic_virtio_copy_desc(mvdev, &copy);
+		if (ret < 0) {
+			dev_err(mic_dev(mvdev),
+				"%s %d errno ret %d\n",
+				__func__, __LINE__, ret);
+			return ret;
+		}
+		if (copy_to_user(
+			&((struct mic_copy_desc __user *)argp)->out_len,
+			&copy.out_len, sizeof(copy.out_len))) {
+			dev_err(mic_dev(mvdev), "%s %d errno ret %d\n",
+				__func__, __LINE__, -EFAULT);
+			return -EFAULT;
+		}
+		break;
+	}
+	case MIC_VIRTIO_CONFIG_CHANGE:
+	{
+		ret = mic_vdev_inited(mvdev);
+		if (ret)
+			return ret;
+
+		ret = mic_virtio_config_change(mvdev, argp);
+		if (ret < 0) {
+			dev_err(mic_dev(mvdev),
+				"%s %d errno ret %d\n",
+				__func__, __LINE__, ret);
+			return ret;
+		}
+		break;
+	}
+	default:
+		return -ENOIOCTLCMD;
+	};
+	return 0;
+}
+
+/*
+ * We return POLLIN | POLLOUT from poll when new buffers are enqueued, and
+ * not when previously enqueued buffers may be available. This means that
+ * in the card->host (TX) path, when userspace is unblocked by poll it
+ * must drain all available descriptors or it can stall.
+ */
+unsigned int mic_poll(struct file *f, poll_table *wait)
+{
+	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
+	int mask = 0;
+
+	poll_wait(f, &mvdev->waitq, wait);
+
+	if (mic_vdev_inited(mvdev))
+		mask = POLLERR;
+	else if (mvdev->poll_wake) {
+		mvdev->poll_wake = 0;
+		mask = POLLIN | POLLOUT;
+	}
+
+	return mask;
+}
+
+static inline int
+mic_query_offset(struct mic_vdev *mvdev, unsigned long offset,
+	unsigned long *size, unsigned long *pa)
+{
+	struct mic_device *mdev = mvdev->mdev;
+	unsigned long start = MIC_DP_SIZE;
+	int i;
+
+	/*
+	 * MMAP interface is as follows:
+	 * offset				region
+	 * 0x0					virtio device_page
+	 * 0x1000				first vring
+	 * 0x1000 + size of 1st vring		second vring
+	 * ....
+	 */
+	if (!offset) {
+		*pa = virt_to_phys(mdev->dp);
+		*size = MIC_DP_SIZE;
+		return 0;
+	}
+
+	for (i = 0; i < mvdev->dd->num_vq; i++) {
+		struct mic_vringh *mvr = &mvdev->mvr[i];
+		if (offset == start) {
+			*pa = virt_to_phys(mvr->vring.va);
+			*size = mvr->vring.len;
+			return 0;
+		}
+		start += mvr->vring.len;
+	}
+	return -1;
+}
+
+/*
+ * Maps the device page and virtio rings to user space for readonly access.
+ */
+int
+mic_mmap(struct file *f, struct vm_area_struct *vma)
+{
+	struct mic_vdev *mvdev = (struct mic_vdev *)f->private_data;
+	unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
+	unsigned long pa, size = vma->vm_end - vma->vm_start, size_rem = size;
+	int i, err;
+
+	err = mic_vdev_inited(mvdev);
+	if (err)
+		return err;
+
+	if (vma->vm_flags & VM_WRITE)
+		return -EACCES;
+
+	while (size_rem) {
+		i = mic_query_offset(mvdev, offset, &size, &pa);
+		if (i < 0)
+			return -EINVAL;
+		err = remap_pfn_range(vma, vma->vm_start + offset,
+			pa >> PAGE_SHIFT, size, vma->vm_page_prot);
+		if (err)
+			return err;
+		dev_dbg(mic_dev(mvdev),
+			"%s %d type %d size 0x%lx off 0x%lx pa 0x%lx vma 0x%lx\n",
+			__func__, __LINE__, mvdev->virtio_id, size, offset,
+			pa, vma->vm_start + offset);
+		size_rem -= size;
+		offset += size;
+	}
+	return 0;
+}
diff --git a/drivers/misc/mic/host/mic_fops.h b/drivers/misc/mic/host/mic_fops.h
new file mode 100644
index 0000000..dc3893d
--- /dev/null
+++ b/drivers/misc/mic/host/mic_fops.h
@@ -0,0 +1,32 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef _MIC_FOPS_H_
+#define _MIC_FOPS_H_
+
+int mic_open(struct inode *inode, struct file *filp);
+int mic_release(struct inode *inode, struct file *filp);
+ssize_t mic_read(struct file *filp, char __user *buf,
+			size_t count, loff_t *pos);
+long mic_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
+int mic_mmap(struct file *f, struct vm_area_struct *vma);
+unsigned int mic_poll(struct file *f, poll_table *wait);
+
+#endif
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 998a20a..a8965d4 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -25,12 +25,15 @@
 #include <linux/fs.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/poll.h>
 
 #include <linux/mic_common.h>
 #include "../common/mic_device.h"
 #include "mic_device.h"
 #include "mic_x100.h"
 #include "mic_smpt.h"
+#include "mic_fops.h"
+#include "mic_virtio.h"
 
 static const char mic_driver_name[] = "mic";
 
@@ -64,6 +67,15 @@ static struct class *g_mic_class;
 /* Base device node number for MIC devices */
 static dev_t g_mic_devno;
 
+static const struct file_operations mic_fops = {
+	.open = mic_open,
+	.release = mic_release,
+	.unlocked_ioctl = mic_ioctl,
+	.poll = mic_poll,
+	.mmap = mic_mmap,
+	.owner = THIS_MODULE,
+};
+
 /* Initialize the device page */
 static int mic_dp_init(struct mic_device *mdev)
 {
@@ -193,6 +205,7 @@ mic_device_init(struct mic_device *mdev, struct pci_dev *pdev)
 	mdev->irq_info.next_avail_src = 0;
 	INIT_WORK(&mdev->reset_trigger_work, mic_reset_trigger_work);
 	INIT_WORK(&mdev->shutdown_work, mic_shutdown_work);
+	INIT_LIST_HEAD(&mdev->vdev_list);
 }
 
 /**
@@ -330,7 +343,19 @@ static int mic_probe(struct pci_dev *pdev,
 	mic_bootparam_init(mdev);
 
 	mic_create_debug_dir(mdev);
+	cdev_init(&mdev->cdev, &mic_fops);
+	mdev->cdev.owner = THIS_MODULE;
+	rc = cdev_add(&mdev->cdev, MKDEV(MAJOR(g_mic_devno), mdev->id), 1);
+	if (rc) {
+		dev_err(&pdev->dev, "cdev_add err id %d rc %d\n", mdev->id, rc);
+		goto cleanup_debug_dir;
+	}
 	return 0;
+cleanup_debug_dir:
+	mic_delete_debug_dir(mdev);
+	mutex_lock(&mdev->mic_mutex);
+	mic_free_irq(mdev, mdev->shutdown_cookie, mdev);
+	mutex_unlock(&mdev->mic_mutex);
 dp_uninit:
 	mic_dp_uninit(mdev);
 sysfs_put:
@@ -375,6 +400,7 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	mic_stop(mdev, false);
+	cdev_del(&mdev->cdev);
 	mic_delete_debug_dir(mdev);
 	mutex_lock(&mdev->mic_mutex);
 	mic_free_irq(mdev, mdev->shutdown_cookie, mdev);
diff --git a/drivers/misc/mic/host/mic_virtio.c b/drivers/misc/mic/host/mic_virtio.c
new file mode 100644
index 0000000..be2a1f0
--- /dev/null
+++ b/drivers/misc/mic/host/mic_virtio.c
@@ -0,0 +1,703 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/pci.h>
+#include <linux/sched.h>
+#include <linux/uaccess.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_smpt.h"
+#include "mic_virtio.h"
+
+/*
+ * Initiates the copies across the PCIe bus from card memory to
+ * a user space buffer.
+ */
+static int mic_virtio_copy_to_user(struct mic_vdev *mvdev,
+		void __user *ubuf, size_t len, u64 addr)
+{
+	int err;
+	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
+	/*
+	 * We are copying from IO below an should ideally use something
+	 * like copy_to_user_fromio(..) if it existed.
+	 */
+	if (copy_to_user(ubuf, dbuf, len)) {
+		err = -EFAULT;
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	mvdev->in_bytes += len;
+	err = 0;
+err:
+	return err;
+}
+
+/*
+ * Initiates copies across the PCIe bus from a user space
+ * buffer to card memory.
+ */
+static int mic_virtio_copy_from_user(struct mic_vdev *mvdev,
+		void __user *ubuf, size_t len, u64 addr)
+{
+	int err;
+	void __iomem *dbuf = mvdev->mdev->aper.va + addr;
+	/*
+	 * We are copying to IO below and should ideally use something
+	 * like copy_from_user_toio(..) if it existed.
+	 */
+	if (copy_from_user(dbuf, ubuf, len)) {
+		err = -EFAULT;
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	mvdev->out_bytes += len;
+	err = 0;
+err:
+	return err;
+}
+
+#define MIC_VRINGH_READ true
+
+/* The function to call to notify the card about added buffers */
+static void mic_notify(struct vringh *vrh)
+{
+	struct mic_vringh *mvrh = container_of(vrh, struct mic_vringh, vrh);
+	struct mic_vdev *mvdev = mvrh->mvdev;
+	s8 db = mvdev->dc->h2c_vdev_db;
+
+	if (db != -1)
+		mvdev->mdev->ops->send_intr(mvdev->mdev, db);
+}
+
+/* Determine the total number of bytes consumed in a VRINGH KIOV */
+static inline u32 mic_vringh_iov_consumed(struct vringh_kiov *iov)
+{
+	int i;
+	u32 total = iov->consumed;
+
+	for (i = 0; i < iov->i; i++)
+		total += iov->iov[i].iov_len;
+	return total;
+}
+
+/*
+ * Traverse the VRINGH KIOV and issue the APIs to trigger the copies.
+ * This API is heavily based on the vringh_iov_xfer(..) implementation
+ * in vringh.c. The reason we cannot reuse vringh_iov_pull_kern(..)
+ * and vringh_iov_push_kern(..) directly is because there is no
+ * way to override the VRINGH xfer(..) routines as of v3.10.
+ */
+static int mic_vringh_copy(struct mic_vdev *mvdev, struct vringh_kiov *iov,
+	void __user *ubuf, size_t len, bool read, size_t *out_len)
+{
+	int ret = 0;
+	size_t partlen, tot_len = 0;
+
+	while (len && iov->i < iov->used) {
+		partlen = min(iov->iov[iov->i].iov_len, len);
+		if (read)
+			ret = mic_virtio_copy_to_user(mvdev,
+				ubuf, partlen,
+				(u64)iov->iov[iov->i].iov_base);
+		else
+			ret = mic_virtio_copy_from_user(mvdev,
+				ubuf, partlen,
+				(u64)iov->iov[iov->i].iov_base);
+		if (ret) {
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			break;
+		}
+		len -= partlen;
+		ubuf += partlen;
+		tot_len += partlen;
+		iov->consumed += partlen;
+		iov->iov[iov->i].iov_len -= partlen;
+		iov->iov[iov->i].iov_base += partlen;
+		if (!iov->iov[iov->i].iov_len) {
+			/* Fix up old iov element then increment. */
+			iov->iov[iov->i].iov_len = iov->consumed;
+			iov->iov[iov->i].iov_base -= iov->consumed;
+
+			iov->consumed = 0;
+			iov->i++;
+		}
+	}
+	*out_len = tot_len;
+	return ret;
+}
+
+/*
+ * Use the standard VRINGH infrastructure in the kernel to fetch new
+ * descriptors, initiate the copies and update the used ring.
+ */
+static int _mic_virtio_copy(struct mic_vdev *mvdev,
+	struct mic_copy_desc *copy)
+{
+	int ret = 0, iovcnt = copy->iovcnt;
+	struct iovec iov;
+	struct iovec __user *u_iov = copy->iov;
+	void __user *ubuf = NULL;
+	struct mic_vringh *mvr = &mvdev->mvr[copy->vr_idx];
+	struct vringh_kiov *riov = &mvr->riov;
+	struct vringh_kiov *wiov = &mvr->wiov;
+	struct vringh *vrh = &mvr->vrh;
+	u16 *head = &mvr->head;
+	struct mic_vring *vr = &mvr->vring;
+	size_t len = 0, out_len;
+
+	copy->out_len = 0;
+	/* Fetch a new IOVEC if all previous elements have been processed */
+	if (riov->i == riov->used && wiov->i == wiov->used) {
+		ret = vringh_getdesc_kern(vrh, riov, wiov,
+				head, GFP_KERNEL);
+		/* Check if there are available descriptors */
+		if (ret <= 0)
+			return ret;
+	}
+	while (iovcnt) {
+		if (!len) {
+			/* Copy over a new iovec from user space. */
+			ret = copy_from_user(&iov, u_iov, sizeof(*u_iov));
+			if (ret) {
+				ret = -EINVAL;
+				dev_err(mic_dev(mvdev), "%s %d err %d\n",
+					__func__, __LINE__, ret);
+				break;
+			}
+			len = iov.iov_len;
+			ubuf = iov.iov_base;
+		}
+		/* Issue all the read descriptors first */
+		ret = mic_vringh_copy(mvdev, riov, ubuf, len,
+			MIC_VRINGH_READ, &out_len);
+		if (ret) {
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+					__func__, __LINE__, ret);
+			break;
+		}
+		len -= out_len;
+		ubuf += out_len;
+		copy->out_len += out_len;
+		/* Issue the write descriptors next */
+		ret = mic_vringh_copy(mvdev, wiov, ubuf, len,
+			!MIC_VRINGH_READ, &out_len);
+		if (ret) {
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+					__func__, __LINE__, ret);
+			break;
+		}
+		len -= out_len;
+		ubuf += out_len;
+		copy->out_len += out_len;
+		if (!len) {
+			/* One user space iovec is now completed */
+			iovcnt--;
+			u_iov++;
+		}
+		/* Exit loop if all elements in KIOVs have been processed. */
+		if (riov->i == riov->used && wiov->i == wiov->used)
+			break;
+	}
+	/*
+	 * Update the used ring if a descriptor was available and some data was
+	 * copied in/out and the user asked for a used ring update.
+	 */
+	if (*head != USHRT_MAX && copy->out_len &&
+		copy->update_used) {
+		u32 total = 0;
+
+		/* Determine the total data consumed */
+		total += mic_vringh_iov_consumed(riov);
+		total += mic_vringh_iov_consumed(wiov);
+		vringh_complete_kern(vrh, *head, total);
+		*head = USHRT_MAX;
+		if (vringh_need_notify_kern(vrh) > 0)
+			vringh_notify(vrh);
+		vringh_kiov_cleanup(riov);
+		vringh_kiov_cleanup(wiov);
+		/* Update avail idx for user space */
+		vr->info->avail_idx = vrh->last_avail_idx;
+	}
+	return ret;
+}
+
+static inline int mic_verify_copy_args(struct mic_vdev *mvdev,
+		struct mic_copy_desc *copy)
+{
+	if (copy->vr_idx >= mvdev->dd->num_vq) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -EINVAL);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+/* Copy a specified number of virtio descriptors in a chain */
+int mic_virtio_copy_desc(struct mic_vdev *mvdev,
+		struct mic_copy_desc *copy)
+{
+	int err;
+	struct mic_vringh *mvr = &mvdev->mvr[copy->vr_idx];
+
+	err = mic_verify_copy_args(mvdev, copy);
+	if (err)
+		return err;
+
+	mutex_lock(&mvr->vr_mutex);
+	if (!mic_vdevup(mvdev)) {
+		err = -ENODEV;
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+		goto err;
+	}
+	err = _mic_virtio_copy(mvdev, copy);
+	if (err) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, err);
+	}
+err:
+	mutex_unlock(&mvr->vr_mutex);
+	return err;
+}
+
+static void mic_virtio_init_post(struct mic_vdev *mvdev)
+{
+	struct mic_vqconfig *vqconfig = mic_vq_config(mvdev->dd);
+	int i;
+
+	for (i = 0; i < mvdev->dd->num_vq; i++) {
+		if (!le64_to_cpu(vqconfig[i].used_address)) {
+			dev_warn(mic_dev(mvdev), "used_address zero??\n");
+			continue;
+		}
+		mvdev->mvr[i].vrh.vring.used =
+			mvdev->mdev->aper.va +
+			le64_to_cpu(vqconfig[i].used_address);
+	}
+
+	mvdev->dc->used_address_updated = 0;
+
+	dev_dbg(mic_dev(mvdev), "%s: device type %d LINKUP\n",
+		__func__, mvdev->virtio_id);
+}
+
+static inline void mic_virtio_device_reset(struct mic_vdev *mvdev)
+{
+	int i;
+
+	dev_dbg(mic_dev(mvdev), "%s: status %d device type %d RESET\n",
+		__func__, mvdev->dd->status, mvdev->virtio_id);
+
+	for (i = 0; i < mvdev->dd->num_vq; i++)
+		/*
+		 * Avoid lockdep false positive. The + 1 is for the mic
+		 * mutex which is held in the reset devices code path.
+		 */
+		mutex_lock_nested(&mvdev->mvr[i].vr_mutex, i + 1);
+
+	/* 0 status means "reset" */
+	mvdev->dd->status = 0;
+	mvdev->dc->vdev_reset = 0;
+	mvdev->dc->host_ack = 1;
+
+	for (i = 0; i < mvdev->dd->num_vq; i++) {
+		struct vringh *vrh = &mvdev->mvr[i].vrh;
+		mvdev->mvr[i].vring.info->avail_idx = 0;
+		vrh->completed = 0;
+		vrh->last_avail_idx = 0;
+		vrh->last_used_idx = 0;
+	}
+
+	for (i = 0; i < mvdev->dd->num_vq; i++)
+		mutex_unlock(&mvdev->mvr[i].vr_mutex);
+}
+
+void mic_virtio_reset_devices(struct mic_device *mdev)
+{
+	struct list_head *pos, *tmp;
+	struct mic_vdev *mvdev;
+
+	dev_dbg(mdev->sdev->parent, "%s\n",  __func__);
+
+	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
+		mvdev = list_entry(pos, struct mic_vdev, list);
+		mic_virtio_device_reset(mvdev);
+		mvdev->poll_wake = 1;
+		wake_up(&mvdev->waitq);
+	}
+}
+
+void mic_bh_handler(struct work_struct *work)
+{
+	struct mic_vdev *mvdev = container_of(work, struct mic_vdev,
+			virtio_bh_work);
+
+	if (mvdev->dc->used_address_updated)
+		mic_virtio_init_post(mvdev);
+
+	if (mvdev->dc->vdev_reset)
+		mic_virtio_device_reset(mvdev);
+
+	mvdev->poll_wake = 1;
+	wake_up(&mvdev->waitq);
+}
+
+static irqreturn_t mic_virtio_intr_handler(int irq, void *data)
+{
+
+	struct mic_vdev *mvdev = data;
+	struct mic_device *mdev = mvdev->mdev;
+
+	mdev->ops->ack_interrupt(mdev);
+	schedule_work(&mvdev->virtio_bh_work);
+	return IRQ_HANDLED;
+}
+
+int mic_virtio_config_change(struct mic_vdev *mvdev,
+			void __user *argp)
+{
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
+	int ret = 0, retry = 100, i;
+	struct mic_bootparam *bootparam = mvdev->mdev->dp;
+	s8 db = bootparam->h2c_config_db;
+
+	mutex_lock(&mvdev->mdev->mic_mutex);
+	for (i = 0; i < mvdev->dd->num_vq; i++)
+		mutex_lock_nested(&mvdev->mvr[i].vr_mutex, i + 1);
+
+	if (db == -1 || mvdev->dd->type == -1) {
+		ret = -EIO;
+		goto exit;
+	}
+
+	if (copy_from_user(mic_vq_configspace(mvdev->dd),
+				argp, mvdev->dd->config_len)) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -EFAULT);
+		ret = -EFAULT;
+		goto exit;
+	}
+	mvdev->dc->config_change = MIC_VIRTIO_PARAM_CONFIG_CHANGED;
+	mvdev->mdev->ops->send_intr(mvdev->mdev, db);
+
+	for (i = retry; i--;) {
+		ret = wait_event_timeout(wake,
+			mvdev->dc->guest_ack, msecs_to_jiffies(100));
+		if (ret)
+			break;
+	}
+
+	dev_dbg(mic_dev(mvdev),
+		"%s %d retry: %d\n", __func__, __LINE__, retry);
+	mvdev->dc->config_change = 0;
+	mvdev->dc->guest_ack = 0;
+exit:
+	for (i = 0; i < mvdev->dd->num_vq; i++)
+		mutex_unlock(&mvdev->mvr[i].vr_mutex);
+	mutex_unlock(&mvdev->mdev->mic_mutex);
+	return ret;
+}
+
+static int mic_copy_dp_entry(struct mic_vdev *mvdev,
+					void __user *argp,
+					__u8 *type,
+					struct mic_device_desc **devpage)
+{
+	struct mic_device *mdev = mvdev->mdev;
+	struct mic_device_desc dd, *dd_config, *devp;
+	struct mic_vqconfig *vqconfig;
+	int ret = 0, i;
+	bool slot_found = false;
+
+	if (copy_from_user(&dd, argp, sizeof(dd))) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -EFAULT);
+		return -EFAULT;
+	}
+
+	if (mic_aligned_desc_size(&dd) > MIC_MAX_DESC_BLK_SIZE
+		|| dd.num_vq > MIC_MAX_VRINGS) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -EINVAL);
+		return -EINVAL;
+	}
+
+	dd_config = kmalloc(mic_desc_size(&dd), GFP_KERNEL);
+	if (dd_config == NULL) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -ENOMEM);
+		return -ENOMEM;
+	}
+	if (copy_from_user(dd_config, argp, mic_desc_size(&dd))) {
+		ret = -EFAULT;
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, ret);
+		goto exit;
+	}
+
+	vqconfig = mic_vq_config(dd_config);
+	for (i = 0; i < dd.num_vq; i++) {
+		if (le16_to_cpu(vqconfig[i].num) > MIC_MAX_VRING_ENTRIES) {
+			ret =  -EINVAL;
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto exit;
+		}
+	}
+
+	/* Find the first free device page entry */
+	for (i = mic_aligned_size(struct mic_bootparam);
+		i < MIC_DP_SIZE - mic_total_desc_size(dd_config);
+		i += mic_total_desc_size(devp)) {
+		devp = mdev->dp + i;
+		if (devp->type == 0 || devp->type == -1) {
+			slot_found = true;
+			break;
+		}
+	}
+	if (!slot_found) {
+		ret =  -EINVAL;
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, ret);
+		goto exit;
+	}
+	/*
+	 * Save off the type before doing the memcpy. Type will be set in the
+	 * end after completing all initialization for the new device.
+	 */
+	*type = dd_config->type;
+	dd_config->type = 0;
+	memcpy(devp, dd_config, mic_desc_size(dd_config));
+
+	*devpage = devp;
+exit:
+	kfree(dd_config);
+	return ret;
+}
+
+static void mic_init_device_ctrl(struct mic_vdev *mvdev,
+				struct mic_device_desc *devpage)
+{
+	struct mic_device_ctrl *dc;
+
+	dc = mvdev->dc = (void *)devpage + mic_aligned_desc_size(devpage);
+
+	dc->config_change = 0;
+	dc->guest_ack = 0;
+	dc->vdev_reset = 0;
+	dc->host_ack = 0;
+	dc->used_address_updated = 0;
+	dc->c2h_vdev_db = -1;
+	dc->h2c_vdev_db = -1;
+}
+
+int mic_virtio_add_device(struct mic_vdev *mvdev,
+			void __user *argp)
+{
+	struct mic_device *mdev = mvdev->mdev;
+	struct mic_device_desc *dd;
+	struct mic_vqconfig *vqconfig;
+	int vr_size, i, j, ret;
+	u8 type;
+	s8 db;
+	char irqname[10];
+	struct mic_bootparam *bootparam = mdev->dp;
+	u16 num;
+
+	mutex_lock(&mdev->mic_mutex);
+
+	ret = mic_copy_dp_entry(mvdev, argp, &type, &dd);
+	if (ret) {
+		mutex_unlock(&mdev->mic_mutex);
+		return ret;
+	}
+
+	mic_init_device_ctrl(mvdev, dd);
+
+	mvdev->dd = dd;
+	mvdev->virtio_id = type;
+	vqconfig = mic_vq_config(dd);
+	INIT_WORK(&mvdev->virtio_bh_work, mic_bh_handler);
+
+	for (i = 0; i < dd->num_vq; i++) {
+		struct mic_vringh *mvr = &mvdev->mvr[i];
+		struct mic_vring *vr = &mvdev->mvr[i].vring;
+		num = le16_to_cpu(vqconfig[i].num);
+		mutex_init(&mvr->vr_mutex);
+		vr_size = PAGE_ALIGN(vring_size(num, MIC_VIRTIO_RING_ALIGN) +
+			sizeof(struct _mic_vring_info));
+		vr->va = (void *)
+			__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+			get_order(vr_size));
+		if (!vr->va) {
+			ret = -ENOMEM;
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vr->len = vr_size;
+		vr->info = vr->va + vring_size(num, MIC_VIRTIO_RING_ALIGN);
+		vr->info->magic = MIC_MAGIC + mvdev->virtio_id + i;
+		vqconfig[i].address = mic_map_single(mdev,
+			vr->va, vr_size);
+		if (mic_map_error(vqconfig[i].address)) {
+			free_pages((unsigned long)vr->va,
+				get_order(vr_size));
+			ret = -ENOMEM;
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vqconfig[i].address = cpu_to_le64(vqconfig[i].address);
+
+		vring_init(&vr->vr, num,
+			vr->va, MIC_VIRTIO_RING_ALIGN);
+		ret = vringh_init_kern(&mvr->vrh,
+			*(u32 *)mic_vq_features(mvdev->dd), num, false,
+			vr->vr.desc, vr->vr.avail, vr->vr.used);
+		if (ret) {
+			dev_err(mic_dev(mvdev), "%s %d err %d\n",
+				__func__, __LINE__, ret);
+			goto err;
+		}
+		vringh_kiov_init(&mvr->riov, NULL, 0);
+		vringh_kiov_init(&mvr->wiov, NULL, 0);
+		mvr->head = USHRT_MAX;
+		mvr->mvdev = mvdev;
+		mvr->vrh.notify = mic_notify;
+		dev_dbg(mdev->sdev->parent,
+			"%s %d index %d va %p info %p vr_size 0x%x\n",
+			__func__, __LINE__, i, vr->va, vr->info, vr_size);
+	}
+
+	snprintf(irqname, sizeof(irqname),
+		"mic%dvirtio%d", mdev->id, mvdev->virtio_id);
+	mvdev->virtio_db = mic_next_db(mdev);
+	mvdev->virtio_cookie = mic_request_irq(mdev, mic_virtio_intr_handler,
+			irqname, mvdev, mvdev->virtio_db, MIC_INTR_DB);
+	if (IS_ERR(mvdev->virtio_cookie)) {
+		ret = PTR_ERR(mvdev->virtio_cookie);
+		dev_dbg(mdev->sdev->parent, "request irq failed\n");
+		goto err;
+	}
+
+	mvdev->dc->c2h_vdev_db = mvdev->virtio_db;
+
+	list_add_tail(&mvdev->list, &mdev->vdev_list);
+	/*
+	 * Order the type update with previous stores. This write barrier
+	 * is paired with the corresponding read barrier before the uncached
+	 * system memory read of the type, on the card while scanning the
+	 * device page.
+	 */
+	smp_wmb();
+	dd->type = type;
+
+	dev_dbg(mdev->sdev->parent, "Added virtio device id %d\n", dd->type);
+
+	db = bootparam->h2c_config_db;
+	if (db != -1)
+		mdev->ops->send_intr(mdev, db);
+	mutex_unlock(&mdev->mic_mutex);
+	return 0;
+err:
+	vqconfig = mic_vq_config(dd);
+	for (j = 0; j < i; j++) {
+		struct mic_vringh *mvr = &mvdev->mvr[j];
+		mic_unmap_single(mdev, le64_to_cpu(vqconfig[j].address),
+				mvr->vring.len);
+		free_pages((unsigned long)mvr->vring.va,
+			get_order(mvr->vring.len));
+	}
+	mutex_unlock(&mdev->mic_mutex);
+	return ret;
+}
+
+void mic_virtio_del_device(struct mic_vdev *mvdev)
+{
+	struct list_head *pos, *tmp;
+	struct mic_vdev *tmp_mvdev;
+	struct mic_device *mdev = mvdev->mdev;
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wake);
+	int i, ret, retry = 100;
+	struct mic_vqconfig *vqconfig;
+	struct mic_bootparam *bootparam = mdev->dp;
+	s8 db;
+
+	mutex_lock(&mdev->mic_mutex);
+	db = bootparam->h2c_config_db;
+	if (db == -1)
+		goto skip_hot_remove;
+	dev_dbg(mdev->sdev->parent,
+		"Requesting hot remove id %d\n", mvdev->virtio_id);
+	mvdev->dc->config_change = MIC_VIRTIO_PARAM_DEV_REMOVE;
+	mdev->ops->send_intr(mdev, db);
+	for (i = retry; i--;) {
+		ret = wait_event_timeout(wake,
+			mvdev->dc->guest_ack, msecs_to_jiffies(100));
+		if (ret)
+			break;
+	}
+	dev_dbg(mdev->sdev->parent,
+		"Device id %d config_change %d guest_ack %d\n",
+		mvdev->virtio_id, mvdev->dc->config_change,
+		mvdev->dc->guest_ack);
+	mvdev->dc->config_change = 0;
+	mvdev->dc->guest_ack = 0;
+skip_hot_remove:
+	mic_free_irq(mdev, mvdev->virtio_cookie, mvdev);
+	flush_work(&mvdev->virtio_bh_work);
+	vqconfig = mic_vq_config(mvdev->dd);
+	for (i = 0; i < mvdev->dd->num_vq; i++) {
+		struct mic_vringh *mvr = &mvdev->mvr[i];
+		vringh_kiov_cleanup(&mvr->riov);
+		vringh_kiov_cleanup(&mvr->wiov);
+		mic_unmap_single(mdev, le64_to_cpu(vqconfig[i].address),
+				mvr->vring.len);
+		free_pages((unsigned long)mvr->vring.va,
+			get_order(mvr->vring.len));
+	}
+
+	list_for_each_safe(pos, tmp, &mdev->vdev_list) {
+		tmp_mvdev = list_entry(pos, struct mic_vdev, list);
+		if (tmp_mvdev == mvdev) {
+			list_del(pos);
+			dev_dbg(mdev->sdev->parent,
+				"Removing virtio device id %d\n",
+				mvdev->virtio_id);
+			break;
+		}
+	}
+	/*
+	 * Order the type update with previous stores. This write barrier
+	 * is paired with the corresponding read barrier before the uncached
+	 * system memory read of the type, on the card while scanning the
+	 * device page.
+	 */
+	smp_wmb();
+	mvdev->dd->type = -1;
+	mutex_unlock(&mdev->mic_mutex);
+}
diff --git a/drivers/misc/mic/host/mic_virtio.h b/drivers/misc/mic/host/mic_virtio.h
new file mode 100644
index 0000000..184f3c8
--- /dev/null
+++ b/drivers/misc/mic/host/mic_virtio.h
@@ -0,0 +1,138 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef MIC_VIRTIO_H
+#define MIC_VIRTIO_H
+
+#include <linux/virtio_config.h>
+#include <linux/mic_ioctl.h>
+
+/*
+ * Note on endianness.
+ * 1. Host can be both BE or LE
+ * 2. Guest/card is LE. Host uses le_to_cpu to access desc/avail
+ *    rings and ioreadXX/iowriteXX to access used ring.
+ * 3. Device page exposed by host to guest contains LE values. Guest
+ *    accesses these using ioreadXX/iowriteXX etc. This way in general we
+ *    obey the virtio spec according to which guest works with native
+ *    endianness and host is aware of guest endianness and does all
+ *    required endianness conversion.
+ * 4. Data provided from user space to guest (in ADD_DEVICE and
+ *    CONFIG_CHANGE ioctl's) is not interpreted by the driver and should be
+ *    in guest endianness.
+ */
+
+/**
+ * struct mic_vringh - Virtio ring host information.
+ *
+ * @vring: The MIC vring used for setting up user space mappings.
+ * @vrh: The host VRINGH used for accessing the card vrings.
+ * @riov: The VRINGH read kernel IOV.
+ * @wiov: The VRINGH write kernel IOV.
+ * @head: The VRINGH head index address passed to vringh_getdesc_kern(..).
+ * @vr_mutex: Mutex for synchronizing access to the VRING.
+ * @mvdev: Back pointer to MIC virtio device for vringh_notify(..).
+ */
+struct mic_vringh {
+	struct mic_vring vring;
+	struct vringh vrh;
+	struct vringh_kiov riov;
+	struct vringh_kiov wiov;
+	u16 head;
+	struct mutex vr_mutex;
+	struct mic_vdev *mvdev;
+};
+
+/**
+ * struct mic_vdev - Host information for a card Virtio device.
+ *
+ * @virtio_id - Virtio device id.
+ * @waitq - Waitqueue to allow ring3 apps to poll.
+ * @mdev - Back pointer to host MIC device.
+ * @poll_wake - Used for waking up threads blocked in poll.
+ * @out_bytes - Debug stats for number of bytes copied from host to card.
+ * @in_bytes - Debug stats for number of bytes copied from card to host.
+ * @mvr - Store per VRING data structures.
+ * @virtio_bh_work - Work struct used to schedule virtio bottom half handling.
+ * @dd - Virtio device descriptor.
+ * @dc - Virtio device control fields.
+ * @list - List of Virtio devices.
+ * @virtio_db - The doorbell used by the card to interrupt the host.
+ * @virtio_cookie - The cookie returned while requesting interrupts.
+ */
+struct mic_vdev {
+	int virtio_id;
+	wait_queue_head_t waitq;
+	struct mic_device *mdev;
+	int poll_wake;
+	unsigned long out_bytes;
+	unsigned long in_bytes;
+	struct mic_vringh mvr[MIC_MAX_VRINGS];
+	struct work_struct virtio_bh_work;
+	struct mic_device_desc *dd;
+	struct mic_device_ctrl *dc;
+	struct list_head list;
+	int virtio_db;
+	struct mic_irq *virtio_cookie;
+};
+
+void mic_virtio_uninit(struct mic_device *mdev);
+int mic_virtio_add_device(struct mic_vdev *mvdev,
+			void __user *argp);
+void mic_virtio_del_device(struct mic_vdev *mvdev);
+int mic_virtio_config_change(struct mic_vdev *mvdev,
+			void __user *argp);
+int mic_virtio_copy_desc(struct mic_vdev *mvdev,
+	struct mic_copy_desc *request);
+void mic_virtio_reset_devices(struct mic_device *mdev);
+void mic_bh_handler(struct work_struct *work);
+
+/* Helper API to obtain the MIC PCIe device */
+static inline struct device *mic_dev(struct mic_vdev *mvdev)
+{
+	return mvdev->mdev->sdev->parent;
+}
+
+/* Helper API to check if a virtio device is initialized */
+static inline int mic_vdev_inited(struct mic_vdev *mvdev)
+{
+	/* Device has not been created yet */
+	if (!mvdev->dd || !mvdev->dd->type) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -EINVAL);
+		return -EINVAL;
+	}
+
+	/* Device has been removed/deleted */
+	if (mvdev->dd->type == -1) {
+		dev_err(mic_dev(mvdev), "%s %d err %d\n",
+			__func__, __LINE__, -ENODEV);
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+/* Helper API to check if a virtio device is running */
+static inline bool mic_vdevup(struct mic_vdev *mvdev)
+{
+	return !!mvdev->dd->status;
+}
+#endif
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index e3f7b37..c73dbd0 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -241,6 +241,7 @@ header-y += mei.h
 header-y += mempolicy.h
 header-y += meye.h
 header-y += mic_common.h
+header-y += mic_ioctl.h
 header-y += mii.h
 header-y += minix_fs.h
 header-y += mman.h
diff --git a/include/uapi/linux/mic_common.h b/include/uapi/linux/mic_common.h
index a9091e5..364ac75 100644
--- a/include/uapi/linux/mic_common.h
+++ b/include/uapi/linux/mic_common.h
@@ -21,7 +21,60 @@
 #ifndef __MIC_COMMON_H_
 #define __MIC_COMMON_H_
 
-#include <linux/types.h>
+#include <linux/virtio_ring.h>
+
+#ifndef __KERNEL__
+#define ALIGN(a, x)	(((a) + (x) - 1) & ~((x) - 1))
+#define __aligned(x)	__attribute__ ((aligned(x)))
+#endif
+
+#define mic_aligned_size(x) ALIGN(sizeof(x), 8)
+
+/**
+ * struct mic_device_desc: Virtio device information shared between the
+ * virtio driver and userspace backend
+ *
+ * @type: Device type: console/network/disk etc.  Type 0/-1 terminates.
+ * @num_vq: Number of virtqueues.
+ * @feature_len: Number of bytes of feature bits.  Multiply by 2: one for
+   host features and one for guest acknowledgements.
+ * @config_len: Number of bytes of the config array after virtqueues.
+ * @status: A status byte, written by the Guest.
+ * @config: Start of the following variable length config.
+ */
+struct mic_device_desc {
+	__s8 type;
+	__u8 num_vq;
+	__u8 feature_len;
+	__u8 config_len;
+	__u8 status;
+	__u64 config[0];
+} __aligned(8);
+
+/**
+ * struct mic_device_ctrl: Per virtio device information in the device page
+ * used internally by the host and card side drivers.
+ *
+ * @vdev: Used for storing MIC vdev information by the guest.
+ * @config_change: Set to 1 by host when a config change is requested.
+ * @vdev_reset: Set to 1 by guest to indicate virtio device has been reset.
+ * @guest_ack: Set to 1 by guest to ack a command.
+ * @host_ack: Set to 1 by host to ack a command.
+ * @used_address_updated: Set to 1 by guest when the used address should be
+ * updated.
+ * @c2h_vdev_db: The doorbell number to be used by guest. Set by host.
+ * @h2c_vdev_db: The doorbell number to be used by host. Set by guest.
+ */
+struct mic_device_ctrl {
+	__u64 vdev;
+	__u8 config_change;
+	__u8 vdev_reset;
+	__u8 guest_ack;
+	__u8 host_ack;
+	__u8 used_address_updated;
+	__s8 c2h_vdev_db;
+	__s8 h2c_vdev_db;
+} __aligned(8);
 
 /**
  * struct mic_bootparam: Virtio device independent information in device page
@@ -42,6 +95,117 @@ struct mic_bootparam {
 	__u8 shutdown_card;
 } __aligned(8);
 
+/**
+ * struct mic_device_page: High level representation of the device page
+ *
+ * @bootparam: The bootparam structure is used for sharing information and
+ * status updates between MIC host and card drivers.
+ * @desc: Array of MIC virtio device descriptors.
+ */
+struct mic_device_page {
+	struct mic_bootparam bootparam;
+	struct mic_device_desc desc[0];
+};
+/**
+ * struct mic_vqconfig: This is how we expect the device configuration field
+ * for a virtqueue to be laid out in config space.
+ *
+ * @address: Guest/MIC physical address of the virtio ring
+ * (avail and desc rings)
+ * @used_address: Guest/MIC physical address of the used ring
+ * @num: The number of entries in the virtio_ring
+ */
+struct mic_vqconfig {
+	__u64 address;
+	__u64 used_address;
+	__u16 num;
+} __aligned(8);
+
+/*
+ * The alignment to use between consumer and producer parts of vring.
+ * This is pagesize for historical reasons.
+ */
+#define MIC_VIRTIO_RING_ALIGN		4096
+
+#define MIC_MAX_VRINGS			4
+#define MIC_VRING_ENTRIES		128
+
+/*
+ * Max vring entries (power of 2) to ensure desc and avail rings
+ * fit in a single page
+ */
+#define MIC_MAX_VRING_ENTRIES		128
+
+/**
+ * Max size of the desc block in bytes: includes:
+ *	- struct mic_device_desc
+ *	- struct mic_vqconfig (num_vq of these)
+ *	- host and guest features
+ *	- virtio device config space
+ */
+#define MIC_MAX_DESC_BLK_SIZE		256
+
+/**
+ * struct _mic_vring_info - Host vring info exposed to userspace backend
+ * for the avail index and magic for the card.
+ *
+ * @avail_idx: host avail idx
+ * @magic: A magic debug cookie.
+ */
+struct _mic_vring_info {
+	__u16 avail_idx;
+	int magic;
+};
+
+/**
+ * struct mic_vring - Vring information.
+ *
+ * @vr: The virtio ring.
+ * @info: Host vring information exposed to the userspace backend for the
+ * avail index and magic for the card.
+ * @va: The va for the buffer allocated for vr and info.
+ * @len: The length of the buffer required for allocating vr and info.
+ */
+struct mic_vring {
+	struct vring vr;
+	struct _mic_vring_info *info;
+	void *va;
+	int len;
+};
+
+#define mic_aligned_desc_size(d) ALIGN(mic_desc_size(d), 8)
+
+#ifndef INTEL_MIC_CARD
+static inline unsigned mic_desc_size(const struct mic_device_desc *desc)
+{
+	return mic_aligned_size(*desc)
+		+ desc->num_vq * mic_aligned_size(struct mic_vqconfig)
+		+ desc->feature_len * 2
+		+ desc->config_len;
+}
+
+static inline struct mic_vqconfig *
+mic_vq_config(const struct mic_device_desc *desc)
+{
+	return (struct mic_vqconfig *)(desc + 1);
+}
+
+static inline __u8 *mic_vq_features(const struct mic_device_desc *desc)
+{
+	return (__u8 *)(mic_vq_config(desc) + desc->num_vq);
+}
+
+static inline __u8 *mic_vq_configspace(const struct mic_device_desc *desc)
+{
+	return mic_vq_features(desc) + desc->feature_len * 2;
+}
+static inline unsigned mic_total_desc_size(struct mic_device_desc *desc)
+{
+	return mic_aligned_desc_size(desc) +
+		mic_aligned_size(struct mic_device_ctrl);
+}
+#endif
+
 /* Device page size */
 #define MIC_DP_SIZE 4096
 
diff --git a/include/uapi/linux/mic_ioctl.h b/include/uapi/linux/mic_ioctl.h
new file mode 100644
index 0000000..0e6cbf3
--- /dev/null
+++ b/include/uapi/linux/mic_ioctl.h
@@ -0,0 +1,74 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef _MIC_IOCTL_H_
+#define _MIC_IOCTL_H_
+
+/*
+ * mic_copy - MIC virtio descriptor copy.
+ *
+ * @iov: An array of IOVEC structures containing user space buffers.
+ * @iovcnt: Number of IOVEC structures in iov.
+ * @vr_idx: The vring index.
+ * @update_used: A non zero value results in used index being updated.
+ * @out_len: The aggregate of the total length written to or read from
+ *	the virtio device.
+ */
+struct mic_copy_desc {
+#ifdef __KERNEL__
+	struct iovec __user *iov;
+#else
+	struct iovec *iov;
+#endif
+	int iovcnt;
+	__u8 vr_idx;
+	__u8 update_used;
+	__u32 out_len;
+};
+
+/*
+ * Add a new virtio device
+ * The (struct mic_device_desc *) pointer points to a device page entry
+ *	for the virtio device consisting of:
+ *	- struct mic_device_desc
+ *	- struct mic_vqconfig (num_vq of these)
+ *	- host and guest features
+ *	- virtio device config space
+ * The total size referenced by the pointer should equal the size returned
+ * by desc_size() in mic_common.h
+ */
+#define MIC_VIRTIO_ADD_DEVICE _IOWR('s', 1, struct mic_device_desc *)
+
+/*
+ * Copy the number of entries in the iovec and update the used index
+ * if requested by the user.
+ */
+#define MIC_VIRTIO_COPY_DESC	_IOWR('s', 2, struct mic_copy_desc *)
+
+/*
+ * Notify virtio device of a config change
+ * The (__u8 *) pointer points to config space values for the device
+ * as they should be written into the device page. The total size
+ * referenced by the pointer should equal the config_len field of struct
+ * mic_device_desc.
+ */
+#define MIC_VIRTIO_CONFIG_CHANGE _IOWR('s', 5, __u8 *)
+
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 26%]

* [PATCH RESEND v3 4/7] Intel MIC Card Driver for X100 family.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
                   ` (2 preceding siblings ...)
  2013-09-05 23:41 28% ` [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management Sudeep Dutt
@ 2013-09-05 23:42 42% ` Sudeep Dutt
  2013-09-05 23:42 26% ` [PATCH RESEND v3 5/7] Intel MIC Host Driver Changes for Virtio Devices Sudeep Dutt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

This patch does the following:
a) Initializes the Intel MIC X100 platform device and driver.
b) Sets up support to handle shutdown requests from the host.
c) Maps the device page after obtaining the device page address
from the scratchpad registers updated by the host.
d) Informs the host upon a card crash by registering a panic notifier.
e) Informs the host upon a poweroff/halt event.

Co-author: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
---
 drivers/misc/mic/Kconfig            |  18 +++
 drivers/misc/mic/Makefile           |   1 +
 drivers/misc/mic/card/Makefile      |  10 ++
 drivers/misc/mic/card/mic_debugfs.c | 130 ++++++++++++++++
 drivers/misc/mic/card/mic_device.c  | 299 ++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/card/mic_device.h  | 133 ++++++++++++++++
 drivers/misc/mic/card/mic_x100.c    | 256 ++++++++++++++++++++++++++++++
 drivers/misc/mic/card/mic_x100.h    |  48 ++++++
 8 files changed, 895 insertions(+)
 create mode 100644 drivers/misc/mic/card/Makefile
 create mode 100644 drivers/misc/mic/card/mic_debugfs.c
 create mode 100644 drivers/misc/mic/card/mic_device.c
 create mode 100644 drivers/misc/mic/card/mic_device.h
 create mode 100644 drivers/misc/mic/card/mic_x100.c
 create mode 100644 drivers/misc/mic/card/mic_x100.h

diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
index aaefd0c..279a2e6 100644
--- a/drivers/misc/mic/Kconfig
+++ b/drivers/misc/mic/Kconfig
@@ -17,3 +17,21 @@ config INTEL_MIC_HOST
 	  More information about the Intel MIC family as well as the Linux
 	  OS and tools for MIC to use with this driver are available from
 	  <http://software.intel.com/en-us/mic-developer>.
+
+comment "Intel MIC Card Driver"
+
+config INTEL_MIC_CARD
+	tristate "Intel MIC Card Driver"
+	depends on 64BIT
+	default N
+	help
+	  This enables card driver support for the Intel Many Integrated
+	  Core (MIC) device family. The card driver communicates shutdown/
+	  crash events to the host and allows registration/configuration of
+	  virtio devices. Intel MIC X100 devices are currently supported.
+
+	  If you are building a card kernel for an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  For more information see
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
index 8e72421..05b34d6 100644
--- a/drivers/misc/mic/Makefile
+++ b/drivers/misc/mic/Makefile
@@ -3,3 +3,4 @@
 # Copyright(c) 2013, Intel Corporation.
 #
 obj-$(CONFIG_INTEL_MIC_HOST) += host/
+obj-$(CONFIG_INTEL_MIC_CARD) += card/
diff --git a/drivers/misc/mic/card/Makefile b/drivers/misc/mic/card/Makefile
new file mode 100644
index 0000000..6e9675e
--- /dev/null
+++ b/drivers/misc/mic/card/Makefile
@@ -0,0 +1,10 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2013, Intel Corporation.
+#
+ccflags-y += -DINTEL_MIC_CARD
+
+obj-$(CONFIG_INTEL_MIC_CARD) += mic_card.o
+mic_card-y += mic_x100.o
+mic_card-y += mic_device.o
+mic_card-y += mic_debugfs.o
diff --git a/drivers/misc/mic/card/mic_debugfs.c b/drivers/misc/mic/card/mic_debugfs.c
new file mode 100644
index 0000000..95cf186
--- /dev/null
+++ b/drivers/misc/mic/card/mic_debugfs.c
@@ -0,0 +1,130 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#include <linux/debugfs.h>
+#include <linux/delay.h>
+#include <linux/seq_file.h>
+#include <linux/interrupt.h>
+#include <linux/device.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+
+/* Debugfs parent dir */
+static struct dentry *mic_dbg;
+
+/**
+ * mic_intr_test - Send interrupts to host.
+ */
+static int mic_intr_test(struct seq_file *s, void *unused)
+{
+	struct mic_driver *mdrv = s->private;
+	struct mic_device *mdev = &mdrv->mdev;
+
+	mic_send_intr(mdev, 0);
+	msleep(1000);
+	mic_send_intr(mdev, 1);
+	msleep(1000);
+	mic_send_intr(mdev, 2);
+	msleep(1000);
+	mic_send_intr(mdev, 3);
+	msleep(1000);
+
+	return 0;
+}
+
+static int mic_intr_test_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, mic_intr_test, inode->i_private);
+}
+
+static int mic_intr_test_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations intr_test_ops = {
+	.owner   = THIS_MODULE,
+	.open    = mic_intr_test_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = mic_intr_test_release
+};
+
+/**
+ * mic_create_card_debug_dir - Initialize MIC debugfs entries.
+ */
+void __init mic_create_card_debug_dir(struct mic_driver *mdrv)
+{
+	struct dentry *d;
+
+	if (!mic_dbg)
+		return;
+
+	mdrv->dbg_dir = debugfs_create_dir(mdrv->name, mic_dbg);
+	if (!mdrv->dbg_dir) {
+		dev_err(mdrv->dev, "Cant create dbg_dir %s\n", mdrv->name);
+		return;
+	}
+
+	d = debugfs_create_file("intr_test", 0444, mdrv->dbg_dir,
+		mdrv, &intr_test_ops);
+
+	if (!d) {
+		dev_err(mdrv->dev,
+			"Cant create dbg intr_test %s\n", mdrv->name);
+		return;
+	}
+}
+
+/**
+ * mic_delete_card_debug_dir - Uninitialize MIC debugfs entries.
+ */
+void mic_delete_card_debug_dir(struct mic_driver *mdrv)
+{
+	if (!mdrv->dbg_dir)
+		return;
+
+	debugfs_remove_recursive(mdrv->dbg_dir);
+}
+
+/**
+ * mic_init_card_debugfs - Initialize global debugfs entry.
+ */
+void __init mic_init_card_debugfs(void)
+{
+	mic_dbg = debugfs_create_dir(KBUILD_MODNAME, NULL);
+	if (!mic_dbg)
+		pr_err("can't create debugfs dir\n");
+}
+
+/**
+ * mic_exit_card_debugfs - Uninitialize global debugfs entry
+ */
+void mic_exit_card_debugfs(void)
+{
+	debugfs_remove(mic_dbg);
+}
diff --git a/drivers/misc/mic/card/mic_device.c b/drivers/misc/mic/card/mic_device.c
new file mode 100644
index 0000000..3c5c302
--- /dev/null
+++ b/drivers/misc/mic/card/mic_device.c
@@ -0,0 +1,299 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/interrupt.h>
+#include <linux/reboot.h>
+
+#include <linux/mic_common.h>
+#include "../common/mic_device.h"
+#include "mic_device.h"
+
+static struct mic_driver *g_drv;
+static struct mic_irq *shutdown_cookie;
+
+static void mic_notify_host(u8 state)
+{
+	struct mic_driver *mdrv = g_drv;
+	struct mic_bootparam __iomem *bootparam = mdrv->dp;
+
+	iowrite8(state, &bootparam->shutdown_status);
+	dev_dbg(mdrv->dev, "%s %d system_state %d\n",
+		__func__, __LINE__, state);
+	mic_send_intr(&mdrv->mdev, ioread8(&bootparam->c2h_shutdown_db));
+}
+
+static int mic_panic_event(struct notifier_block *this, unsigned long event,
+		void *ptr)
+{
+	struct mic_driver *mdrv = g_drv;
+	struct mic_bootparam __iomem *bootparam = mdrv->dp;
+
+	iowrite8(-1, &bootparam->h2c_config_db);
+	iowrite8(-1, &bootparam->h2c_shutdown_db);
+	mic_notify_host(MIC_CRASHED);
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block mic_panic = {
+	.notifier_call  = mic_panic_event,
+};
+
+static irqreturn_t mic_shutdown_isr(int irq, void *data)
+{
+	struct mic_driver *mdrv = g_drv;
+	struct mic_bootparam __iomem *bootparam = mdrv->dp;
+
+	mic_ack_interrupt(&g_drv->mdev);
+	if (ioread8(&bootparam->shutdown_card))
+		orderly_poweroff(true);
+	return IRQ_HANDLED;
+}
+
+static int mic_shutdown_init(void)
+{
+	int rc = 0;
+	struct mic_driver *mdrv = g_drv;
+	struct mic_bootparam __iomem *bootparam = mdrv->dp;
+	int shutdown_db;
+
+	shutdown_db = mic_next_card_db();
+	shutdown_cookie = mic_request_card_irq(mic_shutdown_isr,
+			"Shutdown", mdrv, shutdown_db);
+	if (IS_ERR(shutdown_cookie))
+		rc = PTR_ERR(shutdown_cookie);
+	else
+		iowrite8(shutdown_db, &bootparam->h2c_shutdown_db);
+	return rc;
+}
+
+static void mic_shutdown_uninit(void)
+{
+	struct mic_driver *mdrv = g_drv;
+	struct mic_bootparam __iomem *bootparam = mdrv->dp;
+
+	iowrite8(-1, &bootparam->h2c_shutdown_db);
+	mic_free_card_irq(shutdown_cookie, mdrv);
+}
+
+static int __init mic_dp_init(void)
+{
+	struct mic_driver *mdrv = g_drv;
+	struct mic_device *mdev = &mdrv->mdev;
+	struct mic_bootparam __iomem *bootparam;
+	u64 lo, hi, dp_dma_addr;
+	u32 magic;
+
+	lo = mic_read_spad(&mdrv->mdev, MIC_DPLO_SPAD);
+	hi = mic_read_spad(&mdrv->mdev, MIC_DPHI_SPAD);
+
+	dp_dma_addr = lo | (hi << 32);
+	mdrv->dp = mic_card_map(mdev, dp_dma_addr, MIC_DP_SIZE);
+	if (!mdrv->dp) {
+		dev_err(mdrv->dev, "Cannot remap Aperture BAR\n");
+		return -ENOMEM;
+	}
+	bootparam = mdrv->dp;
+	magic = ioread32(&bootparam->magic);
+	if (MIC_MAGIC != magic) {
+		dev_err(mdrv->dev, "bootparam magic mismatch 0x%x\n", magic);
+		return -EIO;
+	}
+	return 0;
+}
+
+/* Uninitialize the device page */
+static void mic_dp_uninit(void)
+{
+	mic_card_unmap(&g_drv->mdev, g_drv->dp);
+}
+
+/**
+ * mic_request_card_irq - request an irq.
+ *
+ * @func: The callback function that handles the interrupt.
+ * @name: The ASCII name of the callee requesting the irq.
+ * @data: private data that is returned back when calling the
+ * function handler.
+ * @index: The doorbell index of the requester.
+ *
+ * returns: The cookie that is transparent to the caller. Passed
+ * back when calling mic_free_irq. An appropriate error code
+ * is returned on failure. Caller needs to use IS_ERR(return_val)
+ * to check for failure and PTR_ERR(return_val) to obtained the
+ * error code.
+ *
+ */
+struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
+	const char *name, void *data, int index)
+{
+	int rc = 0;
+	unsigned long cookie;
+	struct mic_driver *mdrv = g_drv;
+
+	rc  = request_irq(mic_db_to_irq(mdrv, index), func,
+		0, name, data);
+	if (rc) {
+		dev_err(mdrv->dev, "request_irq failed rc = %d\n", rc);
+		goto err;
+	}
+	mdrv->irq_info.irq_usage_count[index]++;
+	cookie = index;
+	return (struct mic_irq *)cookie;
+err:
+	return ERR_PTR(rc);
+
+}
+
+/**
+ * mic_free_card_irq - free irq.
+ *
+ * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @data: private data specified by the calling function during the
+ * mic_request_irq
+ *
+ * returns: none.
+ */
+void mic_free_card_irq(struct mic_irq *cookie, void *data)
+{
+	int index;
+	struct mic_driver *mdrv = g_drv;
+
+	index = (unsigned long)cookie & 0xFFFFU;
+	free_irq(mic_db_to_irq(mdrv, index), data);
+	mdrv->irq_info.irq_usage_count[index]--;
+}
+
+/**
+ * mic_next_card_db - Get the doorbell with minimum usage count.
+ *
+ * Returns the irq index.
+ */
+int mic_next_card_db(void)
+{
+	int i;
+	int index = 0;
+	struct mic_driver *mdrv = g_drv;
+
+	for (i = 0; i < mdrv->intr_info.num_intr; i++) {
+		if (mdrv->irq_info.irq_usage_count[i] <
+			mdrv->irq_info.irq_usage_count[index])
+			index = i;
+	}
+
+	return index;
+}
+
+/**
+ * mic_init_irq - Initialize irq information.
+ *
+ * Returns 0 in success. Appropriate error code on failure.
+ */
+static int mic_init_irq(void)
+{
+	struct mic_driver *mdrv = g_drv;
+
+	mdrv->irq_info.irq_usage_count = kzalloc((sizeof(u32) *
+			mdrv->intr_info.num_intr),
+			GFP_KERNEL);
+	if (!mdrv->irq_info.irq_usage_count)
+		return -ENOMEM;
+	return 0;
+}
+
+/**
+ * mic_uninit_irq - Uninitialize irq information.
+ *
+ * None.
+ */
+static void mic_uninit_irq(void)
+{
+	struct mic_driver *mdrv = g_drv;
+
+	kfree(mdrv->irq_info.irq_usage_count);
+}
+
+/*
+ * mic_driver_init - MIC driver initialization tasks.
+ *
+ * Returns 0 in success. Appropriate error code on failure.
+ */
+int __init mic_driver_init(struct mic_driver *mdrv)
+{
+	int rc;
+
+	g_drv = mdrv;
+	/*
+	 * Unloading the card module is not supported. The MIC card module
+	 * handles fundamental operations like host/card initiated shutdowns
+	 * and informing the host about card crashes and cannot be unloaded.
+	 */
+	if (!try_module_get(mdrv->dev->driver->owner)) {
+		rc = -ENODEV;
+		goto done;
+	}
+	rc = mic_dp_init();
+	if (rc)
+		goto put;
+	rc = mic_init_irq();
+	if (rc)
+		goto dp_uninit;
+	rc = mic_shutdown_init();
+	if (rc)
+		goto irq_uninit;
+	mic_create_card_debug_dir(mdrv);
+	atomic_notifier_chain_register(&panic_notifier_list, &mic_panic);
+done:
+	return rc;
+irq_uninit:
+	mic_uninit_irq();
+dp_uninit:
+	mic_dp_uninit();
+put:
+	module_put(mdrv->dev->driver->owner);
+	return rc;
+}
+
+/*
+ * mic_driver_uninit - MIC driver uninitialization tasks.
+ *
+ * Returns None
+ */
+void mic_driver_uninit(struct mic_driver *mdrv)
+{
+	mic_delete_card_debug_dir(mdrv);
+	/*
+	 * Inform the host about the shutdown status i.e. poweroff/restart etc.
+	 * The module cannot be unloaded so the only code path to call
+	 * mic_devices_uninit(..) is the shutdown callback.
+	 */
+	mic_notify_host(system_state);
+	mic_shutdown_uninit();
+	mic_uninit_irq();
+	mic_dp_uninit();
+	module_put(mdrv->dev->driver->owner);
+}
diff --git a/drivers/misc/mic/card/mic_device.h b/drivers/misc/mic/card/mic_device.h
new file mode 100644
index 0000000..347b9b3
--- /dev/null
+++ b/drivers/misc/mic/card/mic_device.h
@@ -0,0 +1,133 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#ifndef _MIC_CARD_DEVICE_H_
+#define _MIC_CARD_DEVICE_H_
+
+#include <linux/workqueue.h>
+#include <linux/io.h>
+
+/**
+ * struct mic_intr_info - Contains h/w specific interrupt sources info
+ *
+ * @num_intr: The number of irqs available
+ */
+struct mic_intr_info {
+	u32 num_intr;
+};
+
+/**
+ * struct mic_irq_info - OS specific irq information
+ *
+ * @irq_usage_count: usage count array tracking the number of sources
+ * assigned for each irq.
+ */
+struct mic_irq_info {
+	int *irq_usage_count;
+};
+
+/**
+ * struct mic_device -  MIC device information.
+ *
+ * @mmio: MMIO bar information.
+ */
+struct mic_device {
+	struct mic_mw mmio;
+};
+
+/**
+ * struct mic_driver - MIC card driver information.
+ *
+ * @name: Name for MIC driver.
+ * @dbg_dir: debugfs directory of this MIC device.
+ * @dev: The device backing this MIC.
+ * @dp: The pointer to the virtio device page.
+ * @mdev: MIC device information for the host.
+ * @hotplug_work: Hot plug work for adding/removing virtio devices.
+ * @irq_info: The OS specific irq information
+ * @intr_info: H/W specific interrupt information.
+ */
+struct mic_driver {
+	char name[20];
+	struct dentry *dbg_dir;
+	struct device *dev;
+	void __iomem *dp;
+	struct mic_device mdev;
+	struct work_struct hotplug_work;
+	struct mic_irq_info irq_info;
+	struct mic_intr_info intr_info;
+};
+
+/**
+ * struct mic_irq - opaque pointer used as cookie
+ */
+struct mic_irq;
+
+/**
+ * mic_mmio_read - read from an MMIO register.
+ * @mw: MMIO register base virtual address.
+ * @offset: register offset.
+ *
+ * RETURNS: register value.
+ */
+static inline u32 mic_mmio_read(struct mic_mw *mw, u32 offset)
+{
+	return ioread32(mw->va + offset);
+}
+
+/**
+ * mic_mmio_write - write to an MMIO register.
+ * @mw: MMIO register base virtual address.
+ * @val: the data value to put into the register
+ * @offset: register offset.
+ *
+ * RETURNS: none.
+ */
+static inline void
+mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
+{
+	iowrite32(val, mw->va + offset);
+}
+
+int mic_driver_init(struct mic_driver *mdrv);
+void mic_driver_uninit(struct mic_driver *mdrv);
+int mic_next_card_db(void);
+struct mic_irq *mic_request_card_irq(irqreturn_t (*func)(int irq, void *data),
+	const char *name, void *data, int intr_src);
+void mic_free_card_irq(struct mic_irq *cookie, void *data);
+u32 mic_read_spad(struct mic_device *mdev, unsigned int idx);
+void mic_send_intr(struct mic_device *mdev, int doorbell);
+int mic_db_to_irq(struct mic_driver *mdrv, int db);
+u32 mic_ack_interrupt(struct mic_device *mdev);
+void mic_hw_intr_init(struct mic_driver *mdrv);
+void __iomem *
+mic_card_map(struct mic_device *mdev, dma_addr_t addr, size_t size);
+void mic_card_unmap(struct mic_device *mdev, void __iomem *addr);
+void __init mic_create_card_debug_dir(struct mic_driver *mdrv);
+void mic_delete_card_debug_dir(struct mic_driver *mdrv);
+void __init mic_init_card_debugfs(void);
+void mic_exit_card_debugfs(void);
+#endif
diff --git a/drivers/misc/mic/card/mic_x100.c b/drivers/misc/mic/card/mic_x100.c
new file mode 100644
index 0000000..7cb3469
--- /dev/null
+++ b/drivers/misc/mic/card/mic_x100.c
@@ -0,0 +1,256 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_x100.h"
+
+static const char mic_driver_name[] = "mic";
+
+static struct mic_driver g_drv;
+
+/**
+ * mic_read_spad - read from the scratchpad register
+ * @mdev: pointer to mic_device instance
+ * @idx: index to scratchpad register, 0 based
+ *
+ * This function allows reading of the 32bit scratchpad register.
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+u32 mic_read_spad(struct mic_device *mdev, unsigned int idx)
+{
+	return mic_mmio_read(&mdev->mmio,
+		MIC_X100_SBOX_BASE_ADDRESS +
+		MIC_X100_SBOX_SPAD0 + idx * 4);
+}
+
+/**
+ * __mic_send_intr - Send interrupt to Host.
+ * @mdev: pointer to mic_device instance
+ * @doorbell: Doorbell number.
+ */
+void mic_send_intr(struct mic_device *mdev, int doorbell)
+{
+	struct mic_mw *mw = &mdev->mmio;
+
+	if (doorbell > MIC_X100_MAX_DOORBELL_IDX)
+		return;
+	/* Ensure that the interrupt is ordered w.r.t previous stores. */
+	wmb();
+	mic_mmio_write(mw, MIC_X100_SBOX_SDBIC0_DBREQ_BIT,
+		MIC_X100_SBOX_BASE_ADDRESS +
+		(MIC_X100_SBOX_SDBIC0 + (4 * doorbell)));
+}
+
+/**
+ * mic_ack_interrupt - Device specific interrupt handling.
+ * @mdev: pointer to mic_device instance
+ *
+ * Returns: bitmask of doorbell events triggered.
+ */
+u32 mic_ack_interrupt(struct mic_device *mdev)
+{
+	return 0;
+}
+
+static inline int mic_get_sbox_irq(int db)
+{
+	return MIC_X100_IRQ_BASE + db;
+}
+
+static inline int mic_get_rdmasr_irq(int index)
+{
+	return  MIC_X100_RDMASR_IRQ_BASE + index;
+}
+
+/**
+ * mic_hw_intr_init - Initialize h/w specific interrupt
+ * information.
+ * @mdrv: pointer to mic_driver
+ */
+void mic_hw_intr_init(struct mic_driver *mdrv)
+{
+	mdrv->intr_info.num_intr = MIC_X100_NUM_SBOX_IRQ +
+				MIC_X100_NUM_RDMASR_IRQ;
+}
+
+/**
+ * mic_db_to_irq - Retrieve irq number corresponding to a doorbell.
+ * @mdrv: pointer to mic_driver
+ * @db: The doorbell obtained for which the irq is needed. Doorbell
+ * may correspond to an sbox doorbell or an rdmasr index.
+ *
+ * Returns the irq corresponding to the doorbell.
+ */
+int mic_db_to_irq(struct mic_driver *mdrv, int db)
+{
+	int rdmasr_index;
+	if (db < MIC_X100_NUM_SBOX_IRQ) {
+		return mic_get_sbox_irq(db);
+	} else {
+		rdmasr_index = db - MIC_X100_NUM_SBOX_IRQ +
+			MIC_X100_RDMASR_IRQ_BASE;
+		return mic_get_rdmasr_irq(rdmasr_index);
+	}
+}
+
+/*
+ * mic_card_map - Allocate virtual address for a remote memory region.
+ * @mdev: pointer to mic_device instance.
+ * @addr: Remote DMA address.
+ * @size: Size of the region.
+ *
+ * Returns: Virtual address backing the remote memory region.
+ */
+void __iomem *
+mic_card_map(struct mic_device *mdev, dma_addr_t addr, size_t size)
+{
+	return ioremap(addr, size);
+}
+
+/*
+ * mic_card_unmap - Unmap the virtual address for a remote memory region.
+ * @mdev: pointer to mic_device instance.
+ * @addr: Virtual address for remote memory region.
+ *
+ * Returns: None.
+ */
+void mic_card_unmap(struct mic_device *mdev, void __iomem *addr)
+{
+	iounmap(addr);
+}
+
+static int __init mic_probe(struct platform_device *pdev)
+{
+	struct mic_driver *mdrv = &g_drv;
+	struct mic_device *mdev = &mdrv->mdev;
+	int rc = 0;
+
+	mdrv->dev = &pdev->dev;
+	snprintf(mdrv->name, sizeof(mic_driver_name), mic_driver_name);
+
+	mdev->mmio.pa = MIC_X100_MMIO_BASE;
+	mdev->mmio.len = MIC_X100_MMIO_LEN;
+	mdev->mmio.va = ioremap(MIC_X100_MMIO_BASE, MIC_X100_MMIO_LEN);
+	if (!mdev->mmio.va) {
+		dev_err(&pdev->dev, "Cannot remap MMIO BAR\n");
+		rc = -EIO;
+		goto done;
+	}
+	mic_hw_intr_init(mdrv);
+	rc = mic_driver_init(mdrv);
+	if (rc) {
+		dev_err(&pdev->dev, "mic_driver_init failed rc %d\n", rc);
+		goto iounmap;
+	}
+done:
+	return rc;
+iounmap:
+	iounmap(mdev->mmio.va);
+	return rc;
+}
+
+static int mic_remove(struct platform_device *pdev)
+{
+	struct mic_driver *mdrv = &g_drv;
+	struct mic_device *mdev = &mdrv->mdev;
+
+	mic_driver_uninit(mdrv);
+	iounmap(mdev->mmio.va);
+	return 0;
+}
+
+static void mic_platform_shutdown(struct platform_device *pdev)
+{
+	mic_remove(pdev);
+}
+
+static struct platform_device mic_platform_dev = {
+	.name = mic_driver_name,
+	.id   = 0,
+	.num_resources = 0,
+};
+
+static struct platform_driver __refdata mic_platform_driver = {
+	.probe = mic_probe,
+	.remove = mic_remove,
+	.shutdown = mic_platform_shutdown,
+	.driver         = {
+		.name   = mic_driver_name,
+		.owner	= THIS_MODULE,
+	},
+};
+
+static int __init mic_init(void)
+{
+	int ret;
+	struct cpuinfo_x86 *c = &cpu_data(0);
+
+	if (!(c->x86 == 11 && c->x86_model == 1)) {
+		ret = -ENODEV;
+		pr_err("%s not running on X100 ret %d\n", __func__, ret);
+		goto done;
+	}
+
+	mic_init_card_debugfs();
+	ret = platform_device_register(&mic_platform_dev);
+	if (ret) {
+		pr_err("platform_device_register ret %d\n", ret);
+		goto cleanup_debugfs;
+	}
+	ret = platform_driver_register(&mic_platform_driver);
+	if (ret) {
+		pr_err("platform_driver_register ret %d\n", ret);
+		goto device_unregister;
+	}
+	return ret;
+
+device_unregister:
+	platform_device_unregister(&mic_platform_dev);
+cleanup_debugfs:
+	mic_exit_card_debugfs();
+done:
+	return ret;
+}
+
+static void __exit mic_exit(void)
+{
+	platform_driver_unregister(&mic_platform_driver);
+	platform_device_unregister(&mic_platform_dev);
+	mic_exit_card_debugfs();
+}
+
+module_init(mic_init);
+module_exit(mic_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC X100 Card driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/misc/mic/card/mic_x100.h b/drivers/misc/mic/card/mic_x100.h
new file mode 100644
index 0000000..d66ea55
--- /dev/null
+++ b/drivers/misc/mic/card/mic_x100.h
@@ -0,0 +1,48 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Disclaimer: The codes contained in these modules may be specific to
+ * the Intel Software Development Platform codenamed: Knights Ferry, and
+ * the Intel product codenamed: Knights Corner, and are not backward
+ * compatible with other Intel products. Additionally, Intel will NOT
+ * support the codes or instruction set in future products.
+ *
+ * Intel MIC Card driver.
+ *
+ */
+#ifndef _MIC_X100_CARD_H_
+#define _MIC_X100_CARD_H_
+
+#define MIC_X100_MMIO_BASE 0x08007C0000ULL
+#define MIC_X100_MMIO_LEN 0x00020000ULL
+#define MIC_X100_SBOX_BASE_ADDRESS 0x00010000ULL
+
+#define MIC_X100_SBOX_SPAD0 0x0000AB20
+#define MIC_X100_SBOX_SDBIC0 0x0000CC90
+#define MIC_X100_SBOX_SDBIC0_DBREQ_BIT 0x80000000
+#define MIC_X100_SBOX_RDMASR0	0x0000B180
+
+#define MIC_X100_MAX_DOORBELL_IDX 8
+
+#define MIC_X100_NUM_SBOX_IRQ 8
+#define MIC_X100_NUM_RDMASR_IRQ 8
+#define MIC_X100_SBOX_IRQ_BASE 0
+#define MIC_X100_RDMASR_IRQ_BASE 17
+
+#define MIC_X100_IRQ_BASE 26
+
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 42%]

* [PATCH RESEND v3 2/7] Intel MIC Host Driver Interrupt/SMPT support.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
  2013-09-05 23:41 44% ` [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family Sudeep Dutt
@ 2013-09-05 23:41 28% ` Sudeep Dutt
  2013-09-05 23:41 28% ` [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management Sudeep Dutt
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>

This patch enables the following features:
a) MSIx, MSI and legacy interrupt support.
b) System Memory Page Table(SMPT) support. SMPT enables system memory
   access from the card. On X100 devices the host can program 32 SMPT
   registers each capable of accessing 16GB of system memory
   address space from X100 devices. The registers can thereby be used
   to access a cumulative 512GB of system memory address space from
   X100 devices at any point in time.

Co-author: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
---
 drivers/misc/mic/host/Makefile     |   2 +
 drivers/misc/mic/host/mic_device.h |  21 ++
 drivers/misc/mic/host/mic_intr.c   | 628 +++++++++++++++++++++++++++++++++++++
 drivers/misc/mic/host/mic_intr.h   | 137 ++++++++
 drivers/misc/mic/host/mic_main.c   |  26 +-
 drivers/misc/mic/host/mic_smpt.c   | 442 ++++++++++++++++++++++++++
 drivers/misc/mic/host/mic_smpt.h   |  98 ++++++
 drivers/misc/mic/host/mic_x100.c   | 247 +++++++++++++++
 drivers/misc/mic/host/mic_x100.h   |  39 +++
 9 files changed, 1638 insertions(+), 2 deletions(-)
 create mode 100644 drivers/misc/mic/host/mic_intr.c
 create mode 100644 drivers/misc/mic/host/mic_intr.h
 create mode 100644 drivers/misc/mic/host/mic_smpt.c
 create mode 100644 drivers/misc/mic/host/mic_smpt.h

diff --git a/drivers/misc/mic/host/Makefile b/drivers/misc/mic/host/Makefile
index 93b9d25..6ff5550 100644
--- a/drivers/misc/mic/host/Makefile
+++ b/drivers/misc/mic/host/Makefile
@@ -6,3 +6,5 @@ obj-$(CONFIG_INTEL_MIC_HOST) += mic_host.o
 mic_host-objs := mic_main.o
 mic_host-objs += mic_x100.o
 mic_host-objs += mic_sysfs.o
+mic_host-objs += mic_smpt.o
+mic_host-objs += mic_intr.o
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
index 6cd904c..dd34b65 100644
--- a/drivers/misc/mic/host/mic_device.h
+++ b/drivers/misc/mic/host/mic_device.h
@@ -21,6 +21,10 @@
 #ifndef _MIC_DEVICE_H_
 #define _MIC_DEVICE_H_
 
+#include <linux/idr.h>
+
+#include "mic_intr.h"
+
 /* The maximum number of MIC devices supported in a single host system. */
 #define MIC_MAX_NUM_DEVS 256
 
@@ -53,6 +57,12 @@ enum mic_stepping {
  * @stepping: Stepping ID.
  * @attr_group: Pointer to list of sysfs attribute groups.
  * @sdev: Device for sysfs entries.
+ * @mic_mutex: Mutex for synchronizing access to mic_device.
+ * @intr_ops: HW specific interrupt operations.
+ * @smpt_ops: Hardware specific SMPT operations.
+ * @smpt: MIC SMPT information.
+ * @intr_info: H/W specific interrupt information.
+ * @irq_info: The OS specific irq information
  */
 struct mic_device {
 	struct mic_mw mmio;
@@ -63,6 +73,12 @@ struct mic_device {
 	enum mic_stepping stepping;
 	const struct attribute_group **attr_group;
 	struct device *sdev;
+	struct mutex mic_mutex;
+	struct mic_hw_intr_ops *intr_ops;
+	struct mic_smpt_ops *smpt_ops;
+	struct mic_smpt_info *smpt;
+	struct mic_intr_info *intr_info;
+	struct mic_irq_info irq_info;
 };
 
 /**
@@ -71,12 +87,17 @@ struct mic_device {
  * @mmio_bar: MMIO bar resource number.
  * @read_spad: Read from scratch pad register.
  * @write_spad: Write to scratch pad register.
+ * @send_intr: Send an interrupt for a particular doorbell on the card.
+ * @ack_interrupt: Hardware specific operations to ack the h/w on
+ * receipt of an interrupt.
  */
 struct mic_hw_ops {
 	u8 aper_bar;
 	u8 mmio_bar;
 	u32 (*read_spad)(struct mic_device *mdev, unsigned int idx);
 	void (*write_spad)(struct mic_device *mdev, unsigned int idx, u32 val);
+	void (*send_intr)(struct mic_device *mdev, int doorbell);
+	u32 (*ack_interrupt)(struct mic_device *mdev);
 };
 
 /**
diff --git a/drivers/misc/mic/host/mic_intr.c b/drivers/misc/mic/host/mic_intr.c
new file mode 100644
index 0000000..cdae314
--- /dev/null
+++ b/drivers/misc/mic/host/mic_intr.c
@@ -0,0 +1,628 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/pci.h>
+#include <linux/interrupt.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+
+/*
+ * mic_invoke_callback - Invoke callback functions registered for
+ * the corresponding source id.
+ *
+ * @mdev: pointer to the mic_device instance
+ * @idx: The interrupt source id.
+ *
+ * Returns none.
+ */
+static inline void mic_invoke_callback(struct mic_device *mdev, int idx)
+{
+	struct mic_intr_cb *intr_cb;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+
+	spin_lock(&mdev->irq_info.mic_intr_lock);
+	list_for_each_entry(intr_cb, &mdev->irq_info.cb_list[idx], list)
+		if (intr_cb->func)
+			intr_cb->func(pdev->irq, intr_cb->data);
+	spin_unlock(&mdev->irq_info.mic_intr_lock);
+}
+
+/**
+ * mic_interrupt - Generic interrupt handler for
+ * MSI and INTx based interrupts.
+ */
+static irqreturn_t mic_interrupt(int irq, void *dev)
+{
+	struct mic_device *mdev = dev;
+	struct mic_intr_info *info = mdev->intr_info;
+	u32 mask;
+	int i;
+
+	mask = mdev->ops->ack_interrupt(mdev);
+	if (!mask)
+		return IRQ_NONE;
+
+	for (i = info->intr_start_idx[MIC_INTR_DB];
+			i < info->intr_len[MIC_INTR_DB]; i++)
+		if (mask & BIT(i))
+			mic_invoke_callback(mdev, i);
+
+	return IRQ_HANDLED;
+}
+
+/* Return the interrupt offset from the index. Index is 0 based. */
+static u16 mic_map_src_to_offset(struct mic_device *mdev,
+		int intr_src, enum mic_intr_type type) {
+
+	if (type >= MIC_NUM_INTR_TYPES)
+		return MIC_NUM_OFFSETS;
+	if (intr_src >= mdev->intr_info->intr_len[type])
+		return MIC_NUM_OFFSETS;
+
+	return mdev->intr_info->intr_start_idx[type] + intr_src;
+}
+
+/* Return next available msix_entry. */
+static struct msix_entry *mic_get_available_vector(struct mic_device *mdev)
+{
+	int i;
+	struct mic_irq_info *info = &mdev->irq_info;
+
+	for (i = 0; i < info->num_vectors; i++)
+		if (!info->mic_msi_map[i])
+			return &info->msix_entries[i];
+	return NULL;
+}
+
+/**
+ * mic_register_intr_callback - Register a callback handler for the
+ * given source id.
+ *
+ * @mdev: pointer to the mic_device instance
+ * @idx: The source id to be registered.
+ * @func: The function to be called when the source id receives
+ * the interrupt.
+ * @data: Private data of the requester.
+ * Return the callback structure that was registered or an
+ * appropriate error on failure.
+ */
+static struct mic_intr_cb *mic_register_intr_callback(struct mic_device *mdev,
+			u8 idx, irqreturn_t (*func) (int irq, void *dev),
+			void *data)
+{
+	struct mic_intr_cb *intr_cb;
+	unsigned long flags;
+	int rc;
+	intr_cb = kmalloc(sizeof(struct mic_intr_cb), GFP_KERNEL);
+
+	if (!intr_cb)
+		return ERR_PTR(-ENOMEM);
+
+	intr_cb->func = func;
+	intr_cb->data = data;
+	intr_cb->cb_id = ida_simple_get(&mdev->irq_info.cb_ida,
+		0, 0, GFP_KERNEL);
+	if (intr_cb->cb_id < 0) {
+		rc = intr_cb->cb_id;
+		goto ida_fail;
+	}
+
+	spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
+	list_add_tail(&intr_cb->list, &mdev->irq_info.cb_list[idx]);
+	spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+
+	return intr_cb;
+ida_fail:
+	kfree(intr_cb);
+	return ERR_PTR(rc);
+}
+
+/**
+ * mic_unregister_intr_callback - Unregister the callback handler
+ * identified by its callback id.
+ *
+ * @mdev: pointer to the mic_device instance
+ * @idx: The callback structure id to be unregistered.
+ * Return the source id that was unregistered or MIC_NUM_OFFSETS if no
+ * such callback handler was found.
+ */
+static u8 mic_unregister_intr_callback(struct mic_device *mdev, u32 idx)
+{
+	struct list_head *pos, *tmp;
+	struct mic_intr_cb *intr_cb;
+	unsigned long flags;
+	int i;
+
+	for (i = 0;  i < MIC_NUM_OFFSETS; i++) {
+		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
+		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
+			intr_cb = list_entry(pos, struct mic_intr_cb, list);
+			if (intr_cb->cb_id == idx) {
+				list_del(pos);
+				ida_simple_remove(&mdev->irq_info.cb_ida,
+					intr_cb->cb_id);
+				kfree(intr_cb);
+				spin_unlock_irqrestore(
+					&mdev->irq_info.mic_intr_lock, flags);
+				return i;
+			}
+		}
+		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	}
+	return MIC_NUM_OFFSETS;
+}
+
+/**
+ * mic_setup_msix - Initializes MSIx interrupts.
+ *
+ * @mdev: pointer to mic_device instance
+ *
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int mic_setup_msix(struct mic_device *mdev, struct pci_dev *pdev)
+{
+	int rc, i;
+
+	mdev->irq_info.msix_entries = kmalloc(sizeof(struct msix_entry) *
+			MIC_MIN_MSIX, GFP_KERNEL);
+	if (!mdev->irq_info.msix_entries) {
+		rc = -ENOMEM;
+		goto err_nomem1;
+	}
+
+	for (i = 0; i < MIC_MIN_MSIX; i++)
+		mdev->irq_info.msix_entries[i].entry = i;
+
+	rc = pci_enable_msix(pdev, mdev->irq_info.msix_entries,
+		MIC_MIN_MSIX);
+	if (rc) {
+		dev_dbg(&pdev->dev, "Error enabling MSIx. rc = %d\n", rc);
+		goto err_enable_msix;
+	}
+
+	mdev->irq_info.num_vectors = MIC_MIN_MSIX;
+	mdev->irq_info.mic_msi_map = kzalloc((sizeof(u32) *
+		mdev->irq_info.num_vectors), GFP_KERNEL);
+
+	if (!mdev->irq_info.mic_msi_map) {
+		rc = -ENOMEM;
+		goto err_nomem2;
+	}
+
+	dev_dbg(mdev->sdev->parent,
+		"%d MSIx irqs setup\n", mdev->irq_info.num_vectors);
+	return 0;
+err_nomem2:
+	pci_disable_msix(pdev);
+err_enable_msix:
+	kfree(mdev->irq_info.msix_entries);
+err_nomem1:
+	mdev->irq_info.num_vectors = 0;
+	return rc;
+}
+
+/**
+ * mic_setup_callbacks - Initialize data structures needed
+ * to handle callbacks.
+ *
+ * @mdev: pointer to mic_device instance
+ */
+static int mic_setup_callbacks(struct mic_device *mdev)
+{
+	int i;
+
+	mdev->irq_info.cb_list = kmalloc(sizeof(struct list_head) *
+		MIC_NUM_OFFSETS, GFP_KERNEL);
+	if (!mdev->irq_info.cb_list)
+		return -ENOMEM;
+
+	for (i = 0; i < MIC_NUM_OFFSETS; i++)
+		INIT_LIST_HEAD(&mdev->irq_info.cb_list[i]);
+	ida_init(&mdev->irq_info.cb_ida);
+	spin_lock_init(&mdev->irq_info.mic_intr_lock);
+	return 0;
+}
+
+/**
+ * mic_release_callbacks - Uninitialize data structures needed
+ * to handle callbacks.
+ *
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_release_callbacks(struct mic_device *mdev)
+{
+	unsigned long flags;
+	struct list_head *pos, *tmp;
+	struct mic_intr_cb *intr_cb;
+	int i;
+
+	for (i = 0; i < MIC_NUM_OFFSETS; i++) {
+		spin_lock_irqsave(&mdev->irq_info.mic_intr_lock, flags);
+
+		if (list_empty(&mdev->irq_info.cb_list[i])) {
+			spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock,
+				flags);
+			break;
+		}
+
+		list_for_each_safe(pos, tmp, &mdev->irq_info.cb_list[i]) {
+			intr_cb = list_entry(pos, struct mic_intr_cb, list);
+			list_del(pos);
+			ida_simple_remove(&mdev->irq_info.cb_ida,
+				intr_cb->cb_id);
+			kfree(intr_cb);
+		}
+		spin_unlock_irqrestore(&mdev->irq_info.mic_intr_lock, flags);
+	}
+	ida_destroy(&mdev->irq_info.cb_ida);
+	kfree(mdev->irq_info.cb_list);
+}
+
+/**
+ * mic_setup_msi - Initializes MSI interrupts.
+ *
+ * @mdev: pointer to mic_device instance
+ * @pdev: PCI device structure
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int mic_setup_msi(struct mic_device *mdev, struct pci_dev *pdev)
+{
+	int rc;
+
+	rc = pci_enable_msi(pdev);
+	if (rc) {
+		dev_dbg(&pdev->dev, "Error enabling MSI. rc = %d\n", rc);
+		return rc;
+	}
+
+	mdev->irq_info.num_vectors = 1;
+	mdev->irq_info.mic_msi_map = kzalloc((sizeof(u32) *
+		mdev->irq_info.num_vectors), GFP_KERNEL);
+
+	if (!mdev->irq_info.mic_msi_map) {
+		rc = -ENOMEM;
+		goto err_nomem1;
+	}
+
+	rc = mic_setup_callbacks(mdev);
+	if (rc) {
+		dev_err(&pdev->dev, "Error setting up callbacks\n");
+		goto err_nomem2;
+	}
+
+	rc = request_irq(pdev->irq, mic_interrupt, 0 , "mic-msi", mdev);
+	if (rc) {
+		dev_err(&pdev->dev, "Error allocating MSI interrupt\n");
+		goto err_irq_req_fail;
+	}
+
+	dev_dbg(&pdev->dev, "%d MSI irqs setup\n", mdev->irq_info.num_vectors);
+	return 0;
+err_irq_req_fail:
+	mic_release_callbacks(mdev);
+err_nomem2:
+	kfree(mdev->irq_info.mic_msi_map);
+err_nomem1:
+	pci_disable_msi(pdev);
+	mdev->irq_info.num_vectors = 0;
+	return rc;
+}
+
+/**
+ * mic_setup_intx - Initializes legacy interrupts.
+ *
+ * @mdev: pointer to mic_device instance
+ * @pdev: PCI device structure
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static int mic_setup_intx(struct mic_device *mdev, struct pci_dev *pdev)
+{
+	int rc;
+
+	pci_msi_off(pdev);
+
+	/* Enable intx */
+	pci_intx(pdev, 1);
+	rc = mic_setup_callbacks(mdev);
+	if (rc) {
+		dev_err(&pdev->dev, "Error setting up callbacks\n");
+		goto err_nomem;
+	}
+
+	rc = request_irq(pdev->irq, mic_interrupt,
+		IRQF_SHARED, "mic-intx", mdev);
+	if (rc)
+		goto err;
+
+	dev_dbg(&pdev->dev, "intx irq setup\n");
+	return 0;
+err:
+	mic_release_callbacks(mdev);
+err_nomem:
+	return rc;
+}
+
+/**
+ * mic_next_db - Retrieve the next doorbell interrupt source id.
+ * The id is picked sequentially from the available pool of
+ * doorlbell ids.
+ *
+ * @mdev: pointer to the mic_device instance.
+ *
+ * Returns the next doorbell interrupt source.
+ */
+int mic_next_db(struct mic_device *mdev)
+{
+	int next_db;
+
+	next_db = mdev->irq_info.next_avail_src %
+		mdev->intr_info->intr_len[MIC_INTR_DB];
+	mdev->irq_info.next_avail_src++;
+	return next_db;
+}
+
+#define COOKIE_ID_SHIFT 16
+#define GET_ENTRY(cookie) ((cookie) & 0xFFFF)
+#define GET_OFFSET(cookie) ((cookie) >> COOKIE_ID_SHIFT)
+#define MK_COOKIE(x, y) ((x) | (y) << COOKIE_ID_SHIFT)
+
+/**
+ * mic_request_irq - request an irq. mic_mutex needs
+ * to be held before calling this function.
+ *
+ * @mdev: pointer to mic_device instance
+ * @func: The callback function that handles the interrupt.
+ * The function needs to call ack_interrupts
+ * (mdev->ops->ack_interrupt(mdev)) when handling the interrupts.
+ * @name: The ASCII name of the callee requesting the irq.
+ * @data: private data that is returned back when calling the
+ * function handler.
+ * @intr_src: The source id of the requester. Its the doorbell id
+ * for Doorbell interrupts and DMA channel id for DMA interrupts.
+ * @type: The type of interrupt. Values defined in mic_intr_type
+ *
+ * returns: The cookie that is transparent to the caller. Passed
+ * back when calling mic_free_irq. An appropriate error code
+ * is returned on failure. Caller needs to use IS_ERR(return_val)
+ * to check for failure and PTR_ERR(return_val) to obtained the
+ * error code.
+ *
+ */
+struct mic_irq *mic_request_irq(struct mic_device *mdev,
+	irqreturn_t (*func)(int irq, void *dev),
+	const char *name, void *data, int intr_src,
+	enum mic_intr_type type)
+{
+	u16 offset;
+	int rc = 0;
+	struct msix_entry *msix = NULL;
+	unsigned long cookie = 0;
+	u16 entry;
+	struct mic_intr_cb *intr_cb;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+
+	offset = mic_map_src_to_offset(mdev, intr_src, type);
+	if (offset >= MIC_NUM_OFFSETS) {
+		dev_err(mdev->sdev->parent,
+				"Error mapping index %d to a valid source id.\n",
+				intr_src);
+		rc = -EINVAL;
+		goto err;
+	}
+
+	if (mdev->irq_info.num_vectors > 1) {
+		msix = mic_get_available_vector(mdev);
+		if (!msix) {
+			dev_err(mdev->sdev->parent,
+			"No MSIx vectors available for use.\n");
+			rc = -ENOSPC;
+			goto err;
+		}
+
+		rc = request_irq(msix->vector, func, 0, name, data);
+		if (rc) {
+			dev_dbg(mdev->sdev->parent,
+				"request irq failed rc = %d\n", rc);
+			goto err;
+		}
+		entry = msix->entry;
+		mdev->irq_info.mic_msi_map[entry] |= BIT(offset);
+		mdev->intr_ops->program_msi_to_src_map(mdev,
+				entry, offset, true);
+		cookie = MK_COOKIE(entry, offset);
+		dev_dbg(mdev->sdev->parent, "irq: %d assigned for src: %d\n",
+			msix->vector, intr_src);
+	} else {
+		intr_cb = mic_register_intr_callback(mdev,
+				offset, func, data);
+		if (IS_ERR(intr_cb)) {
+			dev_err(mdev->sdev->parent,
+			"No available callback entries for use\n");
+			rc = PTR_ERR(intr_cb);
+			goto err;
+		}
+
+		entry = 0;
+		if (pci_dev_msi_enabled(pdev)) {
+			mdev->irq_info.mic_msi_map[entry] |= (1 << offset);
+			mdev->intr_ops->program_msi_to_src_map(mdev,
+				entry, offset, true);
+		}
+		cookie = MK_COOKIE(entry, intr_cb->cb_id);
+		dev_dbg(mdev->sdev->parent, "callback %d registered for src: %d\n",
+			intr_cb->cb_id, intr_src);
+	}
+	return (struct mic_irq *)cookie;
+err:
+	return ERR_PTR(rc);
+}
+
+/**
+ * mic_free_irq - free irq. mic_mutex
+ *  needs to be held before calling this function.
+ *
+ * @mdev: pointer to mic_device instance
+ * @cookie: cookie obtained during a successful call to mic_request_irq
+ * @data: private data specified by the calling function during the
+ * mic_request_irq
+ *
+ * returns: none.
+ */
+void mic_free_irq(struct mic_device *mdev,
+	struct mic_irq *cookie, void *data)
+{
+	u32 offset;
+	u32 entry;
+	u8 src_id;
+	unsigned int irq;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+
+	entry = GET_ENTRY((unsigned long)cookie);
+	offset = GET_OFFSET((unsigned long)cookie);
+	if (mdev->irq_info.num_vectors > 1) {
+		if (entry >= mdev->irq_info.num_vectors) {
+			dev_warn(mdev->sdev->parent,
+				"entry %d should be < num_irq %d\n",
+				entry, mdev->irq_info.num_vectors);
+			return;
+		}
+		irq = mdev->irq_info.msix_entries[entry].vector;
+		free_irq(irq, data);
+		mdev->irq_info.mic_msi_map[entry] &= ~(BIT(offset));
+		mdev->intr_ops->program_msi_to_src_map(mdev,
+			entry, offset, false);
+
+		dev_dbg(mdev->sdev->parent, "irq: %d freed\n", irq);
+	} else {
+		irq = pdev->irq;
+		src_id = mic_unregister_intr_callback(mdev, offset);
+		if (src_id >= MIC_NUM_OFFSETS) {
+			dev_warn(mdev->sdev->parent, "Error unregistering callback\n");
+			return;
+		}
+		if (pci_dev_msi_enabled(pdev)) {
+			mdev->irq_info.mic_msi_map[entry] &= ~(BIT(src_id));
+			mdev->intr_ops->program_msi_to_src_map(mdev,
+				entry, src_id, false);
+		}
+		dev_dbg(mdev->sdev->parent, "callback %d unregistered for src: %d\n",
+			offset, src_id);
+	}
+}
+
+/**
+ * mic_setup_interrupts - Initializes interrupts.
+ *
+ * @mdev: pointer to mic_device instance
+ * @pdev: PCI device structure
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+int mic_setup_interrupts(struct mic_device *mdev, struct pci_dev *pdev)
+{
+	int rc;
+
+	rc = mic_setup_msix(mdev, pdev);
+	if (!rc)
+		goto done;
+
+	rc = mic_setup_msi(mdev, pdev);
+	if (!rc)
+		goto done;
+
+	rc = mic_setup_intx(mdev, pdev);
+	if (rc) {
+		dev_err(mdev->sdev->parent, "no usable interrupts\n");
+		return rc;
+	}
+done:
+	mdev->intr_ops->enable_interrupts(mdev);
+	return 0;
+}
+
+/**
+ * mic_free_interrupts - Frees interrupts setup by mic_setup_interrupts
+ *
+ * @mdev: pointer to mic_device instance
+ * @pdev: PCI device structure
+ *
+ * returns none.
+ */
+void mic_free_interrupts(struct mic_device *mdev, struct pci_dev *pdev)
+{
+	int i;
+
+	mdev->intr_ops->disable_interrupts(mdev);
+	if (mdev->irq_info.num_vectors > 1) {
+		for (i = 0; i < mdev->irq_info.num_vectors; i++) {
+			if (mdev->irq_info.mic_msi_map[i])
+				dev_warn(&pdev->dev, "irq %d may still be in use.\n",
+					mdev->irq_info.msix_entries[i].vector);
+		}
+		kfree(mdev->irq_info.mic_msi_map);
+		kfree(mdev->irq_info.msix_entries);
+		pci_disable_msix(pdev);
+	} else {
+		if (pci_dev_msi_enabled(pdev)) {
+			free_irq(pdev->irq, mdev);
+			kfree(mdev->irq_info.mic_msi_map);
+			pci_disable_msi(pdev);
+		} else {
+			free_irq(pdev->irq, mdev);
+		}
+		mic_release_callbacks(mdev);
+	}
+}
+
+/**
+ * mic_intr_restore - Restore MIC interrupt registers.
+ *
+ * @mdev: pointer to mic_device instance.
+ *
+ * Restore the interrupt registers to values previously
+ * stored in the SW data structures. mic_mutex needs to
+ * be held before calling this function.
+ *
+ * returns None.
+ */
+void mic_intr_restore(struct mic_device *mdev)
+{
+	int entry, offset;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+
+	if (!pci_dev_msi_enabled(pdev))
+		return;
+
+	for (entry = 0; entry < mdev->irq_info.num_vectors; entry++) {
+		for (offset = 0; offset < MIC_NUM_OFFSETS; offset++) {
+			if (mdev->irq_info.mic_msi_map[entry] & BIT(offset))
+				mdev->intr_ops->program_msi_to_src_map(mdev,
+					entry, offset, true);
+		}
+	}
+}
diff --git a/drivers/misc/mic/host/mic_intr.h b/drivers/misc/mic/host/mic_intr.h
new file mode 100644
index 0000000..6091aa9
--- /dev/null
+++ b/drivers/misc/mic/host/mic_intr.h
@@ -0,0 +1,137 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef _MIC_INTR_H_
+#define _MIC_INTR_H_
+
+/*
+ * The minimum number of msix vectors required for normal operation.
+ * 3 for virtio network, console and block devices.
+ * 1 for card shutdown notifications.
+ */
+#define MIC_MIN_MSIX 4
+#define MIC_NUM_OFFSETS 32
+
+/**
+ * mic_intr_source - The type of source that will generate
+ * the interrupt.The number of types needs to be in sync with
+ * MIC_NUM_INTR_TYPES
+ *
+ * MIC_INTR_DB: The source is a doorbell
+ * MIC_INTR_DMA: The source is a DMA channel
+ * MIC_INTR_ERR: The source is an error interrupt e.g. SBOX ERR
+ * MIC_NUM_INTR_TYPES: Total number of interrupt sources.
+ */
+enum mic_intr_type {
+	MIC_INTR_DB = 0,
+	MIC_INTR_DMA,
+	MIC_INTR_ERR,
+	MIC_NUM_INTR_TYPES
+};
+
+/**
+ * struct mic_intr_info - Contains h/w specific interrupt sources
+ * information.
+ *
+ * @intr_start_idx: Contains the starting indexes of the
+ * interrupt types.
+ * @intr_len: Contains the length of the interrupt types.
+ */
+struct mic_intr_info {
+	u16 intr_start_idx[MIC_NUM_INTR_TYPES];
+	u16 intr_len[MIC_NUM_INTR_TYPES];
+};
+
+/**
+ * struct mic_irq_info - OS specific irq information
+ *
+ * @next_avail_src: next available doorbell that can be assigned.
+ * @msix_entries: msix entries allocated while setting up MSI-x
+ * @mic_msi_map: The MSI/MSI-x mapping information.
+ * @num_vectors: The number of MSI/MSI-x vectors that have been allocated.
+ * @cb_ida: callback ID allocator to track the callbacks registered.
+ * @mic_intr_lock: spinlock to protect the interrupt callback list.
+ * @cb_list: Array of callback lists one for each source.
+ */
+struct mic_irq_info {
+	int next_avail_src;
+	struct msix_entry *msix_entries;
+	u32 *mic_msi_map;
+	u16 num_vectors;
+	struct ida cb_ida;
+	spinlock_t mic_intr_lock;
+	struct list_head *cb_list;
+};
+
+/**
+ * struct mic_intr_cb - Interrupt callback structure.
+ *
+ * @func: The callback function
+ * @data: Private data of the requester.
+ * @cb_id: The callback id. Identifies this callback.
+ * @list: list head pointing to the next callback structure.
+ */
+struct mic_intr_cb {
+	irqreturn_t (*func) (int irq, void *data);
+	void *data;
+	int cb_id;
+	struct list_head list;
+};
+
+/**
+ * struct mic_irq - opaque pointer used as cookie
+ */
+struct mic_irq;
+
+/* Forward declaration */
+struct mic_device;
+
+/**
+ * struct mic_hw_intr_ops: MIC HW specific interrupt operations
+ * @intr_init: Initialize H/W specific interrupt information.
+ * @enable_interrupts: Enable interrupts from the hardware.
+ * @disable_interrupts: Disable interrupts from the hardware.
+ * @program_msi_to_src_map: Update MSI mapping registers with
+ * irq information.
+ * @read_msi_to_src_map: Read MSI mapping registers containing
+ * irq information.
+ */
+struct mic_hw_intr_ops {
+	void (*intr_init)(struct mic_device *mdev);
+	void (*enable_interrupts)(struct mic_device *mdev);
+	void (*disable_interrupts)(struct mic_device *mdev);
+	void (*program_msi_to_src_map) (struct mic_device *mdev,
+			int idx, int intr_src, bool set);
+	u32 (*read_msi_to_src_map) (struct mic_device *mdev,
+			int idx);
+};
+
+int mic_next_db(struct mic_device *mdev);
+struct mic_irq *mic_request_irq(struct mic_device *mdev,
+	irqreturn_t (*func)(int irq, void *data),
+	const char *name, void *data, int intr_src,
+	enum mic_intr_type type);
+
+void mic_free_irq(struct mic_device *mdev,
+		struct mic_irq *cookie, void *data);
+int mic_setup_interrupts(struct mic_device *mdev, struct pci_dev *pdev);
+void mic_free_interrupts(struct mic_device *mdev, struct pci_dev *pdev);
+void mic_intr_restore(struct mic_device *mdev);
+#endif
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
index 228c96c..332a15e 100644
--- a/drivers/misc/mic/host/mic_main.c
+++ b/drivers/misc/mic/host/mic_main.c
@@ -23,13 +23,13 @@
  * 2) Enable per vring interrupt support.
  */
 #include <linux/fs.h>
-#include <linux/idr.h>
 #include <linux/module.h>
 #include <linux/pci.h>
 
 #include "../common/mic_device.h"
 #include "mic_device.h"
 #include "mic_x100.h"
+#include "mic_smpt.h"
 
 static const char mic_driver_name[] = "mic";
 
@@ -75,6 +75,8 @@ static void mic_ops_init(struct mic_device *mdev)
 	switch (mdev->family) {
 	case MIC_FAMILY_X100:
 		mdev->ops = &mic_x100_ops;
+		mdev->intr_ops = &mic_x100_intr_ops;
+		mdev->smpt_ops = &mic_x100_smpt_ops;
 		break;
 	default:
 		break;
@@ -132,6 +134,8 @@ mic_device_init(struct mic_device *mdev, struct pci_dev *pdev)
 	mdev->stepping = pdev->revision;
 	mic_ops_init(mdev);
 	mic_sysfs_init(mdev);
+	mutex_init(&mdev->mic_mutex);
+	mdev->irq_info.next_avail_src = 0;
 }
 
 /**
@@ -201,6 +205,18 @@ static int mic_probe(struct pci_dev *pdev,
 		goto unmap_mmio;
 	}
 
+	mdev->intr_ops->intr_init(mdev);
+	rc = mic_setup_interrupts(mdev, pdev);
+	if (rc) {
+		dev_err(&pdev->dev, "mic_setup_interrupts failed %d\n", rc);
+		goto unmap_aper;
+	}
+	rc = mic_smpt_init(mdev);
+	if (rc) {
+		dev_err(&pdev->dev, "smpt_init failed %d\n", rc);
+		goto free_interrupts;
+	}
+
 	pci_set_drvdata(pdev, mdev);
 
 	mdev->sdev = device_create_with_groups(g_mic_class, &pdev->dev,
@@ -210,9 +226,13 @@ static int mic_probe(struct pci_dev *pdev,
 		rc = PTR_ERR(mdev->sdev);
 		dev_err(&pdev->dev,
 			"device_create_with_groups failed rc %d\n", rc);
-		goto unmap_aper;
+		goto smpt_uninit;
 	}
 	return 0;
+smpt_uninit:
+	mic_smpt_uninit(mdev);
+free_interrupts:
+	mic_free_interrupts(mdev, pdev);
 unmap_aper:
 	iounmap(mdev->aper.va);
 unmap_mmio:
@@ -246,6 +266,8 @@ static void mic_remove(struct pci_dev *pdev)
 		return;
 
 	device_destroy(g_mic_class, MKDEV(MAJOR(g_mic_devno), mdev->id));
+	mic_smpt_uninit(mdev);
+	mic_free_interrupts(mdev, pdev);
 	iounmap(mdev->mmio.va);
 	iounmap(mdev->aper.va);
 	pci_release_regions(pdev);
diff --git a/drivers/misc/mic/host/mic_smpt.c b/drivers/misc/mic/host/mic_smpt.c
new file mode 100644
index 0000000..a346207
--- /dev/null
+++ b/drivers/misc/mic/host/mic_smpt.c
@@ -0,0 +1,442 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/pci.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_smpt.h"
+
+static inline u64 mic_system_page_mask(struct mic_device *mdev)
+{
+	return (1ULL << mdev->smpt->info.page_shift) - 1ULL;
+}
+
+static inline u8 mic_sys_addr_to_smpt(struct mic_device *mdev, dma_addr_t pa)
+{
+	return (pa - mdev->smpt->info.base) >> mdev->smpt->info.page_shift;
+}
+
+static inline u64 mic_smpt_to_pa(struct mic_device *mdev, u8 index)
+{
+	return mdev->smpt->info.base + (index * mdev->smpt->info.page_size);
+}
+
+static inline u64 mic_smpt_offset(struct mic_device *mdev, dma_addr_t pa)
+{
+	return pa & mic_system_page_mask(mdev);
+}
+
+static inline u64 mic_smpt_align_low(struct mic_device *mdev, dma_addr_t pa)
+{
+	return ALIGN(pa - mic_system_page_mask(mdev),
+		mdev->smpt->info.page_size);
+}
+
+static inline u64 mic_smpt_align_high(struct mic_device *mdev, dma_addr_t pa)
+{
+	return ALIGN(pa, mdev->smpt->info.page_size);
+}
+
+/* Total Cumulative system memory accessible by MIC across all SMPT entries */
+static inline u64 mic_max_system_memory(struct mic_device *mdev)
+{
+	return mdev->smpt->info.num_reg * mdev->smpt->info.page_size;
+}
+
+/* Maximum system memory address accessible by MIC */
+static inline u64 mic_max_system_addr(struct mic_device *mdev)
+{
+	return mdev->smpt->info.base + mic_max_system_memory(mdev) - 1ULL;
+}
+
+/* Check if the DMA address is a MIC system memory address */
+static inline bool
+mic_is_system_addr(struct mic_device *mdev, dma_addr_t pa)
+{
+	return pa >= mdev->smpt->info.base && pa <= mic_max_system_addr(mdev);
+}
+
+/* Populate an SMPT entry and update the reference counts. */
+static void mic_add_smpt_entry(int spt, s64 *ref, u64 addr,
+		int entries, struct mic_device *mdev)
+{
+	struct mic_smpt_info *smpt_info = mdev->smpt;
+	int i;
+
+	for (i = spt; i < spt + entries; i++,
+		addr += smpt_info->info.page_size) {
+		if (!smpt_info->entry[i].ref_count &&
+			(smpt_info->entry[i].dma_addr != addr)) {
+			mdev->smpt_ops->set(mdev, addr, i);
+			smpt_info->entry[i].dma_addr = addr;
+		}
+		smpt_info->entry[i].ref_count += ref[i - spt];
+	}
+}
+
+/*
+ * Find an available MIC address in MIC SMPT address space
+ * for a given DMA address and size.
+ */
+static dma_addr_t mic_smpt_op(struct mic_device *mdev, u64 dma_addr,
+				int entries, s64 *ref, size_t size)
+{
+	int spt;
+	int ae = 0;
+	int i;
+	unsigned long flags;
+	dma_addr_t mic_addr = 0;
+	dma_addr_t addr = dma_addr;
+	struct mic_smpt_info *smpt_info = mdev->smpt;
+
+	spin_lock_irqsave(&smpt_info->smpt_lock, flags);
+
+	/* find existing entries */
+	for (i = 0; i < smpt_info->info.num_reg; i++) {
+		if (smpt_info->entry[i].dma_addr == addr) {
+			ae++;
+			addr += smpt_info->info.page_size;
+		} else if (ae) /* cannot find contiguous entries */
+			goto not_found;
+
+		if (ae == entries)
+			goto found;
+	}
+
+	/* find free entry */
+	for (ae = 0, i = 0; i < smpt_info->info.num_reg; i++) {
+		ae = (smpt_info->entry[i].ref_count == 0) ? ae + 1 : 0;
+		if (ae == entries)
+			goto found;
+	}
+
+not_found:
+	spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
+	return mic_addr;
+
+found:
+	spt = i - entries + 1;
+	mic_addr = mic_smpt_to_pa(mdev, spt);
+	mic_add_smpt_entry(spt, ref, dma_addr, entries, mdev);
+	smpt_info->map_count++;
+	smpt_info->ref_count += (s64)size;
+	spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
+	return mic_addr;
+}
+
+/*
+ * Returns number of smpt entries needed for dma_addr to dma_addr + size
+ * also returns the reference count array for each of those entries
+ * and the starting smpt address
+ */
+static int mic_get_smpt_ref_count(struct mic_device *mdev, dma_addr_t dma_addr,
+				size_t size, s64 *ref,  u64 *smpt_start)
+{
+	u64 start =  dma_addr;
+	u64 end = dma_addr + size;
+	int i = 0;
+
+	while (start < end) {
+		ref[i++] = min(mic_smpt_align_high(mdev, start + 1),
+			end) - start;
+		start = mic_smpt_align_high(mdev, start + 1);
+	}
+
+	if (smpt_start)
+		*smpt_start = mic_smpt_align_low(mdev, dma_addr);
+
+	return i;
+}
+
+/*
+ * mic_to_dma_addr - Converts a MIC address to a DMA address.
+ *
+ * @mdev: pointer to mic_device instance.
+ * @mic_addr: MIC address.
+ *
+ * returns a DMA address.
+ */
+static dma_addr_t
+mic_to_dma_addr(struct mic_device *mdev, dma_addr_t mic_addr)
+{
+	struct mic_smpt_info *smpt_info = mdev->smpt;
+	int spt;
+	dma_addr_t dma_addr;
+
+	if (!mic_is_system_addr(mdev, mic_addr)) {
+		dev_err(mdev->sdev->parent,
+		"mic_addr is invalid. mic_addr = 0x%llx\n", mic_addr);
+		return -EINVAL;
+	}
+	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
+	dma_addr = smpt_info->entry[spt].dma_addr +
+		mic_smpt_offset(mdev, mic_addr);
+	return dma_addr;
+}
+
+/**
+ * mic_map - Maps a DMA address to a MIC physical address.
+ *
+ * @mdev: pointer to mic_device instance.
+ * @dma_addr: DMA address.
+ * @size: Size of the region to be mapped.
+ *
+ * This API converts the DMA address provided to a DMA address understood
+ * by MIC. Caller should check for errors by calling mic_map_error(..).
+ *
+ * returns DMA address as required by MIC.
+ */
+dma_addr_t mic_map(struct mic_device *mdev, dma_addr_t dma_addr, size_t size)
+{
+	dma_addr_t mic_addr = 0;
+	int num_entries;
+	s64 *ref;
+	u64 smpt_start;
+
+	if (!size || size > mic_max_system_memory(mdev))
+		return mic_addr;
+
+	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	if (!ref)
+		return mic_addr;
+
+	num_entries = mic_get_smpt_ref_count(mdev, dma_addr, size,
+		ref, &smpt_start);
+
+	/* Set the smpt table appropriately and get 16G aligned mic address */
+	mic_addr = mic_smpt_op(mdev, smpt_start, num_entries, ref, size);
+
+	kfree(ref);
+
+	/*
+	 * If mic_addr is zero then its an error case
+	 * since mic_addr can never be zero.
+	 * else generate mic_addr by adding the 16G offset in dma_addr
+	 */
+	if (!mic_addr && MIC_FAMILY_X100 == mdev->family) {
+		dev_err(mdev->sdev->parent,
+			"mic_map failed dma_addr 0x%llx size 0x%lx\n",
+			dma_addr, size);
+		return mic_addr;
+	} else {
+		return mic_addr + mic_smpt_offset(mdev, dma_addr);
+	}
+}
+
+/**
+ * mic_unmap - Unmaps a MIC physical address.
+ *
+ * @mdev: pointer to mic_device instance.
+ * @mic_addr: MIC physical address.
+ * @size: Size of the region to be unmapped.
+ *
+ * This API unmaps the mappings created by mic_map(..).
+ *
+ * returns None.
+ */
+void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
+{
+	struct mic_smpt_info *smpt_info = mdev->smpt;
+	s64 *ref;
+	int num_smpt;
+	int spt;
+	int i;
+	unsigned long flags;
+
+	if (!size)
+		return;
+
+	if (!mic_is_system_addr(mdev, mic_addr)) {
+		dev_err(mdev->sdev->parent,
+			"invalid address: 0x%llx\n", mic_addr);
+		return;
+	}
+
+	spt = mic_sys_addr_to_smpt(mdev, mic_addr);
+	ref = kmalloc(mdev->smpt->info.num_reg * sizeof(s64), GFP_KERNEL);
+	if (!ref)
+		return;
+
+	/* Get number of smpt entries to be mapped, ref count array */
+	num_smpt = mic_get_smpt_ref_count(mdev, mic_addr, size, ref, NULL);
+
+	spin_lock_irqsave(&smpt_info->smpt_lock, flags);
+	smpt_info->unmap_count++;
+	smpt_info->ref_count -= (s64)size;
+
+	for (i = spt; i < spt + num_smpt; i++) {
+		smpt_info->entry[i].ref_count -= ref[i - spt];
+		if (smpt_info->entry[i].ref_count < 0)
+			dev_warn(mdev->sdev->parent,
+				"ref count for entry %d is negative\n", i);
+	}
+	spin_unlock_irqrestore(&smpt_info->smpt_lock, flags);
+	kfree(ref);
+}
+
+/**
+ * mic_map_single - Maps a virtual address to a MIC physical address.
+ *
+ * @mdev: pointer to mic_device instance.
+ * @va: Kernel direct mapped virtual address.
+ * @size: Size of the region to be mapped.
+ *
+ * This API calls pci_map_single(..) for the direct mapped virtual address
+ * and then converts the DMA address provided to a DMA address understood
+ * by MIC. Caller should check for errors by calling mic_map_error(..).
+ *
+ * returns DMA address as required by MIC.
+ */
+dma_addr_t mic_map_single(struct mic_device *mdev, void *va, size_t size)
+{
+	dma_addr_t mic_addr = 0;
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+	dma_addr_t dma_addr =
+		pci_map_single(pdev, va, size, PCI_DMA_BIDIRECTIONAL);
+
+	if (!pci_dma_mapping_error(pdev, dma_addr)) {
+		mic_addr = mic_map(mdev, dma_addr, size);
+		if (!mic_addr) {
+			dev_err(mdev->sdev->parent,
+				"mic_map failed dma_addr 0x%llx size 0x%lx\n",
+				dma_addr, size);
+			pci_unmap_single(pdev, dma_addr,
+				size, PCI_DMA_BIDIRECTIONAL);
+		}
+	}
+	return mic_addr;
+}
+
+/**
+ * mic_unmap_single - Unmaps a MIC physical address.
+ *
+ * @mdev: pointer to mic_device instance.
+ * @mic_addr: MIC physical address.
+ * @size: Size of the region to be unmapped.
+ *
+ * This API unmaps the mappings created by mic_map_single(..).
+ *
+ * returns None.
+ */
+void
+mic_unmap_single(struct mic_device *mdev, dma_addr_t mic_addr, size_t size)
+{
+	struct pci_dev *pdev = container_of(mdev->sdev->parent,
+		struct pci_dev, dev);
+	dma_addr_t dma_addr = mic_to_dma_addr(mdev, mic_addr);
+	mic_unmap(mdev, mic_addr, size);
+	pci_unmap_single(pdev, dma_addr, size, PCI_DMA_BIDIRECTIONAL);
+}
+
+/**
+ * mic_smpt_init - Initialize MIC System Memory Page Tables.
+ *
+ * @mdev: pointer to mic_device instance.
+ *
+ * returns 0 for success and -errno for error.
+ */
+int mic_smpt_init(struct mic_device *mdev)
+{
+	int i, err = 0;
+	dma_addr_t dma_addr;
+	struct mic_smpt_info *smpt_info;
+
+	mdev->smpt = kmalloc(sizeof(*mdev->smpt), GFP_KERNEL);
+	if (!mdev->smpt)
+		return -ENOMEM;
+
+	smpt_info = mdev->smpt;
+	mdev->smpt_ops->init(mdev);
+	smpt_info->entry = kmalloc(sizeof(struct mic_smpt)
+			* smpt_info->info.num_reg, GFP_KERNEL);
+	if (!smpt_info->entry) {
+		err = -ENOMEM;
+		goto free_smpt;
+	}
+	spin_lock_init(&smpt_info->smpt_lock);
+	for (i = 0; i < smpt_info->info.num_reg; i++) {
+		dma_addr = i * smpt_info->info.page_size;
+		smpt_info->entry[i].dma_addr = dma_addr;
+		smpt_info->entry[i].ref_count = 0;
+		mdev->smpt_ops->set(mdev, dma_addr, i);
+	}
+	smpt_info->ref_count = 0;
+	smpt_info->map_count = 0;
+	smpt_info->unmap_count = 0;
+	return 0;
+free_smpt:
+	kfree(smpt_info);
+	return err;
+}
+
+/**
+ * mic_smpt_uninit - UnInitialize MIC System Memory Page Tables.
+ *
+ * @mdev: pointer to mic_device instance.
+ *
+ * returns None.
+ */
+void mic_smpt_uninit(struct mic_device *mdev)
+{
+	struct mic_smpt_info *smpt_info = mdev->smpt;
+	int i;
+
+	dev_dbg(mdev->sdev->parent,
+		"nodeid %d SMPT ref count %lld map %lld unmap %lld\n",
+		mdev->id, smpt_info->ref_count,
+		smpt_info->map_count, smpt_info->unmap_count);
+
+	for (i = 0; i < smpt_info->info.num_reg; i++) {
+		dev_dbg(mdev->sdev->parent,
+			"SMPT entry[%d] dma_addr = 0x%llx ref_count = %lld\n",
+			i, smpt_info->entry[i].dma_addr,
+			smpt_info->entry[i].ref_count);
+		if (smpt_info->entry[i].ref_count)
+			dev_warn(mdev->sdev->parent,
+			"ref count for entry %d is not zero\n", i);
+	}
+	kfree(smpt_info->entry);
+	kfree(smpt_info);
+}
+
+/**
+ * mic_smpt_restore - Restore MIC System Memory Page Tables.
+ *
+ * @mdev: pointer to mic_device instance.
+ *
+ * Restore the SMPT registers to values previously stored in the
+ * SW data structures. Some MIC steppings lose register state
+ * across resets and this API should be called for performing
+ * a restore operation if required.
+ *
+ * returns None.
+ */
+void mic_smpt_restore(struct mic_device *mdev)
+{
+	int i;
+	dma_addr_t dma_addr;
+
+	for (i = 0; i < mdev->smpt->info.num_reg; i++) {
+		dma_addr = mdev->smpt->entry[i].dma_addr;
+		mdev->smpt_ops->set(mdev, dma_addr, i);
+	}
+}
diff --git a/drivers/misc/mic/host/mic_smpt.h b/drivers/misc/mic/host/mic_smpt.h
new file mode 100644
index 0000000..51970ab
--- /dev/null
+++ b/drivers/misc/mic/host/mic_smpt.h
@@ -0,0 +1,98 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef MIC_SMPT_H
+#define MIC_SMPT_H
+/**
+ * struct mic_smpt_ops - MIC HW specific SMPT operations.
+ * @init: Initialize hardware specific SMPT information in mic_smpt_hw_info.
+ * @set: Set the value for a particular SMPT entry.
+ */
+struct mic_smpt_ops {
+	void (*init)(struct mic_device *mdev);
+	void (*set)(struct mic_device *mdev, dma_addr_t dma_addr, u8 index);
+};
+
+/**
+ * struct mic_smpt - MIC SMPT entry information.
+ * @dma_addr: Base DMA address for this SMPT entry.
+ * @ref_count: Number of active mappings for this SMPT entry in bytes.
+ */
+struct mic_smpt {
+	dma_addr_t dma_addr;
+	s64 ref_count;
+};
+
+/**
+ * struct mic_smpt_hw_info - MIC SMPT hardware specific information.
+ * @num_reg: Number of SMPT registers.
+ * @page_shift: System memory page shift.
+ * @page_size: System memory page size.
+ * @base: System address base.
+ */
+struct mic_smpt_hw_info {
+	u8 num_reg;
+	u8 page_shift;
+	u64 page_size;
+	u64 base;
+};
+
+/**
+ * struct mic_smpt_info - MIC SMPT information.
+ * @entry: Array of SMPT entries.
+ * @smpt_lock: Spin lock protecting access to SMPT data structures.
+ * @info: Hardware specific SMPT information.
+ * @ref_count: Number of active SMPT mappings (for debug).
+ * @map_count: Number of SMPT mappings created (for debug).
+ * @unmap_count: Number of SMPT mappings destroyed (for debug).
+ */
+struct mic_smpt_info {
+	struct mic_smpt *entry;
+	spinlock_t smpt_lock;
+	struct mic_smpt_hw_info info;
+	s64 ref_count;
+	s64 map_count;
+	s64 unmap_count;
+};
+
+dma_addr_t mic_map_single(struct mic_device *mdev, void *va, size_t size);
+void mic_unmap_single(struct mic_device *mdev,
+	dma_addr_t mic_addr, size_t size);
+dma_addr_t mic_map(struct mic_device *mdev,
+	dma_addr_t dma_addr, size_t size);
+void mic_unmap(struct mic_device *mdev, dma_addr_t mic_addr, size_t size);
+
+/**
+ * mic_map_error - Check a MIC address for errors.
+ *
+ * @mdev: pointer to mic_device instance.
+ *
+ * returns Whether there was an error during mic_map..(..) APIs.
+ */
+static inline bool mic_map_error(dma_addr_t mic_addr)
+{
+	return !mic_addr;
+}
+
+int mic_smpt_init(struct mic_device *mdev);
+void mic_smpt_uninit(struct mic_device *mdev);
+void mic_smpt_restore(struct mic_device *mdev);
+
+#endif
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
index da481b1..b637316 100644
--- a/drivers/misc/mic/host/mic_x100.c
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -24,6 +24,7 @@
 #include "../common/mic_device.h"
 #include "mic_device.h"
 #include "mic_x100.h"
+#include "mic_smpt.h"
 
 /**
  * mic_x100_write_spad - write to the scratchpad register
@@ -67,9 +68,255 @@ mic_x100_read_spad(struct mic_device *mdev, unsigned int idx)
 	return val;
 }
 
+/**
+ * mic_x100_enable_interrupts - Enable interrupts.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_enable_interrupts(struct mic_device *mdev)
+{
+	u32 reg;
+	struct mic_mw *mw = &mdev->mmio;
+	u32 sice0 = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_SICE0;
+	u32 siac0 = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_SIAC0;
+
+	reg = mic_mmio_read(mw, sice0);
+	reg |= MIC_X100_SBOX_DBR_BITS(0xf) | MIC_X100_SBOX_DMA_BITS(0xff);
+	mic_mmio_write(mw, reg, sice0);
+
+	/*
+	 * Enable auto-clear when enabling interrupts. Applicable only for
+	 * MSI-x. Legacy and MSI mode cannot have auto-clear enabled.
+	 */
+	if (mdev->irq_info.num_vectors > 1) {
+		reg = mic_mmio_read(mw, siac0);
+		reg |= MIC_X100_SBOX_DBR_BITS(0xf) |
+			MIC_X100_SBOX_DMA_BITS(0xff);
+		mic_mmio_write(mw, reg, siac0);
+	}
+}
+
+/**
+ * mic_x100_disable_interrupts - Disable interrupts.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_disable_interrupts(struct mic_device *mdev)
+{
+	u32 reg;
+	struct mic_mw *mw = &mdev->mmio;
+	u32 sice0 = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_SICE0;
+	u32 siac0 = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_SIAC0;
+	u32 sicc0 = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_SICC0;
+
+	reg = mic_mmio_read(mw, sice0);
+	mic_mmio_write(mw, reg, sicc0);
+
+	if (mdev->irq_info.num_vectors > 1) {
+		reg = mic_mmio_read(mw, siac0);
+		reg &= ~(MIC_X100_SBOX_DBR_BITS(0xf) |
+			MIC_X100_SBOX_DMA_BITS(0xff));
+		mic_mmio_write(mw, reg, siac0);
+	}
+}
+
+/**
+ * mic_x100_send_sbox_intr - Send an MIC_X100_SBOX interrupt to MIC.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_send_sbox_intr(struct mic_device *mdev,
+			int doorbell)
+{
+	struct mic_mw *mw = &mdev->mmio;
+	u64 apic_icr_offset = MIC_X100_SBOX_APICICR0 + doorbell * 8;
+	u32 apicicr_low = mic_mmio_read(mw,
+			MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+
+	/* for MIC we need to make sure we "hit" the send_icr bit (13) */
+	apicicr_low = (apicicr_low | (1 << 13));
+
+	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
+	wmb();
+	mic_mmio_write(mw, apicicr_low,
+		MIC_X100_SBOX_BASE_ADDRESS + apic_icr_offset);
+}
+
+/**
+ * mic_x100_send_rdmasr_intr - Send an RDMASR interrupt to MIC.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_send_rdmasr_intr(struct mic_device *mdev,
+			int doorbell)
+{
+	int rdmasr_offset = MIC_X100_SBOX_RDMASR0 + (doorbell << 2);
+	/* Ensure that the interrupt is ordered w.r.t. previous stores. */
+	wmb();
+	mic_mmio_write(&mdev->mmio, 0,
+		MIC_X100_SBOX_BASE_ADDRESS + rdmasr_offset);
+}
+
+/**
+ * __mic_x100_send_intr - Send interrupt to MIC.
+ * @mdev: pointer to mic_device instance
+ * @doorbell: doorbell number.
+ */
+static void mic_x100_send_intr(struct mic_device *mdev, int doorbell)
+{
+	int rdmasr_db;
+	if (doorbell < MIC_X100_NUM_SBOX_IRQ) {
+		mic_x100_send_sbox_intr(mdev, doorbell);
+	} else {
+		rdmasr_db = doorbell - MIC_X100_NUM_SBOX_IRQ +
+			MIC_X100_RDMASR_IRQ_BASE;
+		mic_x100_send_rdmasr_intr(mdev, rdmasr_db);
+	}
+}
+
+/**
+ * mic_ack_interrupt - Device specific interrupt handling.
+ * @mdev: pointer to mic_device instance
+ *
+ * Returns: bitmask of doorbell events triggered.
+ */
+static u32 mic_x100_ack_interrupt(struct mic_device *mdev)
+{
+	u32 reg = 0;
+	struct mic_mw *mw = &mdev->mmio;
+	u32 sicr0 = MIC_X100_SBOX_BASE_ADDRESS + MIC_X100_SBOX_SICR0;
+
+	/* Clear pending bit array. */
+	if (MIC_A0_STEP == mdev->stepping)
+		mic_mmio_write(mw, 1, MIC_X100_SBOX_BASE_ADDRESS +
+			MIC_X100_SBOX_MSIXPBACR);
+
+	if (mdev->irq_info.num_vectors <= 1) {
+		reg = mic_mmio_read(mw, sicr0);
+
+		if (unlikely(!reg))
+			goto done;
+
+		mic_mmio_write(mw, reg, sicr0);
+	}
+
+	if (mdev->stepping >= MIC_B0_STEP)
+		mdev->intr_ops->enable_interrupts(mdev);
+done:
+	return reg;
+}
+
+/**
+ * mic_x100_hw_intr_init - Initialize h/w specific interrupt
+ * information.
+ * @mdev: pointer to mic_device instance
+ */
+static void mic_x100_hw_intr_init(struct mic_device *mdev)
+{
+	mdev->intr_info = (struct mic_intr_info *) mic_x100_intr_init;
+}
+
+/**
+ * mic_x100_read_msi_to_src_map - read from the MSI mapping registers
+ * @mdev: pointer to mic_device instance
+ * @idx: index to the mapping register, 0 based
+ *
+ * This function allows reading of the 32bit MSI mapping register.
+ *
+ * RETURNS: The value in the register.
+ */
+static u32
+mic_x100_read_msi_to_src_map(struct mic_device *mdev, int idx)
+{
+	return mic_mmio_read(&mdev->mmio,
+		MIC_X100_SBOX_BASE_ADDRESS +
+		MIC_X100_SBOX_MXAR0 + idx * 4);
+}
+
+/**
+ * mic_x100_program_msi_to_src_map - program the MSI mapping registers
+ * @mdev: pointer to mic_device instance
+ * @idx: index to the mapping register, 0 based
+ * @offset: The bit offset in the register that needs to be updated.
+ * @set: boolean specifying if the bit in the specified offset needs
+ * to be set or cleared.
+ *
+ * RETURNS: None.
+ */
+static void
+mic_x100_program_msi_to_src_map(struct mic_device *mdev,
+			int idx, int offset, bool set)
+{
+	unsigned long reg;
+	struct mic_mw *mw = &mdev->mmio;
+	u32 mxar = MIC_X100_SBOX_BASE_ADDRESS +
+		MIC_X100_SBOX_MXAR0 + idx * 4;
+
+	reg = mic_mmio_read(mw, mxar);
+	if (set)
+		__set_bit(offset, &reg);
+	else
+		__clear_bit(offset, &reg);
+	mic_mmio_write(mw, reg, mxar);
+}
+
+/**
+ * mic_x100_smpt_set - Update an SMPT entry with a DMA address.
+ * @mdev: pointer to mic_device instance
+ *
+ * RETURNS: none.
+ */
+static void
+mic_x100_smpt_set(struct mic_device *mdev, dma_addr_t dma_addr, u8 index)
+{
+#define SNOOP_ON	(0 << 0)
+#define SNOOP_OFF	(1 << 0)
+/*
+ * Sbox Smpt Reg Bits:
+ * Bits	31:2	Host address
+ * Bits	1	RSVD
+ * Bits	0	No snoop
+ */
+#define BUILD_SMPT(NO_SNOOP, HOST_ADDR)  \
+	(u32)(((HOST_ADDR) << 2) | ((NO_SNOOP) & 0x01))
+
+	uint32_t smpt_reg_val = BUILD_SMPT(SNOOP_ON,
+			dma_addr >> mdev->smpt->info.page_shift);
+	mic_mmio_write(&mdev->mmio, smpt_reg_val,
+		MIC_X100_SBOX_BASE_ADDRESS +
+		MIC_X100_SBOX_SMPT00 + (4 * index));
+}
+
+/**
+ * mic_x100_smpt_hw_init - Initialize SMPT X100 specific fields.
+ * @mdev: pointer to mic_device instance
+ *
+ * RETURNS: none.
+ */
+static void mic_x100_smpt_hw_init(struct mic_device *mdev)
+{
+	struct mic_smpt_hw_info *info = &mdev->smpt->info;
+
+	info->num_reg = 32;
+	info->page_shift = 34;
+	info->page_size = (1ULL << info->page_shift);
+	info->base = 0x8000000000ULL;
+}
+
+struct mic_smpt_ops mic_x100_smpt_ops = {
+	.init = mic_x100_smpt_hw_init,
+	.set = mic_x100_smpt_set,
+};
+
 struct mic_hw_ops mic_x100_ops = {
 	.aper_bar = MIC_X100_APER_BAR,
 	.mmio_bar = MIC_X100_MMIO_BAR,
 	.read_spad = mic_x100_read_spad,
 	.write_spad = mic_x100_write_spad,
+	.send_intr = mic_x100_send_intr,
+	.ack_interrupt = mic_x100_ack_interrupt,
+};
+
+struct mic_hw_intr_ops mic_x100_intr_ops = {
+	.intr_init = mic_x100_hw_intr_init,
+	.enable_interrupts = mic_x100_enable_interrupts,
+	.disable_interrupts = mic_x100_disable_interrupts,
+	.program_msi_to_src_map = mic_x100_program_msi_to_src_map,
+	.read_msi_to_src_map = mic_x100_read_msi_to_src_map,
 };
diff --git a/drivers/misc/mic/host/mic_x100.h b/drivers/misc/mic/host/mic_x100.h
index 1f4e630..642cae9 100644
--- a/drivers/misc/mic/host/mic_x100.h
+++ b/drivers/misc/mic/host/mic_x100.h
@@ -42,6 +42,45 @@
 
 #define MIC_X100_SBOX_BASE_ADDRESS 0x00010000
 #define MIC_X100_SBOX_SPAD0 0x0000AB20
+#define MIC_X100_SBOX_SICR0_DBR(x) ((x) & 0xf)
+#define MIC_X100_SBOX_SICR0_DMA(x) (((x) >> 8) & 0xff)
+#define MIC_X100_SBOX_SICE0_DBR(x) ((x) & 0xf)
+#define MIC_X100_SBOX_DBR_BITS(x) ((x) & 0xf)
+#define MIC_X100_SBOX_SICE0_DMA(x) (((x) >> 8) & 0xff)
+#define MIC_X100_SBOX_DMA_BITS(x) (((x) & 0xff) << 8)
+
+#define MIC_X100_SBOX_APICICR0 0x0000A9D0
+#define MIC_X100_SBOX_SICR0 0x00009004
+#define MIC_X100_SBOX_SICE0 0x0000900C
+#define MIC_X100_SBOX_SICC0 0x00009010
+#define MIC_X100_SBOX_SIAC0 0x00009014
+#define MIC_X100_SBOX_MSIXPBACR 0x00009084
+#define MIC_X100_SBOX_MXAR0 0x00009044
+#define MIC_X100_SBOX_SMPT00 0x00003100
+#define MIC_X100_SBOX_RDMASR0 0x0000B180
+
+#define MIC_X100_DOORBELL_IDX_START 0
+#define MIC_X100_NUM_DOORBELL 4
+#define MIC_X100_DMA_IDX_START 8
+#define MIC_X100_NUM_DMA 8
+#define MIC_X100_ERR_IDX_START 30
+#define MIC_X100_NUM_ERR 1
+
+#define MIC_X100_NUM_SBOX_IRQ 8
+#define MIC_X100_NUM_RDMASR_IRQ 8
+#define MIC_X100_RDMASR_IRQ_BASE 17
+
+static const u16 mic_x100_intr_init[] = {
+		MIC_X100_DOORBELL_IDX_START,
+		MIC_X100_DMA_IDX_START,
+		MIC_X100_ERR_IDX_START,
+		MIC_X100_NUM_DOORBELL,
+		MIC_X100_NUM_DMA,
+		MIC_X100_NUM_ERR,
+};
+
 extern struct mic_hw_ops mic_x100_ops;
+extern struct mic_smpt_ops mic_x100_smpt_ops;
+extern struct mic_hw_intr_ops mic_x100_intr_ops;
 
 #endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 28%]

* [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family.
  2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
@ 2013-09-05 23:41 44% ` Sudeep Dutt
  2013-09-05 23:41 28% ` [PATCH RESEND v3 2/7] Intel MIC Host Driver Interrupt/SMPT support Sudeep Dutt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

This patch enables the following:
a) Initializes the Intel MIC X100 PCIe devices.
b) Provides sysfs entries for family and stepping information.

Co-author: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
---
 Documentation/ABI/testing/sysfs-class-mic.txt |  34 +++
 drivers/misc/Kconfig                          |   1 +
 drivers/misc/Makefile                         |   1 +
 drivers/misc/mic/Kconfig                      |  19 ++
 drivers/misc/mic/Makefile                     |   5 +
 drivers/misc/mic/common/mic_device.h          |  37 +++
 drivers/misc/mic/host/Makefile                |   8 +
 drivers/misc/mic/host/mic_device.h            | 109 +++++++++
 drivers/misc/mic/host/mic_main.c              | 309 ++++++++++++++++++++++++++
 drivers/misc/mic/host/mic_sysfs.c             |  97 ++++++++
 drivers/misc/mic/host/mic_x100.c              |  75 +++++++
 drivers/misc/mic/host/mic_x100.h              |  47 ++++
 12 files changed, 742 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-mic.txt
 create mode 100644 drivers/misc/mic/Kconfig
 create mode 100644 drivers/misc/mic/Makefile
 create mode 100644 drivers/misc/mic/common/mic_device.h
 create mode 100644 drivers/misc/mic/host/Makefile
 create mode 100644 drivers/misc/mic/host/mic_device.h
 create mode 100644 drivers/misc/mic/host/mic_main.c
 create mode 100644 drivers/misc/mic/host/mic_sysfs.c
 create mode 100644 drivers/misc/mic/host/mic_x100.c
 create mode 100644 drivers/misc/mic/host/mic_x100.h

diff --git a/Documentation/ABI/testing/sysfs-class-mic.txt b/Documentation/ABI/testing/sysfs-class-mic.txt
new file mode 100644
index 0000000..09eb3c6
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-mic.txt
@@ -0,0 +1,34 @@
+What:		/sys/class/mic/
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		The mic class directory belongs to Intel MIC devices and
+		provides information per MIC device. An Intel MIC device is a
+		PCIe form factor add-in Coprocessor card based on the Intel Many
+		Integrated Core (MIC) architecture that runs a Linux OS.
+
+What:		/sys/class/mic/mic(x)
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		The directories /sys/class/mic/mic0, /sys/class/mic/mic1 etc.,
+		represent MIC devices (0,1,..etc). Each directory has
+		information specific to that MIC device.
+
+What:		/sys/class/mic/mic(x)/family
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		Provides information about the Coprocessor family for an Intel
+		MIC device. For example - "x100"
+
+What:		/sys/class/mic/mic(x)/stepping
+Date:		August 2013
+KernelVersion:	3.11
+Contact:	Sudeep Dutt <sudeep.dutt@intel.com>
+Description:
+		Provides information about the silicon stepping for an Intel
+		MIC device. For example - "A0" or "B0"
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 8dacd4c..e760715 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -537,4 +537,5 @@ source "drivers/misc/carma/Kconfig"
 source "drivers/misc/altera-stapl/Kconfig"
 source "drivers/misc/mei/Kconfig"
 source "drivers/misc/vmw_vmci/Kconfig"
+source "drivers/misc/mic/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index c235d5b..0b7ea3e 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -53,3 +53,4 @@ obj-$(CONFIG_INTEL_MEI)		+= mei/
 obj-$(CONFIG_VMWARE_VMCI)	+= vmw_vmci/
 obj-$(CONFIG_LATTICE_ECP3_CONFIG)	+= lattice-ecp3-config.o
 obj-$(CONFIG_SRAM)		+= sram.o
+obj-y				+= mic/
diff --git a/drivers/misc/mic/Kconfig b/drivers/misc/mic/Kconfig
new file mode 100644
index 0000000..aaefd0c
--- /dev/null
+++ b/drivers/misc/mic/Kconfig
@@ -0,0 +1,19 @@
+comment "Intel MIC Host Driver"
+
+config INTEL_MIC_HOST
+	tristate "Intel MIC Host Driver"
+	depends on 64BIT && PCI
+	default N
+	help
+	  This enables Host Driver support for the Intel Many Integrated
+	  Core (MIC) family of PCIe form factor coprocessor devices that
+	  run a 64 bit Linux OS. The driver manages card OS state and
+	  enables communication between host and card. Intel MIC X100
+	  devices are currently supported.
+
+	  If you are building a host kernel with an Intel MIC device then
+	  say M (recommended) or Y, else say N. If unsure say N.
+
+	  More information about the Intel MIC family as well as the Linux
+	  OS and tools for MIC to use with this driver are available from
+	  <http://software.intel.com/en-us/mic-developer>.
diff --git a/drivers/misc/mic/Makefile b/drivers/misc/mic/Makefile
new file mode 100644
index 0000000..8e72421
--- /dev/null
+++ b/drivers/misc/mic/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2013, Intel Corporation.
+#
+obj-$(CONFIG_INTEL_MIC_HOST) += host/
diff --git a/drivers/misc/mic/common/mic_device.h b/drivers/misc/mic/common/mic_device.h
new file mode 100644
index 0000000..f02262e
--- /dev/null
+++ b/drivers/misc/mic/common/mic_device.h
@@ -0,0 +1,37 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC driver.
+ *
+ */
+#ifndef __MIC_COMMON_DEVICE_H_
+#define __MIC_COMMON_DEVICE_H_
+
+/**
+ * struct mic_mw - MIC memory window
+ *
+ * @pa: Base physical address.
+ * @va: Base ioremap'd virtual address.
+ * @len: Size of the memory window.
+ */
+struct mic_mw {
+	phys_addr_t pa;
+	void __iomem *va;
+	resource_size_t len;
+};
+
+#endif
diff --git a/drivers/misc/mic/host/Makefile b/drivers/misc/mic/host/Makefile
new file mode 100644
index 0000000..93b9d25
--- /dev/null
+++ b/drivers/misc/mic/host/Makefile
@@ -0,0 +1,8 @@
+#
+# Makefile - Intel MIC Linux driver.
+# Copyright(c) 2013, Intel Corporation.
+#
+obj-$(CONFIG_INTEL_MIC_HOST) += mic_host.o
+mic_host-objs := mic_main.o
+mic_host-objs += mic_x100.o
+mic_host-objs += mic_sysfs.o
diff --git a/drivers/misc/mic/host/mic_device.h b/drivers/misc/mic/host/mic_device.h
new file mode 100644
index 0000000..6cd904c
--- /dev/null
+++ b/drivers/misc/mic/host/mic_device.h
@@ -0,0 +1,109 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef _MIC_DEVICE_H_
+#define _MIC_DEVICE_H_
+
+/* The maximum number of MIC devices supported in a single host system. */
+#define MIC_MAX_NUM_DEVS 256
+
+/**
+ * enum mic_hw_family - The hardware family to which a device belongs.
+ */
+enum mic_hw_family {
+	MIC_FAMILY_X100 = 0,
+	MIC_FAMILY_UNKNOWN
+};
+
+/**
+ * enum mic_stepping - MIC stepping ids.
+ */
+enum mic_stepping {
+	MIC_A0_STEP = 0x0,
+	MIC_B0_STEP = 0x10,
+	MIC_B1_STEP = 0x11,
+	MIC_C0_STEP = 0x20,
+};
+
+/**
+ * struct mic_device -  MIC device information for each card.
+ *
+ * @mmio: MMIO bar information.
+ * @aper: Aperture bar information.
+ * @family: The MIC family to which this device belongs.
+ * @ops: MIC HW specific operations.
+ * @id: The unique device id for this MIC device.
+ * @stepping: Stepping ID.
+ * @attr_group: Pointer to list of sysfs attribute groups.
+ * @sdev: Device for sysfs entries.
+ */
+struct mic_device {
+	struct mic_mw mmio;
+	struct mic_mw aper;
+	enum mic_hw_family family;
+	struct mic_hw_ops *ops;
+	int id;
+	enum mic_stepping stepping;
+	const struct attribute_group **attr_group;
+	struct device *sdev;
+};
+
+/**
+ * struct mic_hw_ops - MIC HW specific operations.
+ * @aper_bar: Aperture bar resource number.
+ * @mmio_bar: MMIO bar resource number.
+ * @read_spad: Read from scratch pad register.
+ * @write_spad: Write to scratch pad register.
+ */
+struct mic_hw_ops {
+	u8 aper_bar;
+	u8 mmio_bar;
+	u32 (*read_spad)(struct mic_device *mdev, unsigned int idx);
+	void (*write_spad)(struct mic_device *mdev, unsigned int idx, u32 val);
+};
+
+/**
+ * mic_mmio_read - read from an MMIO register.
+ * @mw: MMIO register base virtual address.
+ * @offset: register offset.
+ *
+ * RETURNS: register value.
+ */
+static inline u32 mic_mmio_read(struct mic_mw *mw, u32 offset)
+{
+	return ioread32(mw->va + offset);
+}
+
+/**
+ * mic_mmio_write - write to an MMIO register.
+ * @mw: MMIO register base virtual address.
+ * @val: the data value to put into the register
+ * @offset: register offset.
+ *
+ * RETURNS: none.
+ */
+static inline void
+mic_mmio_write(struct mic_mw *mw, u32 val, u32 offset)
+{
+	iowrite32(val, mw->va + offset);
+}
+
+void mic_sysfs_init(struct mic_device *mdev);
+#endif
diff --git a/drivers/misc/mic/host/mic_main.c b/drivers/misc/mic/host/mic_main.c
new file mode 100644
index 0000000..228c96c
--- /dev/null
+++ b/drivers/misc/mic/host/mic_main.c
@@ -0,0 +1,309 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ * Global TODO's across the driver to be added after initial base
+ * patches are accepted upstream:
+ * 1) Enable DMA support.
+ * 2) Enable per vring interrupt support.
+ */
+#include <linux/fs.h>
+#include <linux/idr.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_x100.h"
+
+static const char mic_driver_name[] = "mic";
+
+static DEFINE_PCI_DEVICE_TABLE(mic_pci_tbl) = {
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2250)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2251)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2252)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2253)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2254)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2255)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2256)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2257)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2258)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_2259)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_225a)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_225b)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_225c)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_225d)},
+	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, MIC_X100_PCI_DEVICE_225e)},
+
+	/* required last entry */
+	{ 0, }
+};
+
+MODULE_DEVICE_TABLE(pci, mic_pci_tbl);
+
+/* ID allocator for MIC devices */
+static struct ida g_mic_ida;
+/* Class of MIC devices for sysfs accessibility. */
+static struct class *g_mic_class;
+/* Base device node number for MIC devices */
+static dev_t g_mic_devno;
+
+/**
+ * mic_ops_init: Initialize HW specific operation tables.
+ *
+ * @mdev: pointer to mic_device instance
+ *
+ * returns none.
+ */
+static void mic_ops_init(struct mic_device *mdev)
+{
+	switch (mdev->family) {
+	case MIC_FAMILY_X100:
+		mdev->ops = &mic_x100_ops;
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ * mic_get_family - Determine hardware family to which this MIC belongs.
+ *
+ * @pdev: The pci device structure
+ *
+ * returns family.
+ */
+static enum mic_hw_family mic_get_family(struct pci_dev *pdev)
+{
+	enum mic_hw_family family;
+
+	switch (pdev->device) {
+	case MIC_X100_PCI_DEVICE_2250:
+	case MIC_X100_PCI_DEVICE_2251:
+	case MIC_X100_PCI_DEVICE_2252:
+	case MIC_X100_PCI_DEVICE_2253:
+	case MIC_X100_PCI_DEVICE_2254:
+	case MIC_X100_PCI_DEVICE_2255:
+	case MIC_X100_PCI_DEVICE_2256:
+	case MIC_X100_PCI_DEVICE_2257:
+	case MIC_X100_PCI_DEVICE_2258:
+	case MIC_X100_PCI_DEVICE_2259:
+	case MIC_X100_PCI_DEVICE_225a:
+	case MIC_X100_PCI_DEVICE_225b:
+	case MIC_X100_PCI_DEVICE_225c:
+	case MIC_X100_PCI_DEVICE_225d:
+	case MIC_X100_PCI_DEVICE_225e:
+		family = MIC_FAMILY_X100;
+		break;
+	default:
+		family = MIC_FAMILY_UNKNOWN;
+		break;
+	}
+	return family;
+}
+
+/**
+ * mic_device_init - Allocates and initializes the MIC device structure
+ *
+ * @mdev: pointer to mic_device instance
+ * @pdev: The pci device structure
+ *
+ * returns none.
+ */
+static void
+mic_device_init(struct mic_device *mdev, struct pci_dev *pdev)
+{
+	mdev->family = mic_get_family(pdev);
+	mdev->stepping = pdev->revision;
+	mic_ops_init(mdev);
+	mic_sysfs_init(mdev);
+}
+
+/**
+ * mic_probe - Device Initialization Routine
+ *
+ * @pdev: PCI device structure
+ * @ent: entry in mic_pci_tbl
+ *
+ * returns 0 on success, < 0 on failure.
+ */
+static int mic_probe(struct pci_dev *pdev,
+		const struct pci_device_id *ent)
+{
+	int rc;
+	struct mic_device *mdev;
+
+	mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
+	if (!mdev) {
+		rc = -ENOMEM;
+		dev_err(&pdev->dev, "mdev kmalloc failed rc %d\n", rc);
+		goto mdev_alloc_fail;
+	}
+	mdev->id = ida_simple_get(&g_mic_ida, 0, MIC_MAX_NUM_DEVS, GFP_KERNEL);
+	if (mdev->id < 0) {
+		rc = mdev->id;
+		dev_err(&pdev->dev, "ida_simple_get failed rc %d\n", rc);
+		goto ida_fail;
+	}
+
+	mic_device_init(mdev, pdev);
+
+	rc = pci_enable_device(pdev);
+	if (rc) {
+		dev_err(&pdev->dev, "failed to enable pci device.\n");
+		goto ida_remove;
+	}
+
+	pci_set_master(pdev);
+
+	rc = pci_request_regions(pdev, mic_driver_name);
+	if (rc) {
+		dev_err(&pdev->dev, "failed to get pci regions.\n");
+		goto disable_device;
+	}
+
+	rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+	if (rc) {
+		dev_err(&pdev->dev, "Cannot set DMA mask\n");
+		goto release_regions;
+	}
+
+	mdev->mmio.pa = pci_resource_start(pdev, mdev->ops->mmio_bar);
+	mdev->mmio.len = pci_resource_len(pdev, mdev->ops->mmio_bar);
+	mdev->mmio.va = pci_ioremap_bar(pdev, mdev->ops->mmio_bar);
+	if (!mdev->mmio.va) {
+		dev_err(&pdev->dev, "Cannot remap MMIO BAR\n");
+		rc = -EIO;
+		goto release_regions;
+	}
+
+	mdev->aper.pa = pci_resource_start(pdev, mdev->ops->aper_bar);
+	mdev->aper.len = pci_resource_len(pdev, mdev->ops->aper_bar);
+	mdev->aper.va = ioremap_wc(mdev->aper.pa, mdev->aper.len);
+	if (!mdev->aper.va) {
+		dev_err(&pdev->dev, "Cannot remap Aperture BAR\n");
+		rc = -EIO;
+		goto unmap_mmio;
+	}
+
+	pci_set_drvdata(pdev, mdev);
+
+	mdev->sdev = device_create_with_groups(g_mic_class, &pdev->dev,
+		MKDEV(MAJOR(g_mic_devno), mdev->id), NULL,
+		mdev->attr_group, "mic%d", mdev->id);
+	if (IS_ERR(mdev->sdev)) {
+		rc = PTR_ERR(mdev->sdev);
+		dev_err(&pdev->dev,
+			"device_create_with_groups failed rc %d\n", rc);
+		goto unmap_aper;
+	}
+	return 0;
+unmap_aper:
+	iounmap(mdev->aper.va);
+unmap_mmio:
+	iounmap(mdev->mmio.va);
+release_regions:
+	pci_release_regions(pdev);
+disable_device:
+	pci_disable_device(pdev);
+ida_remove:
+	ida_simple_remove(&g_mic_ida, mdev->id);
+ida_fail:
+	kfree(mdev);
+mdev_alloc_fail:
+	dev_err(&pdev->dev, "Probe failed rc %d\n", rc);
+	return rc;
+}
+
+/**
+ * mic_remove - Device Removal Routine
+ * mic_remove is called by the PCI subsystem to alert the driver
+ * that it should release a PCI device.
+ *
+ * @pdev: PCI device structure
+ */
+static void mic_remove(struct pci_dev *pdev)
+{
+	struct mic_device *mdev;
+
+	mdev = pci_get_drvdata(pdev);
+	if (!mdev)
+		return;
+
+	device_destroy(g_mic_class, MKDEV(MAJOR(g_mic_devno), mdev->id));
+	iounmap(mdev->mmio.va);
+	iounmap(mdev->aper.va);
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+	ida_simple_remove(&g_mic_ida, mdev->id);
+	kfree(mdev);
+}
+static struct pci_driver mic_driver = {
+	.name = mic_driver_name,
+	.id_table = mic_pci_tbl,
+	.probe = mic_probe,
+	.remove = mic_remove
+};
+
+static int __init mic_init(void)
+{
+	int ret;
+
+	ret = alloc_chrdev_region(&g_mic_devno, 0,
+		MIC_MAX_NUM_DEVS, mic_driver_name);
+	if (ret) {
+		pr_err("alloc_chrdev_region failed ret %d\n", ret);
+		goto error;
+	}
+
+	g_mic_class = class_create(THIS_MODULE, mic_driver_name);
+	if (IS_ERR(g_mic_class)) {
+		ret = PTR_ERR(g_mic_class);
+		pr_err("class_create failed ret %d\n", ret);
+		goto cleanup_chrdev;
+	}
+
+	ida_init(&g_mic_ida);
+	ret = pci_register_driver(&mic_driver);
+	if (ret) {
+		pr_err("pci_register_driver failed ret %d\n", ret);
+		goto class_destroy;
+	}
+	return ret;
+class_destroy:
+	class_destroy(g_mic_class);
+cleanup_chrdev:
+	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
+error:
+	return ret;
+}
+
+static void __exit mic_exit(void)
+{
+	pci_unregister_driver(&mic_driver);
+	ida_destroy(&g_mic_ida);
+	class_destroy(g_mic_class);
+	unregister_chrdev_region(g_mic_devno, MIC_MAX_NUM_DEVS);
+}
+
+module_init(mic_init);
+module_exit(mic_exit);
+
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Intel(R) MIC X100 Host driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/misc/mic/host/mic_sysfs.c b/drivers/misc/mic/host/mic_sysfs.c
new file mode 100644
index 0000000..972c182
--- /dev/null
+++ b/drivers/misc/mic/host/mic_sysfs.c
@@ -0,0 +1,97 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/pci.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+
+static ssize_t
+mic_show_family(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	static const char x100[] = "x100";
+	static const char unknown[] = "Unknown";
+	const char *card = NULL;
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+
+	if (!mdev)
+		return -EINVAL;
+
+	switch (mdev->family) {
+	case MIC_FAMILY_X100:
+		card = x100;
+		break;
+	default:
+		card = unknown;
+		break;
+	}
+	return scnprintf(buf, PAGE_SIZE, "%s\n", card);
+}
+static DEVICE_ATTR(family, S_IRUGO, mic_show_family, NULL);
+
+static ssize_t
+mic_show_stepping(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct mic_device *mdev = dev_get_drvdata(dev->parent);
+	char *string = "??";
+
+	if (!mdev)
+		return -EINVAL;
+
+	switch (mdev->stepping) {
+	case MIC_A0_STEP:
+		string = "A0";
+		break;
+	case MIC_B0_STEP:
+		string = "B0";
+		break;
+	case MIC_B1_STEP:
+		string = "B1";
+		break;
+	case MIC_C0_STEP:
+		string = "C0";
+		break;
+	default:
+		break;
+	}
+	return scnprintf(buf, PAGE_SIZE, "%s\n", string);
+}
+static DEVICE_ATTR(stepping, S_IRUGO, mic_show_stepping, NULL);
+
+static struct attribute *mic_default_attrs[] = {
+	&dev_attr_family.attr,
+	&dev_attr_stepping.attr,
+
+	NULL
+};
+
+static struct attribute_group mic_attr_group = {
+	.attrs = mic_default_attrs,
+};
+
+static const struct attribute_group *__mic_attr_group[] = {
+	&mic_attr_group,
+	NULL
+};
+
+void mic_sysfs_init(struct mic_device *mdev)
+{
+	mdev->attr_group = __mic_attr_group;
+}
diff --git a/drivers/misc/mic/host/mic_x100.c b/drivers/misc/mic/host/mic_x100.c
new file mode 100644
index 0000000..da481b1
--- /dev/null
+++ b/drivers/misc/mic/host/mic_x100.c
@@ -0,0 +1,75 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#include <linux/fs.h>
+#include <linux/pci.h>
+
+#include "../common/mic_device.h"
+#include "mic_device.h"
+#include "mic_x100.h"
+
+/**
+ * mic_x100_write_spad - write to the scratchpad register
+ * @mdev: pointer to mic_device instance
+ * @idx: index to the scratchpad register, 0 based
+ * @val: the data value to put into the register
+ *
+ * This function allows writing of a 32bit value to the indexed scratchpad
+ * register.
+ *
+ * RETURNS: none.
+ */
+static void
+mic_x100_write_spad(struct mic_device *mdev, unsigned int idx, u32 val)
+{
+	dev_dbg(mdev->sdev->parent, "Writing 0x%x to scratch pad index %d\n",
+		val, idx);
+	mic_mmio_write(&mdev->mmio, val,
+		MIC_X100_SBOX_BASE_ADDRESS +
+		MIC_X100_SBOX_SPAD0 + idx * 4);
+}
+
+/**
+ * mic_x100_read_spad - read from the scratchpad register
+ * @mdev: pointer to mic_device instance
+ * @idx: index to scratchpad register, 0 based
+ *
+ * This function allows reading of the 32bit scratchpad register.
+ *
+ * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
+ */
+static u32
+mic_x100_read_spad(struct mic_device *mdev, unsigned int idx)
+{
+	u32 val = mic_mmio_read(&mdev->mmio,
+		MIC_X100_SBOX_BASE_ADDRESS +
+		MIC_X100_SBOX_SPAD0 + idx * 4);
+
+	dev_dbg(mdev->sdev->parent,
+		"Reading 0x%x from scratch pad index %d\n", val, idx);
+	return val;
+}
+
+struct mic_hw_ops mic_x100_ops = {
+	.aper_bar = MIC_X100_APER_BAR,
+	.mmio_bar = MIC_X100_MMIO_BAR,
+	.read_spad = mic_x100_read_spad,
+	.write_spad = mic_x100_write_spad,
+};
diff --git a/drivers/misc/mic/host/mic_x100.h b/drivers/misc/mic/host/mic_x100.h
new file mode 100644
index 0000000..1f4e630
--- /dev/null
+++ b/drivers/misc/mic/host/mic_x100.h
@@ -0,0 +1,47 @@
+/*
+ * Intel MIC Platform Software Stack (MPSS)
+ *
+ * Copyright(c) 2013 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ *
+ * Intel MIC Host driver.
+ *
+ */
+#ifndef _MIC_X100_HW_H_
+#define _MIC_X100_HW_H_
+
+#define MIC_X100_PCI_DEVICE_2250 0x2250
+#define MIC_X100_PCI_DEVICE_2251 0x2251
+#define MIC_X100_PCI_DEVICE_2252 0x2252
+#define MIC_X100_PCI_DEVICE_2253 0x2253
+#define MIC_X100_PCI_DEVICE_2254 0x2254
+#define MIC_X100_PCI_DEVICE_2255 0x2255
+#define MIC_X100_PCI_DEVICE_2256 0x2256
+#define MIC_X100_PCI_DEVICE_2257 0x2257
+#define MIC_X100_PCI_DEVICE_2258 0x2258
+#define MIC_X100_PCI_DEVICE_2259 0x2259
+#define MIC_X100_PCI_DEVICE_225a 0x225a
+#define MIC_X100_PCI_DEVICE_225b 0x225b
+#define MIC_X100_PCI_DEVICE_225c 0x225c
+#define MIC_X100_PCI_DEVICE_225d 0x225d
+#define MIC_X100_PCI_DEVICE_225e 0x225e
+
+#define MIC_X100_APER_BAR 0
+#define MIC_X100_MMIO_BAR 4
+
+#define MIC_X100_SBOX_BASE_ADDRESS 0x00010000
+#define MIC_X100_SBOX_SPAD0 0x0000AB20
+extern struct mic_hw_ops mic_x100_ops;
+
+#endif
-- 
1.8.2.1


^ permalink raw reply related	[relevance 44%]

* [PATCH RESEND v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors.
@ 2013-09-05 23:41 60% Sudeep Dutt
  2013-09-05 23:41 44% ` [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family Sudeep Dutt
                   ` (7 more replies)
  0 siblings, 8 replies; 200+ results
From: Sudeep Dutt @ 2013-09-05 23:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arnd Bergmann, Rusty Russell,
	Michael S. Tsirkin, Rob Landley, linux-kernel, virtualization,
	linux-doc, Asias He
  Cc: Nikhil Rao, Ashutosh Dixit, Caz Yokoyama,
	Dasaratharaman Chandramouli, Harshavardhan R Kharche,
	Yaozu (Eddie) Dong, Peter P Waskiewicz Jr, Sudeep Dutt

ChangeLog:
=========

v2 => v3:
a) Patch 1 data structure cleanups, header file include cleanups,
   IDA interface reuse and switching to device_create_with_groups(..)
   as per feedback from Greg Kroah-Hartman.
b) Patch 7 signal documentation, sleep workaround removal and sysfs
   access API cleanups as per feedback from Michael S. Tsirkin.

v1 => v2: @ http://lwn.net/Articles/563131/
a) License wording cleanup, sysfs ABI documentation, patch 1 refactoring
   into 3 smaller patches and function renames, as per feedback from
   Greg Kroah-Hartman.
b) Use VRINGH infrastructure for accessing virtio rings from the host
   in patch 5, as per feedback from Michael S. Tsirkin.

v1: Initial post @ https://lwn.net/Articles/561314/

Description:
============

An Intel MIC X100 device is a PCIe form factor add-in coprocessor
card based on the Intel Many Integrated Core (MIC) architecture
that runs a Linux OS. It is a PCIe endpoint in a platform and therefore
implements the three required standard address spaces i.e. configuration,
memory and I/O. The host OS loads a device driver as is typical for
PCIe devices. The card itself runs a bootstrap after reset that
transfers control to the card OS downloaded from the host driver.
The card OS as shipped by Intel is a Linux kernel with modifications
for the X100 devices.

Since it is a PCIe card, it does not have the ability to host hardware
devices for networking, storage and console. We provide these devices
on X100 coprocessors thus enabling a self-bootable equivalent environment
for applications. A key benefit of our solution is that it leverages
the standard virtio framework for network, disk and console devices,
though in our case the virtio framework is used across a PCIe bus.

Here is a block diagram of the various components described above. The
virtio backends are situated on the host rather than the card given better
single threaded performance for the host compared to MIC, the ability of
the host to initiate DMA's to/from the card using the MIC DMA engine and
the fact that the virtio block storage backend can only be on the host.

       +----------+           |             +----------+
       | Card OS  |           |             | Host OS  |
       +----------+           |             +----------+
                              |
+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
| Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
| Net   | |Console | |Block | | |Net      |  |Console | |Block   |
| Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
+-------+ +--------+ +------+ | +---------+  +--------+ +--------+
    |         |         |     |      |            |         |
    |         |         |     |User  |            |         |
    |         |         |     |------|------------|---------|-------
    +-------------------+     |Kernel +--------------------------+
              |               |       | Virtio over PCIe IOCTLs  |
              |               |       +--------------------------+
      +--------------+        |                   |
      |Intel MIC     |        |            +---------------+
      |Card Driver   |        |            |Intel MIC      |
      +--------------+        |            |Host Driver    |
              |               |            +---------------+
              |               |                   |
     +-------------------------------------------------------------+
     |                                                             |
     |                    PCIe Bus                                 |
     +-------------------------------------------------------------+

The following series of patches are partitioned as follows:

Patch 1: This patch introduces the "Intel MIC Host Driver" in the block
diagram which does the following:
a) Initializes the Intel MIC X100 PCIe devices.
b) Provides sysfs entries for family and stepping information.

Patch 2: This patch enables the following features in the
"Intel MIC Host Driver" in the block diagram:
a) MSIx, MSI and legacy interrupt support.
b) System Memory Page Table(SMPT) support. SMPT enables system memory
   access from the card. On X100 devices the host can program 32 SMPT
   registers each capable of accessing 16GB of system memory
   address space from X100 devices. The registers can thereby be used
   to access a cumulative 512GB of system memory address space from
   X100 devices at any point in time.

Patch 3: This patch enables the following features in the
"Intel MIC Host Driver" in the block diagram:
a) Boots and shuts down the card via sysfs entries.
b) Allocates and maps a device page for communication with the
   card driver and updates the device page address via scratchpad
   registers.
c) Provides sysfs entries for shutdown status, kernel command line,
   firmware, ramdisk, bootmode and log buffer information.

Patch 4: This patch introduces the "Intel MIC Card Driver" in the block
diagram which does the following:
a) Initializes the Intel MIC X100 platform device and driver.
b) Sets up support to handle shutdown requests from the host.
c) Maps the device page after obtaining the device page address
   from the scratchpad registers updated by the host.
d) Informs the host upon a card crash by registering a panic notifier.
e) Informs the host upon a poweroff/halt event.

Patch 5: This patch introduces the host "Virtio over PCIe" interface for
Intel MIC. It allows creating user space backends on the host and instantiating
virtio devices for them on the Intel MIC card. It uses the existing VRINGH
infrastructure in the kernel to access virtio rings from the host. A character
device per MIC is exposed with IOCTL, mmap and poll callbacks. This allows the
user space backend to:
(a) add/remove a virtio device via a device page.
(b) map (R/O) virtio rings and device page to user space.
(c) poll for availability of data.
(d) copy a descriptor or entire descriptor chain to/from the card.
(e) modify virtio configuration.
(f) handle virtio device reset.
The buffers are copied over using CPU copies for this initial patch
and host initiated MIC DMA support is planned for future patches.
The avail and desc virtio rings are in host memory and the used ring
is in card memory to maximize writes across PCIe for performance.

Patch 6: This patch introduces the card "Virtio over PCIe" interface for
Intel MIC. It allows virtio drivers on the card to communicate with their
user space backends on the host via a device page. Ring 3 apps on the host
can add, remove and configure virtio devices. A thin MIC specific
virtio_config_ops is implemented which is borrowed heavily from previous
similar implementations in lguest and s390 @
drivers/lguest/lguest_device.c
drivers/s390/kvm/kvm_virtio.c

Patch 7: This patch introduces a sample user space daemon which
implements the virtio device backends on the host. The daemon
creates/removes/configures virtio device backends by communicating with
the Intel MIC Host Driver. The virtio devices currently supported are
virtio net, virtio console and virtio block. Virtio net supports TSO/GSO.
The daemon also monitors card shutdown status and takes appropriate actions
like killing the virtio backends and resetting the card upon card shutdown
and crashes.

The patches have been compiled/validated against v3.11.

Ashutosh Dixit (2):
  Intel MIC Host Driver Changes for Virtio Devices.
  Intel MIC Card Driver Changes for Virtio Devices.

Caz Yokoyama (1):
  Sample Implementation of Intel MIC User Space Daemon.

Dasaratharaman Chandramouli (1):
  Intel MIC Host Driver Interrupt/SMPT support.

Sudeep Dutt (3):
  Intel MIC Host Driver for X100 family.
  Intel MIC Host Driver, card OS state management.
  Intel MIC Card Driver for X100 family.

 Documentation/ABI/testing/sysfs-class-mic.txt |  147 +++
 Documentation/mic/mic_overview.txt            |   49 +
 Documentation/mic/mpssd/.gitignore            |    1 +
 Documentation/mic/mpssd/Makefile              |   19 +
 Documentation/mic/mpssd/micctrl               |  173 +++
 Documentation/mic/mpssd/mpss                  |  202 +++
 Documentation/mic/mpssd/mpssd.c               | 1701 +++++++++++++++++++++++++
 Documentation/mic/mpssd/mpssd.h               |  100 ++
 Documentation/mic/mpssd/sysfs.c               |  102 ++
 drivers/misc/Kconfig                          |    1 +
 drivers/misc/Makefile                         |    1 +
 drivers/misc/mic/Kconfig                      |   39 +
 drivers/misc/mic/Makefile                     |    6 +
 drivers/misc/mic/card/Makefile                |   11 +
 drivers/misc/mic/card/mic_debugfs.c           |  130 ++
 drivers/misc/mic/card/mic_device.c            |  306 +++++
 drivers/misc/mic/card/mic_device.h            |  133 ++
 drivers/misc/mic/card/mic_virtio.c            |  631 +++++++++
 drivers/misc/mic/card/mic_virtio.h            |   77 ++
 drivers/misc/mic/card/mic_x100.c              |  256 ++++
 drivers/misc/mic/card/mic_x100.h              |   48 +
 drivers/misc/mic/common/mic_device.h          |   51 +
 drivers/misc/mic/host/Makefile                |   14 +
 drivers/misc/mic/host/mic_boot.c              |  185 +++
 drivers/misc/mic/host/mic_debugfs.c           |  495 +++++++
 drivers/misc/mic/host/mic_device.h            |  195 +++
 drivers/misc/mic/host/mic_fops.c              |  221 ++++
 drivers/misc/mic/host/mic_fops.h              |   32 +
 drivers/misc/mic/host/mic_intr.c              |  628 +++++++++
 drivers/misc/mic/host/mic_intr.h              |  137 ++
 drivers/misc/mic/host/mic_main.c              |  478 +++++++
 drivers/misc/mic/host/mic_smpt.c              |  442 +++++++
 drivers/misc/mic/host/mic_smpt.h              |   98 ++
 drivers/misc/mic/host/mic_sysfs.c             |  466 +++++++
 drivers/misc/mic/host/mic_virtio.c            |  703 ++++++++++
 drivers/misc/mic/host/mic_virtio.h            |  138 ++
 drivers/misc/mic/host/mic_x100.c              |  573 +++++++++
 drivers/misc/mic/host/mic_x100.h              |   98 ++
 include/uapi/linux/Kbuild                     |    2 +
 include/uapi/linux/mic_common.h               |  238 ++++
 include/uapi/linux/mic_ioctl.h                |   74 ++
 41 files changed, 9401 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-mic.txt
 create mode 100644 Documentation/mic/mic_overview.txt
 create mode 100644 Documentation/mic/mpssd/.gitignore
 create mode 100644 Documentation/mic/mpssd/Makefile
 create mode 100755 Documentation/mic/mpssd/micctrl
 create mode 100755 Documentation/mic/mpssd/mpss
 create mode 100644 Documentation/mic/mpssd/mpssd.c
 create mode 100644 Documentation/mic/mpssd/mpssd.h
 create mode 100644 Documentation/mic/mpssd/sysfs.c
 create mode 100644 drivers/misc/mic/Kconfig
 create mode 100644 drivers/misc/mic/Makefile
 create mode 100644 drivers/misc/mic/card/Makefile
 create mode 100644 drivers/misc/mic/card/mic_debugfs.c
 create mode 100644 drivers/misc/mic/card/mic_device.c
 create mode 100644 drivers/misc/mic/card/mic_device.h
 create mode 100644 drivers/misc/mic/card/mic_virtio.c
 create mode 100644 drivers/misc/mic/card/mic_virtio.h
 create mode 100644 drivers/misc/mic/card/mic_x100.c
 create mode 100644 drivers/misc/mic/card/mic_x100.h
 create mode 100644 drivers/misc/mic/common/mic_device.h
 create mode 100644 drivers/misc/mic/host/Makefile
 create mode 100644 drivers/misc/mic/host/mic_boot.c
 create mode 100644 drivers/misc/mic/host/mic_debugfs.c
 create mode 100644 drivers/misc/mic/host/mic_device.h
 create mode 100644 drivers/misc/mic/host/mic_fops.c
 create mode 100644 drivers/misc/mic/host/mic_fops.h
 create mode 100644 drivers/misc/mic/host/mic_intr.c
 create mode 100644 drivers/misc/mic/host/mic_intr.h
 create mode 100644 drivers/misc/mic/host/mic_main.c
 create mode 100644 drivers/misc/mic/host/mic_smpt.c
 create mode 100644 drivers/misc/mic/host/mic_smpt.h
 create mode 100644 drivers/misc/mic/host/mic_sysfs.c
 create mode 100644 drivers/misc/mic/host/mic_virtio.c
 create mode 100644 drivers/misc/mic/host/mic_virtio.h
 create mode 100644 drivers/misc/mic/host/mic_x100.c
 create mode 100644 drivers/misc/mic/host/mic_x100.h
 create mode 100644 include/uapi/linux/mic_common.h
 create mode 100644 include/uapi/linux/mic_ioctl.h

-- 
1.8.2.1


^ permalink raw reply	[relevance 60%]

* Re: [PATCH v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors.
    2013-08-21 17:55 99% ` Sudeep Dutt
@ 2013-09-04 16:18 99% ` Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-09-04 16:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Rusty Russell, Michael S. Tsirkin, Rob Landley,
	linux-kernel, virtualization, linux-doc, Asias He, Nikhil Rao,
	Ashutosh Dixit, Caz Yokoyama, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr, Sudeep Dutt

On Wed, 2013-08-21 at 10:51 -0700, Sudeep Dutt wrote:
> ChangeLog:
> =========
> 
> v2 => v3:
> a) Patch 1 data structure cleanups, header file include cleanups,
>    IDA interface reuse and switching to device_create_with_groups(..)
>    as per feedback from Greg Kroah-Hartman.
> b) Patch 7 signal documentation, sleep workaround removal and sysfs
>    access API cleanups as per feedback from Michael S. Tsirkin.
> 
> v1 => v2: @ http://lwn.net/Articles/563131/
> a) License wording cleanup, sysfs ABI documentation, patch 1 refactoring
>    into 3 smaller patches and function renames, as per feedback from
>    Greg Kroah-Hartman.
> b) Use VRINGH infrastructure for accessing virtio rings from the host
>    in patch 5, as per feedback from Michael S. Tsirkin.
> 
> v1: Initial post @ https://lwn.net/Articles/561314/
> 

Hi Greg,

It has been a couple of weeks since we posted v3 @
https://lwn.net/Articles/564795 which addresses all feedback received so
far. Have you had a chance to review this version? Please let us know
your thoughts.

Thanks in advance,
Sudeep Dutt

> Description:
> ============
> 
> An Intel MIC X100 device is a PCIe form factor add-in coprocessor
> card based on the Intel Many Integrated Core (MIC) architecture
> that runs a Linux OS. It is a PCIe endpoint in a platform and therefore
> implements the three required standard address spaces i.e. configuration,
> memory and I/O. The host OS loads a device driver as is typical for
> PCIe devices. The card itself runs a bootstrap after reset that
> transfers control to the card OS downloaded from the host driver.
> The card OS as shipped by Intel is a Linux kernel with modifications
> for the X100 devices.
> 
> Since it is a PCIe card, it does not have the ability to host hardware
> devices for networking, storage and console. We provide these devices
> on X100 coprocessors thus enabling a self-bootable equivalent environment
> for applications. A key benefit of our solution is that it leverages
> the standard virtio framework for network, disk and console devices,
> though in our case the virtio framework is used across a PCIe bus.
> 
> Here is a block diagram of the various components described above. The
> virtio backends are situated on the host rather than the card given better
> single threaded performance for the host compared to MIC, the ability of
> the host to initiate DMA's to/from the card using the MIC DMA engine and
> the fact that the virtio block storage backend can only be on the host.
> 
>        +----------+           |             +----------+
>        | Card OS  |           |             | Host OS  |
>        +----------+           |             +----------+
>                               |
> +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
> | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
> | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
> | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
> +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
>     |         |         |     |      |            |         |
>     |         |         |     |User  |            |         |
>     |         |         |     |------|------------|---------|-------
>     +-------------------+     |Kernel +--------------------------+
>               |               |       | Virtio over PCIe IOCTLs  |
>               |               |       +--------------------------+
>       +--------------+        |                   |
>       |Intel MIC     |        |            +---------------+
>       |Card Driver   |        |            |Intel MIC      |
>       +--------------+        |            |Host Driver    |
>               |               |            +---------------+
>               |               |                   |
>      +-------------------------------------------------------------+
>      |                                                             |
>      |                    PCIe Bus                                 |
>      +-------------------------------------------------------------+
> 
> The following series of patches are partitioned as follows:
> 
> Patch 1: This patch introduces the "Intel MIC Host Driver" in the block
> diagram which does the following:
> a) Initializes the Intel MIC X100 PCIe devices.
> b) Provides sysfs entries for family and stepping information.
> 
> Patch 2: This patch enables the following features in the
> "Intel MIC Host Driver" in the block diagram:
> a) MSIx, MSI and legacy interrupt support.
> b) System Memory Page Table(SMPT) support. SMPT enables system memory
>    access from the card. On X100 devices the host can program 32 SMPT
>    registers each capable of accessing 16GB of system memory
>    address space from X100 devices. The registers can thereby be used
>    to access a cumulative 512GB of system memory address space from
>    X100 devices at any point in time.
> 
> Patch 3: This patch enables the following features in the
> "Intel MIC Host Driver" in the block diagram:
> a) Boots and shuts down the card via sysfs entries.
> b) Allocates and maps a device page for communication with the
>    card driver and updates the device page address via scratchpad
>    registers.
> c) Provides sysfs entries for shutdown status, kernel command line,
>    firmware, ramdisk, bootmode and log buffer information.
> 
> Patch 4: This patch introduces the "Intel MIC Card Driver" in the block
> diagram which does the following:
> a) Initializes the Intel MIC X100 platform device and driver.
> b) Sets up support to handle shutdown requests from the host.
> c) Maps the device page after obtaining the device page address
>    from the scratchpad registers updated by the host.
> d) Informs the host upon a card crash by registering a panic notifier.
> e) Informs the host upon a poweroff/halt event.
> 
> Patch 5: This patch introduces the host "Virtio over PCIe" interface for
> Intel MIC. It allows creating user space backends on the host and instantiating
> virtio devices for them on the Intel MIC card. It uses the existing VRINGH
> infrastructure in the kernel to access virtio rings from the host. A character
> device per MIC is exposed with IOCTL, mmap and poll callbacks. This allows the
> user space backend to:
> (a) add/remove a virtio device via a device page.
> (b) map (R/O) virtio rings and device page to user space.
> (c) poll for availability of data.
> (d) copy a descriptor or entire descriptor chain to/from the card.
> (e) modify virtio configuration.
> (f) handle virtio device reset.
> The buffers are copied over using CPU copies for this initial patch
> and host initiated MIC DMA support is planned for future patches.
> The avail and desc virtio rings are in host memory and the used ring
> is in card memory to maximize writes across PCIe for performance.
> 
> Patch 6: This patch introduces the card "Virtio over PCIe" interface for
> Intel MIC. It allows virtio drivers on the card to communicate with their
> user space backends on the host via a device page. Ring 3 apps on the host
> can add, remove and configure virtio devices. A thin MIC specific
> virtio_config_ops is implemented which is borrowed heavily from previous
> similar implementations in lguest and s390 @
> drivers/lguest/lguest_device.c
> drivers/s390/kvm/kvm_virtio.c
> 
> Patch 7: This patch introduces a sample user space daemon which
> implements the virtio device backends on the host. The daemon
> creates/removes/configures virtio device backends by communicating with
> the Intel MIC Host Driver. The virtio devices currently supported are
> virtio net, virtio console and virtio block. Virtio net supports TSO/GSO.
> The daemon also monitors card shutdown status and takes appropriate actions
> like killing the virtio backends and resetting the card upon card shutdown
> and crashes.
> 
> The patches have been compiled/validated against v3.11-rc5.
> 
> Ashutosh Dixit (2):
>   Intel MIC Host Driver Changes for Virtio Devices.
>   Intel MIC Card Driver Changes for Virtio Devices.
> 
> Caz Yokoyama (1):
>   Sample Implementation of Intel MIC User Space Daemon.
> 
> Dasaratharaman Chandramouli (1):
>   Intel MIC Host Driver Interrupt/SMPT support.
> 
> Sudeep Dutt (3):
>   Intel MIC Host Driver for X100 family.
>   Intel MIC Host Driver, card OS state management.
>   Intel MIC Card Driver for X100 family.
> 
>  Documentation/ABI/testing/sysfs-class-mic.txt |  147 +++
>  Documentation/mic/mic_overview.txt            |   49 +
>  Documentation/mic/mpssd/.gitignore            |    1 +
>  Documentation/mic/mpssd/Makefile              |   19 +
>  Documentation/mic/mpssd/micctrl               |  173 +++
>  Documentation/mic/mpssd/mpss                  |  202 +++
>  Documentation/mic/mpssd/mpssd.c               | 1701 +++++++++++++++++++++++++
>  Documentation/mic/mpssd/mpssd.h               |  100 ++
>  Documentation/mic/mpssd/sysfs.c               |  102 ++
>  drivers/misc/Kconfig                          |    1 +
>  drivers/misc/Makefile                         |    1 +
>  drivers/misc/mic/Kconfig                      |   39 +
>  drivers/misc/mic/Makefile                     |    6 +
>  drivers/misc/mic/card/Makefile                |   11 +
>  drivers/misc/mic/card/mic_debugfs.c           |  130 ++
>  drivers/misc/mic/card/mic_device.c            |  306 +++++
>  drivers/misc/mic/card/mic_device.h            |  133 ++
>  drivers/misc/mic/card/mic_virtio.c            |  631 +++++++++
>  drivers/misc/mic/card/mic_virtio.h            |   77 ++
>  drivers/misc/mic/card/mic_x100.c              |  256 ++++
>  drivers/misc/mic/card/mic_x100.h              |   48 +
>  drivers/misc/mic/common/mic_device.h          |   51 +
>  drivers/misc/mic/host/Makefile                |   13 +
>  drivers/misc/mic/host/mic_boot.c              |  185 +++
>  drivers/misc/mic/host/mic_debugfs.c           |  496 +++++++
>  drivers/misc/mic/host/mic_device.h            |  300 +++++
>  drivers/misc/mic/host/mic_fops.c              |  221 ++++
>  drivers/misc/mic/host/mic_fops.h              |   32 +
>  drivers/misc/mic/host/mic_main.c              | 1098 ++++++++++++++++
>  drivers/misc/mic/host/mic_smpt.c              |  440 +++++++
>  drivers/misc/mic/host/mic_smpt.h              |   98 ++
>  drivers/misc/mic/host/mic_sysfs.c             |  468 +++++++
>  drivers/misc/mic/host/mic_virtio.c            |  704 ++++++++++
>  drivers/misc/mic/host/mic_virtio.h            |  138 ++
>  drivers/misc/mic/host/mic_x100.c              |  573 +++++++++
>  drivers/misc/mic/host/mic_x100.h              |   99 ++
>  include/uapi/linux/Kbuild                     |    2 +
>  include/uapi/linux/mic_common.h               |  238 ++++
>  include/uapi/linux/mic_ioctl.h                |   74 ++
>  39 files changed, 9363 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-class-mic.txt
>  create mode 100644 Documentation/mic/mic_overview.txt
>  create mode 100644 Documentation/mic/mpssd/.gitignore
>  create mode 100644 Documentation/mic/mpssd/Makefile
>  create mode 100755 Documentation/mic/mpssd/micctrl
>  create mode 100755 Documentation/mic/mpssd/mpss
>  create mode 100644 Documentation/mic/mpssd/mpssd.c
>  create mode 100644 Documentation/mic/mpssd/mpssd.h
>  create mode 100644 Documentation/mic/mpssd/sysfs.c
>  create mode 100644 drivers/misc/mic/Kconfig
>  create mode 100644 drivers/misc/mic/Makefile
>  create mode 100644 drivers/misc/mic/card/Makefile
>  create mode 100644 drivers/misc/mic/card/mic_debugfs.c
>  create mode 100644 drivers/misc/mic/card/mic_device.c
>  create mode 100644 drivers/misc/mic/card/mic_device.h
>  create mode 100644 drivers/misc/mic/card/mic_virtio.c
>  create mode 100644 drivers/misc/mic/card/mic_virtio.h
>  create mode 100644 drivers/misc/mic/card/mic_x100.c
>  create mode 100644 drivers/misc/mic/card/mic_x100.h
>  create mode 100644 drivers/misc/mic/common/mic_device.h
>  create mode 100644 drivers/misc/mic/host/Makefile
>  create mode 100644 drivers/misc/mic/host/mic_boot.c
>  create mode 100644 drivers/misc/mic/host/mic_debugfs.c
>  create mode 100644 drivers/misc/mic/host/mic_device.h
>  create mode 100644 drivers/misc/mic/host/mic_fops.c
>  create mode 100644 drivers/misc/mic/host/mic_fops.h
>  create mode 100644 drivers/misc/mic/host/mic_main.c
>  create mode 100644 drivers/misc/mic/host/mic_smpt.c
>  create mode 100644 drivers/misc/mic/host/mic_smpt.h
>  create mode 100644 drivers/misc/mic/host/mic_sysfs.c
>  create mode 100644 drivers/misc/mic/host/mic_virtio.c
>  create mode 100644 drivers/misc/mic/host/mic_virtio.h
>  create mode 100644 drivers/misc/mic/host/mic_x100.c
>  create mode 100644 drivers/misc/mic/host/mic_x100.h
>  create mode 100644 include/uapi/linux/mic_common.h
>  create mode 100644 include/uapi/linux/mic_ioctl.h
> 



^ permalink raw reply	[relevance 99%]

* Re: [PATCH v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors.
  @ 2013-08-21 17:55 99% ` Sudeep Dutt
  2013-09-04 16:18 99% ` Sudeep Dutt
  1 sibling, 0 replies; 200+ results
From: Sudeep Dutt @ 2013-08-21 17:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Rusty Russell, Michael S. Tsirkin, Rob Landley,
	linux-kernel, virtualization, linux-doc, Asias He, Nikhil Rao,
	Ashutosh Dixit, Caz Yokoyama, Dasaratharaman Chandramouli,
	Harshavardhan R Kharche, Yaozu (Eddie) Dong,
	Peter P Waskiewicz Jr

Sincere apologies for the spam. This patches series got sent out 7 times
by mistake due to script bug at my end. I will ensure this never happens
again.

Apologies again.
Sudeep Dutt

On Wed, 2013-08-21 at 10:52 -0700, Sudeep Dutt wrote:
> ChangeLog:
> =========
> 
> v2 => v3:
> a) Patch 1 data structure cleanups, header file include cleanups,
>    IDA interface reuse and switching to device_create_with_groups(..)
>    as per feedback from Greg Kroah-Hartman.
> b) Patch 7 signal documentation, sleep workaround removal and sysfs
>    access API cleanups as per feedback from Michael S. Tsirkin.
> 
> v1 => v2: @ http://lwn.net/Articles/563131/
> a) License wording cleanup, sysfs ABI documentation, patch 1 refactoring
>    into 3 smaller patches and function renames, as per feedback from
>    Greg Kroah-Hartman.
> b) Use VRINGH infrastructure for accessing virtio rings from the host
>    in patch 5, as per feedback from Michael S. Tsirkin.
> 
> v1: Initial post @ https://lwn.net/Articles/561314/
> 
> Description:
> ============
> 
> An Intel MIC X100 device is a PCIe form factor add-in coprocessor
> card based on the Intel Many Integrated Core (MIC) architecture
> that runs a Linux OS. It is a PCIe endpoint in a platform and therefore
> implements the three required standard address spaces i.e. configuration,
> memory and I/O. The host OS loads a device driver as is typical for
> PCIe devices. The card itself runs a bootstrap after reset that
> transfers control to the card OS downloaded from the host driver.
> The card OS as shipped by Intel is a Linux kernel with modifications
> for the X100 devices.
> 
> Since it is a PCIe card, it does not have the ability to host hardware
> devices for networking, storage and console. We provide these devices
> on X100 coprocessors thus enabling a self-bootable equivalent environment
> for applications. A key benefit of our solution is that it leverages
> the standard virtio framework for network, disk and console devices,
> though in our case the virtio framework is used across a PCIe bus.
> 
> Here is a block diagram of the various components described above. The
> virtio backends are situated on the host rather than the card given better
> single threaded performance for the host compared to MIC, the ability of
> the host to initiate DMA's to/from the card using the MIC DMA engine and
> the fact that the virtio block storage backend can only be on the host.
> 
>        +----------+           |             +----------+
>        | Card OS  |           |             | Host OS  |
>        +----------+           |             +----------+
>                               |
> +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
> | Virtio| |Virtio  | |Virtio| | |Virtio   |  |Virtio  | |Virtio  |
> | Net   | |Console | |Block | | |Net      |  |Console | |Block   |
> | Driver| |Driver  | |Driver| | |backend  |  |backend | |backend |
> +-------+ +--------+ +------+ | +---------+  +--------+ +--------+
>     |         |         |     |      |            |         |
>     |         |         |     |User  |            |         |
>     |         |         |     |------|------------|---------|-------
>     +-------------------+     |Kernel +--------------------------+
>               |               |       | Virtio over PCIe IOCTLs  |
>               |               |       +--------------------------+
>       +--------------+        |                   |
>       |Intel MIC     |        |            +---------------+
>       |Card Driver   |        |            |Intel MIC      |
>       +--------------+        |            |Host Driver    |
>               |               |            +---------------+
>               |               |                   |
>      +-------------------------------------------------------------+
>      |                                                             |
>      |                    PCIe Bus                                 |
>      +-------------------------------------------------------------+
> 
> The following series of patches are partitioned as follows:
> 
> Patch 1: This patch introduces the "Intel MIC Host Driver" in the block
> diagram which does the following:
> a) Initializes the Intel MIC X100 PCIe devices.
> b) Provides sysfs entries for family and stepping information.
> 
> Patch 2: This patch enables the following features in the
> "Intel MIC Host Driver" in the block diagram:
> a) MSIx, MSI and legacy interrupt support.
> b) System Memory Page Table(SMPT) support. SMPT enables system memory
>    access from the card. On X100 devices the host can program 32 SMPT
>    registers each capable of accessing 16GB of system memory
>    address space from X100 devices. The registers can thereby be used
>    to access a cumulative 512GB of system memory address space from
>    X100 devices at any point in time.
> 
> Patch 3: This patch enables the following features in the
> "Intel MIC Host Driver" in the block diagram:
> a) Boots and shuts down the card via sysfs entries.
> b) Allocates and maps a device page for communication with the
>    card driver and updates the device page address via scratchpad
>    registers.
> c) Provides sysfs entries for shutdown status, kernel command line,
>    firmware, ramdisk, bootmode and log buffer information.
> 
> Patch 4: This patch introduces the "Intel MIC Card Driver" in the block
> diagram which does the following:
> a) Initializes the Intel MIC X100 platform device and driver.
> b) Sets up support to handle shutdown requests from the host.
> c) Maps the device page after obtaining the device page address
>    from the scratchpad registers updated by the host.
> d) Informs the host upon a card crash by registering a panic notifier.
> e) Informs the host upon a poweroff/halt event.
> 
> Patch 5: This patch introduces the host "Virtio over PCIe" interface for
> Intel MIC. It allows creating user space backends on the host and instantiating
> virtio devices for them on the Intel MIC card. It uses the existing VRINGH
> infrastructure in the kernel to access virtio rings from the host. A character
> device per MIC is exposed with IOCTL, mmap and poll callbacks. This allows the
> user space backend to:
> (a) add/remove a virtio device via a device page.
> (b) map (R/O) virtio rings and device page to user space.
> (c) poll for availability of data.
> (d) copy a descriptor or entire descriptor chain to/from the card.
> (e) modify virtio configuration.
> (f) handle virtio device reset.
> The buffers are copied over using CPU copies for this initial patch
> and host initiated MIC DMA support is planned for future patches.
> The avail and desc virtio rings are in host memory and the used ring
> is in card memory to maximize writes across PCIe for performance.
> 
> Patch 6: This patch introduces the card "Virtio over PCIe" interface for
> Intel MIC. It allows virtio drivers on the card to communicate with their
> user space backends on the host via a device page. Ring 3 apps on the host
> can add, remove and configure virtio devices. A thin MIC specific
> virtio_config_ops is implemented which is borrowed heavily from previous
> similar implementations in lguest and s390 @
> drivers/lguest/lguest_device.c
> drivers/s390/kvm/kvm_virtio.c
> 
> Patch 7: This patch introduces a sample user space daemon which
> implements the virtio device backends on the host. The daemon
> creates/removes/configures virtio device backends by communicating with
> the Intel MIC Host Driver. The virtio devices currently supported are
> virtio net, virtio console and virtio block. Virtio net supports TSO/GSO.
> The daemon also monitors card shutdown status and takes appropriate actions
> like killing the virtio backends and resetting the card upon card shutdown
> and crashes.
> 
> The patches have been compiled/validated against v3.11-rc5.
> 
> Ashutosh Dixit (2):
>   Intel MIC Host Driver Changes for Virtio Devices.
>   Intel MIC Card Driver Changes for Virtio Devices.
> 
> Caz Yokoyama (1):
>   Sample Implementation of Intel MIC User Space Daemon.
> 
> Dasaratharaman Chandramouli (1):
>   Intel MIC Host Driver Interrupt/SMPT support.
> 
> Sudeep Dutt (3):
>   Intel MIC Host Driver for X100 family.
>   Intel MIC Host Driver, card OS state management.
>   Intel MIC Card Driver for X100 family.
> 
>  Documentation/ABI/testing/sysfs-class-mic.txt |  147 +++
>  Documentation/mic/mic_overview.txt            |   49 +
>  Documentation/mic/mpssd/.gitignore            |    1 +
>  Documentation/mic/mpssd/Makefile              |   19 +
>  Documentation/mic/mpssd/micctrl               |  173 +++
>  Documentation/mic/mpssd/mpss                  |  202 +++
>  Documentation/mic/mpssd/mpssd.c               | 1701 +++++++++++++++++++++++++
>  Documentation/mic/mpssd/mpssd.h               |  100 ++
>  Documentation/mic/mpssd/sysfs.c               |  102 ++
>  drivers/misc/Kconfig                          |    1 +
>  drivers/misc/Makefile                         |    1 +
>  drivers/misc/mic/Kconfig                      |   39 +
>  drivers/misc/mic/Makefile                     |    6 +
>  drivers/misc/mic/card/Makefile                |   11 +
>  drivers/misc/mic/card/mic_debugfs.c           |  130 ++
>  drivers/misc/mic/card/mic_device.c            |  306 +++++
>  drivers/misc/mic/card/mic_device.h            |  133 ++
>  drivers/misc/mic/card/mic_virtio.c            |  631 +++++++++
>  drivers/misc/mic/card/mic_virtio.h            |   77 ++
>  drivers/misc/mic/card/mic_x100.c              |  256 ++++
>  drivers/misc/mic/card/mic_x100.h              |   48 +
>  drivers/misc/mic/common/mic_device.h          |   51 +
>  drivers/misc/mic/host/Makefile                |   13 +
>  drivers/misc/mic/host/mic_boot.c              |  185 +++
>  drivers/misc/mic/host/mic_debugfs.c           |  496 +++++++
>  drivers/misc/mic/host/mic_device.h            |  300 +++++
>  drivers/misc/mic/host/mic_fops.c              |  221 ++++
>  drivers/misc/mic/host/mic_fops.h              |   32 +
>  drivers/misc/mic/host/mic_main.c              | 1098 ++++++++++++++++
>  drivers/misc/mic/host/mic_smpt.c              |  440 +++++++
>  drivers/misc/mic/host/mic_smpt.h              |   98 ++
>  drivers/misc/mic/host/mic_sysfs.c             |  468 +++++++
>  drivers/misc/mic/host/mic_virtio.c            |  704 ++++++++++
>  drivers/misc/mic/host/mic_virtio.h            |  138 ++
>  drivers/misc/mic/host/mic_x100.c              |  573 +++++++++
>  drivers/misc/mic/host/mic_x100.h              |   99 ++
>  include/uapi/linux/Kbuild                     |    2 +
>  include/uapi/linux/mic_common.h               |  238 ++++
>  include/uapi/linux/mic_ioctl.h                |   74 ++
>  39 files changed, 9363 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-class-mic.txt
>  create mode 100644 Documentation/mic/mic_overview.txt
>  create mode 100644 Documentation/mic/mpssd/.gitignore
>  create mode 100644 Documentation/mic/mpssd/Makefile
>  create mode 100755 Documentation/mic/mpssd/micctrl
>  create mode 100755 Documentation/mic/mpssd/mpss
>  create mode 100644 Documentation/mic/mpssd/mpssd.c
>  create mode 100644 Documentation/mic/mpssd/mpssd.h
>  create mode 100644 Documentation/mic/mpssd/sysfs.c
>  create mode 100644 drivers/misc/mic/Kconfig
>  create mode 100644 drivers/misc/mic/Makefile
>  create mode 100644 drivers/misc/mic/card/Makefile
>  create mode 100644 drivers/misc/mic/card/mic_debugfs.c
>  create mode 100644 drivers/misc/mic/card/mic_device.c
>  create mode 100644 drivers/misc/mic/card/mic_device.h
>  create mode 100644 drivers/misc/mic/card/mic_virtio.c
>  create mode 100644 drivers/misc/mic/card/mic_virtio.h
>  create mode 100644 drivers/misc/mic/card/mic_x100.c
>  create mode 100644 drivers/misc/mic/card/mic_x100.h
>  create mode 100644 drivers/misc/mic/common/mic_device.h
>  create mode 100644 drivers/misc/mic/host/Makefile
>  create mode 100644 drivers/misc/mic/host/mic_boot.c
>  create mode 100644 drivers/misc/mic/host/mic_debugfs.c
>  create mode 100644 drivers/misc/mic/host/mic_device.h
>  create mode 100644 drivers/misc/mic/host/mic_fops.c
>  create mode 100644 drivers/misc/mic/host/mic_fops.h
>  create mode 100644 drivers/misc/mic/host/mic_main.c
>  create mode 100644 drivers/misc/mic/host/mic_smpt.c
>  create mode 100644 drivers/misc/mic/host/mic_smpt.h
>  create mode 100644 drivers/misc/mic/host/mic_sysfs.c
>  create mode 100644 drivers/misc/mic/host/mic_virtio.c
>  create mode 100644 drivers/misc/mic/host/mic_virtio.h
>  create mode 100644 drivers/misc/mic/host/mic_x100.c
>  create mode 100644 drivers/misc/mic/host/mic_x100.h
>  create mode 100644 include/uapi/linux/mic_common.h
>  create mode 100644 include/uapi/linux/mic_ioctl.h
> 



^ permalink raw reply	[relevance 99%]

Results 1-200 of ~300   | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2013-08-21 17:50     [PATCH v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors Sudeep Dutt
2013-08-21 17:55 99% ` Sudeep Dutt
2013-09-04 16:18 99% ` Sudeep Dutt
2013-09-05 23:41 60% [PATCH RESEND " Sudeep Dutt
2013-09-05 23:41 44% ` [PATCH RESEND v3 1/7] Intel MIC Host Driver for X100 family Sudeep Dutt
2013-09-05 23:41 28% ` [PATCH RESEND v3 2/7] Intel MIC Host Driver Interrupt/SMPT support Sudeep Dutt
2013-09-05 23:41 28% ` [PATCH RESEND v3 3/7] Intel MIC Host Driver, card OS state management Sudeep Dutt
2013-09-06  4:58       ` Greg Kroah-Hartman
2013-09-06 18:29 99%     ` Sudeep Dutt
2013-09-06  5:00       ` Greg Kroah-Hartman
2013-09-06 18:30 99%     ` Sudeep Dutt
2013-09-06  5:01       ` Greg Kroah-Hartman
2013-09-06 18:41 98%     ` Sudeep Dutt
2013-09-06 19:04           ` Greg Kroah-Hartman
2013-09-06 22:00 99%         ` Sudeep Dutt
2013-09-05 23:42 42% ` [PATCH RESEND v3 4/7] Intel MIC Card Driver for X100 family Sudeep Dutt
2013-09-05 23:42 26% ` [PATCH RESEND v3 5/7] Intel MIC Host Driver Changes for Virtio Devices Sudeep Dutt
2013-09-05 23:42 46% ` [PATCH RESEND v3 6/7] Intel MIC Card " Sudeep Dutt
2013-09-05 23:42 24% ` [PATCH RESEND v3 7/7] Sample Implementation of Intel MIC User Space Daemon Sudeep Dutt
2013-09-06  1:36     ` [PATCH RESEND v3 0/7] Enable Drivers for Intel MIC X100 Coprocessors Joe Perches
2013-09-06 18:27 99%   ` Sudeep Dutt
2013-09-26 20:54       ` Greg Kroah-Hartman
2013-09-26 21:33         ` Joe Perches
2013-09-26 21:40 99%       ` Sudeep Dutt
2013-09-27  1:09 96% [PATCH CHAR-MISC-NEXT 0/4] Intel MIC Driver Cleanups Sudeep Dutt
2013-09-27  1:09 75% ` [PATCH CHAR-MISC-NEXT 1/4] Intel MIC Host Driver Sysfs Cleanups Sudeep Dutt
2013-09-27  1:10 67% ` [PATCH CHAR-MISC-NEXT 2/4] Intel MIC Driver Header File Cleanups Sudeep Dutt
2013-09-27  1:10 35% ` [PATCH CHAR-MISC-NEXT 3/4] Intel MIC Driver Cleanups for "--strict" checkpatch Sudeep Dutt
2013-09-27  1:10 99% ` [PATCH CHAR-MISC-NEXT 4/4] Intel MIC Header File warning fix Sudeep Dutt
2013-09-27  3:17       ` Greg Kroah-Hartman
2013-09-27 19:37 99%     ` Sudeep Dutt
2013-09-27 16:49 94% [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Sudeep Dutt
2013-09-27 16:49 74% ` [PATCH char-misc-next 1/5] misc: mic: host driver sysfs cleanups Sudeep Dutt
2013-09-27 16:49 75% ` [PATCH char-misc-next 2/5] misc: mic: header file cleanups Sudeep Dutt
2013-09-27 16:49 35% ` [PATCH char-misc-next 3/5] misc: mic: cleanups for "--strict" checkpatch Sudeep Dutt
2013-09-27 16:50 99% ` [PATCH char-misc-next 4/5] misc: mic: fix a warning in the IOCTL header file Sudeep Dutt
2013-09-27 16:50 95% ` [PATCH char-misc-next 5/5] misc: mic: depend on X86 for both host and card drivers Sudeep Dutt
2013-09-27 17:29     ` [PATCH char-misc-next 0/5] misc: mic: driver cleanups and a build fix Joe Perches
2013-09-27 18:14 99%   ` Sudeep Dutt
2013-10-02 23:09 85% [PATCH char-misc-next] misc: mic: Fix build issues in sample daemon Sudeep Dutt
     [not found]     <1151498255.5788.1389332631491.JavaMail.zimbra@efficios.com>
2014-01-10  5:56     ` Intel MIC host driver: possible signed underflow (undefined behavior) in userspace API Mathieu Desnoyers
2014-01-10 14:21       ` Greg Kroah-Hartman
2014-01-10 18:22 99%     ` Sudeep Dutt
2014-02-03 22:53 96% [PATCH char-misc-linus] misc: mic: fix " Sudeep Dutt
2014-02-18 20:59     [PATCH v2 0/4] Remove dead code Bjorn Helgaas
2014-02-18 20:59     ` [PATCH v2 1/4] misc: mic: Add include of <linux/irqreturn.h> Bjorn Helgaas
2014-02-19 17:09 99%   ` Sudeep Dutt
2014-04-14 20:14     [PATCH 0/8] Enable dma driver for MIC X100 Coprocessors Siva Yerramreddy
     [not found]     ` <1398363008.6542.1.camel@siva-rh63@ra.intel.com>
2014-05-08  3:10 99%   ` Sudeep Dutt
2014-05-27 21:14         ` Greg Kroah-Hartman
2014-05-28  2:34 99%       ` Sudeep Dutt
2014-05-28  2:36 83% [PATCH char-misc-next " Sudeep Dutt
2014-05-28  2:36 91% ` [PATCH char-misc-next 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
2014-05-28  2:36 63% ` [PATCH char-misc-next 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
2014-05-28 20:50       ` Greg Kroah-Hartman
2014-05-29  2:56 96%     ` Sudeep Dutt
2014-05-28  2:36 41% ` [PATCH char-misc-next 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
2014-05-28  2:36 63% ` [PATCH char-misc-next 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
2014-05-28  2:36 54% ` [PATCH char-misc-next 5/8] misc: mic: add dma " Sudeep Dutt
2014-05-28  2:36 86% ` [PATCH char-misc-next 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
2014-05-28  2:36 87% ` [PATCH char-misc-next 7/8] misc: mic: add dma " Sudeep Dutt
2014-05-28  2:36 99% ` [PATCH char-misc-next 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
2014-05-29 21:23 82% [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
2014-05-29 21:23 91% ` [PATCH char-misc-next v2 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
2014-05-29 21:23 63% ` [PATCH char-misc-next v2 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
2014-07-10  0:42       ` Greg Kroah-Hartman
2014-07-10 16:43 99%     ` Sudeep Dutt
2014-05-29 21:24 41% ` [PATCH char-misc-next v2 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
2014-05-29 21:24 62% ` [PATCH char-misc-next v2 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
2014-05-29 21:24 54% ` [PATCH char-misc-next v2 5/8] misc: mic: add dma " Sudeep Dutt
2014-05-29 21:24 85% ` [PATCH char-misc-next v2 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
2014-05-29 21:24 85% ` [PATCH char-misc-next v2 7/8] misc: mic: add dma " Sudeep Dutt
2014-05-29 21:24 99% ` [PATCH char-misc-next v2 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
2014-06-16 15:58 99% ` [PATCH char-misc-next v2 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
2014-06-30 15:33 99%   ` Sudeep Dutt
2014-06-30 16:27         ` Greg Kroah-Hartman
2014-06-30 19:04 99%       ` Sudeep Dutt
2014-06-01 18:54     [PATCH] misc: mic: Introduce the managed version of ioremap Himangi Saraogi
2014-06-02 16:30 99% ` Sudeep Dutt
2014-07-11 21:04 82% [PATCH char-misc-next v3 0/8] Enable dma driver for MIC X100 Coprocessors Sudeep Dutt
2014-07-11 21:04 91% ` [PATCH char-misc-next v3 1/8] misc: mic: Add mic bus and dma driver documentation Sudeep Dutt
2014-07-11 21:04 63% ` [PATCH char-misc-next v3 2/8] misc: mic: add a bus driver for virtual MIC devices Sudeep Dutt
2014-07-12  1:27       ` Greg Kroah-Hartman
2014-07-12  1:31         ` Greg Kroah-Hartman
2014-07-12  6:49 99%       ` Sudeep Dutt
2014-07-11 21:04 41% ` [PATCH char-misc-next v3 3/8] dma: MIC X100 DMA Driver Sudeep Dutt
2014-07-11 21:04 62% ` [PATCH char-misc-next v3 4/8] misc: mic: add threaded irq support in host driver Sudeep Dutt
2014-07-11 21:04 54% ` [PATCH char-misc-next v3 5/8] misc: mic: add dma " Sudeep Dutt
2014-07-11 21:04 85% ` [PATCH char-misc-next v3 6/8] misc: mic: add threaded irq support in card driver Sudeep Dutt
2014-07-11 21:04 85% ` [PATCH char-misc-next v3 7/8] misc: mic: add dma " Sudeep Dutt
2014-07-11 21:04 99% ` [PATCH char-misc-next v3 8/8] misc: mic: add support for loading/unloading dma driver Sudeep Dutt
2014-07-14 21:23 99% [PATCH char-misc-next 0/2] Fix minor issues reported in MIC drivers Sudeep Dutt
2014-07-14 21:23 96% ` [PATCH char-misc-next 1/2] misc: mic: Introduce the managed version of ioremap Sudeep Dutt
2014-07-14 21:23 99% ` [PATCH char-misc-next 2/2] dma: Fix MIC X100 DMA Driver Kconfig option Sudeep Dutt
2014-12-10 19:47 65% [PATCH char-misc-next 00/13] misc: mic: SCIF driver Sudeep Dutt
2014-12-10 19:47 21% ` [PATCH char-misc-next 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
2015-01-09 23:04       ` Greg Kroah-Hartman
2015-01-09 23:57 99%     ` Sudeep Dutt
2014-12-10 19:47 58% ` [PATCH char-misc-next 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
2015-01-09 23:05       ` Greg Kroah-Hartman
2015-01-10  0:28 99%     ` Sudeep Dutt
2014-12-10 19:47 61% ` [PATCH char-misc-next 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
2014-12-10 19:47 78% ` [PATCH char-misc-next 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
2015-01-09 23:07       ` Greg Kroah-Hartman
2015-01-09 23:58 99%     ` Sudeep Dutt
2014-12-10 19:47 97% ` [PATCH char-misc-next 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
2014-12-10 19:47 47% ` [PATCH char-misc-next 06/13] misc: mic: SCIF module initialization Sudeep Dutt
2014-12-10 19:47 28% ` [PATCH char-misc-next 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
2014-12-10 19:47 41% ` [PATCH char-misc-next 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
2014-12-10 19:47 41% ` [PATCH char-misc-next 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
2014-12-10 19:47 52% ` [PATCH char-misc-next 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
2014-12-10 19:47 57% ` [PATCH char-misc-next 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
2014-12-10 19:47 70% ` [PATCH char-misc-next 12/13] misc: mic: MIC card " Sudeep Dutt
2014-12-10 19:47 98% ` [PATCH char-misc-next 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
2015-01-12 23:09 64% [PATCH char-misc-next v2 00/13] misc: mic: " Sudeep Dutt
2015-01-12 23:09 22% ` [PATCH char-misc-next v2 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
2015-01-25 12:39       ` Greg Kroah-Hartman
2015-01-26 18:07 99%     ` Sudeep Dutt
2015-01-25 12:41       ` Greg Kroah-Hartman
2015-01-26 18:07 99%     ` Sudeep Dutt
2015-01-12 23:09 58% ` [PATCH char-misc-next v2 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
2015-01-12 23:09 61% ` [PATCH char-misc-next v2 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
2015-01-12 23:09 78% ` [PATCH char-misc-next v2 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
2015-01-12 23:09 96% ` [PATCH char-misc-next v2 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
2015-01-12 23:10 47% ` [PATCH char-misc-next v2 06/13] misc: mic: SCIF module initialization Sudeep Dutt
2015-01-12 23:10 28% ` [PATCH char-misc-next v2 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
2015-01-12 23:10 41% ` [PATCH char-misc-next v2 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
2015-01-12 23:10 41% ` [PATCH char-misc-next v2 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
2015-01-12 23:10 52% ` [PATCH char-misc-next v2 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
2015-01-12 23:10 57% ` [PATCH char-misc-next v2 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
2015-01-12 23:10 70% ` [PATCH char-misc-next v2 12/13] misc: mic: MIC card " Sudeep Dutt
2015-01-12 23:10 98% ` [PATCH char-misc-next v2 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
2015-01-25 12:44     ` [PATCH char-misc-next v2 00/13] misc: mic: " Greg Kroah-Hartman
2015-01-26 18:07 97%   ` Sudeep Dutt
2015-01-27  0:32 63% [PATCH char-misc-next v3 " Sudeep Dutt
2015-01-27  0:32 22% ` [PATCH char-misc-next v3 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
2015-03-25 11:04       ` Greg Kroah-Hartman
2015-03-26 17:53 99%     ` Sudeep Dutt
2015-01-27  0:32 58% ` [PATCH char-misc-next v3 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
2015-01-27  0:32 61% ` [PATCH char-misc-next v3 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
2015-01-27  0:32 78% ` [PATCH char-misc-next v3 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
2015-01-27  0:32 96% ` [PATCH char-misc-next v3 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
2015-01-27  0:32 47% ` [PATCH char-misc-next v3 06/13] misc: mic: SCIF module initialization Sudeep Dutt
2015-01-27  0:32 28% ` [PATCH char-misc-next v3 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
2015-01-27  0:32 41% ` [PATCH char-misc-next v3 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
2015-01-27  0:32 41% ` [PATCH char-misc-next v3 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
2015-01-27  0:32 52% ` [PATCH char-misc-next v3 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
2015-01-27  0:32 57% ` [PATCH char-misc-next v3 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
2015-01-27  0:32 70% ` [PATCH char-misc-next v3 12/13] misc: mic: MIC card " Sudeep Dutt
2015-01-27  0:32 98% ` [PATCH char-misc-next v3 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
2015-02-24  0:45 99% ` [PATCH char-misc-next v3 00/13] misc: mic: " Sudeep Dutt
2015-03-09 16:13 99%   ` Sudeep Dutt
2015-03-23 16:44 99%     ` Sudeep Dutt
2015-02-09 19:09     [PATCH] misc: mic: fixup return type of wait_for_completion_timeout Nicholas Mc Guire
2015-02-09 22:00 99% ` Sudeep Dutt
2015-03-31  1:36 63% [PATCH char-misc-next v4 00/13] misc: mic: SCIF driver Sudeep Dutt
2015-03-31  1:36 22% ` [PATCH char-misc-next v4 01/13] misc: mic: SCIF header file and IOCTL interface Sudeep Dutt
2015-03-31  1:36 58% ` [PATCH char-misc-next v4 02/13] misc: mic: SCIF ring buffer infrastructure Sudeep Dutt
2015-03-31  1:36 61% ` [PATCH char-misc-next v4 03/13] misc: mic: SCIF Hardware Bus Sudeep Dutt
2015-03-31  1:36 78% ` [PATCH char-misc-next v4 04/13] misc: mic: SCIF Peer Bus Sudeep Dutt
2015-03-31  1:36 96% ` [PATCH char-misc-next v4 05/13] misc: mic: Common MIC header file changes in preparation for SCIF Sudeep Dutt
2015-03-31  1:36 47% ` [PATCH char-misc-next v4 06/13] misc: mic: SCIF module initialization Sudeep Dutt
2015-03-31  1:36 28% ` [PATCH char-misc-next v4 07/13] misc: mic: SCIF node queue pair setup management Sudeep Dutt
2015-03-31  1:36 41% ` [PATCH char-misc-next v4 08/13] misc: mic: SCIF open close bind and listen APIs Sudeep Dutt
2015-03-31  8:39       ` Paul Bolle
2015-03-31 21:13 99%     ` Sudeep Dutt
2015-03-31  1:36 41% ` [PATCH char-misc-next v4 09/13] misc: mic: SCIF connections APIs i.e. accept and connect Sudeep Dutt
2015-03-31  1:36 52% ` [PATCH char-misc-next v4 10/13] misc: mic: SCIF messaging and node enumeration APIs Sudeep Dutt
2015-03-31  1:36 57% ` [PATCH char-misc-next v4 11/13] misc: mic: MIC host driver specific changes to enable SCIF Sudeep Dutt
2015-03-31  1:36 70% ` [PATCH char-misc-next v4 12/13] misc: mic: MIC card " Sudeep Dutt
2015-03-31  1:36 98% ` [PATCH char-misc-next v4 13/13] misc: mic: add support for loading/unloading SCIF driver Sudeep Dutt
2015-04-28 20:45 99% ` [PATCH char-misc-next v4 00/13] misc: mic: " Sudeep Dutt
2015-04-29 11:27       ` Greg Kroah-Hartman
2015-04-29 12:42 99%     ` Sudeep Dutt
2015-06-09 17:48     randconfig build error with next-20150609, in drivers/misc/mic/scif/scif_rb.c Jim Davis
2015-06-09 18:47 99% ` Sudeep Dutt
2015-06-10  3:58 99% [PATCH char-misc-next 0/2] misc: mic: bug fixes Sudeep Dutt
2015-06-10  3:58 99% ` [PATCH char-misc-next 1/2] misc: mic: Fix randconfig build error by including errno.h Sudeep Dutt
2015-06-10  3:58 99% ` [PATCH char-misc-next 2/2] misc: mic: Fix reported static checker warning Sudeep Dutt
2015-07-27 23:52     [PATCH char-misc-next 00/19] misc: mic: Enable COSM and remaining SCIF functionality Ashutosh Dixit
2015-07-27 23:57     ` [PATCH char-misc-next 10/19] lib: convert iova.c into a library Ashutosh Dixit
2015-07-28 10:03       ` Joerg Roedel
2015-07-28 10:41         ` Robin Murphy
2015-07-28 14:38           ` David Woodhouse
2015-07-28 17:01 97%         ` Sudeep Dutt
2015-07-27 23:58     ` [PATCH char-misc-next 11/19] dma: Add support to program MIC x100 status descriptiors Ashutosh Dixit
2015-08-20  5:20       ` Vinod Koul
2015-08-20 17:30 98%     ` Sudeep Dutt
2015-08-21  8:29           ` Vinod Koul
2015-08-21 16:14 99%         ` Sudeep Dutt
2015-09-30  1:06     [PATCH char-misc-next v2 00/22] misc: mic: Enable COSM and remaining SCIF functionality Ashutosh Dixit
2015-09-30  1:09     ` [PATCH char-misc-next v2 04/22] iommu: Allow iova to be used without requiring IOMMU_SUPPORT Ashutosh Dixit
2015-10-05 10:50       ` Woodhouse, David
2015-10-05 17:38 99%     ` Sudeep Dutt
2015-10-06  5:12           ` gregkh
2015-10-06  5:20             ` gregkh
2015-10-06  5:23 99%           ` Sudeep Dutt
2015-10-06  7:56                 ` gregkh
2015-10-06  8:05 99%               ` Sudeep Dutt
2015-10-06  8:33                     ` gregkh
2015-10-06 12:04 99%                   ` Sudeep Dutt
2015-10-04 20:38 99% [PATCH char-misc-next 1/1] misc: mic: Fix SCIF build failure with IOMMU_SUPPORT disabled Sudeep Dutt
2015-10-05 17:37 99% [PATCH char-misc-testing 1/1] Revert "iommu: Allow iova to be used without requiring IOMMU_SUPPORT" Sudeep Dutt
2015-10-09  6:40     [patch 2/2] misc: mic/scif: fix wrap around tests Dan Carpenter
2015-10-11  9:14 99% ` Sudeep Dutt
2015-10-14  3:21 99% ` Sudeep Dutt
2015-10-09  6:41     [patch 1/2] misc: mic/scif: fix error code in scif_create_remote_lookup() Dan Carpenter
2015-10-11  9:14 99% ` Sudeep Dutt
2015-10-13  9:08     [PATCH] misc: mic: fix memory leak Sudip Mukherjee
2015-10-13 16:34 99% ` Sudeep Dutt
2015-10-13 17:16       ` Sudip Mukherjee
2015-10-13 17:25 99%     ` Sudeep Dutt
2015-10-13 12:52     [patch] misc: mic/scif: re-take a lock on error path Dan Carpenter
2015-10-13 16:20 99% ` Sudeep Dutt
2015-10-13 17:06     [PATCH char-misc-next v3] misc: mic: fix memory leak Sudip Mukherjee
2015-10-13 17:08 99% ` Sudeep Dutt
2015-10-13 17:58 99% [PATCH char-misc-next 1/1] MAINTAINERS: Add maintainership for MIC drivers Sudeep Dutt
2015-10-13 18:21 99% [PATCH char-misc-next v2 " Sudeep Dutt
2015-11-16 13:46     [PATCH 1/3] misc: mic/scif: use list_next_entry instead of list_entry_next Geliang Tang
2015-11-16 17:08 99% ` Sudeep Dutt
2015-11-18 12:10     [PATCH 1/3] misc:scif: do not define list_entry_next Sergey Senozhatsky
2015-11-18 16:17 99% ` Sudeep Dutt
2015-11-23 11:54     [PATCH char-misc-next 1/4] misc: mic: remove unneeded debug message Sudip Mukherjee
2015-11-23 11:54     ` [PATCH char-misc-next 4/4] misc: mic: use common error path Sudip Mukherjee
2015-12-12  2:46 99%   ` Sudeep Dutt
2015-12-12  2:09     [PATCH] misc: mic: fix incorrect use of error codes in SCIF DMA driver Eric Biggers
2015-12-12  2:50 99% ` Sudeep Dutt
2016-02-02  4:23 86% [PATCH char-misc-next 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
2016-02-02  4:23 29% ` [PATCH char-misc-next 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
2016-02-02  4:23 49% ` [PATCH char-misc-next 2/8] misc: mic: Remove MIC X100 card " Sudeep Dutt
2016-02-02  4:23 61% ` [PATCH char-misc-next 3/8] misc: mic: MIC VOP Bus Sudeep Dutt
2016-02-08  6:57       ` Greg Kroah-Hartman
2016-02-08 17:26 99%     ` Sudeep Dutt
2016-02-02  4:23 75% ` [PATCH char-misc-next 4/8] misc: mic: Add data structures for the VOP driver Sudeep Dutt
2016-02-02  4:23 39% ` [PATCH char-misc-next 5/8] misc: mic: Enable VOP host side functionality Sudeep Dutt
2016-02-02  4:23 49% ` [PATCH char-misc-next 6/8] misc: mic: Enable VOP card " Sudeep Dutt
2016-02-02  4:23 67% ` [PATCH char-misc-next 7/8] misc: mic: Enable VOP debugfs and driver build Sudeep Dutt
2016-02-02  4:23 58% ` [PATCH char-misc-next 8/8] misc: mic: MIC host and card driver changes to enable VOP Sudeep Dutt
2016-02-08 23:48 84% [PATCH char-misc-testing v2 0/8] Enable Virtio Over PCIe (VOP) driver Sudeep Dutt
2016-02-08 23:48 29% ` [PATCH char-misc-testing v2 1/8] misc: mic: Remove MIC X100 host virtio functionality Sudeep Dutt
2016-02-08 23:48 49% ` [PATCH char-misc-testing v2 2/8] misc: mic: Remove MIC X100 card " Sudeep Dutt
2016-02-08 23:48 61% ` [PATCH char-misc-testing v2 3/8] misc: mic: MIC VOP Bus Sudeep Dutt
2016-02-08 23:48 75% ` [PATCH char-misc-testing v2 4/8] misc: mic: Add data structures for the VOP driver Sudeep Dutt
2016-02-08 23:48 39% ` [PATCH char-misc-testing v2 5/8] misc: mic: Enable VOP host side functionality Sudeep Dutt
2016-02-08 23:48 49% ` [PATCH char-misc-testing v2 6/8] misc: mic: Enable VOP card " Sudeep Dutt
2016-02-08 23:48 67% ` [PATCH char-misc-testing v2 7/8] misc: mic: Enable VOP debugfs and driver build Sudeep Dutt
2016-02-08 23:48 58% ` [PATCH char-misc-testing v2 8/8] misc: mic: MIC host and card driver changes to enable VOP Sudeep Dutt
2016-04-05  4:32 99% [PATCH char-misc-linus] misc: mic: Fix randconfig build error Sudeep Dutt
2016-04-05  4:41     ` Greg Kroah-Hartman
2016-04-05  5:00 99%   ` Sudeep Dutt
2016-04-18 15:05     ERROR: "vring_del_virtqueue" [drivers/misc/mic/vop/vop.ko] undefined! kbuild test robot
2016-04-30 16:49     ` Randy Dunlap
2016-04-30 18:56 99%   ` Sudeep Dutt
2018-08-02  8:42     [PATCH] misc: mic: SCIF Fix scif_get_new_port() error handling Dan Carpenter
2018-08-08 23:13 99% ` Sudeep Dutt
2018-10-17  8:54     rdma-core doesn't install driver.h, broke libibscif Christoph Hellwig
2018-10-18 15:10 99% ` Sudeep Dutt
2018-10-18 19:46     [PATCH] misc: mic: fix a DMA pool free failure Wenwen Wang
2018-11-05  2:23 99% ` Sudeep Dutt
2018-12-04 15:16     [PATCH v2] " Wenwen Wang
2019-01-03  1:50 99% ` Sudeep Dutt
2019-01-29 10:22     [PATCH] mic: vop: Fix broken virtqueues Vincent Whitchurch
2019-01-30 16:29 99% ` Sudeep Dutt
2019-03-11  6:45     scif_insert_vma() Jarkko Sakkinen
2019-03-18  5:47 99% ` scif_insert_vma() Sudeep Dutt
2019-06-11 18:43     [PATCH] mic: no need to check return value of debugfs_create functions Greg Kroah-Hartman
2019-06-12 15:41 99% ` Sudeep Dutt
2019-06-12 12:25     [PATCH 1/6] dma: amba-pl08x: no need to cast away call to debugfs_create_file() Greg Kroah-Hartman
2019-06-12 12:25     ` [PATCH 5/6] dma: mic_x100_dma: no need to check return value of debugfs_create functions Greg Kroah-Hartman
2019-06-12 15:44 99%   ` Sudeep Dutt
2020-10-28  2:03     [PATCH V5 0/2] Change vring space from nomal memory to dma coherent memory Sherry Sun
2020-10-28  5:58     ` Greg KH
2020-10-28  6:05       ` Sherry Sun
2020-10-28  7:07         ` Greg KH
2020-10-28 10:17           ` [EXT] " Andy Duan
2020-10-28 11:13             ` Greg KH
2020-10-28 15:11               ` Andy Duan
2020-10-28 15:42                 ` Greg KH
2020-10-29  1:51                   ` Sherry Sun
2020-10-29  2:03 99%                 ` Dutt, Sudeep
     [not found]     <8c1443136563de34699d2c084df478181c205db4.1603854416.git.sudeep.dutt@intel.com>
2020-10-28  5:54     ` [PATCH char-misc-next 1/1] misc: mic: remove the MIC drivers Greg Kroah-Hartman
2020-10-28 17:22 99%   ` Dutt, Sudeep

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).