dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
@ 2019-03-19 21:53 sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 1/6] Add skeleton code: ioctl definitions and build hooks sonal.santan
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:53 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk

From: Sonal Santan <sonal.santan@xilinx.com>

Hello,

This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
These drivers are part of Xilinx Runtime (XRT) open source stack and
have been deployed by leading FaaS vendors and many enterprise customers.

PLATFORM ARCHITECTURE

Alveo PCIe platforms have a static shell and a reconfigurable (dynamic)
region. The shell is automatically loaded from PROM when host is booted
and PCIe is enumerated by BIOS. Shell cannot be changed till next cold
reboot. The shell exposes two physical functions: management physical
function and user physical function.

Users compile their high level design in C/C++/OpenCL or RTL into FPGA
image using SDx compiler. The FPGA image packaged as xclbin file can be
loaded onto reconfigurable region. The image may contain one or more
compute unit. Users can dynamically swap the full image running on the
reconfigurable region in order to switch between different workloads.

XRT DRIVERS

XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular and
organized into several platform drivers which primarily handle the
following functionality:
1.  ICAP programming (FPGA bitstream download with FPGA Mgr integration)
2.  Clock scaling
3.  Loading firmware container also called dsabin (embedded Microblaze
    firmware for ERT and XMC, optional clearing bitstream)
4.  In-band sensors: temp, voltage, power, etc.
5.  AXI Firewall management
6.  Device reset and rescan
7.  Hardware mailbox for communication between two physical functions

XRT Linux kernel driver xocl binds to user pf. Like its peer, this driver
is also modular and organized into several platform drivers which handle
the following functionality:
1.  Device memory topology discovery and memory management
2.  Buffer object abstraction and management for client process
3.  XDMA MM PCIe DMA engine programming
4.  Multi-process aware context management
5.  Compute unit execution management (optionally with help of ERT) for
    client processes
6.  Hardware mailbox for communication between two physical functions

The drivers export ioctls and sysfs nodes for various services. xocl
driver makes heavy use of DRM GEM features for device memory management,
reference counting, mmap support and export/import. xocl also includes a
simple scheduler called KDS which schedules compute units and interacts
with hardware scheduler running ERT firmware. The scheduler understands
custom opcodes packaged into command objects and provides an asynchronous
command done notification via POSIX poll.

More details on architecture, software APIs, ioctl definitions, execution
model, etc. is available as Sphinx documentation--

https://xilinx.github.io/XRT/2018.3/html/index.html

The complete runtime software stack (XRT) which includes out of tree
kernel drivers, user space libraries, board utilities and firmware for
the hardware scheduler is open source and available at
https://github.com/Xilinx/XRT

Thanks,
-Sonal

Sonal Santan (6):
  Add skeleton code: ioctl definitions and build hooks
  Global data structures shared between xocl and xmgmt drivers
  Add platform drivers for various IPs and frameworks
  Add core of XDMA driver
  Add management driver
  Add user physical function driver

 drivers/gpu/drm/Kconfig                    |    2 +
 drivers/gpu/drm/Makefile                   |    1 +
 drivers/gpu/drm/xocl/Kconfig               |   22 +
 drivers/gpu/drm/xocl/Makefile              |    3 +
 drivers/gpu/drm/xocl/devices.h             |  954 +++++
 drivers/gpu/drm/xocl/ert.h                 |  385 ++
 drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
 drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
 drivers/gpu/drm/xocl/lib/libxdma.c         | 4368 ++++++++++++++++++++
 drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
 drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
 drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
 drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
 drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
 drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
 drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
 drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
 drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
 drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++
 drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
 drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
 drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
 drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
 drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
 drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
 drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
 drivers/gpu/drm/xocl/userpf/common.h       |  157 +
 drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
 drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
 drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
 drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
 drivers/gpu/drm/xocl/version.h             |   22 +
 drivers/gpu/drm/xocl/xclbin.h              |  314 ++
 drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
 drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
 drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
 drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
 drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
 drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
 include/uapi/drm/xmgmt_drm.h               |  204 +
 include/uapi/drm/xocl_drm.h                |  483 +++
 50 files changed, 28252 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/Kconfig
 create mode 100644 drivers/gpu/drm/xocl/Makefile
 create mode 100644 drivers/gpu/drm/xocl/devices.h
 create mode 100644 drivers/gpu/drm/xocl/ert.h
 create mode 100644 drivers/gpu/drm/xocl/lib/Makefile.in
 create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
 create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
 create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
 create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/xvc.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/Makefile
 create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
 create mode 100644 drivers/gpu/drm/xocl/version.h
 create mode 100644 drivers/gpu/drm/xocl/xclbin.h
 create mode 100644 drivers/gpu/drm/xocl/xclfeatures.h
 create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c
 create mode 100644 drivers/gpu/drm/xocl/xocl_drm.h
 create mode 100644 drivers/gpu/drm/xocl/xocl_drv.h
 create mode 100644 drivers/gpu/drm/xocl/xocl_subdev.c
 create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
 create mode 100644 include/uapi/drm/xmgmt_drm.h
 create mode 100644 include/uapi/drm/xocl_drm.h

--
2.17.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH Xilinx Alveo 1/6] Add skeleton code: ioctl definitions and build hooks
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
@ 2019-03-19 21:53 ` sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 2/6] Global data structures shared between xocl and xmgmt drivers sonal.santan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:53 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk,
	Sonal Santan

From: Sonal Santan <sonal.santan@xilinx.com>

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 drivers/gpu/drm/Kconfig              |   2 +
 drivers/gpu/drm/Makefile             |   1 +
 drivers/gpu/drm/xocl/Kconfig         |  22 ++
 drivers/gpu/drm/xocl/Makefile        |   3 +
 drivers/gpu/drm/xocl/mgmtpf/Makefile |  29 ++
 drivers/gpu/drm/xocl/userpf/Makefile |  27 ++
 include/uapi/drm/xmgmt_drm.h         | 204 +++++++++++
 include/uapi/drm/xocl_drm.h          | 483 +++++++++++++++++++++++++++
 8 files changed, 771 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/Kconfig
 create mode 100644 drivers/gpu/drm/xocl/Makefile
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
 create mode 100644 drivers/gpu/drm/xocl/userpf/Makefile
 create mode 100644 include/uapi/drm/xmgmt_drm.h
 create mode 100644 include/uapi/drm/xocl_drm.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index bd943a71756c..cc3785b1ae3d 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -329,6 +329,8 @@ source "drivers/gpu/drm/tve200/Kconfig"
 
 source "drivers/gpu/drm/xen/Kconfig"
 
+source "drivers/gpu/drm/xocl/Kconfig"
+
 # Keep legacy drivers last
 
 menuconfig DRM_LEGACY
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 1ac55c65eac0..ebebaba2bf3d 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -109,3 +109,4 @@ obj-$(CONFIG_DRM_TINYDRM) += tinydrm/
 obj-$(CONFIG_DRM_PL111) += pl111/
 obj-$(CONFIG_DRM_TVE200) += tve200/
 obj-$(CONFIG_DRM_XEN) += xen/
+obj-$(CONFIG_DRM_XOCL) += xocl/
diff --git a/drivers/gpu/drm/xocl/Kconfig b/drivers/gpu/drm/xocl/Kconfig
new file mode 100644
index 000000000000..197d36250b7c
--- /dev/null
+++ b/drivers/gpu/drm/xocl/Kconfig
@@ -0,0 +1,22 @@
+#
+# Xilinx Alveo and FaaS platform drivers
+#
+
+config DRM_XOCL
+	tristate "DRM Support for Xilinx PCIe Accelerator Alveo and FaaS platforms (EXPERIMENTAL)"
+	depends on DRM
+	depends on PCI
+	default n
+	help
+	  Choose this option if you have a Xilinx PCIe Accelerator
+	  card like Alveo or FaaS environments like AWS F1
+
+
+config DRM_XMGMT
+	tristate "DRM Support for Xilinx PCIe Accelerator Alveo and FaaS platforms (EXPERIMENTAL)"
+	depends on PCI
+	depends on FPGA
+	default n
+	help
+	  Choose this option if you have a Xilinx PCIe Accelerator
+	  card like Alveo
diff --git a/drivers/gpu/drm/xocl/Makefile b/drivers/gpu/drm/xocl/Makefile
new file mode 100644
index 000000000000..605459ab7de6
--- /dev/null
+++ b/drivers/gpu/drm/xocl/Makefile
@@ -0,0 +1,3 @@
+
+obj-$(CONFIG_DRM_XOCL) += userpf/
+obj-$(CONFIG_DRM_XMGMT) += mgmtpf/
diff --git a/drivers/gpu/drm/xocl/mgmtpf/Makefile b/drivers/gpu/drm/xocl/mgmtpf/Makefile
new file mode 100644
index 000000000000..569b7dc01866
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/Makefile
@@ -0,0 +1,29 @@
+obj-m	+= xmgmt.o
+
+xmgmt-y := \
+	../xocl_subdev.o \
+	../xocl_ctx.o \
+	../xocl_thread.o \
+	../subdev/sysmon.o \
+	../subdev/feature_rom.o \
+	../subdev/microblaze.o \
+	../subdev/firewall.o \
+	../subdev/xvc.o \
+	../subdev/mailbox.o \
+	../subdev/icap.o \
+	../subdev/mig.o \
+	../subdev/xmc.o \
+	../subdev/dna.o \
+	../subdev/fmgr.o \
+	mgmt-core.o \
+	mgmt-cw.o \
+	mgmt-utils.o \
+	mgmt-ioctl.o \
+	mgmt-sysfs.o
+
+
+
+ccflags-y += -DSUBDEV_SUFFIX=MGMT_SUFFIX
+ifeq ($(DEBUG),1)
+ccflags-y += -DDEBUG
+endif
diff --git a/drivers/gpu/drm/xocl/userpf/Makefile b/drivers/gpu/drm/xocl/userpf/Makefile
new file mode 100644
index 000000000000..ff895d2b2e68
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/Makefile
@@ -0,0 +1,27 @@
+obj-$(CONFIG_DRM_XOCL)	+= xocl.o
+
+include $(src)/../lib/Makefile.in
+
+xocl-y := \
+	../xocl_subdev.o \
+	../xocl_ctx.o \
+	../xocl_thread.o \
+	../subdev/xdma.o \
+	../subdev/feature_rom.o \
+	../subdev/mb_scheduler.o \
+	../subdev/mailbox.o \
+	../subdev/xvc.o \
+	../subdev/icap.o \
+	../subdev/xmc.o \
+	$(xocl_lib-y)	\
+	xocl_drv.o	\
+	xocl_bo.o	\
+	xocl_drm.o	\
+	xocl_ioctl.o	\
+	xocl_sysfs.o
+
+
+ccflags-y += -DSUBDEV_SUFFIX=USER_SUFFIX
+ifeq ($(DEBUG),1)
+ccflags-y += -DDEBUG
+endif
diff --git a/include/uapi/drm/xmgmt_drm.h b/include/uapi/drm/xmgmt_drm.h
new file mode 100644
index 000000000000..a0c23cf2ae82
--- /dev/null
+++ b/include/uapi/drm/xmgmt_drm.h
@@ -0,0 +1,204 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/**
+ * DOC: PCIe Kernel Driver for Managament Physical Function
+ * Interfaces exposed by *xclmgmt* driver are defined in file, *mgmt-ioctl.h*.
+ * Core functionality provided by *xclmgmt* driver is described in the following table:
+ *
+ * ==== ====================================== ============================== ==================================
+ * #    Functionality                          ioctl request code             data format
+ * ==== ====================================== ============================== ==================================
+ * 1    FPGA image download                    XCLMGMT_IOCICAPDOWNLOAD_AXLF   xclmgmt_ioc_bitstream_axlf
+ * 2    CL frequency scaling                   XCLMGMT_IOCFREQSCALE           xclmgmt_ioc_freqscaling
+ * 3    PCIe hot reset                         XCLMGMT_IOCHOTRESET            NA
+ * 4    CL reset                               XCLMGMT_IOCOCLRESET            NA
+ * 5    Live boot FPGA from PROM               XCLMGMT_IOCREBOOT              NA
+ * 6    Device sensors (current, voltage and   NA                             *hwmon* (xclmgmt_microblaze and
+ *      temperature)                                                          xclmgmt_sysmon) interface on sysfs
+ * 7    Querying device errors                 XCLMGMT_IOCERRINFO             xclErrorStatus
+ * ==== ====================================== ============================== ==================================
+ *
+ */
+
+#ifndef _XCLMGMT_IOCALLS_POSIX_H_
+#define _XCLMGMT_IOCALLS_POSIX_H_
+
+#include <linux/ioctl.h>
+
+/**
+ * enum xclFirewallID - AXI Firewall IDs used to identify individual AXI Firewalls
+ *
+ * @XCL_FW_MGMT_CONTROL:  MGMT BAR AXI-Lite BAR access protection
+ * @XCL_FW_USER_CONTROL:  USER BAR AXI-Lite BAR access protection
+ * @XCL_FW_DATAPATH:	  DMA data path protection
+ */
+enum xclFirewallID {
+	XCL_FW_MGMT_CONTROL = 0,
+	XCL_FW_USER_CONTROL,
+	XCL_FW_DATAPATH,
+	XCL_FW_MAX_LEVEL // always the last one
+};
+
+/**
+ * struct xclAXIErrorStatus - Record used to capture specific error
+ *
+ * @mErrFirewallTime:	 Timestamp of when Firewall tripped
+ * @mErrFirewallStatus:	 Error code obtained from the Firewall
+ * @mErrFirewallID:	 Firewall ID
+ */
+struct xclAXIErrorStatus {
+	unsigned int long   mErrFirewallTime;
+	unsigned int	    mErrFirewallStatus;
+	enum xclFirewallID  mErrFirewallID;
+};
+
+struct xclPCIErrorStatus {
+	unsigned int mDeviceStatus;
+	unsigned int mUncorrErrStatus;
+	unsigned int mCorrErrStatus;
+	unsigned int rsvd1;
+	unsigned int rsvd2;
+};
+
+/**
+ * struct xclErrorStatus - Container for all error records
+ *
+ * @mNumFirewalls:    Count of Firewalls in the record (max is 8)
+ * @mAXIErrorStatus:  Records holding Firewall information
+ * @mPCIErrorStatus:  Unused
+ */
+struct xclErrorStatus {
+	unsigned int mNumFirewalls;
+	struct xclAXIErrorStatus mAXIErrorStatus[8];
+	struct xclPCIErrorStatus mPCIErrorStatus;
+	unsigned int mFirewallLevel;
+};
+
+#define XCLMGMT_IOC_MAGIC	'X'
+#define XCLMGMT_NUM_SUPPORTED_CLOCKS 4
+#define XCLMGMT_NUM_ACTUAL_CLOCKS 2
+#define XCLMGMT_NUM_FIREWALL_IPS 3
+#define AWS_SHELL14		69605400
+
+#define AXI_FIREWALL
+
+enum XCLMGMT_IOC_TYPES {
+	XCLMGMT_IOC_INFO,
+	XCLMGMT_IOC_ICAP_DOWNLOAD,
+	XCLMGMT_IOC_FREQ_SCALE,
+	XCLMGMT_IOC_OCL_RESET,
+	XCLMGMT_IOC_HOT_RESET,
+	XCLMGMT_IOC_REBOOT,
+	XCLMGMT_IOC_ICAP_DOWNLOAD_AXLF,
+	XCLMGMT_IOC_ERR_INFO,
+	XCLMGMT_IOC_MAX
+};
+
+/**
+ * struct xclmgmt_ioc_info - Obtain information from the device
+ * used with XCLMGMT_IOCINFO ioctl
+ *
+ * Note that this structure will be obsoleted in future and the same functionality will be exposed via sysfs nodes
+ */
+struct xclmgmt_ioc_info {
+	unsigned short vendor;
+	unsigned short device;
+	unsigned short subsystem_vendor;
+	unsigned short subsystem_device;
+	unsigned int driver_version;
+	unsigned int device_version;
+	unsigned long long feature_id;
+	unsigned long long time_stamp;
+	unsigned short ddr_channel_num;
+	unsigned short ddr_channel_size;
+	unsigned short pcie_link_width;
+	unsigned short pcie_link_speed;
+	char vbnv[64];
+	char fpga[64];
+	unsigned short onchip_temp;
+	unsigned short fan_temp;
+	unsigned short fan_speed;
+	unsigned short vcc_int;
+	unsigned short vcc_aux;
+	unsigned short vcc_bram;
+	unsigned short ocl_frequency[XCLMGMT_NUM_SUPPORTED_CLOCKS];
+	bool mig_calibration[4];
+	unsigned short num_clocks;
+	bool isXPR;
+	unsigned int pci_slot;
+	unsigned long long xmc_version;
+	unsigned short twelve_vol_pex;
+	unsigned short twelve_vol_aux;
+	unsigned long long pex_curr;
+	unsigned long long aux_curr;
+	unsigned short three_vol_three_pex;
+	unsigned short three_vol_three_aux;
+	unsigned short ddr_vpp_btm;
+	unsigned short sys_5v5;
+	unsigned short one_vol_two_top;
+	unsigned short one_vol_eight_top;
+	unsigned short zero_vol_eight;
+	unsigned short ddr_vpp_top;
+	unsigned short mgt0v9avcc;
+	unsigned short twelve_vol_sw;
+	unsigned short mgtavtt;
+	unsigned short vcc1v2_btm;
+	short se98_temp[4];
+	short dimm_temp[4];
+};
+
+struct xclmgmt_ioc_bitstream {
+	struct xclBin *xclbin;
+};
+
+
+/*
+ * struct xclmgmt_err_info - Obtain Error information from the device
+ * used with XCLMGMT_IOCERRINFO ioctl
+ *
+ * Note that this structure will be obsoleted in future and the same functionality will be exposed via sysfs nodes
+ */
+struct xclmgmt_err_info {
+	unsigned int mNumFirewalls;
+	struct xclAXIErrorStatus mAXIErrorStatus[8];
+	struct xclPCIErrorStatus mPCIErrorStatus;
+};
+
+/**
+ * struct xclmgmt_ioc_bitstream_axlf - load xclbin (AXLF) device image
+ * used with XCLMGMT_IOCICAPDOWNLOAD_AXLF ioctl
+ *
+ * @xclbin:	Pointer to user's xclbin structure in memory
+ */
+struct xclmgmt_ioc_bitstream_axlf {
+	struct axlf *xclbin;
+};
+
+/**
+ * struct xclmgmt_ioc_freqscaling - scale frequencies on the board using Xilinx clock wizard
+ * used with XCLMGMT_IOCFREQSCALE ioctl
+ *
+ * @ocl_region:		PR region (currently only 0 is supported)
+ * @ocl_target_freq:	Array of requested frequencies, a value o zero in the array indicates leave untouched
+ */
+struct xclmgmt_ioc_freqscaling {
+	unsigned int ocl_region;
+	unsigned short ocl_target_freq[XCLMGMT_NUM_SUPPORTED_CLOCKS];
+};
+
+#define XCLMGMT_IOCINFO			 _IOR(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_INFO, \
+					      struct xclmgmt_ioc_info)
+#define XCLMGMT_IOCICAPDOWNLOAD		 _IOW(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_ICAP_DOWNLOAD, \
+					      struct xclmgmt_ioc_bitstream)
+#define XCLMGMT_IOCICAPDOWNLOAD_AXLF	 _IOW(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_ICAP_DOWNLOAD_AXLF, \
+					      struct xclmgmt_ioc_bitstream_axlf)
+#define XCLMGMT_IOCFREQSCALE		 _IOW(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_FREQ_SCALE, \
+					      struct xclmgmt_ioc_freqscaling)
+#define XCLMGMT_IOCHOTRESET		 _IO(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_HOT_RESET)
+#define XCLMGMT_IOCOCLRESET		 _IO(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_OCL_RESET)
+#define XCLMGMT_IOCREBOOT		 _IO(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_REBOOT)
+#define XCLMGMT_IOCERRINFO		 _IOR(XCLMGMT_IOC_MAGIC, XCLMGMT_IOC_ERR_INFO, struct xclErrorStatus)
+
+#define	XCLMGMT_MB_HWMON_NAME	    "xclmgmt_microblaze"
+#define XCLMGMT_SYSMON_HWMON_NAME   "xclmgmt_sysmon"
+#endif
diff --git a/include/uapi/drm/xocl_drm.h b/include/uapi/drm/xocl_drm.h
new file mode 100644
index 000000000000..259e30b159ca
--- /dev/null
+++ b/include/uapi/drm/xocl_drm.h
@@ -0,0 +1,483 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/**
+ * DOC: A GEM style driver for Xilinx PCIe based accelerators
+ * This file defines ioctl command codes and associated structures for interacting with
+ * *xocl* PCI driver for Xilinx FPGA platforms.
+ *
+ * Device memory allocation is modeled as buffer objects (bo). For each bo driver tracks the host pointer
+ * backed by scatter gather list -- which provides backing storage on host -- and the corresponding device
+ * side allocation of contiguous buffer in one of the memory mapped DDRs/BRAMs, etc.
+ *
+ * Exection model is asynchronous where execute commands are submitted using command buffers and POSIX poll
+ * is used to wait for finished commands. Commands for a compute unit can only be submitted after an explicit
+ * context has been opened by the client.
+ *
+ * *xocl* driver functionality is described in the following table. All the APIs are multi-threading and
+ * multi-process safe.
+ *
+ * ==== ====================================== ============================== ==================================
+ * #    Functionality                          ioctl request code             data format
+ * ==== ====================================== ============================== ==================================
+ * 1    Allocate buffer on device              DRM_IOCTL_XOCL_CREATE_BO       drm_xocl_create_bo
+ * 2    Allocate buffer on device with         DRM_IOCTL_XOCL_USERPTR_BO      drm_xocl_userptr_bo
+ *      userptr
+ * 3    Prepare bo for mapping into user's     DRM_IOCTL_XOCL_MAP_BO          drm_xocl_map_bo
+ *      address space
+ * 4    Synchronize (DMA) buffer contents in   DRM_IOCTL_XOCL_SYNC_BO         drm_xocl_sync_bo
+ *      requested direction
+ * 5    Obtain information about buffer        DRM_IOCTL_XOCL_INFO_BO         drm_xocl_info_bo
+ *      object
+ * 6    Update bo backing storage with user's  DRM_IOCTL_XOCL_PWRITE_BO       drm_xocl_pwrite_bo
+ *      data
+ * 7    Read back data in bo backing storage   DRM_IOCTL_XOCL_PREAD_BO        drm_xocl_pread_bo
+ * 8    Open/close a context on a compute unit DRM_XOCL_CTX                   drm_xocl_ctx
+ *      on the device
+ * 9    Unprotected write to device memory     DRM_IOCTL_XOCL_PWRITE_UNMGD    drm_xocl_pwrite_unmgd
+ * 10   Unprotected read from device memory    DRM_IOCTL_XOCL_PREAD_UNMGD     drm_xocl_pread_unmgd
+ * 11   Send an execute job to a compute unit  DRM_IOCTL_XOCL_EXECBUF         drm_xocl_execbuf
+ * 12   Register eventfd handle for MSIX       DRM_IOCTL_XOCL_USER_INTR       drm_xocl_user_intr
+ *      interrupt
+ * 13   Update device view with a specific     DRM_XOCL_READ_AXLF             drm_xocl_axlf
+ *      xclbin image
+ * 14   Write buffer from device to peer FPGA  DRM_IOCTL_XOCL_COPY_BO         drm_xocl_copy_bo
+ *      buffer
+ * ==== ====================================== ============================== ==================================
+ */
+
+#ifndef _XCL_XOCL_IOCTL_H_
+#define _XCL_XOCL_IOCTL_H_
+
+#if defined(__KERNEL__)
+#include <linux/types.h>
+#include <linux/uuid.h>
+#include <linux/version.h>
+#elif defined(__cplusplus)
+#include <cstdlib>
+#include <cstdint>
+#include <uuid/uuid.h>
+#else
+#include <stdlib.h>
+#include <stdint.h>
+#include <uuid/uuid.h>
+#endif
+
+/*
+ * enum drm_xocl_ops - ioctl command code enumerations
+ */
+enum drm_xocl_ops {
+	/* Buffer creation */
+	DRM_XOCL_CREATE_BO = 0,
+	/* Buffer creation from user provided pointer */
+	DRM_XOCL_USERPTR_BO,
+	/* Map buffer into application user space (no DMA is performed) */
+	DRM_XOCL_MAP_BO,
+	/* Sync buffer (like fsync) in the desired direction by using DMA */
+	DRM_XOCL_SYNC_BO,
+	/* Get information about the buffer such as physical address in the device, etc */
+	DRM_XOCL_INFO_BO,
+	/* Update host cached copy of buffer wih user's data */
+	DRM_XOCL_PWRITE_BO,
+	/* Update user's data with host cached copy of buffer */
+	DRM_XOCL_PREAD_BO,
+	/* Other ioctls */
+	DRM_XOCL_OCL_RESET,
+	/* Open/close a context */
+	DRM_XOCL_CTX,
+	/* Get information from device */
+	DRM_XOCL_INFO,
+	/* Unmanaged DMA from/to device */
+	DRM_XOCL_PREAD_UNMGD,
+	DRM_XOCL_PWRITE_UNMGD,
+	/* Various usage metrics */
+	DRM_XOCL_USAGE_STAT,
+	/* Hardware debug command */
+	DRM_XOCL_DEBUG,
+	/* Command to run on one or more CUs */
+	DRM_XOCL_EXECBUF,
+	/* Register eventfd for user interrupts */
+	DRM_XOCL_USER_INTR,
+	/* Read xclbin/axlf */
+	DRM_XOCL_READ_AXLF,
+	/* Copy buffer to Destination buffer by using DMA */
+	DRM_XOCL_COPY_BO,
+	/* Hot reset request */
+	DRM_XOCL_HOT_RESET,
+	/* Reclock through userpf*/
+	DRM_XOCL_RECLOCK,
+
+	DRM_XOCL_NUM_IOCTLS
+};
+
+enum drm_xocl_sync_bo_dir {
+	DRM_XOCL_SYNC_BO_TO_DEVICE = 0,
+	DRM_XOCL_SYNC_BO_FROM_DEVICE
+};
+
+/*
+ * Higher 4 bits are for DDR, one for each DDR
+ * LSB bit for execbuf
+ */
+#define DRM_XOCL_BO_BANK0   (0x1)
+#define DRM_XOCL_BO_BANK1   (0x1 << 1)
+#define DRM_XOCL_BO_BANK2   (0x1 << 2)
+#define DRM_XOCL_BO_BANK3   (0x1 << 3)
+
+#define DRM_XOCL_BO_CMA     (0x1 << 29)
+#define DRM_XOCL_BO_P2P     (0x1 << 30)
+#define DRM_XOCL_BO_EXECBUF (0x1 << 31)
+
+#define DRM_XOCL_CTX_FLAG_EXCLUSIVE (0x1)
+
+
+#define DRM_XOCL_NUM_SUPPORTED_CLOCKS 4
+/**
+ * struct drm_xocl_create_bo - Create buffer object
+ * used with DRM_IOCTL_XOCL_CREATE_BO ioctl
+ *
+ * @size:       Requested size of the buffer object
+ * @handle:     bo handle returned by the driver
+ * @flags:      DRM_XOCL_BO_XXX flags
+ * @type:       The type of bo
+ */
+struct drm_xocl_create_bo {
+	uint64_t size;
+	uint32_t handle;
+	uint32_t flags;
+	uint32_t type;
+};
+
+/**
+ * struct drm_xocl_userptr_bo - Create buffer object with user's pointer
+ * used with DRM_IOCTL_XOCL_USERPTR_BO ioctl
+ *
+ * @addr:       Address of buffer allocated by user
+ * @size:       Requested size of the buffer object
+ * @handle:     bo handle returned by the driver
+ * @flags:      DRM_XOCL_BO_XXX flags
+ * @type:       The type of bo
+ */
+struct drm_xocl_userptr_bo {
+	uint64_t addr;
+	uint64_t size;
+	uint32_t handle;
+	uint32_t flags;
+	uint32_t type;
+};
+
+/**
+ * struct drm_xocl_map_bo - Prepare a buffer object for mmap
+ * used with DRM_IOCTL_XOCL_MAP_BO ioctl
+ *
+ * @handle:     bo handle
+ * @pad:        Unused
+ * @offset:     'Fake' offset returned by the driver which can be used with POSIX mmap
+ */
+struct drm_xocl_map_bo {
+	uint32_t handle;
+	uint32_t pad;
+	uint64_t offset;
+};
+
+/**
+ * struct drm_xocl_sync_bo - Synchronize the buffer in the requested direction
+ * between device and host
+ * used with DRM_IOCTL_XOCL_SYNC_BO ioctl
+ *
+ * @handle:	bo handle
+ * @flags:	Unused
+ * @size:	Number of bytes to synchronize
+ * @offset:	Offset into the object to synchronize
+ * @dir:	DRM_XOCL_SYNC_DIR_XXX
+ */
+struct drm_xocl_sync_bo {
+	uint32_t handle;
+	uint32_t flags;
+	uint64_t size;
+	uint64_t offset;
+	enum drm_xocl_sync_bo_dir dir;
+};
+
+/**
+ * struct drm_xocl_info_bo - Obtain information about an allocated buffer obbject
+ * used with DRM_IOCTL_XOCL_INFO_BO IOCTL
+ *
+ * @handle:	bo handle
+ * @flags:      Unused
+ * @size:	Size of buffer object (out)
+ * @paddr:	Physical address (out)
+ */
+struct drm_xocl_info_bo {
+	uint32_t handle;
+	uint32_t flags;
+	uint64_t size;
+	uint64_t paddr;
+};
+
+/**
+ * struct drm_xocl_copy_bo - copy source buffer to destination buffer
+ * between device and device
+ * used with DRM_IOCTL_XOCL_COPY_BO ioctl
+ *
+ * @dst_handle: destination bo handle
+ * @src_handle: source bo handle
+ * @flags:  Unused
+ * @size: Number of bytes to synchronize
+ * @dst_offset: Offset into the object to destination buffer to synchronize
+ * @src_offset: Offset into the object to source buffer to synchronize
+ */
+struct drm_xocl_copy_bo {
+	uint32_t dst_handle;
+	uint32_t src_handle;
+	uint32_t flags;
+	uint64_t size;
+	uint64_t dst_offset;
+	uint64_t src_offset;
+};
+/**
+ * struct drm_xocl_axlf - load xclbin (AXLF) device image
+ * used with DRM_IOCTL_XOCL_READ_AXLF ioctl
+ * NOTE: This ioctl will be removed in next release
+ *
+ * @xclbin:	Pointer to user's xclbin structure in memory
+ */
+struct drm_xocl_axlf {
+	struct axlf *xclbin;
+};
+
+/**
+ * struct drm_xocl_pwrite_bo - Update bo with user's data
+ * used with DRM_IOCTL_XOCL_PWRITE_BO ioctl
+ *
+ * @handle:	bo handle
+ * @pad:	Unused
+ * @offset:	Offset into the buffer object to write to
+ * @size:	Length of data to write
+ * @data_ptr:	User's pointer to read the data from
+ */
+struct drm_xocl_pwrite_bo {
+	uint32_t handle;
+	uint32_t pad;
+	uint64_t offset;
+	uint64_t size;
+	uint64_t data_ptr;
+};
+
+/**
+ * struct drm_xocl_pread_bo - Read data from bo
+ * used with DRM_IOCTL_XOCL_PREAD_BO ioctl
+ *
+ * @handle:	bo handle
+ * @pad:	Unused
+ * @offset:	Offset into the buffer object to read from
+ * @size:	Length of data to read
+ * @data_ptr:	User's pointer to write the data into
+ */
+struct drm_xocl_pread_bo {
+	uint32_t handle;
+	uint32_t pad;
+	uint64_t offset;
+	uint64_t size;
+	uint64_t data_ptr;
+};
+
+enum drm_xocl_ctx_code {
+	XOCL_CTX_OP_ALLOC_CTX = 0,
+	XOCL_CTX_OP_FREE_CTX
+};
+
+#define XOCL_CTX_SHARED    0x0
+#define XOCL_CTX_EXCLUSIVE 0x1
+
+/**
+ * struct drm_xocl_ctx - Open or close a context on a compute unit on device
+ * used with DRM_XOCL_CTX ioctl
+ *
+ * @op:            Alloc or free a context (XOCL_CTX_OP_ALLOC_CTX/XOCL_CTX_OP_FREE_CTX)
+ * @xclbin_id:	   UUID of the device image (xclbin)
+ * @cu_index:	   Index of the compute unit in the device inage for which
+ *                 the request is being made
+ * @flags:	   Shared or exclusive context (XOCL_CTX_SHARED/XOCL_CTX_EXCLUSIVE)
+ * @handle:	   Unused
+ */
+struct drm_xocl_ctx {
+	enum drm_xocl_ctx_code op;
+	uuid_t	 xclbin_id;
+	uint32_t cu_index;
+	uint32_t flags;
+	// unused, in future it would return context id
+	uint32_t handle;
+};
+
+struct drm_xocl_info {
+	unsigned short vendor;
+	unsigned short device;
+	unsigned short subsystem_vendor;
+	unsigned short subsystem_device;
+	unsigned int dma_engine_version;
+	unsigned int driver_version;
+	unsigned int pci_slot;
+	char reserved[64];
+};
+
+
+/**
+ * struct drm_xocl_pwrite_unmgd - unprotected write to device memory
+ * used with DRM_IOCTL_XOCL_PWRITE_UNMGD ioctl
+ *
+ * @address_space: Address space in the DSA; currently only 0 is suported
+ * @pad:	   Unused
+ * @paddr:	   Physical address in the specified address space
+ * @size:	   Length of data to write
+ * @data_ptr:	   User's pointer to read the data from
+ */
+struct drm_xocl_pwrite_unmgd {
+	uint32_t address_space;
+	uint32_t pad;
+	uint64_t paddr;
+	uint64_t size;
+	uint64_t data_ptr;
+};
+
+/**
+ * struct drm_xocl_pread_unmgd - unprotected read from device memory
+ * used with DRM_IOCTL_XOCL_PREAD_UNMGD ioctl
+ *
+ * @address_space: Address space in the DSA; currently only 0 is valid
+ * @pad:	   Unused
+ * @paddr:	   Physical address in the specified address space
+ * @size:	   Length of data to write
+ * @data_ptr:	   User's pointer to write the data to
+ */
+struct drm_xocl_pread_unmgd {
+	uint32_t address_space;
+	uint32_t pad;
+	uint64_t paddr;
+	uint64_t size;
+	uint64_t data_ptr;
+};
+
+
+struct drm_xocl_mm_stat {
+	size_t memory_usage;
+	unsigned int bo_count;
+};
+
+/**
+ * struct drm_xocl_stats - obtain device memory usage and DMA statistics
+ * used with DRM_IOCTL_XOCL_USAGE_STAT ioctl
+ *
+ * @dma_channel_count: How many DMA channels are present
+ * @mm_channel_count:  How many storage banks (DDR) are present
+ * @h2c:	       Total data transferred from host to device by a DMA channel
+ * @c2h:	       Total data transferred from device to host by a DMA channel
+ * @mm:	               BO statistics for a storage bank (DDR)
+ */
+struct drm_xocl_usage_stat {
+	unsigned int dma_channel_count;
+	unsigned int mm_channel_count;
+	uint64_t h2c[8];
+	uint64_t c2h[8];
+	struct drm_xocl_mm_stat mm[8];
+};
+
+enum drm_xocl_debug_code {
+	DRM_XOCL_DEBUG_ACQUIRE_CU = 0,
+	DRM_XOCL_DEBUG_RELEASE_CU,
+	DRM_XOCL_DEBUG_NIFD_RD,
+	DRM_XOCL_DEBUG_NIFD_WR,
+};
+
+struct drm_xocl_debug {
+	uint32_t ctx_id;
+	enum drm_xocl_debug_code code;
+	unsigned int code_size;
+	uint64_t code_ptr;
+};
+
+enum drm_xocl_execbuf_state {
+	DRM_XOCL_EXECBUF_STATE_COMPLETE = 0,
+	DRM_XOCL_EXECBUF_STATE_RUNNING,
+	DRM_XOCL_EXECBUF_STATE_SUBMITTED,
+	DRM_XOCL_EXECBUF_STATE_QUEUED,
+	DRM_XOCL_EXECBUF_STATE_ERROR,
+	DRM_XOCL_EXECBUF_STATE_ABORT,
+};
+
+
+/**
+ * struct drm_xocl_execbuf - Submit a command buffer for execution on a compute unit
+ * used with DRM_IOCTL_XOCL_EXECBUF ioctl
+ *
+ * @ctx_id:         Pass 0
+ * @exec_bo_handle: BO handle of command buffer formatted as ERT command
+ * @deps:	    Upto 8 dependency command BO handles this command is dependent on
+ *                  for automatic event dependency handling by ERT
+ */
+struct drm_xocl_execbuf {
+	uint32_t ctx_id;
+	uint32_t exec_bo_handle;
+	uint32_t deps[8];
+};
+
+/**
+ * struct drm_xocl_user_intr - Register user's eventfd for MSIX interrupt
+ * used with DRM_IOCTL_XOCL_USER_INTR ioctl
+ *
+ * @ctx_id:        Pass 0
+ * @fd:	           File descriptor created with eventfd system call
+ * @msix:	   User interrupt number (0 to 15)
+ */
+struct drm_xocl_user_intr {
+	uint32_t ctx_id;
+	int fd;
+	int msix;
+};
+
+struct drm_xocl_reclock_info {
+	unsigned int region;
+	unsigned short ocl_target_freq[DRM_XOCL_NUM_SUPPORTED_CLOCKS];
+};
+
+/*
+ * Core ioctls numbers
+ */
+
+#define DRM_IOCTL_XOCL_CREATE_BO      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_CREATE_BO, struct drm_xocl_create_bo)
+#define DRM_IOCTL_XOCL_USERPTR_BO     DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_USERPTR_BO, struct drm_xocl_userptr_bo)
+#define DRM_IOCTL_XOCL_MAP_BO	      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_MAP_BO, struct drm_xocl_map_bo)
+#define DRM_IOCTL_XOCL_SYNC_BO	      DRM_IOW(DRM_COMMAND_BASE +       \
+					       DRM_XOCL_SYNC_BO, struct drm_xocl_sync_bo)
+#define DRM_IOCTL_XOCL_COPY_BO	      DRM_IOW(DRM_COMMAND_BASE +       \
+					       DRM_XOCL_COPY_BO, struct drm_xocl_copy_bo)
+#define DRM_IOCTL_XOCL_INFO_BO	      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_INFO_BO, struct drm_xocl_info_bo)
+#define DRM_IOCTL_XOCL_PWRITE_BO      DRM_IOW(DRM_COMMAND_BASE +       \
+					      DRM_XOCL_PWRITE_BO, struct drm_xocl_pwrite_bo)
+#define DRM_IOCTL_XOCL_PREAD_BO	      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_PREAD_BO, struct drm_xocl_pread_bo)
+#define DRM_IOCTL_XOCL_CTX	      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_CTX, struct drm_xocl_ctx)
+#define DRM_IOCTL_XOCL_INFO	      DRM_IOR(DRM_COMMAND_BASE +	\
+					      DRM_XOCL_INFO, struct drm_xocl_info)
+#define DRM_IOCTL_XOCL_READ_AXLF      DRM_IOW(DRM_COMMAND_BASE +	\
+					      DRM_XOCL_READ_AXLF, struct drm_xocl_axlf)
+#define DRM_IOCTL_XOCL_PWRITE_UNMGD   DRM_IOW(DRM_COMMAND_BASE +	\
+					      DRM_XOCL_PWRITE_UNMGD, struct drm_xocl_pwrite_unmgd)
+#define DRM_IOCTL_XOCL_PREAD_UNMGD    DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_PREAD_UNMGD, struct drm_xocl_pread_unmgd)
+#define DRM_IOCTL_XOCL_USAGE_STAT     DRM_IOR(DRM_COMMAND_BASE +	\
+					      DRM_XOCL_USAGE_STAT, struct drm_xocl_usage_stat)
+#define DRM_IOCTL_XOCL_DEBUG	      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_DEBUG, struct drm_xocl_debug)
+#define DRM_IOCTL_XOCL_EXECBUF	      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_EXECBUF, struct drm_xocl_execbuf)
+#define DRM_IOCTL_XOCL_USER_INTR      DRM_IOWR(DRM_COMMAND_BASE +	\
+					       DRM_XOCL_USER_INTR, struct drm_xocl_user_intr)
+#define DRM_IOCTL_XOCL_HOT_RESET      DRM_IO(DRM_COMMAND_BASE +	DRM_XOCL_HOT_RESET)
+#define DRM_IOCTL_XOCL_RECLOCK	      DRM_IOWR(DRM_COMMAND_BASE + \
+					    DRM_XOCL_RECLOCK, struct drm_xocl_reclock_info)
+#endif
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH Xilinx Alveo 2/6] Global data structures shared between xocl and xmgmt drivers
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 1/6] Add skeleton code: ioctl definitions and build hooks sonal.santan
@ 2019-03-19 21:53 ` sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 3/6] Add platform drivers for various IPs and frameworks sonal.santan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:53 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk,
	Sonal Santan

From: Sonal Santan <sonal.santan@xilinx.com>

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 drivers/gpu/drm/xocl/devices.h     | 954 +++++++++++++++++++++++++++++
 drivers/gpu/drm/xocl/ert.h         | 385 ++++++++++++
 drivers/gpu/drm/xocl/version.h     |  22 +
 drivers/gpu/drm/xocl/xclbin.h      | 314 ++++++++++
 drivers/gpu/drm/xocl/xclfeatures.h | 107 ++++
 drivers/gpu/drm/xocl/xocl_ctx.c    | 196 ++++++
 drivers/gpu/drm/xocl/xocl_drm.h    |  91 +++
 drivers/gpu/drm/xocl/xocl_drv.h    | 783 +++++++++++++++++++++++
 drivers/gpu/drm/xocl/xocl_subdev.c | 540 ++++++++++++++++
 drivers/gpu/drm/xocl/xocl_thread.c |  64 ++
 10 files changed, 3456 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/devices.h
 create mode 100644 drivers/gpu/drm/xocl/ert.h
 create mode 100644 drivers/gpu/drm/xocl/version.h
 create mode 100644 drivers/gpu/drm/xocl/xclbin.h
 create mode 100644 drivers/gpu/drm/xocl/xclfeatures.h
 create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c
 create mode 100644 drivers/gpu/drm/xocl/xocl_drm.h
 create mode 100644 drivers/gpu/drm/xocl/xocl_drv.h
 create mode 100644 drivers/gpu/drm/xocl/xocl_subdev.c
 create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c

diff --git a/drivers/gpu/drm/xocl/devices.h b/drivers/gpu/drm/xocl/devices.h
new file mode 100644
index 000000000000..3fc6f8ea6c9b
--- /dev/null
+++ b/drivers/gpu/drm/xocl/devices.h
@@ -0,0 +1,954 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+
+/*
+ *  Copyright (C) 2018-2019, Xilinx Inc
+ *
+ */
+
+
+#ifndef	_XCL_DEVICES_H_
+#define	_XCL_DEVICES_H_
+
+/* board flags */
+enum {
+	XOCL_DSAFLAG_PCI_RESET_OFF =		0x01,
+	XOCL_DSAFLAG_MB_SCHE_OFF =		0x02,
+	XOCL_DSAFLAG_AXILITE_FLUSH =		0x04,
+	XOCL_DSAFLAG_SET_DSA_VER =		0x08,
+	XOCL_DSAFLAG_SET_XPR =			0x10,
+	XOCL_DSAFLAG_MFG =			0x20,
+};
+
+#define	FLASH_TYPE_SPI	"spi"
+#define	FLASH_TYPE_QSPIPS	"qspi_ps"
+
+struct xocl_subdev_info {
+	uint32_t		id;
+	char			*name;
+	struct resource		*res;
+	int			num_res;
+	void			*priv_data;
+	int			data_len;
+};
+
+struct xocl_board_private {
+	uint64_t		flags;
+	struct xocl_subdev_info	*subdev_info;
+	uint32_t		subdev_num;
+	uint32_t		dsa_ver;
+	bool			xpr;
+	char			*flash_type; /* used by xbflash */
+	char			*board_name; /* used by xbflash */
+	bool			mpsoc;
+};
+
+#ifdef __KERNEL__
+#define XOCL_PCI_DEVID(ven, dev, subsysid, priv)		\
+	.vendor = ven, .device = dev, .subvendor = PCI_ANY_ID,	\
+	.subdevice = subsysid, .driver_data =			\
+	(kernel_ulong_t) &XOCL_BOARD_##priv
+
+struct xocl_dsa_vbnv_map {
+	uint16_t		vendor;
+	uint16_t		device;
+	uint16_t		subdevice;
+	char			*vbnv;
+	struct xocl_board_private	*priv_data;
+};
+
+#else
+struct xocl_board_info {
+	uint16_t		vendor;
+	uint16_t		device;
+	uint16_t		subdevice;
+	struct xocl_board_private	*priv_data;
+};
+
+#define XOCL_PCI_DEVID(ven, dev, subsysid, priv)        \
+	.vendor = ven, .device = dev,			\
+	.subdevice = subsysid, .priv_data = &XOCL_BOARD_##priv
+
+struct resource {
+	size_t		start;
+	size_t		end;
+	unsigned long	flags;
+};
+
+enum {
+	IORESOURCE_MEM,
+	IORESOURCE_IRQ,
+};
+
+#define	PCI_ANY_ID	-1
+#define SUBDEV_SUFFIX
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
+
+#endif
+
+#define	MGMT_SUFFIX		".m"
+#define	USER_SUFFIX		".u"
+
+#define	XOCL_FEATURE_ROM_USER	"rom" USER_SUFFIX
+#define XOCL_FEATURE_ROM	"rom" SUBDEV_SUFFIX
+#define XOCL_XDMA		"xdma" SUBDEV_SUFFIX
+#define XOCL_QDMA		"qdma" SUBDEV_SUFFIX
+#define XOCL_MB_SCHEDULER	"mb_scheduler" SUBDEV_SUFFIX
+#define XOCL_XVC_PUB		"xvc_pub" SUBDEV_SUFFIX
+#define XOCL_XVC_PRI		"xvc_pri" SUBDEV_SUFFIX
+#define XOCL_SYSMON		"sysmon" SUBDEV_SUFFIX
+#define XOCL_FIREWALL		"firewall" SUBDEV_SUFFIX
+#define	XOCL_MB			"microblaze" SUBDEV_SUFFIX
+#define	XOCL_XIIC		"xiic" SUBDEV_SUFFIX
+#define	XOCL_MAILBOX		"mailbox" SUBDEV_SUFFIX
+#define	XOCL_ICAP		"icap" SUBDEV_SUFFIX
+#define	XOCL_MIG		"mig" SUBDEV_SUFFIX
+#define	XOCL_XMC		"xmc" SUBDEV_SUFFIX
+#define	XOCL_DNA		"dna" SUBDEV_SUFFIX
+#define	XOCL_FMGR		"fmgr" SUBDEV_SUFFIX
+
+enum subdev_id {
+	XOCL_SUBDEV_FEATURE_ROM,
+	XOCL_SUBDEV_DMA,
+	XOCL_SUBDEV_MB_SCHEDULER,
+	XOCL_SUBDEV_XVC_PUB,
+	XOCL_SUBDEV_XVC_PRI,
+	XOCL_SUBDEV_SYSMON,
+	XOCL_SUBDEV_AF,
+	XOCL_SUBDEV_MIG,
+	XOCL_SUBDEV_MB,
+	XOCL_SUBDEV_XIIC,
+	XOCL_SUBDEV_MAILBOX,
+	XOCL_SUBDEV_ICAP,
+	XOCL_SUBDEV_XMC,
+	XOCL_SUBDEV_DNA,
+	XOCL_SUBDEV_FMGR,
+	XOCL_SUBDEV_NUM
+};
+
+#define	XOCL_RES_FEATURE_ROM				\
+		((struct resource []) {			\
+			{				\
+			.start	= 0xB0000,		\
+			.end	= 0xB0FFF,		\
+			.flags	= IORESOURCE_MEM,	\
+			}				\
+		})
+
+
+#define	XOCL_DEVINFO_FEATURE_ROM			\
+	{						\
+		XOCL_SUBDEV_FEATURE_ROM,		\
+		XOCL_FEATURE_ROM,			\
+		XOCL_RES_FEATURE_ROM,			\
+		ARRAY_SIZE(XOCL_RES_FEATURE_ROM),	\
+	}
+
+#define	XOCL_RES_SYSMON					\
+		((struct resource []) {			\
+			{				\
+			.start	= 0xA0000,		\
+			.end	= 0xAFFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			}				\
+		})
+
+#define	XOCL_DEVINFO_SYSMON				\
+	{						\
+		XOCL_SUBDEV_SYSMON,			\
+		XOCL_SYSMON,				\
+		XOCL_RES_SYSMON,			\
+		ARRAY_SIZE(XOCL_RES_SYSMON),		\
+	}
+
+/* Will be populated dynamically */
+#define	XOCL_RES_MIG					\
+		((struct resource []) {			\
+			{				\
+			.start	= 0x0,			\
+			.end	= 0x3FF,		\
+			.flags  = IORESOURCE_MEM,	\
+			}				\
+		})
+
+#define	XOCL_DEVINFO_MIG				\
+	{						\
+		XOCL_SUBDEV_MIG,			\
+		XOCL_MIG,				\
+		XOCL_RES_MIG,				\
+		ARRAY_SIZE(XOCL_RES_MIG),		\
+	}
+
+
+#define	XOCL_RES_AF					\
+		((struct resource []) {			\
+			{				\
+			.start	= 0xD0000,		\
+			.end	= 0xDFFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0xE0000,		\
+			.end	= 0xEFFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0xF0000,		\
+			.end	= 0xFFFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x330000,		\
+			.end	= 0x330FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+		})
+
+#define	XOCL_DEVINFO_AF					\
+	{						\
+		XOCL_SUBDEV_AF,				\
+		XOCL_FIREWALL,				\
+		XOCL_RES_AF,				\
+		ARRAY_SIZE(XOCL_RES_AF),		\
+	}
+
+#define	XOCL_RES_AF_DSA52				\
+		((struct resource []) {			\
+			{				\
+			.start	= 0xD0000,		\
+			.end	= 0xDFFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0xE0000,		\
+			.end	= 0xE0FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0xE1000,		\
+			.end	= 0xE1FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0xF0000,		\
+			.end	= 0xFFFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x330000,		\
+			.end	= 0x330FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+		})
+
+#define	XOCL_DEVINFO_AF_DSA52				\
+	{						\
+		XOCL_SUBDEV_AF,				\
+		XOCL_FIREWALL,				\
+		XOCL_RES_AF_DSA52,			\
+		ARRAY_SIZE(XOCL_RES_AF_DSA52),		\
+	}
+
+#define	XOCL_RES_XVC_PUB				\
+	((struct resource []) {				\
+		{					\
+			.start	= 0xC0000,		\
+			.end	= 0xCFFFF,		\
+			.flags	= IORESOURCE_MEM,	\
+		},					\
+	})
+
+#define	XOCL_DEVINFO_XVC_PUB				\
+	{						\
+		XOCL_SUBDEV_XVC_PUB,			\
+		XOCL_XVC_PUB,				\
+		XOCL_RES_XVC_PUB,			\
+		ARRAY_SIZE(XOCL_RES_XVC_PUB),		\
+	}
+
+#define	XOCL_RES_XVC_PRI				\
+	((struct resource []) {				\
+		{					\
+			.start	= 0x1C0000,		\
+			.end	= 0x1CFFFF,		\
+			.flags	= IORESOURCE_MEM,	\
+		},					\
+	})
+
+#define	XOCL_DEVINFO_XVC_PRI				\
+	{						\
+		XOCL_SUBDEV_XVC_PRI,			\
+		XOCL_XVC_PRI,				\
+		XOCL_RES_XVC_PRI,			\
+		ARRAY_SIZE(XOCL_RES_XVC_PRI),		\
+	}
+
+#define	XOCL_RES_XIIC					\
+	((struct resource []) {				\
+		{					\
+			.start	= 0x41000,		\
+			.end	= 0x41FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+	})
+
+#define	XOCL_DEVINFO_XIIC				\
+	{						\
+		XOCL_SUBDEV_XIIC,			\
+		XOCL_XIIC,				\
+		XOCL_RES_XIIC,				\
+		ARRAY_SIZE(XOCL_RES_XIIC),		\
+	}
+
+
+/* Will be populated dynamically */
+#define	XOCL_RES_DNA					\
+	((struct resource []) {				\
+		{					\
+			.start	= 0x0,			\
+			.end	= 0xFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+		}					\
+	})
+
+#define	XOCL_DEVINFO_DNA				\
+	{						\
+		XOCL_SUBDEV_DNA,			\
+		XOCL_DNA,				\
+		XOCL_RES_DNA,				\
+		ARRAY_SIZE(XOCL_RES_DNA),		\
+	}
+
+#define	XOCL_MAILBOX_OFFSET_MGMT	0x210000
+#define	XOCL_RES_MAILBOX_MGMT				\
+	((struct resource []) {				\
+		{					\
+			.start	= XOCL_MAILBOX_OFFSET_MGMT, \
+			.end	= 0x21002F,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		{					\
+			.start	= 11,			\
+			.end	= 11,			\
+			.flags  = IORESOURCE_IRQ,	\
+		},					\
+	})
+
+#define	XOCL_DEVINFO_MAILBOX_MGMT			\
+	{						\
+		XOCL_SUBDEV_MAILBOX,			\
+		XOCL_MAILBOX,				\
+		XOCL_RES_MAILBOX_MGMT,			\
+		ARRAY_SIZE(XOCL_RES_MAILBOX_MGMT),	\
+	}
+
+#define	XOCL_MAILBOX_OFFSET_USER	0x200000
+#define	XOCL_RES_MAILBOX_USER				\
+	((struct resource []) {				\
+		{					\
+			.start	= XOCL_MAILBOX_OFFSET_USER, \
+			.end	= 0x20002F,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		{					\
+			.start	= 4,			\
+			.end	= 4,			\
+			.flags  = IORESOURCE_IRQ,	\
+		},					\
+	})
+
+#define	XOCL_DEVINFO_MAILBOX_USER			\
+	{						\
+		XOCL_SUBDEV_MAILBOX,			\
+		XOCL_MAILBOX,				\
+		XOCL_RES_MAILBOX_USER,			\
+		ARRAY_SIZE(XOCL_RES_MAILBOX_USER),	\
+	}
+
+#define	XOCL_RES_ICAP_MGMT				\
+	((struct resource []) {				\
+		/* HWICAP registers */			\
+		{					\
+			.start	= 0x020000,		\
+			.end	= 0x020119,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		/* GENERAL_STATUS_BASE */		\
+		{					\
+			.start	= 0x032000,		\
+			.end	= 0x032003,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		/* AXI Gate registers */		\
+		{					\
+			.start	= 0x030000,		\
+			.end	= 0x03000b,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		/* OCL_CLKWIZ0_BASE */			\
+		{					\
+			.start	= 0x050000,		\
+			.end	= 0x050fff,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		/* OCL_CLKWIZ1_BASE */			\
+		{					\
+			.start	= 0x051000,		\
+			.end	= 0x051fff,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+		/* OCL_CLKFREQ_BASE */			\
+		{					\
+			.start	= 0x052000,		\
+			.end	= 0x052fff,		\
+			.flags  = IORESOURCE_MEM,	\
+		},					\
+	})
+
+#define	XOCL_DEVINFO_ICAP_MGMT				\
+	{						\
+		XOCL_SUBDEV_ICAP,			\
+		XOCL_ICAP,				\
+		XOCL_RES_ICAP_MGMT,			\
+		ARRAY_SIZE(XOCL_RES_ICAP_MGMT),		\
+	}
+
+#define	XOCL_DEVINFO_ICAP_USER				\
+	{						\
+		XOCL_SUBDEV_ICAP,			\
+		XOCL_ICAP,				\
+		NULL,					\
+		0,					\
+	}
+
+#define	XOCL_RES_XMC					\
+		((struct resource []) {			\
+			{				\
+			.start	= 0x120000,		\
+			.end	= 0x121FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x131000,		\
+			.end	= 0x131FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x140000,		\
+			.end	= 0x15FFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x160000,		\
+			.end	= 0x17FFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x190000,		\
+			.end	= 0x19FFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+		})
+
+#define	XOCL_DEVINFO_XMC					\
+	{						\
+		XOCL_SUBDEV_XMC,				\
+		XOCL_XMC,				\
+		XOCL_RES_XMC,				\
+		ARRAY_SIZE(XOCL_RES_XMC),		\
+	}
+
+#define	XOCL_DEVINFO_XMC_USER			\
+	{						\
+		XOCL_SUBDEV_XMC,				\
+		XOCL_XMC,				\
+		NULL,					\
+		0,					\
+	}
+
+#define	XOCL_RES_MB					\
+		((struct resource []) {			\
+			{				\
+			.start	= 0x120000,		\
+			.end	= 0x121FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x131000,		\
+			.end	= 0x131FFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x140000,		\
+			.end	= 0x15FFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+			{				\
+			.start	= 0x160000,		\
+			.end	= 0x17FFFF,		\
+			.flags  = IORESOURCE_MEM,	\
+			},				\
+		})
+
+#define	XOCL_DEVINFO_MB					\
+	{						\
+		XOCL_SUBDEV_MB,				\
+		XOCL_MB,				\
+		XOCL_RES_MB,				\
+		ARRAY_SIZE(XOCL_RES_MB),		\
+	}
+
+#define	XOCL_DEVINFO_QDMA				\
+	{						\
+		XOCL_SUBDEV_DMA,			\
+		XOCL_QDMA,				\
+		NULL,					\
+		0,					\
+	}
+
+#define	XOCL_DEVINFO_XDMA				\
+	{						\
+		XOCL_SUBDEV_DMA,			\
+		XOCL_XDMA,				\
+		NULL,					\
+		0,					\
+	}
+
+#define XOCL_RES_SCHEDULER				\
+	((struct resource []) {				\
+		{					\
+			.start  = 0,			\
+			.end    = 3,			\
+			.flags  = IORESOURCE_IRQ,	\
+		}					\
+	})
+
+
+#define	XOCL_DEVINFO_SCHEDULER				\
+	{						\
+		XOCL_SUBDEV_MB_SCHEDULER,		\
+		XOCL_MB_SCHEDULER,			\
+		XOCL_RES_SCHEDULER,			\
+		ARRAY_SIZE(XOCL_RES_SCHEDULER),		\
+	}
+
+#define	XOCL_DEVINFO_FMGR				\
+	{						\
+		XOCL_SUBDEV_FMGR,			\
+		XOCL_FMGR,				\
+		NULL,					\
+		0,					\
+	}
+
+
+/* user pf defines */
+#define	USER_RES_QDMA							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_QDMA,				\
+			XOCL_DEVINFO_SCHEDULER,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_ICAP_USER,				\
+		})
+
+#define	XOCL_BOARD_USER_QDMA						\
+	(struct xocl_board_private){					\
+		.flags		= XOCL_DSAFLAG_MB_SCHE_OFF,		\
+		.subdev_info	= USER_RES_QDMA,			\
+		.subdev_num = ARRAY_SIZE(USER_RES_QDMA),		\
+	}
+
+#define	USER_RES_XDMA_DSA50						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_XDMA,				\
+			XOCL_DEVINFO_SCHEDULER,				\
+			XOCL_DEVINFO_ICAP_USER,				\
+		})
+
+#define	USER_RES_XDMA							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_XDMA,				\
+			XOCL_DEVINFO_SCHEDULER,				\
+			XOCL_DEVINFO_MAILBOX_USER,			\
+			XOCL_DEVINFO_ICAP_USER,				\
+		})
+
+#define USER_RES_AWS							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_XDMA,				\
+			XOCL_DEVINFO_SCHEDULER,				\
+			XOCL_DEVINFO_ICAP_USER,				\
+		})
+
+#define	USER_RES_DSA52							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_XDMA,				\
+			XOCL_DEVINFO_SCHEDULER,				\
+			XOCL_DEVINFO_MAILBOX_USER,			\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_ICAP_USER,				\
+			XOCL_DEVINFO_XMC_USER,				\
+		})
+
+#define	XOCL_BOARD_USER_XDMA_DSA50					\
+	(struct xocl_board_private){					\
+		.flags		= XOCL_DSAFLAG_MB_SCHE_OFF,		\
+		.subdev_info	= USER_RES_XDMA_DSA50,			\
+		.subdev_num = ARRAY_SIZE(USER_RES_XDMA_DSA50),		\
+	}
+
+#define	XOCL_BOARD_USER_XDMA						\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= USER_RES_XDMA,			\
+		.subdev_num = ARRAY_SIZE(USER_RES_XDMA),		\
+	}
+
+#define	XOCL_BOARD_USER_XDMA_ERT_OFF					\
+	(struct xocl_board_private){					\
+		.flags		= XOCL_DSAFLAG_MB_SCHE_OFF,		\
+		.subdev_info	= USER_RES_XDMA,			\
+		.subdev_num = ARRAY_SIZE(USER_RES_XDMA),		\
+	}
+
+#define XOCL_BOARD_USER_AWS						\
+	(struct xocl_board_private){					\
+		.flags      = 0,					\
+		.subdev_info    = USER_RES_AWS,				\
+		.subdev_num = ARRAY_SIZE(USER_RES_AWS),			\
+	}
+
+#define	XOCL_BOARD_USER_DSA52						\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= USER_RES_DSA52,			\
+		.subdev_num = ARRAY_SIZE(USER_RES_DSA52),		\
+	}
+
+/* mgmt pf defines */
+#define	MGMT_RES_DEFAULT						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_XIIC,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	MGMT_RES_DSA50							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_XIIC,				\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	XOCL_BOARD_MGMT_DEFAULT						\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_DEFAULT,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_DEFAULT),		\
+	}
+
+#define	XOCL_BOARD_MGMT_DSA50						\
+	(struct xocl_board_private){					\
+		.flags		= XOCL_DSAFLAG_PCI_RESET_OFF |		\
+			XOCL_DSAFLAG_AXILITE_FLUSH |			\
+			XOCL_DSAFLAG_MB_SCHE_OFF,			\
+		.subdev_info	= MGMT_RES_DSA50,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_DSA50),		\
+	}
+
+#define	MGMT_RES_6A8F							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	MGMT_RES_6A8F_DSA50						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	MGMT_RES_XBB_DSA51						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_XMC,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	XOCL_BOARD_MGMT_6A8F						\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_6A8F,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_6A8F),		\
+	}
+
+#define	XOCL_BOARD_MGMT_XBB_DSA51						\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_XBB_DSA51,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_XBB_DSA51),		\
+		.flash_type = FLASH_TYPE_SPI,				\
+	}
+
+
+#define	XOCL_BOARD_MGMT_888F	XOCL_BOARD_MGMT_6A8F
+#define	XOCL_BOARD_MGMT_898F	XOCL_BOARD_MGMT_6A8F
+
+#define	XOCL_BOARD_MGMT_6A8F_DSA50					\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_6A8F_DSA50,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_6A8F_DSA50),		\
+	}
+
+#define	MGMT_RES_QDMA							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PRI,				\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+
+#define	XOCL_BOARD_MGMT_QDMA					\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_QDMA,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_QDMA),		\
+		.flash_type = FLASH_TYPE_SPI				\
+	}
+
+#define MGMT_RES_XBB_QDMA                                               \
+	((struct xocl_subdev_info []) {                         \
+		XOCL_DEVINFO_FEATURE_ROM,                       \
+		XOCL_DEVINFO_SYSMON,                            \
+		XOCL_DEVINFO_AF_DSA52,                          \
+		XOCL_DEVINFO_XMC,                               \
+		XOCL_DEVINFO_XVC_PRI,                           \
+		XOCL_DEVINFO_ICAP_MGMT,                         \
+	})
+
+#define XOCL_BOARD_MGMT_XBB_QDMA                                        \
+	(struct xocl_board_private){                                    \
+		.flags          = 0,                                    \
+		.subdev_info    = MGMT_RES_XBB_QDMA,                    \
+		.subdev_num = ARRAY_SIZE(MGMT_RES_XBB_QDMA),            \
+		.flash_type = FLASH_TYPE_SPI				\
+	}
+
+#define	XOCL_BOARD_MGMT_6B0F		XOCL_BOARD_MGMT_6A8F
+
+#define	MGMT_RES_6A8F_DSA52						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF_DSA52,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PRI,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	XOCL_BOARD_MGMT_6A8F_DSA52					\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_6A8F_DSA52,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_6A8F_DSA52),		\
+	}
+
+#define	MGMT_RES_XBB_DSA52						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF_DSA52,				\
+			XOCL_DEVINFO_XMC,				\
+			XOCL_DEVINFO_XVC_PRI,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	XOCL_BOARD_MGMT_XBB_DSA52					\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_XBB_DSA52,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_XBB_DSA52),		\
+		.flash_type = FLASH_TYPE_SPI,				\
+	}
+
+#define	MGMT_RES_6E8F_DSA52						\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_AF,				\
+			XOCL_DEVINFO_MB,				\
+			XOCL_DEVINFO_XVC_PRI,				\
+			XOCL_DEVINFO_XIIC,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	XOCL_BOARD_MGMT_6E8F_DSA52					\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_6E8F_DSA52,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_6E8F_DSA52),		\
+	}
+
+#define MGMT_RES_MPSOC							\
+		((struct xocl_subdev_info []) {				\
+			XOCL_DEVINFO_FEATURE_ROM,			\
+			XOCL_DEVINFO_SYSMON,				\
+			XOCL_DEVINFO_XVC_PUB,				\
+			XOCL_DEVINFO_MAILBOX_MGMT,			\
+			XOCL_DEVINFO_ICAP_MGMT,				\
+			XOCL_DEVINFO_FMGR,				\
+		})
+
+#define	XOCL_BOARD_MGMT_MPSOC						\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= MGMT_RES_MPSOC,			\
+		.subdev_num = ARRAY_SIZE(MGMT_RES_MPSOC),		\
+		.mpsoc = true,						\
+		.board_name = "samsung",				\
+		.flash_type = FLASH_TYPE_QSPIPS,			\
+	}
+
+#define	XOCL_BOARD_USER_XDMA_MPSOC					\
+	(struct xocl_board_private){					\
+		.flags		= 0,					\
+		.subdev_info	= USER_RES_XDMA,			\
+		.subdev_num = ARRAY_SIZE(USER_RES_XDMA),		\
+		.mpsoc = true,						\
+	}
+
+
+#define	XOCL_BOARD_XBB_MFG(board)					\
+	(struct xocl_board_private){					\
+		.flags = XOCL_DSAFLAG_MFG,				\
+		.board_name = board,					\
+		.flash_type = FLASH_TYPE_SPI,				\
+	}
+
+#define	XOCL_MGMT_PCI_IDS	{					\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4A47, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4A87, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4B47, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4B87, 0x4350, MGMT_DSA50) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4B87, 0x4351, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x684F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0xA883, 0x1351, MGMT_MPSOC) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0xA983, 0x1351, MGMT_MPSOC) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x688F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x694F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x698F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A4F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A8F, 0x4350, MGMT_6A8F_DSA50) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A8F, 0x4351, MGMT_6A8F) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A8F, 0x4352, MGMT_6A8F_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A9F, 0x4360, MGMT_QDMA) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5010, PCI_ANY_ID, MGMT_XBB_QDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A9F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6E4F, PCI_ANY_ID, MGMT_DEFAULT) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6B0F, PCI_ANY_ID, MGMT_6B0F) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6E8F, 0x4352, MGMT_6E8F_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x888F, PCI_ANY_ID, MGMT_888F) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x898F, PCI_ANY_ID, MGMT_898F) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x788F, 0x4351, MGMT_XBB_DSA51) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x788F, 0x4352, MGMT_XBB_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x798F, 0x4352, MGMT_XBB_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A8F, 0x4353, MGMT_6A8F_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5000, PCI_ANY_ID, MGMT_XBB_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5004, PCI_ANY_ID, MGMT_XBB_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5008, PCI_ANY_ID, MGMT_XBB_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x13FE, 0x006C, PCI_ANY_ID, MGMT_6A8F) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0xD000, PCI_ANY_ID, XBB_MFG("u200")) }, \
+	{ XOCL_PCI_DEVID(0x10EE, 0xD004, PCI_ANY_ID, XBB_MFG("u250")) }, \
+	{ XOCL_PCI_DEVID(0x10EE, 0xD008, PCI_ANY_ID, XBB_MFG("u280-es1")) }, \
+	{ XOCL_PCI_DEVID(0x10EE, 0xD00C, PCI_ANY_ID, XBB_MFG("u280")) }, \
+	{ XOCL_PCI_DEVID(0x10EE, 0xEB10, PCI_ANY_ID, XBB_MFG("twitch")) }, \
+	{ 0, }								\
+}
+
+#define	XOCL_USER_PCI_IDS	{					\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4A48, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4A88, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4B48, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4B88, 0x4350, USER_XDMA_DSA50) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x4B88, 0x4351, USER_XDMA) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6850, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6890, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6950, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0xA884, 0x1351, USER_XDMA_MPSOC) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0xA984, 0x1351, USER_XDMA_MPSOC) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6990, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A50, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A90, 0x4350, USER_XDMA_DSA50) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A90, 0x4351, USER_XDMA) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A90, 0x4352, USER_DSA52) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6A90, 0x4353, USER_DSA52) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6E50, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6B10, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6E90, 0x4352, USER_DSA52) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x8890, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x8990, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x7890, 0x4351, USER_XDMA) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x7890, 0x4352, USER_DSA52) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x7990, 0x4352, USER_DSA52) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5001, PCI_ANY_ID, USER_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5005, PCI_ANY_ID, USER_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5009, PCI_ANY_ID, USER_DSA52) },	\
+	{ XOCL_PCI_DEVID(0x13FE, 0x0065, PCI_ANY_ID, USER_XDMA) },	\
+	{ XOCL_PCI_DEVID(0x1D0F, 0x1042, PCI_ANY_ID, USER_AWS) },	\
+	{ XOCL_PCI_DEVID(0x1D0F, 0xF000, PCI_ANY_ID, USER_AWS) },	\
+	{ XOCL_PCI_DEVID(0x1D0F, 0xF010, PCI_ANY_ID, USER_AWS) },	\
+	{ XOCL_PCI_DEVID(0x10EE, 0x6AA0, 0x4360, USER_QDMA) },		\
+	{ XOCL_PCI_DEVID(0x10EE, 0x5011, PCI_ANY_ID, USER_QDMA) },	\
+	{ 0, }								\
+}
+
+#define XOCL_DSA_VBNV_MAP	{					\
+	{ 0x10EE, 0x5001, PCI_ANY_ID, "xilinx_u200_xdma_201820_1",	\
+		&XOCL_BOARD_USER_XDMA },				\
+	{ 0x10EE, 0x5000, PCI_ANY_ID, "xilinx_u200_xdma_201820_1",	\
+		&XOCL_BOARD_MGMT_XBB_DSA51 }				\
+}
+
+#endif
diff --git a/drivers/gpu/drm/xocl/ert.h b/drivers/gpu/drm/xocl/ert.h
new file mode 100644
index 000000000000..2e94c5a63877
--- /dev/null
+++ b/drivers/gpu/drm/xocl/ert.h
@@ -0,0 +1,385 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/**
+ * Copyright (C) 2017-2019, Xilinx Inc
+ * DOC: Xilinx SDAccel Embedded Runtime definition
+ *
+ * Header file *ert.h* defines data structures used by Emebdded Runtime (ERT) and
+ * XRT xclExecBuf() API.
+ */
+
+#ifndef _ERT_H_
+#define _ERT_H_
+
+#if defined(__KERNEL__)
+# include <linux/types.h>
+#else
+# include <stdint.h>
+#endif
+
+/**
+ * struct ert_packet: ERT generic packet format
+ *
+ * @state:   [3-0] current state of a command
+ * @custom:  [11-4] custom per specific commands
+ * @count:   [22-12] number of words in payload (data)
+ * @opcode:  [27-23] opcode identifying specific command
+ * @type:    [31-28] type of command (currently 0)
+ * @data:    count number of words representing packet payload
+ */
+struct ert_packet {
+  union {
+    struct {
+      uint32_t state:4;   /* [3-0]   */
+      uint32_t custom:8;  /* [11-4]  */
+      uint32_t count:11;  /* [22-12] */
+      uint32_t opcode:5;  /* [27-23] */
+      uint32_t type:4;    /* [31-28] */
+    };
+    uint32_t header;
+  };
+  uint32_t data[1];   /* count number of words */
+};
+
+/**
+ * struct ert_start_kernel_cmd: ERT start kernel command format
+ *
+ * @state:           [3-0] current state of a command
+ * @extra_cu_masks:  [11-10] extra CU masks in addition to mandatory mask
+ * @count:           [22-12] number of words following header
+ * @opcode:          [27-23] 0, opcode for start_kernel
+ * @type:            [31-27] 0, type of start_kernel
+ *
+ * @cu_mask:         first mandatory CU mask
+ * @data:            count-1 number of words representing interpreted payload
+ *
+ * The packet payload is comprised of reserved id field, a mandatory CU mask,
+ * and extra_cu_masks per header field, followed by a CU register map of size
+ * (count - (1 + extra_cu_masks)) uint32_t words.
+ */
+struct ert_start_kernel_cmd {
+  union {
+    struct {
+      uint32_t state:4;          /* [3-0]   */
+      uint32_t unused:6;         /* [9-4]  */
+      uint32_t extra_cu_masks:2; /* [11-10]  */
+      uint32_t count:11;         /* [22-12] */
+      uint32_t opcode:5;         /* [27-23] */
+      uint32_t type:4;           /* [31-27] */
+    };
+    uint32_t header;
+  };
+
+  /* payload */
+  uint32_t cu_mask;          /* mandatory cu mask */
+  uint32_t data[1];          /* count-1 number of words */
+};
+
+#define COPYBO_UNIT   64     /* Limited by KDMA CU */
+struct ert_start_copybo_cmd {
+  uint32_t state:4;          /* [3-0], must be ERT_CMD_STATE_NEW */
+  uint32_t unused:6;         /* [9-4] */
+  uint32_t extra_cu_masks:2; /* [11-10], = 3 */
+  uint32_t count:11;         /* [22-12], = sizeof(ert_start_copybo_cmd)-1 */
+  uint32_t opcode:5;         /* [27-23], = ERT_START_COPYBO */
+  uint32_t type:4;           /* [31-27], = ERT_DEFAULT */
+  uint32_t cu_mask[4];       /* mandatory cu masks */
+  uint32_t reserved[4];      /* for scheduler use */
+  uint32_t src_addr_lo;      /* low 32 bit of src addr */
+  uint32_t src_addr_hi;      /* high 32 bit of src addr */
+  uint32_t src_bo_hdl;       /* src bo handle, cleared by driver */
+  uint32_t dst_addr_lo;      /* low 32 bit of dst addr */
+  uint32_t dst_addr_hi;      /* high 32 bit of dst addr */
+  uint32_t dst_bo_hdl;       /* dst bo handle, cleared by driver */
+  uint32_t size;             /* size in COPYBO_UNIT byte */
+};
+
+/**
+ * struct ert_configure_cmd: ERT configure command format
+ *
+ * @state:           [3-0] current state of a command
+ * @count:           [22-12] number of words in payload (5 + num_cus)
+ * @opcode:          [27-23] 1, opcode for configure
+ * @type:            [31-27] 0, type of configure
+ *
+ * @slot_size:       command queue slot size
+ * @num_cus:         number of compute units in program
+ * @cu_shift:        shift value to convert CU idx to CU addr
+ * @cu_base_addr:    base address to add to CU addr for actual physical address
+ *
+ * @ert:1            enable embedded HW scheduler
+ * @polling:1        poll for command completion
+ * @cu_dma:1         enable CUDMA custom module for HW scheduler
+ * @cu_isr:1         enable CUISR custom module for HW scheduler
+ * @cq_int:1         enable interrupt from host to HW scheduler
+ * @cdma:1           enable CDMA kernel
+ * @unused:25
+ * @dsa52:1          reserved for internal use
+ *
+ * @data:            addresses of @num_cus CUs
+ */
+struct ert_configure_cmd {
+  union {
+    struct {
+      uint32_t state:4;          /* [3-0]   */
+      uint32_t unused:8;         /* [11-4]  */
+      uint32_t count:11;         /* [22-12] */
+      uint32_t opcode:5;         /* [27-23] */
+      uint32_t type:4;           /* [31-27] */
+    };
+    uint32_t header;
+  };
+
+  /* payload */
+  uint32_t slot_size;
+  uint32_t num_cus;
+  uint32_t cu_shift;
+  uint32_t cu_base_addr;
+
+  /* features */
+  uint32_t ert:1;
+  uint32_t polling:1;
+  uint32_t cu_dma:1;
+  uint32_t cu_isr:1;
+  uint32_t cq_int:1;
+  uint32_t cdma:1;
+  uint32_t unusedf:25;
+  uint32_t dsa52:1;
+
+  /* cu address map size is num_cus */
+  uint32_t data[1];
+};
+
+/**
+ * struct ert_abort_cmd: ERT abort command format.
+ *
+ * @idx: The slot index of command to abort
+ */
+struct ert_abort_cmd {
+  union {
+    struct {
+      uint32_t state:4;          /* [3-0]   */
+      uint32_t unused:11;        /* [14-4]  */
+      uint32_t idx:8;            /* [22-15] */
+      uint32_t opcode:5;         /* [27-23] */
+      uint32_t type:4;           /* [31-27] */
+    };
+    uint32_t header;
+  };
+};
+
+/**
+ * ERT command state
+ *
+ * @ERT_CMD_STATE_NEW:      Set by host before submitting a command to scheduler
+ * @ERT_CMD_STATE_QUEUED:   Internal scheduler state
+ * @ERT_CMD_STATE_SUBMITTED:Internal scheduler state
+ * @ERT_CMD_STATE_RUNNING:  Internal scheduler state
+ * @ERT_CMD_STATE_COMPLETE: Set by scheduler when command completes
+ * @ERT_CMD_STATE_ERROR:    Set by scheduler if command failed
+ * @ERT_CMD_STATE_ABORT:    Set by scheduler if command abort
+ */
+enum ert_cmd_state {
+  ERT_CMD_STATE_NEW = 1,
+  ERT_CMD_STATE_QUEUED = 2,
+  ERT_CMD_STATE_RUNNING = 3,
+  ERT_CMD_STATE_COMPLETED = 4,
+  ERT_CMD_STATE_ERROR = 5,
+  ERT_CMD_STATE_ABORT = 6,
+  ERT_CMD_STATE_SUBMITTED = 7,
+};
+
+/**
+ * Opcode types for commands
+ *
+ * @ERT_START_CU:       start a workgroup on a CU
+ * @ERT_START_KERNEL:   currently aliased to ERT_START_CU
+ * @ERT_CONFIGURE:      configure command scheduler
+ * @ERT_WRITE:          write pairs of addr and value
+ * @ERT_CU_STAT:        get stats about CU execution
+ * @ERT_START_COPYBO:   start KDMA CU or P2P, may be converted to ERT_START_CU
+ *                      before cmd reach to scheduler, short-term hack
+ */
+enum ert_cmd_opcode {
+  ERT_START_CU     = 0,
+  ERT_START_KERNEL = 0,
+  ERT_CONFIGURE    = 2,
+  ERT_STOP         = 3,
+  ERT_ABORT        = 4,
+  ERT_WRITE        = 5,
+  ERT_CU_STAT      = 6,
+  ERT_START_COPYBO = 7,
+};
+
+/**
+ * Command types
+ *
+ * @ERT_DEFAULT:        default command type
+ * @ERT_KDS_LOCAL:      command processed by KDS locally
+ * @ERT_CTRL:           control command uses reserved command queue slot
+ */
+enum ert_cmd_type {
+  ERT_DEFAULT = 0,
+  ERT_KDS_LOCAL = 1,
+  ERT_CTRL = 2,
+};
+
+/**
+ * Address constants per spec
+ */
+#define ERT_WORD_SIZE                     4          /* 4 bytes */
+#define ERT_CQ_SIZE                       0x10000    /* 64K */
+#define ERT_CQ_BASE_ADDR                  0x190000
+#define ERT_CSR_ADDR                      0x180000
+
+/**
+ * The STATUS REGISTER is for communicating completed CQ slot indices
+ * MicroBlaze write, host reads.  MB(W) / HOST(COR)
+ */
+#define ERT_STATUS_REGISTER_ADDR          (ERT_CSR_ADDR)
+#define ERT_STATUS_REGISTER_ADDR0         (ERT_CSR_ADDR)
+#define ERT_STATUS_REGISTER_ADDR1         (ERT_CSR_ADDR + 0x4)
+#define ERT_STATUS_REGISTER_ADDR2         (ERT_CSR_ADDR + 0x8)
+#define ERT_STATUS_REGISTER_ADDR3         (ERT_CSR_ADDR + 0xC)
+
+/**
+ * The CU DMA REGISTER is for communicating which CQ slot is to be started
+ * on a specific CU.  MB selects a free CU on which the command can
+ * run, then writes the 1<<CU back to the command slot CU mask and
+ * writes the slot index to the CU DMA REGISTER.  HW is notified when
+ * the register is written and now does the DMA transfer of CU regmap
+ * map from command to CU, while MB continues its work. MB(W) / HW(R)
+ */
+#define ERT_CU_DMA_ENABLE_ADDR            (ERT_CSR_ADDR + 0x18)
+#define ERT_CU_DMA_REGISTER_ADDR          (ERT_CSR_ADDR + 0x1C)
+#define ERT_CU_DMA_REGISTER_ADDR0         (ERT_CSR_ADDR + 0x1C)
+#define ERT_CU_DMA_REGISTER_ADDR1         (ERT_CSR_ADDR + 0x20)
+#define ERT_CU_DMA_REGISTER_ADDR2         (ERT_CSR_ADDR + 0x24)
+#define ERT_CU_DMA_REGISTER_ADDR3         (ERT_CSR_ADDR + 0x28)
+
+/**
+ * The SLOT SIZE is the size of slots in command queue, it is
+ * configurable per xclbin. MB(W) / HW(R)
+ */
+#define ERT_CQ_SLOT_SIZE_ADDR             (ERT_CSR_ADDR + 0x2C)
+
+/**
+ * The CU_OFFSET is the size of a CU's address map in power of 2.  For
+ * example a 64K regmap is 2^16 so 16 is written to the CU_OFFSET_ADDR.
+ * MB(W) / HW(R)
+ */
+#define ERT_CU_OFFSET_ADDR                (ERT_CSR_ADDR + 0x30)
+
+/**
+ * The number of slots is command_queue_size / slot_size.
+ * MB(W) / HW(R)
+ */
+#define ERT_CQ_NUMBER_OF_SLOTS_ADDR       (ERT_CSR_ADDR + 0x34)
+
+/**
+ * All CUs placed in same address space separated by CU_OFFSET. The
+ * CU_BASE_ADDRESS is the address of the first CU. MB(W) / HW(R)
+ */
+#define ERT_CU_BASE_ADDRESS_ADDR          (ERT_CSR_ADDR + 0x38)
+
+/**
+ * The CQ_BASE_ADDRESS is the base address of the command queue.
+ * MB(W) / HW(R)
+ */
+#define ERT_CQ_BASE_ADDRESS_ADDR          (ERT_CSR_ADDR + 0x3C)
+
+/**
+ * The CU_ISR_HANDLER_ENABLE (MB(W)/HW(R)) enables the HW handling of
+ * CU interrupts.  When a CU interrupts (when done), hardware handles
+ * the interrupt and writes the index of the CU that completed into
+ * the CU_STATUS_REGISTER (HW(W)/MB(COR)) as a bitmask
+ */
+#define ERT_CU_ISR_HANDLER_ENABLE_ADDR    (ERT_CSR_ADDR + 0x40)
+#define ERT_CU_STATUS_REGISTER_ADDR       (ERT_CSR_ADDR + 0x44)
+#define ERT_CU_STATUS_REGISTER_ADDR0      (ERT_CSR_ADDR + 0x44)
+#define ERT_CU_STATUS_REGISTER_ADDR1      (ERT_CSR_ADDR + 0x48)
+#define ERT_CU_STATUS_REGISTER_ADDR2      (ERT_CSR_ADDR + 0x4C)
+#define ERT_CU_STATUS_REGISTER_ADDR3      (ERT_CSR_ADDR + 0x50)
+
+/**
+ * The CQ_STATUS_ENABLE (MB(W)/HW(R)) enables interrupts from HOST to
+ * MB to indicate the presense of a new command in some slot.  The
+ * slot index is written to the CQ_STATUS_REGISTER (HOST(W)/MB(R))
+ */
+#define ERT_CQ_STATUS_ENABLE_ADDR         (ERT_CSR_ADDR + 0x54)
+#define ERT_CQ_STATUS_REGISTER_ADDR       (ERT_CSR_ADDR + 0x58)
+#define ERT_CQ_STATUS_REGISTER_ADDR0      (ERT_CSR_ADDR + 0x58)
+#define ERT_CQ_STATUS_REGISTER_ADDR1      (ERT_CSR_ADDR + 0x5C)
+#define ERT_CQ_STATUS_REGISTER_ADDR2      (ERT_CSR_ADDR + 0x60)
+#define ERT_CQ_STATUS_REGISTER_ADDR3      (ERT_CSR_ADDR + 0x64)
+
+/**
+ * The NUMBER_OF_CU (MB(W)/HW(R) is the number of CUs per current
+ * xclbin.  This is an optimization that allows HW to only check CU
+ * completion on actual CUs.
+ */
+#define ERT_NUMBER_OF_CU_ADDR             (ERT_CSR_ADDR + 0x68)
+
+/**
+ * Enable global interrupts from MB to HOST on command completion.
+ * When enabled writing to STATUS_REGISTER causes an interrupt in HOST.
+ * MB(W)
+ */
+#define ERT_HOST_INTERRUPT_ENABLE_ADDR    (ERT_CSR_ADDR + 0x100)
+
+/**
+ * Interrupt controller base address
+ * This value is per hardware BSP (XPAR_INTC_SINGLE_BASEADDR)
+ */
+#define ERT_INTC_ADDR                     0x41200000
+
+/**
+ * Look up table for CUISR for CU addresses
+ */
+#define ERT_CUISR_LUT_ADDR                (ERT_CSR_ADDR + 0x400)
+
+/**
+ * ERT stop command/ack
+ */
+#define	ERT_STOP_CMD			  ((ERT_STOP << 23) | ERT_CMD_STATE_NEW)
+#define	ERT_STOP_ACK			  (ERT_CMD_STATE_COMPLETED)
+
+/**
+ * State machine for both CUDMA and CUISR modules
+ */
+#define ERT_HLS_MODULE_IDLE               0x1
+#define ERT_CUDMA_STATE                   (ERT_CSR_ADDR + 0x318)
+#define ERT_CUISR_STATE                   (ERT_CSR_ADDR + 0x328)
+
+/**
+ * Interrupt address masks written by MB when interrupts from
+ * CU are enabled
+ */
+#define ERT_INTC_IPR_ADDR                 (ERT_INTC_ADDR + 0x4)  /* pending */
+#define ERT_INTC_IER_ADDR                 (ERT_INTC_ADDR + 0x8)  /* enable */
+#define ERT_INTC_IAR_ADDR                 (ERT_INTC_ADDR + 0x0C) /* acknowledge */
+#define ERT_INTC_MER_ADDR                 (ERT_INTC_ADDR + 0x1C) /* master enable */
+
+static inline void
+ert_fill_copybo_cmd(struct ert_start_copybo_cmd *pkt, uint32_t src_bo,
+  uint32_t dst_bo, uint64_t src_offset, uint64_t dst_offset, uint64_t size)
+{
+  pkt->state = ERT_CMD_STATE_NEW;
+  pkt->extra_cu_masks = 3;
+  pkt->count = sizeof (struct ert_start_copybo_cmd) / 4 - 1;
+  pkt->opcode = ERT_START_COPYBO;
+  pkt->type = ERT_DEFAULT;
+  pkt->cu_mask[0] = 0;
+  pkt->cu_mask[1] = 0;
+  pkt->cu_mask[2] = 0;
+  pkt->cu_mask[3] = 0;
+  pkt->src_addr_lo = src_offset;
+  pkt->src_addr_hi = (src_offset >> 32) & 0xFFFFFFFF;
+  pkt->src_bo_hdl = src_bo;
+  pkt->dst_addr_lo = dst_offset;
+  pkt->dst_addr_hi = (dst_offset >> 32) & 0xFFFFFFFF;
+  pkt->dst_bo_hdl = dst_bo;
+  pkt->size = size / COPYBO_UNIT;
+}
+
+#endif
diff --git a/drivers/gpu/drm/xocl/version.h b/drivers/gpu/drm/xocl/version.h
new file mode 100644
index 000000000000..bdf3d2c6655a
--- /dev/null
+++ b/drivers/gpu/drm/xocl/version.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+#ifndef _XRT_VERSION_H_
+#define _XRT_VERSION_H_
+
+static const char xrt_build_version[] = "2.2.0";
+
+static const char xrt_build_version_branch[] = "master";
+
+static const char xrt_build_version_hash[] = "2de0f3707ba3b3f1a853006bfd8f75a118907021";
+
+static const char xrt_build_version_hash_date[] = "Mon, 4 Mar 2019 13:26:04 -0800";
+
+static const char xrt_build_version_date_rfc[] = "Mon, 04 Mar 2019 19:33:17 -0800";
+
+static const char xrt_build_version_date[] = "2019-03-04 19:33:17";
+
+static const char xrt_modified_files[] = "";
+
+#define XRT_DRIVER_VERSION "2.2.0,2de0f3707ba3b3f1a853006bfd8f75a118907021"
+
+#endif
diff --git a/drivers/gpu/drm/xocl/xclbin.h b/drivers/gpu/drm/xocl/xclbin.h
new file mode 100644
index 000000000000..4a40e9ab03e7
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xclbin.h
@@ -0,0 +1,314 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/* Copyright (C) 2015-2019, Xilinx Inc */
+
+#ifndef _XCLBIN_H_
+#define _XCLBIN_H_
+
+#if defined(__KERNEL__)
+#include <linux/types.h>
+#include <linux/uuid.h>
+#include <linux/version.h>
+#elif defined(__cplusplus)
+#include <cstdlib>
+#include <cstdint>
+#include <algorithm>
+#include <uuid/uuid.h>
+#else
+#include <stdlib.h>
+#include <stdint.h>
+#include <uuid/uuid.h>
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+    /**
+     * Container format for Xilinx bitstreams, metadata and other
+     * binary blobs.
+     * Every segment must be aligned at 8 byte boundary with null byte padding
+     * between adjacent segments if required.
+     * For segements which are not present both offset and length must be 0 in
+     * the header.
+     * Currently only xclbin0\0 is recognized as file magic. In future if/when file
+     * format is updated the magic string will be changed to xclbin1\0 and so on.
+     */
+    enum XCLBIN_MODE {
+        XCLBIN_FLAT,
+        XCLBIN_PR,
+        XCLBIN_TANDEM_STAGE2,
+        XCLBIN_TANDEM_STAGE2_WITH_PR,
+        XCLBIN_HW_EMU,
+        XCLBIN_SW_EMU,
+        XCLBIN_MODE_MAX
+    };
+
+    /*
+     *  AXLF LAYOUT
+     *  -----------
+     *
+     *  -----------------------------------------
+     *  | Magic                                 |
+     *  -----------------------------------------
+     *  | Header                                |
+     *  -----------------------------------------
+     *  | One or more section headers           |
+     *  -----------------------------------------
+     *  | Matching number of sections with data |
+     *  -----------------------------------------
+     *
+     */
+
+    enum axlf_section_kind {
+        BITSTREAM = 0,
+        CLEARING_BITSTREAM,
+        EMBEDDED_METADATA,
+        FIRMWARE,
+        DEBUG_DATA,
+        SCHED_FIRMWARE,
+        MEM_TOPOLOGY,
+        CONNECTIVITY,
+        IP_LAYOUT,
+        DEBUG_IP_LAYOUT,
+        DESIGN_CHECK_POINT,
+        CLOCK_FREQ_TOPOLOGY,
+        MCS,
+        BMC,
+        BUILD_METADATA,
+        KEYVALUE_METADATA,
+        USER_METADATA,
+        DNA_CERTIFICATE,
+        PDI,
+        BITSTREAM_PARTIAL_PDI
+    };
+
+    enum MEM_TYPE {
+        MEM_DDR3,
+        MEM_DDR4,
+        MEM_DRAM,
+        MEM_STREAMING,
+        MEM_PREALLOCATED_GLOB,
+        MEM_ARE, //Aurora
+        MEM_HBM,
+        MEM_BRAM,
+        MEM_URAM,
+        MEM_STREAMING_CONNECTION
+    };
+
+    enum IP_TYPE {
+        IP_MB = 0,
+        IP_KERNEL, //kernel instance
+        IP_DNASC,
+        IP_DDR4_CONTROLLER
+    };
+
+    struct axlf_section_header {
+        uint32_t m_sectionKind;             /* Section type */
+        char m_sectionName[16];             /* Examples: "stage2", "clear1", "clear2", "ocl1", "ocl2, "ublaze", "sched" */
+        uint64_t m_sectionOffset;           /* File offset of section data */
+        uint64_t m_sectionSize;             /* Size of section data */
+    };
+
+    struct axlf_header {
+        uint64_t m_length;                  /* Total size of the xclbin file */
+        uint64_t m_timeStamp;               /* Number of seconds since epoch when xclbin was created */
+        uint64_t m_featureRomTimeStamp;     /* TimeSinceEpoch of the featureRom */
+        uint16_t m_versionPatch;            /* Patch Version */
+        uint8_t m_versionMajor;             /* Major Version - Version: 2.1.0*/
+        uint8_t m_versionMinor;             /* Minor Version */
+        uint32_t m_mode;                    /* XCLBIN_MODE */
+	union {
+	    struct {
+		uint64_t m_platformId;      /* 64 bit platform ID: vendor-device-subvendor-subdev */
+		uint64_t m_featureId;       /* 64 bit feature id */
+	    } rom;
+	    unsigned char rom_uuid[16];     /* feature ROM UUID for which this xclbin was generated */
+	};
+        unsigned char m_platformVBNV[64];   /* e.g. xilinx:xil-accel-rd-ku115:4ddr-xpr:3.4: null terminated */
+	union {
+	    char m_next_axlf[16];           /* Name of next xclbin file in the daisy chain */
+	    uuid_t uuid;                    /* uuid of this xclbin*/
+	};
+        char m_debug_bin[16];               /* Name of binary with debug information */
+        uint32_t m_numSections;             /* Number of section headers */
+    };
+
+    struct axlf {
+        char m_magic[8];                            /* Should be "xclbin2\0"  */
+        unsigned char m_cipher[32];                 /* Hmac output digest */
+        unsigned char m_keyBlock[256];              /* Signature for validation of binary */
+        uint64_t m_uniqueId;                        /* axlf's uniqueId, use it to skip redownload etc */
+        struct axlf_header m_header;                /* Inline header */
+        struct axlf_section_header m_sections[1];   /* One or more section headers follow */
+    };
+
+    typedef struct axlf xclBin;
+
+    /**** BEGIN : Xilinx internal section *****/
+
+    /* bitstream information */
+    struct xlnx_bitstream {
+        uint8_t m_freq[8];
+        char bits[1];
+    };
+
+    /****   MEMORY TOPOLOGY SECTION ****/
+    struct mem_data {
+	uint8_t m_type; //enum corresponding to mem_type.
+	uint8_t m_used; //if 0 this bank is not present
+	union {
+	    uint64_t m_size; //if mem_type DDR, then size in KB;
+	    uint64_t route_id; //if streaming then "route_id"
+	};
+	union {
+	    uint64_t m_base_address;//if DDR then the base address;
+	    uint64_t flow_id; //if streaming then "flow id"
+	};
+	unsigned char m_tag[16]; //DDR: BANK0,1,2,3, has to be null terminated; if streaming then stream0, 1 etc
+    };
+
+    struct mem_topology {
+        int32_t m_count; //Number of mem_data
+        struct mem_data m_mem_data[1]; //Should be sorted on mem_type
+    };
+
+    /****   CONNECTIVITY SECTION ****/
+    /* Connectivity of each argument of Kernel. It will be in terms of argument
+     * index associated. For associating kernel instances with arguments and
+     * banks, start at the connectivity section. Using the m_ip_layout_index
+     * access the ip_data.m_name. Now we can associate this kernel instance
+     * with its original kernel name and get the connectivity as well. This
+     * enables us to form related groups of kernel instances.
+     */
+
+    struct connection {
+        int32_t arg_index; //From 0 to n, may not be contiguous as scalars skipped
+        int32_t m_ip_layout_index; //index into the ip_layout section. ip_layout.m_ip_data[index].m_type == IP_KERNEL
+        int32_t mem_data_index; //index of the m_mem_data . Flag error is m_used false.
+    };
+
+    struct connectivity {
+        int32_t m_count;
+        struct connection m_connection[1];
+    };
+
+
+    /****   IP_LAYOUT SECTION ****/
+    /* IPs on AXI lite - their types, names, and base addresses.*/
+    struct ip_data {
+        uint32_t m_type; //map to IP_TYPE enum
+        uint32_t properties; //32 bits to indicate ip specific property. eg if m_type == IP_KERNEL then bit 0 is for interrupt.
+        uint64_t m_base_address;
+        uint8_t m_name[64]; //eg Kernel name corresponding to KERNEL instance, can embed CU name in future.
+    };
+
+    struct ip_layout {
+        int32_t m_count;
+        struct ip_data m_ip_data[1]; //All the ip_data needs to be sorted by m_base_address.
+    };
+
+    /*** Debug IP section layout ****/
+    enum DEBUG_IP_TYPE {
+        UNDEFINED = 0,
+        LAPC,
+        ILA,
+        AXI_MM_MONITOR,
+        AXI_TRACE_FUNNEL,
+        AXI_MONITOR_FIFO_LITE,
+        AXI_MONITOR_FIFO_FULL,
+        ACCEL_MONITOR,
+        AXI_STREAM_MONITOR
+    };
+
+    struct debug_ip_data {
+        uint8_t m_type; // type of enum DEBUG_IP_TYPE
+        uint8_t m_index;
+        uint8_t m_properties;
+        uint8_t m_major;
+        uint8_t m_minor;
+        uint8_t m_reserved[3];
+        uint64_t m_base_address;
+        char    m_name[128];
+    };
+
+    struct debug_ip_layout {
+        uint16_t m_count;
+        struct debug_ip_data m_debug_ip_data[1];
+    };
+
+    enum CLOCK_TYPE {                      /* Supported clock frequency types */
+        CT_UNUSED = 0,                     /* Initialized value */
+        CT_DATA   = 1,                     /* Data clock */
+        CT_KERNEL = 2,                     /* Kernel clock */
+        CT_SYSTEM = 3                      /* System Clock */
+    };
+
+    struct clock_freq {                    /* Clock Frequency Entry */
+        u_int16_t m_freq_Mhz;              /* Frequency in MHz */
+        u_int8_t m_type;                   /* Clock type (enum CLOCK_TYPE) */
+        u_int8_t m_unused[5];              /* Not used - padding */
+        char m_name[128];                  /* Clock Name */
+    };
+
+    struct clock_freq_topology {           /* Clock frequency section */
+        int16_t m_count;                   /* Number of entries */
+        struct clock_freq m_clock_freq[1]; /* Clock array */
+    };
+
+    enum MCS_TYPE {                        /* Supported MCS file types */
+        MCS_UNKNOWN = 0,                   /* Initialized value */
+        MCS_PRIMARY = 1,                   /* The primary mcs file data */
+        MCS_SECONDARY = 2,                 /* The secondary mcs file data */
+    };
+
+    struct mcs_chunk {                     /* One chunk of MCS data */
+        uint8_t m_type;                    /* MCS data type */
+        uint8_t m_unused[7];               /* padding */
+        uint64_t m_offset;                 /* data offset from the start of the section */
+        uint64_t m_size;                   /* data size */
+    };
+
+    struct mcs {                           /* MCS data section */
+        int8_t m_count;                    /* Number of chunks */
+        int8_t m_unused[7];                /* padding */
+        struct mcs_chunk m_chunk[1];       /* MCS chunks followed by data */
+    };
+
+    struct bmc {                           /* bmc data section  */
+        uint64_t m_offset;                 /* data offset from the start of the section */
+        uint64_t m_size;                   /* data size (bytes)*/
+        char m_image_name[64];             /* Name of the image (e.g., MSP432P401R) */
+        char m_device_name[64];            /* Device ID         (e.g., VCU1525)  */
+        char m_version[64];
+        char m_md5value[33];               /* MD5 Expected Value(e.g., 56027182079c0bd621761b7dab5a27ca)*/
+        char m_padding[7];                 /* Padding */
+    };
+
+    enum CHECKSUM_TYPE
+    {
+        CST_UNKNOWN = 0,
+        CST_SDBM = 1,
+        CST_LAST
+    };
+
+    /**** END : Xilinx internal section *****/
+
+# ifdef __cplusplus
+    namespace xclbin {
+      inline const axlf_section_header*
+      get_axlf_section(const axlf* top, axlf_section_kind kind)
+      {
+        auto begin = top->m_sections;
+        auto end = begin + top->m_header.m_numSections;
+        auto itr = std::find_if(begin,end,[kind](const axlf_section_header& sec) { return sec.m_sectionKind==kind; });
+        return (itr!=end) ? &(*itr) : nullptr;
+      }
+    }
+# endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/drivers/gpu/drm/xocl/xclfeatures.h b/drivers/gpu/drm/xocl/xclfeatures.h
new file mode 100644
index 000000000000..3ef2616e061f
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xclfeatures.h
@@ -0,0 +1,107 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/*
+ *  Xilinx SDAccel FPGA BIOS definition
+ *  Copyright (C) 2016-2019, Xilinx Inc
+ */
+
+#ifndef xclfeatures_h_
+#define xclfeatures_h_
+
+#define FEATURE_ROM_MAJOR_VERSION 10
+#define FEATURE_ROM_MINOR_VERSION 1
+
+//Layout: At address 0xB0000, we will have the FeatureRomHeader that comprises:
+//
+//1. First have FeatureRomHeader: 152 bytes of information followed by
+//2. Then, as a part of FeatureRomHeader we have the PRRegion struct(s).
+//	The number of such structs will be same as OCLRegionCount.
+//3. After this the freq scaling table is laid out.
+//
+
+//#include <stdint.h>
+
+struct PartialRegion {
+	uint16_t clk[4];
+	uint8_t XPR; //0 : non-xpt, 1: xpr
+};
+
+// Each entry represents one row in freq scaling table.
+struct FreqScalingTableRow {
+	short config0;
+	short freq;
+	short config2;
+};
+
+enum PROMType  {
+	BPI    = 0,
+	SPI   = 1
+   //room for 6 more types of flash devices.
+};
+
+enum DebugType	{
+	DT_NIFD	 = 0x01,
+	DT_FIREWALL  = 0x02
+  //There is room for future expansion upto 8 IPs
+};
+
+// This bit mask is used with the FeatureBitMap to calculate 64 bool features
+//
+// To test if a feature is provided:
+//   FeatureRomHeader header;
+//   if (FeatureBitMask::FBM_IS_UNIFIED & header.FeatureBitMap)
+//     // it is supported
+//   else
+//     // it is not supported
+//
+// To set if a feature is provided:
+//   header.FeatureBitMap = 0;
+//   header.FeatureBitMap |= FeatureBitMask::FBM_IS_UNIFIED;
+//
+enum FeatureBitMask {
+	UNIFIED_PLATFORM	 =   0x0000000000000001	      /* bit 1 : Unified platform */
+	, XARE_ENBLD		 =   0x0000000000000002	      /* bit 2 : Aurora link enabled DSA */
+	, BOARD_MGMT_ENBLD	 =   0x0000000000000004	      /* bit 3 : Has MB based power monitoring */
+	, MB_SCHEDULER		 =   0x0000000000000008	      /* bit 4:	 Has MB based scheduler */
+	, PROM_MASK		 =   0x0000000000000070	      /* bits 5,6 &7  : 3 bits for PROMType */
+	/**	------ Bit 8 unused **/
+	, DEBUG_MASK		 =   0x000000000000FF00	      /* bits 9 through 16  : 8 bits for DebugType */
+	, PEER_TO_PEER		 =   0x0000000000010000	      /* bits 17  : Bar 2 is a peer to peer bar */
+	, UUID			 =   0x0000000000020000	      /* bits 18  : UUID enabled. uuid[16] field is valid */
+	, HBM			 =   0x0000000000040000	      /* bits 19  : Device has HBM's. */
+	, CDMA			 =   0x0000000000080000	      /* bits 21  : Device has CDMA*/
+	, QDMA			 =   0x0000000000100000	      /* bits 20  : Device has QDMA*/
+
+	//....more
+};
+
+
+// In the following data structures, the EntryPointString, MajorVersion, and MinorVersion
+// values are all used in the Runtime to identify if the ROM is producing valid data, and
+// to pick the schema to read the rest of the data; Ergo, these values shall not change.
+
+/*
+ * Struct used for >  2017.2_sdx
+ * This struct should be used for version (==) 10.0 (Major: 10, Minor: 0)
+ */
+struct FeatureRomHeader {
+	unsigned char EntryPointString[4];  // This is "xlnx"
+	uint8_t MajorVersion;		    // Feature ROM's major version eg 1
+	uint8_t MinorVersion;		    // minor version eg 2.
+	// -- DO NOT CHANGE THE TYPES ABOVE THIS LINE --
+	uint32_t VivadoBuildID;		    // Vivado Software Build (e.g., 1761098 ). From ./vivado --version
+	uint32_t IPBuildID;		    // IP Build (e.g., 1759159 from abve)
+	uint64_t TimeSinceEpoch;	    // linux time(NULL) call, at write_dsa_rom invocation
+	unsigned char FPGAPartName[64];	    // The hardware FPGA part. Null termninated
+	unsigned char VBNVName[64];	    // eg : xilinx:xil-accel-rd-ku115:4ddr-xpr:3.4: null terminated
+	uint8_t DDRChannelCount;	    // 4 for TUL
+	uint8_t DDRChannelSize;		    // 4 (in GB)
+	uint64_t DRBaseAddress;	            // The Dynamic Range's (AppPF/CL/Userspace) Base Address
+	uint64_t FeatureBitMap;		    // Feature Bit Map, 64 different bool features, maps to enum FeatureBitMask
+	unsigned char uuid[16];		    // UUID of the DSA.
+	uint8_t HBMCount;		    // Number of HBMs
+	uint8_t HBMSize;		    // Size of (each) HBM in GB
+	uint32_t CDMABaseAddress[4];	    // CDMA base addresses
+};
+
+#endif // xclfeatures_h_
diff --git a/drivers/gpu/drm/xocl/xocl_ctx.c b/drivers/gpu/drm/xocl/xocl_ctx.c
new file mode 100644
index 000000000000..4a6c6045c827
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xocl_ctx.c
@@ -0,0 +1,196 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2018-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Lizhi.Hou@xilinx.com
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include "xocl_drv.h"
+
+/*
+ * helper functions to protect driver private data
+ */
+DEFINE_MUTEX(xocl_drvinst_lock);
+struct xocl_drvinst *xocl_drvinst_array[XOCL_MAX_DEVICES * 10];
+
+void *xocl_drvinst_alloc(struct device *dev, u32 size)
+{
+	struct xocl_drvinst	*drvinstp;
+	int		inst;
+
+	mutex_lock(&xocl_drvinst_lock);
+	for (inst = 0; inst < ARRAY_SIZE(xocl_drvinst_array); inst++)
+		if (!xocl_drvinst_array[inst])
+			break;
+
+	if (inst == ARRAY_SIZE(xocl_drvinst_array))
+		goto failed;
+
+	drvinstp = kzalloc(size + sizeof(struct xocl_drvinst), GFP_KERNEL);
+	if (!drvinstp)
+		goto failed;
+
+	drvinstp->dev = dev;
+	drvinstp->size = size;
+	init_completion(&drvinstp->comp);
+	atomic_set(&drvinstp->ref, 1);
+	INIT_LIST_HEAD(&drvinstp->open_procs);
+
+	xocl_drvinst_array[inst] = drvinstp;
+
+	mutex_unlock(&xocl_drvinst_lock);
+
+	return drvinstp->data;
+
+failed:
+	mutex_unlock(&xocl_drvinst_lock);
+
+	kfree(drvinstp);
+	return NULL;
+}
+
+void xocl_drvinst_free(void *data)
+{
+	struct xocl_drvinst	*drvinstp;
+	struct xocl_drvinst_proc *proc, *temp;
+	struct pid		*p;
+	int		inst;
+	int		ret;
+
+	mutex_lock(&xocl_drvinst_lock);
+	drvinstp = container_of(data, struct xocl_drvinst, data);
+	for (inst = 0; inst < ARRAY_SIZE(xocl_drvinst_array); inst++) {
+		if (drvinstp == xocl_drvinst_array[inst])
+			break;
+	}
+
+	/* it must be created before */
+	BUG_ON(inst == ARRAY_SIZE(xocl_drvinst_array));
+
+	xocl_drvinst_array[inst] = NULL;
+	mutex_unlock(&xocl_drvinst_lock);
+
+	/* wait all opened instances to close */
+	if (!atomic_dec_and_test(&drvinstp->ref)) {
+		xocl_info(drvinstp->dev, "Wait for close %p\n",
+				&drvinstp->comp);
+		ret = wait_for_completion_killable(&drvinstp->comp);
+		if (ret == -ERESTARTSYS) {
+			list_for_each_entry_safe(proc, temp,
+				&drvinstp->open_procs, link) {
+				p = find_get_pid(proc->pid);
+				if (!p)
+					continue;
+				ret = kill_pid(p, SIGBUS, 1);
+				if (ret)
+					xocl_err(drvinstp->dev,
+						"kill %d failed",
+						proc->pid);
+				put_pid(p);
+			}
+			wait_for_completion(&drvinstp->comp);
+		}
+	}
+
+	kfree(drvinstp);
+}
+
+void xocl_drvinst_set_filedev(void *data, void *file_dev)
+{
+	struct xocl_drvinst	*drvinstp;
+	int		inst;
+
+	mutex_lock(&xocl_drvinst_lock);
+	drvinstp = container_of(data, struct xocl_drvinst, data);
+	for (inst = 0; inst < ARRAY_SIZE(xocl_drvinst_array); inst++) {
+		if (drvinstp == xocl_drvinst_array[inst])
+			break;
+	}
+
+	BUG_ON(inst == ARRAY_SIZE(xocl_drvinst_array));
+
+	drvinstp->file_dev = file_dev;
+	mutex_unlock(&xocl_drvinst_lock);
+}
+
+void *xocl_drvinst_open(void *file_dev)
+{
+	struct xocl_drvinst	*drvinstp;
+	struct xocl_drvinst_proc	*proc;
+	int		inst;
+	u32		pid;
+
+	mutex_lock(&xocl_drvinst_lock);
+	for (inst = 0; inst < ARRAY_SIZE(xocl_drvinst_array); inst++) {
+		drvinstp = xocl_drvinst_array[inst];
+		if (drvinstp && file_dev == drvinstp->file_dev)
+			break;
+	}
+
+	if (inst == ARRAY_SIZE(xocl_drvinst_array)) {
+		mutex_unlock(&xocl_drvinst_lock);
+		return NULL;
+	}
+
+	pid = pid_nr(task_tgid(current));
+	list_for_each_entry(proc, &drvinstp->open_procs, link) {
+		if (proc->pid == pid)
+			break;
+	}
+	if (&proc->link == &drvinstp->open_procs) {
+		proc = kzalloc(sizeof(*proc), GFP_KERNEL);
+		if (!proc) {
+			mutex_unlock(&xocl_drvinst_lock);
+			return NULL;
+		}
+		proc->pid = pid;
+		list_add(&proc->link, &drvinstp->open_procs);
+	} else
+		proc->count++;
+	xocl_info(drvinstp->dev, "OPEN %d\n", drvinstp->ref.counter);
+
+	if (atomic_inc_return(&drvinstp->ref) == 2)
+		reinit_completion(&drvinstp->comp);
+
+
+	mutex_unlock(&xocl_drvinst_lock);
+
+	return drvinstp->data;
+}
+
+void xocl_drvinst_close(void *data)
+{
+	struct xocl_drvinst	*drvinstp;
+	struct xocl_drvinst_proc *proc;
+	u32	pid;
+
+	mutex_lock(&xocl_drvinst_lock);
+	drvinstp = container_of(data, struct xocl_drvinst, data);
+
+	xocl_info(drvinstp->dev, "CLOSE %d\n", drvinstp->ref.counter);
+
+	pid = pid_nr(task_tgid(current));
+	list_for_each_entry(proc, &drvinstp->open_procs, link) {
+		if (proc->pid == pid)
+			break;
+	}
+
+	if (proc) {
+		proc->count--;
+		if (!proc->count) {
+			list_del(&proc->link);
+			kfree(proc);
+		}
+	}
+
+	if (atomic_dec_return(&drvinstp->ref) == 1) {
+		xocl_info(drvinstp->dev, "NOTIFY %p\n", &drvinstp->comp);
+		complete(&drvinstp->comp);
+	}
+
+	mutex_unlock(&xocl_drvinst_lock);
+}
diff --git a/drivers/gpu/drm/xocl/xocl_drm.h b/drivers/gpu/drm/xocl/xocl_drm.h
new file mode 100644
index 000000000000..de362ee062f6
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xocl_drm.h
@@ -0,0 +1,91 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ */
+
+#ifndef _XOCL_DRM_H
+#define	_XOCL_DRM_H
+
+#include <linux/hashtable.h>
+
+/**
+ * struct drm_xocl_exec_metadata - Meta data for exec bo
+ *
+ * @state: State of exec buffer object
+ * @active: Reverse mapping to kds command object managed exclusively by kds
+ */
+struct drm_xocl_exec_metadata {
+	enum drm_xocl_execbuf_state state;
+	struct xocl_cmd            *active;
+};
+
+struct xocl_drm {
+	xdev_handle_t		xdev;
+	/* memory management */
+	struct drm_device       *ddev;
+	/* Memory manager array, one per DDR channel */
+	struct drm_mm           **mm;
+	struct mutex            mm_lock;
+	struct drm_xocl_mm_stat **mm_usage_stat;
+	u64                     *mm_p2p_off;
+	DECLARE_HASHTABLE(mm_range, 6);
+};
+
+struct drm_xocl_bo {
+	/* drm base object */
+	struct drm_gem_object base;
+	struct drm_mm_node   *mm_node;
+	struct drm_xocl_exec_metadata metadata;
+	struct page         **pages;
+	struct sg_table      *sgt;
+	void                 *vmapping;
+	void                 *bar_vmapping;
+	struct dma_buf                  *dmabuf;
+	const struct vm_operations_struct *dmabuf_vm_ops;
+	unsigned int          dma_nsg;
+	unsigned int          flags;
+	unsigned int          type;
+};
+
+struct drm_xocl_unmgd {
+	struct page         **pages;
+	struct sg_table      *sgt;
+	unsigned int          npages;
+	unsigned int          flags;
+};
+
+struct drm_xocl_bo *xocl_drm_create_bo(struct xocl_drm *drm_p, uint64_t unaligned_size,
+				       unsigned int user_flags, unsigned int user_type);
+void xocl_drm_free_bo(struct drm_gem_object *obj);
+
+
+void xocl_mm_get_usage_stat(struct xocl_drm *drm_p, u32 ddr,
+			    struct drm_xocl_mm_stat *pstat);
+void xocl_mm_update_usage_stat(struct xocl_drm *drm_p, u32 ddr,
+			       u64 size, int count);
+int xocl_mm_insert_node(struct xocl_drm *drm_p, u32 ddr,
+			struct drm_mm_node *node, u64 size);
+void *xocl_drm_init(xdev_handle_t xdev);
+void xocl_drm_fini(struct xocl_drm *drm_p);
+uint32_t xocl_get_shared_ddr(struct xocl_drm *drm_p, struct mem_data *m_data);
+int xocl_init_mem(struct xocl_drm *drm_p);
+void xocl_cleanup_mem(struct xocl_drm *drm_p);
+int xocl_check_topology(struct xocl_drm *drm_p);
+
+int xocl_gem_fault(struct vm_fault *vmf);
+
+static inline struct drm_xocl_bo *to_xocl_bo(struct drm_gem_object *bo)
+{
+	return (struct drm_xocl_bo *)bo;
+}
+
+int xocl_init_unmgd(struct drm_xocl_unmgd *unmgd, uint64_t data_ptr,
+		    uint64_t size, u32 write);
+void xocl_finish_unmgd(struct drm_xocl_unmgd *unmgd);
+
+#endif
diff --git a/drivers/gpu/drm/xocl/xocl_drv.h b/drivers/gpu/drm/xocl/xocl_drv.h
new file mode 100644
index 000000000000..c67f3f03feae
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xocl_drv.h
@@ -0,0 +1,783 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Lizhi.Hou@Xilinx.com
+ */
+
+#ifndef	_XOCL_DRV_H_
+#define	_XOCL_DRV_H_
+
+#include <linux/version.h>
+#include <drm/drmP.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_mm.h>
+#include "xclbin.h"
+#include "devices.h"
+#include <drm/xocl_drm.h>
+#include <drm/xmgmt_drm.h>
+
+static inline void xocl_memcpy_fromio(void *buf, void *iomem, u32 size)
+{
+	int i;
+
+	BUG_ON(size & 0x3);
+
+	for (i = 0; i < size / 4; i++)
+		((u32 *)buf)[i] = ioread32((char *)(iomem) + sizeof(u32) * i);
+}
+
+static inline void xocl_memcpy_toio(void *iomem, void *buf, u32 size)
+{
+	int i;
+
+	BUG_ON(size & 0x3);
+
+	for (i = 0; i < size / 4; i++)
+		iowrite32(((u32 *)buf)[i], ((char *)(iomem) + sizeof(u32) * i));
+}
+
+#define	XOCL_MODULE_NAME	"xocl"
+#define	XCLMGMT_MODULE_NAME	"xclmgmt"
+#define	ICAP_XCLBIN_V2			"xclbin2"
+
+#define XOCL_MAX_DEVICES	16
+#define XOCL_EBUF_LEN           512
+#define xocl_sysfs_error(xdev, fmt, args...)	 \
+	snprintf(((struct xocl_dev_core *)xdev)->ebuf, XOCL_EBUF_LEN,	\
+		 fmt, ##args)
+#define MAX_M_COUNT      64
+
+#define	XDEV2DEV(xdev)		(&XDEV(xdev)->pdev->dev)
+
+#define xocl_err(dev, fmt, args...)			\
+	dev_err(dev, "%s: "fmt, __func__, ##args)
+#define xocl_info(dev, fmt, args...)			\
+	dev_info(dev, "%s: "fmt, __func__, ##args)
+#define xocl_dbg(dev, fmt, args...)			\
+	dev_dbg(dev, "%s: "fmt, __func__, ##args)
+
+#define xocl_xdev_info(xdev, fmt, args...)		\
+	xocl_info(XDEV2DEV(xdev), fmt, ##args)
+#define xocl_xdev_err(xdev, fmt, args...)		\
+	xocl_err(XDEV2DEV(xdev), fmt, ##args)
+#define xocl_xdev_dbg(xdev, fmt, args...)		\
+	xocl_dbg(XDEV2DEV(xdev), fmt, ##args)
+
+#define	XOCL_DRV_VER_NUM(ma, mi, p)		\
+	((ma) * 1000 + (mi) * 100 + (p))
+
+#define	XOCL_READ_REG32(addr)		\
+	ioread32(addr)
+#define	XOCL_WRITE_REG32(val, addr)	\
+	iowrite32(val, addr)
+
+/* xclbin helpers */
+#define sizeof_sect(sect, data) \
+({ \
+	size_t ret; \
+	size_t data_size; \
+	data_size = (sect) ? sect->m_count * sizeof(typeof(sect->data)) : 0; \
+	ret = (sect) ? offsetof(typeof(*sect), data) + data_size : 0; \
+	(ret); \
+})
+
+#define	XOCL_PL_TO_PCI_DEV(pldev)		\
+	to_pci_dev(pldev->dev.parent)
+
+#define XOCL_PL_DEV_TO_XDEV(pldev) \
+	pci_get_drvdata(XOCL_PL_TO_PCI_DEV(pldev))
+
+#define	XOCL_QDMA_USER_BAR	2
+#define	XOCL_DSA_VERSION(xdev)			\
+	(XDEV(xdev)->priv.dsa_ver)
+
+#define XOCL_DSA_IS_MPSOC(xdev)                \
+	(XDEV(xdev)->priv.mpsoc)
+
+#define	XOCL_DEV_ID(pdev)			\
+	((pci_domain_nr(pdev->bus) << 16) |	\
+	PCI_DEVID(pdev->bus->number, pdev->devfn))
+
+#define XOCL_ARE_HOP 0x400000000ull
+
+#define	XOCL_XILINX_VEN		0x10EE
+#define	XOCL_CHARDEV_REG_COUNT	16
+
+#define INVALID_SUBDEVICE ~0U
+
+#define XOCL_INVALID_MINOR -1
+
+extern struct class *xrt_class;
+
+struct drm_xocl_bo;
+struct client_ctx;
+
+struct xocl_subdev {
+	struct platform_device *pldev;
+	void                   *ops;
+};
+
+struct xocl_subdev_private {
+	int		id;
+	bool		is_multi;
+	char		priv_data[1];
+};
+
+#define	XOCL_GET_SUBDEV_PRIV(dev)				\
+	(((struct xocl_subdev_private *)dev_get_platdata(dev))->priv_data)
+
+typedef	void *xdev_handle_t;
+
+struct xocl_pci_funcs {
+	int (*intr_config)(xdev_handle_t xdev, u32 intr, bool enable);
+	int (*intr_register)(xdev_handle_t xdev, u32 intr,
+		irq_handler_t handler, void *arg);
+	int (*reset)(xdev_handle_t xdev);
+};
+
+#define	XDEV(dev)	((struct xocl_dev_core *)(dev))
+#define	XDEV_PCIOPS(xdev)	(XDEV(xdev)->pci_ops)
+
+#define	xocl_user_interrupt_config(xdev, intr, en)	\
+	XDEV_PCIOPS(xdev)->intr_config(xdev, intr, en)
+#define	xocl_user_interrupt_reg(xdev, intr, handler, arg)	\
+	XDEV_PCIOPS(xdev)->intr_register(xdev, intr, handler, arg)
+#define xocl_reset(xdev)			\
+	(XDEV_PCIOPS(xdev)->reset ? XDEV_PCIOPS(xdev)->reset(xdev) : \
+	-ENODEV)
+
+struct xocl_health_thread_arg {
+	int (*health_cb)(void *arg);
+	void		*arg;
+	u32		interval;    /* ms */
+	struct device	*dev;
+};
+
+struct xocl_drvinst_proc {
+	struct list_head	link;
+	u32			pid;
+	u32			count;
+};
+
+struct xocl_drvinst {
+	struct device		*dev;
+	u32			size;
+	atomic_t		ref;
+	struct completion	comp;
+	struct list_head	open_procs;
+	void			*file_dev;
+	char			data[1];
+};
+
+struct xocl_dev_core {
+	struct pci_dev         *pdev;
+	int			dev_minor;
+	struct xocl_subdev	subdevs[XOCL_SUBDEV_NUM];
+	u32			subdev_num;
+	struct xocl_pci_funcs  *pci_ops;
+
+	u32			bar_idx;
+	void			* __iomem bar_addr;
+	resource_size_t		bar_size;
+	resource_size_t		feature_rom_offset;
+
+	u32			intr_bar_idx;
+	void			* __iomem intr_bar_addr;
+	resource_size_t		intr_bar_size;
+
+	struct task_struct     *health_thread;
+	struct xocl_health_thread_arg thread_arg;
+
+	struct xocl_board_private priv;
+
+	char			ebuf[XOCL_EBUF_LEN + 1];
+
+	bool			offline;
+};
+
+enum data_kind {
+	MIG_CALIB,
+	DIMM0_TEMP,
+	DIMM1_TEMP,
+	DIMM2_TEMP,
+	DIMM3_TEMP,
+	FPGA_TEMP,
+	VCC_BRAM,
+	CLOCK_FREQ_0,
+	CLOCK_FREQ_1,
+	FREQ_COUNTER_0,
+	FREQ_COUNTER_1,
+	VOL_12V_PEX,
+	VOL_12V_AUX,
+	CUR_12V_PEX,
+	CUR_12V_AUX,
+	SE98_TEMP0,
+	SE98_TEMP1,
+	SE98_TEMP2,
+	FAN_TEMP,
+	FAN_RPM,
+	VOL_3V3_PEX,
+	VOL_3V3_AUX,
+	VPP_BTM,
+	VPP_TOP,
+	VOL_5V5_SYS,
+	VOL_1V2_TOP,
+	VOL_1V2_BTM,
+	VOL_1V8,
+	VCC_0V9A,
+	VOL_12V_SW,
+	VTT_MGTA,
+	VOL_VCC_INT,
+	CUR_VCC_INT,
+	IDCODE,
+	IPLAYOUT_AXLF,
+	MEMTOPO_AXLF,
+	CONNECTIVITY_AXLF,
+	DEBUG_IPLAYOUT_AXLF,
+	PEER_CONN,
+	XCLBIN_UUID,
+};
+
+
+#define	XOCL_DSA_PCI_RESET_OFF(xdev_hdl)			\
+	(((struct xocl_dev_core *)xdev_hdl)->priv.flags &	\
+	XOCL_DSAFLAG_PCI_RESET_OFF)
+#define	XOCL_DSA_MB_SCHE_OFF(xdev_hdl)			\
+	(((struct xocl_dev_core *)xdev_hdl)->priv.flags &	\
+	XOCL_DSAFLAG_MB_SCHE_OFF)
+#define	XOCL_DSA_AXILITE_FLUSH_REQUIRED(xdev_hdl)			\
+	(((struct xocl_dev_core *)xdev_hdl)->priv.flags &	\
+	XOCL_DSAFLAG_AXILITE_FLUSH)
+
+#define	XOCL_DSA_XPR_ON(xdev_hdl)		\
+	(((struct xocl_dev_core *)xdev_hdl)->priv.xpr)
+
+
+#define	SUBDEV(xdev, id)	\
+	(XDEV(xdev)->subdevs[id])
+
+/* rom callbacks */
+struct xocl_rom_funcs {
+	unsigned int (*dsa_version)(struct platform_device *pdev);
+	bool (*is_unified)(struct platform_device *pdev);
+	bool (*mb_mgmt_on)(struct platform_device *pdev);
+	bool (*mb_sched_on)(struct platform_device *pdev);
+	uint32_t* (*cdma_addr)(struct platform_device *pdev);
+	u16 (*get_ddr_channel_count)(struct platform_device *pdev);
+	u64 (*get_ddr_channel_size)(struct platform_device *pdev);
+	bool (*is_are)(struct platform_device *pdev);
+	bool (*is_aws)(struct platform_device *pdev);
+	bool (*verify_timestamp)(struct platform_device *pdev, u64 timestamp);
+	u64 (*get_timestamp)(struct platform_device *pdev);
+	void (*get_raw_header)(struct platform_device *pdev, void *header);
+};
+#define ROM_DEV(xdev)	\
+	SUBDEV(xdev, XOCL_SUBDEV_FEATURE_ROM).pldev
+#define	ROM_OPS(xdev)	\
+	((struct xocl_rom_funcs *)SUBDEV(xdev, XOCL_SUBDEV_FEATURE_ROM).ops)
+#define	xocl_dsa_version(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->dsa_version(ROM_DEV(xdev)) : 0)
+#define	xocl_is_unified(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->is_unified(ROM_DEV(xdev)) : true)
+#define	xocl_mb_mgmt_on(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->mb_mgmt_on(ROM_DEV(xdev)) : false)
+#define	xocl_mb_sched_on(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->mb_sched_on(ROM_DEV(xdev)) : false)
+#define	xocl_cdma_addr(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->cdma_addr(ROM_DEV(xdev)) : 0)
+#define	xocl_get_ddr_channel_count(xdev) \
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->get_ddr_channel_count(ROM_DEV(xdev)) :\
+	0)
+#define	xocl_get_ddr_channel_size(xdev) \
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->get_ddr_channel_size(ROM_DEV(xdev)) : 0)
+#define	xocl_is_are(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->is_are(ROM_DEV(xdev)) : false)
+#define	xocl_is_aws(xdev)		\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->is_aws(ROM_DEV(xdev)) : false)
+#define	xocl_verify_timestamp(xdev, ts)	\
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->verify_timestamp(ROM_DEV(xdev), ts) : \
+	false)
+#define	xocl_get_timestamp(xdev) \
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->get_timestamp(ROM_DEV(xdev)) : 0)
+#define	xocl_get_raw_header(xdev, header) \
+	(ROM_DEV(xdev) ? ROM_OPS(xdev)->get_raw_header(ROM_DEV(xdev), header) :\
+	NULL)
+
+/* dma callbacks */
+struct xocl_dma_funcs {
+	ssize_t (*migrate_bo)(struct platform_device *pdev,
+			      struct sg_table *sgt, u32 dir, u64 paddr, u32 channel, u64 sz);
+	int (*ac_chan)(struct platform_device *pdev, u32 dir);
+	void (*rel_chan)(struct platform_device *pdev, u32 dir, u32 channel);
+	u32 (*get_chan_count)(struct platform_device *pdev);
+	u64 (*get_chan_stat)(struct platform_device *pdev, u32 channel,
+			     u32 write);
+	u64 (*get_str_stat)(struct platform_device *pdev, u32 q_idx);
+	int (*user_intr_config)(struct platform_device *pdev, u32 intr, bool en);
+	int (*user_intr_register)(struct platform_device *pdev, u32 intr,
+				  irq_handler_t handler, void *arg, int event_fd);
+	int (*user_intr_unreg)(struct platform_device *pdev, u32 intr);
+	void *(*get_drm_handle)(struct platform_device *pdev);
+};
+
+#define DMA_DEV(xdev)	\
+	SUBDEV(xdev, XOCL_SUBDEV_DMA).pldev
+#define	DMA_OPS(xdev)	\
+	((struct xocl_dma_funcs *)SUBDEV(xdev, XOCL_SUBDEV_DMA).ops)
+#define	xocl_migrate_bo(xdev, sgt, write, paddr, chan, len)	\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->migrate_bo(DMA_DEV(xdev), \
+	sgt, write, paddr, chan, len) : 0)
+#define	xocl_acquire_channel(xdev, dir)		\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->ac_chan(DMA_DEV(xdev), dir) : \
+	-ENODEV)
+#define	xocl_release_channel(xdev, dir, chan)	\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->rel_chan(DMA_DEV(xdev), dir, \
+	chan) : NULL)
+#define	xocl_get_chan_count(xdev)		\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->get_chan_count(DMA_DEV(xdev)) \
+	: 0)
+#define	xocl_get_chan_stat(xdev, chan, write)		\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->get_chan_stat(DMA_DEV(xdev), \
+	chan, write) : 0)
+#define xocl_dma_intr_config(xdev, irq, en)			\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->user_intr_config(DMA_DEV(xdev), \
+	irq, en) : -ENODEV)
+#define xocl_dma_intr_register(xdev, irq, handler, arg, event_fd)	\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->user_intr_register(DMA_DEV(xdev), \
+	irq, handler, arg, event_fd) : -ENODEV)
+#define xocl_dma_intr_unreg(xdev, irq)				\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->user_intr_unreg(DMA_DEV(xdev),	\
+	irq) : -ENODEV)
+#define	xocl_dma_get_drm_handle(xdev)				\
+	(DMA_DEV(xdev) ? DMA_OPS(xdev)->get_drm_handle(DMA_DEV(xdev)) : \
+	NULL)
+
+/* mb_scheduler callbacks */
+struct xocl_mb_scheduler_funcs {
+	int (*create_client)(struct platform_device *pdev, void **priv);
+	void (*destroy_client)(struct platform_device *pdev, void **priv);
+	uint (*poll_client)(struct platform_device *pdev, struct file *filp,
+		poll_table *wait, void *priv);
+	int (*client_ioctl)(struct platform_device *pdev, int op,
+		void *data, void *drm_filp);
+	int (*stop)(struct platform_device *pdev);
+	int (*reset)(struct platform_device *pdev);
+};
+#define	MB_SCHEDULER_DEV(xdev)	\
+	SUBDEV(xdev, XOCL_SUBDEV_MB_SCHEDULER).pldev
+#define	MB_SCHEDULER_OPS(xdev)	\
+	((struct xocl_mb_scheduler_funcs *)SUBDEV(xdev,	\
+		XOCL_SUBDEV_MB_SCHEDULER).ops)
+#define	xocl_exec_create_client(xdev, priv)		\
+	(MB_SCHEDULER_DEV(xdev) ?			\
+	MB_SCHEDULER_OPS(xdev)->create_client(MB_SCHEDULER_DEV(xdev), priv) : \
+	-ENODEV)
+#define	xocl_exec_destroy_client(xdev, priv)		\
+	(MB_SCHEDULER_DEV(xdev) ?			\
+	MB_SCHEDULER_OPS(xdev)->destroy_client(MB_SCHEDULER_DEV(xdev), priv) : \
+	NULL)
+#define	xocl_exec_poll_client(xdev, filp, wait, priv)		\
+	(MB_SCHEDULER_DEV(xdev) ?					\
+	MB_SCHEDULER_OPS(xdev)->poll_client(MB_SCHEDULER_DEV(xdev), filp, \
+	wait, priv) : 0)
+#define	xocl_exec_client_ioctl(xdev, op, data, drm_filp)		\
+	(MB_SCHEDULER_DEV(xdev) ?				\
+	MB_SCHEDULER_OPS(xdev)->client_ioctl(MB_SCHEDULER_DEV(xdev),	\
+	op, data, drm_filp) : -ENODEV)
+#define	xocl_exec_stop(xdev)		\
+	(MB_SCHEDULER_DEV(xdev) ?					\
+	 MB_SCHEDULER_OPS(xdev)->stop(MB_SCHEDULER_DEV(xdev)) : \
+	 -ENODEV)
+#define	xocl_exec_reset(xdev)		\
+	(MB_SCHEDULER_DEV(xdev) ?					\
+	 MB_SCHEDULER_OPS(xdev)->reset(MB_SCHEDULER_DEV(xdev)) : \
+	 -ENODEV)
+
+#define XOCL_MEM_TOPOLOGY(xdev)						\
+	((struct mem_topology *)					\
+	 xocl_icap_get_data(xdev, MEMTOPO_AXLF))
+#define XOCL_IP_LAYOUT(xdev)						\
+	((struct ip_layout *)						\
+	 xocl_icap_get_data(xdev, IPLAYOUT_AXLF))
+
+#define	XOCL_IS_DDR_USED(xdev, ddr)					\
+	(XOCL_MEM_TOPOLOGY(xdev)->m_mem_data[ddr].m_used == 1)
+#define	XOCL_DDR_COUNT_UNIFIED(xdev)		\
+	(XOCL_MEM_TOPOLOGY(xdev) ? XOCL_MEM_TOPOLOGY(xdev)->m_count : 0)
+#define	XOCL_DDR_COUNT(xdev)			\
+	((xocl_is_unified(xdev) ? XOCL_DDR_COUNT_UNIFIED(xdev) :	\
+	xocl_get_ddr_channel_count(xdev)))
+
+/* sysmon callbacks */
+enum {
+	XOCL_SYSMON_PROP_TEMP,
+	XOCL_SYSMON_PROP_TEMP_MAX,
+	XOCL_SYSMON_PROP_TEMP_MIN,
+	XOCL_SYSMON_PROP_VCC_INT,
+	XOCL_SYSMON_PROP_VCC_INT_MAX,
+	XOCL_SYSMON_PROP_VCC_INT_MIN,
+	XOCL_SYSMON_PROP_VCC_AUX,
+	XOCL_SYSMON_PROP_VCC_AUX_MAX,
+	XOCL_SYSMON_PROP_VCC_AUX_MIN,
+	XOCL_SYSMON_PROP_VCC_BRAM,
+	XOCL_SYSMON_PROP_VCC_BRAM_MAX,
+	XOCL_SYSMON_PROP_VCC_BRAM_MIN,
+};
+struct xocl_sysmon_funcs {
+	int (*get_prop)(struct platform_device *pdev, u32 prop, void *val);
+};
+#define	SYSMON_DEV(xdev)	\
+	SUBDEV(xdev, XOCL_SUBDEV_SYSMON).pldev
+#define	SYSMON_OPS(xdev)	\
+	((struct xocl_sysmon_funcs *)SUBDEV(xdev,			\
+		XOCL_SUBDEV_SYSMON).ops)
+#define	xocl_sysmon_get_prop(xdev, prop, val)		\
+	(SYSMON_DEV(xdev) ? SYSMON_OPS(xdev)->get_prop(SYSMON_DEV(xdev), \
+	prop, val) : -ENODEV)
+
+/* firewall callbacks */
+enum {
+	XOCL_AF_PROP_TOTAL_LEVEL,
+	XOCL_AF_PROP_STATUS,
+	XOCL_AF_PROP_LEVEL,
+	XOCL_AF_PROP_DETECTED_STATUS,
+	XOCL_AF_PROP_DETECTED_LEVEL,
+	XOCL_AF_PROP_DETECTED_TIME,
+};
+struct xocl_firewall_funcs {
+	int (*get_prop)(struct platform_device *pdev, u32 prop, void *val);
+	int (*clear_firewall)(struct platform_device *pdev);
+	u32 (*check_firewall)(struct platform_device *pdev, int *level);
+};
+#define AF_DEV(xdev)	\
+	SUBDEV(xdev, XOCL_SUBDEV_AF).pldev
+#define	AF_OPS(xdev)	\
+	((struct xocl_firewall_funcs *)SUBDEV(xdev,	\
+	XOCL_SUBDEV_AF).ops)
+#define	xocl_af_get_prop(xdev, prop, val)		\
+	(AF_DEV(xdev) ? AF_OPS(xdev)->get_prop(AF_DEV(xdev), prop, val) : \
+	-ENODEV)
+#define	xocl_af_check(xdev, level)			\
+	(AF_DEV(xdev) ? AF_OPS(xdev)->check_firewall(AF_DEV(xdev), level) : 0)
+#define	xocl_af_clear(xdev)				\
+	(AF_DEV(xdev) ? AF_OPS(xdev)->clear_firewall(AF_DEV(xdev)) : -ENODEV)
+
+/* microblaze callbacks */
+struct xocl_mb_funcs {
+	void (*reset)(struct platform_device *pdev);
+	int (*stop)(struct platform_device *pdev);
+	int (*load_mgmt_image)(struct platform_device *pdev, const char *buf,
+		u32 len);
+	int (*load_sche_image)(struct platform_device *pdev, const char *buf,
+		u32 len);
+	int (*get_data)(struct platform_device *pdev, enum data_kind kind);
+};
+
+struct xocl_dna_funcs {
+	u32 (*status)(struct platform_device *pdev);
+	u32 (*capability)(struct platform_device *pdev);
+	void (*write_cert)(struct platform_device *pdev, const uint32_t *buf, u32 len);
+};
+
+#define	XMC_DEV(xdev)		\
+	SUBDEV(xdev, XOCL_SUBDEV_XMC).pldev
+#define	XMC_OPS(xdev)		\
+	((struct xocl_mb_funcs *)SUBDEV(xdev,	\
+	XOCL_SUBDEV_XMC).ops)
+
+#define	DNA_DEV(xdev)		\
+	SUBDEV(xdev, XOCL_SUBDEV_DNA).pldev
+#define	DNA_OPS(xdev)		\
+	((struct xocl_dna_funcs *)SUBDEV(xdev,	\
+	XOCL_SUBDEV_DNA).ops)
+#define	xocl_dna_status(xdev)			\
+	(DNA_DEV(xdev) ? DNA_OPS(xdev)->status(DNA_DEV(xdev)) : 0)
+#define	xocl_dna_capability(xdev)			\
+	(DNA_DEV(xdev) ? DNA_OPS(xdev)->capability(DNA_DEV(xdev)) : 2)
+#define xocl_dna_write_cert(xdev, data, len)  \
+	(DNA_DEV(xdev) ? DNA_OPS(xdev)->write_cert(DNA_DEV(xdev), data, len) : 0)
+
+#define	MB_DEV(xdev)		\
+	SUBDEV(xdev, XOCL_SUBDEV_MB).pldev
+#define	MB_OPS(xdev)		\
+	((struct xocl_mb_funcs *)SUBDEV(xdev,	\
+	XOCL_SUBDEV_MB).ops)
+#define	xocl_mb_reset(xdev)			\
+	(XMC_DEV(xdev) ? XMC_OPS(xdev)->reset(XMC_DEV(xdev)) : \
+	(MB_DEV(xdev) ? MB_OPS(xdev)->reset(MB_DEV(xdev)) : NULL))
+
+#define	xocl_mb_stop(xdev)			\
+	(XMC_DEV(xdev) ? XMC_OPS(xdev)->stop(XMC_DEV(xdev)) : \
+	(MB_DEV(xdev) ? MB_OPS(xdev)->stop(MB_DEV(xdev)) : -ENODEV))
+
+#define xocl_mb_load_mgmt_image(xdev, buf, len)		\
+	(XMC_DEV(xdev) ? XMC_OPS(xdev)->load_mgmt_image(XMC_DEV(xdev), buf, len) :\
+	(MB_DEV(xdev) ? MB_OPS(xdev)->load_mgmt_image(MB_DEV(xdev), buf, len) :\
+	-ENODEV))
+#define xocl_mb_load_sche_image(xdev, buf, len)		\
+	(XMC_DEV(xdev) ? XMC_OPS(xdev)->load_sche_image(XMC_DEV(xdev), buf, len) :\
+	(MB_DEV(xdev) ? MB_OPS(xdev)->load_sche_image(MB_DEV(xdev), buf, len) :\
+	-ENODEV))
+
+#define xocl_xmc_get_data(xdev, cmd)			\
+	(XMC_DEV(xdev) ? XMC_OPS(xdev)->get_data(XMC_DEV(xdev), cmd) : -ENODEV)
+
+/*
+ * mailbox callbacks
+ */
+enum mailbox_request {
+	MAILBOX_REQ_UNKNOWN = 0,
+	MAILBOX_REQ_TEST_READY,
+	MAILBOX_REQ_TEST_READ,
+	MAILBOX_REQ_LOCK_BITSTREAM,
+	MAILBOX_REQ_UNLOCK_BITSTREAM,
+	MAILBOX_REQ_HOT_RESET,
+	MAILBOX_REQ_FIREWALL,
+	MAILBOX_REQ_GPCTL,
+	MAILBOX_REQ_LOAD_XCLBIN_KADDR,
+	MAILBOX_REQ_LOAD_XCLBIN,
+	MAILBOX_REQ_RECLOCK,
+	MAILBOX_REQ_PEER_DATA,
+	MAILBOX_REQ_CONN_EXPL,
+};
+
+enum mb_cmd_type {
+	MB_CMD_DEFAULT = 0,
+	MB_CMD_LOAD_XCLBIN,
+	MB_CMD_RECLOCK,
+	MB_CMD_CONN_EXPL,
+	MB_CMD_LOAD_XCLBIN_KADDR,
+	MB_CMD_READ_FROM_PEER,
+};
+struct mailbox_req_bitstream_lock {
+	pid_t pid;
+	uuid_t uuid;
+};
+
+struct mailbox_subdev_peer {
+	enum data_kind kind;
+};
+
+struct mailbox_bitstream_kaddr {
+	uint64_t addr;
+};
+
+struct mailbox_gpctl {
+	enum mb_cmd_type cmd_type;
+	uint32_t data_total_len;
+	uint64_t priv_data;
+	void *data_ptr;
+};
+
+
+struct mailbox_req {
+	enum mailbox_request req;
+	uint32_t data_total_len;
+	uint64_t flags;
+	char data[0];
+};
+
+#define MB_PROT_VER_MAJOR 0
+#define MB_PROT_VER_MINOR 5
+#define MB_PROTOCOL_VER   ((MB_PROT_VER_MAJOR<<8) + MB_PROT_VER_MINOR)
+
+#define MB_PEER_CONNECTED 0x1
+#define MB_PEER_SAME_DOM  0x2
+#define MB_PEER_SAMEDOM_CONNECTED (MB_PEER_CONNECTED | MB_PEER_SAME_DOM)
+
+typedef	void (*mailbox_msg_cb_t)(void *arg, void *data, size_t len,
+	u64 msgid, int err);
+struct xocl_mailbox_funcs {
+	int (*request)(struct platform_device *pdev, void *req,
+		size_t reqlen, void *resp, size_t *resplen,
+		mailbox_msg_cb_t cb, void *cbarg);
+	int (*post)(struct platform_device *pdev, u64 req_id,
+		void *resp, size_t len);
+	int (*listen)(struct platform_device *pdev,
+		mailbox_msg_cb_t cb, void *cbarg);
+	int (*reset)(struct platform_device *pdev, bool end_of_reset);
+	int (*get_data)(struct platform_device *pdev, enum data_kind kind);
+};
+#define	MAILBOX_DEV(xdev)	SUBDEV(xdev, XOCL_SUBDEV_MAILBOX).pldev
+#define	MAILBOX_OPS(xdev)	\
+	((struct xocl_mailbox_funcs *)SUBDEV(xdev, XOCL_SUBDEV_MAILBOX).ops)
+#define MAILBOX_READY(xdev)	(MAILBOX_DEV(xdev) && MAILBOX_OPS(xdev))
+#define	xocl_peer_request(xdev, req, reqlen, resp, resplen, cb, cbarg)		\
+	(MAILBOX_READY(xdev) ? MAILBOX_OPS(xdev)->request(MAILBOX_DEV(xdev), \
+	req, reqlen, resp, resplen, cb, cbarg) : -ENODEV)
+#define	xocl_peer_response(xdev, reqid, buf, len)			\
+	(MAILBOX_READY(xdev) ? MAILBOX_OPS(xdev)->post(MAILBOX_DEV(xdev), \
+	reqid, buf, len) : -ENODEV)
+#define	xocl_peer_notify(xdev, req, reqlen)					\
+	(MAILBOX_READY(xdev) ? MAILBOX_OPS(xdev)->post(MAILBOX_DEV(xdev), 0, \
+	req, reqlen) : -ENODEV)
+#define	xocl_peer_listen(xdev, cb, cbarg)				\
+	(MAILBOX_READY(xdev) ? MAILBOX_OPS(xdev)->listen(MAILBOX_DEV(xdev), \
+	cb, cbarg) : -ENODEV)
+#define	xocl_mailbox_reset(xdev, end)				\
+	(MAILBOX_READY(xdev) ? MAILBOX_OPS(xdev)->reset(MAILBOX_DEV(xdev), \
+	end) : -ENODEV)
+#define	xocl_mailbox_get_data(xdev, kind)				\
+	(MAILBOX_READY(xdev) ? MAILBOX_OPS(xdev)->get_data(MAILBOX_DEV(xdev), kind) \
+		: -ENODEV)
+
+struct xocl_icap_funcs {
+	void (*reset_axi_gate)(struct platform_device *pdev);
+	int (*reset_bitstream)(struct platform_device *pdev);
+	int (*download_bitstream_axlf)(struct platform_device *pdev,
+		const void __user *arg);
+	int (*download_boot_firmware)(struct platform_device *pdev);
+	int (*ocl_set_freq)(struct platform_device *pdev,
+		unsigned int region, unsigned short *freqs, int num_freqs);
+	int (*ocl_get_freq)(struct platform_device *pdev,
+		unsigned int region, unsigned short *freqs, int num_freqs);
+	int (*ocl_update_clock_freq_topology)(struct platform_device *pdev, struct xclmgmt_ioc_freqscaling *freqs);
+	int (*ocl_lock_bitstream)(struct platform_device *pdev,
+		const uuid_t *uuid, pid_t pid);
+	int (*ocl_unlock_bitstream)(struct platform_device *pdev,
+		const uuid_t *uuid, pid_t pid);
+	uint64_t (*get_data)(struct platform_device *pdev,
+		enum data_kind kind);
+};
+#define	ICAP_DEV(xdev)	SUBDEV(xdev, XOCL_SUBDEV_ICAP).pldev
+#define	ICAP_OPS(xdev)							\
+	((struct xocl_icap_funcs *)SUBDEV(xdev, XOCL_SUBDEV_ICAP).ops)
+#define	xocl_icap_reset_axi_gate(xdev)					\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->reset_axi_gate(ICAP_DEV(xdev)) :		\
+	NULL)
+#define	xocl_icap_reset_bitstream(xdev)					\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->reset_bitstream(ICAP_DEV(xdev)) :		\
+	-ENODEV)
+#define	xocl_icap_download_axlf(xdev, xclbin)				\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->download_bitstream_axlf(ICAP_DEV(xdev), xclbin) : \
+	-ENODEV)
+#define	xocl_icap_download_boot_firmware(xdev)				\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->download_boot_firmware(ICAP_DEV(xdev)) :	\
+	-ENODEV)
+#define	xocl_icap_ocl_get_freq(xdev, region, freqs, num)		\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->ocl_get_freq(ICAP_DEV(xdev), region, freqs, num) : \
+	-ENODEV)
+#define	xocl_icap_ocl_update_clock_freq_topology(xdev, freqs)		\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->ocl_update_clock_freq_topology(ICAP_DEV(xdev), freqs) : \
+	-ENODEV)
+#define	xocl_icap_ocl_set_freq(xdev, region, freqs, num)		\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->ocl_set_freq(ICAP_DEV(xdev), region, freqs, num) : \
+	-ENODEV)
+#define	xocl_icap_lock_bitstream(xdev, uuid, pid)			\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->ocl_lock_bitstream(ICAP_DEV(xdev), uuid, pid) :	\
+	-ENODEV)
+#define	xocl_icap_unlock_bitstream(xdev, uuid, pid)			\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->ocl_unlock_bitstream(ICAP_DEV(xdev), uuid, pid) : \
+	-ENODEV)
+#define	xocl_icap_get_data(xdev, kind)				\
+	(ICAP_OPS(xdev) ?						\
+	ICAP_OPS(xdev)->get_data(ICAP_DEV(xdev), kind) : \
+	0)
+
+/* helper functions */
+xdev_handle_t xocl_get_xdev(struct platform_device *pdev);
+void xocl_init_dsa_priv(xdev_handle_t xdev_hdl);
+
+/* subdev functions */
+int xocl_subdev_create_multi_inst(xdev_handle_t xdev_hdl,
+	struct xocl_subdev_info *sdev_info);
+int xocl_subdev_create_one(xdev_handle_t xdev_hdl,
+	struct xocl_subdev_info *sdev_info);
+int xocl_subdev_create_by_id(xdev_handle_t xdev_hdl, int id);
+int xocl_subdev_create_all(xdev_handle_t xdev_hdl,
+			   struct xocl_subdev_info *sdev_info, u32 subdev_num);
+void xocl_subdev_destroy_one(xdev_handle_t xdev_hdl, u32 subdev_id);
+void xocl_subdev_destroy_all(xdev_handle_t xdev_hdl);
+void xocl_subdev_destroy_by_id(xdev_handle_t xdev_hdl, int id);
+
+int xocl_subdev_create_by_name(xdev_handle_t xdev_hdl, char *name);
+int xocl_subdev_destroy_by_name(xdev_handle_t xdev_hdl, char *name);
+
+int xocl_subdev_get_devinfo(uint32_t subdev_id,
+			    struct xocl_subdev_info *subdev_info, struct resource *res);
+
+void xocl_subdev_register(struct platform_device *pldev, u32 id,
+			  void *cb_funcs);
+void xocl_fill_dsa_priv(xdev_handle_t xdev_hdl, struct xocl_board_private *in);
+int xocl_xrt_version_check(xdev_handle_t xdev_hdl,
+			   struct axlf *bin_obj, bool major_only);
+int xocl_alloc_dev_minor(xdev_handle_t xdev_hdl);
+void xocl_free_dev_minor(xdev_handle_t xdev_hdl);
+
+/* context helpers */
+extern struct mutex xocl_drvinst_mutex;
+extern struct xocl_drvinst *xocl_drvinst_array[XOCL_MAX_DEVICES * 10];
+
+void *xocl_drvinst_alloc(struct device *dev, u32 size);
+void xocl_drvinst_free(void *data);
+void *xocl_drvinst_open(void *file_dev);
+void xocl_drvinst_close(void *data);
+void xocl_drvinst_set_filedev(void *data, void *file_dev);
+
+/* health thread functions */
+int health_thread_start(xdev_handle_t xdev);
+int health_thread_stop(xdev_handle_t xdev);
+
+/* init functions */
+int __init xocl_init_userpf(void);
+void xocl_fini_fini_userpf(void);
+
+int __init xocl_init_drv_user_qdma(void);
+void xocl_fini_drv_user_qdma(void);
+
+int __init xocl_init_feature_rom(void);
+void xocl_fini_feature_rom(void);
+
+int __init xocl_init_xdma(void);
+void xocl_fini_xdma(void);
+
+int __init xocl_init_qdma(void);
+void xocl_fini_qdma(void);
+
+int __init xocl_init_mb_scheduler(void);
+void xocl_fini_mb_scheduler(void);
+
+int __init xocl_init_xvc(void);
+void xocl_fini_xvc(void);
+
+int __init xocl_init_firewall(void);
+void xocl_fini_firewall(void);
+
+int __init xocl_init_sysmon(void);
+void xocl_fini_sysmon(void);
+
+int __init xocl_init_mb(void);
+void xocl_fini_mb(void);
+
+int __init xocl_init_xiic(void);
+void xocl_fini_xiic(void);
+
+int __init xocl_init_mailbox(void);
+void xocl_fini_mailbox(void);
+
+int __init xocl_init_icap(void);
+void xocl_fini_icap(void);
+
+int __init xocl_init_mig(void);
+void xocl_fini_mig(void);
+
+int __init xocl_init_xmc(void);
+void xocl_fini_xmc(void);
+
+int __init xocl_init_dna(void);
+void xocl_fini_dna(void);
+
+int __init xocl_init_fmgr(void);
+void xocl_fini_fmgr(void);
+#endif
diff --git a/drivers/gpu/drm/xocl/xocl_subdev.c b/drivers/gpu/drm/xocl/xocl_subdev.c
new file mode 100644
index 000000000000..0ade2af180b0
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xocl_subdev.c
@@ -0,0 +1,540 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2018-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include "xclfeatures.h"
+#include "xocl_drv.h"
+#include "version.h"
+
+struct xocl_subdev_array {
+	xdev_handle_t xdev_hdl;
+	int id;
+	struct platform_device **pldevs;
+	int count;
+};
+
+static DEFINE_IDA(xocl_dev_minor_ida);
+
+static DEFINE_IDA(subdev_multi_inst_ida);
+static struct xocl_dsa_vbnv_map dsa_vbnv_map[] = XOCL_DSA_VBNV_MAP;
+
+static struct platform_device *xocl_register_subdev(xdev_handle_t xdev_hdl,
+	struct xocl_subdev_info *sdev_info, bool multi_inst)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct platform_device *pldev;
+	struct xocl_subdev_private *priv;
+	resource_size_t iostart;
+	struct resource *res;
+	int sdev_id;
+	int i, retval;
+
+	if (multi_inst) {
+		sdev_id = ida_simple_get(&subdev_multi_inst_ida,
+			0, 0, GFP_KERNEL);
+		if (sdev_id < 0)
+			return ERR_PTR(-ENOENT);
+	} else {
+		sdev_id = XOCL_DEV_ID(core->pdev);
+	}
+
+	xocl_info(&core->pdev->dev, "creating subdev %s", sdev_info->name);
+	pldev = platform_device_alloc(sdev_info->name, sdev_id);
+	if (!pldev) {
+		xocl_err(&core->pdev->dev, "failed to alloc device %s",
+			sdev_info->name);
+		retval = -ENOMEM;
+		goto error;
+	}
+
+	/* user bar is determined dynamically */
+	iostart = pci_resource_start(core->pdev, core->bar_idx);
+
+	if (sdev_info->num_res > 0) {
+		res = devm_kzalloc(&pldev->dev, sizeof(*res) *
+			sdev_info->num_res, GFP_KERNEL);
+		if (!res) {
+			xocl_err(&pldev->dev, "out of memory");
+			retval = -ENOMEM;
+			goto error;
+		}
+		memcpy(res, sdev_info->res, sizeof(*res) * sdev_info->num_res);
+
+		for (i = 0; i < sdev_info->num_res; i++) {
+			if (sdev_info->res[i].flags & IORESOURCE_MEM) {
+				res[i].start += iostart;
+				res[i].end += iostart;
+			}
+		}
+
+		retval = platform_device_add_resources(pldev,
+			res, sdev_info->num_res);
+		devm_kfree(&pldev->dev, res);
+		if (retval) {
+			xocl_err(&pldev->dev, "failed to add res");
+			goto error;
+		}
+
+		priv = vzalloc(sizeof(*priv) + sdev_info->data_len);
+		if (sdev_info->data_len > 0 && sdev_info->priv_data) {
+			memcpy(priv->priv_data, sdev_info->priv_data,
+				sdev_info->data_len);
+		}
+		priv->id = sdev_info->id;
+		priv->is_multi = multi_inst;
+		retval = platform_device_add_data(pldev,
+			priv, sizeof(*priv) + sdev_info->data_len);
+		vfree(priv);
+		if (retval) {
+			xocl_err(&pldev->dev, "failed to add data");
+			goto error;
+		}
+	}
+
+	pldev->dev.parent = &core->pdev->dev;
+
+	retval = platform_device_add(pldev);
+	if (retval) {
+		xocl_err(&pldev->dev, "failed to add device");
+		goto error;
+	}
+
+	return pldev;
+
+error:
+	platform_device_put(pldev);
+	return NULL;
+}
+
+int xocl_subdev_get_devinfo(uint32_t subdev_id,
+	struct xocl_subdev_info *info, struct resource *res)
+{
+	switch (subdev_id) {
+	case XOCL_SUBDEV_DNA:
+		*info = (struct xocl_subdev_info)XOCL_DEVINFO_DNA;
+		break;
+	case XOCL_SUBDEV_MIG:
+		*info = (struct xocl_subdev_info)XOCL_DEVINFO_MIG;
+		break;
+	default:
+		return -ENODEV;
+	}
+	/* Only support retrieving subdev info with 1 base address and no irq */
+	if (info->num_res > 1)
+		return -EINVAL;
+	*res = *info->res;
+	info->res = res;
+	return 0;
+}
+
+/*
+ * Instantiating a subdevice instance that support > 1 instances.
+ * Restrictions:
+ * 1. it can't expose interfaces for other part of driver to call
+ * 2. one type of subdevice can either be created as single instance or multiple
+ * instance subdevices, but not both.
+ */
+int xocl_subdev_create_multi_inst(xdev_handle_t xdev_hdl,
+	struct xocl_subdev_info *sdev_info)
+{
+	int ret = 0;
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct platform_device *pldev;
+
+	device_lock(&core->pdev->dev);
+	pldev = xocl_register_subdev(core, sdev_info, true);
+	if (!pldev) {
+		xocl_err(&core->pdev->dev,
+			"failed to reg multi instance subdev %s",
+			sdev_info->name);
+		ret = -ENOMEM;
+	}
+	device_unlock(&core->pdev->dev);
+
+	return ret;
+}
+
+int xocl_subdev_create_one(xdev_handle_t xdev_hdl,
+	struct xocl_subdev_info *sdev_info)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct pci_dev *pdev = core->pdev;
+	u32	id = sdev_info->id;
+	int	ret = 0;
+
+	if (core->subdevs[id].pldev)
+		return 0;
+
+	core->subdevs[id].pldev = xocl_register_subdev(core, sdev_info, false);
+	if (!core->subdevs[id].pldev) {
+		xocl_err(&pdev->dev, "failed to register subdev %s",
+			sdev_info->name);
+		ret = -EINVAL;
+		goto failed;
+	}
+	/*
+	 * force probe to avoid dependence issue. if probing
+	 * failed, it could be this device is not detected on the board.
+	 * delete the device.
+	 */
+	ret = device_attach(&core->subdevs[id].pldev->dev);
+	if (ret != 1) {
+		xocl_err(&pdev->dev, "failed to probe subdev %s, ret %d",
+			sdev_info->name, ret);
+		ret = -ENODEV;
+		goto failed;
+	}
+	xocl_info(&pdev->dev, "Created subdev %s", sdev_info->name);
+
+	return 0;
+
+failed:
+	return (ret);
+}
+
+int xocl_subdev_create_by_name(xdev_handle_t xdev_hdl, char *name)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	int i, n;
+
+	for (i = 0; i < core->priv.subdev_num; i++) {
+		n = strlen(name);
+		if (name[n - 1] == '\n')
+			n--;
+		if (!strncmp(core->priv.subdev_info[i].name, name, n))
+			break;
+	}
+	if (i == core->priv.subdev_num)
+		return -ENODEV;
+
+	return xocl_subdev_create_one(xdev_hdl,
+			&core->priv.subdev_info[i]);
+}
+
+int xocl_subdev_destroy_by_name(xdev_handle_t xdev_hdl, char *name)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	int i, n;
+
+	for (i = 0; i < core->priv.subdev_num; i++) {
+		n = strlen(name);
+		if (name[n - 1] == '\n')
+			n--;
+		if (!strncmp(core->priv.subdev_info[i].name, name, n))
+			break;
+	}
+	if (i == core->priv.subdev_num)
+		return -ENODEV;
+
+	xocl_subdev_destroy_one(xdev_hdl, core->priv.subdev_info[i].id);
+
+	return 0;
+}
+
+int xocl_subdev_create_by_id(xdev_handle_t xdev_hdl, int id)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	int i;
+
+	for (i = 0; i < core->priv.subdev_num; i++)
+		if (core->priv.subdev_info[i].id == id)
+			break;
+	if (i == core->priv.subdev_num)
+		return -ENOENT;
+
+	return xocl_subdev_create_one(xdev_hdl,
+			&core->priv.subdev_info[i]);
+}
+
+int xocl_subdev_create_all(xdev_handle_t xdev_hdl,
+	struct xocl_subdev_info *sdev_info, u32 subdev_num)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct FeatureRomHeader rom;
+	u32	id;
+	int	i, ret = 0;
+
+	/* lookup update table */
+	ret = xocl_subdev_create_one(xdev_hdl,
+		&(struct xocl_subdev_info)XOCL_DEVINFO_FEATURE_ROM);
+	if (ret)
+		goto failed;
+
+	for (i = 0; i < ARRAY_SIZE(dsa_vbnv_map); i++) {
+		xocl_get_raw_header(core, &rom);
+		if ((core->pdev->vendor == dsa_vbnv_map[i].vendor ||
+			dsa_vbnv_map[i].vendor == (u16)PCI_ANY_ID) &&
+			(core->pdev->device == dsa_vbnv_map[i].device ||
+			dsa_vbnv_map[i].device == (u16)PCI_ANY_ID) &&
+			(core->pdev->subsystem_device ==
+			dsa_vbnv_map[i].subdevice ||
+			dsa_vbnv_map[i].subdevice == (u16)PCI_ANY_ID) &&
+			!strncmp(rom.VBNVName, dsa_vbnv_map[i].vbnv,
+			sizeof(rom.VBNVName))) {
+			sdev_info = dsa_vbnv_map[i].priv_data->subdev_info;
+			subdev_num = dsa_vbnv_map[i].priv_data->subdev_num;
+			xocl_fill_dsa_priv(xdev_hdl, dsa_vbnv_map[i].priv_data);
+			break;
+		}
+	}
+
+	core->subdev_num = subdev_num;
+
+	/* create subdevices */
+	for (i = 0; i < core->subdev_num; i++) {
+		id = sdev_info[i].id;
+		if (core->subdevs[id].pldev)
+			continue;
+
+		ret = xocl_subdev_create_one(xdev_hdl, &sdev_info[i]);
+		if (ret)
+			goto failed;
+	}
+
+	return 0;
+
+failed:
+	xocl_subdev_destroy_all(xdev_hdl);
+	return ret;
+}
+
+void xocl_subdev_destroy_one(xdev_handle_t xdev_hdl, uint32_t subdev_id)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+
+	if (subdev_id == INVALID_SUBDEVICE)
+		return;
+	if (core->subdevs[subdev_id].pldev) {
+		device_release_driver(&core->subdevs[subdev_id].pldev->dev);
+		platform_device_unregister(core->subdevs[subdev_id].pldev);
+		core->subdevs[subdev_id].pldev = NULL;
+	}
+}
+
+static int match_multi_inst_subdevs(struct device *dev, void *data)
+{
+	struct xocl_subdev_array *subdevs = (struct xocl_subdev_array *)data;
+	struct xocl_dev_core *core = (struct xocl_dev_core *)subdevs->xdev_hdl;
+	struct platform_device *pldev = to_platform_device(dev);
+	struct xocl_subdev_private *priv = dev_get_platdata(dev);
+
+	if (dev->parent == &core->pdev->dev &&
+		priv && priv->is_multi) {
+		if (subdevs->pldevs != NULL)
+			subdevs->pldevs[subdevs->count] = pldev;
+		subdevs->count++;
+	}
+
+	return 0;
+}
+
+static int match_subdev_by_id(struct device *dev, void *data)
+{
+	struct xocl_subdev_array *subdevs = (struct xocl_subdev_array *)data;
+	struct xocl_dev_core *core = (struct xocl_dev_core *)subdevs->xdev_hdl;
+	struct xocl_subdev_private *priv = dev_get_platdata(dev);
+
+	if (dev->parent == &core->pdev->dev &&
+		priv && priv->id == subdevs->id) {
+		if (subdevs->pldevs != NULL)
+			subdevs->pldevs[subdevs->count] =
+				to_platform_device(dev);
+		subdevs->count++;
+	}
+
+	return 0;
+}
+static void xocl_subdev_destroy_common(xdev_handle_t xdev_hdl,
+	int (*match)(struct device *dev, void *data),
+	struct xocl_subdev_array *subdevs)
+{
+	int i;
+	struct xocl_subdev_private *priv;
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+
+	bus_for_each_dev(&platform_bus_type, NULL, subdevs,
+		match);
+	if (subdevs->count == 0)
+		return;
+
+	subdevs->pldevs = vzalloc(sizeof(*subdevs->pldevs) * subdevs->count);
+	if (!subdevs->pldevs)
+		return;
+	subdevs->count = 0;
+
+	bus_for_each_dev(&platform_bus_type, NULL, subdevs,
+		match);
+
+	for (i = 0; i < subdevs->count; i++) {
+		priv = dev_get_platdata(&subdevs->pldevs[i]->dev);
+		if (priv->is_multi)
+			ida_simple_remove(&subdev_multi_inst_ida,
+				subdevs->pldevs[i]->id);
+		else
+			core->subdevs[subdevs->id].pldev = NULL;
+		device_release_driver(&subdevs->pldevs[i]->dev);
+		platform_device_unregister(subdevs->pldevs[i]);
+	}
+
+	vfree(subdevs->pldevs);
+}
+
+void xocl_subdev_destroy_by_id(xdev_handle_t xdev_hdl, int id)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct xocl_subdev_array subdevs;
+
+	memset(&subdevs, 0, sizeof(subdevs));
+	subdevs.xdev_hdl = xdev_hdl;
+	subdevs.id = id;
+
+	device_lock(&core->pdev->dev);
+	xocl_subdev_destroy_common(xdev_hdl,
+		match_subdev_by_id, &subdevs);
+	device_unlock(&core->pdev->dev);
+}
+void xocl_subdev_destroy_all(xdev_handle_t xdev_hdl)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct xocl_subdev_array subdevs;
+	int	i;
+
+	memset(&subdevs, 0, sizeof(subdevs));
+	subdevs.xdev_hdl = xdev_hdl;
+
+	xocl_subdev_destroy_common(xdev_hdl,
+		match_multi_inst_subdevs, &subdevs);
+
+	for (i = ARRAY_SIZE(core->subdevs) - 1; i >= 0; i--)
+		xocl_subdev_destroy_one(xdev_hdl, i);
+
+	core->subdev_num = 0;
+}
+
+void xocl_subdev_register(struct platform_device *pldev, u32 id,
+	void *cb_funcs)
+{
+	struct xocl_dev_core		*core;
+
+	BUG_ON(id >= XOCL_SUBDEV_NUM);
+	core = xocl_get_xdev(pldev);
+	BUG_ON(!core);
+
+	core->subdevs[id].ops = cb_funcs;
+}
+
+xdev_handle_t xocl_get_xdev(struct platform_device *pdev)
+{
+	struct device *dev;
+
+	dev = pdev->dev.parent;
+
+	return dev ? pci_get_drvdata(to_pci_dev(dev)) : NULL;
+}
+
+void xocl_fill_dsa_priv(xdev_handle_t xdev_hdl, struct xocl_board_private *in)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+	struct pci_dev *pdev = core->pdev;
+	unsigned int i;
+
+	memset(&core->priv, 0, sizeof(core->priv));
+	/*
+	 * follow xilinx device id, subsystem id codeing rules to set dsa
+	 * private data. And they can be overwrited in subdev header file
+	 */
+	if ((pdev->device >> 5) & 0x1)
+		core->priv.xpr = true;
+
+	core->priv.dsa_ver = pdev->subsystem_device & 0xff;
+
+	/* data defined in subdev header */
+	core->priv.subdev_info = in->subdev_info;
+	core->priv.subdev_num = in->subdev_num;
+	core->priv.flags = in->flags;
+	core->priv.flash_type = in->flash_type;
+	core->priv.board_name = in->board_name;
+	core->priv.mpsoc = in->mpsoc;
+	if (in->flags & XOCL_DSAFLAG_SET_DSA_VER)
+		core->priv.dsa_ver = in->dsa_ver;
+	if (in->flags & XOCL_DSAFLAG_SET_XPR)
+		core->priv.xpr = in->xpr;
+
+	for (i = 0; i < in->subdev_num; i++) {
+		if (in->subdev_info[i].id == XOCL_SUBDEV_FEATURE_ROM) {
+			core->feature_rom_offset =
+				in->subdev_info[i].res[0].start;
+			break;
+		}
+	}
+}
+
+int xocl_xrt_version_check(xdev_handle_t xdev_hdl,
+	struct axlf *bin_obj, bool major_only)
+{
+	u32 major, minor, patch;
+	/* check runtime version:
+	 *    1. if it is 0.0.xxxx, this implies old xclbin,
+	 *       we pass the check anyway.
+	 *    2. compare major and minor, returns error if it does not match.
+	 */
+	if (sscanf(xrt_build_version, "%d.%d.%d", &major, &minor, &patch) != 3)
+		return -ENODEV;
+
+	if (major != bin_obj->m_header.m_versionMajor &&
+		bin_obj->m_header.m_versionMajor != 0)
+		goto err;
+
+	if (major_only)
+		return 0;
+
+	if ((major != bin_obj->m_header.m_versionMajor ||
+		minor != bin_obj->m_header.m_versionMinor) &&
+		!(bin_obj->m_header.m_versionMajor == 0 &&
+		bin_obj->m_header.m_versionMinor == 0))
+		goto err;
+
+	return 0;
+
+err:
+	xocl_err(&XDEV(xdev_hdl)->pdev->dev,
+		"Mismatch xrt version, xrt %s, xclbin %d.%d.%d", xrt_build_version,
+		bin_obj->m_header.m_versionMajor,
+		bin_obj->m_header.m_versionMinor,
+		bin_obj->m_header.m_versionPatch);
+
+	return -EINVAL;
+}
+
+int xocl_alloc_dev_minor(xdev_handle_t xdev_hdl)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+
+	core->dev_minor = ida_simple_get(&xocl_dev_minor_ida,
+		0, 0, GFP_KERNEL);
+
+	if (core->dev_minor < 0) {
+		xocl_err(&core->pdev->dev, "Failed to alloc dev minor");
+		core->dev_minor = XOCL_INVALID_MINOR;
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+void xocl_free_dev_minor(xdev_handle_t xdev_hdl)
+{
+	struct xocl_dev_core *core = (struct xocl_dev_core *)xdev_hdl;
+
+	if (core->dev_minor != XOCL_INVALID_MINOR) {
+		ida_simple_remove(&xocl_dev_minor_ida, core->dev_minor);
+		core->dev_minor = XOCL_INVALID_MINOR;
+	}
+}
diff --git a/drivers/gpu/drm/xocl/xocl_thread.c b/drivers/gpu/drm/xocl/xocl_thread.c
new file mode 100644
index 000000000000..07cce3f5921b
--- /dev/null
+++ b/drivers/gpu/drm/xocl/xocl_thread.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ *  Copyright (C) 2017 Xilinx, Inc. All rights reserved.
+ *
+ *  Thread to check sysmon/firewall status for errors/issues
+ *  Author: Lizhi.Hou@Xilinx.com
+ *
+ */
+
+#include <linux/kthread.h>
+#include "xocl_drv.h"
+
+int health_thread(void *data)
+{
+	struct xocl_health_thread_arg *thread_arg = data;
+
+	while (!kthread_should_stop()) {
+		msleep_interruptible(thread_arg->interval);
+
+		thread_arg->health_cb(thread_arg->arg);
+	}
+	xocl_info(thread_arg->dev, "The health thread has terminated.");
+	return 0;
+}
+
+int health_thread_start(xdev_handle_t xdev)
+{
+	struct xocl_dev_core *core = XDEV(xdev);
+
+	xocl_info(&core->pdev->dev, "init_health_thread");
+	core->health_thread = kthread_run(health_thread, &core->thread_arg,
+		"xocl_health_thread");
+
+	if (IS_ERR(core->health_thread)) {
+		xocl_err(&core->pdev->dev, "ERROR! health thread init");
+		core->health_thread = NULL;
+		return -ENOMEM;
+	}
+
+	core->thread_arg.dev = &core->pdev->dev;
+
+	return 0;
+}
+
+int health_thread_stop(xdev_handle_t xdev)
+{
+	struct xocl_dev_core *core = XDEV(xdev);
+	int ret;
+
+	if (!core->health_thread)
+		return 0;
+
+	ret = kthread_stop(core->health_thread);
+	core->health_thread = NULL;
+
+	xocl_info(&core->pdev->dev, "fini_health_thread. ret = %d\n", ret);
+	if (ret != -EINTR) {
+		xocl_err(&core->pdev->dev, "The health thread has terminated");
+		ret = 0;
+	}
+
+	return ret;
+}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH Xilinx Alveo 3/6] Add platform drivers for various IPs and frameworks
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 1/6] Add skeleton code: ioctl definitions and build hooks sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 2/6] Global data structures shared between xocl and xmgmt drivers sonal.santan
@ 2019-03-19 21:53 ` sonal.santan
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 4/6] Add core of XDMA driver sonal.santan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:53 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk,
	Sonal Santan

From: Sonal Santan <sonal.santan@xilinx.com>

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 drivers/gpu/drm/xocl/subdev/dna.c          |  356 +++
 drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 +++
 drivers/gpu/drm/xocl/subdev/firewall.c     |  389 +++
 drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 ++
 drivers/gpu/drm/xocl/subdev/icap.c         | 2859 ++++++++++++++++++
 drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 ++++++++++++
 drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++++++++
 drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 +++++
 drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
 drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 +++
 drivers/gpu/drm/xocl/subdev/xdma.c         |  510 ++++
 drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 ++++++++++
 drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
 13 files changed, 12955 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c
 create mode 100644 drivers/gpu/drm/xocl/subdev/xvc.c

diff --git a/drivers/gpu/drm/xocl/subdev/dna.c b/drivers/gpu/drm/xocl/subdev/dna.c
new file mode 100644
index 000000000000..991d98e5b9aa
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/dna.c
@@ -0,0 +1,356 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Chien-Wei Lan <chienwei@xilinx.com>
+ *
+ */
+
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/vmalloc.h>
+#include "../xocl_drv.h"
+#include <drm/xmgmt_drm.h>
+
+/* Registers are defined in pg150-ultrascale-memory-ip.pdf:
+ * AXI4-Lite Slave Control/Status Register Map
+ */
+#define XLNX_DNA_MEMORY_MAP_MAGIC_IS_DEFINED                        (0x3E4D7732)
+#define XLNX_DNA_MAJOR_MINOR_VERSION_REGISTER_OFFSET                0x00          //  RO
+#define XLNX_DNA_REVISION_REGISTER_OFFSET                           0x04          //  RO
+#define XLNX_DNA_CAPABILITY_REGISTER_OFFSET                         0x08          //  RO
+//#define XLNX_DNA_SCRATCHPAD_REGISTER_OFFSET                         (0x0C)          //  RO (31-1) + RW (0)
+#define XLNX_DNA_STATUS_REGISTER_OFFSET                             0x10            //  RO
+#define XLNX_DNA_FSM_DNA_WORD_WRITE_COUNT_REGISTER_OFFSET           (0x14)          //  RO
+#define XLNX_DNA_FSM_CERTIFICATE_WORD_WRITE_COUNT_REGISTER_OFFSET   (0x18)          //  RO
+#define XLNX_DNA_MESSAGE_START_AXI_ONLY_REGISTER_OFFSET             (0x20)          //  RO (31-1) + RW (0)
+#define XLNX_DNA_READBACK_REGISTER_2_OFFSET                         0x40            //  RO XLNX_DNA_BOARD_DNA_95_64
+#define XLNX_DNA_READBACK_REGISTER_1_OFFSET                         0x44            //  RO XLNX_DNA_BOARD_DNA_63_32
+#define XLNX_DNA_READBACK_REGISTER_0_OFFSET                         0x48            //  RO XLNX_DNA_BOARD_DNA_31_0
+#define XLNX_DNA_DATA_AXI_ONLY_REGISTER_OFFSET                      (0x80)          //  WO
+#define XLNX_DNA_CERTIFICATE_DATA_AXI_ONLY_REGISTER_OFFSET          (0xC0)          //  WO - 512 bit aligned.
+#define XLNX_DNA_MAX_ADDRESS_WORDS                                  (0xC4)
+
+struct xocl_xlnx_dna {
+	void __iomem		*base;
+	struct device		*xlnx_dna_dev;
+	struct mutex		xlnx_dna_lock;
+};
+
+static ssize_t status_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xlnx_dna *xlnx_dna = dev_get_drvdata(dev);
+	u32 status;
+
+	status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+
+	return sprintf(buf, "0x%x\n", status);
+}
+static DEVICE_ATTR_RO(status);
+
+static ssize_t dna_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xlnx_dna *xlnx_dna = dev_get_drvdata(dev);
+	uint32_t dna96_64, dna63_32, dna31_0;
+
+	dna96_64 = ioread32(xlnx_dna->base+XLNX_DNA_READBACK_REGISTER_2_OFFSET);
+	dna63_32 = ioread32(xlnx_dna->base+XLNX_DNA_READBACK_REGISTER_1_OFFSET);
+	dna31_0  = ioread32(xlnx_dna->base+XLNX_DNA_READBACK_REGISTER_0_OFFSET);
+
+	return sprintf(buf, "%08x%08x%08x\n", dna96_64, dna63_32, dna31_0);
+}
+static DEVICE_ATTR_RO(dna);
+
+static ssize_t capability_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xlnx_dna *xlnx_dna = dev_get_drvdata(dev);
+	u32 capability;
+
+	capability = ioread32(xlnx_dna->base+XLNX_DNA_CAPABILITY_REGISTER_OFFSET);
+
+	return sprintf(buf, "0x%x\n", capability);
+}
+static DEVICE_ATTR_RO(capability);
+
+
+static ssize_t dna_version_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xlnx_dna *xlnx_dna = dev_get_drvdata(dev);
+	u32 version;
+
+	version = ioread32(xlnx_dna->base+XLNX_DNA_MAJOR_MINOR_VERSION_REGISTER_OFFSET);
+
+	return sprintf(buf, "%d.%d\n", version>>16, version & 0xffff);
+}
+static DEVICE_ATTR_RO(dna_version);
+
+static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xlnx_dna *xlnx_dna = dev_get_drvdata(dev);
+	u32 revision;
+
+	revision = ioread32(xlnx_dna->base+XLNX_DNA_REVISION_REGISTER_OFFSET);
+
+	return sprintf(buf, "%d\n", revision);
+}
+static DEVICE_ATTR_RO(revision);
+
+static struct attribute *xlnx_dna_attributes[] = {
+	&dev_attr_status.attr,
+	&dev_attr_dna.attr,
+	&dev_attr_capability.attr,
+	&dev_attr_dna_version.attr,
+	&dev_attr_revision.attr,
+	NULL
+};
+
+static const struct attribute_group xlnx_dna_attrgroup = {
+	.attrs = xlnx_dna_attributes,
+};
+
+static uint32_t dna_status(struct platform_device *pdev)
+{
+	struct xocl_xlnx_dna *xlnx_dna = platform_get_drvdata(pdev);
+	uint32_t status = 0;
+	uint8_t retries = 10;
+	bool rsa4096done = false;
+
+	if (!xlnx_dna)
+		return status;
+
+	while (!rsa4096done && retries) {
+		status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+		if (status>>8 & 0x1) {
+			rsa4096done = true;
+			break;
+		}
+		msleep(1);
+		retries--;
+	}
+
+	if (retries == 0)
+		return -EBUSY;
+
+	status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+
+	return status;
+}
+
+static uint32_t dna_capability(struct platform_device *pdev)
+{
+	struct xocl_xlnx_dna *xlnx_dna = platform_get_drvdata(pdev);
+	u32 capability = 0;
+
+	if (!xlnx_dna)
+		return capability;
+
+	capability = ioread32(xlnx_dna->base+XLNX_DNA_CAPABILITY_REGISTER_OFFSET);
+
+	return capability;
+}
+
+static void dna_write_cert(struct platform_device *pdev, const uint32_t *cert, uint32_t len)
+{
+	struct xocl_xlnx_dna *xlnx_dna = platform_get_drvdata(pdev);
+	int i, j, k;
+	u32 status = 0, words;
+	uint8_t retries = 100;
+	bool sha256done = false;
+	uint32_t convert;
+	uint32_t sign_start, message_words = (len-512)>>2;
+
+	sign_start = message_words;
+
+	if (!xlnx_dna)
+		return;
+
+	iowrite32(0x1, xlnx_dna->base+XLNX_DNA_MESSAGE_START_AXI_ONLY_REGISTER_OFFSET);
+	status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+	xocl_info(&pdev->dev, "Start: status %08x", status);
+
+	for (i = 0; i < message_words; i += 16) {
+
+		retries = 100;
+		sha256done = false;
+
+		while (!sha256done && retries) {
+			status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+			if (!(status >> 4 & 0x1)) {
+				sha256done = true;
+				break;
+			}
+			msleep(10);
+			retries--;
+		}
+		for (j = 0; j < 16; ++j) {
+			convert = (*(cert+i+j) >> 24 & 0xff) | (*(cert+i+j) >> 8 & 0xff00) |
+				(*(cert+i+j) << 8 & 0xff0000) | ((*(cert+i+j) & 0xff) << 24);
+			iowrite32(convert, xlnx_dna->base+XLNX_DNA_DATA_AXI_ONLY_REGISTER_OFFSET+j*4);
+		}
+	}
+	retries = 100;
+	sha256done = false;
+	while (!sha256done && retries) {
+		status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+		if (!(status >> 4 & 0x1)) {
+			sha256done = true;
+			break;
+		}
+		msleep(10);
+		retries--;
+	}
+
+	status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+	words  = ioread32(xlnx_dna->base+XLNX_DNA_FSM_DNA_WORD_WRITE_COUNT_REGISTER_OFFSET);
+	xocl_info(&pdev->dev, "Message: status %08x dna words %d", status, words);
+
+	for (k = 0; k < 128; k += 16) {
+		for (i = 0; i < 16; i++) {
+			j = k+i+sign_start;
+			convert = (*(cert + j) >> 24 & 0xff) | (*(cert + j) >> 8 & 0xff00) |
+				(*(cert + j) << 8 & 0xff0000) | ((*(cert + j) & 0xff) << 24);
+			iowrite32(convert, xlnx_dna->base+XLNX_DNA_CERTIFICATE_DATA_AXI_ONLY_REGISTER_OFFSET + i * 4);
+		}
+	}
+
+	status = ioread32(xlnx_dna->base+XLNX_DNA_STATUS_REGISTER_OFFSET);
+	words  = ioread32(xlnx_dna->base+XLNX_DNA_FSM_CERTIFICATE_WORD_WRITE_COUNT_REGISTER_OFFSET);
+	xocl_info(&pdev->dev, "Signature: status %08x certificate words %d", status, words);
+}
+
+static struct xocl_dna_funcs dna_ops = {
+	.status = dna_status,
+	.capability = dna_capability,
+	.write_cert = dna_write_cert,
+};
+
+
+static void mgmt_sysfs_destroy_xlnx_dna(struct platform_device *pdev)
+{
+	struct xocl_xlnx_dna *xlnx_dna;
+
+	xlnx_dna = platform_get_drvdata(pdev);
+
+	sysfs_remove_group(&pdev->dev.kobj, &xlnx_dna_attrgroup);
+
+}
+
+static int mgmt_sysfs_create_xlnx_dna(struct platform_device *pdev)
+{
+	struct xocl_xlnx_dna *xlnx_dna;
+	struct xocl_dev_core *core;
+	int err;
+
+	xlnx_dna = platform_get_drvdata(pdev);
+	core = XDEV(xocl_get_xdev(pdev));
+
+	err = sysfs_create_group(&pdev->dev.kobj, &xlnx_dna_attrgroup);
+	if (err) {
+		xocl_err(&pdev->dev, "create pw group failed: 0x%x", err);
+		goto create_grp_failed;
+	}
+
+	return 0;
+
+create_grp_failed:
+	return err;
+}
+
+static int xlnx_dna_probe(struct platform_device *pdev)
+{
+	struct xocl_xlnx_dna *xlnx_dna;
+	struct resource *res;
+	int err;
+
+	xlnx_dna = devm_kzalloc(&pdev->dev, sizeof(*xlnx_dna), GFP_KERNEL);
+	if (!xlnx_dna)
+		return -ENOMEM;
+
+	xlnx_dna->base = devm_kzalloc(&pdev->dev, sizeof(void __iomem *), GFP_KERNEL);
+	if (!xlnx_dna->base)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		xocl_err(&pdev->dev, "resource is NULL");
+		return -EINVAL;
+	}
+	xocl_info(&pdev->dev, "IO start: 0x%llx, end: 0x%llx",
+		res->start, res->end);
+
+	xlnx_dna->base = ioremap_nocache(res->start, res->end - res->start + 1);
+	if (!xlnx_dna->base) {
+		err = -EIO;
+		xocl_err(&pdev->dev, "Map iomem failed");
+		goto failed;
+	}
+
+	platform_set_drvdata(pdev, xlnx_dna);
+
+	err = mgmt_sysfs_create_xlnx_dna(pdev);
+	if (err)
+		goto create_xlnx_dna_failed;
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_DNA, &dna_ops);
+
+	return 0;
+
+create_xlnx_dna_failed:
+	platform_set_drvdata(pdev, NULL);
+failed:
+	return err;
+}
+
+
+static int xlnx_dna_remove(struct platform_device *pdev)
+{
+	struct xocl_xlnx_dna	*xlnx_dna;
+
+	xlnx_dna = platform_get_drvdata(pdev);
+	if (!xlnx_dna) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+
+	mgmt_sysfs_destroy_xlnx_dna(pdev);
+
+	if (xlnx_dna->base)
+		iounmap(xlnx_dna->base);
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, xlnx_dna);
+
+	return 0;
+}
+
+struct platform_device_id xlnx_dna_id_table[] = {
+	{ XOCL_DNA, 0 },
+	{ },
+};
+
+static struct platform_driver	xlnx_dna_driver = {
+	.probe		= xlnx_dna_probe,
+	.remove		= xlnx_dna_remove,
+	.driver		= {
+		.name = "xocl_dna",
+	},
+	.id_table = xlnx_dna_id_table,
+};
+
+int __init xocl_init_dna(void)
+{
+	return platform_driver_register(&xlnx_dna_driver);
+}
+
+void xocl_fini_dna(void)
+{
+	platform_driver_unregister(&xlnx_dna_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/feature_rom.c b/drivers/gpu/drm/xocl/subdev/feature_rom.c
new file mode 100644
index 000000000000..f898af6844aa
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/feature_rom.c
@@ -0,0 +1,412 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include "../xclfeatures.h"
+#include "../xocl_drv.h"
+
+#define	MAGIC_NUM	0x786e6c78
+struct feature_rom {
+	void __iomem		*base;
+
+	struct FeatureRomHeader	header;
+	unsigned int            dsa_version;
+	bool			unified;
+	bool			mb_mgmt_enabled;
+	bool			mb_sche_enabled;
+	bool			are_dev;
+	bool			aws_dev;
+};
+
+static ssize_t VBNV_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct feature_rom *rom = platform_get_drvdata(to_platform_device(dev));
+
+	return sprintf(buf, "%s\n", rom->header.VBNVName);
+}
+static DEVICE_ATTR_RO(VBNV);
+
+static ssize_t dr_base_addr_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct feature_rom *rom = platform_get_drvdata(to_platform_device(dev));
+
+	//TODO: Fix: DRBaseAddress no longer required in feature rom
+	if (rom->header.MajorVersion >= 10)
+		return sprintf(buf, "%llu\n", rom->header.DRBaseAddress);
+	else
+		return sprintf(buf, "%u\n", 0);
+}
+static DEVICE_ATTR_RO(dr_base_addr);
+
+static ssize_t ddr_bank_count_max_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct feature_rom *rom = platform_get_drvdata(to_platform_device(dev));
+
+	return sprintf(buf, "%d\n", rom->header.DDRChannelCount);
+}
+static DEVICE_ATTR_RO(ddr_bank_count_max);
+
+static ssize_t ddr_bank_size_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct feature_rom *rom = platform_get_drvdata(to_platform_device(dev));
+
+	return sprintf(buf, "%d\n", rom->header.DDRChannelSize);
+}
+static DEVICE_ATTR_RO(ddr_bank_size);
+
+static ssize_t timestamp_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct feature_rom *rom = platform_get_drvdata(to_platform_device(dev));
+
+	return sprintf(buf, "%llu\n", rom->header.TimeSinceEpoch);
+}
+static DEVICE_ATTR_RO(timestamp);
+
+static ssize_t FPGA_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct feature_rom *rom = platform_get_drvdata(to_platform_device(dev));
+
+	return sprintf(buf, "%s\n", rom->header.FPGAPartName);
+}
+static DEVICE_ATTR_RO(FPGA);
+
+static struct attribute *rom_attrs[] = {
+	&dev_attr_VBNV.attr,
+	&dev_attr_dr_base_addr.attr,
+	&dev_attr_ddr_bank_count_max.attr,
+	&dev_attr_ddr_bank_size.attr,
+	&dev_attr_timestamp.attr,
+	&dev_attr_FPGA.attr,
+	NULL,
+};
+
+static struct attribute_group rom_attr_group = {
+	.attrs = rom_attrs,
+};
+
+static unsigned int dsa_version(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->dsa_version;
+}
+
+static bool is_unified(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->unified;
+}
+
+static bool mb_mgmt_on(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->mb_mgmt_enabled;
+}
+
+static bool mb_sched_on(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->mb_sche_enabled && !XOCL_DSA_MB_SCHE_OFF(xocl_get_xdev(pdev));
+}
+
+static uint32_t *get_cdma_base_addresses(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return (rom->header.FeatureBitMap & CDMA) ? rom->header.CDMABaseAddress : 0;
+}
+
+static u16 get_ddr_channel_count(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->header.DDRChannelCount;
+}
+
+static u64 get_ddr_channel_size(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->header.DDRChannelSize;
+}
+
+static u64 get_timestamp(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->header.TimeSinceEpoch;
+}
+
+static bool is_are(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->are_dev;
+}
+
+static bool is_aws(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	return rom->aws_dev;
+}
+
+static bool verify_timestamp(struct platform_device *pdev, u64 timestamp)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	xocl_info(&pdev->dev, "DSA timestamp: 0x%llx",
+		rom->header.TimeSinceEpoch);
+	xocl_info(&pdev->dev, "Verify timestamp: 0x%llx", timestamp);
+	return (rom->header.TimeSinceEpoch == timestamp);
+}
+
+static void get_raw_header(struct platform_device *pdev, void *header)
+{
+	struct feature_rom *rom;
+
+	rom = platform_get_drvdata(pdev);
+	BUG_ON(!rom);
+
+	memcpy(header, &rom->header, sizeof(rom->header));
+}
+
+static struct xocl_rom_funcs rom_ops = {
+	.dsa_version = dsa_version,
+	.is_unified = is_unified,
+	.mb_mgmt_on = mb_mgmt_on,
+	.mb_sched_on = mb_sched_on,
+	.cdma_addr = get_cdma_base_addresses,
+	.get_ddr_channel_count = get_ddr_channel_count,
+	.get_ddr_channel_size = get_ddr_channel_size,
+	.is_are = is_are,
+	.is_aws = is_aws,
+	.verify_timestamp = verify_timestamp,
+	.get_timestamp = get_timestamp,
+	.get_raw_header = get_raw_header,
+};
+
+static int feature_rom_probe(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+	struct resource *res;
+	u32	val;
+	u16	vendor, did;
+	char	*tmp;
+	int	ret;
+
+	rom = devm_kzalloc(&pdev->dev, sizeof(*rom), GFP_KERNEL);
+	if (!rom)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	rom->base = ioremap_nocache(res->start, res->end - res->start + 1);
+	if (!rom->base) {
+		ret = -EIO;
+		xocl_err(&pdev->dev, "Map iomem failed");
+		goto failed;
+	}
+
+	val = ioread32(rom->base);
+	if (val != MAGIC_NUM) {
+		vendor = XOCL_PL_TO_PCI_DEV(pdev)->vendor;
+		did = XOCL_PL_TO_PCI_DEV(pdev)->device;
+		if (vendor == 0x1d0f && (did == 0x1042 || did == 0xf010)) { // MAGIC, we should define elsewhere
+			xocl_info(&pdev->dev,
+				"Found AWS VU9P Device without featureROM");
+			/*
+			 * This is AWS device. Fill the FeatureROM struct.
+			 * Right now it doesn't have FeatureROM
+			 */
+			memset(rom->header.EntryPointString, 0,
+				sizeof(rom->header.EntryPointString));
+			strncpy(rom->header.EntryPointString, "xlnx", 4);
+			memset(rom->header.FPGAPartName, 0,
+				sizeof(rom->header.FPGAPartName));
+			strncpy(rom->header.FPGAPartName, "AWS VU9P", 8);
+			memset(rom->header.VBNVName, 0,
+				sizeof(rom->header.VBNVName));
+			strncpy(rom->header.VBNVName,
+				"xilinx_aws-vu9p-f1_dynamic_5_0", 35);
+			rom->header.MajorVersion = 4;
+			rom->header.MinorVersion = 0;
+			rom->header.VivadoBuildID = 0xabcd;
+			rom->header.IPBuildID = 0xabcd;
+			rom->header.TimeSinceEpoch = 0xabcd;
+			rom->header.DDRChannelCount = 4;
+			rom->header.DDRChannelSize = 16;
+			rom->header.FeatureBitMap = 0x0;
+			rom->header.FeatureBitMap = UNIFIED_PLATFORM;
+			rom->unified = true;
+			rom->aws_dev = true;
+
+			xocl_info(&pdev->dev, "Enabling AWS dynamic 5.0 DSA");
+		} else {
+			xocl_err(&pdev->dev, "Magic number does not match, actual 0x%x, expected 0x%x",
+					val, MAGIC_NUM);
+			ret = -ENODEV;
+			goto failed;
+		}
+	}
+
+	xocl_memcpy_fromio(&rom->header, rom->base, sizeof(rom->header));
+
+	if (strstr(rom->header.VBNVName, "-xare")) {
+		/*
+		 * ARE device, ARE is mapped like another DDR inside FPGA;
+		 * map_connects as M04_AXI
+		 */
+		rom->header.DDRChannelCount = rom->header.DDRChannelCount - 1;
+		rom->are_dev = true;
+	}
+
+	rom->dsa_version = 0;
+	if (strstr(rom->header.VBNVName, "5_0"))
+		rom->dsa_version = 50;
+	else if (strstr(rom->header.VBNVName, "5_1")
+		 || strstr(rom->header.VBNVName, "u200_xdma_201820_1"))
+		rom->dsa_version = 51;
+	else if (strstr(rom->header.VBNVName, "5_2")
+		 || strstr(rom->header.VBNVName, "u200_xdma_201820_2")
+		 || strstr(rom->header.VBNVName, "u250_xdma_201820_1")
+		 || strstr(rom->header.VBNVName, "201830"))
+		rom->dsa_version = 52;
+	else if (strstr(rom->header.VBNVName, "5_3"))
+		rom->dsa_version = 53;
+
+	if (rom->header.FeatureBitMap & UNIFIED_PLATFORM)
+		rom->unified = true;
+
+	if (rom->header.FeatureBitMap & BOARD_MGMT_ENBLD)
+		rom->mb_mgmt_enabled = true;
+
+	if (rom->header.FeatureBitMap & MB_SCHEDULER)
+		rom->mb_sche_enabled = true;
+
+	ret = sysfs_create_group(&pdev->dev.kobj, &rom_attr_group);
+	if (ret) {
+		xocl_err(&pdev->dev, "create sysfs failed");
+		goto failed;
+	}
+
+	tmp = rom->header.EntryPointString;
+	xocl_info(&pdev->dev, "ROM magic : %c%c%c%c",
+		tmp[0], tmp[1], tmp[2], tmp[3]);
+	xocl_info(&pdev->dev, "VBNV: %s", rom->header.VBNVName);
+	xocl_info(&pdev->dev, "DDR channel count : %d",
+		rom->header.DDRChannelCount);
+	xocl_info(&pdev->dev, "DDR channel size: %d GB",
+		rom->header.DDRChannelSize);
+	xocl_info(&pdev->dev, "Major Version: %d", rom->header.MajorVersion);
+	xocl_info(&pdev->dev, "Minor Version: %d", rom->header.MinorVersion);
+	xocl_info(&pdev->dev, "IPBuildID: %u", rom->header.IPBuildID);
+	xocl_info(&pdev->dev, "TimeSinceEpoch: %llx",
+		rom->header.TimeSinceEpoch);
+	xocl_info(&pdev->dev, "FeatureBitMap: %llx", rom->header.FeatureBitMap);
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_FEATURE_ROM, &rom_ops);
+	platform_set_drvdata(pdev, rom);
+
+	return 0;
+
+failed:
+	if (rom->base)
+		iounmap(rom->base);
+	devm_kfree(&pdev->dev, rom);
+	return ret;
+}
+
+static int feature_rom_remove(struct platform_device *pdev)
+{
+	struct feature_rom *rom;
+
+	xocl_info(&pdev->dev, "Remove feature rom");
+	rom = platform_get_drvdata(pdev);
+	if (!rom) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+	if (rom->base)
+		iounmap(rom->base);
+
+	sysfs_remove_group(&pdev->dev.kobj, &rom_attr_group);
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, rom);
+	return 0;
+}
+
+struct platform_device_id rom_id_table[] =  {
+	{ XOCL_FEATURE_ROM, 0 },
+	{ },
+};
+
+static struct platform_driver	feature_rom_driver = {
+	.probe		= feature_rom_probe,
+	.remove		= feature_rom_remove,
+	.driver		= {
+		.name = XOCL_FEATURE_ROM,
+	},
+	.id_table = rom_id_table,
+};
+
+int __init xocl_init_feature_rom(void)
+{
+	return platform_driver_register(&feature_rom_driver);
+}
+
+void xocl_fini_feature_rom(void)
+{
+	return platform_driver_unregister(&feature_rom_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/firewall.c b/drivers/gpu/drm/xocl/subdev/firewall.c
new file mode 100644
index 000000000000..a32766507ae0
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/firewall.c
@@ -0,0 +1,389 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ *  Copyright (C) 2017-2019 Xilinx, Inc. All rights reserved.
+ *
+ *  Utility Functions for AXI firewall IP.
+ *  Author: Lizhi.Hou@Xilinx.com
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/ktime.h>
+#include <linux/rtc.h>
+#include "../xocl_drv.h"
+
+/* Firewall registers */
+#define	FAULT_STATUS				0x0
+#define	SOFT_CTRL				0x4
+#define	UNBLOCK_CTRL				0x8
+// Firewall error bits
+#define READ_RESPONSE_BUSY                        BIT(0)
+#define RECS_ARREADY_MAX_WAIT                     BIT(1)
+#define RECS_CONTINUOUS_RTRANSFERS_MAX_WAIT       BIT(2)
+#define ERRS_RDATA_NUM                            BIT(3)
+#define ERRS_RID                                  BIT(4)
+#define WRITE_RESPONSE_BUSY                       BIT(16)
+#define RECS_AWREADY_MAX_WAIT                     BIT(17)
+#define RECS_WREADY_MAX_WAIT                      BIT(18)
+#define RECS_WRITE_TO_BVALID_MAX_WAIT             BIT(19)
+#define ERRS_BRESP                                BIT(20)
+
+#define	FIREWALL_STATUS_BUSY	(READ_RESPONSE_BUSY | WRITE_RESPONSE_BUSY)
+#define	CLEAR_RESET_GPIO		0
+
+#define	READ_STATUS(fw, id)			\
+	XOCL_READ_REG32(fw->base_addrs[id] + FAULT_STATUS)
+#define	WRITE_UNBLOCK_CTRL(fw, id, val)			\
+	XOCL_WRITE_REG32(val, fw->base_addrs[id] + UNBLOCK_CTRL)
+
+#define	IS_FIRED(fw, id) (READ_STATUS(fw, id) & ~FIREWALL_STATUS_BUSY)
+
+#define	BUSY_RETRY_COUNT		20
+#define	BUSY_RETRY_INTERVAL		100		/* ms */
+#define	CLEAR_RETRY_COUNT		4
+#define	CLEAR_RETRY_INTERVAL		2		/* ms */
+
+#define	MAX_LEVEL		16
+
+struct firewall {
+	void __iomem		*base_addrs[MAX_LEVEL];
+	u32			max_level;
+	void __iomem		*gpio_addr;
+
+	u32			curr_status;
+	int			curr_level;
+
+	u32			err_detected_status;
+	u32			err_detected_level;
+	u64			err_detected_time;
+
+	bool			inject_firewall;
+};
+
+static int clear_firewall(struct platform_device *pdev);
+static u32 check_firewall(struct platform_device *pdev, int *level);
+
+static int get_prop(struct platform_device *pdev, u32 prop, void *val)
+{
+	struct firewall *fw;
+
+	fw = platform_get_drvdata(pdev);
+	BUG_ON(!fw);
+
+	check_firewall(pdev, NULL);
+	switch (prop) {
+	case XOCL_AF_PROP_TOTAL_LEVEL:
+		*(u32 *)val = fw->max_level;
+		break;
+	case XOCL_AF_PROP_STATUS:
+		*(u32 *)val = fw->curr_status;
+		break;
+	case XOCL_AF_PROP_LEVEL:
+		*(int *)val = fw->curr_level;
+		break;
+	case XOCL_AF_PROP_DETECTED_STATUS:
+		*(u32 *)val = fw->err_detected_status;
+		break;
+	case XOCL_AF_PROP_DETECTED_LEVEL:
+		*(u32 *)val = fw->err_detected_level;
+		break;
+	case XOCL_AF_PROP_DETECTED_TIME:
+		*(u64 *)val = fw->err_detected_time;
+		break;
+	default:
+		xocl_err(&pdev->dev, "Invalid prop %d", prop);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/* sysfs support */
+static ssize_t show_firewall(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct platform_device *pdev = to_platform_device(dev);
+	struct firewall *fw;
+	u64 t;
+	u32 val;
+	int ret;
+
+	fw = platform_get_drvdata(pdev);
+	BUG_ON(!fw);
+
+	if (attr->index == XOCL_AF_PROP_DETECTED_TIME) {
+		get_prop(pdev,  attr->index, &t);
+		return sprintf(buf, "%llu\n", t);
+	}
+
+	ret = get_prop(pdev, attr->index, &val);
+	if (ret)
+		return 0;
+
+	return sprintf(buf, "%u\n", val);
+}
+
+static SENSOR_DEVICE_ATTR(status, 0444, show_firewall, NULL,
+	XOCL_AF_PROP_STATUS);
+static SENSOR_DEVICE_ATTR(level, 0444, show_firewall, NULL,
+	XOCL_AF_PROP_LEVEL);
+static SENSOR_DEVICE_ATTR(detected_status, 0444, show_firewall, NULL,
+	XOCL_AF_PROP_DETECTED_STATUS);
+static SENSOR_DEVICE_ATTR(detected_level, 0444, show_firewall, NULL,
+	XOCL_AF_PROP_DETECTED_LEVEL);
+static SENSOR_DEVICE_ATTR(detected_time, 0444, show_firewall, NULL,
+	XOCL_AF_PROP_DETECTED_TIME);
+
+static ssize_t clear_store(struct device *dev, struct device_attribute *da,
+	const char *buf, size_t count)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	u32 val = 0;
+
+	if (kstrtou32(buf, 10, &val) == -EINVAL || val != 1)
+		return -EINVAL;
+
+	clear_firewall(pdev);
+
+	return count;
+}
+static DEVICE_ATTR_WO(clear);
+
+static ssize_t inject_store(struct device *dev, struct device_attribute *da,
+	const char *buf, size_t count)
+{
+	struct firewall *fw = platform_get_drvdata(to_platform_device(dev));
+
+	fw->inject_firewall = true;
+	return count;
+}
+static DEVICE_ATTR_WO(inject);
+
+static struct attribute *firewall_attributes[] = {
+	&sensor_dev_attr_status.dev_attr.attr,
+	&sensor_dev_attr_level.dev_attr.attr,
+	&sensor_dev_attr_detected_status.dev_attr.attr,
+	&sensor_dev_attr_detected_level.dev_attr.attr,
+	&sensor_dev_attr_detected_time.dev_attr.attr,
+	&dev_attr_clear.attr,
+	&dev_attr_inject.attr,
+	NULL
+};
+
+static const struct attribute_group firewall_attrgroup = {
+	.attrs = firewall_attributes,
+};
+
+static u32 check_firewall(struct platform_device *pdev, int *level)
+{
+	struct firewall	*fw;
+//	struct timeval	time;
+	struct timespec64 now;
+	int	i;
+	u32	val = 0;
+
+	fw = platform_get_drvdata(pdev);
+	BUG_ON(!fw);
+
+	for (i = 0; i < fw->max_level; i++) {
+		val = IS_FIRED(fw, i);
+		if (val) {
+			xocl_info(&pdev->dev, "AXI Firewall %d tripped, "
+				  "status: 0x%x", i, val);
+			if (!fw->curr_status) {
+				fw->err_detected_status = val;
+				fw->err_detected_level = i;
+				ktime_get_ts64(&now);
+				fw->err_detected_time = (u64)(now.tv_sec -
+					(sys_tz.tz_minuteswest * 60));
+			}
+			fw->curr_level = i;
+
+			if (level)
+				*level = i;
+			break;
+		}
+	}
+
+	fw->curr_status = val;
+	fw->curr_level = i >= fw->max_level ? -1 : i;
+
+	/* Inject firewall for testing. */
+	if (fw->curr_level == -1 && fw->inject_firewall) {
+		fw->inject_firewall = false;
+		fw->curr_level = 0;
+		fw->curr_status = 0x1;
+	}
+
+	return fw->curr_status;
+}
+
+static int clear_firewall(struct platform_device *pdev)
+{
+	struct firewall	*fw;
+	int	i, retry = 0, clear_retry = 0;
+	u32	val;
+	int	ret = 0;
+
+	fw = platform_get_drvdata(pdev);
+	BUG_ON(!fw);
+
+	if (!check_firewall(pdev, NULL)) {
+		/* firewall is not tripped */
+		return 0;
+	}
+
+retry_level1:
+	for (i = 0; i < fw->max_level; i++) {
+		for (val = READ_STATUS(fw, i);
+			(val & FIREWALL_STATUS_BUSY) &&
+			retry++ < BUSY_RETRY_COUNT;
+			val = READ_STATUS(fw, i)) {
+			msleep(BUSY_RETRY_INTERVAL);
+		}
+		if (val & FIREWALL_STATUS_BUSY) {
+			xocl_err(&pdev->dev, "firewall %d busy", i);
+			ret = -EBUSY;
+			goto failed;
+		}
+		WRITE_UNBLOCK_CTRL(fw, i, 1);
+	}
+
+	if (check_firewall(pdev, NULL) && clear_retry++ < CLEAR_RETRY_COUNT) {
+		msleep(CLEAR_RETRY_INTERVAL);
+		goto retry_level1;
+	}
+
+	if (!check_firewall(pdev, NULL)) {
+		xocl_info(&pdev->dev, "firewall cleared level 1");
+		return 0;
+	}
+
+	clear_retry = 0;
+
+retry_level2:
+	XOCL_WRITE_REG32(CLEAR_RESET_GPIO, fw->gpio_addr);
+
+	if (check_firewall(pdev, NULL) && clear_retry++ < CLEAR_RETRY_COUNT) {
+		msleep(CLEAR_RETRY_INTERVAL);
+		goto retry_level2;
+	}
+
+	if (!check_firewall(pdev, NULL)) {
+		xocl_info(&pdev->dev, "firewall cleared level 2");
+		return 0;
+	}
+
+	xocl_info(&pdev->dev, "failed clear firewall, level %d, status 0x%x",
+		fw->curr_level, fw->curr_status);
+
+	ret = -EIO;
+
+failed:
+	return ret;
+}
+
+static struct xocl_firewall_funcs fw_ops = {
+	.clear_firewall	= clear_firewall,
+	.check_firewall = check_firewall,
+	.get_prop = get_prop,
+};
+
+static int firewall_remove(struct platform_device *pdev)
+{
+	struct firewall *fw;
+	int     i;
+
+	fw = platform_get_drvdata(pdev);
+	if (!fw) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+
+	sysfs_remove_group(&pdev->dev.kobj, &firewall_attrgroup);
+
+	for (i = 0; i <= fw->max_level; i++) {
+		if (fw->base_addrs[i])
+			iounmap(fw->base_addrs[i]);
+	}
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, fw);
+
+	return 0;
+}
+
+static int firewall_probe(struct platform_device *pdev)
+{
+	struct firewall	*fw;
+	struct resource	*res;
+	int	i, ret = 0;
+
+	xocl_info(&pdev->dev, "probe");
+
+	fw = devm_kzalloc(&pdev->dev, sizeof(*fw), GFP_KERNEL);
+	if (!fw)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, fw);
+
+	fw->curr_level = -1;
+
+	for (i = 0; i < MAX_LEVEL; i++) {
+		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+		if (!res) {
+			fw->max_level = i - 1;
+			fw->gpio_addr = fw->base_addrs[i - 1];
+			break;
+		}
+		fw->base_addrs[i] =
+			ioremap_nocache(res->start, res->end - res->start + 1);
+		if (!fw->base_addrs[i]) {
+			ret = -EIO;
+			xocl_err(&pdev->dev, "Map iomem failed");
+			goto failed;
+		}
+	}
+
+	ret = sysfs_create_group(&pdev->dev.kobj, &firewall_attrgroup);
+	if (ret) {
+		xocl_err(&pdev->dev, "create attr group failed: %d", ret);
+		goto failed;
+	}
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_AF, &fw_ops);
+
+	return 0;
+
+failed:
+	firewall_remove(pdev);
+	return ret;
+}
+
+struct platform_device_id firewall_id_table[] = {
+	{ XOCL_FIREWALL, 0 },
+	{ },
+};
+
+static struct platform_driver	firewall_driver = {
+	.probe		= firewall_probe,
+	.remove		= firewall_remove,
+	.driver		= {
+		.name = "xocl_firewall",
+	},
+	.id_table = firewall_id_table,
+};
+
+int __init xocl_init_firewall(void)
+{
+	return platform_driver_register(&firewall_driver);
+}
+
+void xocl_fini_firewall(void)
+{
+	return platform_driver_unregister(&firewall_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/fmgr.c b/drivers/gpu/drm/xocl/subdev/fmgr.c
new file mode 100644
index 000000000000..99efd86ccd1b
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/fmgr.c
@@ -0,0 +1,198 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * FPGA Manager bindings for XRT driver
+ *
+ * Copyright (C) 2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Sonal Santan
+ *
+ */
+
+#include <linux/fpga/fpga-mgr.h>
+
+#include "../xocl_drv.h"
+#include "../xclbin.h"
+
+/*
+ * Container to capture and cache full xclbin as it is passed in blocks by FPGA
+ * Manager. xocl needs access to full xclbin to walk through xclbin sections. FPGA
+ * Manager's .write() backend sends incremental blocks without any knowledge of
+ * xclbin format forcing us to collect the blocks and stitch them together here.
+ * TODO:
+ * 1. Add a variant of API, icap_download_bitstream_axlf() which works off kernel buffer
+ * 2. Call this new API from FPGA Manager's write complete hook, xocl_pr_write_complete()
+ */
+
+struct xfpga_klass {
+	struct xocl_dev *xdev;
+	struct axlf *blob;
+	char name[64];
+	size_t count;
+	enum fpga_mgr_states state;
+};
+
+static int xocl_pr_write_init(struct fpga_manager *mgr,
+			      struct fpga_image_info *info, const char *buf, size_t count)
+{
+	struct xfpga_klass *obj = mgr->priv;
+	const struct axlf *bin = (const struct axlf *)buf;
+
+	if (count < sizeof(struct axlf)) {
+		obj->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
+		return -EINVAL;
+	}
+
+	if (count > bin->m_header.m_length) {
+		obj->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
+		return -EINVAL;
+	}
+
+	/* Free up the previous blob */
+	vfree(obj->blob);
+	obj->blob = vmalloc(bin->m_header.m_length);
+	if (!obj->blob) {
+		obj->state = FPGA_MGR_STATE_WRITE_INIT_ERR;
+		return -ENOMEM;
+	}
+
+	memcpy(obj->blob, buf, count);
+	xocl_info(&mgr->dev, "Begin download of xclbin %pUb of length %lld B", &obj->blob->m_header.uuid,
+		  obj->blob->m_header.m_length);
+	obj->count = count;
+	obj->state = FPGA_MGR_STATE_WRITE_INIT;
+	return 0;
+}
+
+static int xocl_pr_write(struct fpga_manager *mgr,
+			 const char *buf, size_t count)
+{
+	struct xfpga_klass *obj = mgr->priv;
+	char *curr = (char *)obj->blob;
+
+	if ((obj->state != FPGA_MGR_STATE_WRITE_INIT) && (obj->state != FPGA_MGR_STATE_WRITE)) {
+		obj->state = FPGA_MGR_STATE_WRITE_ERR;
+		return -EINVAL;
+	}
+
+	curr += obj->count;
+	obj->count += count;
+	/* Check if the xclbin buffer is not longer than advertised in the header */
+	if (obj->blob->m_header.m_length < obj->count) {
+		obj->state = FPGA_MGR_STATE_WRITE_ERR;
+		return -EINVAL;
+	}
+	memcpy(curr, buf, count);
+	xocl_info(&mgr->dev, "Next block of %zu B of xclbin %pUb", count, &obj->blob->m_header.uuid);
+	obj->state = FPGA_MGR_STATE_WRITE;
+	return 0;
+}
+
+
+static int xocl_pr_write_complete(struct fpga_manager *mgr,
+				  struct fpga_image_info *info)
+{
+	int result;
+	struct xfpga_klass *obj = mgr->priv;
+
+	if (obj->state != FPGA_MGR_STATE_WRITE) {
+		obj->state = FPGA_MGR_STATE_WRITE_COMPLETE_ERR;
+		return -EINVAL;
+	}
+
+	/* Check if we got the complete xclbin */
+	if (obj->blob->m_header.m_length != obj->count) {
+		obj->state = FPGA_MGR_STATE_WRITE_COMPLETE_ERR;
+		return -EINVAL;
+	}
+	/* Send the xclbin blob to actual download framework in icap */
+	result = xocl_icap_download_axlf(obj->xdev, obj->blob);
+	obj->state = result ? FPGA_MGR_STATE_WRITE_COMPLETE_ERR : FPGA_MGR_STATE_WRITE_COMPLETE;
+	xocl_info(&mgr->dev, "Finish download of xclbin %pUb of size %zu B", &obj->blob->m_header.uuid, obj->count);
+	vfree(obj->blob);
+	obj->blob = NULL;
+	obj->count = 0;
+	return result;
+}
+
+static enum fpga_mgr_states xocl_pr_state(struct fpga_manager *mgr)
+{
+	struct xfpga_klass *obj = mgr->priv;
+
+	return obj->state;
+}
+
+static const struct fpga_manager_ops xocl_pr_ops = {
+	.initial_header_size = sizeof(struct axlf),
+	.write_init = xocl_pr_write_init,
+	.write = xocl_pr_write,
+	.write_complete = xocl_pr_write_complete,
+	.state = xocl_pr_state,
+};
+
+
+struct platform_device_id fmgr_id_table[] = {
+	{ XOCL_FMGR, 0 },
+	{ },
+};
+
+static int fmgr_probe(struct platform_device *pdev)
+{
+	struct fpga_manager *mgr;
+	int ret = 0;
+	struct xfpga_klass *obj = kzalloc(sizeof(struct xfpga_klass), GFP_KERNEL);
+
+	if (!obj)
+		return -ENOMEM;
+
+	obj->xdev = xocl_get_xdev(pdev);
+	snprintf(obj->name, sizeof(obj->name), "Xilinx PCIe FPGA Manager");
+
+	obj->state = FPGA_MGR_STATE_UNKNOWN;
+	mgr = fpga_mgr_create(&pdev->dev, obj->name, &xocl_pr_ops, obj);
+	if (!mgr) {
+		ret = -ENODEV;
+		goto out;
+	}
+	ret = fpga_mgr_register(mgr);
+	if (ret)
+		goto out;
+
+	return ret;
+out:
+	kfree(obj);
+	return ret;
+}
+
+static int fmgr_remove(struct platform_device *pdev)
+{
+	struct fpga_manager *mgr = platform_get_drvdata(pdev);
+	struct xfpga_klass *obj = mgr->priv;
+
+	obj->state = FPGA_MGR_STATE_UNKNOWN;
+	fpga_mgr_unregister(mgr);
+
+	platform_set_drvdata(pdev, NULL);
+	vfree(obj->blob);
+	kfree(obj);
+	return 0;
+}
+
+static struct platform_driver	fmgr_driver = {
+	.probe		= fmgr_probe,
+	.remove		= fmgr_remove,
+	.driver		= {
+		.name = "xocl_fmgr",
+	},
+	.id_table = fmgr_id_table,
+};
+
+int __init xocl_init_fmgr(void)
+{
+	return platform_driver_register(&fmgr_driver);
+}
+
+void xocl_fini_fmgr(void)
+{
+	platform_driver_unregister(&fmgr_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/icap.c b/drivers/gpu/drm/xocl/subdev/icap.c
new file mode 100644
index 000000000000..93eb6265a9c4
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/icap.c
@@ -0,0 +1,2859 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ *  Copyright (C) 2017 Xilinx, Inc. All rights reserved.
+ *  Author: Sonal Santan
+ *  Code copied verbatim from SDAccel xcldma kernel mode driver
+ *
+ */
+
+/*
+ * TODO: Currently, locking / unlocking bitstream is implemented w/ pid as
+ * identification of bitstream users. We assume that, on bare metal, an app
+ * has only one process and will open both user and mgmt pfs. In this model,
+ * xclmgmt has enough information to handle locking/unlocking alone, but we
+ * still involve user pf and mailbox here so that it'll be easier to support
+ * cloud env later. We'll replace pid with a token that is more appropriate
+ * to identify a user later as well.
+ */
+
+#include <linux/firmware.h>
+#include <linux/vmalloc.h>
+#include <linux/string.h>
+#include <linux/version.h>
+#include <linux/uuid.h>
+#include <linux/pid.h>
+#include "../xclbin.h"
+#include "../xocl_drv.h"
+#include <drm/xmgmt_drm.h>
+
+#if defined(XOCL_UUID)
+static uuid_t uuid_null = NULL_UUID_LE;
+#endif
+
+#define	ICAP_ERR(icap, fmt, arg...)	\
+	xocl_err(&(icap)->icap_pdev->dev, fmt "\n", ##arg)
+#define	ICAP_INFO(icap, fmt, arg...)	\
+	xocl_info(&(icap)->icap_pdev->dev, fmt "\n", ##arg)
+#define	ICAP_DBG(icap, fmt, arg...)	\
+	xocl_dbg(&(icap)->icap_pdev->dev, fmt "\n", ##arg)
+
+#define	ICAP_PRIVILEGED(icap)	((icap)->icap_regs != NULL)
+#define DMA_HWICAP_BITFILE_BUFFER_SIZE 1024
+#define	ICAP_MAX_REG_GROUPS		ARRAY_SIZE(XOCL_RES_ICAP_MGMT)
+
+#define	ICAP_MAX_NUM_CLOCKS		2
+#define OCL_CLKWIZ_STATUS_OFFSET	0x4
+#define OCL_CLKWIZ_CONFIG_OFFSET(n)	(0x200 + 4 * (n))
+#define OCL_CLK_FREQ_COUNTER_OFFSET	0x8
+
+/*
+ * Bitstream header information.
+ */
+struct XHwIcap_Bit_Header {
+	unsigned int HeaderLength;     /* Length of header in 32 bit words */
+	unsigned int BitstreamLength;  /* Length of bitstream to read in bytes*/
+	unsigned char *DesignName;     /* Design name read from bitstream header */
+	unsigned char *PartName;       /* Part name read from bitstream header */
+	unsigned char *Date;           /* Date read from bitstream header */
+	unsigned char *Time;           /* Bitstream creation time read from header */
+	unsigned int MagicLength;      /* Length of the magic numbers in header */
+};
+
+#define XHI_BIT_HEADER_FAILURE	-1
+/* Used for parsing bitstream header */
+#define XHI_EVEN_MAGIC_BYTE	0x0f
+#define XHI_ODD_MAGIC_BYTE	0xf0
+/* Extra mode for IDLE */
+#define XHI_OP_IDLE		-1
+/* The imaginary module length register */
+#define XHI_MLR			15
+
+#define	GATE_FREEZE_USER	0x0c
+#define GATE_FREEZE_SHELL	0x00
+
+static u32 gate_free_user[] = {0xe, 0xc, 0xe, 0xf};
+static u32 gate_free_shell[] = {0x8, 0xc, 0xe, 0xf};
+
+/*
+ * AXI-HWICAP IP register layout
+ */
+struct icap_reg {
+	u32			ir_rsvd1[7];
+	u32			ir_gier;
+	u32			ir_isr;
+	u32			ir_rsvd2;
+	u32			ir_ier;
+	u32			ir_rsvd3[53];
+	u32			ir_wf;
+	u32			ir_rf;
+	u32			ir_sz;
+	u32			ir_cr;
+	u32			ir_sr;
+	u32			ir_wfv;
+	u32			ir_rfo;
+	u32			ir_asr;
+} __attribute__((packed));
+
+struct icap_generic_state {
+	u32			igs_state;
+} __attribute__((packed));
+
+struct icap_axi_gate {
+	u32			iag_wr;
+	u32			iag_rvsd;
+	u32			iag_rd;
+} __attribute__((packed));
+
+struct icap_bitstream_user {
+	struct list_head	ibu_list;
+	pid_t			ibu_pid;
+};
+
+struct icap {
+	struct platform_device	*icap_pdev;
+	struct mutex		icap_lock;
+	struct icap_reg		*icap_regs;
+	struct icap_generic_state *icap_state;
+	unsigned int            idcode;
+	bool			icap_axi_gate_frozen;
+	bool			icap_axi_gate_shell_frozen;
+	struct icap_axi_gate	*icap_axi_gate;
+
+	u64			icap_bitstream_id;
+	uuid_t			icap_bitstream_uuid;
+	int			icap_bitstream_ref;
+	struct list_head	icap_bitstream_users;
+
+	char			*icap_clear_bitstream;
+	unsigned long		icap_clear_bitstream_length;
+
+	char			*icap_clock_bases[ICAP_MAX_NUM_CLOCKS];
+	unsigned short		icap_ocl_frequency[ICAP_MAX_NUM_CLOCKS];
+
+	char                    *icap_clock_freq_topology;
+	unsigned long		icap_clock_freq_topology_length;
+	char                    *icap_clock_freq_counter;
+	struct mem_topology      *mem_topo;
+	struct ip_layout         *ip_layout;
+	struct debug_ip_layout   *debug_layout;
+	struct connectivity      *connectivity;
+
+	char			*bit_buffer;
+	unsigned long		bit_length;
+};
+
+static inline u32 reg_rd(void __iomem *reg)
+{
+	return XOCL_READ_REG32(reg);
+}
+
+static inline void reg_wr(void __iomem *reg, u32 val)
+{
+	iowrite32(val, reg);
+}
+
+/*
+ * Precomputed table with config0 and config2 register values together with
+ * target frequency. The steps are approximately 5 MHz apart. Table is
+ * generated by wiz.pl.
+ */
+const static struct xclmgmt_ocl_clockwiz {
+	/* target frequency */
+	unsigned short ocl;
+	/* config0 register */
+	unsigned long config0;
+	/* config2 register */
+	unsigned short config2;
+} frequency_table[] = {
+	{/* 600*/   60, 0x0601, 0x000a},
+	{/* 600*/   66, 0x0601, 0x0009},
+	{/* 600*/   75, 0x0601, 0x0008},
+	{/* 800*/   80, 0x0801, 0x000a},
+	{/* 600*/   85, 0x0601, 0x0007},
+	{/* 900*/   90, 0x0901, 0x000a},
+	{/*1000*/  100, 0x0a01, 0x000a},
+	{/*1100*/  110, 0x0b01, 0x000a},
+	{/* 700*/  116, 0x0701, 0x0006},
+	{/*1100*/  122, 0x0b01, 0x0009},
+	{/* 900*/  128, 0x0901, 0x0007},
+	{/*1200*/  133, 0x0c01, 0x0009},
+	{/*1400*/  140, 0x0e01, 0x000a},
+	{/*1200*/  150, 0x0c01, 0x0008},
+	{/*1400*/  155, 0x0e01, 0x0009},
+	{/* 800*/  160, 0x0801, 0x0005},
+	{/*1000*/  166, 0x0a01, 0x0006},
+	{/*1200*/  171, 0x0c01, 0x0007},
+	{/* 900*/  180, 0x0901, 0x0005},
+	{/*1300*/  185, 0x0d01, 0x0007},
+	{/*1400*/  200, 0x0e01, 0x0007},
+	{/*1300*/  216, 0x0d01, 0x0006},
+	{/* 900*/  225, 0x0901, 0x0004},
+	{/*1400*/  233, 0x0e01, 0x0006},
+	{/*1200*/  240, 0x0c01, 0x0005},
+	{/*1000*/  250, 0x0a01, 0x0004},
+	{/*1300*/  260, 0x0d01, 0x0005},
+	{/* 800*/  266, 0x0801, 0x0003},
+	{/*1100*/  275, 0x0b01, 0x0004},
+	{/*1400*/  280, 0x0e01, 0x0005},
+	{/*1200*/  300, 0x0c01, 0x0004},
+	{/*1300*/  325, 0x0d01, 0x0004},
+	{/*1000*/  333, 0x0a01, 0x0003},
+	{/*1400*/  350, 0x0e01, 0x0004},
+	{/*1100*/  366, 0x0b01, 0x0003},
+	{/*1200*/  400, 0x0c01, 0x0003},
+	{/*1300*/  433, 0x0d01, 0x0003},
+	{/* 900*/  450, 0x0901, 0x0002},
+	{/*1400*/  466, 0x0e01, 0x0003},
+	{/*1000*/  500, 0x0a01, 0x0002}
+};
+
+static int icap_verify_bitstream_axlf(struct platform_device *pdev,
+	struct axlf *xclbin);
+static int icap_parse_bitstream_axlf_section(struct platform_device *pdev,
+	const struct axlf *xclbin, enum axlf_section_kind kind);
+
+static struct icap_bitstream_user *alloc_user(pid_t pid)
+{
+	struct icap_bitstream_user *u =
+		kzalloc(sizeof(struct icap_bitstream_user), GFP_KERNEL);
+
+	if (u) {
+		INIT_LIST_HEAD(&u->ibu_list);
+		u->ibu_pid = pid;
+	}
+	return u;
+}
+
+static void free_user(struct icap_bitstream_user *u)
+{
+	kfree(u);
+}
+
+static struct icap_bitstream_user *obtain_user(struct icap *icap, pid_t pid)
+{
+	struct list_head *pos, *n;
+
+	list_for_each_safe(pos, n, &icap->icap_bitstream_users) {
+		struct icap_bitstream_user *u = list_entry(pos, struct icap_bitstream_user, ibu_list);
+
+		if (u->ibu_pid == pid)
+			return u;
+	}
+
+	return NULL;
+}
+
+static void icap_read_from_peer(struct platform_device *pdev, enum data_kind kind, void *resp, size_t resplen)
+{
+	struct mailbox_subdev_peer subdev_peer = {0};
+	size_t data_len = sizeof(struct mailbox_subdev_peer);
+	struct mailbox_req *mb_req = NULL;
+	size_t reqlen = sizeof(struct mailbox_req) + data_len;
+
+	mb_req = vmalloc(reqlen);
+	if (!mb_req)
+		return;
+
+	mb_req->req = MAILBOX_REQ_PEER_DATA;
+
+	subdev_peer.kind = kind;
+	memcpy(mb_req->data, &subdev_peer, data_len);
+
+	(void) xocl_peer_request(XOCL_PL_DEV_TO_XDEV(pdev),
+		mb_req, reqlen, resp, &resplen, NULL, NULL);
+
+	vfree(mb_req);
+}
+
+
+static int add_user(struct icap *icap, pid_t pid)
+{
+	struct icap_bitstream_user *u;
+
+	u = obtain_user(icap, pid);
+	if (u)
+		return 0;
+
+	u = alloc_user(pid);
+	if (!u)
+		return -ENOMEM;
+
+	list_add_tail(&u->ibu_list, &icap->icap_bitstream_users);
+	icap->icap_bitstream_ref++;
+	return 0;
+}
+
+static int del_user(struct icap *icap, pid_t pid)
+{
+	struct icap_bitstream_user *u = NULL;
+
+	u = obtain_user(icap, pid);
+	if (!u)
+		return -EINVAL;
+
+	list_del(&u->ibu_list);
+	free_user(u);
+	icap->icap_bitstream_ref--;
+	return 0;
+}
+
+static void del_all_users(struct icap *icap)
+{
+	struct icap_bitstream_user *u = NULL;
+	struct list_head *pos, *n;
+
+	if (icap->icap_bitstream_ref == 0)
+		return;
+
+	list_for_each_safe(pos, n, &icap->icap_bitstream_users) {
+		u = list_entry(pos, struct icap_bitstream_user, ibu_list);
+		list_del(&u->ibu_list);
+		free_user(u);
+	}
+
+	ICAP_INFO(icap, "removed %d users", icap->icap_bitstream_ref);
+	icap->icap_bitstream_ref = 0;
+}
+
+static unsigned int find_matching_freq_config(unsigned int freq)
+{
+	unsigned int start = 0;
+	unsigned int end = ARRAY_SIZE(frequency_table) - 1;
+	unsigned int idx = ARRAY_SIZE(frequency_table) - 1;
+
+	if (freq < frequency_table[0].ocl)
+		return 0;
+
+	if (freq > frequency_table[ARRAY_SIZE(frequency_table) - 1].ocl)
+		return ARRAY_SIZE(frequency_table) - 1;
+
+	while (start < end) {
+		if (freq == frequency_table[idx].ocl)
+			break;
+		if (freq < frequency_table[idx].ocl)
+			end = idx;
+		else
+			start = idx + 1;
+		idx = start + (end - start) / 2;
+	}
+	if (freq < frequency_table[idx].ocl)
+		idx--;
+
+	return idx;
+}
+
+static unsigned short icap_get_ocl_frequency(const struct icap *icap, int idx)
+{
+#define XCL_INPUT_FREQ 100
+	const u64 input = XCL_INPUT_FREQ;
+	u32 val;
+	u32 mul0, div0;
+	u32 mul_frac0 = 0;
+	u32 div1;
+	u32 div_frac1 = 0;
+	u64 freq;
+	char *base = NULL;
+
+	if (ICAP_PRIVILEGED(icap)) {
+		base = icap->icap_clock_bases[idx];
+	  val = reg_rd(base + OCL_CLKWIZ_STATUS_OFFSET);
+		if ((val & 1) == 0)
+			return 0;
+
+		val = reg_rd(base + OCL_CLKWIZ_CONFIG_OFFSET(0));
+
+		div0 = val & 0xff;
+		mul0 = (val & 0xff00) >> 8;
+		if (val & BIT(26)) {
+			mul_frac0 = val >> 16;
+			mul_frac0 &= 0x3ff;
+		}
+
+		/*
+		 * Multiply both numerator (mul0) and the denominator (div0) with 1000
+		 * to account for fractional portion of multiplier
+		 */
+		mul0 *= 1000;
+		mul0 += mul_frac0;
+		div0 *= 1000;
+
+		val = reg_rd(base + OCL_CLKWIZ_CONFIG_OFFSET(2));
+
+		div1 = val & 0xff;
+		if (val & BIT(18)) {
+			div_frac1 = val >> 8;
+			div_frac1 &= 0x3ff;
+		}
+
+		/*
+		 * Multiply both numerator (mul0) and the denominator (div1) with 1000 to
+		 * account for fractional portion of divider
+		 */
+
+		div1 *= 1000;
+		div1 += div_frac1;
+		div0 *= div1;
+		mul0 *= 1000;
+		if (div0 == 0) {
+			ICAP_ERR(icap, "clockwiz 0 divider");
+			return 0;
+		}
+		freq = (input * mul0) / div0;
+	} else {
+		icap_read_from_peer(icap->icap_pdev, CLOCK_FREQ_0, (u32 *)&freq, sizeof(u32));
+	}
+	return freq;
+}
+
+static unsigned int icap_get_clock_frequency_counter_khz(const struct icap *icap, int idx)
+{
+	u32 freq, status;
+	char *base = icap->icap_clock_freq_counter;
+	int times;
+
+	times = 10;
+	freq = 0;
+	/*
+	 * reset and wait until done
+	 */
+	if (ICAP_PRIVILEGED(icap)) {
+		if (uuid_is_null(&icap->icap_bitstream_uuid)) {
+			ICAP_ERR(icap, "ERROR: There isn't a xclbin loaded in the dynamic "
+				 "region, frequencies counter cannot be determined");
+			return freq;
+		}
+		reg_wr(base, 0x1);
+
+		while (times != 0) {
+			status = reg_rd(base);
+			if (status == 0x2)
+				break;
+			mdelay(1);
+			times--;
+		};
+
+	  freq = reg_rd(base + OCL_CLK_FREQ_COUNTER_OFFSET + idx * sizeof(u32));
+	} else {
+		icap_read_from_peer(icap->icap_pdev, FREQ_COUNTER_0, (u32 *)&freq, sizeof(u32));
+	}
+	return freq;
+}
+/*
+ * Based on Clocking Wizard v5.1, section Dynamic Reconfiguration
+ * through AXI4-Lite
+ */
+static int icap_ocl_freqscaling(struct icap *icap, bool force)
+{
+	unsigned int curr_freq;
+	u32 config;
+	int i;
+	int j = 0;
+	u32 val = 0;
+	unsigned int idx = 0;
+	long err = 0;
+
+	for (i = 0; i < ICAP_MAX_NUM_CLOCKS; ++i) {
+		// A value of zero means skip scaling for this clock index
+		if (!icap->icap_ocl_frequency[i])
+			continue;
+
+		idx = find_matching_freq_config(icap->icap_ocl_frequency[i]);
+		curr_freq = icap_get_ocl_frequency(icap, i);
+		ICAP_INFO(icap, "Clock %d, Current %d Mhz, New %d Mhz ",
+				i, curr_freq, icap->icap_ocl_frequency[i]);
+
+		/*
+		 * If current frequency is in the same step as the
+		 * requested frequency then nothing to do.
+		 */
+		if (!force && (find_matching_freq_config(curr_freq) == idx))
+			continue;
+
+		val = reg_rd(icap->icap_clock_bases[i] +
+			OCL_CLKWIZ_STATUS_OFFSET);
+		if (val != 1) {
+			ICAP_ERR(icap, "clockwiz %d is busy", i);
+			err = -EBUSY;
+			break;
+		}
+
+		config = frequency_table[idx].config0;
+		reg_wr(icap->icap_clock_bases[i] + OCL_CLKWIZ_CONFIG_OFFSET(0),
+			config);
+		config = frequency_table[idx].config2;
+		reg_wr(icap->icap_clock_bases[i] + OCL_CLKWIZ_CONFIG_OFFSET(2),
+			config);
+		msleep(10);
+		reg_wr(icap->icap_clock_bases[i] + OCL_CLKWIZ_CONFIG_OFFSET(23),
+			0x00000007);
+		msleep(1);
+		reg_wr(icap->icap_clock_bases[i] + OCL_CLKWIZ_CONFIG_OFFSET(23),
+			0x00000002);
+
+		ICAP_INFO(icap, "clockwiz waiting for locked signal");
+		msleep(100);
+		for (j = 0; j < 100; j++) {
+			val = reg_rd(icap->icap_clock_bases[i] +
+				OCL_CLKWIZ_STATUS_OFFSET);
+			if (val != 1) {
+				msleep(100);
+				continue;
+			}
+		}
+		if (val != 1) {
+			ICAP_ERR(icap, "clockwiz MMCM/PLL did not lock after %d ms, "
+				"restoring the original configuration", 100 * 100);
+			/* restore the original clock configuration */
+			reg_wr(icap->icap_clock_bases[i] +
+				OCL_CLKWIZ_CONFIG_OFFSET(23), 0x00000004);
+			msleep(10);
+			reg_wr(icap->icap_clock_bases[i] +
+				OCL_CLKWIZ_CONFIG_OFFSET(23), 0x00000000);
+			err = -ETIMEDOUT;
+			break;
+		}
+		val = reg_rd(icap->icap_clock_bases[i] +
+			OCL_CLKWIZ_CONFIG_OFFSET(0));
+		ICAP_INFO(icap, "clockwiz CONFIG(0) 0x%x", val);
+		val = reg_rd(icap->icap_clock_bases[i] +
+			OCL_CLKWIZ_CONFIG_OFFSET(2));
+		ICAP_INFO(icap, "clockwiz CONFIG(2) 0x%x", val);
+	}
+
+	return err;
+}
+
+static bool icap_bitstream_in_use(struct icap *icap, pid_t pid)
+{
+	BUG_ON(icap->icap_bitstream_ref < 0);
+
+	/* Any user counts if pid isn't specified. */
+	if (pid == 0)
+		return icap->icap_bitstream_ref != 0;
+
+	if (icap->icap_bitstream_ref == 0)
+		return false;
+	if ((icap->icap_bitstream_ref == 1) && obtain_user(icap, pid))
+		return false;
+	return true;
+}
+
+static int icap_freeze_axi_gate_shell(struct icap *icap)
+{
+	xdev_handle_t xdev = xocl_get_xdev(icap->icap_pdev);
+
+	ICAP_INFO(icap, "freezing Shell AXI gate");
+	BUG_ON(icap->icap_axi_gate_shell_frozen);
+
+	(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+	reg_wr(&icap->icap_axi_gate->iag_wr, GATE_FREEZE_SHELL);
+	(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+
+	if (!xocl_is_unified(xdev)) {
+		reg_wr(&icap->icap_regs->ir_cr, 0xc);
+		ndelay(20);
+	} else {
+		/* New ICAP reset sequence applicable only to unified dsa. */
+		reg_wr(&icap->icap_regs->ir_cr, 0x8);
+		ndelay(2000);
+		reg_wr(&icap->icap_regs->ir_cr, 0x0);
+		ndelay(2000);
+		reg_wr(&icap->icap_regs->ir_cr, 0x4);
+		ndelay(2000);
+		reg_wr(&icap->icap_regs->ir_cr, 0x0);
+		ndelay(2000);
+	}
+
+	icap->icap_axi_gate_shell_frozen = true;
+
+	return 0;
+}
+
+static int icap_free_axi_gate_shell(struct icap *icap)
+{
+	int i;
+
+	ICAP_INFO(icap, "freeing Shell AXI gate");
+	/*
+	 * First pulse the OCL RESET. This is important for PR with multiple
+	 * clocks as it resets the edge triggered clock converter FIFO
+	 */
+
+	if (!icap->icap_axi_gate_shell_frozen)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(gate_free_shell); i++) {
+		(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+		reg_wr(&icap->icap_axi_gate->iag_wr, gate_free_shell[i]);
+		mdelay(50);
+	}
+
+	(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+
+	icap->icap_axi_gate_shell_frozen = false;
+
+	return 0;
+}
+
+static int icap_freeze_axi_gate(struct icap *icap)
+{
+	xdev_handle_t xdev = xocl_get_xdev(icap->icap_pdev);
+
+	ICAP_INFO(icap, "freezing CL AXI gate");
+	BUG_ON(icap->icap_axi_gate_frozen);
+
+	(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+	reg_wr(&icap->icap_axi_gate->iag_wr, GATE_FREEZE_USER);
+	(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+
+	if (!xocl_is_unified(xdev)) {
+		reg_wr(&icap->icap_regs->ir_cr, 0xc);
+		ndelay(20);
+	} else {
+		/* New ICAP reset sequence applicable only to unified dsa. */
+		reg_wr(&icap->icap_regs->ir_cr, 0x8);
+		ndelay(2000);
+		reg_wr(&icap->icap_regs->ir_cr, 0x0);
+		ndelay(2000);
+		reg_wr(&icap->icap_regs->ir_cr, 0x4);
+		ndelay(2000);
+		reg_wr(&icap->icap_regs->ir_cr, 0x0);
+		ndelay(2000);
+	}
+
+	icap->icap_axi_gate_frozen = true;
+
+	return 0;
+}
+
+static int icap_free_axi_gate(struct icap *icap)
+{
+	int i;
+
+	ICAP_INFO(icap, "freeing CL AXI gate");
+	/*
+	 * First pulse the OCL RESET. This is important for PR with multiple
+	 * clocks as it resets the edge triggered clock converter FIFO
+	 */
+
+	if (!icap->icap_axi_gate_frozen)
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(gate_free_user); i++) {
+		(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+		reg_wr(&icap->icap_axi_gate->iag_wr, gate_free_user[i]);
+		ndelay(500);
+	}
+
+	(void) reg_rd(&icap->icap_axi_gate->iag_rd);
+
+	icap->icap_axi_gate_frozen = false;
+
+	return 0;
+}
+
+static void platform_reset_axi_gate(struct platform_device *pdev)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+
+	/* Can only be done from mgmt pf. */
+	if (!ICAP_PRIVILEGED(icap))
+		return;
+
+	mutex_lock(&icap->icap_lock);
+	if (!icap_bitstream_in_use(icap, 0)) {
+		(void) icap_freeze_axi_gate(platform_get_drvdata(pdev));
+		msleep(500);
+		(void) icap_free_axi_gate(platform_get_drvdata(pdev));
+		msleep(500);
+	}
+	mutex_unlock(&icap->icap_lock);
+}
+
+static int set_freqs(struct icap *icap, unsigned short *freqs, int num_freqs)
+{
+	int i;
+	int err;
+	u32 val;
+
+	for (i = 0; i < min(ICAP_MAX_NUM_CLOCKS, num_freqs); ++i) {
+		if (freqs[i] == 0)
+			continue;
+
+		val = reg_rd(icap->icap_clock_bases[i] +
+			OCL_CLKWIZ_STATUS_OFFSET);
+		if ((val & 0x1) == 0) {
+			ICAP_ERR(icap, "clockwiz %d is busy", i);
+			err = -EBUSY;
+			goto done;
+		}
+	}
+
+	memcpy(icap->icap_ocl_frequency, freqs,
+		sizeof(*freqs) * min(ICAP_MAX_NUM_CLOCKS, num_freqs));
+
+	icap_freeze_axi_gate(icap);
+	err = icap_ocl_freqscaling(icap, false);
+	icap_free_axi_gate(icap);
+
+done:
+	return err;
+
+}
+
+static int set_and_verify_freqs(struct icap *icap, unsigned short *freqs, int num_freqs)
+{
+	int i;
+	int err;
+	u32 clock_freq_counter, request_in_khz, tolerance;
+
+	err = set_freqs(icap, freqs, num_freqs);
+	if (err)
+		return err;
+
+	for (i = 0; i < min(ICAP_MAX_NUM_CLOCKS, num_freqs); ++i) {
+		if (!freqs[i])
+			continue;
+		clock_freq_counter = icap_get_clock_frequency_counter_khz(icap, i);
+		if (clock_freq_counter == 0) {
+			err = -EDOM;
+			break;
+		}
+		request_in_khz = freqs[i]*1000;
+		tolerance = freqs[i]*50;
+		if (tolerance < abs(clock_freq_counter-request_in_khz)) {
+			ICAP_ERR(icap, "Frequency is higher than tolerance value, request %u khz, "
+				 "actual %u khz", request_in_khz, clock_freq_counter);
+			err = -EDOM;
+			break;
+		}
+	}
+
+	return err;
+}
+
+static int icap_ocl_set_freqscaling(struct platform_device *pdev,
+	unsigned int region, unsigned short *freqs, int num_freqs)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	int err = 0;
+
+	/* Can only be done from mgmt pf. */
+	if (!ICAP_PRIVILEGED(icap))
+		return -EPERM;
+
+	/* For now, only PR region 0 is supported. */
+	if (region != 0)
+		return -EINVAL;
+
+	mutex_lock(&icap->icap_lock);
+
+	err = set_freqs(icap, freqs, num_freqs);
+
+	mutex_unlock(&icap->icap_lock);
+
+	return err;
+}
+
+static int icap_ocl_update_clock_freq_topology(struct platform_device *pdev, struct xclmgmt_ioc_freqscaling *freq_obj)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	struct clock_freq_topology *topology = 0;
+	int num_clocks = 0;
+	int i = 0;
+	int err = 0;
+
+	mutex_lock(&icap->icap_lock);
+	if (icap->icap_clock_freq_topology) {
+		topology = (struct clock_freq_topology *)icap->icap_clock_freq_topology;
+		num_clocks = topology->m_count;
+		ICAP_INFO(icap, "Num clocks is %d", num_clocks);
+		for (i = 0; i < ARRAY_SIZE(freq_obj->ocl_target_freq); i++) {
+			ICAP_INFO(icap, "requested frequency is : "
+				"%d xclbin freq is: %d",
+				freq_obj->ocl_target_freq[i],
+				topology->m_clock_freq[i].m_freq_Mhz);
+			if (freq_obj->ocl_target_freq[i] >
+				topology->m_clock_freq[i].m_freq_Mhz) {
+				ICAP_ERR(icap, "Unable to set frequency as "
+					"requested frequency %d is greater "
+					"than set by xclbin %d",
+					freq_obj->ocl_target_freq[i],
+					topology->m_clock_freq[i].m_freq_Mhz);
+				err = -EDOM;
+				goto done;
+			}
+		}
+	} else {
+		ICAP_ERR(icap, "ERROR: There isn't a hardware accelerator loaded in the dynamic region."
+			" Validation of accelerator frequencies cannot be determine");
+		err = -EDOM;
+		goto done;
+	}
+
+	err = set_and_verify_freqs(icap, freq_obj->ocl_target_freq, ARRAY_SIZE(freq_obj->ocl_target_freq));
+
+done:
+	mutex_unlock(&icap->icap_lock);
+	return err;
+}
+
+static int icap_ocl_get_freqscaling(struct platform_device *pdev,
+	unsigned int region, unsigned short *freqs, int num_freqs)
+{
+	int i;
+	struct icap *icap = platform_get_drvdata(pdev);
+
+	/* For now, only PR region 0 is supported. */
+	if (region != 0)
+		return -EINVAL;
+
+	mutex_lock(&icap->icap_lock);
+	for (i = 0; i < min(ICAP_MAX_NUM_CLOCKS, num_freqs); i++)
+		freqs[i] = icap_get_ocl_frequency(icap, i);
+	mutex_unlock(&icap->icap_lock);
+
+	return 0;
+}
+
+static inline bool mig_calibration_done(struct icap *icap)
+{
+	return (reg_rd(&icap->icap_state->igs_state) & BIT(0)) != 0;
+}
+
+/* Check for MIG calibration. */
+static int calibrate_mig(struct icap *icap)
+{
+	int i;
+
+	for (i = 0; i < 10 && !mig_calibration_done(icap); ++i)
+		msleep(500);
+
+	if (!mig_calibration_done(icap)) {
+		ICAP_ERR(icap,
+			"MIG calibration timeout after bitstream download");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static inline void free_clock_freq_topology(struct icap *icap)
+{
+	vfree(icap->icap_clock_freq_topology);
+	icap->icap_clock_freq_topology = NULL;
+	icap->icap_clock_freq_topology_length = 0;
+}
+
+static int icap_setup_clock_freq_topology(struct icap *icap,
+	const char *buffer, unsigned long length)
+{
+	if (length == 0)
+		return 0;
+
+	free_clock_freq_topology(icap);
+
+	icap->icap_clock_freq_topology = vmalloc(length);
+	if (!icap->icap_clock_freq_topology)
+		return -ENOMEM;
+
+	memcpy(icap->icap_clock_freq_topology, buffer, length);
+	icap->icap_clock_freq_topology_length = length;
+
+	return 0;
+}
+
+static inline void free_clear_bitstream(struct icap *icap)
+{
+	vfree(icap->icap_clear_bitstream);
+	icap->icap_clear_bitstream = NULL;
+	icap->icap_clear_bitstream_length = 0;
+}
+
+static int icap_setup_clear_bitstream(struct icap *icap,
+	const char *buffer, unsigned long length)
+{
+	if (length == 0)
+		return 0;
+
+	free_clear_bitstream(icap);
+
+	icap->icap_clear_bitstream = vmalloc(length);
+	if (!icap->icap_clear_bitstream)
+		return -ENOMEM;
+
+	memcpy(icap->icap_clear_bitstream, buffer, length);
+	icap->icap_clear_bitstream_length = length;
+
+	return 0;
+}
+
+static int wait_for_done(struct icap *icap)
+{
+	u32 w;
+	int i = 0;
+
+	for (i = 0; i < 10; i++) {
+		udelay(5);
+		w = reg_rd(&icap->icap_regs->ir_sr);
+		ICAP_INFO(icap, "XHWICAP_SR: %x", w);
+		if (w & 0x5)
+			return 0;
+	}
+
+	ICAP_ERR(icap, "bitstream download timeout");
+	return -ETIMEDOUT;
+}
+
+static int icap_write(struct icap *icap, const u32 *word_buf, int size)
+{
+	int i;
+	u32 value = 0;
+
+	for (i = 0; i < size; i++) {
+		value = be32_to_cpu(word_buf[i]);
+		reg_wr(&icap->icap_regs->ir_wf, value);
+	}
+
+	reg_wr(&icap->icap_regs->ir_cr, 0x1);
+
+	for (i = 0; i < 20; i++) {
+		value = reg_rd(&icap->icap_regs->ir_cr);
+		if ((value & 0x1) == 0)
+			return 0;
+		ndelay(50);
+	}
+
+	ICAP_ERR(icap, "writing %d dwords timeout", size);
+	return -EIO;
+}
+
+static uint64_t icap_get_section_size(struct icap *icap, enum axlf_section_kind kind)
+{
+	uint64_t size = 0;
+
+	switch (kind) {
+	case IP_LAYOUT:
+		size = sizeof_sect(icap->ip_layout, m_ip_data);
+		break;
+	case MEM_TOPOLOGY:
+		size = sizeof_sect(icap->mem_topo, m_mem_data);
+		break;
+	case DEBUG_IP_LAYOUT:
+		size = sizeof_sect(icap->debug_layout, m_debug_ip_data);
+		break;
+	case CONNECTIVITY:
+		size = sizeof_sect(icap->connectivity, m_connection);
+		break;
+	default:
+		break;
+	}
+
+	return size;
+}
+
+static int bitstream_parse_header(struct icap *icap, const unsigned char *Data,
+	unsigned int Size, struct XHwIcap_Bit_Header *Header)
+{
+	unsigned int I;
+	unsigned int Len;
+	unsigned int Tmp;
+	unsigned int Index;
+
+	/* Start Index at start of bitstream */
+	Index = 0;
+
+	/* Initialize HeaderLength.  If header returned early inidicates
+	 * failure.
+	 */
+	Header->HeaderLength = XHI_BIT_HEADER_FAILURE;
+
+	/* Get "Magic" length */
+	Header->MagicLength = Data[Index++];
+	Header->MagicLength = (Header->MagicLength << 8) | Data[Index++];
+
+	/* Read in "magic" */
+	for (I = 0; I < Header->MagicLength - 1; I++) {
+		Tmp = Data[Index++];
+		if (I%2 == 0 && Tmp != XHI_EVEN_MAGIC_BYTE)
+			return -1;   /* INVALID_FILE_HEADER_ERROR */
+
+		if (I%2 == 1 && Tmp != XHI_ODD_MAGIC_BYTE)
+			return -1;   /* INVALID_FILE_HEADER_ERROR */
+
+	}
+
+	/* Read null end of magic data. */
+	Tmp = Data[Index++];
+
+	/* Read 0x01 (short) */
+	Tmp = Data[Index++];
+	Tmp = (Tmp << 8) | Data[Index++];
+
+	/* Check the "0x01" half word */
+	if (Tmp != 0x01)
+		return -1;	 /* INVALID_FILE_HEADER_ERROR */
+
+	/* Read 'a' */
+	Tmp = Data[Index++];
+	if (Tmp != 'a')
+		return -1;	  /* INVALID_FILE_HEADER_ERROR	*/
+
+	/* Get Design Name length */
+	Len = Data[Index++];
+	Len = (Len << 8) | Data[Index++];
+
+	/* allocate space for design name and final null character. */
+	Header->DesignName = kmalloc(Len, GFP_KERNEL);
+
+	/* Read in Design Name */
+	for (I = 0; I < Len; I++)
+		Header->DesignName[I] = Data[Index++];
+
+
+	if (Header->DesignName[Len-1] != '\0')
+		return -1;
+
+	/* Read 'b' */
+	Tmp = Data[Index++];
+	if (Tmp != 'b')
+		return -1;	/* INVALID_FILE_HEADER_ERROR */
+
+	/* Get Part Name length */
+	Len = Data[Index++];
+	Len = (Len << 8) | Data[Index++];
+
+	/* allocate space for part name and final null character. */
+	Header->PartName = kmalloc(Len, GFP_KERNEL);
+
+	/* Read in part name */
+	for (I = 0; I < Len; I++)
+		Header->PartName[I] = Data[Index++];
+
+	if (Header->PartName[Len-1] != '\0')
+		return -1;
+
+	/* Read 'c' */
+	Tmp = Data[Index++];
+	if (Tmp != 'c')
+		return -1;	/* INVALID_FILE_HEADER_ERROR */
+
+	/* Get date length */
+	Len = Data[Index++];
+	Len = (Len << 8) | Data[Index++];
+
+	/* allocate space for date and final null character. */
+	Header->Date = kmalloc(Len, GFP_KERNEL);
+
+	/* Read in date name */
+	for (I = 0; I < Len; I++)
+		Header->Date[I] = Data[Index++];
+
+	if (Header->Date[Len - 1] != '\0')
+		return -1;
+
+	/* Read 'd' */
+	Tmp = Data[Index++];
+	if (Tmp != 'd')
+		return -1;	/* INVALID_FILE_HEADER_ERROR  */
+
+	/* Get time length */
+	Len = Data[Index++];
+	Len = (Len << 8) | Data[Index++];
+
+	/* allocate space for time and final null character. */
+	Header->Time = kmalloc(Len, GFP_KERNEL);
+
+	/* Read in time name */
+	for (I = 0; I < Len; I++)
+		Header->Time[I] = Data[Index++];
+
+	if (Header->Time[Len - 1] != '\0')
+		return -1;
+
+	/* Read 'e' */
+	Tmp = Data[Index++];
+	if (Tmp != 'e')
+		return -1;	/* INVALID_FILE_HEADER_ERROR */
+
+	/* Get byte length of bitstream */
+	Header->BitstreamLength = Data[Index++];
+	Header->BitstreamLength = (Header->BitstreamLength << 8) | Data[Index++];
+	Header->BitstreamLength = (Header->BitstreamLength << 8) | Data[Index++];
+	Header->BitstreamLength = (Header->BitstreamLength << 8) | Data[Index++];
+	Header->HeaderLength = Index;
+
+	ICAP_INFO(icap, "Design \"%s\"", Header->DesignName);
+	ICAP_INFO(icap, "Part \"%s\"", Header->PartName);
+	ICAP_INFO(icap, "Timestamp \"%s %s\"", Header->Time, Header->Date);
+	ICAP_INFO(icap, "Raw data size 0x%x", Header->BitstreamLength);
+	return 0;
+}
+
+static int bitstream_helper(struct icap *icap, const u32 *word_buffer,
+			    unsigned int word_count)
+{
+	unsigned int remain_word;
+	unsigned int word_written = 0;
+	int wr_fifo_vacancy = 0;
+	int err = 0;
+
+	for (remain_word = word_count; remain_word > 0;
+		remain_word -= word_written, word_buffer += word_written) {
+		wr_fifo_vacancy = reg_rd(&icap->icap_regs->ir_wfv);
+		if (wr_fifo_vacancy <= 0) {
+			ICAP_ERR(icap, "no vacancy: %d", wr_fifo_vacancy);
+			err = -EIO;
+			break;
+		}
+		word_written = (wr_fifo_vacancy < remain_word) ?
+			wr_fifo_vacancy : remain_word;
+		if (icap_write(icap, word_buffer, word_written) != 0) {
+			err = -EIO;
+			break;
+		}
+	}
+
+	return err;
+}
+
+static long icap_download(struct icap *icap, const char *buffer,
+	unsigned long length)
+{
+	long err = 0;
+	struct XHwIcap_Bit_Header bit_header = { 0 };
+	unsigned int numCharsRead = DMA_HWICAP_BITFILE_BUFFER_SIZE;
+	unsigned int byte_read;
+
+	BUG_ON(!buffer);
+	BUG_ON(!length);
+
+	if (bitstream_parse_header(icap, buffer,
+		DMA_HWICAP_BITFILE_BUFFER_SIZE, &bit_header)) {
+		err = -EINVAL;
+		goto free_buffers;
+	}
+
+	if ((bit_header.HeaderLength + bit_header.BitstreamLength) > length) {
+		err = -EINVAL;
+		goto free_buffers;
+	}
+
+	buffer += bit_header.HeaderLength;
+
+	for (byte_read = 0; byte_read < bit_header.BitstreamLength;
+		byte_read += numCharsRead) {
+		numCharsRead = bit_header.BitstreamLength - byte_read;
+		if (numCharsRead > DMA_HWICAP_BITFILE_BUFFER_SIZE)
+			numCharsRead = DMA_HWICAP_BITFILE_BUFFER_SIZE;
+
+		err = bitstream_helper(icap, (u32 *)buffer,
+				       numCharsRead / sizeof(u32));
+		if (err)
+			goto free_buffers;
+		buffer += numCharsRead;
+	}
+
+	err = wait_for_done(icap);
+
+free_buffers:
+	kfree(bit_header.DesignName);
+	kfree(bit_header.PartName);
+	kfree(bit_header.Date);
+	kfree(bit_header.Time);
+	return err;
+}
+
+static const struct axlf_section_header *get_axlf_section_hdr(
+	struct icap *icap, const struct axlf *top, enum axlf_section_kind kind)
+{
+	int i;
+	const struct axlf_section_header *hdr = NULL;
+
+	ICAP_INFO(icap,
+		"trying to find section header for axlf section %d", kind);
+
+	for (i = 0; i < top->m_header.m_numSections; i++) {
+		ICAP_INFO(icap, "saw section header: %d",
+			top->m_sections[i].m_sectionKind);
+		if (top->m_sections[i].m_sectionKind == kind) {
+			hdr = &top->m_sections[i];
+			break;
+		}
+	}
+
+	if (hdr) {
+		if ((hdr->m_sectionOffset + hdr->m_sectionSize) >
+			top->m_header.m_length) {
+			ICAP_INFO(icap, "found section is invalid");
+			hdr = NULL;
+		} else {
+			ICAP_INFO(icap, "header offset: %llu, size: %llu",
+				hdr->m_sectionOffset, hdr->m_sectionSize);
+		}
+	} else {
+		ICAP_INFO(icap, "could not find section header %d", kind);
+	}
+
+	return hdr;
+}
+
+static int alloc_and_get_axlf_section(struct icap *icap,
+	const struct axlf *top, enum axlf_section_kind kind,
+	void **addr, uint64_t *size)
+{
+	void *section = NULL;
+	const struct axlf_section_header *hdr =
+		get_axlf_section_hdr(icap, top, kind);
+
+	if (hdr == NULL)
+		return -EINVAL;
+
+	section = vmalloc(hdr->m_sectionSize);
+	if (section == NULL)
+		return -ENOMEM;
+
+	memcpy(section, ((const char *)top) + hdr->m_sectionOffset,
+		hdr->m_sectionSize);
+
+	*addr = section;
+	*size = hdr->m_sectionSize;
+	return 0;
+}
+
+static int icap_download_boot_firmware(struct platform_device *pdev)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	struct pci_dev *pcidev = XOCL_PL_TO_PCI_DEV(pdev);
+	struct pci_dev *pcidev_user = NULL;
+	xdev_handle_t xdev = xocl_get_xdev(pdev);
+	int funcid = PCI_FUNC(pcidev->devfn);
+	int slotid = PCI_SLOT(pcidev->devfn);
+	unsigned short deviceid = pcidev->device;
+	struct axlf *bin_obj_axlf;
+	const struct firmware *fw;
+	char fw_name[128];
+	struct XHwIcap_Bit_Header bit_header = { 0 };
+	long err = 0;
+	uint64_t length = 0;
+	uint64_t primaryFirmwareOffset = 0;
+	uint64_t primaryFirmwareLength = 0;
+	uint64_t secondaryFirmwareOffset = 0;
+	uint64_t secondaryFirmwareLength = 0;
+	uint64_t mbBinaryOffset = 0;
+	uint64_t mbBinaryLength = 0;
+	const struct axlf_section_header *primaryHeader = 0;
+	const struct axlf_section_header *secondaryHeader = 0;
+	const struct axlf_section_header *mbHeader = 0;
+	bool load_mbs = false;
+
+	/* Can only be done from mgmt pf. */
+	if (!ICAP_PRIVILEGED(icap))
+		return -EPERM;
+
+	/* Read dsabin from file system. */
+
+	if (funcid != 0) {
+		pcidev_user = pci_get_slot(pcidev->bus,
+			PCI_DEVFN(slotid, funcid - 1));
+		if (!pcidev_user) {
+			pcidev_user = pci_get_device(pcidev->vendor,
+				pcidev->device + 1, NULL);
+		}
+		if (pcidev_user)
+			deviceid = pcidev_user->device;
+	}
+
+	snprintf(fw_name, sizeof(fw_name),
+		"xilinx/%04x-%04x-%04x-%016llx.dsabin",
+		le16_to_cpu(pcidev->vendor),
+		le16_to_cpu(deviceid),
+		le16_to_cpu(pcidev->subsystem_device),
+		le64_to_cpu(xocl_get_timestamp(xdev)));
+	ICAP_INFO(icap, "try load dsabin %s", fw_name);
+	err = request_firmware(&fw, fw_name, &pcidev->dev);
+	if (err) {
+		snprintf(fw_name, sizeof(fw_name),
+			"xilinx/%04x-%04x-%04x-%016llx.dsabin",
+			le16_to_cpu(pcidev->vendor),
+			le16_to_cpu(deviceid + 1),
+			le16_to_cpu(pcidev->subsystem_device),
+			le64_to_cpu(xocl_get_timestamp(xdev)));
+		ICAP_INFO(icap, "try load dsabin %s", fw_name);
+		err = request_firmware(&fw, fw_name, &pcidev->dev);
+	}
+	/* Retry with the legacy dsabin. */
+	if (err) {
+		snprintf(fw_name, sizeof(fw_name),
+			"xilinx/%04x-%04x-%04x-%016llx.dsabin",
+			le16_to_cpu(pcidev->vendor),
+			le16_to_cpu(pcidev->device + 1),
+			le16_to_cpu(pcidev->subsystem_device),
+			le64_to_cpu(0x0000000000000000));
+		ICAP_INFO(icap, "try load dsabin %s", fw_name);
+		err = request_firmware(&fw, fw_name, &pcidev->dev);
+	}
+	if (err) {
+		/* Give up on finding .dsabin. */
+		ICAP_ERR(icap, "unable to find firmware, giving up");
+		return err;
+	}
+
+	/* Grab lock and touch hardware. */
+	mutex_lock(&icap->icap_lock);
+
+	if (xocl_mb_sched_on(xdev)) {
+		/* Try locating the microblaze binary. */
+		bin_obj_axlf = (struct axlf *)fw->data;
+		mbHeader = get_axlf_section_hdr(icap, bin_obj_axlf, SCHED_FIRMWARE);
+		if (mbHeader) {
+			mbBinaryOffset = mbHeader->m_sectionOffset;
+			mbBinaryLength = mbHeader->m_sectionSize;
+			length = bin_obj_axlf->m_header.m_length;
+			xocl_mb_load_sche_image(xdev, fw->data + mbBinaryOffset,
+				mbBinaryLength);
+			ICAP_INFO(icap, "stashed mb sche binary");
+			load_mbs = true;
+		}
+	}
+
+	if (xocl_mb_mgmt_on(xdev)) {
+		/* Try locating the board mgmt binary. */
+		bin_obj_axlf = (struct axlf *)fw->data;
+		mbHeader = get_axlf_section_hdr(icap, bin_obj_axlf, FIRMWARE);
+		if (mbHeader) {
+			mbBinaryOffset = mbHeader->m_sectionOffset;
+			mbBinaryLength = mbHeader->m_sectionSize;
+			length = bin_obj_axlf->m_header.m_length;
+			xocl_mb_load_mgmt_image(xdev, fw->data + mbBinaryOffset,
+				mbBinaryLength);
+			ICAP_INFO(icap, "stashed mb mgmt binary");
+			load_mbs = true;
+		}
+	}
+
+	if (load_mbs)
+		xocl_mb_reset(xdev);
+
+
+	if (memcmp(fw->data, ICAP_XCLBIN_V2, sizeof(ICAP_XCLBIN_V2)) != 0) {
+		ICAP_ERR(icap, "invalid firmware %s", fw_name);
+		err = -EINVAL;
+		goto done;
+	}
+
+	ICAP_INFO(icap, "boot_firmware in axlf format");
+	bin_obj_axlf = (struct axlf *)fw->data;
+	length = bin_obj_axlf->m_header.m_length;
+	/* Match the xclbin with the hardware. */
+	if (!xocl_verify_timestamp(xdev,
+		bin_obj_axlf->m_header.m_featureRomTimeStamp)) {
+		ICAP_ERR(icap, "timestamp of ROM did not match xclbin");
+		err = -EINVAL;
+		goto done;
+	}
+	ICAP_INFO(icap, "VBNV and timestamps matched");
+
+	if (xocl_xrt_version_check(xdev, bin_obj_axlf, true)) {
+		ICAP_ERR(icap, "Major version does not match xrt");
+		err = -EINVAL;
+		goto done;
+	}
+	ICAP_INFO(icap, "runtime version matched");
+
+	primaryHeader = get_axlf_section_hdr(icap, bin_obj_axlf, BITSTREAM);
+	secondaryHeader = get_axlf_section_hdr(icap, bin_obj_axlf,
+		CLEARING_BITSTREAM);
+	if (primaryHeader) {
+		primaryFirmwareOffset = primaryHeader->m_sectionOffset;
+		primaryFirmwareLength = primaryHeader->m_sectionSize;
+	}
+	if (secondaryHeader) {
+		secondaryFirmwareOffset = secondaryHeader->m_sectionOffset;
+		secondaryFirmwareLength = secondaryHeader->m_sectionSize;
+	}
+
+	if (length > fw->size) {
+		err = -EINVAL;
+		goto done;
+	}
+
+	if ((primaryFirmwareOffset + primaryFirmwareLength) > length) {
+		err = -EINVAL;
+		goto done;
+	}
+
+	if ((secondaryFirmwareOffset + secondaryFirmwareLength) > length) {
+		err = -EINVAL;
+		goto done;
+	}
+
+	if (primaryFirmwareLength) {
+		ICAP_INFO(icap,
+			"found second stage bitstream of size 0x%llx in %s",
+			primaryFirmwareLength, fw_name);
+		err = icap_download(icap, fw->data + primaryFirmwareOffset,
+			primaryFirmwareLength);
+		/*
+		 * If we loaded a new second stage, we do not need the
+		 * previously stashed clearing bitstream if any.
+		 */
+		free_clear_bitstream(icap);
+		if (err) {
+			ICAP_ERR(icap,
+				"failed to download second stage bitstream");
+			goto done;
+		}
+		ICAP_INFO(icap, "downloaded second stage bitstream");
+	}
+
+	/*
+	 * If both primary and secondary bitstreams have been provided then
+	 * ignore the previously stashed bitstream if any. If only secondary
+	 * bitstream was provided, but we found a previously stashed bitstream
+	 * we should use the latter since it is more appropriate for the
+	 * current state of the device
+	 */
+	if (secondaryFirmwareLength && (primaryFirmwareLength ||
+		!icap->icap_clear_bitstream)) {
+		free_clear_bitstream(icap);
+		icap->icap_clear_bitstream = vmalloc(secondaryFirmwareLength);
+		if (!icap->icap_clear_bitstream) {
+			err = -ENOMEM;
+			goto done;
+		}
+		icap->icap_clear_bitstream_length = secondaryFirmwareLength;
+		memcpy(icap->icap_clear_bitstream,
+			fw->data + secondaryFirmwareOffset,
+			icap->icap_clear_bitstream_length);
+		ICAP_INFO(icap, "found clearing bitstream of size 0x%lx in %s",
+			icap->icap_clear_bitstream_length, fw_name);
+	} else if (icap->icap_clear_bitstream) {
+		ICAP_INFO(icap,
+			"using existing clearing bitstream of size 0x%lx",
+		       icap->icap_clear_bitstream_length);
+	}
+
+	if (icap->icap_clear_bitstream &&
+		bitstream_parse_header(icap, icap->icap_clear_bitstream,
+		DMA_HWICAP_BITFILE_BUFFER_SIZE, &bit_header)) {
+		err = -EINVAL;
+		free_clear_bitstream(icap);
+	}
+
+done:
+	mutex_unlock(&icap->icap_lock);
+	release_firmware(fw);
+	kfree(bit_header.DesignName);
+	kfree(bit_header.PartName);
+	kfree(bit_header.Date);
+	kfree(bit_header.Time);
+	ICAP_INFO(icap, "%s err: %ld", __func__, err);
+	return err;
+}
+
+
+static long icap_download_clear_bitstream(struct icap *icap)
+{
+	long err = 0;
+	const char *buffer = icap->icap_clear_bitstream;
+	unsigned long length = icap->icap_clear_bitstream_length;
+
+	ICAP_INFO(icap, "downloading clear bitstream of length 0x%lx", length);
+
+	if (!buffer)
+		return 0;
+
+	err = icap_download(icap, buffer, length);
+
+	free_clear_bitstream(icap);
+	return err;
+}
+
+/*
+ * This function should be called with icap_mutex lock held
+ */
+static long axlf_set_freqscaling(struct icap *icap, struct platform_device *pdev,
+	const char *clk_buf, unsigned long length)
+{
+	struct clock_freq_topology *freqs = NULL;
+	int clock_type_count = 0;
+	int i = 0;
+	struct clock_freq *freq = NULL;
+	int data_clk_count = 0;
+	int kernel_clk_count = 0;
+	int system_clk_count = 0;
+	unsigned short target_freqs[4] = {0};
+
+	freqs = (struct clock_freq_topology *)clk_buf;
+	if (freqs->m_count > 4) {
+		ICAP_ERR(icap, "More than 4 clocks found in clock topology");
+		return -EDOM;
+	}
+
+	//Error checks - we support 1 data clk (reqd), one kernel clock(reqd) and
+	//at most 2 system clocks (optional/reqd for aws).
+	//Data clk needs to be the first entry, followed by kernel clock
+	//and then system clocks
+	//
+
+	for (i = 0; i < freqs->m_count; i++) {
+		freq = &(freqs->m_clock_freq[i]);
+		if (freq->m_type == CT_DATA)
+			data_clk_count++;
+
+		if (freq->m_type == CT_KERNEL)
+			kernel_clk_count++;
+
+		if (freq->m_type == CT_SYSTEM)
+			system_clk_count++;
+
+	}
+
+	if (data_clk_count != 1) {
+		ICAP_ERR(icap, "Data clock not found in clock topology");
+		return -EDOM;
+	}
+	if (kernel_clk_count != 1) {
+		ICAP_ERR(icap, "Kernel clock not found in clock topology");
+		return -EDOM;
+	}
+	if (system_clk_count > 2) {
+		ICAP_ERR(icap,
+			"More than 2 system clocks found in clock topology");
+		return -EDOM;
+	}
+
+	for (i = 0; i < freqs->m_count; i++) {
+		freq = &(freqs->m_clock_freq[i]);
+		if (freq->m_type == CT_DATA)
+			target_freqs[0] = freq->m_freq_Mhz;
+	}
+
+	for (i = 0; i < freqs->m_count; i++) {
+		freq = &(freqs->m_clock_freq[i]);
+		if (freq->m_type == CT_KERNEL)
+			target_freqs[1] = freq->m_freq_Mhz;
+	}
+
+	clock_type_count = 2;
+	for (i = 0; i < freqs->m_count; i++) {
+		freq = &(freqs->m_clock_freq[i]);
+		if (freq->m_type == CT_SYSTEM)
+			target_freqs[clock_type_count++] = freq->m_freq_Mhz;
+	}
+
+
+	ICAP_INFO(icap, "setting clock freq, "
+		"num: %lu, data_freq: %d , clk_freq: %d, "
+		"sys_freq[0]: %d, sys_freq[1]: %d",
+		ARRAY_SIZE(target_freqs), target_freqs[0], target_freqs[1],
+		target_freqs[2], target_freqs[3]);
+	return set_freqs(icap, target_freqs, 4);
+}
+
+
+static int icap_download_user(struct icap *icap, const char *bit_buf,
+	unsigned long length)
+{
+	long err = 0;
+	struct XHwIcap_Bit_Header bit_header = { 0 };
+	unsigned int numCharsRead = DMA_HWICAP_BITFILE_BUFFER_SIZE;
+	unsigned int byte_read;
+
+	ICAP_INFO(icap, "downloading bitstream, length: %lu", length);
+
+	icap_freeze_axi_gate(icap);
+
+	err = icap_download_clear_bitstream(icap);
+	if (err)
+		goto free_buffers;
+
+	if (bitstream_parse_header(icap, bit_buf,
+		DMA_HWICAP_BITFILE_BUFFER_SIZE, &bit_header)) {
+		err = -EINVAL;
+		goto free_buffers;
+	}
+	if ((bit_header.HeaderLength + bit_header.BitstreamLength) > length) {
+		err = -EINVAL;
+		goto free_buffers;
+	}
+
+	bit_buf += bit_header.HeaderLength;
+	for (byte_read = 0; byte_read < bit_header.BitstreamLength;
+		byte_read += numCharsRead) {
+		numCharsRead = bit_header.BitstreamLength - byte_read;
+		if (numCharsRead > DMA_HWICAP_BITFILE_BUFFER_SIZE)
+			numCharsRead = DMA_HWICAP_BITFILE_BUFFER_SIZE;
+
+		err = bitstream_helper(icap, (u32 *)bit_buf,
+				       numCharsRead / sizeof(u32));
+		if (err)
+			goto free_buffers;
+
+		bit_buf += numCharsRead;
+	}
+
+	err = wait_for_done(icap);
+	if (err)
+		goto free_buffers;
+
+	/*
+	 * Perform frequency scaling since PR download can silenty overwrite
+	 * MMCM settings in static region changing the clock frequencies
+	 * although ClockWiz CONFIG registers will misleading report the older
+	 * configuration from before bitstream download as if nothing has
+	 * changed.
+	 */
+	if (!err)
+		err = icap_ocl_freqscaling(icap, true);
+
+free_buffers:
+	icap_free_axi_gate(icap);
+	kfree(bit_header.DesignName);
+	kfree(bit_header.PartName);
+	kfree(bit_header.Date);
+	kfree(bit_header.Time);
+	return err;
+}
+
+
+static int __icap_lock_peer(struct platform_device *pdev, const uuid_t *id)
+{
+	int err = 0;
+	struct icap *icap = platform_get_drvdata(pdev);
+	int resp = 0;
+	size_t resplen = sizeof(resp);
+	struct mailbox_req_bitstream_lock bitstream_lock = {0};
+	size_t data_len = sizeof(struct mailbox_req_bitstream_lock);
+	struct mailbox_req *mb_req = NULL;
+	size_t reqlen = sizeof(struct mailbox_req) + data_len;
+	/* if there is no user there
+	 * ask mgmt to lock the bitstream
+	 */
+	if (icap->icap_bitstream_ref == 0) {
+		mb_req = vmalloc(reqlen);
+		if (!mb_req) {
+			err = -ENOMEM;
+			goto done;
+		}
+
+		mb_req->req = MAILBOX_REQ_LOCK_BITSTREAM;
+		uuid_copy(&bitstream_lock.uuid, id);
+
+		memcpy(mb_req->data, &bitstream_lock, data_len);
+
+		err = xocl_peer_request(XOCL_PL_DEV_TO_XDEV(pdev),
+			mb_req, reqlen, &resp, &resplen, NULL, NULL);
+
+		if (err) {
+			err = -ENODEV;
+			goto done;
+		}
+
+		if (resp < 0) {
+			err = resp;
+			goto done;
+		}
+	}
+
+done:
+	vfree(mb_req);
+	return err;
+}
+
+static int __icap_unlock_peer(struct platform_device *pdev, const uuid_t *id)
+{
+	int err = 0;
+	struct icap *icap = platform_get_drvdata(pdev);
+	struct mailbox_req_bitstream_lock bitstream_lock = {0};
+	size_t data_len = sizeof(struct mailbox_req_bitstream_lock);
+	struct mailbox_req *mb_req = NULL;
+	size_t reqlen = sizeof(struct mailbox_req) + data_len;
+	/* if there is no user there
+	 * ask mgmt to unlock the bitstream
+	 */
+	if (icap->icap_bitstream_ref == 0) {
+		mb_req = vmalloc(reqlen);
+		if (!mb_req) {
+			err = -ENOMEM;
+			goto done;
+		}
+
+		mb_req->req = MAILBOX_REQ_UNLOCK_BITSTREAM;
+		memcpy(mb_req->data, &bitstream_lock, data_len);
+
+		err = xocl_peer_notify(XOCL_PL_DEV_TO_XDEV(pdev), mb_req, reqlen);
+		if (err) {
+			err = -ENODEV;
+			goto done;
+		}
+	}
+done:
+	vfree(mb_req);
+	return err;
+}
+
+
+static int icap_download_bitstream_axlf(struct platform_device *pdev,
+	const void *u_xclbin)
+{
+	/*
+	 * decouple as 1. download xclbin, 2. parse xclbin 3. verify xclbin
+	 */
+	struct icap *icap = platform_get_drvdata(pdev);
+	long err = 0;
+	uint64_t primaryFirmwareOffset = 0;
+	uint64_t primaryFirmwareLength = 0;
+	uint64_t secondaryFirmwareOffset = 0;
+	uint64_t secondaryFirmwareLength = 0;
+	const struct axlf_section_header *primaryHeader = NULL;
+	const struct axlf_section_header *clockHeader = NULL;
+	const struct axlf_section_header *secondaryHeader = NULL;
+	struct axlf *xclbin = (struct axlf *)u_xclbin;
+	char *buffer;
+	xdev_handle_t xdev = xocl_get_xdev(pdev);
+	bool need_download;
+	int msg = -ETIMEDOUT;
+	size_t resplen = sizeof(msg);
+	int pid = pid_nr(task_tgid(current));
+	uint32_t data_len = 0;
+	int peer_connected;
+	struct mailbox_req *mb_req = NULL;
+	struct mailbox_bitstream_kaddr mb_addr = {0};
+	uuid_t peer_uuid;
+
+	if (memcmp(xclbin->m_magic, ICAP_XCLBIN_V2, sizeof(ICAP_XCLBIN_V2)))
+		return -EINVAL;
+
+	if (ICAP_PRIVILEGED(icap)) {
+		if (xocl_xrt_version_check(xdev, xclbin, true)) {
+			ICAP_ERR(icap, "XRT version does not match");
+			return -EINVAL;
+		}
+
+		/* Match the xclbin with the hardware. */
+		if (!xocl_verify_timestamp(xdev,
+			xclbin->m_header.m_featureRomTimeStamp)) {
+			ICAP_ERR(icap, "timestamp of ROM not match Xclbin");
+			xocl_sysfs_error(xdev, "timestamp of ROM not match Xclbin");
+			return -EINVAL;
+		}
+
+		mutex_lock(&icap->icap_lock);
+
+		ICAP_INFO(icap,
+			"incoming xclbin: %016llx, on device xclbin: %016llx",
+			xclbin->m_uniqueId, icap->icap_bitstream_id);
+
+		need_download = (icap->icap_bitstream_id != xclbin->m_uniqueId);
+
+		if (!need_download) {
+			/*
+			 * No need to download, if xclbin exists already.
+			 * But, still need to reset CUs.
+			 */
+			if (!icap_bitstream_in_use(icap, 0)) {
+				icap_freeze_axi_gate(icap);
+				msleep(50);
+				icap_free_axi_gate(icap);
+				msleep(50);
+			}
+			ICAP_INFO(icap, "bitstream already exists, skip downloading");
+		}
+
+		mutex_unlock(&icap->icap_lock);
+
+		if (!need_download)
+			return 0;
+
+		/*
+		 * Find sections in xclbin.
+		 */
+		ICAP_INFO(icap, "finding CLOCK_FREQ_TOPOLOGY section");
+		/* Read the CLOCK section but defer changing clocks to later */
+		clockHeader = get_axlf_section_hdr(icap, xclbin,
+			CLOCK_FREQ_TOPOLOGY);
+
+		ICAP_INFO(icap, "finding bitstream sections");
+		primaryHeader = get_axlf_section_hdr(icap, xclbin, BITSTREAM);
+		if (primaryHeader == NULL) {
+			err = -EINVAL;
+			goto done;
+		}
+		primaryFirmwareOffset = primaryHeader->m_sectionOffset;
+		primaryFirmwareLength = primaryHeader->m_sectionSize;
+
+		secondaryHeader = get_axlf_section_hdr(icap, xclbin,
+			CLEARING_BITSTREAM);
+		if (secondaryHeader) {
+			if (XOCL_PL_TO_PCI_DEV(pdev)->device == 0x7138) {
+				err = -EINVAL;
+				goto done;
+			} else {
+				secondaryFirmwareOffset =
+					secondaryHeader->m_sectionOffset;
+				secondaryFirmwareLength =
+					secondaryHeader->m_sectionSize;
+			}
+		}
+
+		mutex_lock(&icap->icap_lock);
+
+		if (icap_bitstream_in_use(icap, 0)) {
+			ICAP_ERR(icap, "bitstream is locked, can't download new one");
+			err = -EBUSY;
+			goto done;
+		}
+
+		/* All clear, go ahead and start fiddling with hardware */
+
+		if (clockHeader != NULL) {
+			uint64_t clockFirmwareOffset = clockHeader->m_sectionOffset;
+			uint64_t clockFirmwareLength = clockHeader->m_sectionSize;
+
+			buffer = (char *)xclbin;
+			buffer += clockFirmwareOffset;
+			err = axlf_set_freqscaling(icap, pdev, buffer, clockFirmwareLength);
+			if (err)
+				goto done;
+			err = icap_setup_clock_freq_topology(icap, buffer, clockFirmwareLength);
+			if (err)
+				goto done;
+		}
+
+		icap->icap_bitstream_id = 0;
+		uuid_copy(&icap->icap_bitstream_uuid, &uuid_null);
+
+		buffer = (char *)xclbin;
+		buffer += primaryFirmwareOffset;
+		err = icap_download_user(icap, buffer, primaryFirmwareLength);
+		if (err)
+			goto done;
+
+		buffer = (char *)u_xclbin;
+		buffer += secondaryFirmwareOffset;
+		err = icap_setup_clear_bitstream(icap, buffer, secondaryFirmwareLength);
+		if (err)
+			goto done;
+
+		if ((xocl_is_unified(xdev) || XOCL_DSA_XPR_ON(xdev)))
+			err = calibrate_mig(icap);
+		if (err)
+			goto done;
+
+		/* Remember "this" bitstream, so avoid redownload the next time. */
+		icap->icap_bitstream_id = xclbin->m_uniqueId;
+		if (!uuid_is_null(&xclbin->m_header.uuid)) {
+			uuid_copy(&icap->icap_bitstream_uuid, &xclbin->m_header.uuid);
+		} else {
+			// Legacy xclbin, convert legacy id to new id
+			memcpy(&icap->icap_bitstream_uuid,
+				&xclbin->m_header.m_timeStamp, 8);
+		}
+	} else {
+
+		mutex_lock(&icap->icap_lock);
+
+		if (icap_bitstream_in_use(icap, pid)) {
+			if (!uuid_equal(&xclbin->m_header.uuid, &icap->icap_bitstream_uuid)) {
+				err = -EBUSY;
+				goto done;
+			}
+		}
+
+		icap_read_from_peer(pdev, XCLBIN_UUID, &peer_uuid, sizeof(uuid_t));
+
+		if (!uuid_equal(&peer_uuid, &xclbin->m_header.uuid)) {
+			/*
+			 *  should replace with userpf download flow
+			 */
+			peer_connected = xocl_mailbox_get_data(xdev, PEER_CONN);
+			ICAP_INFO(icap, "%s peer_connected 0x%x", __func__,
+				peer_connected);
+			if (peer_connected < 0) {
+				err = -ENODEV;
+				goto done;
+			}
+
+			if (!(peer_connected & MB_PEER_CONNECTED)) {
+				ICAP_ERR(icap, "%s fail to find peer, abort!",
+					__func__);
+				err = -EFAULT;
+				goto done;
+			}
+
+			if ((peer_connected & 0xF) == MB_PEER_SAMEDOM_CONNECTED) {
+				data_len = sizeof(struct mailbox_req) + sizeof(struct mailbox_bitstream_kaddr);
+				mb_req = vmalloc(data_len);
+				if (!mb_req) {
+					ICAP_ERR(icap, "Unable to create mb_req\n");
+					err = -ENOMEM;
+					goto done;
+				}
+				mb_req->req = MAILBOX_REQ_LOAD_XCLBIN_KADDR;
+				mb_addr.addr = (uint64_t)xclbin;
+				memcpy(mb_req->data, &mb_addr, sizeof(struct mailbox_bitstream_kaddr));
+
+			} else if ((peer_connected & 0xF) == MB_PEER_CONNECTED) {
+				data_len = sizeof(struct mailbox_req) +
+					xclbin->m_header.m_length;
+				mb_req = vmalloc(data_len);
+				if (!mb_req) {
+					ICAP_ERR(icap, "Unable to create mb_req\n");
+					err = -ENOMEM;
+					goto done;
+				}
+				memcpy(mb_req->data, u_xclbin, xclbin->m_header.m_length);
+				mb_req->req = MAILBOX_REQ_LOAD_XCLBIN;
+			}
+
+			mb_req->data_total_len = data_len;
+			(void) xocl_peer_request(xdev,
+				mb_req, data_len, &msg, &resplen, NULL, NULL);
+
+			if (msg != 0) {
+				ICAP_ERR(icap,
+					"%s peer failed to download xclbin",
+					__func__);
+				err = -EFAULT;
+				goto done;
+			}
+		} else
+			ICAP_INFO(icap, "Already downloaded xclbin ID: %016llx",
+				xclbin->m_uniqueId);
+
+		icap->icap_bitstream_id = xclbin->m_uniqueId;
+		if (!uuid_is_null(&xclbin->m_header.uuid)) {
+			uuid_copy(&icap->icap_bitstream_uuid, &xclbin->m_header.uuid);
+		} else {
+			// Legacy xclbin, convert legacy id to new id
+			memcpy(&icap->icap_bitstream_uuid,
+				&xclbin->m_header.m_timeStamp, 8);
+		}
+
+	}
+
+	if (ICAP_PRIVILEGED(icap)) {
+		icap_parse_bitstream_axlf_section(pdev, xclbin, MEM_TOPOLOGY);
+		icap_parse_bitstream_axlf_section(pdev, xclbin, IP_LAYOUT);
+	} else {
+		icap_parse_bitstream_axlf_section(pdev, xclbin, IP_LAYOUT);
+		icap_parse_bitstream_axlf_section(pdev, xclbin, MEM_TOPOLOGY);
+		icap_parse_bitstream_axlf_section(pdev, xclbin, CONNECTIVITY);
+		icap_parse_bitstream_axlf_section(pdev, xclbin, DEBUG_IP_LAYOUT);
+	}
+
+	if (ICAP_PRIVILEGED(icap))
+		err = icap_verify_bitstream_axlf(pdev, xclbin);
+
+done:
+	mutex_unlock(&icap->icap_lock);
+	vfree(mb_req);
+	ICAP_INFO(icap, "%s err: %ld", __func__, err);
+	return err;
+}
+
+static int icap_verify_bitstream_axlf(struct platform_device *pdev,
+	struct axlf *xclbin)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	int err = 0, i;
+	xdev_handle_t xdev = xocl_get_xdev(pdev);
+	bool dna_check = false;
+	uint64_t section_size = 0;
+
+	/* Destroy all dynamically add sub-devices*/
+	xocl_subdev_destroy_by_id(xdev, XOCL_SUBDEV_DNA);
+	xocl_subdev_destroy_by_id(xdev, XOCL_SUBDEV_MIG);
+
+	/*
+	 * Add sub device dynamically.
+	 * restrict any dynamically added sub-device and 1 base address,
+	 * Has pre-defined length
+	 *  Ex:    "ip_data": {
+	 *         "m_type": "IP_DNASC",
+	 *         "properties": "0x0",
+	 *         "m_base_address": "0x1100000", <--  base address
+	 *         "m_name": "slr0\/dna_self_check_0"
+	 */
+
+	if (!icap->ip_layout) {
+		err = -EFAULT;
+		goto done;
+	}
+	for (i = 0; i < icap->ip_layout->m_count; ++i) {
+		struct xocl_subdev_info subdev_info = { 0 };
+		struct resource res = { 0 };
+		struct ip_data *ip = &icap->ip_layout->m_ip_data[i];
+
+		if (ip->m_type == IP_KERNEL)
+			continue;
+
+		if (ip->m_type == IP_DDR4_CONTROLLER) {
+			uint32_t memidx = ip->properties;
+
+			if (!icap->mem_topo || ip->properties >= icap->mem_topo->m_count ||
+				icap->mem_topo->m_mem_data[memidx].m_type !=
+				MEM_DDR4) {
+				ICAP_ERR(icap, "bad ECC controller index: %u",
+					ip->properties);
+				continue;
+			}
+			if (!icap->mem_topo->m_mem_data[memidx].m_used) {
+				ICAP_INFO(icap,
+					"ignore ECC controller for: %s",
+					icap->mem_topo->m_mem_data[memidx].m_tag);
+				continue;
+			}
+			err = xocl_subdev_get_devinfo(XOCL_SUBDEV_MIG,
+				&subdev_info, &res);
+			if (err) {
+				ICAP_ERR(icap, "can't get MIG subdev info");
+				goto done;
+			}
+			res.start += ip->m_base_address;
+			res.end += ip->m_base_address;
+			subdev_info.priv_data =
+				icap->mem_topo->m_mem_data[memidx].m_tag;
+			subdev_info.data_len =
+				sizeof(icap->mem_topo->m_mem_data[memidx].m_tag);
+			err = xocl_subdev_create_multi_inst(xdev, &subdev_info);
+			if (err) {
+				ICAP_ERR(icap, "can't create MIG subdev");
+				goto done;
+			}
+		}
+		if (ip->m_type == IP_DNASC) {
+			dna_check = true;
+			err = xocl_subdev_get_devinfo(XOCL_SUBDEV_DNA,
+				&subdev_info, &res);
+			if (err) {
+				ICAP_ERR(icap, "can't get DNA subdev info");
+				goto done;
+			}
+			res.start += ip->m_base_address;
+			res.end += ip->m_base_address;
+			err = xocl_subdev_create_one(xdev, &subdev_info);
+			if (err) {
+				ICAP_ERR(icap, "can't create DNA subdev");
+				goto done;
+			}
+		}
+	}
+
+	if (dna_check) {
+		bool is_axi = ((xocl_dna_capability(xdev) & 0x1) != 0);
+
+		/*
+		 * Any error occurs here should return -EACCES for app to
+		 * know that DNA has failed.
+		 */
+		err = -EACCES;
+
+		ICAP_INFO(icap, "DNA version: %s", is_axi ? "AXI" : "BRAM");
+
+		if (is_axi) {
+			uint32_t *cert = NULL;
+
+			if (alloc_and_get_axlf_section(icap, xclbin,
+				DNA_CERTIFICATE,
+				(void **)&cert, &section_size) != 0) {
+
+				// We keep dna sub device if IP_DNASC presents
+				ICAP_ERR(icap, "Can't get certificate section");
+				goto dna_cert_fail;
+			}
+
+			ICAP_INFO(icap, "DNA Certificate Size 0x%llx", section_size);
+			if (section_size % 64 || section_size < 576)
+				ICAP_ERR(icap, "Invalid certificate size");
+			else
+				xocl_dna_write_cert(xdev, cert, section_size);
+			vfree(cert);
+		}
+
+		/* Check DNA validation result. */
+		if (0x1 & xocl_dna_status(xdev)) {
+			err = 0; /* xclbin is valid */
+		} else {
+			ICAP_ERR(icap, "DNA inside xclbin is invalid");
+			goto dna_cert_fail;
+		}
+	}
+
+done:
+	if (err) {
+		vfree(icap->connectivity);
+		icap->connectivity = NULL;
+		vfree(icap->ip_layout);
+		icap->ip_layout = NULL;
+		vfree(icap->mem_topo);
+		icap->mem_topo = NULL;
+		xocl_subdev_destroy_by_id(xdev, XOCL_SUBDEV_DNA);
+		xocl_subdev_destroy_by_id(xdev, XOCL_SUBDEV_MIG);
+	}
+dna_cert_fail:
+	return err;
+}
+
+/*
+ * On x86_64, reset hwicap by loading special bitstream sequence which
+ * forces the FPGA to reload from PROM.
+ */
+static int icap_reset_bitstream(struct platform_device *pdev)
+{
+/*
+ * Booting FPGA from PROM
+ * http://www.xilinx.com/support/documentation/user_guides/ug470_7Series_Config.pdf
+ * Table 7.1
+ */
+#define DUMMY_WORD         0xFFFFFFFF
+#define SYNC_WORD          0xAA995566
+#define TYPE1_NOOP         0x20000000
+#define TYPE1_WRITE_WBSTAR 0x30020001
+#define WBSTAR_ADD10       0x00000000
+#define WBSTAR_ADD11       0x01000000
+#define TYPE1_WRITE_CMD    0x30008001
+#define IPROG_CMD          0x0000000F
+#define SWAP_ENDIAN_32(x)						\
+	(unsigned int)((((x) & 0xFF000000) >> 24) | (((x) & 0x00FF0000) >> 8) | \
+		       (((x) & 0x0000FF00) << 8)  | (((x) & 0x000000FF) << 24))
+	/*
+	 * The bitstream is expected in big endian format
+	 */
+	const unsigned int fpga_boot_seq[] = {SWAP_ENDIAN_32(DUMMY_WORD),
+			SWAP_ENDIAN_32(SYNC_WORD),
+			SWAP_ENDIAN_32(TYPE1_NOOP),
+			SWAP_ENDIAN_32(TYPE1_WRITE_CMD),
+			SWAP_ENDIAN_32(IPROG_CMD),
+			SWAP_ENDIAN_32(TYPE1_NOOP),
+			SWAP_ENDIAN_32(TYPE1_NOOP)};
+
+	struct icap *icap = platform_get_drvdata(pdev);
+	int i;
+
+	/* Can only be done from mgmt pf. */
+	if (!ICAP_PRIVILEGED(icap))
+		return -EPERM;
+
+	mutex_lock(&icap->icap_lock);
+
+	if (icap_bitstream_in_use(icap, 0)) {
+		mutex_unlock(&icap->icap_lock);
+		ICAP_ERR(icap, "bitstream is locked, can't reset");
+		return -EBUSY;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(fpga_boot_seq); i++) {
+		unsigned int value = be32_to_cpu(fpga_boot_seq[i]);
+
+		reg_wr(&icap->icap_regs->ir_wfv, value);
+	}
+	reg_wr(&icap->icap_regs->ir_cr, 0x1);
+
+	msleep(4000);
+
+	mutex_unlock(&icap->icap_lock);
+
+	ICAP_INFO(icap, "reset bitstream is done");
+	return 0;
+}
+
+static int icap_lock_bitstream(struct platform_device *pdev, const uuid_t *id,
+	pid_t pid)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	int err = 0;
+
+	if (uuid_is_null(id)) {
+		ICAP_ERR(icap, "proc %d invalid UUID", pid);
+		return -EINVAL;
+	}
+
+	mutex_lock(&icap->icap_lock);
+
+	if (!ICAP_PRIVILEGED(icap)) {
+		err = __icap_lock_peer(pdev, id);
+		if (err < 0)
+			goto done;
+	}
+
+	if (uuid_equal(id, &icap->icap_bitstream_uuid))
+		err = add_user(icap, pid);
+	else
+		err = -EBUSY;
+
+	if (err >= 0)
+		err = icap->icap_bitstream_ref;
+
+	ICAP_INFO(icap, "proc %d try to lock bitstream %pUb, ref=%d, err=%d",
+		  pid, id, icap->icap_bitstream_ref, err);
+done:
+	mutex_unlock(&icap->icap_lock);
+
+	if (!ICAP_PRIVILEGED(icap) && err == 1) /* reset on first reference */
+		xocl_exec_reset(xocl_get_xdev(pdev));
+
+	if (err >= 0)
+		err = 0;
+
+	return err;
+}
+
+static int icap_unlock_bitstream(struct platform_device *pdev, const uuid_t *id,
+	pid_t pid)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	int err = 0;
+
+	if (id == NULL)
+		id = &uuid_null;
+
+	mutex_lock(&icap->icap_lock);
+
+	/* Force unlock. */
+	if (uuid_is_null(id))
+		del_all_users(icap);
+	else if (uuid_equal(id, &icap->icap_bitstream_uuid))
+		err = del_user(icap, pid);
+	else
+		err = -EINVAL;
+
+	if (!ICAP_PRIVILEGED(icap))
+		__icap_unlock_peer(pdev, id);
+
+	if (err >= 0)
+		err = icap->icap_bitstream_ref;
+
+	if (!ICAP_PRIVILEGED(icap)) {
+		if (err == 0)
+			xocl_exec_stop(xocl_get_xdev(pdev));
+	}
+
+	ICAP_INFO(icap, "proc %d try to unlock bitstream %pUb, ref=%d, err=%d",
+		  pid, id, icap->icap_bitstream_ref, err);
+
+	mutex_unlock(&icap->icap_lock);
+	if (err >= 0)
+		err = 0;
+	return err;
+}
+
+static int icap_parse_bitstream_axlf_section(struct platform_device *pdev,
+	const struct axlf *xclbin, enum axlf_section_kind kind)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	long err = 0;
+	uint64_t section_size = 0, sect_sz = 0;
+	void **target = NULL;
+
+	if (memcmp(xclbin->m_magic, ICAP_XCLBIN_V2, sizeof(ICAP_XCLBIN_V2)))
+		return -EINVAL;
+
+	switch (kind) {
+	case IP_LAYOUT:
+		target = (void **)&icap->ip_layout;
+		break;
+	case MEM_TOPOLOGY:
+		target = (void **)&icap->mem_topo;
+		break;
+	case DEBUG_IP_LAYOUT:
+		target = (void **)&icap->debug_layout;
+		break;
+	case CONNECTIVITY:
+		target = (void **)&icap->connectivity;
+		break;
+	default:
+		break;
+	}
+	if (target) {
+		vfree(*target);
+		*target = NULL;
+	}
+	err = alloc_and_get_axlf_section(icap, xclbin, kind,
+		target, &section_size);
+	if (err != 0)
+		goto done;
+	sect_sz = icap_get_section_size(icap, kind);
+	if (sect_sz > section_size) {
+		err = -EINVAL;
+		goto done;
+	}
+done:
+	if (err) {
+		vfree(*target);
+		*target = NULL;
+	}
+	ICAP_INFO(icap, "%s kind %d, err: %ld", __func__, kind, err);
+	return err;
+}
+
+static uint64_t icap_get_data(struct platform_device *pdev,
+	enum data_kind kind)
+{
+
+	struct icap *icap = platform_get_drvdata(pdev);
+	uint64_t target = 0;
+
+	mutex_lock(&icap->icap_lock);
+	switch (kind) {
+	case IPLAYOUT_AXLF:
+		target = (uint64_t)icap->ip_layout;
+		break;
+	case MEMTOPO_AXLF:
+		target = (uint64_t)icap->mem_topo;
+		break;
+	case DEBUG_IPLAYOUT_AXLF:
+		target = (uint64_t)icap->debug_layout;
+		break;
+	case CONNECTIVITY_AXLF:
+		target = (uint64_t)icap->connectivity;
+		break;
+	case IDCODE:
+		target = icap->idcode;
+		break;
+	case XCLBIN_UUID:
+		target = (uint64_t)&icap->icap_bitstream_uuid;
+		break;
+	default:
+		break;
+	}
+	mutex_unlock(&icap->icap_lock);
+	return target;
+}
+
+/* Kernel APIs exported from this sub-device driver. */
+static struct xocl_icap_funcs icap_ops = {
+	.reset_axi_gate = platform_reset_axi_gate,
+	.reset_bitstream = icap_reset_bitstream,
+	.download_boot_firmware = icap_download_boot_firmware,
+	.download_bitstream_axlf = icap_download_bitstream_axlf,
+	.ocl_set_freq = icap_ocl_set_freqscaling,
+	.ocl_get_freq = icap_ocl_get_freqscaling,
+	.ocl_update_clock_freq_topology = icap_ocl_update_clock_freq_topology,
+	.ocl_lock_bitstream = icap_lock_bitstream,
+	.ocl_unlock_bitstream = icap_unlock_bitstream,
+	.get_data = icap_get_data,
+};
+
+static ssize_t clock_freq_topology_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct icap *icap = platform_get_drvdata(to_platform_device(dev));
+	ssize_t cnt = 0;
+
+	mutex_lock(&icap->icap_lock);
+	if (ICAP_PRIVILEGED(icap)) {
+		memcpy(buf, icap->icap_clock_freq_topology, icap->icap_clock_freq_topology_length);
+		cnt = icap->icap_clock_freq_topology_length;
+	}
+	mutex_unlock(&icap->icap_lock);
+
+	return cnt;
+
+}
+
+static DEVICE_ATTR_RO(clock_freq_topology);
+
+static ssize_t clock_freqs_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct icap *icap = platform_get_drvdata(to_platform_device(dev));
+	ssize_t cnt = 0;
+	int i;
+	u32 freq_counter, freq, request_in_khz, tolerance;
+
+	mutex_lock(&icap->icap_lock);
+
+	for (i = 0; i < ICAP_MAX_NUM_CLOCKS; i++) {
+		freq = icap_get_ocl_frequency(icap, i);
+		if (!uuid_is_null(&icap->icap_bitstream_uuid)) {
+			freq_counter = icap_get_clock_frequency_counter_khz(icap, i);
+
+			request_in_khz = freq * 1000;
+			tolerance = freq * 50;
+
+			if (abs(freq_counter-request_in_khz) > tolerance)
+				ICAP_INFO(icap, "Frequency mismatch, Should be %u khz, Now is %ukhz",
+					  request_in_khz, freq_counter);
+			cnt += sprintf(buf + cnt, "%d\n", DIV_ROUND_CLOSEST(freq_counter, 1000));
+		} else {
+			cnt += sprintf(buf + cnt, "%d\n", freq);
+		}
+	}
+
+	mutex_unlock(&icap->icap_lock);
+
+	return cnt;
+}
+static DEVICE_ATTR_RO(clock_freqs);
+
+static ssize_t icap_rl_program(struct file *filp, struct kobject *kobj,
+	struct bin_attribute *attr, char *buffer, loff_t off, size_t count)
+{
+	struct XHwIcap_Bit_Header bit_header = { 0 };
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct icap *icap = platform_get_drvdata(to_platform_device(dev));
+	ssize_t ret = count;
+
+	if (off == 0) {
+		if (count < DMA_HWICAP_BITFILE_BUFFER_SIZE) {
+			ICAP_ERR(icap, "count is too small %ld", count);
+			return -EINVAL;
+		}
+
+		if (bitstream_parse_header(icap, buffer,
+			DMA_HWICAP_BITFILE_BUFFER_SIZE, &bit_header)) {
+			ICAP_ERR(icap, "parse header failed");
+			return -EINVAL;
+		}
+
+		icap->bit_length = bit_header.HeaderLength +
+			bit_header.BitstreamLength;
+		icap->bit_buffer = vmalloc(icap->bit_length);
+	}
+
+	if (off + count >= icap->bit_length) {
+		/*
+		 * assumes all subdevices are removed at this time
+		 */
+		memcpy(icap->bit_buffer + off, buffer, icap->bit_length - off);
+		icap_freeze_axi_gate_shell(icap);
+		ret = icap_download(icap, icap->bit_buffer, icap->bit_length);
+		if (ret) {
+			ICAP_ERR(icap, "bitstream download failed");
+			ret = -EIO;
+		} else {
+			ret = count;
+		}
+		icap_free_axi_gate_shell(icap);
+		/* has to reset pci, otherwise firewall trips */
+		xocl_reset(xocl_get_xdev(icap->icap_pdev));
+		icap->icap_bitstream_id = 0;
+		memset(&icap->icap_bitstream_uuid, 0, sizeof(uuid_t));
+		vfree(icap->bit_buffer);
+		icap->bit_buffer = NULL;
+	} else {
+		memcpy(icap->bit_buffer + off, buffer, count);
+	}
+
+	return ret;
+}
+
+static struct bin_attribute shell_program_attr = {
+	.attr = {
+		.name = "shell_program",
+		.mode = 0200
+	},
+	.read = NULL,
+	.write = icap_rl_program,
+	.size = 0
+};
+
+static struct bin_attribute *icap_mgmt_bin_attrs[] = {
+	&shell_program_attr,
+	NULL,
+};
+
+static struct attribute_group icap_mgmt_bin_attr_group = {
+	.bin_attrs = icap_mgmt_bin_attrs,
+};
+
+static ssize_t idcode_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct icap *icap = platform_get_drvdata(to_platform_device(dev));
+	ssize_t cnt = 0;
+	uint32_t val;
+
+	mutex_lock(&icap->icap_lock);
+	if (ICAP_PRIVILEGED(icap)) {
+		cnt = sprintf(buf, "0x%x\n", icap->idcode);
+	} else {
+		icap_read_from_peer(to_platform_device(dev), IDCODE, &val, sizeof(unsigned int));
+		cnt = sprintf(buf, "0x%x\n", val);
+	}
+	mutex_unlock(&icap->icap_lock);
+
+	return cnt;
+}
+
+static DEVICE_ATTR_RO(idcode);
+
+static struct attribute *icap_attrs[] = {
+	&dev_attr_clock_freq_topology.attr,
+	&dev_attr_clock_freqs.attr,
+	&dev_attr_idcode.attr,
+	NULL,
+};
+
+//- Debug IP_layout--
+static ssize_t icap_read_debug_ip_layout(struct file *filp, struct kobject *kobj,
+	struct bin_attribute *attr, char *buffer, loff_t offset, size_t count)
+{
+	struct icap *icap;
+	u32 nread = 0;
+	size_t size = 0;
+
+	icap = (struct icap *)dev_get_drvdata(container_of(kobj, struct device, kobj));
+
+	if (!icap || !icap->debug_layout)
+		return 0;
+
+	mutex_lock(&icap->icap_lock);
+
+	size = sizeof_sect(icap->debug_layout, m_debug_ip_data);
+	if (offset >= size)
+		goto unlock;
+
+	if (count < size - offset)
+		nread = count;
+	else
+		nread = size - offset;
+
+	memcpy(buffer, ((char *)icap->debug_layout) + offset, nread);
+
+unlock:
+	mutex_unlock(&icap->icap_lock);
+	return nread;
+}
+static struct bin_attribute debug_ip_layout_attr = {
+	.attr = {
+		.name = "debug_ip_layout",
+		.mode = 0444
+	},
+	.read = icap_read_debug_ip_layout,
+	.write = NULL,
+	.size = 0
+};
+
+//IP layout
+static ssize_t icap_read_ip_layout(struct file *filp, struct kobject *kobj,
+	struct bin_attribute *attr, char *buffer, loff_t offset, size_t count)
+{
+	struct icap *icap;
+	u32 nread = 0;
+	size_t size = 0;
+
+	icap = (struct icap *)dev_get_drvdata(container_of(kobj, struct device, kobj));
+
+	if (!icap || !icap->ip_layout)
+		return 0;
+
+	mutex_lock(&icap->icap_lock);
+
+	size = sizeof_sect(icap->ip_layout, m_ip_data);
+	if (offset >= size)
+		goto unlock;
+
+	if (count < size - offset)
+		nread = count;
+	else
+		nread = size - offset;
+
+	memcpy(buffer, ((char *)icap->ip_layout) + offset, nread);
+
+unlock:
+	mutex_unlock(&icap->icap_lock);
+	return nread;
+}
+
+static struct bin_attribute ip_layout_attr = {
+	.attr = {
+		.name = "ip_layout",
+		.mode = 0444
+	},
+	.read = icap_read_ip_layout,
+	.write = NULL,
+	.size = 0
+};
+
+//-Connectivity--
+static ssize_t icap_read_connectivity(struct file *filp, struct kobject *kobj,
+	struct bin_attribute *attr, char *buffer, loff_t offset, size_t count)
+{
+	struct icap *icap;
+	u32 nread = 0;
+	size_t size = 0;
+
+	icap = (struct icap *)dev_get_drvdata(container_of(kobj, struct device, kobj));
+
+	if (!icap || !icap->connectivity)
+		return 0;
+
+	mutex_lock(&icap->icap_lock);
+
+	size = sizeof_sect(icap->connectivity, m_connection);
+	if (offset >= size)
+		goto unlock;
+
+	if (count < size - offset)
+		nread = count;
+	else
+		nread = size - offset;
+
+	memcpy(buffer, ((char *)icap->connectivity) + offset, nread);
+
+unlock:
+	mutex_unlock(&icap->icap_lock);
+	return nread;
+}
+
+static struct bin_attribute connectivity_attr = {
+	.attr = {
+		.name = "connectivity",
+		.mode = 0444
+	},
+	.read = icap_read_connectivity,
+	.write = NULL,
+	.size = 0
+};
+
+
+//-Mem_topology--
+static ssize_t icap_read_mem_topology(struct file *filp, struct kobject *kobj,
+	struct bin_attribute *attr, char *buffer, loff_t offset, size_t count)
+{
+	struct icap *icap;
+	u32 nread = 0;
+	size_t size = 0;
+
+	icap = (struct icap *)dev_get_drvdata(container_of(kobj, struct device, kobj));
+
+	if (!icap || !icap->mem_topo)
+		return 0;
+
+	mutex_lock(&icap->icap_lock);
+
+	size = sizeof_sect(icap->mem_topo, m_mem_data);
+	if (offset >= size)
+		goto unlock;
+
+	if (count < size - offset)
+		nread = count;
+	else
+		nread = size - offset;
+
+	memcpy(buffer, ((char *)icap->mem_topo) + offset, nread);
+unlock:
+	mutex_unlock(&icap->icap_lock);
+	return nread;
+}
+
+
+static struct bin_attribute mem_topology_attr = {
+	.attr = {
+		.name = "mem_topology",
+		.mode = 0444
+	},
+	.read = icap_read_mem_topology,
+	.write = NULL,
+	.size = 0
+};
+
+static struct bin_attribute *icap_bin_attrs[] = {
+	&debug_ip_layout_attr,
+	&ip_layout_attr,
+	&connectivity_attr,
+	&mem_topology_attr,
+	NULL,
+};
+
+static struct attribute_group icap_attr_group = {
+	.attrs = icap_attrs,
+	.bin_attrs = icap_bin_attrs,
+};
+
+static int icap_remove(struct platform_device *pdev)
+{
+	struct icap *icap = platform_get_drvdata(pdev);
+	int i;
+
+	BUG_ON(icap == NULL);
+
+	del_all_users(icap);
+	xocl_subdev_register(pdev, XOCL_SUBDEV_ICAP, NULL);
+
+	if (ICAP_PRIVILEGED(icap))
+		sysfs_remove_group(&pdev->dev.kobj, &icap_mgmt_bin_attr_group);
+
+	if (icap->bit_buffer)
+		vfree(icap->bit_buffer);
+
+	iounmap(icap->icap_regs);
+	iounmap(icap->icap_state);
+	iounmap(icap->icap_axi_gate);
+	for (i = 0; i < ICAP_MAX_NUM_CLOCKS; i++)
+		iounmap(icap->icap_clock_bases[i]);
+	free_clear_bitstream(icap);
+	free_clock_freq_topology(icap);
+
+	sysfs_remove_group(&pdev->dev.kobj, &icap_attr_group);
+
+	ICAP_INFO(icap, "cleaned up successfully");
+	platform_set_drvdata(pdev, NULL);
+	vfree(icap->mem_topo);
+	vfree(icap->ip_layout);
+	vfree(icap->debug_layout);
+	vfree(icap->connectivity);
+	kfree(icap);
+	return 0;
+}
+
+/*
+ * Run the following sequence of canned commands to obtain IDCODE of the FPGA
+ */
+static void icap_probe_chip(struct icap *icap)
+{
+	u32 w;
+
+	if (!ICAP_PRIVILEGED(icap))
+		return;
+
+	w = reg_rd(&icap->icap_regs->ir_sr);
+	w = reg_rd(&icap->icap_regs->ir_sr);
+	reg_wr(&icap->icap_regs->ir_gier, 0x0);
+	w = reg_rd(&icap->icap_regs->ir_wfv);
+	reg_wr(&icap->icap_regs->ir_wf, 0xffffffff);
+	reg_wr(&icap->icap_regs->ir_wf, 0xaa995566);
+	reg_wr(&icap->icap_regs->ir_wf, 0x20000000);
+	reg_wr(&icap->icap_regs->ir_wf, 0x20000000);
+	reg_wr(&icap->icap_regs->ir_wf, 0x28018001);
+	reg_wr(&icap->icap_regs->ir_wf, 0x20000000);
+	reg_wr(&icap->icap_regs->ir_wf, 0x20000000);
+	w = reg_rd(&icap->icap_regs->ir_cr);
+	reg_wr(&icap->icap_regs->ir_cr, 0x1);
+	w = reg_rd(&icap->icap_regs->ir_cr);
+	w = reg_rd(&icap->icap_regs->ir_cr);
+	w = reg_rd(&icap->icap_regs->ir_sr);
+	w = reg_rd(&icap->icap_regs->ir_cr);
+	w = reg_rd(&icap->icap_regs->ir_sr);
+	reg_wr(&icap->icap_regs->ir_sz, 0x1);
+	w = reg_rd(&icap->icap_regs->ir_cr);
+	reg_wr(&icap->icap_regs->ir_cr, 0x2);
+	w = reg_rd(&icap->icap_regs->ir_rfo);
+	icap->idcode = reg_rd(&icap->icap_regs->ir_rf);
+	w = reg_rd(&icap->icap_regs->ir_cr);
+}
+
+static int icap_probe(struct platform_device *pdev)
+{
+	struct icap *icap = NULL;
+	struct resource *res;
+	int ret;
+	int reg_grp;
+	void **regs;
+
+	icap = kzalloc(sizeof(struct icap), GFP_KERNEL);
+	if (!icap)
+		return -ENOMEM;
+	platform_set_drvdata(pdev, icap);
+	icap->icap_pdev = pdev;
+	mutex_init(&icap->icap_lock);
+	INIT_LIST_HEAD(&icap->icap_bitstream_users);
+
+	for (reg_grp = 0; reg_grp < ICAP_MAX_REG_GROUPS; reg_grp++) {
+		switch (reg_grp) {
+		case 0:
+			regs = (void **)&icap->icap_regs;
+			break;
+		case 1:
+			regs = (void **)&icap->icap_state;
+			break;
+		case 2:
+			regs = (void **)&icap->icap_axi_gate;
+			break;
+		case 3:
+			regs = (void **)&icap->icap_clock_bases[0];
+			break;
+		case 4:
+			regs = (void **)&icap->icap_clock_bases[1];
+			break;
+		case 5:
+			regs = (void **)&icap->icap_clock_freq_counter;
+			break;
+		default:
+			BUG();
+			break;
+		}
+		res = platform_get_resource(pdev, IORESOURCE_MEM, reg_grp);
+		if (res != NULL) {
+			*regs = ioremap_nocache(res->start,
+				res->end - res->start + 1);
+			if (*regs == NULL) {
+				ICAP_ERR(icap,
+					"failed to map in register group: %d",
+					reg_grp);
+				ret = -EIO;
+				goto failed;
+			} else {
+				ICAP_INFO(icap,
+					"mapped in register group %d @ 0x%p",
+					reg_grp, *regs);
+			}
+		} else {
+			if (reg_grp != 0) {
+				ICAP_ERR(icap,
+					"failed to find register group: %d",
+					reg_grp);
+				ret = -EIO;
+				goto failed;
+			}
+			break;
+		}
+	}
+
+	ret = sysfs_create_group(&pdev->dev.kobj, &icap_attr_group);
+	if (ret) {
+		ICAP_ERR(icap, "create icap attrs failed: %d", ret);
+		goto failed;
+	}
+
+	if (ICAP_PRIVILEGED(icap)) {
+		ret = sysfs_create_group(&pdev->dev.kobj,
+			&icap_mgmt_bin_attr_group);
+		if (ret) {
+			ICAP_ERR(icap, "create icap attrs failed: %d", ret);
+			goto failed;
+		}
+	}
+
+	icap_probe_chip(icap);
+	if (!ICAP_PRIVILEGED(icap))
+		icap_unlock_bitstream(pdev, NULL, 0);
+	ICAP_INFO(icap, "successfully initialized FPGA IDCODE 0x%x",
+			icap->idcode);
+	xocl_subdev_register(pdev, XOCL_SUBDEV_ICAP, &icap_ops);
+	return 0;
+
+failed:
+	(void) icap_remove(pdev);
+	return ret;
+}
+
+
+struct platform_device_id icap_id_table[] = {
+	{ XOCL_ICAP, 0 },
+	{ },
+};
+
+static struct platform_driver icap_driver = {
+	.probe		= icap_probe,
+	.remove		= icap_remove,
+	.driver		= {
+		.name	= XOCL_ICAP,
+	},
+	.id_table = icap_id_table,
+};
+
+int __init xocl_init_icap(void)
+{
+	return platform_driver_register(&icap_driver);
+}
+
+void xocl_fini_icap(void)
+{
+	platform_driver_unregister(&icap_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/mailbox.c b/drivers/gpu/drm/xocl/subdev/mailbox.c
new file mode 100644
index 000000000000..dc4736c9100a
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/mailbox.c
@@ -0,0 +1,1868 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Max Zhen <maxz@xilinx.com>
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+/*
+ * Statement of Theory
+ *
+ * This is the mailbox sub-device driver added into existing xclmgmt / xocl
+ * driver so that user pf and mgmt pf can send and receive messages of
+ * arbitrary length to / from peer. The driver is written based on the spec of
+ * pg114 document (https://www.xilinx.com/support/documentation/
+ * ip_documentation/mailbox/v2_1/pg114-mailbox.pdf). The HW provides one TX
+ * channel and one RX channel, which operate completely independent of each
+ * other. Data can be pushed into or read from a channel in DWORD unit as a
+ * FIFO.
+ *
+ *
+ * Packet layer
+ *
+ * The driver implemented two transport layers - packet and message layer (see
+ * below). A packet is a fixed size chunk of data that can be send through TX
+ * channel or retrieved from RX channel. The TX and RX interrupt happens at
+ * packet boundary, instead of DWORD boundary. The driver will not attempt to
+ * send next packet until the previous one is read by peer. Similarly, the
+ * driver will not attempt to read the data from HW until a full packet has been
+ * written to HW by peer. No polling is implemented. Data transfer is entirely
+ * interrupt driven. So, the interrupt functionality needs to work and enabled
+ * on both mgmt and user pf for mailbox driver to function properly.
+ *
+ * A TX packet is considered as time'd out after sitting in the TX channel of
+ * mailbox HW for two packet ticks (1 packet tick = 1 second, for now) without
+ * being read by peer. Currently, the driver will not try to re-transmit the
+ * packet after timeout. It just simply propagate the error to the upper layer.
+ * A retry at packet layer can be implement later, if considered as appropriate.
+ *
+ *
+ * Message layer
+ *
+ * A message is a data buffer of arbitrary length. The driver will break a
+ * message into multiple packets and transmit them to the peer, which, in turn,
+ * will assemble them into a full message before it's delivered to upper layer
+ * for further processing. One message requires at least one packet to be
+ * transferred to the peer.
+ *
+ * Each message has a unique temporary u64 ID (see communication model below
+ * for more detail). The ID shows up in each packet's header. So, at packet
+ * layer, there is no assumption that adjacent packets belong to the same
+ * message. However, for the sake of simplicity, at message layer, the driver
+ * will not attempt to send the next message until the sending of current one
+ * is finished. I.E., we implement a FIFO for message TX channel. All messages
+ * are sent by driver in the order of received from upper layer. We can
+ * implement messages of different priority later, if needed. There is no
+ * certain order for receiving messages. It's up to the peer side to decide
+ * which message gets enqueued into its own TX queue first, which will be
+ * received first on the other side.
+ *
+ * A message is considered as time'd out when it's transmit (send or receive)
+ * is not finished within 10 packet ticks. This applies to all messages queued
+ * up on both RX and TX channels. Again, no retry for a time'd out message is
+ * implemented. The error will be simply passed to upper layer. Also, a TX
+ * message may time out earlier if it's being transmitted and one of it's
+ * packets time'd out. During normal operation, timeout should never happen.
+ *
+ * The upper layer can choose to queue a message for TX or RX asynchronously
+ * when it provides a callback or wait synchronously when no callback is
+ * provided.
+ *
+ *
+ * Communication model
+ *
+ * At the highest layer, the driver implements a request-response communication
+ * model. A request may or may not require a response, but a response must match
+ * a request, or it'll be silently dropped. The driver provides a few kernel
+ * APIs for mgmt and user pf to talk to each other in this model (see kernel
+ * APIs section below for details). Each request or response is a message by
+ * itself. A request message will automatically be assigned a message ID when
+ * it's enqueued into TX channel for sending. If this request requires a
+ * response, the buffer provided by caller for receiving response will be
+ * enqueued into RX channel as well. The enqueued response message will have
+ * the same message ID as the corresponding request message. The response
+ * message, if provided, will always be enqueued before the request message is
+ * enqueued to avoid race condition.
+ *
+ * The driver will automatically enqueue a special message into the RX channel
+ * for receiving new request after initialized. This request RX message has a
+ * special message ID (id=0) and never time'd out. When a new request comes
+ * from peer, it'll be copied into request RX message then passed to the
+ * callback provided by upper layer through xocl_peer_listen() API for further
+ * processing. Currently, the driver implements only one kernel thread for RX
+ * channel and one for TX channel. So, all message callback happens in the
+ * context of that channel thread. So, the user of mailbox driver needs to be
+ * careful when it calls xocl_peer_request() synchronously in this context.
+ * You may see deadlock when both ends are trying to call xocl_peer_request()
+ * synchronously at the same time.
+ *
+ *
+ * +------------------+            +------------------+
+ * | Request/Response | <--------> | Request/Response |
+ * +------------------+            +------------------+
+ * | Message          | <--------> | Message          |
+ * +------------------+            +------------------+
+ * | Packet           | <--------> | Packet           |
+ * +------------------+            +------------------+
+ * | RX/TX Channel    | <<======>> | RX/TX Channel    |
+ * +------------------+            +------------------+
+ *   mgmt pf                         user pf
+ */
+
+#include <linux/mutex.h>
+#include <linux/completion.h>
+#include <linux/list.h>
+#include <linux/device.h>
+#include <linux/crc32c.h>
+#include <linux/random.h>
+#include "../xocl_drv.h"
+
+int mailbox_no_intr;
+module_param(mailbox_no_intr, int, (S_IRUGO|S_IWUSR));
+MODULE_PARM_DESC(mailbox_no_intr,
+	"Disable mailbox interrupt and do timer-driven msg passing");
+
+#define	PACKET_SIZE	16 /* Number of DWORD. */
+
+#define	FLAG_STI	(1 << 0)
+#define	FLAG_RTI	(1 << 1)
+
+#define	STATUS_EMPTY	(1 << 0)
+#define	STATUS_FULL	(1 << 1)
+#define	STATUS_STA	(1 << 2)
+#define	STATUS_RTA	(1 << 3)
+
+#define	MBX_ERR(mbx, fmt, arg...)	\
+	xocl_err(&mbx->mbx_pdev->dev, fmt "\n", ##arg)
+#define	MBX_INFO(mbx, fmt, arg...)	\
+	xocl_info(&mbx->mbx_pdev->dev, fmt "\n", ##arg)
+#define	MBX_DBG(mbx, fmt, arg...)	\
+	xocl_dbg(&mbx->mbx_pdev->dev, fmt "\n", ##arg)
+
+#define	MAILBOX_TIMER	HZ	/* in jiffies */
+#define	MSG_TTL		10	/* in MAILBOX_TIMER */
+#define	TEST_MSG_LEN	128
+
+#define	INVALID_MSG_ID	((u64)-1)
+#define	MSG_FLAG_RESPONSE	(1 << 0)
+#define	MSG_FLAG_REQUEST (1 << 1)
+
+#define MAX_MSG_QUEUE_SZ  (PAGE_SIZE << 16)
+#define MAX_MSG_QUEUE_LEN 5
+
+#define MB_CONN_INIT	(0x1<<0)
+#define MB_CONN_SYN	(0x1<<1)
+#define MB_CONN_ACK	(0x1<<2)
+#define MB_CONN_FIN	(0x1<<3)
+
+/*
+ * Mailbox IP register layout
+ */
+struct mailbox_reg {
+	u32			mbr_wrdata;
+	u32			mbr_resv1;
+	u32			mbr_rddata;
+	u32			mbr_resv2;
+	u32			mbr_status;
+	u32			mbr_error;
+	u32			mbr_sit;
+	u32			mbr_rit;
+	u32			mbr_is;
+	u32			mbr_ie;
+	u32			mbr_ip;
+	u32			mbr_ctrl;
+} __attribute__((packed));
+
+/*
+ * A message transport by mailbox.
+ */
+struct mailbox_msg {
+	struct list_head	mbm_list;
+	struct mailbox_channel	*mbm_ch;
+	u64			mbm_req_id;
+	char			*mbm_data;
+	size_t			mbm_len;
+	int			mbm_error;
+	struct completion	mbm_complete;
+	mailbox_msg_cb_t	mbm_cb;
+	void			*mbm_cb_arg;
+	u32			mbm_flags;
+	int			mbm_ttl;
+	bool			mbm_timer_on;
+};
+
+/*
+ * A packet transport by mailbox.
+ * When extending, only add new data structure to body. Choose to add new flag
+ * if new feature can be safely ignored by peer, other wise, add new type.
+ */
+enum packet_type {
+	PKT_INVALID = 0,
+	PKT_TEST,
+	PKT_MSG_START,
+	PKT_MSG_BODY
+};
+
+
+enum conn_state {
+	CONN_START = 0,
+	CONN_SYN_SENT,
+	CONN_SYN_RECV,
+	CONN_ESTABLISH,
+};
+
+/* Lower 8 bits for type, the rest for flags. */
+#define	PKT_TYPE_MASK		0xff
+#define	PKT_TYPE_MSG_END	(1 << 31)
+struct mailbox_pkt {
+	struct {
+		u32		type;
+		u32		payload_size;
+	} hdr;
+	union {
+		u32		data[PACKET_SIZE - 2];
+		struct {
+			u64	msg_req_id;
+			u32	msg_flags;
+			u32	msg_size;
+			u32	payload[0];
+		} msg_start;
+		struct {
+			u32	payload[0];
+		} msg_body;
+	} body;
+} __attribute__((packed));
+
+/*
+ * Mailbox communication channel.
+ */
+#define MBXCS_BIT_READY		0
+#define MBXCS_BIT_STOP		1
+#define MBXCS_BIT_TICK		2
+#define MBXCS_BIT_CHK_STALL	3
+#define MBXCS_BIT_POLL_MODE	4
+
+struct mailbox_channel;
+typedef	void (*chan_func_t)(struct mailbox_channel *ch);
+struct mailbox_channel {
+	struct mailbox		*mbc_parent;
+	char			*mbc_name;
+
+	struct workqueue_struct	*mbc_wq;
+	struct work_struct	mbc_work;
+	struct completion	mbc_worker;
+	chan_func_t		mbc_tran;
+	unsigned long		mbc_state;
+
+	struct mutex		mbc_mutex;
+	struct list_head	mbc_msgs;
+
+	struct mailbox_msg	*mbc_cur_msg;
+	int			mbc_bytes_done;
+	struct mailbox_pkt	mbc_packet;
+
+	struct timer_list	mbc_timer;
+	bool			mbc_timer_on;
+};
+
+/*
+ * The mailbox softstate.
+ */
+struct mailbox {
+	struct platform_device	*mbx_pdev;
+	struct mailbox_reg	*mbx_regs;
+	u32			mbx_irq;
+
+	struct mailbox_channel	mbx_rx;
+	struct mailbox_channel	mbx_tx;
+
+	/* For listening to peer's request. */
+	mailbox_msg_cb_t	mbx_listen_cb;
+	void			*mbx_listen_cb_arg;
+	struct workqueue_struct	*mbx_listen_wq;
+	struct work_struct	mbx_listen_worker;
+
+	int			mbx_paired;
+	/*
+	 * For testing basic intr and mailbox comm functionality via sysfs.
+	 * No locking protection, use with care.
+	 */
+	struct mailbox_pkt	mbx_tst_pkt;
+	char			mbx_tst_tx_msg[TEST_MSG_LEN];
+	char			mbx_tst_rx_msg[TEST_MSG_LEN];
+	size_t			mbx_tst_tx_msg_len;
+
+	/* Req list for all incoming request message */
+	struct completion mbx_comp;
+	struct mutex mbx_lock;
+	struct list_head mbx_req_list;
+	uint8_t mbx_req_cnt;
+	size_t mbx_req_sz;
+
+	struct mutex mbx_conn_lock;
+	uint64_t mbx_conn_id;
+	enum conn_state mbx_state;
+	bool mbx_established;
+	uint32_t mbx_prot_ver;
+
+	void *mbx_kaddr;
+};
+
+static inline const char *reg2name(struct mailbox *mbx, u32 *reg)
+{
+	static const char *reg_names[] = {
+		"wrdata",
+		"reserved1",
+		"rddata",
+		"reserved2",
+		"status",
+		"error",
+		"sit",
+		"rit",
+		"is",
+		"ie",
+		"ip",
+		"ctrl"
+	};
+
+	return reg_names[((uintptr_t)reg -
+		(uintptr_t)mbx->mbx_regs) / sizeof(u32)];
+}
+
+struct mailbox_conn {
+	uint64_t flag;
+	void *kaddr;
+	phys_addr_t paddr;
+	uint32_t crc32;
+	uint32_t ver;
+	uint64_t sec_id;
+};
+
+int mailbox_request(struct platform_device *pdev, void *req, size_t reqlen,
+		    void *resp, size_t *resplen, mailbox_msg_cb_t cb, void *cbarg);
+int mailbox_post(struct platform_device *pdev, u64 reqid, void *buf, size_t len);
+static int mailbox_connect_status(struct platform_device *pdev);
+static void connect_state_handler(struct mailbox *mbx, struct mailbox_conn *conn);
+
+static void connect_state_touch(struct mailbox *mbx, uint64_t flag)
+{
+	struct mailbox_conn conn = {0};
+	if (!mbx)
+		return;
+	conn.flag = flag;
+	connect_state_handler(mbx, &conn);
+}
+
+
+static inline u32 mailbox_reg_rd(struct mailbox *mbx, u32 *reg)
+{
+	u32 val = ioread32(reg);
+
+#ifdef	MAILBOX_REG_DEBUG
+	MBX_DBG(mbx, "REG_RD(%s)=0x%x", reg2name(mbx, reg), val);
+#endif
+	return val;
+}
+
+static inline void mailbox_reg_wr(struct mailbox *mbx, u32 *reg, u32 val)
+{
+#ifdef	MAILBOX_REG_DEBUG
+	MBX_DBG(mbx, "REG_WR(%s, 0x%x)", reg2name(mbx, reg), val);
+#endif
+	iowrite32(val, reg);
+}
+
+static inline void reset_pkt(struct mailbox_pkt *pkt)
+{
+	pkt->hdr.type = PKT_INVALID;
+}
+
+static inline bool valid_pkt(struct mailbox_pkt *pkt)
+{
+	return (pkt->hdr.type != PKT_INVALID);
+}
+
+irqreturn_t mailbox_isr(int irq, void *arg)
+{
+	struct mailbox *mbx = (struct mailbox *)arg;
+	u32 is = mailbox_reg_rd(mbx, &mbx->mbx_regs->mbr_is);
+
+	while (is) {
+		MBX_DBG(mbx, "intr status: 0x%x", is);
+
+		if ((is & FLAG_STI) != 0) {
+			/* A packet has been sent successfully. */
+			complete(&mbx->mbx_tx.mbc_worker);
+		}
+		if ((is & FLAG_RTI) != 0) {
+			/* A packet is waiting to be received from mailbox. */
+			complete(&mbx->mbx_rx.mbc_worker);
+		}
+		/* Anything else is not expected. */
+		if ((is & (FLAG_STI | FLAG_RTI)) == 0) {
+			MBX_ERR(mbx, "spurious mailbox irq %d, is=0x%x",
+				irq, is);
+		}
+
+		/* Clear intr state for receiving next one. */
+		mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_is, is);
+
+		is = mailbox_reg_rd(mbx, &mbx->mbx_regs->mbr_is);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static void chan_timer(struct timer_list *t)
+{
+	struct mailbox_channel *ch = from_timer(ch, t, mbc_timer);
+
+	MBX_DBG(ch->mbc_parent, "%s tick", ch->mbc_name);
+
+	set_bit(MBXCS_BIT_TICK, &ch->mbc_state);
+	complete(&ch->mbc_worker);
+
+	/* We're a periodic timer. */
+	mod_timer(&ch->mbc_timer, jiffies + MAILBOX_TIMER);
+}
+
+static void chan_config_timer(struct mailbox_channel *ch)
+{
+	struct list_head *pos, *n;
+	struct mailbox_msg *msg = NULL;
+	bool on = false;
+
+	mutex_lock(&ch->mbc_mutex);
+
+	if (test_bit(MBXCS_BIT_POLL_MODE, &ch->mbc_state)) {
+		on = true;
+	} else {
+		list_for_each_safe(pos, n, &ch->mbc_msgs) {
+			msg = list_entry(pos, struct mailbox_msg, mbm_list);
+			if (msg->mbm_req_id == 0)
+			       continue;
+			on = true;
+			break;
+		}
+	}
+
+	if (on != ch->mbc_timer_on) {
+		ch->mbc_timer_on = on;
+		if (on)
+			mod_timer(&ch->mbc_timer, jiffies + MAILBOX_TIMER);
+		else
+			del_timer_sync(&ch->mbc_timer);
+	}
+
+	mutex_unlock(&ch->mbc_mutex);
+}
+
+static void free_msg(struct mailbox_msg *msg)
+{
+	vfree(msg);
+}
+
+static void msg_done(struct mailbox_msg *msg, int err)
+{
+	struct mailbox_channel *ch = msg->mbm_ch;
+	struct mailbox *mbx = ch->mbc_parent;
+
+	MBX_DBG(ch->mbc_parent, "%s finishing msg id=0x%llx err=%d",
+		ch->mbc_name, msg->mbm_req_id, err);
+
+	msg->mbm_error = err;
+	if (msg->mbm_cb) {
+		msg->mbm_cb(msg->mbm_cb_arg, msg->mbm_data, msg->mbm_len,
+			msg->mbm_req_id, msg->mbm_error);
+		free_msg(msg);
+	} else {
+		if (msg->mbm_flags & MSG_FLAG_REQUEST) {
+			if ((mbx->mbx_req_sz+msg->mbm_len) >= MAX_MSG_QUEUE_SZ ||
+				  mbx->mbx_req_cnt >= MAX_MSG_QUEUE_LEN) {
+				goto done;
+			}
+			mutex_lock(&ch->mbc_parent->mbx_lock);
+			list_add_tail(&msg->mbm_list, &ch->mbc_parent->mbx_req_list);
+			mbx->mbx_req_cnt++;
+			mbx->mbx_req_sz += msg->mbm_len;
+			mutex_unlock(&ch->mbc_parent->mbx_lock);
+
+			complete(&ch->mbc_parent->mbx_comp);
+		} else {
+			complete(&msg->mbm_complete);
+		}
+	}
+done:
+	chan_config_timer(ch);
+}
+
+static void chan_msg_done(struct mailbox_channel *ch, int err)
+{
+	if (!ch->mbc_cur_msg)
+		return;
+
+	msg_done(ch->mbc_cur_msg, err);
+	ch->mbc_cur_msg = NULL;
+	ch->mbc_bytes_done = 0;
+}
+
+void timeout_msg(struct mailbox_channel *ch)
+{
+	struct mailbox *mbx = ch->mbc_parent;
+	struct mailbox_msg *msg = NULL;
+	struct list_head *pos, *n;
+	struct list_head l = LIST_HEAD_INIT(l);
+	bool reschedule = false;
+
+	/* Check active msg first. */
+	msg = ch->mbc_cur_msg;
+	if (msg) {
+
+		if (msg->mbm_ttl == 0) {
+			MBX_ERR(mbx, "found active msg time'd out");
+			chan_msg_done(ch, -ETIME);
+		} else {
+			if (msg->mbm_timer_on) {
+				msg->mbm_ttl--;
+				/* Need to come back again for this one. */
+				reschedule = true;
+			}
+		}
+	}
+
+	mutex_lock(&ch->mbc_mutex);
+
+	list_for_each_safe(pos, n, &ch->mbc_msgs) {
+		msg = list_entry(pos, struct mailbox_msg, mbm_list);
+		if (!msg->mbm_timer_on)
+			continue;
+		if (msg->mbm_req_id == 0)
+		       continue;
+		if (msg->mbm_ttl == 0) {
+			list_del(&msg->mbm_list);
+			list_add_tail(&msg->mbm_list, &l);
+		} else {
+			msg->mbm_ttl--;
+			/* Need to come back again for this one. */
+			reschedule = true;
+		}
+	}
+
+	mutex_unlock(&ch->mbc_mutex);
+
+	if (!list_empty(&l))
+		MBX_ERR(mbx, "found waiting msg time'd out");
+
+	list_for_each_safe(pos, n, &l) {
+		msg = list_entry(pos, struct mailbox_msg, mbm_list);
+		list_del(&msg->mbm_list);
+		msg_done(msg, -ETIME);
+	}
+}
+
+static void chann_worker(struct work_struct *work)
+{
+	struct mailbox_channel *ch =
+		container_of(work, struct mailbox_channel, mbc_work);
+	struct mailbox *mbx = ch->mbc_parent;
+
+	while (!test_bit(MBXCS_BIT_STOP, &ch->mbc_state)) {
+		MBX_DBG(mbx, "%s worker start", ch->mbc_name);
+		ch->mbc_tran(ch);
+		wait_for_completion_interruptible(&ch->mbc_worker);
+	}
+}
+
+static inline u32 mailbox_chk_err(struct mailbox *mbx)
+{
+	u32 val = mailbox_reg_rd(mbx, &mbx->mbx_regs->mbr_error);
+
+	/* Ignore bad register value after firewall is tripped. */
+	if (val == 0xffffffff)
+		val = 0;
+
+	/* Error should not be seen, shout when found. */
+	if (val)
+		MBX_ERR(mbx, "mailbox error detected, error=0x%x\n", val);
+	return val;
+}
+
+static int chan_msg_enqueue(struct mailbox_channel *ch, struct mailbox_msg *msg)
+{
+	int rv = 0;
+
+	MBX_DBG(ch->mbc_parent, "%s enqueuing msg, id=0x%llx\n",
+		ch->mbc_name, msg->mbm_req_id);
+
+	BUG_ON(msg->mbm_req_id == INVALID_MSG_ID);
+
+	mutex_lock(&ch->mbc_mutex);
+	if (test_bit(MBXCS_BIT_STOP, &ch->mbc_state)) {
+		rv = -ESHUTDOWN;
+	} else {
+		list_add_tail(&msg->mbm_list, &ch->mbc_msgs);
+		msg->mbm_ch = ch;
+//		msg->mbm_ttl = MSG_TTL;
+	}
+	mutex_unlock(&ch->mbc_mutex);
+
+	chan_config_timer(ch);
+
+	return rv;
+}
+
+static struct mailbox_msg *chan_msg_dequeue(struct mailbox_channel *ch,
+	u64 req_id)
+{
+	struct mailbox_msg *msg = NULL;
+	struct list_head *pos;
+
+	mutex_lock(&ch->mbc_mutex);
+
+	/* Take the first msg. */
+	if (req_id == INVALID_MSG_ID) {
+		msg = list_first_entry_or_null(&ch->mbc_msgs,
+			struct mailbox_msg, mbm_list);
+	/* Take the msg w/ specified ID. */
+	} else {
+		list_for_each(pos, &ch->mbc_msgs) {
+			msg = list_entry(pos, struct mailbox_msg, mbm_list);
+			if (msg->mbm_req_id == req_id)
+				break;
+		}
+	}
+
+	if (msg) {
+		MBX_DBG(ch->mbc_parent, "%s dequeued msg, id=0x%llx\n",
+			ch->mbc_name, msg->mbm_req_id);
+		list_del(&msg->mbm_list);
+	}
+
+	mutex_unlock(&ch->mbc_mutex);
+
+	return msg;
+}
+
+static struct mailbox_msg *alloc_msg(void *buf, size_t len)
+{
+	char *newbuf = NULL;
+	struct mailbox_msg *msg = NULL;
+	/* Give MB*2 secs as time to live */
+	int calculated_ttl = (len >> 19) < MSG_TTL ? MSG_TTL : (len >> 19);
+
+	if (!buf) {
+		msg = vzalloc(sizeof(struct mailbox_msg) + len);
+		if (!msg)
+			return NULL;
+		newbuf = ((char *)msg) + sizeof(struct mailbox_msg);
+	} else {
+		msg = vzalloc(sizeof(struct mailbox_msg));
+		if (!msg)
+			return NULL;
+		newbuf = buf;
+	}
+
+	INIT_LIST_HEAD(&msg->mbm_list);
+	msg->mbm_data = newbuf;
+	msg->mbm_len = len;
+	msg->mbm_ttl = calculated_ttl;
+	msg->mbm_timer_on = false;
+	init_completion(&msg->mbm_complete);
+
+	return msg;
+}
+
+static int chan_init(struct mailbox *mbx, char *nm,
+	struct mailbox_channel *ch, chan_func_t fn)
+{
+	ch->mbc_parent = mbx;
+	ch->mbc_name = nm;
+	ch->mbc_tran = fn;
+	INIT_LIST_HEAD(&ch->mbc_msgs);
+	init_completion(&ch->mbc_worker);
+	mutex_init(&ch->mbc_mutex);
+
+	ch->mbc_cur_msg = NULL;
+	ch->mbc_bytes_done = 0;
+
+	reset_pkt(&ch->mbc_packet);
+	set_bit(MBXCS_BIT_READY, &ch->mbc_state);
+
+	/* One thread for one channel. */
+	ch->mbc_wq =
+		create_singlethread_workqueue(dev_name(&mbx->mbx_pdev->dev));
+	if (!ch->mbc_wq) {
+		ch->mbc_parent = NULL;
+		return -ENOMEM;
+	}
+
+	INIT_WORK(&ch->mbc_work, chann_worker);
+	queue_work(ch->mbc_wq, &ch->mbc_work);
+
+	/* One timer for one channel. */
+	timer_setup(&ch->mbc_timer, chan_timer, 0);
+
+	return 0;
+}
+
+static void chan_fini(struct mailbox_channel *ch)
+{
+	struct mailbox_msg *msg;
+
+	if (!ch->mbc_parent)
+		return;
+
+	/*
+	 * Holding mutex to ensure no new msg is enqueued after
+	 * flag is set.
+	 */
+	mutex_lock(&ch->mbc_mutex);
+	set_bit(MBXCS_BIT_STOP, &ch->mbc_state);
+	mutex_unlock(&ch->mbc_mutex);
+
+	complete(&ch->mbc_worker);
+	cancel_work_sync(&ch->mbc_work);
+	destroy_workqueue(ch->mbc_wq);
+
+	msg = ch->mbc_cur_msg;
+	if (msg)
+		chan_msg_done(ch, -ESHUTDOWN);
+
+	while ((msg = chan_msg_dequeue(ch, INVALID_MSG_ID)) != NULL)
+		msg_done(msg, -ESHUTDOWN);
+
+	del_timer_sync(&ch->mbc_timer);
+}
+
+static void listen_wq_fini(struct mailbox *mbx)
+{
+	BUG_ON(mbx == NULL);
+
+	if (mbx->mbx_listen_wq != NULL) {
+		complete(&mbx->mbx_comp);
+		cancel_work_sync(&mbx->mbx_listen_worker);
+		destroy_workqueue(mbx->mbx_listen_wq);
+	}
+
+}
+
+static void chan_recv_pkt(struct mailbox_channel *ch)
+{
+	int i, retry = 10;
+	struct mailbox *mbx = ch->mbc_parent;
+	struct mailbox_pkt *pkt = &ch->mbc_packet;
+
+	BUG_ON(valid_pkt(pkt));
+
+	/* Picking up a packet from HW. */
+	for (i = 0; i < PACKET_SIZE; i++) {
+		while ((mailbox_reg_rd(mbx,
+			&mbx->mbx_regs->mbr_status) & STATUS_EMPTY) &&
+			(retry-- > 0))
+			msleep(100);
+
+		*(((u32 *)pkt) + i) =
+			mailbox_reg_rd(mbx, &mbx->mbx_regs->mbr_rddata);
+	}
+
+	if ((mailbox_chk_err(mbx) & STATUS_EMPTY) != 0)
+		reset_pkt(pkt);
+	else
+		MBX_DBG(mbx, "received pkt: type=0x%x", pkt->hdr.type);
+}
+
+static void chan_send_pkt(struct mailbox_channel *ch)
+{
+	int i;
+	struct mailbox *mbx = ch->mbc_parent;
+	struct mailbox_pkt *pkt = &ch->mbc_packet;
+
+	BUG_ON(!valid_pkt(pkt));
+
+	MBX_DBG(mbx, "sending pkt: type=0x%x", pkt->hdr.type);
+
+	/* Pushing a packet into HW. */
+	for (i = 0; i < PACKET_SIZE; i++) {
+		mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_wrdata,
+			*(((u32 *)pkt) + i));
+	}
+
+	reset_pkt(pkt);
+	if (ch->mbc_cur_msg)
+		ch->mbc_bytes_done += ch->mbc_packet.hdr.payload_size;
+
+	BUG_ON((mailbox_chk_err(mbx) & STATUS_FULL) != 0);
+}
+
+static int chan_pkt2msg(struct mailbox_channel *ch)
+{
+	struct mailbox *mbx = ch->mbc_parent;
+	void *msg_data, *pkt_data;
+	struct mailbox_msg *msg = ch->mbc_cur_msg;
+	struct mailbox_pkt *pkt = &ch->mbc_packet;
+	size_t cnt = pkt->hdr.payload_size;
+	u32 type = (pkt->hdr.type & PKT_TYPE_MASK);
+
+	BUG_ON(((type != PKT_MSG_START) && (type != PKT_MSG_BODY)) || !msg);
+
+	if (type == PKT_MSG_START) {
+		msg->mbm_req_id = pkt->body.msg_start.msg_req_id;
+		BUG_ON(msg->mbm_len < pkt->body.msg_start.msg_size);
+		msg->mbm_len = pkt->body.msg_start.msg_size;
+		pkt_data = pkt->body.msg_start.payload;
+	} else {
+		pkt_data = pkt->body.msg_body.payload;
+	}
+
+	if (cnt > msg->mbm_len - ch->mbc_bytes_done) {
+		MBX_ERR(mbx, "invalid mailbox packet size\n");
+		return -EBADMSG;
+	}
+
+	msg_data = msg->mbm_data + ch->mbc_bytes_done;
+	(void) memcpy(msg_data, pkt_data, cnt);
+	ch->mbc_bytes_done += cnt;
+
+	reset_pkt(pkt);
+	return 0;
+}
+
+/*
+ * Worker for RX channel.
+ */
+static void chan_do_rx(struct mailbox_channel *ch)
+{
+	struct mailbox *mbx = ch->mbc_parent;
+	struct mailbox_pkt *pkt = &ch->mbc_packet;
+	struct mailbox_msg *msg = NULL;
+	bool needs_read = false;
+	u64 id = 0;
+	bool eom;
+	int err;
+	u32 type;
+	u32 st = mailbox_reg_rd(mbx, &mbx->mbx_regs->mbr_status);
+
+	/* Check if a packet is ready for reading. */
+	if (st == 0xffffffff) {
+		/* Device is still being reset. */
+		needs_read = false;
+	} else if (test_bit(MBXCS_BIT_POLL_MODE, &ch->mbc_state)) {
+		needs_read = ((st & STATUS_EMPTY) == 0);
+	} else {
+		needs_read = ((st & STATUS_RTA) != 0);
+	}
+
+	if (needs_read) {
+		chan_recv_pkt(ch);
+		type = pkt->hdr.type & PKT_TYPE_MASK;
+		eom = ((pkt->hdr.type & PKT_TYPE_MSG_END) != 0);
+
+		switch (type) {
+		case PKT_TEST:
+			(void) memcpy(&mbx->mbx_tst_pkt, &ch->mbc_packet,
+				sizeof(struct mailbox_pkt));
+			reset_pkt(pkt);
+			return;
+		case PKT_MSG_START:
+			if (ch->mbc_cur_msg) {
+				MBX_ERR(mbx, "received partial msg\n");
+				chan_msg_done(ch, -EBADMSG);
+			}
+
+			/* Get a new active msg. */
+			id = 0;
+			if (pkt->body.msg_start.msg_flags & MSG_FLAG_RESPONSE)
+				id = pkt->body.msg_start.msg_req_id;
+			ch->mbc_cur_msg = chan_msg_dequeue(ch, id);
+
+			if (!ch->mbc_cur_msg) {
+				//no msg, alloc dynamically
+				msg = alloc_msg(NULL, pkt->body.msg_start.msg_size);
+
+				msg->mbm_ch = ch;
+				msg->mbm_flags |= MSG_FLAG_REQUEST;
+				ch->mbc_cur_msg = msg;
+
+			}	else if (pkt->body.msg_start.msg_size >
+				ch->mbc_cur_msg->mbm_len) {
+				chan_msg_done(ch, -EMSGSIZE);
+				MBX_ERR(mbx, "received msg is too big");
+				reset_pkt(pkt);
+			}
+			break;
+		case PKT_MSG_BODY:
+			if (!ch->mbc_cur_msg) {
+				MBX_ERR(mbx, "got unexpected msg body pkt\n");
+				reset_pkt(pkt);
+			}
+			break;
+		default:
+			MBX_ERR(mbx, "invalid mailbox pkt type\n");
+			reset_pkt(pkt);
+			return;
+		}
+
+		if (valid_pkt(pkt)) {
+			err = chan_pkt2msg(ch);
+			if (err || eom)
+				chan_msg_done(ch, err);
+		}
+	}
+
+	/* Handle timer event. */
+	if (test_bit(MBXCS_BIT_TICK, &ch->mbc_state)) {
+		timeout_msg(ch);
+		clear_bit(MBXCS_BIT_TICK, &ch->mbc_state);
+	}
+}
+
+static void chan_msg2pkt(struct mailbox_channel *ch)
+{
+	size_t cnt = 0;
+	size_t payload_off = 0;
+	void *msg_data, *pkt_data;
+	struct mailbox_msg *msg = ch->mbc_cur_msg;
+	struct mailbox_pkt *pkt = &ch->mbc_packet;
+	bool is_start = (ch->mbc_bytes_done == 0);
+	bool is_eom = false;
+
+	if (is_start) {
+		payload_off = offsetof(struct mailbox_pkt,
+			body.msg_start.payload);
+	} else {
+		payload_off = offsetof(struct mailbox_pkt,
+			body.msg_body.payload);
+	}
+	cnt = PACKET_SIZE * sizeof(u32) - payload_off;
+	if (cnt >= msg->mbm_len - ch->mbc_bytes_done) {
+		cnt = msg->mbm_len - ch->mbc_bytes_done;
+		is_eom = true;
+	}
+
+	pkt->hdr.type = is_start ? PKT_MSG_START : PKT_MSG_BODY;
+	pkt->hdr.type |= is_eom ? PKT_TYPE_MSG_END : 0;
+	pkt->hdr.payload_size = cnt;
+
+	if (is_start) {
+		pkt->body.msg_start.msg_req_id = msg->mbm_req_id;
+		pkt->body.msg_start.msg_size = msg->mbm_len;
+		pkt->body.msg_start.msg_flags = msg->mbm_flags;
+		pkt_data = pkt->body.msg_start.payload;
+	} else {
+		pkt_data = pkt->body.msg_body.payload;
+	}
+	msg_data = msg->mbm_data + ch->mbc_bytes_done;
+	(void) memcpy(pkt_data, msg_data, cnt);
+}
+
+static void check_tx_stall(struct mailbox_channel *ch)
+{
+	struct mailbox *mbx = ch->mbc_parent;
+	struct mailbox_msg *msg = ch->mbc_cur_msg;
+
+	/*
+	 * No stall checking in polling mode. Don't know how often peer will
+	 * check the channel.
+	 */
+	if ((msg == NULL) || test_bit(MBXCS_BIT_POLL_MODE, &ch->mbc_state))
+		return;
+
+	/*
+	 * No tx intr has come since last check.
+	 * The TX channel is stalled, reset it.
+	 */
+	if (test_bit(MBXCS_BIT_CHK_STALL, &ch->mbc_state)) {
+		MBX_ERR(mbx, "TX channel stall detected, reset...\n");
+		mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_ctrl, 0x1);
+		chan_msg_done(ch, -ETIME);
+		connect_state_touch(mbx, MB_CONN_FIN);
+	/* Mark it for next check. */
+	} else {
+		set_bit(MBXCS_BIT_CHK_STALL, &ch->mbc_state);
+	}
+}
+
+
+
+static void rx_enqueued_msg_timer_on(struct mailbox *mbx, uint64_t req_id)
+{
+	struct list_head *pos, *n;
+	struct mailbox_msg *msg = NULL;
+	struct mailbox_channel *ch = NULL;
+	ch = &mbx->mbx_rx;
+	MBX_DBG(mbx, "try to set ch rx, req_id %llu\n", req_id);
+	mutex_lock(&ch->mbc_mutex);
+
+	list_for_each_safe(pos, n, &ch->mbc_msgs) {
+		msg = list_entry(pos, struct mailbox_msg, mbm_list);
+		if (msg->mbm_req_id == req_id) {
+			msg->mbm_timer_on = true;
+			MBX_DBG(mbx, "set ch rx, req_id %llu\n", req_id);
+			break;
+		}
+	}
+
+	mutex_unlock(&ch->mbc_mutex);
+
+}
+
+/*
+ * Worker for TX channel.
+ */
+static void chan_do_tx(struct mailbox_channel *ch)
+{
+	struct mailbox *mbx = ch->mbc_parent;
+	u32 st = mailbox_reg_rd(mbx, &mbx->mbx_regs->mbr_status);
+
+	/* Check if a packet has been read by peer. */
+	if ((st != 0xffffffff) && ((st & STATUS_STA) != 0)) {
+		clear_bit(MBXCS_BIT_CHK_STALL, &ch->mbc_state);
+
+		/*
+		 * The mailbox is free for sending new pkt now. See if we
+		 * have something to send.
+		 */
+
+		/* Finished sending a whole msg, call it done. */
+		if (ch->mbc_cur_msg &&
+			(ch->mbc_cur_msg->mbm_len == ch->mbc_bytes_done)) {
+			rx_enqueued_msg_timer_on(mbx, ch->mbc_cur_msg->mbm_req_id);
+			chan_msg_done(ch, 0);
+		}
+
+		if (!ch->mbc_cur_msg) {
+			ch->mbc_cur_msg = chan_msg_dequeue(ch, INVALID_MSG_ID);
+			if (ch->mbc_cur_msg)
+				ch->mbc_cur_msg->mbm_timer_on = true;
+		}
+
+		if (ch->mbc_cur_msg) {
+			chan_msg2pkt(ch);
+		} else if (valid_pkt(&mbx->mbx_tst_pkt)) {
+			(void) memcpy(&ch->mbc_packet, &mbx->mbx_tst_pkt,
+				sizeof(struct mailbox_pkt));
+			reset_pkt(&mbx->mbx_tst_pkt);
+		} else {
+			return; /* Nothing to send. */
+		}
+
+		chan_send_pkt(ch);
+	}
+
+	/* Handle timer event. */
+	if (test_bit(MBXCS_BIT_TICK, &ch->mbc_state)) {
+		timeout_msg(ch);
+		check_tx_stall(ch);
+		clear_bit(MBXCS_BIT_TICK, &ch->mbc_state);
+	}
+}
+
+static int mailbox_connect_status(struct platform_device *pdev)
+{
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	int ret = 0;
+	mutex_lock(&mbx->mbx_lock);
+	ret = mbx->mbx_paired;
+	mutex_unlock(&mbx->mbx_lock);
+	return ret;
+}
+
+static ssize_t mailbox_ctl_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	u32 *reg = (u32 *)mbx->mbx_regs;
+	int r, n;
+	int nreg = sizeof (struct mailbox_reg) / sizeof (u32);
+
+	for (r = 0, n = 0; r < nreg; r++, reg++) {
+		/* Non-status registers. */
+		if ((reg == &mbx->mbx_regs->mbr_resv1)		||
+			(reg == &mbx->mbx_regs->mbr_wrdata)	||
+			(reg == &mbx->mbx_regs->mbr_rddata)	||
+			(reg == &mbx->mbx_regs->mbr_resv2))
+			continue;
+		/* Write-only status register. */
+		if (reg == &mbx->mbx_regs->mbr_ctrl) {
+			n += sprintf(buf + n, "%02ld %10s = --\n",
+				r * sizeof (u32), reg2name(mbx, reg));
+		/* Read-able status register. */
+		} else {
+			n += sprintf(buf + n, "%02ld %10s = 0x%08x\n",
+				r * sizeof (u32), reg2name(mbx, reg),
+				mailbox_reg_rd(mbx, reg));
+		}
+	}
+
+	return n;
+}
+
+static ssize_t mailbox_ctl_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	u32 off, val;
+	int nreg = sizeof (struct mailbox_reg) / sizeof (u32);
+	u32 *reg = (u32 *)mbx->mbx_regs;
+
+	if (sscanf(buf, "%d:%d", &off, &val) != 2 || (off % sizeof (u32)) ||
+		!(off >= 0 && off < nreg * sizeof (u32))) {
+		MBX_ERR(mbx, "input should be <reg_offset:reg_val>");
+		return -EINVAL;
+	}
+	reg += off / sizeof (u32);
+
+	mailbox_reg_wr(mbx, reg, val);
+	return count;
+}
+/* HW register level debugging i/f. */
+static DEVICE_ATTR_RW(mailbox_ctl);
+
+static ssize_t mailbox_pkt_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	int ret = 0;
+
+	if (valid_pkt(&mbx->mbx_tst_pkt)) {
+		(void) memcpy(buf, mbx->mbx_tst_pkt.body.data,
+			mbx->mbx_tst_pkt.hdr.payload_size);
+		ret = mbx->mbx_tst_pkt.hdr.payload_size;
+	}
+
+	return ret;
+}
+
+static ssize_t mailbox_pkt_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	size_t maxlen = sizeof (mbx->mbx_tst_pkt.body.data);
+
+	if (count > maxlen) {
+		MBX_ERR(mbx, "max input length is %ld", maxlen);
+		return 0;
+	}
+
+	(void) memcpy(mbx->mbx_tst_pkt.body.data, buf, count);
+	mbx->mbx_tst_pkt.hdr.payload_size = count;
+	mbx->mbx_tst_pkt.hdr.type = PKT_TEST;
+	complete(&mbx->mbx_tx.mbc_worker);
+	return count;
+}
+
+/* Packet test i/f. */
+static DEVICE_ATTR_RW(mailbox_pkt);
+
+static ssize_t mailbox_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	struct mailbox_req req;
+	size_t respsz = sizeof (mbx->mbx_tst_rx_msg);
+	int ret = 0;
+
+	req.req = MAILBOX_REQ_TEST_READ;
+	ret = mailbox_request(to_platform_device(dev), &req, sizeof (req),
+		mbx->mbx_tst_rx_msg, &respsz, NULL, NULL);
+	if (ret) {
+		MBX_ERR(mbx, "failed to read test msg from peer: %d", ret);
+	} else if (respsz > 0) {
+		(void) memcpy(buf, mbx->mbx_tst_rx_msg, respsz);
+		ret = respsz;
+	}
+
+	return ret;
+}
+
+static ssize_t mailbox_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	size_t maxlen = sizeof (mbx->mbx_tst_tx_msg);
+	struct mailbox_req req = { 0 };
+
+	if (count > maxlen) {
+		MBX_ERR(mbx, "max input length is %ld", maxlen);
+		return 0;
+	}
+
+	(void) memcpy(mbx->mbx_tst_tx_msg, buf, count);
+	mbx->mbx_tst_tx_msg_len = count;
+	req.req = MAILBOX_REQ_TEST_READY;
+	(void) mailbox_post(mbx->mbx_pdev, 0, &req, sizeof (req));
+
+	return count;
+}
+
+/* Msg test i/f. */
+static DEVICE_ATTR_RW(mailbox);
+
+static ssize_t connection_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	int ret;
+	ret = mailbox_connect_status(pdev);
+	return sprintf(buf, "0x%x\n", ret);
+}
+static DEVICE_ATTR_RO(connection);
+
+
+static struct attribute *mailbox_attrs[] = {
+	&dev_attr_mailbox.attr,
+	&dev_attr_mailbox_ctl.attr,
+	&dev_attr_mailbox_pkt.attr,
+	&dev_attr_connection.attr,
+	NULL,
+};
+
+static const struct attribute_group mailbox_attrgroup = {
+	.attrs = mailbox_attrs,
+};
+
+static void dft_req_msg_cb(void *arg, void *data, size_t len, u64 id, int err)
+{
+	struct mailbox_msg *respmsg;
+	struct mailbox_msg *reqmsg = (struct mailbox_msg *)arg;
+	struct mailbox *mbx = reqmsg->mbm_ch->mbc_parent;
+
+	/*
+	 * Can't send out request msg.
+	 * Removing corresponding response msg from queue and return error.
+	 */
+	if (err) {
+		respmsg = chan_msg_dequeue(&mbx->mbx_rx, reqmsg->mbm_req_id);
+		if (respmsg)
+			msg_done(respmsg, err);
+	}
+}
+
+static void dft_post_msg_cb(void *arg, void *buf, size_t len, u64 id, int err)
+{
+	struct mailbox_msg *msg = (struct mailbox_msg *)arg;
+
+	if (err) {
+		MBX_ERR(msg->mbm_ch->mbc_parent,
+			"failed to post msg, err=%d", err);
+	}
+}
+
+/*
+ * Msg will be sent to peer and reply will be received.
+ */
+int mailbox_request(struct platform_device *pdev, void *req, size_t reqlen,
+	void *resp, size_t *resplen, mailbox_msg_cb_t cb, void *cbarg)
+{
+	int rv = -ENOMEM;
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	struct mailbox_msg *reqmsg = NULL, *respmsg = NULL;
+
+	MBX_INFO(mbx, "sending request: %d", ((struct mailbox_req *)req)->req);
+
+	if (cb) {
+		reqmsg = alloc_msg(NULL, reqlen);
+		if (reqmsg)
+			(void) memcpy(reqmsg->mbm_data, req, reqlen);
+	} else {
+		reqmsg = alloc_msg(req, reqlen);
+	}
+	if (!reqmsg)
+		goto fail;
+	reqmsg->mbm_cb = dft_req_msg_cb;
+	reqmsg->mbm_cb_arg = reqmsg;
+	reqmsg->mbm_req_id = (uintptr_t)reqmsg->mbm_data;
+
+	respmsg = alloc_msg(resp, *resplen);
+	if (!respmsg)
+		goto fail;
+	respmsg->mbm_cb = cb;
+	respmsg->mbm_cb_arg = cbarg;
+	/* Only interested in response w/ same ID. */
+	respmsg->mbm_req_id = reqmsg->mbm_req_id;
+
+	/* Always enqueue RX msg before TX one to avoid race. */
+	rv = chan_msg_enqueue(&mbx->mbx_rx, respmsg);
+	if (rv)
+		goto fail;
+	rv = chan_msg_enqueue(&mbx->mbx_tx, reqmsg);
+	if (rv) {
+		respmsg = chan_msg_dequeue(&mbx->mbx_rx, reqmsg->mbm_req_id);
+		goto fail;
+	}
+
+	/* Kick TX channel to try to send out msg. */
+	complete(&mbx->mbx_tx.mbc_worker);
+
+	if (cb)
+		return 0;
+
+	wait_for_completion(&respmsg->mbm_complete);
+	rv = respmsg->mbm_error;
+	if (rv == 0)
+		*resplen = respmsg->mbm_len;
+
+	free_msg(respmsg);
+	return rv;
+
+fail:
+	if (reqmsg)
+		free_msg(reqmsg);
+	if (respmsg)
+		free_msg(respmsg);
+	return rv;
+}
+
+/*
+ * Msg will be posted, no wait for reply.
+ */
+int mailbox_post(struct platform_device *pdev, u64 reqid, void *buf, size_t len)
+{
+	int rv = 0;
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	struct mailbox_msg *msg = alloc_msg(NULL, len);
+
+	if (reqid == 0) {
+		MBX_DBG(mbx, "posting request: %d",
+			((struct mailbox_req *)buf)->req);
+	} else {
+		MBX_DBG(mbx, "posting response...");
+	}
+
+	if (!msg)
+		return -ENOMEM;
+
+	(void) memcpy(msg->mbm_data, buf, len);
+	msg->mbm_cb = dft_post_msg_cb;
+	msg->mbm_cb_arg = msg;
+	if (reqid) {
+		msg->mbm_req_id = reqid;
+		msg->mbm_flags |= MSG_FLAG_RESPONSE;
+	} else {
+		msg->mbm_req_id = (uintptr_t)msg->mbm_data;
+	}
+
+	rv = chan_msg_enqueue(&mbx->mbx_tx, msg);
+	if (rv)
+		free_msg(msg);
+
+	/* Kick TX channel to try to send out msg. */
+	complete(&mbx->mbx_tx.mbc_worker);
+
+	return rv;
+}
+/*
+ *   should not be called by other than connect_state_handler
+ */
+static int mailbox_connection_notify(struct platform_device *pdev, uint64_t sec_id, uint64_t flag)
+{
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	struct mailbox_req *mb_req = NULL;
+	struct mailbox_conn mb_conn = { 0 };
+	int  ret = 0;
+	size_t data_len = 0, reqlen = 0;
+	data_len = sizeof(struct mailbox_conn);
+	reqlen = sizeof(struct mailbox_req) + data_len;
+
+	mb_req = (struct mailbox_req *)vmalloc(reqlen);
+	if (!mb_req) {
+		ret = -ENOMEM;
+		goto done;
+	}
+	mb_req->req = MAILBOX_REQ_CONN_EXPL;
+	if (!mbx->mbx_kaddr) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	mb_conn.kaddr = mbx->mbx_kaddr;
+	mb_conn.paddr = virt_to_phys(mbx->mbx_kaddr);
+	mb_conn.crc32 = crc32c_le(~0, mbx->mbx_kaddr, PAGE_SIZE);
+	mb_conn.flag = flag;
+	mb_conn.ver = mbx->mbx_prot_ver;
+
+	if (sec_id != 0) {
+		mb_conn.sec_id = sec_id;
+	} else {
+		mb_conn.sec_id = (uint64_t)mbx->mbx_kaddr;
+		mbx->mbx_conn_id = (uint64_t)mbx->mbx_kaddr;
+	}
+
+	memcpy(mb_req->data, &mb_conn, data_len);
+
+	ret = mailbox_post(pdev, 0, mb_req, reqlen);
+
+done:
+	vfree(mb_req);
+	return ret;
+}
+
+static int mailbox_connection_explore(struct platform_device *pdev, struct mailbox_conn *mb_conn)
+{
+	int ret = 0;
+	uint32_t crc_chk;
+	phys_addr_t paddr;
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	if (!mb_conn) {
+		ret = -EFAULT;
+		goto done;
+	}
+
+	paddr = virt_to_phys(mb_conn->kaddr);
+	if (paddr != mb_conn->paddr) {
+		MBX_INFO(mbx, "mb_conn->paddr %llx paddr: %llx\n", mb_conn->paddr, paddr);
+		MBX_INFO(mbx, "Failed to get the same physical addr, running in VMs?\n");
+		ret = -EFAULT;
+		goto done;
+	}
+	crc_chk = crc32c_le(~0, mb_conn->kaddr, PAGE_SIZE);
+
+	if (crc_chk != mb_conn->crc32) {
+		MBX_INFO(mbx, "crc32  : %x, %x\n",  mb_conn->crc32, crc_chk);
+		MBX_INFO(mbx, "failed to get the same CRC\n");
+		ret = -EFAULT;
+		goto done;
+	}
+done:
+	return ret;
+}
+
+static int mailbox_get_data(struct platform_device *pdev, enum data_kind kind)
+{
+	int ret = 0;
+	switch (kind) {
+	case PEER_CONN:
+		ret = mailbox_connect_status(pdev);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+
+static void connect_state_handler(struct mailbox *mbx, struct mailbox_conn *conn)
+{
+		int ret = 0;
+
+		if (!mbx || !conn)
+			return;
+
+		mutex_lock(&mbx->mbx_lock);
+
+		switch (conn->flag) {
+		case MB_CONN_INIT:
+			/* clean up all cached data, */
+			mbx->mbx_paired = 0;
+			mbx->mbx_established = false;
+			kfree(mbx->mbx_kaddr);
+
+			mbx->mbx_kaddr = kzalloc(PAGE_SIZE, GFP_KERNEL);
+			get_random_bytes(mbx->mbx_kaddr, PAGE_SIZE);
+			ret = mailbox_connection_notify(mbx->mbx_pdev, 0, MB_CONN_SYN);
+			if (ret)
+				goto done;
+			mbx->mbx_state = CONN_SYN_SENT;
+			break;
+		case MB_CONN_SYN:
+			if (mbx->mbx_state == CONN_SYN_SENT) {
+				if (!mailbox_connection_explore(mbx->mbx_pdev, conn)) {
+					mbx->mbx_paired |= 0x2;
+					MBX_INFO(mbx, "mailbox mbx_prot_ver %x", mbx->mbx_prot_ver);
+				}
+				ret = mailbox_connection_notify(mbx->mbx_pdev, conn->sec_id, MB_CONN_ACK);
+				if (ret)
+					goto done;
+				mbx->mbx_state = CONN_SYN_RECV;
+			} else
+				mbx->mbx_state = CONN_START;
+			break;
+		case MB_CONN_ACK:
+			if (mbx->mbx_state & (CONN_SYN_SENT | CONN_SYN_RECV)) {
+				if (mbx->mbx_conn_id == (uint64_t)conn->sec_id) {
+					mbx->mbx_paired |= 0x1;
+					mbx->mbx_established = true;
+					mbx->mbx_state = CONN_ESTABLISH;
+					kfree(mbx->mbx_kaddr);
+					mbx->mbx_kaddr = NULL;
+				} else
+					mbx->mbx_state = CONN_START;
+			}
+			break;
+		case MB_CONN_FIN:
+			mbx->mbx_paired = 0;
+			mbx->mbx_established = false;
+			kfree(mbx->mbx_kaddr);
+			mbx->mbx_kaddr = NULL;
+			mbx->mbx_state = CONN_START;
+			break;
+		default:
+			break;
+		}
+done:
+	if (ret) {
+		kfree(mbx->mbx_kaddr);
+		mbx->mbx_kaddr = NULL;
+		mbx->mbx_paired = 0;
+		mbx->mbx_state = CONN_START;
+	}
+	mutex_unlock(&mbx->mbx_lock);
+	MBX_INFO(mbx, "mailbox connection state %d", mbx->mbx_paired);
+}
+
+static void process_request(struct mailbox *mbx, struct mailbox_msg *msg)
+{
+	struct mailbox_req *req = (struct mailbox_req *)msg->mbm_data;
+	struct mailbox_conn *conn = (struct mailbox_conn *)req->data;
+	int rc;
+	const char *recvstr = "received request from peer";
+	const char *sendstr = "sending test msg to peer";
+
+	if (req->req == MAILBOX_REQ_TEST_READ) {
+		MBX_INFO(mbx, "%s: %d", recvstr, req->req);
+		if (mbx->mbx_tst_tx_msg_len) {
+			MBX_INFO(mbx, "%s", sendstr);
+			rc = mailbox_post(mbx->mbx_pdev, msg->mbm_req_id,
+				mbx->mbx_tst_tx_msg, mbx->mbx_tst_tx_msg_len);
+			if (rc)
+				MBX_ERR(mbx, "%s failed: %d", sendstr, rc);
+			else
+				mbx->mbx_tst_tx_msg_len = 0;
+
+		}
+	} else if (req->req == MAILBOX_REQ_TEST_READY) {
+		MBX_INFO(mbx, "%s: %d", recvstr, req->req);
+	} else if (req->req == MAILBOX_REQ_CONN_EXPL) {
+		MBX_INFO(mbx, "%s: %d", recvstr, req->req);
+		if (mbx->mbx_state != CONN_SYN_SENT) {
+			/* if your peer droped without notice,
+			 * initial the connection Simultaneously
+			 * again.
+			 */
+			if (conn->flag == MB_CONN_SYN) {
+				connect_state_touch(mbx, MB_CONN_INIT);
+			}
+		}
+		connect_state_handler(mbx, conn);
+	} else if (mbx->mbx_listen_cb) {
+		/* Call client's registered callback to process request. */
+		MBX_DBG(mbx, "%s: %d, passed on", recvstr, req->req);
+		mbx->mbx_listen_cb(mbx->mbx_listen_cb_arg, msg->mbm_data,
+			msg->mbm_len, msg->mbm_req_id, msg->mbm_error);
+	} else {
+		MBX_INFO(mbx, "%s: %d, dropped", recvstr, req->req);
+	}
+}
+
+/*
+ * Wait for request from peer.
+ */
+static void mailbox_recv_request(struct work_struct *work)
+{
+	int rv = 0;
+	struct mailbox_msg *msg = NULL;
+	struct mailbox *mbx =
+		container_of(work, struct mailbox, mbx_listen_worker);
+
+	for (;;) {
+		/* Only interested in request msg. */
+
+		rv = wait_for_completion_interruptible(&mbx->mbx_comp);
+		if (rv)
+			break;
+		mutex_lock(&mbx->mbx_lock);
+		msg = list_first_entry_or_null(&mbx->mbx_req_list,
+			struct mailbox_msg, mbm_list);
+
+		if (msg) {
+			list_del(&msg->mbm_list);
+			mbx->mbx_req_cnt--;
+			mbx->mbx_req_sz -= msg->mbm_len;
+			mutex_unlock(&mbx->mbx_lock);
+		} else {
+			mutex_unlock(&mbx->mbx_lock);
+			break;
+		}
+
+		process_request(mbx, msg);
+		free_msg(msg);
+	}
+
+	if (rv == -ESHUTDOWN)
+		MBX_INFO(mbx, "channel is closed, no listen to peer");
+	else if (rv != 0)
+		MBX_ERR(mbx, "failed to receive request from peer, err=%d", rv);
+
+	if (msg)
+		free_msg(msg);
+}
+
+int mailbox_listen(struct platform_device *pdev,
+	mailbox_msg_cb_t cb, void *cbarg)
+{
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+
+	mbx->mbx_listen_cb_arg = cbarg;
+	/* mbx->mbx_listen_cb is used in another thread as a condition to
+	 * call the function. Ensuring that the argument is captured before
+	 * the function pointer
+	 */
+	wmb();
+	mbx->mbx_listen_cb = cb;
+
+	return 0;
+}
+
+static int mailbox_enable_intr_mode(struct mailbox *mbx)
+{
+	struct resource *res;
+	int ret;
+	struct platform_device *pdev = mbx->mbx_pdev;
+	struct xocl_dev *xdev = xocl_get_xdev(pdev);
+
+	if (mbx->mbx_irq != -1)
+		return 0;
+
+	res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
+	if (res == NULL) {
+		MBX_ERR(mbx, "failed to acquire intr resource");
+		return -EINVAL;
+	}
+
+	ret = xocl_user_interrupt_reg(xdev, res->start, mailbox_isr, mbx);
+	if (ret) {
+		MBX_ERR(mbx, "failed to add intr handler");
+		return ret;
+	}
+	ret = xocl_user_interrupt_config(xdev, res->start, true);
+	BUG_ON(ret != 0);
+
+	/* Only see intr when we have full packet sent or received. */
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_rit, PACKET_SIZE - 1);
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_sit, 0);
+
+	/* Finally, enable TX / RX intr. */
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_ie, 0x3);
+
+	clear_bit(MBXCS_BIT_POLL_MODE, &mbx->mbx_rx.mbc_state);
+	chan_config_timer(&mbx->mbx_rx);
+
+	clear_bit(MBXCS_BIT_POLL_MODE, &mbx->mbx_tx.mbc_state);
+	chan_config_timer(&mbx->mbx_tx);
+
+	mbx->mbx_irq = res->start;
+	return 0;
+}
+
+static void mailbox_disable_intr_mode(struct mailbox *mbx)
+{
+	struct platform_device *pdev = mbx->mbx_pdev;
+	struct xocl_dev *xdev = xocl_get_xdev(pdev);
+
+	/*
+	 * No need to turn on polling mode for TX, which has
+	 * a channel stall checking timer always on when there is
+	 * outstanding TX packet.
+	 */
+	set_bit(MBXCS_BIT_POLL_MODE, &mbx->mbx_rx.mbc_state);
+	chan_config_timer(&mbx->mbx_rx);
+
+	/* Disable both TX / RX intrs. */
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_ie, 0x0);
+
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_rit, 0x0);
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_sit, 0x0);
+
+	if (mbx->mbx_irq == -1)
+		return;
+
+	(void) xocl_user_interrupt_config(xdev, mbx->mbx_irq, false);
+	(void) xocl_user_interrupt_reg(xdev, mbx->mbx_irq, NULL, mbx);
+
+	mbx->mbx_irq = -1;
+}
+
+int mailbox_reset(struct platform_device *pdev, bool end_of_reset)
+{
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+	int ret = 0;
+
+	if (mailbox_no_intr)
+		return 0;
+
+	if (end_of_reset) {
+		MBX_INFO(mbx, "enable intr mode");
+		if (mailbox_enable_intr_mode(mbx) != 0)
+			MBX_ERR(mbx, "failed to enable intr after reset");
+	} else {
+		MBX_INFO(mbx, "enable polling mode");
+		mailbox_disable_intr_mode(mbx);
+	}
+	return ret;
+}
+
+/* Kernel APIs exported from this sub-device driver. */
+static struct xocl_mailbox_funcs mailbox_ops = {
+	.request = mailbox_request,
+	.post = mailbox_post,
+	.listen = mailbox_listen,
+	.reset = mailbox_reset,
+	.get_data = mailbox_get_data,
+};
+
+static int mailbox_remove(struct platform_device *pdev)
+{
+	struct mailbox *mbx = platform_get_drvdata(pdev);
+
+	BUG_ON(mbx == NULL);
+
+	connect_state_touch(mbx, MB_CONN_FIN);
+
+	mailbox_disable_intr_mode(mbx);
+
+	sysfs_remove_group(&pdev->dev.kobj, &mailbox_attrgroup);
+
+	chan_fini(&mbx->mbx_rx);
+	chan_fini(&mbx->mbx_tx);
+	listen_wq_fini(mbx);
+
+	BUG_ON(!(list_empty(&mbx->mbx_req_list)));
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_MAILBOX, NULL);
+
+	if (mbx->mbx_regs)
+		iounmap(mbx->mbx_regs);
+
+	MBX_INFO(mbx, "mailbox cleaned up successfully");
+	platform_set_drvdata(pdev, NULL);
+	kfree(mbx);
+	return 0;
+}
+
+static int mailbox_probe(struct platform_device *pdev)
+{
+	struct mailbox *mbx = NULL;
+	struct resource *res;
+	int ret;
+
+	mbx = kzalloc(sizeof(struct mailbox), GFP_KERNEL);
+	if (!mbx)
+		return -ENOMEM;
+	platform_set_drvdata(pdev, mbx);
+	mbx->mbx_pdev = pdev;
+	mbx->mbx_irq = (u32)-1;
+
+
+	init_completion(&mbx->mbx_comp);
+	mutex_init(&mbx->mbx_lock);
+	INIT_LIST_HEAD(&mbx->mbx_req_list);
+	mbx->mbx_req_cnt = 0;
+	mbx->mbx_req_sz = 0;
+
+	mutex_init(&mbx->mbx_conn_lock);
+	mbx->mbx_established = false;
+	mbx->mbx_conn_id = 0;
+	mbx->mbx_kaddr = NULL;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	mbx->mbx_regs = ioremap_nocache(res->start, res->end - res->start + 1);
+	if (!mbx->mbx_regs) {
+		MBX_ERR(mbx, "failed to map in registers");
+		ret = -EIO;
+		goto failed;
+	}
+	/* Reset TX channel, RX channel is managed by peer as his TX. */
+	mailbox_reg_wr(mbx, &mbx->mbx_regs->mbr_ctrl, 0x1);
+
+	/* Set up software communication channels. */
+	ret = chan_init(mbx, "RX", &mbx->mbx_rx, chan_do_rx);
+	if (ret != 0) {
+		MBX_ERR(mbx, "failed to init rx channel");
+		goto failed;
+	}
+	ret = chan_init(mbx, "TX", &mbx->mbx_tx, chan_do_tx);
+	if (ret != 0) {
+		MBX_ERR(mbx, "failed to init tx channel");
+		goto failed;
+	}
+	/* Dedicated thread for listening to peer request. */
+	mbx->mbx_listen_wq =
+		create_singlethread_workqueue(dev_name(&mbx->mbx_pdev->dev));
+	if (!mbx->mbx_listen_wq) {
+		MBX_ERR(mbx, "failed to create request-listen work queue");
+		goto failed;
+	}
+	INIT_WORK(&mbx->mbx_listen_worker, mailbox_recv_request);
+	queue_work(mbx->mbx_listen_wq, &mbx->mbx_listen_worker);
+
+	ret = sysfs_create_group(&pdev->dev.kobj, &mailbox_attrgroup);
+	if (ret != 0) {
+		MBX_ERR(mbx, "failed to init sysfs");
+		goto failed;
+	}
+
+	if (mailbox_no_intr) {
+		MBX_INFO(mbx, "Enabled timer-driven mode");
+		mailbox_disable_intr_mode(mbx);
+	} else {
+		ret = mailbox_enable_intr_mode(mbx);
+		if (ret != 0)
+			goto failed;
+	}
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_MAILBOX, &mailbox_ops);
+
+	connect_state_touch(mbx, MB_CONN_INIT);
+	mbx->mbx_prot_ver = MB_PROTOCOL_VER;
+
+	MBX_INFO(mbx, "successfully initialized");
+	return 0;
+
+failed:
+	mailbox_remove(pdev);
+	return ret;
+}
+
+struct platform_device_id mailbox_id_table[] = {
+	{ XOCL_MAILBOX, 0 },
+	{ },
+};
+
+static struct platform_driver mailbox_driver = {
+	.probe		= mailbox_probe,
+	.remove		= mailbox_remove,
+	.driver		= {
+		.name	= XOCL_MAILBOX,
+	},
+	.id_table = mailbox_id_table,
+};
+
+int __init xocl_init_mailbox(void)
+{
+	BUILD_BUG_ON(sizeof(struct mailbox_pkt) != sizeof(u32) * PACKET_SIZE);
+	return platform_driver_register(&mailbox_driver);
+}
+
+void xocl_fini_mailbox(void)
+{
+	platform_driver_unregister(&mailbox_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/mb_scheduler.c b/drivers/gpu/drm/xocl/subdev/mb_scheduler.c
new file mode 100644
index 000000000000..b3ed3ae0b41a
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/mb_scheduler.c
@@ -0,0 +1,3059 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2018-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *    Soren Soe <soren.soe@xilinx.com>
+ */
+
+/*
+ * Kernel Driver Scheduler (KDS) for XRT
+ *
+ * struct xocl_cmd
+ *  - wraps exec BOs create from user space
+ *  - transitions through a number of states
+ *  - initially added to pending command queue
+ *  - consumed by scheduler which manages its execution (state transition)
+ * struct xcol_cu
+ *  - compute unit for executing commands
+ *  - used only without embedded scheduler (ert)
+ *  - talks to HW compute units
+ * struct xocl_ert
+ *  - embedded scheduler for executing commands on ert
+ *  - talks to HW ERT
+ * struct exec_core
+ *  - execution core managing execution on one device
+ * struct xocl_scheduler
+ *  - manages execution of cmds on one or more exec cores
+ *  - executed in a separate kernel thread
+ *  - loops repeatedly when there is work to do
+ *  - moves pending commands into a scheduler command queue
+ *
+ * [new -> pending]. The xocl API adds exec BOs to KDS.	 The exec BOs are
+ * wrapped in a xocl_cmd object and added to a pending command queue.
+ *
+ * [pending -> queued]. Scheduler loops repeatedly and copies pending commands
+ * to its own command queue, then managaes command execution on one or more
+ * execution cores.
+ *
+ * [queued -> submitted]. Commands are submitted for execution on execution
+ * core when the core has room for new commands.
+ *
+ * [submitted -> running]. Once submitted, a command is transition by
+ * scheduler into running state when there is an available compute unit (no
+ * ert) or if ERT is used, then when ERT has room.
+ *
+ * [running -> complete]. Commands running on ERT complete by sending an
+ * interrupt to scheduler.  When ERT is not used, commands are running on a
+ * compute unit and are polled for completion.
+ */
+
+#include <linux/bitmap.h>
+#include <linux/list.h>
+#include <linux/eventfd.h>
+#include <linux/kthread.h>
+#include "../ert.h"
+#include "../xocl_drv.h"
+#include "../userpf/common.h"
+
+//#define SCHED_VERBOSE
+
+#if defined(__GNUC__)
+#define SCHED_UNUSED __attribute__((unused))
+#endif
+
+#define sched_debug_packet(packet, size)				\
+({									\
+	int i;								\
+	u32 *data = (u32 *)packet;					\
+	for (i = 0; i < size; ++i)					    \
+		DRM_INFO("packet(0x%p) data[%d] = 0x%x\n", data, i, data[i]); \
+})
+
+#ifdef SCHED_VERBOSE
+# define SCHED_DEBUG(msg) DRM_INFO(msg)
+# define SCHED_DEBUGF(format, ...) DRM_INFO(format, ##__VA_ARGS__)
+# define SCHED_PRINTF(format, ...) DRM_INFO(format, ##__VA_ARGS__)
+# define SCHED_DEBUG_PACKET(packet, size) sched_debug_packet(packet, size)
+#else
+# define SCHED_DEBUG(msg)
+# define SCHED_DEBUGF(format, ...)
+# define SCHED_PRINTF(format, ...) DRM_INFO(format, ##__VA_ARGS__)
+# define SCHED_DEBUG_PACKET(packet, size)
+#endif
+
+/* constants */
+static const unsigned int no_index = -1;
+
+/* FFA	handling */
+static const u32 AP_START    = 0x1;
+static const u32 AP_DONE     = 0x2;
+static const u32 AP_IDLE     = 0x4;
+static const u32 AP_READY    = 0x8;
+static const u32 AP_CONTINUE = 0x10;
+
+/* Forward declaration */
+struct exec_core;
+struct exec_ops;
+struct xocl_scheduler;
+
+static int validate(struct platform_device *pdev, struct client_ctx *client,
+		    const struct drm_xocl_bo *bo);
+static bool exec_is_flush(struct exec_core *exec);
+static void scheduler_wake_up(struct xocl_scheduler *xs);
+static void scheduler_intr(struct xocl_scheduler *xs);
+static void scheduler_decr_poll(struct xocl_scheduler *xs);
+
+/*
+ */
+static void
+xocl_bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits)
+{
+	unsigned int i, halfwords;
+
+	halfwords = DIV_ROUND_UP(nbits, 32);
+	for (i = 0; i < halfwords; i++) {
+		buf[i] = (u32) (bitmap[i/2] & UINT_MAX);
+		if (++i < halfwords)
+			buf[i] = (u32) (bitmap[i/2] >> 32);
+	}
+
+	/* Clear tail bits in last element of array beyond nbits. */
+	if (nbits % BITS_PER_LONG)
+		buf[halfwords - 1] &= (u32) (UINT_MAX >> ((-nbits) & 31));
+}
+
+static void
+xocl_bitmap_from_arr32(unsigned long *bitmap, const u32 *buf, unsigned int nbits)
+{
+	unsigned int i, halfwords;
+
+	halfwords = DIV_ROUND_UP(nbits, 32);
+	for (i = 0; i < halfwords; i++) {
+		bitmap[i/2] = (unsigned long) buf[i];
+		if (++i < halfwords)
+			bitmap[i/2] |= ((unsigned long) buf[i]) << 32;
+	}
+
+	/* Clear tail bits in last word beyond nbits. */
+	if (nbits % BITS_PER_LONG)
+		bitmap[(halfwords - 1) / 2] &= BITMAP_LAST_WORD_MASK(nbits);
+}
+
+
+/**
+ * slot_mask_idx() - Slot mask idx index for a given slot_idx
+ *
+ * @slot_idx: Global [0..127] index of a CQ slot
+ * Return: Index of the slot mask containing the slot_idx
+ */
+static inline unsigned int
+slot_mask_idx(unsigned int slot_idx)
+{
+	return slot_idx >> 5;
+}
+
+/**
+ * slot_idx_in_mask() - Index of command queue slot within the mask that contains it
+ *
+ * @slot_idx: Global [0..127] index of a CQ slot
+ * Return: Index of slot within the mask that contains it
+ */
+static inline unsigned int
+slot_idx_in_mask(unsigned int slot_idx)
+{
+	return slot_idx - (slot_mask_idx(slot_idx) << 5);
+}
+
+/**
+ * Command data used by scheduler
+ *
+ * @list: command object moves from pending to commmand queue list
+ * @cu_list: command object is added to CU list when running (penguin only)
+ *
+ * @bo: underlying drm buffer object
+ * @exec: execution device associated with this command
+ * @client: client (user process) context that created this command
+ * @xs: command scheduler responsible for schedulint this command
+ * @state: state of command object per scheduling
+ * @id: unique id for an active command object
+ * @cu_idx: index of CU executing this cmd object; used in penguin mode only
+ * @slot_idx: command queue index of this command object
+ * @wait_count: number of commands that must trigger this command before it can start
+ * @chain_count: number of commands that this command must trigger when it completes
+ * @chain: list of commands to trigger upon completion; maximum chain depth is 8
+ * @deps: list of commands this object depends on, converted to chain when command is queued
+ * @packet: mapped ert packet object from user space
+ */
+struct xocl_cmd {
+	struct list_head cq_list; // scheduler command queue
+	struct list_head rq_list; // exec core running queue
+
+	/* command packet */
+	struct drm_xocl_bo *bo;
+	union {
+		struct ert_packet	    *ecmd;
+		struct ert_start_kernel_cmd *kcmd;
+	};
+
+	DECLARE_BITMAP(cu_bitmap, MAX_CUS);
+
+	struct xocl_dev	  *xdev;
+	struct exec_core  *exec;
+	struct client_ctx *client;
+	struct xocl_scheduler *xs;
+	enum ert_cmd_state state;
+
+	/* dependency handling */
+	unsigned int chain_count;
+	unsigned int wait_count;
+	union {
+		struct xocl_cmd *chain[8];
+		struct drm_xocl_bo *deps[8];
+	};
+
+	unsigned long uid;     // unique id for this command
+	unsigned int cu_idx;   // index of CU running this cmd (penguin mode)
+	unsigned int slot_idx; // index in exec core submit queue
+};
+
+/*
+ * List of free xocl_cmd objects.
+ *
+ * @free_cmds: populated with recycled xocl_cmd objects
+ * @cmd_mutex: mutex lock for cmd_list
+ *
+ * Command objects are recycled for later use and only freed when kernel
+ * module is unloaded.
+ */
+static LIST_HEAD(free_cmds);
+static DEFINE_MUTEX(free_cmds_mutex);
+
+/**
+ * delete_cmd_list() - reclaim memory for all allocated command objects
+ */
+static void
+cmd_list_delete(void)
+{
+	struct xocl_cmd *xcmd;
+	struct list_head *pos, *next;
+
+	mutex_lock(&free_cmds_mutex);
+	list_for_each_safe(pos, next, &free_cmds) {
+		xcmd = list_entry(pos, struct xocl_cmd, cq_list);
+		list_del(pos);
+		kfree(xcmd);
+	}
+	mutex_unlock(&free_cmds_mutex);
+}
+
+/*
+ * opcode() - Command opcode
+ *
+ * @cmd: Command object
+ * Return: Opcode per command packet
+ */
+static inline u32
+cmd_opcode(struct xocl_cmd *xcmd)
+{
+	return xcmd->ecmd->opcode;
+}
+
+/*
+ * type() - Command type
+ *
+ * @cmd: Command object
+ * Return: Type of command
+ */
+static inline u32
+cmd_type(struct xocl_cmd *xcmd)
+{
+	return xcmd->ecmd->type;
+}
+
+/*
+ * exec() - Get execution core
+ */
+static inline struct exec_core *
+cmd_exec(struct xocl_cmd *xcmd)
+{
+	return xcmd->exec;
+}
+
+/*
+ * uid() - Get unique id of command
+ */
+static inline unsigned long
+cmd_uid(struct xocl_cmd *xcmd)
+{
+	return xcmd->uid;
+}
+
+/*
+ */
+static inline unsigned int
+cmd_wait_count(struct xocl_cmd *xcmd)
+{
+	return xcmd->wait_count;
+}
+
+/**
+ * payload_size() - Command payload size
+ *
+ * @xcmd: Command object
+ * Return: Size in number of words of command packet payload
+ */
+static inline unsigned int
+cmd_payload_size(struct xocl_cmd *xcmd)
+{
+	return xcmd->ecmd->count;
+}
+
+/**
+ * cmd_packet_size() - Command packet size
+ *
+ * @xcmd: Command object
+ * Return: Size in number of words of command packet
+ */
+static inline unsigned int
+cmd_packet_size(struct xocl_cmd *xcmd)
+{
+	return cmd_payload_size(xcmd) + 1;
+}
+
+/**
+ * cu_masks() - Number of command packet cu_masks
+ *
+ * @xcmd: Command object
+ * Return: Total number of CU masks in command packet
+ */
+static inline unsigned int
+cmd_cumasks(struct xocl_cmd *xcmd)
+{
+	return 1 + xcmd->kcmd->extra_cu_masks;
+}
+
+/**
+ * regmap_size() - Size of regmap is payload size (n) minus the number of cu_masks
+ *
+ * @xcmd: Command object
+ * Return: Size of register map in number of words
+ */
+static inline unsigned int
+cmd_regmap_size(struct xocl_cmd *xcmd)
+{
+	return cmd_payload_size(xcmd) - cmd_cumasks(xcmd);
+}
+
+/*
+ */
+static inline struct ert_packet*
+cmd_packet(struct xocl_cmd *xcmd)
+{
+	return xcmd->ecmd;
+}
+
+/*
+ */
+static inline u32*
+cmd_regmap(struct xocl_cmd *xcmd)
+{
+	return xcmd->kcmd->data + xcmd->kcmd->extra_cu_masks;
+}
+
+/**
+ * cmd_set_int_state() - Set internal command state used by scheduler only
+ *
+ * @xcmd: command to change internal state on
+ * @state: new command state per ert.h
+ */
+static inline void
+cmd_set_int_state(struct xocl_cmd *xcmd, enum ert_cmd_state state)
+{
+	SCHED_DEBUGF("-> %s(%lu,%d)\n", __func__, xcmd->uid, state);
+	xcmd->state = state;
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/**
+ * cmd_set_state() - Set both internal and external state of a command
+ *
+ * The state is reflected externally through the command packet
+ * as well as being captured in internal state variable
+ *
+ * @xcmd: command object
+ * @state: new state
+ */
+static inline void
+cmd_set_state(struct xocl_cmd *xcmd, enum ert_cmd_state state)
+{
+	SCHED_DEBUGF("->%s(%lu,%d)\n", __func__, xcmd->uid, state);
+	xcmd->state = state;
+	xcmd->ecmd->state = state;
+	SCHED_DEBUGF("<-%s\n", __func__);
+}
+
+/*
+ * update_state() - Update command state if client has aborted
+ */
+static enum ert_cmd_state
+cmd_update_state(struct xocl_cmd *xcmd)
+{
+	if (xcmd->state != ERT_CMD_STATE_RUNNING && xcmd->client->abort) {
+		userpf_info(xcmd->xdev, "aborting stale client cmd(%lu)", xcmd->uid);
+		cmd_set_state(xcmd, ERT_CMD_STATE_ABORT);
+	}
+	if (exec_is_flush(xcmd->exec)) {
+		userpf_info(xcmd->xdev, "aborting stale exec cmd(%lu)", xcmd->uid);
+		cmd_set_state(xcmd, ERT_CMD_STATE_ABORT);
+	}
+	return xcmd->state;
+}
+
+/*
+ * release_gem_object_reference() -
+ */
+static inline void
+cmd_release_gem_object_reference(struct xocl_cmd *xcmd)
+{
+	if (xcmd->bo)
+		drm_gem_object_put_unlocked(&xcmd->bo->base);
+//PORT4_20
+//		drm_gem_object_unreference_unlocked(&xcmd->bo->base);
+}
+
+/*
+ */
+static inline void
+cmd_mark_active(struct xocl_cmd *xcmd)
+{
+	if (xcmd->bo)
+		xcmd->bo->metadata.active = xcmd;
+}
+
+/*
+ */
+static inline void
+cmd_mark_deactive(struct xocl_cmd *xcmd)
+{
+	if (xcmd->bo)
+		xcmd->bo->metadata.active = NULL;
+}
+
+/**
+ * chain_dependencies() - Chain this command to its dependencies
+ *
+ * @xcmd: Command to chain to its dependencies
+ *
+ * This function looks at all incoming explicit BO dependencies, checks if a
+ * corresponding xocl_cmd object exists (is active) in which case that command
+ * object must chain argument xcmd so that it (xcmd) can be triggered when
+ * dependency completes.  The chained command has a wait count corresponding to
+ * the number of dependencies that are active.
+ */
+static int
+cmd_chain_dependencies(struct xocl_cmd *xcmd)
+{
+	int didx;
+	int dcount = xcmd->wait_count;
+
+	SCHED_DEBUGF("-> chain_dependencies of xcmd(%lu)\n", xcmd->uid);
+	for (didx = 0; didx < dcount; ++didx) {
+		struct drm_xocl_bo *dbo = xcmd->deps[didx];
+		struct xocl_cmd *chain_to = dbo->metadata.active;
+		// release reference created in ioctl call when dependency was looked up
+		// see comments in xocl_ioctl.c:xocl_execbuf_ioctl()
+//PORT4_20
+//		drm_gem_object_unreference_unlocked(&dbo->base);
+		drm_gem_object_put_unlocked(&dbo->base);
+		xcmd->deps[didx] = NULL;
+		if (!chain_to) { /* command may have completed already */
+			--xcmd->wait_count;
+			continue;
+		}
+		if (chain_to->chain_count >= MAX_DEPS) {
+			DRM_INFO("chain count exceeded");
+			return 1;
+		}
+		SCHED_DEBUGF("+ xcmd(%lu)->chain[%d]=xcmd(%lu)", chain_to->uid, chain_to->chain_count, xcmd->uid);
+		chain_to->chain[chain_to->chain_count++] = xcmd;
+	}
+	SCHED_DEBUG("<- chain_dependencies\n");
+	return 0;
+}
+
+/**
+ * trigger_chain() - Trigger the execution of any commands chained to argument command
+ *
+ * @xcmd: Completed command that must trigger its chained (waiting) commands
+ *
+ * The argument command has completed and must trigger the execution of all
+ * chained commands whos wait_count is 0.
+ */
+static void
+cmd_trigger_chain(struct xocl_cmd *xcmd)
+{
+	SCHED_DEBUGF("-> trigger_chain xcmd(%lu)\n", xcmd->uid);
+	while (xcmd->chain_count) {
+		struct xocl_cmd *trigger = xcmd->chain[--xcmd->chain_count];
+
+		SCHED_DEBUGF("+ cmd(%lu) triggers cmd(%lu) with wait_count(%d)\n",
+			     xcmd->uid, trigger->uid, trigger->wait_count);
+		// decrement trigger wait count
+		// scheduler will submit when wait count reaches zero
+		--trigger->wait_count;
+	}
+	SCHED_DEBUG("<- trigger_chain\n");
+}
+
+
+/**
+ * cmd_get() - Get a free command object
+ *
+ * Get from free/recycled list or allocate a new command if necessary.
+ *
+ * Return: Free command object
+ */
+static struct xocl_cmd*
+cmd_get(struct xocl_scheduler *xs, struct exec_core *exec, struct client_ctx *client)
+{
+	struct xocl_cmd *xcmd;
+	static unsigned long count;
+
+	mutex_lock(&free_cmds_mutex);
+	xcmd = list_first_entry_or_null(&free_cmds, struct xocl_cmd, cq_list);
+	if (xcmd)
+		list_del(&xcmd->cq_list);
+	mutex_unlock(&free_cmds_mutex);
+	if (!xcmd)
+		xcmd = kmalloc(sizeof(struct xocl_cmd), GFP_KERNEL);
+	if (!xcmd)
+		return ERR_PTR(-ENOMEM);
+	xcmd->uid = count++;
+	xcmd->exec = exec;
+	xcmd->cu_idx = no_index;
+	xcmd->slot_idx = no_index;
+	xcmd->xs = xs;
+	xcmd->xdev = client->xdev;
+	xcmd->client = client;
+	xcmd->bo = NULL;
+	xcmd->ecmd = NULL;
+	atomic_inc(&client->outstanding_execs);
+	SCHED_DEBUGF("xcmd(%lu) xcmd(%p) [-> new ]\n", xcmd->uid, xcmd);
+	return xcmd;
+}
+
+/**
+ * cmd_free() - free a command object
+ *
+ * @xcmd: command object to free (move to freelist)
+ *
+ * The command *is* in some current list (scheduler command queue)
+ */
+static void
+cmd_free(struct xocl_cmd *xcmd)
+{
+	cmd_release_gem_object_reference(xcmd);
+
+	mutex_lock(&free_cmds_mutex);
+	list_move_tail(&xcmd->cq_list, &free_cmds);
+	mutex_unlock(&free_cmds_mutex);
+
+	atomic_dec(&xcmd->xdev->outstanding_execs);
+	atomic_dec(&xcmd->client->outstanding_execs);
+	SCHED_DEBUGF("xcmd(%lu) [-> free]\n", xcmd->uid);
+}
+
+/**
+ * abort_cmd() - abort command object before it becomes pending
+ *
+ * @xcmd: command object to abort (move to freelist)
+ *
+ * Command object is *not* in any current list
+ *
+ * Return: 0
+ */
+static void
+cmd_abort(struct xocl_cmd *xcmd)
+{
+	mutex_lock(&free_cmds_mutex);
+	list_add_tail(&xcmd->cq_list, &free_cmds);
+	mutex_unlock(&free_cmds_mutex);
+	SCHED_DEBUGF("xcmd(%lu) [-> abort]\n", xcmd->uid);
+}
+
+/*
+ * cmd_bo_init() - Initialize a command object with an exec BO
+ *
+ * In penguin mode, the command object caches the CUs available
+ * to execute the command.  When ERT is enabled, the CU info
+ * is not used.
+ */
+static void
+cmd_bo_init(struct xocl_cmd *xcmd, struct drm_xocl_bo *bo,
+	    int numdeps, struct drm_xocl_bo **deps, int penguin)
+{
+	SCHED_DEBUGF("%s(%lu,bo,%d,deps,%d)\n", __func__, xcmd->uid, numdeps, penguin);
+	xcmd->bo = bo;
+	xcmd->ecmd = (struct ert_packet *)bo->vmapping;
+
+	if (penguin && cmd_opcode(xcmd) == ERT_START_KERNEL) {
+		unsigned int i = 0;
+		u32 cumasks[4] = {0};
+
+		cumasks[0] = xcmd->kcmd->cu_mask;
+		SCHED_DEBUGF("+ xcmd(%lu) cumask[0]=0x%x\n", xcmd->uid, cumasks[0]);
+		for (i = 0; i < xcmd->kcmd->extra_cu_masks; ++i) {
+			cumasks[i+1] = xcmd->kcmd->data[i];
+			SCHED_DEBUGF("+ xcmd(%lu) cumask[%d]=0x%x\n", xcmd->uid, i+1, cumasks[i+1]);
+		}
+		xocl_bitmap_from_arr32(xcmd->cu_bitmap, cumasks, MAX_CUS);
+		SCHED_DEBUGF("cu_bitmap[0] = %lu\n", xcmd->cu_bitmap[0]);
+	}
+
+	// dependencies are copied here, the anticipated wait_count is number
+	// of specified dependencies.  The wait_count is adjusted when the
+	// command is queued in the scheduler based on whether or not a
+	// dependency is active (managed by scheduler)
+	memcpy(xcmd->deps, deps, numdeps*sizeof(struct drm_xocl_bo *));
+	xcmd->wait_count = numdeps;
+	xcmd->chain_count = 0;
+}
+
+/*
+ */
+static void
+cmd_packet_init(struct xocl_cmd *xcmd, struct ert_packet *packet)
+{
+	SCHED_DEBUGF("%s(%lu,packet)\n", __func__, xcmd->uid);
+	xcmd->ecmd = packet;
+}
+
+/*
+ * cmd_has_cu() - Check if this command object can execute on CU
+ *
+ * @cuidx: the index of the CU.	 Note that CU indicies start from 0.
+ */
+static int
+cmd_has_cu(struct xocl_cmd *xcmd, unsigned int cuidx)
+{
+	SCHED_DEBUGF("%s(%lu,%d) = %d\n", __func__, xcmd->uid, cuidx, test_bit(cuidx, xcmd->cu_bitmap));
+	return test_bit(cuidx, xcmd->cu_bitmap);
+}
+
+/*
+ * struct xocl_cu: Represents a compute unit in penguin mode
+ *
+ * @running_queue: a fifo representing commands running on this CU
+ * @xdev: the xrt device with this CU
+ * @idx: index of this CU
+ * @base: exec base address of this CU
+ * @addr: base address of this CU
+ * @ctrlreg: state of the CU (value of AXI-lite control register)
+ * @done_cnt: number of command that have completed (<=running_queue.size())
+ *
+ */
+struct xocl_cu {
+	struct list_head   running_queue;
+	unsigned int idx;
+	void __iomem *base;
+	u32 addr;
+
+	u32 ctrlreg;
+	unsigned int done_cnt;
+	unsigned int run_cnt;
+	unsigned int uid;
+};
+
+/*
+ */
+void
+cu_reset(struct xocl_cu *xcu, unsigned int idx, void __iomem *base, u32 addr)
+{
+	xcu->idx = idx;
+	xcu->base = base;
+	xcu->addr = addr;
+	xcu->ctrlreg = 0;
+	xcu->done_cnt = 0;
+	xcu->run_cnt = 0;
+	SCHED_DEBUGF("%s(uid:%d,idx:%d) @ 0x%x\n", __func__, xcu->uid, xcu->idx, xcu->addr);
+}
+
+/*
+ */
+struct xocl_cu *
+cu_create(void)
+{
+	struct xocl_cu *xcu = kmalloc(sizeof(struct xocl_cu), GFP_KERNEL);
+	static unsigned int uid;
+
+	INIT_LIST_HEAD(&xcu->running_queue);
+	xcu->uid = uid++;
+	SCHED_DEBUGF("%s(uid:%d)\n", __func__, xcu->uid);
+	return xcu;
+}
+
+static inline u32
+cu_base_addr(struct xocl_cu *xcu)
+{
+	return xcu->addr;
+}
+
+/*
+ */
+void
+cu_destroy(struct xocl_cu *xcu)
+{
+	SCHED_DEBUGF("%s(uid:%d)\n", __func__, xcu->uid);
+	kfree(xcu);
+}
+
+/*
+ */
+void
+cu_poll(struct xocl_cu *xcu)
+{
+	// assert !list_empty(&running_queue)
+	xcu->ctrlreg = ioread32(xcu->base + xcu->addr);
+	SCHED_DEBUGF("%s(%d) 0x%x done(%d) run(%d)\n", __func__, xcu->idx, xcu->ctrlreg, xcu->done_cnt, xcu->run_cnt);
+	if (xcu->ctrlreg & AP_DONE) {
+		++xcu->done_cnt; // assert done_cnt <= |running_queue|
+		--xcu->run_cnt;
+		// acknowledge done
+		iowrite32(AP_CONTINUE, xcu->base + xcu->addr);
+	}
+}
+
+/*
+ * cu_ready() - Check if CU is ready to start another command
+ *
+ * The CU is ready when AP_START is low
+ */
+static int
+cu_ready(struct xocl_cu *xcu)
+{
+	if (xcu->ctrlreg & AP_START)
+		cu_poll(xcu);
+
+	SCHED_DEBUGF("%s(%d) returns %d\n", __func__, xcu->idx, !(xcu->ctrlreg & AP_START));
+	return !(xcu->ctrlreg & AP_START);
+}
+
+/*
+ * cu_first_done() - Get the first completed command from the running queue
+ *
+ * Return: The first command that has completed or nullptr if none
+ */
+static struct xocl_cmd*
+cu_first_done(struct xocl_cu *xcu)
+{
+	if (!xcu->done_cnt)
+		cu_poll(xcu);
+
+	SCHED_DEBUGF("%s(%d) has done_cnt %d\n", __func__, xcu->idx, xcu->done_cnt);
+
+	return xcu->done_cnt
+		? list_first_entry(&xcu->running_queue, struct xocl_cmd, rq_list)
+		: NULL;
+}
+
+/*
+ * cu_pop_done() - Remove first element from running queue
+ */
+static void
+cu_pop_done(struct xocl_cu *xcu)
+{
+	struct xocl_cmd *xcmd;
+
+	if (!xcu->done_cnt)
+		return;
+	xcmd = list_first_entry(&xcu->running_queue, struct xocl_cmd, rq_list);
+	list_del(&xcmd->rq_list);
+	--xcu->done_cnt;
+	SCHED_DEBUGF("%s(%d) xcmd(%lu) done(%d) run(%d)\n", __func__, xcu->idx, xcmd->uid, xcu->done_cnt, xcu->run_cnt);
+}
+
+/*
+ * cu_start() - Start the CU with a new command.
+ *
+ * The command is pushed onto the running queue
+ */
+static int
+cu_start(struct xocl_cu *xcu, struct xocl_cmd *xcmd)
+{
+	// assert(!(ctrlreg & AP_START), "cu not ready");
+
+	// data past header and cu_masks
+	unsigned int size = cmd_regmap_size(xcmd);
+	u32 *regmap = cmd_regmap(xcmd);
+	unsigned int i;
+
+	// past header, past cumasks
+	SCHED_DEBUG_PACKET(regmap, size);
+
+	// write register map, starting at base + 0xC
+	// 0x4, 0x8 used for interrupt, which is initialized in setu
+	for (i = 1; i < size; ++i)
+		iowrite32(*(regmap + i), xcu->base + xcu->addr + (i << 2));
+
+	// start cu.  update local state as we may not be polling prior
+	// to next ready check.
+	xcu->ctrlreg |= AP_START;
+	iowrite32(AP_START, xcu->base + xcu->addr);
+
+	// add cmd to end of running queue
+	list_add_tail(&xcmd->rq_list, &xcu->running_queue);
+	++xcu->run_cnt;
+
+	SCHED_DEBUGF("%s(%d) started xcmd(%lu) done(%d) run(%d)\n",
+		     __func__, xcu->idx, xcmd->uid, xcu->done_cnt, xcu->run_cnt);
+
+	return true;
+}
+
+
+/*
+ * sruct xocl_ert: Represents embedded scheduler in ert mode
+ */
+struct xocl_ert {
+	void __iomem *base;
+	u32	     cq_addr;
+	unsigned int uid;
+
+	unsigned int slot_size;
+	unsigned int cq_intr;
+};
+
+/*
+ */
+struct xocl_ert *
+ert_create(void __iomem *base, u32 cq_addr)
+{
+	struct xocl_ert *xert = kmalloc(sizeof(struct xocl_ert), GFP_KERNEL);
+	static unsigned int uid;
+
+	xert->base = base;
+	xert->cq_addr = cq_addr;
+	xert->uid = uid++;
+	xert->slot_size = 0;
+	xert->cq_intr = false;
+	SCHED_DEBUGF("%s(%d,0x%x)\n", __func__, xert->uid, xert->cq_addr);
+	return xert;
+}
+
+/*
+ */
+static void
+ert_destroy(struct xocl_ert *xert)
+{
+	SCHED_DEBUGF("%s(%d)\n", __func__, xert->uid);
+	kfree(xert);
+}
+
+/*
+ */
+static void
+ert_cfg(struct xocl_ert *xert, unsigned int slot_size, unsigned int cq_intr)
+{
+	SCHED_DEBUGF("%s(%d) slot_size(%d) cq_intr(%d)\n", __func__, xert->uid, slot_size, cq_intr);
+	xert->slot_size = slot_size;
+	xert->cq_intr = cq_intr;
+}
+
+/*
+ */
+static bool
+ert_start_cmd(struct xocl_ert *xert, struct xocl_cmd *xcmd)
+{
+	u32 slot_addr = xert->cq_addr + xcmd->slot_idx * xert->slot_size;
+	struct ert_packet *ecmd = cmd_packet(xcmd);
+
+	SCHED_DEBUG_PACKET(ecmd, cmd_packet_size(xcmd));
+
+	SCHED_DEBUGF("-> %s(%d,%lu)\n", __func__, xert->uid, xcmd->uid);
+
+	// write packet minus header
+	SCHED_DEBUGF("++ slot_idx=%d, slot_addr=0x%x\n", xcmd->slot_idx, slot_addr);
+	memcpy_toio(xert->base + slot_addr + 4, ecmd->data, (cmd_packet_size(xcmd) - 1) * sizeof(u32));
+
+	// write header
+	iowrite32(ecmd->header, xert->base + slot_addr);
+
+	// trigger interrupt to embedded scheduler if feature is enabled
+	if (xert->cq_intr) {
+		u32 cq_int_addr = ERT_CQ_STATUS_REGISTER_ADDR + (slot_mask_idx(xcmd->slot_idx) << 2);
+		u32 mask = 1 << slot_idx_in_mask(xcmd->slot_idx);
+
+		SCHED_DEBUGF("++ mb_submit writes slot mask 0x%x to CQ_INT register at addr 0x%x\n",
+			     mask, cq_int_addr);
+		iowrite32(mask, xert->base + cq_int_addr);
+	}
+	SCHED_DEBUGF("<- %s returns true\n", __func__);
+	return true;
+}
+
+/*
+ */
+static void
+ert_read_custat(struct xocl_ert *xert, unsigned int num_cus, u32 *cu_usage, struct xocl_cmd *xcmd)
+{
+	u32 slot_addr = xert->cq_addr + xcmd->slot_idx*xert->slot_size;
+
+	memcpy_fromio(cu_usage, xert->base + slot_addr + 4, num_cus * sizeof(u32));
+}
+
+/**
+ * struct exec_ops: scheduler specific operations
+ *
+ * Scheduler can operate in MicroBlaze mode (mb/ert) or in penguin mode. This
+ * struct differentiates specific operations.  The struct is per device node,
+ * meaning that one device can operate in ert mode while another can operate
+ * in penguin mode.
+ */
+struct exec_ops {
+	bool (*start)(struct exec_core *exec, struct xocl_cmd *xcmd);
+	void (*query)(struct exec_core *exec, struct xocl_cmd *xcmd);
+};
+
+static struct exec_ops ert_ops;
+static struct exec_ops penguin_ops;
+
+/**
+ * struct exec_core: Core data structure for command execution on a device
+ *
+ * @ctx_list: Context list populated with device context
+ * @exec_lock: Lock for synchronizing external access
+ * @poll_wait_queue: Wait queue for device polling
+ * @scheduler: Command queue scheduler
+ * @submitted_cmds: Tracking of command submitted for execution on this device
+ * @num_slots: Number of command queue slots
+ * @num_cus: Number of CUs in loaded program
+ * @num_cdma: Number of CDMAs in hardware
+ * @polling_mode: If set then poll for command completion
+ * @cq_interrupt: If set then trigger interrupt to MB on new commands
+ * @configured: Flag to indicate that the core data structure has been initialized
+ * @stopped: Flag to indicate that the core data structure cannot be used
+ * @flush: Flag to indicate that commands for this device should be flushed
+ * @cu_usage: Usage count since last reset
+ * @slot_status: Bitmap to track status (busy(1)/free(0)) slots in command queue
+ * @ctrl_busy: Flag to indicate that slot 0 (ctrl commands) is busy
+ * @cu_status: Bitmap to track status (busy(1)/free(0)) of CUs. Unused in ERT mode.
+ * @sr0: If set, then status register [0..31] is pending with completed commands (ERT only).
+ * @sr1: If set, then status register [32..63] is pending with completed commands (ERT only).
+ * @sr2: If set, then status register [64..95] is pending with completed commands (ERT only).
+ * @sr3: If set, then status register [96..127] is pending with completed commands (ERT only).
+ * @ops: Scheduler operations vtable
+ */
+struct exec_core {
+	struct platform_device	   *pdev;
+
+	struct mutex		   exec_lock;
+
+	void __iomem		   *base;
+	u32			   intr_base;
+	u32			   intr_num;
+
+	wait_queue_head_t	   poll_wait_queue;
+
+	struct xocl_scheduler	   *scheduler;
+
+	uuid_t			   xclbin_id;
+
+	unsigned int		   num_slots;
+	unsigned int		   num_cus;
+	unsigned int		   num_cdma;
+	unsigned int		   polling_mode;
+	unsigned int		   cq_interrupt;
+	unsigned int		   configured;
+	unsigned int		   stopped;
+	unsigned int		   flush;
+
+	struct xocl_cu		   *cus[MAX_CUS];
+	struct xocl_ert		   *ert;
+
+	u32			   cu_usage[MAX_CUS];
+
+	// Bitmap tracks busy(1)/free(0) slots in cmd_slots
+	struct xocl_cmd		   *submitted_cmds[MAX_SLOTS];
+	DECLARE_BITMAP(slot_status, MAX_SLOTS);
+	unsigned int		   ctrl_busy;
+
+	// Status register pending complete.  Written by ISR,
+	// cleared by scheduler
+	atomic_t		   sr0;
+	atomic_t		   sr1;
+	atomic_t		   sr2;
+	atomic_t		   sr3;
+
+	// Operations for dynamic indirection dependt on MB
+	// or kernel scheduler
+	struct exec_ops		   *ops;
+
+	unsigned int		   uid;
+	unsigned int		   ip_reference[MAX_CUS];
+};
+
+/**
+ * exec_get_pdev() -
+ */
+static inline struct platform_device *
+exec_get_pdev(struct exec_core *exec)
+{
+	return exec->pdev;
+}
+
+/**
+ * exec_get_xdev() -
+ */
+static inline struct xocl_dev *
+exec_get_xdev(struct exec_core *exec)
+{
+	return xocl_get_xdev(exec->pdev);
+}
+
+/*
+ */
+static inline bool
+exec_is_ert(struct exec_core *exec)
+{
+	return exec->ops == &ert_ops;
+}
+
+/*
+ */
+static inline bool
+exec_is_polling(struct exec_core *exec)
+{
+	return exec->polling_mode;
+}
+
+/*
+ */
+static inline bool
+exec_is_flush(struct exec_core *exec)
+{
+	return exec->flush;
+}
+
+/*
+ */
+static inline u32
+exec_cu_base_addr(struct exec_core *exec, unsigned int cuidx)
+{
+	return cu_base_addr(exec->cus[cuidx]);
+}
+
+/*
+ */
+static inline u32
+exec_cu_usage(struct exec_core *exec, unsigned int cuidx)
+{
+	return exec->cu_usage[cuidx];
+}
+
+/*
+ */
+static void
+exec_cfg(struct exec_core *exec)
+{
+}
+
+/*
+ * to be automated
+ */
+static int
+exec_cfg_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	struct xocl_dev *xdev = exec_get_xdev(exec);
+	struct client_ctx *client = xcmd->client;
+	bool ert = xocl_mb_sched_on(xdev);
+	uint32_t *cdma = xocl_cdma_addr(xdev);
+	unsigned int dsa = xocl_dsa_version(xdev);
+	struct ert_configure_cmd *cfg;
+	int cuidx = 0;
+
+	/* Only allow configuration with one live ctx */
+	if (exec->configured) {
+		DRM_INFO("command scheduler is already configured for this device\n");
+		return 1;
+	}
+
+	DRM_INFO("ert per feature rom = %d\n", ert);
+	DRM_INFO("dsa per feature rom = %d\n", dsa);
+
+	cfg = (struct ert_configure_cmd *)(xcmd->ecmd);
+
+	/* Mark command as control command to force slot 0 execution */
+	cfg->type = ERT_CTRL;
+
+	if (cfg->count != 5 + cfg->num_cus) {
+		DRM_INFO("invalid configure command, count=%d expected 5+num_cus(%d)\n", cfg->count, cfg->num_cus);
+		return 1;
+	}
+
+	SCHED_DEBUG("configuring scheduler\n");
+	exec->num_slots = ERT_CQ_SIZE / cfg->slot_size;
+	exec->num_cus = cfg->num_cus;
+	exec->num_cdma = 0;
+
+	// skip this in polling mode
+	for (cuidx = 0; cuidx < exec->num_cus; ++cuidx) {
+		struct xocl_cu *xcu = exec->cus[cuidx];
+
+		if (!xcu)
+			xcu = exec->cus[cuidx] = cu_create();
+		cu_reset(xcu, cuidx, exec->base, cfg->data[cuidx]);
+		userpf_info(xdev, "%s cu(%d) at 0x%x\n", __func__, xcu->idx, xcu->addr);
+	}
+
+	if (cdma) {
+		uint32_t *addr = 0;
+
+		mutex_lock(&client->lock); /* for modification to client cu_bitmap */
+		for (addr = cdma; addr < cdma+4; ++addr) { /* 4 is from xclfeatures.h */
+			if (*addr) {
+				struct xocl_cu *xcu = exec->cus[cuidx];
+
+				if (!xcu)
+					xcu = exec->cus[cuidx] = cu_create();
+				cu_reset(xcu, cuidx, exec->base, *addr);
+				++exec->num_cus;
+				++exec->num_cdma;
+				++cfg->num_cus;
+				++cfg->count;
+				cfg->data[cuidx] = *addr;
+				set_bit(cuidx, client->cu_bitmap); /* cdma is shared */
+				userpf_info(xdev, "configure cdma as cu(%d) at 0x%x\n", cuidx, *addr);
+				++cuidx;
+			}
+		}
+		mutex_unlock(&client->lock);
+	}
+
+	if (ert && cfg->ert) {
+		SCHED_DEBUG("++ configuring embedded scheduler mode\n");
+		if (!exec->ert)
+			exec->ert = ert_create(exec->base, ERT_CQ_BASE_ADDR);
+		ert_cfg(exec->ert, cfg->slot_size, cfg->cq_int);
+		exec->ops = &ert_ops;
+		exec->polling_mode = cfg->polling;
+		exec->cq_interrupt = cfg->cq_int;
+		cfg->dsa52 = (dsa >= 52) ? 1 : 0;
+		cfg->cdma = cdma ? 1 : 0;
+	} else {
+		SCHED_DEBUG("++ configuring penguin scheduler mode\n");
+		exec->ops = &penguin_ops;
+		exec->polling_mode = 1;
+	}
+
+	// reserve slot 0 for control commands
+	set_bit(0, exec->slot_status);
+
+	DRM_INFO("scheduler config ert(%d) slots(%d), cudma(%d), cuisr(%d), cdma(%d), cus(%d)\n"
+		 , exec_is_ert(exec)
+		 , exec->num_slots
+		 , cfg->cu_dma ? 1 : 0
+		 , cfg->cu_isr ? 1 : 0
+		 , exec->num_cdma
+		 , exec->num_cus);
+
+	exec->configured = true;
+	return 0;
+}
+
+/**
+ * exec_reset() - Reset the scheduler
+ *
+ * @exec: Execution core (device) to reset
+ *
+ * TODO: Perform scheduler configuration based on current xclbin
+ *	 rather than relying of cfg command
+ */
+static void
+exec_reset(struct exec_core *exec)
+{
+	struct xocl_dev *xdev = exec_get_xdev(exec);
+	uuid_t *xclbin_id;
+
+	mutex_lock(&exec->exec_lock);
+
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+
+	userpf_info(xdev, "%s(%d) cfg(%d)\n", __func__, exec->uid, exec->configured);
+
+	// only reconfigure the scheduler on new xclbin
+	if (!xclbin_id || (uuid_equal(&exec->xclbin_id, xclbin_id) && exec->configured)) {
+		exec->stopped = false;
+		exec->configured = false;  // TODO: remove, but hangs ERT because of in between AXI resets
+		goto out;
+	}
+
+	userpf_info(xdev, "exec->xclbin(%pUb),xclbin(%pUb)\n", &exec->xclbin_id, xclbin_id);
+	userpf_info(xdev, "%s resets for new xclbin", __func__);
+	memset(exec->cu_usage, 0, MAX_CUS * sizeof(u32));
+	uuid_copy(&exec->xclbin_id, xclbin_id);
+	exec->num_cus = 0;
+	exec->num_cdma = 0;
+
+	exec->num_slots = 16;
+	exec->polling_mode = 1;
+	exec->cq_interrupt = 0;
+	exec->configured = false;
+	exec->stopped = false;
+	exec->flush = false;
+	exec->ops = &penguin_ops;
+
+	bitmap_zero(exec->slot_status, MAX_SLOTS);
+	set_bit(0, exec->slot_status); // reserve for control command
+	exec->ctrl_busy = false;
+
+	atomic_set(&exec->sr0, 0);
+	atomic_set(&exec->sr1, 0);
+	atomic_set(&exec->sr2, 0);
+	atomic_set(&exec->sr3, 0);
+
+	exec_cfg(exec);
+
+out:
+	mutex_unlock(&exec->exec_lock);
+}
+
+/**
+ * exec_stop() - Stop the scheduler from scheduling commands on this core
+ *
+ * @exec:  Execution core (device) to stop
+ *
+ * Block access to current exec_core (device).	This API must be called prior
+ * to performing an AXI reset and downloading of a new xclbin.	Calling this
+ * API flushes the commands running on current device and prevents new
+ * commands from being scheduled on the device.	 This effectively prevents any
+ * further commands from running on the device
+ */
+static void
+exec_stop(struct exec_core *exec)
+{
+	int idx;
+	struct xocl_dev *xdev = exec_get_xdev(exec);
+	unsigned int outstanding = 0;
+	unsigned int wait_ms = 100;
+	unsigned int retry = 20;  // 2 sec
+
+	mutex_lock(&exec->exec_lock);
+	userpf_info(xdev, "%s(%p)\n", __func__, exec);
+	exec->stopped = true;
+	mutex_unlock(&exec->exec_lock);
+
+	// Wait for commands to drain if any
+	outstanding = atomic_read(&xdev->outstanding_execs);
+	while (--retry && outstanding) {
+		userpf_info(xdev, "Waiting for %d outstanding commands to finish", outstanding);
+		msleep(wait_ms);
+		outstanding = atomic_read(&xdev->outstanding_execs);
+	}
+
+	// Last gasp, flush any remaining commands for this device exec core
+	// This is an abnormal case.  All exec clients have been destroyed
+	// prior to exec_stop being called (per contract), this implies that
+	// all regular client commands have been flushed.
+	if (outstanding) {
+		// Wake up the scheduler to force one iteration flushing stale
+		// commands for this device
+		exec->flush = 1;
+		scheduler_intr(exec->scheduler);
+
+		// Wait a second
+		msleep(1000);
+	}
+
+	outstanding = atomic_read(&xdev->outstanding_execs);
+	if (outstanding)
+		userpf_err(xdev, "unexpected outstanding commands %d after flush", outstanding);
+
+	// Stale commands were flushed, reset submitted command state
+	for (idx = 0; idx < MAX_SLOTS; ++idx)
+		exec->submitted_cmds[idx] = NULL;
+
+	bitmap_zero(exec->slot_status, MAX_SLOTS);
+	set_bit(0, exec->slot_status); // reserve for control command
+	exec->ctrl_busy = false;
+}
+
+/*
+ */
+static irqreturn_t
+exec_isr(int irq, void *arg)
+{
+	struct exec_core *exec = (struct exec_core *)arg;
+
+	SCHED_DEBUGF("-> xocl_user_event %d\n", irq);
+	if (exec_is_ert(exec) && !exec->polling_mode) {
+
+		if (irq == 0)
+			atomic_set(&exec->sr0, 1);
+		else if (irq == 1)
+			atomic_set(&exec->sr1, 1);
+		else if (irq == 2)
+			atomic_set(&exec->sr2, 1);
+		else if (irq == 3)
+			atomic_set(&exec->sr3, 1);
+
+		/* wake up all scheduler ... currently one only */
+		scheduler_intr(exec->scheduler);
+	} else {
+		userpf_err(exec_get_xdev(exec), "Unhandled isr irq %d, is_ert %d, polling %d",
+			   irq, exec_is_ert(exec), exec->polling_mode);
+	}
+	SCHED_DEBUGF("<- xocl_user_event\n");
+	return IRQ_HANDLED;
+}
+
+/*
+ */
+struct exec_core *
+exec_create(struct platform_device *pdev, struct xocl_scheduler *xs)
+{
+	struct exec_core *exec = devm_kzalloc(&pdev->dev, sizeof(struct exec_core), GFP_KERNEL);
+	struct xocl_dev *xdev = xocl_get_xdev(pdev);
+	struct resource *res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
+	static unsigned int count;
+	unsigned int i;
+
+	if (!exec)
+		return NULL;
+
+	mutex_init(&exec->exec_lock);
+	exec->base = xdev->core.bar_addr;
+
+	exec->intr_base = res->start;
+	exec->intr_num = res->end - res->start + 1;
+	exec->pdev = pdev;
+
+	init_waitqueue_head(&exec->poll_wait_queue);
+	exec->scheduler = xs;
+	exec->uid = count++;
+
+	for (i = 0; i < exec->intr_num; i++) {
+		xocl_user_interrupt_reg(xdev, i+exec->intr_base, exec_isr, exec);
+		xocl_user_interrupt_config(xdev, i + exec->intr_base, true);
+	}
+
+	exec_reset(exec);
+	platform_set_drvdata(pdev, exec);
+
+	SCHED_DEBUGF("%s(%d)\n", __func__, exec->uid);
+
+	return exec;
+}
+
+/*
+ */
+static void
+exec_destroy(struct exec_core *exec)
+{
+	int idx;
+
+	SCHED_DEBUGF("%s(%d)\n", __func__, exec->uid);
+	for (idx = 0; idx < exec->num_cus; ++idx)
+		cu_destroy(exec->cus[idx]);
+	if (exec->ert)
+		ert_destroy(exec->ert);
+	devm_kfree(&exec->pdev->dev, exec);
+}
+
+/*
+ */
+static inline struct xocl_scheduler *
+exec_scheduler(struct exec_core *exec)
+{
+	return exec->scheduler;
+}
+
+/*
+ * acquire_slot_idx() - First available slot index
+ */
+static unsigned int
+exec_acquire_slot_idx(struct exec_core *exec)
+{
+	unsigned int idx = find_first_zero_bit(exec->slot_status, MAX_SLOTS);
+
+	SCHED_DEBUGF("%s(%d) returns %d\n", __func__, exec->uid, idx < exec->num_slots ? idx : no_index);
+	if (idx < exec->num_slots) {
+		set_bit(idx, exec->slot_status);
+		return idx;
+	}
+	return no_index;
+}
+
+
+/**
+ * acquire_slot() - Acquire a slot index for a command
+ *
+ * This function makes a special case for control commands which
+ * must always dispatch to slot 0, otherwise normal acquisition
+ */
+static int
+exec_acquire_slot(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	// slot 0 is reserved for ctrl commands
+	if (cmd_type(xcmd) == ERT_CTRL) {
+		SCHED_DEBUGF("%s(%d,%lu) ctrl cmd\n", __func__, exec->uid, xcmd->uid);
+		if (exec->ctrl_busy)
+			return -1;
+		exec->ctrl_busy = true;
+		return (xcmd->slot_idx = 0);
+	}
+
+	return (xcmd->slot_idx = exec_acquire_slot_idx(exec));
+}
+
+/*
+ * release_slot_idx() - Release specified slot idx
+ */
+static void
+exec_release_slot_idx(struct exec_core *exec, unsigned int slot_idx)
+{
+	clear_bit(slot_idx, exec->slot_status);
+}
+
+/**
+ * release_slot() - Release a slot index for a command
+ *
+ * Special case for control commands that execute in slot 0.  This
+ * slot cannot be marked free ever.
+ */
+static void
+exec_release_slot(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	if (xcmd->slot_idx == no_index)
+		return; // already released
+
+	SCHED_DEBUGF("%s(%d) xcmd(%lu) slotidx(%d)\n",
+		     __func__, exec->uid, xcmd->uid, xcmd->slot_idx);
+	if (cmd_type(xcmd) == ERT_CTRL) {
+		SCHED_DEBUG("+ ctrl cmd\n");
+		exec->ctrl_busy = false;
+	} else {
+		exec_release_slot_idx(exec, xcmd->slot_idx);
+	}
+	xcmd->slot_idx = no_index;
+}
+
+/*
+ * submit_cmd() - Submit command for execution on this core
+ *
+ * Return: true on success, false if command could not be submitted
+ */
+static bool
+exec_submit_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	unsigned int slotidx = exec_acquire_slot(exec, xcmd);
+
+	if (slotidx == no_index)
+		return false;
+	SCHED_DEBUGF("%s(%d,%lu) slotidx(%d)\n", __func__, exec->uid, xcmd->uid, slotidx);
+	exec->submitted_cmds[slotidx] = xcmd;
+	cmd_set_int_state(xcmd, ERT_CMD_STATE_SUBMITTED);
+	return true;
+}
+
+/*
+ * finish_cmd() - Special post processing of commands after execution
+ */
+static int
+exec_finish_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	if (cmd_opcode(xcmd) == ERT_CU_STAT && exec_is_ert(exec))
+		ert_read_custat(exec->ert, exec->num_cus, exec->cu_usage, xcmd);
+	return 0;
+}
+
+/*
+ * execute_write_cmd() - Execute ERT_WRITE commands
+ */
+static int
+exec_execute_write_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	struct ert_packet *ecmd = xcmd->ecmd;
+	unsigned int idx = 0;
+
+	SCHED_DEBUGF("-> %s(%d,%lu)\n", __func__, exec->uid, xcmd->uid);
+	for (idx = 0; idx < ecmd->count - 1; idx += 2) {
+		u32 addr = ecmd->data[idx];
+		u32 val = ecmd->data[idx+1];
+
+		SCHED_DEBUGF("+ exec_write_cmd base[0x%x] = 0x%x\n", addr, val);
+		iowrite32(val, exec->base + addr);
+	}
+	SCHED_DEBUG("<- exec_write\n");
+	return 0;
+}
+
+/*
+ * notify_host() - Notify user space that a command is complete.
+ */
+static void
+exec_notify_host(struct exec_core *exec)
+{
+	struct list_head *ptr;
+	struct client_ctx *entry;
+	struct xocl_dev *xdev = exec_get_xdev(exec);
+
+	SCHED_DEBUGF("-> %s(%d)\n", __func__, exec->uid);
+
+	/* now for each client update the trigger counter in the context */
+	mutex_lock(&xdev->ctx_list_lock);
+	list_for_each(ptr, &xdev->ctx_list) {
+		entry = list_entry(ptr, struct client_ctx, link);
+		atomic_inc(&entry->trigger);
+	}
+	mutex_unlock(&xdev->ctx_list_lock);
+	/* wake up all the clients */
+	wake_up_interruptible(&exec->poll_wait_queue);
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/*
+ * exec_cmd_mark_complete() - Move a command to complete state
+ *
+ * Commands are marked complete in two ways
+ *  1. Through polling of CUs or polling of MB status register
+ *  2. Through interrupts from MB
+ *
+ * @xcmd: Command to mark complete
+ *
+ * The external command state is changed to complete and the host
+ * is notified that some command has completed.
+ */
+static void
+exec_mark_cmd_complete(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	SCHED_DEBUGF("-> %s(%d,%lu)\n", __func__, exec->uid, xcmd->uid);
+	if (cmd_type(xcmd) == ERT_CTRL)
+		exec_finish_cmd(exec, xcmd);
+
+	cmd_set_state(xcmd, ERT_CMD_STATE_COMPLETED);
+
+	if (exec->polling_mode)
+		scheduler_decr_poll(exec->scheduler);
+
+	exec_release_slot(exec, xcmd);
+	exec_notify_host(exec);
+
+	// Deactivate command and trigger chain of waiting commands
+	cmd_mark_deactive(xcmd);
+	cmd_trigger_chain(xcmd);
+
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/**
+ * mark_mask_complete() - Move all commands in mask to complete state
+ *
+ * @mask: Bitmask with queried statuses of commands
+ * @mask_idx: Index of the command mask. Used to offset the actual cmd slot index
+ *
+ * Used in ERT mode only.  Currently ERT submitted commands remain in exec
+ * submitted queue as ERT doesn't support data flow
+ */
+static void
+exec_mark_mask_complete(struct exec_core *exec, u32 mask, unsigned int mask_idx)
+{
+	int bit_idx = 0, cmd_idx = 0;
+
+	SCHED_DEBUGF("-> %s(0x%x,%d)\n", __func__, mask, mask_idx);
+	if (!mask)
+		return;
+
+	for (bit_idx = 0, cmd_idx = mask_idx<<5; bit_idx < 32; mask >>= 1, ++bit_idx, ++cmd_idx) {
+		// mask could be -1 when firewall trips, double check
+		// exec->submitted_cmds[cmd_idx] to make sure it's not NULL
+		if ((mask & 0x1) && exec->submitted_cmds[cmd_idx])
+			exec_mark_cmd_complete(exec, exec->submitted_cmds[cmd_idx]);
+	}
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/*
+ * penguin_start_cmd() - Start a command in penguin mode
+ */
+static bool
+exec_penguin_start_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	unsigned int cuidx;
+	u32 opcode = cmd_opcode(xcmd);
+
+	SCHED_DEBUGF("-> %s (%d,%lu) opcode(%d)\n", __func__, exec->uid, xcmd->uid, opcode);
+
+	if (opcode == ERT_WRITE && exec_execute_write_cmd(exec, xcmd)) {
+		cmd_set_state(xcmd, ERT_CMD_STATE_ERROR);
+		return false;
+	}
+
+	if (opcode != ERT_START_CU) {
+		SCHED_DEBUGF("<- %s -> true\n", __func__);
+		return true;
+	}
+
+	// Find a ready CU
+	for (cuidx = 0; cuidx < exec->num_cus; ++cuidx) {
+		struct xocl_cu *xcu = exec->cus[cuidx];
+
+		if (cmd_has_cu(xcmd, cuidx) && cu_ready(xcu) && cu_start(xcu, xcmd)) {
+			exec->submitted_cmds[xcmd->slot_idx] = NULL;
+			++exec->cu_usage[cuidx];
+			exec_release_slot(exec, xcmd);
+			xcmd->cu_idx = cuidx;
+			SCHED_DEBUGF("<- %s -> true\n", __func__);
+			return true;
+		}
+	}
+	SCHED_DEBUGF("<- %s -> false\n", __func__);
+	return false;
+}
+
+/**
+ * penguin_query() - Check command status of argument command
+ *
+ * @xcmd: Command to check
+ *
+ * Function is called in penguin mode (no embedded scheduler).
+ */
+static void
+exec_penguin_query_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	u32 cmdopcode = cmd_opcode(xcmd);
+	u32 cmdtype = cmd_type(xcmd);
+
+	SCHED_DEBUGF("-> %s(%lu) opcode(%d) type(%d) slot_idx=%d\n",
+		     __func__, xcmd->uid, cmdopcode, cmdtype, xcmd->slot_idx);
+
+	if (cmdtype == ERT_KDS_LOCAL || cmdtype == ERT_CTRL)
+		exec_mark_cmd_complete(exec, xcmd);
+	else if (cmdopcode == ERT_START_CU) {
+		struct xocl_cu *xcu = exec->cus[xcmd->cu_idx];
+
+		if (cu_first_done(xcu) == xcmd) {
+			cu_pop_done(xcu);
+			exec_mark_cmd_complete(exec, xcmd);
+		}
+	}
+
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+
+/*
+ * ert_start_cmd() - Start a command on ERT
+ */
+static bool
+exec_ert_start_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	// if (cmd_type(xcmd) == ERT_DATAFLOW)
+	//   exec_penguin_start_cmd(exec,xcmd);
+	return ert_start_cmd(exec->ert, xcmd);
+}
+
+/*
+ * ert_query_cmd() - Check command completion in ERT
+ *
+ * @xcmd: Command to check
+ *
+ * This function is for ERT mode.  In polling mode, check the command status
+ * register containing the slot assigned to the command.  In interrupt mode
+ * check the interrupting status register.  The function checks all commands
+ * in the same command status register as argument command so more than one
+ * command may be marked complete by this function.
+ */
+static void
+exec_ert_query_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	unsigned int cmd_mask_idx = slot_mask_idx(xcmd->slot_idx);
+
+	SCHED_DEBUGF("-> %s(%lu) slot_idx(%d), cmd_mask_idx(%d)\n", __func__, xcmd->uid, xcmd->slot_idx, cmd_mask_idx);
+
+	if (cmd_type(xcmd) == ERT_KDS_LOCAL) {
+		exec_mark_cmd_complete(exec, xcmd);
+		SCHED_DEBUGF("<- %s local command\n", __func__);
+		return;
+	}
+
+	if (exec->polling_mode
+	    || (cmd_mask_idx == 0 && atomic_xchg(&exec->sr0, 0))
+	    || (cmd_mask_idx == 1 && atomic_xchg(&exec->sr1, 0))
+	    || (cmd_mask_idx == 2 && atomic_xchg(&exec->sr2, 0))
+	    || (cmd_mask_idx == 3 && atomic_xchg(&exec->sr3, 0))) {
+		u32 csr_addr = ERT_STATUS_REGISTER_ADDR + (cmd_mask_idx<<2);
+		u32 mask = ioread32(xcmd->exec->base + csr_addr);
+
+		SCHED_DEBUGF("++ %s csr_addr=0x%x mask=0x%x\n", __func__, csr_addr, mask);
+		if (mask)
+			exec_mark_mask_complete(xcmd->exec, mask, cmd_mask_idx);
+	}
+
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/*
+ * start_cmd() - Start execution of a command
+ *
+ * Return: true if successfully started, false otherwise
+ *
+ * Function dispatches based on penguin vs ert mode
+ */
+static bool
+exec_start_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	// assert cmd had been submitted
+	SCHED_DEBUGF("%s(%d,%lu) opcode(%d)\n", __func__, exec->uid, xcmd->uid, cmd_opcode(xcmd));
+
+	if (exec->ops->start(exec, xcmd)) {
+		cmd_set_int_state(xcmd, ERT_CMD_STATE_RUNNING);
+		return true;
+	}
+
+	return false;
+}
+
+/*
+ * query_cmd() - Check status of command
+ *
+ * Function dispatches based on penguin vs ert mode.  In ERT mode
+ * multiple commands can be marked complete by this function.
+ */
+static void
+exec_query_cmd(struct exec_core *exec, struct xocl_cmd *xcmd)
+{
+	SCHED_DEBUGF("%s(%d,%lu)\n", __func__, exec->uid, xcmd->uid);
+	exec->ops->query(exec, xcmd);
+}
+
+
+
+/**
+ * ert_ops: operations for ERT scheduling
+ */
+static struct exec_ops ert_ops = {
+	.start = exec_ert_start_cmd,
+	.query = exec_ert_query_cmd,
+};
+
+/**
+ * penguin_ops: operations for kernel mode scheduling
+ */
+static struct exec_ops penguin_ops = {
+	.start = exec_penguin_start_cmd,
+	.query = exec_penguin_query_cmd,
+};
+
+/*
+ */
+static inline struct exec_core *
+pdev_get_exec(struct platform_device *pdev)
+{
+	return platform_get_drvdata(pdev);
+}
+
+/*
+ */
+static inline struct exec_core *
+dev_get_exec(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+
+	return pdev ? pdev_get_exec(pdev) : NULL;
+}
+
+/*
+ */
+static inline struct xocl_dev *
+dev_get_xdev(struct device *dev)
+{
+	struct exec_core *exec = dev_get_exec(dev);
+
+	return exec ? exec_get_xdev(exec) : NULL;
+}
+
+/**
+ * List of new pending xocl_cmd objects
+ *
+ * @pending_cmds: populated from user space with new commands for buffer objects
+ * @num_pending: number of pending commands
+ *
+ * Scheduler copies pending commands to its private queue when necessary
+ */
+static LIST_HEAD(pending_cmds);
+static DEFINE_MUTEX(pending_cmds_mutex);
+static atomic_t num_pending = ATOMIC_INIT(0);
+
+static void
+pending_cmds_reset(void)
+{
+	/* clear stale command objects if any */
+	while (!list_empty(&pending_cmds)) {
+		struct xocl_cmd *xcmd = list_first_entry(&pending_cmds, struct xocl_cmd, cq_list);
+
+		DRM_INFO("deleting stale pending cmd\n");
+		cmd_free(xcmd);
+	}
+	atomic_set(&num_pending, 0);
+}
+
+/**
+ * struct xocl_sched: scheduler for xocl_cmd objects
+ *
+ * @scheduler_thread: thread associated with this scheduler
+ * @use_count: use count for this scheduler
+ * @wait_queue: conditional wait queue for scheduler thread
+ * @error: set to 1 to indicate scheduler error
+ * @stop: set to 1 to indicate scheduler should stop
+ * @reset: set to 1 to reset the scheduler
+ * @command_queue: list of command objects managed by scheduler
+ * @intc: boolean flag set when there is a pending interrupt for command completion
+ * @poll: number of running commands in polling mode
+ */
+struct xocl_scheduler {
+	struct task_struct	  *scheduler_thread;
+	unsigned int		   use_count;
+
+	wait_queue_head_t	   wait_queue;
+	unsigned int		   error;
+	unsigned int		   stop;
+	unsigned int		   reset;
+
+	struct list_head	   command_queue;
+
+	unsigned int		   intc; /* pending intr shared with isr, word aligned atomic */
+	unsigned int		   poll; /* number of cmds to poll */
+};
+
+static struct xocl_scheduler scheduler0;
+
+static void
+scheduler_reset(struct xocl_scheduler *xs)
+{
+	xs->error = 0;
+	xs->stop = 0;
+	xs->poll = 0;
+	xs->reset = false;
+	xs->intc = 0;
+}
+
+static void
+scheduler_cq_reset(struct xocl_scheduler *xs)
+{
+	while (!list_empty(&xs->command_queue)) {
+		struct xocl_cmd *xcmd = list_first_entry(&xs->command_queue, struct xocl_cmd, cq_list);
+
+		DRM_INFO("deleting stale scheduler cmd\n");
+		cmd_free(xcmd);
+	}
+}
+
+static void
+scheduler_wake_up(struct xocl_scheduler *xs)
+{
+	wake_up_interruptible(&xs->wait_queue);
+}
+
+static void
+scheduler_intr(struct xocl_scheduler *xs)
+{
+	xs->intc = 1;
+	scheduler_wake_up(xs);
+}
+
+static inline void
+scheduler_decr_poll(struct xocl_scheduler *xs)
+{
+	--xs->poll;
+}
+
+
+/**
+ * scheduler_queue_cmds() - Queue any pending commands
+ *
+ * The scheduler copies pending commands to its internal command queue where
+ * is is now in queued state.
+ */
+static void
+scheduler_queue_cmds(struct xocl_scheduler *xs)
+{
+	struct xocl_cmd *xcmd;
+	struct list_head *pos, *next;
+
+	SCHED_DEBUGF("-> %s\n", __func__);
+	mutex_lock(&pending_cmds_mutex);
+	list_for_each_safe(pos, next, &pending_cmds) {
+		xcmd = list_entry(pos, struct xocl_cmd, cq_list);
+		if (xcmd->xs != xs)
+			continue;
+		SCHED_DEBUGF("+ queueing cmd(%lu)\n", xcmd->uid);
+		list_del(&xcmd->cq_list);
+		list_add_tail(&xcmd->cq_list, &xs->command_queue);
+
+		/* chain active dependencies if any to this command object */
+		if (cmd_wait_count(xcmd) && cmd_chain_dependencies(xcmd))
+			cmd_set_state(xcmd, ERT_CMD_STATE_ERROR);
+		else
+			cmd_set_int_state(xcmd, ERT_CMD_STATE_QUEUED);
+
+		/* this command is now active and can chain other commands */
+		cmd_mark_active(xcmd);
+		atomic_dec(&num_pending);
+	}
+	mutex_unlock(&pending_cmds_mutex);
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/**
+ * queued_to_running() - Move a command from queued to running state if possible
+ *
+ * @xcmd: Command to start
+ *
+ * Upon success, the command is not necessarily running. In ert mode the
+ * command will have been submitted to the embedded scheduler, whereas in
+ * penguin mode the command has been started on a CU.
+ *
+ * Return: %true if command was submitted to device, %false otherwise
+ */
+static bool
+scheduler_queued_to_submitted(struct xocl_scheduler *xs, struct xocl_cmd *xcmd)
+{
+	struct exec_core *exec = cmd_exec(xcmd);
+	bool retval = false;
+
+	if (cmd_wait_count(xcmd))
+		return false;
+
+	SCHED_DEBUGF("-> %s(%lu) opcode(%d)\n", __func__, xcmd->uid, cmd_opcode(xcmd));
+
+	// configure prior to using the core
+	if (cmd_opcode(xcmd) == ERT_CONFIGURE && exec_cfg_cmd(exec, xcmd)) {
+		cmd_set_state(xcmd, ERT_CMD_STATE_ERROR);
+		return false;
+	}
+
+	// submit the command
+	if (exec_submit_cmd(exec, xcmd)) {
+		if (exec->polling_mode)
+			++xs->poll;
+		retval = true;
+	}
+
+	SCHED_DEBUGF("<- queued_to_submitted returns %d\n", retval);
+
+	return retval;
+}
+
+static bool
+scheduler_submitted_to_running(struct xocl_scheduler *xs, struct xocl_cmd *xcmd)
+{
+	return exec_start_cmd(cmd_exec(xcmd), xcmd);
+}
+
+/**
+ * running_to_complete() - Check status of running commands
+ *
+ * @xcmd: Command is in running state
+ *
+ * When ERT is enabled this function may mark more than just argument
+ * command as complete based on content of command completion register.
+ * Without ERT, only argument command is checked for completion.
+ */
+static void
+scheduler_running_to_complete(struct xocl_scheduler *xs, struct xocl_cmd *xcmd)
+{
+	exec_query_cmd(cmd_exec(xcmd), xcmd);
+}
+
+/**
+ * complete_to_free() - Recycle a complete command objects
+ *
+ * @xcmd: Command is in complete state
+ */
+static void
+scheduler_complete_to_free(struct xocl_scheduler *xs, struct xocl_cmd *xcmd)
+{
+	SCHED_DEBUGF("-> %s(%lu)\n", __func__, xcmd->uid);
+	cmd_free(xcmd);
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+static void
+scheduler_error_to_free(struct xocl_scheduler *xs, struct xocl_cmd *xcmd)
+{
+	SCHED_DEBUGF("-> %s(%lu)\n", __func__, xcmd->uid);
+	exec_notify_host(cmd_exec(xcmd));
+	scheduler_complete_to_free(xs, xcmd);
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+static void
+scheduler_abort_to_free(struct xocl_scheduler *xs, struct xocl_cmd *xcmd)
+{
+	SCHED_DEBUGF("-> %s(%lu)\n", __func__, xcmd->uid);
+	scheduler_error_to_free(xs, xcmd);
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/**
+ * scheduler_iterator_cmds() - Iterate all commands in scheduler command queue
+ */
+static void
+scheduler_iterate_cmds(struct xocl_scheduler *xs)
+{
+	struct list_head *pos, *next;
+
+	SCHED_DEBUGF("-> %s\n", __func__);
+	list_for_each_safe(pos, next, &xs->command_queue) {
+		struct xocl_cmd *xcmd = list_entry(pos, struct xocl_cmd, cq_list);
+
+		cmd_update_state(xcmd);
+		SCHED_DEBUGF("+ processing cmd(%lu)\n", xcmd->uid);
+
+		/* check running first since queued maybe we waiting for cmd slot */
+		if (xcmd->state == ERT_CMD_STATE_QUEUED)
+			scheduler_queued_to_submitted(xs, xcmd);
+		if (xcmd->state == ERT_CMD_STATE_SUBMITTED)
+			scheduler_submitted_to_running(xs, xcmd);
+		if (xcmd->state == ERT_CMD_STATE_RUNNING)
+			scheduler_running_to_complete(xs, xcmd);
+		if (xcmd->state == ERT_CMD_STATE_COMPLETED)
+			scheduler_complete_to_free(xs, xcmd);
+		if (xcmd->state == ERT_CMD_STATE_ERROR)
+			scheduler_error_to_free(xs, xcmd);
+		if (xcmd->state == ERT_CMD_STATE_ABORT)
+			scheduler_abort_to_free(xs, xcmd);
+	}
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
+
+/**
+ * scheduler_wait_condition() - Check status of scheduler wait condition
+ *
+ * Scheduler must wait (sleep) if
+ *   1. there are no pending commands
+ *   2. no pending interrupt from embedded scheduler
+ *   3. no pending complete commands in polling mode
+ *
+ * Return: 1 if scheduler must wait, 0 othewise
+ */
+static int
+scheduler_wait_condition(struct xocl_scheduler *xs)
+{
+	if (kthread_should_stop()) {
+		xs->stop = 1;
+		SCHED_DEBUG("scheduler wakes kthread_should_stop\n");
+		return 0;
+	}
+
+	if (atomic_read(&num_pending)) {
+		SCHED_DEBUG("scheduler wakes to copy new pending commands\n");
+		return 0;
+	}
+
+	if (xs->intc) {
+		SCHED_DEBUG("scheduler wakes on interrupt\n");
+		xs->intc = 0;
+		return 0;
+	}
+
+	if (xs->poll) {
+		SCHED_DEBUG("scheduler wakes to poll\n");
+		return 0;
+	}
+
+	SCHED_DEBUG("scheduler waits ...\n");
+	return 1;
+}
+
+/**
+ * scheduler_wait() - check if scheduler should wait
+ *
+ * See scheduler_wait_condition().
+ */
+static void
+scheduler_wait(struct xocl_scheduler *xs)
+{
+	wait_event_interruptible(xs->wait_queue, scheduler_wait_condition(xs) == 0);
+}
+
+/**
+ * scheduler_loop() - Run one loop of the scheduler
+ */
+static void
+scheduler_loop(struct xocl_scheduler *xs)
+{
+	static unsigned int loop_cnt;
+
+	SCHED_DEBUGF("%s\n", __func__);
+	scheduler_wait(xs);
+
+	if (xs->error)
+		DRM_INFO("scheduler encountered unexpected error\n");
+
+	if (xs->stop)
+		return;
+
+	if (xs->reset) {
+		SCHED_DEBUG("scheduler is resetting after timeout\n");
+		scheduler_reset(xs);
+	}
+
+	/* queue new pending commands */
+	scheduler_queue_cmds(xs);
+
+	/* iterate all commands */
+	scheduler_iterate_cmds(xs);
+
+	// loop 8 times before explicitly yielding
+	if (++loop_cnt == 8) {
+		loop_cnt = 0;
+		schedule();
+	}
+}
+
+/**
+ * scheduler() - Command scheduler thread routine
+ */
+static int
+scheduler(void *data)
+{
+	struct xocl_scheduler *xs = (struct xocl_scheduler *)data;
+
+	while (!xs->stop)
+		scheduler_loop(xs);
+	DRM_INFO("%s:%d %s thread exits with value %d\n", __FILE__, __LINE__, __func__, xs->error);
+	return xs->error;
+}
+
+
+
+/**
+ * add_xcmd() - Add initialized xcmd object to pending command list
+ *
+ * @xcmd: Command to add
+ *
+ * Scheduler copies pending commands to its internal command queue.
+ *
+ * Return: 0 on success
+ */
+static int
+add_xcmd(struct xocl_cmd *xcmd)
+{
+	struct exec_core *exec = xcmd->exec;
+	struct xocl_dev *xdev = xocl_get_xdev(exec->pdev);
+
+	// Prevent stop and reset
+	mutex_lock(&exec->exec_lock);
+
+	SCHED_DEBUGF("-> %s(%lu) pid(%d)\n", __func__, xcmd->uid, pid_nr(task_tgid(current)));
+	SCHED_DEBUGF("+ exec stopped(%d) configured(%d)\n", exec->stopped, exec->configured);
+
+	if (exec->stopped || (!exec->configured && cmd_opcode(xcmd) != ERT_CONFIGURE))
+		goto err;
+
+	cmd_set_state(xcmd, ERT_CMD_STATE_NEW);
+	mutex_lock(&pending_cmds_mutex);
+	list_add_tail(&xcmd->cq_list, &pending_cmds);
+	atomic_inc(&num_pending);
+	mutex_unlock(&pending_cmds_mutex);
+
+	/* wake scheduler */
+	atomic_inc(&xdev->outstanding_execs);
+	atomic64_inc(&xdev->total_execs);
+	scheduler_wake_up(xcmd->xs);
+
+	SCHED_DEBUGF("<- %s ret(0) opcode(%d) type(%d) num_pending(%d)\n",
+		     __func__, cmd_opcode(xcmd), cmd_type(xcmd), atomic_read(&num_pending));
+	mutex_unlock(&exec->exec_lock);
+	return 0;
+
+err:
+	SCHED_DEBUGF("<- %s ret(1) opcode(%d) type(%d) num_pending(%d)\n",
+		     __func__, cmd_opcode(xcmd), cmd_type(xcmd), atomic_read(&num_pending));
+	mutex_unlock(&exec->exec_lock);
+	return 1;
+}
+
+
+/**
+ * add_bo_cmd() - Add a new buffer object command to pending list
+ *
+ * @exec: Targeted device
+ * @client: Client context
+ * @bo: Buffer objects from user space from which new command is created
+ * @numdeps: Number of dependencies for this command
+ * @deps: List of @numdeps dependencies
+ *
+ * Scheduler copies pending commands to its internal command queue.
+ *
+ * Return: 0 on success, 1 on failure
+ */
+static int
+add_bo_cmd(struct exec_core *exec, struct client_ctx *client, struct drm_xocl_bo *bo,
+	   int numdeps, struct drm_xocl_bo **deps)
+{
+	struct xocl_cmd *xcmd = cmd_get(exec_scheduler(exec), exec, client);
+
+	if (!xcmd)
+		return 1;
+
+	SCHED_DEBUGF("-> %s(%lu)\n", __func__, xcmd->uid);
+
+	cmd_bo_init(xcmd, bo, numdeps, deps, !exec_is_ert(exec));
+
+	if (add_xcmd(xcmd))
+		goto err;
+
+	SCHED_DEBUGF("<- %s ret(0) opcode(%d) type(%d)\n", __func__, cmd_opcode(xcmd), cmd_type(xcmd));
+	return 0;
+err:
+	cmd_abort(xcmd);
+	SCHED_DEBUGF("<- %s ret(1) opcode(%d) type(%d)\n", __func__, cmd_opcode(xcmd), cmd_type(xcmd));
+	return 1;
+}
+
+static int
+add_ctrl_cmd(struct exec_core *exec, struct client_ctx *client, struct ert_packet *packet)
+{
+	struct xocl_cmd *xcmd = cmd_get(exec_scheduler(exec), exec, client);
+
+	if (!xcmd)
+		return 1;
+
+	SCHED_DEBUGF("-> %s(%lu)\n", __func__, xcmd->uid);
+
+	cmd_packet_init(xcmd, packet);
+
+	if (add_xcmd(xcmd))
+		goto err;
+
+	SCHED_DEBUGF("<- %s ret(0) opcode(%d) type(%d)\n", __func__, cmd_opcode(xcmd), cmd_type(xcmd));
+	return 0;
+err:
+	cmd_abort(xcmd);
+	SCHED_DEBUGF("<- %s ret(1) opcode(%d) type(%d)\n", __func__, cmd_opcode(xcmd), cmd_type(xcmd));
+	return 1;
+}
+
+
+/**
+ * init_scheduler_thread() - Initialize scheduler thread if necessary
+ *
+ * Return: 0 on success, -errno otherwise
+ */
+static int
+init_scheduler_thread(struct xocl_scheduler *xs)
+{
+	SCHED_DEBUGF("%s use_count=%d\n", __func__, xs->use_count);
+	if (xs->use_count++)
+		return 0;
+
+	init_waitqueue_head(&xs->wait_queue);
+	INIT_LIST_HEAD(&xs->command_queue);
+	scheduler_reset(xs);
+
+	xs->scheduler_thread = kthread_run(scheduler, (void *)xs, "xocl-scheduler-thread0");
+	if (IS_ERR(xs->scheduler_thread)) {
+		int ret = PTR_ERR(xs->scheduler_thread);
+
+		DRM_ERROR(__func__);
+		return ret;
+	}
+	return 0;
+}
+
+/**
+ * fini_scheduler_thread() - Finalize scheduler thread if unused
+ *
+ * Return: 0 on success, -errno otherwise
+ */
+static int
+fini_scheduler_thread(struct xocl_scheduler *xs)
+{
+	int retval = 0;
+
+	SCHED_DEBUGF("%s use_count=%d\n", __func__, xs->use_count);
+	if (--xs->use_count)
+		return 0;
+
+	retval = kthread_stop(xs->scheduler_thread);
+
+	/* clear stale command objects if any */
+	pending_cmds_reset();
+	scheduler_cq_reset(xs);
+
+	/* reclaim memory for allocate command objects */
+	cmd_list_delete();
+
+	return retval;
+}
+
+/**
+ * Entry point for exec buffer.
+ *
+ * Function adds exec buffer to the pending list of commands
+ */
+int
+add_exec_buffer(struct platform_device *pdev, struct client_ctx *client, void *buf,
+		int numdeps, struct drm_xocl_bo **deps)
+{
+	struct exec_core *exec = platform_get_drvdata(pdev);
+	// Add the command to pending list
+	return add_bo_cmd(exec, client, buf, numdeps, deps);
+}
+
+static int
+xocl_client_lock_bitstream_nolock(struct xocl_dev *xdev, struct client_ctx *client)
+{
+	int pid = pid_nr(task_tgid(current));
+	uuid_t *xclbin_id;
+
+	if (client->xclbin_locked)
+		return 0;
+
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+	if (!xclbin_id || !uuid_equal(xclbin_id, &client->xclbin_id)) {
+		userpf_err(xdev,
+			   "device xclbin does not match context xclbin, cannot obtain lock for process %d",
+			   pid);
+		return 1;
+	}
+
+	if (xocl_icap_lock_bitstream(xdev, &client->xclbin_id, pid) < 0) {
+		userpf_err(xdev, "could not lock bitstream for process %d", pid);
+		return 1;
+	}
+
+	client->xclbin_locked = true;
+	userpf_info(xdev, "process %d successfully locked xcblin", pid);
+	return 0;
+}
+
+static int
+xocl_client_lock_bitstream(struct xocl_dev *xdev, struct client_ctx *client)
+{
+	int ret = 0;
+
+	mutex_lock(&client->lock);	   // protect current client
+	mutex_lock(&xdev->ctx_list_lock);  // protect xdev->xclbin_id
+	ret = xocl_client_lock_bitstream_nolock(xdev, client);
+	mutex_unlock(&xdev->ctx_list_lock);
+	mutex_unlock(&client->lock);
+	return ret;
+}
+
+
+static int
+create_client(struct platform_device *pdev, void **priv)
+{
+	struct client_ctx	*client;
+	struct xocl_dev		*xdev = xocl_get_xdev(pdev);
+	int			ret = 0;
+
+	client = devm_kzalloc(&pdev->dev, sizeof(*client), GFP_KERNEL);
+	if (!client)
+		return -ENOMEM;
+
+	mutex_lock(&xdev->ctx_list_lock);
+
+	if (!xdev->offline) {
+		client->pid = task_tgid(current);
+		mutex_init(&client->lock);
+		client->xclbin_locked = false;
+		client->abort = false;
+		atomic_set(&client->trigger, 0);
+		atomic_set(&client->outstanding_execs, 0);
+		client->num_cus = 0;
+		client->xdev = xocl_get_xdev(pdev);
+		list_add_tail(&client->link, &xdev->ctx_list);
+		*priv =	 client;
+	} else {
+		/* Do not allow new client to come in while being offline. */
+		devm_kfree(&pdev->dev, client);
+		ret = -EBUSY;
+	}
+
+	mutex_unlock(&xdev->ctx_list_lock);
+
+	DRM_INFO("creating scheduler client for pid(%d), ret: %d\n",
+		 pid_nr(task_tgid(current)), ret);
+
+	return ret;
+}
+
+static void destroy_client(struct platform_device *pdev, void **priv)
+{
+	struct client_ctx *client = (struct client_ctx *)(*priv);
+	struct exec_core *exec = platform_get_drvdata(pdev);
+	struct xocl_scheduler *xs = exec_scheduler(exec);
+	struct xocl_dev	*xdev = xocl_get_xdev(pdev);
+	unsigned int	outstanding = atomic_read(&client->outstanding_execs);
+	unsigned int	timeout_loops = 20;
+	unsigned int	loops = 0;
+	int pid = pid_nr(task_tgid(current));
+	unsigned int bit;
+	struct ip_layout *layout = XOCL_IP_LAYOUT(xdev);
+
+	bit = layout
+	  ? find_first_bit(client->cu_bitmap, layout->m_count)
+	  : MAX_CUS;
+
+	/*
+	 * This happens when application exists without formally releasing the
+	 * contexts on CUs. Give up our contexts on CUs and our lock on xclbin.
+	 * Note, that implicit CUs (such as CDMA) do not add to ip_reference.
+	 */
+	while (layout && (bit < layout->m_count)) {
+		if (exec->ip_reference[bit]) {
+			userpf_info(xdev, "CTX reclaim (%pUb, %d, %u)",
+				&client->xclbin_id, pid, bit);
+			exec->ip_reference[bit]--;
+		}
+		bit = find_next_bit(client->cu_bitmap, layout->m_count, bit + 1);
+	}
+	bitmap_zero(client->cu_bitmap, MAX_CUS);
+
+	// force scheduler to abort execs for this client
+	client->abort = true;
+
+	// wait for outstanding execs to finish
+	while (outstanding) {
+		unsigned int new;
+
+		userpf_info(xdev, "waiting for %d outstanding execs to finish", outstanding);
+		msleep(500);
+		new = atomic_read(&client->outstanding_execs);
+		loops = (new == outstanding ? (loops + 1) : 0);
+		if (loops == timeout_loops) {
+			userpf_err(xdev,
+				   "Giving up with %d outstanding execs, please reset device with 'xbutil reset'\n",
+				   outstanding);
+			xdev->needs_reset = true;
+			// reset the scheduler loop
+			xs->reset = true;
+			break;
+		}
+		outstanding = new;
+	}
+
+	DRM_INFO("client exits pid(%d)\n", pid);
+
+	mutex_lock(&xdev->ctx_list_lock);
+	list_del(&client->link);
+	mutex_unlock(&xdev->ctx_list_lock);
+
+	if (client->xclbin_locked)
+		xocl_icap_unlock_bitstream(xdev, &client->xclbin_id, pid);
+	mutex_destroy(&client->lock);
+	devm_kfree(&pdev->dev, client);
+	*priv = NULL;
+}
+
+static uint poll_client(struct platform_device *pdev, struct file *filp,
+	poll_table *wait, void *priv)
+{
+	struct client_ctx	*client = (struct client_ctx *)priv;
+	struct exec_core	*exec;
+	int			counter;
+	uint			ret = 0;
+
+	exec = platform_get_drvdata(pdev);
+
+	poll_wait(filp, &exec->poll_wait_queue, wait);
+
+	/*
+	 * Mutex lock protects from two threads from the same application
+	 * calling poll concurrently using the same file handle
+	 */
+	mutex_lock(&client->lock);
+	counter = atomic_read(&client->trigger);
+	if (counter > 0) {
+		/*
+		 * Use atomic here since the trigger may be incremented by
+		 * interrupt handler running concurrently.
+		 */
+		atomic_dec(&client->trigger);
+		ret = POLLIN;
+	}
+	mutex_unlock(&client->lock);
+
+	return ret;
+}
+
+static int client_ioctl_ctx(struct platform_device *pdev,
+			    struct client_ctx *client, void *data)
+{
+	bool acquire_lock = false;
+	struct drm_xocl_ctx *args = data;
+	int ret = 0;
+	int pid = pid_nr(task_tgid(current));
+	struct xocl_dev	*xdev = xocl_get_xdev(pdev);
+	struct exec_core *exec = platform_get_drvdata(pdev);
+	uuid_t *xclbin_id;
+
+	mutex_lock(&client->lock);
+	mutex_lock(&xdev->ctx_list_lock);
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+	if (!xclbin_id || !uuid_equal(xclbin_id, &args->xclbin_id)) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	if (args->cu_index >= XOCL_IP_LAYOUT(xdev)->m_count) {
+		userpf_err(xdev, "cuidx(%d) >= numcus(%d)\n",
+			   args->cu_index, XOCL_IP_LAYOUT(xdev)->m_count);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (args->op == XOCL_CTX_OP_FREE_CTX) {
+		ret = test_and_clear_bit(args->cu_index, client->cu_bitmap) ? 0 : -EINVAL;
+		if (ret) // No context was previously allocated for this CU
+			goto out;
+
+		// CU unlocked explicitly
+		--exec->ip_reference[args->cu_index];
+		if (!--client->num_cus) {
+			// We just gave up the last context, unlock the xclbin
+			ret = xocl_icap_unlock_bitstream(xdev, xclbin_id, pid);
+			client->xclbin_locked = false;
+		}
+		userpf_info(xdev, "CTX del(%pUb, %d, %u)",
+			    xclbin_id, pid, args->cu_index);
+		goto out;
+	}
+
+	if (args->op != XOCL_CTX_OP_ALLOC_CTX) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (args->flags != XOCL_CTX_SHARED) {
+		userpf_err(xdev, "Only shared contexts are supported in this release");
+		ret = -EPERM;
+		goto out;
+	}
+
+	if (!client->num_cus && !client->xclbin_locked)
+		// Process has no other context on any CU yet, hence we need to
+		// lock the xclbin A process uses just one lock for all its ctxs
+		acquire_lock = true;
+
+	if (test_and_set_bit(args->cu_index, client->cu_bitmap)) {
+		userpf_info(xdev, "CTX already allocated by this process");
+		// Context was previously allocated for the same CU,
+		// cannot allocate again
+		ret = 0;
+		goto out;
+	}
+
+	if (acquire_lock) {
+		// This is the first context on any CU for this process,
+		// lock the xclbin
+		ret = xocl_client_lock_bitstream_nolock(xdev, client);
+		if (ret) {
+			// Locking of xclbin failed, give up our context
+			clear_bit(args->cu_index, client->cu_bitmap);
+			goto out;
+		} else {
+			uuid_copy(&client->xclbin_id, xclbin_id);
+		}
+	}
+
+	// Everything is good so far, hence increment the CU reference count
+	++client->num_cus; // explicitly acquired
+	++exec->ip_reference[args->cu_index];
+	xocl_info(&pdev->dev, "CTX add(%pUb, %d, %u, %d)",
+		  xclbin_id, pid, args->cu_index, acquire_lock);
+out:
+	mutex_unlock(&xdev->ctx_list_lock);
+	mutex_unlock(&client->lock);
+	return ret;
+}
+
+static int
+get_bo_paddr(struct xocl_dev *xdev, struct drm_file *filp,
+	     uint32_t bo_hdl, size_t off, size_t size, uint64_t *paddrp)
+{
+	struct drm_device *ddev = filp->minor->dev;
+	struct drm_gem_object *obj;
+	struct drm_xocl_bo *xobj;
+
+	obj = xocl_gem_object_lookup(ddev, filp, bo_hdl);
+	if (!obj) {
+		userpf_err(xdev, "Failed to look up GEM BO 0x%x\n", bo_hdl);
+		return -ENOENT;
+	}
+	xobj = to_xocl_bo(obj);
+
+	if (obj->size <= off || obj->size < off + size || !xobj->mm_node) {
+		userpf_err(xdev, "Failed to get paddr for BO 0x%x\n", bo_hdl);
+//PORT4_20
+//		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
+		return -EINVAL;
+	}
+
+	*paddrp = xobj->mm_node->start + off;
+//	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
+	return 0;
+}
+
+static int
+convert_execbuf(struct xocl_dev *xdev, struct drm_file *filp,
+		struct exec_core *exec, struct drm_xocl_bo *xobj)
+{
+	int i;
+	int ret;
+	size_t src_off;
+	size_t dst_off;
+	size_t sz;
+	uint64_t src_addr;
+	uint64_t dst_addr;
+	struct ert_start_copybo_cmd *scmd = (struct ert_start_copybo_cmd *)xobj->vmapping;
+
+	/* Only convert COPYBO cmd for now. */
+	if (scmd->opcode != ERT_START_COPYBO)
+		return 0;
+
+	sz = scmd->size * COPYBO_UNIT;
+
+	src_off = scmd->src_addr_hi;
+	src_off <<= 32;
+	src_off |= scmd->src_addr_lo;
+	ret = get_bo_paddr(xdev, filp, scmd->src_bo_hdl, src_off, sz, &src_addr);
+	if (ret != 0)
+		return ret;
+
+	dst_off = scmd->dst_addr_hi;
+	dst_off <<= 32;
+	dst_off |= scmd->dst_addr_lo;
+	ret = get_bo_paddr(xdev, filp, scmd->dst_bo_hdl, dst_off, sz, &dst_addr);
+	if (ret != 0)
+		return ret;
+
+	ert_fill_copybo_cmd(scmd, 0, 0, src_addr, dst_addr, sz);
+
+	for (i = exec->num_cus - exec->num_cdma; i < exec->num_cus; i++)
+		scmd->cu_mask[i / 32] |= 1 << (i % 32);
+
+	scmd->opcode = ERT_START_CU;
+
+	return 0;
+}
+
+static int
+client_ioctl_execbuf(struct platform_device *pdev,
+		     struct client_ctx *client, void *data, struct drm_file *filp)
+{
+	struct drm_xocl_execbuf *args = data;
+	struct drm_xocl_bo *xobj;
+	struct drm_gem_object *obj;
+	struct drm_xocl_bo *deps[8] = {0};
+	int numdeps = -1;
+	int ret = 0;
+	struct xocl_dev	*xdev = xocl_get_xdev(pdev);
+	struct drm_device *ddev = filp->minor->dev;
+
+	if (xdev->needs_reset) {
+		userpf_err(xdev, "device needs reset, use 'xbutil reset -h'");
+		return -EBUSY;
+	}
+
+	/* Look up the gem object corresponding to the BO handle.
+	 * This adds a reference to the gem object.  The refernece is
+	 * passed to kds or released here if errors occur.
+	 */
+	obj = xocl_gem_object_lookup(ddev, filp, args->exec_bo_handle);
+	if (!obj) {
+		userpf_err(xdev, "Failed to look up GEM BO %d\n",
+		args->exec_bo_handle);
+		return -ENOENT;
+	}
+
+	/* Convert gem object to xocl_bo extension */
+	xobj = to_xocl_bo(obj);
+	if (!xocl_bo_execbuf(xobj) || convert_execbuf(xdev, filp,
+		platform_get_drvdata(pdev), xobj) != 0) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	ret = validate(pdev, client, xobj);
+	if (ret) {
+		userpf_err(xdev, "Exec buffer validation failed\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/* Copy dependencies from user.	 It is an error if a BO handle specified
+	 * as a dependency does not exists. Lookup gem object corresponding to bo
+	 * handle.  Convert gem object to xocl_bo extension.  Note that the
+	 * gem lookup acquires a reference to the drm object, this reference
+	 * is passed on to the the scheduler via xocl_exec_add_buffer.
+	 */
+	for (numdeps = 0; numdeps < 8 && args->deps[numdeps]; ++numdeps) {
+		struct drm_gem_object *gobj =
+		  xocl_gem_object_lookup(ddev, filp, args->deps[numdeps]);
+		struct drm_xocl_bo *xbo = gobj ? to_xocl_bo(gobj) : NULL;
+
+		if (!gobj)
+			userpf_err(xdev, "Failed to look up GEM BO %d\n",
+				   args->deps[numdeps]);
+		if (!xbo) {
+			ret = -EINVAL;
+			goto out;
+		}
+		deps[numdeps] = xbo;
+	}
+
+	/* acquire lock on xclbin if necessary */
+	ret = xocl_client_lock_bitstream(xdev, client);
+	if (ret) {
+		userpf_err(xdev, "Failed to lock xclbin\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/* Add exec buffer to scheduler (kds).	The scheduler manages the
+	 * drm object references acquired by xobj and deps.  It is vital
+	 * that the references are released properly.
+	 */
+	ret = add_exec_buffer(pdev, client, xobj, numdeps, deps);
+	if (ret) {
+		userpf_err(xdev, "Failed to add exec buffer to scheduler\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/* Return here, noting that the gem objects passed to kds have
+	 * references that must be released by kds itself.  User manages
+	 * a regular reference to all BOs returned as file handles.  These
+	 * references are released with the BOs are freed.
+	 */
+	return ret;
+
+out:
+	for (--numdeps; numdeps >= 0; numdeps--)
+		drm_gem_object_put_unlocked(&deps[numdeps]->base);
+//PORT4_20
+//		drm_gem_object_unreference_unlocked(&deps[numdeps]->base);
+	drm_gem_object_put_unlocked(&xobj->base);
+//	drm_gem_object_unreference_unlocked(&xobj->base);
+	return ret;
+}
+
+int
+client_ioctl(struct platform_device *pdev, int op, void *data, void *drm_filp)
+{
+	struct drm_file *filp = drm_filp;
+	struct client_ctx *client = filp->driver_priv;
+	int ret;
+
+	switch (op) {
+	case DRM_XOCL_CTX:
+		ret = client_ioctl_ctx(pdev, client, data);
+		break;
+	case DRM_XOCL_EXECBUF:
+		ret = client_ioctl_execbuf(pdev, client, data, drm_filp);
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+/**
+ * reset() - Reset device exec data structure
+ *
+ * @pdev: platform device to reset
+ *
+ * [Current 2018.3 situation:]
+ * This function is currently called from mgmt icap on every AXI is
+ * freeze/unfreeze.  It ensures that the device exec_core state is reset to
+ * same state as was when scheduler was originally probed for the device.
+ * The callback from icap, ensures that scheduler resets the exec core when
+ * multiple processes are already attached to the device but AXI is reset.
+ *
+ * Even though the very first client created for this device also resets the
+ * exec core, it is possible that further resets are necessary.	 For example
+ * in multi-process case, there can be 'n' processes that attach to the
+ * device.  On first client attach the exec core is reset correctly, but now
+ * assume that 'm' of these processes finishes completely before any remaining
+ * (n-m) processes start using the scheduler.  In this case, the n-m clients have
+ * already been created, but icap resets AXI because the xclbin has no
+ * references (arguably this AXI reset is wrong)
+ *
+ * [Work-in-progress:]
+ * Proper contract:
+ *  Pre-condition: xocl_exec_stop has been called before xocl_exec_reset.
+ *  Pre-condition: new bitstream has been downloaded and AXI has been reset
+ */
+static int
+reset(struct platform_device *pdev)
+{
+	struct exec_core *exec = platform_get_drvdata(pdev);
+
+	exec_stop(exec);   // remove when upstream explicitly calls stop()
+	exec_reset(exec);
+	return 0;
+}
+
+/**
+ * stop() - Reset device exec data structure
+ *
+ * This API must be called prior to performing an AXI reset and downloading of
+ * a new xclbin.  Calling this API flushes the commands running on current
+ * device and prevents new commands from being scheduled on the device.	 This
+ * effectively prevents 'xbutil top' from issuing CU_STAT commands while
+ * programming is performed.
+ *
+ * Pre-condition: xocl_client_release has been called, e.g there are no
+ *		  current clients using the bitstream
+ */
+static int
+stop(struct platform_device *pdev)
+{
+	struct exec_core *exec = platform_get_drvdata(pdev);
+
+	exec_stop(exec);
+	return 0;
+}
+
+/**
+ * validate() - Check if requested cmd is valid in the current context
+ */
+static int
+validate(struct platform_device *pdev, struct client_ctx *client, const struct drm_xocl_bo *bo)
+{
+	struct ert_packet *ecmd = (struct ert_packet *)bo->vmapping;
+	struct ert_start_kernel_cmd *scmd = (struct ert_start_kernel_cmd *)bo->vmapping;
+	unsigned int i = 0;
+	u32 ctx_cus[4] = {0};
+	u32 cumasks = 0;
+	int err = 0;
+
+	SCHED_DEBUGF("-> %s(%d)\n", __func__, ecmd->opcode);
+
+	/* cus for start kernel commands only */
+	if (ecmd->opcode != ERT_START_CU)
+		return 0; /* ok */
+
+	/* client context cu bitmap may not change while validating */
+	mutex_lock(&client->lock);
+
+	/* no specific CUs selected, maybe ctx is not used by client */
+	if (bitmap_empty(client->cu_bitmap, MAX_CUS)) {
+		userpf_err(xocl_get_xdev(pdev), "%s found no CUs in ctx\n", __func__);
+		goto out; /* ok */
+	}
+
+	/* Check CUs in cmd BO against CUs in context */
+	cumasks = 1 + scmd->extra_cu_masks;
+	xocl_bitmap_to_arr32(ctx_cus, client->cu_bitmap, cumasks * 32);
+
+	for (i = 0; i < cumasks; ++i) {
+		uint32_t cmd_cus = ecmd->data[i];
+		/* cmd_cus must be subset of ctx_cus */
+		if (cmd_cus & ~ctx_cus[i]) {
+			SCHED_DEBUGF("<- %s(1), CU mismatch in mask(%d) cmd(0x%x) ctx(0x%x)\n",
+				     __func__, i, cmd_cus, ctx_cus[i]);
+			err = 1;
+			goto out; /* error */
+		}
+	}
+
+
+out:
+	mutex_unlock(&client->lock);
+	SCHED_DEBUGF("<- %s(%d) cmd and ctx CUs match\n", __func__, err);
+	return err;
+
+}
+
+struct xocl_mb_scheduler_funcs sche_ops = {
+	.create_client = create_client,
+	.destroy_client = destroy_client,
+	.poll_client = poll_client,
+	.client_ioctl = client_ioctl,
+	.stop = stop,
+	.reset = reset,
+};
+
+/* sysfs */
+static ssize_t
+kds_numcus_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct exec_core *exec = dev_get_exec(dev);
+	unsigned int cus = exec ? exec->num_cus - exec->num_cdma : 0;
+
+	return sprintf(buf, "%d\n", cus);
+}
+static DEVICE_ATTR_RO(kds_numcus);
+
+static ssize_t
+kds_numcdmas_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_xdev(dev);
+	uint32_t *cdma = xocl_cdma_addr(xdev);
+	unsigned int cdmas = cdma ? 1 : 0; //TBD
+
+	return sprintf(buf, "%d\n", cdmas);
+}
+static DEVICE_ATTR_RO(kds_numcdmas);
+
+static ssize_t
+kds_custat_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct exec_core *exec = dev_get_exec(dev);
+	struct xocl_dev *xdev = exec_get_xdev(exec);
+	struct client_ctx client;
+	struct ert_packet packet;
+	unsigned int count = 0;
+	ssize_t sz = 0;
+
+	// minimum required initialization of client
+	client.abort = false;
+	client.xdev = xdev;
+	atomic_set(&client.trigger, 0);
+	atomic_set(&client.outstanding_execs, 0);
+
+	packet.opcode = ERT_CU_STAT;
+	packet.type = ERT_CTRL;
+	packet.count = 1;  // data[1]
+
+	if (add_ctrl_cmd(exec, &client, &packet) == 0) {
+		int retry = 5;
+
+		SCHED_DEBUGF("-> custat waiting for command to finish\n");
+		// wait for command completion
+		while (--retry && atomic_read(&client.outstanding_execs))
+			msleep(100);
+		if (retry == 0 && atomic_read(&client.outstanding_execs))
+			userpf_info(xdev, "custat unexpected timeout\n");
+		SCHED_DEBUGF("<- custat retry(%d)\n", retry);
+	}
+
+	for (count = 0; count < exec->num_cus; ++count)
+		sz += sprintf(buf+sz, "CU[@0x%x] : %d\n",
+			      exec_cu_base_addr(exec, count),
+			      exec_cu_usage(exec, count));
+	if (sz)
+		buf[sz++] = 0;
+
+	return sz;
+}
+static DEVICE_ATTR_RO(kds_custat);
+
+static struct attribute *kds_sysfs_attrs[] = {
+	&dev_attr_kds_numcus.attr,
+	&dev_attr_kds_numcdmas.attr,
+	&dev_attr_kds_custat.attr,
+	NULL
+};
+
+static const struct attribute_group kds_sysfs_attr_group = {
+	.attrs = kds_sysfs_attrs,
+};
+
+static void
+user_sysfs_destroy_kds(struct platform_device *pdev)
+{
+	sysfs_remove_group(&pdev->dev.kobj, &kds_sysfs_attr_group);
+}
+
+static int
+user_sysfs_create_kds(struct platform_device *pdev)
+{
+	int err = sysfs_create_group(&pdev->dev.kobj, &kds_sysfs_attr_group);
+
+	if (err)
+		userpf_err(xocl_get_xdev(pdev), "create kds attr failed: 0x%x", err);
+	return err;
+}
+
+/**
+ * Init scheduler
+ */
+static int mb_scheduler_probe(struct platform_device *pdev)
+{
+	struct exec_core *exec = exec_create(pdev, &scheduler0);
+
+	if (!exec)
+		return -ENOMEM;
+
+	if (user_sysfs_create_kds(pdev))
+		goto err;
+
+	init_scheduler_thread(&scheduler0);
+	xocl_subdev_register(pdev, XOCL_SUBDEV_MB_SCHEDULER, &sche_ops);
+	platform_set_drvdata(pdev, exec);
+
+	DRM_INFO("command scheduler started\n");
+
+	return 0;
+
+err:
+	devm_kfree(&pdev->dev, exec);
+	return 1;
+}
+
+/**
+ * Fini scheduler
+ */
+static int mb_scheduler_remove(struct platform_device *pdev)
+{
+	struct xocl_dev *xdev;
+	int i;
+	struct exec_core *exec = platform_get_drvdata(pdev);
+
+	SCHED_DEBUGF("-> %s\n", __func__);
+	fini_scheduler_thread(exec_scheduler(exec));
+
+	xdev = xocl_get_xdev(pdev);
+	for (i = 0; i < exec->intr_num; i++) {
+		xocl_user_interrupt_config(xdev, i + exec->intr_base, false);
+		xocl_user_interrupt_reg(xdev, i + exec->intr_base,
+			NULL, NULL);
+	}
+	mutex_destroy(&exec->exec_lock);
+
+	user_sysfs_destroy_kds(pdev);
+	exec_destroy(exec);
+	platform_set_drvdata(pdev, NULL);
+
+	SCHED_DEBUGF("<- %s\n", __func__);
+	DRM_INFO("command scheduler removed\n");
+	return 0;
+}
+
+static struct platform_device_id mb_sche_id_table[] = {
+	{ XOCL_MB_SCHEDULER, 0 },
+	{ },
+};
+
+static struct platform_driver	mb_scheduler_driver = {
+	.probe		= mb_scheduler_probe,
+	.remove		= mb_scheduler_remove,
+	.driver		= {
+		.name = "xocl_mb_sche",
+	},
+	.id_table	= mb_sche_id_table,
+};
+
+int __init xocl_init_mb_scheduler(void)
+{
+	return platform_driver_register(&mb_scheduler_driver);
+}
+
+void xocl_fini_mb_scheduler(void)
+{
+	SCHED_DEBUGF("-> %s\n", __func__);
+	platform_driver_unregister(&mb_scheduler_driver);
+	SCHED_DEBUGF("<- %s\n", __func__);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/microblaze.c b/drivers/gpu/drm/xocl/subdev/microblaze.c
new file mode 100644
index 000000000000..38cfbdbb39ef
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/microblaze.c
@@ -0,0 +1,722 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Lizhi.HOu@xilinx.com
+ *
+ */
+
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/vmalloc.h>
+#include "../xocl_drv.h"
+#include <drm/xmgmt_drm.h>
+
+#define MAX_RETRY	50
+#define RETRY_INTERVAL	100	  //ms
+
+#define	MAX_IMAGE_LEN	0x20000
+
+#define	REG_VERSION		0
+#define	REG_ID			0x4
+#define	REG_STATUS		0x8
+#define	REG_ERR			0xC
+#define	REG_CAP			0x10
+#define	REG_CTL			0x18
+#define	REG_STOP_CONFIRM	0x1C
+#define	REG_CURR_BASE		0x20
+#define	REG_POWER_CHECKSUM	0x1A4
+
+#define	VALID_ID		0x74736574
+
+#define	GPIO_RESET		0x0
+#define	GPIO_ENABLED		0x1
+
+#define	SELF_JUMP(ins)		(((ins) & 0xfc00ffff) == 0xb8000000)
+
+enum ctl_mask {
+	CTL_MASK_CLEAR_POW	= 0x1,
+	CTL_MASK_CLEAR_ERR	= 0x2,
+	CTL_MASK_PAUSE		= 0x4,
+	CTL_MASK_STOP		= 0x8,
+};
+
+enum status_mask {
+	STATUS_MASK_INIT_DONE		= 0x1,
+	STATUS_MASK_STOPPED		= 0x2,
+	STATUS_MASK_PAUSE		= 0x4,
+};
+
+enum cap_mask {
+	CAP_MASK_PM			= 0x1,
+};
+
+enum {
+	MB_STATE_INIT = 0,
+	MB_STATE_RUN,
+	MB_STATE_RESET,
+};
+
+enum {
+	IO_REG,
+	IO_GPIO,
+	IO_IMAGE_MGMT,
+	IO_IMAGE_SCHE,
+	NUM_IOADDR
+};
+
+#define	READ_REG32(mb, off)		\
+	XOCL_READ_REG32(mb->base_addrs[IO_REG] + off)
+#define	WRITE_REG32(mb, val, off)	\
+	XOCL_WRITE_REG32(val, mb->base_addrs[IO_REG] + off)
+
+#define	READ_GPIO(mb, off)		\
+	XOCL_READ_REG32(mb->base_addrs[IO_GPIO] + off)
+#define	WRITE_GPIO(mb, val, off)	\
+	XOCL_WRITE_REG32(val, mb->base_addrs[IO_GPIO] + off)
+
+#define	READ_IMAGE_MGMT(mb, off)		\
+	XOCL_READ_REG32(mb->base_addrs[IO_IMAGE_MGMT] + off)
+
+#define	COPY_MGMT(mb, buf, len)		\
+	xocl_memcpy_toio(mb->base_addrs[IO_IMAGE_MGMT], buf, len)
+#define	COPY_SCHE(mb, buf, len)		\
+	xocl_memcpy_toio(mb->base_addrs[IO_IMAGE_SCHE], buf, len)
+
+struct xocl_mb {
+	struct platform_device	*pdev;
+	void __iomem		*base_addrs[NUM_IOADDR];
+
+	struct device		*hwmon_dev;
+	bool			enabled;
+	u32			state;
+	u32			cap;
+	struct mutex		mb_lock;
+
+	char			*sche_binary;
+	u32			sche_binary_length;
+	char			*mgmt_binary;
+	u32			mgmt_binary_length;
+};
+
+static int mb_stop(struct xocl_mb *mb);
+static int mb_start(struct xocl_mb *mb);
+
+/* sysfs support */
+static void safe_read32(struct xocl_mb *mb, u32 reg, u32 *val)
+{
+	mutex_lock(&mb->mb_lock);
+	if (mb->enabled && mb->state == MB_STATE_RUN)
+		*val = READ_REG32(mb, reg);
+	else
+		*val = 0;
+	mutex_unlock(&mb->mb_lock);
+}
+
+static void safe_write32(struct xocl_mb *mb, u32 reg, u32 val)
+{
+	mutex_lock(&mb->mb_lock);
+	if (mb->enabled && mb->state == MB_STATE_RUN)
+		WRITE_REG32(mb, val, reg);
+	mutex_unlock(&mb->mb_lock);
+}
+
+static ssize_t version_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_VERSION, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(version);
+
+static ssize_t id_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_ID, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(id);
+
+static ssize_t status_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_STATUS, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(status);
+
+static ssize_t error_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_ERR, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(error);
+
+static ssize_t capability_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_CAP, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(capability);
+
+static ssize_t power_checksum_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_POWER_CHECKSUM, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(power_checksum);
+
+static ssize_t pause_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(mb, REG_CTL, &val);
+
+	return sprintf(buf, "%d\n", !!(val & CTL_MASK_PAUSE));
+}
+
+static ssize_t pause_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	if (kstrtou32(buf, 10, &val) == -EINVAL || val > 1)
+		return -EINVAL;
+
+	val = val ? CTL_MASK_PAUSE : 0;
+	safe_write32(mb, REG_CTL, val);
+
+	return count;
+}
+static DEVICE_ATTR_RW(pause);
+
+static ssize_t reset_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xocl_mb *mb = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	if (kstrtou32(buf, 10, &val) == -EINVAL || val > 1)
+		return -EINVAL;
+
+	if (val) {
+		mb_stop(mb);
+		mb_start(mb);
+	}
+
+	return count;
+}
+static DEVICE_ATTR_WO(reset);
+
+static struct attribute *mb_attrs[] = {
+	&dev_attr_version.attr,
+	&dev_attr_id.attr,
+	&dev_attr_status.attr,
+	&dev_attr_error.attr,
+	&dev_attr_capability.attr,
+	&dev_attr_power_checksum.attr,
+	&dev_attr_pause.attr,
+	&dev_attr_reset.attr,
+	NULL,
+};
+static struct attribute_group mb_attr_group = {
+	.attrs = mb_attrs,
+};
+
+static ssize_t show_mb_pw(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct xocl_mb *mb = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(mb, REG_CURR_BASE + attr->index * sizeof(u32), &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+
+static SENSOR_DEVICE_ATTR(curr1_highest, 0444, show_mb_pw, NULL, 0);
+static SENSOR_DEVICE_ATTR(curr1_average, 0444, show_mb_pw, NULL, 1);
+static SENSOR_DEVICE_ATTR(curr1_input, 0444, show_mb_pw, NULL, 2);
+static SENSOR_DEVICE_ATTR(curr2_highest, 0444, show_mb_pw, NULL, 3);
+static SENSOR_DEVICE_ATTR(curr2_average, 0444, show_mb_pw, NULL, 4);
+static SENSOR_DEVICE_ATTR(curr2_input, 0444, show_mb_pw, NULL, 5);
+static SENSOR_DEVICE_ATTR(curr3_highest, 0444, show_mb_pw, NULL, 6);
+static SENSOR_DEVICE_ATTR(curr3_average, 0444, show_mb_pw, NULL, 7);
+static SENSOR_DEVICE_ATTR(curr3_input, 0444, show_mb_pw, NULL, 8);
+static SENSOR_DEVICE_ATTR(curr4_highest, 0444, show_mb_pw, NULL, 9);
+static SENSOR_DEVICE_ATTR(curr4_average, 0444, show_mb_pw, NULL, 10);
+static SENSOR_DEVICE_ATTR(curr4_input, 0444, show_mb_pw, NULL, 11);
+static SENSOR_DEVICE_ATTR(curr5_highest, 0444, show_mb_pw, NULL, 12);
+static SENSOR_DEVICE_ATTR(curr5_average, 0444, show_mb_pw, NULL, 13);
+static SENSOR_DEVICE_ATTR(curr5_input, 0444, show_mb_pw, NULL, 14);
+static SENSOR_DEVICE_ATTR(curr6_highest, 0444, show_mb_pw, NULL, 15);
+static SENSOR_DEVICE_ATTR(curr6_average, 0444, show_mb_pw, NULL, 16);
+static SENSOR_DEVICE_ATTR(curr6_input, 0444, show_mb_pw, NULL, 17);
+
+static struct attribute *hwmon_mb_attributes[] = {
+	&sensor_dev_attr_curr1_highest.dev_attr.attr,
+	&sensor_dev_attr_curr1_average.dev_attr.attr,
+	&sensor_dev_attr_curr1_input.dev_attr.attr,
+	&sensor_dev_attr_curr2_highest.dev_attr.attr,
+	&sensor_dev_attr_curr2_average.dev_attr.attr,
+	&sensor_dev_attr_curr2_input.dev_attr.attr,
+	&sensor_dev_attr_curr3_highest.dev_attr.attr,
+	&sensor_dev_attr_curr3_average.dev_attr.attr,
+	&sensor_dev_attr_curr3_input.dev_attr.attr,
+	&sensor_dev_attr_curr4_highest.dev_attr.attr,
+	&sensor_dev_attr_curr4_average.dev_attr.attr,
+	&sensor_dev_attr_curr4_input.dev_attr.attr,
+	&sensor_dev_attr_curr5_highest.dev_attr.attr,
+	&sensor_dev_attr_curr5_average.dev_attr.attr,
+	&sensor_dev_attr_curr5_input.dev_attr.attr,
+	&sensor_dev_attr_curr6_highest.dev_attr.attr,
+	&sensor_dev_attr_curr6_average.dev_attr.attr,
+	&sensor_dev_attr_curr6_input.dev_attr.attr,
+	NULL
+};
+
+static const struct attribute_group hwmon_mb_attrgroup = {
+	.attrs = hwmon_mb_attributes,
+};
+
+static ssize_t show_name(struct device *dev, struct device_attribute *da,
+			 char *buf)
+{
+	return sprintf(buf, "%s\n", XCLMGMT_MB_HWMON_NAME);
+}
+
+static struct sensor_device_attribute name_attr =
+	SENSOR_ATTR(name, 0444, show_name, NULL, 0);
+
+static void mgmt_sysfs_destroy_mb(struct platform_device *pdev)
+{
+	struct xocl_mb *mb;
+
+	mb = platform_get_drvdata(pdev);
+
+	if (!mb->enabled)
+		return;
+
+	if (mb->hwmon_dev) {
+		device_remove_file(mb->hwmon_dev, &name_attr.dev_attr);
+		sysfs_remove_group(&mb->hwmon_dev->kobj,
+			&hwmon_mb_attrgroup);
+		hwmon_device_unregister(mb->hwmon_dev);
+		mb->hwmon_dev = NULL;
+	}
+
+	sysfs_remove_group(&pdev->dev.kobj, &mb_attr_group);
+}
+
+static int mgmt_sysfs_create_mb(struct platform_device *pdev)
+{
+	struct xocl_mb *mb;
+	struct xocl_dev_core *core;
+	int err;
+
+	mb = platform_get_drvdata(pdev);
+	core = XDEV(xocl_get_xdev(pdev));
+
+	if (!mb->enabled)
+		return 0;
+	err = sysfs_create_group(&pdev->dev.kobj, &mb_attr_group);
+	if (err) {
+		xocl_err(&pdev->dev, "create mb attrs failed: 0x%x", err);
+		goto create_attr_failed;
+	}
+	mb->hwmon_dev = hwmon_device_register(&core->pdev->dev);
+	if (IS_ERR(mb->hwmon_dev)) {
+		err = PTR_ERR(mb->hwmon_dev);
+		xocl_err(&pdev->dev, "register mb hwmon failed: 0x%x", err);
+		goto hwmon_reg_failed;
+	}
+
+	dev_set_drvdata(mb->hwmon_dev, mb);
+
+	err = device_create_file(mb->hwmon_dev, &name_attr.dev_attr);
+	if (err) {
+		xocl_err(&pdev->dev, "create attr name failed: 0x%x", err);
+		goto create_name_failed;
+	}
+
+	err = sysfs_create_group(&mb->hwmon_dev->kobj,
+		&hwmon_mb_attrgroup);
+	if (err) {
+		xocl_err(&pdev->dev, "create pw group failed: 0x%x", err);
+		goto create_pw_failed;
+	}
+
+	return 0;
+
+create_pw_failed:
+	device_remove_file(mb->hwmon_dev, &name_attr.dev_attr);
+create_name_failed:
+	hwmon_device_unregister(mb->hwmon_dev);
+	mb->hwmon_dev = NULL;
+hwmon_reg_failed:
+	sysfs_remove_group(&pdev->dev.kobj, &mb_attr_group);
+create_attr_failed:
+	return err;
+}
+
+static int mb_stop(struct xocl_mb *mb)
+{
+	int retry = 0;
+	int ret = 0;
+	u32 reg_val = 0;
+
+	if (!mb->enabled)
+		return 0;
+
+	mutex_lock(&mb->mb_lock);
+	reg_val = READ_GPIO(mb, 0);
+	xocl_info(&mb->pdev->dev, "Reset GPIO 0x%x", reg_val);
+	if (reg_val == GPIO_RESET) {
+		/* MB in reset status */
+		mb->state = MB_STATE_RESET;
+		goto out;
+	}
+
+	xocl_info(&mb->pdev->dev,
+		"MGMT Image magic word, 0x%x, status 0x%x, id 0x%x",
+		READ_IMAGE_MGMT(mb, 0),
+		READ_REG32(mb, REG_STATUS),
+		READ_REG32(mb, REG_ID));
+
+	if (!SELF_JUMP(READ_IMAGE_MGMT(mb, 0))) {
+		/* non cold boot */
+		reg_val = READ_REG32(mb, REG_STATUS);
+		if (!(reg_val & STATUS_MASK_STOPPED)) {
+			// need to stop microblaze
+			xocl_info(&mb->pdev->dev, "stopping microblaze...");
+			WRITE_REG32(mb, CTL_MASK_STOP, REG_CTL);
+			WRITE_REG32(mb, 1, REG_STOP_CONFIRM);
+			while (retry++ < MAX_RETRY &&
+				!(READ_REG32(mb, REG_STATUS) &
+				STATUS_MASK_STOPPED)) {
+				msleep(RETRY_INTERVAL);
+			}
+			if (retry >= MAX_RETRY) {
+				xocl_err(&mb->pdev->dev,
+					"Failed to stop microblaze");
+				xocl_err(&mb->pdev->dev,
+					"Error Reg 0x%x",
+					READ_REG32(mb, REG_ERR));
+				ret = -EIO;
+				goto out;
+			}
+		}
+		xocl_info(&mb->pdev->dev, "Microblaze Stopped, retry %d",
+			retry);
+	}
+
+	/* hold reset */
+	WRITE_GPIO(mb, GPIO_RESET, 0);
+	mb->state = MB_STATE_RESET;
+out:
+	mutex_unlock(&mb->mb_lock);
+
+	return ret;
+}
+
+static int mb_start(struct xocl_mb *mb)
+{
+	int retry = 0;
+	u32 reg_val = 0;
+	int ret = 0;
+	void *xdev_hdl;
+
+	if (!mb->enabled)
+		return 0;
+
+	xdev_hdl = xocl_get_xdev(mb->pdev);
+
+	mutex_lock(&mb->mb_lock);
+	reg_val = READ_GPIO(mb, 0);
+	xocl_info(&mb->pdev->dev, "Reset GPIO 0x%x", reg_val);
+	if (reg_val == GPIO_ENABLED)
+		goto out;
+
+	xocl_info(&mb->pdev->dev, "Start Microblaze...");
+	xocl_info(&mb->pdev->dev, "MGMT Image magic word, 0x%x",
+		READ_IMAGE_MGMT(mb, 0));
+
+	if (xocl_mb_mgmt_on(xdev_hdl)) {
+		xocl_info(&mb->pdev->dev, "Copying mgmt image len %d",
+			mb->mgmt_binary_length);
+		COPY_MGMT(mb, mb->mgmt_binary, mb->mgmt_binary_length);
+	}
+
+	if (xocl_mb_sched_on(xdev_hdl)) {
+		xocl_info(&mb->pdev->dev, "Copying scheduler image len %d",
+			mb->sche_binary_length);
+		COPY_SCHE(mb, mb->sche_binary, mb->sche_binary_length);
+	}
+
+	WRITE_GPIO(mb, GPIO_ENABLED, 0);
+	xocl_info(&mb->pdev->dev,
+		"MGMT Image magic word, 0x%x, status 0x%x, id 0x%x",
+		READ_IMAGE_MGMT(mb, 0),
+		READ_REG32(mb, REG_STATUS),
+		READ_REG32(mb, REG_ID));
+	do {
+		msleep(RETRY_INTERVAL);
+	} while (retry++ < MAX_RETRY && (READ_REG32(mb, REG_STATUS) &
+		STATUS_MASK_STOPPED));
+
+	/* Extra pulse needed as workaround for axi interconnect issue in DSA */
+	if (retry >= MAX_RETRY) {
+		retry = 0;
+		WRITE_GPIO(mb, GPIO_RESET, 0);
+		WRITE_GPIO(mb, GPIO_ENABLED, 0);
+		do {
+			msleep(RETRY_INTERVAL);
+		} while (retry++ < MAX_RETRY && (READ_REG32(mb, REG_STATUS) &
+			STATUS_MASK_STOPPED));
+	}
+
+	if (retry >= MAX_RETRY) {
+		xocl_err(&mb->pdev->dev, "Failed to start microblaze");
+		xocl_err(&mb->pdev->dev, "Error Reg 0x%x",
+				READ_REG32(mb, REG_ERR));
+			ret = -EIO;
+	}
+
+	mb->cap = READ_REG32(mb, REG_CAP);
+	mb->state = MB_STATE_RUN;
+out:
+	mutex_unlock(&mb->mb_lock);
+
+	return ret;
+}
+
+static void mb_reset(struct platform_device *pdev)
+{
+	struct xocl_mb *mb;
+
+	xocl_info(&pdev->dev, "Reset Microblaze...");
+	mb = platform_get_drvdata(pdev);
+	if (!mb)
+		return;
+
+	mb_stop(mb);
+	mb_start(mb);
+}
+
+static int load_mgmt_image(struct platform_device *pdev, const char *image,
+	u32 len)
+{
+	struct xocl_mb *mb;
+	char *binary;
+
+	if (len > MAX_IMAGE_LEN)
+		return -EINVAL;
+
+	mb = platform_get_drvdata(pdev);
+	if (!mb)
+		return -EINVAL;
+
+	binary = mb->mgmt_binary;
+	mb->mgmt_binary = devm_kzalloc(&pdev->dev, len, GFP_KERNEL);
+	if (!mb->mgmt_binary)
+		return -ENOMEM;
+
+	if (binary)
+		devm_kfree(&pdev->dev, binary);
+	memcpy(mb->mgmt_binary, image, len);
+	mb->mgmt_binary_length = len;
+
+	return 0;
+}
+
+static int load_sche_image(struct platform_device *pdev, const char *image,
+	u32 len)
+{
+	struct xocl_mb *mb;
+	char *binary = NULL;
+
+	if (len > MAX_IMAGE_LEN)
+		return -EINVAL;
+
+	mb = platform_get_drvdata(pdev);
+	if (!mb)
+		return -EINVAL;
+
+	binary = mb->sche_binary;
+	mb->sche_binary = devm_kzalloc(&pdev->dev, len, GFP_KERNEL);
+	if (!mb->sche_binary)
+		return -ENOMEM;
+
+	if (binary)
+		devm_kfree(&pdev->dev, binary);
+	memcpy(mb->sche_binary, image, len);
+	mb->sche_binary_length = len;
+
+	return 0;
+}
+
+//Have a function stub but don't actually do anything when this is called
+static int mb_ignore(struct platform_device *pdev)
+{
+	return 0;
+}
+
+static struct xocl_mb_funcs mb_ops = {
+	.load_mgmt_image	= load_mgmt_image,
+	.load_sche_image	= load_sche_image,
+	.reset			= mb_reset,
+	.stop			= mb_ignore,
+};
+
+
+
+static int mb_remove(struct platform_device *pdev)
+{
+	struct xocl_mb *mb;
+	int	i;
+
+	mb = platform_get_drvdata(pdev);
+	if (!mb)
+		return 0;
+
+	if (mb->mgmt_binary)
+		devm_kfree(&pdev->dev, mb->mgmt_binary);
+	if (mb->sche_binary)
+		devm_kfree(&pdev->dev, mb->sche_binary);
+
+	/*
+	 * It is more secure that MB keeps running even driver is unloaded.
+	 * Even user unload our driver and use their own stuff, MB will still
+	 * be able to monitor the board unless user stops it explicitly
+	 */
+	mb_stop(mb);
+
+	mgmt_sysfs_destroy_mb(pdev);
+
+	for (i = 0; i < NUM_IOADDR; i++) {
+		if (mb->base_addrs[i])
+			iounmap(mb->base_addrs[i]);
+	}
+
+	mutex_destroy(&mb->mb_lock);
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, mb);
+
+	return 0;
+}
+
+static int mb_probe(struct platform_device *pdev)
+{
+	struct xocl_mb *mb;
+	struct resource *res;
+	void	*xdev_hdl;
+	int i, err;
+
+	mb = devm_kzalloc(&pdev->dev, sizeof(*mb), GFP_KERNEL);
+	if (!mb) {
+		xocl_err(&pdev->dev, "out of memory");
+		return -ENOMEM;
+	}
+
+	mb->pdev = pdev;
+	platform_set_drvdata(pdev, mb);
+
+	xdev_hdl = xocl_get_xdev(pdev);
+	if (xocl_mb_mgmt_on(xdev_hdl) || xocl_mb_sched_on(xdev_hdl)) {
+		xocl_info(&pdev->dev, "Microblaze is supported.");
+		mb->enabled = true;
+	} else {
+		xocl_info(&pdev->dev, "Microblaze is not supported.");
+		devm_kfree(&pdev->dev, mb);
+		platform_set_drvdata(pdev, NULL);
+		return 0;
+	}
+
+	for (i = 0; i < NUM_IOADDR; i++) {
+		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+		xocl_info(&pdev->dev, "IO start: 0x%llx, end: 0x%llx",
+			res->start, res->end);
+		mb->base_addrs[i] =
+			ioremap_nocache(res->start, res->end - res->start + 1);
+		if (!mb->base_addrs[i]) {
+			err = -EIO;
+			xocl_err(&pdev->dev, "Map iomem failed");
+			goto failed;
+		}
+	}
+
+	err = mgmt_sysfs_create_mb(pdev);
+	if (err) {
+		xocl_err(&pdev->dev, "Create sysfs failed, err %d", err);
+		goto failed;
+	}
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_MB, &mb_ops);
+
+	mutex_init(&mb->mb_lock);
+
+	return 0;
+
+failed:
+	mb_remove(pdev);
+	return err;
+}
+
+struct platform_device_id mb_id_table[] = {
+	{ XOCL_MB, 0 },
+	{ },
+};
+
+static struct platform_driver	mb_driver = {
+	.probe		= mb_probe,
+	.remove		= mb_remove,
+	.driver		= {
+		.name = "xocl_mb",
+	},
+	.id_table = mb_id_table,
+};
+
+int __init xocl_init_mb(void)
+{
+	return platform_driver_register(&mb_driver);
+}
+
+void xocl_fini_mb(void)
+{
+	platform_driver_unregister(&mb_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/mig.c b/drivers/gpu/drm/xocl/subdev/mig.c
new file mode 100644
index 000000000000..5a574f7af796
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/mig.c
@@ -0,0 +1,256 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2018-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Chien-Wei Lan <chienwei@xilinx.com>
+ *
+ */
+
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include "../xocl_drv.h"
+#include <drm/xmgmt_drm.h>
+
+/* Registers are defined in pg150-ultrascale-memory-ip.pdf:
+ * AXI4-Lite Slave Control/Status Register Map
+ */
+
+#define MIG_DEBUG
+#define	MIG_DEV2MIG(dev)	\
+	((struct xocl_mig *)platform_get_drvdata(to_platform_device(dev)))
+#define	MIG_DEV2BASE(dev)	(MIG_DEV2MIG(dev)->base)
+
+#define ECC_STATUS	0x0
+#define ECC_ON_OFF	0x8
+#define CE_CNT		0xC
+#define CE_ADDR_LO	0x1C0
+#define CE_ADDR_HI	0x1C4
+#define UE_ADDR_LO	0x2C0
+#define UE_ADDR_HI	0x2C4
+#define INJ_FAULT_REG	0x300
+
+struct xocl_mig {
+	void __iomem	*base;
+	struct device	*mig_dev;
+};
+
+static ssize_t ecc_ue_ffa_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	uint64_t val = ioread32(MIG_DEV2BASE(dev) + UE_ADDR_HI);
+
+	val <<= 32;
+	val |= ioread32(MIG_DEV2BASE(dev) + UE_ADDR_LO);
+	return sprintf(buf, "0x%llx\n", val);
+}
+static DEVICE_ATTR_RO(ecc_ue_ffa);
+
+
+static ssize_t ecc_ce_ffa_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	uint64_t val = ioread32(MIG_DEV2BASE(dev) + CE_ADDR_HI);
+
+	val <<= 32;
+	val |= ioread32(MIG_DEV2BASE(dev) + CE_ADDR_LO);
+	return sprintf(buf, "0x%llx\n", val);
+}
+static DEVICE_ATTR_RO(ecc_ce_ffa);
+
+
+static ssize_t ecc_ce_cnt_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	return sprintf(buf, "%u\n", ioread32(MIG_DEV2BASE(dev) + CE_CNT));
+}
+static DEVICE_ATTR_RO(ecc_ce_cnt);
+
+
+static ssize_t ecc_status_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	return sprintf(buf, "%u\n", ioread32(MIG_DEV2BASE(dev) + ECC_STATUS));
+}
+static DEVICE_ATTR_RO(ecc_status);
+
+
+static ssize_t ecc_reset_store(struct device *dev, struct device_attribute *da,
+	const char *buf, size_t count)
+{
+	iowrite32(0x3, MIG_DEV2BASE(dev) + ECC_STATUS);
+	iowrite32(0, MIG_DEV2BASE(dev) + CE_CNT);
+	return count;
+}
+static DEVICE_ATTR_WO(ecc_reset);
+
+
+static ssize_t ecc_enabled_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	return sprintf(buf, "%u\n", ioread32(MIG_DEV2BASE(dev) + ECC_ON_OFF));
+}
+static ssize_t ecc_enabled_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	uint32_t val;
+
+	if (sscanf(buf, "%d", &val) != 1 || val > 1) {
+		xocl_err(&to_platform_device(dev)->dev,
+			"usage: echo [0|1] > ecc_enabled");
+		return -EINVAL;
+	}
+
+	iowrite32(val, MIG_DEV2BASE(dev) + ECC_ON_OFF);
+	return count;
+}
+static DEVICE_ATTR_RW(ecc_enabled);
+
+
+#ifdef MIG_DEBUG
+static ssize_t ecc_inject_store(struct device *dev, struct device_attribute *da,
+	const char *buf, size_t count)
+{
+	iowrite32(1, MIG_DEV2BASE(dev) + INJ_FAULT_REG);
+	return count;
+}
+static DEVICE_ATTR_WO(ecc_inject);
+#endif
+
+
+/* Standard sysfs entry for all dynamic subdevices. */
+static ssize_t name_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	return sprintf(buf, "%s\n", XOCL_GET_SUBDEV_PRIV(dev));
+}
+static DEVICE_ATTR_RO(name);
+
+
+static struct attribute *mig_attributes[] = {
+	&dev_attr_name.attr,
+	&dev_attr_ecc_enabled.attr,
+	&dev_attr_ecc_status.attr,
+	&dev_attr_ecc_ce_cnt.attr,
+	&dev_attr_ecc_ce_ffa.attr,
+	&dev_attr_ecc_ue_ffa.attr,
+	&dev_attr_ecc_reset.attr,
+#ifdef MIG_DEBUG
+	&dev_attr_ecc_inject.attr,
+#endif
+	NULL
+};
+
+static const struct attribute_group mig_attrgroup = {
+	.attrs = mig_attributes,
+};
+
+static void mgmt_sysfs_destroy_mig(struct platform_device *pdev)
+{
+	struct xocl_mig *mig;
+
+	mig = platform_get_drvdata(pdev);
+	sysfs_remove_group(&pdev->dev.kobj, &mig_attrgroup);
+}
+
+static int mgmt_sysfs_create_mig(struct platform_device *pdev)
+{
+	struct xocl_mig *mig;
+	int err;
+
+	mig = platform_get_drvdata(pdev);
+	err = sysfs_create_group(&pdev->dev.kobj, &mig_attrgroup);
+	if (err) {
+		xocl_err(&pdev->dev, "create pw group failed: 0x%x", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static int mig_probe(struct platform_device *pdev)
+{
+	struct xocl_mig *mig;
+	struct resource *res;
+	int err;
+
+	mig = devm_kzalloc(&pdev->dev, sizeof(*mig), GFP_KERNEL);
+	if (!mig)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		xocl_err(&pdev->dev, "resource is NULL");
+		return -EINVAL;
+	}
+
+	xocl_info(&pdev->dev, "MIG name: %s, IO start: 0x%llx, end: 0x%llx",
+		XOCL_GET_SUBDEV_PRIV(&pdev->dev), res->start, res->end);
+
+	mig->base = ioremap_nocache(res->start, res->end - res->start + 1);
+	if (!mig->base) {
+		xocl_err(&pdev->dev, "Map iomem failed");
+		return -EIO;
+	}
+
+	platform_set_drvdata(pdev, mig);
+
+	err = mgmt_sysfs_create_mig(pdev);
+	if (err) {
+		platform_set_drvdata(pdev, NULL);
+		iounmap(mig->base);
+		return err;
+	}
+
+	return 0;
+}
+
+
+static int mig_remove(struct platform_device *pdev)
+{
+	struct xocl_mig	*mig;
+
+	mig = platform_get_drvdata(pdev);
+	if (!mig) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+
+	xocl_info(&pdev->dev, "MIG name: %s", XOCL_GET_SUBDEV_PRIV(&pdev->dev));
+
+	mgmt_sysfs_destroy_mig(pdev);
+
+	if (mig->base)
+		iounmap(mig->base);
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, mig);
+
+	return 0;
+}
+
+struct platform_device_id mig_id_table[] = {
+	{ XOCL_MIG, 0 },
+	{ },
+};
+
+static struct platform_driver	mig_driver = {
+	.probe		= mig_probe,
+	.remove		= mig_remove,
+	.driver		= {
+		.name = "xocl_mig",
+	},
+	.id_table = mig_id_table,
+};
+
+int __init xocl_init_mig(void)
+{
+	return platform_driver_register(&mig_driver);
+}
+
+void xocl_fini_mig(void)
+{
+	platform_driver_unregister(&mig_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/sysmon.c b/drivers/gpu/drm/xocl/subdev/sysmon.c
new file mode 100644
index 000000000000..bb5c84485344
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/sysmon.c
@@ -0,0 +1,385 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include "../xocl_drv.h"
+#include <drm/xmgmt_drm.h>
+
+#define TEMP		0x400		// TEMPOERATURE REGISTER ADDRESS
+#define VCCINT		0x404		// VCCINT REGISTER OFFSET
+#define VCCAUX		0x408		// VCCAUX REGISTER OFFSET
+#define VCCBRAM		0x418		// VCCBRAM REGISTER OFFSET
+#define	TEMP_MAX	0x480
+#define	VCCINT_MAX	0x484
+#define	VCCAUX_MAX	0x488
+#define	VCCBRAM_MAX	0x48c
+#define	TEMP_MIN	0x490
+#define	VCCINT_MIN	0x494
+#define	VCCAUX_MIN	0x498
+#define	VCCBRAM_MIN	0x49c
+
+#define	SYSMON_TO_MILLDEGREE(val)		\
+	(((int64_t)(val) * 501374 >> 16) - 273678)
+#define	SYSMON_TO_MILLVOLT(val)			\
+	((val) * 1000 * 3 >> 16)
+
+#define	READ_REG32(sysmon, off)		\
+	XOCL_READ_REG32(sysmon->base + off)
+#define	WRITE_REG32(sysmon, val, off)	\
+	XOCL_WRITE_REG32(val, sysmon->base + off)
+
+struct xocl_sysmon {
+	void __iomem		*base;
+	struct device		*hwmon_dev;
+};
+
+static int get_prop(struct platform_device *pdev, u32 prop, void *val)
+{
+	struct xocl_sysmon	*sysmon;
+	u32			tmp;
+
+	sysmon = platform_get_drvdata(pdev);
+	BUG_ON(!sysmon);
+
+	switch (prop) {
+	case XOCL_SYSMON_PROP_TEMP:
+		tmp = READ_REG32(sysmon, TEMP);
+		*(u32 *)val = SYSMON_TO_MILLDEGREE(tmp)/1000;
+		break;
+	case XOCL_SYSMON_PROP_TEMP_MAX:
+		tmp = READ_REG32(sysmon, TEMP_MAX);
+		*(u32 *)val = SYSMON_TO_MILLDEGREE(tmp);
+		break;
+	case XOCL_SYSMON_PROP_TEMP_MIN:
+		tmp = READ_REG32(sysmon, TEMP_MIN);
+		*(u32 *)val = SYSMON_TO_MILLDEGREE(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_INT:
+		tmp = READ_REG32(sysmon, VCCINT);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_INT_MAX:
+		tmp = READ_REG32(sysmon, VCCINT_MAX);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_INT_MIN:
+		tmp = READ_REG32(sysmon, VCCINT_MIN);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_AUX:
+		tmp = READ_REG32(sysmon, VCCAUX);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_AUX_MAX:
+		tmp = READ_REG32(sysmon, VCCAUX_MAX);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_AUX_MIN:
+		tmp = READ_REG32(sysmon, VCCAUX_MIN);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_BRAM:
+		tmp = READ_REG32(sysmon, VCCBRAM);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_BRAM_MAX:
+		tmp = READ_REG32(sysmon, VCCBRAM_MAX);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	case XOCL_SYSMON_PROP_VCC_BRAM_MIN:
+		tmp = READ_REG32(sysmon, VCCBRAM_MIN);
+		*(u32 *)val = SYSMON_TO_MILLVOLT(tmp);
+		break;
+	default:
+		xocl_err(&pdev->dev, "Invalid prop");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static struct xocl_sysmon_funcs sysmon_ops = {
+	.get_prop	= get_prop,
+};
+
+static ssize_t show_sysmon(struct platform_device *pdev, u32 prop, char *buf)
+{
+	u32 val;
+
+	(void) get_prop(pdev, prop, &val);
+	return sprintf(buf, "%u\n", val);
+}
+
+/* sysfs support */
+static ssize_t show_hwmon(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct platform_device *pdev = dev_get_drvdata(dev);
+
+	return show_sysmon(pdev, attr->index, buf);
+}
+
+static ssize_t show_name(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	return sprintf(buf, "%s\n", XCLMGMT_SYSMON_HWMON_NAME);
+}
+
+static SENSOR_DEVICE_ATTR(temp1_input, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_TEMP);
+static SENSOR_DEVICE_ATTR(temp1_highest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_TEMP_MAX);
+static SENSOR_DEVICE_ATTR(temp1_lowest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_TEMP_MIN);
+
+static SENSOR_DEVICE_ATTR(in0_input, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_INT);
+static SENSOR_DEVICE_ATTR(in0_highest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_INT_MAX);
+static SENSOR_DEVICE_ATTR(in0_lowest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_INT_MIN);
+
+static SENSOR_DEVICE_ATTR(in1_input, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_AUX);
+static SENSOR_DEVICE_ATTR(in1_highest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_AUX_MAX);
+static SENSOR_DEVICE_ATTR(in1_lowest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_AUX_MIN);
+
+static SENSOR_DEVICE_ATTR(in2_input, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_BRAM);
+static SENSOR_DEVICE_ATTR(in2_highest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_BRAM_MAX);
+static SENSOR_DEVICE_ATTR(in2_lowest, 0444, show_hwmon, NULL,
+	XOCL_SYSMON_PROP_VCC_BRAM_MIN);
+
+static struct attribute *hwmon_sysmon_attributes[] = {
+	&sensor_dev_attr_temp1_input.dev_attr.attr,
+	&sensor_dev_attr_temp1_highest.dev_attr.attr,
+	&sensor_dev_attr_temp1_lowest.dev_attr.attr,
+	&sensor_dev_attr_in0_input.dev_attr.attr,
+	&sensor_dev_attr_in0_highest.dev_attr.attr,
+	&sensor_dev_attr_in0_lowest.dev_attr.attr,
+	&sensor_dev_attr_in1_input.dev_attr.attr,
+	&sensor_dev_attr_in1_highest.dev_attr.attr,
+	&sensor_dev_attr_in1_lowest.dev_attr.attr,
+	&sensor_dev_attr_in2_input.dev_attr.attr,
+	&sensor_dev_attr_in2_highest.dev_attr.attr,
+	&sensor_dev_attr_in2_lowest.dev_attr.attr,
+	NULL
+};
+
+static const struct attribute_group hwmon_sysmon_attrgroup = {
+	.attrs = hwmon_sysmon_attributes,
+};
+
+static struct sensor_device_attribute sysmon_name_attr =
+	SENSOR_ATTR(name, 0444, show_name, NULL, 0);
+
+static ssize_t temp_show(struct device *dev, struct device_attribute *da,
+			 char *buf)
+{
+	return show_sysmon(to_platform_device(dev), XOCL_SYSMON_PROP_TEMP, buf);
+}
+static DEVICE_ATTR_RO(temp);
+
+static ssize_t vcc_int_show(struct device *dev, struct device_attribute *da,
+			    char *buf)
+{
+	return show_sysmon(to_platform_device(dev), XOCL_SYSMON_PROP_VCC_INT, buf);
+}
+static DEVICE_ATTR_RO(vcc_int);
+
+static ssize_t vcc_aux_show(struct device *dev, struct device_attribute *da,
+			    char *buf)
+{
+	return show_sysmon(to_platform_device(dev), XOCL_SYSMON_PROP_VCC_AUX, buf);
+}
+static DEVICE_ATTR_RO(vcc_aux);
+
+static ssize_t vcc_bram_show(struct device *dev, struct device_attribute *da,
+			     char *buf)
+{
+	return show_sysmon(to_platform_device(dev), XOCL_SYSMON_PROP_VCC_BRAM, buf);
+}
+static DEVICE_ATTR_RO(vcc_bram);
+
+static struct attribute *sysmon_attributes[] = {
+	&dev_attr_temp.attr,
+	&dev_attr_vcc_int.attr,
+	&dev_attr_vcc_aux.attr,
+	&dev_attr_vcc_bram.attr,
+	NULL,
+};
+
+static const struct attribute_group sysmon_attrgroup = {
+	.attrs = sysmon_attributes,
+};
+
+static void mgmt_sysfs_destroy_sysmon(struct platform_device *pdev)
+{
+	struct xocl_sysmon *sysmon;
+
+	sysmon = platform_get_drvdata(pdev);
+
+	device_remove_file(sysmon->hwmon_dev, &sysmon_name_attr.dev_attr);
+	sysfs_remove_group(&sysmon->hwmon_dev->kobj, &hwmon_sysmon_attrgroup);
+	hwmon_device_unregister(sysmon->hwmon_dev);
+	sysmon->hwmon_dev = NULL;
+
+	sysfs_remove_group(&pdev->dev.kobj, &sysmon_attrgroup);
+}
+
+static int mgmt_sysfs_create_sysmon(struct platform_device *pdev)
+{
+	struct xocl_sysmon *sysmon;
+	struct xocl_dev_core *core;
+	int err;
+
+	sysmon = platform_get_drvdata(pdev);
+	core = XDEV(xocl_get_xdev(pdev));
+
+	sysmon->hwmon_dev = hwmon_device_register(&core->pdev->dev);
+	if (IS_ERR(sysmon->hwmon_dev)) {
+		err = PTR_ERR(sysmon->hwmon_dev);
+		xocl_err(&pdev->dev, "register sysmon hwmon failed: 0x%x", err);
+		goto hwmon_reg_failed;
+	}
+
+	dev_set_drvdata(sysmon->hwmon_dev, pdev);
+	err = device_create_file(sysmon->hwmon_dev,
+		&sysmon_name_attr.dev_attr);
+	if (err) {
+		xocl_err(&pdev->dev, "create attr name failed: 0x%x", err);
+		goto create_name_failed;
+	}
+
+	err = sysfs_create_group(&sysmon->hwmon_dev->kobj,
+		&hwmon_sysmon_attrgroup);
+	if (err) {
+		xocl_err(&pdev->dev, "create hwmon group failed: 0x%x", err);
+		goto create_hwmon_failed;
+	}
+
+	err = sysfs_create_group(&pdev->dev.kobj, &sysmon_attrgroup);
+	if (err) {
+		xocl_err(&pdev->dev, "create sysmon group failed: 0x%x", err);
+		goto create_sysmon_failed;
+	}
+
+	return 0;
+
+create_sysmon_failed:
+	sysfs_remove_group(&sysmon->hwmon_dev->kobj, &hwmon_sysmon_attrgroup);
+create_hwmon_failed:
+	device_remove_file(sysmon->hwmon_dev, &sysmon_name_attr.dev_attr);
+create_name_failed:
+	hwmon_device_unregister(sysmon->hwmon_dev);
+	sysmon->hwmon_dev = NULL;
+hwmon_reg_failed:
+	return err;
+}
+
+static int sysmon_probe(struct platform_device *pdev)
+{
+	struct xocl_sysmon *sysmon;
+	struct resource *res;
+	int err;
+
+	sysmon = devm_kzalloc(&pdev->dev, sizeof(*sysmon), GFP_KERNEL);
+	if (!sysmon)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		xocl_err(&pdev->dev, "resource is NULL");
+		return -EINVAL;
+	}
+	xocl_info(&pdev->dev, "IO start: 0x%llx, end: 0x%llx",
+		res->start, res->end);
+	sysmon->base = ioremap_nocache(res->start, res->end - res->start + 1);
+	if (!sysmon->base) {
+		err = -EIO;
+		xocl_err(&pdev->dev, "Map iomem failed");
+		goto failed;
+	}
+
+	platform_set_drvdata(pdev, sysmon);
+
+	err = mgmt_sysfs_create_sysmon(pdev);
+	if (err)
+		goto create_sysmon_failed;
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_SYSMON, &sysmon_ops);
+
+	return 0;
+
+create_sysmon_failed:
+	platform_set_drvdata(pdev, NULL);
+failed:
+	return err;
+}
+
+
+static int sysmon_remove(struct platform_device *pdev)
+{
+	struct xocl_sysmon	*sysmon;
+
+	sysmon = platform_get_drvdata(pdev);
+	if (!sysmon) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+
+	mgmt_sysfs_destroy_sysmon(pdev);
+
+	if (sysmon->base)
+		iounmap(sysmon->base);
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, sysmon);
+
+	return 0;
+}
+
+struct platform_device_id sysmon_id_table[] = {
+	{ XOCL_SYSMON, 0 },
+	{ },
+};
+
+static struct platform_driver	sysmon_driver = {
+	.probe		= sysmon_probe,
+	.remove		= sysmon_remove,
+	.driver		= {
+		.name = "xocl_sysmon",
+	},
+	.id_table = sysmon_id_table,
+};
+
+int __init xocl_init_sysmon(void)
+{
+	return platform_driver_register(&sysmon_driver);
+}
+
+void xocl_fini_sysmon(void)
+{
+	platform_driver_unregister(&sysmon_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/xdma.c b/drivers/gpu/drm/xocl/subdev/xdma.c
new file mode 100644
index 000000000000..647a69f29a84
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/xdma.c
@@ -0,0 +1,510 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ */
+
+/* XDMA version Memory Mapped DMA */
+
+#include <linux/version.h>
+#include <linux/eventfd.h>
+#include <drm/drmP.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_mm.h>
+#include "../xocl_drv.h"
+#include "../xocl_drm.h"
+#include "../lib/libxdma_api.h"
+
+#define XOCL_FILE_PAGE_OFFSET   0x100000
+#ifndef VM_RESERVED
+#define VM_RESERVED (VM_DONTEXPAND | VM_DONTDUMP)
+#endif
+
+struct xdma_irq {
+	struct eventfd_ctx	*event_ctx;
+	bool			in_use;
+	bool			enabled;
+	irq_handler_t		handler;
+	void			*arg;
+};
+
+struct xocl_xdma {
+	void			*dma_handle;
+	u32			max_user_intr;
+	u32			start_user_intr;
+	struct xdma_irq		*user_msix_table;
+	struct mutex		user_msix_table_lock;
+
+	struct xocl_drm		*drm;
+	/* Number of bidirectional channels */
+	u32			channel;
+	/* Semaphore, one for each direction */
+	struct semaphore	channel_sem[2];
+	/*
+	 * Channel usage bitmasks, one for each direction
+	 * bit 1 indicates channel is free, bit 0 indicates channel is free
+	 */
+	unsigned long		channel_bitmap[2];
+	unsigned long long	*channel_usage[2];
+
+	struct mutex		stat_lock;
+};
+
+static ssize_t xdma_migrate_bo(struct platform_device *pdev,
+	struct sg_table *sgt, u32 dir, u64 paddr, u32 channel, u64 len)
+{
+	struct xocl_xdma *xdma;
+	struct page *pg;
+	struct scatterlist *sg = sgt->sgl;
+	int nents = sgt->orig_nents;
+	pid_t pid = current->pid;
+	int i = 0;
+	ssize_t ret;
+	unsigned long long pgaddr;
+
+	xdma = platform_get_drvdata(pdev);
+	xocl_dbg(&pdev->dev, "TID %d, Channel:%d, Offset: 0x%llx, Dir: %d",
+		pid, channel, paddr, dir);
+	ret = xdma_xfer_submit(xdma->dma_handle, channel, dir,
+		paddr, sgt, false, 10000);
+	if (ret >= 0) {
+		xdma->channel_usage[dir][channel] += ret;
+		return ret;
+	}
+
+	xocl_err(&pdev->dev, "DMA failed, Dumping SG Page Table");
+	for (i = 0; i < nents; i++, sg = sg_next(sg)) {
+		if (!sg)
+			break;
+		pg = sg_page(sg);
+		if (!pg)
+			continue;
+		pgaddr = page_to_phys(pg);
+		xocl_err(&pdev->dev, "%i, 0x%llx\n", i, pgaddr);
+	}
+	return ret;
+}
+
+static int acquire_channel(struct platform_device *pdev, u32 dir)
+{
+	struct xocl_xdma *xdma;
+	int channel = 0;
+	int result = 0;
+
+	xdma = platform_get_drvdata(pdev);
+	if (down_interruptible(&xdma->channel_sem[dir])) {
+		channel = -ERESTARTSYS;
+		goto out;
+	}
+
+	for (channel = 0; channel < xdma->channel; channel++) {
+		result = test_and_clear_bit(channel,
+			&xdma->channel_bitmap[dir]);
+		if (result)
+			break;
+	}
+	if (!result) {
+		// How is this possible?
+		up(&xdma->channel_sem[dir]);
+		channel = -EIO;
+	}
+
+out:
+	return channel;
+}
+
+static void release_channel(struct platform_device *pdev, u32 dir, u32 channel)
+{
+	struct xocl_xdma *xdma;
+
+
+	xdma = platform_get_drvdata(pdev);
+	set_bit(channel, &xdma->channel_bitmap[dir]);
+	up(&xdma->channel_sem[dir]);
+}
+
+static u32 get_channel_count(struct platform_device *pdev)
+{
+	struct xocl_xdma *xdma;
+
+	xdma = platform_get_drvdata(pdev);
+	BUG_ON(!xdma);
+
+	return xdma->channel;
+}
+
+static void *get_drm_handle(struct platform_device *pdev)
+{
+	struct xocl_xdma *xdma;
+
+	xdma = platform_get_drvdata(pdev);
+
+	return xdma->drm;
+}
+
+static u64 get_channel_stat(struct platform_device *pdev, u32 channel,
+	u32 write)
+{
+	struct xocl_xdma *xdma;
+
+	xdma = platform_get_drvdata(pdev);
+	BUG_ON(!xdma);
+
+	return xdma->channel_usage[write][channel];
+}
+
+static int user_intr_config(struct platform_device *pdev, u32 intr, bool en)
+{
+	struct xocl_xdma *xdma;
+	const unsigned int mask = 1 << intr;
+	int ret;
+
+	xdma = platform_get_drvdata(pdev);
+
+	if (intr >= xdma->max_user_intr) {
+		xocl_err(&pdev->dev, "Invalid intr %d, user start %d, max %d",
+			intr, xdma->start_user_intr, xdma->max_user_intr);
+		return -EINVAL;
+	}
+
+	mutex_lock(&xdma->user_msix_table_lock);
+	if (xdma->user_msix_table[intr].enabled == en) {
+		ret = 0;
+		goto end;
+	}
+
+	ret = en ? xdma_user_isr_enable(xdma->dma_handle, mask) :
+		xdma_user_isr_disable(xdma->dma_handle, mask);
+	if (!ret)
+		xdma->user_msix_table[intr].enabled = en;
+end:
+	mutex_unlock(&xdma->user_msix_table_lock);
+
+	return ret;
+}
+
+static irqreturn_t xdma_isr(int irq, void *arg)
+{
+	struct xdma_irq *irq_entry = arg;
+	int ret = IRQ_HANDLED;
+
+	if (irq_entry->handler)
+		ret = irq_entry->handler(irq, irq_entry->arg);
+
+	if (!IS_ERR_OR_NULL(irq_entry->event_ctx))
+		eventfd_signal(irq_entry->event_ctx, 1);
+
+	return ret;
+}
+
+static int user_intr_unreg(struct platform_device *pdev, u32 intr)
+{
+	struct xocl_xdma *xdma;
+	const unsigned int mask = 1 << intr;
+	int ret;
+
+	xdma = platform_get_drvdata(pdev);
+
+	if (intr >= xdma->max_user_intr)
+		return -EINVAL;
+
+	mutex_lock(&xdma->user_msix_table_lock);
+	if (!xdma->user_msix_table[intr].in_use) {
+		ret = -EINVAL;
+		goto failed;
+	}
+	xdma->user_msix_table[intr].handler = NULL;
+	xdma->user_msix_table[intr].arg = NULL;
+
+	ret = xdma_user_isr_register(xdma->dma_handle, mask, NULL, NULL);
+	if (ret) {
+		xocl_err(&pdev->dev, "xdma unregister isr failed");
+		goto failed;
+	}
+
+	xdma->user_msix_table[intr].in_use = false;
+
+failed:
+	mutex_unlock(&xdma->user_msix_table_lock);
+	return ret;
+}
+
+static int user_intr_register(struct platform_device *pdev, u32 intr,
+	irq_handler_t handler, void *arg, int event_fd)
+{
+	struct xocl_xdma *xdma;
+	struct eventfd_ctx *trigger = ERR_PTR(-EINVAL);
+	const unsigned int mask = 1 << intr;
+	int ret;
+
+	xdma = platform_get_drvdata(pdev);
+
+	if (intr >= xdma->max_user_intr ||
+			(event_fd >= 0 && intr < xdma->start_user_intr)) {
+		xocl_err(&pdev->dev, "Invalid intr %d, user start %d, max %d",
+			intr, xdma->start_user_intr, xdma->max_user_intr);
+		return -EINVAL;
+	}
+
+	if (event_fd >= 0) {
+		trigger = eventfd_ctx_fdget(event_fd);
+		if (IS_ERR(trigger)) {
+			xocl_err(&pdev->dev, "get event ctx failed");
+			return -EFAULT;
+		}
+	}
+
+	mutex_lock(&xdma->user_msix_table_lock);
+	if (xdma->user_msix_table[intr].in_use) {
+		xocl_err(&pdev->dev, "IRQ %d is in use", intr);
+		ret = -EPERM;
+		goto failed;
+	}
+	xdma->user_msix_table[intr].event_ctx = trigger;
+	xdma->user_msix_table[intr].handler = handler;
+	xdma->user_msix_table[intr].arg = arg;
+
+	ret = xdma_user_isr_register(xdma->dma_handle, mask, xdma_isr,
+			&xdma->user_msix_table[intr]);
+	if (ret) {
+		xocl_err(&pdev->dev, "IRQ register failed");
+		xdma->user_msix_table[intr].handler = NULL;
+		xdma->user_msix_table[intr].arg = NULL;
+		xdma->user_msix_table[intr].event_ctx = NULL;
+		goto failed;
+	}
+
+	xdma->user_msix_table[intr].in_use = true;
+
+	mutex_unlock(&xdma->user_msix_table_lock);
+
+
+	return 0;
+
+failed:
+	mutex_unlock(&xdma->user_msix_table_lock);
+	if (!IS_ERR(trigger))
+		eventfd_ctx_put(trigger);
+
+	return ret;
+}
+
+static struct xocl_dma_funcs xdma_ops = {
+	.migrate_bo = xdma_migrate_bo,
+	.ac_chan = acquire_channel,
+	.rel_chan = release_channel,
+	.get_chan_count = get_channel_count,
+	.get_chan_stat = get_channel_stat,
+	.user_intr_register = user_intr_register,
+	.user_intr_config = user_intr_config,
+	.user_intr_unreg = user_intr_unreg,
+	.get_drm_handle = get_drm_handle,
+};
+
+static ssize_t channel_stat_raw_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	u32 i;
+	ssize_t nbytes = 0;
+	struct platform_device *pdev = to_platform_device(dev);
+	u32 chs = get_channel_count(pdev);
+
+	for (i = 0; i < chs; i++) {
+		nbytes += sprintf(buf + nbytes, "%llu %llu\n",
+			get_channel_stat(pdev, i, 0),
+			get_channel_stat(pdev, i, 1));
+	}
+	return nbytes;
+}
+static DEVICE_ATTR_RO(channel_stat_raw);
+
+static struct attribute *xdma_attrs[] = {
+	&dev_attr_channel_stat_raw.attr,
+	NULL,
+};
+
+static struct attribute_group xdma_attr_group = {
+	.attrs = xdma_attrs,
+};
+
+static int set_max_chan(struct platform_device *pdev,
+		struct xocl_xdma *xdma)
+{
+	xdma->channel_usage[0] = devm_kzalloc(&pdev->dev, sizeof(u64) *
+		xdma->channel, GFP_KERNEL);
+	xdma->channel_usage[1] = devm_kzalloc(&pdev->dev, sizeof(u64) *
+		xdma->channel, GFP_KERNEL);
+	if (!xdma->channel_usage[0] || !xdma->channel_usage[1]) {
+		xocl_err(&pdev->dev, "failed to alloc channel usage");
+		return -ENOMEM;
+	}
+
+	sema_init(&xdma->channel_sem[0], xdma->channel);
+	sema_init(&xdma->channel_sem[1], xdma->channel);
+
+	/* Initialize bit mask to represent individual channels */
+	xdma->channel_bitmap[0] = BIT(xdma->channel);
+	xdma->channel_bitmap[0]--;
+	xdma->channel_bitmap[1] = xdma->channel_bitmap[0];
+
+	return 0;
+}
+
+static int xdma_probe(struct platform_device *pdev)
+{
+	struct xocl_xdma	*xdma = NULL;
+	int	ret = 0;
+	xdev_handle_t		xdev;
+
+	xdev = xocl_get_xdev(pdev);
+	BUG_ON(!xdev);
+
+	xdma = devm_kzalloc(&pdev->dev, sizeof(*xdma), GFP_KERNEL);
+	if (!xdma) {
+		ret = -ENOMEM;
+		goto failed;
+	}
+
+	xdma->dma_handle = xdma_device_open(XOCL_MODULE_NAME, XDEV(xdev)->pdev,
+			&xdma->max_user_intr,
+			&xdma->channel, &xdma->channel);
+	if (xdma->dma_handle == NULL) {
+		xocl_err(&pdev->dev, "XDMA Device Open failed");
+		ret = -EIO;
+		goto failed;
+	}
+
+	xdma->user_msix_table = devm_kzalloc(&pdev->dev,
+			xdma->max_user_intr *
+			sizeof(struct xdma_irq), GFP_KERNEL);
+	if (!xdma->user_msix_table) {
+		xocl_err(&pdev->dev, "alloc user_msix_table failed");
+		ret = -ENOMEM;
+		goto failed;
+	}
+
+	ret = set_max_chan(pdev, xdma);
+	if (ret) {
+		xocl_err(&pdev->dev, "Set max channel failed");
+		goto failed;
+	}
+
+	xdma->drm = xocl_drm_init(xdev);
+	if (!xdma->drm) {
+		ret = -EFAULT;
+		xocl_err(&pdev->dev, "failed to init drm mm");
+		goto failed;
+	}
+
+	ret = sysfs_create_group(&pdev->dev.kobj, &xdma_attr_group);
+	if (ret) {
+		xocl_err(&pdev->dev, "create attrs failed: %d", ret);
+		goto failed;
+	}
+
+	mutex_init(&xdma->stat_lock);
+	mutex_init(&xdma->user_msix_table_lock);
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_DMA, &xdma_ops);
+	platform_set_drvdata(pdev, xdma);
+
+	return 0;
+
+failed:
+	if (xdma) {
+		if (xdma->drm)
+			xocl_drm_fini(xdma->drm);
+		if (xdma->dma_handle)
+			xdma_device_close(XDEV(xdev)->pdev, xdma->dma_handle);
+		if (xdma->channel_usage[0])
+			devm_kfree(&pdev->dev, xdma->channel_usage[0]);
+		if (xdma->channel_usage[1])
+			devm_kfree(&pdev->dev, xdma->channel_usage[1]);
+		if (xdma->user_msix_table)
+			devm_kfree(&pdev->dev, xdma->user_msix_table);
+
+		devm_kfree(&pdev->dev, xdma);
+	}
+
+	platform_set_drvdata(pdev, NULL);
+
+	return ret;
+}
+
+static int xdma_remove(struct platform_device *pdev)
+{
+	struct xocl_xdma *xdma = platform_get_drvdata(pdev);
+	xdev_handle_t xdev;
+	struct xdma_irq *irq_entry;
+	int i;
+
+	if (!xdma) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+
+	xdev = xocl_get_xdev(pdev);
+	BUG_ON(!xdev);
+
+	sysfs_remove_group(&pdev->dev.kobj, &xdma_attr_group);
+
+	if (xdma->drm)
+		xocl_drm_fini(xdma->drm);
+	if (xdma->dma_handle)
+		xdma_device_close(XDEV(xdev)->pdev, xdma->dma_handle);
+
+	for (i = 0; i < xdma->max_user_intr; i++) {
+		irq_entry = &xdma->user_msix_table[i];
+		if (irq_entry->in_use) {
+			if (irq_entry->enabled) {
+				xocl_err(&pdev->dev,
+					"ERROR: Interrupt %d is still on", i);
+			}
+			if (!IS_ERR_OR_NULL(irq_entry->event_ctx))
+				eventfd_ctx_put(irq_entry->event_ctx);
+		}
+	}
+
+	if (xdma->channel_usage[0])
+		devm_kfree(&pdev->dev, xdma->channel_usage[0]);
+	if (xdma->channel_usage[1])
+		devm_kfree(&pdev->dev, xdma->channel_usage[1]);
+
+	mutex_destroy(&xdma->stat_lock);
+	mutex_destroy(&xdma->user_msix_table_lock);
+
+	devm_kfree(&pdev->dev, xdma->user_msix_table);
+	platform_set_drvdata(pdev, NULL);
+
+	devm_kfree(&pdev->dev, xdma);
+
+	return 0;
+}
+
+static struct platform_device_id xdma_id_table[] = {
+	{ XOCL_XDMA, 0 },
+	{ },
+};
+
+static struct platform_driver	xdma_driver = {
+	.probe		= xdma_probe,
+	.remove		= xdma_remove,
+	.driver		= {
+		.name = "xocl_xdma",
+	},
+	.id_table	= xdma_id_table,
+};
+
+int __init xocl_init_xdma(void)
+{
+	return platform_driver_register(&xdma_driver);
+}
+
+void xocl_fini_xdma(void)
+{
+	return platform_driver_unregister(&xdma_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/xmc.c b/drivers/gpu/drm/xocl/subdev/xmc.c
new file mode 100644
index 000000000000..d9d620ac09b9
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/xmc.c
@@ -0,0 +1,1480 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: chienwei@xilinx.com
+ *
+ */
+
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/vmalloc.h>
+#include <linux/string.h>
+#include "../ert.h"
+#include "../xocl_drv.h"
+#include <drm/xmgmt_drm.h>
+
+#define MAX_XMC_RETRY       150	//Retry is set to 15s for XMC
+#define MAX_ERT_RETRY       10	//Retry is set to 1s for ERT
+#define RETRY_INTERVAL  100       //100ms
+
+#define	MAX_IMAGE_LEN	0x20000
+
+#define XMC_MAGIC_REG               0x0
+#define XMC_VERSION_REG             0x4
+#define XMC_STATUS_REG              0x8
+#define XMC_ERROR_REG               0xC
+#define XMC_FEATURE_REG             0x10
+#define XMC_SENSOR_REG              0x14
+#define XMC_CONTROL_REG             0x18
+#define XMC_STOP_CONFIRM_REG        0x1C
+#define XMC_12V_PEX_REG             0x20
+#define XMC_3V3_PEX_REG             0x2C
+#define XMC_3V3_AUX_REG             0x38
+#define XMC_12V_AUX_REG             0x44
+#define XMC_DDR4_VPP_BTM_REG        0x50
+#define XMC_SYS_5V5_REG             0x5C
+#define XMC_VCC1V2_TOP_REG          0x68
+#define XMC_VCC1V8_REG              0x74
+#define XMC_VCC0V85_REG             0x80
+#define XMC_DDR4_VPP_TOP_REG        0x8C
+#define XMC_MGT0V9AVCC_REG          0x98
+#define XMC_12V_SW_REG              0xA4
+#define XMC_MGTAVTT_REG             0xB0
+#define XMC_VCC1V2_BTM_REG          0xBC
+#define XMC_12V_PEX_I_IN_REG        0xC8
+#define XMC_12V_AUX_I_IN_REG        0xD4
+#define XMC_VCCINT_V_REG            0xE0
+#define XMC_VCCINT_I_REG            0xEC
+#define XMC_FPGA_TEMP               0xF8
+#define XMC_FAN_TEMP_REG            0x104
+#define XMC_DIMM_TEMP0_REG          0x110
+#define XMC_DIMM_TEMP1_REG          0x11C
+#define XMC_DIMM_TEMP2_REG          0x128
+#define XMC_DIMM_TEMP3_REG          0x134
+#define XMC_FAN_SPEED_REG           0x164
+#define XMC_SE98_TEMP0_REG          0x140
+#define XMC_SE98_TEMP1_REG          0x14C
+#define XMC_SE98_TEMP2_REG          0x158
+#define XMC_CAGE_TEMP0_REG          0x170
+#define XMC_CAGE_TEMP1_REG          0x17C
+#define XMC_CAGE_TEMP2_REG          0x188
+#define XMC_CAGE_TEMP3_REG          0x194
+#define XMC_SNSR_CHKSUM_REG         0x1A4
+#define XMC_SNSR_FLAGS_REG          0x1A8
+#define XMC_HOST_MSG_OFFSET_REG     0x300
+#define XMC_HOST_MSG_ERROR_REG      0x304
+#define XMC_HOST_MSG_HEADER_REG     0x308
+
+
+#define	VALID_ID		0x74736574
+
+#define	GPIO_RESET		0x0
+#define	GPIO_ENABLED		0x1
+
+#define	SELF_JUMP(ins)		(((ins) & 0xfc00ffff) == 0xb8000000)
+#define	XMC_PRIVILEGED(xmc)	((xmc)->base_addrs[0] != NULL)
+
+enum ctl_mask {
+	CTL_MASK_CLEAR_POW	= 0x1,
+	CTL_MASK_CLEAR_ERR	= 0x2,
+	CTL_MASK_PAUSE		= 0x4,
+	CTL_MASK_STOP		= 0x8,
+};
+
+enum status_mask {
+	STATUS_MASK_INIT_DONE		= 0x1,
+	STATUS_MASK_STOPPED		= 0x2,
+	STATUS_MASK_PAUSE		= 0x4,
+};
+
+enum cap_mask {
+	CAP_MASK_PM			= 0x1,
+};
+
+enum {
+	XMC_STATE_UNKNOWN,
+	XMC_STATE_ENABLED,
+	XMC_STATE_RESET,
+	XMC_STATE_STOPPED,
+	XMC_STATE_ERROR
+};
+
+enum {
+	IO_REG,
+	IO_GPIO,
+	IO_IMAGE_MGMT,
+	IO_IMAGE_SCHED,
+	IO_CQ,
+	NUM_IOADDR
+};
+
+enum {
+	VOLTAGE_MAX,
+	VOLTAGE_AVG,
+	VOLTAGE_INS,
+};
+
+#define	READ_REG32(xmc, off)		\
+	XOCL_READ_REG32(xmc->base_addrs[IO_REG] + off)
+#define	WRITE_REG32(xmc, val, off)	\
+	XOCL_WRITE_REG32(val, xmc->base_addrs[IO_REG] + off)
+
+#define	READ_GPIO(xmc, off)		\
+	XOCL_READ_REG32(xmc->base_addrs[IO_GPIO] + off)
+#define	WRITE_GPIO(xmc, val, off)	\
+	XOCL_WRITE_REG32(val, xmc->base_addrs[IO_GPIO] + off)
+
+#define	READ_IMAGE_MGMT(xmc, off)		\
+	XOCL_READ_REG32(xmc->base_addrs[IO_IMAGE_MGMT] + off)
+
+#define	READ_IMAGE_SCHED(xmc, off)		\
+	XOCL_READ_REG32(xmc->base_addrs[IO_IMAGE_SCHED] + off)
+
+#define	COPY_MGMT(xmc, buf, len)		\
+	xocl_memcpy_toio(xmc->base_addrs[IO_IMAGE_MGMT], buf, len)
+#define	COPY_SCHE(xmc, buf, len)		\
+	xocl_memcpy_toio(xmc->base_addrs[IO_IMAGE_SCHED], buf, len)
+
+struct xocl_xmc {
+	struct platform_device	*pdev;
+	void __iomem		*base_addrs[NUM_IOADDR];
+
+	struct device		*hwmon_dev;
+	bool			enabled;
+	u32			state;
+	u32			cap;
+	struct mutex		xmc_lock;
+
+	char			*sche_binary;
+	u32			sche_binary_length;
+	char			*mgmt_binary;
+	u32			mgmt_binary_length;
+};
+
+
+static int load_xmc(struct xocl_xmc *xmc);
+static int stop_xmc(struct platform_device *pdev);
+
+static void xmc_read_from_peer(struct platform_device *pdev, enum data_kind kind, void *resp, size_t resplen)
+{
+	struct mailbox_subdev_peer subdev_peer = {0};
+	size_t data_len = sizeof(struct mailbox_subdev_peer);
+	struct mailbox_req *mb_req = NULL;
+	size_t reqlen = sizeof(struct mailbox_req) + data_len;
+
+	mb_req = vmalloc(reqlen);
+	if (!mb_req)
+		return;
+
+	mb_req->req = MAILBOX_REQ_PEER_DATA;
+
+	subdev_peer.kind = kind;
+	memcpy(mb_req->data, &subdev_peer, data_len);
+
+	(void) xocl_peer_request(XOCL_PL_DEV_TO_XDEV(pdev),
+		mb_req, reqlen, resp, &resplen, NULL, NULL);
+	vfree(mb_req);
+}
+
+/* sysfs support */
+static void safe_read32(struct xocl_xmc *xmc, u32 reg, u32 *val)
+{
+	mutex_lock(&xmc->xmc_lock);
+	if (xmc->enabled && xmc->state == XMC_STATE_ENABLED)
+		*val = READ_REG32(xmc, reg);
+	else
+		*val = 0;
+
+	mutex_unlock(&xmc->xmc_lock);
+}
+
+static void safe_write32(struct xocl_xmc *xmc, u32 reg, u32 val)
+{
+	mutex_lock(&xmc->xmc_lock);
+	if (xmc->enabled && xmc->state == XMC_STATE_ENABLED)
+		WRITE_REG32(xmc, val, reg);
+
+	mutex_unlock(&xmc->xmc_lock);
+}
+
+static void safe_read_from_peer(struct xocl_xmc *xmc, struct platform_device *pdev, enum data_kind kind, u32 *val)
+{
+	mutex_lock(&xmc->xmc_lock);
+	if (xmc->enabled)
+		xmc_read_from_peer(pdev, kind, val, sizeof(u32));
+	else
+		*val = 0;
+
+	mutex_unlock(&xmc->xmc_lock);
+}
+
+static int xmc_get_data(struct platform_device *pdev, enum data_kind kind)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(pdev);
+	int val;
+
+	if (XMC_PRIVILEGED(xmc)) {
+		switch (kind) {
+		case VOL_12V_PEX:
+			safe_read32(xmc, XMC_12V_PEX_REG + sizeof(u32)*VOLTAGE_INS, &val);
+			break;
+		default:
+			break;
+		}
+	}
+	return val;
+}
+
+static ssize_t xmc_12v_pex_vol_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 pes_val;
+
+	if (XMC_PRIVILEGED(xmc))
+		safe_read32(xmc, XMC_12V_PEX_REG+sizeof(u32)*VOLTAGE_INS, &pes_val);
+	else
+		safe_read_from_peer(xmc, to_platform_device(dev), VOL_12V_PEX, &pes_val);
+
+	return sprintf(buf, "%d\n", pes_val);
+}
+static DEVICE_ATTR_RO(xmc_12v_pex_vol);
+
+static ssize_t xmc_12v_aux_vol_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_12V_AUX_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_12v_aux_vol);
+
+static ssize_t xmc_12v_pex_curr_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 pes_val;
+
+	safe_read32(xmc, XMC_12V_PEX_I_IN_REG+sizeof(u32)*VOLTAGE_INS, &pes_val);
+
+	return sprintf(buf, "%d\n", pes_val);
+}
+static DEVICE_ATTR_RO(xmc_12v_pex_curr);
+
+static ssize_t xmc_12v_aux_curr_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_12V_AUX_I_IN_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_12v_aux_curr);
+
+static ssize_t xmc_3v3_pex_vol_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_3V3_PEX_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_3v3_pex_vol);
+
+static ssize_t xmc_3v3_aux_vol_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_3V3_AUX_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_3v3_aux_vol);
+
+static ssize_t xmc_ddr_vpp_btm_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_DDR4_VPP_BTM_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_ddr_vpp_btm);
+
+static ssize_t xmc_sys_5v5_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_SYS_5V5_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_sys_5v5);
+
+static ssize_t xmc_1v2_top_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_VCC1V2_TOP_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_1v2_top);
+
+static ssize_t xmc_1v8_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_VCC1V8_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_1v8);
+
+static ssize_t xmc_0v85_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_VCC0V85_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_0v85);
+
+static ssize_t xmc_ddr_vpp_top_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_DDR4_VPP_TOP_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_ddr_vpp_top);
+
+
+static ssize_t xmc_mgt0v9avcc_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_MGT0V9AVCC_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_mgt0v9avcc);
+
+static ssize_t xmc_12v_sw_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_12V_SW_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_12v_sw);
+
+
+static ssize_t xmc_mgtavtt_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_MGTAVTT_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_mgtavtt);
+
+static ssize_t xmc_vcc1v2_btm_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_VCC1V2_BTM_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_vcc1v2_btm);
+
+static ssize_t xmc_vccint_vol_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_VCCINT_V_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_vccint_vol);
+
+static ssize_t xmc_vccint_curr_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_VCCINT_I_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_vccint_curr);
+
+static ssize_t xmc_se98_temp0_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_SE98_TEMP0_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_se98_temp0);
+
+static ssize_t xmc_se98_temp1_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_SE98_TEMP1_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_se98_temp1);
+
+static ssize_t xmc_se98_temp2_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_SE98_TEMP2_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_se98_temp2);
+
+static ssize_t xmc_fpga_temp_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_FPGA_TEMP, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_fpga_temp);
+
+static ssize_t xmc_fan_temp_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_FAN_TEMP_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_fan_temp);
+
+static ssize_t xmc_fan_rpm_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_FAN_SPEED_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_fan_rpm);
+
+
+static ssize_t xmc_dimm_temp0_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_DIMM_TEMP0_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_dimm_temp0);
+
+static ssize_t xmc_dimm_temp1_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_DIMM_TEMP1_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_dimm_temp1);
+
+static ssize_t xmc_dimm_temp2_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_DIMM_TEMP2_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_dimm_temp2);
+
+static ssize_t xmc_dimm_temp3_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_DIMM_TEMP3_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_dimm_temp3);
+
+
+static ssize_t xmc_cage_temp0_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_CAGE_TEMP0_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_cage_temp0);
+
+static ssize_t xmc_cage_temp1_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_CAGE_TEMP1_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_cage_temp1);
+
+static ssize_t xmc_cage_temp2_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_CAGE_TEMP2_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_cage_temp2);
+
+static ssize_t xmc_cage_temp3_show(struct device *dev, struct device_attribute *attr,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_CAGE_TEMP3_REG+sizeof(u32)*VOLTAGE_INS, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(xmc_cage_temp3);
+
+
+static ssize_t version_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_VERSION_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(version);
+
+static ssize_t sensor_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_SENSOR_REG, &val);
+
+	return sprintf(buf, "0x%04x\n", val);
+}
+static DEVICE_ATTR_RO(sensor);
+
+
+static ssize_t id_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_MAGIC_REG, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(id);
+
+static ssize_t status_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_STATUS_REG, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(status);
+
+static ssize_t error_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_ERROR_REG, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(error);
+
+static ssize_t capability_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_FEATURE_REG, &val);
+
+	return sprintf(buf, "%x\n", val);
+}
+static DEVICE_ATTR_RO(capability);
+
+static ssize_t power_checksum_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_SNSR_CHKSUM_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(power_checksum);
+
+static ssize_t pause_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	safe_read32(xmc, XMC_CONTROL_REG, &val);
+
+	return sprintf(buf, "%d\n", !!(val & CTL_MASK_PAUSE));
+}
+
+static ssize_t pause_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	if (kstrtou32(buf, 10, &val) == -EINVAL || val > 1)
+		return -EINVAL;
+
+	val = val ? CTL_MASK_PAUSE : 0;
+	safe_write32(xmc, XMC_CONTROL_REG, val);
+
+	return count;
+}
+static DEVICE_ATTR_RW(pause);
+
+static ssize_t reset_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xocl_xmc *xmc = platform_get_drvdata(to_platform_device(dev));
+	u32 val;
+
+	if (kstrtou32(buf, 10, &val) == -EINVAL || val > 1)
+		return -EINVAL;
+
+	if (val)
+		load_xmc(xmc);
+
+	return count;
+}
+static DEVICE_ATTR_WO(reset);
+
+static ssize_t power_flag_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_SNSR_FLAGS_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(power_flag);
+
+static ssize_t host_msg_offset_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_HOST_MSG_OFFSET_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(host_msg_offset);
+
+static ssize_t host_msg_error_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_HOST_MSG_ERROR_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(host_msg_error);
+
+static ssize_t host_msg_header_show(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_HOST_MSG_HEADER_REG, &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+static DEVICE_ATTR_RO(host_msg_header);
+
+
+
+static int get_temp_by_m_tag(struct xocl_xmc *xmc, char *m_tag)
+{
+
+	/**
+	 *   m_tag get from xclbin must follow this format
+	 *   DDR[0] or bank1
+	 *   we check the index in m_tag to decide which temperature
+	 *   to get from XMC IP base address
+	 */
+	char *start = NULL, *left_parentness = NULL, *right_parentness = NULL;
+	long idx;
+	int ret = 0, digit_len = 0;
+	char temp[4];
+
+	if (!xmc)
+		return -ENODEV;
+
+
+	if (!strncmp(m_tag, "bank", 4)) {
+		start = m_tag;
+		// bank0, no left parentness
+		left_parentness = m_tag+3;
+		right_parentness = m_tag+strlen(m_tag)+1;
+		digit_len = right_parentness-(2+left_parentness);
+	} else if (!strncmp(m_tag, "DDR", 3)) {
+
+		start = m_tag;
+		left_parentness = strstr(m_tag, "[");
+		right_parentness = strstr(m_tag, "]");
+		digit_len = right_parentness-(1+left_parentness);
+	}
+
+	if (!left_parentness || !right_parentness)
+		return ret;
+
+	if (!strncmp(m_tag, "DDR", left_parentness-start) || !strncmp(m_tag, "bank", left_parentness-start)) {
+
+		strncpy(temp, left_parentness+1, digit_len);
+		//assumption, temperature won't higher than 3 digits, or the temp[digit_len] should be a null character
+		temp[digit_len] = '\0';
+		//convert to signed long, decimal base
+		if (kstrtol(temp, 10, &idx) == 0 && idx < 4 && idx >= 0)
+			safe_read32(xmc, XMC_DIMM_TEMP0_REG + (3*sizeof(int32_t)) * idx +
+				    sizeof(u32)*VOLTAGE_INS, &ret);
+		else
+			ret = 0;
+	}
+
+	return ret;
+
+}
+
+static struct attribute *xmc_attrs[] = {
+	&dev_attr_version.attr,
+	&dev_attr_id.attr,
+	&dev_attr_status.attr,
+	&dev_attr_sensor.attr,
+	&dev_attr_error.attr,
+	&dev_attr_capability.attr,
+	&dev_attr_power_checksum.attr,
+	&dev_attr_xmc_12v_pex_vol.attr,
+	&dev_attr_xmc_12v_aux_vol.attr,
+	&dev_attr_xmc_12v_pex_curr.attr,
+	&dev_attr_xmc_12v_aux_curr.attr,
+	&dev_attr_xmc_3v3_pex_vol.attr,
+	&dev_attr_xmc_3v3_aux_vol.attr,
+	&dev_attr_xmc_ddr_vpp_btm.attr,
+	&dev_attr_xmc_sys_5v5.attr,
+	&dev_attr_xmc_1v2_top.attr,
+	&dev_attr_xmc_1v8.attr,
+	&dev_attr_xmc_0v85.attr,
+	&dev_attr_xmc_ddr_vpp_top.attr,
+	&dev_attr_xmc_mgt0v9avcc.attr,
+	&dev_attr_xmc_12v_sw.attr,
+	&dev_attr_xmc_mgtavtt.attr,
+	&dev_attr_xmc_vcc1v2_btm.attr,
+	&dev_attr_xmc_fpga_temp.attr,
+	&dev_attr_xmc_fan_temp.attr,
+	&dev_attr_xmc_fan_rpm.attr,
+	&dev_attr_xmc_dimm_temp0.attr,
+	&dev_attr_xmc_dimm_temp1.attr,
+	&dev_attr_xmc_dimm_temp2.attr,
+	&dev_attr_xmc_dimm_temp3.attr,
+	&dev_attr_xmc_vccint_vol.attr,
+	&dev_attr_xmc_vccint_curr.attr,
+	&dev_attr_xmc_se98_temp0.attr,
+	&dev_attr_xmc_se98_temp1.attr,
+	&dev_attr_xmc_se98_temp2.attr,
+	&dev_attr_xmc_cage_temp0.attr,
+	&dev_attr_xmc_cage_temp1.attr,
+	&dev_attr_xmc_cage_temp2.attr,
+	&dev_attr_xmc_cage_temp3.attr,
+	&dev_attr_pause.attr,
+	&dev_attr_reset.attr,
+	&dev_attr_power_flag.attr,
+	&dev_attr_host_msg_offset.attr,
+	&dev_attr_host_msg_error.attr,
+	&dev_attr_host_msg_header.attr,
+	NULL,
+};
+
+
+static ssize_t read_temp_by_mem_topology(struct file *filp, struct kobject *kobj,
+	struct bin_attribute *attr, char *buffer, loff_t offset, size_t count)
+{
+	u32 nread = 0;
+	size_t size = 0;
+	u32 i;
+	struct mem_topology *memtopo = NULL;
+	struct xocl_xmc *xmc;
+	uint32_t temp[MAX_M_COUNT] = {0};
+	struct xclmgmt_dev *lro;
+
+	//xocl_icap_lock_bitstream
+	lro = (struct xclmgmt_dev *)dev_get_drvdata(container_of(kobj, struct device, kobj)->parent);
+	xmc = (struct xocl_xmc *)dev_get_drvdata(container_of(kobj, struct device, kobj));
+
+	memtopo = (struct mem_topology *)xocl_icap_get_data(lro, MEMTOPO_AXLF);
+
+	if (!memtopo)
+		return 0;
+
+	size = sizeof(u32)*(memtopo->m_count);
+
+	if (offset >= size)
+		return 0;
+
+	for (i = 0; i < memtopo->m_count; ++i)
+		*(temp+i) = get_temp_by_m_tag(xmc, memtopo->m_mem_data[i].m_tag);
+
+	if (count < size - offset)
+		nread = count;
+	else
+		nread = size - offset;
+
+	memcpy(buffer, temp, nread);
+	//xocl_icap_unlock_bitstream
+	return nread;
+}
+
+static struct bin_attribute bin_dimm_temp_by_mem_topology_attr = {
+	.attr = {
+		.name = "temp_by_mem_topology",
+		.mode = 0444
+	},
+	.read = read_temp_by_mem_topology,
+	.write = NULL,
+	.size = 0
+};
+
+static struct bin_attribute *xmc_bin_attrs[] = {
+	&bin_dimm_temp_by_mem_topology_attr,
+	NULL,
+};
+
+static struct attribute_group xmc_attr_group = {
+	.attrs = xmc_attrs,
+	.bin_attrs = xmc_bin_attrs,
+};
+static ssize_t show_mb_pw(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct xocl_xmc *xmc = dev_get_drvdata(dev);
+	u32 val;
+
+	safe_read32(xmc, XMC_12V_PEX_REG + attr->index * sizeof(u32), &val);
+
+	return sprintf(buf, "%d\n", val);
+}
+
+static SENSOR_DEVICE_ATTR(curr1_highest, 0444, show_mb_pw, NULL, 0);
+static SENSOR_DEVICE_ATTR(curr1_average, 0444, show_mb_pw, NULL, 1);
+static SENSOR_DEVICE_ATTR(curr1_input, 0444, show_mb_pw, NULL, 2);
+static SENSOR_DEVICE_ATTR(curr2_highest, 0444, show_mb_pw, NULL, 3);
+static SENSOR_DEVICE_ATTR(curr2_average, 0444, show_mb_pw, NULL, 4);
+static SENSOR_DEVICE_ATTR(curr2_input, 0444, show_mb_pw, NULL, 5);
+static SENSOR_DEVICE_ATTR(curr3_highest, 0444, show_mb_pw, NULL, 6);
+static SENSOR_DEVICE_ATTR(curr3_average, 0444, show_mb_pw, NULL, 7);
+static SENSOR_DEVICE_ATTR(curr3_input, 0444, show_mb_pw, NULL, 8);
+static SENSOR_DEVICE_ATTR(curr4_highest, 0444, show_mb_pw, NULL, 9);
+static SENSOR_DEVICE_ATTR(curr4_average, 0444, show_mb_pw, NULL, 10);
+static SENSOR_DEVICE_ATTR(curr4_input, 0444, show_mb_pw, NULL, 11);
+static SENSOR_DEVICE_ATTR(curr5_highest, 0444, show_mb_pw, NULL, 12);
+static SENSOR_DEVICE_ATTR(curr5_average, 0444, show_mb_pw, NULL, 13);
+static SENSOR_DEVICE_ATTR(curr5_input, 0444, show_mb_pw, NULL, 14);
+static SENSOR_DEVICE_ATTR(curr6_highest, 0444, show_mb_pw, NULL, 15);
+static SENSOR_DEVICE_ATTR(curr6_average, 0444, show_mb_pw, NULL, 16);
+static SENSOR_DEVICE_ATTR(curr6_input, 0444, show_mb_pw, NULL, 17);
+
+static struct attribute *hwmon_xmc_attributes[] = {
+	&sensor_dev_attr_curr1_highest.dev_attr.attr,
+	&sensor_dev_attr_curr1_average.dev_attr.attr,
+	&sensor_dev_attr_curr1_input.dev_attr.attr,
+	&sensor_dev_attr_curr2_highest.dev_attr.attr,
+	&sensor_dev_attr_curr2_average.dev_attr.attr,
+	&sensor_dev_attr_curr2_input.dev_attr.attr,
+	&sensor_dev_attr_curr3_highest.dev_attr.attr,
+	&sensor_dev_attr_curr3_average.dev_attr.attr,
+	&sensor_dev_attr_curr3_input.dev_attr.attr,
+	&sensor_dev_attr_curr4_highest.dev_attr.attr,
+	&sensor_dev_attr_curr4_average.dev_attr.attr,
+	&sensor_dev_attr_curr4_input.dev_attr.attr,
+	&sensor_dev_attr_curr5_highest.dev_attr.attr,
+	&sensor_dev_attr_curr5_average.dev_attr.attr,
+	&sensor_dev_attr_curr5_input.dev_attr.attr,
+	&sensor_dev_attr_curr6_highest.dev_attr.attr,
+	&sensor_dev_attr_curr6_average.dev_attr.attr,
+	&sensor_dev_attr_curr6_input.dev_attr.attr,
+	NULL
+};
+
+static const struct attribute_group hwmon_xmc_attrgroup = {
+	.attrs = hwmon_xmc_attributes,
+};
+
+static ssize_t show_name(struct device *dev, struct device_attribute *da,
+	char *buf)
+{
+	return sprintf(buf, "%s\n", XCLMGMT_MB_HWMON_NAME);
+}
+
+static struct sensor_device_attribute name_attr =
+	SENSOR_ATTR(name, 0444, show_name, NULL, 0);
+
+static void mgmt_sysfs_destroy_xmc(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+
+	xmc = platform_get_drvdata(pdev);
+
+	if (!xmc->enabled)
+		return;
+
+	if (xmc->hwmon_dev) {
+		device_remove_file(xmc->hwmon_dev, &name_attr.dev_attr);
+		sysfs_remove_group(&xmc->hwmon_dev->kobj,
+			&hwmon_xmc_attrgroup);
+		hwmon_device_unregister(xmc->hwmon_dev);
+		xmc->hwmon_dev = NULL;
+	}
+
+	sysfs_remove_group(&pdev->dev.kobj, &xmc_attr_group);
+}
+
+static int mgmt_sysfs_create_xmc(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+	struct xocl_dev_core *core;
+	int err;
+
+	xmc = platform_get_drvdata(pdev);
+	core = XDEV(xocl_get_xdev(pdev));
+
+	if (!xmc->enabled)
+		return 0;
+
+	err = sysfs_create_group(&pdev->dev.kobj, &xmc_attr_group);
+	if (err) {
+		xocl_err(&pdev->dev, "create xmc attrs failed: 0x%x", err);
+		goto create_attr_failed;
+	}
+	xmc->hwmon_dev = hwmon_device_register(&core->pdev->dev);
+	if (IS_ERR(xmc->hwmon_dev)) {
+		err = PTR_ERR(xmc->hwmon_dev);
+		xocl_err(&pdev->dev, "register xmc hwmon failed: 0x%x", err);
+		goto hwmon_reg_failed;
+	}
+
+	dev_set_drvdata(xmc->hwmon_dev, xmc);
+
+	err = device_create_file(xmc->hwmon_dev, &name_attr.dev_attr);
+	if (err) {
+		xocl_err(&pdev->dev, "create attr name failed: 0x%x", err);
+		goto create_name_failed;
+	}
+
+	err = sysfs_create_group(&xmc->hwmon_dev->kobj,
+		&hwmon_xmc_attrgroup);
+	if (err) {
+		xocl_err(&pdev->dev, "create pw group failed: 0x%x", err);
+		goto create_pw_failed;
+	}
+
+	return 0;
+
+create_pw_failed:
+	device_remove_file(xmc->hwmon_dev, &name_attr.dev_attr);
+create_name_failed:
+	hwmon_device_unregister(xmc->hwmon_dev);
+	xmc->hwmon_dev = NULL;
+hwmon_reg_failed:
+	sysfs_remove_group(&pdev->dev.kobj, &xmc_attr_group);
+create_attr_failed:
+	return err;
+}
+
+static int stop_xmc_nolock(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+	int retry = 0;
+	u32 reg_val = 0;
+	void *xdev_hdl;
+
+	xmc = platform_get_drvdata(pdev);
+	if (!xmc)
+		return -ENODEV;
+	else if (!xmc->enabled)
+		return -ENODEV;
+
+	xdev_hdl = xocl_get_xdev(xmc->pdev);
+
+	reg_val = READ_GPIO(xmc, 0);
+	xocl_info(&xmc->pdev->dev, "MB Reset GPIO 0x%x", reg_val);
+
+	//Stop XMC and ERT if its currently running
+	if (reg_val == GPIO_ENABLED) {
+		xocl_info(&xmc->pdev->dev,
+			"XMC info, version 0x%x, status 0x%x, id 0x%x",
+			READ_REG32(xmc, XMC_VERSION_REG),
+			READ_REG32(xmc, XMC_STATUS_REG),
+			READ_REG32(xmc, XMC_MAGIC_REG));
+
+		reg_val = READ_REG32(xmc, XMC_STATUS_REG);
+		if (!(reg_val & STATUS_MASK_STOPPED)) {
+			xocl_info(&xmc->pdev->dev, "Stopping XMC...");
+			WRITE_REG32(xmc, CTL_MASK_STOP, XMC_CONTROL_REG);
+			WRITE_REG32(xmc, 1, XMC_STOP_CONFIRM_REG);
+		}
+		//Need to check if ERT is loaded before we attempt to stop it
+		if (!SELF_JUMP(READ_IMAGE_SCHED(xmc, 0))) {
+			reg_val = XOCL_READ_REG32(xmc->base_addrs[IO_CQ]);
+			if (!(reg_val & ERT_STOP_ACK)) {
+				xocl_info(&xmc->pdev->dev, "Stopping scheduler...");
+				XOCL_WRITE_REG32(ERT_STOP_CMD, xmc->base_addrs[IO_CQ]);
+			}
+		}
+
+		retry = 0;
+		while (retry++ < MAX_XMC_RETRY &&
+			!(READ_REG32(xmc, XMC_STATUS_REG) & STATUS_MASK_STOPPED))
+			msleep(RETRY_INTERVAL);
+
+		//Wait for XMC to stop and then check that ERT has also finished
+		if (retry >= MAX_XMC_RETRY) {
+			xocl_err(&xmc->pdev->dev,
+				"Failed to stop XMC");
+			xocl_err(&xmc->pdev->dev,
+				"XMC Error Reg 0x%x",
+				READ_REG32(xmc, XMC_ERROR_REG));
+			xmc->state = XMC_STATE_ERROR;
+			return -ETIMEDOUT;
+		} else if (!SELF_JUMP(READ_IMAGE_SCHED(xmc, 0)) &&
+			 !(XOCL_READ_REG32(xmc->base_addrs[IO_CQ]) & ERT_STOP_ACK)) {
+			while (retry++ < MAX_ERT_RETRY &&
+				!(XOCL_READ_REG32(xmc->base_addrs[IO_CQ]) & ERT_STOP_ACK))
+				msleep(RETRY_INTERVAL);
+			if (retry >= MAX_ERT_RETRY) {
+				xocl_err(&xmc->pdev->dev,
+					"Failed to stop sched");
+				xocl_err(&xmc->pdev->dev,
+					"Scheduler CQ status 0x%x",
+					XOCL_READ_REG32(xmc->base_addrs[IO_CQ]));
+				//We don't exit if ERT doesn't stop since it can hang due to bad kernel
+				//xmc->state = XMC_STATE_ERROR;
+				//return -ETIMEDOUT;
+			}
+		}
+
+		xocl_info(&xmc->pdev->dev, "XMC/sched Stopped, retry %d",
+			retry);
+	}
+
+	// Hold XMC in reset now that its safely stopped
+	xocl_info(&xmc->pdev->dev,
+		"XMC info, version 0x%x, status 0x%x, id 0x%x",
+		READ_REG32(xmc, XMC_VERSION_REG),
+		READ_REG32(xmc, XMC_STATUS_REG),
+		READ_REG32(xmc, XMC_MAGIC_REG));
+	WRITE_GPIO(xmc, GPIO_RESET, 0);
+	xmc->state = XMC_STATE_RESET;
+	reg_val = READ_GPIO(xmc, 0);
+	xocl_info(&xmc->pdev->dev, "MB Reset GPIO 0x%x", reg_val);
+	if (reg_val != GPIO_RESET) {
+		//Shouldnt make it here but if we do then exit
+		xmc->state = XMC_STATE_ERROR;
+		return -EIO;
+	}
+
+	return 0;
+}
+static int stop_xmc(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+	int ret = 0;
+	void *xdev_hdl;
+
+	xocl_info(&pdev->dev, "Stop Microblaze...");
+	xmc = platform_get_drvdata(pdev);
+	if (!xmc)
+		return -ENODEV;
+	else if (!xmc->enabled)
+		return -ENODEV;
+
+	xdev_hdl = xocl_get_xdev(xmc->pdev);
+
+	mutex_lock(&xmc->xmc_lock);
+	ret = stop_xmc_nolock(pdev);
+	mutex_unlock(&xmc->xmc_lock);
+
+	return ret;
+}
+
+static int load_xmc(struct xocl_xmc *xmc)
+{
+	int retry = 0;
+	u32 reg_val = 0;
+	int ret = 0;
+	void *xdev_hdl;
+
+	if (!xmc->enabled)
+		return -ENODEV;
+
+	mutex_lock(&xmc->xmc_lock);
+
+	/* Stop XMC first */
+	ret = stop_xmc_nolock(xmc->pdev);
+	if (ret != 0)
+		goto out;
+
+	xdev_hdl = xocl_get_xdev(xmc->pdev);
+
+	/* Load XMC and ERT Image */
+	if (xocl_mb_mgmt_on(xdev_hdl)) {
+		xocl_info(&xmc->pdev->dev, "Copying XMC image len %d",
+			xmc->mgmt_binary_length);
+		COPY_MGMT(xmc, xmc->mgmt_binary, xmc->mgmt_binary_length);
+	}
+
+	if (xocl_mb_sched_on(xdev_hdl)) {
+		xocl_info(&xmc->pdev->dev, "Copying scheduler image len %d",
+			xmc->sche_binary_length);
+		COPY_SCHE(xmc, xmc->sche_binary, xmc->sche_binary_length);
+	}
+
+	/* Take XMC and ERT out of reset */
+	WRITE_GPIO(xmc, GPIO_ENABLED, 0);
+	reg_val = READ_GPIO(xmc, 0);
+	xocl_info(&xmc->pdev->dev, "MB Reset GPIO 0x%x", reg_val);
+	if (reg_val != GPIO_ENABLED) {
+		//Shouldnt make it here but if we do then exit
+		xmc->state = XMC_STATE_ERROR;
+		goto out;
+	}
+
+	/* Wait for XMC to start
+	 * Note that ERT will start long before XMC so we don't check anything
+	 */
+	reg_val = READ_REG32(xmc, XMC_STATUS_REG);
+	if (!(reg_val & STATUS_MASK_INIT_DONE)) {
+		xocl_info(&xmc->pdev->dev, "Waiting for XMC to finish init...");
+		retry = 0;
+		while (retry++ < MAX_XMC_RETRY &&
+			!(READ_REG32(xmc, XMC_STATUS_REG) & STATUS_MASK_INIT_DONE))
+			msleep(RETRY_INTERVAL);
+		if (retry >= MAX_XMC_RETRY) {
+			xocl_err(&xmc->pdev->dev,
+				"XMC did not finish init sequence!");
+			xocl_err(&xmc->pdev->dev,
+				"Error Reg 0x%x",
+				READ_REG32(xmc, XMC_ERROR_REG));
+			xocl_err(&xmc->pdev->dev,
+				"Status Reg 0x%x",
+				READ_REG32(xmc, XMC_STATUS_REG));
+			ret = -ETIMEDOUT;
+			xmc->state = XMC_STATE_ERROR;
+			goto out;
+		}
+	}
+	xocl_info(&xmc->pdev->dev, "XMC and scheduler Enabled, retry %d",
+			retry);
+	xocl_info(&xmc->pdev->dev,
+		"XMC info, version 0x%x, status 0x%x, id 0x%x",
+		READ_REG32(xmc, XMC_VERSION_REG),
+		READ_REG32(xmc, XMC_STATUS_REG),
+		READ_REG32(xmc, XMC_MAGIC_REG));
+	xmc->state = XMC_STATE_ENABLED;
+
+	xmc->cap = READ_REG32(xmc, XMC_FEATURE_REG);
+out:
+	mutex_unlock(&xmc->xmc_lock);
+
+	return ret;
+}
+
+static void xmc_reset(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+
+	xocl_info(&pdev->dev, "Reset Microblaze...");
+	xmc = platform_get_drvdata(pdev);
+	if (!xmc)
+		return;
+
+	load_xmc(xmc);
+}
+
+static int load_mgmt_image(struct platform_device *pdev, const char *image,
+	u32 len)
+{
+	struct xocl_xmc *xmc;
+	char *binary;
+
+	if (len > MAX_IMAGE_LEN)
+		return -EINVAL;
+
+	xmc = platform_get_drvdata(pdev);
+	if (!xmc)
+		return -EINVAL;
+
+	binary = xmc->mgmt_binary;
+	xmc->mgmt_binary = devm_kzalloc(&pdev->dev, len, GFP_KERNEL);
+	if (!xmc->mgmt_binary)
+		return -ENOMEM;
+
+	if (binary)
+		devm_kfree(&pdev->dev, binary);
+	memcpy(xmc->mgmt_binary, image, len);
+	xmc->mgmt_binary_length = len;
+
+	return 0;
+}
+
+static int load_sche_image(struct platform_device *pdev, const char *image,
+	u32 len)
+{
+	struct xocl_xmc *xmc;
+	char *binary = NULL;
+
+	if (len > MAX_IMAGE_LEN)
+		return -EINVAL;
+
+	xmc = platform_get_drvdata(pdev);
+	if (!xmc)
+		return -EINVAL;
+
+	binary = xmc->sche_binary;
+	xmc->sche_binary = devm_kzalloc(&pdev->dev, len, GFP_KERNEL);
+	if (!xmc->sche_binary)
+		return -ENOMEM;
+
+	if (binary)
+		devm_kfree(&pdev->dev, binary);
+	memcpy(xmc->sche_binary, image, len);
+	xmc->sche_binary_length = len;
+
+	return 0;
+}
+
+static struct xocl_mb_funcs xmc_ops = {
+	.load_mgmt_image	= load_mgmt_image,
+	.load_sche_image	= load_sche_image,
+	.reset			= xmc_reset,
+	.stop			= stop_xmc,
+	.get_data     = xmc_get_data,
+};
+
+static int xmc_remove(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+	int	i;
+
+	xmc = platform_get_drvdata(pdev);
+	if (!xmc)
+		return 0;
+
+	if (xmc->mgmt_binary)
+		devm_kfree(&pdev->dev, xmc->mgmt_binary);
+	if (xmc->sche_binary)
+		devm_kfree(&pdev->dev, xmc->sche_binary);
+
+	mgmt_sysfs_destroy_xmc(pdev);
+
+	for (i = 0; i < NUM_IOADDR; i++) {
+		if (xmc->base_addrs[i])
+			iounmap(xmc->base_addrs[i]);
+	}
+
+	mutex_destroy(&xmc->xmc_lock);
+
+	platform_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, xmc);
+
+	return 0;
+}
+
+static int xmc_probe(struct platform_device *pdev)
+{
+	struct xocl_xmc *xmc;
+	struct resource *res;
+	void	*xdev_hdl;
+	int i, err;
+
+	xmc = devm_kzalloc(&pdev->dev, sizeof(*xmc), GFP_KERNEL);
+	if (!xmc) {
+		xocl_err(&pdev->dev, "out of memory");
+		return -ENOMEM;
+	}
+
+	xmc->pdev = pdev;
+	platform_set_drvdata(pdev, xmc);
+
+	xdev_hdl = xocl_get_xdev(pdev);
+	if (xocl_mb_mgmt_on(xdev_hdl) || xocl_mb_sched_on(xdev_hdl)) {
+		xocl_info(&pdev->dev, "Microblaze is supported.");
+		xmc->enabled = true;
+	} else {
+		xocl_err(&pdev->dev, "Microblaze is not supported.");
+		devm_kfree(&pdev->dev, xmc);
+		platform_set_drvdata(pdev, NULL);
+		return 0;
+	}
+
+	for (i = 0; i < NUM_IOADDR; i++) {
+		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+		if (res) {
+			xocl_info(&pdev->dev, "IO start: 0x%llx, end: 0x%llx",
+				res->start, res->end);
+			xmc->base_addrs[i] =
+				ioremap_nocache(res->start, res->end - res->start + 1);
+			if (!xmc->base_addrs[i]) {
+				err = -EIO;
+				xocl_err(&pdev->dev, "Map iomem failed");
+				goto failed;
+			}
+		} else
+			break;
+	}
+
+	err = mgmt_sysfs_create_xmc(pdev);
+	if (err) {
+		xocl_err(&pdev->dev, "Create sysfs failed, err %d", err);
+		goto failed;
+	}
+
+	xocl_subdev_register(pdev, XOCL_SUBDEV_XMC, &xmc_ops);
+
+	mutex_init(&xmc->xmc_lock);
+
+	return 0;
+
+failed:
+	xmc_remove(pdev);
+	return err;
+}
+
+struct platform_device_id xmc_id_table[] = {
+	{ XOCL_XMC, 0 },
+	{ },
+};
+
+static struct platform_driver	xmc_driver = {
+	.probe		= xmc_probe,
+	.remove		= xmc_remove,
+	.driver		= {
+		.name = XOCL_XMC,
+	},
+	.id_table = xmc_id_table,
+};
+
+int __init xocl_init_xmc(void)
+{
+	return platform_driver_register(&xmc_driver);
+}
+
+void xocl_fini_xmc(void)
+{
+	platform_driver_unregister(&xmc_driver);
+}
diff --git a/drivers/gpu/drm/xocl/subdev/xvc.c b/drivers/gpu/drm/xocl/subdev/xvc.c
new file mode 100644
index 000000000000..355dbad30b00
--- /dev/null
+++ b/drivers/gpu/drm/xocl/subdev/xvc.c
@@ -0,0 +1,461 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *
+ */
+
+#define pr_fmt(fmt)	KBUILD_MODNAME ":%s: " fmt, __func__
+
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <linux/cdev.h>
+#include <linux/fs.h>
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/ioctl.h>
+
+#include "../xocl_drv.h"
+
+/* IOCTL interfaces */
+#define XIL_XVC_MAGIC 0x58564344  // "XVCD"
+#define	MINOR_PUB_HIGH_BIT	0x00000
+#define	MINOR_PRI_HIGH_BIT	0x10000
+#define MINOR_NAME_MASK		0xffffffff
+
+enum xvc_algo_type {
+	XVC_ALGO_NULL,
+	XVC_ALGO_CFG,
+	XVC_ALGO_BAR
+};
+
+struct xil_xvc_ioc {
+	unsigned int opcode;
+	unsigned int length;
+	unsigned char *tms_buf;
+	unsigned char *tdi_buf;
+	unsigned char *tdo_buf;
+};
+
+struct xil_xvc_properties {
+	unsigned int xvc_algo_type;
+	unsigned int config_vsec_id;
+	unsigned int config_vsec_rev;
+	unsigned int bar_index;
+	unsigned int bar_offset;
+};
+
+#define XDMA_IOCXVC	     _IOWR(XIL_XVC_MAGIC, 1, struct xil_xvc_ioc)
+#define XDMA_RDXVC_PROPS _IOR(XIL_XVC_MAGIC, 2, struct xil_xvc_properties)
+
+#define COMPLETION_LOOP_MAX	100
+
+#define XVC_BAR_LENGTH_REG	0x0
+#define XVC_BAR_TMS_REG		0x4
+#define XVC_BAR_TDI_REG		0x8
+#define XVC_BAR_TDO_REG		0xC
+#define XVC_BAR_CTRL_REG	0x10
+
+#define XVC_DEV_NAME "xvc" SUBDEV_SUFFIX
+
+struct xocl_xvc {
+	void *__iomem base;
+	unsigned int instance;
+	struct cdev *sys_cdev;
+	struct device *sys_device;
+};
+
+static dev_t xvc_dev;
+
+static struct xil_xvc_properties xvc_pci_props;
+
+#ifdef __REG_DEBUG__
+/* SECTION: Function definitions */
+static inline void __write_register(const char *fn, u32 value, void *base,
+				unsigned int off)
+{
+	pr_info("%s: 0x%p, W reg 0x%lx, 0x%x.\n", fn, base, off, value);
+	iowrite32(value, base + off);
+}
+
+static inline u32 __read_register(const char *fn, void *base, unsigned int off)
+{
+	u32 v = ioread32(base + off);
+
+	pr_info("%s: 0x%p, R reg 0x%lx, 0x%x.\n", fn, base, off, v);
+	return v;
+}
+#define write_register(v, base, off) __write_register(__func__, v, base, off)
+#define read_register(base, off) __read_register(__func__, base, off)
+
+#else
+#define write_register(v, base, off) iowrite32(v, (base) + (off))
+#define read_register(base, off) ioread32((base) + (off))
+#endif /* #ifdef __REG_DEBUG__ */
+
+
+static int xvc_shift_bits(void *base, u32 tms_bits, u32 tdi_bits,
+			  u32 *tdo_bits)
+{
+	u32 control;
+	u32 write_reg_data;
+	int count;
+
+	/* set tms bit */
+	write_register(tms_bits, base, XVC_BAR_TMS_REG);
+	/* set tdi bits and shift data out */
+	write_register(tdi_bits, base, XVC_BAR_TDI_REG);
+
+	control = read_register(base, XVC_BAR_CTRL_REG);
+	/* enable shift operation */
+	write_reg_data = control | 0x01;
+	write_register(write_reg_data, base, XVC_BAR_CTRL_REG);
+
+	/* poll for completion */
+	count = COMPLETION_LOOP_MAX;
+	while (count) {
+		/* read control reg to check shift operation completion */
+		control = read_register(base, XVC_BAR_CTRL_REG);
+		if ((control & 0x01) == 0)
+			break;
+
+		count--;
+	}
+
+	if (!count)	{
+		pr_warn("XVC bar transaction timed out (0x%0X)\n", control);
+		return -ETIMEDOUT;
+	}
+
+	/* read tdo bits back out */
+	*tdo_bits = read_register(base, XVC_BAR_TDO_REG);
+
+	return 0;
+}
+
+static long xvc_ioctl_helper(struct xocl_xvc *xvc, const void __user *arg)
+{
+	struct xil_xvc_ioc xvc_obj;
+	unsigned int opcode;
+	unsigned int total_bits;
+	unsigned int total_bytes;
+	unsigned int bits, bits_left;
+	unsigned char *buffer = NULL;
+	unsigned char *tms_buf = NULL;
+	unsigned char *tdi_buf = NULL;
+	unsigned char *tdo_buf = NULL;
+	void __iomem *iobase = xvc->base;
+	u32 control_reg_data;
+	u32 write_reg_data;
+	int rv;
+
+	rv = copy_from_user((void *)&xvc_obj, arg,
+				sizeof(struct xil_xvc_ioc));
+	/* anything not copied ? */
+	if (rv) {
+		pr_info("copy_from_user xvc_obj failed: %d.\n", rv);
+		goto cleanup;
+	}
+
+	opcode = xvc_obj.opcode;
+
+	/* Invalid operation type, no operation performed */
+	if (opcode != 0x01 && opcode != 0x02) {
+		pr_info("UNKNOWN opcode 0x%x.\n", opcode);
+		return -EINVAL;
+	}
+
+	total_bits = xvc_obj.length;
+	total_bytes = (total_bits + 7) >> 3;
+
+	buffer = kmalloc(total_bytes * 3, GFP_KERNEL);
+	if (!buffer) {
+		pr_info("OOM %u, op 0x%x, len %u bits, %u bytes.\n",
+			3 * total_bytes, opcode, total_bits, total_bytes);
+		rv = -ENOMEM;
+		goto cleanup;
+	}
+	tms_buf = buffer;
+	tdi_buf = tms_buf + total_bytes;
+	tdo_buf = tdi_buf + total_bytes;
+
+	rv = copy_from_user((void *)tms_buf, xvc_obj.tms_buf, total_bytes);
+	if (rv) {
+		pr_info("copy tmfs_buf failed: %d/%u.\n", rv, total_bytes);
+		goto cleanup;
+	}
+	rv = copy_from_user((void *)tdi_buf, xvc_obj.tdi_buf, total_bytes);
+	if (rv) {
+		pr_info("copy tdi_buf failed: %d/%u.\n", rv, total_bytes);
+		goto cleanup;
+	}
+
+	// If performing loopback test, set loopback bit (0x02) in control reg
+	if (opcode == 0x02) {
+		control_reg_data = read_register(iobase, XVC_BAR_CTRL_REG);
+		write_reg_data = control_reg_data | 0x02;
+		write_register(write_reg_data, iobase, XVC_BAR_CTRL_REG);
+	}
+
+	/* set length register to 32 initially if more than one
+	 * word-transaction is to be done
+	 */
+	if (total_bits >= 32)
+		write_register(0x20, iobase, XVC_BAR_LENGTH_REG);
+
+	for (bits = 0, bits_left = total_bits; bits < total_bits; bits += 32,
+		bits_left -= 32) {
+		unsigned int bytes = bits >> 3;
+		unsigned int shift_bytes = 4;
+		u32 tms_store = 0;
+		u32 tdi_store = 0;
+		u32 tdo_store = 0;
+
+		if (bits_left < 32) {
+			/* set number of bits to shift out */
+			write_register(bits_left, iobase, XVC_BAR_LENGTH_REG);
+			shift_bytes = (bits_left + 7) >> 3;
+		}
+
+		memcpy(&tms_store, tms_buf + bytes, shift_bytes);
+		memcpy(&tdi_store, tdi_buf + bytes, shift_bytes);
+
+		/* Shift data out and copy to output buffer */
+		rv = xvc_shift_bits(iobase, tms_store, tdi_store, &tdo_store);
+		if (rv < 0)
+			goto cleanup;
+
+		memcpy(tdo_buf + bytes, &tdo_store, shift_bytes);
+	}
+
+	// If performing loopback test, reset loopback bit in control reg
+	if (opcode == 0x02) {
+		control_reg_data = read_register(iobase, XVC_BAR_CTRL_REG);
+		write_reg_data = control_reg_data & ~(0x02);
+		write_register(write_reg_data, iobase, XVC_BAR_CTRL_REG);
+	}
+
+	rv = copy_to_user((void *)xvc_obj.tdo_buf, tdo_buf, total_bytes);
+	if (rv) {
+		pr_info("copy back tdo_buf failed: %d/%u.\n", rv, total_bytes);
+		rv = -EFAULT;
+		goto cleanup;
+	}
+
+cleanup:
+	kfree(buffer);
+
+	mmiowb();
+
+	return rv;
+}
+
+static long xvc_read_properties(struct xocl_xvc *xvc, const void __user *arg)
+{
+	int status = 0;
+	struct xil_xvc_properties xvc_props_obj;
+
+	xvc_props_obj.xvc_algo_type   = (unsigned int) xvc_pci_props.xvc_algo_type;
+	xvc_props_obj.config_vsec_id  = xvc_pci_props.config_vsec_id;
+	xvc_props_obj.config_vsec_rev = xvc_pci_props.config_vsec_rev;
+	xvc_props_obj.bar_index		  = xvc_pci_props.bar_index;
+	xvc_props_obj.bar_offset	  = xvc_pci_props.bar_offset;
+
+	if (copy_to_user((void *)arg, &xvc_props_obj, sizeof(xvc_props_obj)))
+		status = -ENOMEM;
+
+	mmiowb();
+	return status;
+}
+
+static long xvc_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+	struct xocl_xvc *xvc = filp->private_data;
+	long status = 0;
+
+	switch (cmd) {
+	case XDMA_IOCXVC:
+		status = xvc_ioctl_helper(xvc, (void __user *)arg);
+		break;
+	case XDMA_RDXVC_PROPS:
+		status = xvc_read_properties(xvc, (void __user *)arg);
+		break;
+	default:
+		status = -ENOIOCTLCMD;
+		break;
+	}
+
+	return status;
+}
+
+static int char_open(struct inode *inode, struct file *file)
+{
+	struct xocl_xvc *xvc = NULL;
+
+	xvc = xocl_drvinst_open(inode->i_cdev);
+	if (!xvc)
+		return -ENXIO;
+
+	/* create a reference to our char device in the opened file */
+	file->private_data = xvc;
+	return 0;
+}
+
+/*
+ * Called when the device goes from used to unused.
+ */
+static int char_close(struct inode *inode, struct file *file)
+{
+	struct xocl_xvc *xvc = file->private_data;
+
+	xocl_drvinst_close(xvc);
+	return 0;
+}
+
+
+/*
+ * character device file operations for the XVC
+ */
+static const struct file_operations xvc_fops = {
+	.owner = THIS_MODULE,
+	.open = char_open,
+	.release = char_close,
+	.unlocked_ioctl = xvc_ioctl,
+};
+
+static int xvc_probe(struct platform_device *pdev)
+{
+	struct xocl_xvc *xvc;
+	struct resource *res;
+	struct xocl_dev_core *core;
+	int err;
+
+	xvc = xocl_drvinst_alloc(&pdev->dev, sizeof(*xvc));
+	if (!xvc)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	xvc->base = ioremap_nocache(res->start, res->end - res->start + 1);
+	if (!xvc->base) {
+		err = -EIO;
+		xocl_err(&pdev->dev, "Map iomem failed");
+		goto failed;
+	}
+
+	core = xocl_get_xdev(pdev);
+
+	xvc->sys_cdev = cdev_alloc();
+	xvc->sys_cdev->ops = &xvc_fops;
+	xvc->sys_cdev->owner = THIS_MODULE;
+	xvc->instance = XOCL_DEV_ID(core->pdev) |
+		platform_get_device_id(pdev)->driver_data;
+	xvc->sys_cdev->dev = MKDEV(MAJOR(xvc_dev), core->dev_minor);
+	err = cdev_add(xvc->sys_cdev, xvc->sys_cdev->dev, 1);
+	if (err) {
+		xocl_err(&pdev->dev, "cdev_add failed, %d", err);
+		goto failed;
+	}
+
+	xvc->sys_device = device_create(xrt_class, &pdev->dev,
+					xvc->sys_cdev->dev,
+					NULL, "%s%d",
+					platform_get_device_id(pdev)->name,
+					xvc->instance & MINOR_NAME_MASK);
+	if (IS_ERR(xvc->sys_device)) {
+		err = PTR_ERR(xvc->sys_device);
+		goto failed;
+	}
+
+	xocl_drvinst_set_filedev(xvc, xvc->sys_cdev);
+
+	platform_set_drvdata(pdev, xvc);
+	xocl_info(&pdev->dev, "XVC device instance %d initialized\n",
+		xvc->instance);
+
+	// Update PCIe BAR properties in a global structure
+	xvc_pci_props.xvc_algo_type   = XVC_ALGO_BAR;
+	xvc_pci_props.config_vsec_id  = 0;
+	xvc_pci_props.config_vsec_rev = 0;
+	xvc_pci_props.bar_index	      = core->bar_idx;
+	xvc_pci_props.bar_offset      = (unsigned int) res->start - (unsigned int)
+									pci_resource_start(core->pdev, core->bar_idx);
+
+	return 0;
+failed:
+	if (!IS_ERR(xvc->sys_device))
+		device_destroy(xrt_class, xvc->sys_cdev->dev);
+	if (xvc->sys_cdev)
+		cdev_del(xvc->sys_cdev);
+	if (xvc->base)
+		iounmap(xvc->base);
+	xocl_drvinst_free(xvc);
+
+	return err;
+}
+
+
+static int xvc_remove(struct platform_device *pdev)
+{
+	struct xocl_xvc	*xvc;
+
+	xvc = platform_get_drvdata(pdev);
+	if (!xvc) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return -EINVAL;
+	}
+	device_destroy(xrt_class, xvc->sys_cdev->dev);
+	cdev_del(xvc->sys_cdev);
+	if (xvc->base)
+		iounmap(xvc->base);
+
+	platform_set_drvdata(pdev, NULL);
+	xocl_drvinst_free(xvc);
+
+	return 0;
+}
+
+struct platform_device_id xvc_id_table[] = {
+	{ XOCL_XVC_PUB, MINOR_PUB_HIGH_BIT },
+	{ XOCL_XVC_PRI, MINOR_PRI_HIGH_BIT },
+	{ },
+};
+
+static struct platform_driver	xvc_driver = {
+	.probe		= xvc_probe,
+	.remove		= xvc_remove,
+	.driver		= {
+		.name = XVC_DEV_NAME,
+	},
+	.id_table = xvc_id_table,
+};
+
+int __init xocl_init_xvc(void)
+{
+	int err = 0;
+
+	err = alloc_chrdev_region(&xvc_dev, 0, XOCL_MAX_DEVICES, XVC_DEV_NAME);
+	if (err < 0)
+		goto err_register_chrdev;
+
+	err = platform_driver_register(&xvc_driver);
+	if (err)
+		goto err_driver_reg;
+	return 0;
+
+err_driver_reg:
+	unregister_chrdev_region(xvc_dev, XOCL_MAX_DEVICES);
+err_register_chrdev:
+	return err;
+}
+
+void xocl_fini_xvc(void)
+{
+	unregister_chrdev_region(xvc_dev, XOCL_MAX_DEVICES);
+	platform_driver_unregister(&xvc_driver);
+}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH Xilinx Alveo 4/6] Add core of XDMA driver
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
                   ` (2 preceding siblings ...)
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 3/6] Add platform drivers for various IPs and frameworks sonal.santan
@ 2019-03-19 21:53 ` sonal.santan
  2019-03-19 21:54 ` [RFC PATCH Xilinx Alveo 5/6] Add management driver sonal.santan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:53 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk,
	Sonal Santan

From: Sonal Santan <sonal.santan@xilinx.com>

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 drivers/gpu/drm/xocl/lib/Makefile.in   |   16 +
 drivers/gpu/drm/xocl/lib/cdev_sgdma.h  |   63 +
 drivers/gpu/drm/xocl/lib/libxdma.c     | 4368 ++++++++++++++++++++++++
 drivers/gpu/drm/xocl/lib/libxdma.h     |  596 ++++
 drivers/gpu/drm/xocl/lib/libxdma_api.h |  127 +
 5 files changed, 5170 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/lib/Makefile.in
 create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
 create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
 create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
 create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h

diff --git a/drivers/gpu/drm/xocl/lib/Makefile.in b/drivers/gpu/drm/xocl/lib/Makefile.in
new file mode 100644
index 000000000000..5e16aefd6aba
--- /dev/null
+++ b/drivers/gpu/drm/xocl/lib/Makefile.in
@@ -0,0 +1,16 @@
+#
+# Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+#
+# Authors:
+#
+# This software is licensed under the terms of the GNU General Public
+# License version 2, as published by the Free Software Foundation, and
+# may be copied, distributed, and modified under those terms.
+#
+# This program is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+
+xocl_lib-y	:= ../lib/libxdma.o
diff --git a/drivers/gpu/drm/xocl/lib/cdev_sgdma.h b/drivers/gpu/drm/xocl/lib/cdev_sgdma.h
new file mode 100644
index 000000000000..db6d3bbfe29b
--- /dev/null
+++ b/drivers/gpu/drm/xocl/lib/cdev_sgdma.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/*******************************************************************************
+ *
+ * Xilinx XDMA IP Core Linux Driver
+ * Copyright(c) 2015 - 2019 Xilinx, Inc.
+ *
+ * Karen Xie <karen.xie@xilinx.com>
+ *
+ ******************************************************************************/
+#ifndef _XDMA_IOCALLS_POSIX_H_
+#define _XDMA_IOCALLS_POSIX_H_
+
+#include <linux/ioctl.h>
+
+
+#define IOCTL_XDMA_PERF_V1 (1)
+#define XDMA_ADDRMODE_MEMORY (0)
+#define XDMA_ADDRMODE_FIXED (1)
+
+/*
+ * S means "Set" through a ptr,
+ * T means "Tell" directly with the argument value
+ * G means "Get": reply by setting through a pointer
+ * Q means "Query": response is on the return value
+ * X means "eXchange": switch G and S atomically
+ * H means "sHift": switch T and Q atomically
+ *
+ * _IO(type,nr)		    no arguments
+ * _IOR(type,nr,datatype)   read data from driver
+ * _IOW(type,nr.datatype)   write data to driver
+ * _IORW(type,nr,datatype)  read/write data
+ *
+ * _IOC_DIR(nr)		    returns direction
+ * _IOC_TYPE(nr)	    returns magic
+ * _IOC_NR(nr)		    returns number
+ * _IOC_SIZE(nr)	    returns size
+ */
+
+struct xdma_performance_ioctl {
+	/* IOCTL_XDMA_IOCTL_Vx */
+	uint32_t version;
+	uint32_t transfer_size;
+	/* measurement */
+	uint32_t stopped;
+	uint32_t iterations;
+	uint64_t clock_cycle_count;
+	uint64_t data_cycle_count;
+	uint64_t pending_count;
+};
+
+
+
+/* IOCTL codes */
+
+#define IOCTL_XDMA_PERF_START   _IOW('q', 1, struct xdma_performance_ioctl *)
+#define IOCTL_XDMA_PERF_STOP    _IOW('q', 2, struct xdma_performance_ioctl *)
+#define IOCTL_XDMA_PERF_GET     _IOR('q', 3, struct xdma_performance_ioctl *)
+#define IOCTL_XDMA_ADDRMODE_SET _IOW('q', 4, int)
+#define IOCTL_XDMA_ADDRMODE_GET _IOR('q', 5, int)
+#define IOCTL_XDMA_ALIGN_GET    _IOR('q', 6, int)
+
+#endif /* _XDMA_IOCALLS_POSIX_H_ */
diff --git a/drivers/gpu/drm/xocl/lib/libxdma.c b/drivers/gpu/drm/xocl/lib/libxdma.c
new file mode 100644
index 000000000000..290f2c153395
--- /dev/null
+++ b/drivers/gpu/drm/xocl/lib/libxdma.c
@@ -0,0 +1,4368 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ *  Copyright (C) 2017-2019 Xilinx, Inc. All rights reserved.
+ *  Author: Karen Xie <karen.xie@xilinx.com>
+ *
+ */
+#define pr_fmt(fmt)     KBUILD_MODNAME ":%s: " fmt, __func__
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/mm.h>
+#include <linux/errno.h>
+#include <linux/sched.h>
+#include <linux/vmalloc.h>
+
+#include "libxdma.h"
+#include "libxdma_api.h"
+#include "cdev_sgdma.h"
+
+/* SECTION: Module licensing */
+
+#ifdef __LIBXDMA_MOD__
+#include "version.h"
+#define DRV_MODULE_NAME		"libxdma"
+#define DRV_MODULE_DESC		"Xilinx XDMA Base Driver"
+#define DRV_MODULE_RELDATE	"Feb. 2017"
+
+static char version[] =
+	DRV_MODULE_DESC " " DRV_MODULE_NAME " v" DRV_MODULE_VERSION "\n";
+
+MODULE_AUTHOR("Xilinx, Inc.");
+MODULE_DESCRIPTION(DRV_MODULE_DESC);
+MODULE_VERSION(DRV_MODULE_VERSION);
+MODULE_LICENSE("GPL v2");
+#endif
+
+/* Module Parameters */
+static unsigned int poll_mode;
+module_param(poll_mode, uint, 0644);
+MODULE_PARM_DESC(poll_mode, "Set 1 for hw polling, default is 0 (interrupts)");
+
+static unsigned int interrupt_mode;
+module_param(interrupt_mode, uint, 0644);
+MODULE_PARM_DESC(interrupt_mode, "0 - MSI-x , 1 - MSI, 2 - Legacy");
+
+static unsigned int enable_credit_mp;
+module_param(enable_credit_mp, uint, 0644);
+MODULE_PARM_DESC(enable_credit_mp, "Set 1 to enable creidt feature, default is 0 (no credit control)");
+
+/*
+ * xdma device management
+ * maintains a list of the xdma devices
+ */
+static LIST_HEAD(xdev_list);
+static DEFINE_MUTEX(xdev_mutex);
+
+static LIST_HEAD(xdev_rcu_list);
+static DEFINE_SPINLOCK(xdev_rcu_lock);
+
+#ifndef list_last_entry
+#define list_last_entry(ptr, type, member) \
+		list_entry((ptr)->prev, type, member)
+#endif
+
+static inline void xdev_list_add(struct xdma_dev *xdev)
+{
+	mutex_lock(&xdev_mutex);
+	if (list_empty(&xdev_list))
+		xdev->idx = 0;
+	else {
+		struct xdma_dev *last;
+
+		last = list_last_entry(&xdev_list, struct xdma_dev, list_head);
+		xdev->idx = last->idx + 1;
+	}
+	list_add_tail(&xdev->list_head, &xdev_list);
+	mutex_unlock(&xdev_mutex);
+
+	dbg_init("dev %s, xdev 0x%p, xdma idx %d.\n",
+		dev_name(&xdev->pdev->dev), xdev, xdev->idx);
+
+	spin_lock(&xdev_rcu_lock);
+	list_add_tail_rcu(&xdev->rcu_node, &xdev_rcu_list);
+	spin_unlock(&xdev_rcu_lock);
+}
+
+#undef list_last_entry
+
+static inline void xdev_list_remove(struct xdma_dev *xdev)
+{
+	mutex_lock(&xdev_mutex);
+	list_del(&xdev->list_head);
+	mutex_unlock(&xdev_mutex);
+
+	spin_lock(&xdev_rcu_lock);
+	list_del_rcu(&xdev->rcu_node);
+	spin_unlock(&xdev_rcu_lock);
+	synchronize_rcu();
+}
+
+static struct xdma_dev *xdev_find_by_pdev(struct pci_dev *pdev)
+{
+	struct xdma_dev *xdev, *tmp;
+
+	mutex_lock(&xdev_mutex);
+	list_for_each_entry_safe(xdev, tmp, &xdev_list, list_head) {
+		if (xdev->pdev == pdev) {
+			mutex_unlock(&xdev_mutex);
+			return xdev;
+		}
+	}
+	mutex_unlock(&xdev_mutex);
+return NULL;
+}
+
+static inline int debug_check_dev_hndl(const char *fname, struct pci_dev *pdev,
+				 void *hndl)
+{
+	struct xdma_dev *xdev;
+
+	if (!pdev)
+		return -EINVAL;
+
+	xdev = xdev_find_by_pdev(pdev);
+	if (!xdev) {
+		pr_info("%s pdev 0x%p, hndl 0x%p, NO match found!\n",
+			fname, pdev, hndl);
+		return -EINVAL;
+	}
+	if (xdev != hndl) {
+		pr_err("%s pdev 0x%p, hndl 0x%p != 0x%p!\n",
+			fname, pdev, hndl, xdev);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+#ifdef __LIBXDMA_DEBUG__
+/* SECTION: Function definitions */
+inline void __write_register(const char *fn, u32 value, void *iomem, unsigned long off)
+{
+	pr_err("%s: w reg 0x%lx(0x%p), 0x%x.\n", fn, off, iomem, value);
+	iowrite32(value, iomem);
+}
+#define write_register(v, mem, off) __write_register(__func__, v, mem, off)
+#else
+#define write_register(v, mem, off) iowrite32(v, mem)
+#endif
+
+inline u32 read_register(void *iomem)
+{
+	return ioread32(iomem);
+}
+
+static inline u32 build_u32(u32 hi, u32 lo)
+{
+	return ((hi & 0xFFFFUL) << 16) | (lo & 0xFFFFUL);
+}
+
+static inline u64 build_u64(u64 hi, u64 lo)
+{
+	return ((hi & 0xFFFFFFFULL) << 32) | (lo & 0xFFFFFFFFULL);
+}
+
+static void check_nonzero_interrupt_status(struct xdma_dev *xdev)
+{
+	struct interrupt_regs *reg = (struct interrupt_regs *)
+		(xdev->bar[xdev->config_bar_idx] + XDMA_OFS_INT_CTRL);
+	u32 w;
+
+	w = read_register(&reg->user_int_enable);
+	if (w)
+		pr_info("%s xdma%d user_int_enable = 0x%08x\n",
+			dev_name(&xdev->pdev->dev), xdev->idx, w);
+
+	w = read_register(&reg->channel_int_enable);
+	if (w)
+		pr_info("%s xdma%d channel_int_enable = 0x%08x\n",
+			dev_name(&xdev->pdev->dev), xdev->idx, w);
+
+	w = read_register(&reg->user_int_request);
+	if (w)
+		pr_info("%s xdma%d user_int_request = 0x%08x\n",
+			dev_name(&xdev->pdev->dev), xdev->idx, w);
+	w = read_register(&reg->channel_int_request);
+	if (w)
+		pr_info("%s xdma%d channel_int_request = 0x%08x\n",
+			dev_name(&xdev->pdev->dev), xdev->idx, w);
+
+	w = read_register(&reg->user_int_pending);
+	if (w)
+		pr_info("%s xdma%d user_int_pending = 0x%08x\n",
+			dev_name(&xdev->pdev->dev), xdev->idx, w);
+	w = read_register(&reg->channel_int_pending);
+	if (w)
+		pr_info("%s xdma%d channel_int_pending = 0x%08x\n",
+			dev_name(&xdev->pdev->dev), xdev->idx, w);
+}
+
+/* channel_interrupts_enable -- Enable interrupts we are interested in */
+static void channel_interrupts_enable(struct xdma_dev *xdev, u32 mask)
+{
+	struct interrupt_regs *reg = (struct interrupt_regs *)
+		(xdev->bar[xdev->config_bar_idx] + XDMA_OFS_INT_CTRL);
+
+	write_register(mask, &reg->channel_int_enable_w1s, XDMA_OFS_INT_CTRL);
+}
+
+/* channel_interrupts_disable -- Disable interrupts we not interested in */
+static void channel_interrupts_disable(struct xdma_dev *xdev, u32 mask)
+{
+	struct interrupt_regs *reg = (struct interrupt_regs *)
+		(xdev->bar[xdev->config_bar_idx] + XDMA_OFS_INT_CTRL);
+
+	write_register(mask, &reg->channel_int_enable_w1c, XDMA_OFS_INT_CTRL);
+}
+
+/* user_interrupts_enable -- Enable interrupts we are interested in */
+static void user_interrupts_enable(struct xdma_dev *xdev, u32 mask)
+{
+	struct interrupt_regs *reg = (struct interrupt_regs *)
+		(xdev->bar[xdev->config_bar_idx] + XDMA_OFS_INT_CTRL);
+
+	write_register(mask, &reg->user_int_enable_w1s, XDMA_OFS_INT_CTRL);
+}
+
+/* user_interrupts_disable -- Disable interrupts we not interested in */
+static void user_interrupts_disable(struct xdma_dev *xdev, u32 mask)
+{
+	struct interrupt_regs *reg = (struct interrupt_regs *)
+		(xdev->bar[xdev->config_bar_idx] + XDMA_OFS_INT_CTRL);
+
+	write_register(mask, &reg->user_int_enable_w1c, XDMA_OFS_INT_CTRL);
+}
+
+/* read_interrupts -- Print the interrupt controller status */
+static u32 read_interrupts(struct xdma_dev *xdev)
+{
+	struct interrupt_regs *reg = (struct interrupt_regs *)
+		(xdev->bar[xdev->config_bar_idx] + XDMA_OFS_INT_CTRL);
+	u32 lo;
+	u32 hi;
+
+	/* extra debugging; inspect complete engine set of registers */
+	hi = read_register(&reg->user_int_request);
+	dbg_io("ioread32(0x%p) returned 0x%08x (user_int_request).\n",
+		&reg->user_int_request, hi);
+	lo = read_register(&reg->channel_int_request);
+	dbg_io("ioread32(0x%p) returned 0x%08x (channel_int_request)\n",
+		&reg->channel_int_request, lo);
+
+	/* return interrupts: user in upper 16-bits, channel in lower 16-bits */
+	return build_u32(hi, lo);
+}
+
+void enable_perf(struct xdma_engine *engine)
+{
+	u32 w;
+
+	w = XDMA_PERF_CLEAR;
+	write_register(w, &engine->regs->perf_ctrl,
+			(unsigned long)(&engine->regs->perf_ctrl) -
+			(unsigned long)(&engine->regs));
+	read_register(&engine->regs->identifier);
+	w = XDMA_PERF_AUTO | XDMA_PERF_RUN;
+	write_register(w, &engine->regs->perf_ctrl,
+			(unsigned long)(&engine->regs->perf_ctrl) -
+			(unsigned long)(&engine->regs));
+	read_register(&engine->regs->identifier);
+
+	dbg_perf("IOCTL_XDMA_PERF_START\n");
+
+}
+EXPORT_SYMBOL_GPL(enable_perf);
+
+void get_perf_stats(struct xdma_engine *engine)
+{
+	u32 hi;
+	u32 lo;
+
+	BUG_ON(!engine);
+	BUG_ON(!engine->xdma_perf);
+
+	hi = 0;
+	lo = read_register(&engine->regs->completed_desc_count);
+	engine->xdma_perf->iterations = build_u64(hi, lo);
+
+	hi = read_register(&engine->regs->perf_cyc_hi);
+	lo = read_register(&engine->regs->perf_cyc_lo);
+
+	engine->xdma_perf->clock_cycle_count = build_u64(hi, lo);
+
+	hi = read_register(&engine->regs->perf_dat_hi);
+	lo = read_register(&engine->regs->perf_dat_lo);
+	engine->xdma_perf->data_cycle_count = build_u64(hi, lo);
+
+	hi = read_register(&engine->regs->perf_pnd_hi);
+	lo = read_register(&engine->regs->perf_pnd_lo);
+	engine->xdma_perf->pending_count = build_u64(hi, lo);
+}
+EXPORT_SYMBOL_GPL(get_perf_stats);
+
+static void engine_reg_dump(struct xdma_engine *engine)
+{
+	u32 w;
+
+	BUG_ON(!engine);
+
+	w = read_register(&engine->regs->identifier);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (id).\n",
+		engine->name, &engine->regs->identifier, w);
+	w &= BLOCK_ID_MASK;
+	if (w != BLOCK_ID_HEAD) {
+		pr_info("%s: engine id missing, 0x%08x exp. & 0x%x = 0x%x\n",
+			 engine->name, w, BLOCK_ID_MASK, BLOCK_ID_HEAD);
+		return;
+	}
+	/* extra debugging; inspect complete engine set of registers */
+	w = read_register(&engine->regs->status);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (status).\n",
+		engine->name, &engine->regs->status, w);
+	w = read_register(&engine->regs->control);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (control)\n",
+		engine->name, &engine->regs->control, w);
+	w = read_register(&engine->sgdma_regs->first_desc_lo);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (first_desc_lo)\n",
+		engine->name, &engine->sgdma_regs->first_desc_lo, w);
+	w = read_register(&engine->sgdma_regs->first_desc_hi);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (first_desc_hi)\n",
+		engine->name, &engine->sgdma_regs->first_desc_hi, w);
+	w = read_register(&engine->sgdma_regs->first_desc_adjacent);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (first_desc_adjacent).\n",
+		engine->name, &engine->sgdma_regs->first_desc_adjacent, w);
+	w = read_register(&engine->regs->completed_desc_count);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (completed_desc_count).\n",
+		engine->name, &engine->regs->completed_desc_count, w);
+	w = read_register(&engine->regs->interrupt_enable_mask);
+	pr_info("%s: ioread32(0x%p) = 0x%08x (interrupt_enable_mask)\n",
+		engine->name, &engine->regs->interrupt_enable_mask, w);
+}
+
+/**
+ * engine_status_read() - read status of SG DMA engine (optionally reset)
+ *
+ * Stores status in engine->status.
+ *
+ * @return -1 on failure, status register otherwise
+ */
+static void engine_status_dump(struct xdma_engine *engine)
+{
+	u32 v = engine->status;
+	char buffer[256];
+	char *buf = buffer;
+	int len = 0;
+
+	len = sprintf(buf, "SG engine %s status: 0x%08x: ", engine->name, v);
+
+	if ((v & XDMA_STAT_BUSY))
+		len += sprintf(buf + len, "BUSY,");
+	if ((v & XDMA_STAT_DESC_STOPPED))
+		len += sprintf(buf + len, "DESC_STOPPED,");
+	if ((v & XDMA_STAT_DESC_COMPLETED))
+		len += sprintf(buf + len, "DESC_COMPL,");
+
+	/* common H2C & C2H */
+	if ((v & XDMA_STAT_COMMON_ERR_MASK)) {
+		if ((v & XDMA_STAT_ALIGN_MISMATCH))
+			len += sprintf(buf + len, "ALIGN_MISMATCH ");
+		if ((v & XDMA_STAT_MAGIC_STOPPED))
+			len += sprintf(buf + len, "MAGIC_STOPPED ");
+		if ((v & XDMA_STAT_INVALID_LEN))
+			len += sprintf(buf + len, "INVLIAD_LEN ");
+		if ((v & XDMA_STAT_IDLE_STOPPED))
+			len += sprintf(buf + len, "IDLE_STOPPED ");
+		buf[len - 1] = ',';
+	}
+
+	if (engine->dir == DMA_TO_DEVICE) {
+		/* H2C only */
+		if ((v & XDMA_STAT_H2C_R_ERR_MASK)) {
+			len += sprintf(buf + len, "R:");
+			if ((v & XDMA_STAT_H2C_R_UNSUPP_REQ))
+				len += sprintf(buf + len, "UNSUPP_REQ ");
+			if ((v & XDMA_STAT_H2C_R_COMPL_ABORT))
+				len += sprintf(buf + len, "COMPL_ABORT ");
+			if ((v & XDMA_STAT_H2C_R_PARITY_ERR))
+				len += sprintf(buf + len, "PARITY ");
+			if ((v & XDMA_STAT_H2C_R_HEADER_EP))
+				len += sprintf(buf + len, "HEADER_EP ");
+			if ((v & XDMA_STAT_H2C_R_UNEXP_COMPL))
+				len += sprintf(buf + len, "UNEXP_COMPL ");
+			buf[len - 1] = ',';
+		}
+
+		if ((v & XDMA_STAT_H2C_W_ERR_MASK)) {
+			len += sprintf(buf + len, "W:");
+			if ((v & XDMA_STAT_H2C_W_DECODE_ERR))
+				len += sprintf(buf + len, "DECODE_ERR ");
+			if ((v & XDMA_STAT_H2C_W_SLAVE_ERR))
+				len += sprintf(buf + len, "SLAVE_ERR ");
+			buf[len - 1] = ',';
+		}
+
+	} else {
+		/* C2H only */
+		if ((v & XDMA_STAT_C2H_R_ERR_MASK)) {
+			len += sprintf(buf + len, "R:");
+			if ((v & XDMA_STAT_C2H_R_DECODE_ERR))
+				len += sprintf(buf + len, "DECODE_ERR ");
+			if ((v & XDMA_STAT_C2H_R_SLAVE_ERR))
+				len += sprintf(buf + len, "SLAVE_ERR ");
+			buf[len - 1] = ',';
+		}
+	}
+
+	/* common H2C & C2H */
+	if ((v & XDMA_STAT_DESC_ERR_MASK)) {
+		len += sprintf(buf + len, "DESC_ERR:");
+		if ((v & XDMA_STAT_DESC_UNSUPP_REQ))
+			len += sprintf(buf + len, "UNSUPP_REQ ");
+		if ((v & XDMA_STAT_DESC_COMPL_ABORT))
+			len += sprintf(buf + len, "COMPL_ABORT ");
+		if ((v & XDMA_STAT_DESC_PARITY_ERR))
+			len += sprintf(buf + len, "PARITY ");
+		if ((v & XDMA_STAT_DESC_HEADER_EP))
+			len += sprintf(buf + len, "HEADER_EP ");
+		if ((v & XDMA_STAT_DESC_UNEXP_COMPL))
+			len += sprintf(buf + len, "UNEXP_COMPL ");
+		buf[len - 1] = ',';
+	}
+
+	buf[len - 1] = '\0';
+	pr_info("%s\n", buffer);
+}
+
+static u32 engine_status_read(struct xdma_engine *engine, bool clear, bool dump)
+{
+	u32 value;
+
+	BUG_ON(!engine);
+
+	if (dump)
+		engine_reg_dump(engine);
+
+	/* read status register */
+	if (clear)
+		value = engine->status =
+			read_register(&engine->regs->status_rc);
+	else
+		value = engine->status = read_register(&engine->regs->status);
+
+	if (dump)
+		engine_status_dump(engine);
+
+	return value;
+}
+
+/**
+ * xdma_engine_stop() - stop an SG DMA engine
+ *
+ */
+static void xdma_engine_stop(struct xdma_engine *engine)
+{
+	u32 w;
+
+	BUG_ON(!engine);
+	dbg_tfr("(engine=%p)\n", engine);
+
+	w = 0;
+	w |= (u32)XDMA_CTRL_IE_DESC_ALIGN_MISMATCH;
+	w |= (u32)XDMA_CTRL_IE_MAGIC_STOPPED;
+	w |= (u32)XDMA_CTRL_IE_READ_ERROR;
+	w |= (u32)XDMA_CTRL_IE_DESC_ERROR;
+
+	if (poll_mode) {
+		w |= (u32) XDMA_CTRL_POLL_MODE_WB;
+	} else {
+		w |= (u32)XDMA_CTRL_IE_DESC_STOPPED;
+		w |= (u32)XDMA_CTRL_IE_DESC_COMPLETED;
+
+		/* Disable IDLE STOPPED for MM */
+		if ((engine->streaming && (engine->dir == DMA_FROM_DEVICE)) ||
+		    (engine->xdma_perf))
+			w |= (u32)XDMA_CTRL_IE_IDLE_STOPPED;
+	}
+
+	dbg_tfr("Stopping SG DMA %s engine; writing 0x%08x to 0x%p.\n",
+			engine->name, w, (u32 *)&engine->regs->control);
+	write_register(w, &engine->regs->control,
+			(unsigned long)(&engine->regs->control) -
+			(unsigned long)(&engine->regs));
+	/* dummy read of status register to flush all previous writes */
+	dbg_tfr("(%s) done\n", engine->name);
+}
+
+static void engine_start_mode_config(struct xdma_engine *engine)
+{
+	u32 w;
+
+	BUG_ON(!engine);
+
+	/* If a perf test is running, enable the engine interrupts */
+	if (engine->xdma_perf) {
+		w = XDMA_CTRL_IE_DESC_STOPPED;
+		w |= XDMA_CTRL_IE_DESC_COMPLETED;
+		w |= XDMA_CTRL_IE_DESC_ALIGN_MISMATCH;
+		w |= XDMA_CTRL_IE_MAGIC_STOPPED;
+		w |= XDMA_CTRL_IE_IDLE_STOPPED;
+		w |= XDMA_CTRL_IE_READ_ERROR;
+		w |= XDMA_CTRL_IE_DESC_ERROR;
+
+		write_register(w, &engine->regs->interrupt_enable_mask,
+			(unsigned long)(&engine->regs->interrupt_enable_mask) -
+			(unsigned long)(&engine->regs));
+	}
+
+	/* write control register of SG DMA engine */
+	w = (u32)XDMA_CTRL_RUN_STOP;
+	w |= (u32)XDMA_CTRL_IE_READ_ERROR;
+	w |= (u32)XDMA_CTRL_IE_DESC_ERROR;
+	w |= (u32)XDMA_CTRL_IE_DESC_ALIGN_MISMATCH;
+	w |= (u32)XDMA_CTRL_IE_MAGIC_STOPPED;
+
+	if (poll_mode) {
+		w |= (u32)XDMA_CTRL_POLL_MODE_WB;
+	} else {
+		w |= (u32)XDMA_CTRL_IE_DESC_STOPPED;
+		w |= (u32)XDMA_CTRL_IE_DESC_COMPLETED;
+
+		if ((engine->streaming && (engine->dir == DMA_FROM_DEVICE)) ||
+		    (engine->xdma_perf))
+			w |= (u32)XDMA_CTRL_IE_IDLE_STOPPED;
+
+		/* set non-incremental addressing mode */
+		if (engine->non_incr_addr)
+			w |= (u32)XDMA_CTRL_NON_INCR_ADDR;
+	}
+
+	dbg_tfr("iowrite32(0x%08x to 0x%p) (control)\n", w,
+			(void *)&engine->regs->control);
+	/* start the engine */
+	write_register(w, &engine->regs->control,
+			(unsigned long)(&engine->regs->control) -
+			(unsigned long)(&engine->regs));
+
+	/* dummy read of status register to flush all previous writes */
+	w = read_register(&engine->regs->status);
+	dbg_tfr("ioread32(0x%p) = 0x%08x (dummy read flushes writes).\n",
+			&engine->regs->status, w);
+}
+
+/**
+ * engine_start() - start an idle engine with its first transfer on queue
+ *
+ * The engine will run and process all transfers that are queued using
+ * transfer_queue() and thus have their descriptor lists chained.
+ *
+ * During the run, new transfers will be processed if transfer_queue() has
+ * chained the descriptors before the hardware fetches the last descriptor.
+ * A transfer that was chained too late will invoke a new run of the engine
+ * initiated from the engine_service() routine.
+ *
+ * The engine must be idle and at least one transfer must be queued.
+ * This function does not take locks; the engine spinlock must already be
+ * taken.
+ *
+ */
+static struct xdma_transfer *engine_start(struct xdma_engine *engine)
+{
+	struct xdma_transfer *transfer;
+	u32 w;
+	int extra_adj = 0;
+
+	/* engine must be idle */
+	BUG_ON(engine->running);
+	/* engine transfer queue must not be empty */
+	BUG_ON(list_empty(&engine->transfer_list));
+	/* inspect first transfer queued on the engine */
+	transfer = list_entry(engine->transfer_list.next, struct xdma_transfer,
+				entry);
+	BUG_ON(!transfer);
+
+	/* engine is no longer shutdown */
+	engine->shutdown = ENGINE_SHUTDOWN_NONE;
+
+	dbg_tfr("(%s): transfer=0x%p.\n", engine->name, transfer);
+
+	/* initialize number of descriptors of dequeued transfers */
+	engine->desc_dequeued = 0;
+
+	/* write lower 32-bit of bus address of transfer first descriptor */
+	w = cpu_to_le32(PCI_DMA_L(transfer->desc_bus));
+	dbg_tfr("iowrite32(0x%08x to 0x%p) (first_desc_lo)\n", w,
+			(void *)&engine->sgdma_regs->first_desc_lo);
+	write_register(w, &engine->sgdma_regs->first_desc_lo,
+			(unsigned long)(&engine->sgdma_regs->first_desc_lo) -
+			(unsigned long)(&engine->sgdma_regs));
+	/* write upper 32-bit of bus address of transfer first descriptor */
+	w = cpu_to_le32(PCI_DMA_H(transfer->desc_bus));
+	dbg_tfr("iowrite32(0x%08x to 0x%p) (first_desc_hi)\n", w,
+			(void *)&engine->sgdma_regs->first_desc_hi);
+	write_register(w, &engine->sgdma_regs->first_desc_hi,
+			(unsigned long)(&engine->sgdma_regs->first_desc_hi) -
+			(unsigned long)(&engine->sgdma_regs));
+
+	if (transfer->desc_adjacent > 0) {
+		extra_adj = transfer->desc_adjacent - 1;
+		if (extra_adj > MAX_EXTRA_ADJ)
+			extra_adj = MAX_EXTRA_ADJ;
+	}
+	dbg_tfr("iowrite32(0x%08x to 0x%p) (first_desc_adjacent)\n",
+		extra_adj, (void *)&engine->sgdma_regs->first_desc_adjacent);
+	write_register(extra_adj, &engine->sgdma_regs->first_desc_adjacent,
+			(unsigned long)(&engine->sgdma_regs->first_desc_adjacent) -
+			(unsigned long)(&engine->sgdma_regs));
+
+	dbg_tfr("ioread32(0x%p) (dummy read flushes writes).\n",
+		&engine->regs->status);
+	mmiowb();
+
+	engine_start_mode_config(engine);
+
+	engine_status_read(engine, 0, 0);
+
+	dbg_tfr("%s engine 0x%p now running\n", engine->name, engine);
+	/* remember the engine is running */
+	engine->running = 1;
+	return transfer;
+}
+
+/**
+ * engine_service() - service an SG DMA engine
+ *
+ * must be called with engine->lock already acquired
+ *
+ * @engine pointer to struct xdma_engine
+ *
+ */
+static void engine_service_shutdown(struct xdma_engine *engine)
+{
+	/* if the engine stopped with RUN still asserted, de-assert RUN now */
+	dbg_tfr("engine just went idle, resetting RUN_STOP.\n");
+	xdma_engine_stop(engine);
+	engine->running = 0;
+
+	/* awake task on engine's shutdown wait queue */
+	wake_up(&engine->shutdown_wq);
+}
+
+struct xdma_transfer *engine_transfer_completion(struct xdma_engine *engine,
+		struct xdma_transfer *transfer)
+{
+	BUG_ON(!engine);
+	BUG_ON(!transfer);
+
+	/* synchronous I/O? */
+	/* awake task on transfer's wait queue */
+	wake_up(&transfer->wq);
+
+	return transfer;
+}
+
+struct xdma_transfer *engine_service_transfer_list(struct xdma_engine *engine,
+		struct xdma_transfer *transfer, u32 *pdesc_completed)
+{
+	BUG_ON(!engine);
+	BUG_ON(!pdesc_completed);
+
+	if (!transfer) {
+		pr_info("%s xfer empty, pdesc completed %u.\n",
+			engine->name, *pdesc_completed);
+		return NULL;
+	}
+
+	/*
+	 * iterate over all the transfers completed by the engine,
+	 * except for the last (i.e. use > instead of >=).
+	 */
+	while (transfer && (!transfer->cyclic) &&
+		(*pdesc_completed > transfer->desc_num)) {
+		/* remove this transfer from pdesc_completed */
+		*pdesc_completed -= transfer->desc_num;
+		dbg_tfr("%s engine completed non-cyclic xfer 0x%p (%d desc)\n",
+			engine->name, transfer, transfer->desc_num);
+		/* remove completed transfer from list */
+		list_del(engine->transfer_list.next);
+		/* add to dequeued number of descriptors during this run */
+		engine->desc_dequeued += transfer->desc_num;
+		/* mark transfer as succesfully completed */
+		transfer->state = TRANSFER_STATE_COMPLETED;
+
+		/*
+		 * Complete transfer - sets transfer to NULL if an async
+		 * transfer has completed
+		 */
+		transfer = engine_transfer_completion(engine, transfer);
+
+		/* if exists, get the next transfer on the list */
+		if (!list_empty(&engine->transfer_list)) {
+			transfer = list_entry(engine->transfer_list.next,
+					struct xdma_transfer, entry);
+			dbg_tfr("Non-completed transfer %p\n", transfer);
+		} else {
+			/* no further transfers? */
+			transfer = NULL;
+		}
+	}
+
+	return transfer;
+}
+
+static void engine_err_handle(struct xdma_engine *engine,
+		struct xdma_transfer *transfer, u32 desc_completed)
+{
+	u32 value;
+
+	/*
+	 * The BUSY bit is expected to be clear now but older HW has a race
+	 * condition which could cause it to be still set.  If it's set, re-read
+	 * and check again.  If it's still set, log the issue.
+	 */
+	if (engine->status & XDMA_STAT_BUSY) {
+		value = read_register(&engine->regs->status);
+		if (value & XDMA_STAT_BUSY)
+			pr_info_ratelimited("%s has errors but is still BUSY\n",
+				engine->name);
+	}
+
+	pr_info_ratelimited("%s, s 0x%x, aborted xfer 0x%p, cmpl %d/%d\n",
+		engine->name, engine->status, transfer, desc_completed,
+		transfer->desc_num);
+
+	/* mark transfer as failed */
+	transfer->state = TRANSFER_STATE_FAILED;
+	xdma_engine_stop(engine);
+}
+
+struct xdma_transfer *engine_service_final_transfer(struct xdma_engine *engine,
+			struct xdma_transfer *transfer, u32 *pdesc_completed)
+{
+	BUG_ON(!engine);
+	BUG_ON(!transfer);
+	BUG_ON(!pdesc_completed);
+
+	/* inspect the current transfer */
+	if (transfer) {
+		if (((engine->dir == DMA_FROM_DEVICE) &&
+		     (engine->status & XDMA_STAT_C2H_ERR_MASK)) ||
+		    ((engine->dir == DMA_TO_DEVICE) &&
+		     (engine->status & XDMA_STAT_H2C_ERR_MASK))) {
+			pr_info("engine %s, status error 0x%x.\n",
+				engine->name, engine->status);
+			engine_status_dump(engine);
+			engine_err_handle(engine, transfer, *pdesc_completed);
+			goto transfer_del;
+		}
+
+		if (engine->status & XDMA_STAT_BUSY)
+			dbg_tfr("Engine %s is unexpectedly busy - ignoring\n",
+				engine->name);
+
+		/* the engine stopped on current transfer? */
+		if (*pdesc_completed < transfer->desc_num) {
+			transfer->state = TRANSFER_STATE_FAILED;
+			pr_info("%s, xfer 0x%p, stopped half-way, %d/%d.\n",
+				engine->name, transfer, *pdesc_completed,
+				transfer->desc_num);
+		} else {
+			dbg_tfr("engine %s completed transfer\n", engine->name);
+			dbg_tfr("Completed transfer ID = 0x%p\n", transfer);
+			dbg_tfr("*pdesc_completed=%d, transfer->desc_num=%d",
+				*pdesc_completed, transfer->desc_num);
+
+			if (!transfer->cyclic) {
+				/*
+				 * if the engine stopped on this transfer,
+				 * it should be the last
+				 */
+				WARN_ON(*pdesc_completed > transfer->desc_num);
+			}
+			/* mark transfer as succesfully completed */
+			transfer->state = TRANSFER_STATE_COMPLETED;
+		}
+
+transfer_del:
+		/* remove completed transfer from list */
+		list_del(engine->transfer_list.next);
+		/* add to dequeued number of descriptors during this run */
+		engine->desc_dequeued += transfer->desc_num;
+
+		/*
+		 * Complete transfer - sets transfer to NULL if an asynchronous
+		 * transfer has completed
+		 */
+		transfer = engine_transfer_completion(engine, transfer);
+	}
+
+	return transfer;
+}
+
+static void engine_service_perf(struct xdma_engine *engine, u32 desc_completed)
+{
+	BUG_ON(!engine);
+
+	/* performance measurement is running? */
+	if (engine->xdma_perf) {
+		/* a descriptor was completed? */
+		if (engine->status & XDMA_STAT_DESC_COMPLETED) {
+			engine->xdma_perf->iterations = desc_completed;
+			dbg_perf("transfer->xdma_perf->iterations=%d\n",
+				engine->xdma_perf->iterations);
+		}
+
+		/* a descriptor stopped the engine? */
+		if (engine->status & XDMA_STAT_DESC_STOPPED) {
+			engine->xdma_perf->stopped = 1;
+			/*
+			 * wake any XDMA_PERF_IOCTL_STOP waiting for
+			 * the performance run to finish
+			 */
+			wake_up(&engine->xdma_perf_wq);
+			dbg_perf("transfer->xdma_perf stopped\n");
+		}
+	}
+}
+
+static void engine_transfer_dequeue(struct xdma_engine *engine)
+{
+	struct xdma_transfer *transfer;
+
+	BUG_ON(!engine);
+
+	/* pick first transfer on the queue (was submitted to the engine) */
+	transfer = list_entry(engine->transfer_list.next, struct xdma_transfer,
+		entry);
+	BUG_ON(!transfer);
+	BUG_ON(transfer != &engine->cyclic_req->xfer);
+	dbg_tfr("%s engine completed cyclic transfer 0x%p (%d desc).\n",
+		engine->name, transfer, transfer->desc_num);
+	/* remove completed transfer from list */
+	list_del(engine->transfer_list.next);
+}
+
+static int engine_ring_process(struct xdma_engine *engine)
+{
+	struct xdma_result *result;
+	int start;
+	int eop_count = 0;
+
+	BUG_ON(!engine);
+	result = engine->cyclic_result;
+	BUG_ON(!result);
+
+	/* where we start receiving in the ring buffer */
+	start = engine->rx_tail;
+
+	/* iterate through all newly received RX result descriptors */
+	dbg_tfr("%s, result %d, 0x%x, len 0x%x.\n",
+		engine->name, engine->rx_tail, result[engine->rx_tail].status,
+		result[engine->rx_tail].length);
+	while (result[engine->rx_tail].status && !engine->rx_overrun) {
+		/* EOP bit set in result? */
+		if (result[engine->rx_tail].status & RX_STATUS_EOP)
+			eop_count++;
+
+		/* increment tail pointer */
+		engine->rx_tail = (engine->rx_tail + 1) % CYCLIC_RX_PAGES_MAX;
+
+		dbg_tfr("%s, head %d, tail %d, 0x%x, len 0x%x.\n",
+			engine->name, engine->rx_head, engine->rx_tail,
+			result[engine->rx_tail].status,
+			result[engine->rx_tail].length);
+
+		/* overrun? */
+		if (engine->rx_tail == engine->rx_head) {
+			dbg_tfr("%s: overrun\n", engine->name);
+			/* flag to user space that overrun has occurred */
+			engine->rx_overrun = 1;
+		}
+	}
+
+	return eop_count;
+}
+
+static int engine_service_cyclic_polled(struct xdma_engine *engine)
+{
+	int eop_count = 0;
+	int rc = 0;
+	struct xdma_poll_wb *writeback_data;
+	u32 sched_limit = 0;
+
+	BUG_ON(!engine);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	writeback_data = (struct xdma_poll_wb *)engine->poll_mode_addr_virt;
+
+	while (eop_count == 0) {
+		if (sched_limit != 0) {
+			if ((sched_limit % NUM_POLLS_PER_SCHED) == 0)
+				schedule();
+		}
+		sched_limit++;
+
+		/* Monitor descriptor writeback address for errors */
+		if ((writeback_data->completed_desc_count) & WB_ERR_MASK) {
+			rc = -EINVAL;
+			break;
+		}
+
+		eop_count = engine_ring_process(engine);
+	}
+
+	if (eop_count == 0) {
+		engine_status_read(engine, 1, 0);
+		if ((engine->running) && !(engine->status & XDMA_STAT_BUSY)) {
+			/* transfers on queue? */
+			if (!list_empty(&engine->transfer_list))
+				engine_transfer_dequeue(engine);
+
+			engine_service_shutdown(engine);
+		}
+	}
+
+	return rc;
+}
+
+static int engine_service_cyclic_interrupt(struct xdma_engine *engine)
+{
+	int eop_count = 0;
+	struct xdma_transfer *xfer;
+
+	BUG_ON(!engine);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	engine_status_read(engine, 1, 0);
+
+	eop_count = engine_ring_process(engine);
+	/*
+	 * wake any reader on EOP, as one or more packets are now in
+	 * the RX buffer
+	 */
+	xfer = &engine->cyclic_req->xfer;
+	if (enable_credit_mp)
+		wake_up(&xfer->wq);
+	else {
+		if (eop_count > 0) {
+			/* awake task on transfer's wait queue */
+			dbg_tfr("wake_up() due to %d EOP's\n", eop_count);
+			engine->eop_found = 1;
+			wake_up(&xfer->wq);
+		}
+	}
+
+	/* engine was running but is no longer busy? */
+	if ((engine->running) && !(engine->status & XDMA_STAT_BUSY)) {
+		/* transfers on queue? */
+		if (!list_empty(&engine->transfer_list))
+			engine_transfer_dequeue(engine);
+
+		engine_service_shutdown(engine);
+	}
+
+	return 0;
+}
+
+/* must be called with engine->lock already acquired */
+static int engine_service_cyclic(struct xdma_engine *engine)
+{
+	int rc = 0;
+
+	dbg_tfr("is called\n");
+
+	BUG_ON(!engine);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	if (poll_mode)
+		rc = engine_service_cyclic_polled(engine);
+	else
+		rc = engine_service_cyclic_interrupt(engine);
+
+	return rc;
+}
+
+
+static void engine_service_resume(struct xdma_engine *engine)
+{
+	struct xdma_transfer *transfer_started;
+
+	BUG_ON(!engine);
+
+	/* engine stopped? */
+	if (!engine->running) {
+		/* in the case of shutdown, let it finish what's in the Q */
+		if (!list_empty(&engine->transfer_list)) {
+			/* (re)start engine */
+			transfer_started = engine_start(engine);
+			dbg_tfr("re-started %s engine with pending xfer 0x%p\n",
+				engine->name, transfer_started);
+		/* engine was requested to be shutdown? */
+		} else if (engine->shutdown & ENGINE_SHUTDOWN_REQUEST) {
+			engine->shutdown |= ENGINE_SHUTDOWN_IDLE;
+			/* awake task on engine's shutdown wait queue */
+			wake_up(&engine->shutdown_wq);
+		} else {
+			dbg_tfr("no pending transfers, %s engine stays idle.\n",
+				engine->name);
+		}
+	} else {
+		/* engine is still running? */
+		if (list_empty(&engine->transfer_list)) {
+			pr_warn("no queued transfers but %s engine running!\n",
+				engine->name);
+			WARN_ON(1);
+		}
+	}
+}
+
+/**
+ * engine_service() - service an SG DMA engine
+ *
+ * must be called with engine->lock already acquired
+ *
+ * @engine pointer to struct xdma_engine
+ *
+ */
+static int engine_service(struct xdma_engine *engine, int desc_writeback)
+{
+	struct xdma_transfer *transfer = NULL;
+	u32 desc_count = desc_writeback & WB_COUNT_MASK;
+	u32 err_flag = desc_writeback & WB_ERR_MASK;
+	int rv = 0;
+	struct xdma_poll_wb *wb_data;
+
+	BUG_ON(!engine);
+
+	/* If polling detected an error, signal to the caller */
+	if (err_flag)
+		rv = -EINVAL;
+
+	/* Service the engine */
+	if (!engine->running) {
+		dbg_tfr("Engine was not running!!! Clearing status\n");
+		engine_status_read(engine, 1, 0);
+		return 0;
+	}
+
+	/*
+	 * If called by the ISR or polling detected an error, read and clear
+	 * engine status. For polled mode descriptor completion, this read is
+	 * unnecessary and is skipped to reduce latency
+	 */
+	if ((desc_count == 0) || (err_flag != 0))
+		engine_status_read(engine, 1, 0);
+
+	/*
+	 * engine was running but is no longer busy, or writeback occurred,
+	 * shut down
+	 */
+	if ((engine->running && !(engine->status & XDMA_STAT_BUSY)) ||
+		(desc_count != 0))
+		engine_service_shutdown(engine);
+
+	/*
+	 * If called from the ISR, or if an error occurred, the descriptor
+	 * count will be zero.  In this scenario, read the descriptor count
+	 * from HW.  In polled mode descriptor completion, this read is
+	 * unnecessary and is skipped to reduce latency
+	 */
+	if (!desc_count)
+		desc_count = read_register(&engine->regs->completed_desc_count);
+	dbg_tfr("desc_count = %d\n", desc_count);
+
+	/* transfers on queue? */
+	if (!list_empty(&engine->transfer_list)) {
+		/* pick first transfer on queue (was submitted to the engine) */
+		transfer = list_entry(engine->transfer_list.next,
+				struct xdma_transfer, entry);
+
+		dbg_tfr("head of queue transfer 0x%p has %d descriptors\n",
+			transfer, (int)transfer->desc_num);
+
+		dbg_tfr("Engine completed %d desc, %d not yet dequeued\n",
+			(int)desc_count,
+			(int)desc_count - engine->desc_dequeued);
+
+		engine_service_perf(engine, desc_count);
+	}
+
+	if (transfer) {
+		/*
+		 * account for already dequeued transfers during this engine
+		 * run
+		 */
+		desc_count -= engine->desc_dequeued;
+
+		/* Process all but the last transfer */
+		transfer = engine_service_transfer_list(engine, transfer,
+			&desc_count);
+
+		/*
+		 * Process final transfer - includes checks of number of
+		 * descriptors to detect faulty completion
+		 */
+		transfer = engine_service_final_transfer(engine, transfer,
+			&desc_count);
+	}
+
+	/* Before starting engine again, clear the writeback data */
+	if (poll_mode) {
+		wb_data = (struct xdma_poll_wb *)engine->poll_mode_addr_virt;
+		wb_data->completed_desc_count = 0;
+	}
+
+	/* Restart the engine following the servicing */
+	engine_service_resume(engine);
+
+	return 0;
+}
+
+/* engine_service_work */
+static void engine_service_work(struct work_struct *work)
+{
+	struct xdma_engine *engine;
+	unsigned long flags;
+
+	engine = container_of(work, struct xdma_engine, work);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	/* lock the engine */
+	spin_lock_irqsave(&engine->lock, flags);
+
+	dbg_tfr("engine_service() for %s engine %p\n",
+		engine->name, engine);
+	if (engine->cyclic_req)
+		engine_service_cyclic(engine);
+	else
+		engine_service(engine, 0);
+
+	/* re-enable interrupts for this engine */
+	if (engine->xdev->msix_enabled) {
+		write_register(engine->interrupt_enable_mask_value,
+			       &engine->regs->interrupt_enable_mask_w1s,
+			(unsigned long)(&engine->regs->interrupt_enable_mask_w1s) -
+			(unsigned long)(&engine->regs));
+	} else
+		channel_interrupts_enable(engine->xdev, engine->irq_bitmask);
+
+	/* unlock the engine */
+	spin_unlock_irqrestore(&engine->lock, flags);
+}
+
+static u32 engine_service_wb_monitor(struct xdma_engine *engine,
+	u32 expected_wb)
+{
+	struct xdma_poll_wb *wb_data;
+	u32 desc_wb = 0;
+	u32 sched_limit = 0;
+	unsigned long timeout;
+
+	BUG_ON(!engine);
+	wb_data = (struct xdma_poll_wb *)engine->poll_mode_addr_virt;
+
+	/*
+	 * Poll the writeback location for the expected number of
+	 * descriptors / error events This loop is skipped for cyclic mode,
+	 * where the expected_desc_count passed in is zero, since it cannot be
+	 * determined before the function is called
+	 */
+
+	timeout = jiffies + (POLL_TIMEOUT_SECONDS * HZ);
+	while (expected_wb != 0) {
+		desc_wb = wb_data->completed_desc_count;
+
+		if (desc_wb & WB_ERR_MASK)
+			break;
+		else if (desc_wb == expected_wb)
+			break;
+
+		/* RTO - prevent system from hanging in polled mode */
+		if (time_after(jiffies, timeout)) {
+			dbg_tfr("Polling timeout occurred");
+			dbg_tfr("desc_wb = 0x%08x, expected 0x%08x\n", desc_wb,
+				expected_wb);
+			if ((desc_wb & WB_COUNT_MASK) > expected_wb)
+				desc_wb = expected_wb | WB_ERR_MASK;
+
+			break;
+		}
+
+		/*
+		 * Define NUM_POLLS_PER_SCHED to limit how much time is spent
+		 * in the scheduler
+		 */
+
+		if (sched_limit != 0) {
+			if ((sched_limit % NUM_POLLS_PER_SCHED) == 0)
+				schedule();
+		}
+		sched_limit++;
+	}
+
+	return desc_wb;
+}
+
+static int engine_service_poll(struct xdma_engine *engine,
+		u32 expected_desc_count)
+{
+	struct xdma_poll_wb *writeback_data;
+	u32 desc_wb = 0;
+	unsigned long flags;
+	int rv = 0;
+
+	BUG_ON(!engine);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	writeback_data = (struct xdma_poll_wb *)engine->poll_mode_addr_virt;
+
+	if ((expected_desc_count & WB_COUNT_MASK) != expected_desc_count) {
+		dbg_tfr("Queued descriptor count is larger than supported\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Poll the writeback location for the expected number of
+	 * descriptors / error events This loop is skipped for cyclic mode,
+	 * where the expected_desc_count passed in is zero, since it cannot be
+	 * determined before the function is called
+	 */
+
+	desc_wb = engine_service_wb_monitor(engine, expected_desc_count);
+
+	spin_lock_irqsave(&engine->lock, flags);
+	dbg_tfr("%s service.\n", engine->name);
+	if (engine->cyclic_req)
+		rv = engine_service_cyclic(engine);
+	else
+		rv = engine_service(engine, desc_wb);
+
+	spin_unlock_irqrestore(&engine->lock, flags);
+
+	return rv;
+}
+
+static irqreturn_t user_irq_service(int irq, struct xdma_user_irq *user_irq)
+{
+	unsigned long flags;
+
+	BUG_ON(!user_irq);
+
+	if (user_irq->handler)
+		return user_irq->handler(user_irq->user_idx, user_irq->dev);
+
+	spin_lock_irqsave(&(user_irq->events_lock), flags);
+	if (!user_irq->events_irq) {
+		user_irq->events_irq = 1;
+		wake_up(&(user_irq->events_wq));
+	}
+	spin_unlock_irqrestore(&(user_irq->events_lock), flags);
+
+	return IRQ_HANDLED;
+}
+
+/*
+ * xdma_isr() - Interrupt handler
+ *
+ * @dev_id pointer to xdma_dev
+ */
+static irqreturn_t xdma_isr(int irq, void *dev_id)
+{
+	u32 ch_irq;
+	u32 user_irq;
+	u32 mask;
+	struct xdma_dev *xdev;
+	struct interrupt_regs *irq_regs;
+
+	dbg_irq("(irq=%d, dev 0x%p) <<<< ISR.\n", irq, dev_id);
+	BUG_ON(!dev_id);
+	xdev = (struct xdma_dev *)dev_id;
+
+	if (!xdev) {
+		WARN_ON(!xdev);
+		dbg_irq("(irq=%d) xdev=%p ??\n", irq, xdev);
+		return IRQ_NONE;
+	}
+
+	irq_regs = (struct interrupt_regs *)(xdev->bar[xdev->config_bar_idx] +
+			XDMA_OFS_INT_CTRL);
+
+	/* read channel interrupt requests */
+	ch_irq = read_register(&irq_regs->channel_int_request);
+	dbg_irq("ch_irq = 0x%08x\n", ch_irq);
+
+	/*
+	 * disable all interrupts that fired; these are re-enabled individually
+	 * after the causing module has been fully serviced.
+	 */
+	if (ch_irq)
+		channel_interrupts_disable(xdev, ch_irq);
+
+	/* read user interrupts - this read also flushes the above write */
+	user_irq = read_register(&irq_regs->user_int_request);
+	dbg_irq("user_irq = 0x%08x\n", user_irq);
+
+	if (user_irq) {
+		int user = 0;
+		u32 mask = 1;
+		int max = xdev->h2c_channel_max;
+
+		for (; user < max && user_irq; user++, mask <<= 1) {
+			if (user_irq & mask) {
+				user_irq &= ~mask;
+				user_irq_service(irq, &xdev->user_irq[user]);
+			}
+		}
+	}
+
+	mask = ch_irq & xdev->mask_irq_h2c;
+	if (mask) {
+		int channel = 0;
+		int max = xdev->h2c_channel_max;
+
+		/* iterate over H2C (PCIe read) */
+		for (channel = 0; channel < max && mask; channel++) {
+			struct xdma_engine *engine = &xdev->engine_h2c[channel];
+
+			/* engine present and its interrupt fired? */
+			if ((engine->irq_bitmask & mask) &&
+			   (engine->magic == MAGIC_ENGINE)) {
+				mask &= ~engine->irq_bitmask;
+				dbg_tfr("schedule_work, %s.\n", engine->name);
+				schedule_work(&engine->work);
+			}
+		}
+	}
+
+	mask = ch_irq & xdev->mask_irq_c2h;
+	if (mask) {
+		int channel = 0;
+		int max = xdev->c2h_channel_max;
+
+		/* iterate over C2H (PCIe write) */
+		for (channel = 0; channel < max && mask; channel++) {
+			struct xdma_engine *engine = &xdev->engine_c2h[channel];
+
+			/* engine present and its interrupt fired? */
+			if ((engine->irq_bitmask & mask) &&
+			   (engine->magic == MAGIC_ENGINE)) {
+				mask &= ~engine->irq_bitmask;
+				dbg_tfr("schedule_work, %s.\n", engine->name);
+				schedule_work(&engine->work);
+			}
+		}
+	}
+
+	xdev->irq_count++;
+	return IRQ_HANDLED;
+}
+
+/*
+ * xdma_user_irq() - Interrupt handler for user interrupts in MSI-X mode
+ *
+ * @dev_id pointer to xdma_dev
+ */
+static irqreturn_t xdma_user_irq(int irq, void *dev_id)
+{
+	struct xdma_user_irq *user_irq;
+
+	dbg_irq("(irq=%d) <<<< INTERRUPT SERVICE ROUTINE\n", irq);
+
+	BUG_ON(!dev_id);
+	user_irq = (struct xdma_user_irq *)dev_id;
+
+	return  user_irq_service(irq, user_irq);
+}
+
+/*
+ * xdma_channel_irq() - Interrupt handler for channel interrupts in MSI-X mode
+ *
+ * @dev_id pointer to xdma_dev
+ */
+static irqreturn_t xdma_channel_irq(int irq, void *dev_id)
+{
+	struct xdma_dev *xdev;
+	struct xdma_engine *engine;
+	struct interrupt_regs *irq_regs;
+
+	dbg_irq("(irq=%d) <<<< INTERRUPT service ROUTINE\n", irq);
+	BUG_ON(!dev_id);
+
+	engine = (struct xdma_engine *)dev_id;
+	xdev = engine->xdev;
+
+	if (!xdev) {
+		WARN_ON(!xdev);
+		dbg_irq("(irq=%d) xdev=%p ??\n", irq, xdev);
+		return IRQ_NONE;
+	}
+
+	irq_regs = (struct interrupt_regs *)(xdev->bar[xdev->config_bar_idx] +
+			XDMA_OFS_INT_CTRL);
+
+	/* Disable the interrupt for this engine */
+	write_register(engine->interrupt_enable_mask_value,
+			&engine->regs->interrupt_enable_mask_w1c,
+			(unsigned long)
+			(&engine->regs->interrupt_enable_mask_w1c) -
+			(unsigned long)(&engine->regs));
+	/* Dummy read to flush the above write */
+	read_register(&irq_regs->channel_int_pending);
+	/* Schedule the bottom half */
+	schedule_work(&engine->work);
+
+	/*
+	 * RTO - need to protect access here if multiple MSI-X are used for
+	 * user interrupts
+	 */
+	xdev->irq_count++;
+	return IRQ_HANDLED;
+}
+
+/*
+ * Unmap the BAR regions that had been mapped earlier using map_bars()
+ */
+static void unmap_bars(struct xdma_dev *xdev, struct pci_dev *dev)
+{
+	int i;
+
+	for (i = 0; i < XDMA_BAR_NUM; i++) {
+		/* is this BAR mapped? */
+		if (xdev->bar[i]) {
+			/* unmap BAR */
+			iounmap(xdev->bar[i]);
+			/* mark as unmapped */
+			xdev->bar[i] = NULL;
+		}
+	}
+}
+
+static resource_size_t map_single_bar(struct xdma_dev *xdev,
+	struct pci_dev *dev, int idx)
+{
+	resource_size_t bar_start;
+	resource_size_t bar_len;
+	resource_size_t map_len;
+
+	bar_start = pci_resource_start(dev, idx);
+	bar_len = pci_resource_len(dev, idx);
+	map_len = bar_len;
+
+	xdev->bar[idx] = NULL;
+
+	/*
+	 * do not map
+	 * BARs with length 0. Note that start MAY be 0!
+	 * P2P bar (size >= 256M)
+	 */
+	pr_info("map bar %d, len %lld\n", idx, bar_len);
+	if (!bar_len || bar_len >= (1 << 28))
+		return 0;
+
+	/*
+	 * bail out if the bar is mapped
+	 */
+	if (!request_mem_region(bar_start, bar_len, xdev->mod_name))
+		return 0;
+
+	release_mem_region(bar_start, bar_len);
+	/* BAR size exceeds maximum desired mapping? */
+	if (bar_len > INT_MAX) {
+		pr_info("Limit BAR %d mapping from %llu to %d bytes\n", idx,
+			(u64)bar_len, INT_MAX);
+		map_len = (resource_size_t)INT_MAX;
+	}
+	/*
+	 * map the full device memory or IO region into kernel virtual
+	 * address space
+	 */
+	dbg_init("BAR%d: %llu bytes to be mapped.\n", idx, (u64)map_len);
+	xdev->bar[idx] = ioremap_nocache(pci_resource_start(dev, idx),
+			 map_len);
+
+	if (!xdev->bar[idx]) {
+		pr_info("Could not map BAR %d.\n", idx);
+		return -EINVAL;
+	}
+
+	pr_info("BAR%d at 0x%llx mapped at 0x%p, length=%llu(/%llu)\n", idx,
+		(u64)bar_start, xdev->bar[idx], (u64)map_len, (u64)bar_len);
+
+	return (int)map_len;
+}
+
+static int is_config_bar(struct xdma_dev *xdev, int idx)
+{
+	u32 irq_id = 0;
+	u32 cfg_id = 0;
+	int flag = 0;
+	u32 mask = 0xffff0000; /* Compare only XDMA ID's not Version number */
+	struct interrupt_regs *irq_regs =
+		(struct interrupt_regs *) (xdev->bar[idx] + XDMA_OFS_INT_CTRL);
+	struct config_regs *cfg_regs =
+		(struct config_regs *)(xdev->bar[idx] + XDMA_OFS_CONFIG);
+
+	if (!xdev->bar[idx])
+		return 0;
+
+	irq_id = read_register(&irq_regs->identifier);
+	cfg_id = read_register(&cfg_regs->identifier);
+
+	if (((irq_id & mask) == IRQ_BLOCK_ID) &&
+	    ((cfg_id & mask) == CONFIG_BLOCK_ID)) {
+		dbg_init("BAR %d is the XDMA config BAR\n", idx);
+		flag = 1;
+	} else {
+		dbg_init("BAR %d is NOT the XDMA config BAR: 0x%x, 0x%x.\n",
+			idx, irq_id, cfg_id);
+		flag = 0;
+	}
+
+	return flag;
+}
+
+static void identify_bars(struct xdma_dev *xdev, int *bar_id_list, int num_bars,
+			int config_bar_pos)
+{
+	/*
+	 * The following logic identifies which BARs contain what functionality
+	 * based on the position of the XDMA config BAR and the number of BARs
+	 * detected. The rules are that the user logic and bypass logic BARs
+	 * are optional.  When both are present, the XDMA config BAR will be the
+	 * 2nd BAR detected (config_bar_pos = 1), with the user logic being
+	 * detected first and the bypass being detected last. When one is
+	 * omitted, the type of BAR present can be identified by whether the
+	 * XDMA config BAR is detected first or last.  When both are omitted,
+	 * only the XDMA config BAR is present.  This somewhat convoluted
+	 * approach is used instead of relying on BAR numbers in order to work
+	 * correctly with both 32-bit and 64-bit BARs.
+	 */
+
+	BUG_ON(!xdev);
+	BUG_ON(!bar_id_list);
+
+	pr_info("xdev 0x%p, bars %d, config at %d.\n",
+		xdev, num_bars, config_bar_pos);
+
+	switch (num_bars) {
+	case 1:
+		/* Only one BAR present - no extra work necessary */
+		break;
+
+	case 2:
+		if (config_bar_pos == 0) {
+			xdev->bypass_bar_idx = bar_id_list[1];
+		} else if (config_bar_pos == 1) {
+			xdev->user_bar_idx = bar_id_list[0];
+		} else {
+			pr_info("2, XDMA config BAR unexpected %d.\n",
+				config_bar_pos);
+		}
+		break;
+
+	case 3:
+	case 4:
+		if ((config_bar_pos == 1) || (config_bar_pos == 2)) {
+			/* user bar at bar #0 */
+			xdev->user_bar_idx = bar_id_list[0];
+			/* bypass bar at the last bar */
+			xdev->bypass_bar_idx = bar_id_list[num_bars - 1];
+		} else {
+			pr_info("3/4, XDMA config BAR unexpected %d.\n",
+				config_bar_pos);
+		}
+		break;
+
+	default:
+		/* Should not occur - warn user but safe to continue */
+		pr_info("Unexpected # BARs (%d), XDMA config BAR only.\n",
+			num_bars);
+		break;
+
+	}
+	pr_info("%d BARs: config %d, user %d, bypass %d.\n",
+		num_bars, config_bar_pos, xdev->user_bar_idx,
+		xdev->bypass_bar_idx);
+}
+
+/* map_bars() -- map device regions into kernel virtual address space
+ *
+ * Map the device memory regions into kernel virtual address space after
+ * verifying their sizes respect the minimum sizes needed
+ */
+static int map_bars(struct xdma_dev *xdev, struct pci_dev *dev)
+{
+	int rv;
+	int i;
+	int bar_id_list[XDMA_BAR_NUM];
+	int bar_id_idx = 0;
+	int config_bar_pos = 0;
+
+	/* iterate through all the BARs */
+	for (i = 0; i < XDMA_BAR_NUM; i++) {
+		resource_size_t bar_len;
+
+		bar_len = map_single_bar(xdev, dev, i);
+		if (bar_len == 0) {
+			continue;
+		} else if (bar_len < 0) {
+			rv = -EINVAL;
+			goto fail;
+		}
+
+		/* Try to identify BAR as XDMA control BAR */
+		if ((bar_len >= XDMA_BAR_SIZE) && (xdev->config_bar_idx < 0)) {
+
+			if (is_config_bar(xdev, i)) {
+				xdev->config_bar_idx = i;
+				config_bar_pos = bar_id_idx;
+				pr_info("config bar %d, pos %d.\n",
+					xdev->config_bar_idx, config_bar_pos);
+			}
+		}
+
+		bar_id_list[bar_id_idx] = i;
+		bar_id_idx++;
+	}
+
+	/* The XDMA config BAR must always be present */
+	if (xdev->config_bar_idx < 0) {
+		pr_info("Failed to detect XDMA config BAR\n");
+		rv = -EINVAL;
+		goto fail;
+	}
+
+	identify_bars(xdev, bar_id_list, bar_id_idx, config_bar_pos);
+
+	/* successfully mapped all required BAR regions */
+	return 0;
+
+fail:
+	/* unwind; unmap any BARs that we did map */
+	unmap_bars(xdev, dev);
+	return rv;
+}
+
+/*
+ * MSI-X interrupt:
+ *	<h2c+c2h channel_max> vectors, followed by <user_max> vectors
+ */
+
+/*
+ * RTO - code to detect if MSI/MSI-X capability exists is derived
+ * from linux/pci/msi.c - pci_msi_check_device
+ */
+
+#ifndef arch_msi_check_device
+int arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
+{
+	return 0;
+}
+#endif
+
+/* type = PCI_CAP_ID_MSI or PCI_CAP_ID_MSIX */
+static int msi_msix_capable(struct pci_dev *dev, int type)
+{
+	struct pci_bus *bus;
+	int ret;
+
+	if (!dev || dev->no_msi)
+		return 0;
+
+	for (bus = dev->bus; bus; bus = bus->parent)
+		if (bus->bus_flags & PCI_BUS_FLAGS_NO_MSI)
+			return 0;
+
+	ret = arch_msi_check_device(dev, 1, type);
+	if (ret)
+		return 0;
+
+	if (!pci_find_capability(dev, type))
+		return 0;
+
+	return 1;
+}
+
+static void disable_msi_msix(struct xdma_dev *xdev, struct pci_dev *pdev)
+{
+	if (xdev->msix_enabled) {
+		pci_disable_msix(pdev);
+		xdev->msix_enabled = 0;
+	} else if (xdev->msi_enabled) {
+		pci_disable_msi(pdev);
+		xdev->msi_enabled = 0;
+	}
+}
+
+static int enable_msi_msix(struct xdma_dev *xdev, struct pci_dev *pdev)
+{
+	int rv = 0;
+
+	BUG_ON(!xdev);
+	BUG_ON(!pdev);
+
+	if (!interrupt_mode && msi_msix_capable(pdev, PCI_CAP_ID_MSIX)) {
+		int req_nvec = xdev->c2h_channel_max + xdev->h2c_channel_max +
+				 xdev->user_max;
+
+		dbg_init("Enabling MSI-X\n");
+		rv = pci_alloc_irq_vectors(pdev, req_nvec, req_nvec,
+					PCI_IRQ_MSIX);
+		if (rv < 0)
+			dbg_init("Couldn't enable MSI-X mode: %d\n", rv);
+
+		xdev->msix_enabled = 1;
+
+	} else if (interrupt_mode == 1 &&
+		   msi_msix_capable(pdev, PCI_CAP_ID_MSI)) {
+		/* enable message signalled interrupts */
+		dbg_init("pci_enable_msi()\n");
+		rv = pci_enable_msi(pdev);
+		if (rv < 0)
+			dbg_init("Couldn't enable MSI mode: %d\n", rv);
+		xdev->msi_enabled = 1;
+
+	} else {
+		dbg_init("MSI/MSI-X not detected - using legacy interrupts\n");
+	}
+
+	return rv;
+}
+
+static void pci_check_intr_pend(struct pci_dev *pdev)
+{
+	u16 v;
+
+	pci_read_config_word(pdev, PCI_STATUS, &v);
+	if (v & PCI_STATUS_INTERRUPT) {
+		pr_info("%s PCI STATUS Interrupt pending 0x%x.\n",
+			dev_name(&pdev->dev), v);
+		pci_write_config_word(pdev, PCI_STATUS, PCI_STATUS_INTERRUPT);
+	}
+}
+
+static void pci_keep_intx_enabled(struct pci_dev *pdev)
+{
+	/* workaround to a h/w bug:
+	 * when msix/msi become unavaile, default to legacy.
+	 * However the legacy enable was not checked.
+	 * If the legacy was disabled, no ack then everything stuck
+	 */
+	u16 pcmd, pcmd_new;
+
+	pci_read_config_word(pdev, PCI_COMMAND, &pcmd);
+	pcmd_new = pcmd & ~PCI_COMMAND_INTX_DISABLE;
+	if (pcmd_new != pcmd) {
+		pr_info("%s: clear INTX_DISABLE, 0x%x -> 0x%x.\n",
+			dev_name(&pdev->dev), pcmd, pcmd_new);
+		pci_write_config_word(pdev, PCI_COMMAND, pcmd_new);
+	}
+}
+
+static void prog_irq_msix_user(struct xdma_dev *xdev, bool clear)
+{
+	/* user */
+	struct interrupt_regs *int_regs = (struct interrupt_regs *)
+					(xdev->bar[xdev->config_bar_idx] +
+					 XDMA_OFS_INT_CTRL);
+	u32 i = xdev->c2h_channel_max + xdev->h2c_channel_max;
+	u32 max = i + xdev->user_max;
+	int j;
+
+	for (j = 0; i < max; j++) {
+		u32 val = 0;
+		int k;
+		int shift = 0;
+
+		if (clear)
+			i += 4;
+		else
+			for (k = 0; k < 4 && i < max; i++, k++, shift += 8)
+				val |= (i & 0x1f) << shift;
+
+		write_register(val, &int_regs->user_msi_vector[j],
+			XDMA_OFS_INT_CTRL +
+			((unsigned long)&int_regs->user_msi_vector[j] -
+			 (unsigned long)int_regs));
+
+		dbg_init("vector %d, 0x%x.\n", j, val);
+	}
+}
+
+static void prog_irq_msix_channel(struct xdma_dev *xdev, bool clear)
+{
+	struct interrupt_regs *int_regs = (struct interrupt_regs *)
+					(xdev->bar[xdev->config_bar_idx] +
+					 XDMA_OFS_INT_CTRL);
+	u32 max = xdev->c2h_channel_max + xdev->h2c_channel_max;
+	u32 i;
+	int j;
+
+	/* engine */
+	for (i = 0, j = 0; i < max; j++) {
+		u32 val = 0;
+		int k;
+		int shift = 0;
+
+		if (clear)
+			i += 4;
+		else
+			for (k = 0; k < 4 && i < max; i++, k++, shift += 8)
+				val |= (i & 0x1f) << shift;
+
+		write_register(val, &int_regs->channel_msi_vector[j],
+			XDMA_OFS_INT_CTRL +
+			((unsigned long)&int_regs->channel_msi_vector[j] -
+			 (unsigned long)int_regs));
+		dbg_init("vector %d, 0x%x.\n", j, val);
+	}
+}
+
+static void irq_msix_channel_teardown(struct xdma_dev *xdev)
+{
+	struct xdma_engine *engine;
+	int j = 0;
+	int i = 0;
+
+	if (!xdev->msix_enabled)
+		return;
+
+	prog_irq_msix_channel(xdev, 1);
+
+	engine = xdev->engine_h2c;
+	for (i = 0; i < xdev->h2c_channel_max; i++, j++, engine++) {
+		if (!engine->msix_irq_line)
+			break;
+		dbg_sg("Release IRQ#%d for engine %p\n", engine->msix_irq_line,
+			engine);
+		free_irq(engine->msix_irq_line, engine);
+	}
+
+	engine = xdev->engine_c2h;
+	for (i = 0; i < xdev->c2h_channel_max; i++, j++, engine++) {
+		if (!engine->msix_irq_line)
+			break;
+		dbg_sg("Release IRQ#%d for engine %p\n", engine->msix_irq_line,
+			engine);
+		free_irq(engine->msix_irq_line, engine);
+	}
+}
+
+static int irq_msix_channel_setup(struct xdma_dev *xdev)
+{
+	int i;
+	int j = xdev->h2c_channel_max;
+	int rv = 0;
+	u32 vector;
+	struct xdma_engine *engine;
+
+	BUG_ON(!xdev);
+	if (!xdev->msix_enabled)
+		return 0;
+
+	engine = xdev->engine_h2c;
+	for (i = 0; i < xdev->h2c_channel_max; i++, engine++) {
+		vector = pci_irq_vector(xdev->pdev, i);
+		rv = request_irq(vector, xdma_channel_irq, 0, xdev->mod_name,
+				 engine);
+		if (rv) {
+			pr_info("requesti irq#%d failed %d, engine %s.\n",
+				vector, rv, engine->name);
+			return rv;
+		}
+		pr_info("engine %s, irq#%d.\n", engine->name, vector);
+		engine->msix_irq_line = vector;
+	}
+
+	engine = xdev->engine_c2h;
+	for (i = 0; i < xdev->c2h_channel_max; i++, j++, engine++) {
+		vector = pci_irq_vector(xdev->pdev, j);
+		rv = request_irq(vector, xdma_channel_irq, 0, xdev->mod_name,
+				 engine);
+		if (rv) {
+			pr_info("requesti irq#%d failed %d, engine %s.\n",
+				vector, rv, engine->name);
+			return rv;
+		}
+		pr_info("engine %s, irq#%d.\n", engine->name, vector);
+		engine->msix_irq_line = vector;
+	}
+
+	return 0;
+}
+
+static void irq_msix_user_teardown(struct xdma_dev *xdev)
+{
+	int i;
+	int j = xdev->h2c_channel_max + xdev->c2h_channel_max;
+
+	BUG_ON(!xdev);
+
+	if (!xdev->msix_enabled)
+		return;
+
+	prog_irq_msix_user(xdev, 1);
+
+	for (i = 0; i < xdev->user_max; i++, j++) {
+		u32 vector = pci_irq_vector(xdev->pdev, j);
+
+		dbg_init("user %d, releasing IRQ#%d\n", i, vector);
+		free_irq(vector, &xdev->user_irq[i]);
+	}
+}
+
+static int irq_msix_user_setup(struct xdma_dev *xdev)
+{
+	int i;
+	int j = xdev->h2c_channel_max + xdev->c2h_channel_max;
+	int rv = 0;
+
+	/* vectors set in probe_scan_for_msi() */
+	for (i = 0; i < xdev->user_max; i++, j++) {
+		u32 vector = pci_irq_vector(xdev->pdev, j);
+
+		rv = request_irq(vector, xdma_user_irq, 0, xdev->mod_name,
+				&xdev->user_irq[i]);
+		if (rv) {
+			pr_info("user %d couldn't use IRQ#%d, %d\n",
+				i, vector, rv);
+			break;
+		}
+		pr_info("%d-USR-%d, IRQ#%d with 0x%p\n", xdev->idx, i, vector,
+			&xdev->user_irq[i]);
+	}
+
+	/* If any errors occur, free IRQs that were successfully requested */
+	if (rv) {
+		for (i--, j--; i >= 0; i--, j--) {
+			u32 vector = pci_irq_vector(xdev->pdev, j);
+
+			free_irq(vector, &xdev->user_irq[i]);
+		}
+	}
+
+	return rv;
+}
+
+static int irq_msi_setup(struct xdma_dev *xdev, struct pci_dev *pdev)
+{
+	int rv;
+
+	xdev->irq_line = (int)pdev->irq;
+	rv = request_irq(pdev->irq, xdma_isr, 0, xdev->mod_name, xdev);
+	if (rv)
+		dbg_init("Couldn't use IRQ#%d, %d\n", pdev->irq, rv);
+	else
+		dbg_init("Using IRQ#%d with 0x%p\n", pdev->irq, xdev);
+
+	return rv;
+}
+
+static int irq_legacy_setup(struct xdma_dev *xdev, struct pci_dev *pdev)
+{
+	u32 w;
+	u8 val;
+	void *reg;
+	int rv;
+
+	pci_read_config_byte(pdev, PCI_INTERRUPT_PIN, &val);
+	dbg_init("Legacy Interrupt register value = %d\n", val);
+	if (val > 1) {
+		val--;
+		w = (val<<24) | (val<<16) | (val<<8) | val;
+		/* Program IRQ Block Channel vactor and IRQ Block User vector
+		 * with Legacy interrupt value
+		 */
+		reg = xdev->bar[xdev->config_bar_idx] + 0x2080;   // IRQ user
+		write_register(w, reg, 0x2080);
+		write_register(w, reg+0x4, 0x2084);
+		write_register(w, reg+0x8, 0x2088);
+		write_register(w, reg+0xC, 0x208C);
+		reg = xdev->bar[xdev->config_bar_idx] + 0x20A0;   // IRQ Block
+		write_register(w, reg, 0x20A0);
+		write_register(w, reg+0x4, 0x20A4);
+	}
+
+	xdev->irq_line = (int)pdev->irq;
+	rv = request_irq(pdev->irq, xdma_isr, IRQF_SHARED, xdev->mod_name,
+			xdev);
+	if (rv)
+		dbg_init("Couldn't use IRQ#%d, %d\n", pdev->irq, rv);
+	else
+		dbg_init("Using IRQ#%d with 0x%p\n", pdev->irq, xdev);
+
+	return rv;
+}
+
+static void irq_teardown(struct xdma_dev *xdev)
+{
+	if (xdev->msix_enabled) {
+		irq_msix_channel_teardown(xdev);
+		irq_msix_user_teardown(xdev);
+	} else if (xdev->irq_line != -1) {
+		dbg_init("Releasing IRQ#%d\n", xdev->irq_line);
+		free_irq(xdev->irq_line, xdev);
+	}
+}
+
+static int irq_setup(struct xdma_dev *xdev, struct pci_dev *pdev)
+{
+	pci_keep_intx_enabled(pdev);
+
+	if (xdev->msix_enabled) {
+		int rv = irq_msix_channel_setup(xdev);
+
+		if (rv)
+			return rv;
+		rv = irq_msix_user_setup(xdev);
+		if (rv)
+			return rv;
+		prog_irq_msix_channel(xdev, 0);
+		prog_irq_msix_user(xdev, 0);
+
+		return 0;
+	} else if (xdev->msi_enabled)
+		return irq_msi_setup(xdev, pdev);
+
+	return irq_legacy_setup(xdev, pdev);
+}
+
+#ifdef __LIBXDMA_DEBUG__
+static void dump_desc(struct xdma_desc *desc_virt)
+{
+	int j;
+	u32 *p = (u32 *)desc_virt;
+	static char * const field_name[] = {
+		"magic|extra_adjacent|control", "bytes", "src_addr_lo",
+		"src_addr_hi", "dst_addr_lo", "dst_addr_hi", "next_addr",
+		"next_addr_pad"};
+	char *dummy;
+
+	/* remove warning about unused variable when debug printing is off */
+	dummy = field_name[0];
+
+	for (j = 0; j < 8; j += 1) {
+		pr_info("0x%08lx/0x%02lx: 0x%08x 0x%08x %s\n",
+			 (uintptr_t)p, (uintptr_t)p & 15, (int)*p,
+			 le32_to_cpu(*p), field_name[j]);
+		p++;
+	}
+	pr_info("\n");
+}
+
+static void transfer_dump(struct xdma_transfer *transfer)
+{
+	int i;
+	struct xdma_desc *desc_virt = transfer->desc_virt;
+
+	pr_info("xfer 0x%p, state 0x%x, f 0x%x, dir %d, len %u, last %d.\n",
+		transfer, transfer->state, transfer->flags, transfer->dir,
+		transfer->len, transfer->last_in_request);
+
+	pr_info("transfer 0x%p, desc %d, bus 0x%llx, adj %d.\n",
+		transfer, transfer->desc_num, (u64)transfer->desc_bus,
+		transfer->desc_adjacent);
+	for (i = 0; i < transfer->desc_num; i += 1)
+		dump_desc(desc_virt + i);
+}
+#endif /* __LIBXDMA_DEBUG__ */
+
+/* xdma_desc_alloc() - Allocate cache-coherent array of N descriptors.
+ *
+ * Allocates an array of 'number' descriptors in contiguous PCI bus addressable
+ * memory. Chains the descriptors as a singly-linked list; the descriptor's
+ * next * pointer specifies the bus address of the next descriptor.
+ *
+ *
+ * @dev Pointer to pci_dev
+ * @number Number of descriptors to be allocated
+ * @desc_bus_p Pointer where to store the first descriptor bus address
+ *
+ * @return Virtual address of the first descriptor
+ *
+ */
+static void transfer_desc_init(struct xdma_transfer *transfer, int count)
+{
+	struct xdma_desc *desc_virt = transfer->desc_virt;
+	dma_addr_t desc_bus = transfer->desc_bus;
+	int i;
+	int adj = count - 1;
+	int extra_adj;
+	u32 temp_control;
+
+	BUG_ON(count > XDMA_TRANSFER_MAX_DESC);
+
+	/* create singly-linked list for SG DMA controller */
+	for (i = 0; i < count - 1; i++) {
+		/* increment bus address to next in array */
+		desc_bus += sizeof(struct xdma_desc);
+
+		/* singly-linked list uses bus addresses */
+		desc_virt[i].next_lo = cpu_to_le32(PCI_DMA_L(desc_bus));
+		desc_virt[i].next_hi = cpu_to_le32(PCI_DMA_H(desc_bus));
+		desc_virt[i].bytes = cpu_to_le32(0);
+
+		/* any adjacent descriptors? */
+		if (adj > 0) {
+			extra_adj = adj - 1;
+			if (extra_adj > MAX_EXTRA_ADJ)
+				extra_adj = MAX_EXTRA_ADJ;
+
+			adj--;
+		} else {
+			extra_adj = 0;
+		}
+
+		temp_control = DESC_MAGIC | (extra_adj << 8);
+
+		desc_virt[i].control = cpu_to_le32(temp_control);
+	}
+	/* { i = number - 1 } */
+	/* zero the last descriptor next pointer */
+	desc_virt[i].next_lo = cpu_to_le32(0);
+	desc_virt[i].next_hi = cpu_to_le32(0);
+	desc_virt[i].bytes = cpu_to_le32(0);
+
+	temp_control = DESC_MAGIC;
+
+	desc_virt[i].control = cpu_to_le32(temp_control);
+}
+
+/* xdma_desc_link() - Link two descriptors
+ *
+ * Link the first descriptor to a second descriptor, or terminate the first.
+ *
+ * @first first descriptor
+ * @second second descriptor, or NULL if first descriptor must be set as last.
+ * @second_bus bus address of second descriptor
+ */
+static void xdma_desc_link(struct xdma_desc *first, struct xdma_desc *second,
+		dma_addr_t second_bus)
+{
+	/*
+	 * remember reserved control in first descriptor, but zero
+	 * extra_adjacent!
+	 */
+	 /* RTO - what's this about?  Shouldn't it be 0x0000c0ffUL? */
+	u32 control = le32_to_cpu(first->control) & 0x0000f0ffUL;
+	/* second descriptor given? */
+	if (second) {
+		/*
+		 * link last descriptor of 1st array to first descriptor of
+		 * 2nd array
+		 */
+		first->next_lo = cpu_to_le32(PCI_DMA_L(second_bus));
+		first->next_hi = cpu_to_le32(PCI_DMA_H(second_bus));
+		WARN_ON(first->next_hi);
+		/* no second descriptor given */
+	} else {
+		/* first descriptor is the last */
+		first->next_lo = 0;
+		first->next_hi = 0;
+	}
+	/* merge magic, extra_adjacent and control field */
+	control |= DESC_MAGIC;
+
+	/* write bytes and next_num */
+	first->control = cpu_to_le32(control);
+}
+
+/* xdma_desc_adjacent -- Set how many descriptors are adjacent to this one */
+static void xdma_desc_adjacent(struct xdma_desc *desc, int next_adjacent)
+{
+	int extra_adj = 0;
+	/* remember reserved and control bits */
+	u32 control = le32_to_cpu(desc->control) & 0x0000f0ffUL;
+	u32 max_adj_4k = 0;
+
+	if (next_adjacent > 0) {
+		extra_adj =  next_adjacent - 1;
+		if (extra_adj > MAX_EXTRA_ADJ)
+			extra_adj = MAX_EXTRA_ADJ;
+
+		max_adj_4k = (0x1000 -
+			((le32_to_cpu(desc->next_lo)) & 0xFFF)) / 32 - 1;
+		if (extra_adj > max_adj_4k)
+			extra_adj = max_adj_4k;
+
+		if (extra_adj < 0) {
+			pr_info("Warning: extra_adj<0, converting it to 0\n");
+			extra_adj = 0;
+		}
+	}
+	/* merge adjacent and control field */
+	control |= 0xAD4B0000UL | (extra_adj << 8);
+	/* write control and next_adjacent */
+	desc->control = cpu_to_le32(control);
+}
+
+/* xdma_desc_control -- Set complete control field of a descriptor. */
+static void xdma_desc_control_set(struct xdma_desc *first, u32 control_field)
+{
+	/* remember magic and adjacent number */
+	u32 control = le32_to_cpu(first->control) & ~(LS_BYTE_MASK);
+
+	BUG_ON(control_field & ~(LS_BYTE_MASK));
+	/* merge adjacent and control field */
+	control |= control_field;
+	/* write control and next_adjacent */
+	first->control = cpu_to_le32(control);
+}
+
+/* xdma_desc_clear -- Clear bits in control field of a descriptor. */
+static void xdma_desc_control_clear(struct xdma_desc *first, u32 clear_mask)
+{
+	/* remember magic and adjacent number */
+	u32 control = le32_to_cpu(first->control);
+
+	BUG_ON(clear_mask & ~(LS_BYTE_MASK));
+
+	/* merge adjacent and control field */
+	control &= (~clear_mask);
+	/* write control and next_adjacent */
+	first->control = cpu_to_le32(control);
+}
+
+/* xdma_desc_done - recycle cache-coherent linked list of descriptors.
+ *
+ * @dev Pointer to pci_dev
+ * @number Number of descriptors to be allocated
+ * @desc_virt Pointer to (i.e. virtual address of) first descriptor in list
+ * @desc_bus Bus address of first descriptor in list
+ */
+static inline void xdma_desc_done(struct xdma_desc *desc_virt)
+{
+	memset(desc_virt, 0, XDMA_TRANSFER_MAX_DESC * sizeof(struct xdma_desc));
+}
+
+/* xdma_desc() - Fill a descriptor with the transfer details
+ *
+ * @desc pointer to descriptor to be filled
+ * @addr root complex address
+ * @ep_addr end point address
+ * @len number of bytes, must be a (non-negative) multiple of 4.
+ * @dir, dma direction
+ * is the end point address. If zero, vice versa.
+ *
+ * Does not modify the next pointer
+ */
+static void xdma_desc_set(struct xdma_desc *desc, dma_addr_t rc_bus_addr,
+		u64 ep_addr, int len, int dir)
+{
+	/* transfer length */
+	desc->bytes = cpu_to_le32(len);
+	if (dir == DMA_TO_DEVICE) {
+		/* read from root complex memory (source address) */
+		desc->src_addr_lo = cpu_to_le32(PCI_DMA_L(rc_bus_addr));
+		desc->src_addr_hi = cpu_to_le32(PCI_DMA_H(rc_bus_addr));
+		/* write to end point address (destination address) */
+		desc->dst_addr_lo = cpu_to_le32(PCI_DMA_L(ep_addr));
+		desc->dst_addr_hi = cpu_to_le32(PCI_DMA_H(ep_addr));
+	} else {
+		/* read from end point address (source address) */
+		desc->src_addr_lo = cpu_to_le32(PCI_DMA_L(ep_addr));
+		desc->src_addr_hi = cpu_to_le32(PCI_DMA_H(ep_addr));
+		/* write to root complex memory (destination address) */
+		desc->dst_addr_lo = cpu_to_le32(PCI_DMA_L(rc_bus_addr));
+		desc->dst_addr_hi = cpu_to_le32(PCI_DMA_H(rc_bus_addr));
+	}
+}
+
+/*
+ * should hold the engine->lock;
+ */
+static void transfer_abort(struct xdma_engine *engine,
+			struct xdma_transfer *transfer)
+{
+	struct xdma_transfer *head;
+
+	BUG_ON(!engine);
+	BUG_ON(!transfer);
+	BUG_ON(transfer->desc_num == 0);
+
+	pr_info("abort transfer 0x%p, desc %d, engine desc queued %d.\n",
+		transfer, transfer->desc_num, engine->desc_dequeued);
+
+	head = list_entry(engine->transfer_list.next, struct xdma_transfer,
+			entry);
+	if (head == transfer)
+		list_del(engine->transfer_list.next);
+	else
+		pr_info("engine %s, transfer 0x%p NOT found, 0x%p.\n",
+			engine->name, transfer, head);
+
+	if (transfer->state == TRANSFER_STATE_SUBMITTED)
+		transfer->state = TRANSFER_STATE_ABORTED;
+}
+
+/* transfer_queue() - Queue a DMA transfer on the engine
+ *
+ * @engine DMA engine doing the transfer
+ * @transfer DMA transfer submitted to the engine
+ *
+ * Takes and releases the engine spinlock
+ */
+static int transfer_queue(struct xdma_engine *engine,
+		struct xdma_transfer *transfer)
+{
+	int rv = 0;
+	struct xdma_transfer *transfer_started;
+	struct xdma_dev *xdev;
+	unsigned long flags;
+
+	BUG_ON(!engine);
+	BUG_ON(!engine->xdev);
+	BUG_ON(!transfer);
+	BUG_ON(transfer->desc_num == 0);
+	dbg_tfr("transfer=0x%p.\n", transfer);
+
+	xdev = engine->xdev;
+	if (xdma_device_flag_check(xdev, XDEV_FLAG_OFFLINE)) {
+		pr_info("dev 0x%p offline, transfer 0x%p not queued.\n",
+			xdev, transfer);
+		return -EBUSY;
+	}
+
+	/* lock the engine state */
+	spin_lock_irqsave(&engine->lock, flags);
+
+	engine->prev_cpu = get_cpu();
+	put_cpu();
+
+	/* engine is being shutdown; do not accept new transfers */
+	if (engine->shutdown & ENGINE_SHUTDOWN_REQUEST) {
+		pr_info("engine %s offline, transfer 0x%p not queued.\n",
+			engine->name, transfer);
+		rv = -EBUSY;
+		goto shutdown;
+	}
+
+	/* mark the transfer as submitted */
+	transfer->state = TRANSFER_STATE_SUBMITTED;
+	/* add transfer to the tail of the engine transfer queue */
+	list_add_tail(&transfer->entry, &engine->transfer_list);
+
+	/* engine is idle? */
+	if (!engine->running) {
+		/* start engine */
+		dbg_tfr("starting %s engine.\n",
+			engine->name);
+		transfer_started = engine_start(engine);
+		dbg_tfr("transfer=0x%p started %s engine with transfer 0x%p.\n",
+			transfer, engine->name, transfer_started);
+	} else {
+		dbg_tfr("transfer=0x%p queued, with %s engine running.\n",
+			transfer, engine->name);
+	}
+
+shutdown:
+	/* unlock the engine state */
+	dbg_tfr("engine->running = %d\n", engine->running);
+	spin_unlock_irqrestore(&engine->lock, flags);
+	return rv;
+}
+
+static void engine_alignments(struct xdma_engine *engine)
+{
+	u32 w;
+	u32 align_bytes;
+	u32 granularity_bytes;
+	u32 address_bits;
+
+	w = read_register(&engine->regs->alignments);
+	dbg_init("engine %p name %s alignments=0x%08x\n", engine,
+		engine->name, (int)w);
+
+	/* RTO  - add some macros to extract these fields */
+	align_bytes = (w & 0x00ff0000U) >> 16;
+	granularity_bytes = (w & 0x0000ff00U) >> 8;
+	address_bits = (w & 0x000000ffU);
+
+	dbg_init("align_bytes = %d\n", align_bytes);
+	dbg_init("granularity_bytes = %d\n", granularity_bytes);
+	dbg_init("address_bits = %d\n", address_bits);
+
+	if (w) {
+		engine->addr_align = align_bytes;
+		engine->len_granularity = granularity_bytes;
+		engine->addr_bits = address_bits;
+	} else {
+		/* Some default values if alignments are unspecified */
+		engine->addr_align = 1;
+		engine->len_granularity = 1;
+		engine->addr_bits = 64;
+	}
+}
+
+static void engine_free_resource(struct xdma_engine *engine)
+{
+	struct xdma_dev *xdev = engine->xdev;
+
+	/* Release memory use for descriptor writebacks */
+	if (engine->poll_mode_addr_virt) {
+		dbg_sg("Releasing memory for descriptor writeback\n");
+		dma_free_coherent(&xdev->pdev->dev,
+				sizeof(struct xdma_poll_wb),
+				engine->poll_mode_addr_virt,
+				engine->poll_mode_bus);
+		dbg_sg("Released memory for descriptor writeback\n");
+		engine->poll_mode_addr_virt = NULL;
+	}
+
+	if (engine->desc) {
+		dbg_init("device %s, engine %s pre-alloc desc 0x%p,0x%llx.\n",
+			dev_name(&xdev->pdev->dev), engine->name,
+			engine->desc, engine->desc_bus);
+		dma_free_coherent(&xdev->pdev->dev,
+			XDMA_TRANSFER_MAX_DESC * sizeof(struct xdma_desc),
+			engine->desc, engine->desc_bus);
+		engine->desc = NULL;
+	}
+
+	if (engine->cyclic_result) {
+		dma_free_coherent(&xdev->pdev->dev,
+			CYCLIC_RX_PAGES_MAX * sizeof(struct xdma_result),
+			engine->cyclic_result, engine->cyclic_result_bus);
+		engine->cyclic_result = NULL;
+	}
+}
+
+static void engine_destroy(struct xdma_dev *xdev, struct xdma_engine *engine)
+{
+	BUG_ON(!xdev);
+	BUG_ON(!engine);
+
+	dbg_sg("Shutting down engine %s%d", engine->name, engine->channel);
+
+	/* Disable interrupts to stop processing new events during shutdown */
+	write_register(0x0, &engine->regs->interrupt_enable_mask,
+			(unsigned long)(&engine->regs->interrupt_enable_mask) -
+			(unsigned long)(&engine->regs));
+
+	if (enable_credit_mp && engine->streaming &&
+		engine->dir == DMA_FROM_DEVICE) {
+		u32 reg_value = (0x1 << engine->channel) << 16;
+		struct sgdma_common_regs *reg = (struct sgdma_common_regs *)
+				(xdev->bar[xdev->config_bar_idx] +
+				 (0x6*TARGET_SPACING));
+		write_register(reg_value, &reg->credit_mode_enable_w1c, 0);
+	}
+
+	/* Release memory use for descriptor writebacks */
+	engine_free_resource(engine);
+
+	memset(engine, 0, sizeof(struct xdma_engine));
+	/* Decrement the number of engines available */
+	xdev->engines_num--;
+}
+
+/**
+ *engine_cyclic_stop() - stop a cyclic transfer running on an SG DMA engine
+ *
+ *engine->lock must be taken
+ */
+struct xdma_transfer *engine_cyclic_stop(struct xdma_engine *engine)
+{
+	struct xdma_transfer *transfer = 0;
+
+	/* transfers on queue? */
+	if (!list_empty(&engine->transfer_list)) {
+		/* pick first transfer on the queue (was submitted to engine) */
+		transfer = list_entry(engine->transfer_list.next,
+					struct xdma_transfer, entry);
+		BUG_ON(!transfer);
+
+		xdma_engine_stop(engine);
+
+		if (transfer->cyclic) {
+			if (engine->xdma_perf)
+				dbg_perf("Stopping perf transfer on %s\n",
+					engine->name);
+			else
+				dbg_perf("Stopping cyclic transfer on %s\n",
+					engine->name);
+			/* make sure the handler sees correct transfer state */
+			transfer->cyclic = 1;
+			/*
+			 * set STOP flag and interrupt on completion, on the
+			 * last descriptor
+			 */
+			xdma_desc_control_set(
+				transfer->desc_virt + transfer->desc_num - 1,
+				XDMA_DESC_COMPLETED | XDMA_DESC_STOPPED);
+		} else {
+			dbg_sg("(engine=%p) running transfer is not cyclic\n",
+				engine);
+		}
+	} else {
+		dbg_sg("(engine=%p) found not running transfer.\n", engine);
+	}
+	return transfer;
+}
+EXPORT_SYMBOL_GPL(engine_cyclic_stop);
+
+static int engine_writeback_setup(struct xdma_engine *engine)
+{
+	u32 w;
+	struct xdma_dev *xdev;
+	struct xdma_poll_wb *writeback;
+
+	BUG_ON(!engine);
+	xdev = engine->xdev;
+	BUG_ON(!xdev);
+
+	/*
+	 * RTO - doing the allocation per engine is wasteful since a full page
+	 * is allocated each time - better to allocate one page for the whole
+	 * device during probe() and set per-engine offsets here
+	 */
+	writeback = (struct xdma_poll_wb *)engine->poll_mode_addr_virt;
+	writeback->completed_desc_count = 0;
+
+	dbg_init("Setting writeback location to 0x%llx for engine %p",
+		engine->poll_mode_bus, engine);
+	w = cpu_to_le32(PCI_DMA_L(engine->poll_mode_bus));
+	write_register(w, &engine->regs->poll_mode_wb_lo,
+			(unsigned long)(&engine->regs->poll_mode_wb_lo) -
+			(unsigned long)(&engine->regs));
+	w = cpu_to_le32(PCI_DMA_H(engine->poll_mode_bus));
+	write_register(w, &engine->regs->poll_mode_wb_hi,
+			(unsigned long)(&engine->regs->poll_mode_wb_hi) -
+			(unsigned long)(&engine->regs));
+
+	return 0;
+}
+
+
+/* engine_create() - Create an SG DMA engine bookkeeping data structure
+ *
+ * An SG DMA engine consists of the resources for a single-direction transfer
+ * queue; the SG DMA hardware, the software queue and interrupt handling.
+ *
+ * @dev Pointer to pci_dev
+ * @offset byte address offset in BAR[xdev->config_bar_idx] resource for the
+ * SG DMA * controller registers.
+ * @dir: DMA_TO/FROM_DEVICE
+ * @streaming Whether the engine is attached to AXI ST (rather than MM)
+ */
+static int engine_init_regs(struct xdma_engine *engine)
+{
+	u32 reg_value;
+	int rv = 0;
+
+	write_register(XDMA_CTRL_NON_INCR_ADDR, &engine->regs->control_w1c,
+			(unsigned long)(&engine->regs->control_w1c) -
+			(unsigned long)(&engine->regs));
+
+	engine_alignments(engine);
+
+	/* Configure error interrupts by default */
+	reg_value = XDMA_CTRL_IE_DESC_ALIGN_MISMATCH;
+	reg_value |= XDMA_CTRL_IE_MAGIC_STOPPED;
+	reg_value |= XDMA_CTRL_IE_MAGIC_STOPPED;
+	reg_value |= XDMA_CTRL_IE_READ_ERROR;
+	reg_value |= XDMA_CTRL_IE_DESC_ERROR;
+
+	/* if using polled mode, configure writeback address */
+	if (poll_mode) {
+		rv = engine_writeback_setup(engine);
+		if (rv) {
+			dbg_init("%s descr writeback setup failed.\n",
+				engine->name);
+			goto fail_wb;
+		}
+	} else {
+		/* enable the relevant completion interrupts */
+		reg_value |= XDMA_CTRL_IE_DESC_STOPPED;
+		reg_value |= XDMA_CTRL_IE_DESC_COMPLETED;
+
+		if (engine->streaming && engine->dir == DMA_FROM_DEVICE)
+			reg_value |= XDMA_CTRL_IE_IDLE_STOPPED;
+	}
+
+	/* Apply engine configurations */
+	write_register(reg_value, &engine->regs->interrupt_enable_mask,
+			(unsigned long)(&engine->regs->interrupt_enable_mask) -
+			(unsigned long)(&engine->regs));
+
+	engine->interrupt_enable_mask_value = reg_value;
+
+	/* only enable credit mode for AXI-ST C2H */
+	if (enable_credit_mp && engine->streaming &&
+		engine->dir == DMA_FROM_DEVICE) {
+
+		struct xdma_dev *xdev = engine->xdev;
+		u32 reg_value = (0x1 << engine->channel) << 16;
+		struct sgdma_common_regs *reg = (struct sgdma_common_regs *)
+				(xdev->bar[xdev->config_bar_idx] +
+				 (0x6*TARGET_SPACING));
+
+		write_register(reg_value, &reg->credit_mode_enable_w1s, 0);
+	}
+
+	return 0;
+
+fail_wb:
+	return rv;
+}
+
+static int engine_alloc_resource(struct xdma_engine *engine)
+{
+	struct xdma_dev *xdev = engine->xdev;
+
+	engine->desc = dma_alloc_coherent(&xdev->pdev->dev,
+			XDMA_TRANSFER_MAX_DESC * sizeof(struct xdma_desc),
+			&engine->desc_bus, GFP_KERNEL);
+	if (!engine->desc) {
+		pr_warn("dev %s, %s pre-alloc desc OOM.\n",
+			dev_name(&xdev->pdev->dev), engine->name);
+		goto err_out;
+	}
+
+	if (poll_mode) {
+		engine->poll_mode_addr_virt = dma_alloc_coherent(
+					&xdev->pdev->dev,
+					sizeof(struct xdma_poll_wb),
+					&engine->poll_mode_bus, GFP_KERNEL);
+		if (!engine->poll_mode_addr_virt) {
+			pr_warn("%s, %s poll pre-alloc writeback OOM.\n",
+				dev_name(&xdev->pdev->dev), engine->name);
+			goto err_out;
+		}
+	}
+
+	if (engine->streaming && engine->dir == DMA_FROM_DEVICE) {
+		engine->cyclic_result = dma_alloc_coherent(&xdev->pdev->dev,
+			CYCLIC_RX_PAGES_MAX * sizeof(struct xdma_result),
+			&engine->cyclic_result_bus, GFP_KERNEL);
+
+		if (!engine->cyclic_result) {
+			pr_warn("%s, %s pre-alloc result OOM.\n",
+				dev_name(&xdev->pdev->dev), engine->name);
+			goto err_out;
+		}
+	}
+
+	return 0;
+
+err_out:
+	engine_free_resource(engine);
+	return -ENOMEM;
+}
+
+static int engine_init(struct xdma_engine *engine, struct xdma_dev *xdev,
+			int offset, enum dma_data_direction dir, int channel)
+{
+	int rv;
+	u32 val;
+
+	dbg_init("channel %d, offset 0x%x, dir %d.\n", channel, offset, dir);
+
+	/* set magic */
+	engine->magic = MAGIC_ENGINE;
+
+	engine->channel = channel;
+
+	/* engine interrupt request bit */
+	engine->irq_bitmask = (1 << XDMA_ENG_IRQ_NUM) - 1;
+	engine->irq_bitmask <<= (xdev->engines_num * XDMA_ENG_IRQ_NUM);
+	engine->bypass_offset = xdev->engines_num * BYPASS_MODE_SPACING;
+
+	/* parent */
+	engine->xdev = xdev;
+	/* register address */
+	engine->regs = (xdev->bar[xdev->config_bar_idx] + offset);
+	engine->sgdma_regs = xdev->bar[xdev->config_bar_idx] + offset +
+				SGDMA_OFFSET_FROM_CHANNEL;
+	val = read_register(&engine->regs->identifier);
+	if (val & 0x8000U)
+		engine->streaming = 1;
+
+	/* remember SG DMA direction */
+	engine->dir = dir;
+	sprintf(engine->name, "%d-%s%d-%s", xdev->idx,
+		(dir == DMA_TO_DEVICE) ? "H2C" : "C2H", channel,
+		engine->streaming ? "ST" : "MM");
+
+	dbg_init("engine %p name %s irq_bitmask=0x%08x\n", engine, engine->name,
+		(int)engine->irq_bitmask);
+
+	/* initialize the deferred work for transfer completion */
+	INIT_WORK(&engine->work, engine_service_work);
+
+	if (dir == DMA_TO_DEVICE)
+		xdev->mask_irq_h2c |= engine->irq_bitmask;
+	else
+		xdev->mask_irq_c2h |= engine->irq_bitmask;
+	xdev->engines_num++;
+
+	rv = engine_alloc_resource(engine);
+	if (rv)
+		return rv;
+
+	rv = engine_init_regs(engine);
+	if (rv)
+		return rv;
+
+	return 0;
+}
+
+/* transfer_destroy() - free transfer */
+static void transfer_destroy(struct xdma_dev *xdev, struct xdma_transfer *xfer)
+{
+	/* free descriptors */
+	xdma_desc_done(xfer->desc_virt);
+
+	if (xfer->last_in_request && (xfer->flags & XFER_FLAG_NEED_UNMAP)) {
+		struct sg_table *sgt = xfer->sgt;
+
+		if (sgt->nents) {
+			pci_unmap_sg(xdev->pdev, sgt->sgl, sgt->nents,
+				xfer->dir);
+			sgt->nents = 0;
+		}
+	}
+}
+
+static int transfer_build(struct xdma_engine *engine,
+			struct xdma_request_cb *req, unsigned int desc_max)
+{
+	struct xdma_transfer *xfer = &req->xfer;
+	struct sw_desc *sdesc = &(req->sdesc[req->sw_desc_idx]);
+	int i = 0;
+	int j = 0;
+
+	for (; i < desc_max; i++, j++, sdesc++) {
+		dbg_desc("sw desc %d/%u: 0x%llx, 0x%x, ep 0x%llx.\n",
+			i + req->sw_desc_idx, req->sw_desc_cnt,
+			sdesc->addr, sdesc->len, req->ep_addr);
+
+		/* fill in descriptor entry j with transfer details */
+		xdma_desc_set(xfer->desc_virt + j, sdesc->addr, req->ep_addr,
+				 sdesc->len, xfer->dir);
+		xfer->len += sdesc->len;
+
+		/* for non-inc-add mode don't increment ep_addr */
+		if (!engine->non_incr_addr)
+			req->ep_addr += sdesc->len;
+	}
+	req->sw_desc_idx += desc_max;
+	return 0;
+}
+
+static int transfer_init(struct xdma_engine *engine, struct xdma_request_cb *req)
+{
+	struct xdma_transfer *xfer = &req->xfer;
+	unsigned int desc_max = min_t(unsigned int,
+				req->sw_desc_cnt - req->sw_desc_idx,
+				XDMA_TRANSFER_MAX_DESC);
+	int i = 0;
+	int last = 0;
+	u32 control;
+
+	memset(xfer, 0, sizeof(*xfer));
+
+	/* initialize wait queue */
+	init_waitqueue_head(&xfer->wq);
+
+	/* remember direction of transfer */
+	xfer->dir = engine->dir;
+
+	xfer->desc_virt = engine->desc;
+	xfer->desc_bus = engine->desc_bus;
+
+	transfer_desc_init(xfer, desc_max);
+
+	dbg_sg("transfer->desc_bus = 0x%llx.\n", (u64)xfer->desc_bus);
+
+	transfer_build(engine, req, desc_max);
+
+	/* terminate last descriptor */
+	last = desc_max - 1;
+	xdma_desc_link(xfer->desc_virt + last, 0, 0);
+	/* stop engine, EOP for AXI ST, req IRQ on last descriptor */
+	control = XDMA_DESC_STOPPED;
+	control |= XDMA_DESC_EOP;
+	control |= XDMA_DESC_COMPLETED;
+	xdma_desc_control_set(xfer->desc_virt + last, control);
+
+	xfer->desc_num = xfer->desc_adjacent = desc_max;
+
+	dbg_sg("transfer 0x%p has %d descriptors\n", xfer, xfer->desc_num);
+	/* fill in adjacent numbers */
+	for (i = 0; i < xfer->desc_num; i++)
+		xdma_desc_adjacent(xfer->desc_virt + i, xfer->desc_num - i - 1);
+
+	return 0;
+}
+
+#ifdef __LIBXDMA_DEBUG__
+static void sgt_dump(struct sg_table *sgt)
+{
+	int i;
+	struct scatterlist *sg = sgt->sgl;
+
+	pr_info("sgt 0x%p, sgl 0x%p, nents %u/%u.\n",
+		sgt, sgt->sgl, sgt->nents, sgt->orig_nents);
+
+	for (i = 0; i < sgt->orig_nents; i++, sg = sg_next(sg))
+		pr_info("%d, 0x%p, pg 0x%p,%u+%u, dma 0x%llx,%u.\n",
+			i, sg, sg_page(sg), sg->offset, sg->length,
+			sg_dma_address(sg), sg_dma_len(sg));
+}
+
+static void xdma_request_cb_dump(struct xdma_request_cb *req)
+{
+	int i;
+
+	pr_info("request 0x%p, total %u, ep 0x%llx, sw_desc %u, sgt 0x%p.\n",
+		req, req->total_len, req->ep_addr, req->sw_desc_cnt, req->sgt);
+	sgt_dump(req->sgt);
+	for (i = 0; i < req->sw_desc_cnt; i++)
+		pr_info("%d/%u, 0x%llx, %u.\n",
+			i, req->sw_desc_cnt, req->sdesc[i].addr,
+			req->sdesc[i].len);
+}
+#endif
+
+static void xdma_request_free(struct xdma_request_cb *req)
+{
+	if (((unsigned long)req) >= VMALLOC_START &&
+	    ((unsigned long)req) < VMALLOC_END)
+		vfree(req);
+	else
+		kfree(req);
+}
+
+static struct xdma_request_cb *xdma_request_alloc(unsigned int sdesc_nr)
+{
+	struct xdma_request_cb *req;
+	unsigned int size = sizeof(struct xdma_request_cb) +
+				sdesc_nr * sizeof(struct sw_desc);
+
+	req = kzalloc(size, GFP_KERNEL);
+	if (!req) {
+		req = vmalloc(size);
+		if (req)
+			memset(req, 0, size);
+	}
+	if (!req) {
+		pr_info("OOM, %u sw_desc, %u.\n", sdesc_nr, size);
+		return NULL;
+	}
+
+	return req;
+}
+
+static struct xdma_request_cb *xdma_init_request(struct sg_table *sgt,
+						u64 ep_addr)
+{
+	struct xdma_request_cb *req;
+	struct scatterlist *sg = sgt->sgl;
+	int max = sgt->nents;
+	int extra = 0;
+	int i, j = 0;
+
+	for (i = 0;  i < max; i++, sg = sg_next(sg)) {
+		unsigned int len = sg_dma_len(sg);
+
+		if (unlikely(len > XDMA_DESC_BLEN_MAX))
+			extra += len >> XDMA_DESC_BLEN_BITS;
+	}
+
+	//pr_info("ep 0x%llx, desc %u+%u.\n", ep_addr, max, extra);
+
+	max += extra;
+	req = xdma_request_alloc(max);
+	if (!req)
+		return NULL;
+
+	req->sgt = sgt;
+	req->ep_addr = ep_addr;
+
+	for (i = 0, sg = sgt->sgl;  i < sgt->nents; i++, sg = sg_next(sg)) {
+		unsigned int tlen = sg_dma_len(sg);
+		dma_addr_t addr = sg_dma_address(sg);
+
+		req->total_len += tlen;
+		while (tlen) {
+			req->sdesc[j].addr = addr;
+			if (tlen > XDMA_DESC_BLEN_MAX) {
+				req->sdesc[j].len = XDMA_DESC_BLEN_MAX;
+				addr += XDMA_DESC_BLEN_MAX;
+				tlen -= XDMA_DESC_BLEN_MAX;
+			} else {
+				req->sdesc[j].len = tlen;
+				tlen = 0;
+			}
+			j++;
+		}
+	}
+	BUG_ON(j > max);
+
+	req->sw_desc_cnt = j;
+#ifdef __LIBXDMA_DEBUG__
+	xdma_request_cb_dump(req);
+#endif
+	return req;
+}
+
+ssize_t xdma_xfer_submit(void *dev_hndl, int channel, bool write, u64 ep_addr,
+			struct sg_table *sgt, bool dma_mapped, int timeout_ms)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+	struct xdma_engine *engine;
+	int rv = 0;
+	ssize_t done = 0;
+	struct scatterlist *sg = sgt->sgl;
+	int nents;
+	enum dma_data_direction dir = write ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+	struct xdma_request_cb *req = NULL;
+
+	if (!dev_hndl)
+		return -EINVAL;
+
+	if (debug_check_dev_hndl(__func__, xdev->pdev, dev_hndl) < 0)
+		return -EINVAL;
+
+	if (write == 1) {
+		if (channel >= xdev->h2c_channel_max) {
+			pr_warn("H2C channel %d >= %d.\n",
+				channel, xdev->h2c_channel_max);
+			return -EINVAL;
+		}
+		engine = &xdev->engine_h2c[channel];
+	} else if (write == 0) {
+		if (channel >= xdev->c2h_channel_max) {
+			pr_warn("C2H channel %d >= %d.\n",
+				channel, xdev->c2h_channel_max);
+			return -EINVAL;
+		}
+		engine = &xdev->engine_c2h[channel];
+	} else {
+		pr_warn("write %d, exp. 0|1.\n", write);
+		return -EINVAL;
+	}
+
+	BUG_ON(!engine);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	xdev = engine->xdev;
+	if (xdma_device_flag_check(xdev, XDEV_FLAG_OFFLINE)) {
+		pr_info("xdev 0x%p, offline.\n", xdev);
+		return -EBUSY;
+	}
+
+	/* check the direction */
+	if (engine->dir != dir) {
+		pr_info("0x%p, %s, %d, W %d, 0x%x/0x%x mismatch.\n",
+			engine, engine->name, channel, write, engine->dir, dir);
+		return -EINVAL;
+	}
+
+	if (!dma_mapped) {
+		nents = pci_map_sg(xdev->pdev, sg, sgt->orig_nents, dir);
+		if (!nents) {
+			pr_info("map sgl failed, sgt 0x%p.\n", sgt);
+			return -EIO;
+		}
+		sgt->nents = nents;
+	} else {
+		BUG_ON(!sgt->nents);
+	}
+
+	req = xdma_init_request(sgt, ep_addr);
+	if (!req) {
+		rv = -ENOMEM;
+		goto unmap_sgl;
+	}
+
+	dbg_tfr("%s, len %u sg cnt %u.\n",
+		engine->name, req->total_len, req->sw_desc_cnt);
+
+	sg = sgt->sgl;
+	nents = req->sw_desc_cnt;
+	while (nents) {
+		unsigned long flags;
+		struct xdma_transfer *xfer;
+
+		/* one transfer at a time */
+#ifndef CONFIG_PREEMPT_COUNT
+		spin_lock(&engine->desc_lock);
+#else
+		mutex_lock(&engine->desc_mutex);
+#endif
+
+		/* build transfer */
+		rv = transfer_init(engine, req);
+		if (rv < 0) {
+#ifndef CONFIG_PREEMPT_COUNT
+			spin_unlock(&engine->desc_lock);
+#else
+			mutex_unlock(&engine->desc_mutex);
+#endif
+			goto unmap_sgl;
+		}
+		xfer = &req->xfer;
+
+		if (!dma_mapped)
+			xfer->flags = XFER_FLAG_NEED_UNMAP;
+
+		/* last transfer for the given request? */
+		nents -= xfer->desc_num;
+		if (!nents) {
+			xfer->last_in_request = 1;
+			xfer->sgt = sgt;
+		}
+
+		dbg_tfr("xfer, %u, ep 0x%llx, done %lu, sg %u/%u.\n",
+			xfer->len, req->ep_addr, done, req->sw_desc_idx,
+			req->sw_desc_cnt);
+
+#ifdef __LIBXDMA_DEBUG__
+		transfer_dump(xfer);
+#endif
+
+		rv = transfer_queue(engine, xfer);
+		if (rv < 0) {
+#ifndef CONFIG_PREEMPT_COUNT
+			spin_unlock(&engine->desc_lock);
+#else
+			mutex_unlock(&engine->desc_mutex);
+#endif
+			pr_info("unable to submit %s, %d.\n", engine->name, rv);
+			goto unmap_sgl;
+		}
+
+		/*
+		 * When polling, determine how many descriptors have been
+		 * queued on the engine to determine the writeback value
+		 * expected
+		 */
+		if (poll_mode) {
+			unsigned int desc_count;
+
+			spin_lock_irqsave(&engine->lock, flags);
+			desc_count = xfer->desc_num;
+			spin_unlock_irqrestore(&engine->lock, flags);
+
+			dbg_tfr("%s poll desc_count=%d\n",
+				engine->name, desc_count);
+			rv = engine_service_poll(engine, desc_count);
+
+		} else {
+			rv = wait_event_timeout(xfer->wq,
+				(xfer->state != TRANSFER_STATE_SUBMITTED),
+				msecs_to_jiffies(timeout_ms));
+		}
+
+		spin_lock_irqsave(&engine->lock, flags);
+
+		switch (xfer->state) {
+		case TRANSFER_STATE_COMPLETED:
+			spin_unlock_irqrestore(&engine->lock, flags);
+
+			dbg_tfr("transfer %p, %u, ep 0x%llx compl, +%lu.\n",
+				xfer, xfer->len, req->ep_addr - xfer->len, done);
+			done += xfer->len;
+			rv = 0;
+			break;
+		case TRANSFER_STATE_FAILED:
+			pr_info("xfer 0x%p,%u, failed, ep 0x%llx.\n",
+				 xfer, xfer->len, req->ep_addr - xfer->len);
+			spin_unlock_irqrestore(&engine->lock, flags);
+
+#ifdef __LIBXDMA_DEBUG__
+			transfer_dump(xfer);
+			sgt_dump(sgt);
+#endif
+			rv = -EIO;
+			break;
+		default:
+			if (!poll_mode && rv == -ERESTARTSYS) {
+				pr_info("xfer 0x%p,%u, canceled, ep 0x%llx.\n",
+					xfer, xfer->len,
+					req->ep_addr - xfer->len);
+				spin_unlock_irqrestore(&engine->lock, flags);
+				wait_event_timeout(xfer->wq, (xfer->state !=
+					TRANSFER_STATE_SUBMITTED),
+					msecs_to_jiffies(timeout_ms));
+				xdma_engine_stop(engine);
+				break;
+			}
+			/* transfer can still be in-flight */
+			pr_info("xfer 0x%p,%u, s 0x%x timed out, ep 0x%llx.\n",
+				 xfer, xfer->len, xfer->state, req->ep_addr);
+			engine_status_read(engine, 0, 1);
+			//engine_status_dump(engine);
+			transfer_abort(engine, xfer);
+
+			xdma_engine_stop(engine);
+			spin_unlock_irqrestore(&engine->lock, flags);
+
+#ifdef __LIBXDMA_DEBUG__
+			transfer_dump(xfer);
+			sgt_dump(sgt);
+#endif
+			rv = -ERESTARTSYS;
+			break;
+		}
+		transfer_destroy(xdev, xfer);
+#ifndef CONFIG_PREEMPT_COUNT
+			spin_unlock(&engine->desc_lock);
+#else
+			mutex_unlock(&engine->desc_mutex);
+#endif
+
+		if (rv < 0)
+			goto unmap_sgl;
+	} /* while (sg) */
+
+unmap_sgl:
+	if (!dma_mapped && sgt->nents) {
+		pci_unmap_sg(xdev->pdev, sgt->sgl, sgt->orig_nents, dir);
+		sgt->nents = 0;
+	}
+
+	if (req)
+		xdma_request_free(req);
+
+	if (rv < 0)
+		return rv;
+
+	return done;
+}
+EXPORT_SYMBOL_GPL(xdma_xfer_submit);
+
+int xdma_performance_submit(struct xdma_dev *xdev, struct xdma_engine *engine)
+{
+	u8 *buffer_virt;
+	u32 max_consistent_size = 128 * 32 * 1024; /* 1024 pages, 4MB */
+	dma_addr_t buffer_bus;	/* bus address */
+	struct xdma_transfer *transfer;
+	u64 ep_addr = 0;
+	int num_desc_in_a_loop = 128;
+	int size_in_desc = engine->xdma_perf->transfer_size;
+	int size = size_in_desc * num_desc_in_a_loop;
+	int i;
+
+	BUG_ON(size_in_desc > max_consistent_size);
+
+	if (size > max_consistent_size) {
+		size = max_consistent_size;
+		num_desc_in_a_loop = size / size_in_desc;
+	}
+
+	buffer_virt = dma_alloc_coherent(&xdev->pdev->dev, size,
+					&buffer_bus, GFP_KERNEL);
+
+	/* allocate transfer data structure */
+	transfer = kzalloc(sizeof(struct xdma_transfer), GFP_KERNEL);
+	BUG_ON(!transfer);
+
+	/* 0 = write engine (to_dev=0) , 1 = read engine (to_dev=1) */
+	transfer->dir = engine->dir;
+	/* set number of descriptors */
+	transfer->desc_num = num_desc_in_a_loop;
+
+	/* allocate descriptor list */
+	if (!engine->desc) {
+		engine->desc = dma_alloc_coherent(&xdev->pdev->dev,
+			num_desc_in_a_loop * sizeof(struct xdma_desc),
+			&engine->desc_bus, GFP_KERNEL);
+		BUG_ON(!engine->desc);
+		dbg_init("device %s, engine %s pre-alloc desc 0x%p,0x%llx.\n",
+			dev_name(&xdev->pdev->dev), engine->name,
+			engine->desc, engine->desc_bus);
+	}
+	transfer->desc_virt = engine->desc;
+	transfer->desc_bus = engine->desc_bus;
+
+	transfer_desc_init(transfer, transfer->desc_num);
+
+	dbg_sg("transfer->desc_bus = 0x%llx.\n", (u64)transfer->desc_bus);
+
+	for (i = 0; i < transfer->desc_num; i++) {
+		struct xdma_desc *desc = transfer->desc_virt + i;
+		dma_addr_t rc_bus_addr = buffer_bus + size_in_desc * i;
+
+		/* fill in descriptor entry with transfer details */
+		xdma_desc_set(desc, rc_bus_addr, ep_addr, size_in_desc,
+			engine->dir);
+	}
+
+	/* stop engine and request interrupt on last descriptor */
+	xdma_desc_control_set(transfer->desc_virt, 0);
+	/* create a linked loop */
+	xdma_desc_link(transfer->desc_virt + transfer->desc_num - 1,
+		transfer->desc_virt, transfer->desc_bus);
+
+	transfer->cyclic = 1;
+
+	/* initialize wait queue */
+	init_waitqueue_head(&transfer->wq);
+
+	dbg_perf("Queueing XDMA I/O %s request for performance measurement.\n",
+		engine->dir ? "write (to dev)" : "read (from dev)");
+	transfer_queue(engine, transfer);
+	return 0;
+
+}
+EXPORT_SYMBOL_GPL(xdma_performance_submit);
+
+static struct xdma_dev *alloc_dev_instance(struct pci_dev *pdev)
+{
+	int i;
+	struct xdma_dev *xdev;
+	struct xdma_engine *engine;
+
+	BUG_ON(!pdev);
+
+	/* allocate zeroed device book keeping structure */
+	xdev = kzalloc(sizeof(struct xdma_dev), GFP_KERNEL);
+	if (!xdev)
+		return NULL;
+
+	spin_lock_init(&xdev->lock);
+
+	xdev->magic = MAGIC_DEVICE;
+	xdev->config_bar_idx = -1;
+	xdev->user_bar_idx = -1;
+	xdev->bypass_bar_idx = -1;
+	xdev->irq_line = -1;
+
+	/* create a driver to device reference */
+	xdev->pdev = pdev;
+	dbg_init("xdev = 0x%p\n", xdev);
+
+	/* Set up data user IRQ data structures */
+	for (i = 0; i < 16; i++) {
+		xdev->user_irq[i].xdev = xdev;
+		spin_lock_init(&xdev->user_irq[i].events_lock);
+		init_waitqueue_head(&xdev->user_irq[i].events_wq);
+		xdev->user_irq[i].handler = NULL;
+		xdev->user_irq[i].user_idx = i; /* 0 based */
+	}
+
+	engine = xdev->engine_h2c;
+	for (i = 0; i < XDMA_CHANNEL_NUM_MAX; i++, engine++) {
+		spin_lock_init(&engine->lock);
+		spin_lock_init(&engine->desc_lock);
+#ifdef CONFIG_PREEMPT_COUNT
+		mutex_init(&engine->desc_mutex);
+#endif
+		INIT_LIST_HEAD(&engine->transfer_list);
+		init_waitqueue_head(&engine->shutdown_wq);
+		init_waitqueue_head(&engine->xdma_perf_wq);
+	}
+
+	engine = xdev->engine_c2h;
+	for (i = 0; i < XDMA_CHANNEL_NUM_MAX; i++, engine++) {
+		spin_lock_init(&engine->lock);
+		spin_lock_init(&engine->desc_lock);
+#ifdef CONFIG_PREEMPT_COUNT
+		mutex_init(&engine->desc_mutex);
+#endif
+		INIT_LIST_HEAD(&engine->transfer_list);
+		init_waitqueue_head(&engine->shutdown_wq);
+		init_waitqueue_head(&engine->xdma_perf_wq);
+	}
+
+	return xdev;
+}
+
+static int set_dma_mask(struct pci_dev *pdev)
+{
+	BUG_ON(!pdev);
+
+	dbg_init("sizeof(dma_addr_t) == %ld\n", sizeof(dma_addr_t));
+	/* 64-bit addressing capability for XDMA? */
+	if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
+		/* query for DMA transfer */
+		/* @see Documentation/DMA-mapping.txt */
+		dbg_init("pci_set_dma_mask()\n");
+		/* use 64-bit DMA */
+		dbg_init("Using a 64-bit DMA mask.\n");
+		/* use 32-bit DMA for descriptors */
+		pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+		/* use 64-bit DMA, 32-bit for consistent */
+	} else if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
+		dbg_init("Could not set 64-bit DMA mask.\n");
+		pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+		/* use 32-bit DMA */
+		dbg_init("Using a 32-bit DMA mask.\n");
+	} else {
+		dbg_init("No suitable DMA possible.\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static u32 get_engine_channel_id(struct engine_regs *regs)
+{
+	u32 value;
+
+	BUG_ON(!regs);
+
+	value = read_register(&regs->identifier);
+
+	return (value & 0x00000f00U) >> 8;
+}
+
+static u32 get_engine_id(struct engine_regs *regs)
+{
+	u32 value;
+
+	BUG_ON(!regs);
+
+	value = read_register(&regs->identifier);
+	return (value & 0xffff0000U) >> 16;
+}
+
+static void remove_engines(struct xdma_dev *xdev)
+{
+	struct xdma_engine *engine;
+	int i;
+
+	BUG_ON(!xdev);
+
+	/* iterate over channels */
+	for (i = 0; i < xdev->h2c_channel_max; i++) {
+		engine = &xdev->engine_h2c[i];
+		if (engine->magic == MAGIC_ENGINE) {
+			dbg_sg("Remove %s, %d", engine->name, i);
+			engine_destroy(xdev, engine);
+			dbg_sg("%s, %d removed", engine->name, i);
+		}
+	}
+
+	for (i = 0; i < xdev->c2h_channel_max; i++) {
+		engine = &xdev->engine_c2h[i];
+		if (engine->magic == MAGIC_ENGINE) {
+			dbg_sg("Remove %s, %d", engine->name, i);
+			engine_destroy(xdev, engine);
+			dbg_sg("%s, %d removed", engine->name, i);
+		}
+	}
+}
+
+static int probe_for_engine(struct xdma_dev *xdev, enum dma_data_direction dir,
+			int channel)
+{
+	struct engine_regs *regs;
+	int offset = channel * CHANNEL_SPACING;
+	u32 engine_id;
+	u32 engine_id_expected;
+	u32 channel_id;
+	struct xdma_engine *engine;
+	int rv;
+
+	/* register offset for the engine */
+	/*
+	 * read channels at 0x0000, write channels at 0x1000,
+	 * channels at 0x100 interval
+	 */
+	if (dir == DMA_TO_DEVICE) {
+		engine_id_expected = XDMA_ID_H2C;
+		engine = &xdev->engine_h2c[channel];
+	} else {
+		offset += H2C_CHANNEL_OFFSET;
+		engine_id_expected = XDMA_ID_C2H;
+		engine = &xdev->engine_c2h[channel];
+	}
+
+	regs = xdev->bar[xdev->config_bar_idx] + offset;
+	engine_id = get_engine_id(regs);
+	channel_id = get_engine_channel_id(regs);
+
+	if ((engine_id != engine_id_expected) || (channel_id != channel)) {
+		dbg_init("%s %d engine, reg off 0x%x, id mismatch 0x%x,0x%x,exp 0x%x,0x%x, SKIP.\n",
+			dir == DMA_TO_DEVICE ? "H2C" : "C2H",
+			 channel, offset, engine_id, channel_id,
+			engine_id_expected, channel_id != channel);
+		return -EINVAL;
+	}
+
+	dbg_init("found AXI %s %d engine, reg. off 0x%x, id 0x%x,0x%x.\n",
+		 dir == DMA_TO_DEVICE ? "H2C" : "C2H", channel,
+		 offset, engine_id, channel_id);
+
+	/* allocate and initialize engine */
+	rv = engine_init(engine, xdev, offset, dir, channel);
+	if (rv != 0) {
+		pr_info("failed to create AXI %s %d engine.\n",
+			dir == DMA_TO_DEVICE ? "H2C" : "C2H",
+			channel);
+		return rv;
+	}
+
+	return 0;
+}
+
+static int probe_engines(struct xdma_dev *xdev)
+{
+	int i;
+	int rv = 0;
+
+	BUG_ON(!xdev);
+
+	/* iterate over channels */
+	for (i = 0; i < xdev->h2c_channel_max; i++) {
+		rv = probe_for_engine(xdev, DMA_TO_DEVICE, i);
+		if (rv)
+			break;
+	}
+	xdev->h2c_channel_max = i;
+
+	for (i = 0; i < xdev->c2h_channel_max; i++) {
+		rv = probe_for_engine(xdev, DMA_FROM_DEVICE, i);
+		if (rv)
+			break;
+	}
+	xdev->c2h_channel_max = i;
+
+	return 0;
+}
+
+static void pci_enable_capability(struct pci_dev *pdev, int cap)
+{
+	pcie_capability_set_word(pdev, PCI_EXP_DEVCTL, cap);
+}
+
+static int pci_check_extended_tag(struct xdma_dev *xdev, struct pci_dev *pdev)
+{
+	u16 cap;
+
+	pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap);
+
+	if ((cap & PCI_EXP_DEVCTL_EXT_TAG))
+		return 0;
+
+	/* extended tag not enabled */
+	pr_info("0x%p EXT_TAG disabled.\n", pdev);
+
+	/*
+	 * Always enable ExtTag, even though xdma has configuration
+	 * XDMA_OFS_CONFIG for system which does not have ExtTag enabled.
+	 *
+	 * We observed that the ExtTag was cleared on some system. The SSD-
+	 * FPGA board will not work on that system (DMA failed). The solution
+	 * is that XDMA driver enables ExtTag in this case.
+	 *
+	 * If ExtTag need to be disabled for your system, please enable this.
+	 */
+#ifdef _NO_FORCE_EXTTAG
+	if (xdev->config_bar_idx < 0) {
+		pr_info("pdev 0x%p, xdev 0x%p, config bar UNKNOWN.\n",
+				pdev, xdev);
+		return -EINVAL;
+	}
+
+	reg = xdev->bar[xdev->config_bar_idx] + XDMA_OFS_CONFIG + 0x4C;
+	v =  read_register(reg);
+	v = (v & 0xFF) | (((u32)32) << 8);
+	write_register(v, reg, XDMA_OFS_CONFIG + 0x4C);
+	return 0;
+#else
+	/* Return 1 is going to enable ExtTag */
+	return 1;
+#endif
+}
+
+void *xdma_device_open(const char *mname, struct pci_dev *pdev, int *user_max,
+			int *h2c_channel_max, int *c2h_channel_max)
+{
+	struct xdma_dev *xdev = NULL;
+	int rv = 0;
+
+	pr_info("%s device %s, 0x%p.\n", mname, dev_name(&pdev->dev), pdev);
+
+	/* allocate zeroed device book keeping structure */
+	xdev = alloc_dev_instance(pdev);
+	if (!xdev)
+		return NULL;
+	xdev->mod_name = mname;
+	xdev->user_max = *user_max;
+	xdev->h2c_channel_max = *h2c_channel_max;
+	xdev->c2h_channel_max = *c2h_channel_max;
+
+	xdma_device_flag_set(xdev, XDEV_FLAG_OFFLINE);
+	xdev_list_add(xdev);
+
+	if (xdev->user_max == 0 || xdev->user_max > MAX_USER_IRQ)
+		xdev->user_max = MAX_USER_IRQ;
+	if (xdev->h2c_channel_max == 0 ||
+	    xdev->h2c_channel_max > XDMA_CHANNEL_NUM_MAX)
+		xdev->h2c_channel_max = XDMA_CHANNEL_NUM_MAX;
+	if (xdev->c2h_channel_max == 0 ||
+	    xdev->c2h_channel_max > XDMA_CHANNEL_NUM_MAX)
+		xdev->c2h_channel_max = XDMA_CHANNEL_NUM_MAX;
+
+	/* keep INTx enabled */
+	pci_check_intr_pend(pdev);
+
+	/* enable relaxed ordering */
+	pci_enable_capability(pdev, PCI_EXP_DEVCTL_RELAX_EN);
+
+	/* if extended tag check failed, enable it */
+	if (pci_check_extended_tag(xdev, pdev)) {
+		pr_info("ExtTag is disabled, try enable it.\n");
+		pci_enable_capability(pdev, PCI_EXP_DEVCTL_EXT_TAG);
+	}
+
+	/* force MRRS to be 512 */
+	rv = pcie_get_readrq(pdev);
+	if (rv < 0) {
+		dev_err(&pdev->dev, "failed to read mrrs %d\n", rv);
+		goto err_map;
+	}
+	if (rv > 512) {
+		rv = pcie_set_readrq(pdev, 512);
+		if (rv) {
+			dev_err(&pdev->dev, "failed to force mrrs %d\n", rv);
+			goto err_map;
+		}
+	}
+
+	/* enable bus master capability */
+	pci_set_master(pdev);
+
+	rv = map_bars(xdev, pdev);
+	if (rv)
+		goto err_map;
+
+	rv = set_dma_mask(pdev);
+	if (rv)
+		goto err_mask;
+
+	check_nonzero_interrupt_status(xdev);
+	/* explicitely zero all interrupt enable masks */
+	channel_interrupts_disable(xdev, ~0);
+	user_interrupts_disable(xdev, ~0);
+	read_interrupts(xdev);
+
+	rv = probe_engines(xdev);
+	if (rv)
+		goto err_engines;
+
+	rv = enable_msi_msix(xdev, pdev);
+	if (rv < 0)
+		goto err_enable_msix;
+
+	rv = irq_setup(xdev, pdev);
+	if (rv < 0)
+		goto err_interrupts;
+
+	if (!poll_mode)
+		channel_interrupts_enable(xdev, ~0);
+
+	/* Flush writes */
+	read_interrupts(xdev);
+
+	*user_max = xdev->user_max;
+	*h2c_channel_max = xdev->h2c_channel_max;
+	*c2h_channel_max = xdev->c2h_channel_max;
+
+	xdma_device_flag_clear(xdev, XDEV_FLAG_OFFLINE);
+	return (void *)xdev;
+
+err_interrupts:
+	irq_teardown(xdev);
+err_enable_msix:
+	disable_msi_msix(xdev, pdev);
+err_engines:
+	remove_engines(xdev);
+err_mask:
+	unmap_bars(xdev, pdev);
+err_map:
+	xdev_list_remove(xdev);
+	kfree(xdev);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(xdma_device_open);
+
+void xdma_device_close(struct pci_dev *pdev, void *dev_hndl)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+
+	dbg_init("pdev 0x%p, xdev 0x%p.\n", pdev, dev_hndl);
+
+	if (!dev_hndl)
+		return;
+
+	if (debug_check_dev_hndl(__func__, pdev, dev_hndl) < 0)
+		return;
+
+	dbg_sg("remove(dev = 0x%p) where pdev->dev.driver_data = 0x%p\n",
+		   pdev, xdev);
+	if (xdev->pdev != pdev) {
+		dbg_sg("pci_dev(0x%lx) != pdev(0x%lx)\n",
+			(unsigned long)xdev->pdev, (unsigned long)pdev);
+	}
+
+	channel_interrupts_disable(xdev, ~0);
+	user_interrupts_disable(xdev, ~0);
+	read_interrupts(xdev);
+
+	irq_teardown(xdev);
+	disable_msi_msix(xdev, pdev);
+
+	remove_engines(xdev);
+	unmap_bars(xdev, pdev);
+
+	xdev_list_remove(xdev);
+
+	kfree(xdev);
+}
+EXPORT_SYMBOL_GPL(xdma_device_close);
+
+void xdma_device_offline(struct pci_dev *pdev, void *dev_hndl)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+	struct xdma_engine *engine;
+	int i;
+
+	if (!dev_hndl)
+		return;
+
+	if (debug_check_dev_hndl(__func__, pdev, dev_hndl) < 0)
+		return;
+
+pr_info("pdev 0x%p, xdev 0x%p.\n", pdev, xdev);
+	xdma_device_flag_set(xdev, XDEV_FLAG_OFFLINE);
+
+	/* wait for all engines to be idle */
+	for (i  = 0; i < xdev->h2c_channel_max; i++) {
+		unsigned long flags;
+
+		engine = &xdev->engine_h2c[i];
+
+		if (engine->magic == MAGIC_ENGINE) {
+			spin_lock_irqsave(&engine->lock, flags);
+			engine->shutdown |= ENGINE_SHUTDOWN_REQUEST;
+
+			xdma_engine_stop(engine);
+			engine->running = 0;
+			spin_unlock_irqrestore(&engine->lock, flags);
+		}
+	}
+
+	for (i  = 0; i < xdev->c2h_channel_max; i++) {
+		unsigned long flags;
+
+		engine = &xdev->engine_c2h[i];
+		if (engine->magic == MAGIC_ENGINE) {
+			spin_lock_irqsave(&engine->lock, flags);
+			engine->shutdown |= ENGINE_SHUTDOWN_REQUEST;
+
+			xdma_engine_stop(engine);
+			engine->running = 0;
+			spin_unlock_irqrestore(&engine->lock, flags);
+		}
+	}
+
+	/* turn off interrupts */
+	channel_interrupts_disable(xdev, ~0);
+	user_interrupts_disable(xdev, ~0);
+	read_interrupts(xdev);
+	irq_teardown(xdev);
+
+	pr_info("xdev 0x%p, done.\n", xdev);
+}
+EXPORT_SYMBOL_GPL(xdma_device_offline);
+
+void xdma_device_online(struct pci_dev *pdev, void *dev_hndl)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+	struct xdma_engine *engine;
+	unsigned long flags;
+	int i;
+
+	if (!dev_hndl)
+		return;
+
+	if (debug_check_dev_hndl(__func__, pdev, dev_hndl) < 0)
+		return;
+
+pr_info("pdev 0x%p, xdev 0x%p.\n", pdev, xdev);
+
+	for (i  = 0; i < xdev->h2c_channel_max; i++) {
+		engine = &xdev->engine_h2c[i];
+		if (engine->magic == MAGIC_ENGINE) {
+			engine_init_regs(engine);
+			spin_lock_irqsave(&engine->lock, flags);
+			engine->shutdown &= ~ENGINE_SHUTDOWN_REQUEST;
+			spin_unlock_irqrestore(&engine->lock, flags);
+		}
+	}
+
+	for (i  = 0; i < xdev->c2h_channel_max; i++) {
+		engine = &xdev->engine_c2h[i];
+		if (engine->magic == MAGIC_ENGINE) {
+			engine_init_regs(engine);
+			spin_lock_irqsave(&engine->lock, flags);
+			engine->shutdown &= ~ENGINE_SHUTDOWN_REQUEST;
+			spin_unlock_irqrestore(&engine->lock, flags);
+		}
+	}
+
+	/* re-write the interrupt table */
+	if (!poll_mode) {
+		irq_setup(xdev, pdev);
+
+		channel_interrupts_enable(xdev, ~0);
+		user_interrupts_enable(xdev, xdev->mask_irq_user);
+		read_interrupts(xdev);
+	}
+
+	xdma_device_flag_clear(xdev, XDEV_FLAG_OFFLINE);
+pr_info("xdev 0x%p, done.\n", xdev);
+}
+EXPORT_SYMBOL_GPL(xdma_device_online);
+
+int xdma_device_restart(struct pci_dev *pdev, void *dev_hndl)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+
+	if (!dev_hndl)
+		return -EINVAL;
+
+	if (debug_check_dev_hndl(__func__, pdev, dev_hndl) < 0)
+		return -EINVAL;
+
+	pr_info("NOT implemented, 0x%p.\n", xdev);
+	return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(xdma_device_restart);
+
+int xdma_user_isr_register(void *dev_hndl, unsigned int mask,
+			irq_handler_t handler, void *dev)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+	int i;
+
+	if (!dev_hndl)
+		return -EINVAL;
+
+	if (debug_check_dev_hndl(__func__, xdev->pdev, dev_hndl) < 0)
+		return -EINVAL;
+
+	for (i = 0; i < xdev->user_max && mask; i++) {
+		unsigned int bit = (1 << i);
+
+		if ((bit & mask) == 0)
+			continue;
+
+		mask &= ~bit;
+		xdev->user_irq[i].handler = handler;
+		xdev->user_irq[i].dev = dev;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xdma_user_isr_register);
+
+int xdma_user_isr_enable(void *dev_hndl, unsigned int mask)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+
+	if (!dev_hndl)
+		return -EINVAL;
+
+	if (debug_check_dev_hndl(__func__, xdev->pdev, dev_hndl) < 0)
+		return -EINVAL;
+
+	xdev->mask_irq_user |= mask;
+	/* enable user interrupts */
+	user_interrupts_enable(xdev, mask);
+	read_interrupts(xdev);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xdma_user_isr_enable);
+
+int xdma_user_isr_disable(void *dev_hndl, unsigned int mask)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+
+	if (!dev_hndl)
+		return -EINVAL;
+
+	if (debug_check_dev_hndl(__func__, xdev->pdev, dev_hndl) < 0)
+		return -EINVAL;
+
+	xdev->mask_irq_user &= ~mask;
+	user_interrupts_disable(xdev, mask);
+	read_interrupts(xdev);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xdma_user_isr_disable);
+
+int xdma_get_userio(void *dev_hndl, void * __iomem *base_addr,
+	u64 *len, u32 *bar_idx)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+
+	if (xdev->user_bar_idx < 0)
+		return -ENOENT;
+
+	*base_addr = xdev->bar[xdev->user_bar_idx];
+	*len = pci_resource_len(xdev->pdev, xdev->user_bar_idx);
+	*bar_idx = xdev->user_bar_idx;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xdma_get_userio);
+
+int xdma_get_bypassio(void *dev_hndl, u64 *len, u32 *bar_idx)
+{
+	struct xdma_dev *xdev = (struct xdma_dev *)dev_hndl;
+
+	/* not necessary have bypass bar*/
+	if (xdev->bypass_bar_idx < 0)
+		return 0;
+
+	*len = pci_resource_len(xdev->pdev, xdev->bypass_bar_idx);
+	*bar_idx = xdev->bypass_bar_idx;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xdma_get_bypassio);
+
+
+#ifdef __LIBXDMA_MOD__
+static int __init xdma_base_init(void)
+{
+	pr_info("%s", version);
+	return 0;
+}
+
+static void __exit xdma_base_exit(void)
+{
+}
+
+module_init(xdma_base_init);
+module_exit(xdma_base_exit);
+#endif
+/* makes an existing transfer cyclic */
+static void xdma_transfer_cyclic(struct xdma_transfer *transfer)
+{
+	/* link last descriptor to first descriptor */
+	xdma_desc_link(transfer->desc_virt + transfer->desc_num - 1,
+			transfer->desc_virt, transfer->desc_bus);
+	/* remember transfer is cyclic */
+	transfer->cyclic = 1;
+}
+
+static int transfer_monitor_cyclic(struct xdma_engine *engine,
+			struct xdma_transfer *transfer, int timeout_ms)
+{
+	struct xdma_result *result;
+	int rc = 0;
+
+	BUG_ON(!engine);
+	BUG_ON(!transfer);
+
+	result = engine->cyclic_result;
+	BUG_ON(!result);
+
+	if (poll_mode) {
+		int i;
+
+		for (i = 0; i < 5; i++) {
+			rc = engine_service_poll(engine, 0);
+			if (rc) {
+				pr_info("%s service_poll failed %d.\n",
+					engine->name, rc);
+				rc = -ERESTARTSYS;
+			}
+			if (result[engine->rx_head].status)
+				return 0;
+		}
+	} else {
+		if (enable_credit_mp) {
+			dbg_tfr("%s: rx_head=%d,rx_tail=%d, wait ...\n",
+				engine->name, engine->rx_head, engine->rx_tail);
+			rc = wait_event_timeout(transfer->wq,
+					(engine->rx_head != engine->rx_tail ||
+					 engine->rx_overrun),
+					msecs_to_jiffies(timeout_ms));
+			dbg_tfr("%s: wait returns %d, rx %d/%d, overrun %d.\n",
+				 engine->name, rc, engine->rx_head,
+				engine->rx_tail, engine->rx_overrun);
+		} else {
+			rc = wait_event_timeout(transfer->wq,
+					engine->eop_found,
+					msecs_to_jiffies(timeout_ms));
+			dbg_tfr("%s: wait returns %d, eop_found %d.\n",
+				engine->name, rc, engine->eop_found);
+		}
+	}
+
+	return 0;
+}
+
+struct scatterlist *sglist_index(struct sg_table *sgt, unsigned int idx)
+{
+	struct scatterlist *sg = sgt->sgl;
+	int i;
+
+	if (idx >= sgt->orig_nents)
+		return NULL;
+
+	if (!idx)
+		return sg;
+
+	for (i = 0; i < idx; i++, sg = sg_next(sg))
+		;
+
+	return sg;
+}
+
+static int copy_cyclic_to_user(struct xdma_engine *engine, int pkt_length,
+				int head, char __user *buf, size_t count)
+{
+	struct scatterlist *sg;
+	int more = pkt_length;
+
+	BUG_ON(!engine);
+	BUG_ON(!buf);
+
+	dbg_tfr("%s, pkt_len %d, head %d, user buf idx %u.\n",
+		engine->name, pkt_length, head, engine->user_buffer_index);
+
+	sg = sglist_index(&engine->cyclic_sgt, head);
+	if (!sg) {
+		pr_info("%s, head %d OOR, sgl %u.\n",
+			engine->name, head, engine->cyclic_sgt.orig_nents);
+		return -EIO;
+	}
+
+	/* EOP found? Transfer anything from head to EOP */
+	while (more) {
+		unsigned int copy = more > PAGE_SIZE ? PAGE_SIZE : more;
+		unsigned int blen = count - engine->user_buffer_index;
+		int rv;
+
+		if (copy > blen)
+			copy = blen;
+
+		dbg_tfr("%s sg %d, 0x%p, copy %u to user %u.\n",
+			engine->name, head, sg, copy,
+			engine->user_buffer_index);
+
+		rv = copy_to_user(&buf[engine->user_buffer_index],
+			page_address(sg_page(sg)), copy);
+		if (rv) {
+			pr_info("%s copy_to_user %u failed %d\n",
+				engine->name, copy, rv);
+			return -EIO;
+		}
+
+		more -= copy;
+		engine->user_buffer_index += copy;
+
+		if (engine->user_buffer_index == count) {
+			/* user buffer used up */
+			break;
+		}
+
+		head++;
+		if (head >= CYCLIC_RX_PAGES_MAX) {
+			head = 0;
+			sg = engine->cyclic_sgt.sgl;
+		} else
+			sg = sg_next(sg);
+	}
+
+	return pkt_length;
+}
+
+static int complete_cyclic(struct xdma_engine *engine, char __user *buf,
+			   size_t count)
+{
+	struct xdma_result *result;
+	int pkt_length = 0;
+	int fault = 0;
+	int eop = 0;
+	int head;
+	int rc = 0;
+	int num_credit = 0;
+	unsigned long flags;
+
+	BUG_ON(!engine);
+	result = engine->cyclic_result;
+	BUG_ON(!result);
+
+	spin_lock_irqsave(&engine->lock, flags);
+
+	/* where the host currently is in the ring buffer */
+	head = engine->rx_head;
+
+	/* iterate over newly received results */
+	while (engine->rx_head != engine->rx_tail || engine->rx_overrun) {
+
+		WARN_ON(result[engine->rx_head].status == 0);
+
+		dbg_tfr("%s, result[%d].status = 0x%x length = 0x%x.\n",
+			engine->name, engine->rx_head,
+			result[engine->rx_head].status,
+			result[engine->rx_head].length);
+
+		if ((result[engine->rx_head].status >> 16) != C2H_WB) {
+			pr_info("%s, result[%d].status 0x%x, no magic.\n",
+				engine->name, engine->rx_head,
+				result[engine->rx_head].status);
+			fault = 1;
+		} else if (result[engine->rx_head].length > PAGE_SIZE) {
+			pr_info("%s, result[%d].len 0x%x, > PAGE_SIZE 0x%lx.\n",
+				engine->name, engine->rx_head,
+				result[engine->rx_head].length, PAGE_SIZE);
+			fault = 1;
+		} else if (result[engine->rx_head].length == 0) {
+			pr_info("%s, result[%d].length 0x%x.\n",
+				engine->name, engine->rx_head,
+				result[engine->rx_head].length);
+			fault = 1;
+			/* valid result */
+		} else {
+			pkt_length += result[engine->rx_head].length;
+			num_credit++;
+			/* seen eop? */
+			//if (result[engine->rx_head].status & RX_STATUS_EOP)
+			if (result[engine->rx_head].status & RX_STATUS_EOP) {
+				eop = 1;
+				engine->eop_found = 1;
+			}
+
+			dbg_tfr("%s, pkt_length=%d (%s)\n",
+				engine->name, pkt_length,
+				eop ? "with EOP" : "no EOP yet");
+		}
+		/* clear result */
+		result[engine->rx_head].status = 0;
+		result[engine->rx_head].length = 0;
+		/* proceed head pointer so we make progress, even when fault */
+		engine->rx_head = (engine->rx_head + 1) % CYCLIC_RX_PAGES_MAX;
+
+		/* stop processing if a fault/eop was detected */
+		if (fault || eop)
+			break;
+	}
+
+	spin_unlock_irqrestore(&engine->lock, flags);
+
+	if (fault)
+		return -EIO;
+
+	rc = copy_cyclic_to_user(engine, pkt_length, head, buf, count);
+	engine->rx_overrun = 0;
+	/* if copy is successful, release credits */
+	if (rc > 0)
+		write_register(num_credit, &engine->sgdma_regs->credits, 0);
+
+	return rc;
+}
+
+ssize_t xdma_engine_read_cyclic(struct xdma_engine *engine, char __user *buf,
+				size_t count, int timeout_ms)
+{
+	int i = 0;
+	int rc = 0;
+	int rc_len = 0;
+	struct xdma_transfer *transfer;
+
+	BUG_ON(!engine);
+	BUG_ON(engine->magic != MAGIC_ENGINE);
+
+	transfer = &engine->cyclic_req->xfer;
+	BUG_ON(!transfer);
+
+	engine->user_buffer_index = 0;
+
+	do {
+		rc = transfer_monitor_cyclic(engine, transfer, timeout_ms);
+		if (rc < 0)
+			return rc;
+		rc = complete_cyclic(engine, buf, count);
+		if (rc < 0)
+			return rc;
+		rc_len += rc;
+
+		i++;
+		if (i > 10)
+			break;
+	} while (!engine->eop_found);
+
+	if (enable_credit_mp)
+		engine->eop_found = 0;
+
+	return rc_len;
+}
+
+static void sgt_free_with_pages(struct sg_table *sgt, int dir,
+				struct pci_dev *pdev)
+{
+	struct scatterlist *sg = sgt->sgl;
+	int npages = sgt->orig_nents;
+	int i;
+
+	for (i = 0; i < npages; i++, sg = sg_next(sg)) {
+		struct page *pg = sg_page(sg);
+		dma_addr_t bus = sg_dma_address(sg);
+
+		if (pg) {
+			if (pdev)
+				pci_unmap_page(pdev, bus, PAGE_SIZE, dir);
+			__free_page(pg);
+		} else
+			break;
+	}
+	sg_free_table(sgt);
+	memset(sgt, 0, sizeof(struct sg_table));
+}
+
+static int sgt_alloc_with_pages(struct sg_table *sgt, unsigned int npages,
+				int dir, struct pci_dev *pdev)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (sg_alloc_table(sgt, npages, GFP_KERNEL)) {
+		pr_info("sgt OOM.\n");
+		return -ENOMEM;
+	}
+
+	sg = sgt->sgl;
+	for (i = 0; i < npages; i++, sg = sg_next(sg)) {
+		struct page *pg = alloc_page(GFP_KERNEL);
+
+		if (!pg) {
+			pr_info("%d/%u, page OOM.\n", i, npages);
+			goto err_out;
+		}
+
+		if (pdev) {
+			dma_addr_t bus = pci_map_page(pdev, pg, 0, PAGE_SIZE,
+							dir);
+			if (unlikely(pci_dma_mapping_error(pdev, bus))) {
+				pr_info("%d/%u, page 0x%p map err.\n",
+					 i, npages, pg);
+				__free_page(pg);
+				goto err_out;
+			}
+			sg_dma_address(sg) = bus;
+			sg_dma_len(sg) = PAGE_SIZE;
+		}
+		sg_set_page(sg, pg, PAGE_SIZE, 0);
+	}
+
+	sgt->orig_nents = sgt->nents = npages;
+
+	return 0;
+
+err_out:
+	sgt_free_with_pages(sgt, dir, pdev);
+	return -ENOMEM;
+}
+
+int xdma_cyclic_transfer_setup(struct xdma_engine *engine)
+{
+	struct xdma_dev *xdev;
+	struct xdma_transfer *xfer;
+	dma_addr_t bus;
+	unsigned long flags;
+	int i;
+	int rc;
+
+	BUG_ON(!engine);
+	xdev = engine->xdev;
+	BUG_ON(!xdev);
+
+	if (engine->cyclic_req) {
+		pr_info("%s: exclusive access already taken.\n",
+			engine->name);
+		return -EBUSY;
+	}
+
+	spin_lock_irqsave(&engine->lock, flags);
+
+	engine->rx_tail = 0;
+	engine->rx_head = 0;
+	engine->rx_overrun = 0;
+	engine->eop_found = 0;
+
+	rc = sgt_alloc_with_pages(&engine->cyclic_sgt, CYCLIC_RX_PAGES_MAX,
+				engine->dir, xdev->pdev);
+	if (rc < 0) {
+		pr_info("%s cyclic pages %u OOM.\n",
+			engine->name, CYCLIC_RX_PAGES_MAX);
+		goto err_out;
+	}
+
+	engine->cyclic_req = xdma_init_request(&engine->cyclic_sgt, 0);
+	if (!engine->cyclic_req) {
+		pr_info("%s cyclic request OOM.\n", engine->name);
+		rc = -ENOMEM;
+		goto err_out;
+	}
+
+#ifdef __LIBXDMA_DEBUG__
+	xdma_request_cb_dump(engine->cyclic_req);
+#endif
+
+	rc = transfer_init(engine, engine->cyclic_req);
+	if (rc < 0)
+		goto err_out;
+
+	xfer = &engine->cyclic_req->xfer;
+
+	/* replace source addresses with result write-back addresses */
+	memset(engine->cyclic_result, 0,
+		CYCLIC_RX_PAGES_MAX * sizeof(struct xdma_result));
+	bus = engine->cyclic_result_bus;
+	for (i = 0; i < xfer->desc_num; i++) {
+		xfer->desc_virt[i].src_addr_lo = cpu_to_le32(PCI_DMA_L(bus));
+		xfer->desc_virt[i].src_addr_hi = cpu_to_le32(PCI_DMA_H(bus));
+		bus += sizeof(struct xdma_result);
+	}
+	/* set control of all descriptors */
+	for (i = 0; i < xfer->desc_num; i++) {
+		xdma_desc_control_clear(xfer->desc_virt + i, LS_BYTE_MASK);
+		xdma_desc_control_set(xfer->desc_virt + i,
+				 XDMA_DESC_EOP | XDMA_DESC_COMPLETED);
+	}
+
+	/* make this a cyclic transfer */
+	xdma_transfer_cyclic(xfer);
+
+#ifdef __LIBXDMA_DEBUG__
+	transfer_dump(xfer);
+#endif
+
+	if (enable_credit_mp)
+		write_register(128, &engine->sgdma_regs->credits, 0);
+
+	spin_unlock_irqrestore(&engine->lock, flags);
+
+	/* start cyclic transfer */
+	transfer_queue(engine, xfer);
+
+	return 0;
+
+	/* unwind on errors */
+err_out:
+	if (engine->cyclic_req) {
+		xdma_request_free(engine->cyclic_req);
+		engine->cyclic_req = NULL;
+	}
+
+	if (engine->cyclic_sgt.orig_nents) {
+		sgt_free_with_pages(&engine->cyclic_sgt, engine->dir,
+				xdev->pdev);
+		engine->cyclic_sgt.orig_nents = 0;
+		engine->cyclic_sgt.nents = 0;
+		engine->cyclic_sgt.sgl = NULL;
+	}
+
+	spin_unlock_irqrestore(&engine->lock, flags);
+
+	return rc;
+}
+
+
+static int cyclic_shutdown_polled(struct xdma_engine *engine)
+{
+	BUG_ON(!engine);
+
+	spin_lock(&engine->lock);
+
+	dbg_tfr("Polling for shutdown completion\n");
+	do {
+		engine_status_read(engine, 1, 0);
+		schedule();
+	} while (engine->status & XDMA_STAT_BUSY);
+
+	if ((engine->running) && !(engine->status & XDMA_STAT_BUSY)) {
+		dbg_tfr("Engine has stopped\n");
+
+		if (!list_empty(&engine->transfer_list))
+			engine_transfer_dequeue(engine);
+
+		engine_service_shutdown(engine);
+	}
+
+	dbg_tfr("Shutdown completion polling done\n");
+	spin_unlock(&engine->lock);
+
+	return 0;
+}
+
+static int cyclic_shutdown_interrupt(struct xdma_engine *engine)
+{
+	int rc;
+
+	BUG_ON(!engine);
+
+	rc = wait_event_timeout(engine->shutdown_wq,
+				!engine->running, msecs_to_jiffies(10000));
+
+	if (engine->running) {
+		pr_info("%s still running?!, %d\n", engine->name, rc);
+		return -EINVAL;
+	}
+
+	return rc;
+}
+
+int xdma_cyclic_transfer_teardown(struct xdma_engine *engine)
+{
+	int rc;
+	struct xdma_dev *xdev = engine->xdev;
+	struct xdma_transfer *transfer;
+	unsigned long flags;
+
+	transfer = engine_cyclic_stop(engine);
+
+	spin_lock_irqsave(&engine->lock, flags);
+	if (transfer) {
+		dbg_tfr("%s: stop transfer 0x%p.\n", engine->name, transfer);
+		if (transfer != &engine->cyclic_req->xfer) {
+			pr_info("%s unexpected transfer 0x%p/0x%p\n",
+				engine->name, transfer,
+				&engine->cyclic_req->xfer);
+		}
+	}
+	/* allow engine to be serviced after stop request */
+	spin_unlock_irqrestore(&engine->lock, flags);
+
+	/* wait for engine to be no longer running */
+	if (poll_mode)
+		rc = cyclic_shutdown_polled(engine);
+	else
+		rc = cyclic_shutdown_interrupt(engine);
+
+	/* obtain spin lock to atomically remove resources */
+	spin_lock_irqsave(&engine->lock, flags);
+
+	if (engine->cyclic_req) {
+		xdma_request_free(engine->cyclic_req);
+		engine->cyclic_req = NULL;
+	}
+
+	if (engine->cyclic_sgt.orig_nents) {
+		sgt_free_with_pages(&engine->cyclic_sgt, engine->dir,
+				xdev->pdev);
+		engine->cyclic_sgt.orig_nents = 0;
+		engine->cyclic_sgt.nents = 0;
+		engine->cyclic_sgt.sgl = NULL;
+	}
+
+	spin_unlock_irqrestore(&engine->lock, flags);
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/xocl/lib/libxdma.h b/drivers/gpu/drm/xocl/lib/libxdma.h
new file mode 100644
index 000000000000..b519958f9633
--- /dev/null
+++ b/drivers/gpu/drm/xocl/lib/libxdma.h
@@ -0,0 +1,596 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*******************************************************************************
+ *
+ * Xilinx XDMA IP Core Linux Driver
+ * Copyright(c) 2015 - 2017 Xilinx, Inc.
+ *
+ * Karen Xie <karen.xie@xilinx.com>
+ *
+ ******************************************************************************/
+#ifndef XDMA_LIB_H
+#define XDMA_LIB_H
+
+#include <linux/version.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/module.h>
+#include <linux/dma-mapping.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/jiffies.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/workqueue.h>
+
+/* Switch debug printing on/off */
+#define XDMA_DEBUG 0
+
+/* SECTION: Preprocessor macros/constants */
+#define XDMA_BAR_NUM (6)
+
+/* maximum amount of register space to map */
+#define XDMA_BAR_SIZE (0x8000UL)
+
+/* Use this definition to poll several times between calls to schedule */
+#define NUM_POLLS_PER_SCHED 100
+
+#define XDMA_CHANNEL_NUM_MAX (4)
+/*
+ * interrupts per engine, rad2_vul.sv:237
+ * .REG_IRQ_OUT	(reg_irq_from_ch[(channel*2) +: 2]),
+ */
+#define XDMA_ENG_IRQ_NUM (1)
+#define MAX_EXTRA_ADJ (15)
+#define RX_STATUS_EOP (1)
+
+/* Target internal components on XDMA control BAR */
+#define XDMA_OFS_INT_CTRL	(0x2000UL)
+#define XDMA_OFS_CONFIG		(0x3000UL)
+
+/* maximum number of desc per transfer request */
+#define XDMA_TRANSFER_MAX_DESC (2048)
+
+/* maximum size of a single DMA transfer descriptor */
+#define XDMA_DESC_BLEN_BITS	28
+#define XDMA_DESC_BLEN_MAX	((1 << (XDMA_DESC_BLEN_BITS)) - 1)
+
+/* bits of the SG DMA control register */
+#define XDMA_CTRL_RUN_STOP			(1UL << 0)
+#define XDMA_CTRL_IE_DESC_STOPPED		(1UL << 1)
+#define XDMA_CTRL_IE_DESC_COMPLETED		(1UL << 2)
+#define XDMA_CTRL_IE_DESC_ALIGN_MISMATCH	(1UL << 3)
+#define XDMA_CTRL_IE_MAGIC_STOPPED		(1UL << 4)
+#define XDMA_CTRL_IE_IDLE_STOPPED		(1UL << 6)
+#define XDMA_CTRL_IE_READ_ERROR			(0x1FUL << 9)
+#define XDMA_CTRL_IE_DESC_ERROR			(0x1FUL << 19)
+#define XDMA_CTRL_NON_INCR_ADDR			(1UL << 25)
+#define XDMA_CTRL_POLL_MODE_WB			(1UL << 26)
+
+/* bits of the SG DMA status register */
+#define XDMA_STAT_BUSY			(1UL << 0)
+#define XDMA_STAT_DESC_STOPPED		(1UL << 1)
+#define XDMA_STAT_DESC_COMPLETED	(1UL << 2)
+#define XDMA_STAT_ALIGN_MISMATCH	(1UL << 3)
+#define XDMA_STAT_MAGIC_STOPPED		(1UL << 4)
+#define XDMA_STAT_INVALID_LEN		(1UL << 5)
+#define XDMA_STAT_IDLE_STOPPED		(1UL << 6)
+
+#define XDMA_STAT_COMMON_ERR_MASK \
+	(XDMA_STAT_ALIGN_MISMATCH | XDMA_STAT_MAGIC_STOPPED | \
+	 XDMA_STAT_INVALID_LEN)
+
+/* desc_error, C2H & H2C */
+#define XDMA_STAT_DESC_UNSUPP_REQ	(1UL << 19)
+#define XDMA_STAT_DESC_COMPL_ABORT	(1UL << 20)
+#define XDMA_STAT_DESC_PARITY_ERR	(1UL << 21)
+#define XDMA_STAT_DESC_HEADER_EP	(1UL << 22)
+#define XDMA_STAT_DESC_UNEXP_COMPL	(1UL << 23)
+
+#define XDMA_STAT_DESC_ERR_MASK	\
+	(XDMA_STAT_DESC_UNSUPP_REQ | XDMA_STAT_DESC_COMPL_ABORT | \
+	 XDMA_STAT_DESC_PARITY_ERR | XDMA_STAT_DESC_HEADER_EP | \
+	 XDMA_STAT_DESC_UNEXP_COMPL)
+
+/* read error: H2C */
+#define XDMA_STAT_H2C_R_UNSUPP_REQ	(1UL << 9)
+#define XDMA_STAT_H2C_R_COMPL_ABORT	(1UL << 10)
+#define XDMA_STAT_H2C_R_PARITY_ERR	(1UL << 11)
+#define XDMA_STAT_H2C_R_HEADER_EP	(1UL << 12)
+#define XDMA_STAT_H2C_R_UNEXP_COMPL	(1UL << 13)
+
+#define XDMA_STAT_H2C_R_ERR_MASK	\
+	(XDMA_STAT_H2C_R_UNSUPP_REQ | XDMA_STAT_H2C_R_COMPL_ABORT | \
+	 XDMA_STAT_H2C_R_PARITY_ERR | XDMA_STAT_H2C_R_HEADER_EP | \
+	 XDMA_STAT_H2C_R_UNEXP_COMPL)
+
+/* write error, H2C only */
+#define XDMA_STAT_H2C_W_DECODE_ERR	(1UL << 14)
+#define XDMA_STAT_H2C_W_SLAVE_ERR	(1UL << 15)
+
+#define XDMA_STAT_H2C_W_ERR_MASK	\
+	(XDMA_STAT_H2C_W_DECODE_ERR | XDMA_STAT_H2C_W_SLAVE_ERR)
+
+/* read error: C2H */
+#define XDMA_STAT_C2H_R_DECODE_ERR	(1UL << 9)
+#define XDMA_STAT_C2H_R_SLAVE_ERR	(1UL << 10)
+
+#define XDMA_STAT_C2H_R_ERR_MASK	\
+	(XDMA_STAT_C2H_R_DECODE_ERR | XDMA_STAT_C2H_R_SLAVE_ERR)
+
+/* all combined */
+#define XDMA_STAT_H2C_ERR_MASK	\
+	(XDMA_STAT_COMMON_ERR_MASK | XDMA_STAT_DESC_ERR_MASK | \
+	 XDMA_STAT_H2C_R_ERR_MASK | XDMA_STAT_H2C_W_ERR_MASK)
+
+#define XDMA_STAT_C2H_ERR_MASK	\
+	(XDMA_STAT_COMMON_ERR_MASK | XDMA_STAT_DESC_ERR_MASK | \
+	 XDMA_STAT_C2H_R_ERR_MASK)
+
+/* bits of the SGDMA descriptor control field */
+#define XDMA_DESC_STOPPED	(1UL << 0)
+#define XDMA_DESC_COMPLETED	(1UL << 1)
+#define XDMA_DESC_EOP		(1UL << 4)
+
+#define XDMA_PERF_RUN	(1UL << 0)
+#define XDMA_PERF_CLEAR	(1UL << 1)
+#define XDMA_PERF_AUTO	(1UL << 2)
+
+#define MAGIC_ENGINE	0xEEEEEEEEUL
+#define MAGIC_DEVICE	0xDDDDDDDDUL
+
+/* upper 16-bits of engine identifier register */
+#define XDMA_ID_H2C 0x1fc0U
+#define XDMA_ID_C2H 0x1fc1U
+
+/* for C2H AXI-ST mode */
+#define CYCLIC_RX_PAGES_MAX	256
+
+#define LS_BYTE_MASK 0x000000FFUL
+
+#define BLOCK_ID_MASK 0xFFF00000
+#define BLOCK_ID_HEAD 0x1FC00000
+
+#define IRQ_BLOCK_ID 0x1fc20000UL
+#define CONFIG_BLOCK_ID 0x1fc30000UL
+
+#define WB_COUNT_MASK 0x00ffffffUL
+#define WB_ERR_MASK (1UL << 31)
+#define POLL_TIMEOUT_SECONDS 10
+
+#define MAX_USER_IRQ 16
+
+#define MAX_DESC_BUS_ADDR (0xffffffffULL)
+
+#define DESC_MAGIC 0xAD4B0000UL
+
+#define C2H_WB 0x52B4UL
+
+#define MAX_NUM_ENGINES (XDMA_CHANNEL_NUM_MAX * 2)
+#define H2C_CHANNEL_OFFSET 0x1000
+#define SGDMA_OFFSET_FROM_CHANNEL 0x4000
+#define CHANNEL_SPACING 0x100
+#define TARGET_SPACING 0x1000
+
+#define BYPASS_MODE_SPACING 0x0100
+
+/* obtain the 32 most significant (high) bits of a 32-bit or 64-bit address */
+#define PCI_DMA_H(addr) ((addr >> 16) >> 16)
+/* obtain the 32 least significant (low) bits of a 32-bit or 64-bit address */
+#define PCI_DMA_L(addr) (addr & 0xffffffffUL)
+
+#ifndef VM_RESERVED
+	#define VMEM_FLAGS (VM_IO | VM_DONTEXPAND | VM_DONTDUMP)
+#else
+	#define VMEM_FLAGS (VM_IO | VM_RESERVED)
+#endif
+
+#ifdef __LIBXDMA_DEBUG__
+#define dbg_io		pr_err
+#define dbg_fops	pr_err
+#define dbg_perf	pr_err
+#define dbg_sg		pr_err
+#define dbg_tfr		pr_err
+#define dbg_irq		pr_err
+#define dbg_init	pr_err
+#define dbg_desc	pr_err
+#else
+/* disable debugging */
+#define dbg_io(...)
+#define dbg_fops(...)
+#define dbg_perf(...)
+#define dbg_sg(...)
+#define dbg_tfr(...)
+#define dbg_irq(...)
+#define dbg_init(...)
+#define dbg_desc(...)
+#endif
+
+/* SECTION: Enum definitions */
+enum transfer_state {
+	TRANSFER_STATE_NEW = 0,
+	TRANSFER_STATE_SUBMITTED,
+	TRANSFER_STATE_COMPLETED,
+	TRANSFER_STATE_FAILED,
+	TRANSFER_STATE_ABORTED
+};
+
+enum shutdown_state {
+	ENGINE_SHUTDOWN_NONE = 0,	/* No shutdown in progress */
+	ENGINE_SHUTDOWN_REQUEST = 1,	/* engine requested to shutdown */
+	ENGINE_SHUTDOWN_IDLE = 2	/* engine has shutdown and is idle */
+};
+
+enum dev_capabilities {
+	CAP_64BIT_DMA = 2,
+	CAP_64BIT_DESC = 4,
+	CAP_ENGINE_WRITE = 8,
+	CAP_ENGINE_READ = 16
+};
+
+/* SECTION: Structure definitions */
+
+struct config_regs {
+	u32 identifier;
+	u32 reserved_1[4];
+	u32 msi_enable;
+};
+
+/**
+ * SG DMA Controller status and control registers
+ *
+ * These registers make the control interface for DMA transfers.
+ *
+ * It sits in End Point (FPGA) memory BAR[0] for 32-bit or BAR[0:1] for 64-bit.
+ * It references the first descriptor which exists in Root Complex (PC) memory.
+ *
+ * @note The registers must be accessed using 32-bit (PCI DWORD) read/writes,
+ * and their values are in little-endian byte ordering.
+ */
+struct engine_regs {
+	u32 identifier;
+	u32 control;
+	u32 control_w1s;
+	u32 control_w1c;
+	u32 reserved_1[12];	/* padding */
+
+	u32 status;
+	u32 status_rc;
+	u32 completed_desc_count;
+	u32 alignments;
+	u32 reserved_2[14];	/* padding */
+
+	u32 poll_mode_wb_lo;
+	u32 poll_mode_wb_hi;
+	u32 interrupt_enable_mask;
+	u32 interrupt_enable_mask_w1s;
+	u32 interrupt_enable_mask_w1c;
+	u32 reserved_3[9];	/* padding */
+
+	u32 perf_ctrl;
+	u32 perf_cyc_lo;
+	u32 perf_cyc_hi;
+	u32 perf_dat_lo;
+	u32 perf_dat_hi;
+	u32 perf_pnd_lo;
+	u32 perf_pnd_hi;
+} __packed;
+
+struct engine_sgdma_regs {
+	u32 identifier;
+	u32 reserved_1[31];	/* padding */
+
+	/* bus address to first descriptor in Root Complex Memory */
+	u32 first_desc_lo;
+	u32 first_desc_hi;
+	/* number of adjacent descriptors at first_desc */
+	u32 first_desc_adjacent;
+	u32 credits;
+} __packed;
+
+struct msix_vec_table_entry {
+	u32 msi_vec_addr_lo;
+	u32 msi_vec_addr_hi;
+	u32 msi_vec_data_lo;
+	u32 msi_vec_data_hi;
+} __packed;
+
+struct msix_vec_table {
+	struct msix_vec_table_entry entry_list[32];
+} __packed;
+
+struct interrupt_regs {
+	u32 identifier;
+	u32 user_int_enable;
+	u32 user_int_enable_w1s;
+	u32 user_int_enable_w1c;
+	u32 channel_int_enable;
+	u32 channel_int_enable_w1s;
+	u32 channel_int_enable_w1c;
+	u32 reserved_1[9];	/* padding */
+
+	u32 user_int_request;
+	u32 channel_int_request;
+	u32 user_int_pending;
+	u32 channel_int_pending;
+	u32 reserved_2[12];	/* padding */
+
+	u32 user_msi_vector[8];
+	u32 channel_msi_vector[8];
+} __packed;
+
+struct sgdma_common_regs {
+	u32 padding[8];
+	u32 credit_mode_enable;
+	u32 credit_mode_enable_w1s;
+	u32 credit_mode_enable_w1c;
+} __packed;
+
+
+/* Structure for polled mode descriptor writeback */
+struct xdma_poll_wb {
+	u32 completed_desc_count;
+	u32 reserved_1[7];
+} __packed;
+
+
+/**
+ * Descriptor for a single contiguous memory block transfer.
+ *
+ * Multiple descriptors are linked by means of the next pointer. An additional
+ * extra adjacent number gives the amount of extra contiguous descriptors.
+ *
+ * The descriptors are in root complex memory, and the bytes in the 32-bit
+ * words must be in little-endian byte ordering.
+ */
+struct xdma_desc {
+	u32 control;
+	u32 bytes;		/* transfer length in bytes */
+	u32 src_addr_lo;	/* source address (low 32-bit) */
+	u32 src_addr_hi;	/* source address (high 32-bit) */
+	u32 dst_addr_lo;	/* destination address (low 32-bit) */
+	u32 dst_addr_hi;	/* destination address (high 32-bit) */
+	/*
+	 * next descriptor in the single-linked list of descriptors;
+	 * this is the PCIe (bus) address of the next descriptor in the
+	 * root complex memory
+	 */
+	u32 next_lo;		/* next desc address (low 32-bit) */
+	u32 next_hi;		/* next desc address (high 32-bit) */
+} __packed;
+
+/* 32 bytes (four 32-bit words) or 64 bytes (eight 32-bit words) */
+struct xdma_result {
+	u32 status;
+	u32 length;
+	u32 reserved_1[6];	/* padding */
+} __packed;
+
+struct sw_desc {
+	dma_addr_t addr;
+	unsigned int len;
+};
+
+/* Describes a (SG DMA) single transfer for the engine */
+struct xdma_transfer {
+	struct list_head entry;		/* queue of non-completed transfers */
+	struct xdma_desc *desc_virt;	/* virt addr of the 1st descriptor */
+	dma_addr_t desc_bus;		/* bus addr of the first descriptor */
+	int desc_adjacent;		/* adjacent descriptors at desc_bus */
+	int desc_num;			/* number of descriptors in transfer */
+	enum dma_data_direction dir;
+	wait_queue_head_t wq;		/* wait queue for transfer completion */
+
+	enum transfer_state state;	/* state of the transfer */
+	unsigned int flags;
+#define XFER_FLAG_NEED_UNMAP	0x1
+	int cyclic;			/* flag if transfer is cyclic */
+	int last_in_request;		/* flag if last within request */
+	unsigned int len;
+	struct sg_table *sgt;
+};
+
+struct xdma_request_cb {
+	struct sg_table *sgt;
+	unsigned int total_len;
+	u64 ep_addr;
+
+	struct xdma_transfer xfer;
+
+	unsigned int sw_desc_idx;
+	unsigned int sw_desc_cnt;
+	struct sw_desc sdesc[0];
+};
+
+struct xdma_engine {
+	unsigned long magic;	/* structure ID for sanity checks */
+	struct xdma_dev *xdev;	/* parent device */
+	char name[5];		/* name of this engine */
+	int version;		/* version of this engine */
+	//dev_t cdevno;		/* character device major:minor */
+	//struct cdev cdev;	/* character device (embedded struct) */
+
+	/* HW register address offsets */
+	struct engine_regs *regs;		/* Control reg BAR offset */
+	struct engine_sgdma_regs *sgdma_regs;	/* SGDAM reg BAR offset */
+	u32 bypass_offset;			/* Bypass mode BAR offset */
+
+	/* Engine state, configuration and flags */
+	enum shutdown_state shutdown;	/* engine shutdown mode */
+	enum dma_data_direction dir;
+	int running;		/* flag if the driver started engine */
+	int non_incr_addr;	/* flag if non-incremental addressing used */
+	int streaming;
+	int addr_align;		/* source/dest alignment in bytes */
+	int len_granularity;	/* transfer length multiple */
+	int addr_bits;		/* HW datapath address width */
+	int channel;		/* engine indices */
+	int max_extra_adj;	/* descriptor prefetch capability */
+	int desc_dequeued;	/* num descriptors of completed transfers */
+	u32 status;		/* last known status of device */
+	u32 interrupt_enable_mask_value;/* only used for MSIX mode to store per-engine interrupt mask value */
+
+	/* Transfer list management */
+	struct list_head transfer_list;	/* queue of transfers */
+
+	/* Members applicable to AXI-ST C2H (cyclic) transfers */
+	struct xdma_result *cyclic_result;
+	dma_addr_t cyclic_result_bus;	/* bus addr for transfer */
+	struct xdma_request_cb *cyclic_req;
+	struct sg_table cyclic_sgt;
+	u8 eop_found; /* used only for cyclic(rx:c2h) */
+
+	int rx_tail;	/* follows the HW */
+	int rx_head;	/* where the SW reads from */
+	int rx_overrun;	/* flag if overrun occured */
+
+	/* for copy from cyclic buffer to user buffer */
+	unsigned int user_buffer_index;
+
+	/* Members associated with polled mode support */
+	u8 *poll_mode_addr_virt;	/* virt addr for descriptor writeback */
+	dma_addr_t poll_mode_bus;	/* bus addr for descriptor writeback */
+
+	/* Members associated with interrupt mode support */
+	wait_queue_head_t shutdown_wq;	/* wait queue for shutdown sync */
+	spinlock_t lock;		/* protects concurrent access */
+	int prev_cpu;			/* remember CPU# of (last) locker */
+	int msix_irq_line;		/* MSI-X vector for this engine */
+	u32 irq_bitmask;		/* IRQ bit mask for this engine */
+	struct work_struct work;	/* Work queue for interrupt handling */
+
+	spinlock_t desc_lock;		/* protects concurrent access */
+#ifdef CONFIG_PREEMPT_COUNT
+	struct mutex desc_mutex;
+#endif
+
+	dma_addr_t desc_bus;
+	struct xdma_desc *desc;
+
+	/* for performance test support */
+	struct xdma_performance_ioctl *xdma_perf;	/* perf test control */
+	wait_queue_head_t xdma_perf_wq;	/* Perf test sync */
+};
+
+struct xdma_user_irq {
+	struct xdma_dev *xdev;		/* parent device */
+	u8 user_idx;			/* 0 ~ 15 */
+	u8 events_irq;			/* accumulated IRQs */
+	spinlock_t events_lock;		/* lock to safely update events_irq */
+	wait_queue_head_t events_wq;	/* wait queue to sync waiting threads */
+	irq_handler_t handler;
+
+	void *dev;
+};
+
+/* XDMA PCIe device specific book-keeping */
+#define XDEV_FLAG_OFFLINE	0x1
+struct xdma_dev {
+	struct list_head list_head;
+	struct list_head rcu_node;
+
+	unsigned long magic;		/* structure ID for sanity checks */
+	struct pci_dev *pdev;	/* pci device struct from probe() */
+	int idx;		/* dev index */
+
+	const char *mod_name;		/* name of module owning the dev */
+
+	spinlock_t lock;		/* protects concurrent access */
+	unsigned int flags;
+
+	/* PCIe BAR management */
+	void *__iomem bar[XDMA_BAR_NUM];	/* addresses for mapped BARs */
+	int user_bar_idx;	/* BAR index of user logic */
+	int config_bar_idx;	/* BAR index of XDMA config logic */
+	int bypass_bar_idx;	/* BAR index of XDMA bypass logic */
+	int regions_in_use;	/* flag if dev was in use during probe() */
+	int got_regions;	/* flag if probe() obtained the regions */
+
+	int user_max;
+	int c2h_channel_max;
+	int h2c_channel_max;
+
+	/* Interrupt management */
+	int irq_count;		/* interrupt counter */
+	int irq_line;		/* flag if irq allocated successfully */
+	int msi_enabled;	/* flag if msi was enabled for the device */
+	int msix_enabled;	/* flag if msi-x was enabled for the device */
+	struct xdma_user_irq user_irq[16];	/* user IRQ management */
+	unsigned int mask_irq_user;
+
+	/* XDMA engine management */
+	int engines_num;	/* Total engine count */
+	u32 mask_irq_h2c;
+	u32 mask_irq_c2h;
+	struct xdma_engine engine_h2c[XDMA_CHANNEL_NUM_MAX];
+	struct xdma_engine engine_c2h[XDMA_CHANNEL_NUM_MAX];
+
+	/* SD_Accel specific */
+	enum dev_capabilities capabilities;
+	u64 feature_id;
+};
+
+static inline int xdma_device_flag_check(struct xdma_dev *xdev, unsigned int f)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&xdev->lock, flags);
+	if (xdev->flags & f) {
+		spin_unlock_irqrestore(&xdev->lock, flags);
+		return 1;
+	}
+	spin_unlock_irqrestore(&xdev->lock, flags);
+	return 0;
+}
+
+static inline int xdma_device_flag_test_n_set(struct xdma_dev *xdev,
+					 unsigned int f)
+{
+	unsigned long flags;
+	int rv = 0;
+
+	spin_lock_irqsave(&xdev->lock, flags);
+	if (xdev->flags & f) {
+		spin_unlock_irqrestore(&xdev->lock, flags);
+		rv = 1;
+	} else
+		xdev->flags |= f;
+	spin_unlock_irqrestore(&xdev->lock, flags);
+	return rv;
+}
+
+static inline void xdma_device_flag_set(struct xdma_dev *xdev, unsigned int f)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&xdev->lock, flags);
+	xdev->flags |= f;
+	spin_unlock_irqrestore(&xdev->lock, flags);
+}
+
+static inline void xdma_device_flag_clear(struct xdma_dev *xdev, unsigned int f)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&xdev->lock, flags);
+	xdev->flags &= ~f;
+	spin_unlock_irqrestore(&xdev->lock, flags);
+}
+
+void write_register(u32 value, void *iomem);
+u32 read_register(void *iomem);
+
+void xdma_device_offline(struct pci_dev *pdev, void *dev_handle);
+void xdma_device_online(struct pci_dev *pdev, void *dev_handle);
+
+int xdma_performance_submit(struct xdma_dev *xdev, struct xdma_engine *engine);
+struct xdma_transfer *engine_cyclic_stop(struct xdma_engine *engine);
+void enable_perf(struct xdma_engine *engine);
+void get_perf_stats(struct xdma_engine *engine);
+
+int xdma_cyclic_transfer_setup(struct xdma_engine *engine);
+int xdma_cyclic_transfer_teardown(struct xdma_engine *engine);
+ssize_t xdma_engine_read_cyclic(struct xdma_engine *engine, char __user *buf,
+	size_t count, int timeout_ms);
+
+#endif /* XDMA_LIB_H */
diff --git a/drivers/gpu/drm/xocl/lib/libxdma_api.h b/drivers/gpu/drm/xocl/lib/libxdma_api.h
new file mode 100644
index 000000000000..bfa3e7ba9b11
--- /dev/null
+++ b/drivers/gpu/drm/xocl/lib/libxdma_api.h
@@ -0,0 +1,127 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (C) 2016-2019 Xilinx, Inc
+ *
+ */
+
+#ifndef __XDMA_BASE_API_H__
+#define __XDMA_BASE_API_H__
+
+#include <linux/types.h>
+#include <linux/scatterlist.h>
+#include <linux/interrupt.h>
+
+/*
+ * functions exported by the xdma driver
+ */
+
+/*
+ * xdma_device_open - read the pci bars and configure the fpga
+ *	should be called from probe()
+ *    NOTE:
+ *		user interrupt will not enabled until xdma_user_isr_enable()
+ *		is called
+ * @pdev: ptr to pci_dev
+ * @mod_name: the module name to be used for request_irq
+ * @user_max: max # of user/event (interrupts) to be configured
+ * @channel_max: max # of c2h and h2c channels to be configured
+ * NOTE: if the user/channel provisioned is less than the max specified,
+ *	 libxdma will update the user_max/channel_max
+ * returns
+ *	a opaque handle (for libxdma to identify the device)
+ *	NULL, in case of error
+ */
+void *xdma_device_open(const char *mod_name, struct pci_dev *pdev,
+		 int *user_max, int *h2c_channel_max, int *c2h_channel_max);
+
+/*
+ * xdma_device_close - prepare fpga for removal: disable all interrupts (users
+ * and xdma) and release all resources
+ *	should called from remove()
+ * @pdev: ptr to struct pci_dev
+ * @tuples: from xdma_device_open()
+ */
+void xdma_device_close(struct pci_dev *pdev, void *dev_handle);
+
+/*
+ * xdma_device_restart - restart the fpga
+ * @pdev: ptr to struct pci_dev
+ * TODO:
+ *	may need more refining on the parameter list
+ * return < 0 in case of error
+ * TODO: exact error code will be defined later
+ */
+int xdma_device_restart(struct pci_dev *pdev, void *dev_handle);
+
+/*
+ * xdma_user_isr_register - register a user ISR handler
+ * It is expected that the xdma will register the ISR, and for the user
+ * interrupt, it will call the corresponding handle if it is registered and
+ * enabled.
+ *
+ * @pdev: ptr to the the pci_dev struct
+ * @mask: bitmask of user interrupts (0 ~ 15)to be registered
+ *		bit 0: user interrupt 0
+ *		...
+ *		bit 15: user interrupt 15
+ *		any bit above bit 15 will be ignored.
+ * @handler: the correspoinding handler
+ *		a NULL handler will be treated as de-registeration
+ * @name: to be passed to the handler, ignored if handler is NULL`
+ * @dev: to be passed to the handler, ignored if handler is NULL`
+ * return < 0 in case of error
+ * TODO: exact error code will be defined later
+ */
+int xdma_user_isr_register(void *dev_hndl, unsigned int mask,
+			 irq_handler_t handler, void *dev);
+
+/*
+ * xdma_user_isr_enable/disable - enable or disable user interrupt
+ * @pdev: ptr to the the pci_dev struct
+ * @mask: bitmask of user interrupts (0 ~ 15)to be registered
+ * return < 0 in case of error
+ * TODO: exact error code will be defined later
+ */
+int xdma_user_isr_enable(void *dev_hndl, unsigned int mask);
+int xdma_user_isr_disable(void *dev_hndl, unsigned int mask);
+
+/*
+ * xdma_xfer_submit - submit data for dma operation (for both read and write)
+ *	This is a blocking call
+ * @channel: channle number (< channel_max)
+ *	== channel_max means libxdma can pick any channel available:q
+
+ * @dir: DMA_FROM/TO_DEVICE
+ * @offset: offset into the DDR/BRAM memory to read from or write to
+ * @sg_tbl: the scatter-gather list of data buffers
+ * @timeout: timeout in mili-seconds, *currently ignored
+ * return # of bytes transfered or
+ *	 < 0 in case of error
+ * TODO: exact error code will be defined later
+ */
+ssize_t xdma_xfer_submit(void *dev_hndl, int channel, bool write, u64 ep_addr,
+			struct sg_table *sgt, bool dma_mapped, int timeout_ms);
+
+/*
+ * xdma_device_online - bring device offline
+ */
+void xdma_device_online(struct pci_dev *pdev, void *dev_hndl);
+
+/*
+ * xdma_device_offline - bring device offline
+ */
+void xdma_device_offline(struct pci_dev *pdev, void *dev_hndl);
+
+/*
+ * xdma_get_userio - get user bar information
+ */
+int xdma_get_userio(void *dev_hndl, void * __iomem *base_addr,
+		    u64 *len, u32 *bar_idx);
+
+/*
+ * xdma_get_bypassio - get bypass bar information
+ */
+int xdma_get_bypassio(void *dev_hndl, u64 *len, u32 *bar_idx);
+
+#endif
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH Xilinx Alveo 5/6] Add management driver
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
                   ` (3 preceding siblings ...)
  2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 4/6] Add core of XDMA driver sonal.santan
@ 2019-03-19 21:54 ` sonal.santan
  2019-03-19 21:54 ` [RFC PATCH Xilinx Alveo 6/6] Add user physical function driver sonal.santan
  2019-03-25 20:28 ` [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver Daniel Vetter
  6 siblings, 0 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:54 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk,
	Sonal Santan

From: Sonal Santan <sonal.santan@xilinx.com>

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c  | 960 +++++++++++++++++++++++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h  | 147 ++++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c    |  30 +
 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c | 148 ++++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h   | 244 ++++++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c | 318 ++++++++
 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c | 399 ++++++++++
 7 files changed, 2246 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
 create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c

diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c b/drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
new file mode 100644
index 000000000000..2eb0267fc2b2
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
@@ -0,0 +1,960 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Simple Driver for Management PF
+ *
+ * Copyright (C) 2017 Xilinx, Inc.
+ *
+ * Code borrowed from Xilinx SDAccel XDMA driver
+ *
+ * Author(s):
+ * Sonal Santan <sonal.santan@xilinx.com>
+ */
+#include "mgmt-core.h"
+#include <linux/ioctl.h>
+#include <linux/module.h>
+#include <linux/vmalloc.h>
+#include <linux/version.h>
+#include <linux/fs.h>
+#include <linux/platform_device.h>
+#include <linux/i2c.h>
+#include <linux/crc32c.h>
+#include "../xocl_drv.h"
+#include "../version.h"
+
+//#define USE_FEATURE_ROM
+
+static const struct pci_device_id pci_ids[] = XOCL_MGMT_PCI_IDS;
+
+MODULE_DEVICE_TABLE(pci, pci_ids);
+
+int health_interval = 5;
+module_param(health_interval, int, (S_IRUGO|S_IWUSR));
+MODULE_PARM_DESC(health_interval,
+	"Interval (in sec) after which the health thread is run. (1 = Minimum, 5 = default)");
+
+int health_check = 1;
+module_param(health_check, int, (S_IRUGO|S_IWUSR));
+MODULE_PARM_DESC(health_check,
+	"Enable health thread that checks the status of AXI Firewall and SYSMON. (0 = disable, 1 = enable)");
+
+int minimum_initialization;
+module_param(minimum_initialization, int, (S_IRUGO|S_IWUSR));
+MODULE_PARM_DESC(minimum_initialization,
+	"Enable minimum_initialization to force driver to load without vailid firmware or DSA. Thus xbsak flash is able to upgrade firmware. (0 = normal initialization, 1 = minimum initialization)");
+
+#define	LOW_TEMP		0
+#define	HI_TEMP			85000
+#define	LOW_MILLVOLT		500
+#define	HI_MILLVOLT		2500
+
+
+static dev_t xclmgmt_devnode;
+struct class *xrt_class;
+
+/*
+ * Called when the device goes from unused to used.
+ */
+static int char_open(struct inode *inode, struct file *file)
+{
+	struct xclmgmt_dev *lro;
+
+	/* pointer to containing data structure of the character device inode */
+	lro = xocl_drvinst_open(inode->i_cdev);
+	if (!lro)
+		return -ENXIO;
+
+	/* create a reference to our char device in the opened file */
+	file->private_data = lro;
+	BUG_ON(!lro);
+
+	mgmt_info(lro, "opened file %p by pid: %d\n",
+		file, pid_nr(task_tgid(current)));
+
+	return 0;
+}
+
+/*
+ * Called when the device goes from used to unused.
+ */
+static int char_close(struct inode *inode, struct file *file)
+{
+	struct xclmgmt_dev *lro;
+
+	lro = (struct xclmgmt_dev *)file->private_data;
+	BUG_ON(!lro);
+
+	mgmt_info(lro, "Closing file %p by pid: %d\n",
+		file, pid_nr(task_tgid(current)));
+
+	xocl_drvinst_close(lro);
+
+	return 0;
+}
+
+/*
+ * Unmap the BAR regions that had been mapped earlier using map_bars()
+ */
+static void unmap_bars(struct xclmgmt_dev *lro)
+{
+	if (lro->core.bar_addr) {
+		/* unmap BAR */
+		pci_iounmap(lro->core.pdev, lro->core.bar_addr);
+		/* mark as unmapped */
+		lro->core.bar_addr = NULL;
+	}
+	if (lro->core.intr_bar_addr) {
+		/* unmap BAR */
+		pci_iounmap(lro->core.pdev, lro->core.intr_bar_addr);
+		/* mark as unmapped */
+		lro->core.intr_bar_addr = NULL;
+	}
+}
+
+static int identify_bar(struct xocl_dev_core *core, int bar)
+{
+	void *__iomem bar_addr;
+	resource_size_t bar_len;
+
+	bar_len = pci_resource_len(core->pdev, bar);
+	bar_addr = pci_iomap(core->pdev, bar, bar_len);
+	if (!bar_addr) {
+		xocl_err(&core->pdev->dev, "Could not map BAR #%d",
+				core->bar_idx);
+		return -EIO;
+	}
+
+	/*
+	 * did not find a better way to identify BARS. Currently,
+	 * we have DSAs which rely VBNV name to differenciate them.
+	 * And reading VBNV name needs to bring up Feature ROM.
+	 * So we are not able to specify BARs in devices.h
+	 */
+	if (bar_len < 1024 * 1024 && bar > 0) {
+		core->intr_bar_idx = bar;
+		core->intr_bar_addr = bar_addr;
+		core->intr_bar_size = bar_len;
+	} else if (bar_len < 256 * 1024 * 1024) {
+		core->bar_idx = bar;
+		core->bar_size = bar_len;
+		core->bar_addr = bar_addr;
+	}
+
+	return 0;
+}
+
+/* map_bars() -- map device regions into kernel virtual address space
+ *
+ * Map the device memory regions into kernel virtual address space after
+ * verifying their sizes respect the minimum sizes needed, given by the
+ * bar_map_sizes[] array.
+ */
+static int map_bars(struct xclmgmt_dev *lro)
+{
+	struct pci_dev *pdev = lro->core.pdev;
+	resource_size_t bar_len;
+	int	i, ret = 0;
+
+	for (i = PCI_STD_RESOURCES; i <= PCI_STD_RESOURCE_END; i++) {
+		bar_len = pci_resource_len(pdev, i);
+		if (bar_len > 0) {
+			ret = identify_bar(&lro->core, i);
+			if (ret)
+				goto failed;
+		}
+	}
+
+	/* succesfully mapped all required BAR regions */
+	return 0;
+
+failed:
+	unmap_bars(lro);
+	return ret;
+}
+
+void get_pcie_link_info(struct xclmgmt_dev *lro,
+	unsigned short *link_width, unsigned short *link_speed, bool is_cap)
+{
+	u16 stat;
+	long result;
+	int pos = is_cap ? PCI_EXP_LNKCAP : PCI_EXP_LNKSTA;
+
+	result = pcie_capability_read_word(lro->core.pdev, pos, &stat);
+	if (result) {
+		*link_width = *link_speed = 0;
+		mgmt_err(lro, "Read pcie capability failed");
+		return;
+	}
+	*link_width = (stat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT;
+	*link_speed = stat & PCI_EXP_LNKSTA_CLS;
+}
+
+void device_info(struct xclmgmt_dev *lro, struct xclmgmt_ioc_info *obj)
+{
+	u32 val, major, minor, patch;
+	struct FeatureRomHeader rom;
+
+	memset(obj, 0, sizeof(struct xclmgmt_ioc_info));
+	if (sscanf(XRT_DRIVER_VERSION, "%d.%d.%d", &major, &minor, &patch) != 3)
+		return;
+
+	obj->vendor = lro->core.pdev->vendor;
+	obj->device = lro->core.pdev->device;
+	obj->subsystem_vendor = lro->core.pdev->subsystem_vendor;
+	obj->subsystem_device = lro->core.pdev->subsystem_device;
+	obj->driver_version = XOCL_DRV_VER_NUM(major, minor, patch);
+	obj->pci_slot = PCI_SLOT(lro->core.pdev->devfn);
+
+	val = MGMT_READ_REG32(lro, GENERAL_STATUS_BASE);
+	mgmt_info(lro, "MIG Calibration: %d\n", val);
+
+	obj->mig_calibration[0] = (val & BIT(0)) ? true : false;
+	obj->mig_calibration[1] = obj->mig_calibration[0];
+	obj->mig_calibration[2] = obj->mig_calibration[0];
+	obj->mig_calibration[3] = obj->mig_calibration[0];
+
+	/*
+	 * Get feature rom info
+	 */
+	obj->ddr_channel_num = xocl_get_ddr_channel_count(lro);
+	obj->ddr_channel_size = xocl_get_ddr_channel_size(lro);
+	obj->time_stamp = xocl_get_timestamp(lro);
+	obj->isXPR = XOCL_DSA_XPR_ON(lro);
+	xocl_get_raw_header(lro, &rom);
+	memcpy(obj->vbnv, rom.VBNVName, 64);
+	memcpy(obj->fpga, rom.FPGAPartName, 64);
+
+	/* Get sysmon info */
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_TEMP, &val);
+	obj->onchip_temp = val / 1000;
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_VCC_INT, &val);
+	obj->vcc_int = val;
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_VCC_AUX, &val);
+	obj->vcc_aux = val;
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_VCC_BRAM, &val);
+	obj->vcc_bram = val;
+
+	fill_frequency_info(lro, obj);
+	get_pcie_link_info(lro, &obj->pcie_link_width, &obj->pcie_link_speed,
+		false);
+}
+
+/*
+ * Maps the PCIe BAR into user space for memory-like access using mmap().
+ * Callable even when lro->ready == false.
+ */
+static int bridge_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	int rc;
+	struct xclmgmt_dev *lro;
+	unsigned long off;
+	unsigned long phys;
+	unsigned long vsize;
+	unsigned long psize;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	lro = (struct xclmgmt_dev *)file->private_data;
+	BUG_ON(!lro);
+
+	off = vma->vm_pgoff << PAGE_SHIFT;
+	/* BAR physical address */
+	phys = pci_resource_start(lro->core.pdev, lro->core.bar_idx) + off;
+	vsize = vma->vm_end - vma->vm_start;
+	/* complete resource */
+	psize = pci_resource_end(lro->core.pdev, lro->core.bar_idx) -
+		pci_resource_start(lro->core.pdev, lro->core.bar_idx) + 1 - off;
+
+	mgmt_info(lro, "mmap(): bar %d, phys:0x%lx, vsize:%ld, psize:%ld",
+		lro->core.bar_idx, phys, vsize, psize);
+
+	if (vsize > psize)
+		return -EINVAL;
+
+	/*
+	 * pages must not be cached as this would result in cache line sized
+	 * accesses to the end point
+	 */
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+	/*
+	 * prevent touching the pages (byte access) for swap-in,
+	 * and prevent the pages from being swapped out
+	 */
+#ifndef VM_RESERVED
+	vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
+#else
+	vma->vm_flags |= VM_IO | VM_RESERVED;
+#endif
+
+	/* make MMIO accessible to user space */
+	rc = io_remap_pfn_range(vma, vma->vm_start, phys >> PAGE_SHIFT,
+				vsize, vma->vm_page_prot);
+	if (rc)
+		return -EAGAIN;
+
+	return rc;
+}
+
+/*
+ * character device file operations for control bus (through control bridge)
+ */
+static const struct file_operations ctrl_fops = {
+	.owner = THIS_MODULE,
+	.open = char_open,
+	.release = char_close,
+	.mmap = bridge_mmap,
+	.unlocked_ioctl = mgmt_ioctl,
+};
+
+/*
+ * create_char() -- create a character device interface to data or control bus
+ *
+ * If at least one SG DMA engine is specified, the character device interface
+ * is coupled to the SG DMA file operations which operate on the data bus. If
+ * no engines are specified, the interface is coupled with the control bus.
+ */
+static int create_char(struct xclmgmt_dev *lro)
+{
+	struct xclmgmt_char *lro_char;
+	int rc;
+
+	lro_char = &lro->user_char_dev;
+
+	/* couple the control device file operations to the character device */
+	lro_char->cdev = cdev_alloc();
+	if (!lro_char->cdev)
+		return -ENOMEM;
+
+	lro_char->cdev->ops = &ctrl_fops;
+	lro_char->cdev->owner = THIS_MODULE;
+	lro_char->cdev->dev = MKDEV(MAJOR(xclmgmt_devnode), lro->core.dev_minor);
+	rc = cdev_add(lro_char->cdev, lro_char->cdev->dev, 1);
+	if (rc < 0) {
+		memset(lro_char, 0, sizeof(*lro_char));
+		mgmt_info(lro, "cdev_add() = %d\n", rc);
+		goto fail_add;
+	}
+
+	lro_char->sys_device = device_create(xrt_class,
+				&lro->core.pdev->dev,
+				lro_char->cdev->dev, NULL,
+				DRV_NAME "%d", lro->instance);
+
+	if (IS_ERR(lro_char->sys_device)) {
+		rc = PTR_ERR(lro_char->sys_device);
+		goto fail_device;
+	}
+
+	return 0;
+
+fail_device:
+	cdev_del(lro_char->cdev);
+fail_add:
+	return rc;
+}
+
+static int destroy_sg_char(struct xclmgmt_char *lro_char)
+{
+	BUG_ON(!lro_char);
+	BUG_ON(!xrt_class);
+
+	if (lro_char->sys_device)
+		device_destroy(xrt_class, lro_char->cdev->dev);
+	cdev_del(lro_char->cdev);
+
+	return 0;
+}
+
+struct pci_dev *find_user_node(const struct pci_dev *pdev)
+{
+	struct xclmgmt_dev *lro;
+	unsigned int slot = PCI_SLOT(pdev->devfn);
+	unsigned int func = PCI_FUNC(pdev->devfn);
+	struct pci_dev *user_dev;
+
+	lro = (struct xclmgmt_dev *)dev_get_drvdata(&pdev->dev);
+
+	/*
+	 * if we are function one then the zero
+	 * function has the user pf node
+	 */
+	if (func == 0) {
+		mgmt_err(lro, "failed get user pf, expect user pf is func 0");
+		return NULL;
+	}
+
+	user_dev = pci_get_slot(pdev->bus, PCI_DEVFN(slot, 0));
+	if (!user_dev) {
+		mgmt_err(lro, "did not find user dev");
+		return NULL;
+	}
+
+	return user_dev;
+}
+
+inline void check_temp_within_range(struct xclmgmt_dev *lro, u32 temp)
+{
+	if (temp < LOW_TEMP || temp > HI_TEMP) {
+		mgmt_err(lro, "Temperature outside normal range (%d-%d) %d.",
+			LOW_TEMP, HI_TEMP, temp);
+	}
+}
+
+inline void check_volt_within_range(struct xclmgmt_dev *lro, u16 volt)
+{
+	if (volt < LOW_MILLVOLT || volt > HI_MILLVOLT) {
+		mgmt_err(lro, "Voltage outside normal range (%d-%d)mV %d.",
+			LOW_MILLVOLT, HI_MILLVOLT, volt);
+	}
+}
+
+static void check_sysmon(struct xclmgmt_dev *lro)
+{
+	u32 val;
+
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_TEMP, &val);
+	check_temp_within_range(lro, val);
+
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_VCC_INT, &val);
+	check_volt_within_range(lro, val);
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_VCC_AUX, &val);
+	check_volt_within_range(lro, val);
+	xocl_sysmon_get_prop(lro, XOCL_SYSMON_PROP_VCC_BRAM, &val);
+	check_volt_within_range(lro, val);
+}
+
+static int health_check_cb(void *data)
+{
+	struct xclmgmt_dev *lro = (struct xclmgmt_dev *)data;
+	struct mailbox_req mbreq = { MAILBOX_REQ_FIREWALL, };
+	bool tripped;
+
+	if (!health_check)
+		return 0;
+
+	mutex_lock(&lro->busy_mutex);
+	tripped = xocl_af_check(lro, NULL);
+	mutex_unlock(&lro->busy_mutex);
+
+	if (!tripped) {
+		check_sysmon(lro);
+	} else {
+		mgmt_info(lro, "firewall tripped, notify peer");
+		(void) xocl_peer_notify(lro, &mbreq, sizeof(struct mailbox_req));
+	}
+
+	return 0;
+}
+
+static inline bool xclmgmt_support_intr(struct xclmgmt_dev *lro)
+{
+	return lro->core.intr_bar_addr != NULL;
+}
+
+static int xclmgmt_setup_msix(struct xclmgmt_dev *lro)
+{
+	int total, rv, i;
+
+	if (!xclmgmt_support_intr(lro))
+		return -EOPNOTSUPP;
+
+	/*
+	 * Get start vector (index into msi-x table) of msi-x usr intr on this
+	 * device.
+	 *
+	 * The device has XCLMGMT_MAX_USER_INTR number of usr intrs, the last
+	 * half of them belongs to mgmt pf, and the first half to user pf. All
+	 * vectors are hard-wired.
+	 *
+	 * The device also has some number of DMA intrs whose vectors come
+	 * before usr ones.
+	 *
+	 * This means that mgmt pf needs to allocate msi-x table big enough to
+	 * cover its own usr vectors. So, only the last chunk of the table will
+	 * ever be used for mgmt pf.
+	 */
+	lro->msix_user_start_vector = XOCL_READ_REG32(lro->core.intr_bar_addr +
+		XCLMGMT_INTR_USER_VECTOR) & 0x0f;
+	total = lro->msix_user_start_vector + XCLMGMT_MAX_USER_INTR;
+
+	i = 0; // Suppress warning about unused variable
+	rv = pci_alloc_irq_vectors(lro->core.pdev, total, total, PCI_IRQ_MSIX);
+	if (rv == total)
+		rv = 0;
+	mgmt_info(lro, "setting up msix, total irqs: %d, rv=%d\n", total, rv);
+	return rv;
+}
+
+static void xclmgmt_teardown_msix(struct xclmgmt_dev *lro)
+{
+	if (xclmgmt_support_intr(lro))
+		pci_disable_msix(lro->core.pdev);
+}
+
+static int xclmgmt_intr_config(xdev_handle_t xdev_hdl, u32 intr, bool en)
+{
+	struct xclmgmt_dev *lro = (struct xclmgmt_dev *)xdev_hdl;
+
+	if (!xclmgmt_support_intr(lro))
+		return -EOPNOTSUPP;
+
+	XOCL_WRITE_REG32(1 << intr, lro->core.intr_bar_addr +
+		(en ? XCLMGMT_INTR_USER_ENABLE : XCLMGMT_INTR_USER_DISABLE));
+	return 0;
+}
+
+static int xclmgmt_intr_register(xdev_handle_t xdev_hdl, u32 intr,
+	irq_handler_t handler, void *arg)
+{
+	u32 vec;
+	struct xclmgmt_dev *lro = (struct xclmgmt_dev *)xdev_hdl;
+
+	if (!xclmgmt_support_intr(lro))
+		return -EOPNOTSUPP;
+
+	vec = pci_irq_vector(lro->core.pdev,
+		lro->msix_user_start_vector + intr);
+
+	if (handler)
+		return request_irq(vec, handler, 0, DRV_NAME, arg);
+
+	free_irq(vec, arg);
+	return 0;
+}
+
+static int xclmgmt_reset(xdev_handle_t xdev_hdl)
+{
+	struct xclmgmt_dev *lro = (struct xclmgmt_dev *)xdev_hdl;
+
+	return reset_hot_ioctl(lro);
+}
+
+struct xocl_pci_funcs xclmgmt_pci_ops = {
+	.intr_config = xclmgmt_intr_config,
+	.intr_register = xclmgmt_intr_register,
+	.reset = xclmgmt_reset,
+};
+
+static int xclmgmt_read_subdev_req(struct xclmgmt_dev *lro, char *data_ptr, void **resp, size_t *sz)
+{
+	uint64_t val = 0;
+	size_t resp_sz = 0;
+	void *ptr = NULL;
+	struct mailbox_subdev_peer *subdev_req = (struct mailbox_subdev_peer *)data_ptr;
+
+	switch (subdev_req->kind) {
+	case VOL_12V_PEX:
+		val = xocl_xmc_get_data(lro, subdev_req->kind);
+		resp_sz = sizeof(u32);
+		ptr = (void *)&val;
+		break;
+	case IDCODE:
+		val = xocl_icap_get_data(lro, subdev_req->kind);
+		resp_sz = sizeof(u32);
+		ptr = (void *)&val;
+		break;
+	case XCLBIN_UUID:
+		ptr = (void *)xocl_icap_get_data(lro, subdev_req->kind);
+		resp_sz = sizeof(uuid_t);
+		break;
+	default:
+		break;
+	}
+
+	if (!resp_sz)
+		return -EINVAL;
+
+	*resp = vmalloc(resp_sz);
+	if (*resp == NULL)
+		return -ENOMEM;
+
+	memcpy(*resp, ptr, resp_sz);
+	*sz = resp_sz;
+	return 0;
+}
+
+static void xclmgmt_mailbox_srv(void *arg, void *data, size_t len,
+	u64 msgid, int err)
+{
+	int ret = 0;
+	size_t sz = 0;
+	struct xclmgmt_dev *lro = (struct xclmgmt_dev *)arg;
+	struct mailbox_req *req = (struct mailbox_req *)data;
+	struct mailbox_req_bitstream_lock *bitstm_lock = NULL;
+	struct mailbox_bitstream_kaddr *mb_kaddr = NULL;
+	void *resp = NULL;
+
+	bitstm_lock =	(struct mailbox_req_bitstream_lock *)req->data;
+
+	if (err != 0)
+		return;
+
+	mgmt_info(lro, "%s received request (%d) from peer\n", __func__, req->req);
+
+	switch (req->req) {
+	case MAILBOX_REQ_LOCK_BITSTREAM:
+		ret = xocl_icap_lock_bitstream(lro, &bitstm_lock->uuid,
+			0);
+		(void) xocl_peer_response(lro, msgid, &ret, sizeof(ret));
+		break;
+	case MAILBOX_REQ_UNLOCK_BITSTREAM:
+		ret = xocl_icap_unlock_bitstream(lro, &bitstm_lock->uuid,
+			0);
+		break;
+	case MAILBOX_REQ_HOT_RESET:
+		ret = (int) reset_hot_ioctl(lro);
+		(void) xocl_peer_response(lro, msgid, &ret, sizeof(ret));
+		break;
+	case MAILBOX_REQ_LOAD_XCLBIN_KADDR:
+		mb_kaddr = (struct mailbox_bitstream_kaddr *)req->data;
+		ret = xocl_icap_download_axlf(lro, (void *)mb_kaddr->addr);
+		(void) xocl_peer_response(lro, msgid, &ret, sizeof(ret));
+		break;
+	case MAILBOX_REQ_LOAD_XCLBIN:
+		ret = xocl_icap_download_axlf(lro, req->data);
+		(void) xocl_peer_response(lro, msgid, &ret, sizeof(ret));
+		break;
+	case MAILBOX_REQ_RECLOCK:
+		ret = xocl_icap_ocl_update_clock_freq_topology(lro, (struct xclmgmt_ioc_freqscaling *)req->data);
+		(void) xocl_peer_response(lro, msgid, &ret, sizeof(ret));
+		break;
+	case MAILBOX_REQ_PEER_DATA:
+		ret = xclmgmt_read_subdev_req(lro, req->data, &resp, &sz);
+		if (ret) {
+			/* if can't get data, return 0 as response */
+			ret = 0;
+			(void) xocl_peer_response(lro, msgid, &ret, sizeof(ret));
+		} else
+			(void) xocl_peer_response(lro, msgid, resp, sz);
+		vfree(resp);
+		break;
+	default:
+		break;
+	}
+}
+
+/*
+ * Called after minimum initialization is done. Should not return failure.
+ * If something goes wrong, it should clean up and return back to minimum
+ * initialization stage.
+ */
+static void xclmgmt_extended_probe(struct xclmgmt_dev *lro)
+{
+	int ret;
+	struct xocl_board_private *dev_info = &lro->core.priv;
+	struct pci_dev *pdev = lro->pci_dev;
+
+	/* We can only support MSI-X. */
+	ret = xclmgmt_setup_msix(lro);
+	if (ret && (ret != -EOPNOTSUPP)) {
+		xocl_err(&pdev->dev, "set up MSI-X failed\n");
+		goto fail;
+	}
+	lro->core.pci_ops = &xclmgmt_pci_ops;
+	lro->core.pdev = pdev;
+
+	/*
+	 * Workaround needed on some platforms. Will clear out any stale
+	 * data after the platform has been reset
+	 */
+	ret = xocl_subdev_create_one(lro,
+		&(struct xocl_subdev_info)XOCL_DEVINFO_AF);
+	if (ret) {
+		xocl_err(&pdev->dev, "failed to register firewall\n");
+		goto fail_firewall;
+	}
+	if (dev_info->flags & XOCL_DSAFLAG_AXILITE_FLUSH)
+		platform_axilite_flush(lro);
+
+	ret = xocl_subdev_create_all(lro, dev_info->subdev_info,
+		dev_info->subdev_num);
+	if (ret) {
+		xocl_err(&pdev->dev, "failed to register subdevs\n");
+		goto fail_all_subdev;
+	}
+	xocl_err(&pdev->dev, "created all sub devices");
+
+	ret = xocl_icap_download_boot_firmware(lro);
+	if (ret)
+		goto fail_all_subdev;
+
+	lro->core.thread_arg.health_cb = health_check_cb;
+	lro->core.thread_arg.arg = lro;
+	lro->core.thread_arg.interval = health_interval * 1000;
+
+	health_thread_start(lro);
+
+	/* Launch the mailbox server. */
+	(void) xocl_peer_listen(lro, xclmgmt_mailbox_srv, (void *)lro);
+
+	lro->ready = true;
+	xocl_err(&pdev->dev, "device fully initialized\n");
+	return;
+
+fail_all_subdev:
+	xocl_subdev_destroy_all(lro);
+fail_firewall:
+	xclmgmt_teardown_msix(lro);
+fail:
+	xocl_err(&pdev->dev, "failed to fully probe device, err: %d\n", ret);
+}
+
+/*
+ * Device initialization is done in two phases:
+ * 1. Minimum initialization - init to the point where open/close/mmap entry
+ * points are working, sysfs entries work without register access, ioctl entry
+ * point is completely disabled.
+ * 2. Full initialization - driver is ready for use.
+ * Once we pass minimum initialization point, probe function shall not fail.
+ */
+static int xclmgmt_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	int rc = 0;
+	struct xclmgmt_dev *lro = NULL;
+	struct xocl_board_private *dev_info;
+
+	xocl_info(&pdev->dev, "Driver: %s", XRT_DRIVER_VERSION);
+	xocl_info(&pdev->dev, "probe(pdev = 0x%p, pci_id = 0x%p)\n", pdev, id);
+
+	rc = pci_enable_device(pdev);
+	if (rc) {
+		xocl_err(&pdev->dev, "pci_enable_device() failed, rc = %d.\n",
+			rc);
+		return rc;
+	}
+
+	/* allocate zeroed device book keeping structure */
+	lro = xocl_drvinst_alloc(&pdev->dev, sizeof(struct xclmgmt_dev));
+	if (!lro) {
+		xocl_err(&pdev->dev, "Could not kzalloc(xclmgmt_dev).\n");
+		rc = -ENOMEM;
+		goto err_alloc;
+	}
+
+	/* create a device to driver reference */
+	dev_set_drvdata(&pdev->dev, lro);
+	/* create a driver to device reference */
+	lro->core.pdev = pdev;
+	lro->pci_dev = pdev;
+	lro->ready = false;
+
+	rc = pcie_get_readrq(pdev);
+	if (rc < 0) {
+		dev_err(&pdev->dev, "failed to read mrrs %d\n", rc);
+		goto err_alloc;
+	}
+	if (rc > 512) {
+		rc = pcie_set_readrq(pdev, 512);
+		if (rc) {
+			dev_err(&pdev->dev, "failed to force mrrs %d\n", rc);
+			goto err_alloc;
+		}
+	}
+
+	rc = xocl_alloc_dev_minor(lro);
+	if (rc)
+		goto err_alloc_minor;
+
+	dev_info = (struct xocl_board_private *)id->driver_data;
+	xocl_fill_dsa_priv(lro, dev_info);
+
+	/* map BARs */
+	rc = map_bars(lro);
+	if (rc)
+		goto err_map;
+
+	lro->instance = XOCL_DEV_ID(pdev);
+	rc = create_char(lro);
+	if (rc) {
+		xocl_err(&pdev->dev, "create_char(user_char_dev) failed\n");
+		goto err_cdev;
+	}
+
+	xocl_drvinst_set_filedev(lro, lro->user_char_dev.cdev);
+
+	mutex_init(&lro->busy_mutex);
+
+	mgmt_init_sysfs(&pdev->dev);
+
+	/* Probe will not fail from now on. */
+	xocl_err(&pdev->dev, "minimum initialization done\n");
+
+	/* No further initialization for MFG board. */
+	if (minimum_initialization ||
+		(dev_info->flags & XOCL_DSAFLAG_MFG) != 0) {
+		return 0;
+	}
+
+	xclmgmt_extended_probe(lro);
+
+	return 0;
+
+err_cdev:
+	unmap_bars(lro);
+err_map:
+	xocl_free_dev_minor(lro);
+err_alloc_minor:
+	dev_set_drvdata(&pdev->dev, NULL);
+	xocl_drvinst_free(lro);
+err_alloc:
+	pci_disable_device(pdev);
+
+	return rc;
+}
+
+static void xclmgmt_remove(struct pci_dev *pdev)
+{
+	struct xclmgmt_dev *lro;
+
+	if ((pdev == 0) || (dev_get_drvdata(&pdev->dev) == 0))
+		return;
+
+	lro = (struct xclmgmt_dev *)dev_get_drvdata(&pdev->dev);
+	mgmt_info(lro, "remove(0x%p) where pdev->dev.driver_data = 0x%p",
+	       pdev, lro);
+	BUG_ON(lro->core.pdev != pdev);
+
+	health_thread_stop(lro);
+
+	mgmt_fini_sysfs(&pdev->dev);
+
+	xocl_subdev_destroy_all(lro);
+
+	xclmgmt_teardown_msix(lro);
+	/* remove user character device */
+	destroy_sg_char(&lro->user_char_dev);
+
+	/* unmap the BARs */
+	unmap_bars(lro);
+	pci_disable_device(pdev);
+
+	xocl_free_dev_minor(lro);
+
+	dev_set_drvdata(&pdev->dev, NULL);
+
+	xocl_drvinst_free(lro);
+}
+
+static pci_ers_result_t mgmt_pci_error_detected(struct pci_dev *pdev,
+	pci_channel_state_t state)
+{
+	switch (state) {
+	case pci_channel_io_normal:
+		xocl_info(&pdev->dev, "PCI normal state error\n");
+		return PCI_ERS_RESULT_CAN_RECOVER;
+	case pci_channel_io_frozen:
+		xocl_info(&pdev->dev, "PCI frozen state error\n");
+		return PCI_ERS_RESULT_NEED_RESET;
+	case pci_channel_io_perm_failure:
+		xocl_info(&pdev->dev, "PCI failure state error\n");
+		return PCI_ERS_RESULT_DISCONNECT;
+	default:
+		xocl_info(&pdev->dev, "PCI unknown state %d error\n", state);
+		break;
+	}
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static const struct pci_error_handlers xclmgmt_err_handler = {
+	.error_detected = mgmt_pci_error_detected,
+};
+
+static struct pci_driver xclmgmt_driver = {
+	.name = DRV_NAME,
+	.id_table = pci_ids,
+	.probe = xclmgmt_probe,
+	.remove = xclmgmt_remove,
+	/* resume, suspend are optional */
+	.err_handler = &xclmgmt_err_handler,
+};
+
+static int (*drv_reg_funcs[])(void) __initdata = {
+	xocl_init_feature_rom,
+	xocl_init_sysmon,
+	xocl_init_mb,
+	xocl_init_xvc,
+	xocl_init_mailbox,
+	xocl_init_firewall,
+	xocl_init_icap,
+	xocl_init_mig,
+	xocl_init_xmc,
+	xocl_init_dna,
+	xocl_init_fmgr,
+};
+
+static void (*drv_unreg_funcs[])(void) = {
+	xocl_fini_feature_rom,
+	xocl_fini_sysmon,
+	xocl_fini_mb,
+	xocl_fini_xvc,
+	xocl_fini_mailbox,
+	xocl_fini_firewall,
+	xocl_fini_icap,
+	xocl_fini_mig,
+	xocl_fini_xmc,
+	xocl_fini_dna,
+	xocl_fini_fmgr,
+};
+
+static int __init xclmgmt_init(void)
+{
+	int res, i;
+
+	pr_info(DRV_NAME " init()\n");
+	xrt_class = class_create(THIS_MODULE, "xrt_mgmt");
+	if (IS_ERR(xrt_class))
+		return PTR_ERR(xrt_class);
+
+	res = alloc_chrdev_region(&xclmgmt_devnode, 0,
+				  XOCL_MAX_DEVICES, DRV_NAME);
+	if (res)
+		goto alloc_err;
+
+	/* Need to init sub device driver before pci driver register */
+	for (i = 0; i < ARRAY_SIZE(drv_reg_funcs); ++i) {
+		res = drv_reg_funcs[i]();
+		if (res)
+			goto drv_init_err;
+	}
+
+	res = pci_register_driver(&xclmgmt_driver);
+	if (res)
+		goto reg_err;
+
+	return 0;
+
+drv_init_err:
+reg_err:
+	for (i--; i >= 0; i--)
+		drv_unreg_funcs[i]();
+
+	unregister_chrdev_region(xclmgmt_devnode, XOCL_MAX_DEVICES);
+alloc_err:
+	pr_info(DRV_NAME " init() err\n");
+	class_destroy(xrt_class);
+	return res;
+}
+
+static void xclmgmt_exit(void)
+{
+	int i;
+
+	pr_info(DRV_NAME" exit()\n");
+	pci_unregister_driver(&xclmgmt_driver);
+
+	for (i = ARRAY_SIZE(drv_unreg_funcs) - 1; i >= 0; i--)
+		drv_unreg_funcs[i]();
+
+	/* unregister this driver from the PCI bus driver */
+	unregister_chrdev_region(xclmgmt_devnode, XOCL_MAX_DEVICES);
+	class_destroy(xrt_class);
+}
+
+module_init(xclmgmt_init);
+module_exit(xclmgmt_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Lizhi Hou <lizhi.hou@xilinx.com>");
+MODULE_VERSION(XRT_DRIVER_VERSION);
+MODULE_DESCRIPTION("Xilinx SDx management function driver");
diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h b/drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
new file mode 100644
index 000000000000..14ef10e21e00
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
@@ -0,0 +1,147 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/**
+ * Copyright (C) 2017-2019 Xilinx, Inc.
+ *
+ * Author(s):
+ * Sonal Santan <sonal.santan@xilinx.com>
+ */
+
+#ifndef _XCL_MGT_PF_H_
+#define _XCL_MGT_PF_H_
+
+#include <linux/cdev.h>
+#include <linux/list.h>
+#include <linux/signal.h>
+#include <linux/init_task.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include <linux/time.h>
+#include <linux/types.h>
+#include <asm/io.h>
+#include <drm/xmgmt_drm.h>
+#include "mgmt-reg.h"
+#include "../xclfeatures.h"
+#include "../xocl_drv.h"
+
+#define DRV_NAME "xmgmt"
+
+#define	MGMT_READ_REG32(lro, off)	\
+	ioread32(lro->core.bar_addr + off)
+#define	MGMT_WRITE_REG32(lro, off, val)	\
+	iowrite32(val, lro->core.bar_addr + off)
+#define	MGMT_WRITE_REG8(lro, off, val)	\
+	iowrite8(val, lro->core.bar_addr + off)
+
+#define	mgmt_err(lro, fmt, args...)	\
+	dev_err(&lro->core.pdev->dev, "%s: "fmt, __func__, ##args)
+#define	mgmt_info(lro, fmt, args...)	\
+	dev_info(&lro->core.pdev->dev, "%s: "fmt, __func__, ##args)
+
+#define	MGMT_PROC_TABLE_HASH_SZ		256
+
+struct xclmgmt_ioc_info;
+
+// List of processes that are using the mgmt driver
+// also saving the task
+struct proc_list {
+	struct list_head head;
+	struct pid      *pid;
+	bool		 signaled;
+};
+
+struct power_val {
+	s32 max;
+	s32 avg;
+	s32 curr;
+};
+
+struct mgmt_power {
+	struct power_val vccint;
+	struct power_val vcc1v8;
+	struct power_val vcc1v2;
+	struct power_val vccbram;
+	struct power_val mgtavcc;
+	struct power_val mgtavtt;
+};
+
+struct xclmgmt_proc_ctx {
+	struct xclmgmt_dev	*lro;
+	struct pid		*pid;
+	bool			signaled;
+};
+
+struct xclmgmt_char {
+	struct xclmgmt_dev *lro;
+	struct cdev *cdev;
+	struct device *sys_device;
+};
+
+struct xclmgmt_data_buf {
+	enum mb_cmd_type cmd_type;
+	uint64_t priv_data;
+	char *data_buf;
+};
+
+struct xclmgmt_dev {
+	struct xocl_dev_core	core;
+	/* MAGIC_DEVICE == 0xAAAAAAAA */
+	unsigned long magic;
+
+	/* the kernel pci device data structure provided by probe() */
+	struct pci_dev *pci_dev;
+	int instance;
+	struct xclmgmt_char user_char_dev;
+	int axi_gate_frozen;
+	unsigned short ocl_frequency[4];
+
+	struct mutex busy_mutex;
+	struct mgmt_power power;
+
+	int msix_user_start_vector;
+	bool ready;
+
+};
+
+extern int health_check;
+
+int ocl_freqscaling_ioctl(struct xclmgmt_dev *lro, const void __user *arg);
+void platform_axilite_flush(struct xclmgmt_dev *lro);
+u16 get_dsa_version(struct xclmgmt_dev *lro);
+void fill_frequency_info(struct xclmgmt_dev *lro, struct xclmgmt_ioc_info *obj);
+void device_info(struct xclmgmt_dev *lro, struct xclmgmt_ioc_info *obj);
+long mgmt_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
+void get_pcie_link_info(struct xclmgmt_dev *lro,
+			unsigned short *width, unsigned short *speed, bool is_cap);
+
+// utils.c
+unsigned int compute_unit_busy(struct xclmgmt_dev *lro);
+int pci_fundamental_reset(struct xclmgmt_dev *lro);
+
+long reset_hot_ioctl(struct xclmgmt_dev *lro);
+void xdma_reset(struct pci_dev *pdev, bool prepare);
+void xclmgmt_reset_pci(struct xclmgmt_dev *lro);
+
+// firewall.c
+void init_firewall(struct xclmgmt_dev *lro);
+void xclmgmt_killall_processes(struct xclmgmt_dev *lro);
+void xclmgmt_list_add(struct xclmgmt_dev *lro, struct pid *new_pid);
+void xclmgmt_list_remove(struct xclmgmt_dev *lro, struct pid *remove_pid);
+void xclmgmt_list_del(struct xclmgmt_dev *lro);
+bool xclmgmt_check_proc(struct xclmgmt_dev *lro, struct pid *pid);
+
+// mgmt-xvc.c
+long xvc_ioctl(struct xclmgmt_dev *lro, const void __user *arg);
+
+//mgmt-sysfs.c
+int mgmt_init_sysfs(struct device *dev);
+void mgmt_fini_sysfs(struct device *dev);
+
+//mgmt-mb.c
+int mgmt_init_mb(struct xclmgmt_dev *lro);
+void mgmt_fini_mb(struct xclmgmt_dev *lro);
+int mgmt_start_mb(struct xclmgmt_dev *lro);
+int mgmt_stop_mb(struct xclmgmt_dev *lro);
+
+#endif
diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c b/drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
new file mode 100644
index 000000000000..5e60db260b37
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/**
+ *  Copyright (C) 2017-2019 Xilinx, Inc. All rights reserved.
+ *
+ *  Code borrowed from Xilinx SDAccel XDMA driver
+ *  Author: Umang Parekh
+ *
+ */
+
+#include "mgmt-core.h"
+
+int ocl_freqscaling_ioctl(struct xclmgmt_dev *lro, const void __user *arg)
+{
+	struct xclmgmt_ioc_freqscaling freq_obj;
+
+	mgmt_info(lro, "%s  called", __func__);
+
+	if (copy_from_user((void *)&freq_obj, arg,
+		sizeof(struct xclmgmt_ioc_freqscaling)))
+		return -EFAULT;
+
+	return xocl_icap_ocl_update_clock_freq_topology(lro, &freq_obj);
+}
+
+void fill_frequency_info(struct xclmgmt_dev *lro, struct xclmgmt_ioc_info *obj)
+{
+	(void) xocl_icap_ocl_get_freq(lro, 0, obj->ocl_frequency,
+		ARRAY_SIZE(obj->ocl_frequency));
+}
diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c b/drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
new file mode 100644
index 000000000000..bd53b6997d2a
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/**
+ *  Copyright (C) 2017 Xilinx, Inc. All rights reserved.
+ *  Author: Sonal Santan
+ *  Code copied verbatim from SDAccel xcldma kernel mode driver
+ */
+
+#include "mgmt-core.h"
+
+static int err_info_ioctl(struct xclmgmt_dev *lro, void __user *arg)
+{
+	struct xclmgmt_err_info obj;
+	u32	val, level;
+	u64	t;
+	int	i;
+
+	mgmt_info(lro, "Enter error_info IOCTL");
+
+	xocl_af_get_prop(lro, XOCL_AF_PROP_TOTAL_LEVEL, &val);
+	if (val > ARRAY_SIZE(obj.mAXIErrorStatus)) {
+		mgmt_err(lro, "Too many levels %d", val);
+		return -EINVAL;
+	}
+
+	obj.mNumFirewalls = val;
+	memset(obj.mAXIErrorStatus, 0, sizeof(obj.mAXIErrorStatus));
+	for (i = 0; i < obj.mNumFirewalls; ++i)
+		obj.mAXIErrorStatus[i].mErrFirewallID = i;
+
+	xocl_af_get_prop(lro, XOCL_AF_PROP_DETECTED_LEVEL, &level);
+	if (level >= val) {
+		mgmt_err(lro, "Invalid detected level %d", level);
+		return -EINVAL;
+	}
+	obj.mAXIErrorStatus[level].mErrFirewallID = level;
+
+	xocl_af_get_prop(lro, XOCL_AF_PROP_DETECTED_STATUS, &val);
+	obj.mAXIErrorStatus[level].mErrFirewallStatus = val;
+
+	xocl_af_get_prop(lro, XOCL_AF_PROP_DETECTED_TIME, &t);
+	obj.mAXIErrorStatus[level].mErrFirewallTime = t;
+
+	if (copy_to_user(arg, &obj, sizeof(struct xclErrorStatus)))
+		return -EFAULT;
+	return 0;
+}
+
+static int version_ioctl(struct xclmgmt_dev *lro, void __user *arg)
+{
+	struct xclmgmt_ioc_info obj;
+
+	mgmt_info(lro, "%s: %s\n", DRV_NAME, __func__);
+	device_info(lro, &obj);
+	if (copy_to_user(arg, &obj, sizeof(struct xclmgmt_ioc_info)))
+		return -EFAULT;
+	return 0;
+}
+
+static long reset_ocl_ioctl(struct xclmgmt_dev *lro)
+{
+	xocl_icap_reset_axi_gate(lro);
+	return compute_unit_busy(lro) ? -EBUSY : 0;
+}
+
+static int bitstream_ioctl_axlf(struct xclmgmt_dev *lro, const void __user *arg)
+{
+	void *copy_buffer = NULL;
+	size_t copy_buffer_size = 0;
+	struct xclmgmt_ioc_bitstream_axlf ioc_obj = { 0 };
+	struct axlf xclbin_obj = { 0 };
+	int ret = 0;
+
+	if (copy_from_user((void *)&ioc_obj, arg, sizeof(ioc_obj)))
+		return -EFAULT;
+	if (copy_from_user((void *)&xclbin_obj, ioc_obj.xclbin,
+		sizeof(xclbin_obj)))
+		return -EFAULT;
+
+	copy_buffer_size = xclbin_obj.m_header.m_length;
+	copy_buffer = vmalloc(copy_buffer_size);
+	if (copy_buffer == NULL)
+		return -ENOMEM;
+
+	if (copy_from_user((void *)copy_buffer, ioc_obj.xclbin,
+		copy_buffer_size))
+		ret = -EFAULT;
+	else
+		ret = xocl_icap_download_axlf(lro, copy_buffer);
+
+	vfree(copy_buffer);
+	return ret;
+}
+
+long mgmt_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+	struct xclmgmt_dev *lro = (struct xclmgmt_dev *)filp->private_data;
+	long result = 0;
+
+	BUG_ON(!lro);
+
+	if (!lro->ready || _IOC_TYPE(cmd) != XCLMGMT_IOC_MAGIC)
+		return -ENOTTY;
+
+	if (_IOC_DIR(cmd) & _IOC_READ)
+		result = !access_ok((void __user *)arg, _IOC_SIZE(cmd));
+	else if (_IOC_DIR(cmd) & _IOC_WRITE)
+		result =  !access_ok((void __user *)arg, _IOC_SIZE(cmd));
+
+	if (result)
+		return -EFAULT;
+
+	mutex_lock(&lro->busy_mutex);
+
+	switch (cmd) {
+	case XCLMGMT_IOCINFO:
+		result = version_ioctl(lro, (void __user *)arg);
+		break;
+	case XCLMGMT_IOCICAPDOWNLOAD:
+		mgmt_err(lro, "Bitstream ioctl with legacy bitstream not supported");
+		result = -EINVAL;
+		break;
+	case XCLMGMT_IOCICAPDOWNLOAD_AXLF:
+		result = bitstream_ioctl_axlf(lro, (void __user *)arg);
+		break;
+	case XCLMGMT_IOCOCLRESET:
+		result = reset_ocl_ioctl(lro);
+		break;
+	case XCLMGMT_IOCHOTRESET:
+		result = reset_hot_ioctl(lro);
+		break;
+	case XCLMGMT_IOCFREQSCALE:
+		result = ocl_freqscaling_ioctl(lro, (void __user *)arg);
+		break;
+	case XCLMGMT_IOCREBOOT:
+		result = capable(CAP_SYS_ADMIN) ? pci_fundamental_reset(lro) : -EACCES;
+		break;
+	case XCLMGMT_IOCERRINFO:
+		result = err_info_ioctl(lro, (void __user *)arg);
+		break;
+	default:
+		mgmt_info(lro, "MGMT default IOCTL request %u\n", cmd & 0xff);
+		result = -ENOTTY;
+	}
+
+	mutex_unlock(&lro->busy_mutex);
+	return result;
+}
diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h b/drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
new file mode 100644
index 000000000000..cff012c98673
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
@@ -0,0 +1,244 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Apache-2.0 */
+
+/**
+ * Copyright (C) 2016-2019 Xilinx, Inc
+ */
+
+#ifndef _XCL_MGT_REG_H_
+#define _XCL_MGT_REG_H_
+
+
+#define KB(x)   ((unsigned int) (x) << 10)
+#define MB(x)   ((unsigned int) (x) << 20)
+
+enum PFO_BARS {
+	USER_BAR = 0,
+	DMA_BAR,
+	MAX_BAR
+};
+
+/**
+ * Peripherals on AXI-Lite mapped to PCIe BAR
+ */
+
+#define XILINX_VENDOR_ID	0x10EE
+#define OCL_CU_CTRL_RANGE	KB(4)
+
+#define DDR_BUFFER_ALIGNMENT	0x40
+#define MMAP_SIZE_USER		MB(32)
+
+//parameters for HWICAP, Flash and APM on PCIe BAR
+#define HWICAP_OFFSET		0x020000
+#define AXI_GATE_OFFSET		0x030000
+#define AXI_GATE_OFFSET_READ	0x030008
+#define BPI_FLASH_OFFSET	0x040000
+
+//Base addresses for LAPC
+#define LAPC0_BASE	      0x00120000  //ocl master00
+#define LAPC1_BASE	      0x00121000  //ocl master01
+#define LAPC2_BASE	      0x00122000  //ocl master02
+#define LAPC3_BASE	      0x00123000  //ocl master03
+
+//Following status registers are available at each base
+#define LAPC_OVERALL_STATUS_OFFSET	  0x0
+#define LAPC_CUMULATIVE_STATUS_0_OFFSET	  0x100
+#define LAPC_CUMULATIVE_STATUS_1_OFFSET	  0x104
+#define LAPC_CUMULATIVE_STATUS_2_OFFSET	  0x108
+#define LAPC_CUMULATIVE_STATUS_3_OFFSET	  0x10c
+
+#define LAPC_SNAPSHOT_STATUS_0_OFFSET	  0x200
+#define LAPC_SNAPSHOT_STATUS_1_OFFSET	  0x204
+#define LAPC_SNAPSHOT_STATUS_2_OFFSET	  0x208
+#define LAPC_SNAPSHOT_STATUS_3_OFFSET	  0x20c
+
+// NOTE: monitor address offset now defined by PERFMON0_BASE
+#define PERFMON0_OFFSET		0x0
+#define PERFMON1_OFFSET		0x020000
+#define PERFMON2_OFFSET		0x010000
+
+#define PERFMON_START_OFFSET	0x2000
+#define PERFMON_RANGE			0x1000
+
+#define FEATURE_ROM_BASE	   0x0B0000
+#define OCL_CTLR_BASE		   0x000000
+#define HWICAP_BASE		   0x020000
+#define AXI_GATE_BASE		   0x030000
+#define AXI_GATE_BASE_RD_BASE	   0x030008
+#define FEATURE_ID_BASE		   0x031000
+#define GENERAL_STATUS_BASE	   0x032000
+#define AXI_I2C_BASE		   0x041000
+#define PERFMON0_BASE		   0x100000
+#define PERFMON0_BASE2		   0x1800000
+#define OCL_CLKWIZ0_BASE	   0x050000
+#define OCL_CLKWIZ1_BASE	   0x051000
+/* Only needed for workaround for 5.0 platforms */
+#define GPIO_NULL_BASE		   0x1FFF000
+
+
+#define OCL_CLKWIZ_STATUS_OFFSET      0x4
+#define OCL_CLKWIZ_CONFIG_OFFSET(n)   (0x200 + 4 * (n))
+
+/**
+ * AXI Firewall Register definition
+ */
+#define FIREWALL_MGMT_CONTROL_BASE	0xD0000
+#define FIREWALL_USER_CONTROL_BASE	0xE0000
+#define FIREWALL_DATAPATH_BASE		0xF0000
+
+#define AF_MI_FAULT_STATUS_OFFSET	       0x0	//MI Fault Status Register
+#define AF_MI_SOFT_CTRL_OFFSET		       0x4	//MI Soft Fault Control Register
+#define AF_UNBLOCK_CTRL_OFFSET		       0x8	//MI Unblock Control Register
+
+// Currently un-used regs from the Firewall IP.
+#define AF_MAX_CONTINUOUS_RTRANSFERS_WAITS     0x30	//MAX_CONTINUOUS_RTRANSFERS_WAITS
+#define AF_MAX_WRITE_TO_BVALID_WAITS	       0x34	//MAX_WRITE_TO_BVALID_WAITS
+#define AF_MAX_ARREADY_WAITS		       0x38	//MAX_ARREADY_WAITS
+#define AF_MAX_AWREADY_WAITS		       0x3c	//MAX_AWREADY_WAITS
+#define AF_MAX_WREADY_WAITS		       0x40	//MAX_WREADY_WAITS
+
+/**
+ * DDR Zero IP Register definition
+ */
+//#define ENABLE_DDR_ZERO_IP
+#define DDR_ZERO_BASE			0x0B0000
+#define DDR_ZERO_CONFIG_REG_OFFSET	0x10
+#define DDR_ZERO_CTRL_REG_OFFSET	0x0
+
+
+/**
+ * SYSMON Register definition
+ */
+#define SYSMON_BASE		0x0A0000
+#define SYSMON_TEMP		0x400		// TEMPOERATURE REGISTER ADDRESS
+#define SYSMON_VCCINT		0x404		// VCCINT REGISTER OFFSET
+#define SYSMON_VCCAUX		0x408		// VCCAUX REGISTER OFFSET
+#define SYSMON_VCCBRAM		0x418		// VCCBRAM REGISTER OFFSET
+#define	SYSMON_TEMP_MAX		0x480
+#define	SYSMON_VCCINT_MAX	0x484
+#define	SYSMON_VCCAUX_MAX	0x488
+#define	SYSMON_VCCBRAM_MAX	0x48c
+#define	SYSMON_TEMP_MIN		0x490
+#define	SYSMON_VCCINT_MIN	0x494
+#define	SYSMON_VCCAUX_MIN	0x498
+#define	SYSMON_VCCBRAM_MIN	0x49c
+
+#define	SYSMON_TO_MILLDEGREE(val)		\
+	(((int64_t)(val) * 501374 >> 16) - 273678)
+#define	SYSMON_TO_MILLVOLT(val)			\
+	((val) * 1000 * 3 >> 16)
+
+
+/**
+ * ICAP Register definition
+ */
+
+#define XHWICAP_GIER		(HWICAP_BASE+0x1c)
+#define XHWICAP_ISR		(HWICAP_BASE+0x20)
+#define XHWICAP_IER		(HWICAP_BASE+0x28)
+#define XHWICAP_WF		(HWICAP_BASE+0x100)
+#define XHWICAP_RF		(HWICAP_BASE+0x104)
+#define XHWICAP_SZ		(HWICAP_BASE+0x108)
+#define XHWICAP_CR		(HWICAP_BASE+0x10c)
+#define XHWICAP_SR		(HWICAP_BASE+0x110)
+#define XHWICAP_WFV		(HWICAP_BASE+0x114)
+#define XHWICAP_RFO		(HWICAP_BASE+0x118)
+#define XHWICAP_ASR		(HWICAP_BASE+0x11c)
+
+/* Used for parsing bitstream header */
+#define XHI_EVEN_MAGIC_BYTE	0x0f
+#define XHI_ODD_MAGIC_BYTE	0xf0
+
+/* Extra mode for IDLE */
+#define XHI_OP_IDLE  -1
+
+#define XHI_BIT_HEADER_FAILURE -1
+
+/* The imaginary module length register */
+#define XHI_MLR			 15
+
+#define DMA_HWICAP_BITFILE_BUFFER_SIZE 1024
+
+/*
+ * Flash programming constants
+ * XAPP 518
+ * http://www.xilinx.com/support/documentation/application_notes/xapp518-isp-bpi-prom-virtex-6-pcie.pdf
+ * Table 1
+ */
+
+#define START_ADDR_HI_CMD   0x53420000
+#define START_ADDR_CMD	    0x53410000
+#define END_ADDR_CMD	    0x45000000
+#define END_ADDR_HI_CMD	    0x45420000
+#define UNLOCK_CMD	    0x556E6C6B
+#define ERASE_CMD	    0x45726173
+#define PROGRAM_CMD	    0x50726F67
+#define VERSION_CMD	    0x55726F73
+
+#define READY_STAT	    0x00008000
+#define ERASE_STAT	    0x00000000
+#define PROGRAM_STAT	    0x00000080
+
+/*
+ * Booting FPGA from PROM
+ * http://www.xilinx.com/support/documentation/user_guides/ug470_7Series_Config.pdf
+ * Table 7.1
+ */
+
+#define DUMMY_WORD	   0xFFFFFFFF
+#define SYNC_WORD	   0xAA995566
+#define TYPE1_NOOP	   0x20000000
+#define TYPE1_WRITE_WBSTAR 0x30020001
+#define WBSTAR_ADD10	   0x00000000
+#define WBSTAR_ADD11	   0x01000000
+#define TYPE1_WRITE_CMD	   0x30008001
+#define IPROG_CMD	   0x0000000F
+
+/*
+ * MicroBlaze definition
+ */
+
+#define	MB_REG_BASE		0x120000
+#define	MB_GPIO			0x131000
+#define	MB_IMAGE_MGMT		0x140000
+#define	MB_IMAGE_SCHE		0x160000
+
+#define	MB_REG_VERSION		(MB_REG_BASE)
+#define	MB_REG_ID		(MB_REG_BASE + 0x4)
+#define	MB_REG_STATUS		(MB_REG_BASE + 0x8)
+#define	MB_REG_ERR		(MB_REG_BASE + 0xC)
+#define	MB_REG_CAP		(MB_REG_BASE + 0x10)
+#define	MB_REG_CTL		(MB_REG_BASE + 0x18)
+#define	MB_REG_STOP_CONFIRM	(MB_REG_BASE + 0x1C)
+#define	MB_REG_CURR_BASE	(MB_REG_BASE + 0x20)
+#define	MB_REG_POW_CHK		(MB_REG_BASE + 0x1A4)
+
+#define	MB_CTL_MASK_STOP		0x8
+#define	MB_CTL_MASK_PAUSE		0x4
+#define	MB_CTL_MASK_CLEAR_ERR		0x2
+#define MB_CTL_MASK_CLEAR_POW		0x1
+
+#define	MB_STATUS_MASK_INIT_DONE	0x1
+#define	MB_STATUS_MASK_STOPPED		0x2
+#define	MB_STATUS_MASK_PAUSED		0x4
+
+#define	MB_CAP_MASK_PM			0x1
+
+#define	MB_VALID_ID			0x74736574
+
+#define	MB_GPIO_RESET			0x0
+#define	MB_GPIO_ENABLED			0x1
+
+#define	MB_SELF_JUMP(ins)		(((ins) & 0xfc00ffff) == 0xb8000000)
+
+/*
+ * Interrupt controls
+ */
+#define XCLMGMT_MAX_INTR_NUM		32
+#define XCLMGMT_MAX_USER_INTR		16
+#define XCLMGMT_INTR_CTRL_BASE		(0x2000UL)
+#define XCLMGMT_INTR_USER_ENABLE	(XCLMGMT_INTR_CTRL_BASE + 0x08)
+#define XCLMGMT_INTR_USER_DISABLE	(XCLMGMT_INTR_CTRL_BASE + 0x0C)
+#define XCLMGMT_INTR_USER_VECTOR	(XCLMGMT_INTR_CTRL_BASE + 0x80)
+#define XCLMGMT_MAILBOX_INTR		11
+
+#endif
diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c b/drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
new file mode 100644
index 000000000000..40d7c855ab14
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
@@ -0,0 +1,318 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * sysfs for the device attributes.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *    Lizhi Hou <lizhih@xilinx.com>
+ *    Umang Parekh <umang.parekh@xilinx.com>
+ *
+ */
+
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+
+#include "mgmt-core.h"
+#include "../version.h"
+
+static ssize_t instance_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%u\n", lro->instance);
+}
+static DEVICE_ATTR_RO(instance);
+
+static ssize_t error_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+	ssize_t count = sprintf(buf, "%s\n", lro->core.ebuf);
+
+	lro->core.ebuf[0] = 0;
+	return count;
+}
+static DEVICE_ATTR_RO(error);
+
+static ssize_t userbar_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", lro->core.bar_idx);
+}
+static DEVICE_ATTR_RO(userbar);
+
+static ssize_t flash_type_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%s\n",
+		lro->core.priv.flash_type ? lro->core.priv.flash_type : "");
+}
+static DEVICE_ATTR_RO(flash_type);
+
+static ssize_t board_name_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%s\n",
+		lro->core.priv.board_name ? lro->core.priv.board_name : "");
+}
+static DEVICE_ATTR_RO(board_name);
+
+static ssize_t mfg_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", (lro->core.priv.flags & XOCL_DSAFLAG_MFG) != 0);
+}
+static DEVICE_ATTR_RO(mfg);
+
+static ssize_t feature_rom_offset_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%llu\n", lro->core.feature_rom_offset);
+}
+static DEVICE_ATTR_RO(feature_rom_offset);
+
+static ssize_t mgmt_pf_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	// The existence of entry indicates mgmt function.
+	return sprintf(buf, "%s", "");
+}
+static DEVICE_ATTR_RO(mgmt_pf);
+
+static ssize_t version_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	u32 major, minor, patch;
+
+	if (sscanf(XRT_DRIVER_VERSION, "%d.%d.%d", &major, &minor, &patch) != 3)
+		return 0;
+	return sprintf(buf, "%d\n", XOCL_DRV_VER_NUM(major, minor, patch));
+}
+static DEVICE_ATTR_RO(version);
+
+static ssize_t slot_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", PCI_SLOT(lro->core.pdev->devfn));
+}
+static DEVICE_ATTR_RO(slot);
+
+static ssize_t link_speed_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	get_pcie_link_info(lro, &width, &speed, false);
+	return sprintf(buf, "%d\n", speed);
+}
+static DEVICE_ATTR_RO(link_speed);
+
+static ssize_t link_width_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	get_pcie_link_info(lro, &width, &speed, false);
+	return sprintf(buf, "%d\n", width);
+}
+static DEVICE_ATTR_RO(link_width);
+
+static ssize_t link_speed_max_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	get_pcie_link_info(lro, &width, &speed, true);
+	return sprintf(buf, "%d\n", speed);
+}
+static DEVICE_ATTR_RO(link_speed_max);
+
+static ssize_t link_width_max_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	get_pcie_link_info(lro, &width, &speed, true);
+	return sprintf(buf, "%d\n", width);
+}
+static DEVICE_ATTR_RO(link_width_max);
+
+static ssize_t mig_calibration_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n",
+		lro->ready ? MGMT_READ_REG32(lro, GENERAL_STATUS_BASE) : 0);
+}
+static DEVICE_ATTR_RO(mig_calibration);
+
+static ssize_t xpr_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", XOCL_DSA_XPR_ON(lro));
+}
+static DEVICE_ATTR_RO(xpr);
+
+static ssize_t ready_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", lro->ready);
+}
+static DEVICE_ATTR_RO(ready);
+
+static ssize_t dev_offline_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+	int val = lro->core.offline ? 1 : 0;
+
+	return sprintf(buf, "%d\n", val);
+}
+
+static ssize_t dev_offline_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+	int ret;
+	u32 offline;
+
+	if (kstrtou32(buf, 10, &offline) == -EINVAL || offline > 1)
+		return -EINVAL;
+
+	device_lock(dev);
+	if (offline) {
+		ret = health_thread_stop(lro);
+		if (ret) {
+			xocl_err(dev, "stop health thread failed");
+			return -EIO;
+		}
+		xocl_subdev_destroy_all(lro);
+		lro->core.offline = true;
+	} else {
+		ret = xocl_subdev_create_all(lro, lro->core.priv.subdev_info,
+			lro->core.priv.subdev_num);
+		if (ret) {
+			xocl_err(dev, "Online subdevices failed");
+			return -EIO;
+		}
+		ret = health_thread_start(lro);
+		if (ret) {
+			xocl_err(dev, "start health thread failed");
+			return -EIO;
+		}
+		lro->core.offline = false;
+	}
+	device_unlock(dev);
+
+	return count;
+}
+
+static DEVICE_ATTR(dev_offline, 0644, dev_offline_show, dev_offline_store);
+
+static ssize_t subdev_online_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+	int ret;
+	char *name = (char *)buf;
+
+	device_lock(dev);
+	ret = xocl_subdev_create_by_name(lro, name);
+	if (ret)
+		xocl_err(dev, "create subdev by name failed");
+	else
+		ret = count;
+	device_unlock(dev);
+
+	return ret;
+}
+
+static DEVICE_ATTR(subdev_online, 0200, NULL, subdev_online_store);
+
+static ssize_t subdev_offline_store(struct device *dev,
+	struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xclmgmt_dev *lro = dev_get_drvdata(dev);
+	int ret;
+	char *name = (char *)buf;
+
+	device_lock(dev);
+	ret = xocl_subdev_destroy_by_name(lro, name);
+	if (ret)
+		xocl_err(dev, "destroy subdev by name failed");
+	else
+		ret = count;
+	device_unlock(dev);
+
+	return ret;
+}
+
+static DEVICE_ATTR(subdev_offline, 0200, NULL, subdev_offline_store);
+
+static struct attribute *mgmt_attrs[] = {
+	&dev_attr_instance.attr,
+	&dev_attr_error.attr,
+	&dev_attr_userbar.attr,
+	&dev_attr_version.attr,
+	&dev_attr_slot.attr,
+	&dev_attr_link_speed.attr,
+	&dev_attr_link_width.attr,
+	&dev_attr_link_speed_max.attr,
+	&dev_attr_link_width_max.attr,
+	&dev_attr_mig_calibration.attr,
+	&dev_attr_xpr.attr,
+	&dev_attr_ready.attr,
+	&dev_attr_mfg.attr,
+	&dev_attr_mgmt_pf.attr,
+	&dev_attr_flash_type.attr,
+	&dev_attr_board_name.attr,
+	&dev_attr_feature_rom_offset.attr,
+	&dev_attr_dev_offline.attr,
+	&dev_attr_subdev_online.attr,
+	&dev_attr_subdev_offline.attr,
+	NULL,
+};
+
+static struct attribute_group mgmt_attr_group = {
+	.attrs = mgmt_attrs,
+};
+
+int mgmt_init_sysfs(struct device *dev)
+{
+	int err;
+
+	err = sysfs_create_group(&dev->kobj, &mgmt_attr_group);
+	if (err)
+		xocl_err(dev, "create mgmt attrs failed: %d", err);
+
+	return err;
+}
+
+void mgmt_fini_sysfs(struct device *dev)
+{
+	sysfs_remove_group(&dev->kobj, &mgmt_attr_group);
+}
diff --git a/drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c b/drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
new file mode 100644
index 000000000000..ed70ca83d748
--- /dev/null
+++ b/drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
@@ -0,0 +1,399 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ *  Copyright (C) 2017-2019 Xilinx, Inc. All rights reserved.
+ *
+ *  Utility Functions for sysmon, axi firewall and other peripherals.
+ *  Author: Umang Parekh
+ *
+ */
+
+#include "mgmt-core.h"
+#include <linux/module.h>
+#include "../xocl_drv.h"
+
+#define XCLMGMT_RESET_MAX_RETRY		10
+
+/**
+ * @returns: NULL if AER apability is not found walking up to the root port
+ *         : pci_dev ptr to the port which is AER capable.
+ */
+static struct pci_dev *find_aer_cap(struct pci_dev *bridge)
+{
+	struct pci_dev *prev_bridge = bridge;
+	int cap;
+
+	if (bridge == NULL)
+		return NULL;
+	/*
+	 * Walk the hierarchy up to the root port
+	 **/
+	do {
+		cap = pci_find_ext_capability(bridge, PCI_EXT_CAP_ID_ERR);
+		if (cap) {
+			printk(KERN_DEBUG "%s: AER capability found.\n", DRV_NAME);
+			return bridge;
+		}
+
+		prev_bridge = bridge;
+		bridge = bridge->bus->self;
+
+		if (!bridge || prev_bridge == bridge) {
+			printk(KERN_DEBUG "%s: AER capability not found. Ignoring boot command.\n", DRV_NAME);
+			return NULL;
+		}
+
+	} while (pci_pcie_type(bridge) != PCI_EXP_TYPE_ROOT_PORT);
+
+	return NULL;
+}
+
+/*
+ * pcie_(un)mask_surprise_down inspired by myri10ge driver, myri10ge.c
+ */
+static int pcie_mask_surprise_down(struct pci_dev *pdev, u32 *orig_mask)
+{
+	struct pci_dev *bridge = pdev->bus->self;
+	int cap;
+	u32 mask;
+
+	printk(KERN_INFO "%s: pcie_mask_surprise_down\n", DRV_NAME);
+
+	bridge = find_aer_cap(bridge);
+	if (bridge) {
+		cap = pci_find_ext_capability(bridge, PCI_EXT_CAP_ID_ERR);
+		if (cap) {
+			pci_read_config_dword(bridge, cap + PCI_ERR_UNCOR_MASK, orig_mask);
+			mask = *orig_mask;
+			mask |= 0x20;
+			pci_write_config_dword(bridge, cap + PCI_ERR_UNCOR_MASK, mask);
+			return 0;
+		}
+	}
+
+	return -ENODEV;
+}
+
+static int pcie_unmask_surprise_down(struct pci_dev *pdev, u32 orig_mask)
+{
+	struct pci_dev *bridge = pdev->bus->self;
+	int cap;
+
+	printk(KERN_DEBUG "%s: pcie_unmask_surprise_down\n", DRV_NAME);
+
+	bridge = find_aer_cap(bridge);
+	if (bridge) {
+		cap = pci_find_ext_capability(bridge, PCI_EXT_CAP_ID_ERR);
+		if (cap) {
+			pci_write_config_dword(bridge, cap + PCI_ERR_UNCOR_MASK, orig_mask);
+			return 0;
+		}
+	}
+
+	return -ENODEV;
+}
+
+/**
+ * Workaround for some DSAs that need axilite bus flushed after reset
+ */
+void platform_axilite_flush(struct xclmgmt_dev *lro)
+{
+	u32 val, i, gpio_val;
+
+	mgmt_info(lro, "Flushing axilite busses.");
+
+	/* The flush sequence works as follows:
+	 * Read axilite peripheral up to 4 times
+	 * Check if firewall trips and clear it.
+	 * Touch all axilite interconnects with clock crossing
+	 * in platform which requires reading multiple peripherals
+	 * (Feature ROM, MB Reset GPIO, Sysmon)
+	 */
+	for (i = 0; i < 4; i++) {
+		val = MGMT_READ_REG32(lro, FEATURE_ROM_BASE);
+		xocl_af_clear(lro);
+	}
+
+	for (i = 0; i < 4; i++) {
+		gpio_val = MGMT_READ_REG32(lro, MB_GPIO);
+		xocl_af_clear(lro);
+	}
+
+	for (i = 0; i < 4; i++) {
+		val = MGMT_READ_REG32(lro, SYSMON_BASE);
+		xocl_af_clear(lro);
+	}
+
+	//Can only read this safely if not in reset
+	if (gpio_val == 1) {
+		for (i = 0; i < 4; i++) {
+			val = MGMT_READ_REG32(lro, MB_IMAGE_SCHE);
+			xocl_af_clear(lro);
+		}
+	}
+
+	for (i = 0; i < 4; i++) {
+		val = MGMT_READ_REG32(lro, XHWICAP_CR);
+		xocl_af_clear(lro);
+	}
+
+	for (i = 0; i < 4; i++) {
+		val = MGMT_READ_REG32(lro, GPIO_NULL_BASE);
+		xocl_af_clear(lro);
+	}
+
+	for (i = 0; i < 4; i++) {
+		val = MGMT_READ_REG32(lro, AXI_GATE_BASE);
+		xocl_af_clear(lro);
+	}
+}
+
+/**
+ * Perform a PCIe secondary bus reset. Note: Use this method over pcie fundamental reset.
+ * This method is known to work better.
+ */
+
+long reset_hot_ioctl(struct xclmgmt_dev *lro)
+{
+	long err = 0;
+	const char *ep_name;
+	struct pci_dev *pdev = lro->pci_dev;
+	struct xocl_board_private *dev_info = &lro->core.priv;
+	int retry = 0;
+
+
+	if (!pdev->bus || !pdev->bus->self) {
+		mgmt_err(lro, "Unable to identify device root port for card %d",
+		       lro->instance);
+		err = -ENODEV;
+		goto done;
+	}
+
+	ep_name = pdev->bus->name;
+#if defined(__PPC64__)
+	mgmt_err(lro, "Ignore reset operation for card %d in slot %s:%02x:%1x",
+		lro->instance, ep_name,
+		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+#else
+	mgmt_err(lro, "Trying to reset card %d in slot %s:%02x:%1x",
+		lro->instance, ep_name,
+		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+
+	/* request XMC/ERT to stop */
+	xocl_mb_stop(lro);
+
+	xocl_icap_reset_axi_gate(lro);
+
+	/*
+	 * lock pci config space access from userspace,
+	 * save state and issue PCIe secondary bus reset
+	 */
+	if (!XOCL_DSA_PCI_RESET_OFF(lro)) {
+		(void) xocl_mailbox_reset(lro, false);
+		xclmgmt_reset_pci(lro);
+		(void) xocl_mailbox_reset(lro, true);
+	} else {
+		mgmt_err(lro, "PCI Hot reset is not supported on this board.");
+	}
+
+	/* Workaround for some DSAs. Flush axilite busses */
+	if (dev_info->flags & XOCL_DSAFLAG_AXILITE_FLUSH)
+		platform_axilite_flush(lro);
+
+	/*
+	 * Check firewall status. Status should be 0 (cleared)
+	 * Otherwise issue message that a warm reboot is required.
+	 */
+	do {
+		msleep(20);
+	} while (retry++ < XCLMGMT_RESET_MAX_RETRY &&
+		xocl_af_check(lro, NULL));
+
+	if (retry >= XCLMGMT_RESET_MAX_RETRY) {
+		mgmt_err(lro, "Board is not able to recover by PCI Hot reset, please warm reboot");
+		return -EIO;
+	}
+
+	//Also freeze and free AXI gate to reset the OCL region.
+	xocl_icap_reset_axi_gate(lro);
+
+	/* Workaround for some DSAs. Flush axilite busses */
+	if (dev_info->flags & XOCL_DSAFLAG_AXILITE_FLUSH)
+		platform_axilite_flush(lro);
+
+	/* restart XMC/ERT */
+	xocl_mb_reset(lro);
+
+#endif
+done:
+	return err;
+}
+
+static int xocl_match_slot_and_save(struct device *dev, void *data)
+{
+	struct pci_dev *pdev;
+	unsigned long slot;
+
+	pdev = to_pci_dev(dev);
+	slot = PCI_SLOT(pdev->devfn);
+
+	if (slot == (unsigned long)data) {
+		pci_cfg_access_lock(pdev);
+		pci_save_state(pdev);
+	}
+
+	return 0;
+}
+
+static void xocl_pci_save_config_all(struct pci_dev *pdev)
+{
+	unsigned long slot = PCI_SLOT(pdev->devfn);
+
+	bus_for_each_dev(&pci_bus_type, NULL, (void *)slot,
+		xocl_match_slot_and_save);
+}
+
+static int xocl_match_slot_and_restore(struct device *dev, void *data)
+{
+	struct pci_dev *pdev;
+	unsigned long slot;
+
+	pdev = to_pci_dev(dev);
+	slot = PCI_SLOT(pdev->devfn);
+
+	if (slot == (unsigned long)data) {
+		pci_restore_state(pdev);
+		pci_cfg_access_unlock(pdev);
+	}
+
+	return 0;
+}
+
+static void xocl_pci_restore_config_all(struct pci_dev *pdev)
+{
+	unsigned long slot = PCI_SLOT(pdev->devfn);
+
+	bus_for_each_dev(&pci_bus_type, NULL, (void *)slot,
+		xocl_match_slot_and_restore);
+}
+/*
+ * Inspired by GenWQE driver, card_base.c
+ */
+int pci_fundamental_reset(struct xclmgmt_dev *lro)
+{
+	int rc;
+	u32 orig_mask;
+	u8 hot;
+	struct pci_dev *pci_dev = lro->pci_dev;
+
+	//freeze and free AXI gate to reset the OCL region before and after the pcie reset.
+	xocl_icap_reset_axi_gate(lro);
+
+	/*
+	 * lock pci config space access from userspace,
+	 * save state and issue PCIe fundamental reset
+	 */
+	mgmt_info(lro, "%s\n", __func__);
+
+	// Save pci config space for botht the pf's
+	xocl_pci_save_config_all(pci_dev);
+
+	rc = pcie_mask_surprise_down(pci_dev, &orig_mask);
+	if (rc)
+		goto done;
+
+#if defined(__PPC64__)
+	/*
+	 * On PPC64LE use pcie_warm_reset which will cause the FPGA to
+	 * reload from PROM
+	 */
+	rc = pci_set_pcie_reset_state(pci_dev, pcie_warm_reset);
+	if (rc)
+		goto done;
+	/* keep PCIe reset asserted for 250ms */
+	msleep(250);
+	rc = pci_set_pcie_reset_state(pci_dev, pcie_deassert_reset);
+	if (rc)
+		goto done;
+	/* Wait for 2s to reload flash and train the link */
+	msleep(2000);
+#else
+	rc = xocl_icap_reset_bitstream(lro);
+	if (rc)
+		goto done;
+
+	/* Now perform secondary bus reset which should reset most of the device */
+	pci_read_config_byte(pci_dev->bus->self, PCI_MIN_GNT, &hot);
+	/* Toggle the PCIe hot reset bit in the root port */
+	pci_write_config_byte(pci_dev->bus->self, PCI_MIN_GNT, hot | 0x40);
+	msleep(500);
+	pci_write_config_byte(pci_dev->bus->self, PCI_MIN_GNT, hot);
+	msleep(500);
+#endif
+done:
+	// restore pci config space for botht the pf's
+	rc = pcie_unmask_surprise_down(pci_dev, orig_mask);
+	xocl_pci_restore_config_all(pci_dev);
+
+	//Also freeze and free AXI gate to reset the OCL region.
+	xocl_icap_reset_axi_gate(lro);
+
+	return rc;
+}
+
+unsigned int compute_unit_busy(struct xclmgmt_dev *lro)
+{
+	int i = 0;
+	unsigned int result = 0;
+	u32 r = MGMT_READ_REG32(lro, AXI_GATE_BASE_RD_BASE);
+
+	/*
+	 * r != 0x3 implies that OCL region is isolated and we cannot read
+	 * CUs' status
+	 */
+	if (r != 0x3)
+		return 0;
+
+	/* ?? Assumed only 16 CUs ? */
+	for (i = 0; i < 16; i++) {
+		r = MGMT_READ_REG32(lro, OCL_CTLR_BASE + i * OCL_CU_CTRL_RANGE);
+		if (r == 0x1)
+			result |= (r << i);
+	}
+	return result;
+}
+
+void xclmgmt_reset_pci(struct xclmgmt_dev *lro)
+{
+	struct pci_dev *pdev = lro->pci_dev;
+	struct pci_bus *bus;
+	int i;
+	u16 pci_cmd;
+	u8 pci_bctl;
+
+	mgmt_info(lro, "Reset PCI");
+
+	/* what if user PF in VM ? */
+	xocl_pci_save_config_all(pdev);
+
+	/* Reset secondary bus. */
+	bus = pdev->bus;
+	pci_read_config_byte(bus->self, PCI_BRIDGE_CONTROL, &pci_bctl);
+	pci_bctl |= PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_byte(bus->self, PCI_BRIDGE_CONTROL, pci_bctl);
+
+	msleep(100);
+	pci_bctl &= ~PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_byte(bus->self, PCI_BRIDGE_CONTROL, pci_bctl);
+
+	for (i = 0; i < 5000; i++) {
+		pci_read_config_word(pdev, PCI_COMMAND, &pci_cmd);
+		if (pci_cmd != 0xffff)
+			break;
+		msleep(1);
+	}
+
+	mgmt_info(lro, "Resetting for %d ms", i);
+
+	xocl_pci_restore_config_all(pdev);
+}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH Xilinx Alveo 6/6] Add user physical function driver
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
                   ` (4 preceding siblings ...)
  2019-03-19 21:54 ` [RFC PATCH Xilinx Alveo 5/6] Add management driver sonal.santan
@ 2019-03-19 21:54 ` sonal.santan
  2019-03-25 20:28 ` [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver Daniel Vetter
  6 siblings, 0 replies; 20+ messages in thread
From: sonal.santan @ 2019-03-19 21:54 UTC (permalink / raw)
  To: dri-devel
  Cc: linux-kernel, gregkh, airlied, cyrilc, michals, lizhih, hyunk,
	Sonal Santan

From: Sonal Santan <sonal.santan@xilinx.com>

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 drivers/gpu/drm/xocl/userpf/common.h     |  157 +++
 drivers/gpu/drm/xocl/userpf/xocl_bo.c    | 1255 ++++++++++++++++++++++
 drivers/gpu/drm/xocl/userpf/xocl_bo.h    |  119 ++
 drivers/gpu/drm/xocl/userpf/xocl_drm.c   |  640 +++++++++++
 drivers/gpu/drm/xocl/userpf/xocl_drv.c   |  743 +++++++++++++
 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c |  396 +++++++
 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c |  344 ++++++
 7 files changed, 3654 insertions(+)
 create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
 create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c

diff --git a/drivers/gpu/drm/xocl/userpf/common.h b/drivers/gpu/drm/xocl/userpf/common.h
new file mode 100644
index 000000000000..c7dd4a68441c
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/common.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ * Lizhi Hou <lizhi.hou@xilinx.com>
+ *
+ */
+
+#ifndef _USERPF_COMMON_H
+#define	_USERPF_COMMON_H
+
+#include "../xocl_drv.h"
+#include "xocl_bo.h"
+#include "../xocl_drm.h"
+#include <drm/xocl_drm.h>
+#include <linux/hashtable.h>
+
+#define XOCL_DRIVER_DESC        "Xilinx PCIe Accelerator Device Manager"
+#define XOCL_DRIVER_DATE        "20180612"
+#define XOCL_DRIVER_MAJOR       2018
+#define XOCL_DRIVER_MINOR       2
+#define XOCL_DRIVER_PATCHLEVEL  8
+
+#define XOCL_MAX_CONCURRENT_CLIENTS 32
+
+#define XOCL_DRIVER_VERSION                             \
+	__stringify(XOCL_DRIVER_MAJOR) "."              \
+	__stringify(XOCL_DRIVER_MINOR) "."              \
+	__stringify(XOCL_DRIVER_PATCHLEVEL)
+
+#define XOCL_DRIVER_VERSION_NUMBER                              \
+	((XOCL_DRIVER_MAJOR)*1000 + (XOCL_DRIVER_MINOR)*100 +   \
+	XOCL_DRIVER_PATCHLEVEL)
+
+#define userpf_err(d, args...)                     \
+	xocl_err(&XDEV(d)->pdev->dev, ##args)
+#define userpf_info(d, args...)                    \
+	xocl_info(&XDEV(d)->pdev->dev, ##args)
+#define userpf_dbg(d, args...)                     \
+	xocl_dbg(&XDEV(d)->pdev->dev, ##args)
+
+#define xocl_get_root_dev(dev, root)		\
+	for (root = dev; root->bus && root->bus->self; root = root->bus->self)
+
+#define	XOCL_USER_PROC_HASH_SZ		256
+
+#define XOCL_U32_MASK 0xFFFFFFFF
+
+#define	MAX_SLOTS	128
+#define MAX_CUS		128
+#define MAX_U32_SLOT_MASKS (((MAX_SLOTS-1)>>5) + 1)
+#define MAX_U32_CU_MASKS (((MAX_CUS-1)>>5) + 1)
+#define MAX_DEPS        8
+
+#define XOCL_DRM_FREE_MALLOC
+
+#define XOCL_PA_SECTION_SHIFT		28
+
+struct xocl_dev	{
+	struct xocl_dev_core	core;
+
+	bool			offline;
+
+	/* health thread */
+	struct task_struct	       *health_thread;
+	struct xocl_health_thread_arg	thread_arg;
+
+	u32			p2p_bar_idx;
+	resource_size_t		p2p_bar_len;
+	void			* __iomem p2p_bar_addr;
+
+	/*should be removed after mailbox is supported */
+	struct percpu_ref ref;
+	struct completion cmp;
+
+	struct dev_pagemap pgmap;
+	struct list_head                ctx_list;
+	struct mutex			ctx_list_lock;
+	unsigned int                    needs_reset; /* bool aligned */
+	atomic_t                        outstanding_execs;
+	atomic64_t                      total_execs;
+	void				*p2p_res_grp;
+};
+
+/**
+ * struct client_ctx: Manage user space client attached to device
+ *
+ * @link: Client context is added to list in device
+ * @xclbin_id: UUID for xclbin loaded by client, or nullid if no xclbin loaded
+ * @xclbin_locked: Flag to denote that this context locked the xclbin
+ * @trigger: Poll wait counter for number of completed exec buffers
+ * @outstanding_execs: Counter for number outstanding exec buffers
+ * @abort: Flag to indicate that this context has detached from user space (ctrl-c)
+ * @num_cus: Number of resources (CUs) explcitly aquired
+ * @lock: Mutex lock for exclusive access
+ * @cu_bitmap: CUs reserved by this context, may contain implicit resources
+ */
+struct client_ctx {
+	struct list_head	link;
+	uuid_t                  xclbin_id;
+	unsigned int            xclbin_locked;
+	unsigned int            abort;
+	unsigned int		num_cus; /* number of resource locked explicitly by client */
+	atomic_t		trigger;     /* count of poll notification to acknowledge */
+	atomic_t                outstanding_execs;
+	struct mutex		lock;
+	struct xocl_dev        *xdev;
+	DECLARE_BITMAP(cu_bitmap, MAX_CUS);  /* may contain implicitly aquired resources such as CDMA */
+	struct pid             *pid;
+};
+
+struct xocl_mm_wrapper {
+	struct drm_mm *mm;
+	struct drm_xocl_mm_stat *mm_usage_stat;
+	uint64_t start_addr;
+	uint64_t size;
+	uint32_t ddr;
+	struct hlist_node node;
+};
+
+/* ioctl functions */
+int xocl_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp);
+int xocl_execbuf_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_ctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp);
+int xocl_user_intr_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_read_axlf_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_hot_reset_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_reclock_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+
+/* sysfs functions */
+int xocl_init_sysfs(struct device *dev);
+void xocl_fini_sysfs(struct device *dev);
+
+/* helper functions */
+int64_t xocl_hot_reset(struct xocl_dev *xdev, bool force);
+void xocl_p2p_mem_release(struct xocl_dev *xdev, bool recov_bar_sz);
+int xocl_p2p_mem_reserve(struct xocl_dev *xdev);
+int xocl_get_p2p_bar(struct xocl_dev *xdev, u64 *bar_size);
+int xocl_pci_resize_resource(struct pci_dev *dev, int resno, int size);
+void xocl_reset_notify(struct pci_dev *pdev, bool prepare);
+void user_pci_reset_prepare(struct pci_dev *pdev);
+void user_pci_reset_done(struct pci_dev *pdev);
+
+uint get_live_client_size(struct xocl_dev *xdev);
+void reset_notify_client_ctx(struct xocl_dev *xdev);
+
+void get_pcie_link_info(struct xocl_dev	*xdev,
+	unsigned short *link_width, unsigned short *link_speed, bool is_cap);
+int xocl_reclock(struct xocl_dev *xdev, void *data);
+#endif
diff --git a/drivers/gpu/drm/xocl/userpf/xocl_bo.c b/drivers/gpu/drm/xocl/userpf/xocl_bo.c
new file mode 100644
index 000000000000..546ce5f7e428
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/xocl_bo.c
@@ -0,0 +1,1255 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *    Sonal Santan <sonal.santan@xilinx.com>
+ *    Sarabjeet Singh <sarabjeet.singh@xilinx.com>
+ *
+ */
+
+#include <linux/bitops.h>
+#include <linux/swap.h>
+#include <linux/dma-buf.h>
+#include <linux/pagemap.h>
+#include <linux/version.h>
+#include <drm/drmP.h>
+#include "common.h"
+
+#ifdef _XOCL_BO_DEBUG
+#define	BO_ENTER(fmt, args...)		\
+	pr_info("[BO] Entering %s:"fmt"\n", __func__, ##args)
+#define	BO_DEBUG(fmt, args...)		\
+	pr_info("[BO] %s:%d:"fmt"\n", __func__, __LINE__, ##args)
+#else
+#define BO_ENTER(fmt, args...)
+#define	BO_DEBUG(fmt, args...)
+#endif
+
+#if defined(XOCL_DRM_FREE_MALLOC)
+static inline void drm_free_large(void *ptr)
+{
+	kvfree(ptr);
+}
+
+static inline void *drm_malloc_ab(size_t nmemb, size_t size)
+{
+	return kvmalloc_array(nmemb, sizeof(struct page *), GFP_KERNEL);
+}
+#endif
+
+static inline void xocl_release_pages(struct page **pages, int nr, bool cold)
+{
+	release_pages(pages, nr);
+}
+
+
+static inline void __user *to_user_ptr(u64 address)
+{
+	return (void __user *)(uintptr_t)address;
+}
+
+static size_t xocl_bo_physical_addr(const struct drm_xocl_bo *xobj)
+{
+	uint64_t paddr = xobj->mm_node ? xobj->mm_node->start : 0xffffffffffffffffull;
+
+	//Sarab: Need to check for number of hops & size of DDRs
+	if (xobj->type & XOCL_BO_ARE)
+		paddr |= XOCL_ARE_HOP;
+	return paddr;
+}
+
+void xocl_describe(const struct drm_xocl_bo *xobj)
+{
+	size_t size_in_kb = xobj->base.size / 1024;
+	size_t physical_addr = xocl_bo_physical_addr(xobj);
+	unsigned int ddr = xocl_bo_ddr_idx(xobj->flags);
+	unsigned int userptr = xocl_bo_userptr(xobj) ? 1 : 0;
+
+	DRM_DEBUG("%p: H[%p] SIZE[0x%zxKB] D[0x%zx] DDR[%u] UPTR[%u] SGLCOUNT[%u]\n",
+		  xobj, xobj->vmapping ? xobj->vmapping : xobj->bar_vmapping, size_in_kb,
+			physical_addr, ddr, userptr, xobj->sgt->orig_nents);
+}
+
+static void xocl_free_mm_node(struct drm_xocl_bo *xobj)
+{
+	struct drm_device *ddev = xobj->base.dev;
+	struct xocl_drm *drm_p = ddev->dev_private;
+	unsigned int ddr = xocl_bo_ddr_idx(xobj->flags);
+
+	mutex_lock(&drm_p->mm_lock);
+	BO_ENTER("xobj %p, mm_node %p", xobj, xobj->mm_node);
+	if (!xobj->mm_node)
+		goto end;
+
+	xocl_mm_update_usage_stat(drm_p, ddr, xobj->base.size, -1);
+	BO_DEBUG("remove mm_node:%p, start:%llx size: %llx", xobj->mm_node,
+		xobj->mm_node->start, xobj->mm_node->size);
+	drm_mm_remove_node(xobj->mm_node);
+	kfree(xobj->mm_node);
+	xobj->mm_node = NULL;
+end:
+	mutex_unlock(&drm_p->mm_lock);
+}
+
+static void xocl_free_bo(struct drm_gem_object *obj)
+{
+	struct drm_xocl_bo *xobj = to_xocl_bo(obj);
+	struct drm_device *ddev = xobj->base.dev;
+	struct xocl_drm *drm_p = ddev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	int npages = obj->size >> PAGE_SHIFT;
+
+	DRM_DEBUG("Freeing BO %p\n", xobj);
+
+	BO_ENTER("xobj %p pages %p", xobj, xobj->pages);
+	if (xobj->vmapping)
+		vunmap(xobj->vmapping);
+	xobj->vmapping = NULL;
+
+	if (xobj->dmabuf)
+		unmap_mapping_range(xobj->dmabuf->file->f_mapping, 0, 0, 1);
+
+	if (xobj->dma_nsg) {
+		pci_unmap_sg(xdev->core.pdev, xobj->sgt->sgl, xobj->dma_nsg,
+			PCI_DMA_BIDIRECTIONAL);
+	}
+
+	if (xobj->pages) {
+		if (xocl_bo_userptr(xobj)) {
+			xocl_release_pages(xobj->pages, npages, 0);
+			drm_free_large(xobj->pages);
+		} else if (xocl_bo_p2p(xobj)) {
+			drm_free_large(xobj->pages);
+			/*devm_* will release all the pages while unload xocl driver*/
+			xobj->bar_vmapping = NULL;
+		} else if (!xocl_bo_import(xobj)) {
+			drm_gem_put_pages(obj, xobj->pages, false, false);
+		}
+	}
+	xobj->pages = NULL;
+
+	if (!xocl_bo_import(xobj)) {
+		DRM_DEBUG("Freeing regular buffer\n");
+		if (xobj->sgt) {
+			sg_free_table(xobj->sgt);
+			kfree(xobj->sgt);
+		}
+		xobj->sgt = NULL;
+		xocl_free_mm_node(xobj);
+	} else {
+		DRM_DEBUG("Freeing imported buffer\n");
+		if (!(xobj->type & XOCL_BO_ARE))
+			xocl_free_mm_node(xobj);
+
+		if (obj->import_attach) {
+			DRM_DEBUG("Unnmapping attached dma buf\n");
+			dma_buf_unmap_attachment(obj->import_attach, xobj->sgt, DMA_TO_DEVICE);
+			drm_prime_gem_destroy(obj, NULL);
+		}
+	}
+
+	/* If it is imported BO then we do not delete SG Table
+	 * And if is imported from ARE device then we do not free the mm_node
+	 * as well
+	 * Call detach here........
+	 * to let the exporting device know that importing device do not need
+	 * it anymore..
+	 * else free_bo i.e this function is not called for exporting device
+	 * as it assumes that the exported buffer is still being used
+	 * dmabuf->ops->release(dmabuf);
+	 * The drm_driver.gem_free_object callback is responsible for cleaning
+	 * up the dma_buf attachment and references acquired at import time.
+	 *
+	 * This crashes machine.. Using above code instead
+	 * drm_prime_gem_destroy calls detach function..
+	 * struct dma_buf *imported_dma_buf = obj->dma_buf;
+	 * if (imported_dma_buf->ops->detach)
+	 * imported_dma_buf->ops->detach(imported_dma_buf, obj->import_attach);
+	 */
+
+	drm_gem_object_release(obj);
+	kfree(xobj);
+}
+
+void xocl_drm_free_bo(struct drm_gem_object *obj)
+{
+	xocl_free_bo(obj);
+}
+
+static inline int check_bo_user_reqs(const struct drm_device *dev,
+	unsigned int flags, unsigned int type)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	u16 ddr_count;
+	unsigned int ddr;
+
+	if (type == DRM_XOCL_BO_EXECBUF)
+		return 0;
+	if (type == DRM_XOCL_BO_CMA)
+		return -EINVAL;
+
+	//From "mem_topology" or "feature rom" depending on
+	//unified or non-unified dsa
+	ddr_count = XOCL_DDR_COUNT(xdev);
+
+	if (ddr_count == 0)
+		return -EINVAL;
+	ddr = xocl_bo_ddr_idx(flags);
+	if (ddr >= ddr_count)
+		return -EINVAL;
+	if (XOCL_MEM_TOPOLOGY(xdev)->m_mem_data[ddr].m_type == MEM_STREAMING)
+		return -EINVAL;
+	if (!XOCL_IS_DDR_USED(xdev, ddr)) {
+		userpf_err(xdev, "Bank %d is marked as unused in axlf", ddr);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static struct drm_xocl_bo *xocl_create_bo(struct drm_device *dev,
+					  uint64_t unaligned_size,
+					  unsigned int user_flags,
+					  unsigned int user_type)
+{
+	size_t size = PAGE_ALIGN(unaligned_size);
+	struct drm_xocl_bo *xobj;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	unsigned int ddr = xocl_bo_ddr_idx(user_flags);
+	u16 ddr_count = 0;
+	bool xobj_inited = false;
+	int err = 0;
+
+	BO_DEBUG("New create bo flags:%u type:%u", user_flags, user_type);
+	if (!size)
+		return ERR_PTR(-EINVAL);
+
+	/* Either none or only one DDR should be specified */
+	/* Check the type */
+	if (check_bo_user_reqs(dev, user_flags, user_type))
+		return ERR_PTR(-EINVAL);
+
+	xobj = kzalloc(sizeof(*xobj), GFP_KERNEL);
+	if (!xobj)
+		return ERR_PTR(-ENOMEM);
+
+	BO_ENTER("xobj %p", xobj);
+	err = drm_gem_object_init(dev, &xobj->base, size);
+	if (err)
+		goto failed;
+	xobj_inited = true;
+
+	if (user_type == DRM_XOCL_BO_EXECBUF) {
+		xobj->type = XOCL_BO_EXECBUF;
+		xobj->metadata.state = DRM_XOCL_EXECBUF_STATE_ABORT;
+		return xobj;
+	}
+
+	if (user_type & DRM_XOCL_BO_P2P)
+		xobj->type = XOCL_BO_P2P;
+
+	xobj->mm_node = kzalloc(sizeof(*xobj->mm_node), GFP_KERNEL);
+	if (!xobj->mm_node) {
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	ddr_count = XOCL_DDR_COUNT(xdev);
+
+	mutex_lock(&drm_p->mm_lock);
+	/* Attempt to allocate buffer on the requested DDR */
+	xocl_xdev_dbg(xdev, "alloc bo from bank%u", ddr);
+	err = xocl_mm_insert_node(drm_p, ddr, xobj->mm_node,
+		xobj->base.size);
+	BO_DEBUG("insert mm_node:%p, start:%llx size: %llx",
+		xobj->mm_node, xobj->mm_node->start,
+		xobj->mm_node->size);
+	if (err)
+		goto failed;
+
+	xocl_mm_update_usage_stat(drm_p, ddr, xobj->base.size, 1);
+	mutex_unlock(&drm_p->mm_lock);
+	/* Record the DDR we allocated the buffer on */
+	//xobj->flags |= (1 << ddr);
+	xobj->flags = ddr;
+
+	return xobj;
+failed:
+	mutex_unlock(&drm_p->mm_lock);
+	kfree(xobj->mm_node);
+
+	if (xobj_inited)
+		drm_gem_object_release(&xobj->base);
+
+	kfree(xobj);
+
+	return ERR_PTR(err);
+}
+
+struct drm_xocl_bo *xocl_drm_create_bo(struct xocl_drm *drm_p,
+					  uint64_t unaligned_size,
+					  unsigned int user_flags,
+					  unsigned int user_type)
+{
+	return xocl_create_bo(drm_p->ddev, unaligned_size, user_flags,
+			user_type);
+}
+
+static struct page **xocl_p2p_get_pages(void *bar_vaddr, int npages)
+{
+	struct page *p, **pages;
+	int i;
+	uint64_t page_offset_enum = 0;
+
+	pages = drm_malloc_ab(npages, sizeof(struct page *));
+
+	if (pages == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < npages; i++) {
+		p = virt_to_page(bar_vaddr+page_offset_enum);
+		pages[i] = p;
+
+		if (IS_ERR(p))
+			goto fail;
+
+		page_offset_enum += PAGE_SIZE;
+	}
+
+	return pages;
+fail:
+	kvfree(pages);
+	return ERR_CAST(p);
+}
+
+/*
+ * For ARE device do not reserve DDR space
+ * In below import it will reuse the mm_node which is already created by other application
+ */
+
+static struct drm_xocl_bo *xocl_create_bo_forARE(struct drm_device *dev,
+						 uint64_t unaligned_size,
+						 struct drm_mm_node   *exporting_mm_node)
+{
+	struct drm_xocl_bo *xobj;
+	size_t size = PAGE_ALIGN(unaligned_size);
+	int err = 0;
+
+	if (!size)
+		return ERR_PTR(-EINVAL);
+
+	xobj = kzalloc(sizeof(*xobj), GFP_KERNEL);
+	if (!xobj)
+		return ERR_PTR(-ENOMEM);
+
+	BO_ENTER("xobj %p", xobj);
+	err = drm_gem_object_init(dev, &xobj->base, size);
+	if (err)
+		goto out3;
+
+	xobj->mm_node = exporting_mm_node;
+	if (!xobj->mm_node) {
+		err = -ENOMEM;
+		goto out3;
+	}
+
+	/* Record that this buffer is on remote device to be access over ARE*/
+	//xobj->flags = XOCL_BO_ARE;
+	xobj->type |= XOCL_BO_ARE;
+	return xobj;
+out3:
+	kfree(xobj);
+	return ERR_PTR(err);
+}
+
+
+int xocl_create_bo_ioctl(struct drm_device *dev,
+			 void *data,
+			 struct drm_file *filp)
+{
+	int ret;
+	struct drm_xocl_bo *xobj;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	struct drm_xocl_create_bo *args = data;
+	//unsigned ddr = args->flags & XOCL_MEM_BANK_MSK;
+	unsigned int ddr = args->flags;
+	//unsigned bar_mapped = (args->flags & DRM_XOCL_BO_P2P) ? 1 : 0;
+	unsigned int bar_mapped = (args->type & DRM_XOCL_BO_P2P) ? 1 : 0;
+
+//	//Only one bit should be set in ddr. Other bits are now in "type"
+//	if (hweight_long(ddr) > 1)
+//		return -EINVAL;
+////	if (args->flags && (args->flags != DRM_XOCL_BO_EXECBUF)) {
+//		if (hweight_long(ddr) > 1)
+//			return -EINVAL;
+//	}
+
+	if (bar_mapped) {
+		if (!xdev->p2p_bar_addr) {
+			xocl_xdev_err(xdev, "No P2P mem region available, Can't create p2p BO");
+			return -EINVAL;
+		}
+	}
+
+	xobj = xocl_create_bo(dev, args->size, args->flags, args->type);
+
+	BO_ENTER("xobj %p, mm_node %p", xobj, xobj->mm_node);
+	if (IS_ERR(xobj)) {
+		DRM_DEBUG("object creation failed\n");
+		return PTR_ERR(xobj);
+	}
+
+	if (bar_mapped) {
+		ddr = xocl_bo_ddr_idx(xobj->flags);
+		/*
+		 * DRM allocate contiguous pages, shift the vmapping with
+		 * bar address offset
+		 */
+		xobj->bar_vmapping = xdev->p2p_bar_addr +
+			drm_p->mm_p2p_off[ddr] + xobj->mm_node->start -
+			XOCL_MEM_TOPOLOGY(xdev)->m_mem_data[ddr].m_base_address;
+	}
+
+	if (bar_mapped)
+		xobj->pages = xocl_p2p_get_pages(xobj->bar_vmapping, xobj->base.size >> PAGE_SHIFT);
+	else
+		xobj->pages = drm_gem_get_pages(&xobj->base);
+
+	if (IS_ERR(xobj->pages)) {
+		ret = PTR_ERR(xobj->pages);
+		goto out_free;
+	}
+
+	xobj->sgt = drm_prime_pages_to_sg(xobj->pages, xobj->base.size >> PAGE_SHIFT);
+	if (IS_ERR(xobj->sgt)) {
+		ret = PTR_ERR(xobj->sgt);
+		goto out_free;
+	}
+
+	if (!bar_mapped) {
+		xobj->vmapping = vmap(xobj->pages, xobj->base.size >> PAGE_SHIFT, VM_MAP, PAGE_KERNEL);
+		if (!xobj->vmapping) {
+			ret = -ENOMEM;
+			goto out_free;
+		}
+	}
+
+	ret = drm_gem_create_mmap_offset(&xobj->base);
+	if (ret < 0)
+		goto out_free;
+	ret = drm_gem_handle_create(filp, &xobj->base, &args->handle);
+	if (ret < 0)
+		goto out_free;
+
+	xocl_describe(xobj);
+//PORT4_20
+//	drm_gem_object_unreference_unlocked(&xobj->base);
+	drm_gem_object_put_unlocked(&xobj->base);
+	return ret;
+
+out_free:
+	xocl_free_bo(&xobj->base);
+	return ret;
+}
+
+int xocl_userptr_bo_ioctl(struct drm_device *dev,
+			      void *data,
+			      struct drm_file *filp)
+{
+	int ret;
+	struct drm_xocl_bo *xobj;
+	unsigned int page_count;
+	struct drm_xocl_userptr_bo *args = data;
+	//unsigned ddr = args->flags & XOCL_MEM_BANK_MSK;
+	//unsigned ddr = args->flags;
+
+	if (offset_in_page(args->addr))
+		return -EINVAL;
+
+	if (args->type & DRM_XOCL_BO_EXECBUF)
+		return -EINVAL;
+
+	if (args->type & DRM_XOCL_BO_CMA)
+		return -EINVAL;
+
+//	if (args->flags && (hweight_long(ddr) > 1))
+//		return -EINVAL;
+
+	xobj = xocl_create_bo(dev, args->size, args->flags, args->type);
+	BO_ENTER("xobj %p", xobj);
+
+	if (IS_ERR(xobj)) {
+		DRM_DEBUG("object creation failed\n");
+		return PTR_ERR(xobj);
+	}
+
+	/* Use the page rounded size so we can accurately account for number of pages */
+	page_count = xobj->base.size >> PAGE_SHIFT;
+
+	xobj->pages = drm_malloc_ab(page_count, sizeof(*xobj->pages));
+	if (!xobj->pages) {
+		ret = -ENOMEM;
+		goto out1;
+	}
+	ret = get_user_pages_fast(args->addr, page_count, 1, xobj->pages);
+
+	if (ret != page_count)
+		goto out0;
+
+	xobj->sgt = drm_prime_pages_to_sg(xobj->pages, page_count);
+	if (IS_ERR(xobj->sgt)) {
+		ret = PTR_ERR(xobj->sgt);
+		goto out0;
+	}
+
+	/* TODO: resolve the cache issue */
+	xobj->vmapping = vmap(xobj->pages, page_count, VM_MAP, PAGE_KERNEL);
+
+	if (!xobj->vmapping) {
+		ret = -ENOMEM;
+		goto out1;
+	}
+
+	ret = drm_gem_handle_create(filp, &xobj->base, &args->handle);
+	if (ret)
+		goto out1;
+
+	xobj->type |= XOCL_BO_USERPTR;
+	xocl_describe(xobj);
+//PORT4_20
+//	drm_gem_object_unreference_unlocked(&xobj->base);
+	drm_gem_object_put_unlocked(&xobj->base);
+	return ret;
+
+out0:
+	drm_free_large(xobj->pages);
+	xobj->pages = NULL;
+out1:
+	xocl_free_bo(&xobj->base);
+	DRM_DEBUG("handle creation failed\n");
+	return ret;
+}
+
+
+int xocl_map_bo_ioctl(struct drm_device *dev,
+		      void *data,
+		      struct drm_file *filp)
+{
+	int ret = 0;
+	struct drm_xocl_map_bo *args = data;
+	struct drm_gem_object *obj;
+	struct drm_xocl_bo *xobj;
+
+	obj = xocl_gem_object_lookup(dev, filp, args->handle);
+	xobj = to_xocl_bo(obj);
+
+	if (!obj) {
+		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		return -ENOENT;
+	}
+
+	BO_ENTER("xobj %p", xobj);
+	if (xocl_bo_userptr(xobj)) {
+		ret = -EPERM;
+		goto out;
+	}
+	/* The mmap offset was set up at BO allocation time. */
+	args->offset = drm_vma_node_offset_addr(&obj->vma_node);
+	xocl_describe(to_xocl_bo(obj));
+out:
+//PORT4_20
+//	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
+	return ret;
+}
+
+static struct sg_table *alloc_onetime_sg_table(struct page **pages, uint64_t offset, uint64_t size)
+{
+	int ret;
+	unsigned int nr_pages;
+	struct sg_table *sgt = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
+
+	if (!sgt)
+		return ERR_PTR(-ENOMEM);
+
+	pages += (offset >> PAGE_SHIFT);
+	offset &= (~PAGE_MASK);
+	nr_pages = PAGE_ALIGN(size + offset) >> PAGE_SHIFT;
+
+	ret = sg_alloc_table_from_pages(sgt, pages, nr_pages, offset, size, GFP_KERNEL);
+	if (ret)
+		goto cleanup;
+	return sgt;
+
+cleanup:
+	kfree(sgt);
+	return ERR_PTR(-ENOMEM);
+}
+
+int xocl_sync_bo_ioctl(struct drm_device *dev,
+		       void *data,
+		       struct drm_file *filp)
+{
+	const struct drm_xocl_bo *xobj;
+	struct sg_table *sgt;
+	u64 paddr = 0;
+	int channel = 0;
+	ssize_t ret = 0;
+	const struct drm_xocl_sync_bo *args = data;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+
+	u32 dir = (args->dir == DRM_XOCL_SYNC_BO_TO_DEVICE) ? 1 : 0;
+	struct drm_gem_object *gem_obj = xocl_gem_object_lookup(dev, filp,
+							       args->handle);
+	if (!gem_obj) {
+		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		return -ENOENT;
+	}
+
+	xobj = to_xocl_bo(gem_obj);
+	BO_ENTER("xobj %p", xobj);
+	sgt = xobj->sgt;
+
+	if (xocl_bo_p2p(xobj)) {
+		DRM_DEBUG("P2P_BO doesn't support sync_bo\n");
+		ret = -EOPNOTSUPP;
+		goto out;
+	}
+
+	//Sarab: If it is a remote BO then why do sync over ARE.
+	//We should do sync directly using the other device which this bo locally.
+	//So that txfer is: HOST->PCIE->DDR; Else it will be HOST->PCIE->ARE->DDR
+	paddr = xocl_bo_physical_addr(xobj);
+
+	if (paddr == 0xffffffffffffffffull)
+		return -EINVAL;
+
+	/* If device is offline (due to error), reject all DMA requests */
+	if (xdev->offline)
+		return -ENODEV;
+
+
+	if ((args->offset + args->size) > gem_obj->size) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/* only invalidate the range of addresses requested by the user */
+	paddr += args->offset;
+
+	if (args->offset || (args->size != xobj->base.size)) {
+		sgt = alloc_onetime_sg_table(xobj->pages, args->offset, args->size);
+		if (IS_ERR(sgt)) {
+			ret = PTR_ERR(sgt);
+			goto out;
+		}
+	}
+
+	//drm_clflush_sg(sgt);
+	channel = xocl_acquire_channel(xdev, dir);
+
+	if (channel < 0) {
+		ret = -EINVAL;
+		goto clear;
+	}
+	/* Now perform DMA */
+	ret = xocl_migrate_bo(xdev, sgt, dir, paddr, channel, args->size);
+	if (ret >= 0)
+		ret = (ret == args->size) ? 0 : -EIO;
+	xocl_release_channel(xdev, dir, channel);
+clear:
+	if (args->offset || (args->size != xobj->base.size)) {
+		sg_free_table(sgt);
+		kfree(sgt);
+	}
+out:
+//PORT4_20
+	drm_gem_object_put_unlocked(gem_obj);
+	return ret;
+}
+
+int xocl_info_bo_ioctl(struct drm_device *dev,
+		       void *data,
+		       struct drm_file *filp)
+{
+	const struct drm_xocl_bo *xobj;
+	struct drm_xocl_info_bo *args = data;
+	struct drm_gem_object *gem_obj = xocl_gem_object_lookup(dev, filp,
+								args->handle);
+
+	if (!gem_obj) {
+		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		return -ENOENT;
+	}
+
+	xobj = to_xocl_bo(gem_obj);
+	BO_ENTER("xobj %p", xobj);
+
+	args->size = xobj->base.size;
+
+	args->paddr = xocl_bo_physical_addr(xobj);
+	xocl_describe(xobj);
+//PORT4_20
+	drm_gem_object_put_unlocked(gem_obj);
+
+	return 0;
+}
+
+int xocl_pwrite_bo_ioctl(struct drm_device *dev, void *data,
+			 struct drm_file *filp)
+{
+	struct drm_xocl_bo *xobj;
+	const struct drm_xocl_pwrite_bo *args = data;
+	struct drm_gem_object *gem_obj = xocl_gem_object_lookup(dev, filp,
+							       args->handle);
+	char __user *user_data = to_user_ptr(args->data_ptr);
+	int ret = 0;
+	void *kaddr;
+
+	if (!gem_obj) {
+		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		return -ENOENT;
+	}
+
+	if ((args->offset > gem_obj->size) || (args->size > gem_obj->size)
+	    || ((args->offset + args->size) > gem_obj->size)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (args->size == 0) {
+		ret = 0;
+		goto out;
+	}
+
+	if (!access_ok(user_data, args->size)) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	xobj = to_xocl_bo(gem_obj);
+	BO_ENTER("xobj %p", xobj);
+
+	if (xocl_bo_userptr(xobj)) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	kaddr = xobj->vmapping ? xobj->vmapping : xobj->bar_vmapping;
+	kaddr += args->offset;
+
+	ret = copy_from_user(kaddr, user_data, args->size);
+out:
+//PORT4_20
+	drm_gem_object_put_unlocked(gem_obj);
+
+	return ret;
+}
+
+int xocl_pread_bo_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *filp)
+{
+	struct drm_xocl_bo *xobj;
+	const struct drm_xocl_pread_bo *args = data;
+	struct drm_gem_object *gem_obj = xocl_gem_object_lookup(dev, filp,
+							       args->handle);
+	char __user *user_data = to_user_ptr(args->data_ptr);
+	int ret = 0;
+	void *kaddr;
+
+	if (!gem_obj) {
+		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		return -ENOENT;
+	}
+
+	if (xocl_bo_userptr(to_xocl_bo(gem_obj))) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	if ((args->offset > gem_obj->size) || (args->size > gem_obj->size)
+	    || ((args->offset + args->size) > gem_obj->size)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (args->size == 0) {
+		ret = 0;
+		goto out;
+	}
+
+	if (!access_ok(user_data, args->size)) {
+		ret = EFAULT;
+		goto out;
+	}
+
+	xobj = to_xocl_bo(gem_obj);
+	BO_ENTER("xobj %p", xobj);
+	kaddr = xobj->vmapping ? xobj->vmapping : xobj->bar_vmapping;
+	kaddr += args->offset;
+
+	ret = copy_to_user(user_data, kaddr, args->size);
+
+out:
+//PORT4_20
+	drm_gem_object_put_unlocked(gem_obj);
+
+	return ret;
+}
+
+int xocl_copy_bo_ioctl(struct drm_device *dev,
+			 void *data,
+			 struct drm_file *filp)
+{
+	const struct drm_xocl_bo *dst_xobj, *src_xobj;
+	struct sg_table *sgt;
+	u64 paddr = 0;
+	int channel = 0;
+	ssize_t ret = 0;
+	const struct drm_xocl_copy_bo *args = data;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	u32 dir = 0; //always write data from source to destination
+	struct drm_gem_object *dst_gem_obj, *src_gem_obj;
+
+	dst_gem_obj = xocl_gem_object_lookup(dev, filp,
+							       args->dst_handle);
+	if (!dst_gem_obj) {
+		DRM_ERROR("Failed to look up Destination GEM BO %d\n", args->dst_handle);
+		return -ENOENT;
+	}
+	src_gem_obj = xocl_gem_object_lookup(dev, filp,
+							       args->src_handle);
+	if (!src_gem_obj) {
+		DRM_ERROR("Failed to look up Source GEM BO %d\n", args->src_handle);
+		ret = -ENOENT;
+		goto src_lookup_fail;
+	}
+
+	dst_xobj = to_xocl_bo(dst_gem_obj);
+	src_xobj = to_xocl_bo(src_gem_obj);
+
+	if (!xocl_bo_p2p(src_xobj)) {
+		DRM_ERROR("src_bo must be p2p bo, copy_bo aborted");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	DRM_DEBUG("dst_xobj %p, src_xobj %p", dst_xobj, src_xobj);
+	DRM_DEBUG("dst_xobj->sgt %p, src_xobj->sgt %p", dst_xobj->sgt, src_xobj->sgt);
+	sgt = dst_xobj->sgt;
+
+	paddr = xocl_bo_physical_addr(src_xobj);
+
+	if (paddr == 0xffffffffffffffffull) {
+		ret =  -EINVAL;
+		goto out;
+	}
+	/* If device is offline (due to error), reject all DMA requests */
+	if (xdev->offline) {
+		ret = -ENODEV;
+		goto out;
+	}
+
+	if (((args->src_offset + args->size) > src_gem_obj->size) ||
+			((args->dst_offset + args->size) > dst_gem_obj->size)) {
+		DRM_ERROR("offsize + sizes out of boundary, copy_bo abort");
+		ret = -EINVAL;
+		goto out;
+	}
+	paddr += args->src_offset;
+
+	DRM_DEBUG("%s, xobj->pages = %p\n", __func__, dst_xobj->pages);
+
+
+	if (args->dst_offset || (args->size != dst_xobj->base.size)) {
+		sgt = alloc_onetime_sg_table(dst_xobj->pages, args->dst_offset, args->size);
+		if (IS_ERR(sgt)) {
+			ret = PTR_ERR(sgt);
+			goto out;
+		}
+	}
+
+	channel = xocl_acquire_channel(xdev, dir);
+
+	if (channel < 0) {
+		ret = -EINVAL;
+		goto clear;
+	}
+	/* Now perform DMA */
+	ret = xocl_migrate_bo(xdev, sgt, dir, paddr, channel,
+		args->size);
+
+	if (ret >= 0)
+		ret = (ret == args->size) ? 0 : -EIO;
+	xocl_release_channel(xdev, dir, channel);
+
+
+clear:
+	if (args->dst_offset || (args->size != dst_xobj->base.size)) {
+		sg_free_table(sgt);
+		kfree(sgt);
+	}
+out:
+//PORT4_20
+	drm_gem_object_put_unlocked(src_gem_obj);
+src_lookup_fail:
+//PORT4_20
+	drm_gem_object_put_unlocked(dst_gem_obj);
+	return ret;
+
+}
+
+
+struct sg_table *xocl_gem_prime_get_sg_table(struct drm_gem_object *obj)
+{
+	struct drm_xocl_bo *xobj = to_xocl_bo(obj);
+
+	BO_ENTER("xobj %p", xobj);
+	return drm_prime_pages_to_sg(xobj->pages, xobj->base.size >> PAGE_SHIFT);
+}
+
+
+static struct drm_xocl_bo *xocl_is_exporting_xare(struct drm_device *dev, struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *exporting_gem_obj;
+	struct drm_device *exporting_drm_dev;
+	struct xocl_drm *exporting_drmp;
+	struct xocl_dev *exporting_xdev;
+
+	struct device_driver *importing_dma_driver = dev->dev->driver;
+	struct dma_buf *exporting_dma_buf = attach->dmabuf;
+	struct device_driver *exporting_dma_driver = attach->dev->driver;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+
+	if (xocl_is_are(xdev))
+		return NULL;
+
+	//We don't know yet if the exporting device is Xilinx/XOCL or third party or USB device
+	//So checking it in below code
+	if (importing_dma_driver != exporting_dma_driver)
+		return NULL;
+
+	//Exporting devices have same driver as us. So this is Xilinx device
+	//So now we can get gem_object, drm_device & xocl_dev
+	exporting_gem_obj = exporting_dma_buf->priv;
+	exporting_drm_dev = exporting_gem_obj->dev;
+	exporting_drmp = exporting_drm_dev->dev_private;
+	exporting_xdev = exporting_drmp->xdev;
+	//exporting_xdev->header;//This has FeatureROM header
+	if (xocl_is_are(exporting_xdev))
+		return to_xocl_bo(exporting_gem_obj);
+
+	return NULL;
+}
+
+struct drm_gem_object *xocl_gem_prime_import_sg_table(struct drm_device *dev,
+						      struct dma_buf_attachment *attach, struct sg_table *sgt)
+{
+	int ret = 0;
+	struct drm_xocl_bo *exporting_xobj;
+	struct drm_xocl_bo *importing_xobj;
+
+	/*
+	 * For ARE device resue the mm node from exporting xobj
+	 * For non ARE devices we need to create a full BO but share the SG
+	 * table
+	 * ???? add flags to create_bo.. for DDR bank??
+	 */
+
+	exporting_xobj = xocl_is_exporting_xare(dev, attach);
+	importing_xobj = exporting_xobj ?
+		xocl_create_bo_forARE(dev, attach->dmabuf->size,
+				exporting_xobj->mm_node) :
+		xocl_create_bo(dev, attach->dmabuf->size, 0, 0);
+
+	BO_ENTER("xobj %p", importing_xobj);
+
+	if (IS_ERR(importing_xobj)) {
+		DRM_DEBUG("object creation failed\n");
+		return (struct drm_gem_object *)importing_xobj;
+	}
+
+	importing_xobj->type |= XOCL_BO_IMPORT;
+	importing_xobj->sgt = sgt;
+	importing_xobj->pages = drm_malloc_ab(attach->dmabuf->size >> PAGE_SHIFT, sizeof(*importing_xobj->pages));
+	if (!importing_xobj->pages) {
+		ret = -ENOMEM;
+		goto out_free;
+	}
+
+	ret = drm_prime_sg_to_page_addr_arrays(sgt, importing_xobj->pages,
+					       NULL, attach->dmabuf->size >> PAGE_SHIFT);
+	if (ret)
+		goto out_free;
+
+	importing_xobj->vmapping = vmap(importing_xobj->pages, importing_xobj->base.size >> PAGE_SHIFT, VM_MAP,
+					PAGE_KERNEL);
+
+	if (!importing_xobj->vmapping) {
+		ret = -ENOMEM;
+		goto out_free;
+	}
+
+	ret = drm_gem_create_mmap_offset(&importing_xobj->base);
+	if (ret < 0)
+		goto out_free;
+
+	xocl_describe(importing_xobj);
+	return &importing_xobj->base;
+
+out_free:
+	xocl_free_bo(&importing_xobj->base);
+	DRM_ERROR("Buffer import failed\n");
+	return ERR_PTR(ret);
+}
+
+void *xocl_gem_prime_vmap(struct drm_gem_object *obj)
+{
+	struct drm_xocl_bo *xobj = to_xocl_bo(obj);
+
+	BO_ENTER("xobj %p", xobj);
+	return xobj->vmapping;
+}
+
+void xocl_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
+{
+
+}
+
+int xocl_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)
+{
+	struct drm_xocl_bo *xobj = to_xocl_bo(obj);
+	int ret;
+
+	BO_ENTER("obj %p", obj);
+	if (obj->size < vma->vm_end - vma->vm_start)
+		return -EINVAL;
+
+	if (!obj->filp)
+		return -ENODEV;
+
+	ret = obj->filp->f_op->mmap(obj->filp, vma);
+	if (ret)
+		return ret;
+
+	fput(vma->vm_file);
+	if (!IS_ERR(xobj->dmabuf)) {
+		vma->vm_file = get_file(xobj->dmabuf->file);
+		vma->vm_ops = xobj->dmabuf_vm_ops;
+		vma->vm_private_data = obj;
+		vma->vm_flags |= VM_MIXEDMAP;
+	}
+
+	return 0;
+}
+
+int xocl_init_unmgd(struct drm_xocl_unmgd *unmgd, uint64_t data_ptr,
+	uint64_t size, u32 write)
+{
+	int ret;
+	char __user *user_data = to_user_ptr(data_ptr);
+
+	if (!access_ok(user_data, size))
+		return -EFAULT;
+
+	memset(unmgd, 0, sizeof(struct drm_xocl_unmgd));
+
+	unmgd->npages = (((unsigned long)user_data + size + PAGE_SIZE - 1) -
+			((unsigned long)user_data & PAGE_MASK)) >> PAGE_SHIFT;
+
+	unmgd->pages = drm_malloc_ab(unmgd->npages, sizeof(*unmgd->pages));
+	if (!unmgd->pages)
+		return -ENOMEM;
+
+	ret = get_user_pages_fast(data_ptr, unmgd->npages, (write == 0) ? 1 : 0, unmgd->pages);
+
+	if (ret != unmgd->npages)
+		goto clear_pages;
+
+	unmgd->sgt = alloc_onetime_sg_table(unmgd->pages, data_ptr & ~PAGE_MASK, size);
+	if (IS_ERR(unmgd->sgt)) {
+		ret = PTR_ERR(unmgd->sgt);
+		goto clear_release;
+	}
+
+	return 0;
+
+clear_release:
+	xocl_release_pages(unmgd->pages, unmgd->npages, 0);
+clear_pages:
+	drm_free_large(unmgd->pages);
+	unmgd->pages = NULL;
+	return ret;
+}
+
+void xocl_finish_unmgd(struct drm_xocl_unmgd *unmgd)
+{
+	if (!unmgd->pages)
+		return;
+	sg_free_table(unmgd->sgt);
+	kfree(unmgd->sgt);
+	xocl_release_pages(unmgd->pages, unmgd->npages, 0);
+	drm_free_large(unmgd->pages);
+	unmgd->pages = NULL;
+}
+
+static bool xocl_validate_paddr(struct xocl_dev *xdev, u64 paddr, u64 size)
+{
+	struct mem_data *mem_data;
+	int	i;
+	uint64_t addr;
+	bool start_check = false;
+	bool end_check = false;
+
+	for (i = 0; i < XOCL_MEM_TOPOLOGY(xdev)->m_count; i++) {
+		mem_data = &XOCL_MEM_TOPOLOGY(xdev)->m_mem_data[i];
+		addr = mem_data->m_base_address;
+		start_check = (paddr >= addr);
+		end_check = (paddr + size <= addr + mem_data->m_size * 1024);
+		if (mem_data->m_used && start_check && end_check)
+			return true;
+	}
+
+	return false;
+}
+
+int xocl_pwrite_unmgd_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *filp)
+{
+	int channel;
+	struct drm_xocl_unmgd unmgd;
+	const struct drm_xocl_pwrite_unmgd *args = data;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	u32 dir = 1;
+	ssize_t ret = 0;
+
+	if (args->address_space != 0) {
+		userpf_err(xdev, "invalid addr space");
+		return -EFAULT;
+	}
+
+	if (args->size == 0)
+		return 0;
+
+	if (!xocl_validate_paddr(xdev, args->paddr, args->size)) {
+		userpf_err(xdev, "invalid paddr: 0x%llx, size:0x%llx",
+			args->paddr, args->size);
+		/* currently we are not able to return error because
+		 * it is unclear that what addresses are valid other than
+		 * ddr area. we should revisit this sometime.
+		 * return -EINVAL;
+		 */
+	}
+
+	ret = xocl_init_unmgd(&unmgd, args->data_ptr, args->size, dir);
+	if (ret) {
+		userpf_err(xdev, "init unmgd failed %ld", ret);
+		return ret;
+	}
+
+	channel = xocl_acquire_channel(xdev, dir);
+	if (channel < 0) {
+		userpf_err(xdev, "acquire channel failed");
+		ret = -EINVAL;
+		goto clear;
+	}
+	/* Now perform DMA */
+	ret = xocl_migrate_bo(xdev, unmgd.sgt, dir, args->paddr, channel,
+		args->size);
+	if (ret >= 0)
+		ret = (ret == args->size) ? 0 : -EIO;
+	xocl_release_channel(xdev, dir, channel);
+clear:
+	xocl_finish_unmgd(&unmgd);
+	return ret;
+}
+
+int xocl_pread_unmgd_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *filp)
+{
+	int channel;
+	struct drm_xocl_unmgd unmgd;
+	const struct drm_xocl_pwrite_unmgd *args = data;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	u32 dir = 0;  /* read */
+	ssize_t ret = 0;
+
+	if (args->address_space != 0) {
+		userpf_err(xdev, "invalid addr space");
+		return -EFAULT;
+	}
+
+	if (args->size == 0)
+		return 0;
+
+	if (!xocl_validate_paddr(xdev, args->paddr, args->size)) {
+		userpf_err(xdev, "invalid paddr: 0x%llx, size:0x%llx",
+			args->paddr, args->size);
+		/* currently we are not able to return error because
+		 * it is unclear that what addresses are valid other than
+		 * ddr area. we should revisit this sometime.
+		 * return -EINVAL;
+		 */
+	}
+
+	ret = xocl_init_unmgd(&unmgd, args->data_ptr, args->size, dir);
+	if (ret) {
+		userpf_err(xdev, "init unmgd failed %ld", ret);
+		return ret;
+	}
+
+	channel = xocl_acquire_channel(xdev, dir);
+
+	if (channel < 0) {
+		userpf_err(xdev, "acquire channel failed");
+		ret = -EINVAL;
+		goto clear;
+	}
+	/* Now perform DMA */
+	ret = xocl_migrate_bo(xdev, unmgd.sgt, dir, args->paddr, channel,
+		args->size);
+	if (ret >= 0)
+		ret = (ret == args->size) ? 0 : -EIO;
+
+	xocl_release_channel(xdev, dir, channel);
+clear:
+	xocl_finish_unmgd(&unmgd);
+	return ret;
+}
+
+int xocl_usage_stat_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *filp)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	struct drm_xocl_usage_stat *args = data;
+	int	i;
+
+	args->mm_channel_count = XOCL_DDR_COUNT(xdev);
+	if (args->mm_channel_count > 8)
+		args->mm_channel_count = 8;
+	for (i = 0; i < args->mm_channel_count; i++)
+		xocl_mm_get_usage_stat(drm_p, i, args->mm + i);
+
+	args->dma_channel_count = xocl_get_chan_count(xdev);
+	if (args->dma_channel_count > 8)
+		args->dma_channel_count = 8;
+
+	for (i = 0; i < args->dma_channel_count; i++) {
+		args->h2c[i] = xocl_get_chan_stat(xdev, i, 1);
+		args->c2h[i] = xocl_get_chan_stat(xdev, i, 0);
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/xocl/userpf/xocl_bo.h b/drivers/gpu/drm/xocl/userpf/xocl_bo.h
new file mode 100644
index 000000000000..38ac78cd59f6
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/xocl_bo.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _XOCL_BO_H
+#define	_XOCL_BO_H
+
+#include <drm/xocl_drm.h>
+#include "../xocl_drm.h"
+
+#define XOCL_BO_USERPTR (1 << 31)
+#define XOCL_BO_IMPORT  (1 << 30)
+#define XOCL_BO_EXECBUF (1 << 29)
+#define XOCL_BO_CMA     (1 << 28)
+#define XOCL_BO_P2P     (1 << 27)
+
+#define XOCL_BO_DDR0 (1 << 0)
+#define XOCL_BO_DDR1 (1 << 1)
+#define XOCL_BO_DDR2 (1 << 2)
+#define XOCL_BO_DDR3 (1 << 3)
+
+
+
+//#define XOCL_MEM_BANK_MSK (0xFFFFFF)
+/*
+ * When the BO is imported from an ARE device. This is remote BO to
+ * be accessed over ARE
+ */
+#define XOCL_BO_ARE  (1 << 26)
+
+static inline bool xocl_bo_userptr(const struct drm_xocl_bo *bo)
+{
+	return (bo->type & XOCL_BO_USERPTR);
+}
+
+static inline bool xocl_bo_import(const struct drm_xocl_bo *bo)
+{
+	return (bo->type & XOCL_BO_IMPORT);
+}
+
+static inline bool xocl_bo_execbuf(const struct drm_xocl_bo *bo)
+{
+	return (bo->type & XOCL_BO_EXECBUF);
+}
+
+static inline bool xocl_bo_cma(const struct drm_xocl_bo *bo)
+{
+	return (bo->type & XOCL_BO_CMA);
+}
+static inline bool xocl_bo_p2p(const struct drm_xocl_bo *bo)
+{
+	return (bo->type & XOCL_BO_P2P);
+}
+
+static inline struct drm_gem_object *xocl_gem_object_lookup(struct drm_device *dev,
+							    struct drm_file *filp,
+							    u32 handle)
+{
+	return drm_gem_object_lookup(filp, handle);
+}
+
+static inline struct drm_xocl_dev *bo_xocl_dev(const struct drm_xocl_bo *bo)
+{
+	return bo->base.dev->dev_private;
+}
+
+static inline unsigned int xocl_bo_ddr_idx(unsigned int flags)
+{
+	return flags;
+}
+
+int xocl_create_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_userptr_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_sync_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_copy_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_map_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_info_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_pwrite_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_pread_bo_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_ctx_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_pwrite_unmgd_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_pread_unmgd_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+int xocl_usage_stat_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp);
+
+struct sg_table *xocl_gem_prime_get_sg_table(struct drm_gem_object *obj);
+struct drm_gem_object *xocl_gem_prime_import_sg_table(struct drm_device *dev,
+	struct dma_buf_attachment *attach, struct sg_table *sgt);
+void *xocl_gem_prime_vmap(struct drm_gem_object *obj);
+void xocl_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+int xocl_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
+
+#endif
diff --git a/drivers/gpu/drm/xocl/userpf/xocl_drm.c b/drivers/gpu/drm/xocl/userpf/xocl_drm.c
new file mode 100644
index 000000000000..e07f5f8a054a
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/xocl_drm.c
@@ -0,0 +1,640 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors:
+ *
+ */
+
+#include <linux/version.h>
+#include <drm/drmP.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_mm.h>
+#include "../version.h"
+#include "../lib/libxdma_api.h"
+#include "common.h"
+
+#if defined(__PPC64__)
+#define XOCL_FILE_PAGE_OFFSET	0x10000
+#else
+#define XOCL_FILE_PAGE_OFFSET	0x100000
+#endif
+
+#ifndef VM_RESERVED
+#define VM_RESERVED (VM_DONTEXPAND | VM_DONTDUMP)
+#endif
+
+#ifdef _XOCL_DRM_DEBUG
+#define DRM_ENTER(fmt, args...)		 \
+	printk(KERN_INFO "[DRM] Entering %s:"fmt"\n", __func__, ##args)
+#define DRM_DBG(fmt, args...)	       \
+	printk(KERN_INFO "[DRM] %s:%d:"fmt"\n", __func__, __LINE__, ##args)
+#else
+#define DRM_ENTER(fmt, args...)
+#define DRM_DBG(fmt, args...)
+#endif
+
+static char driver_date[9];
+
+static void xocl_free_object(struct drm_gem_object *obj)
+{
+	DRM_ENTER("");
+	xocl_drm_free_bo(obj);
+}
+
+static int xocl_open(struct inode *inode, struct file *filp)
+{
+	struct xocl_drm *drm_p;
+	struct drm_file *priv;
+	struct drm_device *ddev;
+	int ret;
+
+	ret = drm_open(inode, filp);
+	if (ret)
+		return ret;
+
+	priv = filp->private_data;
+	ddev = priv->minor->dev;
+	drm_p = xocl_drvinst_open(ddev);
+	if (!drm_p)
+		return -ENXIO;
+
+	return 0;
+}
+
+static int xocl_release(struct inode *inode, struct file *filp)
+{
+	struct drm_file *priv = filp->private_data;
+	struct drm_device *ddev = priv->minor->dev;
+	struct xocl_drm	*drm_p = ddev->dev_private;
+	int ret;
+
+	ret = drm_release(inode, filp);
+	xocl_drvinst_close(drm_p);
+
+	return ret;
+}
+
+static int xocl_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	int ret;
+	struct drm_file *priv = filp->private_data;
+	struct drm_device *dev = priv->minor->dev;
+	struct mm_struct *mm = current->mm;
+	struct xocl_drm *drm_p = dev->dev_private;
+	xdev_handle_t xdev = drm_p->xdev;
+	unsigned long vsize;
+	phys_addr_t res_start;
+
+	DRM_ENTER("vm pgoff %lx", vma->vm_pgoff);
+
+	/*
+	 * If the page offset is > than 4G, then let GEM handle that and do what
+	 * it thinks is best,we will only handle page offsets less than 4G.
+	 */
+	if (likely(vma->vm_pgoff >= XOCL_FILE_PAGE_OFFSET)) {
+		ret = drm_gem_mmap(filp, vma);
+		if (ret)
+			return ret;
+		/* Clear VM_PFNMAP flag set by drm_gem_mmap()
+		 * we have "struct page" for all backing pages for bo
+		 */
+		vma->vm_flags &= ~VM_PFNMAP;
+		/* Clear VM_IO flag set by drm_gem_mmap()
+		 * it prevents gdb from accessing mapped buffers
+		 */
+		vma->vm_flags &= ~VM_IO;
+		vma->vm_flags |= VM_MIXEDMAP;
+		vma->vm_flags |= mm->def_flags;
+		vma->vm_pgoff = 0;
+
+		/* Override pgprot_writecombine() mapping setup by
+		 * drm_gem_mmap()
+		 * which results in very poor read performance
+		 */
+		if (vma->vm_flags & (VM_READ | VM_MAYREAD))
+			vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+		else
+			vma->vm_page_prot = pgprot_writecombine(
+				vm_get_page_prot(vma->vm_flags));
+		return ret;
+	}
+
+	if (vma->vm_pgoff != 0)
+		return -EINVAL;
+
+	vsize = vma->vm_end - vma->vm_start;
+	if (vsize > XDEV(xdev)->bar_size)
+		return -EINVAL;
+
+	DRM_DBG("MAP size %ld", vsize);
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+	vma->vm_flags |= VM_IO;
+	vma->vm_flags |= VM_RESERVED;
+
+	res_start = pci_resource_start(XDEV(xdev)->pdev, XDEV(xdev)->bar_idx);
+	ret = io_remap_pfn_range(vma, vma->vm_start,
+				 res_start >> PAGE_SHIFT,
+				 vsize, vma->vm_page_prot);
+	userpf_info(xdev, "io_remap_pfn_range ret code: %d", ret);
+
+	return ret;
+}
+
+int xocl_gem_fault(struct vm_fault *vmf)
+{
+	loff_t num_pages;
+	unsigned int page_offset;
+	struct vm_area_struct *vma = vmf->vma;
+	struct drm_xocl_bo *xobj = to_xocl_bo(vma->vm_private_data);
+	int ret = 0;
+	unsigned long vmf_address = vmf->address;
+
+	page_offset = (vmf_address - vma->vm_start) >> PAGE_SHIFT;
+
+
+	if (!xobj->pages)
+		return VM_FAULT_SIGBUS;
+
+	num_pages = DIV_ROUND_UP(xobj->base.size, PAGE_SIZE);
+	if (page_offset > num_pages)
+		return VM_FAULT_SIGBUS;
+
+	if (xobj->type & XOCL_BO_P2P)
+		ret = vm_insert_page(vma, vmf_address, xobj->pages[page_offset]);
+	else
+		ret = vm_insert_page(vma, vmf_address, xobj->pages[page_offset]);
+
+	switch (ret) {
+	case -EAGAIN:
+	case 0:
+	case -ERESTARTSYS:
+		return VM_FAULT_NOPAGE;
+	case -ENOMEM:
+		return VM_FAULT_OOM;
+	default:
+		return VM_FAULT_SIGBUS;
+	}
+}
+
+static int xocl_client_open(struct drm_device *dev, struct drm_file *filp)
+{
+	struct xocl_drm	*drm_p = dev->dev_private;
+	int	ret = 0;
+
+	DRM_ENTER("");
+
+	/* We do not allow users to open PRIMARY node, /dev/dri/cardX node.
+	 * Users should only open RENDER, /dev/dri/renderX node
+	 */
+	if (drm_is_primary_client(filp))
+		return -EPERM;
+
+	if (get_live_client_size(drm_p->xdev) > XOCL_MAX_CONCURRENT_CLIENTS)
+		return -EBUSY;
+
+	ret = xocl_exec_create_client(drm_p->xdev, &filp->driver_priv);
+	if (ret)
+		goto failed;
+
+	return 0;
+
+failed:
+	return ret;
+}
+
+static void xocl_client_release(struct drm_device *dev, struct drm_file *filp)
+{
+	struct xocl_drm	*drm_p = dev->dev_private;
+
+	xocl_exec_destroy_client(drm_p->xdev, &filp->driver_priv);
+}
+
+static uint xocl_poll(struct file *filp, poll_table *wait)
+{
+	struct drm_file *priv = filp->private_data;
+	struct drm_device *dev = priv->minor->dev;
+	struct xocl_drm	*drm_p = dev->dev_private;
+
+	BUG_ON(!priv->driver_priv);
+
+	DRM_ENTER("");
+	return xocl_exec_poll_client(drm_p->xdev, filp, wait, priv->driver_priv);
+}
+
+static const struct drm_ioctl_desc xocl_ioctls[] = {
+	DRM_IOCTL_DEF_DRV(XOCL_CREATE_BO, xocl_create_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_USERPTR_BO, xocl_userptr_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_MAP_BO, xocl_map_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_SYNC_BO, xocl_sync_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_INFO_BO, xocl_info_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_PWRITE_BO, xocl_pwrite_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_PREAD_BO, xocl_pread_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_CTX, xocl_ctx_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_INFO, xocl_info_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_READ_AXLF, xocl_read_axlf_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_PWRITE_UNMGD, xocl_pwrite_unmgd_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_PREAD_UNMGD, xocl_pread_unmgd_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_USAGE_STAT, xocl_usage_stat_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_USER_INTR, xocl_user_intr_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_EXECBUF, xocl_execbuf_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_COPY_BO, xocl_copy_bo_ioctl,
+			  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_HOT_RESET, xocl_hot_reset_ioctl,
+		  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(XOCL_RECLOCK, xocl_reclock_ioctl,
+	  DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+};
+
+static long xocl_drm_ioctl(struct file *filp,
+			      unsigned int cmd, unsigned long arg)
+{
+	return drm_ioctl(filp, cmd, arg);
+}
+
+static const struct file_operations xocl_driver_fops = {
+	.owner		= THIS_MODULE,
+	.open		= xocl_open,
+	.mmap		= xocl_mmap,
+	.poll		= xocl_poll,
+	.read		= drm_read,
+	.unlocked_ioctl = xocl_drm_ioctl,
+	.release	= xocl_release,
+};
+
+static const struct vm_operations_struct xocl_vm_ops = {
+	.fault = xocl_gem_fault,
+	.open = drm_gem_vm_open,
+	.close = drm_gem_vm_close,
+};
+
+static struct drm_driver mm_drm_driver = {
+	.driver_features		= DRIVER_GEM | DRIVER_PRIME |
+						DRIVER_RENDER,
+
+	.postclose			= xocl_client_release,
+	.open				= xocl_client_open,
+
+	.gem_free_object		= xocl_free_object,
+	.gem_vm_ops			= &xocl_vm_ops,
+
+	.ioctls				= xocl_ioctls,
+	.num_ioctls			= ARRAY_SIZE(xocl_ioctls),
+	.fops				= &xocl_driver_fops,
+
+	.gem_prime_get_sg_table		= xocl_gem_prime_get_sg_table,
+	.gem_prime_import_sg_table	= xocl_gem_prime_import_sg_table,
+	.gem_prime_vmap			= xocl_gem_prime_vmap,
+	.gem_prime_vunmap		= xocl_gem_prime_vunmap,
+	.gem_prime_mmap			= xocl_gem_prime_mmap,
+
+	.prime_handle_to_fd		= drm_gem_prime_handle_to_fd,
+	.prime_fd_to_handle		= drm_gem_prime_fd_to_handle,
+	.gem_prime_import		= drm_gem_prime_import,
+	.gem_prime_export		= drm_gem_prime_export,
+	.name				= XOCL_MODULE_NAME,
+	.desc				= XOCL_DRIVER_DESC,
+	.date				= driver_date,
+};
+
+void *xocl_drm_init(xdev_handle_t xdev_hdl)
+{
+	struct xocl_drm		*drm_p = NULL;
+	struct drm_device	*ddev = NULL;
+	int			year, mon, day;
+	int			ret = 0;
+	bool			drm_registered = false;
+
+	sscanf(XRT_DRIVER_VERSION, "%d.%d.%d",
+		&mm_drm_driver.major,
+		&mm_drm_driver.minor,
+		&mm_drm_driver.patchlevel);
+
+	sscanf(xrt_build_version_date, "%d-%d-%d ", &year, &mon, &day);
+	snprintf(driver_date, sizeof(driver_date),
+		"%d%02d%02d", year, mon, day);
+
+	ddev = drm_dev_alloc(&mm_drm_driver, &XDEV(xdev_hdl)->pdev->dev);
+	if (!ddev) {
+		xocl_xdev_err(xdev_hdl, "alloc drm dev failed");
+		ret = -ENOMEM;
+		goto failed;
+	}
+
+	drm_p = xocl_drvinst_alloc(ddev->dev, sizeof(*drm_p));
+	if (!drm_p) {
+		xocl_xdev_err(xdev_hdl, "alloc drm inst failed");
+		ret = -ENOMEM;
+		goto failed;
+	}
+	drm_p->xdev = xdev_hdl;
+
+	ddev->pdev = XDEV(xdev_hdl)->pdev;
+
+	ret = drm_dev_register(ddev, 0);
+	if (ret) {
+		xocl_xdev_err(xdev_hdl, "register drm dev failed 0x%x", ret);
+		goto failed;
+	}
+	drm_registered = true;
+
+	drm_p->ddev = ddev;
+
+	mutex_init(&drm_p->mm_lock);
+	ddev->dev_private = drm_p;
+	hash_init(drm_p->mm_range);
+
+	xocl_drvinst_set_filedev(drm_p, ddev);
+	return drm_p;
+
+failed:
+	if (drm_registered)
+		drm_dev_unregister(ddev);
+//PORT4_20
+	if (ddev)
+		drm_dev_put(ddev);
+	if (drm_p)
+		xocl_drvinst_free(drm_p);
+
+	return NULL;
+}
+
+void xocl_drm_fini(struct xocl_drm *drm_p)
+{
+	xocl_cleanup_mem(drm_p);
+	drm_put_dev(drm_p->ddev);
+	mutex_destroy(&drm_p->mm_lock);
+
+	xocl_drvinst_free(drm_p);
+}
+
+void xocl_mm_get_usage_stat(struct xocl_drm *drm_p, u32 ddr,
+	struct drm_xocl_mm_stat *pstat)
+{
+	pstat->memory_usage = drm_p->mm_usage_stat[ddr] ?
+		drm_p->mm_usage_stat[ddr]->memory_usage : 0;
+	pstat->bo_count = drm_p->mm_usage_stat[ddr] ?
+		drm_p->mm_usage_stat[ddr]->bo_count : 0;
+}
+
+void xocl_mm_update_usage_stat(struct xocl_drm *drm_p, u32 ddr,
+	u64 size, int count)
+{
+	BUG_ON(!drm_p->mm_usage_stat[ddr]);
+
+	drm_p->mm_usage_stat[ddr]->memory_usage += (count > 0) ? size : -size;
+	drm_p->mm_usage_stat[ddr]->bo_count += count;
+}
+
+int xocl_mm_insert_node(struct xocl_drm *drm_p, u32 ddr,
+		struct drm_mm_node *node, u64 size)
+{
+	return drm_mm_insert_node_generic(drm_p->mm[ddr], node, size, PAGE_SIZE,
+#if defined(XOCL_DRM_FREE_MALLOC)
+		0, 0);
+#else
+		0, 0, 0);
+#endif
+}
+
+int xocl_check_topology(struct xocl_drm *drm_p)
+{
+	struct mem_topology    *topology;
+	u16	i;
+	int	err = 0;
+
+	topology = XOCL_MEM_TOPOLOGY(drm_p->xdev);
+	if (topology == NULL)
+		return 0;
+
+	for (i = 0; i < topology->m_count; i++) {
+		if (!topology->m_mem_data[i].m_used)
+			continue;
+
+		if (topology->m_mem_data[i].m_type == MEM_STREAMING)
+			continue;
+
+		if (drm_p->mm_usage_stat[i]->bo_count != 0) {
+			err = -EPERM;
+			xocl_err(drm_p->ddev->dev,
+				 "The ddr %d has pre-existing buffer allocations, please exit and re-run.",
+				 i);
+		}
+	}
+
+	return err;
+}
+
+uint32_t xocl_get_shared_ddr(struct xocl_drm *drm_p, struct mem_data *m_data)
+{
+	struct xocl_mm_wrapper *wrapper;
+	uint64_t start_addr = m_data->m_base_address;
+	uint64_t sz = m_data->m_size*1024;
+
+	hash_for_each_possible(drm_p->mm_range, wrapper, node, start_addr) {
+		if (!wrapper)
+			continue;
+
+		if (wrapper->start_addr == start_addr) {
+			if (wrapper->size == sz)
+				return wrapper->ddr;
+			else
+				return 0xffffffff;
+		}
+	}
+	return 0xffffffff;
+}
+
+void xocl_cleanup_mem(struct xocl_drm *drm_p)
+{
+	struct mem_topology *topology;
+	u16 i, ddr;
+	uint64_t addr;
+	struct xocl_mm_wrapper *wrapper;
+	struct hlist_node *tmp;
+
+	topology = XOCL_MEM_TOPOLOGY(drm_p->xdev);
+	if (topology) {
+		ddr = topology->m_count;
+		for (i = 0; i < ddr; i++) {
+			if (!topology->m_mem_data[i].m_used)
+				continue;
+
+			if (topology->m_mem_data[i].m_type == MEM_STREAMING)
+				continue;
+
+			xocl_info(drm_p->ddev->dev, "Taking down DDR : %d", i);
+			addr = topology->m_mem_data[i].m_base_address;
+
+			hash_for_each_possible_safe(drm_p->mm_range, wrapper,
+					tmp, node, addr) {
+				if (wrapper->ddr == i) {
+					hash_del(&wrapper->node);
+					vfree(wrapper);
+					drm_mm_takedown(drm_p->mm[i]);
+					vfree(drm_p->mm[i]);
+					vfree(drm_p->mm_usage_stat[i]);
+				}
+			}
+
+			drm_p->mm[i] = NULL;
+			drm_p->mm_usage_stat[i] = NULL;
+		}
+	}
+	vfree(drm_p->mm);
+	drm_p->mm = NULL;
+	vfree(drm_p->mm_usage_stat);
+	drm_p->mm_usage_stat = NULL;
+	vfree(drm_p->mm_p2p_off);
+	drm_p->mm_p2p_off = NULL;
+}
+
+int xocl_init_mem(struct xocl_drm *drm_p)
+{
+	size_t length = 0;
+	size_t mm_size = 0, mm_stat_size = 0;
+	size_t size = 0, wrapper_size = 0;
+	size_t ddr_bank_size;
+	struct mem_topology *topo;
+	struct mem_data *mem_data;
+	uint32_t shared;
+	struct xocl_mm_wrapper *wrapper = NULL;
+	uint64_t reserved1 = 0;
+	uint64_t reserved2 = 0;
+	uint64_t reserved_start;
+	uint64_t reserved_end;
+	int err = 0;
+	int i = -1;
+
+	if (XOCL_DSA_IS_MPSOC(drm_p->xdev)) {
+		/* TODO: This is still hardcoding.. */
+		reserved1 = 0x80000000;
+		reserved2 = 0x1000000;
+	}
+
+	topo = XOCL_MEM_TOPOLOGY(drm_p->xdev);
+	if (topo == NULL)
+		return 0;
+
+	length = topo->m_count * sizeof(struct mem_data);
+	size = topo->m_count * sizeof(void *);
+	wrapper_size = sizeof(struct xocl_mm_wrapper);
+	mm_size = sizeof(struct drm_mm);
+	mm_stat_size = sizeof(struct drm_xocl_mm_stat);
+
+	xocl_info(drm_p->ddev->dev, "Topology count = %d, data_length = %ld",
+		topo->m_count, length);
+
+	drm_p->mm = vzalloc(size);
+	drm_p->mm_usage_stat = vzalloc(size);
+	drm_p->mm_p2p_off = vzalloc(topo->m_count * sizeof(u64));
+	if (!drm_p->mm || !drm_p->mm_usage_stat || !drm_p->mm_p2p_off) {
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	for (i = 0; i < topo->m_count; i++) {
+		mem_data = &topo->m_mem_data[i];
+		ddr_bank_size = mem_data->m_size * 1024;
+
+		xocl_info(drm_p->ddev->dev, "  Mem Index %d", i);
+		xocl_info(drm_p->ddev->dev, "  Base Address:0x%llx",
+			mem_data->m_base_address);
+		xocl_info(drm_p->ddev->dev, "  Size:0x%lx", ddr_bank_size);
+		xocl_info(drm_p->ddev->dev, "  Type:%d", mem_data->m_type);
+		xocl_info(drm_p->ddev->dev, "  Used:%d", mem_data->m_used);
+	}
+
+	/* Initialize the used banks and their sizes */
+	/* Currently only fixed sizes are supported */
+	for (i = 0; i < topo->m_count; i++) {
+		mem_data = &topo->m_mem_data[i];
+		if (!mem_data->m_used)
+			continue;
+
+		if (mem_data->m_type == MEM_STREAMING ||
+			mem_data->m_type == MEM_STREAMING_CONNECTION)
+			continue;
+
+		ddr_bank_size = mem_data->m_size * 1024;
+		xocl_info(drm_p->ddev->dev, "Allocating DDR bank%d", i);
+		xocl_info(drm_p->ddev->dev, "  base_addr:0x%llx, total size:0x%lx",
+			mem_data->m_base_address, ddr_bank_size);
+
+		if (XOCL_DSA_IS_MPSOC(drm_p->xdev)) {
+			reserved_end = mem_data->m_base_address + ddr_bank_size;
+			reserved_start = reserved_end - reserved1 - reserved2;
+			xocl_info(drm_p->ddev->dev, "  reserved region:0x%llx - 0x%llx",
+				reserved_start, reserved_end - 1);
+		}
+
+		shared = xocl_get_shared_ddr(drm_p, mem_data);
+		if (shared != 0xffffffff) {
+			xocl_info(drm_p->ddev->dev, "Found duplicated memory region!");
+			drm_p->mm[i] = drm_p->mm[shared];
+			drm_p->mm_usage_stat[i] = drm_p->mm_usage_stat[shared];
+			continue;
+		}
+
+		xocl_info(drm_p->ddev->dev, "Found a new memory region");
+		wrapper = vzalloc(wrapper_size);
+		drm_p->mm[i] = vzalloc(mm_size);
+		drm_p->mm_usage_stat[i] = vzalloc(mm_stat_size);
+
+		if (!drm_p->mm[i] || !drm_p->mm_usage_stat[i] || !wrapper) {
+			err = -ENOMEM;
+			goto failed;
+		}
+
+		wrapper->start_addr = mem_data->m_base_address;
+		wrapper->size = mem_data->m_size*1024;
+		wrapper->mm = drm_p->mm[i];
+		wrapper->mm_usage_stat = drm_p->mm_usage_stat[i];
+		wrapper->ddr = i;
+		hash_add(drm_p->mm_range, &wrapper->node, wrapper->start_addr);
+
+		drm_mm_init(drm_p->mm[i], mem_data->m_base_address,
+				ddr_bank_size - reserved1 - reserved2);
+		drm_p->mm_p2p_off[i] = ddr_bank_size * i;
+
+		xocl_info(drm_p->ddev->dev, "drm_mm_init called");
+	}
+
+	return 0;
+
+failed:
+	vfree(wrapper);
+	if (drm_p->mm) {
+		for (; i >= 0; i--) {
+			drm_mm_takedown(drm_p->mm[i]);
+			vfree(drm_p->mm[i]);
+			vfree(drm_p->mm_usage_stat[i]);
+		}
+		vfree(drm_p->mm);
+		drm_p->mm = NULL;
+	}
+	vfree(drm_p->mm_usage_stat);
+	drm_p->mm_usage_stat = NULL;
+	vfree(drm_p->mm_p2p_off);
+	drm_p->mm_p2p_off = NULL;
+
+	return err;
+}
diff --git a/drivers/gpu/drm/xocl/userpf/xocl_drv.c b/drivers/gpu/drm/xocl/userpf/xocl_drv.c
new file mode 100644
index 000000000000..6fc57da3deab
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/xocl_drv.c
@@ -0,0 +1,743 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Lizhi.Hou@xilinx.com
+ *
+ */
+
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/version.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include "../xocl_drv.h"
+#include "common.h"
+#include "../version.h"
+#include <linux/memremap.h>
+
+#ifndef PCI_EXT_CAP_ID_REBAR
+#define PCI_EXT_CAP_ID_REBAR 0x15
+#endif
+
+#ifndef PCI_REBAR_CTRL
+#define PCI_REBAR_CTRL          8       /* control register */
+#endif
+
+#ifndef PCI_REBAR_CTRL_BAR_SIZE
+#define  PCI_REBAR_CTRL_BAR_SIZE        0x00001F00  /* BAR size */
+#endif
+
+#ifndef PCI_REBAR_CTRL_BAR_SHIFT
+#define  PCI_REBAR_CTRL_BAR_SHIFT       8           /* shift for BAR size */
+#endif
+
+#define REBAR_FIRST_CAP		4
+
+static const struct pci_device_id pciidlist[] = XOCL_USER_PCI_IDS;
+
+struct class *xrt_class;
+
+MODULE_DEVICE_TABLE(pci, pciidlist);
+
+static int userpf_intr_config(xdev_handle_t xdev_hdl, u32 intr, bool en)
+{
+	return xocl_dma_intr_config(xdev_hdl, intr, en);
+}
+
+static int userpf_intr_register(xdev_handle_t xdev_hdl, u32 intr,
+		irq_handler_t handler, void *arg)
+{
+	return handler ?
+		xocl_dma_intr_register(xdev_hdl, intr, handler, arg, -1) :
+		xocl_dma_intr_unreg(xdev_hdl, intr);
+}
+
+struct xocl_pci_funcs userpf_pci_ops = {
+	.intr_config = userpf_intr_config,
+	.intr_register = userpf_intr_register,
+};
+
+void xocl_reset_notify(struct pci_dev *pdev, bool prepare)
+{
+	struct xocl_dev *xdev = pci_get_drvdata(pdev);
+
+	xocl_info(&pdev->dev, "PCI reset NOTIFY, prepare %d", prepare);
+
+	if (prepare) {
+		xocl_mailbox_reset(xdev, false);
+		xocl_subdev_destroy_by_id(xdev, XOCL_SUBDEV_DMA);
+	} else {
+		reset_notify_client_ctx(xdev);
+		xocl_subdev_create_by_id(xdev, XOCL_SUBDEV_DMA);
+		xocl_mailbox_reset(xdev, true);
+		xocl_exec_reset(xdev);
+	}
+}
+
+static void kill_all_clients(struct xocl_dev *xdev)
+{
+	struct list_head *ptr;
+	struct client_ctx *entry;
+	int ret;
+	int total_wait_secs = 10; // sec
+	int wait_interval = 100; // millisec
+	int retry = total_wait_secs * 1000 / wait_interval;
+
+	mutex_lock(&xdev->ctx_list_lock);
+
+	list_for_each(ptr, &xdev->ctx_list) {
+		entry = list_entry(ptr, struct client_ctx, link);
+		ret = kill_pid(entry->pid, SIGBUS, 1);
+		if (ret) {
+			userpf_err(xdev, "killing pid: %d failed. err: %d",
+				pid_nr(entry->pid), ret);
+		}
+	}
+
+	mutex_unlock(&xdev->ctx_list_lock);
+
+	while (!list_empty(&xdev->ctx_list) && retry--)
+		msleep(wait_interval);
+
+	if (!list_empty(&xdev->ctx_list))
+		userpf_err(xdev, "failed to kill all clients");
+}
+
+int64_t xocl_hot_reset(struct xocl_dev *xdev, bool force)
+{
+	bool skip = false;
+	int64_t ret = 0, mbret = 0;
+	struct mailbox_req mbreq = { MAILBOX_REQ_HOT_RESET, };
+	size_t resplen = sizeof(ret);
+
+	mutex_lock(&xdev->ctx_list_lock);
+	if (xdev->offline) {
+		skip = true;
+	} else if (!force && !list_is_singular(&xdev->ctx_list)) {
+		/* We should have one context for ourselves. */
+		BUG_ON(list_empty(&xdev->ctx_list));
+		userpf_err(xdev, "device is in use, can't reset");
+		ret = -EBUSY;
+	} else {
+		xdev->offline = true;
+	}
+	mutex_unlock(&xdev->ctx_list_lock);
+	if (ret < 0 || skip)
+		return ret;
+
+	userpf_info(xdev, "resetting device...");
+
+	if (force)
+		kill_all_clients(xdev);
+
+	xocl_reset_notify(xdev->core.pdev, true);
+	mbret = xocl_peer_request(xdev, &mbreq, sizeof(struct mailbox_req), &ret, &resplen, NULL, NULL);
+	if (mbret)
+		ret = mbret;
+	xocl_reset_notify(xdev->core.pdev, false);
+
+	mutex_lock(&xdev->ctx_list_lock);
+	xdev->offline = false;
+	mutex_unlock(&xdev->ctx_list_lock);
+
+	return ret;
+}
+
+
+int xocl_reclock(struct xocl_dev *xdev, void *data)
+{
+	int err = 0;
+	int64_t msg = -ENODEV;
+	struct mailbox_req *req = NULL;
+	size_t resplen = sizeof(msg);
+	size_t reqlen = sizeof(struct mailbox_req)+sizeof(struct drm_xocl_reclock_info);
+
+	req = kzalloc(reqlen, GFP_KERNEL);
+	req->req = MAILBOX_REQ_RECLOCK;
+	req->data_total_len = sizeof(struct drm_xocl_reclock_info);
+	memcpy(req->data, data, sizeof(struct drm_xocl_reclock_info));
+
+	err = xocl_peer_request(xdev, req, reqlen,
+		&msg, &resplen, NULL, NULL);
+
+	if (msg != 0)
+		err = -ENODEV;
+
+	kfree(req);
+	return err;
+}
+
+static void xocl_mailbox_srv(void *arg, void *data, size_t len,
+	u64 msgid, int err)
+{
+	struct xocl_dev *xdev = (struct xocl_dev *)arg;
+	struct mailbox_req *req = (struct mailbox_req *)data;
+
+	if (err != 0)
+		return;
+
+	userpf_info(xdev, "received request (%d) from peer\n", req->req);
+
+	switch (req->req) {
+	case MAILBOX_REQ_FIREWALL:
+		(void) xocl_hot_reset(xdev, true);
+		break;
+	default:
+		userpf_err(xdev, "dropped bad request (%d)\n", req->req);
+		break;
+	}
+}
+
+void get_pcie_link_info(struct xocl_dev *xdev,
+	unsigned short *link_width, unsigned short *link_speed, bool is_cap)
+{
+	u16 stat;
+	long result;
+	int pos = is_cap ? PCI_EXP_LNKCAP : PCI_EXP_LNKSTA;
+
+	result = pcie_capability_read_word(xdev->core.pdev, pos, &stat);
+	if (result) {
+		*link_width = *link_speed = 0;
+		xocl_info(&xdev->core.pdev->dev, "Read pcie capability failed");
+		return;
+	}
+	*link_width = (stat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT;
+	*link_speed = stat & PCI_EXP_LNKSTA_CLS;
+}
+
+void user_pci_reset_prepare(struct pci_dev *pdev)
+{
+	xocl_reset_notify(pdev, true);
+}
+
+void user_pci_reset_done(struct pci_dev *pdev)
+{
+	xocl_reset_notify(pdev, false);
+}
+
+static void xocl_dev_percpu_release(struct percpu_ref *ref)
+{
+	struct xocl_dev *xdev = container_of(ref, struct xocl_dev, ref);
+
+	complete(&xdev->cmp);
+}
+
+static void xocl_dev_percpu_exit(void *data)
+{
+	struct percpu_ref *ref = data;
+	struct xocl_dev *xdev = container_of(ref, struct xocl_dev, ref);
+
+	wait_for_completion(&xdev->cmp);
+	percpu_ref_exit(ref);
+}
+
+
+static void xocl_dev_percpu_kill(void *data)
+{
+	struct percpu_ref *ref = data;
+
+	percpu_ref_kill(ref);
+}
+
+void xocl_p2p_mem_release(struct xocl_dev *xdev, bool recov_bar_sz)
+{
+	struct pci_dev *pdev = xdev->core.pdev;
+	int p2p_bar = -1;
+
+	if (xdev->p2p_bar_addr) {
+		devres_release_group(&pdev->dev, xdev->p2p_res_grp);
+		xdev->p2p_bar_addr = NULL;
+		xdev->p2p_res_grp = NULL;
+	}
+	if (xdev->p2p_res_grp) {
+		devres_remove_group(&pdev->dev, xdev->p2p_res_grp);
+		xdev->p2p_res_grp = NULL;
+	}
+
+	if (recov_bar_sz) {
+		p2p_bar = xocl_get_p2p_bar(xdev, NULL);
+		if (p2p_bar < 0)
+			return;
+
+		xocl_pci_resize_resource(pdev, p2p_bar,
+				(XOCL_PA_SECTION_SHIFT - 20));
+
+		xocl_info(&pdev->dev, "Resize p2p bar %d to %d M ", p2p_bar,
+			(1 << XOCL_PA_SECTION_SHIFT));
+	}
+}
+
+int xocl_p2p_mem_reserve(struct xocl_dev *xdev)
+{
+	resource_size_t p2p_bar_addr;
+	resource_size_t p2p_bar_len;
+	struct resource res;
+	uint32_t p2p_bar_idx;
+	struct pci_dev *pdev = xdev->core.pdev;
+	int32_t ret;
+
+	xocl_info(&pdev->dev, "reserve p2p mem, bar %d, len %lld",
+			xdev->p2p_bar_idx, xdev->p2p_bar_len);
+
+	if (xdev->p2p_bar_idx < 0 ||
+		xdev->p2p_bar_len <= (1<<XOCL_PA_SECTION_SHIFT)) {
+		/* only p2p_bar_len > SECTION (256MB) */
+		xocl_info(&pdev->dev, "Did not find p2p BAR");
+		return 0;
+	}
+
+	p2p_bar_len = xdev->p2p_bar_len;
+	p2p_bar_idx = xdev->p2p_bar_idx;
+
+	xdev->p2p_res_grp = devres_open_group(&pdev->dev, NULL, GFP_KERNEL);
+	if (!xdev->p2p_res_grp) {
+		xocl_err(&pdev->dev, "open p2p resource group failed");
+		ret = -ENOMEM;
+		goto failed;
+	}
+
+	p2p_bar_addr = pci_resource_start(pdev, p2p_bar_idx);
+
+	res.start = p2p_bar_addr;
+	res.end	  = p2p_bar_addr+p2p_bar_len-1;
+	res.name  = NULL;
+	res.flags = IORESOURCE_MEM;
+
+	init_completion(&xdev->cmp);
+
+	ret = percpu_ref_init(&xdev->ref, xocl_dev_percpu_release, 0,
+		GFP_KERNEL);
+	if (ret)
+		goto failed;
+
+	ret = devm_add_action_or_reset(&(pdev->dev), xocl_dev_percpu_exit,
+		&xdev->ref);
+	if (ret)
+		goto failed;
+
+	xdev->pgmap.ref = &xdev->ref;
+	memcpy(&xdev->pgmap.res, &res, sizeof(struct resource));
+	xdev->pgmap.altmap_valid = false;
+	xdev->p2p_bar_addr = devm_memremap_pages(&(pdev->dev), &xdev->pgmap);
+
+	if (!xdev->p2p_bar_addr) {
+		ret = -ENOMEM;
+		percpu_ref_kill(&xdev->ref);
+		devres_close_group(&pdev->dev, xdev->p2p_res_grp);
+		goto failed;
+	}
+
+	ret = devm_add_action_or_reset(&(pdev->dev), xocl_dev_percpu_kill,
+				       &xdev->ref);
+	if (ret) {
+		percpu_ref_kill(&xdev->ref);
+		devres_close_group(&pdev->dev, xdev->p2p_res_grp);
+		goto failed;
+	}
+
+	devres_close_group(&pdev->dev, xdev->p2p_res_grp);
+
+	return 0;
+
+failed:
+	xocl_p2p_mem_release(xdev, false);
+
+	return ret;
+}
+
+static inline u64 xocl_pci_rebar_size_to_bytes(int size)
+{
+	return 1ULL << (size + 20);
+}
+
+int xocl_get_p2p_bar(struct xocl_dev *xdev, u64 *bar_size)
+{
+	struct pci_dev *dev = xdev->core.pdev;
+	int i, pos;
+	u32 cap, ctrl, size;
+
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_REBAR);
+	if (!pos) {
+		xocl_err(&dev->dev, "did not find rebar cap");
+		return -ENOTSUPP;
+	}
+
+	pos += REBAR_FIRST_CAP;
+	for (i = PCI_STD_RESOURCES; i <= PCI_STD_RESOURCE_END; i++) {
+		pci_read_config_dword(dev, pos, &cap);
+		pci_read_config_dword(dev, pos + 4, &ctrl);
+		size = (ctrl & PCI_REBAR_CTRL_BAR_SIZE) >>
+			PCI_REBAR_CTRL_BAR_SHIFT;
+		if (xocl_pci_rebar_size_to_bytes(size) >=
+			(1 << XOCL_PA_SECTION_SHIFT) &&
+			cap >= 0x1000) {
+			if (bar_size)
+				*bar_size = xocl_pci_rebar_size_to_bytes(size);
+			return i;
+		}
+		pos += 8;
+	}
+
+	if (bar_size)
+		*bar_size = 0;
+
+	return -1;
+}
+
+static int xocl_reassign_resources(struct pci_dev *dev, int resno)
+{
+	pci_assign_unassigned_bus_resources(dev->bus);
+
+	return 0;
+}
+
+int xocl_pci_resize_resource(struct pci_dev *dev, int resno, int size)
+{
+	struct resource *res = dev->resource + resno;
+	struct pci_dev *root;
+	struct resource *root_res;
+	u64 bar_size, req_size;
+	unsigned long flags;
+	u16 cmd;
+	int pos, ret = 0;
+	u32 ctrl, i;
+
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_REBAR);
+	if (!pos) {
+		xocl_err(&dev->dev, "did not find rebar cap");
+		return -ENOTSUPP;
+	}
+
+	pos += resno * PCI_REBAR_CTRL;
+	pci_read_config_dword(dev, pos + PCI_REBAR_CTRL, &ctrl);
+
+	bar_size = xocl_pci_rebar_size_to_bytes(
+			(ctrl & PCI_REBAR_CTRL_BAR_SIZE) >>
+			PCI_REBAR_CTRL_BAR_SHIFT);
+	req_size = xocl_pci_rebar_size_to_bytes(size);
+
+	xocl_info(&dev->dev, "req_size %lld, bar size %lld\n",
+			req_size, bar_size);
+	if (req_size == bar_size) {
+		xocl_info(&dev->dev, "same size, return success");
+		return -EALREADY;
+	}
+
+	xocl_get_root_dev(dev, root);
+
+	for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
+		root_res = root->subordinate->resource[i];
+		root_res = (root_res) ? root_res->parent : NULL;
+		if (root_res && (root_res->flags & IORESOURCE_MEM)
+			&& resource_size(root_res) > req_size)
+			break;
+	}
+
+	if (i == PCI_BRIDGE_RESOURCE_NUM) {
+		xocl_err(&dev->dev, "Not enough IO Mem space, Please check BIOS settings. ");
+		return -ENOSPC;
+	}
+	pci_release_selected_regions(dev, (1 << resno));
+	pci_read_config_word(dev, PCI_COMMAND, &cmd);
+	pci_write_config_word(dev, PCI_COMMAND,
+		cmd & ~PCI_COMMAND_MEMORY);
+
+	flags = res->flags;
+	if (res->parent)
+		release_resource(res);
+
+	ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE;
+	ctrl |= size << PCI_REBAR_CTRL_BAR_SHIFT;
+	pci_write_config_dword(dev, pos + PCI_REBAR_CTRL, ctrl);
+
+
+	res->start = 0;
+	res->end = req_size - 1;
+
+	xocl_info(&dev->dev, "new size %lld", resource_size(res));
+	xocl_reassign_resources(dev, resno);
+	res->flags = flags;
+
+	pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY);
+	pci_request_selected_regions(dev, (1 << resno),
+		XOCL_MODULE_NAME);
+
+	return ret;
+}
+
+static int identify_bar(struct xocl_dev *xdev)
+{
+	struct pci_dev *pdev = xdev->core.pdev;
+	resource_size_t bar_len;
+	int		i;
+
+	for (i = PCI_STD_RESOURCES; i <= PCI_STD_RESOURCE_END; i++) {
+		bar_len = pci_resource_len(pdev, i);
+		if (bar_len >= (1 << XOCL_PA_SECTION_SHIFT)) {
+			xdev->p2p_bar_idx = i;
+			xdev->p2p_bar_len = bar_len;
+			pci_request_selected_regions(pdev, 1 << i,
+				XOCL_MODULE_NAME);
+		} else if (bar_len >= 32 * 1024 * 1024) {
+			xdev->core.bar_addr = ioremap_nocache(
+				pci_resource_start(pdev, i), bar_len);
+			if (!xdev->core.bar_addr)
+				return -EIO;
+			xdev->core.bar_idx = i;
+			xdev->core.bar_size = bar_len;
+		}
+	}
+
+	return 0;
+}
+
+static void unmap_bar(struct xocl_dev *xdev)
+{
+	if (xdev->core.bar_addr) {
+		iounmap(xdev->core.bar_addr);
+		xdev->core.bar_addr = NULL;
+	}
+
+	if (xdev->p2p_bar_len)
+		pci_release_selected_regions(xdev->core.pdev,
+				1 << xdev->p2p_bar_idx);
+}
+
+/* pci driver callbacks */
+int xocl_userpf_probe(struct pci_dev *pdev,
+		const struct pci_device_id *ent)
+{
+	struct xocl_dev			*xdev;
+	struct xocl_board_private	*dev_info;
+	int				ret;
+
+	xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
+	if (!xdev) {
+		xocl_err(&pdev->dev, "failed to alloc xocl_dev");
+		return -ENOMEM;
+	}
+
+	/* this is used for all subdevs, bind it to device earlier */
+	pci_set_drvdata(pdev, xdev);
+	dev_info = (struct xocl_board_private *)ent->driver_data;
+
+	xdev->core.pci_ops = &userpf_pci_ops;
+	xdev->core.pdev = pdev;
+	xocl_fill_dsa_priv(xdev, dev_info);
+
+	ret = identify_bar(xdev);
+	if (ret) {
+		xocl_err(&pdev->dev, "failed to identify bar");
+		goto failed_to_bar;
+	}
+
+	ret = pci_enable_device(pdev);
+	if (ret) {
+		xocl_err(&pdev->dev, "failed to enable device.");
+		goto failed_to_enable;
+	}
+
+	ret = xocl_alloc_dev_minor(xdev);
+	if (ret)
+		goto failed_alloc_minor;
+
+	ret = xocl_subdev_create_all(xdev, dev_info->subdev_info,
+			dev_info->subdev_num);
+	if (ret) {
+		xocl_err(&pdev->dev, "failed to register subdevs");
+		goto failed_create_subdev;
+	}
+
+	ret = xocl_p2p_mem_reserve(xdev);
+	if (ret)
+		xocl_err(&pdev->dev, "failed to reserve p2p memory region");
+
+	ret = xocl_init_sysfs(&pdev->dev);
+	if (ret) {
+		xocl_err(&pdev->dev, "failed to init sysfs");
+		goto failed_init_sysfs;
+	}
+
+	mutex_init(&xdev->ctx_list_lock);
+	xdev->needs_reset = false;
+	atomic64_set(&xdev->total_execs, 0);
+	atomic_set(&xdev->outstanding_execs, 0);
+	INIT_LIST_HEAD(&xdev->ctx_list);
+
+	/* Launch the mailbox server. */
+	(void) xocl_peer_listen(xdev, xocl_mailbox_srv, (void *)xdev);
+
+	return 0;
+
+failed_init_sysfs:
+	xocl_p2p_mem_release(xdev, false);
+	xocl_subdev_destroy_all(xdev);
+
+failed_create_subdev:
+	xocl_free_dev_minor(xdev);
+
+failed_alloc_minor:
+	pci_disable_device(pdev);
+failed_to_enable:
+	unmap_bar(xdev);
+failed_to_bar:
+	devm_kfree(&pdev->dev, xdev);
+	pci_set_drvdata(pdev, NULL);
+
+	return ret;
+}
+
+void xocl_userpf_remove(struct pci_dev *pdev)
+{
+	struct xocl_dev		*xdev;
+
+	xdev = pci_get_drvdata(pdev);
+	if (!xdev) {
+		xocl_err(&pdev->dev, "driver data is NULL");
+		return;
+	}
+
+	xocl_p2p_mem_release(xdev, false);
+	xocl_subdev_destroy_all(xdev);
+
+	xocl_fini_sysfs(&pdev->dev);
+	xocl_free_dev_minor(xdev);
+
+	pci_disable_device(pdev);
+
+	unmap_bar(xdev);
+
+	mutex_destroy(&xdev->ctx_list_lock);
+
+	pci_set_drvdata(pdev, NULL);
+	devm_kfree(&pdev->dev, xdev);
+}
+
+static pci_ers_result_t user_pci_error_detected(struct pci_dev *pdev,
+		pci_channel_state_t state)
+{
+	switch (state) {
+	case pci_channel_io_normal:
+		xocl_info(&pdev->dev, "PCI normal state error\n");
+		return PCI_ERS_RESULT_CAN_RECOVER;
+	case pci_channel_io_frozen:
+		xocl_info(&pdev->dev, "PCI frozen state error\n");
+		return PCI_ERS_RESULT_NEED_RESET;
+	case pci_channel_io_perm_failure:
+		xocl_info(&pdev->dev, "PCI failure state error\n");
+		return PCI_ERS_RESULT_DISCONNECT;
+	default:
+		xocl_info(&pdev->dev, "PCI unknown state (%d) error\n", state);
+		break;
+	}
+
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t user_pci_slot_reset(struct pci_dev *pdev)
+{
+	xocl_info(&pdev->dev, "PCI reset slot");
+	pci_restore_state(pdev);
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void user_pci_error_resume(struct pci_dev *pdev)
+{
+	xocl_info(&pdev->dev, "PCI error resume");
+	pci_cleanup_aer_uncorrect_error_status(pdev);
+}
+
+static const struct pci_error_handlers xocl_err_handler = {
+	.error_detected	= user_pci_error_detected,
+	.slot_reset	= user_pci_slot_reset,
+	.resume		= user_pci_error_resume,
+	.reset_prepare	= user_pci_reset_prepare,
+	.reset_done	= user_pci_reset_done,
+};
+
+static struct pci_driver userpf_driver = {
+	.name = XOCL_MODULE_NAME,
+	.id_table = pciidlist,
+	.probe = xocl_userpf_probe,
+	.remove = xocl_userpf_remove,
+	.err_handler = &xocl_err_handler,
+};
+
+/* INIT */
+static int (*xocl_drv_reg_funcs[])(void) __initdata = {
+	xocl_init_feature_rom,
+	xocl_init_xdma,
+	xocl_init_mb_scheduler,
+	xocl_init_mailbox,
+	xocl_init_xmc,
+	xocl_init_icap,
+	xocl_init_xvc,
+};
+
+static void (*xocl_drv_unreg_funcs[])(void) = {
+	xocl_fini_feature_rom,
+	xocl_fini_xdma,
+	xocl_fini_mb_scheduler,
+	xocl_fini_mailbox,
+	xocl_fini_xmc,
+	xocl_fini_icap,
+	xocl_fini_xvc,
+};
+
+static int __init xocl_init(void)
+{
+	int		ret, i;
+
+	xrt_class = class_create(THIS_MODULE, "xrt_user");
+	if (IS_ERR(xrt_class)) {
+		ret = PTR_ERR(xrt_class);
+		goto err_class_create;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(xocl_drv_reg_funcs); ++i) {
+		ret = xocl_drv_reg_funcs[i]();
+		if (ret)
+			goto failed;
+	}
+
+	ret = pci_register_driver(&userpf_driver);
+	if (ret)
+		goto failed;
+
+	return 0;
+
+failed:
+	for (i--; i >= 0; i--)
+		xocl_drv_unreg_funcs[i]();
+
+	class_destroy(xrt_class);
+	xrt_class = NULL;
+
+err_class_create:
+	return ret;
+}
+
+static void __exit xocl_exit(void)
+{
+	int i;
+
+	pci_unregister_driver(&userpf_driver);
+
+	for (i = ARRAY_SIZE(xocl_drv_unreg_funcs) - 1; i >= 0; i--)
+		xocl_drv_unreg_funcs[i]();
+
+	class_destroy(xrt_class);
+	xrt_class = NULL;
+}
+
+module_init(xocl_init);
+module_exit(xocl_exit);
+
+MODULE_VERSION(XRT_DRIVER_VERSION);
+
+MODULE_DESCRIPTION(XOCL_DRIVER_DESC);
+MODULE_AUTHOR("Lizhi Hou <lizhi.hou@xilinx.com>");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/xocl/userpf/xocl_ioctl.c b/drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
new file mode 100644
index 000000000000..665ecb0e27ac
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
@@ -0,0 +1,396 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2018 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Sonal Santan
+ *
+ */
+
+#include <linux/version.h>
+#include <drm/drmP.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_mm.h>
+#include <linux/eventfd.h>
+#include <linux/uuid.h>
+#include <linux/hashtable.h>
+#include "../version.h"
+#include "common.h"
+
+int xocl_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+{
+	struct drm_xocl_info *obj = data;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	struct pci_dev *pdev = xdev->core.pdev;
+	u32 major, minor, patch;
+
+	userpf_info(xdev, "INFO IOCTL");
+
+	if (sscanf(XRT_DRIVER_VERSION, "%d.%d.%d", &major, &minor, &patch) != 3)
+		return -ENODEV;
+
+	obj->vendor = pdev->vendor;
+	obj->device = pdev->device;
+	obj->subsystem_vendor = pdev->subsystem_vendor;
+	obj->subsystem_device = pdev->subsystem_device;
+	obj->driver_version = XOCL_DRV_VER_NUM(major, minor, patch);
+	obj->pci_slot = PCI_SLOT(pdev->devfn);
+
+	return 0;
+}
+
+int xocl_execbuf_ioctl(struct drm_device *dev,
+	void *data, struct drm_file *filp)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	int ret = 0;
+
+	ret = xocl_exec_client_ioctl(drm_p->xdev,
+		       DRM_XOCL_EXECBUF, data, filp);
+
+	return ret;
+}
+
+/*
+ * Create a context (only shared supported today) on a CU. Take a lock on xclbin if
+ * it has not been acquired before. Shared the same lock for all context requests
+ * for that process
+ */
+int xocl_ctx_ioctl(struct drm_device *dev, void *data,
+		   struct drm_file *filp)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	int ret = 0;
+
+	ret = xocl_exec_client_ioctl(drm_p->xdev,
+		       DRM_XOCL_CTX, data, filp);
+
+	return ret;
+}
+
+int xocl_user_intr_ioctl(struct drm_device *dev, void *data,
+			 struct drm_file *filp)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	struct drm_xocl_user_intr *args = data;
+	int	ret = 0;
+
+	xocl_info(dev->dev, "USER INTR ioctl");
+
+	if (args->fd < 0)
+		return -EINVAL;
+
+	xocl_dma_intr_register(xdev, args->msix, NULL, NULL, args->fd);
+	xocl_dma_intr_config(xdev, args->msix, true);
+
+	return ret;
+}
+
+char *kind_to_string(enum axlf_section_kind kind)
+{
+	switch (kind) {
+	case 0:	 return "BITSTREAM";
+	case 1:	 return "CLEARING_BITSTREAM";
+	case 2:	 return "EMBEDDED_METADATA";
+	case 3:	 return "FIRMWARE";
+	case 4:	 return "DEBUG_DATA";
+	case 5:	 return "SCHED_FIRMWARE";
+	case 6:	 return "MEM_TOPOLOGY";
+	case 7:	 return "CONNECTIVITY";
+	case 8:	 return "IP_LAYOUT";
+	case 9:	 return "DEBUG_IP_LAYOUT";
+	case 10: return "DESIGN_CHECK_POINT";
+	case 11: return "CLOCK_FREQ_TOPOLOGY";
+	default: return "UNKNOWN";
+	}
+}
+
+/* should be obsoleted after mailbox implememted */
+static const struct axlf_section_header *
+get_axlf_section(const struct axlf *top, enum axlf_section_kind kind)
+{
+	int i = 0;
+
+	DRM_INFO("Finding %s section header", kind_to_string(kind));
+	for (i = 0; i < top->m_header.m_numSections; i++) {
+		if (top->m_sections[i].m_sectionKind == kind)
+			return &top->m_sections[i];
+	}
+	DRM_INFO("Did not find AXLF section %s", kind_to_string(kind));
+	return NULL;
+}
+
+static int
+xocl_check_section(const struct axlf_section_header *header, uint64_t len,
+		enum axlf_section_kind kind)
+{
+	uint64_t offset;
+	uint64_t size;
+
+	DRM_INFO("Section %s details:", kind_to_string(kind));
+	DRM_INFO("  offset = 0x%llx", header->m_sectionOffset);
+	DRM_INFO("  size = 0x%llx", header->m_sectionSize);
+
+	offset = header->m_sectionOffset;
+	size = header->m_sectionSize;
+	if (offset + size <= len)
+		return 0;
+
+	DRM_INFO("Section %s extends beyond xclbin boundary 0x%llx\n",
+			kind_to_string(kind), len);
+	return -EINVAL;
+}
+
+/* Return value: Negative for error, or the size in bytes has been copied */
+static int
+xocl_read_sect(enum axlf_section_kind kind, void **sect, struct axlf *axlf_full)
+{
+	const struct axlf_section_header *memHeader;
+	uint64_t xclbin_len;
+	uint64_t offset;
+	uint64_t size;
+	int err = 0;
+
+	memHeader = get_axlf_section(axlf_full, kind);
+	if (!memHeader)
+		return 0;
+
+	xclbin_len = axlf_full->m_header.m_length;
+	err = xocl_check_section(memHeader, xclbin_len, kind);
+	if (err)
+		return err;
+
+	offset = memHeader->m_sectionOffset;
+	size = memHeader->m_sectionSize;
+	*sect = &((char *)axlf_full)[offset];
+
+	return size;
+}
+
+/*
+ * Should be called with xdev->ctx_list_lock held
+ */
+static uint live_client_size(struct xocl_dev *xdev)
+{
+	const struct list_head *ptr;
+	const struct client_ctx *entry;
+	uint count = 0;
+
+	BUG_ON(!mutex_is_locked(&xdev->ctx_list_lock));
+
+	list_for_each(ptr, &xdev->ctx_list) {
+		entry = list_entry(ptr, struct client_ctx, link);
+		count++;
+	}
+	return count;
+}
+
+static int
+xocl_read_axlf_helper(struct xocl_drm *drm_p, struct drm_xocl_axlf *axlf_ptr)
+{
+	long err = 0;
+	struct axlf *axlf = 0;
+	struct axlf bin_obj;
+	size_t size;
+	int preserve_mem = 0;
+	struct mem_topology *new_topology = NULL, *topology;
+	struct xocl_dev *xdev = drm_p->xdev;
+	uuid_t *xclbin_id;
+
+	userpf_info(xdev, "READ_AXLF IOCTL\n");
+
+	if (!xocl_is_unified(xdev)) {
+		userpf_info(xdev, "XOCL: not unified dsa");
+		return err;
+	}
+
+	if (copy_from_user(&bin_obj, axlf_ptr->xclbin, sizeof(struct axlf)))
+		return -EFAULT;
+
+	if (memcmp(bin_obj.m_magic, "xclbin2", 8))
+		return -EINVAL;
+
+	if (xocl_xrt_version_check(xdev, &bin_obj, true))
+		return -EINVAL;
+
+	if (uuid_is_null(&bin_obj.m_header.uuid)) {
+		// Legacy xclbin, convert legacy id to new id
+		memcpy(&bin_obj.m_header.uuid, &bin_obj.m_header.m_timeStamp, 8);
+	}
+
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+	if (!xclbin_id)
+		return -EINVAL;
+	/*
+	 * Support for multiple processes
+	 * 1. We lock &xdev->ctx_list_lock so no new contexts can be opened and no live contexts
+	 *    can be closed
+	 * 2. If more than one context exists -- more than one clients are connected -- we cannot
+	 *    swap the xclbin return -EPERM
+	 * 3. If no live contexts exist there may still be sumbitted exec BOs from a
+	 *    previous context (which was subsequently closed), hence we check for exec BO count.
+	 *    If exec BO are outstanding we return -EBUSY
+	 */
+	if (!uuid_equal(xclbin_id, &bin_obj.m_header.uuid)) {
+		if (atomic_read(&xdev->outstanding_execs)) {
+			userpf_err(xdev, "Current xclbin is busy, can't change\n");
+			return -EBUSY;
+		}
+	}
+
+	//Ignore timestamp matching for AWS platform
+	if (!xocl_is_aws(xdev) && !xocl_verify_timestamp(xdev,
+		bin_obj.m_header.m_featureRomTimeStamp)) {
+		userpf_err(xdev, "TimeStamp of ROM did not match Xclbin\n");
+		return -EINVAL;
+	}
+
+	userpf_info(xdev, "XOCL: VBNV and TimeStamps matched\n");
+
+	if (uuid_equal(xclbin_id, &bin_obj.m_header.uuid)) {
+		userpf_info(xdev, "Skipping repopulating topology, connectivity,ip_layout data\n");
+		goto done;
+	}
+
+	//Copy from user space and proceed.
+	axlf = vmalloc(bin_obj.m_header.m_length);
+	if (!axlf) {
+		userpf_err(xdev, "Unable to create axlf\n");
+		err = -ENOMEM;
+		goto done;
+	}
+
+	if (copy_from_user(axlf, axlf_ptr->xclbin, bin_obj.m_header.m_length)) {
+		err = -EFAULT;
+		goto done;
+	}
+
+	/* Populating MEM_TOPOLOGY sections. */
+	size = xocl_read_sect(MEM_TOPOLOGY, (void **)&new_topology, axlf);
+	if (size <= 0) {
+		if (size != 0)
+			goto done;
+	} else if (sizeof_sect(new_topology, m_mem_data) != size) {
+		err = -EINVAL;
+		goto done;
+	}
+
+	topology = XOCL_MEM_TOPOLOGY(xdev);
+
+	/*
+	 * Compare MEM_TOPOLOGY previous vs new.
+	 * Ignore this and keep disable preserve_mem if not for aws.
+	 */
+	if (xocl_is_aws(xdev) && (topology != NULL)) {
+		if ((size == sizeof_sect(topology, m_mem_data)) &&
+		    !memcmp(new_topology, topology, size)) {
+			xocl_xdev_info(xdev, "MEM_TOPOLOGY match, preserve mem_topology.");
+			preserve_mem = 1;
+		} else {
+			xocl_xdev_info(xdev, "MEM_TOPOLOGY mismatch, do not preserve mem_topology.");
+		}
+	}
+
+	/* Switching the xclbin, make sure none of the buffers are used. */
+	if (!preserve_mem) {
+		err = xocl_check_topology(drm_p);
+		if (err)
+			goto done;
+		xocl_cleanup_mem(drm_p);
+	}
+
+	err = xocl_icap_download_axlf(xdev, axlf);
+	if (err) {
+		userpf_err(xdev, "%s Fail to download\n", __func__);
+		/*
+		 * Don't just bail out here, always recreate drm mem
+		 * since we have cleaned it up before download.
+		 */
+	}
+
+	if (!preserve_mem) {
+		int rc = xocl_init_mem(drm_p);
+
+		if (err == 0)
+			err = rc;
+	}
+
+done:
+	if (size < 0)
+		err = size;
+	if (err)
+		userpf_err(xdev, "err: %ld\n", err);
+	else
+		userpf_info(xdev, "Loaded xclbin %pUb", xclbin_id);
+	vfree(axlf);
+	return err;
+}
+
+int xocl_read_axlf_ioctl(struct drm_device *dev,
+			 void *data,
+			 struct drm_file *filp)
+{
+	struct drm_xocl_axlf *axlf_obj_ptr = data;
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	struct client_ctx *client = filp->driver_priv;
+	int err = 0;
+	uuid_t *xclbin_id;
+
+	mutex_lock(&xdev->ctx_list_lock);
+	err = xocl_read_axlf_helper(drm_p, axlf_obj_ptr);
+	/*
+	 * Record that user land configured this context for current device xclbin
+	 * It doesn't mean that the context has a lock on the xclbin, only that
+	 * when a lock is eventually acquired it can be verified to be against to
+	 * be a lock on expected xclbin
+	 */
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+	uuid_copy(&client->xclbin_id,
+			((err || !xclbin_id) ? &uuid_null : xclbin_id));
+	mutex_unlock(&xdev->ctx_list_lock);
+	return err;
+}
+
+uint get_live_client_size(struct xocl_dev *xdev)
+{
+	uint count;
+
+	mutex_lock(&xdev->ctx_list_lock);
+	count = live_client_size(xdev);
+	mutex_unlock(&xdev->ctx_list_lock);
+	return count;
+}
+
+void reset_notify_client_ctx(struct xocl_dev *xdev)
+{
+	xdev->needs_reset = false;
+//	wmb();
+}
+
+int xocl_hot_reset_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+
+	int err = xocl_hot_reset(xdev, false);
+
+	userpf_info(xdev, "%s err: %d\n", __func__, err);
+	return err;
+}
+
+int xocl_reclock_ioctl(struct drm_device *dev, void *data,
+	struct drm_file *filp)
+{
+	struct xocl_drm *drm_p = dev->dev_private;
+	struct xocl_dev *xdev = drm_p->xdev;
+	int err = xocl_reclock(xdev, data);
+
+	userpf_info(xdev, "%s err: %d\n", __func__, err);
+	return err;
+}
diff --git a/drivers/gpu/drm/xocl/userpf/xocl_sysfs.c b/drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
new file mode 100644
index 000000000000..fccb27906897
--- /dev/null
+++ b/drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * A GEM style device manager for PCIe based OpenCL accelerators.
+ *
+ * Copyright (C) 2016-2019 Xilinx, Inc. All rights reserved.
+ *
+ * Authors: Lizhi.Hou@xilinx.com
+ *
+ */
+#include "common.h"
+
+//Attributes followed by bin_attributes.
+//
+/* -Attributes -- */
+
+/* -xclbinuuid-- (supersedes xclbinid) */
+static ssize_t xclbinuuid_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+	uuid_t *xclbin_id;
+
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+	return sprintf(buf, "%pUb\n", xclbin_id ? xclbin_id : 0);
+}
+
+static DEVICE_ATTR_RO(xclbinuuid);
+
+/* -userbar-- */
+static ssize_t userbar_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	return sprintf(buf, "%d\n", xdev->core.bar_idx);
+}
+
+static DEVICE_ATTR_RO(userbar);
+
+static ssize_t user_pf_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	// The existence of entry indicates user function.
+	return sprintf(buf, "%s", "");
+}
+static DEVICE_ATTR_RO(user_pf);
+
+/* -live client contects-- */
+static ssize_t kdsstat_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+	int size;
+	uuid_t *xclbin_id;
+
+	xclbin_id = (uuid_t *)xocl_icap_get_data(xdev, XCLBIN_UUID);
+	size = sprintf(buf,
+			   "xclbin:\t\t\t%pUb\noutstanding execs:\t%d\ntotal execs:\t\t%lld\ncontexts:\t\t%d\n",
+			   xclbin_id ? xclbin_id : 0,
+			   atomic_read(&xdev->outstanding_execs),
+			   atomic64_read(&xdev->total_execs),
+			   get_live_client_size(xdev));
+	return size;
+}
+static DEVICE_ATTR_RO(kdsstat);
+
+static ssize_t xocl_mm_stat(struct xocl_dev *xdev, char *buf, bool raw)
+{
+	int i;
+	ssize_t count = 0;
+	ssize_t size = 0;
+	size_t memory_usage = 0;
+	unsigned int bo_count = 0;
+	const char *txt_fmt = "[%s] %s@0x%012llx (%lluMB): %lluKB %dBOs\n";
+	const char *raw_fmt = "%llu %d\n";
+	struct mem_topology *topo = NULL;
+	struct drm_xocl_mm_stat stat;
+	void *drm_hdl;
+
+	drm_hdl = xocl_dma_get_drm_handle(xdev);
+	if (!drm_hdl)
+		return -EINVAL;
+
+	mutex_lock(&xdev->ctx_list_lock);
+
+	topo = XOCL_MEM_TOPOLOGY(xdev);
+	if (!topo) {
+		mutex_unlock(&xdev->ctx_list_lock);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < topo->m_count; i++) {
+		xocl_mm_get_usage_stat(drm_hdl, i, &stat);
+
+		if (raw) {
+			memory_usage = 0;
+			bo_count = 0;
+			memory_usage = stat.memory_usage;
+			bo_count = stat.bo_count;
+
+			count = sprintf(buf, raw_fmt,
+				memory_usage,
+				bo_count);
+		} else {
+			count = sprintf(buf, txt_fmt,
+				topo->m_mem_data[i].m_used ?
+				"IN-USE" : "UNUSED",
+				topo->m_mem_data[i].m_tag,
+				topo->m_mem_data[i].m_base_address,
+				topo->m_mem_data[i].m_size / 1024,
+				stat.memory_usage / 1024,
+				stat.bo_count);
+		}
+		buf += count;
+		size += count;
+	}
+	mutex_unlock(&xdev->ctx_list_lock);
+	return size;
+}
+
+/* -live memory usage-- */
+static ssize_t memstat_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	return xocl_mm_stat(xdev, buf, false);
+}
+static DEVICE_ATTR_RO(memstat);
+
+static ssize_t memstat_raw_show(struct device *dev,
+	struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	return xocl_mm_stat(xdev, buf, true);
+}
+static DEVICE_ATTR_RO(memstat_raw);
+
+static ssize_t p2p_enable_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+	u64 size;
+
+	if (xdev->p2p_bar_addr)
+		return sprintf(buf, "1\n");
+	else if (xocl_get_p2p_bar(xdev, &size) >= 0 &&
+			size > (1 << XOCL_PA_SECTION_SHIFT))
+		return sprintf(buf, "2\n");
+
+	return sprintf(buf, "0\n");
+}
+
+static ssize_t p2p_enable_store(struct device *dev,
+		struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+	struct pci_dev *pdev = xdev->core.pdev;
+	int ret, p2p_bar;
+	u32 enable;
+	u64 size;
+
+
+	if (kstrtou32(buf, 10, &enable) == -EINVAL || enable > 1)
+		return -EINVAL;
+
+	p2p_bar = xocl_get_p2p_bar(xdev, NULL);
+	if (p2p_bar < 0) {
+		xocl_err(&pdev->dev, "p2p bar is not configurable");
+		return -EACCES;
+	}
+
+	size = xocl_get_ddr_channel_size(xdev) *
+		xocl_get_ddr_channel_count(xdev); /* GB */
+	size = (ffs(size) == fls(size)) ? (fls(size) - 1) : fls(size);
+	size = enable ? (size + 10) : (XOCL_PA_SECTION_SHIFT - 20);
+	xocl_info(&pdev->dev, "Resize p2p bar %d to %d M ", p2p_bar,
+			(1 << size));
+	xocl_p2p_mem_release(xdev, false);
+
+	ret = xocl_pci_resize_resource(pdev, p2p_bar, size);
+	if (ret) {
+		xocl_err(&pdev->dev, "Failed to resize p2p BAR %d", ret);
+		goto failed;
+	}
+
+	xdev->p2p_bar_idx = p2p_bar;
+	xdev->p2p_bar_len = pci_resource_len(pdev, p2p_bar);
+
+	if (enable) {
+		ret = xocl_p2p_mem_reserve(xdev);
+		if (ret) {
+			xocl_err(&pdev->dev, "Failed to reserve p2p memory %d",
+					ret);
+		}
+	}
+
+	return count;
+
+failed:
+	return ret;
+
+}
+
+static DEVICE_ATTR(p2p_enable, 0644, p2p_enable_show, p2p_enable_store);
+
+static ssize_t dev_offline_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+	int val = xdev->core.offline ? 1 : 0;
+
+	return sprintf(buf, "%d\n", val);
+}
+static ssize_t dev_offline_store(struct device *dev,
+		struct device_attribute *da, const char *buf, size_t count)
+{
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+	int ret;
+	u32 offline;
+
+
+	if (kstrtou32(buf, 10, &offline) == -EINVAL || offline > 1)
+		return -EINVAL;
+
+	device_lock(dev);
+	if (offline) {
+		xocl_subdev_destroy_all(xdev);
+		xdev->core.offline = true;
+	} else {
+		ret = xocl_subdev_create_all(xdev, xdev->core.priv.subdev_info,
+				xdev->core.priv.subdev_num);
+		if (ret) {
+			xocl_err(dev, "Online subdevices failed");
+			return -EIO;
+		}
+		xdev->core.offline = false;
+	}
+	device_unlock(dev);
+
+	return count;
+}
+
+static DEVICE_ATTR(dev_offline, 0644, dev_offline_show, dev_offline_store);
+
+static ssize_t mig_calibration_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "0\n");
+}
+
+static DEVICE_ATTR_RO(mig_calibration);
+
+static ssize_t link_width_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	get_pcie_link_info(xdev, &width, &speed, false);
+	return sprintf(buf, "%d\n", width);
+}
+static DEVICE_ATTR_RO(link_width);
+
+static ssize_t link_speed_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	get_pcie_link_info(xdev, &width, &speed, false);
+	return sprintf(buf, "%d\n", speed);
+}
+static DEVICE_ATTR_RO(link_speed);
+
+static ssize_t link_width_max_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	get_pcie_link_info(xdev, &width, &speed, true);
+	return sprintf(buf, "%d\n", width);
+}
+static DEVICE_ATTR_RO(link_width_max);
+
+static ssize_t link_speed_max_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	unsigned short speed, width;
+	struct xocl_dev *xdev = dev_get_drvdata(dev);
+
+	get_pcie_link_info(xdev, &width, &speed, true);
+	return sprintf(buf, "%d\n", speed);
+}
+static DEVICE_ATTR_RO(link_speed_max);
+/* - End attributes-- */
+
+static struct attribute *xocl_attrs[] = {
+	&dev_attr_xclbinuuid.attr,
+	&dev_attr_userbar.attr,
+	&dev_attr_kdsstat.attr,
+	&dev_attr_memstat.attr,
+	&dev_attr_memstat_raw.attr,
+	&dev_attr_user_pf.attr,
+	&dev_attr_p2p_enable.attr,
+	&dev_attr_dev_offline.attr,
+	&dev_attr_mig_calibration.attr,
+	&dev_attr_link_width.attr,
+	&dev_attr_link_speed.attr,
+	&dev_attr_link_speed_max.attr,
+	&dev_attr_link_width_max.attr,
+	NULL,
+};
+
+static struct attribute_group xocl_attr_group = {
+	.attrs = xocl_attrs,
+};
+
+//---
+int xocl_init_sysfs(struct device *dev)
+{
+	int ret;
+	struct pci_dev *rdev;
+
+	ret = sysfs_create_group(&dev->kobj, &xocl_attr_group);
+	if (ret)
+		xocl_err(dev, "create xocl attrs failed: %d", ret);
+
+	xocl_get_root_dev(to_pci_dev(dev), rdev);
+	ret = sysfs_create_link(&dev->kobj, &rdev->dev.kobj, "root_dev");
+	if (ret)
+		xocl_err(dev, "create root device link failed: %d", ret);
+
+	return ret;
+}
+
+void xocl_fini_sysfs(struct device *dev)
+{
+	sysfs_remove_link(&dev->kobj, "root_dev");
+	sysfs_remove_group(&dev->kobj, &xocl_attr_group);
+}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
                   ` (5 preceding siblings ...)
  2019-03-19 21:54 ` [RFC PATCH Xilinx Alveo 6/6] Add user physical function driver sonal.santan
@ 2019-03-25 20:28 ` Daniel Vetter
  2019-03-26 23:30   ` Sonal Santan
  6 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2019-03-25 20:28 UTC (permalink / raw)
  To: sonal.santan
  Cc: gregkh, cyrilc, linux-kernel, lizhih, michals, dri-devel, airlied

On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com wrote:
> From: Sonal Santan <sonal.santan@xilinx.com>
> 
> Hello,
> 
> This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> These drivers are part of Xilinx Runtime (XRT) open source stack and
> have been deployed by leading FaaS vendors and many enterprise customers.

Cool, first fpga driver submitted to drm! And from a high level I think
this makes a lot of sense.

> PLATFORM ARCHITECTURE
> 
> Alveo PCIe platforms have a static shell and a reconfigurable (dynamic)
> region. The shell is automatically loaded from PROM when host is booted
> and PCIe is enumerated by BIOS. Shell cannot be changed till next cold
> reboot. The shell exposes two physical functions: management physical
> function and user physical function.
> 
> Users compile their high level design in C/C++/OpenCL or RTL into FPGA
> image using SDx compiler. The FPGA image packaged as xclbin file can be
> loaded onto reconfigurable region. The image may contain one or more
> compute unit. Users can dynamically swap the full image running on the
> reconfigurable region in order to switch between different workloads.
> 
> XRT DRIVERS
> 
> XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular and
> organized into several platform drivers which primarily handle the
> following functionality:
> 1.  ICAP programming (FPGA bitstream download with FPGA Mgr integration)
> 2.  Clock scaling
> 3.  Loading firmware container also called dsabin (embedded Microblaze
>     firmware for ERT and XMC, optional clearing bitstream)
> 4.  In-band sensors: temp, voltage, power, etc.
> 5.  AXI Firewall management
> 6.  Device reset and rescan
> 7.  Hardware mailbox for communication between two physical functions
> 
> XRT Linux kernel driver xocl binds to user pf. Like its peer, this driver
> is also modular and organized into several platform drivers which handle
> the following functionality:
> 1.  Device memory topology discovery and memory management
> 2.  Buffer object abstraction and management for client process
> 3.  XDMA MM PCIe DMA engine programming
> 4.  Multi-process aware context management
> 5.  Compute unit execution management (optionally with help of ERT) for
>     client processes
> 6.  Hardware mailbox for communication between two physical functions
> 
> The drivers export ioctls and sysfs nodes for various services. xocl
> driver makes heavy use of DRM GEM features for device memory management,
> reference counting, mmap support and export/import. xocl also includes a
> simple scheduler called KDS which schedules compute units and interacts
> with hardware scheduler running ERT firmware. The scheduler understands
> custom opcodes packaged into command objects and provides an asynchronous
> command done notification via POSIX poll.
> 
> More details on architecture, software APIs, ioctl definitions, execution
> model, etc. is available as Sphinx documentation--
> 
> https://xilinx.github.io/XRT/2018.3/html/index.html
> 
> The complete runtime software stack (XRT) which includes out of tree
> kernel drivers, user space libraries, board utilities and firmware for
> the hardware scheduler is open source and available at
> https://github.com/Xilinx/XRT

Before digging into the implementation side more I looked into the
userspace here. I admit I got lost a bit, since there's lots of
indirections and abstractions going on, but it seems like this is just a
fancy ioctl wrapper/driver backend abstractions. Not really something
applications would use.

From the pretty picture on github it looks like there's some
opencl/ml/other fancy stuff sitting on top that applications would use. Is
that also available?

Thanks, Daniel

> 
> Thanks,
> -Sonal
> 
> Sonal Santan (6):
>   Add skeleton code: ioctl definitions and build hooks
>   Global data structures shared between xocl and xmgmt drivers
>   Add platform drivers for various IPs and frameworks
>   Add core of XDMA driver
>   Add management driver
>   Add user physical function driver
> 
>  drivers/gpu/drm/Kconfig                    |    2 +
>  drivers/gpu/drm/Makefile                   |    1 +
>  drivers/gpu/drm/xocl/Kconfig               |   22 +
>  drivers/gpu/drm/xocl/Makefile              |    3 +
>  drivers/gpu/drm/xocl/devices.h             |  954 +++++
>  drivers/gpu/drm/xocl/ert.h                 |  385 ++
>  drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
>  drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
>  drivers/gpu/drm/xocl/lib/libxdma.c         | 4368 ++++++++++++++++++++
>  drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
>  drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
>  drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
>  drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
>  drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
>  drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
>  drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
>  drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
>  drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
>  drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
>  drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++
>  drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
>  drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
>  drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
>  drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
>  drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
>  drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
>  drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
>  drivers/gpu/drm/xocl/userpf/common.h       |  157 +
>  drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
>  drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
>  drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
>  drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
>  drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
>  drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
>  drivers/gpu/drm/xocl/version.h             |   22 +
>  drivers/gpu/drm/xocl/xclbin.h              |  314 ++
>  drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
>  drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
>  drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
>  drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
>  drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
>  drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
>  include/uapi/drm/xmgmt_drm.h               |  204 +
>  include/uapi/drm/xocl_drm.h                |  483 +++
>  50 files changed, 28252 insertions(+)
>  create mode 100644 drivers/gpu/drm/xocl/Kconfig
>  create mode 100644 drivers/gpu/drm/xocl/Makefile
>  create mode 100644 drivers/gpu/drm/xocl/devices.h
>  create mode 100644 drivers/gpu/drm/xocl/ert.h
>  create mode 100644 drivers/gpu/drm/xocl/lib/Makefile.in
>  create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
>  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
>  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
>  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
>  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c
>  create mode 100644 drivers/gpu/drm/xocl/subdev/xvc.c
>  create mode 100644 drivers/gpu/drm/xocl/userpf/Makefile
>  create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
>  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
>  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
>  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
>  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
>  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
>  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
>  create mode 100644 drivers/gpu/drm/xocl/version.h
>  create mode 100644 drivers/gpu/drm/xocl/xclbin.h
>  create mode 100644 drivers/gpu/drm/xocl/xclfeatures.h
>  create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c
>  create mode 100644 drivers/gpu/drm/xocl/xocl_drm.h
>  create mode 100644 drivers/gpu/drm/xocl/xocl_drv.h
>  create mode 100644 drivers/gpu/drm/xocl/xocl_subdev.c
>  create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
>  create mode 100644 include/uapi/drm/xmgmt_drm.h
>  create mode 100644 include/uapi/drm/xocl_drm.h
> 
> --
> 2.17.0
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-25 20:28 ` [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver Daniel Vetter
@ 2019-03-26 23:30   ` Sonal Santan
  2019-03-27  8:22     ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Sonal Santan @ 2019-03-26 23:30 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, gregkh, Cyril Chemparathy, linux-kernel, Lizhi Hou,
	Michal Simek, airlied



> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, March 25, 2019 1:28 PM
> To: Sonal Santan <sonals@xilinx.com>
> Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org; Cyril
> Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org; Lizhi Hou
> <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; airlied@redhat.com
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> 
> On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com wrote:
> > From: Sonal Santan <sonal.santan@xilinx.com>
> >
> > Hello,
> >
> > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > These drivers are part of Xilinx Runtime (XRT) open source stack and
> > have been deployed by leading FaaS vendors and many enterprise
> customers.
> 
> Cool, first fpga driver submitted to drm! And from a high level I think this
> makes a lot of sense.
> 
> > PLATFORM ARCHITECTURE
> >
> > Alveo PCIe platforms have a static shell and a reconfigurable
> > (dynamic) region. The shell is automatically loaded from PROM when
> > host is booted and PCIe is enumerated by BIOS. Shell cannot be changed
> > till next cold reboot. The shell exposes two physical functions:
> > management physical function and user physical function.
> >
> > Users compile their high level design in C/C++/OpenCL or RTL into FPGA
> > image using SDx compiler. The FPGA image packaged as xclbin file can
> > be loaded onto reconfigurable region. The image may contain one or
> > more compute unit. Users can dynamically swap the full image running
> > on the reconfigurable region in order to switch between different
> workloads.
> >
> > XRT DRIVERS
> >
> > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular
> > and organized into several platform drivers which primarily handle the
> > following functionality:
> > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > integration) 2.  Clock scaling 3.  Loading firmware container also
> > called dsabin (embedded Microblaze
> >     firmware for ERT and XMC, optional clearing bitstream) 4.  In-band
> > sensors: temp, voltage, power, etc.
> > 5.  AXI Firewall management
> > 6.  Device reset and rescan
> > 7.  Hardware mailbox for communication between two physical functions
> >
> > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > driver is also modular and organized into several platform drivers
> > which handle the following functionality:
> > 1.  Device memory topology discovery and memory management 2.  Buffer
> > object abstraction and management for client process 3.  XDMA MM PCIe
> > DMA engine programming 4.  Multi-process aware context management 5.
> > Compute unit execution management (optionally with help of ERT) for
> >     client processes
> > 6.  Hardware mailbox for communication between two physical functions
> >
> > The drivers export ioctls and sysfs nodes for various services. xocl
> > driver makes heavy use of DRM GEM features for device memory
> > management, reference counting, mmap support and export/import. xocl
> > also includes a simple scheduler called KDS which schedules compute
> > units and interacts with hardware scheduler running ERT firmware. The
> > scheduler understands custom opcodes packaged into command objects
> and
> > provides an asynchronous command done notification via POSIX poll.
> >
> > More details on architecture, software APIs, ioctl definitions,
> > execution model, etc. is available as Sphinx documentation--
> >
> > https://xilinx.github.io/XRT/2018.3/html/index.html
> >
> > The complete runtime software stack (XRT) which includes out of tree
> > kernel drivers, user space libraries, board utilities and firmware for
> > the hardware scheduler is open source and available at
> > https://github.com/Xilinx/XRT
> 
> Before digging into the implementation side more I looked into the userspace
> here. I admit I got lost a bit, since there's lots of indirections and abstractions
> going on, but it seems like this is just a fancy ioctl wrapper/driver backend
> abstractions. Not really something applications would use.
Sonal Santan <sonals@xilinx.com>
	
4:20 PM (1 minute ago)
	
to me


> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, March 25, 2019 1:28 PM
> To: Sonal Santan <sonals@xilinx.com>
> Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org; Cyril
> Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org; Lizhi Hou
> <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; airlied@redhat.com
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
>
> On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com wrote:
> > From: Sonal Santan <sonal.santan@xilinx.com>
> >
> > Hello,
> >
> > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > These drivers are part of Xilinx Runtime (XRT) open source stack and
> > have been deployed by leading FaaS vendors and many enterprise
> customers.
>
> Cool, first fpga driver submitted to drm! And from a high level I think this
> makes a lot of sense.
>
> > PLATFORM ARCHITECTURE
> >
> > Alveo PCIe platforms have a static shell and a reconfigurable
> > (dynamic) region. The shell is automatically loaded from PROM when
> > host is booted and PCIe is enumerated by BIOS. Shell cannot be changed
> > till next cold reboot. The shell exposes two physical functions:
> > management physical function and user physical function.
> >
> > Users compile their high level design in C/C++/OpenCL or RTL into FPGA
> > image using SDx compiler. The FPGA image packaged as xclbin file can
> > be loaded onto reconfigurable region. The image may contain one or
> > more compute unit. Users can dynamically swap the full image running
> > on the reconfigurable region in order to switch between different
> workloads.
> >
> > XRT DRIVERS
> >
> > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular
> > and organized into several platform drivers which primarily handle the
> > following functionality:
> > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > integration) 2.  Clock scaling 3.  Loading firmware container also
> > called dsabin (embedded Microblaze
> >     firmware for ERT and XMC, optional clearing bitstream) 4.  In-band
> > sensors: temp, voltage, power, etc.
> > 5.  AXI Firewall management
> > 6.  Device reset and rescan
> > 7.  Hardware mailbox for communication between two physical functions
> >
> > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > driver is also modular and organized into several platform drivers
> > which handle the following functionality:
> > 1.  Device memory topology discovery and memory management 2.  Buffer
> > object abstraction and management for client process 3.  XDMA MM PCIe
> > DMA engine programming 4.  Multi-process aware context management 5.
> > Compute unit execution management (optionally with help of ERT) for
> >     client processes
> > 6.  Hardware mailbox for communication between two physical functions
> >
> > The drivers export ioctls and sysfs nodes for various services. xocl
> > driver makes heavy use of DRM GEM features for device memory
> > management, reference counting, mmap support and export/import. xocl
> > also includes a simple scheduler called KDS which schedules compute
> > units and interacts with hardware scheduler running ERT firmware. The
> > scheduler understands custom opcodes packaged into command objects
> and
> > provides an asynchronous command done notification via POSIX poll.
> >
> > More details on architecture, software APIs, ioctl definitions,
> > execution model, etc. is available as Sphinx documentation--
> >
> > https://xilinx.github.io/XRT/2018.3/html/index.html
> >
> > The complete runtime software stack (XRT) which includes out of tree
> > kernel drivers, user space libraries, board utilities and firmware for
> > the hardware scheduler is open source and available at
> > https://github.com/Xilinx/XRT
>
> Before digging into the implementation side more I looked into the userspace
> here. I admit I got lost a bit, since there's lots of indirections and abstractions
> going on, but it seems like this is just a fancy ioctl wrapper/driver backend
> abstractions. Not really something applications would use.
>

Appreciate your feedback. 

The userspace libraries define a common abstraction but have different implementations
for Zynq Ultrascale+ embedded platform, PCIe based Alveo (and Faas) and emulation
flows. The latter lets you run your application without physical hardware.

> 
> From the pretty picture on github it looks like there's some opencl/ml/other
> fancy stuff sitting on top that applications would use. Is that also available?

The full OpenCL runtime is available in the same repository. Xilinx ML Suite is
also based on XRT and its source can be found at https://github.com/Xilinx/ml-suite.

Typically end users use OpenCL APIs which are part of XRT stack. One can write an
application to directly call XRT APIs defined at 
https://xilinx.github.io/XRT/2018.3/html/xclhal2.main.html

Thanks,
-Sonal
> 
> Thanks, Daniel
> 
> >
> > Thanks,
> > -Sonal
> >
> > Sonal Santan (6):
> >   Add skeleton code: ioctl definitions and build hooks
> >   Global data structures shared between xocl and xmgmt drivers
> >   Add platform drivers for various IPs and frameworks
> >   Add core of XDMA driver
> >   Add management driver
> >   Add user physical function driver
> >
> >  drivers/gpu/drm/Kconfig                    |    2 +
> >  drivers/gpu/drm/Makefile                   |    1 +
> >  drivers/gpu/drm/xocl/Kconfig               |   22 +
> >  drivers/gpu/drm/xocl/Makefile              |    3 +
> >  drivers/gpu/drm/xocl/devices.h             |  954 +++++
> >  drivers/gpu/drm/xocl/ert.h                 |  385 ++
> >  drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
> >  drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
> >  drivers/gpu/drm/xocl/lib/libxdma.c         | 4368 ++++++++++++++++++++
> >  drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
> >  drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
> >  drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
> >  drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
> >  drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
> >  drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
> >  drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
> >  drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
> >  drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
> >  drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
> >  drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++
> >  drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
> >  drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
> >  drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
> >  drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
> >  drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
> >  drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
> >  drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
> >  drivers/gpu/drm/xocl/userpf/common.h       |  157 +
> >  drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
> >  drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
> >  drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
> >  drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
> >  drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
> >  drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
> >  drivers/gpu/drm/xocl/version.h             |   22 +
> >  drivers/gpu/drm/xocl/xclbin.h              |  314 ++
> >  drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
> >  drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
> >  drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
> >  drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
> >  drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
> >  drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
> >  include/uapi/drm/xmgmt_drm.h               |  204 +
> >  include/uapi/drm/xocl_drm.h                |  483 +++
> >  50 files changed, 28252 insertions(+)  create mode 100644
> > drivers/gpu/drm/xocl/Kconfig  create mode 100644
> > drivers/gpu/drm/xocl/Makefile  create mode 100644
> > drivers/gpu/drm/xocl/devices.h  create mode 100644
> > drivers/gpu/drm/xocl/ert.h  create mode 100644
> > drivers/gpu/drm/xocl/lib/Makefile.in
> >  create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
> >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
> >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
> >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
> >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c  create mode
> > 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c  create mode
> > 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
> >  create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c  create mode
> > 100644 drivers/gpu/drm/xocl/subdev/xvc.c  create mode 100644
> > drivers/gpu/drm/xocl/userpf/Makefile
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
> >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
> >  create mode 100644 drivers/gpu/drm/xocl/version.h  create mode 100644
> > drivers/gpu/drm/xocl/xclbin.h  create mode 100644
> > drivers/gpu/drm/xocl/xclfeatures.h
> >  create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c  create mode
> > 100644 drivers/gpu/drm/xocl/xocl_drm.h  create mode 100644
> > drivers/gpu/drm/xocl/xocl_drv.h  create mode 100644
> > drivers/gpu/drm/xocl/xocl_subdev.c
> >  create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
> >  create mode 100644 include/uapi/drm/xmgmt_drm.h  create mode 100644
> > include/uapi/drm/xocl_drm.h
> >
> > --
> > 2.17.0
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-26 23:30   ` Sonal Santan
@ 2019-03-27  8:22     ` Daniel Vetter
  2019-03-27 12:50       ` Sonal Santan
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2019-03-27  8:22 UTC (permalink / raw)
  To: Sonal Santan
  Cc: dri-devel, gregkh, Cyril Chemparathy, linux-kernel, Lizhi Hou,
	Michal Simek, airlied

On Wed, Mar 27, 2019 at 12:30 AM Sonal Santan <sonals@xilinx.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> > Sent: Monday, March 25, 2019 1:28 PM
> > To: Sonal Santan <sonals@xilinx.com>
> > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org; Cyril
> > Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org; Lizhi Hou
> > <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> >
> > On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com wrote:
> > > From: Sonal Santan <sonal.santan@xilinx.com>
> > >
> > > Hello,
> > >
> > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > These drivers are part of Xilinx Runtime (XRT) open source stack and
> > > have been deployed by leading FaaS vendors and many enterprise
> > customers.
> >
> > Cool, first fpga driver submitted to drm! And from a high level I think this
> > makes a lot of sense.
> >
> > > PLATFORM ARCHITECTURE
> > >
> > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > (dynamic) region. The shell is automatically loaded from PROM when
> > > host is booted and PCIe is enumerated by BIOS. Shell cannot be changed
> > > till next cold reboot. The shell exposes two physical functions:
> > > management physical function and user physical function.
> > >
> > > Users compile their high level design in C/C++/OpenCL or RTL into FPGA
> > > image using SDx compiler. The FPGA image packaged as xclbin file can
> > > be loaded onto reconfigurable region. The image may contain one or
> > > more compute unit. Users can dynamically swap the full image running
> > > on the reconfigurable region in order to switch between different
> > workloads.
> > >
> > > XRT DRIVERS
> > >
> > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular
> > > and organized into several platform drivers which primarily handle the
> > > following functionality:
> > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > integration) 2.  Clock scaling 3.  Loading firmware container also
> > > called dsabin (embedded Microblaze
> > >     firmware for ERT and XMC, optional clearing bitstream) 4.  In-band
> > > sensors: temp, voltage, power, etc.
> > > 5.  AXI Firewall management
> > > 6.  Device reset and rescan
> > > 7.  Hardware mailbox for communication between two physical functions
> > >
> > > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > > driver is also modular and organized into several platform drivers
> > > which handle the following functionality:
> > > 1.  Device memory topology discovery and memory management 2.  Buffer
> > > object abstraction and management for client process 3.  XDMA MM PCIe
> > > DMA engine programming 4.  Multi-process aware context management 5.
> > > Compute unit execution management (optionally with help of ERT) for
> > >     client processes
> > > 6.  Hardware mailbox for communication between two physical functions
> > >
> > > The drivers export ioctls and sysfs nodes for various services. xocl
> > > driver makes heavy use of DRM GEM features for device memory
> > > management, reference counting, mmap support and export/import. xocl
> > > also includes a simple scheduler called KDS which schedules compute
> > > units and interacts with hardware scheduler running ERT firmware. The
> > > scheduler understands custom opcodes packaged into command objects
> > and
> > > provides an asynchronous command done notification via POSIX poll.
> > >
> > > More details on architecture, software APIs, ioctl definitions,
> > > execution model, etc. is available as Sphinx documentation--
> > >
> > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > >
> > > The complete runtime software stack (XRT) which includes out of tree
> > > kernel drivers, user space libraries, board utilities and firmware for
> > > the hardware scheduler is open source and available at
> > > https://github.com/Xilinx/XRT
> >
> > Before digging into the implementation side more I looked into the userspace
> > here. I admit I got lost a bit, since there's lots of indirections and abstractions
> > going on, but it seems like this is just a fancy ioctl wrapper/driver backend
> > abstractions. Not really something applications would use.
> Sonal Santan <sonals@xilinx.com>
>
> 4:20 PM (1 minute ago)
>
> to me
>
>
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> > Sent: Monday, March 25, 2019 1:28 PM
> > To: Sonal Santan <sonals@xilinx.com>
> > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org; Cyril
> > Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org; Lizhi Hou
> > <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> >
> > On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com wrote:
> > > From: Sonal Santan <sonal.santan@xilinx.com>
> > >
> > > Hello,
> > >
> > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > These drivers are part of Xilinx Runtime (XRT) open source stack and
> > > have been deployed by leading FaaS vendors and many enterprise
> > customers.
> >
> > Cool, first fpga driver submitted to drm! And from a high level I think this
> > makes a lot of sense.
> >
> > > PLATFORM ARCHITECTURE
> > >
> > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > (dynamic) region. The shell is automatically loaded from PROM when
> > > host is booted and PCIe is enumerated by BIOS. Shell cannot be changed
> > > till next cold reboot. The shell exposes two physical functions:
> > > management physical function and user physical function.
> > >
> > > Users compile their high level design in C/C++/OpenCL or RTL into FPGA
> > > image using SDx compiler. The FPGA image packaged as xclbin file can
> > > be loaded onto reconfigurable region. The image may contain one or
> > > more compute unit. Users can dynamically swap the full image running
> > > on the reconfigurable region in order to switch between different
> > workloads.
> > >
> > > XRT DRIVERS
> > >
> > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular
> > > and organized into several platform drivers which primarily handle the
> > > following functionality:
> > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > integration) 2.  Clock scaling 3.  Loading firmware container also
> > > called dsabin (embedded Microblaze
> > >     firmware for ERT and XMC, optional clearing bitstream) 4.  In-band
> > > sensors: temp, voltage, power, etc.
> > > 5.  AXI Firewall management
> > > 6.  Device reset and rescan
> > > 7.  Hardware mailbox for communication between two physical functions
> > >
> > > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > > driver is also modular and organized into several platform drivers
> > > which handle the following functionality:
> > > 1.  Device memory topology discovery and memory management 2.  Buffer
> > > object abstraction and management for client process 3.  XDMA MM PCIe
> > > DMA engine programming 4.  Multi-process aware context management 5.
> > > Compute unit execution management (optionally with help of ERT) for
> > >     client processes
> > > 6.  Hardware mailbox for communication between two physical functions
> > >
> > > The drivers export ioctls and sysfs nodes for various services. xocl
> > > driver makes heavy use of DRM GEM features for device memory
> > > management, reference counting, mmap support and export/import. xocl
> > > also includes a simple scheduler called KDS which schedules compute
> > > units and interacts with hardware scheduler running ERT firmware. The
> > > scheduler understands custom opcodes packaged into command objects
> > and
> > > provides an asynchronous command done notification via POSIX poll.
> > >
> > > More details on architecture, software APIs, ioctl definitions,
> > > execution model, etc. is available as Sphinx documentation--
> > >
> > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > >
> > > The complete runtime software stack (XRT) which includes out of tree
> > > kernel drivers, user space libraries, board utilities and firmware for
> > > the hardware scheduler is open source and available at
> > > https://github.com/Xilinx/XRT
> >
> > Before digging into the implementation side more I looked into the userspace
> > here. I admit I got lost a bit, since there's lots of indirections and abstractions
> > going on, but it seems like this is just a fancy ioctl wrapper/driver backend
> > abstractions. Not really something applications would use.
> >
>
> Appreciate your feedback.
>
> The userspace libraries define a common abstraction but have different implementations
> for Zynq Ultrascale+ embedded platform, PCIe based Alveo (and Faas) and emulation
> flows. The latter lets you run your application without physical hardware.
>
> >
> > From the pretty picture on github it looks like there's some opencl/ml/other
> > fancy stuff sitting on top that applications would use. Is that also available?
>
> The full OpenCL runtime is available in the same repository. Xilinx ML Suite is
> also based on XRT and its source can be found at https://github.com/Xilinx/ml-suite.

Hm, I did a few git grep for the usual opencl entry points, but didn't
find anything. Do I need to run some build scripts first (which
downloads additional sourcecode)? Or is there some symbol mangling
going on and that's why I don't find anything? Pointers very much
appreciated.

> Typically end users use OpenCL APIs which are part of XRT stack. One can write an
> application to directly call XRT APIs defined at
> https://xilinx.github.io/XRT/2018.3/html/xclhal2.main.html

I have no clue about DNN/ML unfortunately, I think I'll try to look
into the ocl side a bit more first.

Thanks, Daniel

>
> Thanks,
> -Sonal
> >
> > Thanks, Daniel
> >
> > >
> > > Thanks,
> > > -Sonal
> > >
> > > Sonal Santan (6):
> > >   Add skeleton code: ioctl definitions and build hooks
> > >   Global data structures shared between xocl and xmgmt drivers
> > >   Add platform drivers for various IPs and frameworks
> > >   Add core of XDMA driver
> > >   Add management driver
> > >   Add user physical function driver
> > >
> > >  drivers/gpu/drm/Kconfig                    |    2 +
> > >  drivers/gpu/drm/Makefile                   |    1 +
> > >  drivers/gpu/drm/xocl/Kconfig               |   22 +
> > >  drivers/gpu/drm/xocl/Makefile              |    3 +
> > >  drivers/gpu/drm/xocl/devices.h             |  954 +++++
> > >  drivers/gpu/drm/xocl/ert.h                 |  385 ++
> > >  drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
> > >  drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
> > >  drivers/gpu/drm/xocl/lib/libxdma.c         | 4368 ++++++++++++++++++++
> > >  drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
> > >  drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
> > >  drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
> > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
> > >  drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
> > >  drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
> > >  drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
> > >  drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
> > >  drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
> > >  drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
> > >  drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++
> > >  drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
> > >  drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
> > >  drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
> > >  drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
> > >  drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
> > >  drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
> > >  drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
> > >  drivers/gpu/drm/xocl/userpf/common.h       |  157 +
> > >  drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
> > >  drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
> > >  drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
> > >  drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
> > >  drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
> > >  drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
> > >  drivers/gpu/drm/xocl/version.h             |   22 +
> > >  drivers/gpu/drm/xocl/xclbin.h              |  314 ++
> > >  drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
> > >  drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
> > >  drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
> > >  drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
> > >  drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
> > >  drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
> > >  include/uapi/drm/xmgmt_drm.h               |  204 +
> > >  include/uapi/drm/xocl_drm.h                |  483 +++
> > >  50 files changed, 28252 insertions(+)  create mode 100644
> > > drivers/gpu/drm/xocl/Kconfig  create mode 100644
> > > drivers/gpu/drm/xocl/Makefile  create mode 100644
> > > drivers/gpu/drm/xocl/devices.h  create mode 100644
> > > drivers/gpu/drm/xocl/ert.h  create mode 100644
> > > drivers/gpu/drm/xocl/lib/Makefile.in
> > >  create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
> > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
> > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
> > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
> > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c  create mode
> > > 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c  create mode
> > > 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
> > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c  create mode
> > > 100644 drivers/gpu/drm/xocl/subdev/xvc.c  create mode 100644
> > > drivers/gpu/drm/xocl/userpf/Makefile
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
> > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
> > >  create mode 100644 drivers/gpu/drm/xocl/version.h  create mode 100644
> > > drivers/gpu/drm/xocl/xclbin.h  create mode 100644
> > > drivers/gpu/drm/xocl/xclfeatures.h
> > >  create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c  create mode
> > > 100644 drivers/gpu/drm/xocl/xocl_drm.h  create mode 100644
> > > drivers/gpu/drm/xocl/xocl_drv.h  create mode 100644
> > > drivers/gpu/drm/xocl/xocl_subdev.c
> > >  create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
> > >  create mode 100644 include/uapi/drm/xmgmt_drm.h  create mode 100644
> > > include/uapi/drm/xocl_drm.h
> > >
> > > --
> > > 2.17.0
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-27  8:22     ` Daniel Vetter
@ 2019-03-27 12:50       ` Sonal Santan
  2019-03-27 14:11         ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Sonal Santan @ 2019-03-27 12:50 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, gregkh, Cyril Chemparathy, linux-kernel, Lizhi Hou,
	Michal Simek, airlied



> -----Original Message-----
> From: Daniel Vetter [mailto:daniel@ffwll.ch]
> Sent: Wednesday, March 27, 2019 1:23 AM
> To: Sonal Santan <sonals@xilinx.com>
> Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org; Cyril
> Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org; Lizhi Hou
> <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; airlied@redhat.com
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> 
> On Wed, Mar 27, 2019 at 12:30 AM Sonal Santan <sonals@xilinx.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > Daniel Vetter
> > > Sent: Monday, March 25, 2019 1:28 PM
> > > To: Sonal Santan <sonals@xilinx.com>
> > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > Cyril Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org;
> > > Lizhi Hou <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>;
> > > airlied@redhat.com
> > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator
> > > driver
> > >
> > > On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com
> wrote:
> > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > >
> > > > Hello,
> > > >
> > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > These drivers are part of Xilinx Runtime (XRT) open source stack
> > > > and have been deployed by leading FaaS vendors and many enterprise
> > > customers.
> > >
> > > Cool, first fpga driver submitted to drm! And from a high level I
> > > think this makes a lot of sense.
> > >
> > > > PLATFORM ARCHITECTURE
> > > >
> > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > (dynamic) region. The shell is automatically loaded from PROM when
> > > > host is booted and PCIe is enumerated by BIOS. Shell cannot be
> > > > changed till next cold reboot. The shell exposes two physical functions:
> > > > management physical function and user physical function.
> > > >
> > > > Users compile their high level design in C/C++/OpenCL or RTL into
> > > > FPGA image using SDx compiler. The FPGA image packaged as xclbin
> > > > file can be loaded onto reconfigurable region. The image may
> > > > contain one or more compute unit. Users can dynamically swap the
> > > > full image running on the reconfigurable region in order to switch
> > > > between different
> > > workloads.
> > > >
> > > > XRT DRIVERS
> > > >
> > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > modular and organized into several platform drivers which
> > > > primarily handle the following functionality:
> > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > integration) 2.  Clock scaling 3.  Loading firmware container also
> > > > called dsabin (embedded Microblaze
> > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > In-band
> > > > sensors: temp, voltage, power, etc.
> > > > 5.  AXI Firewall management
> > > > 6.  Device reset and rescan
> > > > 7.  Hardware mailbox for communication between two physical
> > > > functions
> > > >
> > > > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > > > driver is also modular and organized into several platform drivers
> > > > which handle the following functionality:
> > > > 1.  Device memory topology discovery and memory management 2.
> > > > Buffer object abstraction and management for client process 3.
> > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> context management 5.
> > > > Compute unit execution management (optionally with help of ERT) for
> > > >     client processes
> > > > 6.  Hardware mailbox for communication between two physical
> > > > functions
> > > >
> > > > The drivers export ioctls and sysfs nodes for various services.
> > > > xocl driver makes heavy use of DRM GEM features for device memory
> > > > management, reference counting, mmap support and export/import.
> > > > xocl also includes a simple scheduler called KDS which schedules
> > > > compute units and interacts with hardware scheduler running ERT
> > > > firmware. The scheduler understands custom opcodes packaged into
> > > > command objects
> > > and
> > > > provides an asynchronous command done notification via POSIX poll.
> > > >
> > > > More details on architecture, software APIs, ioctl definitions,
> > > > execution model, etc. is available as Sphinx documentation--
> > > >
> > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > >
> > > > The complete runtime software stack (XRT) which includes out of
> > > > tree kernel drivers, user space libraries, board utilities and
> > > > firmware for the hardware scheduler is open source and available
> > > > at https://github.com/Xilinx/XRT
> > >
> > > Before digging into the implementation side more I looked into the
> > > userspace here. I admit I got lost a bit, since there's lots of
> > > indirections and abstractions going on, but it seems like this is
> > > just a fancy ioctl wrapper/driver backend abstractions. Not really
> something applications would use.
> > Sonal Santan <sonals@xilinx.com>
> >
> > 4:20 PM (1 minute ago)
> >
> > to me
> >
> >
> > > -----Original Message-----
> > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > Daniel Vetter
> > > Sent: Monday, March 25, 2019 1:28 PM
> > > To: Sonal Santan <sonals@xilinx.com>
> > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > Cyril Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org;
> > > Lizhi Hou <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>;
> > > airlied@redhat.com
> > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator
> > > driver
> > >
> > > On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com
> wrote:
> > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > >
> > > > Hello,
> > > >
> > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > These drivers are part of Xilinx Runtime (XRT) open source stack
> > > > and have been deployed by leading FaaS vendors and many enterprise
> > > customers.
> > >
> > > Cool, first fpga driver submitted to drm! And from a high level I
> > > think this makes a lot of sense.
> > >
> > > > PLATFORM ARCHITECTURE
> > > >
> > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > (dynamic) region. The shell is automatically loaded from PROM when
> > > > host is booted and PCIe is enumerated by BIOS. Shell cannot be
> > > > changed till next cold reboot. The shell exposes two physical functions:
> > > > management physical function and user physical function.
> > > >
> > > > Users compile their high level design in C/C++/OpenCL or RTL into
> > > > FPGA image using SDx compiler. The FPGA image packaged as xclbin
> > > > file can be loaded onto reconfigurable region. The image may
> > > > contain one or more compute unit. Users can dynamically swap the
> > > > full image running on the reconfigurable region in order to switch
> > > > between different
> > > workloads.
> > > >
> > > > XRT DRIVERS
> > > >
> > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > modular and organized into several platform drivers which
> > > > primarily handle the following functionality:
> > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > integration) 2.  Clock scaling 3.  Loading firmware container also
> > > > called dsabin (embedded Microblaze
> > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > In-band
> > > > sensors: temp, voltage, power, etc.
> > > > 5.  AXI Firewall management
> > > > 6.  Device reset and rescan
> > > > 7.  Hardware mailbox for communication between two physical
> > > > functions
> > > >
> > > > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > > > driver is also modular and organized into several platform drivers
> > > > which handle the following functionality:
> > > > 1.  Device memory topology discovery and memory management 2.
> > > > Buffer object abstraction and management for client process 3.
> > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> context management 5.
> > > > Compute unit execution management (optionally with help of ERT) for
> > > >     client processes
> > > > 6.  Hardware mailbox for communication between two physical
> > > > functions
> > > >
> > > > The drivers export ioctls and sysfs nodes for various services.
> > > > xocl driver makes heavy use of DRM GEM features for device memory
> > > > management, reference counting, mmap support and export/import.
> > > > xocl also includes a simple scheduler called KDS which schedules
> > > > compute units and interacts with hardware scheduler running ERT
> > > > firmware. The scheduler understands custom opcodes packaged into
> > > > command objects
> > > and
> > > > provides an asynchronous command done notification via POSIX poll.
> > > >
> > > > More details on architecture, software APIs, ioctl definitions,
> > > > execution model, etc. is available as Sphinx documentation--
> > > >
> > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > >
> > > > The complete runtime software stack (XRT) which includes out of
> > > > tree kernel drivers, user space libraries, board utilities and
> > > > firmware for the hardware scheduler is open source and available
> > > > at https://github.com/Xilinx/XRT
> > >
> > > Before digging into the implementation side more I looked into the
> > > userspace here. I admit I got lost a bit, since there's lots of
> > > indirections and abstractions going on, but it seems like this is
> > > just a fancy ioctl wrapper/driver backend abstractions. Not really
> something applications would use.
> > >
> >
> > Appreciate your feedback.
> >
> > The userspace libraries define a common abstraction but have different
> > implementations for Zynq Ultrascale+ embedded platform, PCIe based
> > Alveo (and Faas) and emulation flows. The latter lets you run your
> application without physical hardware.
> >
> > >
> > > From the pretty picture on github it looks like there's some
> > > opencl/ml/other fancy stuff sitting on top that applications would use. Is
> that also available?
> >
> > The full OpenCL runtime is available in the same repository. Xilinx ML
> > Suite is also based on XRT and its source can be found at
> https://github.com/Xilinx/ml-suite.
> 
> Hm, I did a few git grep for the usual opencl entry points, but didn't find
> anything. Do I need to run some build scripts first (which downloads
> additional sourcecode)? Or is there some symbol mangling going on and that's
> why I don't find anything? Pointers very much appreciated.

The bulk of the OCL runtime code can be found inside 
https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xocl. 
The OCL runtime also includes https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xrt.
The OCL runtime library called libxilinxopencl.so in turn then uses XRT APIs to talk to the drivers. 
For PCIe these XRT APIs are implemented in the library libxrt_core.so the source for which is
https://github.com/Xilinx/XRT/tree/master/src/runtime_src/driver/xclng/xrt.

You can build a fully functioning runtime stack by following very simple build instructions--
https://xilinx.github.io/XRT/master/html/build.html

We do have a few dependencies on standard Linux packages including a few OpenCL packages 
bundled by Linux distros: ocl-icd, ocl-icd-devel and opencl-headers

Thanks,
-Sonal

> 
> > Typically end users use OpenCL APIs which are part of XRT stack. One
> > can write an application to directly call XRT APIs defined at
> > https://xilinx.github.io/XRT/2018.3/html/xclhal2.main.html
> 
> I have no clue about DNN/ML unfortunately, I think I'll try to look into the ocl
> side a bit more first.
> 
> Thanks, Daniel
> 
> >
> > Thanks,
> > -Sonal
> > >
> > > Thanks, Daniel
> > >
> > > >
> > > > Thanks,
> > > > -Sonal
> > > >
> > > > Sonal Santan (6):
> > > >   Add skeleton code: ioctl definitions and build hooks
> > > >   Global data structures shared between xocl and xmgmt drivers
> > > >   Add platform drivers for various IPs and frameworks
> > > >   Add core of XDMA driver
> > > >   Add management driver
> > > >   Add user physical function driver
> > > >
> > > >  drivers/gpu/drm/Kconfig                    |    2 +
> > > >  drivers/gpu/drm/Makefile                   |    1 +
> > > >  drivers/gpu/drm/xocl/Kconfig               |   22 +
> > > >  drivers/gpu/drm/xocl/Makefile              |    3 +
> > > >  drivers/gpu/drm/xocl/devices.h             |  954 +++++
> > > >  drivers/gpu/drm/xocl/ert.h                 |  385 ++
> > > >  drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
> > > >  drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
> > > >  drivers/gpu/drm/xocl/lib/libxdma.c         | 4368 ++++++++++++++++++++
> > > >  drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
> > > >  drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
> > > >  drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
> > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
> > > >  drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
> > > >  drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
> > > >  drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
> > > >  drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
> > > >  drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
> > > >  drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
> > > >  drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++
> > > >  drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
> > > >  drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
> > > >  drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
> > > >  drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
> > > >  drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
> > > >  drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
> > > >  drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
> > > >  drivers/gpu/drm/xocl/userpf/common.h       |  157 +
> > > >  drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
> > > >  drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
> > > >  drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
> > > >  drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
> > > >  drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
> > > >  drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
> > > >  drivers/gpu/drm/xocl/version.h             |   22 +
> > > >  drivers/gpu/drm/xocl/xclbin.h              |  314 ++
> > > >  drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
> > > >  drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
> > > >  drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
> > > >  drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
> > > >  drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
> > > >  drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
> > > >  include/uapi/drm/xmgmt_drm.h               |  204 +
> > > >  include/uapi/drm/xocl_drm.h                |  483 +++
> > > >  50 files changed, 28252 insertions(+)  create mode 100644
> > > > drivers/gpu/drm/xocl/Kconfig  create mode 100644
> > > > drivers/gpu/drm/xocl/Makefile  create mode 100644
> > > > drivers/gpu/drm/xocl/devices.h  create mode 100644
> > > > drivers/gpu/drm/xocl/ert.h  create mode 100644
> > > > drivers/gpu/drm/xocl/lib/Makefile.in
> > > >  create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c  create mode
> > > > 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c  create mode
> > > > 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c  create mode
> > > > 100644 drivers/gpu/drm/xocl/subdev/xvc.c  create mode 100644
> > > > drivers/gpu/drm/xocl/userpf/Makefile
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/version.h  create mode
> > > > 100644 drivers/gpu/drm/xocl/xclbin.h  create mode 100644
> > > > drivers/gpu/drm/xocl/xclfeatures.h
> > > >  create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c  create mode
> > > > 100644 drivers/gpu/drm/xocl/xocl_drm.h  create mode 100644
> > > > drivers/gpu/drm/xocl/xocl_drv.h  create mode 100644
> > > > drivers/gpu/drm/xocl/xocl_subdev.c
> > > >  create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
> > > >  create mode 100644 include/uapi/drm/xmgmt_drm.h  create mode
> > > > 100644 include/uapi/drm/xocl_drm.h
> > > >
> > > > --
> > > > 2.17.0
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation http://blog.ffwll.ch
> 
> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-27 12:50       ` Sonal Santan
@ 2019-03-27 14:11         ` Daniel Vetter
  2019-03-28  0:13           ` Sonal Santan
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2019-03-27 14:11 UTC (permalink / raw)
  To: Sonal Santan
  Cc: gregkh, Cyril Chemparathy, linux-kernel, Lizhi Hou, Michal Simek,
	dri-devel, airlied

On Wed, Mar 27, 2019 at 12:50:14PM +0000, Sonal Santan wrote:
> 
> 
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel@ffwll.ch]
> > Sent: Wednesday, March 27, 2019 1:23 AM
> > To: Sonal Santan <sonals@xilinx.com>
> > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org; Cyril
> > Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org; Lizhi Hou
> > <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> > 
> > On Wed, Mar 27, 2019 at 12:30 AM Sonal Santan <sonals@xilinx.com> wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > > Daniel Vetter
> > > > Sent: Monday, March 25, 2019 1:28 PM
> > > > To: Sonal Santan <sonals@xilinx.com>
> > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > Cyril Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org;
> > > > Lizhi Hou <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>;
> > > > airlied@redhat.com
> > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator
> > > > driver
> > > >
> > > > On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com
> > wrote:
> > > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > > >
> > > > > Hello,
> > > > >
> > > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > > These drivers are part of Xilinx Runtime (XRT) open source stack
> > > > > and have been deployed by leading FaaS vendors and many enterprise
> > > > customers.
> > > >
> > > > Cool, first fpga driver submitted to drm! And from a high level I
> > > > think this makes a lot of sense.
> > > >
> > > > > PLATFORM ARCHITECTURE
> > > > >
> > > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > > (dynamic) region. The shell is automatically loaded from PROM when
> > > > > host is booted and PCIe is enumerated by BIOS. Shell cannot be
> > > > > changed till next cold reboot. The shell exposes two physical functions:
> > > > > management physical function and user physical function.
> > > > >
> > > > > Users compile their high level design in C/C++/OpenCL or RTL into
> > > > > FPGA image using SDx compiler. The FPGA image packaged as xclbin
> > > > > file can be loaded onto reconfigurable region. The image may
> > > > > contain one or more compute unit. Users can dynamically swap the
> > > > > full image running on the reconfigurable region in order to switch
> > > > > between different
> > > > workloads.
> > > > >
> > > > > XRT DRIVERS
> > > > >
> > > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > > modular and organized into several platform drivers which
> > > > > primarily handle the following functionality:
> > > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > > integration) 2.  Clock scaling 3.  Loading firmware container also
> > > > > called dsabin (embedded Microblaze
> > > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > > In-band
> > > > > sensors: temp, voltage, power, etc.
> > > > > 5.  AXI Firewall management
> > > > > 6.  Device reset and rescan
> > > > > 7.  Hardware mailbox for communication between two physical
> > > > > functions
> > > > >
> > > > > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > > > > driver is also modular and organized into several platform drivers
> > > > > which handle the following functionality:
> > > > > 1.  Device memory topology discovery and memory management 2.
> > > > > Buffer object abstraction and management for client process 3.
> > > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> > context management 5.
> > > > > Compute unit execution management (optionally with help of ERT) for
> > > > >     client processes
> > > > > 6.  Hardware mailbox for communication between two physical
> > > > > functions
> > > > >
> > > > > The drivers export ioctls and sysfs nodes for various services.
> > > > > xocl driver makes heavy use of DRM GEM features for device memory
> > > > > management, reference counting, mmap support and export/import.
> > > > > xocl also includes a simple scheduler called KDS which schedules
> > > > > compute units and interacts with hardware scheduler running ERT
> > > > > firmware. The scheduler understands custom opcodes packaged into
> > > > > command objects
> > > > and
> > > > > provides an asynchronous command done notification via POSIX poll.
> > > > >
> > > > > More details on architecture, software APIs, ioctl definitions,
> > > > > execution model, etc. is available as Sphinx documentation--
> > > > >
> > > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > > >
> > > > > The complete runtime software stack (XRT) which includes out of
> > > > > tree kernel drivers, user space libraries, board utilities and
> > > > > firmware for the hardware scheduler is open source and available
> > > > > at https://github.com/Xilinx/XRT
> > > >
> > > > Before digging into the implementation side more I looked into the
> > > > userspace here. I admit I got lost a bit, since there's lots of
> > > > indirections and abstractions going on, but it seems like this is
> > > > just a fancy ioctl wrapper/driver backend abstractions. Not really
> > something applications would use.
> > > Sonal Santan <sonals@xilinx.com>
> > >
> > > 4:20 PM (1 minute ago)
> > >
> > > to me
> > >
> > >
> > > > -----Original Message-----
> > > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > > Daniel Vetter
> > > > Sent: Monday, March 25, 2019 1:28 PM
> > > > To: Sonal Santan <sonals@xilinx.com>
> > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > Cyril Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org;
> > > > Lizhi Hou <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>;
> > > > airlied@redhat.com
> > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator
> > > > driver
> > > >
> > > > On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.santan@xilinx.com
> > wrote:
> > > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > > >
> > > > > Hello,
> > > > >
> > > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > > These drivers are part of Xilinx Runtime (XRT) open source stack
> > > > > and have been deployed by leading FaaS vendors and many enterprise
> > > > customers.
> > > >
> > > > Cool, first fpga driver submitted to drm! And from a high level I
> > > > think this makes a lot of sense.
> > > >
> > > > > PLATFORM ARCHITECTURE
> > > > >
> > > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > > (dynamic) region. The shell is automatically loaded from PROM when
> > > > > host is booted and PCIe is enumerated by BIOS. Shell cannot be
> > > > > changed till next cold reboot. The shell exposes two physical functions:
> > > > > management physical function and user physical function.
> > > > >
> > > > > Users compile their high level design in C/C++/OpenCL or RTL into
> > > > > FPGA image using SDx compiler. The FPGA image packaged as xclbin
> > > > > file can be loaded onto reconfigurable region. The image may
> > > > > contain one or more compute unit. Users can dynamically swap the
> > > > > full image running on the reconfigurable region in order to switch
> > > > > between different
> > > > workloads.
> > > > >
> > > > > XRT DRIVERS
> > > > >
> > > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > > modular and organized into several platform drivers which
> > > > > primarily handle the following functionality:
> > > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > > integration) 2.  Clock scaling 3.  Loading firmware container also
> > > > > called dsabin (embedded Microblaze
> > > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > > In-band
> > > > > sensors: temp, voltage, power, etc.
> > > > > 5.  AXI Firewall management
> > > > > 6.  Device reset and rescan
> > > > > 7.  Hardware mailbox for communication between two physical
> > > > > functions
> > > > >
> > > > > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > > > > driver is also modular and organized into several platform drivers
> > > > > which handle the following functionality:
> > > > > 1.  Device memory topology discovery and memory management 2.
> > > > > Buffer object abstraction and management for client process 3.
> > > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> > context management 5.
> > > > > Compute unit execution management (optionally with help of ERT) for
> > > > >     client processes
> > > > > 6.  Hardware mailbox for communication between two physical
> > > > > functions
> > > > >
> > > > > The drivers export ioctls and sysfs nodes for various services.
> > > > > xocl driver makes heavy use of DRM GEM features for device memory
> > > > > management, reference counting, mmap support and export/import.
> > > > > xocl also includes a simple scheduler called KDS which schedules
> > > > > compute units and interacts with hardware scheduler running ERT
> > > > > firmware. The scheduler understands custom opcodes packaged into
> > > > > command objects
> > > > and
> > > > > provides an asynchronous command done notification via POSIX poll.
> > > > >
> > > > > More details on architecture, software APIs, ioctl definitions,
> > > > > execution model, etc. is available as Sphinx documentation--
> > > > >
> > > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > > >
> > > > > The complete runtime software stack (XRT) which includes out of
> > > > > tree kernel drivers, user space libraries, board utilities and
> > > > > firmware for the hardware scheduler is open source and available
> > > > > at https://github.com/Xilinx/XRT
> > > >
> > > > Before digging into the implementation side more I looked into the
> > > > userspace here. I admit I got lost a bit, since there's lots of
> > > > indirections and abstractions going on, but it seems like this is
> > > > just a fancy ioctl wrapper/driver backend abstractions. Not really
> > something applications would use.
> > > >
> > >
> > > Appreciate your feedback.
> > >
> > > The userspace libraries define a common abstraction but have different
> > > implementations for Zynq Ultrascale+ embedded platform, PCIe based
> > > Alveo (and Faas) and emulation flows. The latter lets you run your
> > application without physical hardware.
> > >
> > > >
> > > > From the pretty picture on github it looks like there's some
> > > > opencl/ml/other fancy stuff sitting on top that applications would use. Is
> > that also available?
> > >
> > > The full OpenCL runtime is available in the same repository. Xilinx ML
> > > Suite is also based on XRT and its source can be found at
> > https://github.com/Xilinx/ml-suite.
> > 
> > Hm, I did a few git grep for the usual opencl entry points, but didn't find
> > anything. Do I need to run some build scripts first (which downloads
> > additional sourcecode)? Or is there some symbol mangling going on and that's
> > why I don't find anything? Pointers very much appreciated.
> 
> The bulk of the OCL runtime code can be found inside 
> https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xocl. 
> The OCL runtime also includes https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xrt.
> The OCL runtime library called libxilinxopencl.so in turn then uses XRT APIs to talk to the drivers. 
> For PCIe these XRT APIs are implemented in the library libxrt_core.so the source for which is
> https://github.com/Xilinx/XRT/tree/master/src/runtime_src/driver/xclng/xrt.
> 
> You can build a fully functioning runtime stack by following very simple build instructions--
> https://xilinx.github.io/XRT/master/html/build.html
> 
> We do have a few dependencies on standard Linux packages including a few OpenCL packages 
> bundled by Linux distros: ocl-icd, ocl-icd-devel and opencl-headers

Thanks a lot for pointers. No idea why I didn't find this stuff, I guess I
was blind.

The thing I'm really interested in is the compiler, since at least the
experience from gpus says that very much is part of the overall uapi, and
definitely needed to be able to make any chances to the implementation.
Looking at clCreateProgramWithSource there's only a lookup up cached
compiles (it looks for xclbin), and src/runtime_src/xclbin doesn't look
like that provides a compiler either. It seems like apps need to
precompile everything first. Am I again missing something, or is this how
it's supposed to work?

Note: There's no expectation for the fully optimizing compiler, and we're
totally ok if there's an optimizing proprietary compiler and a basic open
one (amd, and bunch of other companies all have such dual stacks running
on top of drm kernel drivers). But a basic compiler that can convert basic
kernels into machine code is expected.

Thanks, Daniel

> 
> Thanks,
> -Sonal
> 
> > 
> > > Typically end users use OpenCL APIs which are part of XRT stack. One
> > > can write an application to directly call XRT APIs defined at
> > > https://xilinx.github.io/XRT/2018.3/html/xclhal2.main.html
> > 
> > I have no clue about DNN/ML unfortunately, I think I'll try to look into the ocl
> > side a bit more first.
> > 
> > Thanks, Daniel
> > 
> > >
> > > Thanks,
> > > -Sonal
> > > >
> > > > Thanks, Daniel
> > > >
> > > > >
> > > > > Thanks,
> > > > > -Sonal
> > > > >
> > > > > Sonal Santan (6):
> > > > >   Add skeleton code: ioctl definitions and build hooks
> > > > >   Global data structures shared between xocl and xmgmt drivers
> > > > >   Add platform drivers for various IPs and frameworks
> > > > >   Add core of XDMA driver
> > > > >   Add management driver
> > > > >   Add user physical function driver
> > > > >
> > > > >  drivers/gpu/drm/Kconfig                    |    2 +
> > > > >  drivers/gpu/drm/Makefile                   |    1 +
> > > > >  drivers/gpu/drm/xocl/Kconfig               |   22 +
> > > > >  drivers/gpu/drm/xocl/Makefile              |    3 +
> > > > >  drivers/gpu/drm/xocl/devices.h             |  954 +++++
> > > > >  drivers/gpu/drm/xocl/ert.h                 |  385 ++
> > > > >  drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
> > > > >  drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
> > > > >  drivers/gpu/drm/xocl/lib/libxdma.c         | 4368 ++++++++++++++++++++
> > > > >  drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
> > > > >  drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
> > > > >  drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
> > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
> > > > >  drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
> > > > >  drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
> > > > >  drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
> > > > >  drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
> > > > >  drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
> > > > >  drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
> > > > >  drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059 ++++++++++++++
> > > > >  drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
> > > > >  drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
> > > > >  drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
> > > > >  drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
> > > > >  drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
> > > > >  drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
> > > > >  drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
> > > > >  drivers/gpu/drm/xocl/userpf/common.h       |  157 +
> > > > >  drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
> > > > >  drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
> > > > >  drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
> > > > >  drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
> > > > >  drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
> > > > >  drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
> > > > >  drivers/gpu/drm/xocl/version.h             |   22 +
> > > > >  drivers/gpu/drm/xocl/xclbin.h              |  314 ++
> > > > >  drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
> > > > >  drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
> > > > >  drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
> > > > >  drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
> > > > >  drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
> > > > >  drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
> > > > >  include/uapi/drm/xmgmt_drm.h               |  204 +
> > > > >  include/uapi/drm/xocl_drm.h                |  483 +++
> > > > >  50 files changed, 28252 insertions(+)  create mode 100644
> > > > > drivers/gpu/drm/xocl/Kconfig  create mode 100644
> > > > > drivers/gpu/drm/xocl/Makefile  create mode 100644
> > > > > drivers/gpu/drm/xocl/devices.h  create mode 100644
> > > > > drivers/gpu/drm/xocl/ert.h  create mode 100644
> > > > > drivers/gpu/drm/xocl/lib/Makefile.in
> > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c  create mode
> > > > > 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c  create mode
> > > > > 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c  create mode
> > > > > 100644 drivers/gpu/drm/xocl/subdev/xvc.c  create mode 100644
> > > > > drivers/gpu/drm/xocl/userpf/Makefile
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/version.h  create mode
> > > > > 100644 drivers/gpu/drm/xocl/xclbin.h  create mode 100644
> > > > > drivers/gpu/drm/xocl/xclfeatures.h
> > > > >  create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c  create mode
> > > > > 100644 drivers/gpu/drm/xocl/xocl_drm.h  create mode 100644
> > > > > drivers/gpu/drm/xocl/xocl_drv.h  create mode 100644
> > > > > drivers/gpu/drm/xocl/xocl_subdev.c
> > > > >  create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
> > > > >  create mode 100644 include/uapi/drm/xmgmt_drm.h  create mode
> > > > > 100644 include/uapi/drm/xocl_drm.h
> > > > >
> > > > > --
> > > > > 2.17.0
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation http://blog.ffwll.ch
> > 
> > 
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-27 14:11         ` Daniel Vetter
@ 2019-03-28  0:13           ` Sonal Santan
  2019-03-29  4:56             ` Dave Airlie
  0 siblings, 1 reply; 20+ messages in thread
From: Sonal Santan @ 2019-03-28  0:13 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: dri-devel, gregkh, Cyril Chemparathy, linux-kernel, Lizhi Hou,
	Michal Simek, airlied



> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Wednesday, March 27, 2019 7:12 AM
> To: Sonal Santan <sonals@xilinx.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>; dri-devel@lists.freedesktop.org;
> gregkh@linuxfoundation.org; Cyril Chemparathy <cyrilc@xilinx.com>; linux-
> kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>; Michal Simek
> <michals@xilinx.com>; airlied@redhat.com
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> 
> On Wed, Mar 27, 2019 at 12:50:14PM +0000, Sonal Santan wrote:
> >
> >
> > > -----Original Message-----
> > > From: Daniel Vetter [mailto:daniel@ffwll.ch]
> > > Sent: Wednesday, March 27, 2019 1:23 AM
> > > To: Sonal Santan <sonals@xilinx.com>
> > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > Cyril Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org;
> > > Lizhi Hou <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>;
> > > airlied@redhat.com
> > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator
> > > driver
> > >
> > > On Wed, Mar 27, 2019 at 12:30 AM Sonal Santan <sonals@xilinx.com>
> wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > > > Daniel Vetter
> > > > > Sent: Monday, March 25, 2019 1:28 PM
> > > > > To: Sonal Santan <sonals@xilinx.com>
> > > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > > Cyril Chemparathy <cyrilc@xilinx.com>;
> > > > > linux-kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>;
> > > > > Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe
> > > > > accelerator driver
> > > > >
> > > > > On Tue, Mar 19, 2019 at 02:53:55PM -0700,
> > > > > sonal.santan@xilinx.com
> > > wrote:
> > > > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > > > These drivers are part of Xilinx Runtime (XRT) open source
> > > > > > stack and have been deployed by leading FaaS vendors and many
> > > > > > enterprise
> > > > > customers.
> > > > >
> > > > > Cool, first fpga driver submitted to drm! And from a high level
> > > > > I think this makes a lot of sense.
> > > > >
> > > > > > PLATFORM ARCHITECTURE
> > > > > >
> > > > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > > > (dynamic) region. The shell is automatically loaded from PROM
> > > > > > when host is booted and PCIe is enumerated by BIOS. Shell
> > > > > > cannot be changed till next cold reboot. The shell exposes two
> physical functions:
> > > > > > management physical function and user physical function.
> > > > > >
> > > > > > Users compile their high level design in C/C++/OpenCL or RTL
> > > > > > into FPGA image using SDx compiler. The FPGA image packaged as
> > > > > > xclbin file can be loaded onto reconfigurable region. The
> > > > > > image may contain one or more compute unit. Users can
> > > > > > dynamically swap the full image running on the reconfigurable
> > > > > > region in order to switch between different
> > > > > workloads.
> > > > > >
> > > > > > XRT DRIVERS
> > > > > >
> > > > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > > > modular and organized into several platform drivers which
> > > > > > primarily handle the following functionality:
> > > > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > > > integration) 2.  Clock scaling 3.  Loading firmware container
> > > > > > also called dsabin (embedded Microblaze
> > > > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > > > In-band
> > > > > > sensors: temp, voltage, power, etc.
> > > > > > 5.  AXI Firewall management
> > > > > > 6.  Device reset and rescan
> > > > > > 7.  Hardware mailbox for communication between two physical
> > > > > > functions
> > > > > >
> > > > > > XRT Linux kernel driver xocl binds to user pf. Like its peer,
> > > > > > this driver is also modular and organized into several
> > > > > > platform drivers which handle the following functionality:
> > > > > > 1.  Device memory topology discovery and memory management 2.
> > > > > > Buffer object abstraction and management for client process 3.
> > > > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> > > context management 5.
> > > > > > Compute unit execution management (optionally with help of ERT)
> for
> > > > > >     client processes
> > > > > > 6.  Hardware mailbox for communication between two physical
> > > > > > functions
> > > > > >
> > > > > > The drivers export ioctls and sysfs nodes for various services.
> > > > > > xocl driver makes heavy use of DRM GEM features for device
> > > > > > memory management, reference counting, mmap support and
> export/import.
> > > > > > xocl also includes a simple scheduler called KDS which
> > > > > > schedules compute units and interacts with hardware scheduler
> > > > > > running ERT firmware. The scheduler understands custom opcodes
> > > > > > packaged into command objects
> > > > > and
> > > > > > provides an asynchronous command done notification via POSIX poll.
> > > > > >
> > > > > > More details on architecture, software APIs, ioctl
> > > > > > definitions, execution model, etc. is available as Sphinx
> > > > > > documentation--
> > > > > >
> > > > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > > > >
> > > > > > The complete runtime software stack (XRT) which includes out
> > > > > > of tree kernel drivers, user space libraries, board utilities
> > > > > > and firmware for the hardware scheduler is open source and
> > > > > > available at https://github.com/Xilinx/XRT
> > > > >
> > > > > Before digging into the implementation side more I looked into
> > > > > the userspace here. I admit I got lost a bit, since there's lots
> > > > > of indirections and abstractions going on, but it seems like
> > > > > this is just a fancy ioctl wrapper/driver backend abstractions.
> > > > > Not really
> > > something applications would use.
> > > > Sonal Santan <sonals@xilinx.com>
> > > >
> > > > 4:20 PM (1 minute ago)
> > > >
> > > > to me
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > > > Daniel Vetter
> > > > > Sent: Monday, March 25, 2019 1:28 PM
> > > > > To: Sonal Santan <sonals@xilinx.com>
> > > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > > Cyril Chemparathy <cyrilc@xilinx.com>;
> > > > > linux-kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>;
> > > > > Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe
> > > > > accelerator driver
> > > > >
> > > > > On Tue, Mar 19, 2019 at 02:53:55PM -0700,
> > > > > sonal.santan@xilinx.com
> > > wrote:
> > > > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > > > These drivers are part of Xilinx Runtime (XRT) open source
> > > > > > stack and have been deployed by leading FaaS vendors and many
> > > > > > enterprise
> > > > > customers.
> > > > >
> > > > > Cool, first fpga driver submitted to drm! And from a high level
> > > > > I think this makes a lot of sense.
> > > > >
> > > > > > PLATFORM ARCHITECTURE
> > > > > >
> > > > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > > > (dynamic) region. The shell is automatically loaded from PROM
> > > > > > when host is booted and PCIe is enumerated by BIOS. Shell
> > > > > > cannot be changed till next cold reboot. The shell exposes two
> physical functions:
> > > > > > management physical function and user physical function.
> > > > > >
> > > > > > Users compile their high level design in C/C++/OpenCL or RTL
> > > > > > into FPGA image using SDx compiler. The FPGA image packaged as
> > > > > > xclbin file can be loaded onto reconfigurable region. The
> > > > > > image may contain one or more compute unit. Users can
> > > > > > dynamically swap the full image running on the reconfigurable
> > > > > > region in order to switch between different
> > > > > workloads.
> > > > > >
> > > > > > XRT DRIVERS
> > > > > >
> > > > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > > > modular and organized into several platform drivers which
> > > > > > primarily handle the following functionality:
> > > > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > > > integration) 2.  Clock scaling 3.  Loading firmware container
> > > > > > also called dsabin (embedded Microblaze
> > > > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > > > In-band
> > > > > > sensors: temp, voltage, power, etc.
> > > > > > 5.  AXI Firewall management
> > > > > > 6.  Device reset and rescan
> > > > > > 7.  Hardware mailbox for communication between two physical
> > > > > > functions
> > > > > >
> > > > > > XRT Linux kernel driver xocl binds to user pf. Like its peer,
> > > > > > this driver is also modular and organized into several
> > > > > > platform drivers which handle the following functionality:
> > > > > > 1.  Device memory topology discovery and memory management 2.
> > > > > > Buffer object abstraction and management for client process 3.
> > > > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> > > context management 5.
> > > > > > Compute unit execution management (optionally with help of ERT)
> for
> > > > > >     client processes
> > > > > > 6.  Hardware mailbox for communication between two physical
> > > > > > functions
> > > > > >
> > > > > > The drivers export ioctls and sysfs nodes for various services.
> > > > > > xocl driver makes heavy use of DRM GEM features for device
> > > > > > memory management, reference counting, mmap support and
> export/import.
> > > > > > xocl also includes a simple scheduler called KDS which
> > > > > > schedules compute units and interacts with hardware scheduler
> > > > > > running ERT firmware. The scheduler understands custom opcodes
> > > > > > packaged into command objects
> > > > > and
> > > > > > provides an asynchronous command done notification via POSIX poll.
> > > > > >
> > > > > > More details on architecture, software APIs, ioctl
> > > > > > definitions, execution model, etc. is available as Sphinx
> > > > > > documentation--
> > > > > >
> > > > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > > > >
> > > > > > The complete runtime software stack (XRT) which includes out
> > > > > > of tree kernel drivers, user space libraries, board utilities
> > > > > > and firmware for the hardware scheduler is open source and
> > > > > > available at https://github.com/Xilinx/XRT
> > > > >
> > > > > Before digging into the implementation side more I looked into
> > > > > the userspace here. I admit I got lost a bit, since there's lots
> > > > > of indirections and abstractions going on, but it seems like
> > > > > this is just a fancy ioctl wrapper/driver backend abstractions.
> > > > > Not really
> > > something applications would use.
> > > > >
> > > >
> > > > Appreciate your feedback.
> > > >
> > > > The userspace libraries define a common abstraction but have
> > > > different implementations for Zynq Ultrascale+ embedded platform,
> > > > PCIe based Alveo (and Faas) and emulation flows. The latter lets
> > > > you run your
> > > application without physical hardware.
> > > >
> > > > >
> > > > > From the pretty picture on github it looks like there's some
> > > > > opencl/ml/other fancy stuff sitting on top that applications
> > > > > would use. Is
> > > that also available?
> > > >
> > > > The full OpenCL runtime is available in the same repository.
> > > > Xilinx ML Suite is also based on XRT and its source can be found
> > > > at
> > > https://github.com/Xilinx/ml-suite.
> > >
> > > Hm, I did a few git grep for the usual opencl entry points, but
> > > didn't find anything. Do I need to run some build scripts first
> > > (which downloads additional sourcecode)? Or is there some symbol
> > > mangling going on and that's why I don't find anything? Pointers very
> much appreciated.
> >
> > The bulk of the OCL runtime code can be found inside
> > https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xocl.
> > The OCL runtime also includes
> https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xrt.
> > The OCL runtime library called libxilinxopencl.so in turn then uses XRT APIs
> to talk to the drivers.
> > For PCIe these XRT APIs are implemented in the library libxrt_core.so
> > the source for which is
> https://github.com/Xilinx/XRT/tree/master/src/runtime_src/driver/xclng/xrt.
> >
> > You can build a fully functioning runtime stack by following very
> > simple build instructions--
> > https://xilinx.github.io/XRT/master/html/build.html
> >
> > We do have a few dependencies on standard Linux packages including a
> > few OpenCL packages bundled by Linux distros: ocl-icd, ocl-icd-devel
> > and opencl-headers
> 
> Thanks a lot for pointers. No idea why I didn't find this stuff, I guess I was
> blind.
> 
> The thing I'm really interested in is the compiler, since at least the experience
> from gpus says that very much is part of the overall uapi, and definitely
> needed to be able to make any chances to the implementation.
> Looking at clCreateProgramWithSource there's only a lookup up cached
> compiles (it looks for xclbin), and src/runtime_src/xclbin doesn't look like that
> provides a compiler either. It seems like apps need to precompile everything
> first. Am I again missing something, or is this how it's supposed to work?
> 
XRT works with precompiled binaries which are compiled by Xilinx SDx compiler 
called xocc. The binary (xclbin) is loaded by clCreateProgramWithBinary(). 

> Note: There's no expectation for the fully optimizing compiler, and we're
> totally ok if there's an optimizing proprietary compiler and a basic open one
> (amd, and bunch of other companies all have such dual stacks running on top
> of drm kernel drivers). But a basic compiler that can convert basic kernels into
> machine code is expected.
> 
Although the compiler is not open source the compilation flow lets users examine
output from various stages. For example if you write your kernel in OpenCL/C/C++ 
you can view the RTL (Verilog/VHDL) output produced by first stage of compilation. 
Note that the compiler is really generating a custom circuit given a high level 
input which in the last phase gets synthesized into bitstream. Expert hardware 
designers can handcraft a circuit in RTL and feed it to the compiler. Our FPGA tools 
let you view the generated hardware design, the register map, etc. You can get more 
information about a compiled design by running XRT tool like xclbinutil on the 
generated file.

In essence compiling for FPGAs is quite different than compiling for GPU/CPU/DSP.
Interestingly FPGA compilers can run anywhere from 30 mins to a few hours to 
compile a testcase.

Thanks,
-Sonal

> Thanks, Daniel
> 
> >
> > Thanks,
> > -Sonal
> >
> > >
> > > > Typically end users use OpenCL APIs which are part of XRT stack.
> > > > One can write an application to directly call XRT APIs defined at
> > > > https://xilinx.github.io/XRT/2018.3/html/xclhal2.main.html
> > >
> > > I have no clue about DNN/ML unfortunately, I think I'll try to look
> > > into the ocl side a bit more first.
> > >
> > > Thanks, Daniel
> > >
> > > >
> > > > Thanks,
> > > > -Sonal
> > > > >
> > > > > Thanks, Daniel
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > -Sonal
> > > > > >
> > > > > > Sonal Santan (6):
> > > > > >   Add skeleton code: ioctl definitions and build hooks
> > > > > >   Global data structures shared between xocl and xmgmt drivers
> > > > > >   Add platform drivers for various IPs and frameworks
> > > > > >   Add core of XDMA driver
> > > > > >   Add management driver
> > > > > >   Add user physical function driver
> > > > > >
> > > > > >  drivers/gpu/drm/Kconfig                    |    2 +
> > > > > >  drivers/gpu/drm/Makefile                   |    1 +
> > > > > >  drivers/gpu/drm/xocl/Kconfig               |   22 +
> > > > > >  drivers/gpu/drm/xocl/Makefile              |    3 +
> > > > > >  drivers/gpu/drm/xocl/devices.h             |  954 +++++
> > > > > >  drivers/gpu/drm/xocl/ert.h                 |  385 ++
> > > > > >  drivers/gpu/drm/xocl/lib/Makefile.in       |   16 +
> > > > > >  drivers/gpu/drm/xocl/lib/cdev_sgdma.h      |   63 +
> > > > > >  drivers/gpu/drm/xocl/lib/libxdma.c         | 4368
> ++++++++++++++++++++
> > > > > >  drivers/gpu/drm/xocl/lib/libxdma.h         |  596 +++
> > > > > >  drivers/gpu/drm/xocl/lib/libxdma_api.h     |  127 +
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/Makefile       |   29 +
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c    |  960 +++++
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h    |  147 +
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c      |   30 +
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c   |  148 +
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h     |  244 ++
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c   |  318 ++
> > > > > >  drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c   |  399 ++
> > > > > >  drivers/gpu/drm/xocl/subdev/dna.c          |  356 ++
> > > > > >  drivers/gpu/drm/xocl/subdev/feature_rom.c  |  412 ++
> > > > > >  drivers/gpu/drm/xocl/subdev/firewall.c     |  389 ++
> > > > > >  drivers/gpu/drm/xocl/subdev/fmgr.c         |  198 +
> > > > > >  drivers/gpu/drm/xocl/subdev/icap.c         | 2859 +++++++++++++
> > > > > >  drivers/gpu/drm/xocl/subdev/mailbox.c      | 1868 +++++++++
> > > > > >  drivers/gpu/drm/xocl/subdev/mb_scheduler.c | 3059
> ++++++++++++++
> > > > > >  drivers/gpu/drm/xocl/subdev/microblaze.c   |  722 ++++
> > > > > >  drivers/gpu/drm/xocl/subdev/mig.c          |  256 ++
> > > > > >  drivers/gpu/drm/xocl/subdev/sysmon.c       |  385 ++
> > > > > >  drivers/gpu/drm/xocl/subdev/xdma.c         |  510 +++
> > > > > >  drivers/gpu/drm/xocl/subdev/xmc.c          | 1480 +++++++
> > > > > >  drivers/gpu/drm/xocl/subdev/xvc.c          |  461 +++
> > > > > >  drivers/gpu/drm/xocl/userpf/Makefile       |   27 +
> > > > > >  drivers/gpu/drm/xocl/userpf/common.h       |  157 +
> > > > > >  drivers/gpu/drm/xocl/userpf/xocl_bo.c      | 1255 ++++++
> > > > > >  drivers/gpu/drm/xocl/userpf/xocl_bo.h      |  119 +
> > > > > >  drivers/gpu/drm/xocl/userpf/xocl_drm.c     |  640 +++
> > > > > >  drivers/gpu/drm/xocl/userpf/xocl_drv.c     |  743 ++++
> > > > > >  drivers/gpu/drm/xocl/userpf/xocl_ioctl.c   |  396 ++
> > > > > >  drivers/gpu/drm/xocl/userpf/xocl_sysfs.c   |  344 ++
> > > > > >  drivers/gpu/drm/xocl/version.h             |   22 +
> > > > > >  drivers/gpu/drm/xocl/xclbin.h              |  314 ++
> > > > > >  drivers/gpu/drm/xocl/xclfeatures.h         |  107 +
> > > > > >  drivers/gpu/drm/xocl/xocl_ctx.c            |  196 +
> > > > > >  drivers/gpu/drm/xocl/xocl_drm.h            |   91 +
> > > > > >  drivers/gpu/drm/xocl/xocl_drv.h            |  783 ++++
> > > > > >  drivers/gpu/drm/xocl/xocl_subdev.c         |  540 +++
> > > > > >  drivers/gpu/drm/xocl/xocl_thread.c         |   64 +
> > > > > >  include/uapi/drm/xmgmt_drm.h               |  204 +
> > > > > >  include/uapi/drm/xocl_drm.h                |  483 +++
> > > > > >  50 files changed, 28252 insertions(+)  create mode 100644
> > > > > > drivers/gpu/drm/xocl/Kconfig  create mode 100644
> > > > > > drivers/gpu/drm/xocl/Makefile  create mode 100644
> > > > > > drivers/gpu/drm/xocl/devices.h  create mode 100644
> > > > > > drivers/gpu/drm/xocl/ert.h  create mode 100644
> > > > > > drivers/gpu/drm/xocl/lib/Makefile.in
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/cdev_sgdma.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/lib/libxdma_api.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/Makefile
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-core.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-cw.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-ioctl.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-reg.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-sysfs.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/mgmtpf/mgmt-utils.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/dna.c  create
> > > > > > mode
> > > > > > 100644 drivers/gpu/drm/xocl/subdev/feature_rom.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/firewall.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/fmgr.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/icap.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mailbox.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mb_scheduler.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/microblaze.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/mig.c  create
> > > > > > mode
> > > > > > 100644 drivers/gpu/drm/xocl/subdev/sysmon.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xdma.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/subdev/xmc.c  create
> > > > > > mode
> > > > > > 100644 drivers/gpu/drm/xocl/subdev/xvc.c  create mode 100644
> > > > > > drivers/gpu/drm/xocl/userpf/Makefile
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/common.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_bo.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drm.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_drv.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_ioctl.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/userpf/xocl_sysfs.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/version.h  create
> > > > > > mode
> > > > > > 100644 drivers/gpu/drm/xocl/xclbin.h  create mode 100644
> > > > > > drivers/gpu/drm/xocl/xclfeatures.h
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/xocl_ctx.c  create
> > > > > > mode
> > > > > > 100644 drivers/gpu/drm/xocl/xocl_drm.h  create mode 100644
> > > > > > drivers/gpu/drm/xocl/xocl_drv.h  create mode 100644
> > > > > > drivers/gpu/drm/xocl/xocl_subdev.c
> > > > > >  create mode 100644 drivers/gpu/drm/xocl/xocl_thread.c
> > > > > >  create mode 100644 include/uapi/drm/xmgmt_drm.h  create mode
> > > > > > 100644 include/uapi/drm/xocl_drm.h
> > > > > >
> > > > > > --
> > > > > > 2.17.0
> > > > > > _______________________________________________
> > > > > > dri-devel mailing list
> > > > > > dri-devel@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation http://blog.ffwll.ch
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-28  0:13           ` Sonal Santan
@ 2019-03-29  4:56             ` Dave Airlie
  2019-03-30  1:09               ` Ronan KERYELL
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Airlie @ 2019-03-29  4:56 UTC (permalink / raw)
  To: Sonal Santan
  Cc: Daniel Vetter, dri-devel, gregkh, Cyril Chemparathy,
	linux-kernel, Lizhi Hou, Michal Simek, airlied

On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
> > Sent: Wednesday, March 27, 2019 7:12 AM
> > To: Sonal Santan <sonals@xilinx.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>; dri-devel@lists.freedesktop.org;
> > gregkh@linuxfoundation.org; Cyril Chemparathy <cyrilc@xilinx.com>; linux-
> > kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>; Michal Simek
> > <michals@xilinx.com>; airlied@redhat.com
> > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> >
> > On Wed, Mar 27, 2019 at 12:50:14PM +0000, Sonal Santan wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Daniel Vetter [mailto:daniel@ffwll.ch]
> > > > Sent: Wednesday, March 27, 2019 1:23 AM
> > > > To: Sonal Santan <sonals@xilinx.com>
> > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > Cyril Chemparathy <cyrilc@xilinx.com>; linux-kernel@vger.kernel.org;
> > > > Lizhi Hou <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>;
> > > > airlied@redhat.com
> > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator
> > > > driver
> > > >
> > > > On Wed, Mar 27, 2019 at 12:30 AM Sonal Santan <sonals@xilinx.com>
> > wrote:
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > > > > Daniel Vetter
> > > > > > Sent: Monday, March 25, 2019 1:28 PM
> > > > > > To: Sonal Santan <sonals@xilinx.com>
> > > > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > > > Cyril Chemparathy <cyrilc@xilinx.com>;
> > > > > > linux-kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>;
> > > > > > Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > > > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe
> > > > > > accelerator driver
> > > > > >
> > > > > > On Tue, Mar 19, 2019 at 02:53:55PM -0700,
> > > > > > sonal.santan@xilinx.com
> > > > wrote:
> > > > > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > > > > These drivers are part of Xilinx Runtime (XRT) open source
> > > > > > > stack and have been deployed by leading FaaS vendors and many
> > > > > > > enterprise
> > > > > > customers.
> > > > > >
> > > > > > Cool, first fpga driver submitted to drm! And from a high level
> > > > > > I think this makes a lot of sense.
> > > > > >
> > > > > > > PLATFORM ARCHITECTURE
> > > > > > >
> > > > > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > > > > (dynamic) region. The shell is automatically loaded from PROM
> > > > > > > when host is booted and PCIe is enumerated by BIOS. Shell
> > > > > > > cannot be changed till next cold reboot. The shell exposes two
> > physical functions:
> > > > > > > management physical function and user physical function.
> > > > > > >
> > > > > > > Users compile their high level design in C/C++/OpenCL or RTL
> > > > > > > into FPGA image using SDx compiler. The FPGA image packaged as
> > > > > > > xclbin file can be loaded onto reconfigurable region. The
> > > > > > > image may contain one or more compute unit. Users can
> > > > > > > dynamically swap the full image running on the reconfigurable
> > > > > > > region in order to switch between different
> > > > > > workloads.
> > > > > > >
> > > > > > > XRT DRIVERS
> > > > > > >
> > > > > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > > > > modular and organized into several platform drivers which
> > > > > > > primarily handle the following functionality:
> > > > > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > > > > integration) 2.  Clock scaling 3.  Loading firmware container
> > > > > > > also called dsabin (embedded Microblaze
> > > > > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > > > > In-band
> > > > > > > sensors: temp, voltage, power, etc.
> > > > > > > 5.  AXI Firewall management
> > > > > > > 6.  Device reset and rescan
> > > > > > > 7.  Hardware mailbox for communication between two physical
> > > > > > > functions
> > > > > > >
> > > > > > > XRT Linux kernel driver xocl binds to user pf. Like its peer,
> > > > > > > this driver is also modular and organized into several
> > > > > > > platform drivers which handle the following functionality:
> > > > > > > 1.  Device memory topology discovery and memory management 2.
> > > > > > > Buffer object abstraction and management for client process 3.
> > > > > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> > > > context management 5.
> > > > > > > Compute unit execution management (optionally with help of ERT)
> > for
> > > > > > >     client processes
> > > > > > > 6.  Hardware mailbox for communication between two physical
> > > > > > > functions
> > > > > > >
> > > > > > > The drivers export ioctls and sysfs nodes for various services.
> > > > > > > xocl driver makes heavy use of DRM GEM features for device
> > > > > > > memory management, reference counting, mmap support and
> > export/import.
> > > > > > > xocl also includes a simple scheduler called KDS which
> > > > > > > schedules compute units and interacts with hardware scheduler
> > > > > > > running ERT firmware. The scheduler understands custom opcodes
> > > > > > > packaged into command objects
> > > > > > and
> > > > > > > provides an asynchronous command done notification via POSIX poll.
> > > > > > >
> > > > > > > More details on architecture, software APIs, ioctl
> > > > > > > definitions, execution model, etc. is available as Sphinx
> > > > > > > documentation--
> > > > > > >
> > > > > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > > > > >
> > > > > > > The complete runtime software stack (XRT) which includes out
> > > > > > > of tree kernel drivers, user space libraries, board utilities
> > > > > > > and firmware for the hardware scheduler is open source and
> > > > > > > available at https://github.com/Xilinx/XRT
> > > > > >
> > > > > > Before digging into the implementation side more I looked into
> > > > > > the userspace here. I admit I got lost a bit, since there's lots
> > > > > > of indirections and abstractions going on, but it seems like
> > > > > > this is just a fancy ioctl wrapper/driver backend abstractions.
> > > > > > Not really
> > > > something applications would use.
> > > > > Sonal Santan <sonals@xilinx.com>
> > > > >
> > > > > 4:20 PM (1 minute ago)
> > > > >
> > > > > to me
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of
> > > > > > Daniel Vetter
> > > > > > Sent: Monday, March 25, 2019 1:28 PM
> > > > > > To: Sonal Santan <sonals@xilinx.com>
> > > > > > Cc: dri-devel@lists.freedesktop.org; gregkh@linuxfoundation.org;
> > > > > > Cyril Chemparathy <cyrilc@xilinx.com>;
> > > > > > linux-kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>;
> > > > > > Michal Simek <michals@xilinx.com>; airlied@redhat.com
> > > > > > Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe
> > > > > > accelerator driver
> > > > > >
> > > > > > On Tue, Mar 19, 2019 at 02:53:55PM -0700,
> > > > > > sonal.santan@xilinx.com
> > > > wrote:
> > > > > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > > > > > > These drivers are part of Xilinx Runtime (XRT) open source
> > > > > > > stack and have been deployed by leading FaaS vendors and many
> > > > > > > enterprise
> > > > > > customers.
> > > > > >
> > > > > > Cool, first fpga driver submitted to drm! And from a high level
> > > > > > I think this makes a lot of sense.
> > > > > >
> > > > > > > PLATFORM ARCHITECTURE
> > > > > > >
> > > > > > > Alveo PCIe platforms have a static shell and a reconfigurable
> > > > > > > (dynamic) region. The shell is automatically loaded from PROM
> > > > > > > when host is booted and PCIe is enumerated by BIOS. Shell
> > > > > > > cannot be changed till next cold reboot. The shell exposes two
> > physical functions:
> > > > > > > management physical function and user physical function.
> > > > > > >
> > > > > > > Users compile their high level design in C/C++/OpenCL or RTL
> > > > > > > into FPGA image using SDx compiler. The FPGA image packaged as
> > > > > > > xclbin file can be loaded onto reconfigurable region. The
> > > > > > > image may contain one or more compute unit. Users can
> > > > > > > dynamically swap the full image running on the reconfigurable
> > > > > > > region in order to switch between different
> > > > > > workloads.
> > > > > > >
> > > > > > > XRT DRIVERS
> > > > > > >
> > > > > > > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is
> > > > > > > modular and organized into several platform drivers which
> > > > > > > primarily handle the following functionality:
> > > > > > > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > > > > > > integration) 2.  Clock scaling 3.  Loading firmware container
> > > > > > > also called dsabin (embedded Microblaze
> > > > > > >     firmware for ERT and XMC, optional clearing bitstream) 4.
> > > > > > > In-band
> > > > > > > sensors: temp, voltage, power, etc.
> > > > > > > 5.  AXI Firewall management
> > > > > > > 6.  Device reset and rescan
> > > > > > > 7.  Hardware mailbox for communication between two physical
> > > > > > > functions
> > > > > > >
> > > > > > > XRT Linux kernel driver xocl binds to user pf. Like its peer,
> > > > > > > this driver is also modular and organized into several
> > > > > > > platform drivers which handle the following functionality:
> > > > > > > 1.  Device memory topology discovery and memory management 2.
> > > > > > > Buffer object abstraction and management for client process 3.
> > > > > > > XDMA MM PCIe DMA engine programming 4.  Multi-process aware
> > > > context management 5.
> > > > > > > Compute unit execution management (optionally with help of ERT)
> > for
> > > > > > >     client processes
> > > > > > > 6.  Hardware mailbox for communication between two physical
> > > > > > > functions
> > > > > > >
> > > > > > > The drivers export ioctls and sysfs nodes for various services.
> > > > > > > xocl driver makes heavy use of DRM GEM features for device
> > > > > > > memory management, reference counting, mmap support and
> > export/import.
> > > > > > > xocl also includes a simple scheduler called KDS which
> > > > > > > schedules compute units and interacts with hardware scheduler
> > > > > > > running ERT firmware. The scheduler understands custom opcodes
> > > > > > > packaged into command objects
> > > > > > and
> > > > > > > provides an asynchronous command done notification via POSIX poll.
> > > > > > >
> > > > > > > More details on architecture, software APIs, ioctl
> > > > > > > definitions, execution model, etc. is available as Sphinx
> > > > > > > documentation--
> > > > > > >
> > > > > > > https://xilinx.github.io/XRT/2018.3/html/index.html
> > > > > > >
> > > > > > > The complete runtime software stack (XRT) which includes out
> > > > > > > of tree kernel drivers, user space libraries, board utilities
> > > > > > > and firmware for the hardware scheduler is open source and
> > > > > > > available at https://github.com/Xilinx/XRT
> > > > > >
> > > > > > Before digging into the implementation side more I looked into
> > > > > > the userspace here. I admit I got lost a bit, since there's lots
> > > > > > of indirections and abstractions going on, but it seems like
> > > > > > this is just a fancy ioctl wrapper/driver backend abstractions.
> > > > > > Not really
> > > > something applications would use.
> > > > > >
> > > > >
> > > > > Appreciate your feedback.
> > > > >
> > > > > The userspace libraries define a common abstraction but have
> > > > > different implementations for Zynq Ultrascale+ embedded platform,
> > > > > PCIe based Alveo (and Faas) and emulation flows. The latter lets
> > > > > you run your
> > > > application without physical hardware.
> > > > >
> > > > > >
> > > > > > From the pretty picture on github it looks like there's some
> > > > > > opencl/ml/other fancy stuff sitting on top that applications
> > > > > > would use. Is
> > > > that also available?
> > > > >
> > > > > The full OpenCL runtime is available in the same repository.
> > > > > Xilinx ML Suite is also based on XRT and its source can be found
> > > > > at
> > > > https://github.com/Xilinx/ml-suite.
> > > >
> > > > Hm, I did a few git grep for the usual opencl entry points, but
> > > > didn't find anything. Do I need to run some build scripts first
> > > > (which downloads additional sourcecode)? Or is there some symbol
> > > > mangling going on and that's why I don't find anything? Pointers very
> > much appreciated.
> > >
> > > The bulk of the OCL runtime code can be found inside
> > > https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xocl.
> > > The OCL runtime also includes
> > https://github.com/Xilinx/XRT/tree/master/src/runtime_src/xrt.
> > > The OCL runtime library called libxilinxopencl.so in turn then uses XRT APIs
> > to talk to the drivers.
> > > For PCIe these XRT APIs are implemented in the library libxrt_core.so
> > > the source for which is
> > https://github.com/Xilinx/XRT/tree/master/src/runtime_src/driver/xclng/xrt.
> > >
> > > You can build a fully functioning runtime stack by following very
> > > simple build instructions--
> > > https://xilinx.github.io/XRT/master/html/build.html
> > >
> > > We do have a few dependencies on standard Linux packages including a
> > > few OpenCL packages bundled by Linux distros: ocl-icd, ocl-icd-devel
> > > and opencl-headers
> >
> > Thanks a lot for pointers. No idea why I didn't find this stuff, I guess I was
> > blind.
> >
> > The thing I'm really interested in is the compiler, since at least the experience
> > from gpus says that very much is part of the overall uapi, and definitely
> > needed to be able to make any chances to the implementation.
> > Looking at clCreateProgramWithSource there's only a lookup up cached
> > compiles (it looks for xclbin), and src/runtime_src/xclbin doesn't look like that
> > provides a compiler either. It seems like apps need to precompile everything
> > first. Am I again missing something, or is this how it's supposed to work?
> >
> XRT works with precompiled binaries which are compiled by Xilinx SDx compiler
> called xocc. The binary (xclbin) is loaded by clCreateProgramWithBinary().
>
> > Note: There's no expectation for the fully optimizing compiler, and we're
> > totally ok if there's an optimizing proprietary compiler and a basic open one
> > (amd, and bunch of other companies all have such dual stacks running on top
> > of drm kernel drivers). But a basic compiler that can convert basic kernels into
> > machine code is expected.
> >
> Although the compiler is not open source the compilation flow lets users examine
> output from various stages. For example if you write your kernel in OpenCL/C/C++
> you can view the RTL (Verilog/VHDL) output produced by first stage of compilation.
> Note that the compiler is really generating a custom circuit given a high level
> input which in the last phase gets synthesized into bitstream. Expert hardware
> designers can handcraft a circuit in RTL and feed it to the compiler. Our FPGA tools
> let you view the generated hardware design, the register map, etc. You can get more
> information about a compiled design by running XRT tool like xclbinutil on the
> generated file.
>
> In essence compiling for FPGAs is quite different than compiling for GPU/CPU/DSP.
> Interestingly FPGA compilers can run anywhere from 30 mins to a few hours to
> compile a testcase.

So is there any open source userspace generator for what this
interface provides? Is the bitstream format that gets fed into the
FPGA proprietary and is it signed?

Dave.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-29  4:56             ` Dave Airlie
@ 2019-03-30  1:09               ` Ronan KERYELL
  2019-04-03 13:14                 ` Daniel Vetter
  2019-04-03 15:47                 ` Jerome Glisse
  0 siblings, 2 replies; 20+ messages in thread
From: Ronan KERYELL @ 2019-03-30  1:09 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Sonal Santan, Daniel Vetter, dri-devel, gregkh,
	Cyril Chemparathy, linux-kernel, Lizhi Hou, Michal Simek,
	airlied, linux-fpga, Ralph Wittig, Ronan Keryell

I am adding linux-fpga@vger.kernel.org, since this is why I missed this
thread in the first place...

>>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie <airlied@gmail.com> said:

Hi Dave!

    Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com> wrote:

    >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch]

[...]

    >>> Note: There's no expectation for the fully optimizing compiler,
    >>> and we're totally ok if there's an optimizing proprietary
    >>> compiler and a basic open one (amd, and bunch of other
    >>> companies all have such dual stacks running on top of drm
    >>> kernel drivers). But a basic compiler that can convert basic
    >>> kernels into machine code is expected.

    >> Although the compiler is not open source the compilation flow
    >> lets users examine output from various stages. For example if you
    >> write your kernel in OpenCL/C/C++ you can view the RTL
    >> (Verilog/VHDL) output produced by first stage of compilation.
    >> Note that the compiler is really generating a custom circuit
    >> given a high level input which in the last phase gets synthesized
    >> into bitstream. Expert hardware designers can handcraft a circuit
    >> in RTL and feed it to the compiler. Our FPGA tools let you view
    >> the generated hardware design, the register map, etc. You can get
    >> more information about a compiled design by running XRT tool like
    >> xclbinutil on the generated file.

    >> In essence compiling for FPGAs is quite different than compiling
    >> for GPU/CPU/DSP.  Interestingly FPGA compilers can run anywhere
    >> from 30 mins to a few hours to compile a testcase.

    Dave> So is there any open source userspace generator for what this
    Dave> interface provides? Is the bitstream format that gets fed into
    Dave> the FPGA proprietary and is it signed?

Short answer:

- a bitstream is an opaque content similar to various firmware handled
  by Linux, EFI capsules, x86 microcode, WiFi modems, etc.

- there is no open-source generator for what the interface consume;

- I do not know if it is signed;

- it is probably similar to what Intel FPGA (not GPU) drivers provide
  already inside the Linux kernel and I guess there is no pure
  open-source way to generate their bit-stream either.


Long answer:

- processors, GPU and other digital circuits are designed from a lot of
  elementary transistors, wires, capacitors, resistors... using some
  very complex (and expensive) tools from some EDA companies but at the
  end, after months of work, they come often with a "simple" public
  interface, the... instruction set! So it is rather "easy" at the end
  to generate some instructions with a compiler such as LLVM from a
  description of this ISA or some reverse engineering. Note that even if
  the ISA is public, it is very difficult to make another efficient
  processor from scratch just from this ISA, so there is often no
  concern about making this ISA public to develop the ecosystem ;

- FPGA are field-programmable gate arrays, made also from a lot of
  elementary transistors, wires, capacitors, resistors... but organized
  in billions of very low-level elementary gates, memory elements, DSP
  blocks, I/O blocks, clock generators, specific
  accelerators... directly exposed to the user and that can be
  programmed according to a configuration memory (the bitstream) that
  details how to connect each part, routing element, configuring each
  elemental piece of hardware.  So instead of just writing instructions
  like on a CPU or a GPU, you need to configure each bit of the
  architecture in such a way it does something interesting for
  you. Concretely, you write some programs in RTL languages (Verilog,
  VHDL) or higher-level (C/C++, OpenCL, SYCL...)  and you use some very
  complex (and expensive) tools from some EDA companies to generate the
  bitstream implementing an equivalent circuit with the same
  semantics. Since the architecture is so low level, there is a direct
  mapping between the configuration memory (bitstream) and the hardware
  architecture itself, so if it is public then it is easy to duplicate
  the FPGA itself and to start a new FPGA company. That is unfortunately
  something the existing FPGA companies do not want... ;-)

To summarize:

- on a CPU & GPU, the vendor used the expensive EDA tools once already
  for you and provide the simpler ISA interface;

- on an FPGA, you have access to a pile of low-level hardware and it is
  up to you to use the lengthy process of building your own computing
  architecture using the heavy expensive very subtle EDA tools that will
  run for hours or days to generate some good-enough placement for your
  pleasure.

There is some public documentation on-line:
https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html#documentation

To have an idea of the elementary architecture:
https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf
https://www.xilinx.com/support/documentation/user_guides/ug579-ultrascale-dsp.pdf
https://www.xilinx.com/support/documentation/user_guides/ug573-ultrascale-memory-resources.pdf

Even on the configuration and the file format, but without any detailed semantics:
https://www.xilinx.com/support/documentation/user_guides/ug570-ultrascale-configuration.pdf


The Xilinx compiler xocc taking for example some LLVM IR and generating
some bitstream is not open-source and will probably never be for the
reasons above... :-(

Xilinx is open-sourcing all what can reasonably be open-sourced:

- the user-level and system run-time, including the OpenCL runtime:
  https://github.com/Xilinx/XRT to handle the bitstreams generated by
  some close-source tools

- the kernel device drivers which are already in
  https://github.com/Xilinx/XRT but we want to upstream into the Linux
  kernel to make life easier (this is the matter of this e-mail thread);

- to generate some real code in the most (modern and) open-source way,
  there is an open-source framework to compile some SYCL C++ including
  some Xilinx FPGA-specific extensions down to SPIR LLVM IR using
  Clang/LLVM and to feed the close-source xocc tool with it
  https://github.com/triSYCL/triSYCL

  You can see starting from
  https://github.com/triSYCL/triSYCL/blob/master/tests/Makefile#L322 how
  to start from C++ code, generate some SPIR LLVM IR and to feed xocc
  and build a fat binary that will use the XRT runtime.

  Some documentation in
  https://github.com/triSYCL/triSYCL/blob/master/doc/architecture.rst

  There are other more official ways to generate bitstream (they are
  called products instead of research projects like triSYCL :-) ).

  We are also working on an other open-source SYCL compiler with Intel
  to have a better common implementation
  https://github.com/intel/llvm/wiki and to upstream this into Clang/LLVM.

So for Xilinx FPGA, you can see the LLVM IR as the equivalent of PTX for
nVidia. But xocc is close-source for some more fundamental reasons: it
would expose all the details of the FPGA. I guess this is exactly the
same for Xilinx FPGA.

Note that probably most of the tool chains used to generate the
low-level firmware for the various CPU (microcode), GPU, etc. are
also close-source.

See you,
-- 
Ronan KERYELL, Xilinx Research Labs / San José, California.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-30  1:09               ` Ronan KERYELL
@ 2019-04-03 13:14                 ` Daniel Vetter
  2019-04-03 14:17                   ` Moritz Fischer
  2019-04-03 15:47                 ` Jerome Glisse
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2019-04-03 13:14 UTC (permalink / raw)
  To: Ronan KERYELL
  Cc: Sonal Santan, gregkh, Cyril Chemparathy, linux-kernel, dri-devel,
	Ralph Wittig, Michal Simek, Lizhi Hou, airlied, Ronan Keryell,
	linux-fpga

On Fri, Mar 29, 2019 at 06:09:18PM -0700, Ronan KERYELL wrote:
> I am adding linux-fpga@vger.kernel.org, since this is why I missed this
> thread in the first place...
> 
> >>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie <airlied@gmail.com> said:
> 
> Hi Dave!
> 
>     Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com> wrote:
> 
>     >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch]
> 
> [...]
> 
>     >>> Note: There's no expectation for the fully optimizing compiler,
>     >>> and we're totally ok if there's an optimizing proprietary
>     >>> compiler and a basic open one (amd, and bunch of other
>     >>> companies all have such dual stacks running on top of drm
>     >>> kernel drivers). But a basic compiler that can convert basic
>     >>> kernels into machine code is expected.
> 
>     >> Although the compiler is not open source the compilation flow
>     >> lets users examine output from various stages. For example if you
>     >> write your kernel in OpenCL/C/C++ you can view the RTL
>     >> (Verilog/VHDL) output produced by first stage of compilation.
>     >> Note that the compiler is really generating a custom circuit
>     >> given a high level input which in the last phase gets synthesized
>     >> into bitstream. Expert hardware designers can handcraft a circuit
>     >> in RTL and feed it to the compiler. Our FPGA tools let you view
>     >> the generated hardware design, the register map, etc. You can get
>     >> more information about a compiled design by running XRT tool like
>     >> xclbinutil on the generated file.
> 
>     >> In essence compiling for FPGAs is quite different than compiling
>     >> for GPU/CPU/DSP.  Interestingly FPGA compilers can run anywhere
>     >> from 30 mins to a few hours to compile a testcase.
> 
>     Dave> So is there any open source userspace generator for what this
>     Dave> interface provides? Is the bitstream format that gets fed into
>     Dave> the FPGA proprietary and is it signed?
> 
> Short answer:
> 
> - a bitstream is an opaque content similar to various firmware handled
>   by Linux, EFI capsules, x86 microcode, WiFi modems, etc.
> 
> - there is no open-source generator for what the interface consume;
> 
> - I do not know if it is signed;
> 
> - it is probably similar to what Intel FPGA (not GPU) drivers provide
>   already inside the Linux kernel and I guess there is no pure
>   open-source way to generate their bit-stream either.

Yeah, drivers/gpu folks wouldn't ever have merged drivers/fpga, and I
think there's pretty strong consensus over here that merging fpga stuff
without having clear specs (in the form of an executable open source
compiler/synthesizer/whatever) was a mistake.

We just had a similar huge discussions around the recently merged
habanalabs driver in drivers/misc, for neural network accel. There was a
proposed drivers/accel for these. gpu folks objected, Greg and Olof were
happy with merging.

And the exact same arguments has come up tons of times for gpus too, with
lots proposals to merge a kernel driver with just the kernel driver being
open source, or just the state tracker/runtime, but most definitely not
anything looking like the compiler. Because $reasons.

Conclusion was that drivers/gpu people will continue to reject these,
everyone else will continue to take whatever, but just don't complain to
us if it all comes crashing down :-)

> Long answer:
> 
> - processors, GPU and other digital circuits are designed from a lot of
>   elementary transistors, wires, capacitors, resistors... using some
>   very complex (and expensive) tools from some EDA companies but at the
>   end, after months of work, they come often with a "simple" public
>   interface, the... instruction set! So it is rather "easy" at the end
>   to generate some instructions with a compiler such as LLVM from a
>   description of this ISA or some reverse engineering. Note that even if
>   the ISA is public, it is very difficult to make another efficient
>   processor from scratch just from this ISA, so there is often no
>   concern about making this ISA public to develop the ecosystem ;
> 
> - FPGA are field-programmable gate arrays, made also from a lot of
>   elementary transistors, wires, capacitors, resistors... but organized
>   in billions of very low-level elementary gates, memory elements, DSP
>   blocks, I/O blocks, clock generators, specific
>   accelerators... directly exposed to the user and that can be
>   programmed according to a configuration memory (the bitstream) that
>   details how to connect each part, routing element, configuring each
>   elemental piece of hardware.  So instead of just writing instructions
>   like on a CPU or a GPU, you need to configure each bit of the
>   architecture in such a way it does something interesting for
>   you. Concretely, you write some programs in RTL languages (Verilog,
>   VHDL) or higher-level (C/C++, OpenCL, SYCL...)  and you use some very
>   complex (and expensive) tools from some EDA companies to generate the
>   bitstream implementing an equivalent circuit with the same
>   semantics. Since the architecture is so low level, there is a direct
>   mapping between the configuration memory (bitstream) and the hardware
>   architecture itself, so if it is public then it is easy to duplicate
>   the FPGA itself and to start a new FPGA company. That is unfortunately
>   something the existing FPGA companies do not want... ;-)

i.e. you have a use case where you absolutely need an offline compiler.
Like with gpus (in some use cases), the only difference is that for gpus
the latency requirement that's too high is measured in milliseconds, cause
that would cause dropped frames, and worst case compiling takes seconds
for some big shaders. With FPGAs it's just 1000x higher limits, same problem.

> To summarize:
> 
> - on a CPU & GPU, the vendor used the expensive EDA tools once already
>   for you and provide the simpler ISA interface;
> 
> - on an FPGA, you have access to a pile of low-level hardware and it is
>   up to you to use the lengthy process of building your own computing
>   architecture using the heavy expensive very subtle EDA tools that will
>   run for hours or days to generate some good-enough placement for your
>   pleasure.
> 
> There is some public documentation on-line:
> https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html#documentation
> 
> To have an idea of the elementary architecture:
> https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf
> https://www.xilinx.com/support/documentation/user_guides/ug579-ultrascale-dsp.pdf
> https://www.xilinx.com/support/documentation/user_guides/ug573-ultrascale-memory-resources.pdf
> 
> Even on the configuration and the file format, but without any detailed semantics:
> https://www.xilinx.com/support/documentation/user_guides/ug570-ultrascale-configuration.pdf
> 
> 
> The Xilinx compiler xocc taking for example some LLVM IR and generating
> some bitstream is not open-source and will probably never be for the
> reasons above... :-(
> 
> Xilinx is open-sourcing all what can reasonably be open-sourced:
> 
> - the user-level and system run-time, including the OpenCL runtime:
>   https://github.com/Xilinx/XRT to handle the bitstreams generated by
>   some close-source tools
> 
> - the kernel device drivers which are already in
>   https://github.com/Xilinx/XRT but we want to upstream into the Linux
>   kernel to make life easier (this is the matter of this e-mail thread);
> 
> - to generate some real code in the most (modern and) open-source way,
>   there is an open-source framework to compile some SYCL C++ including
>   some Xilinx FPGA-specific extensions down to SPIR LLVM IR using
>   Clang/LLVM and to feed the close-source xocc tool with it
>   https://github.com/triSYCL/triSYCL
> 
>   You can see starting from
>   https://github.com/triSYCL/triSYCL/blob/master/tests/Makefile#L322 how
>   to start from C++ code, generate some SPIR LLVM IR and to feed xocc
>   and build a fat binary that will use the XRT runtime.
> 
>   Some documentation in
>   https://github.com/triSYCL/triSYCL/blob/master/doc/architecture.rst
> 
>   There are other more official ways to generate bitstream (they are
>   called products instead of research projects like triSYCL :-) ).
> 
>   We are also working on an other open-source SYCL compiler with Intel
>   to have a better common implementation
>   https://github.com/intel/llvm/wiki and to upstream this into Clang/LLVM.

Yeah, there's been plenty of gpu stacks with "everything open sourced that
can be open sourced", except the compiler, for gpus. We didn't take those
drivers either.

And I looked at the entire stack already to see what's there and what's
missing.

> So for Xilinx FPGA, you can see the LLVM IR as the equivalent of PTX for
> nVidia. But xocc is close-source for some more fundamental reasons: it
> would expose all the details of the FPGA. I guess this is exactly the
> same for Xilinx FPGA.

Yeah, neither did we merge a driver with just some IR as the "compiler",
and most definitely not PTX (since that's just nv lock-in, spirv is the
cross vendor solution that at least seems to have a fighting chance). We
want the low level stuff (and if the high level compiler is the dumbest,
least optimizing thing ever that can't run any real world workload yet,
that's fine, it can be fixed). The low level stuff is what matters from an
uapi perspective.

> Note that probably most of the tool chains used to generate the
> low-level firmware for the various CPU (microcode), GPU, etc. are
> also close-source.

Yup. None have been successfully used to merge stuff into drivers/gpu.

Note that we're perfectly fine with closed source stacks running on top of
drivers/gpu, with lots of additional secret sauce/value add/customer lock
in/whatever compared to the basic open source stack. There's plenty of
vendors doing that. But for the uapi review, and making sure we can at
least keep the basic stack working, it needs to be the full open stack.
End to end.

I guess I need to actually type that article on my blog about why exactly
we're so much insisting on this, seems to become a bit an FAQ.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-04-03 13:14                 ` Daniel Vetter
@ 2019-04-03 14:17                   ` Moritz Fischer
  2019-04-03 14:53                     ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Moritz Fischer @ 2019-04-03 14:17 UTC (permalink / raw)
  To: Ronan KERYELL, Dave Airlie, Sonal Santan, dri-devel, gregkh,
	Cyril Chemparathy, linux-kernel, Lizhi Hou, Michal Simek,
	airlied, linux-fpga, Ralph Wittig, Ronan Keryell

Hi Daniel,

On Wed, Apr 03, 2019 at 03:14:49PM +0200, Daniel Vetter wrote:
> On Fri, Mar 29, 2019 at 06:09:18PM -0700, Ronan KERYELL wrote:
> > I am adding linux-fpga@vger.kernel.org, since this is why I missed this
> > thread in the first place...
> > 
> > >>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie <airlied@gmail.com> said:
> > 
> > Hi Dave!
> > 
> >     Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com> wrote:
> > 
> >     >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch]
> > 
> > [...]
> > 
> >     >>> Note: There's no expectation for the fully optimizing compiler,
> >     >>> and we're totally ok if there's an optimizing proprietary
> >     >>> compiler and a basic open one (amd, and bunch of other
> >     >>> companies all have such dual stacks running on top of drm
> >     >>> kernel drivers). But a basic compiler that can convert basic
> >     >>> kernels into machine code is expected.
> > 
> >     >> Although the compiler is not open source the compilation flow
> >     >> lets users examine output from various stages. For example if you
> >     >> write your kernel in OpenCL/C/C++ you can view the RTL
> >     >> (Verilog/VHDL) output produced by first stage of compilation.
> >     >> Note that the compiler is really generating a custom circuit
> >     >> given a high level input which in the last phase gets synthesized
> >     >> into bitstream. Expert hardware designers can handcraft a circuit
> >     >> in RTL and feed it to the compiler. Our FPGA tools let you view
> >     >> the generated hardware design, the register map, etc. You can get
> >     >> more information about a compiled design by running XRT tool like
> >     >> xclbinutil on the generated file.
> > 
> >     >> In essence compiling for FPGAs is quite different than compiling
> >     >> for GPU/CPU/DSP.  Interestingly FPGA compilers can run anywhere
> >     >> from 30 mins to a few hours to compile a testcase.
> > 
> >     Dave> So is there any open source userspace generator for what this
> >     Dave> interface provides? Is the bitstream format that gets fed into
> >     Dave> the FPGA proprietary and is it signed?
> > 
> > Short answer:
> > 
> > - a bitstream is an opaque content similar to various firmware handled
> >   by Linux, EFI capsules, x86 microcode, WiFi modems, etc.
> > 
> > - there is no open-source generator for what the interface consume;
> > 
> > - I do not know if it is signed;
> > 
> > - it is probably similar to what Intel FPGA (not GPU) drivers provide
> >   already inside the Linux kernel and I guess there is no pure
> >   open-source way to generate their bit-stream either.
> 
> Yeah, drivers/gpu folks wouldn't ever have merged drivers/fpga, and I
> think there's pretty strong consensus over here that merging fpga stuff
> without having clear specs (in the form of an executable open source
> compiler/synthesizer/whatever) was a mistake.

I don't totally understand this statement. You don't go out and ask
people to open source their EDA tools that are used to create the ASICs
on any piece of HW (NIC, GPU, USB controller,...) out there.

FPGAs are no different.

I think you need to distinguish between the general FPGA as a means to
implement a HW solution and *FPGA based devices* that implement flows such
as OpenCL etc. For the latter I'm more inclined to buy the equivalence
to GPUs argument.

> We just had a similar huge discussions around the recently merged
> habanalabs driver in drivers/misc, for neural network accel. There was a
> proposed drivers/accel for these. gpu folks objected, Greg and Olof were
> happy with merging.
> 
> And the exact same arguments has come up tons of times for gpus too, with
> lots proposals to merge a kernel driver with just the kernel driver being
> open source, or just the state tracker/runtime, but most definitely not
> anything looking like the compiler. Because $reasons.
> 
> Conclusion was that drivers/gpu people will continue to reject these,
> everyone else will continue to take whatever, but just don't complain to
> us if it all comes crashing down :-)
> 
> > Long answer:
> > 
> > - processors, GPU and other digital circuits are designed from a lot of
> >   elementary transistors, wires, capacitors, resistors... using some
> >   very complex (and expensive) tools from some EDA companies but at the
> >   end, after months of work, they come often with a "simple" public
> >   interface, the... instruction set! So it is rather "easy" at the end
> >   to generate some instructions with a compiler such as LLVM from a
> >   description of this ISA or some reverse engineering. Note that even if
> >   the ISA is public, it is very difficult to make another efficient
> >   processor from scratch just from this ISA, so there is often no
> >   concern about making this ISA public to develop the ecosystem ;
> > 
> > - FPGA are field-programmable gate arrays, made also from a lot of
> >   elementary transistors, wires, capacitors, resistors... but organized
> >   in billions of very low-level elementary gates, memory elements, DSP
> >   blocks, I/O blocks, clock generators, specific
> >   accelerators... directly exposed to the user and that can be
> >   programmed according to a configuration memory (the bitstream) that
> >   details how to connect each part, routing element, configuring each
> >   elemental piece of hardware.  So instead of just writing instructions
> >   like on a CPU or a GPU, you need to configure each bit of the
> >   architecture in such a way it does something interesting for
> >   you. Concretely, you write some programs in RTL languages (Verilog,
> >   VHDL) or higher-level (C/C++, OpenCL, SYCL...)  and you use some very
> >   complex (and expensive) tools from some EDA companies to generate the
> >   bitstream implementing an equivalent circuit with the same
> >   semantics. Since the architecture is so low level, there is a direct
> >   mapping between the configuration memory (bitstream) and the hardware
> >   architecture itself, so if it is public then it is easy to duplicate
> >   the FPGA itself and to start a new FPGA company. That is unfortunately
> >   something the existing FPGA companies do not want... ;-)
> 
> i.e. you have a use case where you absolutely need an offline compiler.
> Like with gpus (in some use cases), the only difference is that for gpus
> the latency requirement that's too high is measured in milliseconds, cause
> that would cause dropped frames, and worst case compiling takes seconds
> for some big shaders. With FPGAs it's just 1000x higher limits, same problem.

As I said above, you'd do the same thing when you design any other piece
of hardware out there, except for with FPGAs you'd be able to change
stuff, whereas with an ASIC your netlist gets fixed at tape-out date.
> 
> > To summarize:
> > 
> > - on a CPU & GPU, the vendor used the expensive EDA tools once already
> >   for you and provide the simpler ISA interface;
> > 
> > - on an FPGA, you have access to a pile of low-level hardware and it is
> >   up to you to use the lengthy process of building your own computing
> >   architecture using the heavy expensive very subtle EDA tools that will
> >   run for hours or days to generate some good-enough placement for your
> >   pleasure.
> > 
> > There is some public documentation on-line:
> > https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html#documentation
> > 
> > To have an idea of the elementary architecture:
> > https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf
> > https://www.xilinx.com/support/documentation/user_guides/ug579-ultrascale-dsp.pdf
> > https://www.xilinx.com/support/documentation/user_guides/ug573-ultrascale-memory-resources.pdf
> > 
> > Even on the configuration and the file format, but without any detailed semantics:
> > https://www.xilinx.com/support/documentation/user_guides/ug570-ultrascale-configuration.pdf
> > 
> > 
> > The Xilinx compiler xocc taking for example some LLVM IR and generating
> > some bitstream is not open-source and will probably never be for the
> > reasons above... :-(
> > 
> > Xilinx is open-sourcing all what can reasonably be open-sourced:
> > 
> > - the user-level and system run-time, including the OpenCL runtime:
> >   https://github.com/Xilinx/XRT to handle the bitstreams generated by
> >   some close-source tools
> > 
> > - the kernel device drivers which are already in
> >   https://github.com/Xilinx/XRT but we want to upstream into the Linux
> >   kernel to make life easier (this is the matter of this e-mail thread);
> > 
> > - to generate some real code in the most (modern and) open-source way,
> >   there is an open-source framework to compile some SYCL C++ including
> >   some Xilinx FPGA-specific extensions down to SPIR LLVM IR using
> >   Clang/LLVM and to feed the close-source xocc tool with it
> >   https://github.com/triSYCL/triSYCL
> > 
> >   You can see starting from
> >   https://github.com/triSYCL/triSYCL/blob/master/tests/Makefile#L322 how
> >   to start from C++ code, generate some SPIR LLVM IR and to feed xocc
> >   and build a fat binary that will use the XRT runtime.
> > 
> >   Some documentation in
> >   https://github.com/triSYCL/triSYCL/blob/master/doc/architecture.rst
> > 
> >   There are other more official ways to generate bitstream (they are
> >   called products instead of research projects like triSYCL :-) ).
> > 
> >   We are also working on an other open-source SYCL compiler with Intel
> >   to have a better common implementation
> >   https://github.com/intel/llvm/wiki and to upstream this into Clang/LLVM.
> 
> Yeah, there's been plenty of gpu stacks with "everything open sourced that
> can be open sourced", except the compiler, for gpus. We didn't take those
> drivers either.
> 
> And I looked at the entire stack already to see what's there and what's
> missing.
> 
> > So for Xilinx FPGA, you can see the LLVM IR as the equivalent of PTX for
> > nVidia. But xocc is close-source for some more fundamental reasons: it
> > would expose all the details of the FPGA. I guess this is exactly the
> > same for Xilinx FPGA.
> 
> Yeah, neither did we merge a driver with just some IR as the "compiler",
> and most definitely not PTX (since that's just nv lock-in, spirv is the
> cross vendor solution that at least seems to have a fighting chance). We
> want the low level stuff (and if the high level compiler is the dumbest,
> least optimizing thing ever that can't run any real world workload yet,
> that's fine, it can be fixed). The low level stuff is what matters from an
> uapi perspective.
> 
> > Note that probably most of the tool chains used to generate the
> > low-level firmware for the various CPU (microcode), GPU, etc. are
> > also close-source.
> 
> Yup. None have been successfully used to merge stuff into drivers/gpu.
> 
> Note that we're perfectly fine with closed source stacks running on top of
> drivers/gpu, with lots of additional secret sauce/value add/customer lock
> in/whatever compared to the basic open source stack. There's plenty of
> vendors doing that. But for the uapi review, and making sure we can at
> least keep the basic stack working, it needs to be the full open stack.
> End to end.
> 
> I guess I need to actually type that article on my blog about why exactly
> we're so much insisting on this, seems to become a bit an FAQ.
> 
> Cheers, Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Cheers,
Moritz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-04-03 14:17                   ` Moritz Fischer
@ 2019-04-03 14:53                     ` Daniel Vetter
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Vetter @ 2019-04-03 14:53 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: Ronan KERYELL, Dave Airlie, Sonal Santan, dri-devel, gregkh,
	Cyril Chemparathy, linux-kernel, Lizhi Hou, Michal Simek,
	airlied, linux-fpga, Ralph Wittig, Ronan Keryell

On Wed, Apr 3, 2019 at 4:17 PM Moritz Fischer <mdf@kernel.org> wrote:
>
> Hi Daniel,
>
> On Wed, Apr 03, 2019 at 03:14:49PM +0200, Daniel Vetter wrote:
> > On Fri, Mar 29, 2019 at 06:09:18PM -0700, Ronan KERYELL wrote:
> > > I am adding linux-fpga@vger.kernel.org, since this is why I missed this
> > > thread in the first place...
> > >
> > > >>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie <airlied@gmail.com> said:
> > >
> > > Hi Dave!
> > >
> > >     Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com> wrote:
> > >
> > >     >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch]
> > >
> > > [...]
> > >
> > >     >>> Note: There's no expectation for the fully optimizing compiler,
> > >     >>> and we're totally ok if there's an optimizing proprietary
> > >     >>> compiler and a basic open one (amd, and bunch of other
> > >     >>> companies all have such dual stacks running on top of drm
> > >     >>> kernel drivers). But a basic compiler that can convert basic
> > >     >>> kernels into machine code is expected.
> > >
> > >     >> Although the compiler is not open source the compilation flow
> > >     >> lets users examine output from various stages. For example if you
> > >     >> write your kernel in OpenCL/C/C++ you can view the RTL
> > >     >> (Verilog/VHDL) output produced by first stage of compilation.
> > >     >> Note that the compiler is really generating a custom circuit
> > >     >> given a high level input which in the last phase gets synthesized
> > >     >> into bitstream. Expert hardware designers can handcraft a circuit
> > >     >> in RTL and feed it to the compiler. Our FPGA tools let you view
> > >     >> the generated hardware design, the register map, etc. You can get
> > >     >> more information about a compiled design by running XRT tool like
> > >     >> xclbinutil on the generated file.
> > >
> > >     >> In essence compiling for FPGAs is quite different than compiling
> > >     >> for GPU/CPU/DSP.  Interestingly FPGA compilers can run anywhere
> > >     >> from 30 mins to a few hours to compile a testcase.
> > >
> > >     Dave> So is there any open source userspace generator for what this
> > >     Dave> interface provides? Is the bitstream format that gets fed into
> > >     Dave> the FPGA proprietary and is it signed?
> > >
> > > Short answer:
> > >
> > > - a bitstream is an opaque content similar to various firmware handled
> > >   by Linux, EFI capsules, x86 microcode, WiFi modems, etc.
> > >
> > > - there is no open-source generator for what the interface consume;
> > >
> > > - I do not know if it is signed;
> > >
> > > - it is probably similar to what Intel FPGA (not GPU) drivers provide
> > >   already inside the Linux kernel and I guess there is no pure
> > >   open-source way to generate their bit-stream either.
> >
> > Yeah, drivers/gpu folks wouldn't ever have merged drivers/fpga, and I
> > think there's pretty strong consensus over here that merging fpga stuff
> > without having clear specs (in the form of an executable open source
> > compiler/synthesizer/whatever) was a mistake.
>
> I don't totally understand this statement. You don't go out and ask
> people to open source their EDA tools that are used to create the ASICs
> on any piece of HW (NIC, GPU, USB controller,...) out there.
>
> FPGAs are no different.
>
> I think you need to distinguish between the general FPGA as a means to
> implement a HW solution and *FPGA based devices* that implement flows such
> as OpenCL etc. For the latter I'm more inclined to buy the equivalence
> to GPUs argument.

Yeah maybe there's a misunderstanding, my comments where in the
context of the submitted xilinx driver, and similar drivers that mean
to expose fpgas to userspace for doing stuff. If all you use your FGPA
for is to load a bitstream as a firmware blob, to then instantiate a
device which doesn't really change anymore, then then bistream is just
like firmware indeed. But we're talking about kernel/userspace api,
where (possible multiple) unpriviledged clients can do whatever they
feel like, and where we have pretty hard requirements about not
breaking userspace. To be able to fully review and more or less
indefinitely support such driver stacks, we need to understand what
they're doing and what's possible. It's the "unpriviledge userspace
submits touring complete (ok sometimes not quite touring complete, but
really powerful i/o is usually on the menu) blobs to be run on
questionable hardware by the kernel" which is the part that
distinguishes a firmware blob from the compute kernels we're talking
about here. It's not gpu vs fpga vs something else. From a kernel
driver pov those are all the same: You take a
shader/bitstream/whatever blob + a bit of state/configuration from
userspace, and need to make sure there's no DOS or other exploit in
there.

And we're not going to take "there's no problem here" on blind faith,
because we know how well designed&validated hw is. This isn't new with
smeltdown, because gpus have been very fragile/broken/insecure pieces
since forever (it's slowly getting better though, and we rely ever
less on sw to guarantee isolation).

Wrt drivers/fpga: My understanding is that's very much meant to allow
unpriviledge userspace to use them, at least they wouldn't need an
ioctl interface otherwise.
-Daniel

> > We just had a similar huge discussions around the recently merged
> > habanalabs driver in drivers/misc, for neural network accel. There was a
> > proposed drivers/accel for these. gpu folks objected, Greg and Olof were
> > happy with merging.
> >
> > And the exact same arguments has come up tons of times for gpus too, with
> > lots proposals to merge a kernel driver with just the kernel driver being
> > open source, or just the state tracker/runtime, but most definitely not
> > anything looking like the compiler. Because $reasons.
> >
> > Conclusion was that drivers/gpu people will continue to reject these,
> > everyone else will continue to take whatever, but just don't complain to
> > us if it all comes crashing down :-)
> >
> > > Long answer:
> > >
> > > - processors, GPU and other digital circuits are designed from a lot of
> > >   elementary transistors, wires, capacitors, resistors... using some
> > >   very complex (and expensive) tools from some EDA companies but at the
> > >   end, after months of work, they come often with a "simple" public
> > >   interface, the... instruction set! So it is rather "easy" at the end
> > >   to generate some instructions with a compiler such as LLVM from a
> > >   description of this ISA or some reverse engineering. Note that even if
> > >   the ISA is public, it is very difficult to make another efficient
> > >   processor from scratch just from this ISA, so there is often no
> > >   concern about making this ISA public to develop the ecosystem ;
> > >
> > > - FPGA are field-programmable gate arrays, made also from a lot of
> > >   elementary transistors, wires, capacitors, resistors... but organized
> > >   in billions of very low-level elementary gates, memory elements, DSP
> > >   blocks, I/O blocks, clock generators, specific
> > >   accelerators... directly exposed to the user and that can be
> > >   programmed according to a configuration memory (the bitstream) that
> > >   details how to connect each part, routing element, configuring each
> > >   elemental piece of hardware.  So instead of just writing instructions
> > >   like on a CPU or a GPU, you need to configure each bit of the
> > >   architecture in such a way it does something interesting for
> > >   you. Concretely, you write some programs in RTL languages (Verilog,
> > >   VHDL) or higher-level (C/C++, OpenCL, SYCL...)  and you use some very
> > >   complex (and expensive) tools from some EDA companies to generate the
> > >   bitstream implementing an equivalent circuit with the same
> > >   semantics. Since the architecture is so low level, there is a direct
> > >   mapping between the configuration memory (bitstream) and the hardware
> > >   architecture itself, so if it is public then it is easy to duplicate
> > >   the FPGA itself and to start a new FPGA company. That is unfortunately
> > >   something the existing FPGA companies do not want... ;-)
> >
> > i.e. you have a use case where you absolutely need an offline compiler.
> > Like with gpus (in some use cases), the only difference is that for gpus
> > the latency requirement that's too high is measured in milliseconds, cause
> > that would cause dropped frames, and worst case compiling takes seconds
> > for some big shaders. With FPGAs it's just 1000x higher limits, same problem.
>
> As I said above, you'd do the same thing when you design any other piece
> of hardware out there, except for with FPGAs you'd be able to change
> stuff, whereas with an ASIC your netlist gets fixed at tape-out date.
> >
> > > To summarize:
> > >
> > > - on a CPU & GPU, the vendor used the expensive EDA tools once already
> > >   for you and provide the simpler ISA interface;
> > >
> > > - on an FPGA, you have access to a pile of low-level hardware and it is
> > >   up to you to use the lengthy process of building your own computing
> > >   architecture using the heavy expensive very subtle EDA tools that will
> > >   run for hours or days to generate some good-enough placement for your
> > >   pleasure.
> > >
> > > There is some public documentation on-line:
> > > https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html#documentation
> > >
> > > To have an idea of the elementary architecture:
> > > https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf
> > > https://www.xilinx.com/support/documentation/user_guides/ug579-ultrascale-dsp.pdf
> > > https://www.xilinx.com/support/documentation/user_guides/ug573-ultrascale-memory-resources.pdf
> > >
> > > Even on the configuration and the file format, but without any detailed semantics:
> > > https://www.xilinx.com/support/documentation/user_guides/ug570-ultrascale-configuration.pdf
> > >
> > >
> > > The Xilinx compiler xocc taking for example some LLVM IR and generating
> > > some bitstream is not open-source and will probably never be for the
> > > reasons above... :-(
> > >
> > > Xilinx is open-sourcing all what can reasonably be open-sourced:
> > >
> > > - the user-level and system run-time, including the OpenCL runtime:
> > >   https://github.com/Xilinx/XRT to handle the bitstreams generated by
> > >   some close-source tools
> > >
> > > - the kernel device drivers which are already in
> > >   https://github.com/Xilinx/XRT but we want to upstream into the Linux
> > >   kernel to make life easier (this is the matter of this e-mail thread);
> > >
> > > - to generate some real code in the most (modern and) open-source way,
> > >   there is an open-source framework to compile some SYCL C++ including
> > >   some Xilinx FPGA-specific extensions down to SPIR LLVM IR using
> > >   Clang/LLVM and to feed the close-source xocc tool with it
> > >   https://github.com/triSYCL/triSYCL
> > >
> > >   You can see starting from
> > >   https://github.com/triSYCL/triSYCL/blob/master/tests/Makefile#L322 how
> > >   to start from C++ code, generate some SPIR LLVM IR and to feed xocc
> > >   and build a fat binary that will use the XRT runtime.
> > >
> > >   Some documentation in
> > >   https://github.com/triSYCL/triSYCL/blob/master/doc/architecture.rst
> > >
> > >   There are other more official ways to generate bitstream (they are
> > >   called products instead of research projects like triSYCL :-) ).
> > >
> > >   We are also working on an other open-source SYCL compiler with Intel
> > >   to have a better common implementation
> > >   https://github.com/intel/llvm/wiki and to upstream this into Clang/LLVM.
> >
> > Yeah, there's been plenty of gpu stacks with "everything open sourced that
> > can be open sourced", except the compiler, for gpus. We didn't take those
> > drivers either.
> >
> > And I looked at the entire stack already to see what's there and what's
> > missing.
> >
> > > So for Xilinx FPGA, you can see the LLVM IR as the equivalent of PTX for
> > > nVidia. But xocc is close-source for some more fundamental reasons: it
> > > would expose all the details of the FPGA. I guess this is exactly the
> > > same for Xilinx FPGA.
> >
> > Yeah, neither did we merge a driver with just some IR as the "compiler",
> > and most definitely not PTX (since that's just nv lock-in, spirv is the
> > cross vendor solution that at least seems to have a fighting chance). We
> > want the low level stuff (and if the high level compiler is the dumbest,
> > least optimizing thing ever that can't run any real world workload yet,
> > that's fine, it can be fixed). The low level stuff is what matters from an
> > uapi perspective.
> >
> > > Note that probably most of the tool chains used to generate the
> > > low-level firmware for the various CPU (microcode), GPU, etc. are
> > > also close-source.
> >
> > Yup. None have been successfully used to merge stuff into drivers/gpu.
> >
> > Note that we're perfectly fine with closed source stacks running on top of
> > drivers/gpu, with lots of additional secret sauce/value add/customer lock
> > in/whatever compared to the basic open source stack. There's plenty of
> > vendors doing that. But for the uapi review, and making sure we can at
> > least keep the basic stack working, it needs to be the full open stack.
> > End to end.
> >
> > I guess I need to actually type that article on my blog about why exactly
> > we're so much insisting on this, seems to become a bit an FAQ.
> >
> > Cheers, Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
> Cheers,
> Moritz
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-03-30  1:09               ` Ronan KERYELL
  2019-04-03 13:14                 ` Daniel Vetter
@ 2019-04-03 15:47                 ` Jerome Glisse
  2019-04-05 22:15                   ` Sonal Santan
  1 sibling, 1 reply; 20+ messages in thread
From: Jerome Glisse @ 2019-04-03 15:47 UTC (permalink / raw)
  To: Ronan KERYELL
  Cc: Dave Airlie, Sonal Santan, Daniel Vetter, dri-devel, gregkh,
	Cyril Chemparathy, linux-kernel, Lizhi Hou, Michal Simek,
	airlied, linux-fpga, Ralph Wittig, Ronan Keryell

On Fri, Mar 29, 2019 at 06:09:18PM -0700, Ronan KERYELL wrote:
> I am adding linux-fpga@vger.kernel.org, since this is why I missed this
> thread in the first place...
> >>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie <airlied@gmail.com> said:
>     Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com> wrote:
>     >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch]

[...]

> Long answer:
> 
> - processors, GPU and other digital circuits are designed from a lot of
>   elementary transistors, wires, capacitors, resistors... using some
>   very complex (and expensive) tools from some EDA companies but at the
>   end, after months of work, they come often with a "simple" public
>   interface, the... instruction set! So it is rather "easy" at the end
>   to generate some instructions with a compiler such as LLVM from a
>   description of this ISA or some reverse engineering. Note that even if
>   the ISA is public, it is very difficult to make another efficient
>   processor from scratch just from this ISA, so there is often no
>   concern about making this ISA public to develop the ecosystem ;
> 
> - FPGA are field-programmable gate arrays, made also from a lot of
>   elementary transistors, wires, capacitors, resistors... but organized
>   in billions of very low-level elementary gates, memory elements, DSP
>   blocks, I/O blocks, clock generators, specific
>   accelerators... directly exposed to the user and that can be
>   programmed according to a configuration memory (the bitstream) that
>   details how to connect each part, routing element, configuring each
>   elemental piece of hardware.  So instead of just writing instructions
>   like on a CPU or a GPU, you need to configure each bit of the
>   architecture in such a way it does something interesting for
>   you. Concretely, you write some programs in RTL languages (Verilog,
>   VHDL) or higher-level (C/C++, OpenCL, SYCL...)  and you use some very
>   complex (and expensive) tools from some EDA companies to generate the
>   bitstream implementing an equivalent circuit with the same
>   semantics. Since the architecture is so low level, there is a direct
>   mapping between the configuration memory (bitstream) and the hardware
>   architecture itself, so if it is public then it is easy to duplicate
>   the FPGA itself and to start a new FPGA company. That is unfortunately
>   something the existing FPGA companies do not want... ;-)

This is completely bogus argument, all FPGA documentation i have seen so far
_extensively_ describe _each_ basic blocks within the FGPA, this does include
the excelent documentation Xilinx provide on the inner working and layout of
Xilinx FPGA. Same apply to Altera, Atmel, Latice, ...

The extensive public documentation is enough for anyone with the money and
with half decent engineers to produce an FPGA.

The real know how of FPGA vendor is how to produce big chips on small process
capable to sustain high clock with the best power consumption possible. This
is the part where the years of experiences of each company pay off. The cost
for anyone to come to the market is in the hundred of millions just in setup
cost and to catch with established vendor on the hardware side. This without
any garanty of revenue at the end.

The bitstream is only giving away which bits correspond to which wire where
the LUT boolean table is store  ... Bitstream that have been reverse engineer
never revealed anything of value that was not already publicly documented.


So no the bitstream has _no_ value, please prove me wrong with Latice bitstream
for instance. If anything the fact that Latice has a reverse engineer bitstream
has made that FPGA popular with the maker community as it allows people to do
experiment for which the closed source tools are an impediment. So i would argue
that open bitstream is actualy beneficial.


The only valid reason i have ever seen for hidding the bitstream is to protect
the IP of the customer ie those customer that can pour quite a lot of money on
designing something with an FPGA and then wants to keep the VHDL/Verilog
protected and "safe" from reverse engineering.

But this is security by obscurity and FPGA company would be better off providing
strong bitstream encryption (and most already do but i have seen some paper on
how to break them).


I rather not see any bogus argument to try to justify something that is not
justifiable.


Daniel already stressed that we need to know what the bitstream can do and it
is even more important with FPGA where on some FPGA AFAICT the bitstream can
have total control over the PCIE BUS and thus can be use to attack either main
memory or other PCIE devices.

For instance with ATS/PASID you can have the device send pre-translated request
to the IOMMU and access any memory despite the IOMMU.

So without total confidence of what the bitstream can and can not do, and thus
without knowledge of the bitstream format and how it maps to LUT, switch, cross-
bar, clock, fix block (PCIE, DSP, DAC, ADC, ...) there is no way for someone
independant to check anything.


Cheers,
Jérôme Glisse

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
  2019-04-03 15:47                 ` Jerome Glisse
@ 2019-04-05 22:15                   ` Sonal Santan
  0 siblings, 0 replies; 20+ messages in thread
From: Sonal Santan @ 2019-04-05 22:15 UTC (permalink / raw)
  To: Jerome Glisse, Ronan KERYELL
  Cc: Dave Airlie, Daniel Vetter, dri-devel, gregkh, Cyril Chemparathy,
	linux-kernel, Lizhi Hou, Michal Simek, airlied, linux-fpga,
	Ralph Wittig, Ronan Keryell



> -----Original Message-----
> From: Jerome Glisse [mailto:jglisse@redhat.com]
> Sent: Wednesday, April 03, 2019 8:48 AM
> To: Ronan KERYELL <ronan@keryell.fr>
> Cc: Dave Airlie <airlied@gmail.com>; Sonal Santan <sonals@xilinx.com>;
> Daniel Vetter <daniel@ffwll.ch>; dri-devel@lists.freedesktop.org;
> gregkh@linuxfoundation.org; Cyril Chemparathy <cyrilc@xilinx.com>; linux-
> kernel@vger.kernel.org; Lizhi Hou <lizhih@xilinx.com>; Michal Simek
> <michals@xilinx.com>; airlied@redhat.com; linux-fpga@vger.kernel.org; Ralph
> Wittig <wittig@xilinx.com>; Ronan Keryell <rkeryell@xilinx.com>
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> 
> On Fri, Mar 29, 2019 at 06:09:18PM -0700, Ronan KERYELL wrote:
> > I am adding linux-fpga@vger.kernel.org, since this is why I missed
> > this thread in the first place...
> > >>>>> On Fri, 29 Mar 2019 14:56:17 +1000, Dave Airlie <airlied@gmail.com>
> said:
> >     Dave> On Thu, 28 Mar 2019 at 10:14, Sonal Santan <sonals@xilinx.com>
> wrote:
> >     >>> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch]
> 
> [...]
> 
> > Long answer:
> >
> > - processors, GPU and other digital circuits are designed from a lot of
> >   elementary transistors, wires, capacitors, resistors... using some
> >   very complex (and expensive) tools from some EDA companies but at the
> >   end, after months of work, they come often with a "simple" public
> >   interface, the... instruction set! So it is rather "easy" at the end
> >   to generate some instructions with a compiler such as LLVM from a
> >   description of this ISA or some reverse engineering. Note that even if
> >   the ISA is public, it is very difficult to make another efficient
> >   processor from scratch just from this ISA, so there is often no
> >   concern about making this ISA public to develop the ecosystem ;
> >
> > - FPGA are field-programmable gate arrays, made also from a lot of
> >   elementary transistors, wires, capacitors, resistors... but organized
> >   in billions of very low-level elementary gates, memory elements, DSP
> >   blocks, I/O blocks, clock generators, specific
> >   accelerators... directly exposed to the user and that can be
> >   programmed according to a configuration memory (the bitstream) that
> >   details how to connect each part, routing element, configuring each
> >   elemental piece of hardware.  So instead of just writing instructions
> >   like on a CPU or a GPU, you need to configure each bit of the
> >   architecture in such a way it does something interesting for
> >   you. Concretely, you write some programs in RTL languages (Verilog,
> >   VHDL) or higher-level (C/C++, OpenCL, SYCL...)  and you use some very
> >   complex (and expensive) tools from some EDA companies to generate the
> >   bitstream implementing an equivalent circuit with the same
> >   semantics. Since the architecture is so low level, there is a direct
> >   mapping between the configuration memory (bitstream) and the hardware
> >   architecture itself, so if it is public then it is easy to duplicate
> >   the FPGA itself and to start a new FPGA company. That is unfortunately
> >   something the existing FPGA companies do not want... ;-)
> 
> This is completely bogus argument, all FPGA documentation i have seen so far
> _extensively_ describe _each_ basic blocks within the FGPA, this does include
> the excelent documentation Xilinx provide on the inner working and layout of
> Xilinx FPGA. Same apply to Altera, Atmel, Latice, ...
> 
> The extensive public documentation is enough for anyone with the money
> and with half decent engineers to produce an FPGA.
> 
> The real know how of FPGA vendor is how to produce big chips on small
> process capable to sustain high clock with the best power consumption
> possible. This is the part where the years of experiences of each company pay
> off. The cost for anyone to come to the market is in the hundred of millions
> just in setup cost and to catch with established vendor on the hardware side.
> This without any garanty of revenue at the end.
> 
> The bitstream is only giving away which bits correspond to which wire where
> the LUT boolean table is store  ... Bitstream that have been reverse engineer
> never revealed anything of value that was not already publicly documented.
> 
> 
> So no the bitstream has _no_ value, please prove me wrong with Latice
> bitstream for instance. If anything the fact that Latice has a reverse engineer
> bitstream has made that FPGA popular with the maker community as it allows
> people to do experiment for which the closed source tools are an
> impediment. So i would argue that open bitstream is actualy beneficial.
> 
> 
> The only valid reason i have ever seen for hidding the bitstream is to protect
> the IP of the customer ie those customer that can pour quite a lot of money
> on designing something with an FPGA and then wants to keep the
> VHDL/Verilog protected and "safe" from reverse engineering.
> 
> But this is security by obscurity and FPGA company would be better off
> providing strong bitstream encryption (and most already do but i have seen
> some paper on how to break them).
> 
> 
> I rather not see any bogus argument to try to justify something that is not
> justifiable.
> 
> 
> Daniel already stressed that we need to know what the bitstream can do and
> it is even more important with FPGA where on some FPGA AFAICT the
> bitstream can have total control over the PCIE BUS and thus can be use to
> attack either main memory or other PCIE devices.
> 
> For instance with ATS/PASID you can have the device send pre-translated
> request to the IOMMU and access any memory despite the IOMMU.
> 
> So without total confidence of what the bitstream can and can not do, and
> thus without knowledge of the bitstream format and how it maps to LUT,
> switch, cross- bar, clock, fix block (PCIE, DSP, DAC, ADC, ...) there is no way for
> someone independant to check anything.
> 
> 

Thank you for your time and valuable feedback. I will work on addressing these 
and get back. 

-Sonal
> Cheers,
> Jérôme Glisse

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-04-05 22:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-19 21:53 [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver sonal.santan
2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 1/6] Add skeleton code: ioctl definitions and build hooks sonal.santan
2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 2/6] Global data structures shared between xocl and xmgmt drivers sonal.santan
2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 3/6] Add platform drivers for various IPs and frameworks sonal.santan
2019-03-19 21:53 ` [RFC PATCH Xilinx Alveo 4/6] Add core of XDMA driver sonal.santan
2019-03-19 21:54 ` [RFC PATCH Xilinx Alveo 5/6] Add management driver sonal.santan
2019-03-19 21:54 ` [RFC PATCH Xilinx Alveo 6/6] Add user physical function driver sonal.santan
2019-03-25 20:28 ` [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver Daniel Vetter
2019-03-26 23:30   ` Sonal Santan
2019-03-27  8:22     ` Daniel Vetter
2019-03-27 12:50       ` Sonal Santan
2019-03-27 14:11         ` Daniel Vetter
2019-03-28  0:13           ` Sonal Santan
2019-03-29  4:56             ` Dave Airlie
2019-03-30  1:09               ` Ronan KERYELL
2019-04-03 13:14                 ` Daniel Vetter
2019-04-03 14:17                   ` Moritz Fischer
2019-04-03 14:53                     ` Daniel Vetter
2019-04-03 15:47                 ` Jerome Glisse
2019-04-05 22:15                   ` Sonal Santan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).