All of lore.kernel.org
 help / color / mirror / Atom feed
* New driver mtipx2xx submission
@ 2011-04-28 15:53 Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-04-28 22:06 ` Alan Cox
  0 siblings, 1 reply; 40+ messages in thread
From: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR] @ 2011-04-28 15:53 UTC (permalink / raw)
  To: linux-ide

[-- Attachment #1: Type: text/plain, Size: 834 bytes --]

Hi All,

This is my first post/submission to Linux Kernel. Please bear with any
mistakes and give an opportunity to learn/correct. 

We have written a new block driver for our AHCI based PCIe SSDs. The
main objective of our product is providing high performance. Traffic
through OS storage stack is not to able fully utilize the device's
capabilty. To improve the traffic to the device and hence
showcase/utilize the device's capability, we have come up with this new
block driver. This driver includes
	* utilize device's increased queue depth
	* workaround for hardware errata

We want to get this driver into kernel tree to support the device out of
the box. Attached this driver as a patch for latest kernel. We would
like to get your comments, and also open for discussion.


Thanks & Regards,
Asai Thambi


[-- Attachment #2: mtipx2xx.patch --]
[-- Type: application/octet-stream, Size: 177935 bytes --]

diff -uNr linux-2.6.38/drivers/block/Kconfig linux-2.6.38-asai/drivers/block/Kconfig
--- linux-2.6.38/drivers/block/Kconfig	2011-03-14 19:20:32.000000000 -0600
+++ linux-2.6.38-asai/drivers/block/Kconfig	2011-04-15 20:20:15.000000000 -0600
@@ -116,6 +116,8 @@
 
 source "drivers/block/paride/Kconfig"
 
+source "drivers/block/mtipx2xx/Kconfig"
+
 config BLK_CPQ_DA
 	tristate "Compaq SMART2 support"
 	depends on PCI && VIRT_TO_BUS
diff -uNr linux-2.6.38/drivers/block/Makefile linux-2.6.38-asai/drivers/block/Makefile
--- linux-2.6.38/drivers/block/Makefile	2011-03-14 19:20:32.000000000 -0600
+++ linux-2.6.38-asai/drivers/block/Makefile	2011-04-15 20:20:02.000000000 -0600
@@ -38,5 +38,5 @@
 obj-$(CONFIG_XEN_BLKDEV_FRONTEND)	+= xen-blkfront.o
 obj-$(CONFIG_BLK_DEV_DRBD)     += drbd/
 obj-$(CONFIG_BLK_DEV_RBD)     += rbd.o
-
+obj-$(CONFIG_BLK_DEV_PCIE_SSD) += mtipx2xx/
 swim_mod-y	:= swim.o swim_asm.o
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/Kconfig linux-2.6.38-asai/drivers/block/mtipx2xx/Kconfig
--- linux-2.6.38/drivers/block/mtipx2xx/Kconfig	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/Kconfig	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,14 @@
+#
+# mtipx2xx device driver configuration
+#
+
+comment "Micron PCIe SSD"
+        depends on HOTPLUG_PCI_PCIE='y' || HOTPLUG_PCI_PCIE='m'
+
+config BLK_DEV_PCIE_SSD 
+        tristate "Block Device Driver for Micron PCIe SSDs"
+        depends on  HOTPLUG_PCI_PCIE
+        default m
+        help
+
+          This option enables block device driver for Micron PCIe SSDs. 
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/LICENSE linux-2.6.38-asai/drivers/block/mtipx2xx/LICENSE
--- linux-2.6.38/drivers/block/mtipx2xx/LICENSE	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/LICENSE	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,339 @@
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+			    Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+		    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+			    NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+		     END OF TERMS AND CONDITIONS
+
+	    How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License along
+    with this program; if not, write to the Free Software Foundation, Inc.,
+    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) year name of author
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/Makefile linux-2.6.38-asai/drivers/block/mtipx2xx/Makefile
--- linux-2.6.38/drivers/block/mtipx2xx/Makefile	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/Makefile	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,7 @@
+#
+# Makefile for  Block device driver for Micron PCIe SSD
+#
+
+obj-$(CONFIG_BLK_DEV_PCIE_SSD) += mtipx2xx.o
+
+mtipx2xx-objs := module.o pci.o block.o ahci.o
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/ahci.c linux-2.6.38-asai/drivers/block/mtipx2xx/ahci.c
--- linux-2.6.38/drivers/block/mtipx2xx/ahci.c	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/ahci.c	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,3885 @@
+/*****************************************************************************
+ *
+ * ahci.c - Handles the AHCI protocol layer of the Cyclone SSD Block Driver
+ *   Copyright (C) 2009  Integrated Device Technology, Inc.
+ *
+ *  Changes from IDT 1.0.1 are copyright (C) 2010 Micron Technology, Inc.
+ *
+ *  This file is part of the Cyclone SSD Block Driver, it is free software:
+ *  you can redistribute it and/or modify it under the terms of the GNU
+ *  General Public License as published by the Free Software Foundation;
+ *  either version 2 of the License, or (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc.,51 Franklin Street, Fifth Floor,Boston,MA 02110-1301,USA.
+ *
+ *  You can contact Integrated Device Technology, Inc via
+ *  email ssdhelp@idt.com or mail
+ *  Integrated Device Technology, Inc
+ *  6024 Silver Creek Valley Road, San Jose, CA 95128, USA
+ *
+ ****************************************************************************/
+#include <linux/pci.h>
+#include <linux/interrupt.h>
+#include <linux/ata.h>
+#include <linux/delay.h>
+#include <linux/hdreg.h>
+#include <linux/uaccess.h>
+#include <linux/random.h>
+#include "mtipx2xx.h"
+
+/**
+ * @file
+ * The protocol layer interfaces between the block layer and
+ * the actual hardware.
+ * This layer of the driver supports the actual protocol used to talk to the
+ * hardware such as AHCI or NVMHCI. As already mentioned in the Block Layer
+ * section, the Protocol Layer needs to make a number of functions available
+ * to the BlockLayer.
+ *
+ * - ahci_init() - Called by the block layer to initialize the protocol layer.
+ * This includes resetting and initializing the hardware, and requesting IRQ's
+ * and enabling interrupts. This function may also perform some rudimentary
+ * tests to ensure that the drive is operating within specified parameters.
+ * -ahci_exit() - Called by the Block Layer to undo what was done
+ * by the init() call. It is also probably a good idea to sync the drive.
+ * - ahci_shutdown() - Called to sync the drive before a power down.
+ * - ahci_read() - Called by the Block Layer to read a number of
+ * sectors from the device.
+ * - ahci_write() - Called by the Block Layer to write a number of
+ * sectors to the device.
+ * - ahci_hwBlkSize() - Should return the hardware block size in bytes.
+ * For Cyclone devices this is 4KB (4096 bytes).
+ * - ahci_get_capacity() - Will return the capacity of the device in
+ * 512 byte sectors.
+ * - achi_get_scatterList() - Allocate a command slot and return its associated
+ * scatter list.
+ */
+
+#define UNU __attribute__ ((unused))
+
+#define AHCI_CMD_TBL_HDR_SZ	0x80
+#define AHCI_RX_FIS_SZ		0x100
+#define AHCI_CMD_SLOT_SZ	(MAX_COMMAND_SLOTS * 32)
+#define AHCI_CMD_TBL_SZ		(AHCI_CMD_TBL_HDR_SZ + (MAX_SG * 16))
+#define AHCI_CMD_TBL_AR_SZ	(AHCI_CMD_TBL_SZ * MAX_COMMAND_SLOTS)
+#define AHCI_PORT_PRIV_DMA_SZ \
+		(AHCI_CMD_SLOT_SZ + AHCI_CMD_TBL_AR_SZ + AHCI_RX_FIS_SZ)
+
+#define HBA_CAPS		0x00
+#define	HOST_CTRL		0x04
+#define HOST_IRQ_STAT	0x08
+#define PORTS_IMPL		0x0c
+#define VERSION			0x10
+#define CCCC			0x14
+#define CCCP			0x18
+#define	EML				0x1c
+#define	EMC				0x20
+#define	EX_HOST_CAP		0x24
+
+#define HOST_CAP_64		(1 << 31)
+#define	HOST_IRQ_EN		(1 << 1)
+#define	HOST_RESET		(1 << 0)
+#define HOST_HSORG	0xFC
+#define HSORG_DISABLE_SLOTGRP_INTR (1<<24)
+#define HSORG_DISABLE_SLOTGRP_PXIS (1<<16)
+#define HSORG_HWREV 0xFF00
+#define HSORG_STYLE 0x8
+#define HSORG_SLOTGROUPS 0x7
+
+#define PORT_LST_ADDR		0x00
+#define PORT_LST_ADDR_HI	0x04
+#define PORT_FIS_ADDR		0x08
+#define PORT_FIS_ADDR_HI	0x0c
+#define PORT_IRQ_STAT		0x10
+#define PORT_IRQ_EN			0x14
+#define PORT_CMD			0x18
+#define PORT_TFDATA			0x20
+#define PORT_SCR_STAT		0x28
+#define PORT_SCR_CTL		0x2c
+#define PORT_SCR_ERR		0x30
+#define PORT_SACTIVE		0x34
+#define PORT_COMMAND_ISSUE	0x38
+#define PORT_SCR_NTF		0x3c
+#define PORT_SDBV			0x7C
+
+#define PORT_CMD_ICC_ACTIVE	(1 << 28)
+#define PORT_CMD_LIST_ON	(1 << 15)
+#define PORT_CMD_FIS_RX		(1 << 4)
+#define PORT_CMD_CLO		(1 << 3)
+#define PORT_CMD_START		(1 << 0)
+#define PORT_OFFSET			0x100
+#define PORT_MEM_SIZE		0x80
+#define	RX_FIS_D2H_REG		0x40
+#define	RX_FIS_PIO			0x20
+
+#define	PORT_IRQ_COLD_PRES		(1 << 31)
+#define	PORT_IRQ_TF_ERR			(1 << 30)
+#define	PORT_IRQ_HBUS_ERR		(1 << 29)
+#define	PORT_IRQ_HBUS_DATA_ERR	(1 << 28)
+#define	PORT_IRQ_IF_ERR			(1 << 27)
+#define	PORT_IRQ_IF_NONFATAL	(1 << 26)
+#define	PORT_IRQ_OVERFLOW		(1 << 24)
+#define	PORT_IRQ_BAD_PMP		(1 << 23)
+#define	PORT_IRQ_PHYRDY			(1 << 22)
+#define	PORT_IRQ_DEV_ILCK		(1 << 7)
+#define	PORT_IRQ_CONNECT		(1 << 6)
+#define	PORT_IRQ_DPS			(1 << 5)
+#define	PORT_IRQ_UNK_FIS		(1 << 4)
+#define	PORT_IRQ_SDB_FIS		(1 << 3)
+#define	PORT_IRQ_DMAS_FIS		(1 << 2)
+#define	PORT_IRQ_PIOS_FIS		(1 << 1)
+#define	PORT_IRQ_D2H_REG_FIS	(1 << 0)
+
+#define	AHCI_CMD_WRITE			(1 << 6)
+#define	AHCI_CMD_PREFETCH		(1 << 7)
+
+#define	PORT_IRQ_FREEZE\
+	(PORT_IRQ_HBUS_ERR | PORT_IRQ_IF_ERR | PORT_IRQ_CONNECT |\
+	PORT_IRQ_PHYRDY | PORT_IRQ_UNK_FIS | PORT_IRQ_BAD_PMP)
+#define	PORT_IRQ_ERROR\
+	(PORT_IRQ_FREEZE | PORT_IRQ_TF_ERR | PORT_IRQ_HBUS_DATA_ERR |\
+	PORT_IRQ_IF_NONFATAL | PORT_IRQ_OVERFLOW)
+#define DEF_PORT_IRQ\
+	(PORT_IRQ_ERROR | PORT_IRQ_SDB_FIS | PORT_IRQ_DMAS_FIS |\
+	PORT_IRQ_PIOS_FIS | PORT_IRQ_D2H_REG_FIS)
+
+/* made-up magic product numbers.*/
+#define PRODUCT_UNKNOWN  0x00
+#define PRODUCT_OLDFPGA  0x11
+#define PRODUCT_ASICFPGA 0x12
+
+/*void restart_port(struct port *);*/
+
+static int exec_internal_command_polled(struct port *port,
+					void *fis,
+					int fisLen,
+					dma_addr_t buffer,
+					int bufLen,
+					u32 opts,
+					unsigned long timeout);
+
+/**
+ * @brief Display a buffer as hex.
+ *
+ * @param buffer Pointer to the buffer to be displayed.
+ * @param len Number of bytes to display.
+ *
+ * @return N/A
+ */
+static void dump_buffer(void *buffer, int len)
+{
+	int i;
+	unsigned char *data = buffer;
+
+	for (i = 0; i < len; i++) {
+		if ((i%8) == 0)
+			printk(KERN_INFO "\n");
+		printk(KERN_INFO "0x%02x ", data[i]);
+	}
+	printk(KERN_INFO "\n");
+}
+
+/**
+ * @brief Dump mmio register values.
+ *
+ * @param mmio Starting mmio register address.
+ * @param numRegs Number of 32 bit registers to dump.
+ *
+ * @return N/A
+ */
+static void UNU dump_regs(void __iomem *mmio, int numRegs)
+{
+	int n;
+	printk(KERN_INFO "%s:\n", __func__);
+	for (n = 0; n < numRegs; n++)
+		printk(KERN_INFO "0x%p = 0x%08x\n",
+				 mmio + (n * 4),
+				 readl(mmio + (n * 4)));
+}
+
+/**
+ * @brief Obtain an empty command slot.
+ *
+ * This function needs to be reentrant since it could be called
+ * at the same time on multiple CPUs. The allocation of the
+ * command slot must be atomic.
+ *
+ * @param port Pointer to the port data structure.
+ *
+ * @retval >=0 Index of command slot obtained.
+ * @retval -1 No command slots available.
+ */
+static int get_slot(struct port *port)
+{
+	int slot, ii;
+	unsigned int num_command_slots = port->dd->slot_groups * 32;
+
+	/* Try 10 times, because there is a small race here.
+	   that's ok, because it's still cheaper than a lock.*/
+
+	for (ii = 0; ii < 10; ii++) {
+		slot = find_next_zero_bit(port->allocated,
+					 num_command_slots, 1);
+		if (!test_and_set_bit(slot, port->allocated))
+			return slot;
+	}
+	printk(KERN_ERR "get_slot() failed to get a tag.\n");
+
+
+      /*
+       * Check for the device present
+       */
+	if (check_for_surprise_removal(port->dd->pdev)) {
+
+		/* Device is not present clean the outstanding command */
+		command_cleanup(port->dd);
+	}
+	return FAILURE;
+}
+
+static inline void release_slot(struct port *port, int tag)
+{
+	smp_mb__before_clear_bit();
+	clear_bit(tag, port->allocated);
+	smp_mb__after_clear_bit();
+}
+
+
+
+#ifdef COMMAND_TIMEOUT
+/**
+ * @brief Called periodically to see if any read/write commands are
+ * taking too long to complete.
+ *
+ * @param data Pointer to the PORT data structure.
+ *
+ * @return N/A
+ */
+void timeout_function(unsigned long int data)
+{
+	struct port *port = (struct port *) data;
+	struct HOST_TO_DEV_FIS *fis;
+	struct COMMAND *command;
+	int tag;
+	int reset_issued = 0;
+	unsigned int num_command_slots = port->dd->slot_groups * 32;
+
+	/*printk("%s: Checking for timeouts\n", __func__);*/
+
+	for (tag = 0; tag < num_command_slots; tag++) {
+		/*
+		 * Do not check the internal command slot.
+		 */
+		if (tag == TAG_INTERNAL)
+			continue;
+
+		if (atomic_read(&port->commands[tag].active) &&
+		   (time_after(jiffies, port->commands[tag].compTime))) {
+
+			unsigned int group = tag >> 5;
+			unsigned int bit = tag & 0x1f;
+
+			command = &port->commands[tag];
+			fis = (struct HOST_TO_DEV_FIS *) command->command;
+
+			printk(KERN_WARNING "%s:timeout for command tag %d\n",
+						 __func__, tag);
+
+			/* Clear the Completed bit; this should prevent any
+			 * interrupt handlers from trying to retire the
+			 *  command.
+			 */
+			writel(1<<bit, port->Completed[group]);
+
+			/*
+			 * Call the async completion callback.
+			 */
+			if (likely(command->asyncCallback))
+				command->asyncCallback(command->asyncData,
+							 -EIO);
+			command->asyncCallback = NULL;
+			command->completionFunc = NULL;
+
+			/*
+			 *Unmap the DMA scatter list entries for read and
+			 *write commands
+			 */
+			if (fis->command == ATA_CMD_FPDMA_WRITE)
+				dma_unmap_sg(&port->dd->pdev->dev, command->sg,
+						 command->scatterEnts,
+						 DMA_TO_DEVICE);
+			else if (fis->command == ATA_CMD_FPDMA_READ)
+				dma_unmap_sg(&port->dd->pdev->dev, command->sg,
+						 command->scatterEnts,
+						 DMA_FROM_DEVICE);
+
+			if (ISSUE_COMRESET_ON_TIMEOUT && !reset_issued) {
+				printk(KERN_WARNING " Issuing port reset for command timeout");
+				restart_port(port);
+				reset_issued = 1;
+			}
+
+			/*
+			 * Clear the allocated bit and active tag for the
+			 * command.
+			 */
+			atomic_set(&port->commands[tag].active, 0);
+			release_slot(port, tag);
+
+			up(&port->commandSlot);
+		}
+	}
+
+	/*
+	 * Start the timer again.
+	 */
+	mod_timer(&port->commandTimer,
+			 jiffies + msecs_to_jiffies(TIMEOUT_CHECK_PERIOD));
+}
+#endif
+
+/**
+ * @brief Asynchronous read completion function.
+ *
+ * This completion function is called by the driver ISR when a read
+ * command that was issued by the kernel completes. It first calls the
+ * asynchronous completion function which normally calls back into the block
+ * layer passing the asynchronous callback data, then unmaps the
+ * scatter list associated with the completed command, and finally
+ * clears the allocated bit associated with the completed command.
+ *
+ * @param port Pointer to the port data structure.
+ * @param tag Tag of the read command that has completed.
+ * @param data Read completion data.
+ * @param status Completion status.
+ *
+ * @return N/A
+ */
+static void async_read_complete(struct port *port,
+				int tag,
+				void *data,
+				int status)
+{
+	struct driver_data *dd = data;
+	struct COMMAND *command = &port->commands[tag];
+
+	if (unlikely(status == PORT_IRQ_TF_ERR))
+		printk(KERN_WARNING "%s: Command tag %d failed due to TFE\n",
+					__func__, tag);
+
+	/*
+	 * Call the async completion callback.
+	 */
+	if (likely(command->asyncCallback))
+		command->asyncCallback(command->asyncData, status);
+
+	command->asyncCallback = NULL;
+	command->completionFunc = NULL;
+
+	/*
+	 * Unmap the DMA scatter list entries.
+	 */
+	dma_unmap_sg(&dd->pdev->dev,
+				command->sg,
+				command->scatterEnts,
+				DMA_FROM_DEVICE);
+
+	/*
+	 * Clear the allocated and active bits for the command.
+	 */
+	atomic_set(&port->commands[tag].active, 0);
+	release_slot(port, tag);
+
+	up(&port->commandSlot);
+}
+
+/**
+ * @brief Asynchronous write completion function.
+ *
+ * This completion function is called by the driver ISR when a write
+ * command that was issued by the kernel completes. It first calls the
+ * asynchronous completion function which normally calls back into the block
+ * layer passing the asynchronous callback data, then unmaps the
+ * scatter list associated with the completed command, and finally
+ * clears the allocated bit associated with the completed command.
+ *
+ * @param port Pointer to the port data structure.
+ * @param tag Tag of the write command that has completed.
+ * @param data Write completion data.
+ * @param status Completion status.
+ *
+ * @return N/A
+ */
+static void async_write_complete(struct port *port,
+				int tag,
+				void *data,
+				int status)
+{
+	struct driver_data *dd = data;
+	struct COMMAND *command = &port->commands[tag];
+
+	if (status == PORT_IRQ_TF_ERR)
+		printk(KERN_WARNING "%s: Command tag %d did not completed due to TFE\n",
+			__func__, tag);
+
+	/*
+	 * Call the async completion callback.
+	 */
+	if (likely(command->asyncCallback))
+		command->asyncCallback(command->asyncData, status);
+
+	command->asyncCallback = NULL;
+	command->completionFunc = NULL;
+
+	/*
+	 * Unmap the DMA scatter list entries.
+	 */
+	dma_unmap_sg(&dd->pdev->dev,
+			command->sg,
+			command->scatterEnts,
+			DMA_TO_DEVICE);
+
+	/*
+	 *Clear the allocated and active bits for the command.
+	 */
+	atomic_set(&port->commands[tag].active, 0);
+	release_slot(port, tag);
+
+	up(&port->commandSlot);
+}
+
+/**
+ * @brief Internal command completion callback function.
+ *
+ * This function is normally called by the driver ISR when an internal
+ * command completed. This function signals the command completion by
+ * calling complete().
+ *
+ * @param port Pointer to the port data structure.
+ * @param tag Tag of the command that has completed.
+ * @param data Pointer to a completion structure.
+ * @param status Completion status.
+ *
+ * @return N/A
+ */
+static void completion(struct port *port, int tag, void *data, int status)
+{
+	struct COMMAND *command = &port->commands[tag];
+	struct completion *waiting = data;
+	if (status == PORT_IRQ_TF_ERR)
+		printk(KERN_WARNING "%s: Command %d completed with TFE\n",
+						__func__, tag);
+
+	command->asyncCallback = NULL;
+	command->completionFunc = NULL;
+
+	complete(waiting);
+}
+
+/**
+ * @brief Enable/disable the reception of FIS.
+ *
+ * @param port Pointer to the port data structure.
+ * @param enable New state, 1 enabled, 0 disabled.
+ *
+ * @return Previous state of the FIS, 1 enabled, 0 disabled.
+ */
+static int enable_FIS(struct port *port, int enable)
+{
+	u32 tmp;
+	/* enable FIS reception */
+	tmp = readl(port->mmio + PORT_CMD);
+	if (enable)
+		writel(tmp | PORT_CMD_FIS_RX, port->mmio + PORT_CMD);
+	else
+		writel(tmp & ~PORT_CMD_FIS_RX, port->mmio + PORT_CMD);
+
+	readl(port->mmio + PORT_CMD);
+	return (((tmp & PORT_CMD_FIS_RX) == PORT_CMD_FIS_RX));
+}
+
+/**
+ * @brief Enable/disable the DMA engine returning the previous state.
+ *
+ * @param port Pointer to the port data structure.
+ * @param enable New state, 1 enabled, 0 disabled.
+ *
+ * @return Previous state of the DMA engine, 1 enabled, 0 disabled.
+ */
+static int enable_engine(struct port *port, int enable)
+{
+	u32 tmp;
+	/* enable FIS reception */
+	tmp = readl(port->mmio + PORT_CMD);
+	if (enable)
+		writel(tmp | PORT_CMD_START, port->mmio + PORT_CMD);
+	else
+		writel(tmp & ~PORT_CMD_START, port->mmio + PORT_CMD);
+
+	readl(port->mmio + PORT_CMD);
+	return (((tmp & PORT_CMD_START) == PORT_CMD_START));
+}
+
+/**
+ * @brief Make a port active.
+ *
+ * This function enables the port DMA engine and FIS reception.
+ *
+ * @return N/A
+ */
+static void start_port(struct port *port)
+{
+	/*
+	 * Enable FIS reception.
+	 */
+	enable_FIS(port, 1);
+
+	/*
+	 * Enable the DMA engine.
+	 */
+	enable_engine(port, 1);
+}
+
+/**
+ * @brief Deinitialize a port.
+ *
+ * Deinitialize a port by disabling port interrupts, the DMA engine,
+ * and FIS reception.
+ *
+ * @param port Pointer to the port structure.
+ *
+ * @return N/A
+ */
+static void deinit_port(struct port *port)
+{
+	/*
+	 * Disable interrupts on this port.
+	 */
+	writel(0, port->mmio + PORT_IRQ_EN);
+
+	/*
+	 * Disable the DMA engine.
+	 */
+	enable_engine(port, 0);
+
+	/*
+	 * Disable FIS reception.
+	 */
+	enable_FIS(port, 0);
+}
+
+/**
+ * @brief Initialize a port.
+ *
+ * This function deinitializes the port by calling deinit_port() and then
+ * initializes it by setting the command header and RX FIS addresses,
+ * clearing the SError register and any
+ * pending port interrupts before re-enabling the default set of
+ * port interrupts.
+ *
+ * @param port Pointer to the port structure.
+ *
+ * @return N/A
+ */
+static void init_port(struct port *port)
+{
+	int ii;
+	deinit_port(port);
+
+	/*
+	 * Program the command list base and FIS base addresses.
+	 */
+	if (readl(port->dd->mmio + HBA_CAPS) & HOST_CAP_64) {
+		writel((port->commandListDMA >> 16) >> 16,
+			 port->mmio + PORT_LST_ADDR_HI);
+		writel((port->rxFISDMA >> 16) >> 16,
+			 port->mmio + PORT_FIS_ADDR_HI);
+	}
+
+	writel(port->commandListDMA & 0xffffffff, port->mmio + PORT_LST_ADDR);
+	writel(port->rxFISDMA & 0xffffffff, port->mmio + PORT_FIS_ADDR);
+
+
+	/*
+	 * Clear SError.
+	 */
+	writel(readl(port->mmio + PORT_SCR_ERR), port->mmio + PORT_SCR_ERR);
+
+	/*reset the Completed registers.*/
+	for (ii = 0; ii < port->dd->slot_groups; ii++)
+		writel(0xFFFFFFFF, port->Completed[ii]);
+
+
+	/*
+	 * Clear any pending interrupts for this port.
+	 */
+	writel(readl(port->mmio + PORT_IRQ_STAT), port->mmio + PORT_IRQ_STAT);
+
+	/*
+	 * Enable port interrupts.
+	 */
+	writel(DEF_PORT_IRQ, port->mmio + PORT_IRQ_EN);
+}
+
+/**
+ * @brief Reset the HBA (without sleeping)
+ *
+ * Just like hbaReset, except does not call sleep, so can be
+ * run from interrupt/tasklet context.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @retval 0 The reset was successful.
+ * @retval -1 The HBA Reset bit did not clear.
+ */
+int hbaReset_nosleep(struct driver_data *dd)
+{
+	unsigned long timeout;
+	mdelay(10);
+	writel(HOST_RESET, dd->mmio + HOST_CTRL);
+	timeout = jiffies + msecs_to_jiffies(1000);
+	mdelay(10);
+	while ((readl(dd->mmio + HOST_CTRL) & HOST_RESET)
+		 && time_before(jiffies, timeout))
+		mdelay(1);
+	if (readl(dd->mmio + HOST_CTRL) & HOST_RESET)
+		return -1;
+	return 0;
+}
+
+/**
+ * @brief Restart a port after a Task File Error.
+ *
+ * This function is called to restart a port after a Task File Error
+ * has been detected.
+ *
+ * @param port Pointer to the port data structure.
+ *
+ * @return N/A
+ */
+void restart_port(struct port *port)
+{
+	int didReset = 0;
+	unsigned long timeout;
+	struct HOST_TO_DEV_FIS fis;
+
+	/*
+	 * Disable the DMA engine.
+	 */
+	enable_engine(port, 0);
+
+	/*
+	 * Wait for PxCMD.CR == 0
+	 */
+	timeout = jiffies + msecs_to_jiffies(500);
+	while ((readl(port->mmio + PORT_CMD) & PORT_CMD_LIST_ON)
+		 && time_before(jiffies, timeout))
+		;
+
+	if (readl(port->mmio + PORT_CMD) & PORT_CMD_LIST_ON) {
+		printk(KERN_WARNING "%s:PxCMD.CR not clear do HBA reset\n",
+				 __func__);
+		/*Issue HBA reset here.  Don't do any sleeping because we're
+		 *in interrupt context!
+		 */
+		if (hbaReset_nosleep(port->dd))
+			printk(KERN_ERR "HBA Reset escalation failed.\n");
+
+		/*
+		 * no longer any need to do comreset, just exit.
+		*/
+		return;
+	}
+	/*
+	 * Check if a COMRESET is required.
+	 */
+	/*	if (readl(port->mmio + PORT_TFDATA) & (ATA_BUSY | ATA_DRQ))*/
+	{
+		printk(KERN_INFO "%s: Issuing COMRESET\n", __func__);
+		/*
+		 * Set PxSCTL.DET
+		 */
+		writel(readl(port->mmio + PORT_SCR_CTL) |
+				 1, port->mmio + PORT_SCR_CTL);
+		readl(port->mmio + PORT_SCR_CTL);
+
+		/*
+		 * Wait at least 1ms
+		 */
+		timeout = jiffies + msecs_to_jiffies(1);
+		while (time_before(jiffies, timeout))
+			;
+
+		/*
+		 * Clear PxSCTL.DET
+		 */
+		writel(readl(port->mmio + PORT_SCR_CTL) & ~1,
+				 port->mmio + PORT_SCR_CTL);
+		readl(port->mmio + PORT_SCR_CTL);
+
+		/*
+		 * Wait for bit 0 of PORT_SCR_STS to be set.
+		 */
+		timeout = jiffies + msecs_to_jiffies(500);
+		while (((readl(port->mmio + PORT_SCR_STAT) & 0x01) == 0)
+				 && time_before(jiffies, timeout))
+			;
+
+		if ((readl(port->mmio + PORT_SCR_STAT) & 0x01) == 0)
+			printk(KERN_WARNING "%s:PORT_SCR_STAT bit 1 not set\n",
+						__func__);
+
+		didReset = 1;
+	}
+
+	/*
+	 * Clear SError, the PxSERR.DIAG.x should be set so clear it.
+	 */
+	writel(readl(port->mmio + PORT_SCR_ERR), port->mmio + PORT_SCR_ERR);
+
+	/*
+	 * Enable the DMA engine.
+	 */
+	enable_engine(port, 1);
+
+	/*
+	 * Issue Read Log Ext.
+	 */
+	if (!didReset) {
+
+		/*
+		 * Build the FIS.
+		 */
+		memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+		fis.type		= 0x27;
+		fis.opts		= 1 << 7;
+		fis.command		= ATA_CMD_READ_LOG_EXT;
+		fis.sectCount	= 1;
+		fis.LBALow		= ATA_LOG_SATA_NCQ;
+		fis.device		= ATA_DEVICE_OBS;
+
+		memset(port->sectorBuffer, 0, ATA_SECT_SIZE);
+
+		printk(KERN_INFO "%s:executing read logpage 10h to clear err\n"
+				 , __func__);
+		/*
+		 * Execute the command.
+		 */
+		/*
+			if (exec_internal_command_polled(port,
+				&fis,
+				5,
+				port->sectorBufferDMA,
+				ATA_SECT_SIZE,
+				0, 100))
+		*/
+		if (exec_internal_command_polled(port,
+						 &fis,
+						 5,
+						 port->sectorBufferDMA,
+						 ATA_SECT_SIZE,
+						 (1<<10), 100))
+			printk(KERN_WARNING "%s: Error issuing ReadLogExt\n",
+						 __func__);
+
+		{
+			int n;
+			unsigned char *buf =
+				(unsigned char *) port->sectorBuffer;
+			for (n = 0; n < 13; n++)
+				printk(KERN_INFO "%s: 0x%02x\n",
+							 __func__, buf[n]);
+		}
+	}
+
+}
+
+static inline void issue_command(struct port *, int);
+
+static void print_tags(struct driver_data *dd,
+				char *msg,
+				unsigned long *tagbits)
+{
+	unsigned int tag, count = 0;
+	for (tag = 0; tag < (dd->slot_groups)*32; tag++) {
+		if (test_bit(tag, tagbits))
+			count++;
+	}
+	if (count)
+		printk(KERN_INFO "%s [%i tags]\n", msg, count);
+}
+
+/**
+ * @brief Handle a Task File Error.
+ *
+ * @param dd Pointer to the DRIVER_DATA structure.
+ *
+ * @retval 0
+ */
+static int handleTFE(struct driver_data *dd)
+{
+	int group;
+	int tag;
+	int bit;
+	struct port *port ;
+	struct COMMAND  *command;
+	u32 completed;
+	/* used to accumulate tag bits for log messages*/
+	unsigned long tagaccum[SLOTBITS_IN_LONGS];
+
+	/*
+	 * Grab the lock to prevent any more commands from
+	 * being issued.printk(KERN_WARNING "%s: Error issuing ReadLogExt
+	 */
+	/*
+	 *	FIXME: Spinlock CILock went away!
+	 *	Need something to take its place.
+	*/
+	printk(KERN_WARNING "Taskfile error!\n");
+
+	port = dd->port;
+#ifdef COMMAND_TIMEOUT
+	/*
+	 * Stop the timer to prevent command timeouts.
+	 */
+	del_timer(&port->commandTimer);
+#endif
+
+	/*
+	* Loop through all the groups.
+	*/
+	for (group = 0; group < dd->slot_groups; group++) {
+		completed = readl(port->Completed[group]);
+		/* clear completed status register in the hardware.*/
+		writel(completed, port->Completed[group]);
+		/* clear the tag accumulator*/
+		memset(tagaccum, 0, SLOTBITS_IN_LONGS * sizeof(long));
+
+		/*
+		 * Process successfully completed commands.
+		 */
+		for (bit = 0; bit < 32 && completed; bit++) {
+			if (completed & (1<<bit)) {
+				tag = (group << 5) + bit;
+				/*
+				 * Do not process the internal command slot.
+				 */
+				if (tag == TAG_INTERNAL)
+					continue;
+
+				command = &port->commands[tag];
+				if (likely(command->completionFunc)) {
+					set_bit(tag, tagaccum);
+					command->completionFunc(port,
+						 tag,
+						 command->completionData,
+						 0);
+				} else {
+					printk(KERN_WARNING "%s:completion function isNULL, tag=%d\n",
+								__func__, tag);
+				       /*
+					* Check for the device present
+					*/
+					if (check_for_surprise_removal(
+							dd->pdev)) {
+						/*
+						* Device is not present
+						* clean the outstanding
+						* command
+						*/
+						command_cleanup(dd);
+						/*
+						 * Stop executing further
+						 * process in driver
+						*/
+						return SUCCESS;
+					}
+
+				}
+			}
+		}
+	}
+	print_tags(dd, "TFE tags completed:", tagaccum);
+
+
+	if (ISSUE_COMRESET_ON_TFE) {
+		mdelay(20);
+		restart_port(port);
+	}
+
+	/*
+	 * If software failure emulation is enabled. If the error.
+	 */
+	if (dd->makeItFail & 0x03) {
+		struct HOST_TO_DEV_FIS	*fis =
+		(struct HOST_TO_DEV_FIS *)port->commands[
+					dd->makeItFailTag].command;
+
+		fis->LBALow   = (dd->makeItFailStart & 0xff);
+		fis->LBAMid   = (dd->makeItFailStart >> 8) & 0xff;
+		fis->LBAHi    = (dd->makeItFailStart >> 16) & 0xff;
+		fis->LBALowEx = (dd->makeItFailStart >> 24) & 0xff;
+		fis->LBAMidEx = (dd->makeItFailStart >> 32) & 0xff;
+		fis->LBAHiEx  = (dd->makeItFailStart >> 40) & 0xff;
+	} else {
+		/*
+		 * Read log page 10h to determine cause of error.
+		 */
+	}
+
+	/* clear the tag accumulator*/
+	memset(tagaccum, 0, SLOTBITS_IN_LONGS * sizeof(long));
+
+	/*
+	 * Loop through all the groups.
+	 */
+	for (group = 0; group < dd->slot_groups; group++) {
+		for (bit = 0; bit < 32; bit++) {
+			int reissue = 0;
+			tag = (group << 5) + bit;
+			/*
+			 * If the active bit is set re-issue the command.
+			 */
+			if (atomic_read(&port->commands[tag].active)) {
+				/* Should  re-issue an internal/ioctl command*/
+				if (tag == TAG_INTERNAL) {
+					if (REISSUE_INT_COMMANDS_ON_ERR
+					 && port->internalCommandInProgress)
+						reissue = 1;
+				} else {
+				/* Should we re-issue an NCQ command?*/
+					if (REISSUE_NCQ_COMMANDS_ON_ERR)
+						reissue = 1;
+				}
+
+				if (reissue) {
+					/* First check if this command has
+					 *  exceeded its retries.
+					 *  */
+					if (!port->commands[tag].retries--) {
+						/* command will be retired with
+						 * a failure code below.*/
+						reissue = 0;
+					} else {
+						/*printk(KERN_INFO "%s:
+						 * Reissue tag %d\n",
+						 * __func__, tag);*/
+						set_bit(tag, tagaccum);
+
+#ifdef COMMAND_TIMEOUT
+						/*
+						 * Update the timeout value.
+						 */
+						port->commands[tag].compTime =
+						jiffies + msecs_to_jiffies(
+						NCQ_COMMAND_TIMEOUT_MS);
+#endif
+						/*
+						 * Re-issue the command.
+						 */
+						issue_command(port, tag);
+					}
+				}
+				/* Here we retire a command that will not be
+				 * reissued.*/
+				if (!reissue) {
+					printk(KERN_WARNING "retiring tag %d\n",
+								 tag);
+					atomic_set(&port->commands[tag].active,
+							 0);
+
+				if (port->commands[tag].completionFunc)
+					port->commands[tag].completionFunc(
+					port,
+					tag,
+					port->commands[tag].completionData,
+					PORT_IRQ_TF_ERR);
+				else
+					printk(KERN_WARNING "%s: tag %d completion function is NULL\n",
+							__func__, tag);
+				}
+			}
+		}
+	}
+	print_tags(dd, "TFE tags reissued:", tagaccum);
+
+#ifdef COMMAND_TIMEOUT
+	mod_timer(&port->commandTimer,
+		 jiffies + msecs_to_jiffies(TIMEOUT_CHECK_PERIOD));
+#endif
+
+	/*
+	 * Allow commands to continue.
+	 */
+	/*spin_unlock(&dd->CILock);*/
+
+	return 0;
+}
+
+/**
+ * @brief Handle a Set Device Bits interrupt.
+ *
+ * This function performs retirement of completed commands.
+ *
+ * @param dd Pointer to the DRIVER_DATA structure.
+ *
+ * @retval 0
+ */
+static void process_SDBF(struct driver_data *dd)
+{
+	struct port  *port = dd->port;
+	int group, tag, bit;
+	u32 completed;
+	struct COMMAND *command;
+
+	for (group = 0; group < dd->slot_groups; group++) {
+		completed = readl(port->Completed[group]);
+
+		/* clear completed status register in the hardware.*/
+		writel(completed, port->Completed[group]);
+
+		/*
+		 * Process completed commands.
+		 */
+		for (bit = 0; bit < 32 && completed; bit++) {
+			if (completed & 0x01) {
+				tag = (group << 5) | bit;
+				/*
+				 * Do not process the internal command slot.
+				 */
+				if (likely(tag != TAG_INTERNAL)) {
+					command = &port->commands[tag];
+					if (likely(command->completionFunc))
+						command->completionFunc(
+						port,
+						 tag,
+						 command->completionData,
+						 0);
+					else{
+						printk(KERN_WARNING "%s: While"
+						" processing PORT_IRQ_SDB_FIS,"
+						" completion function is NULL,"
+						" tag = %d!\n", __func__, tag);
+
+						/*
+						*Check for the device present
+						*/
+						if (check_for_surprise_removal(
+								dd->pdev)) {
+							/*
+							 * Device is not
+							 * present clean
+							 * the outstanding
+							 * command
+							*/
+							command_cleanup(dd);
+							return ;
+						}
+
+					}
+				}
+			}
+			completed >>= 1;
+		}
+	}
+}
+
+
+
+
+
+#if USE_TASKLET
+static void tasklet_proc(unsigned long data)
+#else
+static inline irqreturn_t process_IRQ(struct driver_data *data)
+#endif
+{
+	struct driver_data *dd = (struct driver_data *) data;
+	struct port *port = dd->port;
+	u32 hbaStat;
+	int portStat;
+	int oldIrqBugWorkaround = 0;
+#if !USE_TASKLET
+	int rv = IRQ_NONE;
+#endif
+
+	if (dd->product_type == PRODUCT_OLDFPGA)
+		oldIrqBugWorkaround = 1;
+
+	hbaStat = readl(dd->mmio + HOST_IRQ_STAT);
+	if (hbaStat) {
+#if !USE_TASKLET
+		rv = IRQ_HANDLED;
+#endif
+		if (oldIrqBugWorkaround) {
+			/* Acknowledge the interrupt status on the HBA.*/
+			writel(hbaStat, dd->mmio + HOST_IRQ_STAT);
+		}
+
+		/* Acknowledge the interrupt status on the port.*/
+		portStat = readl(port->mmio + PORT_IRQ_STAT);
+		writel(portStat, port->mmio + PORT_IRQ_STAT);
+		/*printk("PORT_IRQ_STAT = 0x%08x\n", portStat);*/
+
+		if (likely(portStat & PORT_IRQ_SDB_FIS))
+			process_SDBF(dd);
+
+
+		if (unlikely(portStat & (PORT_IRQ_TF_ERR | PORT_IRQ_IF_ERR))) {
+			if (unlikely(portStat & PORT_IRQ_IF_ERR))
+				printk(KERN_WARNING "PORT_IRQ_IF_ERR.\n");
+
+
+			/*
+			* Check for the device presence
+			*/
+			if (check_for_surprise_removal(dd->pdev)) {
+				/*
+				* Device is not presence clean the
+				* outstanding command
+				*/
+				command_cleanup(dd);
+				/*
+				* Stop executing further process in driver
+				*/
+
+				return ;
+			}
+
+			handleTFE(dd);
+		}
+		if (unlikely(portStat & PORT_IRQ_CONNECT)) {
+			printk(KERN_INFO "%s: Clearing PxSERR.DIAG.x\n",
+						 __func__);
+			/*
+			 * Clear PxSERR.DIAG.x
+			 */
+			writel((1<<26), port->mmio + PORT_SCR_ERR);
+		}
+
+		if (unlikely(portStat & PORT_IRQ_PHYRDY)) {
+			printk(KERN_INFO "%s: Clearing PxSERR.DIAG.n\n",
+					 __func__);
+			/*
+			 * Clear PxSERR.DIAG.n
+			 */
+			writel((1<<16), port->mmio + PORT_SCR_ERR);
+		}
+
+		if (unlikely(portStat & PORT_IRQ_DMAS_FIS))
+			printk(KERN_WARNING "Got DMA FIS\n");
+
+		/*if (unlikely(portStat & PORT_IRQ_DPS))
+		printk(KERN_WARNING "Got DPS IRQ, CI=%x\n",
+		readl(port->CommandIssue[0]));
+		*/
+		if (unlikely(portStat & PORT_IRQ_PIOS_FIS)) {
+			/*printk(KERN_WARNING "Got PIOS FIS IRQ,
+			 *  CI=%x\n", readl(port->CommandIssue[0]));*/
+			if (port->internalCommandInProgress) {
+				if (!(readl(
+					port->CommandIssue[TAG_INDEX(
+						TAG_INTERNAL)])
+						 & (1 << TAG_BIT(TAG_INTERNAL)
+						))) {
+					if (port->commands[
+						TAG_INTERNAL].completionFunc)
+						port->commands[
+						TAG_INTERNAL].completionFunc(
+						port,
+						TAG_INTERNAL,
+						port->commands[
+						TAG_INTERNAL].completionData,
+						0);
+					else
+						printk(KERN_WARNING "%s:"
+							"internal"
+							" command  completion"
+							" function is NULL,"
+							" tag = %d!\n",
+							 __func__,
+							TAG_INTERNAL);
+				}
+			} else
+				printk(KERN_WARNING "Hmmm, got PIOS_FIS but no"
+							"internal"
+							" command is"
+							" outstanding!\n");
+		}
+
+		if (unlikely(portStat & PORT_IRQ_D2H_REG_FIS)) {
+			if (port->internalCommandInProgress) {
+				if (port->commands[TAG_INTERNAL].completionFunc)
+					port->commands[
+					TAG_INTERNAL].completionFunc(
+					port,
+					TAG_INTERNAL,
+					port->commands[
+					TAG_INTERNAL].completionData,
+					 0);
+				else
+					printk(KERN_WARNING "%s: While"
+						"processing"
+						"PORT_IRQ_D2H_REG_FIS,"
+						"completion function is"
+						"NULL, tag = %d!\n",
+						 __func__, TAG_INTERNAL);
+			} else {
+				printk(KERN_WARNING "Hmmm, got D2H_REG_FIS"
+						" but no internal"
+						" command is outstanding!\n");
+				dump_buffer(dd->port->rxFIS + RX_FIS_D2H_REG,
+						 20);
+			}
+		}
+		if (unlikely(portStat & PORT_IRQ_HBUS_ERR))
+			printk(KERN_WARNING "PORT_IRQ_HBUS_ERR unhandled.\n");
+
+		if (unlikely(portStat & PORT_IRQ_UNK_FIS))
+			printk(KERN_WARNING "PORT_IRQ_UNK_FIS unhandled.\n");
+
+		if (unlikely(portStat & PORT_IRQ_BAD_PMP))
+			printk(KERN_WARNING "PORT_IRQ_BAD_PMP unhandled.\n");
+
+		if (unlikely(portStat & PORT_IRQ_HBUS_DATA_ERR))
+			printk(KERN_WARNING "PORT_IRQ_HBUS_DATA_ERR unhandled.\n");
+
+		if (unlikely(portStat & PORT_IRQ_IF_NONFATAL))
+			printk(KERN_WARNING "PORT_IRQ_IF_NONFATAL unhandled.\n");
+
+		if (unlikely(portStat & PORT_IRQ_OVERFLOW))
+			printk(KERN_WARNING "PORT_IRQ_OVERFLOW unhandled.\n");
+
+	}
+
+	if (!oldIrqBugWorkaround) {
+		/*Acknowledge the interrupt status on the HBA.*/
+		writel(hbaStat, dd->mmio + HOST_IRQ_STAT);
+	}
+
+#if USE_TASKLET
+
+#else
+	return rv;
+#endif
+}
+
+/**
+ * @brief HBA interrupt subroutine.
+ *
+ * @param irq IRQ number.
+ * @param dev_instance Pointer to the driver data structure.
+ *
+ * @retval IRQ_HANDLED A HBA interrupt was pending and handled.
+ * @retval IRQ_NONE This interrupt was not for the HBA.
+ */
+static irqreturn_t irq_handler(int irq, void *dev_instance)
+{
+	struct driver_data *dd = dev_instance;
+	atomic_inc(&dd->statistics.interrupts);
+#if USE_TASKLET
+	tasklet_schedule(&dd->tasklet);
+	return IRQ_HANDLED;
+#else
+	return process_IRQ(dd);
+#endif
+}
+
+/**
+ * @brief Dump the contents of a command header.
+ *
+ * @param hdr Pointer to the command header to dump.
+ * @param index Index of the command header to dump.
+ *
+ * @return N/A
+ */
+static void UNU dump_CmdHdr(void *hdr, int index)
+{
+	unsigned int *ptr = (unsigned int *) hdr;
+
+	printk(KERN_INFO "Command Header %d:\n", index);
+	printk(KERN_INFO "dw0: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw1: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw2: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw3: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw4: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw5: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw6: 0x%08x\n", *ptr++);
+	printk(KERN_INFO "dw7: 0x%08x\n\n", *ptr++);
+}
+
+static void issue_non_NCQ_command(struct port *port, int tag)
+{
+	atomic_set(&port->commands[tag].active, 1);
+	writel(1 << TAG_BIT(tag), port->CommandIssue[TAG_INDEX(tag)]);
+}
+
+
+/**
+ * @brief Issue a command to the hardware.
+ *
+ * Set the appropriate bit in the SActive and Command Issue hardware
+ * registers, causing hardware command processing to begin.
+ *
+ * @param port Pointer to the port structure.
+ * @param tag The tag of the command to be issued.
+ *
+ * @return N/A
+ */
+static inline void issue_command(struct port *port, int tag)
+{
+	unsigned long flags = 0;
+	int need_lock = 0;
+
+	atomic_set(&port->commands[tag].active, 1);
+
+	/* "cmdIssueLock": workaround for a bug involving parallel command
+	 * issue.
+	* if two cpus interleave these two steps the hardware will hang.
+	*/
+	if (port->dd->product_type == PRODUCT_ASICFPGA)
+		need_lock = 1;
+	if (need_lock)
+		spin_lock_irqsave(&port->cmdIssueLock, flags);
+
+	writel((1 << TAG_BIT(tag)), port->SActive[TAG_INDEX(tag)]);
+	writel(1 << TAG_BIT(tag), port->CommandIssue[TAG_INDEX(tag)]);
+
+	if (need_lock)
+		spin_unlock_irqrestore(&port->cmdIssueLock, flags);
+}
+
+/**
+ * @brief Execute an internal command and wait for the completion.
+ *
+ * When calling this function the writer portion of the internalSem
+ * should be held.
+ *
+ * @param port Pointer to the port data structure.
+ * @param fis Pointer to the FIS that describes the command.
+ * @param fisLen Length in WORDS of the FIS.
+ * @param buffer DMA accessible for command data.
+ * @param bufLen Length, in bytes, of the data buffer.
+ * @param timeout Time in ms that this function will wait for the command
+ * to complete.
+ *
+ * @retval 0 Command completed successfully.
+ * @retval -EFAULT The buffer address is not correctly aligned.
+ * @retval -EAGAIN An internal command is already in progress.
+ * @retval -EBUSY A response was not received within the timeout period.
+ */
+static int exec_internal_command(struct port *port,
+				void *fis,
+				int fisLen,
+				dma_addr_t buffer,
+				int bufLen,
+				unsigned long timeout)
+{
+	unsigned int active;
+	struct COMMAND_SG	*commandSG;
+	DECLARE_COMPLETION_ONSTACK(wait);
+	unsigned long to;
+
+	port->internalCommandInProgress = 1;
+
+	/*
+	 * Make sure the buffer is 8 byte aligned. This is
+	 * Cyclone specific.
+	 */
+	if (buffer & 0x00000007) {
+		printk(KERN_ERR "Hold it! The SG buffer is not 8 byte aligned!!!\n");
+		return -EFAULT;
+	}
+
+	/*
+	 * Wait for all other commands to complete or a timeout.
+	 */
+	to = jiffies + msecs_to_jiffies(5000);
+	do {
+		int n;
+
+		/*
+		 * Ignore SActive bit 0 of array element 0.
+		 * This bit will always be set
+		 */
+		active = readl(port->SActive[0]) & 0xfffffffe;
+		for (n = 1; n < port->dd->slot_groups; n++)
+			active |= readl(port->SActive[n]);
+
+		if (!active)
+			break;
+
+		msleep(20);
+	} while (time_before(jiffies, to));
+
+	if (active)
+		printk(KERN_WARNING "%s timeout wait for commands to complete\n",
+					__func__);
+
+	/*
+	 * Copy the command to the command table.
+	 */
+	memcpy(port->commands[TAG_INTERNAL].command, fis, fisLen*4);
+
+	port->commands[TAG_INTERNAL].commandHeader->opts = cpu_to_le32(fisLen);
+	/*
+	 * Populate the SG list.
+	 */
+	if (bufLen) {
+		commandSG = port->commands[TAG_INTERNAL].command +
+				 AHCI_CMD_TBL_HDR_SZ;
+		commandSG->info	= cpu_to_le32(((bufLen-1) & 0x3fffff));
+		commandSG->dba		= cpu_to_le32(buffer & 0xffffffff);
+		commandSG->dbaUpper = cpu_to_le32((buffer >> 16) >> 16);
+		port->commands[TAG_INTERNAL].commandHeader->opts |= cpu_to_le32(
+								(1 << 16));
+	}
+	port->commands[TAG_INTERNAL].commandHeader->byteCount = 0;
+
+	/*
+	 * Set the completion function and data for the command.
+	 */
+	port->commands[TAG_INTERNAL].completionData = &wait;
+	port->commands[TAG_INTERNAL].completionFunc = completion;
+
+	/*
+	 * Issue the command to the hardware.
+	 */
+	issue_non_NCQ_command(port, TAG_INTERNAL);
+
+	/*
+	 * Wait for the command to complete or timeout.
+	 */
+	if (wait_for_completion_timeout(
+				&wait,
+				msecs_to_jiffies(timeout)) == 0) {
+		port->internalCommandInProgress = 0;
+		printk(KERN_ERR "Timeout waiting for internal command to complete\n");
+		return -EBUSY;
+	}
+	if (readl(port->CommandIssue[TAG_INDEX(TAG_INTERNAL)])
+			 & (1 << TAG_BIT(TAG_INTERNAL)))
+		printk(KERN_ERR "ERROR: retiring internal command but CI is still 1.\n");
+
+	/* Mark the slot as inactive.*/
+	atomic_set(&port->commands[TAG_INTERNAL].active, 0);
+	port->internalCommandInProgress = 0;
+	return 0;
+}
+
+/**
+ * @brief Execute an internal command and wait for the completion.
+ *
+ * @param port Pointer to the port data structure.
+ * @param fis Pointer to the FIS that describes the command.
+ * @param fisLen Length in WORDS of the FIS.
+ * @param buffer DMA accessible for command data.
+ * @param bufLen Length, in bytes, of the data buffer.
+ * @param opts Command header options, excluding the FIS length and the number of PRD entries.
+ * @param timeout Time in ms to wait for the command to complete.
+ *
+ * @retval 0 Command completed successfully.
+ * @retval -EFAULT The buffer address is not correctly aligned.
+ * @retval -EAGAIN An internal command is already in progress.
+ */
+static int exec_internal_command_polled(struct port *port,
+					void *fis,
+					int fisLen,
+					dma_addr_t buffer,
+					int bufLen,
+					u32 opts,
+					unsigned long timeout)
+{
+	struct COMMAND_SG	*commandSG;
+	int rv = 0;
+
+	/*
+	 * Make sure the buffer is 8 byte aligned. This is
+	 * Cyclone specific.
+	 */
+	if (buffer & 0x00000007) {
+		printk(KERN_ERR "Hold it! The SG buffer is not 8 byte aligned!!!\n");
+		return -EFAULT;
+	}
+
+	/*
+	 * Only one internal command should be running at a time.
+	 */
+	if (test_and_set_bit(TAG_INTERNAL, port->allocated)) {
+		printk(KERN_ERR "Internal command already active!\n");
+		return -EAGAIN;
+	}
+
+	/*
+	 * Copy the command to the command table.
+	 */
+	memcpy(port->commands[TAG_INTERNAL].command, fis, fisLen*4);
+
+	/*
+	 * Populate the SG list.
+	 */
+	port->commands[TAG_INTERNAL].commandHeader->opts =
+		 cpu_to_le32(opts | fisLen);
+	if (bufLen) {
+		commandSG = port->commands[TAG_INTERNAL].command +
+					 AHCI_CMD_TBL_HDR_SZ;
+		commandSG->info = cpu_to_le32((bufLen-1) & 0x3fffff);
+		commandSG->dba		= cpu_to_le32(buffer & 0xffffffff);
+		commandSG->dbaUpper =
+			cpu_to_le32((buffer >> 16) >> 16);
+		port->commands[TAG_INTERNAL].commandHeader->opts |=
+					cpu_to_le32((1 << 16));
+	}
+
+	/*
+	 * Populate the command header.
+	 */
+	port->commands[TAG_INTERNAL].commandHeader->byteCount = 0;
+
+	/*
+	 * Set the completion function and data for the command.
+	 */
+	port->commands[TAG_INTERNAL].completionData = NULL;
+	port->commands[TAG_INTERNAL].completionFunc = NULL;
+
+	/*
+	 * Issue the command to the hardware.
+	 */
+	writel(1 << TAG_BIT(TAG_INTERNAL),
+		port->CommandIssue[TAG_INDEX(TAG_INTERNAL)]);
+	timeout = jiffies + msecs_to_jiffies(timeout);
+	while ((readl(port->CommandIssue[TAG_INDEX(TAG_INTERNAL)])
+			& (1 << TAG_BIT(TAG_INTERNAL)))
+			 && time_before(jiffies, timeout))
+		;
+
+	if (readl(port->CommandIssue[TAG_INDEX(TAG_INTERNAL)])
+			& (1 << TAG_BIT(TAG_INTERNAL))) {
+		printk(KERN_ERR "Internal command did not complete!\n");
+		rv = -1;
+	}
+	/*
+	 * Clear the allocated and active bits for the internal command.
+	 */
+	atomic_set(&port->commands[TAG_INTERNAL].active, 0);
+	release_slot(port, TAG_INTERNAL);
+
+	return rv;
+}
+
+/**
+ * @brief Byte-swap ATA ID strings.
+ *
+ * ATA identify data contains strings in byte-swapped 16-bit words.
+ * They must be swapped (on all architectures) to be usable as C strings.
+ * This function swaps bytes in-place.
+ *
+ * @param buf The buffer location of the string
+ * @param len The number of bytes to swap
+ *
+ * @return N/A
+ */
+static void ata_swap_string(u16 *buf, unsigned int len)
+{
+	char *cbuf = (char *)buf;
+	unsigned int ii = 0;
+
+	while (ii < len) {
+		char tmp = cbuf[ii];
+		cbuf[ii] = cbuf[ii+1];
+		cbuf[ii+1] = tmp;
+		ii += 2;
+	}
+}
+
+
+/**
+ * @brief Request the device identity information.
+ *
+ * If a user space buffer is not specified, i.e. is NULL, the
+ * identify information is still read from the drive and placed
+ * into the identify data buffer (@e port->identify) in the port data structure.
+ * When the identify buffer contains valid identify information @e
+ * port->identifyValid is non-zero.
+ *
+ * @param port Pointer to the port structure.
+ * @param userBuffer A user space buffer where the identify data should be copied.
+ *
+ * @retval 0 Command completed successfully.
+ * @retval -EFAULT An error occurred while coping data to the user space buffer.
+ * @retval -1 Command failed.
+ */
+static int get_identify(struct port *port, void __user *userBuffer)
+{
+	int rv = 0;
+	struct HOST_TO_DEV_FIS	fis;
+
+	down_write(&port->dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= ATA_CMD_ID_ATA;
+
+	/*
+	 * Set the identify information as invalid.
+	 */
+	clear_bit(0, (unsigned long *) &port->identifyValid);
+
+	/*
+	 * Clear the identify information.
+	 */
+	memset(port->identify, 0, sizeof(u16) * ATA_ID_WORDS);
+
+	/*
+	 * Execute the command.
+	 */
+	if (exec_internal_command(port,
+				&fis,
+				5,
+				port->identifyDMA,
+				sizeof(u16) * ATA_ID_WORDS,
+				INTERNAL_COMMAND_TIMEOUT_MS)
+				< 0) {
+		rv = -1;
+		goto out;
+	}
+	/* Perform any necessary byte-swapping.  Yes, the kernel does in fact
+	 * perform field-sensitive swapping on the string fields.
+	 *See the kernel use of ata_id_string() for proof of this.
+	 */
+#ifdef __LITTLE_ENDIAN
+	ata_swap_string(port->identify + 27, 40);  /* model string*/
+	ata_swap_string(port->identify + 23, 8);   /* firmware string*/
+	ata_swap_string(port->identify + 10, 20);  /* serial# string*/
+#else
+	swap_buf_le16(port->identify, ATA_ID_WORDS);
+#endif
+
+	/*
+	 * Set the identify buffer as valid.
+	 */
+	set_bit(0, (unsigned long *) &port->identifyValid);
+
+	if (userBuffer) {
+		if (copy_to_user(
+				userBuffer,
+				port->identify,
+				ATA_ID_WORDS * sizeof(u16))
+			) {
+			rv = -EFAULT;
+			goto out;
+		}
+	}
+
+out:
+	up_write(&port->dd->internalSem);
+	return rv;
+}
+
+/**
+ * @brief Issue a software reset to the HBA.
+ *
+ * This function issues a software reset to the device by first sending
+ * a FIS with the reset bit set, waiting 500ms, and then sending a FIS
+ *with the reset bit cleared.
+ *
+ * @param port Pointer to the port data structure.
+ *
+ * @retval 0 The reset completed successfullt.
+ * @retval -1 An error occurred executing one of the reset commands.
+ *
+ * @note This function is untested.
+ */
+static int UNU software_reset(struct port *port)
+{
+	struct HOST_TO_DEV_FIS	fis;
+	u32 opts = 0;
+
+	memset(port->rxFIS, 0, AHCI_RX_FIS_SZ);
+	enable_engine(port, 0);
+	msleep(500);
+	enable_engine(port, 1);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.control		= ATA_SRST;
+
+	/*
+	 * Execute the command.
+	 * The C & R bits need to be set in the command header.
+	 */
+	opts = 1 << 8 | 1 << 10;
+	if (exec_internal_command_polled(port,
+					&fis,
+					 5,
+					 0, 0,
+					 opts,
+					 INTERNAL_COMMAND_TIMEOUT_MS) < 0) {
+		printk(KERN_ERR "%s: timeout setting ATA_SRST\n", __func__);
+		return -1;
+	}
+
+	msleep(500);
+
+	fis.control	= 0;
+	opts = 0;
+	/*
+	 * Execute the command.
+	 */
+	if (exec_internal_command_polled(port,
+					 &fis,
+					 5,
+					 0,
+					 0,
+					 opts,
+					 INTERNAL_COMMAND_TIMEOUT_MS) < 0) {
+		printk(KERN_ERR "%s: timeout clearing ATA_SRST\n", __func__);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int UNU set_feature(struct port *port,
+		 unsigned char enable,
+		 unsigned char feature
+		 )
+{
+	int rv;
+	struct HOST_TO_DEV_FIS	fis;
+
+	down_write(&port->dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= ATA_CMD_SET_FEATURES;
+	fis.features	= enable;
+	fis.sectCount	= feature;
+
+	/*
+	 * Execute the command.
+	 */
+	rv = exec_internal_command(port,
+				 &fis,
+				 5,
+				 0,
+				 0,
+				 INTERNAL_COMMAND_TIMEOUT_MS);
+
+	up_write(&port->dd->internalSem);
+	return rv;
+}
+
+static int UNU set_max_address(struct port *port, sector_t sectors)
+{
+	int rv;
+	struct HOST_TO_DEV_FIS	fis;
+
+	down_write(&port->dd->internalSem);
+
+	sectors--;
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= ATA_CMD_SET_MAX_EXT;
+	fis.device		= ATA_LBA;
+
+	fis.LBALow		= (sectors >> 0) & 0xff;
+	fis.LBAMid		= (sectors >> 8) & 0xff;
+	fis.LBAHi		= (sectors >> 16) & 0xff;
+	fis.LBALowEx	= (sectors >> 24) & 0xff;
+	fis.LBAMidEx	= (sectors >> 32) & 0xff;
+	fis.LBAHiEx		= (sectors >> 40) & 0xff;
+
+	/*
+	 * Execute the command.
+	 */
+	rv = exec_internal_command(port,
+				 &fis,
+				 5,
+				 0,
+				 0,
+				 INTERNAL_COMMAND_TIMEOUT_MS);
+
+	up_write(&port->dd->internalSem);
+	return rv;
+}
+
+/**
+ * @brief Issue an ATA_CMD_READ_NATIVE_MAX_EXT command to the device.
+ *
+ * @param port Pointer to the port structure.
+ *
+ * @return Returns the number of 512 byte sectors on success else all f's
+ */
+static sector_t read_max_address(struct port *port)
+{
+	struct HOST_TO_DEV_FIS	fis;
+	struct HOST_TO_DEV_FIS	*rxFIS;
+	sector_t sectors;
+
+	down_write(&port->dd->internalSem);
+
+	memset(port->rxFIS, 0, AHCI_RX_FIS_SZ);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= ATA_CMD_READ_NATIVE_MAX_EXT;
+	fis.device		= ATA_LBA;
+	/*
+	 * Execute the command.
+	 */
+	if (exec_internal_command(port,
+				 &fis,
+				 5,
+				 0,
+				 0,
+				 INTERNAL_COMMAND_TIMEOUT_MS) < 0) {
+		up_write(&port->dd->internalSem);
+		return -1;
+	}
+
+	rxFIS = port->rxFIS + RX_FIS_D2H_REG;
+
+	sectors = rxFIS->LBALow;
+	sectors |= rxFIS->LBAMid << 8;
+	sectors |= rxFIS->LBAHi << 16;
+	sectors |= rxFIS->LBALowEx << 24;
+	sectors |= (sector_t) rxFIS->LBAMidEx << 32;
+	sectors |= (sector_t) rxFIS->LBAHiEx << 40;
+
+	up_write(&port->dd->internalSem);
+	return sectors+1;
+}
+
+/**
+ * @brief Issue a standby immediate command to the device.
+ *
+ * @param port Pointer to the port structure.
+ *
+ * @retval 0 Command was executed successfully.
+ * @retval -1 An error occurred while executing the command.
+ */
+static int standby_immediate(struct port *port)
+{
+	int rv;
+	struct HOST_TO_DEV_FIS	fis;
+
+	down_write(&port->dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= ATA_CMD_STANDBYNOW1;
+
+	/*
+	 * Execute the command.  Use a 10-second timeout for large drives.
+	 */
+	rv = exec_internal_command(port, &fis, 5, 0, 0, 10000);
+
+	up_write(&port->dd->internalSem);
+	return rv;
+}
+
+/**
+ * @brief Write an internal drive config register
+ *
+ * @param port Pointer to the port structure.
+ * @param addr Config register address
+ * @param data Config register data
+ *
+ * @retval 0 Command was executed successfully.
+ * @retval -1 An error occurred while executing the command.
+ */
+
+
+/**
+ * @brief Read data from a log page.
+ *
+ * @param port Pointer to the port data structure.
+ * @param page Number of the page to read.
+ * @param buffer Location where read data is placed.
+ * @param sectors The number of sectors to read from the log.
+ *
+ * @retval 0 The log page was read successfully.
+ * @retval -1 A timeout occurred waiting for this command to complete.
+ */
+static int UNU read_logpage(struct port *port,
+				u8 page,
+				dma_addr_t buffer,
+				int sectors)
+{
+	int rv;
+	struct HOST_TO_DEV_FIS	fis;
+	unsigned int *dump = (unsigned int *) &fis;
+	down_write(&port->dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= ATA_CMD_READ_LOG_EXT;
+	fis.sectCount	= sectors & 0xff;
+	fis.secCountEx	= (sectors >> 8) & 0xff;
+	fis.LBALow		= page;
+	fis.LBAMid		= 0;
+	fis.device		= ATA_DEVICE_OBS;
+
+	printk(KERN_INFO "0x%08x\n", dump[0]);
+	printk(KERN_INFO "0x%08x\n", dump[1]);
+	printk(KERN_INFO "0x%08x\n", dump[2]);
+	printk(KERN_INFO "0x%08x\n", dump[3]);
+	printk(KERN_INFO "0x%08x\n", dump[4]);
+
+	/*
+	 * Execute the command.
+	 */
+	rv = exec_internal_command(port,
+				&fis,
+				 5,
+				 buffer,
+				 sectors * ATA_SECT_SIZE,
+				 INTERNAL_COMMAND_TIMEOUT_MS);
+
+	dump = (port->rxFIS + RX_FIS_D2H_REG);
+	printk(KERN_INFO "Outputs:\n");
+	printk(KERN_INFO "0x%08x\n", dump[0]);
+	printk(KERN_INFO "0x%08x\n", dump[1]);
+	printk(KERN_INFO "0x%08x\n", dump[2]);
+	printk(KERN_INFO "0x%08x\n", dump[3]);
+	printk(KERN_INFO "0x%08x\n", dump[4]);
+
+	up_write(&port->dd->internalSem);
+	return rv;
+}
+
+/**
+ * @brief Get the drive capacity.
+ *
+ * @param port Pointer to the port structure.
+ * @param sectors Pointer to the variable that will receive the sector count.
+ *
+ * @retval 1 Capacity was returned successfully.
+ * @retval 0 The identify information is invalid.
+ */
+static int getCapacity(struct port *port, sector_t *sectors)
+{
+	u64 total, raw0, raw1, raw2, raw3;
+	raw0 = port->identify[100];
+	raw1 = port->identify[101];
+	raw2 = port->identify[102];
+	raw3 = port->identify[103];
+	total = raw0 | raw1<<16 | raw2<<32 | raw3<<48;
+	*sectors = total;
+	return port->identifyValid;
+}
+
+/**
+ * @brief Reset the HBA.
+ *
+ * Resets the HBA by setting the HBA Reset bit in the Global
+ * HBA Control register. After setting the HBA Reset bit the
+ * function waits for 1 second before reading the HBA Reset
+ * bit to make sure it has cleared. If HBA Reset is not clear
+ * an error is returned.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @retval 0 The reset was successful.
+ * @retval -1 The HBA Reset bit did not clear.
+ */
+static int hba_reset(struct driver_data *dd)
+{
+	deinit_port(dd->port);
+
+	/*
+	 * Set the reset bit.
+	 */
+	writel(HOST_RESET, dd->mmio + HOST_CTRL);
+
+	/*
+	 * Flush.
+	 */
+	readl(dd->mmio + HOST_CTRL);
+
+	/*
+	 * Wait for reset to clear.
+	 */
+	ssleep(1);
+
+	/*
+	 * Check the bit has cleared.
+	 */
+	if (readl(dd->mmio + HOST_CTRL) & HOST_RESET)
+		return -1;
+
+	return 0;
+}
+
+/**
+ * @brief Display the identify command data.
+ *
+ * @param port Pointer to the port data structure.
+ *
+ * @retval 0 The identify information is valid and has been displayed.
+ * @retval -1 The identify information is invalid.
+ */
+static int dump_identify(struct port *port)
+{
+	sector_t sectors;
+	char cbuf[42];
+
+	if (!port->identifyValid)
+		return -1;
+	/* note string length is +1 to allow for null.*/
+	strlcpy(cbuf, (char *)(port->identify+10), 21);
+	printk(KERN_INFO "Serial No.: %s\n", cbuf);
+
+	strlcpy(cbuf, (char *)(port->identify+23), 9);
+	printk(KERN_INFO "Firmware Ver.: %s\n", cbuf);
+
+	strlcpy(cbuf, (char *)(port->identify+27), 41);
+	printk(KERN_INFO "Model: %s\n", cbuf);
+
+	if (getCapacity(port, &sectors))
+		printk(KERN_INFO "Capacity: %llu sectors (%lluMB)\n",
+					 (u64)sectors,
+					 ((u64)sectors) * ATA_SECT_SIZE >> 20);
+
+	return 0;
+}
+
+/**
+ * @brief Map the commands scatter list into the command table.
+ *
+ * @param command Pointer to the command.
+ * @param nents Number of scatter list entries.
+ *
+ * @return This function always returns 0.
+ */
+static inline int fill_command_SG(struct COMMAND *command, int nents)
+{
+	int n;
+	struct COMMAND_SG *commandSG = command->command + AHCI_CMD_TBL_HDR_SZ;
+	struct scatterlist *sg = command->sg;
+
+	for (n = 0; n < nents; n++) {
+		unsigned int dma_len = sg_dma_len(sg);
+		if (dma_len > 0x400000)
+			printk(KERN_ERR "Error: DMA segment length truncated!\n");
+		commandSG->info = cpu_to_le32((dma_len-1) & 0x3fffff);
+#if (BITS_PER_LONG == 64)
+		*((unsigned long *) &commandSG->dba) =
+			 cpu_to_le64(sg_dma_address(sg));
+#else
+		commandSG->dba	= cpu_to_le32(sg_dma_address(sg));
+		commandSG->dbaUpper	=
+			 cpu_to_le32((sg_dma_address(sg) >> 16) >> 16);
+#endif
+		commandSG++;
+		sg++;
+	}
+
+	return 0;
+}
+
+/**
+ * @brief Execute a drive command.
+ *
+ * @retval 0 The command completed successfully.
+ * @retval -1 An error occurred while executing the command.
+ */
+int exec_drive_task(struct port *port, u8 *command)
+{
+	struct HOST_TO_DEV_FIS	fis;
+	struct HOST_TO_DEV_FIS *reply = (port->rxFIS + RX_FIS_D2H_REG);
+
+	/*
+	 * Lock the internal command semaphore.
+	 */
+	down_write(&port->dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= command[0];
+	fis.features	= command[1];
+	fis.sectCount	= command[2];
+	fis.sector		= command[3];
+	fis.cylLow		= command[4];
+	fis.cylHi		= command[5];
+	fis.device		= command[6] & ~0x10; /* Clear the dev bit*/
+
+
+	printk(KERN_INFO "User Command %s: command = 0x%x, feature = 0x%x, nsector = 0x%x, sector = 0x%x, lcyl = 0x%x, hcyl = 0x%x, select = 0x%x\n",
+			__func__,
+			command[0],
+			command[1],
+			command[2],
+			command[3],
+			command[4],
+			command[5],
+			command[6]
+			);
+
+	/*
+	 * Execute the command.
+	 */
+	if (exec_internal_command(port,
+				 &fis,
+				 5,
+				 0,
+				 0, IOCTL_COMMAND_TIMEOUT_MS) < 0) {
+		up_write(&port->dd->internalSem);
+		return -1;
+	}
+
+	command[0] = reply->command; /* Status*/
+	command[1] = reply->features; /* Error*/
+	command[4] = reply->cylLow;
+	command[5] = reply->cylHi;
+
+	printk(KERN_INFO "Completion Status from devices %s: Status = 0x%x, Error = 0x%x , CylLow = 0x%x CylHi = 0x%x\n",
+				__func__,
+			  command[0],
+			  command[1],
+			  command[4],
+			  command[5]);
+
+	up_write(&port->dd->internalSem);
+	return 0;
+}
+
+/**
+ * @brief Execute a drive command.
+ *
+ * @param port Pointer to the port data structure.
+ * @param command Pointer to the user specified command parameters.
+ * @param userBuffer Pointer to the user space buffer where read sector data should be copied.
+ *
+ * @retval 0 The command completed successfully.
+ * @retval -EFAULT An error occurred while copying the completion data to the user space buffer.
+ * @retval -1 An error occurred while executing the command.
+ */
+int exec_drive_command(struct port *port, u8 *command, void __user *userBuffer)
+{
+	struct HOST_TO_DEV_FIS	fis;
+	struct HOST_TO_DEV_FIS *reply = (port->rxFIS + RX_FIS_D2H_REG);
+
+	/*
+	 * Lock the internal command semaphore.
+	 */
+	down_write(&port->dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= command[0];
+	fis.features	= command[2];
+	fis.sectCount	= command[3];
+	if (fis.command == WIN_SMART) {
+		fis.sector	= command[1];
+		fis.cylLow	= 0x4f;
+		fis.cylHi	= 0xc2;
+	}
+
+
+	printk(KERN_INFO " UserCommand %s: command = 0x%x, sector = 0x%x, features = 0x%x, sectCount = 0x%x\n",
+			__func__,
+			command[0],
+			command[1],
+			command[2],
+			command[3]);
+
+	memset(port->sectorBuffer, 0x00, ATA_SECT_SIZE);
+
+	/*
+	 * Execute the command.
+	 */
+	if (exec_internal_command(port,
+				&fis,
+				 5,
+				 port->sectorBufferDMA,
+				 (command[3] != 0) ? ATA_SECT_SIZE : 0,
+				 IOCTL_COMMAND_TIMEOUT_MS)
+				 < 0) {
+		up_write(&port->dd->internalSem);
+		return -1;
+	}
+
+	/*
+	 * Collect the completion status.
+	 */
+	command[0] = reply->command; /* Status*/
+	command[1] = reply->features; /* Error*/
+	command[2] = command[3];
+
+	printk(KERN_INFO "Completion Status from devices %s: Status = 0x%x, Error = 0x%x , Command = 0x%x\n",
+				__func__,
+				command[0],
+				command[1],
+				command[2]);
+
+	if (userBuffer && command[3]) {
+		if (copy_to_user(userBuffer,
+				 port->sectorBuffer,
+				 ATA_SECT_SIZE * command[3])) {
+			up_write(&port->dd->internalSem);
+			return -EFAULT;
+		}
+	}
+
+	up_write(&port->dd->internalSem);
+	return 0;
+}
+
+/**
+ *  @brief Execute returns 1 if the command is one that
+ *   always has a single sector payload.
+ *
+ *  @param command passed to the device to perform the certain event.
+ *  @param features passed to the device to perform the certain event.
+ *  @returns 1 if the command is one that always has a single sector payload,
+ *   regardless of the value in the Sector Count field.
+ *
+ */
+
+unsigned int implicit_sector(unsigned char command, unsigned char features)
+{
+	unsigned int rv = 0;
+
+	/* this is a list of commands that have an implicit sector count of 1.*/
+	switch (command) {
+	case 0xF1:
+	case 0xF2:
+	case 0xF3:
+	case 0xF4:
+	case 0xF5:
+	case 0xF6:
+	case 0xE4:
+	case 0xE8:
+		rv = 1;
+		break;
+	case 0xF9:
+		if (features == 0x03)
+			rv = 1;
+		break;
+	case 0xB0:
+		if ((features == 0xD0) || (features == 0xD1))
+			rv = 1;
+		break;
+	case 0xB1:
+		if ((features == 0xC2) || (features == 0xC3))
+			rv = 1;
+		break;
+	}
+	return rv;
+}
+/* Borrows liberally from ide_taskfile_ioctl()
+ *
+ */
+
+static int exec_drive_taskfile(struct driver_data *dd, unsigned long arg)
+{
+	struct HOST_TO_DEV_FIS	fis;
+	struct HOST_TO_DEV_FIS *reply = (dd->port->rxFIS + RX_FIS_D2H_REG);
+
+	ide_task_request_t *req_task;
+	u8 *outbuf = NULL;
+	u8 *inbuf = NULL;
+	dma_addr_t outbuf_dma = (dma_addr_t)NULL;
+	dma_addr_t inbuf_dma = (dma_addr_t)NULL;
+	dma_addr_t dma_buffer = (dma_addr_t)NULL;
+	int err = 0;
+	int tasksize = sizeof(struct ide_task_request_s);
+	unsigned int taskin = 0;
+	unsigned int taskout = 0;
+	u8 nsect = 0;
+	char __user *buf = (char __user *)arg;
+	unsigned int timeout = IOCTL_COMMAND_TIMEOUT_MS;
+	unsigned int force_single_sector;
+	unsigned int transfer_size;
+
+
+	req_task = kzalloc(tasksize, GFP_KERNEL);
+	if (req_task == NULL)
+		return -ENOMEM;
+	if (copy_from_user(req_task, buf, tasksize)) {
+		kfree(req_task);
+		return -EFAULT;
+	}
+
+	/* we don't support the extended register set.*/
+	/*if (req_task->in_flags.b.data_hob ||
+			req_task->in_flags.b.sector_hob ||
+			req_task->in_flags.b.nsector_hob ||
+			req_task->in_flags.b.lcyl_hob) {
+		err = -EINVAL;
+		goto abort;
+	}*/
+
+	taskout = req_task->out_size;
+	taskin = req_task->in_size;
+	/* 130560 = 512 * 0xFF*/
+	if (taskin > 130560 || taskout > 130560) {
+		err = -EINVAL;
+		goto abort;
+	}
+
+	if (taskout) {
+		int outtotal = tasksize;
+		outbuf = kzalloc(taskout, GFP_KERNEL);
+		if (outbuf == NULL) {
+			err = -ENOMEM;
+			goto abort;
+		}
+		if (copy_from_user(outbuf, buf + outtotal, taskout)) {
+			err = -EFAULT;
+			goto abort;
+		}
+		outbuf_dma = pci_map_single(dd->pdev,
+					 outbuf,
+					 taskout,
+					 DMA_TO_DEVICE);
+		if (outbuf_dma == (dma_addr_t)NULL) {
+			err = -ENOMEM;
+			goto abort;
+		}
+		dma_buffer = outbuf_dma;
+	}
+
+	if (taskin) {
+		int intotal = tasksize + taskout;
+		inbuf = kzalloc(taskin, GFP_KERNEL);
+		if (inbuf == NULL) {
+			err = -ENOMEM;
+			goto abort;
+		}
+		/* FIXME: why are we copying the "in" buffer from the user?
+		* keep for now because this is how kernel ATA does it.
+		*/
+		if (copy_from_user(inbuf, buf + intotal, taskin)) {
+			err = -EFAULT;
+			goto abort;
+		}
+		inbuf_dma = pci_map_single(dd->pdev,
+					 inbuf,
+					 taskin, DMA_FROM_DEVICE);
+		if (inbuf_dma == (dma_addr_t)NULL) {
+			err = -ENOMEM;
+			goto abort;
+		}
+		dma_buffer = inbuf_dma;
+	}
+
+	/* This driver only supports PIO and non-data commands
+	 * from this ioctl.*/
+	switch (req_task->data_phase) {
+	case TASKFILE_OUT:
+		nsect = taskout / ATA_SECT_SIZE;
+		break;
+	case TASKFILE_IN:
+	case TASKFILE_NO_DATA:
+		break;
+	default:
+		err = -EINVAL;
+		goto abort;
+	}
+
+	/*
+	 * Lock the internal command semaphore.
+	 */
+	down_write(&dd->internalSem);
+
+	/*
+	 * Build the FIS.
+	 */
+	memset(&fis, 0, sizeof(struct HOST_TO_DEV_FIS));
+
+	fis.type		= 0x27;
+	fis.opts		= 1 << 7;
+	fis.command		= req_task->io_ports[7];
+	fis.features	= req_task->io_ports[1];
+	fis.sectCount	= req_task->io_ports[2];
+	fis.LBALow		= req_task->io_ports[3];
+	fis.LBAMid		= req_task->io_ports[4];
+	fis.LBAHi		= req_task->io_ports[5];
+	 /* Clear the dev bit*/
+	fis.device		= req_task->io_ports[6] & ~0x10;
+
+	if ((req_task->in_flags.all == 0) && (req_task->out_flags.all & 1)) {
+		req_task->in_flags.all	=
+			IDE_TASKFILE_STD_IN_FLAGS | (IDE_HOB_STD_IN_FLAGS << 8);
+		fis.LBALowEx		= req_task->hob_ports[3];
+		fis.LBAMidEx		= req_task->hob_ports[4];
+		fis.LBAHiEx			= req_task->hob_ports[5];
+		fis.featuresEx		= req_task->hob_ports[1];
+		fis.secCountEx		= req_task->hob_ports[2];
+
+	} else {
+		req_task->in_flags.all = IDE_TASKFILE_STD_IN_FLAGS;
+	}
+
+	force_single_sector = implicit_sector(fis.command, fis.features);
+
+	if ((taskin || taskout) && (!fis.sectCount)) {
+		if (nsect)
+			fis.sectCount = nsect;
+		else {
+				if (!force_single_sector) {
+					printk(KERN_WARNING "%s: requested data movement but sectCount is 0!\n",
+								__func__);
+					up_write(&dd->internalSem);
+					err = -EINVAL;
+					goto abort;
+				}
+
+		}
+	}
+
+	printk(KERN_INFO "taskfile command = 0x%x, feature = 0x%x, nsector = 0x%x, sector/lbal = 0x%x, lcyl/lbam = 0x%x, hcyl/lbah = 0x%x, head/device = 0x%x\n",
+			fis.command,
+			fis.features,
+			fis.sectCount,
+			fis.LBALow,
+			fis.LBAMid, fis.LBAHi, fis.device);
+
+	/* If the command is Download Microcode increase the timeout to
+	 * 60 seconds.*/
+	if (fis.command == 0x92)
+		timeout = 60000;
+
+	/* If the command is Security Erase Unit increase the timeout to
+	 * 4 minutes.*/
+	if (fis.command == 0xF4)
+		timeout = 240000;
+
+	/* If the command is standby immediate increase the timeout to
+	 * 10 seconds.*/
+	if (fis.command == 0xE0)
+		timeout = 10000;
+
+	/* If the command is vendor unquie command the timeout to
+	 * 2 minutes.*/
+	if (fis.command == 0xF7)
+		timeout = 10000;
+
+	if (fis.command == 0xFA)
+		timeout = 10000;
+
+	/* Determine the correct transfer size.*/
+	if (force_single_sector)
+		transfer_size = ATA_SECT_SIZE;
+	else
+		transfer_size = ATA_SECT_SIZE * fis.sectCount;
+
+
+	/* Execute the command.*/
+	if (exec_internal_command(dd->port,
+				 &fis,
+				 5,
+				 dma_buffer,
+				 transfer_size, timeout) < 0) {
+		up_write(&dd->internalSem);
+		err = -EIO;
+		goto abort;
+	}
+
+	/* reclaim the DMA buffers.*/
+	if (inbuf_dma)
+		pci_unmap_single(dd->pdev, inbuf_dma, taskin, DMA_FROM_DEVICE);
+	if (outbuf_dma)
+		pci_unmap_single(dd->pdev, outbuf_dma, taskout, DMA_TO_DEVICE);
+	inbuf_dma = outbuf_dma = (dma_addr_t)NULL;
+
+	/* return the ATA registers to the caller.*/
+	req_task->io_ports[7] =	reply->command;
+	req_task->io_ports[1] = reply->features;
+	req_task->io_ports[2] = reply->sectCount;
+	req_task->io_ports[3] = reply->LBALow;
+	req_task->io_ports[4] = reply->LBAMid;
+	req_task->io_ports[5] = reply->LBAHi;
+	req_task->io_ports[6] = reply->device;
+
+	if (req_task->out_flags.all & 1)  {
+
+		req_task->hob_ports[3] = reply->LBALowEx;
+		req_task->hob_ports[4] = reply->LBAMidEx;
+		req_task->hob_ports[5] = reply->LBAHiEx;
+		req_task->hob_ports[1] = reply->featuresEx;
+		req_task->hob_ports[2] = reply->secCountEx;
+	}
+
+	printk(KERN_INFO "Completion Status from devices %s: Status = 0x%x,"
+			"Error = 0x%x , sectCount = 0x%x, Lbalow = 0x%x ,"
+			"LbaMid = 0x%x, LbaHi = 0x%x, Device = 0x%x\n",
+				__func__,
+				req_task->io_ports[7],
+				req_task->io_ports[1],
+				req_task->io_ports[2],
+				req_task->io_ports[3],
+				req_task->io_ports[4],
+				req_task->io_ports[5],
+				req_task->io_ports[6]);
+
+	up_write(&dd->internalSem);
+
+	/* FIXME: why are we copying "out" data back to the user?
+	* keep for now because this is how kernel ATA does it.
+	*/
+	if (copy_to_user(buf, req_task, tasksize)) {
+		err = -EFAULT;
+		goto abort;
+	}
+	if (taskout) {
+		int outtotal = tasksize;
+		if (copy_to_user(buf + outtotal, outbuf, taskout)) {
+			err = -EFAULT;
+			goto abort;
+		}
+	}
+	if (taskin) {
+		int intotal = tasksize + taskout;
+		if (copy_to_user(buf + intotal, inbuf, taskin)) {
+			err = -EFAULT;
+			goto abort;
+		}
+	}
+abort:
+	if (inbuf_dma)
+		pci_unmap_single(dd->pdev, inbuf_dma, taskin, DMA_FROM_DEVICE);
+	if (outbuf_dma)
+		pci_unmap_single(dd->pdev, outbuf_dma, taskout, DMA_TO_DEVICE);
+	kfree(req_task);
+	kfree(outbuf);
+	kfree(inbuf);
+
+	return err;
+}
+
+
+/**
+ * @brief Handle IOCTL calls from the Block Layer.
+ *
+ * This function is called by the Block Layer when it receives an IOCTL command
+ * that it does not understand. If the IOCTL command is not supported
+ * this function returns -ENOTTY.
+ *
+ * @param dd Pointer to the driver data structure.
+ * @param cmd IOCTL command passed from the Block Layer.
+ * @param arg IOCTL argument passed from the Block Layer.
+ *
+ * @retval 0 The IOCTL completed successfully.
+ * @retval -ENOTTY The specified command is not supported.
+ * @retval -EFAULT An error occurred copying data to a user space buffer.
+ * @retval -EIO An error occurred while executing the command.
+ */
+int ahci_ioctl(struct driver_data *dd, unsigned int cmd, unsigned long arg)
+{
+	switch (cmd) {
+	case HDIO_GET_IDENTITY:
+		if (get_identify(dd->port, (void __user *) arg) < 0) {
+			printk(KERN_ERR "%s: Unable to read identity\n",
+						__func__);
+			return -EIO;
+		}
+
+		break;
+	case HDIO_DRIVE_CMD:
+	{
+		u8 driveCommand[4];
+
+		/*
+		 * Copy the user command info to our buffer.
+		 */
+		if (copy_from_user(driveCommand,
+					 (void __user *) arg,
+					 sizeof(driveCommand)))
+			return -EFAULT;
+
+		/*
+		 * Execute the drive command.
+		 */
+		if (exec_drive_command(dd->port,
+					 driveCommand,
+					 (void __user *) (arg+4)))
+			return -EIO;
+
+		/*
+		 * Copy the status back to the users buffer.
+		 */
+		if (copy_to_user((void __user *) arg,
+					 driveCommand,
+					 sizeof(driveCommand)))
+			return -EFAULT;
+
+		break;
+	}
+	case HDIO_DRIVE_TASK:
+	{
+		u8 driveCommand[7];
+
+		/*
+		 * Copy the user command info to our buffer.
+		 */
+		if (copy_from_user(driveCommand,
+					 (void __user *) arg,
+					 sizeof(driveCommand)))
+			return -EFAULT;
+
+		/*
+		 * Execute the drive command.
+		 */
+		if (exec_drive_task(dd->port, driveCommand))
+			return -EIO;
+
+		/*
+		 * Copy the status back to the users buffer.
+		 */
+		if (copy_to_user((void __user *) arg,
+					 driveCommand,
+					 sizeof(driveCommand)))
+			return -EFAULT;
+
+		break;
+	}
+	case HDIO_DRIVE_TASKFILE:
+		return exec_drive_taskfile(dd, arg);
+
+	default:
+		printk(KERN_WARNING "%s: unsupported IOCTL 0x%x\n",
+					 __func__, cmd);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+/**
+ * @brief Asynchronous write.
+ *
+ * This function is called by the block layer to issue a write command
+ * to the device. Upon completion of the write the callback function will
+ * be called with the data parameter passed as the callback data.
+ *
+ * @param dd Pointer to the driver data structure.
+ * @param start First sector to write.
+ * @param nsect Number of sectors to write.
+ * @param nents Number of entries in scatter list for the write command.
+ * @param tag The tag of this write command.
+ * @param callback Pointer to the function that should be called
+ * when the write completes.
+ * @param data Callback data passed to the callback function
+ * when the write completes.
+ * @param barrier If non-zero, this command must be completed
+ * before issuing any other commands.
+ *
+ * @return This function always returns 0.
+ */
+int ahci_write(struct driver_data *dd,
+				sector_t start,
+				int nsect,
+				int nents,
+				int tag,
+				void *callback,
+				void *data,
+				int barrier)
+{
+	struct HOST_TO_DEV_FIS	*fis;
+	struct port *port = dd->port;
+	struct COMMAND *command = &port->commands[tag];
+
+	/*
+	 * Map the scatter list for DMA access.
+	 */
+	nents = dma_map_sg(&dd->pdev->dev, command->sg, nents, DMA_TO_DEVICE);
+
+	/*
+	 * Number of sg entries.
+	 */
+	command->scatterEnts = nents;
+
+	/*
+	 * The number of retries for this command before it is
+	 * reported as a failure to the upper layers.
+	 */
+	command->retries = MAX_RETRIES;
+
+	fis = command->command;
+	/*
+	 * Build the FIS.
+	 */
+	fis->type		= 0x27;
+	fis->opts		= 1 << 7;
+	fis->command	= ATA_CMD_FPDMA_WRITE;
+
+	/*
+	 * This is used to inject errors into the read flow.
+	 */
+	if (unlikely(dd->makeItFail & 0x02)) {
+		spin_lock(&dd->makeItFailLock);
+		if (unlikely(dd->randomWriteCount-- == 0)) {
+			dd->makeItFailTag = tag;
+			dd->makeItFailStart = start;
+			get_random_bytes(&dd->randomWriteCount,
+					 sizeof(dd->randomWriteCount));
+			dd->randomWriteCount &= 0xfffff;
+			printk(KERN_INFO "%s: Random write count = %d\n",
+					 __func__,
+					 dd->randomWriteCount);
+			printk(KERN_INFO "%s: Generating error for tag %d\n",
+						 __func__, tag);
+			/*
+			 * Set start to an invalid value.
+			 */
+			start = (sector_t) -1;
+		}
+		spin_unlock(&dd->makeItFailLock);
+	}
+
+	*((unsigned int *) &fis->LBALow) = (start & 0xffffff);
+	*((unsigned int *) &fis->LBALowEx) = ((start >> 24) & 0xffffff);
+
+	fis->device		= 1 << 6;
+	if (barrier)
+		fis->device |= FUA_BIT;
+
+	fis->features	= nsect & 0xff;
+	fis->featuresEx	= (nsect >> 8) & 0xff;
+
+	fis->sectCount	= ((tag << 3) | (tag >> 5));
+	fis->secCountEx	= 0;
+	fis->control	= 0;
+	fis->res2		= 0;
+	fis->res3		= 0;
+
+	fill_command_SG(command, nents);
+
+	/*
+	 * Populate the command header.
+	 */
+	command->commandHeader->opts = cpu_to_le32(
+			(nents << 16) | 5 | AHCI_CMD_WRITE | AHCI_CMD_PREFETCH
+			);
+	command->commandHeader->byteCount = 0;
+
+	/*
+	 * Set the completion function and data for the command.
+	 */
+	command->completionData = dd;
+	command->completionFunc = async_write_complete;
+
+	command->asyncData = data;
+	command->asyncCallback = callback;
+
+	/*
+	 * Lock used to prevent this command from being issued
+	 * if an internal command is in progress.
+	 */
+	down_read(&port->dd->internalSem);
+
+	atomic_inc(&dd->statistics.writes);
+
+	/*
+	 * Issue the command to the hardware.
+	 */
+	issue_command(port, tag);
+
+#ifdef COMMAND_TIMEOUT
+	/* Set the command's timeout value.*/
+	port->commands[tag].compTime = jiffies + msecs_to_jiffies(
+							NCQ_COMMAND_TIMEOUT_MS
+							);
+#endif
+
+	up_read(&port->dd->internalSem);
+
+	return 0;
+}
+
+/**
+ * @brief Asynchronous read.
+ *
+ * This function is called by the block layer to issue a read command
+ * to the device. Upon completion of the read the callback function will
+ * be called with the data parameter passed as the callback data.
+ *
+ * @param dd Pointer to the driver data structure.
+ * @param start First sector to read.
+ * @param nsect Number of sectors to read.
+ * @param nents Number of entries in scatter list for the read command.
+ * @param tag The tag of this read command.
+ * @param callback Pointer to the function that should be called
+ * when the read completes.
+ * @param data Callback data passed to the callback function
+ * when the read completes.
+ * @param barrier If non-zero, this command must be completed before
+ * issuing any other commands.
+ *
+ * @return This function always returns 0.
+ */
+int ahci_read(struct driver_data *dd,
+			sector_t start,
+			int nsect,
+			int nents,
+			int tag,
+			void *callback,
+			void *data,
+			int barrier)
+{
+	struct HOST_TO_DEV_FIS	*fis;
+	struct port *port = dd->port;
+	struct COMMAND *command = &port->commands[tag];
+
+	/*
+	 * Map the scatter list for DMA access.
+	 */
+	nents = dma_map_sg(&dd->pdev->dev, command->sg, nents, DMA_FROM_DEVICE);
+
+	/*
+	 * Number of sg entries.
+	 */
+	command->scatterEnts = nents;
+
+	/*
+	 * The number of retries for this command before it is
+	 * reported as a failure to the upper layers.
+	 */
+	command->retries = MAX_RETRIES;
+
+	fis = command->command;
+	/*
+	 * Build the FIS.
+	 */
+	fis->type		= 0x27;
+	fis->opts		= 1 << 7;
+	fis->command	= ATA_CMD_FPDMA_READ;
+
+	/*
+	 * This is used to inject errors into the read flow.
+	 */
+	if (unlikely(dd->makeItFail & 0x01)) {
+		spin_lock(&dd->makeItFailLock);
+		if (unlikely(dd->randomReadCount-- == 0)) {
+			dd->makeItFailTag = tag;
+			dd->makeItFailStart = start;
+			get_random_bytes(&dd->randomReadCount,
+				 sizeof(dd->randomReadCount));
+			dd->randomReadCount &= 0xffffff;
+			printk(KERN_INFO "%s: Random read count = %d\n",
+					__func__,
+					dd->randomReadCount);
+			printk(KERN_INFO "%s: Generating error for tag %d\n",
+							__func__, tag);
+			/*
+			 * Set start to an invalid value.
+			 */
+			start = (sector_t) -1;
+		}
+		spin_unlock(&dd->makeItFailLock);
+	}
+
+	*((unsigned int *) &fis->LBALow) = (start & 0xffffff);
+	*((unsigned int *) &fis->LBALowEx) = ((start >> 24) & 0xffffff);
+
+	/*
+	 * This has to be done after writing the start lower bytes.
+	 */
+	fis->device		= 1 << 6;
+	if (barrier)
+		fis->device |= FUA_BIT;
+
+	fis->features	= nsect & 0xff;
+	fis->featuresEx	= (nsect >> 8) & 0xff;
+
+	fis->sectCount	= ((tag << 3) | (tag >> 5));
+	fis->secCountEx	= 0;
+	fis->control	= 0;
+	fis->res2		= 0;
+	fis->res3		= 0;
+	fill_command_SG(command, nents);
+
+	/*
+	 * Populate the command header.
+	 */
+	command->commandHeader->opts = cpu_to_le32(
+			(nents << 16) | 5 | AHCI_CMD_PREFETCH);
+	command->commandHeader->byteCount = 0;
+
+	/*
+	 * Set the completion function and data for the command
+	 * within this layer.
+	 */
+	command->completionData = dd;
+	command->completionFunc = async_read_complete;
+
+	/*
+	 * Set the completion function and data for the command passed
+	 * from the upper layer.
+	 */
+	command->asyncData = data;
+	command->asyncCallback = callback;
+
+	/*
+	 * Lock used to prevent this command from being issued
+	 * if an internal command is in progress.
+	 */
+	down_read(&port->dd->internalSem);
+
+	atomic_inc(&dd->statistics.reads);
+
+	/*
+	 * Issue the command to the hardware.
+	 */
+	issue_command(port, tag);
+
+#ifdef COMMAND_TIMEOUT
+	/* Set the command's timeout value.*/
+	port->commands[tag].compTime = jiffies + msecs_to_jiffies(
+					NCQ_COMMAND_TIMEOUT_MS);
+#endif
+
+	up_read(&port->dd->internalSem);
+
+	return 0;
+}
+
+/**
+ * @brief Obtain a command slot and return its associated scatter list.
+ *
+ * @param dd Pointer to the driver data structure.
+ * @param tag Pointer to an int that will receive the allocated command slot tag.
+ *
+ * @return Pointer to the scatter list for the allocated command slot or NULL if
+ * no command slots are available.
+ */
+struct scatterlist *ahci_get_scatterlist(struct driver_data *dd, int *tag)
+{
+	/*
+	 * It is possible that, even with this semaphore, a thread
+	 * may think that no command slots are available. Therefore, we
+	 * need to make an appempt for  get_slot().
+	 */
+	down(&dd->port->commandSlot);
+	*tag = get_slot(dd->port);
+	if (unlikely(*tag  < 0)) {
+		/*printk(KERN_WARNING "%s: No free command slots\n",
+		 * __func__);*/
+	}
+	return dd->port->commands[*tag].sg;
+}
+
+/**
+ * @brief Get the drive capacity in sectors.
+ *
+ * This function obtains the drive capacity by
+ * issuing a READ NATIVE MAX EXT command to the drive.
+ *
+ * @return Highest sector number accessible on the drive.
+ */
+sector_t ahci_get_capacity(struct driver_data *dd)
+{
+	sector_t capacity;
+
+	if (dd->product_type != PRODUCT_OLDFPGA) {
+		/*why do we have two functions to do the same thing
+		*in different ways?
+		*/
+		if (!getCapacity(dd->port, &capacity)) {
+			printk(KERN_ERR "Error: unable to determine capacity.\n");
+			/*FIXME: Look for a better way to report failure.*/
+			capacity = 0;
+		}
+		return capacity;
+	}
+
+	/* This doesn't work on the ASIC yet.*/
+	return read_max_address(dd->port);
+}
+
+/**
+ * @brief Get the hardware block size.
+ *
+ * For Cyclone this is 4KB
+ *
+ * @retval 4096 for Cyclone devices.
+ */
+int ahci_hard_blksize(void)
+{
+	return 4096;
+}
+
+/**
+ * @brief Copy the statistical information to the supplied buffer.
+ *
+ * @param dev Pointer to the device structure, passed by the kernrel.
+ * @param attr Pointer to the device_attribute structure passed by the kernel.
+ * @param buf Pointer to the char buffer that will receive the stats info.
+ *
+ * @return The size, in bytes, of the data pointed to by buf.
+ */
+static ssize_t ahci_show_stats(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct driver_data *dd = dev_to_disk(dev)->private_data;
+	int size;
+
+	/*
+	 * Protect us from the timer that updates the statistics counters.
+	 */
+	size = sprintf(buf, "%s:Ints = %d, reads = %d, writes = %d, IOPS =%d\n",
+		__func__,
+		(unsigned int) atomic_read(&dd->statistics.currentInts),
+		(unsigned int) atomic_read(&dd->statistics.currentReads),
+		(unsigned int) atomic_read(&dd->statistics.currentWrites),
+		(unsigned int) atomic_read(&dd->statistics.currentIOPS)
+	);
+
+	return size;
+}
+static DEVICE_ATTR(statistics, S_IRUGO, ahci_show_stats, NULL);
+
+/**
+ * @brief Copy the important register information to the supplied buffer.
+ *
+ * @param dev Pointer to the device structure, passed by the kernrel.
+ * @param attr Pointer to the device_attribute structure passed by the kernel.
+ * @param buf Pointer to the char buffer that will receive the stats info.
+ *
+ * @return The size, in bytes, of the data pointed to by buf.
+ */
+static ssize_t ahci_show_registers(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct driver_data *dd = dev_to_disk(dev)->private_data;
+	int size = 0;
+	int n;
+
+	size += sprintf(&buf[size], "%s:\nSActive:\n", __func__);
+
+	for (n = 0; n < dd->slot_groups; n++)
+		size += sprintf(&buf[size], "0x%08x\n",
+					 readl(dd->port->SActive[n]));
+
+	size += sprintf(&buf[size], "Command Issue:\n");
+
+	for (n = 0; n < dd->slot_groups; n++)
+		size += sprintf(&buf[size], "0x%08x\n",
+					readl(dd->port->CommandIssue[n]));
+
+	size += sprintf(&buf[size], "Allocated:\n");
+
+	for (n = 0; n < dd->slot_groups; n++) {
+		/* some magic to work around the fact that 'allocated'
+		 * is an array of longs.*/
+		u32 group_allocated;
+		if (sizeof(long) > sizeof(u32))
+			group_allocated =
+					dd->port->allocated[n/2] >> (32*(n&1));
+		else
+			group_allocated = dd->port->allocated[n];
+		size += sprintf(&buf[size], "0x%08x\n",
+				 group_allocated);
+	}
+
+	size += sprintf(&buf[size], "Completed:\n");
+
+	for (n = 0; n < dd->slot_groups; n++)
+		size += sprintf(&buf[size], "0x%08x\n",
+				readl(dd->port->Completed[n]));
+
+	size += sprintf(&buf[size], "PORT_IRQ_STAT 0x%08x\n",
+				readl(dd->port->mmio + PORT_IRQ_STAT));
+	size += sprintf(&buf[size], "HOST_IRQ_STAT 0x%08x\n",
+				readl(dd->mmio + HOST_IRQ_STAT));
+
+	return size;
+}
+static DEVICE_ATTR(registers, S_IRUGO, ahci_show_registers, NULL);
+
+static ssize_t ahci_show_resptime(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct driver_data *dd = dev_to_disk(dev)->private_data;
+	int size = 0;
+
+	size += sprintf(&buf[size], "%s:\nResponse Time (us):\n", __func__);
+	size += sprintf(&buf[size], "Min\tMax\tAvg\n");
+	size += sprintf(&buf[size], "%d\t%d\t%d\n",
+			atomic_read(&dd->statistics.minRespTime),
+			atomic_read(&dd->statistics.minRespTime),
+			atomic_read(&dd->statistics.currentAvgRespTime));
+
+	return size;
+}
+static DEVICE_ATTR(resptime, S_IRUGO, ahci_show_resptime, NULL);
+
+static ssize_t ahci_show_makeItFail(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct driver_data *dd = dev_to_disk(dev)->private_data;
+	int size = 0;
+
+	size += sprintf(&buf[size], "%d\n", dd->makeItFail);
+
+	return size;
+}
+
+static ssize_t ahci_store_makeItFail(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf,
+					size_t len)
+{
+	struct driver_data *dd = dev_to_disk(dev)->private_data;
+	sscanf(buf, "%d", &dd->makeItFail);
+
+	if (dd->makeItFail & 0x01) {
+		get_random_bytes(&dd->randomReadCount,
+				sizeof(dd->randomReadCount));
+		dd->randomReadCount &= 0xffffff;
+		printk(KERN_INFO "%s: Random read count = %d\n",
+				__func__,
+				dd->randomReadCount);
+	}
+
+	if (dd->makeItFail & 0x02) {
+		get_random_bytes(&dd->randomWriteCount,
+				sizeof(dd->randomWriteCount));
+		dd->randomWriteCount &= 0xfffff;
+		printk(KERN_INFO "%s: Random write count = %d\n", __func__,
+				dd->randomWriteCount);
+	}
+
+	return len;
+}
+static DEVICE_ATTR(make_it_fail,
+			S_IRUGO | S_IWUGO,
+			ahci_show_makeItFail,
+			ahci_store_makeItFail);
+
+static ssize_t ahci_store_reset(struct device *dev,
+			struct device_attribute *attr,
+			const char *buf,
+			size_t len)
+{
+	struct driver_data *dd = dev_to_disk(dev)->private_data;
+	int val;
+	sscanf(buf, "%d", &val);
+
+	if (val == 1)
+		restart_port(dd->port);
+	/* fake a TFE to get rid of hung commands.*/
+	if (val == 4)
+		handleTFE(dd);
+
+	return len;
+}
+
+
+static DEVICE_ATTR(reset, S_IRUGO , NULL, ahci_store_reset);
+
+/**
+ * @brief Create the sysfs related attributes.
+ *
+ * @param dd Pointer to the driver data structure.
+* @param kobj Pointer to the kobj for the block device.
+ *
+ * @retval 0 Operation completed successfully.
+ * @retval -EINVAL Invalid parameter.
+ */
+int ahci_sysfs_init(struct driver_data *dd, struct kobject *kobj)
+{
+	if (!kobj || !dd)
+		return -EINVAL;
+
+	if (sysfs_create_file(kobj, &dev_attr_statistics.attr))
+		printk(KERN_ERR "%s: Error creating statistics sysfs attribute\n",
+					__func__);
+	if (sysfs_create_file(kobj, &dev_attr_registers.attr))
+		printk(KERN_ERR "%s: Error creating registers sysfs attribute\n",
+					__func__);
+	if (sysfs_create_file(kobj, &dev_attr_resptime.attr))
+		printk(KERN_ERR "%s: Error creating resptime sysfs attribute\n",
+					__func__);
+	if (sysfs_create_file(kobj, &dev_attr_make_it_fail.attr))
+		printk(KERN_ERR "%s: Error creating make_it_fail sysfs attribute\n",
+					__func__);
+	if (sysfs_create_file(kobj, &dev_attr_reset.attr))
+		printk(KERN_ERR "%s: Error creating reset sysfs attribute\n",
+					__func__);
+
+	return 0;
+}
+
+/**
+ * @brief Remove the sysfs related attributes.
+ *
+ * @param dd Pointer to the driver data structure.
+ * @param kobj Pointer to the kobj for the block device.
+ *
+ * @retval 0 Operation completed successfully.
+ * @retval -EINVAL Invalid parameter.
+ */
+int ahci_sysfs_exit(struct driver_data *dd, struct kobject *kobj)
+{
+	if (!kobj || !dd)
+		return -EINVAL;
+
+	sysfs_remove_file(kobj, &dev_attr_registers.attr);
+	sysfs_remove_file(kobj, &dev_attr_statistics.attr);
+	sysfs_remove_file(kobj, &dev_attr_resptime.attr);
+	sysfs_remove_file(kobj, &dev_attr_make_it_fail.attr);
+	sysfs_remove_file(kobj, &dev_attr_reset.attr);
+
+	return 0;
+}
+
+/**
+ * @brief Statistics timer.
+ *
+ * Triggered once per second to update the performance counters.
+ *
+ * @param data Pointer to the STATS structure.
+ *
+ * @return N/A
+ */
+static void stats_timeout(unsigned long int data)
+{
+	struct STATS *stats = (struct STATS *) data;
+
+	atomic_set(&stats->currentInts, atomic_read(&stats->interrupts));
+	atomic_set(&stats->interrupts, 0);
+
+	atomic_set(&stats->currentReads, atomic_read(&stats->reads));
+	atomic_set(&stats->reads, 0);
+
+	atomic_set(&stats->currentWrites, atomic_read(&stats->writes));
+	atomic_set(&stats->writes, 0);
+
+	atomic_set(
+	  &stats->currentIOPS,
+	  atomic_read(&stats->currentReads) + atomic_read(
+						&stats->currentWrites));
+
+	if (atomic_read(&stats->currentIOPS))
+		atomic_set(&stats->currentAvgRespTime,
+				100000000 / atomic_read(&stats->currentIOPS));
+	else
+		atomic_set(&stats->currentAvgRespTime, 0);
+
+	atomic_set(&stats->avgRespTime, 0);
+
+	mod_timer(&stats->timer, jiffies + msecs_to_jiffies(1000));
+}
+
+/**
+ * @brief Initialize the statistics counters.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return N/A
+ */
+static void stats_init(struct driver_data *dd)
+{
+	atomic_set(&dd->statistics.interrupts, 0);
+	atomic_set(&dd->statistics.reads, 0);
+	atomic_set(&dd->statistics.writes, 0);
+	atomic_set(&dd->statistics.avgRespTime, 0);
+	atomic_set(&dd->statistics.currentInts, 0);
+	atomic_set(&dd->statistics.currentReads, 0);
+	atomic_set(&dd->statistics.currentWrites, 0);
+	atomic_set(&dd->statistics.currentAvgRespTime, 0);
+	init_timer(&dd->statistics.timer);
+	dd->statistics.timer.data = (unsigned long int) &dd->statistics;
+	dd->statistics.timer.function = stats_timeout;
+}
+
+/**
+ * @brief Start statistics collection.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return N/A
+ */
+static void stats_start(struct driver_data *dd)
+{
+	/*
+	 * Start the statistics timer, once per second.
+	 */
+	mod_timer(&dd->statistics.timer, jiffies + msecs_to_jiffies(1000));
+}
+
+/**
+ * @brief Stop statistics collection.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return N/A
+ */
+static void stats_stop(struct driver_data *dd)
+{
+	/*
+	 * Stop the performance statistics timer.
+	 */
+	del_timer_sync(&dd->statistics.timer);
+}
+
+/**
+ * @brief Perform any one-time hardware setup
+ *
+ * Perform any hardware initialization steps that are needed
+ * at driver initialization time or when resuming from a
+ * suspended state.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return N/A
+ */
+void hba_setup(struct driver_data *dd)
+{
+	u32 hwdata;
+	hwdata = readl(dd->mmio + HOST_HSORG);
+
+	/* interrupt bug workaround: use only 1 IS bit.*/
+	writel(hwdata | HSORG_DISABLE_SLOTGRP_INTR|HSORG_DISABLE_SLOTGRP_PXIS,
+		 dd->mmio + HOST_HSORG);
+}
+
+/**
+ * @brief Detect the design and interface version.
+ *
+ * Detect the details of the product, and store anything needed
+ * into the driver data structure.  This includes product type and
+ * version and number of slot groups.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return N/A
+ */
+static void detect_product(struct driver_data *dd)
+{
+	u32 hwdata;
+	/* HBA base + 0xFC [15:0] - vendor-specific hardware interface
+	 * info register:
+	 * [15:8] hardware/software interface rev#
+	 * [   3] asic-style interface
+	 * [ 2:0] number of slot groups, minus 1 (only valid for asic-style).
+	 */
+	hwdata = readl(dd->mmio + HOST_HSORG);
+
+	dd->product_type = PRODUCT_UNKNOWN;
+
+	if ((hwdata & HSORG_STYLE) == 0) {
+		printk(KERN_INFO "Detected an old FPGA design. Assuming 4 slot groups, 128 slots.\n");
+		dd->product_type = PRODUCT_OLDFPGA;
+		dd->slot_groups = 4;
+	} else if (hwdata & 0x8) {
+		unsigned int rev, slotgroups;
+
+		dd->product_type = PRODUCT_ASICFPGA;
+		rev = hwdata & HSORG_HWREV >> 4;
+		slotgroups = (hwdata & HSORG_SLOTGROUPS) + 1;
+		printk(KERN_INFO "ASIC-FPGA design, HS rev 0x%x, %i slot groups, %i slots\n",
+				 rev,
+				 slotgroups,
+				 slotgroups*32);
+
+		if (slotgroups > MAX_SLOT_GROUPS) {
+			printk(KERN_WARNING "Warning: driver only supports %i slot groups.\n",
+						 MAX_SLOT_GROUPS);
+			slotgroups = MAX_SLOT_GROUPS;
+		}
+		dd->slot_groups = slotgroups;
+	}
+}
+
+/**
+ * @brief Called once for each Cyclone AHCI device.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return 0 on success, else an error code.
+ */
+int ahci_init(struct driver_data *dd)
+{
+	int i;
+	int rv;
+	unsigned int num_command_slots;
+
+	dd->mmio = pcim_iomap_table(dd->pdev)[MTIPX2XX_ABAR];
+
+	detect_product(dd);
+	if (dd->product_type == PRODUCT_UNKNOWN) {
+		rv = -EIO;
+		goto out1;
+	}
+	num_command_slots = dd->slot_groups * 32;
+
+	hba_setup(dd);
+
+	stats_init(dd);
+
+	/*
+	 * Initialize the spin lock for accessing the SActive register and
+	 * the port->active variable.
+	 */
+	spin_lock_init(&dd->makeItFailLock);
+	init_rwsem(&dd->internalSem);
+
+#if USE_TASKLET
+	tasklet_init(&dd->tasklet, tasklet_proc, (unsigned long)dd);
+#endif
+
+	/*
+	 * Allocate memory for the port structures. Use vmalloc rather than
+	 * kmalloc since we don't need this memory to be physically contiguous.
+	 */
+	dd->port = vmalloc(sizeof(struct port));
+
+	if (dd->port == NULL) {
+		printk(KERN_ERR "Unable to allocate memory for port structure\n");
+		return -ENOMEM;
+	}
+	memset(dd->port, 0, sizeof(struct port));
+
+	sema_init(&dd->port->commandSlot, num_command_slots - 1);
+	spin_lock_init(&dd->port->cmdIssueLock);
+
+	/*
+	 * Set the port mmio base address.
+	 */
+	dd->port->mmio	= dd->mmio + PORT_OFFSET;
+	dd->port->dd	= dd;
+
+	/*
+	 * Allocate memory for the command list.
+	 */
+	dd->port->commandList = dmam_alloc_coherent(&dd->pdev->dev,
+				AHCI_PORT_PRIV_DMA_SZ + (ATA_SECT_SIZE * 2),
+				&dd->port->commandListDMA,
+				GFP_KERNEL);
+	if (dd->port->commandList == NULL) {
+		printk(KERN_ERR "Cannot allocate memory for AHCI structures!\n");
+		rv = -ENOMEM;
+		goto out1;
+	}
+	/*
+	 * Clear the memory we have allocated.
+	 */
+	memset(dd->port->commandList,
+				0,
+				AHCI_PORT_PRIV_DMA_SZ + (ATA_SECT_SIZE * 2));
+
+	/*
+	 * Setup the addresse of the RX FIS.
+	 */
+	dd->port->rxFIS		= dd->port->commandList + AHCI_CMD_SLOT_SZ;
+	dd->port->rxFISDMA	= dd->port->commandListDMA + AHCI_CMD_SLOT_SZ;
+
+	/*
+	 * Setup the address of the command tables.
+	 */
+	dd->port->commandTbl	= dd->port->rxFIS + AHCI_RX_FIS_SZ;
+	dd->port->commandTblDMA	= dd->port->rxFISDMA + AHCI_RX_FIS_SZ;
+
+	/*
+	 * Setup the address of the identify data.
+	 */
+	dd->port->identify	= dd->port->commandTbl + AHCI_CMD_TBL_AR_SZ;
+	dd->port->identifyDMA	= dd->port->commandTblDMA + AHCI_CMD_TBL_AR_SZ;
+
+	/*
+	 * Setup the address of the sector buffer.
+	 */
+	dd->port->sectorBuffer	= (void *) dd->port->identify + ATA_SECT_SIZE;
+	dd->port->sectorBufferDMA = dd->port->identifyDMA + ATA_SECT_SIZE;
+
+	/*
+	 * Point the command headers at the command tables.
+	 */
+	for (i = 0; i < num_command_slots; i++) {
+		dd->port->commands[i].commandHeader =
+					dd->port->commandList +
+					(sizeof(struct COMMAND_HDR) * i);
+		dd->port->commands[i].commandHeaderDMA =
+					dd->port->commandListDMA +
+					(sizeof(struct COMMAND_HDR) * i);
+
+		dd->port->commands[i].command = dd->port->commandTbl +
+							(AHCI_CMD_TBL_SZ * i);
+		dd->port->commands[i].commandDMA = dd->port->commandTblDMA +
+							(AHCI_CMD_TBL_SZ * i);
+
+		if (readl(dd->mmio + HBA_CAPS) & HOST_CAP_64)
+			dd->port->commands[i].commandHeader->ctbau =
+			cpu_to_le32(
+			(dd->port->commands[i].commandDMA >> 16) >> 16);
+		dd->port->commands[i].commandHeader->ctba = cpu_to_le32(
+			dd->port->commands[i].commandDMA & 0xffffffff);
+
+		/*
+		 * If this is not done a bug is reported by the stock FC11 i386.
+		 * Due to the fact that it has lots of kernel debugging enabled.
+		 */
+		sg_init_table(dd->port->commands[i].sg, MAX_SG);
+		/* Mark all commands as currently inactive.*/
+		atomic_set(&dd->port->commands[i].active, 0);
+	}
+
+	/*
+	 * Setup the pointers to the extended SActive and CI registers.
+	 */
+	if (dd->product_type == PRODUCT_ASICFPGA) {
+		for (i = 0; i < dd->slot_groups; i++) {
+			dd->port->SActive[i] =
+				dd->port->mmio + i*0x80 + PORT_SACTIVE;
+			dd->port->CommandIssue[i] =
+				dd->port->mmio + i*0x80 + PORT_COMMAND_ISSUE;
+			dd->port->Completed[i] =
+				dd->port->mmio + i*0x80 + PORT_SDBV;
+		}
+	} else if (dd->product_type == PRODUCT_OLDFPGA) {
+		dd->port->SActive[0]	= dd->port->mmio + PORT_SACTIVE;
+		dd->port->CommandIssue[0] =
+				dd->port->mmio + PORT_COMMAND_ISSUE;
+		for (i = 1; i < dd->slot_groups; i++) {
+			dd->port->SActive[i] = dd->mmio + 0xa0 + (8 * (i-1));
+			dd->port->CommandIssue[i] =
+					dd->mmio + 0xa4 + (8 * (i-1));
+		}
+		/* Set up pointers to the Completed registers.*/
+		for (i = 0; i < dd->slot_groups; i++)
+			dd->port->Completed[i] = dd->mmio + 0xd8 + (4 * i);
+
+	}
+
+	/*
+	* Reset the HBA.
+	*/
+	if (hba_reset(dd) < 0) {
+		printk(KERN_ERR "HBA did not reset within timeout\n");
+		rv = -EIO;
+		goto out2;
+	}
+
+	init_port(dd->port);
+	start_port(dd->port);
+
+	/*
+	 * Setup the ISR and enable interrupts.
+	 */
+	rv = devm_request_irq(&dd->pdev->dev,
+				dd->pdev->irq,
+				irq_handler,
+				IRQF_SHARED,
+				dev_driver_string(&dd->pdev->dev),
+				dd);
+
+	if (rv) {
+		printk(KERN_ERR "Unable to allocate IRQ %d\n", dd->pdev->irq);
+		goto out2;
+	}
+
+	/*
+	 * Enable interrupts on the HBA.
+	 */
+	writel(readl(dd->mmio + HOST_CTRL) | HOST_IRQ_EN, dd->mmio + HOST_CTRL);
+
+#ifdef COMMAND_TIMEOUT
+	init_timer(&dd->port->commandTimer);
+	dd->port->commandTimer.data = (unsigned long int) dd->port;
+	dd->port->commandTimer.function = timeout_function;
+	mod_timer(&dd->port->commandTimer,
+			jiffies + msecs_to_jiffies(TIMEOUT_CHECK_PERIOD));
+#endif
+
+/*	restart_port(dd->port);
+	softwareReset(dd->port);
+
+	memset(dd->port->sectorBuffer, 0, 512);
+	read_logpage(dd->port, ATA_LOG_SATA_NCQ, dd->port->sectorBufferDMA, 1);
+	dump_buffer(dd->port->sectorBuffer, 512);
+	read_logpage(dd->port, 0, dd->port->sectorBufferDMA, 1);
+*/
+
+	get_identify(dd->port, NULL);
+	dump_identify(dd->port);
+
+	/*
+	 * Bit 15 of this register needs to be cleared to
+	 * enable 128 command slots in some hardware versions.
+	 */
+	/*FIXME: is this still required?*/
+	if (dd->product_type == PRODUCT_OLDFPGA)
+		writel(0x0, dd->mmio + 0xfc);
+
+
+	stats_start(dd);
+
+	return rv;
+
+/*out3:   label currently unused, but want to preserve the code
+ * in case of future need.*/
+#ifdef COMMAND_TIMEOUT
+	del_timer_sync(&dd->port->commandTimer);
+#endif
+	/*
+	 * Disable interrupts on the HBA.
+	 */
+	writel(readl(dd->mmio + HOST_CTRL) & ~HOST_IRQ_EN,
+			dd->mmio + HOST_CTRL);
+
+	/*
+	 * Release the IRQ.
+	 */
+	devm_free_irq(&dd->pdev->dev, dd->pdev->irq, dd);
+
+out2:
+	deinit_port(dd->port);
+
+	/*
+	 * Free the command/command header memory.
+	 */
+	dmam_free_coherent(&dd->pdev->dev,
+				AHCI_PORT_PRIV_DMA_SZ + (ATA_SECT_SIZE * 2),
+				dd->port->commandList,
+				dd->port->commandListDMA);
+out1:
+	/*
+	 * Free the memory allocated for the for structure.
+	 */
+	vfree(dd->port);
+
+	return rv;
+}
+
+/**
+ * @brief Called to deinitialize an AHCI interface.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return This function always returns 0.
+ */
+int ahci_exit(struct driver_data *dd)
+{
+	stats_stop(dd);
+
+	/*
+	 * Send standby immediate (E0h) to the drive so that it
+	 * saves its state.
+	 */
+	if (atomic_read(&dd->drv_cleanup_done) != true) {
+
+			standby_immediate(dd->port);
+
+			/*
+			* de-initialize the port.
+			*/
+			deinit_port(dd->port);
+
+			/*
+			* Disable interrupts on the HBA.
+			*/
+			writel(readl(dd->mmio + HOST_CTRL) & ~HOST_IRQ_EN,
+					dd->mmio + HOST_CTRL);
+		}
+
+#ifdef COMMAND_TIMEOUT
+	del_timer_sync(&dd->port->commandTimer);
+#endif
+
+#if USE_TASKLET
+	/*
+	 * Stop the bottom half tasklet.
+	 */
+	tasklet_kill(&dd->tasklet);
+#endif
+
+	/*
+	 * Release the IRQ.
+	 */
+	devm_free_irq(&dd->pdev->dev, dd->pdev->irq, dd);
+
+	/* Workaround for Fedora 14 crash:
+	* Must have delay between free_irq and driver exit.
+	*/
+	msleep(100);
+
+	/*
+	 * Free the command/command header memory.
+	 */
+	dmam_free_coherent(&dd->pdev->dev,
+			AHCI_PORT_PRIV_DMA_SZ + (ATA_SECT_SIZE * 2),
+			dd->port->commandList,
+			dd->port->commandListDMA);
+	/*
+	 * Free the memory allocated for the for structure.
+	 */
+	vfree(dd->port);
+
+	return 0;
+}
+
+/**
+ * @brief Issue a Standby Immediate command to the device.
+ *
+ * This function is called by the Block Layer just before the
+ * system powers off during a shutdown.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return This function always returns 0.
+ */
+int ahci_shutdown(struct driver_data *dd)
+{
+	/*
+	 * Send standby immediate (E0h) to the drive so that it
+	 * saves its state.
+	 */
+	standby_immediate(dd->port);
+
+	return 0;
+}
+
+int ahci_suspend(struct driver_data *dd)
+{
+		/* Send standby immediate (E0h) to the drive
+		 *  so that it saves its state.*/
+		if (standby_immediate(dd->port) != 0) {
+			printk(KERN_ERR "Failed to send standby-immediate command\n");
+			return FAILURE;
+		}
+
+		/* Disable interrupts on the HBA.*/
+		writel(readl(dd->mmio + HOST_CTRL) & ~HOST_IRQ_EN,
+				dd->mmio + HOST_CTRL);
+		deinit_port(dd->port);
+
+		return SUCCESS;
+}
+
+int ahci_resume(struct driver_data *dd)
+{
+		/*Perform any needed hardware setup steps*/
+		hba_setup(dd);
+
+		/* Reset the HBA*/
+		if (hba_reset(dd) != 0) {
+			printk(KERN_ERR "Unable to reset the HBA\n");
+			return FAILURE;
+		}
+
+		/* Enable the port, the DMA engine and FIS reception specific
+		* h/w in controller.
+		*/
+		init_port(dd->port);
+		start_port(dd->port);
+
+		/* Enable interrupts on the HBA.*/
+		writel(readl(dd->mmio + HOST_CTRL) | HOST_IRQ_EN,
+				dd->mmio + HOST_CTRL);
+
+		return SUCCESS;
+}
+
+/* * This function command_cleanup is called for clean the pending
+ * command in the command slot during the surprise removal of device
+ * and return error to the upper layer.
+ *
+ *
+ * @param dd Pointer to the DRIVER_DATA structure.
+ *
+ *
+ * @return N/A
+ */
+
+void command_cleanup(struct driver_data *dd)
+{
+		int Group = 0, commandslot = 0, commandindex = 0;
+		struct COMMAND *command;
+		struct HOST_TO_DEV_FIS *fis;
+		struct port *port = dd->port;
+
+		for (Group = 0; Group < 4; Group++) {
+			for (commandslot = 0; commandslot < 32; commandslot++) {
+				if (
+				(port->allocated[Group] << commandslot) == 1) {
+					commandindex =
+					Group  << 5 | commandslot;
+					command =
+					&port->commands[commandindex];
+					if (atomic_read(
+					&command->active)
+					&& (command->asyncCallback)
+					)
+						command->asyncCallback(
+						command->asyncData,
+						ENODEV);
+					fis =
+					(struct HOST_TO_DEV_FIS *)
+					command->command;
+					if (fis->command == ATA_CMD_FPDMA_WRITE)
+						dma_unmap_sg(
+						&port->dd->pdev->dev,
+						command->sg,
+						command->scatterEnts,
+						DMA_TO_DEVICE);
+					else
+						dma_unmap_sg(
+							&port->dd->pdev->dev,
+							command->sg,
+							command->scatterEnts,
+							DMA_FROM_DEVICE);
+
+				}
+			}
+		}
+
+
+		up(&port->commandSlot);
+
+		/*
+		* Set the atomic variable as 1 in case of SRSI
+		*/
+		atomic_set(&dd->drv_cleanup_done, true);
+}
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/block.c linux-2.6.38-asai/drivers/block/mtipx2xx/block.c
--- linux-2.6.38/drivers/block/mtipx2xx/block.c	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/block.c	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,584 @@
+/*****************************************************************************
+ *
+ * block.c - Handles the block layer of the Cyclone SSD Block Driver
+ *   Copyright (C) 2009  Integrated Device Technology, Inc.
+ *
+ *  This file is part of the Cyclone SSD Block Driver, it is free software:
+ *  you can redistribute it and/or modify it under the terms of the GNU
+ *  General Public License as published by the Free Software Foundation;
+ *  either version 2 of the License, or (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+ *  MA  02110-1301, USA.
+ *
+ *  You can contact Integrated Device Technology, Inc
+ *  via email ssdhelp@idt.com or mail
+ *  Integrated Device Technology, Inc
+ *  6024 Silver Creek Valley Road, San Jose, CA 95128, USA
+ *
+ ****************************************************************************/
+#include <linux/fs.h>
+#include <linux/genhd.h>
+#include <linux/hdreg.h>
+#include <linux/blkdev.h>
+#include <linux/bio.h>
+#include <linux/dma-mapping.h>
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include "mtipx2xx.h"
+
+/**
+ * This layer will manage the allocation and registration of block devices,
+ * request queuing, and the inter-face into the kernel block layer.
+ * The block layer will also be required to call into the protocol layer.
+ * Since multiple protocols are supported, the block layer uses a structure
+ * called PROTOCOL_FUNCS, which is an array of structures that contain
+ * pointers to functions exported by the protocol layers. An example of the
+ * PROTOCOL_FUNCS table is shown below.
+ * @verbatim
+static PROTOCOL_FUNCS protocol[] = {
+	{
+		.init = ahci_init,
+		.exit = ahci_exit,
+#ifdef CONFIG_PM
+		.suspend = ahci_suspend,
+		.resume = ahci_resume,
+#endif
+		.shutdown = ahci_shutdown,
+		.read = ahci_read,
+		.write = ahci_write,
+		.hwBlkSize = ahci_hard_blksize,
+		.getCapacity = ahci_get_capacity,
+		.getScatterList = ahci_get_scatterList,
+		.ioctl = ahci_ioctl,
+		.sysfsInit = ahci_sysfs_init,
+		.sysfsExit = ahci_sysfs_exit,
+	},
+	{}
+};
+@endverbatim
+ * In the above example only one protocol is supported, AHCI.
+ * The block layer accesses the appropriate array element in the table by
+ * using the protocol index value that was defined in the pci_device_id table
+ * and passed to the block layer in the driver private data structure.
+ * This allows the same Block Layer source code to interface to multiple
+ * protocol layers based on the PCI vendor/device ID of the device.
+ *
+ * All Cyclone devices will utilize the same major device number, allocated by
+ * the Module Layer, and will support 16 minor device numbers. This will allow
+ * each Cyclone device to have an entry under /dev for the "full device" and 15
+ * partitions.
+ * For example, the first Cyclone device detected will have a major/min
+ * device number of 0,0. The first partition on this drive will have a
+ * major/minor device number of 0,1, and so on. The second Cyclone device will
+ * have the same major device number but its minor device number will be 16,
+ * the third Cyclone device will have a minor device number of 32, etc. Cyclone
+ * devices are registered with the kernel block layer as "cyclone",
+ * consequently they appear under /dev as cyclonex where x is a letter a-z
+ * which is assigned in the order that the drives are detected.
+ * Partitions are referenced as /dev/cyclonexn, where n is the
+ * partition number 1-15.
+ *
+ * The Kernel block layer requires that the following functions be implemented
+ * by the block device driver.
+ *
+ * - open() - Called when the block device is opened, i.e. mounted, etc.
+ * This function will perform minimal initialization since there is really
+ * nothing to do when the device is opened.
+ *
+ * - release() - Called when the block device is closed.
+ *
+ * - ioctl() - Called by the kernel block layer when it receives an IOCTL
+ * request that it does not understand. This will be the mechanism by which
+ * the SMART information will be obtained from the device and how vendor
+ * specific commands will be issued to the device.
+ *
+ * - make_request() - Called by the kernel to process a BIO structure.
+ * This function is the main guts of the block layer. It receives BIO requests
+ * from the kernel, processes them, and then issues read or writes to the
+ * protocol layer. Upon completion of a request by the protocol layer it will
+ * callback into the block layer to complete the transfer. This is the only
+ * acceptable manner in which a driver layer may call a function that belongs
+ * to a higher layer. It is also worth noting that this interface will bypass
+ * the I/O scheduler. No merging or reordering of the BIO's will be performed.
+ * See the section titled BIO Processing for detaile information on exactly
+ * how a BIO is handled.
+ *
+ * Additionally, the PCI layer requires that the following management functions
+ * be implemented in the drivers block layer.
+ *
+ * - block_initialize() - Called by the PCI layer to initialize the block device
+ * and layer. This function should first call the Protocol Layer initialization
+ * function. Once that is done, this function will allocate a block request
+ * queue, inform the kernel of the hardware block size, allocate a Linux disk
+ * structure, inform the kernel of the drives capacity, and finally add the new
+ * disk to the system. The hardware block size and drive capacity are obtained
+ * by calling the hwBlkSize() and getCapacity() functions in the Protocol Layer,
+ * respectively.
+ *
+ * - block_remove() - Called by the PCI layer to de-initialize the block device
+ * and layer. This function must remove the block device from the system before
+ * calling the Protocol Layer exit() function.
+ *
+ * - block_shutdown() - Called by the PCI layer just before the system halts
+ * during a shutdown. This function should call the protocol layers shutdown()
+ * function to sync the device before the system is powered off.
+ *
+ * - block_suspend() - Called by the PCI layer during hibernation/suspend of the
+ * system. This function should call the protocol layers suspend() function to
+ * change the power state of the device
+ *
+ * - block_resume() - Called by the PCI layer while resuming from
+ * hiberation/suspend of the system. This function should call the protocol
+ * layers resume() function to change the power state of the device.
+ */
+static struct  PROTOCOL_FUNCS	protocol[] = {
+		{
+				.init = ahci_init,
+				.exit = ahci_exit,
+				.suspend = ahci_suspend,
+				.resume = ahci_resume,
+				.shutdown = ahci_shutdown,
+				.read = ahci_read,
+				.write = ahci_write,
+				.hwBlkSize = ahci_hard_blksize,
+				.getCapacity = ahci_get_capacity,
+				.getScatterList = ahci_get_scatterlist,
+				.ioctl = ahci_ioctl,
+				.sysfsInit = ahci_sysfs_init,
+				.sysfsExit = ahci_sysfs_exit,
+		},
+		{}	/* Terminate the list*/
+};
+
+/**
+ * @brief Block layer IOCTL handler.
+ *
+ * @param dev Pointer to the block_device structure.
+ * @param mode ????
+ * @param cmd IOCTL command passed from the user application.
+ * @param arg Argument passed from the user application.
+ *
+ * @retval 0 IOCTL completed successfully.
+ * @retval -ENOTTY IOCTL not supported or invalid driver data structure pointer.
+ */
+static int block_ioctl(struct block_device *dev,
+			fmode_t mode,
+			unsigned cmd,
+			unsigned long arg)
+{
+	struct driver_data *dd = dev->bd_disk->private_data;
+	int rv = 0;
+
+	if (!dd)
+		return -ENOTTY;
+
+	switch (cmd) {
+	case BLKFLSBUF: /* Flush the device buffers, if it has any. */
+		printk(KERN_INFO "%s: Received BLKFLSBUF\n", __func__);
+		break;
+	default:
+		rv = protocol[dd->protocol].ioctl(dd, cmd, arg);
+	}
+	return rv;
+}
+
+/**
+ * @brief Obtain the geometry of the device.
+ *
+ * You may think that this function is obsolete, but some applications,
+ * fdisk for example still used CHS values. This function describes the
+ * device as having 224 heads and 56 sectors per cylinder. These values are
+ * chosen so that each cylinder is aligned on a 4KB boundary. Since a partition
+ * is described in terms of a start and end cylinder this means that each
+ * partition is also 4KB aligned. Non-aligned partitions adversely affects
+ * performance.
+ *
+ * @param dev Pointer to the block_device strucutre.
+ * @param geo Pointer to a hd_geometry structure.
+ *
+ * @retval 0 Operation completed successfully.
+ * @retval -ENOTTY An error occurred while reading the drive capacity.
+ */
+static int block_getgeo(struct block_device *dev, struct hd_geometry *geo)
+{
+	struct driver_data *dd = dev->bd_disk->private_data;
+	sector_t capacity;
+	capacity = protocol[dd->protocol].getCapacity(dd);
+	if (capacity == (sector_t) -1) {
+		printk(KERN_ERR "%s: Could not get drive capacity.\n",
+					__func__);
+		return -ENOTTY;
+	}
+
+	geo->heads = 224;
+	geo->sectors = 56;
+#if BITS_PER_LONG == 64
+	geo->cylinders = capacity / (geo->heads*geo->sectors);
+#else
+	do_div(capacity, (geo->heads*geo->sectors));
+	geo->cylinders = capacity;
+#endif
+	return 0;
+}
+
+/**
+ * @brief Block device operation function.
+ *
+ * This structure contains pointers to the functions required by the block
+ * layer.
+ */
+static const struct block_device_operations blockOps = {
+		/**
+		 * Called to process an IOCTL for the block device.
+		 */
+		.ioctl		= block_ioctl,
+		/**
+		 * Called to obtain the drives geometry.
+		 */
+		.getgeo		= block_getgeo,
+		/**
+		 * Owner of this structure.
+		 */
+		.owner		= THIS_MODULE
+};
+
+/**
+ * @brief BIO completion function.
+ *
+ * This function is called by the protocol layer to complete a BIO transfer.
+ * A Pointer to this function is passed into the protocol layer read/write
+ * functions as the completion callback.
+ *
+ * @param bio Pointer to the BIO that has completed.
+ * @param status Completion status, 0 = success, non-zero = error.
+ *
+ * @return This function always returns 0.
+ */
+static int complete_bio(struct bio *bio, int status)
+{
+	bio_endio(bio, status);
+	return 0;
+}
+
+
+#define bio_rw_flagged(bio, flag) ((bio)->bi_rw & flag)
+
+/**
+ * @brief Block layer make request function.
+ *
+ * This function is called by the kernel to process a BIO for
+ * the Cyclone device.
+ *
+ * @param queue Pointer to the request queue. Unused other than to obtain the driver data structure.
+ * @param bio Pointer to the BIO.
+ *
+ * @return This function always returns 0.
+ */
+static int make_request(struct request_queue *queue, struct bio *bio)
+{
+	struct driver_data *dd = queue->queuedata;
+	struct scatterlist *sg;
+	struct bio_vec *bvec;
+	int nents = 0;
+	int tag = 0;
+	if (unlikely(!bio_has_data(bio))) {
+		blk_queue_flush(queue, 0);
+		bio_endio(bio, 0);
+		return 0;
+	}
+
+	sg = protocol[dd->protocol].getScatterList(dd, &tag);
+	if (likely(sg != NULL)) {
+		blk_queue_bounce(queue, &bio);
+
+		if (unlikely((bio)->bi_vcnt > MAX_SG)) {
+			printk(KERN_WARNING "Maximum number of scatter list entries exceeded\n");
+			bio_io_error(bio);
+			return 0;
+		}
+
+		/*
+		 * Create the scatter list for this bio.
+		 */
+		bio_for_each_segment(bvec, bio, nents)
+		{
+			sg_set_page(&sg[nents],
+					bvec->bv_page,
+					bvec->bv_len,
+					bvec->bv_offset);
+		}
+
+		/*
+		 * Issue the read or write to the protocol layer.
+		 */
+		if (bio_data_dir(bio) == WRITE) {
+			protocol[dd->protocol].write(dd,/* Driver data */
+					/* Start sector */
+						bio->bi_sector,
+					/* # sectors to write */
+						bio_sectors(bio),
+					/*entries in the scatterlist*/
+						nents,
+					/*tag obtained with scatterlist*/
+						tag,
+					/* Blocklayer completion routineo*/
+						complete_bio,
+					/*Blocklayer completion data*/
+						bio,
+					/*Write must be flushed to media*/
+						bio_rw_flagged(bio, REQ_FLUSH));
+		} else {
+			protocol[dd->protocol].read(dd,	/* Driver data */
+					/* Start sector */
+						bio->bi_sector,
+					/* # sectors to read */
+						bio_sectors(bio),
+					/* # entries in the scatter list */
+						nents,
+					/* tag obtained with scatter list */
+						tag,
+					/* Block layer completion routine */
+						complete_bio,
+					/*Block layer completion data*/
+						bio,
+					/*Read must be from physicalmedia*/
+						bio_rw_flagged(bio, REQ_FLUSH));
+		}
+	} else {
+			bio_io_error(bio);
+		/*printk(KERN_ERR "%s: Interrupted\n", __func__);*/
+	}
+
+	return 0;
+}
+
+/**
+ * @brief Block layer initialization function.
+ *
+ * This function is called once by the PCI layer for each Cyclone
+ * device that is connected to the system.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return 0 on success else an error code.
+ */
+int block_initialize(struct driver_data *dd)
+{
+	int rv = 0;
+	sector_t capacity;
+
+	/*
+	 * Initialize the protocol layer.
+	 */
+	rv = protocol[dd->protocol].init(dd);
+	if (rv  < 0) {
+		printk(KERN_ERR "protocol layer initialization failed\n");
+		rv = -EINVAL;
+		goto protocol_init_error;
+	}
+
+	/*
+	 * Allocate the request queue.
+	 */
+	dd->queue = blk_alloc_queue(GFP_KERNEL);
+	if (dd->queue == NULL) {
+		printk(KERN_ERR "Unable to allocate request queue\n");
+		rv = -ENOMEM;
+		goto block_queue_alloc_init_error;
+	}
+
+	/*
+	 * Attach our request function to the request queue.
+	 */
+	blk_queue_make_request(dd->queue, make_request);
+
+	/* Set device limits.
+	* Even settings normally directed at the elevator are assigned,
+	* because they are exported to userspace via sysfs and might
+	* be used for tuning.
+	*/
+
+	set_bit(QUEUE_FLAG_NOMERGES, &dd->queue->queue_flags);
+	set_bit(QUEUE_FLAG_NONROT, &dd->queue->queue_flags);
+
+
+	/*set_bit(QUEUE_FLAG_SAME_COMP, &dd->queue->queue_flags);
+	 * fixme: should we use this? */
+	/*set_bit(QUEUE_FLAG_ELVSWITCH, &dd->queue->queue_flags);
+	 * fixme: should we use this? */
+
+	blk_queue_max_segments(dd->queue, MAX_SG);
+
+	/*blk_queue_max_hw_sectors(dd->queue, 1<<20);
+	 * fixme: figure out what the hardware's max sector size is.*/
+
+	blk_queue_physical_block_size(dd->queue, 4096);
+	/*blk_queue_alignment_offset(dd->queue, 0);   // default is 0.*/
+	blk_queue_io_min(dd->queue, 4096);          /*  sysfs/minimum_io_size*/
+	/* not sure yet.  sysfs/optimal_io_size*/
+	/*blk_queue_io_opt(dd->queue, 4096);*/
+
+	/* no discard support yet.*/
+	/*set_bit(QUEUE_FLAG_DISCARD, &dd->queue->queue_flags);
+	*dd->queue->limits.discard_granularity=4096;
+	*not sure yet.  1<<22 is 1<<31 in sectors.
+	*blk_queue_max_discard_sectors(dd->queue, 1<<22);
+	*dd->queue->limits.discard_zeroes_data=0; // ?
+	*/
+	dd->disk = alloc_disk(MAX_MINORS);
+	if (dd->disk  == NULL) {
+		printk(KERN_ERR "Unable to allocate gendisk structure\n");
+		rv = -EINVAL;
+		goto alloc_disk_error;
+	}
+
+	dd->disk->driverfs_dev	= &dd->pdev->dev;
+	dd->disk->major		= dd->major;
+	dd->disk->first_minor	= dd->instance * MAX_MINORS;
+	dd->disk->fops		= &blockOps;
+	dd->disk->queue		= dd->queue;
+	dd->disk->private_data	= dd;
+	dd->queue->queuedata	= dd;
+
+	/*
+	 * Create the name that will appear in /dev
+	 */
+	snprintf(dd->disk->disk_name, 32, "rssd%c", dd->instance + 'a');
+
+	/*
+	 * Set the capacity of the device in 512 byte sectors.
+	 */
+	capacity = protocol[dd->protocol].getCapacity(dd);
+	if (capacity == (sector_t) -1) {
+		printk(KERN_ERR "Could not read drive capacity\n");
+		rv = -EIO;
+		goto read_capacity_error;
+	}
+	set_capacity(dd->disk, capacity);
+
+	/*
+	 * Enable the block device and add it to /dev
+	 */
+	add_disk(dd->disk);
+
+	/*
+	 * Now that the disk is active, initialize any sysfs attributes
+	 * managed by the protocol layer.
+	 */
+	if (protocol[dd->protocol].sysfsInit) {
+		struct kobject *kobj =
+			kobject_get(&disk_to_dev(dd->disk)->kobj);
+		protocol[dd->protocol].sysfsInit(dd, kobj);
+		kobject_put(kobj);
+	}
+
+	return rv;
+
+read_capacity_error:
+	/*
+	 * Delete our gendisk structure. This also removes the device
+	 * from /dev
+	 */
+	del_gendisk(dd->disk);
+
+alloc_disk_error:
+
+	blk_cleanup_queue(dd->queue);
+
+block_queue_alloc_init_error:
+	/*
+	 * De-initialize the protocol layer.
+	 */
+	protocol[dd->protocol].exit(dd);
+protocol_init_error:
+	return rv;
+}
+
+/**
+ * @brief Block layer deinitialization function.
+ *
+ * Called by the PCI layer as each Cyclone device is removed.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return This function always returns 0.
+ */
+int block_remove(struct driver_data *dd)
+{
+
+	/*
+	 * Clean up the sysfs attributes managed by the protocol layer.
+	 */
+	if (protocol[dd->protocol].sysfsExit) {
+		struct kobject *kobj =
+			kobject_get(&disk_to_dev(dd->disk)->kobj);
+		protocol[dd->protocol].sysfsExit(dd, kobj);
+		kobject_put(kobj);
+	}
+
+	/*
+	 * Delete our gendisk structure. This also removes the device
+	 * from /dev
+	 */
+	del_gendisk(dd->disk);
+	blk_cleanup_queue(dd->queue);
+	dd->disk  = NULL;
+	dd->queue = NULL;
+
+	/*
+	 * De-initialize the protocol layer.
+	 */
+	protocol[dd->protocol].exit(dd);
+
+	return 0;
+}
+/**
+ * @brief Function called by the PCI layer when just before the
+ * machine shuts down.
+ *
+ * If a protocol layer shutdown function is present it will be called
+ * by this function.
+ *
+ * @param dd Pointer to the driver data structure.
+ *
+ * @return This function always returns 0.
+ */
+int block_shutdown(struct driver_data *dd)
+{
+	printk(KERN_NOTICE "Shutting down %s\n", dd->disk->disk_name);
+
+	/* Delete our gendisk structure, and cleanup the blk queue. */
+	del_gendisk(dd->disk);
+	blk_cleanup_queue(dd->queue);
+	dd->disk  = NULL;
+	dd->queue = NULL;
+
+	if (protocol[dd->protocol].shutdown)
+		protocol[dd->protocol].shutdown(dd);
+	return 0;
+}
+
+int block_suspend(struct driver_data *dd)
+{
+	printk(KERN_NOTICE "Suspending %s\n", dd->disk->disk_name);
+	if (protocol[dd->protocol].suspend)
+		protocol[dd->protocol].suspend(dd);
+	return 0;
+}
+
+int block_resume(struct driver_data *dd)
+{
+	printk(KERN_NOTICE "Resuming %s\n", dd->disk->disk_name);
+	if (protocol[dd->protocol].resume)
+		protocol[dd->protocol].resume(dd);
+	return 0;
+}
+
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/module.c linux-2.6.38-asai/drivers/block/mtipx2xx/module.c
--- linux-2.6.38/drivers/block/mtipx2xx/module.c	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/module.c	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,149 @@
+/*****************************************************************************
+ *
+ * module.c - Handles the module layer of the Cyclone SSD Block Driver
+ *   Copyright (C) 2009  Integrated Device Technology, Inc.
+ *
+ *  This file is part of the Cyclone SSD Block Driver, it is free software:
+ *  you can redistribute it and/or modify it under the terms of the GNU
+ *  General Public License as published by the Free Software Foundation;
+ *  either version 2 of the License, or (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 51 Franklin Street, Fifth Floor,
+ *  Boston, MA  02110-1301, USA.
+ *
+ *  You can contact Integrated Device Technology, Inc
+ *  via email ssdhelp@idt.com or mail
+ *  Integrated Device Technology, Inc
+ *  6024 Silver Creek Valley Road, San Jose, CA 95128, USA
+ *
+ ****************************************************************************/
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/fs.h>
+#include <linux/pci.h>
+#include "mtipx2xx.h"
+
+/**
+ * @file
+ * The module layer manages the loading and unloading of the driver module.
+ * Primarily this comprises of the module init and exit functions, prototypes
+ * for which are shown below.
+ *
+ * static int __init cyclone_init(void);@n
+ * static void __exit cyclone_exit(void);
+ *
+ * These functions are declared as the module entry and exit points, called
+ * when the module is loaded or unloaded by the following Linux macros.
+ *
+ * module_init(cyclone_init);@n
+ * module_exit(cyclone_exit);
+ *
+ * When the module is loaded, cyclone_init() will allocate a new major device
+ * number for the Cyclone devices, this should be evident by an entry in
+ * /proc/devices called "cyclone".
+ * This major device number will be used for all Cyclone devices connected
+ * to the system. Assuming a new major device number was allocated,
+ * cyclone_init() will then register the PCI driver by
+ * calling pci_register_driver() passing the address of the cyclone_pci_driver
+ * structure which is declared by the PCI layer.
+ *
+ * When the module is unloaded, cyclone_exit() is called to undo the work
+ * done by cyclone_init(). So, basically freeing the major device number
+ * and unregistering the PCI driver.
+ */
+
+/** @mainpage Cyclone Linux Block Device Driver
+ *
+ * @section Introduction
+ * The Cyclone SSD Linux Block Device Driver is a high performance low latency
+ * driver that interfaces directly to the Linux kernel block layer, bypassing
+ * the I/O scheduler. The driver will be implemented as a loadable Linux module
+ * that will support multiple Cyclone devices, device versions, and
+ * interface protocols such as AHCI and NVMHCI.
+ * @section Driver Layers
+ * For ease of maintenance the driver will be split into layers as depicted
+ * in Figure 1.
+ * Each layer will be self contained within a single C module.
+ * Each module exports a number of functions that constitute the layers API.
+ * A layer may only access the API of the layer that is directly beneath it.
+ * Calling functions that do not belong to the layer directly below the
+ * calling layer is prohibited, as is calling functions that belong to the
+ * layer above the calling layer unless done so via a callback.
+ * In effect you can only call down through the stack.
+ *
+ * @image html Layers.jpg Figure 1 - Cyclone Driver Layers
+ */
+
+
+/**
+ * Global variable used to hold the major block device number
+ * allocated in cyclone_init().
+ */
+int cyclone_major;
+
+/**
+ * Module initialization function.
+ *
+ * Called once when the module is loaded. This function allocates a major
+ * block device number to the Cyclone devices and registers the PCI layer
+ * of the driver.
+ *
+ * @return 0 on success else error code.
+ */
+static int __init cyclone_init(void)
+{
+	printk(KERN_INFO DRV_NAME " Version " DRV_VERSION "\n");
+
+	/*
+	 * Allocate a major block device number to use with this
+	 * driver.
+	 */
+	cyclone_major = register_blkdev(0, "mtipx2xx");
+	if (cyclone_major < 0) {
+		printk(KERN_ERR "Unable to register block device (%d)\n",
+						cyclone_major);
+		return -EBUSY;
+	}
+
+	/*
+	 * Register our PCI operations.
+	 */
+	return pci_register_driver(&cyclone_pci_driver);
+}
+
+/**
+ * Module de-initialization function.
+ *
+ * Called once when the module is unloaded. This function deallocates
+ * the major block device number allocated by cyclone_init() and
+ * unregisters the PCI layer of the driver.
+ *
+ * @return N/A
+ */
+static void __exit cyclone_exit(void)
+{
+	/*
+	 * Release the allocated major block device number.
+	 */
+	unregister_blkdev(cyclone_major, "mtipx2xx");
+
+	/*
+	 * Unregister the PCI driver.
+	 */
+	pci_unregister_driver(&cyclone_pci_driver);
+}
+
+MODULE_AUTHOR("Integrated Device Technology, Inc, Micron Technology, Inc");
+MODULE_DESCRIPTION("Micron RealSSD PCIe Block Driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DRV_VERSION);
+
+module_init(cyclone_init);
+module_exit(cyclone_exit);
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/mtipx2xx.h linux-2.6.38-asai/drivers/block/mtipx2xx/mtipx2xx.h
--- linux-2.6.38/drivers/block/mtipx2xx/mtipx2xx.h	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/mtipx2xx.h	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,795 @@
+/*****************************************************************************
+ *
+ * mtipx2xx.h - General header file for the Cyclone SSD Block Driver
+ *   Copyright (C) 2009  Integrated Device Technology, Inc.
+ *
+ *  Changes from IDT 1.0.1 are copyright (C) 2010 Micron Technology, Inc.
+ *
+ *  This file is part of the Cyclone SSD Block Driver, it is free software:
+ *  you can redistribute it and/or modify it under the terms of the GNU
+ *  General Public License as published by the Free Software Foundation;
+ *  either version 2 of the License, or (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+ *  MA  02110-1301, USA.
+ *
+ *  You can contact Integrated Device Technology, Inc via
+ *  email ssdhelp@idt.com or mail
+ *  Integrated Device Technology, Inc
+ *  6024 Silver Creek Valley Road, San Jose, CA 95128, USA
+ *
+ ****************************************************************************/
+#ifndef __MTIPX2XX_H__
+#define __MTIPX2XX_H__
+/**
+ * @file
+ */
+#include <linux/spinlock.h>
+#include <linux/rwsem.h>
+#include <linux/ata.h>
+#include <linux/interrupt.h>
+#include <linux/atomic.h>
+#include <linux/genhd.h>
+#include <linux/version.h>
+
+#ifndef WIN_SMART
+#define WIN_SMART       0xB0
+#endif
+
+/**
+* Defined for enabling surprise removal handling
+*/
+#define SRSI_IMPLEMENATAION  1
+
+#define MAX_RETRIES	5
+/**
+ * Defined if the driver should timeout commands that take
+ * too long to execute.
+ */
+#define COMMAND_TIMEOUT
+/**
+ * If COMMAND_TIMEOUT is defined, this is the timeout value in ms.
+ */
+#define NCQ_COMMAND_TIMEOUT_MS       5000
+#define IOCTL_COMMAND_TIMEOUT_MS     5000
+#define INTERNAL_COMMAND_TIMEOUT_MS  5000
+
+/* check for timeouts every 500ms.*/
+#define TIMEOUT_CHECK_PERIOD 500
+
+/**
+ * How should the driver react when an error occurs?
+ * Set each of these to 1 (enabled) or 0 (disabled)
+ */
+#define ISSUE_COMRESET_ON_TIMEOUT 1
+#define ISSUE_COMRESET_ON_TFE 1
+#define REISSUE_NCQ_COMMANDS_ON_ERR 1
+#define REISSUE_INT_COMMANDS_ON_ERR 0
+#define NULL_BIO_FIX 1
+/**
+ * Use a tasklet for bottom half IRQ processing.
+ *
+ * Set to 1 to use a tasklet for bottom half IRQ processing.
+ */
+#define USE_TASKLET	1
+/**
+ * Macro to extract the tag bit number from a tag value.
+ */
+#define TAG_BIT(tag)	(tag & 0x1f)
+/**
+ * Macro to extract the tag index from a tag value. The index
+ * is used to access the correct SActive/Command Issue register based
+ * on the tag value.
+ */
+#define TAG_INDEX(tag)	(tag >> 5)
+/**
+ * Maximum number of scatter gather entries
+ * a single command may have.
+ */
+#define MAX_SG				128
+/**
+ * Maximum number of slot groups (Command Issue & SActive registers)
+ * NOTE: This is the driver maximum; check dd->slot_groups for actual value.
+ */
+#define MAX_SLOT_GROUPS			8
+/**
+ * Internal command tag.
+ */
+#define TAG_INTERNAL	0
+/**
+ * Micron Vendor ID
+ */
+#define PCI_VENDOR_ID_MICRON    0x1344
+/**
+ * Micron FPGA SSD device ID
+ */
+#define CYCLONE_FPGA_DEVICE_ID	0x5150
+/**
+ * Driver name string.
+ */
+#define DRV_NAME	"Micron RealSSD PCIe Block Driver"
+/**
+ * Driver version string.
+ */
+#define DRV_VERSION	"1.2.3b3"
+/**
+ * Maximum number of minor device numbers per device.
+ */
+#define MAX_MINORS				16
+/**
+ * Maximum number of supported command slots.
+ */
+#define MAX_COMMAND_SLOTS	(MAX_SLOT_GROUPS * 32)
+/**
+ * Per-tag bitfield size in longs.
+ * Linux bit manipulation functions (i.e. test_and_set_bit, find_next_zero_bit)
+ * manipulate memory in longs, so we try to make the math work.
+ * take the slot groups and find the number of longs, rounding up.
+ * Careful! i386 and x86_64 use different size longs!
+ */
+#define U32_PER_LONG (sizeof(long) / sizeof(u32))
+#define SLOTBITS_IN_LONGS ((MAX_SLOT_GROUPS+(U32_PER_LONG-1))/U32_PER_LONG)
+/**
+ * BAR number used to access the AHCI HBA registers.
+ */
+#define MTIPX2XX_ABAR				5
+/*
+ * Forced Unit Access Bit
+ */
+#define FUA_BIT		0x80
+
+/**
+ * Register Frame Information Structure (FIS), host to device.
+ */
+
+ /**
+ * Macro value specfies the success and failure
+ *
+ */
+#define SUCCESS   0
+#define FAILURE  -1
+
+struct HOST_TO_DEV_FIS {
+	/**
+	 * FIS type.
+	 * - 27h Register FIS, host to device.
+	 * - 34h Register FIS, device to host.
+	 * - 39h DMA Activate FIS, device to host.
+	 * - 41h DMA Setup FIS, bi-directional.
+	 * - 46h Data FIS, bi-directional.
+	 * - 58h BIST Activate FIS, bi-directional.
+	 * - 5Fh PIO Setup FIS, device to host.
+	 * - A1h Set Device Bits FIS, device to host.
+	 */
+	unsigned char type;
+	unsigned char opts;
+	unsigned char command;
+	unsigned char features;
+
+	union {
+		unsigned char LBALow;
+		unsigned char sector;
+	};
+	union {
+		unsigned char LBAMid;
+		unsigned char cylLow;
+	};
+	union {
+		unsigned char LBAHi;
+		unsigned char cylHi;
+	};
+	union {
+		unsigned char device;
+		unsigned char head;
+	};
+
+	union {
+		unsigned char LBALowEx;
+		unsigned char sectorEx;
+	};
+	union {
+		unsigned char LBAMidEx;
+		unsigned char cylLowEx;
+	};
+	union {
+		unsigned char LBAHiEx;
+		unsigned char cylHiEx;
+	};
+	unsigned char featuresEx;
+
+	unsigned char sectCount;
+	unsigned char secCountEx;
+	unsigned char res2;
+	unsigned char control;
+
+	unsigned int res3;
+};
+
+/**
+ * Command header structure.
+ */
+struct COMMAND_HDR {
+	/**
+	 * Command options.
+	 * - Bits 31:16 Number of PRD entries.
+	 * - Bits 15:8 Unused in this implementation.
+	 * - Bit 7 Prefetch bit, informs the drive to prefetch PRD entries.
+	 * - Bit 6 Write bit, should be set when writing data to the device.
+	 * - Bit 5 Unused in this implementation.
+	 * - Bits 4:0 Length of the command FIS in DWords (DWord = 4 bytes).
+	 */
+	unsigned int opts;
+	/**
+	 * This field is unsed when using NCQ.
+	 */
+	union {
+		unsigned int byteCount;
+		unsigned int status;
+	};
+	/**
+	 * Lower 32 bits of the command table address associated with this header.
+	 * The command table addresses must be 128 byte aligned.
+	 */
+	unsigned int ctba;
+	/**
+	 * If 64 bit addressing is used this field is the upper 32 bits
+	 * of the command table address associated with this command.
+	 */
+	unsigned int ctbau;
+	/**
+	 * Reserved and unused.
+	 */
+	unsigned int res[4];
+};
+
+/**
+ * Command scatter gather structure (PRD).
+ */
+struct COMMAND_SG {
+	/**
+	 * Low 32 bits of the data buffer address. For Cyclone this
+	 * address must be 8 byte aligned signified by bits 2:0 being
+	 * set to 0.
+	 */
+	unsigned int dba;
+	/**
+	 * When 64 bit addressing is used this field is the upper
+	 * 32 bits of the data buffer address.
+	 */
+	unsigned int dbaUpper;
+	/**
+	 * Unused.
+	 */
+	unsigned int reserved;
+	/**
+	 * Bit 31: interrupt when this data block has been transferred.
+	 * Bits 30..22: reserved
+	 * Bits 21..0: byte count (minus 1).  For Cyclone the byte count must be
+	 * 8 byte aligned signified by bits 2:0 being set to 1.
+	 */
+	unsigned int info;
+
+};
+struct port;
+/*struct port;*/
+
+/**
+ * Structure used to describe an AHCI command.
+ */
+struct COMMAND {
+	/**
+	 * Pointer to the command headers in virtual kernel address space.
+	 */
+	struct COMMAND_HDR *commandHeader;
+	/**
+	 * Pointer to the command headers in physical address space.
+	 */
+	dma_addr_t commandHeaderDMA;
+	/**
+	 * Pointer to the command tables in virtual kernel address space.
+	 */
+	void *command;
+	/**
+	 * Pointer to the command tables in physical address space.
+	 */
+	dma_addr_t commandDMA;
+	/**
+	 * Completion data passed to completionFunc() upon completion
+	 * of a command.
+	 */
+	void *completionData;
+	/**
+	 * Completion function called by the ISR upon completion of
+	 * a command.
+	 */
+	void (*completionFunc)(struct port *port,
+				int tag,
+				void *data,
+				int status);
+	/**
+	 * Additional callback function that may be called by
+	 * completionFunc().
+	 */
+	void (*asyncCallback)(void *data, int status);
+	/**
+	 * Additional callback data that should be passed to
+	 * asyncCallback().
+	 */
+	void *asyncData;
+	/**
+	 * Number of scatter list entries used by this command.
+	 */
+	int scatterEnts;
+	/**
+	 * Scatter list entries for this command.
+	 */
+	struct scatterlist sg[MAX_SG];
+	/**
+	 * The nuCOMMANDmber of retries left for this command.
+	 */
+	int retries;
+#ifdef COMMAND_TIMEOUT
+	/**
+	 * The time, in jiffies, at which this command should have completed.
+	 */
+	unsigned long compTime;
+#endif
+	/**
+	 * A bit to declare that this command has been sent to the drive.
+	 */
+	atomic_t active;
+};
+
+
+/**
+ * Structure usCOMMAND_SGed to describe an AHCI port.
+ */
+struct port {
+	/**
+	 * Pointer back to the driver data for this port.
+	 */
+	struct driver_data *dd;
+	/**
+	 * Used to determine if the data pointed to by the
+	 * identify field is valid.
+	 */
+	int identifyValid;
+	/**
+	 * Base address of the memory mapped IO for the port.
+	 */
+	void __iomem *mmio;
+	/**
+	 * Array of pointers to the memory mapped SActive registers.
+	 */
+	void __iomem *SActive[MAX_SLOT_GROUPS];
+	/**
+	 * Array of pointers to the memory mapped Completed registers.
+	 */
+	void __iomem *Completed[MAX_SLOT_GROUPS];
+	/**
+	 * Array of pointers to the memory mapped Command Issue registers.
+	 */
+	void __iomem *CommandIssue[MAX_SLOT_GROUPS];
+	/**
+	 * Pointer to the beginning of the command header memory as used
+	 * by the driver.
+	 */
+	void *commandList;
+	/**COMMAND_SG
+	 * Pointer to the beginning of the command header memory as used
+	 * by the DMA.
+	 */
+	dma_addr_t commandListDMA;
+	/**
+	 * Pointer to the beginning of the RX FIS memory as used
+	 * by the driver.
+	 */
+	void *rxFIS;
+	/**
+	 * Pointer to the beginning of the RX FIS memory as used
+	 * by the DMA.
+	 */
+	dma_addr_t rxFISDMA;
+	/**COMMAND_SG
+	 * Pointer to the beginning of the command table memory as used
+	 * by the driver.
+	 */
+	void *commandTbl;
+	/**
+	 * Pointer to the beginning of the command table memory as used
+	 * by the DMA.
+	 */
+	dma_addr_t commandTblDMA;
+	/**
+	 * Pointer to the beginning of the identify data memory as used
+	 * by the driver.
+	 */
+	u16 *identify;
+	/**
+	 * Pointer to the beginning of the identify data memory as used
+	 * by the DMA.
+	 */
+	dma_addr_t identifyDMA;
+	/**
+	 * Pointer to the beginning of a sector buffer that is used
+	 * by the driver when issuing internal commands.
+	 */
+	u16 *sectorBuffer;
+	/**
+	 * Pointer to the beginning of a sector buffer that is used
+	 * by the DMA when the driver issues internal commands.
+	 */
+	dma_addr_t sectorBufferDMA;
+	/**
+	 * Bit significant, used to determine if a command slot has
+	 * been allocated. i.e. the slot is in use.  Bits are cleared
+	 * when the command slot and all associated data structures
+	 * are no longer needed.
+	 */
+	unsigned long allocated[SLOTBITS_IN_LONGS];
+	/**
+	 * Array of command slots. Structure includes pointers to the
+	 * command header and command table, and completion function and data
+	 * pointers.
+	 */
+	struct COMMAND commands[MAX_COMMAND_SLOTS];
+	/**
+	 * Non-zero if an internal command is in progress.
+	 */
+	int internalCommandInProgress;
+#ifdef COMMAND_TIMEOUT
+	/**
+	 * Timer used to complete commands that have been active for too long.
+	 */
+	struct timer_list	commandTimer;
+#endif
+	/**
+	 * Semaphore used to block threads if there are no
+	 * command slots available.
+	 */
+	struct semaphore commandSlot;
+	/**
+	 * Spinlock for working around command-issue bug.
+	 */
+	spinlock_t cmdIssueLock;
+};
+
+/**
+ * Structure used to hold information relating to performance
+ * statistics.
+ */
+struct STATS {
+	/**
+	 * Statistics timer.
+	 */
+	struct timer_list	timer;
+	/**
+	 * Statistics, incremented each time the SSD ISR is executed.
+	 * Cleared to zero each second when the value is copied to currentInts.
+	 */
+	atomic_t interrupts;
+	/**
+	 * Statistics, incremented each time a read is issued to the device.
+	 * Cleared to zero each second when the value is copied to
+	 * currentReads.
+	 */
+	atomic_t reads;
+	/**
+	 * Statistics, incremented each time a write is issued to the device.
+	 * Cleared to zero each second when the value is copied to
+	 * currentWrites.
+	 */
+	atomic_t writes;
+	/**
+	 * The current number of interrupts per second.
+	 */
+	atomic_t currentInts;
+	/**
+	 * The current number of reads per second.
+	 */
+	atomic_t currentReads;
+	/**
+	 * The current number of writes per second.
+	 */
+	atomic_t currentWrites;
+	/**
+	 * The current number of IOPS.
+	 */
+	atomic_t currentIOPS;
+	/**
+	 * Minimum response time.
+	 */
+	atomic_t minRespTime;
+	/**
+	 * Maximum response time.
+	 */
+	atomic_t maxRespTime;
+	/**
+	 * Average response time.
+	 */
+	atomic_t avgRespTime;
+	/**
+	 * Average response time.
+	 */
+	atomic_t currentAvgRespTime;
+};
+
+
+/**
+ * Driver private data structure.
+ *
+ * One structure is allocated per probed device.
+ */
+struct  driver_data {
+	/**
+	 * Base address of the HBA registers.
+	 */
+	void __iomem *mmio;
+	/**
+	 * Major device number.
+	 */
+	int	major;
+	/**
+	 * Instance number for this device. First device probed is 0, second
+	 * is 1, etc.
+	 */
+	int instance;
+	/**
+	 * Protocol ops array index.
+	 */
+	int protocol;
+	/**
+	 * Pointer to our gendisk structure.
+	 */
+	struct gendisk *disk;
+	/**
+	 * Pointer to the PCI device structure.
+	 */
+	struct pci_dev *pdev;
+	/**
+	 * Our request queue.
+	 */
+	struct request_queue	*queue;
+	/**
+	 * Semaphore used to lock out read/write commands during the
+	 * execution of an internal command.
+	 */
+	struct rw_semaphore internalSem;
+	/**
+	 * Pointer to the port data structure.
+	 */
+	struct port  *port;
+#ifdef USE_TASKLET
+	/**
+	 * Tasklet used to process the bottom half of
+	 * the ISR.
+	 */
+	struct tasklet_struct tasklet;
+#endif
+	/**
+	 * Performance statistics.
+	 */
+	struct STATS statistics;
+	/**
+	 * Used to inject random read/write errors.
+	 * - Bit 0 - Insert random read errors.
+	 * - Bit 1 - Insert random write errors.
+	 */
+	int makeItFail;
+	/**
+	 * Tag which caused the simulated failure.
+	 */
+	int makeItFailTag;
+	/**
+	 * Used to store the original start sector value from the FIS
+	 * when an error is injected.
+	 */
+	sector_t makeItFailStart;
+	/**
+	 * Spinlock used to access the make_it_fail counters below.
+	 */
+	spinlock_t	makeItFailLock;
+	/**
+	 * Initialized with a random value when bit 0 is set in makeItFail.
+	 * The value is decremented each time there is a read, when the count
+	 * is zero an error is injected into the read command and the counter
+	 * is written with a new random value.
+	 */
+	unsigned  randomReadCount;
+	/**
+	 * Initialized with a random value when bit 1 is set in makeItFail.
+	 * The value is decremented each time there is a write, when the count
+	 * is zero an error is injected into the write command and the counter
+	 * is written with a new random value.
+	 */
+	unsigned  randomWriteCount;
+	/**
+	 * Store a magic value declaring the product type.
+	 */
+	unsigned product_type;
+	/**
+	 * Store the number of slot groups the product supports.
+	 */
+	unsigned slot_groups;
+	/**
+	* Atomic varaiable for SRSI
+	*/
+	atomic_t drv_cleanup_done;
+};
+
+/**
+ * Protocol layer exported functions.
+ *
+ * This structure contains functions that are required by
+ * the block layer.
+ */
+struct PROTOCOL_FUNCS {
+	/**
+	 * Protocol layer initialization function. Called by the
+	 * block layer initialization function (blockInitialize())
+	 * to initialize the protocol layer.
+	 */
+	int (*init)(struct driver_data *dd);
+	/**
+	 * Protocol layer exit function. Called by the block layer
+	 * exit function (blockRemove()) to deinitialize the protocol
+	 * layer.
+	 */
+	int (*exit)(struct driver_data *dd);
+#ifdef CONFIG_PM
+	/**
+	 * Protocol layer suspend function. Called by the block layer suspend
+	 * routine (blockSuspend()) to suspend the device
+	 */
+	int (*suspend)(struct driver_data *dd);
+	/**
+	 * Protocol layer resume function. Called by the block layer resume
+	 * routine (blockResume()) to resume the device
+	 */
+	int (*resume)(struct driver_data *dd);
+#endif
+	/**
+	 * Protocol layer shutdown function. Called by the kernel when
+	 * the machine is being shutdown. This function should sync the
+	 * hardware. For Cyclone, this means issuing a Standby Immediate
+	 * command. This function is optional.
+	 */
+	int (*shutdown)(struct driver_data *dd);
+	/**
+	 * Function that returns the actual hardware block size. For the Cyclone
+	 * SSD this function should return 4096. The information returned from
+	 * this function is used to be block layer to inform the kernel of the
+	 * hardware block size. The kernel will then only issue block read/write
+	 * commands that are a multiples of that size.
+	 */
+	int (*hwBlkSize)(void);
+	/**
+	 * Function to obtain and return the size of the device in 512
+	 * byte sectors.
+	 */
+	sector_t (*getCapacity)(struct driver_data *dd);
+	/**
+	 * Function called by the block later to obtain a free command
+	 * slot and to return a pointer to the scatter list associated
+	 * with it.
+	 */
+	struct scatterlist *(*getScatterList)(struct driver_data *dd, int *tag);
+	/**
+	 * Called by the Block Layer to issue a read to the hardware.
+	 */
+	int (*read)(struct driver_data *dd,
+			sector_t start,
+			int nsect,
+			int nents,
+			int tag,
+			void *callback,
+			void *data,
+			int barrier);
+	/**
+	 * Called by the Block Layer to issue a write to the hardware.
+	 */
+	int (*write)(struct driver_data *dd,
+			sector_t start,
+			int nsect,
+			int nents,
+			int tag,
+			void *callback,
+			void *data,
+			int barrier);
+	/**
+	 * Called by the Block Layer to handle device specific IOCTL commands.
+	 */
+	int (*ioctl)(struct driver_data *dd,
+			unsigned int cmd,
+			unsigned long arg);
+	/**
+	 * Function to initialize sysfs attributes for the protocol layer. This
+	 * function is called by the block layer after the block device has
+	 * been added to the system by a call to add_disk(). This field is
+	 * optional.
+	 */
+	int (*sysfsInit)(struct driver_data *dd, struct kobject *kobj);
+	/**
+	 * Function to deinitialize sysfs attributes for the protocol layer. This
+	 * function is called by the block layer before the block device is
+	 * removed from the system by a call to del_gendisk(). This function
+	 * is optional.
+	 */
+	int (*sysfsExit)(struct driver_data *dd, struct kobject *kobj);
+};
+
+extern int cyclone_major;
+extern int block_initialize(struct driver_data *dd);
+extern int block_remove(struct driver_data *dd);
+extern int block_shutdown(struct driver_data *dd);
+extern int block_suspend(struct driver_data *dd);
+extern int block_resume(struct driver_data *dd);
+extern int ahci_init(struct driver_data *dd);
+extern int ahci_exit(struct driver_data *dd);
+extern int ahci_shutdown(struct driver_data *dd);
+extern int ahci_hard_blksize(void);
+extern sector_t ahci_get_capacity(struct driver_data *dd);
+extern struct scatterlist *ahci_get_scatterlist(
+				struct driver_data *dd,
+				int *tag);
+extern int ahci_read(struct driver_data *dd,
+			sector_t start,
+			int nsect,
+			int nents,
+			int tag,
+			void *callback,
+			void *data,
+			int barrier);
+extern int ahci_write(struct driver_data *dd,
+			sector_t start,
+			int nsect,
+			int nents,
+			int tag,
+			void *callback,
+			void *data,
+			int barrier);
+extern int ahci_ioctl(struct driver_data *dd,
+			unsigned int cmd,
+			unsigned long arg);
+extern int ahci_sysfs_init(struct driver_data *dd, struct kobject *kobj);
+extern int ahci_sysfs_exit(struct driver_data *dd, struct kobject *kobj);
+extern int ahci_resume(struct driver_data *dd);
+extern int ahci_suspend(struct driver_data *dd);
+
+
+extern struct pci_driver cyclone_pci_driver;
+void command_cleanup(struct driver_data *dd);
+int check_for_surprise_removal(struct pci_dev *pdev);
+
+void restart_port(struct port *);
+/**
+ * Swap halves of 16-bit words in place
+ *
+ * Swap halves of 16-bit words if needed to convert from
+ * little-endian byte order to native cpu byte order, or
+ * vice-versa.
+ *
+ * @param buf Buffer to swap
+ * @param buf_words Number of 16-bit words in buffer.
+ *
+ * @return N/A
+ */
+static inline void swap_buf_le16(u16 *buf, unsigned int buf_words)
+{
+#ifdef __BIG_ENDIAN
+	unsigned int i;
+
+	for (i = 0; i < buf_words; i++)
+		buf[i] = le16_to_cpu(buf[i]);
+#endif /* __BIG_ENDIAN */
+}
+
+
+#endif
diff -uNr linux-2.6.38/drivers/block/mtipx2xx/pci.c linux-2.6.38-asai/drivers/block/mtipx2xx/pci.c
--- linux-2.6.38/drivers/block/mtipx2xx/pci.c	1969-12-31 17:00:00.000000000 -0700
+++ linux-2.6.38-asai/drivers/block/mtipx2xx/pci.c	2011-04-15 20:18:50.000000000 -0600
@@ -0,0 +1,384 @@
+/*****************************************************************************
+ *
+ * pci.c - Handles the PCI layer of the Cyclone SSD Block Driver
+ *   Copyright (C) 2009  Integrated Device Technology, Inc.
+ *
+ *  This file is part of the Cyclone SSD Block Driver, it is free software:
+ *  you can redistribute it and/or modify it under the terms of the GNU
+ *  General Public License as published by the Free Software Foundation;
+ *  either version 2 of the License, or (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+ *  MA  02110-1301, USA.
+ *
+ *  You can contact Integrated Device Technology, Inc
+ *  via email ssdhelp@idt.com or mail
+ *  Integrated Device Technology, Inc
+ *  6024 Silver Creek Valley Road, San Jose, CA 95128, USA
+ *
+ ****************************************************************************/
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include "mtipx2xx.h"
+/**
+ * @file
+ * The PCI layer contains the pci_device_id structure which is used to
+ * associate a hardware vendor/device ID with the device driver. An example of
+ * a pci_device_id structure for Cyclone is shown below.
+ * @verbatim
+static const struct pci_device_id pci_tbl[] = {
+	{ PCI_VDEVICE(MICRON, CYCLONE_FPGA_DEVICE_ID), 0},
+	{}
+};
+@endverbatim
+ * Each entry in the pci_device_id table consists of a vendor/device ID, the
+ * macro PCI_VDEVICE() is used to correctly populate the vendor/device entries,
+ * and an integer value that is used by the block layer as an
+ * index into the PROTOCOL_FUNCS table which is described in the Block Layer
+ * section.
+ * The Linux macro MODULE_DEVICE_TABLE is used to create the relevant linker
+ * sections so that the depmod command may be used to update the module
+ * dependency files.
+ *
+ * Another structure declared in the PCI layer is the cyclone_pci_driver
+ * structure. This is the structure that describes the PCI driver entry points.
+ * A pointer to this structure is passed into the pci_register_driver()
+ * function by the module layer to register the PCI driver.
+ * The following PCI driver methods will be supported by the PCI layer.
+ *
+ * - probe() - When the module is loaded this function is called for each device
+ * that matches the vendor/device ID's specified in the pci_device_id table as
+ * they are discovered.
+ * The probe() function performs the following operations.
+ *  - Allocate memory for the drivers private data
+ *  cyclone_major = register_blkdev(0, "mtipx2xx")a structures.
+ *  -	Enables the PCI device.
+ *  -	Iomap the devices BARs.
+ * cyclone_major = register_blkdev(0, "mtipx2xx") -Set the DMA mask for the
+ * device.
+ *  -	Enable PCI device master operations.
+ *  -	Call the block layer initialization function (blockInitialize()).
+ * - remove() - When the module is unloaded this function is called for each
+ * device that matches the vendor/device ID's specified in the pci_device_id
+ * table.
+ * The remove() function must undo the operations of the probe() function.
+ *  -	Call the block layer remove function (blockRemove()).
+ *  -	Free the memory allocated for the driver private data structures.
+ *  -	Unmap the devices BARs.
+ * - shutdown() - This function is called by the kernel PCI infrastructure
+ * just before the system is halted during a shutdown. This function should
+ * call into the block layer to sync the device before the power is switched
+ * off.
+ *
+ * The PCI layer will also handle the suspend and resume operations which are
+ * issued when the system enters and exits low power states. The initial
+ * version of the driver will not support these methods.
+ */
+
+/**
+ * Device instance number, incremented each time a device is probed.
+ */
+static int instance;
+
+/**
+ * Called for each supported PCI device detected.
+ *
+ * This function allocates the private data structure, enables the
+ * PCI device and then calls the block layer initialization function.
+ *
+ * @return 0 on success else an error code.
+ */
+static int pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	struct driver_data	*dd = NULL;
+	int rv = 0;
+
+	/*
+	 * Allocate memory for this devices private data.
+	 */
+	dd = vmalloc(sizeof(struct driver_data));
+	if (dd == NULL) {
+		printk(KERN_ERR "Unable to allocate memory for driver data\n");
+		return -ENOMEM;
+	}
+	memset(dd, 0, sizeof(struct driver_data));
+	/**
+	* Set the atomic variable as 1 in case of SRSI
+	*/
+
+	atomic_set(&dd->drv_cleanup_done, true);
+
+	/*
+	 * Attach the private data to this PCI device.
+	 */
+	pci_set_drvdata(pdev, dd);
+
+	/*
+	 * Enable the PCI device.
+	 */
+	rv = pcim_enable_device(pdev);
+	if (rv < 0) {
+		printk(KERN_ERR "Unable to enable device\n");
+		goto iomap_err;
+	}
+
+	/*
+	 * Map BAR5 to memory.
+	 */
+	rv = pcim_iomap_regions(pdev, 1 << MTIPX2XX_ABAR, DRV_NAME);
+	if (rv  < 0) {
+		printk(KERN_ERR "Unable to map regions\n");
+		goto iomap_err;
+	}
+
+	if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
+		rv = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+
+		if (rv) {
+			rv = pci_set_consistent_dma_mask(pdev,
+						DMA_BIT_MASK(32));
+			if (rv) {
+				dev_printk(KERN_ERR, &pdev->dev,
+					"64-bit DMA enable failed\n");
+				return rv;
+			}
+		}
+	}
+
+	/*
+	 * Enable PCI device master operations.
+	 */
+	pci_set_master(pdev);
+
+	/*
+	 * Enable MSI for this device.
+	 */
+	if (pci_enable_msi(pdev))
+		printk(KERN_WARNING "Unable to enable MSI interrupt.\n");
+
+
+	/*
+	 * Copy the info we may need later into the private data structure.
+	 */
+	dd->major	= cyclone_major;
+	dd->protocol	= ent->driver_data;
+	dd->instance	= instance;
+	dd->pdev	= pdev;
+
+	/*
+	 * Initialize the block layer.
+	 */
+	rv = block_initialize(dd);
+	if (rv < 0) {
+		printk(KERN_ERR "Unable to initialize block layer\n");
+		goto block_initialize_err;
+	}
+
+	/*
+	 * Increment the instance count so that each device has a unique
+	 * instance number.
+	 */
+	instance++;
+
+	goto done;
+
+block_initialize_err:
+
+	pcim_iounmap_regions(pdev, 1 << MTIPX2XX_ABAR);
+iomap_err:
+	vfree(dd);
+	pci_set_drvdata(pdev, NULL);
+done:
+	/**
+	* Set the atomic variable as 0 in case of SRSI
+	*/
+
+	atomic_set(&dd->drv_cleanup_done, false);
+
+	return rv;
+}
+
+/**
+ * Called for each probed device when the device is removed or the
+ * driver is unloaded.
+ *
+ * @return N/A
+ */
+static void pci_remove(struct pci_dev *pdev)
+{
+	struct driver_data	*dd = pci_get_drvdata(pdev);
+	int Counter = 0;
+
+	/*
+	* Check for the device presence
+	*/
+	if (check_for_surprise_removal(pdev)) {
+		while (atomic_read(&dd->drv_cleanup_done) == false) {
+			Counter++;
+			msleep(20);
+			if (Counter == 10) {
+				/*
+				* Cleanup the outstanding the command
+				*/
+				command_cleanup(dd);
+				break;
+			}
+		}
+	}
+       /*
+	* Set the atomic variable as 1 in case of SRSI
+	*/
+	atomic_set(&dd->drv_cleanup_done, true);
+
+
+	/*
+	 * Clean up the block layer.
+	 */
+	block_remove(dd);
+
+	/*
+	 * Reverse the effects of pci_enable_msi().
+	 */
+	pci_disable_msi(pdev);
+
+	/*
+	 * Free the memory allocated for the private data structure.
+	 */
+	vfree(dd);
+
+	/*
+	 * Unmap the memory regions allocated to the BARs.
+	 */
+	pcim_iounmap_regions(pdev, 1 << MTIPX2XX_ABAR);
+}
+
+
+static int pci_suspend(struct pci_dev *pdev, pm_message_t mesg)
+{
+		int rv = 0;
+		struct driver_data	 *dd ;
+
+		dd = pci_get_drvdata(pdev);
+		if (dd == NULL)
+			printk(KERN_ERR "Driver private datastructure is NULL\n");
+
+
+		/* Disable PORTs and Interrupts.
+		 * It sends standbyImmediate command to controller.*/
+		rv = block_suspend(dd);
+
+		if (rv < 0) {
+			printk(KERN_ERR "Failed to suspend AHCI controller\n");
+			return rv;
+		}
+
+		/* Save the pci config space to pdev structure &
+		 * disable the device */
+		pci_save_state(pdev);
+		pci_disable_device(pdev);
+
+		/* Move to Low power state*/
+		pci_set_power_state(pdev, PCI_D3hot);
+
+		return rv;
+}
+
+static int pci_resume(struct pci_dev *pdev)
+{
+		int rv = 0;
+		struct driver_data	 *dd;
+
+		dd = pci_get_drvdata(pdev);
+		if (dd == NULL)
+			printk(KERN_ERR "Driver private datastructure is NULL\n");
+
+
+		/* Move the device to active State*/
+		pci_set_power_state(pdev, PCI_D0);
+
+		/* Restore PCI configuration space*/
+		pci_restore_state(pdev);
+
+		/* Enable the PCI device*/
+		rv = pcim_enable_device(pdev);
+		if (rv < 0) {
+			printk(KERN_ERR "Failed to enable P320 card during resume\n");
+			return rv;
+		}
+		pci_set_master(pdev);
+
+		/*
+		 * It calls hbaReset , initPort,startPort function
+		 * and enable the interrupt
+		 */
+		rv = block_resume(dd);
+		if (rv < 0) {
+			printk(KERN_ERR "Unable to initialize port\n");
+			return rv;
+		}
+
+		return rv;
+}
+
+static void pci_shutdown(struct pci_dev *pdev)
+{
+	struct driver_data	*dd = pci_get_drvdata(pdev);
+	block_shutdown(dd);
+}
+
+/**
+ * This function check_for_surprise_removal is called
+ * while card is removed from the system and it will
+ * read the vendor id from the configration space
+ *
+ *
+ * @param pdev Pointer to the pci_dev structure.
+ *
+ *
+ * @return 1 if device removed, else 0
+ */
+
+int check_for_surprise_removal(struct pci_dev *pdev)
+{
+	u16 vendor_id = 0;
+
+       /*
+	* Read the vendorID from the configuration space
+	*/
+	pci_read_config_word(pdev, 0x00, &vendor_id);
+	if (vendor_id == 0xFFFF)
+		return 1; /* device removed */
+
+	return 0; /* device present */
+}
+
+/**
+ * Table of devices supported by this driver.
+ */
+static DEFINE_PCI_DEVICE_TABLE(pci_tbl) = {
+	{  PCI_DEVICE(PCI_VENDOR_ID_MICRON, CYCLONE_FPGA_DEVICE_ID) },
+
+	{0}	/* terminate list */
+};
+
+/**
+* Structure that describes the PCI driver functions.
+*/
+struct pci_driver cyclone_pci_driver = {
+	.name			= DRV_NAME,
+	.id_table		= pci_tbl,
+	.probe			= pci_probe,
+	.remove			= pci_remove,
+	.suspend		= pci_suspend,
+	.resume			= pci_resume,
+	.shutdown		= pci_shutdown,
+};
+
+MODULE_DEVICE_TABLE(pci, pci_tbl);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-04-28 15:53 New driver mtipx2xx submission Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
@ 2011-04-28 22:06 ` Alan Cox
  2011-05-02 12:40   ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-05-02 18:40   ` Jeff Moyer
  0 siblings, 2 replies; 40+ messages in thread
From: Alan Cox @ 2011-04-28 22:06 UTC (permalink / raw)
  To: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]; +Cc: linux-ide

> We have written a new block driver for our AHCI based PCIe SSDs. The
> main objective of our product is providing high performance. Traffic
> through OS storage stack is not to able fully utilize the device's
> capabilty. To improve the traffic to the device and hence
> showcase/utilize the device's capability, we have come up with this new
> block driver. This driver includes
> 	* utilize device's increased queue depth
> 	* workaround for hardware errata
> 
> We want to get this driver into kernel tree to support the device out of
> the box. Attached this driver as a patch for latest kernel. We would
> like to get your comments, and also open for discussion.

The kernel starting point would be that we have an AHCI driver. If you
need workarounds for hardware errata then they can go into it and that is
fine. We support NCQ so we can use the queue depths. If there are
extensions then the AHCI driver can be enhanced.

Similarly if you've got performnace problems going through the libata
core and AHCI driver the right thing to do is to fix the performance
problems in the core (or rewrite bits of the core as needed).

I think the starting point would be to explain what problems you see with
the existing driver, and where the profiling tools say the bottlenecks
are when you try and get full speed under libata + libata ahci driver.

Otherwise we will end up with one AHCIish driver per vendor and it turns
into a nightmare for maintenance.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: New driver mtipx2xx submission
  2011-04-28 22:06 ` Alan Cox
@ 2011-05-02 12:40   ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-05-02 17:42     ` Alan Cox
  2011-05-02 18:40   ` Jeff Moyer
  1 sibling, 1 reply; 40+ messages in thread
From: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR] @ 2011-05-02 12:40 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-ide

> The kernel starting point would be that we have an AHCI driver. If you
> need workarounds for hardware errata then they can go into it and that
is
> fine. We support NCQ so we can use the queue depths. If there are
> extensions then the AHCI driver can be enhanced.

We understand that AHCI driver can be patched for hardware errata and
other customizations, but main concern is performance.

> Similarly if you've got performnace problems going through the libata
> core and AHCI driver the right thing to do is to fix the performance
> problems in the core (or rewrite bits of the core as needed).

> I think the starting point would be to explain what problems you see
with
> the existing driver, and where the profiling tools say the bottlenecks
> are when you try and get full speed under libata + libata ahci driver.

The performance difference is not just because of libata + libata ahci
driver. Our driver gets the request before elevator comes into picture.
So, the stack starting from elevator,scsi upper, scsi mid, libata and
libata ahci driver attributes to the performance difference. 

> Otherwise we will end up with one AHCIish driver per vendor and it
turns
> into a nightmare for maintenance.

Regards,
Asai Thambi

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-02 12:40   ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
@ 2011-05-02 17:42     ` Alan Cox
  2011-05-03 20:07       ` [PATCH 0/3] " Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-05-11 17:40       ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  0 siblings, 2 replies; 40+ messages in thread
From: Alan Cox @ 2011-05-02 17:42 UTC (permalink / raw)
  To: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]; +Cc: linux-ide

On Mon, 2 May 2011 06:40:31 -0600
"Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]"
<asamymuthupa@micron.com> wrote:

> > The kernel starting point would be that we have an AHCI driver. If you
> > need workarounds for hardware errata then they can go into it and that
> is
> > fine. We support NCQ so we can use the queue depths. If there are
> > extensions then the AHCI driver can be enhanced.
> 
> We understand that AHCI driver can be patched for hardware errata and
> other customizations, but main concern is performance.

Yes I understand that - but our main concern is maintainability, which
means one driver per AHCI flash vendor is a mess that isn't practicable
to deal with.

Documenting the errata would be a help anyway so we can get them into the
mainline AHCI driver, irrespective of what drivers we end up with
ultimately.

> > I think the starting point would be to explain what problems you see
> with
> > the existing driver, and where the profiling tools say the bottlenecks
> > are when you try and get full speed under libata + libata ahci driver.
> 
> The performance difference is not just because of libata + libata ahci
> driver. Our driver gets the request before elevator comes into picture.
> So, the stack starting from elevator,scsi upper, scsi mid, libata and
> libata ahci driver attributes to the performance difference. 

Elevator for flash devices should automatically be null and most of the
SCSI layer isn't actually used so it would be interesting to know for
example what shows up in comparative profiles so that it can be optimised
or dropped out.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-04-28 22:06 ` Alan Cox
  2011-05-02 12:40   ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
@ 2011-05-02 18:40   ` Jeff Moyer
  2011-05-02 18:52     ` Alan Cox
  2011-05-03 15:02     ` Mark Lord
  1 sibling, 2 replies; 40+ messages in thread
From: Jeff Moyer @ 2011-05-02 18:40 UTC (permalink / raw)
  To: Alan Cox
  Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> We have written a new block driver for our AHCI based PCIe SSDs. The
>> main objective of our product is providing high performance. Traffic
>> through OS storage stack is not to able fully utilize the device's
>> capabilty. To improve the traffic to the device and hence
>> showcase/utilize the device's capability, we have come up with this new
>> block driver. This driver includes
>> 	* utilize device's increased queue depth
>> 	* workaround for hardware errata
>> 
>> We want to get this driver into kernel tree to support the device out of
>> the box. Attached this driver as a patch for latest kernel. We would
>> like to get your comments, and also open for discussion.
>
> The kernel starting point would be that we have an AHCI driver. If you
> need workarounds for hardware errata then they can go into it and that is
> fine. We support NCQ so we can use the queue depths. If there are
> extensions then the AHCI driver can be enhanced.

Given the highly parallel nature of these parts, I wouldn't be surprised
if the ahci queue depth of 31 is one of the main bottlenecks.  Can you
think of a way to extend the ahci driver in this manner to accommodate
devices like this one?

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-02 18:40   ` Jeff Moyer
@ 2011-05-02 18:52     ` Alan Cox
  2011-05-03 15:04       ` Mark Lord
  2011-05-03 15:02     ` Mark Lord
  1 sibling, 1 reply; 40+ messages in thread
From: Alan Cox @ 2011-05-02 18:52 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

> > The kernel starting point would be that we have an AHCI driver. If you
> > need workarounds for hardware errata then they can go into it and that is
> > fine. We support NCQ so we can use the queue depths. If there are
> > extensions then the AHCI driver can be enhanced.
> 
> Given the highly parallel nature of these parts, I wouldn't be surprised
> if the ahci queue depth of 31 is one of the main bottlenecks.  Can you
> think of a way to extend the ahci driver in this manner to accommodate
> devices like this one?

The queue depth and tag limits come from the SATA standard and also leak
fairly comprehensively into the AHCI spec so not unless the hardware has
extensions for it and is merely faked SATA (and if you are faking SATA
you have to wonder why if your goal is max performance)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-02 18:40   ` Jeff Moyer
  2011-05-02 18:52     ` Alan Cox
@ 2011-05-03 15:02     ` Mark Lord
  2011-05-12 14:39       ` Jeff Garzik
  1 sibling, 1 reply; 40+ messages in thread
From: Mark Lord @ 2011-05-03 15:02 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Alan Cox,
	Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

On 11-05-02 02:40 PM, Jeff Moyer wrote:
..
> Given the highly parallel nature of these parts, I wouldn't be surprised
> if the ahci queue depth of 31 is one of the main bottlenecks.  Can you
> think of a way to extend the ahci driver in this manner to accommodate
> devices like this one?
..

I imagine that another performance drain might be libata's lack of
support for host controller queues --> a hardware queuing layer that
exists between the host and the drive's own queuing mechanism.

Most modern ATA/SATA chipsets have something of that nature.
These are designed to eliminate the activity-gaps that happen between
when a drive finishes a command, and when the host interrupt -> BH -> whatever
gets around to submitting a new command.

With commands already batched up in a hardware host queue, the controller
takes care of that part, keeping the drive active more of the time,
increasing IOPS.

Cheers

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-02 18:52     ` Alan Cox
@ 2011-05-03 15:04       ` Mark Lord
  2011-05-03 15:07         ` Alan Cox
  0 siblings, 1 reply; 40+ messages in thread
From: Mark Lord @ 2011-05-03 15:04 UTC (permalink / raw)
  To: Alan Cox
  Cc: Jeff Moyer,
	Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

On 11-05-02 02:52 PM, Alan Cox wrote:
>
> The queue depth and tag limits come from the SATA standard and also leak
> fairly comprehensively into the AHCI spec so not unless the hardware has
> extensions for it and is merely faked SATA (and if you are faking SATA
> you have to wonder why if your goal is max performance)


The reason to "fake" SATA and/or ATA is unchanged since the 1990s:
so that the system BIOS can boot from the device, and the operating system
can work up to the point where it finally loads hardware-specific drivers.

Cheers

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-03 15:04       ` Mark Lord
@ 2011-05-03 15:07         ` Alan Cox
  2011-05-03 15:08           ` Mark Lord
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Cox @ 2011-05-03 15:07 UTC (permalink / raw)
  To: Mark Lord
  Cc: Jeff Moyer,
	Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

On Tue, 03 May 2011 11:04:12 -0400
Mark Lord <kernel@teksavvy.com> wrote:

> On 11-05-02 02:52 PM, Alan Cox wrote:
> >
> > The queue depth and tag limits come from the SATA standard and also leak
> > fairly comprehensively into the AHCI spec so not unless the hardware has
> > extensions for it and is merely faked SATA (and if you are faking SATA
> > you have to wonder why if your goal is max performance)
> 
> 
> The reason to "fake" SATA and/or ATA is unchanged since the 1990s:
> so that the system BIOS can boot from the device, and the operating system
> can work up to the point where it finally loads hardware-specific drivers.

The BIOS emulation is for legacy IDE style interfaces not AHCI.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-03 15:07         ` Alan Cox
@ 2011-05-03 15:08           ` Mark Lord
  0 siblings, 0 replies; 40+ messages in thread
From: Mark Lord @ 2011-05-03 15:08 UTC (permalink / raw)
  To: Alan Cox
  Cc: Jeff Moyer,
	Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

On 11-05-03 11:07 AM, Alan Cox wrote:
> On Tue, 03 May 2011 11:04:12 -0400
> Mark Lord <kernel@teksavvy.com> wrote:
> 
>> On 11-05-02 02:52 PM, Alan Cox wrote:
>>>
>>> The queue depth and tag limits come from the SATA standard and also leak
>>> fairly comprehensively into the AHCI spec so not unless the hardware has
>>> extensions for it and is merely faked SATA (and if you are faking SATA
>>> you have to wonder why if your goal is max performance)
>>
>>
>> The reason to "fake" SATA and/or ATA is unchanged since the 1990s:
>> so that the system BIOS can boot from the device, and the operating system
>> can work up to the point where it finally loads hardware-specific drivers.
> 
> The BIOS emulation is for legacy IDE style interfaces not AHCI.

I know that.  That's only half of it.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 0/3] RE: New driver mtipx2xx submission
  2011-05-02 17:42     ` Alan Cox
@ 2011-05-03 20:07       ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-05-11 17:40       ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  1 sibling, 0 replies; 40+ messages in thread
From: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR] @ 2011-05-03 20:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-ide

> > > I think the starting point would be to explain what problems you
see
> > with
> > > the existing driver, and where the profiling tools say the
bottlenecks
> > > are when you try and get full speed under libata + libata ahci
driver.
> > 
> > The performance difference is not just because of libata + libata
ahci
> > driver. Our driver gets the request before elevator comes into
picture.
> > So, the stack starting from elevator,scsi upper, scsi mid, libata
and
> > libata ahci driver attributes to the performance difference. 
>
> Elevator for flash devices should automatically be null and most of
the
> SCSI layer isn't actually used so it would be interesting to know for
> example what shows up in comparative profiles so that it can be
optimised
> or dropped out.

We are working on getting performance figures for both paths. Meanwhile,
I am reposting the new patch inline in 3 parts.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: New driver mtipx2xx submission
  2011-05-02 17:42     ` Alan Cox
  2011-05-03 20:07       ` [PATCH 0/3] " Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
@ 2011-05-11 17:40       ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-05-11 19:20         ` Alan Cox
  1 sibling, 1 reply; 40+ messages in thread
From: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR] @ 2011-05-11 17:40 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-ide


> Elevator for flash devices should automatically be null and most of
the
> SCSI layer isn't actually used so it would be interesting to know for
> example what shows up in comparative profiles so that it can be
optimised
> or dropped out.

Here is the performance number from some experiments.

Test System : 
	Dell Power Edge R710 (1 x Quad core processor @3.20GHz) 
OS:
	RHEL (Enterprise) ver5.5 x64 (2.6.18-194.el5)
Test Software:
	Vdbench ver 5.02

These experiments were done by changing the queue depth in
vdbench(Q=xxx).

                        |------------------------------------|
                        | ahci   |  ahci | mtipx2xx| mtipx2xx|
                        | Q=256  |  Q=32 |  Q=32   |  Q=256  |
|------------------------------------------------------------|
|4k random read (IOPS)  | 15,831 |15,763 | 244,795 | 783,778 |
|4k random write (IOPS) |  4,058 | 4,072 |  46,696 | 135,864 |
|128k seq. read (MBps)  |  2,284 | 2,258 |   3,295 |   3,060 |
|128k seq. write (MBps) |  1,049 | 1,053 |   2,092 |   2,078 |
|------------------------------------------------------------|

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-11 17:40       ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
@ 2011-05-11 19:20         ` Alan Cox
  2011-05-21  2:26           ` Asai Thambi S P
  0 siblings, 1 reply; 40+ messages in thread
From: Alan Cox @ 2011-05-11 19:20 UTC (permalink / raw)
  To: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]; +Cc: linux-ide

d core processor @3.20GHz) 
> OS:
> 	RHEL (Enterprise) ver5.5 x64 (2.6.18-194.el5)

Linux 2.6.18 was released in July 2006

> Test Software:
> 	Vdbench ver 5.02
> 
> These experiments were done by changing the queue depth in
> vdbench(Q=xxx).
> 
>                         |------------------------------------|
>                         | ahci   |  ahci | mtipx2xx| mtipx2xx|
>                         | Q=256  |  Q=32 |  Q=32   |  Q=256  |
> |------------------------------------------------------------|
> |4k random read (IOPS)  | 15,831 |15,763 | 244,795 | 783,778 |
> |4k random write (IOPS) |  4,058 | 4,072 |  46,696 | 135,864 |
> |128k seq. read (MBps)  |  2,284 | 2,258 |   3,295 |   3,060 |
> |128k seq. write (MBps) |  1,049 | 1,053 |   2,092 |   2,078 |
> |------------------------------------------------------------|

So a bigger queue helped (at least in 2006). The AHCI driver can be
taught your bigger queue easily enough. The question is where with a
*current* kernel are any remaining bottlenecks if you do that and how do
we fix them.

I was expecting at the very least numbers versus a modern kernel (Would
you do Windows 7 submission benchmarks against XP SP 2 ??) and profile
data to show where the bottlenecks appear to be.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-03 15:02     ` Mark Lord
@ 2011-05-12 14:39       ` Jeff Garzik
  0 siblings, 0 replies; 40+ messages in thread
From: Jeff Garzik @ 2011-05-12 14:39 UTC (permalink / raw)
  To: Mark Lord
  Cc: Jeff Moyer, Alan Cox,
	Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

On 05/03/2011 11:02 AM, Mark Lord wrote:
> On 11-05-02 02:40 PM, Jeff Moyer wrote:
> ..
>> Given the highly parallel nature of these parts, I wouldn't be surprised
>> if the ahci queue depth of 31 is one of the main bottlenecks.  Can you
>> think of a way to extend the ahci driver in this manner to accommodate
>> devices like this one?
> ..
>
> I imagine that another performance drain might be libata's lack of
> support for host controller queues -->  a hardware queuing layer that
> exists between the host and the drive's own queuing mechanism.

What do you feel libata lacks, in this area?

AHCI has a host controller queue that it actively manages, and we ship 
off requests to that queue ASAP.

	Jeff




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-11 19:20         ` Alan Cox
@ 2011-05-21  2:26           ` Asai Thambi S P
  2011-05-25 14:36             ` Jeff Moyer
  0 siblings, 1 reply; 40+ messages in thread
From: Asai Thambi S P @ 2011-05-21  2:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-ide

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

On 5/11/2011 1:20 PM, Alan Cox wrote:
> So a bigger queue helped (at least in 2006). The AHCI driver can be
> taught your bigger queue easily enough. The question is where with a
> *current* kernel are any remaining bottlenecks if you do that and how do
> we fix them.

Attached image/table shows the performance numbers on current kernel.

The main objectives of our new mtipx2xx driver are
	1.) highest performance (see attached image/table),
	2.) lowest CPU utilization, and
	3.) vendor unique code required to control the P320

We ran our tests (2 iterations) on RHEL 6.0 running kernel 2.6.38.6, 
latest stable kernel then (now I see 2.6.39 as latest stable one). The 
colored cells in the table indicates that AHCI driver was experiencing 
excessive write failures on the P320 which caused the AHCI driver to 
disable NCQ.

Other aspects:
  * This driver works with standard tools like smartctl, hdparm, etc.
  * We are committed to ongoing support in the kernel for this driver
  * This driver is open for other vendors to use
  * Layered driver architecture gives scope for adding interfaces other 
than AHCI i.e. this driver is not limited by AHCI interface

Driver achitecture:
mtipx2xx contain three layers –
  * pci – implementation of all pci related functions
  * block – implementation of all block related functions
  * ahci – implementation of all ahci interface.


> I was expecting at the very least numbers versus a modern kernel (Would
> you do Windows 7 submission benchmarks against XP SP 2 ??) and profile
> data to show where the bottlenecks appear to be.

I knew the numbers on current kernel would make more sense, but then I 
wanted to first post what I had, and then get the latest numbers ready.

Regards,
Asai Thambi

[-- Attachment #2: mtipx2xx_benchmark_comparison.PNG --]
[-- Type: image/png, Size: 63470 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-05-21  2:26           ` Asai Thambi S P
@ 2011-05-25 14:36             ` Jeff Moyer
       [not found]               ` <22A973199D2C2F46933448F6E7990A300239EA77@ntxboimbx31.micron.com>
  0 siblings, 1 reply; 40+ messages in thread
From: Jeff Moyer @ 2011-05-25 14:36 UTC (permalink / raw)
  To: asamymuthupa; +Cc: Alan Cox, linux-ide

Asai Thambi S P <asamymuthupa@micron.com> writes:

> On 5/11/2011 1:20 PM, Alan Cox wrote:
>> So a bigger queue helped (at least in 2006). The AHCI driver can be
>> taught your bigger queue easily enough. The question is where with a
>> *current* kernel are any remaining bottlenecks if you do that and how do
>> we fix them.
>
> Attached image/table shows the performance numbers on current kernel.
>
> The main objectives of our new mtipx2xx driver are
> 	1.) highest performance (see attached image/table),
> 	2.) lowest CPU utilization, and

Can you collect perf data to show why the ahci driver is taking up so
much more CPU for the random I/O case?

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
       [not found]               ` <22A973199D2C2F46933448F6E7990A300239EA77@ntxboimbx31.micron.com>
@ 2011-06-01 19:51                 ` Jeff Moyer
  2011-06-01 20:21                   ` Alan Cox
  2011-06-02  1:21                   ` David Dillow
  0 siblings, 2 replies; 40+ messages in thread
From: Jeff Moyer @ 2011-06-01 19:51 UTC (permalink / raw)
  To: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  Cc: Alan Cox, linux-ide

"Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]"
<asamymuthupa@micron.com> writes:

> On 5/25/2011 8:36 AM, Jeff Moyer wrote:
>> Asai Thambi S P<asamymuthupa@micron.com>  writes:
>>
>>> On 5/11/2011 1:20 PM, Alan Cox wrote:
>>>> So a bigger queue helped (at least in 2006). The AHCI driver can be
>>>> taught your bigger queue easily enough. The question is where with a
>>>> *current* kernel are any remaining bottlenecks if you do that and
> how do
>>>> we fix them.
>>>
>>> Attached image/table shows the performance numbers on current kernel.
>>>
>>> The main objectives of our new mtipx2xx driver are
>>> 	1.) highest performance (see attached image/table),
>>> 	2.) lowest CPU utilization, and
>>
>> Can you collect perf data to show why the ahci driver is taking up so
>> much more CPU for the random I/O case?
>>
>
> Collected the perf data for ahci driver. As the call graph is getting 
> distorted in the email, attaching the perf data call graph report.

Thanks, Asai!  I don't think cfq is the ideal I/O scheduler to be
testing.  Could you run again with deadline and/or noop and see how that
changes your throughput and perf report?  Also, just for completeness,
could you tell us which kernel you ran this against?

Thanks!
Jeff

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-01 19:51                 ` Jeff Moyer
@ 2011-06-01 20:21                   ` Alan Cox
  2011-06-15  1:29                     ` Asai Thambi S P
  2011-06-02  1:21                   ` David Dillow
  1 sibling, 1 reply; 40+ messages in thread
From: Alan Cox @ 2011-06-01 20:21 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

> Thanks, Asai!  I don't think cfq is the ideal I/O scheduler to be
> testing.  Could you run again with deadline and/or noop and see how that
> changes your throughput and perf report?  Also, just for completeness,
> could you tell us which kernel you ran this against?

How many processors is this system, just looking at the lock contention
which is pretty horrible.

I'd been expecting various red flags in the AHCI/libata/scsi queue code
but it seems at first glance that the block queue stuff is killing us and
the scsi/ata code is a distraction (unless of course its causing a lot of
the lock time)

No-op would be most interesting but the I/O scheduler numbers don't look
pretty.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-01 19:51                 ` Jeff Moyer
  2011-06-01 20:21                   ` Alan Cox
@ 2011-06-02  1:21                   ` David Dillow
  2011-06-15  1:33                     ` Asai Thambi S P
  1 sibling, 1 reply; 40+ messages in thread
From: David Dillow @ 2011-06-02  1:21 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	Alan Cox, linux-ide

On Wed, 2011-06-01 at 15:51 -0400, Jeff Moyer wrote:
> "Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]"
> <asamymuthupa@micron.com> writes:
> > Collected the perf data for ahci driver. As the call graph is getting 
> > distorted in the email, attaching the perf data call graph report.
> 
> Thanks, Asai!  I don't think cfq is the ideal I/O scheduler to be
> testing.  Could you run again with deadline and/or noop and see how that
> changes your throughput and perf report?  Also, just for completeness,
> could you tell us which kernel you ran this against?

Did the cc list get trimmed, or was the message too big for vger? It
didn't seem to make it to the list...

If it was too big, could someone trim it down a bit and repost?
Inquiring minds want to play along at home...

Thanks!


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-01 20:21                   ` Alan Cox
@ 2011-06-15  1:29                     ` Asai Thambi S P
  2011-06-15 14:43                       ` Jeff Moyer
  0 siblings, 1 reply; 40+ messages in thread
From: Asai Thambi S P @ 2011-06-15  1:29 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Moyer, linux-ide

On 6/1/2011 2:21 PM, Alan Cox wrote:
>> Thanks, Asai!  I don't think cfq is the ideal I/O scheduler to be
>> testing.  Could you run again with deadline and/or noop and see how that
>> changes your throughput and perf report?  Also, just for completeness,
>> could you tell us which kernel you ran this against?

kernel 2.6.38.6

> How many processors is this system, just looking at the lock contention
> which is pretty horrible.

8 processors (2 quad-core CPUs)
Intel(R) Xeon(R) CPU           X5672  @ 3.20GHz

> I'd been expecting various red flags in the AHCI/libata/scsi queue code
> but it seems at first glance that the block queue stuff is killing us and
> the scsi/ata code is a distraction (unless of course its causing a lot of
> the lock time)
>
> No-op would be most interesting but the I/O scheduler numbers don't look
> pretty.
>
> Alan

On looking into the data in below links, lock and block queue are 
consuming more time when running with ahci driver. Correct me if I 
missing something.

With mtipx2xx, the driver is spot on processing I/O. Any idea of who is 
the top offender when running with mtipx2xx?

Filename                  |    Description
-------------------------------------------------------------------------------------------
perf_report_ahci_cfq      : perf call graph for ahci driver with cfq
                             scheduler enabled
http://www.micron.com/get-document/?documentId=6768

vdbench.ahci.cfq.html     : vdbench summary of a run for ahci driver
                             with cfq scheduler enabled
http://www.micron.com/get-document/?documentId=6774

perf_report_ahci_deadline : perf call graph for ahci driver with
                             deadline scheduler enabled
http://www.micron.com/get-document/?documentId=6769

vdbench_ahci.deadline.html: vdbench summary of a run for ahci driver
                             with deadline scheduler enabled
http://www.micron.com/get-document/?documentId=6775

perf_report_ahci_noop     : perf call graph for ahci driver with noop
                             scheduler enabled
http://www.micron.com/get-document/?documentId=6770

vdbench_ahci.noop.html    : vdbench summary of a run for ahci driver
                             with noop scheduler enabled
http://www.micron.com/get-document/?documentId=6776

perf_report_micron        : perf call graph for Micron block driver
                             (mtipx2xx)
http://www.micron.com/get-document/?documentId=6772

vdbench_micron.html       : vdbench summary of a run for Micron block
                             driver (mtipx2xx)
http://www.micron.com/get-document/?documentId=6777

This is the first time we are hosting files for public access, that 
process took time to get to here.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-02  1:21                   ` David Dillow
@ 2011-06-15  1:33                     ` Asai Thambi S P
  2011-06-15  3:12                       ` David Dillow
  0 siblings, 1 reply; 40+ messages in thread
From: Asai Thambi S P @ 2011-06-15  1:33 UTC (permalink / raw)
  To: David Dillow; +Cc: Jeff Moyer, Alan Cox, linux-ide

On 6/1/2011 7:21 PM, David Dillow wrote:
> On Wed, 2011-06-01 at 15:51 -0400, Jeff Moyer wrote:
>> "Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]"
>> <asamymuthupa@micron.com>  writes:
>>> Collected the perf data for ahci driver. As the call graph is getting
>>> distorted in the email, attaching the perf data call graph report.
>>
>> Thanks, Asai!  I don't think cfq is the ideal I/O scheduler to be
>> testing.  Could you run again with deadline and/or noop and see how that
>> changes your throughput and perf report?  Also, just for completeness,
>> could you tell us which kernel you ran this against?
>
> Did the cc list get trimmed, or was the message too big for vger? It
> didn't seem to make it to the list...
>
> If it was too big, could someone trim it down a bit and repost?
> Inquiring minds want to play along at home...
>
> Thanks!
>

That message with attachment was too big for vger.

Here is the attachment:
Filename                  |    Description
-------------------------------------------------------------------------------------------
perf_report_-g_ahci_txt   : perf call graph for ahci driver with default
                             scheduler (cfq)
http://www.micron.com/get-document/?documentId=6771

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-15  1:33                     ` Asai Thambi S P
@ 2011-06-15  3:12                       ` David Dillow
  0 siblings, 0 replies; 40+ messages in thread
From: David Dillow @ 2011-06-15  3:12 UTC (permalink / raw)
  To: asamymuthupa; +Cc: linux-ide

On Tue, 2011-06-14 at 19:33 -0600, Asai Thambi S P wrote:
> On 6/1/2011 7:21 PM, David Dillow wrote:
> > Did the cc list get trimmed, or was the message too big for vger? It
> > didn't seem to make it to the list...
> >
> > If it was too big, could someone trim it down a bit and repost?
> > Inquiring minds want to play along at home...
> >
> > Thanks!
> >
> 
> That message with attachment was too big for vger.
> 
> Here is the attachment:
> Filename                  |    Description
> -------------------------------------------------------------------------------------------
> perf_report_-g_ahci_txt   : perf call graph for ahci driver with default
>                              scheduler (cfq)
> http://www.micron.com/get-document/?documentId=6771

Thank you for posting this,
Dave


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-15  1:29                     ` Asai Thambi S P
@ 2011-06-15 14:43                       ` Jeff Moyer
  2011-06-27 23:38                         ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  0 siblings, 1 reply; 40+ messages in thread
From: Jeff Moyer @ 2011-06-15 14:43 UTC (permalink / raw)
  To: asamymuthupa; +Cc: Alan Cox, linux-ide

Asai Thambi S P <asamymuthupa@micron.com> writes:

> On 6/1/2011 2:21 PM, Alan Cox wrote:
>>> Thanks, Asai!  I don't think cfq is the ideal I/O scheduler to be
>>> testing.  Could you run again with deadline and/or noop and see how that
>>> changes your throughput and perf report?  Also, just for completeness,
>>> could you tell us which kernel you ran this against?
>
> kernel 2.6.38.6

>> How many processors is this system, just looking at the lock contention
>> which is pretty horrible.
>
> 8 processors (2 quad-core CPUs)
> Intel(R) Xeon(R) CPU           X5672  @ 3.20GHz
>
[...]
> On looking into the data in below links, lock and block queue are
> consuming more time when running with ahci driver. Correct me if I
> missing something.

2.6.39 introduced the on-stack plugging work from Jens, the intent of
which is to reduce queue lock contention.  It would be great if you
could run with that kernel and noop to see if we make up some of the
performance gap (which looks to be just north of 10%).

Asai, thanks for running these tests and providing all of this data!

-Jeff

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: New driver mtipx2xx submission
  2011-06-15 14:43                       ` Jeff Moyer
@ 2011-06-27 23:38                         ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  2011-06-28 15:18                           ` Jeff Moyer
  0 siblings, 1 reply; 40+ messages in thread
From: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR] @ 2011-06-27 23:38 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Alan Cox, linux-ide

> 2.6.39 introduced the on-stack plugging work from Jens, the intent of
> which is to reduce queue lock contention.  It would be great if you
> could run with that kernel and noop to see if we make up some of the
> performance gap (which looks to be just north of 10%).
> 
> Asai, thanks for running these tests and providing all of this data!
> 
> -Jeff

At least with noop and deadline for ahci driver, now queue lock is not
the top offender. With the new change in plugging, seems noop and
deadline are spending more time in processing I/O similar to Micron
block driver.

With the current test results with 2.6.39.1 (new optimization for
plugging) and application queue depth of 32, 
     * Micron block driver exhibits 43% better IOPS than ahci driver
with noop
     * Micron block driver slightly better in CPU utilization.
     
With application queue depth of 256, Micron block driver is able to
leverage the device capability, and hence performance increases more
than 225%.

NOTE: To set up this benchmark, ran 'zerodrive' on the drive twice
before each test (standard benchmark procedure). The objective is to
compare the driver performance to call out the relative performance
capabilities between the drivers.

Links to perf report files below:

Filename                       |    Description
------------------------------------------------------------------------
------------------
mtip_ncq32.html                : vdbench summary of a run for Micron
block driver with app. queue depth of 32
http://www.micron.com/document_download/?documentId=6808

mtip_ncq32_report.txt          : performance call graph for Micron block
driver with app. queue depth of 32
http://www.micron.com/document_download/?documentId=6809

ahci_noop_ncq32.html           : vdbench summary of a run for ahci
driver with noop scheduler enabled and app. queue depth of 32
http://www.micron.com/document_download/?documentId=6814

ahci_noop_ncq32_report.txt     : performance call graph for ahci driver
with noop scheduler enabled and app. queue depth of 32
http://www.micron.com/document_download/?documentId=6815

ahci_cfq_ncq32.html            : vdbench summary of a run for ahci
driver with cfq scheduler enabled and app. queue depth of 32
http://www.micron.com/document_download/?documentId=6810

ahci_cfq_ncq32_report.txt      : performance call graph for ahci driver
with cfq scheduler enabled and app. queue depth of 32
http://www.micron.com/document_download/?documentId=6811

ahci_deadline_ncq32.html       : vdbench summary of a run for ahci
driver with deadline scheduler enabled and app. queue depth of 32
http://www.micron.com/document_download/?documentId=6812

ahci_deadline_ncq32_report.txt : performance call graph for ahci driver
with deadline scheduler enabled and app. queue depth of 32
http://www.micron.com/document_download/?documentId=6813

mtip_ncq256.html               : vdbench summary of a run for Micron
block driver with app. queue depth of 256
http://www.micron.com/document_download/?documentId=6816

mtip_ncq256_report.txt         : performance call graph for Micron block
driver with app. queue depth of 256
http://www.micron.com/document_download/?documentId=6817

config-2.6.39.1                : config file used for kernel 2.6.39.1
http://www.micron.com/document_download/?documentId=6818

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-27 23:38                         ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
@ 2011-06-28 15:18                           ` Jeff Moyer
  2011-06-28 15:31                             ` Alan Cox
  2011-07-06 21:39                             ` Asai Thambi S P
  0 siblings, 2 replies; 40+ messages in thread
From: Jeff Moyer @ 2011-06-28 15:18 UTC (permalink / raw)
  To: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
  Cc: Alan Cox, linux-ide

"Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]"
<asamymuthupa@micron.com> writes:

>> 2.6.39 introduced the on-stack plugging work from Jens, the intent of
>> which is to reduce queue lock contention.  It would be great if you
>> could run with that kernel and noop to see if we make up some of the
>> performance gap (which looks to be just north of 10%).
>> 
>> Asai, thanks for running these tests and providing all of this data!
>> 
>> -Jeff
>
> At least with noop and deadline for ahci driver, now queue lock is not
> the top offender. With the new change in plugging, seems noop and
> deadline are spending more time in processing I/O similar to Micron
> block driver.
>
> With the current test results with 2.6.39.1 (new optimization for
> plugging) and application queue depth of 32, 
>      * Micron block driver exhibits 43% better IOPS than ahci driver
> with noop
>      * Micron block driver slightly better in CPU utilization.
>
> With application queue depth of 256, Micron block driver is able to
> leverage the device capability, and hence performance increases more
> than 225%.
>

>From the perf report, I would have guessed that the CPU utilization for
the ahci test case would have been lower than the Micron block driver.
Odd, I wonder what I'm missing.  Asai, did you notice if any of the CPUs
was completely pegged during testing with ahci?  You're using a NUMA
box, right?  I also wonder what the irq distribution looked like, and
whether rq_affinity is hurting performance for the ahci case.  Also,
does the Micron driver do any sort of interrupt coalescing that maybe
the ahci driver isn't doing?

Anywho, a 40% difference is pretty significant (though NUMA can have
that sort of impact).  Alan, what do you think?  I was never clear on
how exactly the ahci driver would handle a queue depth larger than 32
(if it can't, then clearly we'd need a block driver for this hardware).

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-28 15:18                           ` Jeff Moyer
@ 2011-06-28 15:31                             ` Alan Cox
  2011-06-28 15:38                               ` Jeff Moyer
  2011-07-06 21:39                             ` Asai Thambi S P
  1 sibling, 1 reply; 40+ messages in thread
From: Alan Cox @ 2011-06-28 15:31 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

> Anywho, a 40% difference is pretty significant (though NUMA can have
> that sort of impact).  Alan, what do you think?  I was never clear on
> how exactly the ahci driver would handle a queue depth larger than 32
> (if it can't, then clearly we'd need a block driver for this hardware).

The AHCI driver can drop that support in fairly easily all the queue
handling is nicely pluggable.

A 40% improvement is a big win, but it seems to me that if any of that is
not hardware related (ie the bigger command queue) then we need to be
fixing the existing driver to improve *ALL* AHCI devices. Or are we going
to end up with a pile of AHCIish drivers for every random device on the
planet ?

My request for info on the Micron errata has so far been ignored,
questions on what is involved in the queue stuff likewise. I'd like to
see those matters resolved.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-28 15:31                             ` Alan Cox
@ 2011-06-28 15:38                               ` Jeff Moyer
  2011-07-06 21:43                                 ` Asai Thambi S P
  0 siblings, 1 reply; 40+ messages in thread
From: Jeff Moyer @ 2011-06-28 15:38 UTC (permalink / raw)
  To: Alan Cox
  Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR],
	linux-ide

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> Anywho, a 40% difference is pretty significant (though NUMA can have
>> that sort of impact).  Alan, what do you think?  I was never clear on
>> how exactly the ahci driver would handle a queue depth larger than 32
>> (if it can't, then clearly we'd need a block driver for this hardware).
>
> The AHCI driver can drop that support in fairly easily all the queue
> handling is nicely pluggable.
>
> A 40% improvement is a big win, but it seems to me that if any of that is
> not hardware related (ie the bigger command queue) then we need to be
> fixing the existing driver to improve *ALL* AHCI devices. Or are we going
> to end up with a pile of AHCIish drivers for every random device on the
> planet ?

The 40% number was for a queue depth of 32, so yes, there's room for
improvement in AHCI (so long as it isn't related to the block layer's
rq_affinity interacting poorly with NUMA or some such thing).

> My request for info on the Micron errata has so far been ignored,
> questions on what is involved in the queue stuff likewise. I'd like to
> see those matters resolved.

Asai?

-Jeff

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-28 15:18                           ` Jeff Moyer
  2011-06-28 15:31                             ` Alan Cox
@ 2011-07-06 21:39                             ` Asai Thambi S P
  1 sibling, 0 replies; 40+ messages in thread
From: Asai Thambi S P @ 2011-07-06 21:39 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Alan Cox, linux-ide

On 6/28/2011 9:18 AM, Jeff Moyer wrote:
>  From the perf report, I would have guessed that the CPU utilization for
> the ahci test case would have been lower than the Micron block driver.
> Odd, I wonder what I'm missing.  Asai, did you notice if any of the CPUs
> was completely pegged during testing with ahci?

Yes, whichever cpu was handling interrupts was pegged in both cases. 
The driver is configured to use a single MSI interrupt.


You're using a NUMA
> box, right?  I also wonder what the irq distribution looked like, and
> whether rq_affinity is hurting performance for the ahci case.

No, it's not a NUMA box.  Our runs were done with rq_affinity=0.  If you 
think it would be helpful, we can re-run with rq_affinity=1.


   Also,
> does the Micron driver do any sort of interrupt coalescing that maybe
> the ahci driver isn't doing?
>

We have a scheme to coalesce multiple completions into a single 
interrupt.  Both drivers were using this feature for these runs.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-06-28 15:38                               ` Jeff Moyer
@ 2011-07-06 21:43                                 ` Asai Thambi S P
  2011-07-07  7:37                                   ` Alan Cox
  2011-07-26 10:46                                     ` Alan Cox
  0 siblings, 2 replies; 40+ messages in thread
From: Asai Thambi S P @ 2011-07-06 21:43 UTC (permalink / raw)
  To: Jeff Moyer, Alan Cox; +Cc: linux-ide

On 6/28/2011 9:38 AM, Jeff Moyer wrote:
> Alan Cox<alan@lxorguk.ukuu.org.uk>  writes:
>> My request for info on the Micron errata has so far been ignored,
>> questions on what is involved in the queue stuff likewise. I'd like to
>> see those matters resolved.
>
> Asai?

We will follow up with you directly, Alan & Jeff, regarding the errata & 
workarounds.  We are compiling it and filtering it through legal, which 
as you might expect, is taking a bit of time.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-06 21:43                                 ` Asai Thambi S P
@ 2011-07-07  7:37                                   ` Alan Cox
  2011-07-26 10:46                                     ` Alan Cox
  1 sibling, 0 replies; 40+ messages in thread
From: Alan Cox @ 2011-07-07  7:37 UTC (permalink / raw)
  To: asamymuthupa; +Cc: Jeff Moyer, linux-ide

On Wed, 6 Jul 2011 15:43:46 -0600
Asai Thambi S P <asamymuthupa@micron.com> wrote:

> On 6/28/2011 9:38 AM, Jeff Moyer wrote:
> > Alan Cox<alan@lxorguk.ukuu.org.uk>  writes:
> >> My request for info on the Micron errata has so far been ignored,
> >> questions on what is involved in the queue stuff likewise. I'd like to
> >> see those matters resolved.
> >
> > Asai?
> 
> We will follow up with you directly, Alan & Jeff, regarding the errata & 
> workarounds.  We are compiling it and filtering it through legal, which 
> as you might expect, is taking a bit of time.

Sure.. I'm all to familiar with that, from both sides of the fence.

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-06 21:43                                 ` Asai Thambi S P
@ 2011-07-26 10:46                                     ` Alan Cox
  2011-07-26 10:46                                     ` Alan Cox
  1 sibling, 0 replies; 40+ messages in thread
From: Alan Cox @ 2011-07-26 10:46 UTC (permalink / raw)
  To: asamymuthupa; +Cc: Jeff Moyer, linux-ide, linux-kernel

Sorry this has taken a while - I've been away and also dealing with
various bits of graphics security stuff.

I've now been through the errata, the timing data and the driver code in
somewhat more detail

Overall:
  The hardware deviates a bit from AHCI. The AHCI driver could be taught
  to support it but even with the longer queue supported it's not clear
  this is the right path, and some of the error handling needs deviate a
  bit.

  The performance numbers are pretty definitive, and the data shows that
  is mostly higher up in the queue handling. That's awkward in some ways
  because it means there isn't an obvious way to fix it, and we still
  want the queue stuff for 'normal' disks.

  Looking at other vendors there don't seem to be a pile of them also
  planning to do AHCI with extras instead most seem focussed on NVHMCI so
  it doesn't look like a pile of near identikit drivers will appear. Also
  if they do we would probably want them all to be related to this driver
  not to the general AHCI driver.

So having gone over it all I think the case is rather well made for this
being added as its own driver matching their specific PCI idents, but with
some code clean up, and possibly some further compatibility code if it
turns out some general ide/scsi tools don't work on it as expected.

Comments on the driver code

Questions:
  Should there be security checks on the ioctl interfaces ?

Code:
  Use k[mz]alloc/kfree for small objects like structs, vmalloc has a lot
  of ovherad you don't need

- Lots of global function names with general naming. This causes problems
  in Linux because all the compiled in drivers share a common namespace.
  So they really ought to be something like

	mtip_ahci_write()

  and so on

- Semaphores. Unless you need the counting properties please use mutexes.
  Sempahores really make for problems in hard real time environments if
  using the -rt kernel additions

Style:
- Confuses our kernel-doc tools as it has its own different comment
  extraction format. That wants pulling into line (it looks like all the
  info is there and its a 'perl script from hell' sort of conversion)

- Various struct names in capitals - please search/replace those as for
  style we keep capitals for defines

- Various ifdefs and a lot of printk stuff. Some of this is clearly
  because its a development driver, but it ought to be tidied for a final
  submission. Also use of dev_info/dev_err etc is strongly preferred as
  it means a user and tools can clearly identify which device generated
  the message (dev_dbg() supports runtime debug switching so may also
  deal with stuff you'd otherwise remove later)

- for ata_swap_string look at bswap()


Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
@ 2011-07-26 10:46                                     ` Alan Cox
  0 siblings, 0 replies; 40+ messages in thread
From: Alan Cox @ 2011-07-26 10:46 UTC (permalink / raw)
  To: asamymuthupa; +Cc: Jeff Moyer, linux-ide, linux-kernel

Sorry this has taken a while - I've been away and also dealing with
various bits of graphics security stuff.

I've now been through the errata, the timing data and the driver code in
somewhat more detail

Overall:
  The hardware deviates a bit from AHCI. The AHCI driver could be taught
  to support it but even with the longer queue supported it's not clear
  this is the right path, and some of the error handling needs deviate a
  bit.

  The performance numbers are pretty definitive, and the data shows that
  is mostly higher up in the queue handling. That's awkward in some ways
  because it means there isn't an obvious way to fix it, and we still
  want the queue stuff for 'normal' disks.

  Looking at other vendors there don't seem to be a pile of them also
  planning to do AHCI with extras instead most seem focussed on NVHMCI so
  it doesn't look like a pile of near identikit drivers will appear. Also
  if they do we would probably want them all to be related to this driver
  not to the general AHCI driver.

So having gone over it all I think the case is rather well made for this
being added as its own driver matching their specific PCI idents, but with
some code clean up, and possibly some further compatibility code if it
turns out some general ide/scsi tools don't work on it as expected.

Comments on the driver code

Questions:
  Should there be security checks on the ioctl interfaces ?

Code:
  Use k[mz]alloc/kfree for small objects like structs, vmalloc has a lot
  of ovherad you don't need

- Lots of global function names with general naming. This causes problems
  in Linux because all the compiled in drivers share a common namespace.
  So they really ought to be something like

	mtip_ahci_write()

  and so on

- Semaphores. Unless you need the counting properties please use mutexes.
  Sempahores really make for problems in hard real time environments if
  using the -rt kernel additions

Style:
- Confuses our kernel-doc tools as it has its own different comment
  extraction format. That wants pulling into line (it looks like all the
  info is there and its a 'perl script from hell' sort of conversion)

- Various struct names in capitals - please search/replace those as for
  style we keep capitals for defines

- Various ifdefs and a lot of printk stuff. Some of this is clearly
  because its a development driver, but it ought to be tidied for a final
  submission. Also use of dev_info/dev_err etc is strongly preferred as
  it means a user and tools can clearly identify which device generated
  the message (dev_dbg() supports runtime debug switching so may also
  deal with stuff you'd otherwise remove later)

- for ata_swap_string look at bswap()


Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-26 10:46                                     ` Alan Cox
  (?)
@ 2011-07-26 11:44                                     ` Christoph Hellwig
  2011-07-26 11:49                                       ` Alan Cox
  2011-07-26 18:50                                       ` Jeff Garzik
  -1 siblings, 2 replies; 40+ messages in thread
From: Christoph Hellwig @ 2011-07-26 11:44 UTC (permalink / raw)
  To: Alan Cox; +Cc: asamymuthupa, Jeff Moyer, linux-ide, linux-kernel

On Tue, Jul 26, 2011 at 11:46:40AM +0100, Alan Cox wrote:
> Sorry this has taken a while - I've been away and also dealing with
> various bits of graphics security stuff.

Alan, where did the mail that you reply to originate from?  Care to post
the whole driver to lkml?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-26 11:44                                     ` Christoph Hellwig
@ 2011-07-26 11:49                                       ` Alan Cox
  2011-07-26 18:50                                       ` Jeff Garzik
  1 sibling, 0 replies; 40+ messages in thread
From: Alan Cox @ 2011-07-26 11:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: asamymuthupa, Jeff Moyer, linux-ide, linux-kernel

On Tue, 26 Jul 2011 07:44:45 -0400
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Jul 26, 2011 at 11:46:40AM +0100, Alan Cox wrote:
> > Sorry this has taken a while - I've been away and also dealing with
> > various bits of graphics security stuff.
> 
> Alan, where did the mail that you reply to originate from?  Care to post
> the whole driver to lkml?

linux-ide

Alan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-26 11:44                                     ` Christoph Hellwig
  2011-07-26 11:49                                       ` Alan Cox
@ 2011-07-26 18:50                                       ` Jeff Garzik
  1 sibling, 0 replies; 40+ messages in thread
From: Jeff Garzik @ 2011-07-26 18:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alan Cox, asamymuthupa, Jeff Moyer, linux-ide, linux-kernel

On 07/26/2011 07:44 AM, Christoph Hellwig wrote:
> On Tue, Jul 26, 2011 at 11:46:40AM +0100, Alan Cox wrote:
>> Sorry this has taken a while - I've been away and also dealing with
>> various bits of graphics security stuff.
>
> Alan, where did the mail that you reply to originate from?  Care to post
> the whole driver to lkml?

The original, full post was back in April:
http://marc.info/?l=linux-ide&m=130400649131349&w=2


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-26 10:46                                     ` Alan Cox
@ 2011-07-29 18:13                                       ` Asai Thambi S P
  -1 siblings, 0 replies; 40+ messages in thread
From: Asai Thambi S P @ 2011-07-29 18:13 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Moyer, linux-ide, linux-kernel, Jeff Garzik

On 7/26/2011 4:46 AM, Alan Cox wrote:
> Sorry this has taken a while - I've been away and also dealing with
> various bits of graphics security stuff.
> 
> I've now been through the errata, the timing data and the driver code in
> somewhat more detail
> 
> Overall:
>   The hardware deviates a bit from AHCI. The AHCI driver could be taught
>   to support it but even with the longer queue supported it's not clear
>   this is the right path, and some of the error handling needs deviate a
>   bit.
> 
>   The performance numbers are pretty definitive, and the data shows that
>   is mostly higher up in the queue handling. That's awkward in some ways
>   because it means there isn't an obvious way to fix it, and we still
>   want the queue stuff for 'normal' disks.
> 
>   Looking at other vendors there don't seem to be a pile of them also
>   planning to do AHCI with extras instead most seem focussed on NVHMCI so
>   it doesn't look like a pile of near identikit drivers will appear. Also
>   if they do we would probably want them all to be related to this driver
>   not to the general AHCI driver.
> 
> So having gone over it all I think the case is rather well made for this
> being added as its own driver matching their specific PCI idents, but with
> some code clean up, and possibly some further compatibility code if it
> turns out some general ide/scsi tools don't work on it as expected.
> 
> Comments on the driver code
> 
> Questions:
>   Should there be security checks on the ioctl interfaces ?
> 
> Code:
>   Use k[mz]alloc/kfree for small objects like structs, vmalloc has a lot
>   of ovherad you don't need
> 
> - Lots of global function names with general naming. This causes problems
>   in Linux because all the compiled in drivers share a common namespace.
>   So they really ought to be something like
> 
> 	mtip_ahci_write()
> 
>   and so on
> 
> - Semaphores. Unless you need the counting properties please use mutexes.
>   Sempahores really make for problems in hard real time environments if
>   using the -rt kernel additions
> 
> Style:
> - Confuses our kernel-doc tools as it has its own different comment
>   extraction format. That wants pulling into line (it looks like all the
>   info is there and its a 'perl script from hell' sort of conversion)
> 
> - Various struct names in capitals - please search/replace those as for
>   style we keep capitals for defines
> 
> - Various ifdefs and a lot of printk stuff. Some of this is clearly
>   because its a development driver, but it ought to be tidied for a final
>   submission. Also use of dev_info/dev_err etc is strongly preferred as
>   it means a user and tools can clearly identify which device generated
>   the message (dev_dbg() supports runtime debug switching so may also
>   deal with stuff you'd otherwise remove later)
> 
> - for ata_swap_string look at bswap()

Thanks for your valuable feedback. We started making changes to the
driver, and expect to complete it soon.

-- 
Regards,
Asai Thambi

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
@ 2011-07-29 18:13                                       ` Asai Thambi S P
  0 siblings, 0 replies; 40+ messages in thread
From: Asai Thambi S P @ 2011-07-29 18:13 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Moyer, linux-ide, linux-kernel, Jeff Garzik

On 7/26/2011 4:46 AM, Alan Cox wrote:
> Sorry this has taken a while - I've been away and also dealing with
> various bits of graphics security stuff.
> 
> I've now been through the errata, the timing data and the driver code in
> somewhat more detail
> 
> Overall:
>   The hardware deviates a bit from AHCI. The AHCI driver could be taught
>   to support it but even with the longer queue supported it's not clear
>   this is the right path, and some of the error handling needs deviate a
>   bit.
> 
>   The performance numbers are pretty definitive, and the data shows that
>   is mostly higher up in the queue handling. That's awkward in some ways
>   because it means there isn't an obvious way to fix it, and we still
>   want the queue stuff for 'normal' disks.
> 
>   Looking at other vendors there don't seem to be a pile of them also
>   planning to do AHCI with extras instead most seem focussed on NVHMCI so
>   it doesn't look like a pile of near identikit drivers will appear. Also
>   if they do we would probably want them all to be related to this driver
>   not to the general AHCI driver.
> 
> So having gone over it all I think the case is rather well made for this
> being added as its own driver matching their specific PCI idents, but with
> some code clean up, and possibly some further compatibility code if it
> turns out some general ide/scsi tools don't work on it as expected.
> 
> Comments on the driver code
> 
> Questions:
>   Should there be security checks on the ioctl interfaces ?
> 
> Code:
>   Use k[mz]alloc/kfree for small objects like structs, vmalloc has a lot
>   of ovherad you don't need
> 
> - Lots of global function names with general naming. This causes problems
>   in Linux because all the compiled in drivers share a common namespace.
>   So they really ought to be something like
> 
> 	mtip_ahci_write()
> 
>   and so on
> 
> - Semaphores. Unless you need the counting properties please use mutexes.
>   Sempahores really make for problems in hard real time environments if
>   using the -rt kernel additions
> 
> Style:
> - Confuses our kernel-doc tools as it has its own different comment
>   extraction format. That wants pulling into line (it looks like all the
>   info is there and its a 'perl script from hell' sort of conversion)
> 
> - Various struct names in capitals - please search/replace those as for
>   style we keep capitals for defines
> 
> - Various ifdefs and a lot of printk stuff. Some of this is clearly
>   because its a development driver, but it ought to be tidied for a final
>   submission. Also use of dev_info/dev_err etc is strongly preferred as
>   it means a user and tools can clearly identify which device generated
>   the message (dev_dbg() supports runtime debug switching so may also
>   deal with stuff you'd otherwise remove later)
> 
> - for ata_swap_string look at bswap()

Thanks for your valuable feedback. We started making changes to the
driver, and expect to complete it soon.

-- 
Regards,
Asai Thambi

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
  2011-07-26 10:46                                     ` Alan Cox
@ 2011-08-11 18:36                                       ` Asai Thambi S P
  -1 siblings, 0 replies; 40+ messages in thread
From: Asai Thambi S P @ 2011-08-11 18:36 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Moyer, linux-ide, linux-kernel, Jens Axboe

On 7/26/2011 4:46 AM, Alan Cox wrote:
> Sorry this has taken a while - I've been away and also dealing with
> various bits of graphics security stuff.
> 
> I've now been through the errata, the timing data and the driver code in
> somewhat more detail
> 
> Overall:
>   The hardware deviates a bit from AHCI. The AHCI driver could be taught
>   to support it but even with the longer queue supported it's not clear
>   this is the right path, and some of the error handling needs deviate a
>   bit.
> 
>   The performance numbers are pretty definitive, and the data shows that
>   is mostly higher up in the queue handling. That's awkward in some ways
>   because it means there isn't an obvious way to fix it, and we still
>   want the queue stuff for 'normal' disks.
> 
>   Looking at other vendors there don't seem to be a pile of them also
>   planning to do AHCI with extras instead most seem focussed on NVHMCI so
>   it doesn't look like a pile of near identikit drivers will appear. Also
>   if they do we would probably want them all to be related to this driver
>   not to the general AHCI driver.
> 
> So having gone over it all I think the case is rather well made for this
> being added as its own driver matching their specific PCI idents, but with
> some code clean up, and possibly some further compatibility code if it
> turns out some general ide/scsi tools don't work on it as expected.

Thanks for taking the time to review the errata, performance profiles,
and early driver code.  We've cleaned up much of the ugliness in the
version you inspected so it should be easier on the eyes now.

We changed the driver name from mtipx2xx to mtip32xx. Open to a generic
name if other vendors are planning to use this driver.

> 
> Comments on the driver code
> 
> Questions:
>   Should there be security checks on the ioctl interfaces ?

We added capable(CAP_SYS_ADMIN) checks on the ioctls.

> 
> Code:
>   Use k[mz]alloc/kfree for small objects like structs, vmalloc has a lot
>   of ovherad you don't need

All vmallocs were converted to kzallocs.

> 
> - Lots of global function names with general naming. This causes problems
>   in Linux because all the compiled in drivers share a common namespace.
>   So they really ought to be something like
> 
> 	mtip_ahci_write()
> 
>   and so on

We converted all non-static functions to kernel-compatible nomenclature.
We used the mtip prefix based on your suggestion.

> 
> - Semaphores. Unless you need the counting properties please use mutexes.
>   Sempahores really make for problems in hard real time environments if
>   using the -rt kernel additions

Here we had some trouble.  We needed the counting semaphore to put the
make_request calling context to sleep if there are no empty slots.  We
also needed the rw semaphore to prioritize internal commands and ioctls
during heavy IO load.  There seemed to be a fairness problem that was
best solved through the rw semaphore.  If you have another suggestion
for a "fair" semaphore, we'd love to hear it.

> 
> Style:
> - Confuses our kernel-doc tools as it has its own different comment
>   extraction format. That wants pulling into line (it looks like all the
>   info is there and its a 'perl script from hell' sort of conversion)

We did our best to make the comments and format consistent with other
drivers.

> 
> - Various struct names in capitals - please search/replace those as for
>   style we keep capitals for defines
> 
> - Various ifdefs and a lot of printk stuff. Some of this is clearly
>   because its a development driver, but it ought to be tidied for a final
>   submission. Also use of dev_info/dev_err etc is strongly preferred as
>   it means a user and tools can clearly identify which device generated
>   the message (dev_dbg() supports runtime debug switching so may also
>   deal with stuff you'd otherwise remove later)

We've converted much of the logging to dev_* semantics.  The challenge
for us was reconciling all the messages that we feel are important
against the need to not spam the system log.  We think we've made a
reasonable compromise, at least compared to the last driver we posted.

We are of course open to removing any logging deemed superfluous.

> 
> - for ata_swap_string look at bswap()

We experimented with bswap but in the end felt that be16_to_cpus
was a better choice.

I will post the updated patch shortly.

-- 
Regards,
Asai Thambi

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
@ 2011-08-11 18:36                                       ` Asai Thambi S P
  0 siblings, 0 replies; 40+ messages in thread
From: Asai Thambi S P @ 2011-08-11 18:36 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Moyer, linux-ide, linux-kernel, Jens Axboe

On 7/26/2011 4:46 AM, Alan Cox wrote:
> Sorry this has taken a while - I've been away and also dealing with
> various bits of graphics security stuff.
> 
> I've now been through the errata, the timing data and the driver code in
> somewhat more detail
> 
> Overall:
>   The hardware deviates a bit from AHCI. The AHCI driver could be taught
>   to support it but even with the longer queue supported it's not clear
>   this is the right path, and some of the error handling needs deviate a
>   bit.
> 
>   The performance numbers are pretty definitive, and the data shows that
>   is mostly higher up in the queue handling. That's awkward in some ways
>   because it means there isn't an obvious way to fix it, and we still
>   want the queue stuff for 'normal' disks.
> 
>   Looking at other vendors there don't seem to be a pile of them also
>   planning to do AHCI with extras instead most seem focussed on NVHMCI so
>   it doesn't look like a pile of near identikit drivers will appear. Also
>   if they do we would probably want them all to be related to this driver
>   not to the general AHCI driver.
> 
> So having gone over it all I think the case is rather well made for this
> being added as its own driver matching their specific PCI idents, but with
> some code clean up, and possibly some further compatibility code if it
> turns out some general ide/scsi tools don't work on it as expected.

Thanks for taking the time to review the errata, performance profiles,
and early driver code.  We've cleaned up much of the ugliness in the
version you inspected so it should be easier on the eyes now.

We changed the driver name from mtipx2xx to mtip32xx. Open to a generic
name if other vendors are planning to use this driver.

> 
> Comments on the driver code
> 
> Questions:
>   Should there be security checks on the ioctl interfaces ?

We added capable(CAP_SYS_ADMIN) checks on the ioctls.

> 
> Code:
>   Use k[mz]alloc/kfree for small objects like structs, vmalloc has a lot
>   of ovherad you don't need

All vmallocs were converted to kzallocs.

> 
> - Lots of global function names with general naming. This causes problems
>   in Linux because all the compiled in drivers share a common namespace.
>   So they really ought to be something like
> 
> 	mtip_ahci_write()
> 
>   and so on

We converted all non-static functions to kernel-compatible nomenclature.
We used the mtip prefix based on your suggestion.

> 
> - Semaphores. Unless you need the counting properties please use mutexes.
>   Sempahores really make for problems in hard real time environments if
>   using the -rt kernel additions

Here we had some trouble.  We needed the counting semaphore to put the
make_request calling context to sleep if there are no empty slots.  We
also needed the rw semaphore to prioritize internal commands and ioctls
during heavy IO load.  There seemed to be a fairness problem that was
best solved through the rw semaphore.  If you have another suggestion
for a "fair" semaphore, we'd love to hear it.

> 
> Style:
> - Confuses our kernel-doc tools as it has its own different comment
>   extraction format. That wants pulling into line (it looks like all the
>   info is there and its a 'perl script from hell' sort of conversion)

We did our best to make the comments and format consistent with other
drivers.

> 
> - Various struct names in capitals - please search/replace those as for
>   style we keep capitals for defines
> 
> - Various ifdefs and a lot of printk stuff. Some of this is clearly
>   because its a development driver, but it ought to be tidied for a final
>   submission. Also use of dev_info/dev_err etc is strongly preferred as
>   it means a user and tools can clearly identify which device generated
>   the message (dev_dbg() supports runtime debug switching so may also
>   deal with stuff you'd otherwise remove later)

We've converted much of the logging to dev_* semantics.  The challenge
for us was reconciling all the messages that we feel are important
against the need to not spam the system log.  We think we've made a
reasonable compromise, at least compared to the last driver we posted.

We are of course open to removing any logging deemed superfluous.

> 
> - for ata_swap_string look at bswap()

We experimented with bswap but in the end felt that be16_to_cpus
was a better choice.

I will post the updated patch shortly.

-- 
Regards,
Asai Thambi

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: New driver mtipx2xx submission
@ 2011-05-03 11:09 Jordan_Hargrave
  0 siblings, 0 replies; 40+ messages in thread
From: Jordan_Hargrave @ 2011-05-03 11:09 UTC (permalink / raw)
  To: linux-ide

>Alan Cox <alan@xxxxxxxxxxxxxxxxxxx> writes:
>
>>> We have written a new block driver for our AHCI based PCIe SSDs. The
>>> main objective of our product is providing high performance. Traffic
>>> through OS storage stack is not to able fully utilize the device's
>>> capabilty. To improve the traffic to the device and hence
>>> showcase/utilize the device's capability, we have come up with this new
>>> block driver. This driver includes
>>> 	* utilize device's increased queue depth
>>> 	* workaround for hardware errata
>>> 
>>> We want to get this driver into kernel tree to support the device out of
>>> the box. Attached this driver as a patch for latest kernel. We would
>>> like to get your comments, and also open for discussion.
>>
>> The kernel starting point would be that we have an AHCI driver. If you
>> need workarounds for hardware errata then they can go into it and that is
>> fine. We support NCQ so we can use the queue depths. If there are
>> extensions then the AHCI driver can be enhanced.
>
>Given the highly parallel nature of these parts, I wouldn't be surprised
>if the ahci queue depth of 31 is one of the main bottlenecks.  Can you
>think of a way to extend the ahci driver in this manner to accommodate
>devices like this one?
>
>Cheers,
>Jeff

Correct, the device appears as a single AHCI port device but supports a
queue depth of 128.  It does this by muxing the registers from port 1,2,3
onto port 0. I think this might be possible by creating device specific
port_start, qc_issue and change_queue_depth functions.

--jordan hargrave
Dell Enterprise Linux Engineering



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2011-08-11 18:37 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-28 15:53 New driver mtipx2xx submission Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
2011-04-28 22:06 ` Alan Cox
2011-05-02 12:40   ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
2011-05-02 17:42     ` Alan Cox
2011-05-03 20:07       ` [PATCH 0/3] " Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
2011-05-11 17:40       ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
2011-05-11 19:20         ` Alan Cox
2011-05-21  2:26           ` Asai Thambi S P
2011-05-25 14:36             ` Jeff Moyer
     [not found]               ` <22A973199D2C2F46933448F6E7990A300239EA77@ntxboimbx31.micron.com>
2011-06-01 19:51                 ` Jeff Moyer
2011-06-01 20:21                   ` Alan Cox
2011-06-15  1:29                     ` Asai Thambi S P
2011-06-15 14:43                       ` Jeff Moyer
2011-06-27 23:38                         ` Asai Thambi Samymuthu Pattrayasamy (asamymuthupa) [CONTRACTOR]
2011-06-28 15:18                           ` Jeff Moyer
2011-06-28 15:31                             ` Alan Cox
2011-06-28 15:38                               ` Jeff Moyer
2011-07-06 21:43                                 ` Asai Thambi S P
2011-07-07  7:37                                   ` Alan Cox
2011-07-26 10:46                                   ` Alan Cox
2011-07-26 10:46                                     ` Alan Cox
2011-07-26 11:44                                     ` Christoph Hellwig
2011-07-26 11:49                                       ` Alan Cox
2011-07-26 18:50                                       ` Jeff Garzik
2011-07-29 18:13                                     ` Asai Thambi S P
2011-07-29 18:13                                       ` Asai Thambi S P
2011-08-11 18:36                                     ` Asai Thambi S P
2011-08-11 18:36                                       ` Asai Thambi S P
2011-07-06 21:39                             ` Asai Thambi S P
2011-06-02  1:21                   ` David Dillow
2011-06-15  1:33                     ` Asai Thambi S P
2011-06-15  3:12                       ` David Dillow
2011-05-02 18:40   ` Jeff Moyer
2011-05-02 18:52     ` Alan Cox
2011-05-03 15:04       ` Mark Lord
2011-05-03 15:07         ` Alan Cox
2011-05-03 15:08           ` Mark Lord
2011-05-03 15:02     ` Mark Lord
2011-05-12 14:39       ` Jeff Garzik
2011-05-03 11:09 Jordan_Hargrave

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.