All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
@ 2013-05-21 19:00 Alexander Gordeev
  2013-05-21 19:00 ` [PATCH RESEND 1/1] " Alexander Gordeev
  2013-05-21 23:50 ` [PATCH RESEND 0/1] " Tejun Heo
  0 siblings, 2 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-05-21 19:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, linux-ide, Jeff Garzik

Hi Tejun,

This is a patch I sent to Jeff few months ago. As you asked, I am
resending it on top of 3.10-rc2 Linus tree. Jeff has said he
applied this patch, but I am not sure what exactly it means ;)

Also, I am not sure about my reading of the statistics and the
trade-off I identified (below), so this patch is a RFC.

The numbers are taken by running 'if=/dev/sd{a,b,c} of=/dev/null'
All time values is in us.

Before this update host lock average holdtime was 3.266532061 and
average waittime was 0.009832679 [1]. After the update average
holdtime (slightly) rose up to 0.335267418 while average waittime
decreased to 0.000320469 [2]. Which means host lock with local
interrupt disabled is held roughly the same while the average
waittime dropped 30 times.

After this update port events are handled with local interrupts
enabled and compete on individual per-port locks with average
holdtime 1.540987475 and average waittime 0.000714864 [3].
Comparing to [1], ata_scsi_queuecmd() holds port locks 2 times
less and waits for locks 13 times less.

The downside of this change is introduction of a kernel thread
and (supposedly) increased total average time of handling a
AHCI interrupt - at most 1.5 times.

The upside is better access times from ata_scsi_queuecmd() to
port locks and moving port interrupt handling out of the
hardware interrupt context.

Thanks!

Lock usage statistics.

	1. ahci_interrupt vs ata_scsi_queuecmd (host->lock)

Test	holdtime-total	waittime-total	acquisitions	holdtime-avg	waittime-avg
#
01.	22732497.77	93399.89	06393367	3.555637862	0.014608874
02.	20358052.08	52869.72	06454133	3.154265969	0.008191607
03.	20322516.57	54981.40	06459318	3.146232554	0.008511951
04.	18558686.89	39178.05	06469468	2.868657344	0.006055838
05.	19069799.90	31961.00	06455953	2.953831897	0.004950625
06.	23783542.56	97159.79	06387322	3.723554654	0.015211350
07.	23889266.74	102625.45	06386666	3.740491007	0.016068705
08.	19284522.61	32655.91	06450568	2.989585198	0.005062486
							-----------	-----------
avg							3.266532061	0.009832679

	2. ahci_single_irq_intr vs ahci_port_thread_fn (host->lock)


Alexander Gordeev (1):
  AHCI: Optimize interrupt processing

 drivers/ata/acard-ahci.c    |    8 ++---
 drivers/ata/ahci.c          |   54 ++++++++++++++++++-------------
 drivers/ata/ahci.h          |   10 +++--
 drivers/ata/ahci_platform.c |    3 +-
 drivers/ata/libahci.c       |   74 +++++++++++++++++++++++++------------------
 5 files changed, 85 insertions(+), 64 deletions(-)

-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH RESEND 1/1] AHCI: Optimize interrupt processing
  2013-05-21 19:00 [PATCH RESEND 0/1] AHCI: Optimize interrupt processing Alexander Gordeev
@ 2013-05-21 19:00 ` Alexander Gordeev
  2013-05-21 23:50 ` [PATCH RESEND 0/1] " Tejun Heo
  1 sibling, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-05-21 19:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, linux-ide, Jeff Garzik, Jan Beulich

Split interrupt service routine into hardware context handler and
threaded context handler. That allows to protect ports with individual
locks rather than with a single host-wide lock, which results in better
parallelism.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 drivers/ata/acard-ahci.c    |    8 ++---
 drivers/ata/ahci.c          |   54 ++++++++++++++++++-------------
 drivers/ata/ahci.h          |   10 +++--
 drivers/ata/ahci_platform.c |    3 +-
 drivers/ata/libahci.c       |   74 +++++++++++++++++++++++++------------------
 5 files changed, 85 insertions(+), 64 deletions(-)

diff --git a/drivers/ata/acard-ahci.c b/drivers/ata/acard-ahci.c
index 4e94ba2..e429225 100644
--- a/drivers/ata/acard-ahci.c
+++ b/drivers/ata/acard-ahci.c
@@ -409,7 +409,7 @@ static int acard_ahci_init_one(struct pci_dev *pdev, const struct pci_device_id
 	struct device *dev = &pdev->dev;
 	struct ahci_host_priv *hpriv;
 	struct ata_host *host;
-	int n_ports, i, rc;
+	int n_ports, n_msis, i, rc;
 
 	VPRINTK("ENTER\n");
 
@@ -436,8 +436,7 @@ static int acard_ahci_init_one(struct pci_dev *pdev, const struct pci_device_id
 		return -ENOMEM;
 	hpriv->flags |= (unsigned long)pi.private_data;
 
-	if (!(hpriv->flags & AHCI_HFLAG_NO_MSI))
-		pci_enable_msi(pdev);
+	n_msis = ahci_init_interrupts(pdev, hpriv);
 
 	hpriv->mmio = pcim_iomap_table(pdev)[AHCI_PCI_BAR];
 
@@ -499,8 +498,7 @@ static int acard_ahci_init_one(struct pci_dev *pdev, const struct pci_device_id
 	acard_ahci_pci_print_info(host);
 
 	pci_set_master(pdev);
-	return ata_host_activate(host, pdev->irq, ahci_interrupt, IRQF_SHARED,
-				 &acard_ahci_sht);
+	return ahci_host_activate(host, pdev->irq, n_msis, &acard_ahci_sht);
 }
 
 module_pci_driver(acard_ahci_pci_driver);
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 251e57d..e4d915f 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1110,14 +1110,14 @@ int ahci_init_interrupts(struct pci_dev *pdev, struct ahci_host_priv *hpriv)
 	pci_intx(pdev, 1);
 	return 0;
 }
+EXPORT_SYMBOL_GPL(ahci_init_interrupts);
 
 /**
  *	ahci_host_activate - start AHCI host, request IRQs and register it
  *	@host: target ATA host
  *	@irq: base IRQ number to request
  *	@n_msis: number of MSIs allocated for this host
- *	@irq_handler: irq_handler used when requesting IRQs
- *	@irq_flags: irq_flags used when requesting IRQs
+ *	@sht: scsi_host_template to use when registering the host
  *
  *	Similar to ata_host_activate, but requests IRQs according to AHCI-1.1
  *	when multiple MSIs were allocated. That is one MSI per port, starting
@@ -1129,43 +1129,59 @@ int ahci_init_interrupts(struct pci_dev *pdev, struct ahci_host_priv *hpriv)
  *	RETURNS:
  *	0 on success, -errno otherwise.
  */
-int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis)
+int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis,
+		       struct scsi_host_template *sht)
 {
 	int i, rc;
-
-	/* Sharing Last Message among several ports is not supported */
-	if (n_msis < host->n_ports)
-		return -EINVAL;
+	unsigned int n_irqs;
 
 	rc = ata_host_start(host);
 	if (rc)
 		return rc;
 
-	for (i = 0; i < host->n_ports; i++) {
-		rc = devm_request_threaded_irq(host->dev,
-			irq + i, ahci_hw_interrupt, ahci_thread_fn, IRQF_SHARED,
-			dev_driver_string(host->dev), host->ports[i]);
+	n_irqs = min(host->n_ports, n_msis);
+	n_irqs = max(n_irqs, 1u);
+
+	if (n_irqs > 1) {
+		/* Sharing Last Message among several ports is not supported */
+		if (n_irqs < host->n_ports)
+			return -EINVAL;
+
+		for (i = 0; i < n_irqs; i++) {
+			rc = devm_request_threaded_irq(host->dev, irq + i,
+				ahci_multi_irqs_intr, ahci_port_thread_fn,
+				IRQF_SHARED, dev_driver_string(host->dev),
+				host->ports[i]);
+			if (rc)
+				goto out_free_irqs;
+		}
+	} else {
+		rc = devm_request_threaded_irq(host->dev, irq,
+			ahci_single_irq_intr, ahci_thread_fn, IRQF_SHARED,
+			dev_driver_string(host->dev), host);
 		if (rc)
-			goto out_free_irqs;
+			goto out;
 	}
 
-	for (i = 0; i < host->n_ports; i++)
+	for (i = 0; i < n_irqs; i++)
 		ata_port_desc(host->ports[i], "irq %d", irq + i);
 
-	rc = ata_host_register(host, &ahci_sht);
+	rc = ata_host_register(host, sht);
 	if (rc)
 		goto out_free_all_irqs;
 
 	return 0;
 
 out_free_all_irqs:
-	i = host->n_ports;
+	i = n_irqs;
 out_free_irqs:
 	for (i--; i >= 0; i--)
 		devm_free_irq(host->dev, irq + i, host->ports[i]);
+out:
 
 	return rc;
 }
+EXPORT_SYMBOL_GPL(ahci_host_activate);
 
 static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
@@ -1265,8 +1281,6 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	hpriv->mmio = pcim_iomap_table(pdev)[ahci_pci_bar];
 
 	n_msis = ahci_init_interrupts(pdev, hpriv);
-	if (n_msis > 1)
-		hpriv->flags |= AHCI_HFLAG_MULTI_MSI;
 
 	/* save initial config */
 	ahci_pci_save_initial_config(pdev, hpriv);
@@ -1364,11 +1378,7 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	pci_set_master(pdev);
 
-	if (hpriv->flags & AHCI_HFLAG_MULTI_MSI)
-		return ahci_host_activate(host, pdev->irq, n_msis);
-
-	return ata_host_activate(host, pdev->irq, ahci_interrupt, IRQF_SHARED,
-				 &ahci_sht);
+	return ahci_host_activate(host, pdev->irq, n_msis, &ahci_sht);
 }
 
 module_pci_driver(ahci_pci_driver);
diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
index b830e6c..ed1fbc8 100644
--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -231,7 +231,6 @@ enum {
 	AHCI_HFLAG_DELAY_ENGINE		= (1 << 15), /* do not start engine on
 						        port start (wait until
 						        error-handling stage) */
-	AHCI_HFLAG_MULTI_MSI		= (1 << 16), /* multiple PCI MSIs */
 
 	/* ap->flags bits */
 
@@ -361,11 +360,14 @@ int ahci_port_resume(struct ata_port *ap);
 void ahci_set_em_messages(struct ahci_host_priv *hpriv,
 			  struct ata_port_info *pi);
 int ahci_reset_em(struct ata_host *host);
-irqreturn_t ahci_interrupt(int irq, void *dev_instance);
-irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance);
+irqreturn_t ahci_single_irq_intr(int irq, void *dev_instance);
+irqreturn_t ahci_multi_irqs_intr(int irq, void *dev_instance);
 irqreturn_t ahci_thread_fn(int irq, void *dev_instance);
+irqreturn_t ahci_port_thread_fn(int irq, void *dev_instance);
 void ahci_print_info(struct ata_host *host, const char *scc_s);
-int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis);
+int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis,
+		       struct scsi_host_template *sht);
+int ahci_init_interrupts(struct pci_dev *pdev, struct ahci_host_priv *hpriv);
 
 static inline void __iomem *__ahci_port_base(struct ata_host *host,
 					     unsigned int port_no)
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 7a8a284..b10cd87 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -211,8 +211,7 @@ static int ahci_probe(struct platform_device *pdev)
 	ahci_init_controller(host);
 	ahci_print_info(host, "platform");
 
-	rc = ata_host_activate(host, irq, ahci_interrupt, IRQF_SHARED,
-			       &ahci_platform_sht);
+	rc = ahci_host_activate(host, irq, 0, &ahci_platform_sht);
 	if (rc)
 		goto pdata_exit;
 
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index 34c8216..68b3bdd 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -1655,9 +1655,9 @@ static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
 		ata_port_abort(ap);
 }
 
-static void ahci_handle_port_interrupt(struct ata_port *ap,
-				       void __iomem *port_mmio, u32 status)
+static void ahci_handle_port_interrupt(struct ata_port *ap, u32 status)
 {
+	void __iomem *port_mmio = ahci_port_base(ap);
 	struct ata_eh_info *ehi = &ap->link.eh_info;
 	struct ahci_port_priv *pp = ap->private_data;
 	struct ahci_host_priv *hpriv = ap->host->private_data;
@@ -1740,22 +1740,10 @@ static void ahci_handle_port_interrupt(struct ata_port *ap,
 	}
 }
 
-void ahci_port_intr(struct ata_port *ap)
-{
-	void __iomem *port_mmio = ahci_port_base(ap);
-	u32 status;
-
-	status = readl(port_mmio + PORT_IRQ_STAT);
-	writel(status, port_mmio + PORT_IRQ_STAT);
-
-	ahci_handle_port_interrupt(ap, port_mmio, status);
-}
-
-irqreturn_t ahci_thread_fn(int irq, void *dev_instance)
+irqreturn_t ahci_port_thread_fn(int irq, void *dev_instance)
 {
 	struct ata_port *ap = dev_instance;
 	struct ahci_port_priv *pp = ap->private_data;
-	void __iomem *port_mmio = ahci_port_base(ap);
 	unsigned long flags;
 	u32 status;
 
@@ -1766,14 +1754,43 @@ irqreturn_t ahci_thread_fn(int irq, void *dev_instance)
 	spin_unlock_irqrestore(&ap->host->lock, flags);
 
 	spin_lock_bh(ap->lock);
-	ahci_handle_port_interrupt(ap, port_mmio, status);
+	ahci_handle_port_interrupt(ap, status);
 	spin_unlock_bh(ap->lock);
 
 	return IRQ_HANDLED;
 }
+EXPORT_SYMBOL_GPL(ahci_port_thread_fn);
+
+irqreturn_t ahci_thread_fn(int irq, void *dev_instance)
+{
+	struct ata_host *host = dev_instance;
+	struct ahci_host_priv *hpriv = host->private_data;
+	u32 irq_masked = hpriv->port_map;
+	unsigned int i;
+
+	for (i = 0; i < host->n_ports; i++) {
+		struct ata_port *ap;
+
+		if (!(irq_masked & (1 << i)))
+			continue;
+
+		ap = host->ports[i];
+		if (ap) {
+			ahci_port_thread_fn(irq, ap);
+			VPRINTK("port %u\n", i);
+		} else {
+			VPRINTK("port %u (no irq)\n", i);
+			if (ata_ratelimit())
+				dev_warn(host->dev,
+					 "interrupt on disabled port %u\n", i);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
 EXPORT_SYMBOL_GPL(ahci_thread_fn);
 
-void ahci_hw_port_interrupt(struct ata_port *ap)
+void ahci_update_intr_status(struct ata_port *ap)
 {
 	void __iomem *port_mmio = ahci_port_base(ap);
 	struct ahci_port_priv *pp = ap->private_data;
@@ -1785,7 +1802,7 @@ void ahci_hw_port_interrupt(struct ata_port *ap)
 	pp->intr_status |= status;
 }
 
-irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance)
+irqreturn_t ahci_multi_irqs_intr(int irq, void *dev_instance)
 {
 	struct ata_port *ap_this = dev_instance;
 	struct ahci_port_priv *pp = ap_this->private_data;
@@ -1821,7 +1838,7 @@ irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance)
 
 		ap = host->ports[i];
 		if (ap) {
-			ahci_hw_port_interrupt(ap);
+			ahci_update_intr_status(ap);
 			VPRINTK("port %u\n", i);
 		} else {
 			VPRINTK("port %u (no irq)\n", i);
@@ -1839,9 +1856,9 @@ irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance)
 
 	return IRQ_WAKE_THREAD;
 }
-EXPORT_SYMBOL_GPL(ahci_hw_interrupt);
+EXPORT_SYMBOL_GPL(ahci_multi_irqs_intr);
 
-irqreturn_t ahci_interrupt(int irq, void *dev_instance)
+irqreturn_t ahci_single_irq_intr(int irq, void *dev_instance)
 {
 	struct ata_host *host = dev_instance;
 	struct ahci_host_priv *hpriv;
@@ -1871,7 +1888,7 @@ irqreturn_t ahci_interrupt(int irq, void *dev_instance)
 
 		ap = host->ports[i];
 		if (ap) {
-			ahci_port_intr(ap);
+			ahci_update_intr_status(ap);
 			VPRINTK("port %u\n", i);
 		} else {
 			VPRINTK("port %u (no irq)\n", i);
@@ -1898,9 +1915,9 @@ irqreturn_t ahci_interrupt(int irq, void *dev_instance)
 
 	VPRINTK("EXIT\n");
 
-	return IRQ_RETVAL(handled);
+	return handled ? IRQ_WAKE_THREAD : IRQ_NONE;
 }
-EXPORT_SYMBOL_GPL(ahci_interrupt);
+EXPORT_SYMBOL_GPL(ahci_single_irq_intr);
 
 static unsigned int ahci_qc_issue(struct ata_queued_cmd *qc)
 {
@@ -2294,13 +2311,8 @@ static int ahci_port_start(struct ata_port *ap)
 	 */
 	pp->intr_mask = DEF_PORT_IRQ;
 
-	/*
-	 * Switch to per-port locking in case each port has its own MSI vector.
-	 */
-	if ((hpriv->flags & AHCI_HFLAG_MULTI_MSI)) {
-		spin_lock_init(&pp->lock);
-		ap->lock = &pp->lock;
-	}
+	spin_lock_init(&pp->lock);
+	ap->lock = &pp->lock;
 
 	ap->private_data = pp;
 
-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-05-21 19:00 [PATCH RESEND 0/1] AHCI: Optimize interrupt processing Alexander Gordeev
  2013-05-21 19:00 ` [PATCH RESEND 1/1] " Alexander Gordeev
@ 2013-05-21 23:50 ` Tejun Heo
  2013-05-22 14:39   ` Alexander Gordeev
  2014-09-11 12:42   ` Alexander Gordeev
  1 sibling, 2 replies; 75+ messages in thread
From: Tejun Heo @ 2013-05-21 23:50 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, linux-ide, Jeff Garzik, Jens Axboe, Nicholas A. Bellinger

Hello, Alexander.

(cc'ing Jens and Nicholas, hey guys)

On Tue, May 21, 2013 at 09:00:27PM +0200, Alexander Gordeev wrote:
> Before this update host lock average holdtime was 3.266532061 and
> average waittime was 0.009832679 [1]. After the update average
> holdtime (slightly) rose up to 0.335267418 while average waittime
> decreased to 0.000320469 [2]. Which means host lock with local
> interrupt disabled is held roughly the same while the average
> waittime dropped 30 times.
> 
> After this update port events are handled with local interrupts
> enabled and compete on individual per-port locks with average
> holdtime 1.540987475 and average waittime 0.000714864 [3].
> Comparing to [1], ata_scsi_queuecmd() holds port locks 2 times
> less and waits for locks 13 times less.

Hmmmmmm..... I'd normally apply this patch but block layer is just
growing multi-queue support and libata is likely to be converted to mq
in foreseeable future, so I'm a bit hesitant to make irq handling more
sophiscated right now.  Would you be interested in looking into
converting libata to blk mq support?  I'm pretty sure it'd yield far
better outcome if done properly.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-05-21 23:50 ` [PATCH RESEND 0/1] " Tejun Heo
@ 2013-05-22 14:39   ` Alexander Gordeev
  2013-05-22 17:03     ` Jens Axboe
  2014-09-11 12:42   ` Alexander Gordeev
  1 sibling, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-05-22 14:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-kernel, linux-ide, Jeff Garzik, Jens Axboe, Nicholas A. Bellinger

On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> Hmmmmmm..... I'd normally apply this patch but block layer is just
> growing multi-queue support and libata is likely to be converted to mq
> in foreseeable future, so I'm a bit hesitant to make irq handling more
> sophiscated right now.  Would you be interested in looking into
> converting libata to blk mq support?  I'm pretty sure it'd yield far
> better outcome if done properly.

I am not committing, but will look into it, sure.

> Thanks.
> 
> -- 
> tejun

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-05-22 14:39   ` Alexander Gordeev
@ 2013-05-22 17:03     ` Jens Axboe
  2013-07-11 10:26         ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Jens Axboe @ 2013-05-22 17:03 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Nicholas A. Bellinger

On Wed, May 22 2013, Alexander Gordeev wrote:
> On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > Hmmmmmm..... I'd normally apply this patch but block layer is just
> > growing multi-queue support and libata is likely to be converted to mq
> > in foreseeable future, so I'm a bit hesitant to make irq handling more
> > sophiscated right now.  Would you be interested in looking into
> > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > better outcome if done properly.
> 
> I am not committing, but will look into it, sure.

Would be most awesome, I'm sure Nic would not mind a bit of help on the
SCSI/libata side :-)

And personally, can't wait to run it on the laptop! That's right, I
alpha test on the laptop.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-05-22 17:03     ` Jens Axboe
@ 2013-07-11 10:26         ` Alexander Gordeev
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-11 10:26 UTC (permalink / raw)
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik,
	Nicholas A. Bellinger, Jens Axboe

On Wed, May 22, 2013 at 07:03:05PM +0200, Jens Axboe wrote:
> On Wed, May 22 2013, Alexander Gordeev wrote:
> > On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > > Hmmmmmm..... I'd normally apply this patch but block layer is just
> > > growing multi-queue support and libata is likely to be converted to mq
> > > in foreseeable future, so I'm a bit hesitant to make irq handling more
> > > sophiscated right now.  Would you be interested in looking into
> > > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > > better outcome if done properly.
> > 
> > I am not committing, but will look into it, sure.
> 
> Would be most awesome, I'm sure Nic would not mind a bit of help on the
> SCSI/libata side :-)

Hi Nicholas,

Could you please clarify the status of SCSI MQ support? Is it usable now?

I tried git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git,
but it does not appear working without (at least) changes below to SCSI lib:

Thanks!

diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
index ca6ff67..d8cc7a4 100644
--- a/drivers/scsi/scsi-mq.c
+++ b/drivers/scsi/scsi-mq.c
@@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
 static struct blk_mq_ops scsi_mq_ops = {
 	.queue_rq	= scsi_mq_queue_rq,
 	.map_queue	= blk_mq_map_queue,
+	.timeout	= scsi_times_out,
 	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
 	.free_hctx	= blk_mq_free_single_hw_queue,
 };
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 65360db..33aa373 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -283,7 +283,10 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
 	/*
 	 * head injection *required* here otherwise quiesce won't work
 	 */
-	blk_execute_rq(req->q, NULL, req, 1);
+	if (q->mq_ops)
+		blk_mq_execute_rq(req->q, req);
+	else
+		blk_execute_rq(req->q, NULL, req, 1);
 
 	/*
 	 * Some devices (USB mass-storage in particular) may transfer
@@ -298,12 +301,8 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
 		*resid = req->resid_len;
 	ret = req->errors;
  out:
-	if (q->mq_ops) {
-		printk("scsi_execute(): Calling blk_mq_free_request >>>\n");
-		blk_mq_free_request(req);
-	} else {
+	if (!q->mq_ops)
 		blk_put_request(req);
-	}
 
 	return ret;
 }


> And personally, can't wait to run it on the laptop! That's right, I
> alpha test on the laptop.
> 
> -- 
> Jens Axboe
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
@ 2013-07-11 10:26         ` Alexander Gordeev
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-11 10:26 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik,
	Nicholas A. Bellinger, Jens Axboe

On Wed, May 22, 2013 at 07:03:05PM +0200, Jens Axboe wrote:
> On Wed, May 22 2013, Alexander Gordeev wrote:
> > On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > > Hmmmmmm..... I'd normally apply this patch but block layer is just
> > > growing multi-queue support and libata is likely to be converted to mq
> > > in foreseeable future, so I'm a bit hesitant to make irq handling more
> > > sophiscated right now.  Would you be interested in looking into
> > > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > > better outcome if done properly.
> > 
> > I am not committing, but will look into it, sure.
> 
> Would be most awesome, I'm sure Nic would not mind a bit of help on the
> SCSI/libata side :-)

Hi Nicholas,

Could you please clarify the status of SCSI MQ support? Is it usable now?

I tried git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git,
but it does not appear working without (at least) changes below to SCSI lib:

Thanks!

diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
index ca6ff67..d8cc7a4 100644
--- a/drivers/scsi/scsi-mq.c
+++ b/drivers/scsi/scsi-mq.c
@@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
 static struct blk_mq_ops scsi_mq_ops = {
 	.queue_rq	= scsi_mq_queue_rq,
 	.map_queue	= blk_mq_map_queue,
+	.timeout	= scsi_times_out,
 	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
 	.free_hctx	= blk_mq_free_single_hw_queue,
 };
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 65360db..33aa373 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -283,7 +283,10 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
 	/*
 	 * head injection *required* here otherwise quiesce won't work
 	 */
-	blk_execute_rq(req->q, NULL, req, 1);
+	if (q->mq_ops)
+		blk_mq_execute_rq(req->q, req);
+	else
+		blk_execute_rq(req->q, NULL, req, 1);
 
 	/*
 	 * Some devices (USB mass-storage in particular) may transfer
@@ -298,12 +301,8 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
 		*resid = req->resid_len;
 	ret = req->errors;
  out:
-	if (q->mq_ops) {
-		printk("scsi_execute(): Calling blk_mq_free_request >>>\n");
-		blk_mq_free_request(req);
-	} else {
+	if (!q->mq_ops)
 		blk_put_request(req);
-	}
 
 	return ret;
 }


> And personally, can't wait to run it on the laptop! That's right, I
> alpha test on the laptop.
> 
> -- 
> Jens Axboe
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-11 10:26         ` Alexander Gordeev
  (?)
@ 2013-07-11 23:00         ` Nicholas A. Bellinger
  2013-07-12  7:46           ` Alexander Gordeev
  -1 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-11 23:00 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe

On Thu, 2013-07-11 at 12:26 +0200, Alexander Gordeev wrote:
> On Wed, May 22, 2013 at 07:03:05PM +0200, Jens Axboe wrote:
> > On Wed, May 22 2013, Alexander Gordeev wrote:
> > > On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > > > Hmmmmmm..... I'd normally apply this patch but block layer is just
> > > > growing multi-queue support and libata is likely to be converted to mq
> > > > in foreseeable future, so I'm a bit hesitant to make irq handling more
> > > > sophiscated right now.  Would you be interested in looking into
> > > > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > > > better outcome if done properly.
> > > 
> > > I am not committing, but will look into it, sure.
> > 
> > Would be most awesome, I'm sure Nic would not mind a bit of help on the
> > SCSI/libata side :-)
> 
> Hi Nicholas,
> 
> Could you please clarify the status of SCSI MQ support? Is it usable now?
> 

Hi Alexander,

Thanks for taking a look.  I've not made further progress in the last
weeks on scsi-mq, but am still using virtio-scsi + scsi-mq <->
vhost-scsi + per-cpu-ida for quite a bit for benchmarking purposes.

> I tried git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git,
> but it does not appear working without (at least) changes below to SCSI lib:
> 

The only scsi-mq LLD conversions so far have been to virtio-scsi +
scsi_debug to nop REQ_TYPE_FS.

Just so I understand, your patch below is required in order to make what
LLD function with scsi-mq..?

Thanks!

--nab

> Thanks!
> 
> diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
> index ca6ff67..d8cc7a4 100644
> --- a/drivers/scsi/scsi-mq.c
> +++ b/drivers/scsi/scsi-mq.c
> @@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
>  static struct blk_mq_ops scsi_mq_ops = {
>  	.queue_rq	= scsi_mq_queue_rq,
>  	.map_queue	= blk_mq_map_queue,
> +	.timeout	= scsi_times_out,
>  	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
>  	.free_hctx	= blk_mq_free_single_hw_queue,
>  };
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 65360db..33aa373 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -283,7 +283,10 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
>  	/*
>  	 * head injection *required* here otherwise quiesce won't work
>  	 */
> -	blk_execute_rq(req->q, NULL, req, 1);
> +	if (q->mq_ops)
> +		blk_mq_execute_rq(req->q, req);
> +	else
> +		blk_execute_rq(req->q, NULL, req, 1);
>  
>  	/*
>  	 * Some devices (USB mass-storage in particular) may transfer
> @@ -298,12 +301,8 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
>  		*resid = req->resid_len;
>  	ret = req->errors;
>   out:
> -	if (q->mq_ops) {
> -		printk("scsi_execute(): Calling blk_mq_free_request >>>\n");
> -		blk_mq_free_request(req);
> -	} else {
> +	if (!q->mq_ops)
>  		blk_put_request(req);
> -	}
>  
>  	return ret;
>  }
> 
> 
> > And personally, can't wait to run it on the laptop! That's right, I
> > alpha test on the laptop.
> > 
> > -- 
> > Jens Axboe
> > 
> 



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-11 23:00         ` Nicholas A. Bellinger
@ 2013-07-12  7:46           ` Alexander Gordeev
  2013-07-13  5:20             ` Nicholas A. Bellinger
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-12  7:46 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe

On Thu, Jul 11, 2013 at 04:00:37PM -0700, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-11 at 12:26 +0200, Alexander Gordeev wrote:
> > On Wed, May 22, 2013 at 07:03:05PM +0200, Jens Axboe wrote:
> > > On Wed, May 22 2013, Alexander Gordeev wrote:
> > > > On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > > > > Hmmmmmm..... I'd normally apply this patch but block layer is just
> > > > > growing multi-queue support and libata is likely to be converted to mq
> > > > > in foreseeable future, so I'm a bit hesitant to make irq handling more
> > > > > sophiscated right now.  Would you be interested in looking into
> > > > > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > > > > better outcome if done properly.
> > > > 
> > > > I am not committing, but will look into it, sure.
> > > 
> > > Would be most awesome, I'm sure Nic would not mind a bit of help on the
> > > SCSI/libata side :-)
> > 
> > Hi Nicholas,
> > 
> > Could you please clarify the status of SCSI MQ support? Is it usable now?
> > 
> 
> Hi Alexander,
> 
> Thanks for taking a look.  I've not made further progress in the last
> weeks on scsi-mq, but am still using virtio-scsi + scsi-mq <->
> vhost-scsi + per-cpu-ida for quite a bit for benchmarking purposes.

Thanks for the clarification, Nicholas.

> > I tried git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git,
> > but it does not appear working without (at least) changes below to SCSI lib:
> > 
> 
> The only scsi-mq LLD conversions so far have been to virtio-scsi +
> scsi_debug to nop REQ_TYPE_FS.

I see. Do you think the changes I made is a right way to go?

I had to make it to avoid a NULL-pointer assignent and make BIO bounce
buffers work, but I do not really understand the mixture of old and
new code ( neither in fact :) )

> Just so I understand, your patch below is required in order to make what
> LLD function with scsi-mq..?

ata_piix

> 
> Thanks!
> 
> --nab
> 
> > Thanks!
> > 
> > diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
> > index ca6ff67..d8cc7a4 100644
> > --- a/drivers/scsi/scsi-mq.c
> > +++ b/drivers/scsi/scsi-mq.c
> > @@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
> >  static struct blk_mq_ops scsi_mq_ops = {
> >  	.queue_rq	= scsi_mq_queue_rq,
> >  	.map_queue	= blk_mq_map_queue,
> > +	.timeout	= scsi_times_out,
> >  	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
> >  	.free_hctx	= blk_mq_free_single_hw_queue,
> >  };
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 65360db..33aa373 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -283,7 +283,10 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
> >  	/*
> >  	 * head injection *required* here otherwise quiesce won't work
> >  	 */
> > -	blk_execute_rq(req->q, NULL, req, 1);
> > +	if (q->mq_ops)
> > +		blk_mq_execute_rq(req->q, req);
> > +	else
> > +		blk_execute_rq(req->q, NULL, req, 1);
> >  
> >  	/*
> >  	 * Some devices (USB mass-storage in particular) may transfer
> > @@ -298,12 +301,8 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
> >  		*resid = req->resid_len;
> >  	ret = req->errors;
> >   out:
> > -	if (q->mq_ops) {
> > -		printk("scsi_execute(): Calling blk_mq_free_request >>>\n");
> > -		blk_mq_free_request(req);
> > -	} else {
> > +	if (!q->mq_ops)
> >  		blk_put_request(req);
> > -	}
> >  
> >  	return ret;
> >  }
> > 
> > 
> > > And personally, can't wait to run it on the laptop! That's right, I
> > > alpha test on the laptop.
> > > 
> > > -- 
> > > Jens Axboe
> > > 
> > 
> 
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-12  7:46           ` Alexander Gordeev
@ 2013-07-13  5:20             ` Nicholas A. Bellinger
  2013-07-16 18:32               ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-13  5:20 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe

On Fri, 2013-07-12 at 09:46 +0200, Alexander Gordeev wrote:
> On Thu, Jul 11, 2013 at 04:00:37PM -0700, Nicholas A. Bellinger wrote:
> > On Thu, 2013-07-11 at 12:26 +0200, Alexander Gordeev wrote:
> > > On Wed, May 22, 2013 at 07:03:05PM +0200, Jens Axboe wrote:
> > > > On Wed, May 22 2013, Alexander Gordeev wrote:
> > > > > On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > > > > > Hmmmmmm..... I'd normally apply this patch but block layer is just
> > > > > > growing multi-queue support and libata is likely to be converted to mq
> > > > > > in foreseeable future, so I'm a bit hesitant to make irq handling more
> > > > > > sophiscated right now.  Would you be interested in looking into
> > > > > > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > > > > > better outcome if done properly.
> > > > > 
> > > > > I am not committing, but will look into it, sure.
> > > > 
> > > > Would be most awesome, I'm sure Nic would not mind a bit of help on the
> > > > SCSI/libata side :-)
> > > 
> > > Hi Nicholas,
> > > 
> > > Could you please clarify the status of SCSI MQ support? Is it usable now?
> > > 
> > 
> > Hi Alexander,
> > 
> > Thanks for taking a look.  I've not made further progress in the last
> > weeks on scsi-mq, but am still using virtio-scsi + scsi-mq <->
> > vhost-scsi + per-cpu-ida for quite a bit for benchmarking purposes.
> 
> Thanks for the clarification, Nicholas.
> 
> > > I tried git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git,
> > > but it does not appear working without (at least) changes below to SCSI lib:
> > > 
> > 
> > The only scsi-mq LLD conversions so far have been to virtio-scsi +
> > scsi_debug to nop REQ_TYPE_FS.
> 
> I see. Do you think the changes I made is a right way to go?
> 
> I had to make it to avoid a NULL-pointer assignent and make BIO bounce
> buffers work, but I do not really understand the mixture of old and
> new code ( neither in fact :) )
> 

Hi Alexander,

Comments below.

> > Just so I understand, your patch below is required in order to make what
> > LLD function with scsi-mq..?
> 
> ata_piix
> 

<nod>

> > 
> > Thanks!
> > 
> > --nab
> > 
> > > Thanks!
> > > 
> > > diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
> > > index ca6ff67..d8cc7a4 100644
> > > --- a/drivers/scsi/scsi-mq.c
> > > +++ b/drivers/scsi/scsi-mq.c
> > > @@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
> > >  static struct blk_mq_ops scsi_mq_ops = {
> > >  	.queue_rq	= scsi_mq_queue_rq,
> > >  	.map_queue	= blk_mq_map_queue,
> > > +	.timeout	= scsi_times_out,
> > >  	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
> > >  	.free_hctx	= blk_mq_free_single_hw_queue,
> > >  };

So your actually triggering a blk-mq timeout with ata_piix..?

> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index 65360db..33aa373 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -283,7 +283,10 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
> > >  	/*
> > >  	 * head injection *required* here otherwise quiesce won't work
> > >  	 */
> > > -	blk_execute_rq(req->q, NULL, req, 1);
> > > +	if (q->mq_ops)
> > > +		blk_mq_execute_rq(req->q, req);
> > > +	else
> > > +		blk_execute_rq(req->q, NULL, req, 1);
> > >  

The scsi_execute() -> REQ_TYPE_BLOCK_RQ special case (scsi_scan +
scsi_ioctl) has a small issue preventing it's conversion to use
blk_mq_execute_rq().

Namely that scsi_execute_cmd() currently expects to be able to check for
the existence of req->errors + req->resid_len *after*
blk_mq_execute_rq() -> blk_mq_finish_request() -> blk_mq_free_request()
has been called to mark the pre-allocated struct request as free for
blk-mq re-use.

That is why scsi-mq still uses blk_execute_rq() for this reason, and
this will need be addressed in order to safely use blk_mq_execute_rq()
in the above context.

> > >  	/*
> > >  	 * Some devices (USB mass-storage in particular) may transfer
> > > @@ -298,12 +301,8 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
> > >  		*resid = req->resid_len;
> > >  	ret = req->errors;
> > >   out:
> > > -	if (q->mq_ops) {
> > > -		printk("scsi_execute(): Calling blk_mq_free_request >>>\n");
> > > -		blk_mq_free_request(req);
> > > -	} else {
> > > +	if (!q->mq_ops)
> > >  		blk_put_request(req);
> > > -	}
> > >  

Do you have an OOPs backtrace handy to post w/o the last two changes in
place..?

Also, just a heads up that so far I've been using IDE (/dev/hdX) for the
root-device in order to make scsi-mq debugging safer and easier.

I *very* much recommend doing the same if at all possible for ata_piix
scsi-mq development + testing, as you'll want to be very careful when
using a real file-system on top of this early alpha code.

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-13  5:20             ` Nicholas A. Bellinger
@ 2013-07-16 18:32               ` Alexander Gordeev
  2013-07-16 21:38                 ` Nicholas A. Bellinger
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-16 18:32 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 1547 bytes --]

On Fri, Jul 12, 2013 at 10:20:12PM -0700, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-12 at 09:46 +0200, Alexander Gordeev wrote:
> > > > diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
> > > > index ca6ff67..d8cc7a4 100644
> > > > --- a/drivers/scsi/scsi-mq.c
> > > > +++ b/drivers/scsi/scsi-mq.c
> > > > @@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
> > > >  static struct blk_mq_ops scsi_mq_ops = {
> > > >  	.queue_rq	= scsi_mq_queue_rq,
> > > >  	.map_queue	= blk_mq_map_queue,
> > > > +	.timeout	= scsi_times_out,
> > > >  	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
> > > >  	.free_hctx	= blk_mq_free_single_hw_queue,
> > > >  };
> 
> So your actually triggering a blk-mq timeout with ata_piix..?

No.
That is to avoid a NULL-pointer assignment from ->timeout elsewhere.
In fact I return -ENODEV for sr_probe() to not hit it.

> That is why scsi-mq still uses blk_execute_rq() for this reason, and
> this will need be addressed in order to safely use blk_mq_execute_rq()
> in the above context.

Got it.

> Do you have an OOPs backtrace handy to post w/o the last two changes in
> place..?

Attaching the output. No oops actually (due to aforementioned .timeout).

> I *very* much recommend doing the same if at all possible for ata_piix
> scsi-mq development + testing, as you'll want to be very careful when
> using a real file-system on top of this early alpha code.

Thank you for the warning.
Getting to writing CDBs would be of success :)

> --nab
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

[-- Attachment #2: output_for_nab.txt --]
[-- Type: text/plain, Size: 129996 bytes --]

Loading Fedora (3.10.0-rc5.nab+)
Loading initial ramdisk ...
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.10.0-rc5.nab+ (agordeev@dhcp47-148.lab.bos.redhat.com) (gcc version 4.7.2 20120921 (Red Hat 4.7.2-2) (GCC) ) #11 SMP Tue Jul 16 14:06:45 EDT 2013
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.10.0-rc5.nab+ root=/dev/mapper/vg_dhcp47--148-lv_root ro rd.md=0 rd.dm=0 KEYTABLE=us rd.lvm.lv=vg_dhcp47-148/lv_root console=ttyS0,115200 SYSFONT=True rd.luks=0 rd.lvm.lv=vg_dhcp47-148/lv_swap LANG=en_US.UTF-8 debug
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009bfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009c000-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000008bd5cfff] usable
[    0.000000] BIOS-e820: [mem 0x000000008bd5d000-0x000000008be4efff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000008be4f000-0x000000008c26dfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000008c26e000-0x000000008d66dfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000008d66e000-0x000000008f601fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000008f602000-0x000000008f64efff] reserved
[    0.000000] BIOS-e820: [mem 0x000000008f64f000-0x000000008f6e1fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000008f6e2000-0x000000008f6ecfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000008f6ed000-0x000000008f6effff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000008f6f0000-0x000000008f7cefff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000008f7cf000-0x000000008f7fffff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000008f800000-0x000000008fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000a0000000-0x00000000afffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000fcffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff800000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000046fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.5 present.
[    0.000000] DMI: Cisco Systems Inc R210-2121605W/R210-2121605W, BIOS C200.1.4.3j.0.020720132258 02/07/2013
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] No AGP bridge found
[    0.000000] e820: last_pfn = 0x470000 max_arch_pfn = 0x400000000
[    0.000000] MTRR default type: uncachable
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-DFFFF uncachable
[    0.000000]   E0000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 0000000000 mask FF80000000 write-back
[    0.000000]   1 base 0080000000 mask FFF0000000 write-back
[    0.000000]   2 base 0100000000 mask FF00000000 write-back
[    0.000000]   3 base 0200000000 mask FE00000000 write-back
[    0.000000]   4 base 0400000000 mask FFC0000000 write-back
[    0.000000]   5 base 0440000000 mask FFE0000000 write-back
[    0.000000]   6 base 0460000000 mask FFF0000000 write-back
[    0.000000]   7 base 00B0000000 mask FFFF000000 write-combining
[    0.000000]   8 disabled
[    0.000000]   9 disabled
[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[    0.000000] e820: last_pfn = 0x8bd5d max_arch_pfn = 0x400000000
[    0.000000] found SMP MP-table at [mem 0x000fc640-0x000fc64f] mapped at [ffff8800000fc640]
[    0.000000] Base memory trampoline at [ffff880000095000] 95000 size 24576
[    0.000000] Using GB pages for direct mapping
[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
[    0.000000] BRK [0x01fba000, 0x01fbafff] PGTABLE
[    0.000000] BRK [0x01fbb000, 0x01fbbfff] PGTABLE
[    0.000000] BRK [0x01fbc000, 0x01fbcfff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x46fe00000-0x46fffffff]
[    0.000000]  [mem 0x46fe00000-0x46fffffff] page 2M
[    0.000000] BRK [0x01fbd000, 0x01fbdfff] PGTABLE
[    0.000000] init_memory_mapping: [mem 0x46c000000-0x46fdfffff]
[    0.000000]  [mem 0x46c000000-0x46fdfffff] page 2M
[    0.000000] init_memory_mapping: [mem 0x400000000-0x46bffffff]
[    0.000000]  [mem 0x400000000-0x43fffffff] page 1G
[    0.000000]  [mem 0x440000000-0x46bffffff] page 2M
[    0.000000] init_memory_mapping: [mem 0x00100000-0x8bd5cfff]
[    0.000000]  [mem 0x00100000-0x001fffff] page 4k
[    0.000000]  [mem 0x00200000-0x3fffffff] page 2M
[    0.000000]  [mem 0x40000000-0x7fffffff] page 1G
[    0.000000]  [mem 0x80000000-0x8bbfffff] page 2M
[    0.000000]  [mem 0x8bc00000-0x8bd5cfff] page 4k
[    0.000000] init_memory_mapping: [mem 0x100000000-0x3ffffffff]
[    0.000000]  [mem 0x100000000-0x3ffffffff] page 1G
[    0.000000] RAMDISK: [mem 0x363ca000-0x371dcfff]
[    0.000000] ACPI: RSDP 00000000000f0410 00024 (v02 Cisco0)
[    0.000000] ACPI: XSDT 000000008f7fe120 000A4 (v01 Cisco0 CiscoUCS 00000000      01000013)
[    0.000000] ACPI: FACP 000000008f7fc000 000F4 (v04 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
[    0.000000] ACPI: DSDT 000000008f7f6000 05EC4 (v02 Cisco0 CiscoUCS 00000003 MSFT 0100000D)
[    0.000000] ACPI: FACS 000000008f6f0000 00040
[    0.000000] ACPI: APIC 000000008f7f5000 001A8 (v02 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
[    0.000000] ACPI: MCFG 000000008f7f4000 0003C (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
[    0.000000] ACPI: HPET 000000008f7f3000 00038 (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
[    0.000000] ACPI: SLIT 000000008f7f2000 00030 (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
[    0.000000] ACPI: SRAT 000000008f7f1000 00430 (v02 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
[    0.000000] ACPI: SPCR 000000008f7f0000 00050 (v01 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
[    0.000000] ACPI: WDDT 000000008f7ef000 00040 (v01 Cisco0 CiscoUCS 00000000 MSFT 0100000D)
[    0.000000] ACPI: SSDT 000000008f7d4000 1AFC4 (v02  Cisco SSDT  PM 00004000 INTL 20090730)
[    0.000000] ACPI: SSDT 000000008f7d3000 001D8 (v02  Cisco IPMI     00004000 INTL 20090730)
[    0.000000] ACPI: SSDT 000000008f7d2000 00962 (v02 CISCO  PMETER   00004000 INTL 20090730)
[    0.000000] ACPI: HEST 000000008f7d0000 000A8 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
[    0.000000] ACPI: BERT 000000008f7cf000 00030 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
[    0.000000] ACPI: ERST 000000008f6ef000 00230 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
[    0.000000] ACPI: EINJ 000000008f6ee000 00130 (v01 Cisco  CiscoTbl 00000001 CISC 00000001)
[    0.000000] ACPI: DMAR 000000008f6ed000 001A8 (v01 Cisco0 CiscoUCS 00000001 MSFT 0100000D)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] SRAT: PXM 0 -> APIC 0x00 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x20 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x02 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x22 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x12 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x32 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x14 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x34 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x01 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x21 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x03 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x23 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x13 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x33 -> Node 1
[    0.000000] SRAT: PXM 0 -> APIC 0x15 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 0x35 -> Node 1
[    0.000000] SRAT: Node 0 PXM 0 [mem 0x00000000-0x8fffffff]
[    0.000000] SRAT: Node 0 PXM 0 [mem 0x100000000-0x26fffffff]
[    0.000000] SRAT: Node 1 PXM 1 [mem 0x270000000-0x46fffffff]
[    0.000000] NUMA: Initialized distance table, cnt=2
[    0.000000] NUMA: Node 0 [mem 0x00000000-0x8fffffff] + [mem 0x100000000-0x26fffffff] -> [mem 0x00000000-0x26fffffff]
[    0.000000] Initmem setup node 0 [mem 0x00000000-0x26fffffff]
[    0.000000]   NODE_DATA [mem 0x26ffec000-0x26fffffff]
[    0.000000] Initmem setup node 1 [mem 0x270000000-0x46fffffff]
[    0.000000]   NODE_DATA [mem 0x46ffe9000-0x46fffcfff]
[    0.000000]  [ffffea0000000000-ffffea0009bfffff] PMD -> [ffff880267e00000-ffff88026fdfffff] on node 0
[    0.000000]  [ffffea0009c00000-ffffea0011bfffff] PMD -> [ffff880467600000-ffff88046f5fffff] on node 1
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
[    0.000000]   Normal   [mem 0x100000000-0x46fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00001000-0x0009bfff]
[    0.000000]   node   0: [mem 0x00100000-0x8bd5cfff]
[    0.000000]   node   0: [mem 0x100000000-0x26fffffff]
[    0.000000]   node   1: [mem 0x270000000-0x46fffffff]
[    0.000000] On node 0 totalpages: 2079992
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 22 pages reserved
[    0.000000]   DMA zone: 3995 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 8886 pages used for memmap
[    0.000000]   DMA32 zone: 568669 pages, LIFO batch:31
[    0.000000]   Normal zone: 23552 pages used for memmap
[    0.000000]   Normal zone: 1507328 pages, LIFO batch:31
[    0.000000] On node 1 totalpages: 2097152
[    0.000000]   Normal zone: 32768 pages used for memmap
[    0.000000]   Normal zone: 2097152 pages, LIFO batch:31
[    0.000000] ACPI: PM-Timer IO Port: 0x408
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x20] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x22] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x12] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x32] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x14] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x34] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x01] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x21] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x03] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x23] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x13] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x33] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x15] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x35] enabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x10] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x11] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x12] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x13] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x14] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x15] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x16] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC (acpi_id[0x17] lapic_id[0xff] disabled)
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x03] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x04] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x05] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x06] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x07] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x08] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x09] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0a] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0b] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0c] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0d] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0e] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x0f] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x10] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x11] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x12] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x13] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x14] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x15] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x16] high level lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x17] high level lint[0x1])
[    0.000000] ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: IOAPIC (id[0x09] address[0xfec90000] gsi_base[24])
[    0.000000] IOAPIC[1]: apic_id 9, version 32, address 0xfec90000, GSI 24-47
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ2 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a401 base: 0xfed00000
[    0.000000] smpboot: Allowing 24 CPUs, 8 hotplug CPUs
[    0.000000] nr_irqs_gsi: 64
[    0.000000] PM: Registered nosave memory: 000000000009c000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
[    0.000000] PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
[    0.000000] PM: Registered nosave memory: 000000008bd5d000 - 000000008be4f000
[    0.000000] PM: Registered nosave memory: 000000008be4f000 - 000000008c26e000
[    0.000000] PM: Registered nosave memory: 000000008c26e000 - 000000008d66e000
[    0.000000] PM: Registered nosave memory: 000000008d66e000 - 000000008f602000
[    0.000000] PM: Registered nosave memory: 000000008f602000 - 000000008f64f000
[    0.000000] PM: Registered nosave memory: 000000008f64f000 - 000000008f6e2000
[    0.000000] PM: Registered nosave memory: 000000008f6e2000 - 000000008f6ed000
[    0.000000] PM: Registered nosave memory: 000000008f6ed000 - 000000008f6f0000
[    0.000000] PM: Registered nosave memory: 000000008f6f0000 - 000000008f7cf000
[    0.000000] PM: Registered nosave memory: 000000008f7cf000 - 000000008f800000
[    0.000000] PM: Registered nosave memory: 000000008f800000 - 0000000090000000
[    0.000000] PM: Registered nosave memory: 0000000090000000 - 00000000a0000000
[    0.000000] PM: Registered nosave memory: 00000000a0000000 - 00000000b0000000
[    0.000000] PM: Registered nosave memory: 00000000b0000000 - 00000000fc000000
[    0.000000] PM: Registered nosave memory: 00000000fc000000 - 00000000fd000000
[    0.000000] PM: Registered nosave memory: 00000000fd000000 - 00000000fed1c000
[    0.000000] PM: Registered nosave memory: 00000000fed1c000 - 00000000fed20000
[    0.000000] PM: Registered nosave memory: 00000000fed20000 - 00000000ff800000
[    0.000000] PM: Registered nosave memory: 00000000ff800000 - 0000000100000000
[    0.000000] e820: [mem 0xb0000000-0xfbffffff] available for PCI devices
[    0.000000] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:24 nr_node_ids:2
[    0.000000] PERCPU: Embedded 27 pages/cpu @ffff880267c00000 s81216 r8192 d21184 u131072
[    0.000000] pcpu-alloc: s81216 r8192 d21184 u131072 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 00 02 04 06 08 10 12 14 16 18 20 22 -- -- -- -- 
[    0.000000] pcpu-alloc: [1] 01 03 05 07 09 11 13 15 17 19 21 23 -- -- -- -- 
[    0.000000] Built 2 zonelists in Zone order, mobility grouping on.  Total pages: 4111852
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-rc5.nab+ root=/dev/mapper/vg_dhcp47--148-lv_root ro rd.md=0 rd.dm=0 KEYTABLE=us rd.lvm.lv=vg_dhcp47-148/lv_root console=ttyS0,115200 SYSFONT=True rd.luks=0 rd.lvm.lv=vg_dhcp47-148/lv_swap LANG=en_US.UTF-8 debug
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Memory: 16346740k/18612224k available (6396k kernel code, 1903648k absent, 361836k reserved, 6834k data, 1312k init)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=24, Nodes=2
[    0.000000] Hierarchical RCU implementation.
[    0.000000] 	RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=24.
[    0.000000] NR_IRQS:8448 nr_irqs:1280 16
[    0.000000] Console: colour VGA+ 80x25
[    0.000000] console [ttyS0] enabled
[    0.000000] allocated 67108864 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] Enabling automatic NUMA balancing. Configure with numa_balancing= or sysctl
[    0.000000] hpet clockevent registered
[    0.001000] tsc: Fast TSC calibration using PIT
[    0.002000] tsc: Detected 2400.120 MHz processor
[    0.000006] Calibrating delay loop (skipped), value calculated using timer frequency.. 4800.24 BogoMIPS (lpj=2400120)
[    0.011880] pid_max: default: 32768 minimum: 301
[    0.017103] Security Framework initialized
[    0.021690] SELinux:  Initializing.
[    0.025604] SELinux:  Starting in permissive mode
[    0.032441] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
[    0.044914] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[    0.054586] Mount-cache hash table entries: 256
[    0.059896] Initializing cgroup subsys memory
[    0.064787] Initializing cgroup subsys devices
[    0.069754] Initializing cgroup subsys freezer
[    0.074720] Initializing cgroup subsys net_cls
[    0.079686] Initializing cgroup subsys blkio
[    0.084459] Initializing cgroup subsys perf_event
[    0.089749] CPU: Physical Processor ID: 0
[    0.094230] CPU: Processor Core ID: 0
[    0.098326] mce: CPU supports 9 MCE banks
[    0.102814] CPU0: Thermal monitoring enabled (TM1)
[    0.108176] Last level iTLB entries: 4KB 512, 2MB 7, 4MB 7
[    0.108176] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32
[    0.108176] tlb_flushall_shift: 6
[    0.124388] Freeing SMP alternatives: 24k freed
[    0.129457] ACPI: Core revision 20130328
[    0.146166] ACPI: All ACPI Tables successfully acquired
[    0.157144] ftrace: allocating 24325 entries in 96 pages
[    0.175780] dmar: Host address width 40
[    0.180071] dmar: DRHD base: 0x000000fe710000 flags: 0x1
[    0.186014] dmar: IOMMU 0: reg_base_addr fe710000 ver 1:0 cap c90780106f0462 ecap f020fe
[    0.195056] dmar: RMRR base: 0x0000008f62f000 end: 0x0000008f631fff
[    0.202061] dmar: RMRR base: 0x0000008f61a000 end: 0x0000008f61afff
[    0.209065] dmar: RMRR base: 0x0000008f617000 end: 0x0000008f617fff
[    0.216068] dmar: RMRR base: 0x0000008f614000 end: 0x0000008f614fff
[    0.223072] dmar: RMRR base: 0x0000008f611000 end: 0x0000008f611fff
[    0.230077] dmar: RMRR base: 0x0000008f60e000 end: 0x0000008f60efff
[    0.237082] dmar: RMRR base: 0x0000008f60b000 end: 0x0000008f60bfff
[    0.244085] dmar: RMRR base: 0x0000008f608000 end: 0x0000008f608fff
[    0.251089] dmar: RMRR base: 0x0000008f605000 end: 0x0000008f605fff
[    0.258195] IOAPIC id 8 under DRHD base  0xfe710000 IOMMU 0
[    0.264407] IOAPIC id 9 under DRHD base  0xfe710000 IOMMU 0
[    0.270820] Enabled IRQ remapping in xapic mode
[    0.275886] Switched APIC routing to physical flat.
[    0.281928] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.298646] smpboot: CPU0: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz (fam: 06, model: 2c, stepping: 02)
[    0.411839] Performance Events: PEBS fmt1+, 16-deep LBR, Westmere events, Intel PMU driver.
[    0.421215] perf_event_intel: CPUID marked event: 'bus cycles' unavailable
[    0.428893] ... version:                3
[    0.433367] ... bit width:              48
[    0.437940] ... generic registers:      4
[    0.442416] ... value mask:             0000ffffffffffff
[    0.448347] ... max period:             000000007fffffff
[    0.454279] ... fixed-purpose events:   3
[    0.458755] ... event mask:             000000070000000f
[    0.466272] smpboot: Booting Node   1, Processors  #1[    0.562463] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
 OK
[    0.571999] smpboot: Booting Node   0, Processors  #2 OK
[    0.591330] smpboot: Booting Node   1, Processors  #3 OK
[    0.610596] smpboot: Booting Node   0, Processors  #4 OK
[    0.629920] smpboot: Booting Node   1, Processors  #5 OK
[    0.649202] smpboot: Booting Node   0, Processors  #6 OK
[    0.668532] smpboot: Booting Node   1, Processors  #7 OK
[    0.687810] smpboot: Booting Node   0, Processors  #8 OK
[    0.707145] smpboot: Booting Node   1, Processors  #9 OK
[    0.726403] smpboot: Booting Node   0, Processors  #10 OK
[    0.745806] smpboot: Booting Node   1, Processors  #11 OK
[    0.765195] smpboot: Booting Node   0, Processors  #12 OK
[    0.784617] smpboot: Booting Node   1, Processors  #13 OK
[    0.803988] smpboot: Booting Node   0, Processors  #14 OK
[    0.823413] smpboot: Booting Node   1, Processors  #15
[    0.842210] Brought up 16 CPUs
[    0.845824] smpboot: Total of 16 processors activated (76801.07 BogoMIPS)
[    0.870916] devtmpfs: initialized
[    0.874904] PM: Registering ACPI NVS region [mem 0x8bd5d000-0x8be4efff] (991232 bytes)
[    0.883778] PM: Registering ACPI NVS region [mem 0x8c26e000-0x8d66dfff] (20971520 bytes)
[    0.893458] PM: Registering ACPI NVS region [mem 0x8f6e2000-0x8f6ecfff] (45056 bytes)
[    0.902205] PM: Registering ACPI NVS region [mem 0x8f6f0000-0x8f7cefff] (913408 bytes)
[    0.912322] atomic64 test passed for x86-64 platform with CX8 and with SSE
[    0.920026] RTC time: 14:17:10, date: 07/16/13
[    0.925053] NET: Registered protocol family 16
[    0.930189] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
[    0.938647] ACPI: bus type PCI registered
[    0.943123] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.950399] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xa0000000-0xafffffff] (base 0xa0000000)
[    0.960797] PCI: MMCONFIG at [mem 0xa0000000-0xafffffff] reserved in E820
[    0.991889] PCI: Using configuration type 1 for base access
[    0.999361] bio: create slab <bio-0> at 0
[    1.004097] ACPI: Added _OSI(Module Device)
[    1.008770] ACPI: Added _OSI(Processor Device)
[    1.013734] ACPI: Added _OSI(3.0 _SCP Extensions)
[    1.018988] ACPI: Added _OSI(Processor Aggregator Device)
[    1.026534] ACPI: EC: Look up EC in DSDT
[    1.052643] ACPI: Interpreter enabled
[    1.056728] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20130328/hwxface-568)
[    1.067052] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20130328/hwxface-568)
[    1.077376] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S3_] (20130328/hwxface-568)
[    1.087701] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S4_] (20130328/hwxface-568)
[    1.098029] ACPI: (supports S0 S5)
[    1.101827] ACPI: Using IOAPIC for interrupt routing
[    1.107414] HEST: Table parsing has been initialized.
[    1.113056] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    1.131160] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-fd])
[    1.138489] acpi PNP0A08:00: ignoring host bridge window [mem 0x000c4000-0x000cbfff] (conflicts with Video ROM [mem 0x000c0000-0x000c7fff])
[    1.152629] PCI host bridge to bus 0000:00
[    1.157204] pci_bus 0000:00: root bus resource [bus 00-fd]
[    1.163332] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7]
[    1.170236] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff]
[    1.177139] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff]
[    1.184819] pci_bus 0000:00: root bus resource [mem 0xfed40000-0xfedfffff]
[    1.192499] pci_bus 0000:00: root bus resource [mem 0xb0000000-0xfdffffff]
[    1.200191] pci 0000:00:00.0: [8086:3406] type 00 class 0x060000
[    1.206952] pci 0000:00:00.0: PME# supported from D0 D3hot D3cold
[    1.213809] pci 0000:00:01.0: [8086:3408] type 01 class 0x060400
[    1.220570] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
[    1.227397] pci 0000:00:01.0: System wakeup disabled by ACPI
[    1.233751] pci 0000:00:02.0: [8086:3409] type 01 class 0x060400
[    1.240512] pci 0000:00:02.0: PME# supported from D0 D3hot D3cold
[    1.247373] pci 0000:00:03.0: [8086:340a] type 01 class 0x060400
[    1.254135] pci 0000:00:03.0: PME# supported from D0 D3hot D3cold
[    1.260963] pci 0000:00:03.0: System wakeup disabled by ACPI
[    1.267316] pci 0000:00:04.0: [8086:340b] type 01 class 0x060400
[    1.274075] pci 0000:00:04.0: PME# supported from D0 D3hot D3cold
[    1.280904] pci 0000:00:04.0: System wakeup disabled by ACPI
[    1.287257] pci 0000:00:05.0: [8086:340c] type 01 class 0x060400
[    1.294016] pci 0000:00:05.0: PME# supported from D0 D3hot D3cold
[    1.300845] pci 0000:00:05.0: System wakeup disabled by ACPI
[    1.307200] pci 0000:00:06.0: [8086:340d] type 01 class 0x060400
[    1.313960] pci 0000:00:06.0: PME# supported from D0 D3hot D3cold
[    1.320795] pci 0000:00:06.0: System wakeup disabled by ACPI
[    1.327150] pci 0000:00:07.0: [8086:340e] type 01 class 0x060400
[    1.333911] pci 0000:00:07.0: PME# supported from D0 D3hot D3cold
[    1.340750] pci 0000:00:07.0: System wakeup disabled by ACPI
[    1.347103] pci 0000:00:08.0: [8086:340f] type 01 class 0x060400
[    1.353863] pci 0000:00:08.0: PME# supported from D0 D3hot D3cold
[    1.360695] pci 0000:00:08.0: System wakeup disabled by ACPI
[    1.367050] pci 0000:00:09.0: [8086:3410] type 01 class 0x060400
[    1.373811] pci 0000:00:09.0: PME# supported from D0 D3hot D3cold
[    1.380645] pci 0000:00:09.0: System wakeup disabled by ACPI
[    1.386994] pci 0000:00:0a.0: [8086:3411] type 01 class 0x060400
[    1.393744] pci 0000:00:0a.0: PME# supported from D0 D3hot D3cold
[    1.400581] pci 0000:00:0a.0: System wakeup disabled by ACPI
[    1.406937] pci 0000:00:10.0: [8086:3425] type 00 class 0x080000
[    1.413778] pci 0000:00:10.1: [8086:3426] type 00 class 0x080000
[    1.420600] pci 0000:00:11.0: [8086:3427] type 00 class 0x080000
[    1.427436] pci 0000:00:11.1: [8086:3428] type 00 class 0x080000
[    1.434265] pci 0000:00:13.0: [8086:342d] type 00 class 0x080020
[    1.440984] pci 0000:00:13.0: reg 10: [mem 0xb2423000-0xb2423fff]
[    1.447833] pci 0000:00:13.0: PME# supported from D0 D3hot D3cold
[    1.454725] pci 0000:00:14.0: [8086:342e] type 00 class 0x080000
[    1.461561] pci 0000:00:14.1: [8086:3422] type 00 class 0x080000
[    1.468354] pci 0000:00:14.2: [8086:3423] type 00 class 0x080000
[    1.475189] pci 0000:00:14.3: [8086:3438] type 00 class 0x080000
[    1.482018] pci 0000:00:16.0: [8086:3430] type 00 class 0x088000
[    1.488739] pci 0000:00:16.0: reg 10: [mem 0xb2400000-0xb2403fff 64bit]
[    1.496222] pci 0000:00:16.1: [8086:3431] type 00 class 0x088000
[    1.502942] pci 0000:00:16.1: reg 10: [mem 0xb2404000-0xb2407fff 64bit]
[    1.510450] pci 0000:00:16.2: [8086:3432] type 00 class 0x088000
[    1.517170] pci 0000:00:16.2: reg 10: [mem 0xb2408000-0xb240bfff 64bit]
[    1.524685] pci 0000:00:16.3: [8086:3433] type 00 class 0x088000
[    1.531404] pci 0000:00:16.3: reg 10: [mem 0xb240c000-0xb240ffff 64bit]
[    1.538924] pci 0000:00:16.4: [8086:3429] type 00 class 0x088000
[    1.545645] pci 0000:00:16.4: reg 10: [mem 0xb2410000-0xb2413fff 64bit]
[    1.553167] pci 0000:00:16.5: [8086:342a] type 00 class 0x088000
[    1.559885] pci 0000:00:16.5: reg 10: [mem 0xb2414000-0xb2417fff 64bit]
[    1.567398] pci 0000:00:16.6: [8086:342b] type 00 class 0x088000
[    1.574117] pci 0000:00:16.6: reg 10: [mem 0xb2418000-0xb241bfff 64bit]
[    1.581639] pci 0000:00:16.7: [8086:342c] type 00 class 0x088000
[    1.588357] pci 0000:00:16.7: reg 10: [mem 0xb241c000-0xb241ffff 64bit]
[    1.595878] pci 0000:00:1a.0: [8086:3a37] type 00 class 0x0c0300
[    1.602627] pci 0000:00:1a.0: reg 20: [io  0x50c0-0x50df]
[    1.608742] pci 0000:00:1a.0: System wakeup disabled by ACPI
[    1.615104] pci 0000:00:1a.1: [8086:3a38] type 00 class 0x0c0300
[    1.621853] pci 0000:00:1a.1: reg 20: [io  0x50a0-0x50bf]
[    1.627965] pci 0000:00:1a.1: System wakeup disabled by ACPI
[    1.634321] pci 0000:00:1a.2: [8086:3a39] type 00 class 0x0c0300
[    1.641069] pci 0000:00:1a.2: reg 20: [io  0x5080-0x509f]
[    1.647181] pci 0000:00:1a.2: System wakeup disabled by ACPI
[    1.653546] pci 0000:00:1a.7: [8086:3a3c] type 00 class 0x0c0320
[    1.660273] pci 0000:00:1a.7: reg 10: [mem 0xb2421000-0xb24213ff]
[    1.667162] pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
[    1.674027] pci 0000:00:1a.7: System wakeup disabled by ACPI
[    1.680384] pci 0000:00:1c.0: [8086:3a40] type 01 class 0x060400
[    1.687158] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
[    1.694019] pci 0000:00:1c.0: System wakeup disabled by ACPI
[    1.700383] pci 0000:00:1c.4: [8086:3a48] type 01 class 0x060400
[    1.707157] pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
[    1.714011] pci 0000:00:1c.4: System wakeup disabled by ACPI
[    1.720371] pci 0000:00:1c.5: [8086:3a4a] type 01 class 0x060400
[    1.727145] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold
[    1.734002] pci 0000:00:1c.5: System wakeup disabled by ACPI
[    1.740363] pci 0000:00:1d.0: [8086:3a34] type 00 class 0x0c0300
[    1.747111] pci 0000:00:1d.0: reg 20: [io  0x5060-0x507f]
[    1.753234] pci 0000:00:1d.0: System wakeup disabled by ACPI
[    1.759591] pci 0000:00:1d.1: [8086:3a35] type 00 class 0x0c0300
[    1.766339] pci 0000:00:1d.1: reg 20: [io  0x5040-0x505f]
[    1.772472] pci 0000:00:1d.1: System wakeup disabled by ACPI
[    1.778829] pci 0000:00:1d.2: [8086:3a36] type 00 class 0x0c0300
[    1.785568] pci 0000:00:1d.2: reg 20: [io  0x5020-0x503f]
[    1.791691] pci 0000:00:1d.2: System wakeup disabled by ACPI
[    1.798062] pci 0000:00:1d.7: [8086:3a3a] type 00 class 0x0c0320
[    1.804790] pci 0000:00:1d.7: reg 10: [mem 0xb2420000-0xb24203ff]
[    1.811679] pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
[    1.818557] pci 0000:00:1d.7: System wakeup disabled by ACPI
[    1.824915] pci 0000:00:1e.0: [8086:244e] type 01 class 0x060401
[    1.831718] pci 0000:00:1e.0: System wakeup disabled by ACPI
[    1.838076] pci 0000:00:1f.0: [8086:3a16] type 00 class 0x060100
[    1.844858] pci 0000:00:1f.0: quirk: [io  0x0400-0x047f] claimed by ICH6 ACPI/GPIO/TCO
[    1.853704] pci 0000:00:1f.0: quirk: [io  0x0500-0x053f] claimed by ICH6 GPIO
[    1.861668] pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 000f)
[    1.870124] pci 0000:00:1f.0: ICH7 LPC Generic IO decode 2 PIO at 0ca0 (mask 000f)
[    1.878582] pci 0000:00:1f.0: ICH7 LPC Generic IO decode 3 PIO at 0600 (mask 003f)
[    1.887135] pci 0000:00:1f.2: [8086:3a20] type 00 class 0x01018f
[    1.893858] pci 0000:00:1f.2: reg 10: [io  0x5138-0x513f]
[    1.899893] pci 0000:00:1f.2: reg 14: [io  0x514c-0x514f]
[    1.905928] pci 0000:00:1f.2: reg 18: [io  0x5130-0x5137]
[    1.911964] pci 0000:00:1f.2: reg 1c: [io  0x5148-0x514b]
[    1.917999] pci 0000:00:1f.2: reg 20: [io  0x5110-0x511f]
[    1.924035] pci 0000:00:1f.2: reg 24: [io  0x5100-0x510f]
[    1.930182] pci 0000:00:1f.3: [8086:3a30] type 00 class 0x0c0500
[    1.936905] pci 0000:00:1f.3: reg 10: [mem 0xb2422000-0xb24220ff 64bit]
[    1.944310] pci 0000:00:1f.3: reg 20: [io  0x5000-0x501f]
[    1.950443] pci 0000:00:1f.5: [8086:3a26] type 00 class 0x010185
[    1.957167] pci 0000:00:1f.5: reg 10: [io  0x5128-0x512f]
[    1.963203] pci 0000:00:1f.5: reg 14: [io  0x5144-0x5147]
[    1.969238] pci 0000:00:1f.5: reg 18: [io  0x5120-0x5127]
[    1.975272] pci 0000:00:1f.5: reg 1c: [io  0x5140-0x5143]
[    1.981307] pci 0000:00:1f.5: reg 20: [io  0x50f0-0x50ff]
[    1.987342] pci 0000:00:1f.5: reg 24: [io  0x50e0-0x50ef]
[    1.993531] pci 0000:00:01.0: PCI bridge to [bus 01]
[    1.999131] pci 0000:00:02.0: PCI bridge to [bus 02]
[    2.004722] pci 0000:00:03.0: PCI bridge to [bus 03]
[    2.010319] pci 0000:00:04.0: PCI bridge to [bus 04]
[    2.015921] pci 0000:05:00.0: [10b5:8624] type 01 class 0x060400
[    2.022639] pci 0000:05:00.0: reg 10: [mem 0xb2300000-0xb231ffff]
[    2.029499] pci 0000:05:00.0: PME# supported from D0 D3hot D3cold
[    2.036635] pci 0000:00:05.0: PCI bridge to [bus 05-14]
[    2.042473] pci 0000:00:05.0:   bridge window [io  0x3000-0x4fff]
[    2.049272] pci 0000:00:05.0:   bridge window [mem 0xb1c00000-0xb23fffff]
[    2.056916] pci 0000:06:04.0: [10b5:8624] type 01 class 0x060400
[    2.063685] pci 0000:06:04.0: PME# supported from D0 D3hot D3cold
[    2.070542] pci 0000:06:05.0: [10b5:8624] type 01 class 0x060400
[    2.077311] pci 0000:06:05.0: PME# supported from D0 D3hot D3cold
[    2.084172] pci 0000:06:08.0: [10b5:8624] type 01 class 0x060400
[    2.090941] pci 0000:06:08.0: PME# supported from D0 D3hot D3cold
[    2.097803] pci 0000:05:00.0: PCI bridge to [bus 06-14]
[    2.103644] pci 0000:05:00.0:   bridge window [io  0x3000-0x4fff]
[    2.110451] pci 0000:05:00.0:   bridge window [mem 0xb1c00000-0xb22fffff]
[    2.118084] pci 0000:06:04.0: PCI bridge to [bus 07]
[    2.123688] pci 0000:06:05.0: PCI bridge to [bus 08]
[    2.129301] pci 0000:09:00.0: [10b5:8632] type 01 class 0x060400
[    2.136023] pci 0000:09:00.0: reg 10: [mem 0xb2200000-0xb221ffff]
[    2.142899] pci 0000:09:00.0: PME# supported from D0 D3hot D3cold
[    2.151136] pci 0000:06:08.0: PCI bridge to [bus 09-14]
[    2.156976] pci 0000:06:08.0:   bridge window [io  0x3000-0x4fff]
[    2.163786] pci 0000:06:08.0:   bridge window [mem 0xb1c00000-0xb22fffff]
[    2.171448] pci 0000:0a:01.0: [10b5:8632] type 01 class 0x060400
[    2.178228] pci 0000:0a:01.0: PME# supported from D0 D3hot D3cold
[    2.185094] pci 0000:0a:04.0: [10b5:8632] type 01 class 0x060400
[    2.191881] pci 0000:0a:04.0: PME# supported from D0 D3hot D3cold
[    2.198760] pci 0000:09:00.0: PCI bridge to [bus 0a-14]
[    2.204600] pci 0000:09:00.0:   bridge window [io  0x3000-0x4fff]
[    2.211408] pci 0000:09:00.0:   bridge window [mem 0xb1c00000-0xb21fffff]
[    2.219051] pci 0000:0a:01.0: PCI bridge to [bus 0b]
[    2.224681] pci 0000:0c:00.0: [1137:0023] type 01 class 0x060400
[    2.231479] pci 0000:0c:00.0: PME# supported from D0 D3hot D3cold
[    2.240201] pci 0000:0a:04.0: PCI bridge to [bus 0c-14]
[    2.246042] pci 0000:0a:04.0:   bridge window [io  0x3000-0x4fff]
[    2.252851] pci 0000:0a:04.0:   bridge window [mem 0xb1c00000-0xb21fffff]
[    2.260544] pci 0000:0d:00.0: [1137:0041] type 01 class 0x060400
[    2.267330] pci 0000:0d:00.0: reg 10: [mem 0xb2100000-0xb210ffff 64bit]
[    2.274816] pci 0000:0d:00.0: PME# supported from D0 D3hot D3cold
[    2.281696] pci 0000:0d:01.0: [1137:0041] type 01 class 0x060400
[    2.288549] pci 0000:0d:01.0: PME# supported from D0 D3hot D3cold
[    2.295442] pci 0000:0c:00.0: PCI bridge to [bus 0d-14]
[    2.301282] pci 0000:0c:00.0:   bridge window [io  0x3000-0x4fff]
[    2.308092] pci 0000:0c:00.0:   bridge window [mem 0xb1c00000-0xb21fffff]
[    2.315782] pci 0000:0e:00.0: [1137:0042] type 00 class 0x00ff00
[    2.322563] pci 0000:0e:00.0: reg 10: [mem 0xb2002000-0xb2002fff 64bit]
[    2.330008] pci 0000:0e:00.0: reg 18: [mem 0xb2000000-0xb2001fff 64bit]
[    2.337521] pci 0000:0e:00.0: reg 30: [mem 0xffff8000-0xffffffff pref]
[    2.344955] pci 0000:0d:00.0: PCI bridge to [bus 0e]
[    2.350505] pci 0000:0d:00.0:   bridge window [mem 0xb2000000-0xb20fffff]
[    2.358187] pci 0000:0f:00.0: [1137:0040] type 01 class 0x060400
[    2.365042] pci 0000:0f:00.0: PME# supported from D0 D3hot D3cold
[    2.371903] pci 0000:0d:01.0: PCI bridge to [bus 0f-14]
[    2.377742] pci 0000:0d:01.0:   bridge window [io  0x3000-0x4fff]
[    2.384553] pci 0000:0d:01.0:   bridge window [mem 0xb1c00000-0xb1ffffff]
[    2.392227] pci 0000:10:00.0: [1137:0041] type 01 class 0x060400
[    2.399075] pci 0000:10:00.0: PME# supported from D0 D3hot D3cold
[    2.405948] pci 0000:10:01.0: [1137:0041] type 01 class 0x060400
[    2.412812] pci 0000:10:01.0: PME# supported from D0 D3hot D3cold
[    2.419685] pci 0000:10:02.0: [1137:0041] type 01 class 0x060400
[    2.426533] pci 0000:10:02.0: PME# supported from D0 D3hot D3cold
[    2.433419] pci 0000:10:03.0: [1137:0041] type 01 class 0x060400
[    2.440265] pci 0000:10:03.0: PME# supported from D0 D3hot D3cold
[    2.447158] pci 0000:0f:00.0: PCI bridge to [bus 10-14]
[    2.452996] pci 0000:0f:00.0:   bridge window [io  0x3000-0x4fff]
[    2.459796] pci 0000:0f:00.0:   bridge window [mem 0xb1c00000-0xb1ffffff]
[    2.467471] pci 0000:11:00.0: [1137:0043] type 00 class 0x020000
[    2.474260] pci 0000:11:00.0: reg 10: [mem 0xb1f00000-0xb1f07fff 64bit]
[    2.481714] pci 0000:11:00.0: reg 18: [mem 0xb1f08000-0xb1f09fff 64bit]
[    2.489155] pci 0000:11:00.0: reg 20: [io  0x4000-0x407f]
[    2.495286] pci 0000:11:00.0: reg 30: [mem 0xffff0000-0xffffffff pref]
[    2.502733] pci 0000:10:00.0: PCI bridge to [bus 11]
[    2.508281] pci 0000:10:00.0:   bridge window [io  0x4000-0x4fff]
[    2.515090] pci 0000:10:00.0:   bridge window [mem 0xb1f00000-0xb1ffffff]
[    2.522777] pci 0000:12:00.0: [1137:0043] type 00 class 0x020000
[    2.529564] pci 0000:12:00.0: reg 10: [mem 0xb1e00000-0xb1e07fff 64bit]
[    2.537016] pci 0000:12:00.0: reg 18: [mem 0xb1e08000-0xb1e09fff 64bit]
[    2.544460] pci 0000:12:00.0: reg 20: [io  0x3000-0x307f]
[    2.550591] pci 0000:12:00.0: reg 30: [mem 0xffff0000-0xffffffff pref]
[    2.558034] pci 0000:10:01.0: PCI bridge to [bus 12]
[    2.563584] pci 0000:10:01.0:   bridge window [io  0x3000-0x3fff]
[    2.570394] pci 0000:10:01.0:   bridge window [mem 0xb1e00000-0xb1efffff]
[    2.578083] pci 0000:13:00.0: [1137:0045] type 00 class 0x0c0400
[    2.584875] pci 0000:13:00.0: reg 10: [mem 0xb1d00000-0xb1d07fff 64bit]
[    2.592323] pci 0000:13:00.0: reg 18: [mem 0xb1d08000-0xb1d09fff 64bit]
[    2.600010] pci 0000:10:02.0: PCI bridge to [bus 13]
[    2.605561] pci 0000:10:02.0:   bridge window [mem 0xb1d00000-0xb1dfffff]
[    2.613255] pci 0000:14:00.0: [1137:0045] type 00 class 0x0c0400
[    2.620041] pci 0000:14:00.0: reg 10: [mem 0xb1c00000-0xb1c07fff 64bit]
[    2.627491] pci 0000:14:00.0: reg 18: [mem 0xb1c08000-0xb1c09fff 64bit]
[    2.635158] pci 0000:10:03.0: PCI bridge to [bus 14]
[    2.640708] pci 0000:10:03.0:   bridge window [mem 0xb1c00000-0xb1cfffff]
[    2.648476] pci 0000:00:06.0: PCI bridge to [bus 15]
[    2.654095] pci 0000:16:00.0: [1077:2432] type 00 class 0x0c0400
[    2.660816] pci 0000:16:00.0: reg 10: [io  0x2200-0x22ff]
[    2.666858] pci 0000:16:00.0: reg 14: [mem 0xb1b00000-0xb1b03fff 64bit]
[    2.674275] pci 0000:16:00.0: reg 30: [mem 0xfffc0000-0xffffffff pref]
[    2.681658] pci 0000:16:00.1: [1077:2432] type 00 class 0x0c0400
[    2.688379] pci 0000:16:00.1: reg 10: [io  0x2000-0x20ff]
[    2.694421] pci 0000:16:00.1: reg 14: [mem 0xb1b04000-0xb1b07fff 64bit]
[    2.701839] pci 0000:16:00.1: reg 30: [mem 0xfffc0000-0xffffffff pref]
[    2.709497] pci 0000:00:07.0: PCI bridge to [bus 16]
[    2.715035] pci 0000:00:07.0:   bridge window [io  0x2000-0x2fff]
[    2.721842] pci 0000:00:07.0:   bridge window [mem 0xb1b00000-0xb1bfffff]
[    2.729478] pci 0000:00:08.0: PCI bridge to [bus 17]
[    2.735072] pci 0000:00:09.0: PCI bridge to [bus 18]
[    2.740669] pci 0000:00:0a.0: PCI bridge to [bus 19]
[    2.746295] pci 0000:1a:00.0: [8086:10c9] type 00 class 0x020000
[    2.753018] pci 0000:1a:00.0: reg 10: [mem 0xb1960000-0xb197ffff]
[    2.759835] pci 0000:1a:00.0: reg 14: [mem 0xb1940000-0xb195ffff]
[    2.766652] pci 0000:1a:00.0: reg 18: [io  0x1020-0x103f]
[    2.772691] pci 0000:1a:00.0: reg 1c: [mem 0xb1a04000-0xb1a07fff]
[    2.779526] pci 0000:1a:00.0: reg 30: [mem 0xfffe0000-0xffffffff pref]
[    2.786874] pci 0000:1a:00.0: PME# supported from D0 D3hot D3cold
[    2.793723] pci 0000:1a:00.0: reg 184: [mem 0xb1980000-0xb1983fff 64bit]
[    2.801229] pci 0000:1a:00.0: reg 190: [mem 0xb19a0000-0xb19a3fff 64bit]
[    2.808786] pci 0000:1a:00.1: [8086:10c9] type 00 class 0x020000
[    2.815500] pci 0000:1a:00.1: reg 10: [mem 0xb1920000-0xb193ffff]
[    2.822316] pci 0000:1a:00.1: reg 14: [mem 0xb1900000-0xb191ffff]
[    2.829133] pci 0000:1a:00.1: reg 18: [io  0x1000-0x101f]
[    2.835163] pci 0000:1a:00.1: reg 1c: [mem 0xb1a00000-0xb1a03fff]
[    2.841997] pci 0000:1a:00.1: reg 30: [mem 0xfffe0000-0xffffffff pref]
[    2.849337] pci 0000:1a:00.1: PME# supported from D0 D3hot D3cold
[    2.856185] pci 0000:1a:00.1: reg 184: [mem 0xb19c0000-0xb19c3fff 64bit]
[    2.863692] pci 0000:1a:00.1: reg 190: [mem 0xb19e0000-0xb19e3fff 64bit]
[    2.872673] pci 0000:00:1c.0: PCI bridge to [bus 1a-1c]
[    2.878513] pci 0000:00:1c.0:   bridge window [io  0x1000-0x1fff]
[    2.885322] pci 0000:00:1c.0:   bridge window [mem 0xb1900000-0xb1afffff]
[    2.892986] pci 0000:1d:00.0: [102b:0522] type 00 class 0x030000
[    2.899715] pci 0000:1d:00.0: reg 10: [mem 0xb0000000-0xb0ffffff pref]
[    2.907011] pci 0000:1d:00.0: reg 14: [mem 0xb1800000-0xb1803fff]
[    2.913829] pci 0000:1d:00.0: reg 18: [mem 0xb1000000-0xb17fffff]
[    2.920686] pci 0000:1d:00.0: reg 30: [mem 0xffff0000-0xffffffff pref]
[    2.929714] pci 0000:00:1c.4: PCI bridge to [bus 1d]
[    2.935264] pci 0000:00:1c.4:   bridge window [mem 0xb1000000-0xb18fffff]
[    2.942852] pci 0000:00:1c.4:   bridge window [mem 0xb0000000-0xb0ffffff 64bit pref]
[    2.951551] pci 0000:00:1c.5: PCI bridge to [bus 1e]
[    2.957175] pci 0000:00:1e.0: PCI bridge to [bus 1f] (subtractive decode)
[    2.964766] pci 0000:00:1e.0:   bridge window [io  0x0000-0x0cf7] (subtractive decode)
[    2.973609] pci 0000:00:1e.0:   bridge window [io  0x0d00-0xffff] (subtractive decode)
[    2.982454] pci 0000:00:1e.0:   bridge window [mem 0x000a0000-0x000bffff] (subtractive decode)
[    2.992074] pci 0000:00:1e.0:   bridge window [mem 0xfed40000-0xfedfffff] (subtractive decode)
[    3.001693] pci 0000:00:1e.0:   bridge window [mem 0xb0000000-0xfdffffff] (subtractive decode)
[    3.011408] acpi PNP0A08:00: Requesting ACPI _OSC control (0x1d)
[    3.018207] acpi PNP0A08:00: ACPI _OSC control (0x1d) granted
[    3.024769] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
[    3.033015] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
[    3.041259] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
[    3.049503] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
[    3.057750] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
[    3.067280] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
[    3.075520] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
[    3.085046] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
[    3.094830] acpi root: \_SB_.PCI0 notify handler is installed
[    3.101284] Found 1 acpi root devices
[    3.105475] ACPI: No dock devices found.
[    3.109940] vgaarb: device added: PCI:0000:1d:00.0,decodes=io+mem,owns=io+mem,locks=none
[    3.118980] vgaarb: loaded
[    3.122002] vgaarb: bridge control possible 0000:1d:00.0
[    3.128017] SCSI subsystem initialized
[    3.132204] ACPI: bus type ATA registered
[    3.136744] libata version 3.00 loaded.
[    3.141056] ACPI: bus type USB registered
[    3.145539] usbcore: registered new interface driver usbfs
[    3.151672] usbcore: registered new interface driver hub
[    3.157632] usbcore: registered new device driver usb
[    3.163406] PCI: Using ACPI for IRQ routing
[    3.173349] PCI: Discovered peer bus fe
[    3.177631] PCI: root bus fe: using default resources
[    3.183272] PCI: Probing PCI hardware (bus fe)
[    3.188259] PCI host bridge to bus 0000:fe
[    3.192834] pci_bus 0000:fe: root bus resource [io  0x0000-0xffff]
[    3.199739] pci_bus 0000:fe: root bus resource [mem 0x00000000-0xffffffffff]
[    3.207612] pci_bus 0000:fe: No busn resource found for root bus, will use [bus fe-ff]
[    3.216459] pci 0000:fe:00.0: [8086:2c70] type 00 class 0x060000
[    3.223212] pci 0000:fe:00.1: [8086:2d81] type 00 class 0x060000
[    3.229967] pci 0000:fe:02.0: [8086:2d90] type 00 class 0x060000
[    3.236718] pci 0000:fe:02.1: [8086:2d91] type 00 class 0x060000
[    3.243472] pci 0000:fe:02.2: [8086:2d92] type 00 class 0x060000
[    3.250222] pci 0000:fe:02.3: [8086:2d93] type 00 class 0x060000
[    3.256973] pci 0000:fe:02.4: [8086:2d94] type 00 class 0x060000
[    3.263723] pci 0000:fe:02.5: [8086:2d95] type 00 class 0x060000
[    3.270475] pci 0000:fe:03.0: [8086:2d98] type 00 class 0x060000
[    3.277226] pci 0000:fe:03.1: [8086:2d99] type 00 class 0x060000
[    3.283979] pci 0000:fe:03.2: [8086:2d9a] type 00 class 0x060000
[    3.290732] pci 0000:fe:03.4: [8086:2d9c] type 00 class 0x060000
[    3.297484] pci 0000:fe:04.0: [8086:2da0] type 00 class 0x060000
[    3.304232] pci 0000:fe:04.1: [8086:2da1] type 00 class 0x060000
[    3.310982] pci 0000:fe:04.2: [8086:2da2] type 00 class 0x060000
[    3.317724] pci 0000:fe:04.3: [8086:2da3] type 00 class 0x060000
[    3.324473] pci 0000:fe:05.0: [8086:2da8] type 00 class 0x060000
[    3.331220] pci 0000:fe:05.1: [8086:2da9] type 00 class 0x060000
[    3.337967] pci 0000:fe:05.2: [8086:2daa] type 00 class 0x060000
[    3.344715] pci 0000:fe:05.3: [8086:2dab] type 00 class 0x060000
[    3.351466] pci 0000:fe:06.0: [8086:2db0] type 00 class 0x060000
[    3.358212] pci 0000:fe:06.1: [8086:2db1] type 00 class 0x060000
[    3.364949] pci 0000:fe:06.2: [8086:2db2] type 00 class 0x060000
[    3.371687] pci 0000:fe:06.3: [8086:2db3] type 00 class 0x060000
[    3.378449] pci_bus 0000:fe: busn_res: [bus fe-ff] end is updated to fe
[    3.385841] PCI: Discovered peer bus ff
[    3.390124] PCI: root bus ff: using default resources
[    3.395763] PCI: Probing PCI hardware (bus ff)
[    3.400748] PCI host bridge to bus 0000:ff
[    3.405322] pci_bus 0000:ff: root bus resource [io  0x0000-0xffff]
[    3.412226] pci_bus 0000:ff: root bus resource [mem 0x00000000-0xffffffffff]
[    3.420101] pci_bus 0000:ff: No busn resource found for root bus, will use [bus ff-ff]
[    3.428948] pci 0000:ff:00.0: [8086:2c70] type 00 class 0x060000
[    3.435690] pci 0000:ff:00.1: [8086:2d81] type 00 class 0x060000
[    3.442439] pci 0000:ff:02.0: [8086:2d90] type 00 class 0x060000
[    3.449183] pci 0000:ff:02.1: [8086:2d91] type 00 class 0x060000
[    3.455929] pci 0000:ff:02.2: [8086:2d92] type 00 class 0x060000
[    3.462666] pci 0000:ff:02.3: [8086:2d93] type 00 class 0x060000
[    3.469415] pci 0000:ff:02.4: [8086:2d94] type 00 class 0x060000
[    3.476160] pci 0000:ff:02.5: [8086:2d95] type 00 class 0x060000
[    3.482904] pci 0000:ff:03.0: [8086:2d98] type 00 class 0x060000
[    3.489650] pci 0000:ff:03.1: [8086:2d99] type 00 class 0x060000
[    3.496394] pci 0000:ff:03.2: [8086:2d9a] type 00 class 0x060000
[    3.503141] pci 0000:ff:03.4: [8086:2d9c] type 00 class 0x060000
[    3.509890] pci 0000:ff:04.0: [8086:2da0] type 00 class 0x060000
[    3.516631] pci 0000:ff:04.1: [8086:2da1] type 00 class 0x060000
[    3.523375] pci 0000:ff:04.2: [8086:2da2] type 00 class 0x060000
[    3.530123] pci 0000:ff:04.3: [8086:2da3] type 00 class 0x060000
[    3.536870] pci 0000:ff:05.0: [8086:2da8] type 00 class 0x060000
[    3.543615] pci 0000:ff:05.1: [8086:2da9] type 00 class 0x060000
[    3.550365] pci 0000:ff:05.2: [8086:2daa] type 00 class 0x060000
[    3.557114] pci 0000:ff:05.3: [8086:2dab] type 00 class 0x060000
[    3.563860] pci 0000:ff:06.0: [8086:2db0] type 00 class 0x060000
[    3.570605] pci 0000:ff:06.1: [8086:2db1] type 00 class 0x060000
[    3.577351] pci 0000:ff:06.2: [8086:2db2] type 00 class 0x060000
[    3.584098] pci 0000:ff:06.3: [8086:2db3] type 00 class 0x060000
[    3.590854] pci_bus 0000:ff: busn_res: [bus ff] end is updated to ff
[    3.597954] PCI: pci_cache_line_size set to 64 bytes
[    3.603770] e820: reserve RAM buffer [mem 0x0009c000-0x0009ffff]
[    3.610478] e820: reserve RAM buffer [mem 0x8bd5d000-0x8bffffff]
[    3.617275] NetLabel: Initializing
[    3.621064] NetLabel:  domain hash size = 128
[    3.625929] NetLabel:  protocols = UNLABELED CIPSOv4
[    3.631482] NetLabel:  unlabeled traffic allowed by default
[    3.637747] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
[    3.643534] hpet0: 4 comparators, 64-bit 14.318180 MHz counter
[    3.652074] Switching to clocksource hpet
[    3.663947] pnp: PnP ACPI init
[    3.667378] ACPI: bus type PNP registered
[    3.671942] pnp 00:00: Plug and Play ACPI device, IDs PNP0003 (active)
[    3.679412] pnp 00:01: [dma 4]
[    3.682852] pnp 00:01: Plug and Play ACPI device, IDs PNP0200 (active)
[    3.690190] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00 (active)
[    3.697515] pnp 00:03: Plug and Play ACPI device, IDs PNP0c04 (active)
[    3.704839] pnp 00:04: Plug and Play ACPI device, IDs PNP0800 (active)
[    3.712201] pnp 00:05: Plug and Play ACPI device, IDs PNP0103 (active)
[    3.719662] system 00:06: [io  0x0500-0x057f] could not be reserved
[    3.726664] system 00:06: [io  0x0400-0x047f] has been reserved
[    3.733271] system 00:06: [io  0x0800-0x081f] has been reserved
[    3.739885] system 00:06: [io  0x0ca2-0x0ca3] has been reserved
[    3.746490] system 00:06: [io  0x0600-0x063f] has been reserved
[    3.753108] system 00:06: [io  0x0880-0x0883] has been reserved
[    3.759723] system 00:06: [io  0x0ca4-0x0ca5] has been reserved
[    3.766339] system 00:06: [mem 0xfed1c000-0xfed3fffe] could not be reserved
[    3.774118] system 00:06: [mem 0xff000000-0xffffffff] could not be reserved
[    3.781897] system 00:06: [mem 0xfee00000-0xfeefffff] has been reserved
[    3.789288] system 00:06: [mem 0xfe900000-0xfe90001f] has been reserved
[    3.796678] system 00:06: [mem 0xfea00000-0xfea0001f] has been reserved
[    3.804068] system 00:06: [mem 0xfed1b000-0xfed1bfff] has been reserved
[    3.811459] system 00:06: Plug and Play ACPI device, IDs PNP0c02 (active)
[    3.819214] pnp 00:07: Plug and Play ACPI device, IDs PNP0501 (active)
[    3.826573] pnp 00:08: Plug and Play ACPI device, IDs IPI0001 (active)
[    3.833960] pnp: PnP ACPI: found 9 devices
[    3.838536] ACPI: bus type PNP unregistered
[    3.849623] pci 0000:0e:00.0: no compatible bridge window for [mem 0xffff8000-0xffffffff pref]
[    3.859248] pci 0000:11:00.0: no compatible bridge window for [mem 0xffff0000-0xffffffff pref]
[    3.868871] pci 0000:12:00.0: no compatible bridge window for [mem 0xffff0000-0xffffffff pref]
[    3.878495] pci 0000:16:00.0: no compatible bridge window for [mem 0xfffc0000-0xffffffff pref]
[    3.888117] pci 0000:16:00.1: no compatible bridge window for [mem 0xfffc0000-0xffffffff pref]
[    3.897743] pci 0000:1a:00.0: no compatible bridge window for [mem 0xfffe0000-0xffffffff pref]
[    3.907366] pci 0000:1a:00.1: no compatible bridge window for [mem 0xfffe0000-0xffffffff pref]
[    3.916987] pci 0000:1d:00.0: no compatible bridge window for [mem 0xffff0000-0xffffffff pref]
[    3.926649] pci 0000:06:05.0: bridge window [io  0x1000-0x0fff] to [bus 08] add_size 1000
[    3.935788] pci 0000:06:05.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 08] add_size 200000
[    3.946962] pci 0000:06:05.0: bridge window [mem 0x00100000-0x000fffff] to [bus 08] add_size 200000
[    3.957077] pci 0000:0a:01.0: bridge window [io  0x1000-0x0fff] to [bus 0b] add_size 1000
[    3.966216] pci 0000:0a:01.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 0b] add_size 200000
[    3.977392] pci 0000:0a:01.0: bridge window [mem 0x00100000-0x000fffff] to [bus 0b] add_size 200000
[    3.987624] pci 0000:0a:04.0: bridge window [mem 0x00100000-0x003fffff pref] to [bus 0c-14] add_size 200000
[    3.998515] pci 0000:0a:01.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] get_res_add_size add_size 200000
[    4.009593] pci 0000:0a:04.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 200000
[    4.020089] pci 0000:09:00.0: bridge window [mem 0x00100000-0x003fffff pref] to [bus 0a-14] add_size 400000
[    4.030979] pci 0000:09:00.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 400000
[    4.041475] pci 0000:06:08.0: bridge window [mem 0x00100000-0x003fffff pref] to [bus 09-14] add_size 400000
[    4.052355] pci 0000:06:05.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] get_res_add_size add_size 200000
[    4.063433] pci 0000:06:08.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 400000
[    4.073929] pci 0000:05:00.0: bridge window [mem 0x00100000-0x003fffff pref] to [bus 06-14] add_size 600000
[    4.084808] pci 0000:05:00.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 600000
[    4.095305] pci 0000:00:05.0: bridge window [mem 0x00100000-0x003fffff pref] to [bus 05-14] add_size 600000
[    4.106217] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x001fffff pref] to [bus 1a-1c] add_size 200000
[    4.117106] pci 0000:00:1c.4: bridge window [io  0x1000-0x0fff] to [bus 1d] add_size 1000
[    4.126248] pci 0000:00:1c.5: bridge window [io  0x1000-0x0fff] to [bus 1e] add_size 1000
[    4.135386] pci 0000:00:1c.5: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 1e] add_size 200000
[    4.146562] pci 0000:00:1c.5: bridge window [mem 0x00100000-0x000fffff] to [bus 1e] add_size 200000
[    4.156690] pci 0000:00:05.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 600000
[    4.167187] pci 0000:00:1c.0: res[15]=[mem 0x00100000-0x001fffff pref] get_res_add_size add_size 200000
[    4.177683] pci 0000:00:1c.5: res[14]=[mem 0x00100000-0x000fffff] get_res_add_size add_size 200000
[    4.187695] pci 0000:00:1c.5: res[15]=[mem 0x00100000-0x000fffff 64bit pref] get_res_add_size add_size 200000
[    4.198774] pci 0000:00:1c.4: res[13]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
[    4.207814] pci 0000:00:1c.5: res[13]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
[    4.216848] pci 0000:00:05.0: BAR 15: assigned [mem 0xb2500000-0xb2dfffff pref]
[    4.225015] pci 0000:00:07.0: BAR 15: assigned [mem 0xb2e00000-0xb2efffff pref]
[    4.233183] pci 0000:00:1c.0: BAR 15: assigned [mem 0xb2f00000-0xb31fffff pref]
[    4.241350] pci 0000:00:1c.5: BAR 14: assigned [mem 0xb3200000-0xb33fffff]
[    4.249024] pci 0000:00:1c.5: BAR 15: assigned [mem 0xb3400000-0xb35fffff 64bit pref]
[    4.257774] pci 0000:00:1c.4: BAR 13: assigned [io  0x6000-0x6fff]
[    4.264679] pci 0000:00:1c.5: BAR 13: assigned [io  0x7000-0x7fff]
[    4.271576] pci 0000:00:01.0: PCI bridge to [bus 01]
[    4.277131] pci 0000:00:02.0: PCI bridge to [bus 02]
[    4.282682] pci 0000:00:03.0: PCI bridge to [bus 03]
[    4.288235] pci 0000:00:04.0: PCI bridge to [bus 04]
[    4.293787] pci 0000:05:00.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 600000
[    4.304283] pci 0000:05:00.0: BAR 15: assigned [mem 0xb2500000-0xb2dfffff pref]
[    4.312442] pci 0000:06:05.0: res[14]=[mem 0x00100000-0x000fffff] get_res_add_size add_size 200000
[    4.322455] pci 0000:06:05.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] get_res_add_size add_size 200000
[    4.333534] pci 0000:06:08.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 400000
[    4.344029] pci 0000:06:05.0: res[13]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
[    4.353069] pci 0000:06:05.0: BAR 14: can't assign mem (size 0x200000)
[    4.360363] pci 0000:06:05.0: BAR 15: assigned [mem 0xb2500000-0xb26fffff 64bit pref]
[    4.369113] pci 0000:06:08.0: BAR 15: assigned [mem 0xb2700000-0xb2dfffff pref]
[    4.377281] pci 0000:06:05.0: BAR 13: can't assign io (size 0x1000)
[    4.384284] pci 0000:06:08.0: BAR 15: assigned [mem 0xb2500000-0xb27fffff pref]
[    4.392450] pci 0000:06:08.0: BAR 15: reassigned [mem 0xb2500000-0xb2bfffff pref]
[    4.400810] pci 0000:06:05.0: BAR 14: can't assign mem (size 0x200000)
[    4.408104] pci 0000:06:05.0: BAR 15: assigned [mem 0xb2c00000-0xb2dfffff 64bit pref]
[    4.416845] pci 0000:06:05.0: BAR 13: can't assign io (size 0x1000)
[    4.423848] pci 0000:06:04.0: PCI bridge to [bus 07]
[    4.429400] pci 0000:06:05.0: PCI bridge to [bus 08]
[    4.434949] pci 0000:06:05.0:   bridge window [mem 0xb2c00000-0xb2dfffff 64bit pref]
[    4.443606] pci 0000:09:00.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 400000
[    4.454101] pci 0000:09:00.0: BAR 15: assigned [mem 0xb2500000-0xb2bfffff pref]
[    4.462269] pci 0000:0a:01.0: res[14]=[mem 0x00100000-0x000fffff] get_res_add_size add_size 200000
[    4.472280] pci 0000:0a:01.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] get_res_add_size add_size 200000
[    4.483358] pci 0000:0a:04.0: res[15]=[mem 0x00100000-0x003fffff pref] get_res_add_size add_size 200000
[    4.493853] pci 0000:0a:01.0: res[13]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
[    4.502891] pci 0000:0a:01.0: BAR 14: can't assign mem (size 0x200000)
[    4.510186] pci 0000:0a:01.0: BAR 15: assigned [mem 0xb2500000-0xb26fffff 64bit pref]
[    4.518933] pci 0000:0a:04.0: BAR 15: assigned [mem 0xb2700000-0xb2bfffff pref]
[    4.527099] pci 0000:0a:01.0: BAR 13: can't assign io (size 0x1000)
[    4.534101] pci 0000:0a:04.0: BAR 15: assigned [mem 0xb2500000-0xb27fffff pref]
[    4.542269] pci 0000:0a:04.0: BAR 15: reassigned [mem 0xb2500000-0xb29fffff pref]
[    4.550631] pci 0000:0a:01.0: BAR 14: can't assign mem (size 0x200000)
[    4.557925] pci 0000:0a:01.0: BAR 15: assigned [mem 0xb2a00000-0xb2bfffff 64bit pref]
[    4.566675] pci 0000:0a:01.0: BAR 13: can't assign io (size 0x1000)
[    4.573677] pci 0000:0a:01.0: PCI bridge to [bus 0b]
[    4.579221] pci 0000:0a:01.0:   bridge window [mem 0xb2a00000-0xb2bfffff 64bit pref]
[    4.587879] pci 0000:0c:00.0: BAR 15: assigned [mem 0xb2500000-0xb27fffff pref]
[    4.596046] pci 0000:0d:00.0: BAR 15: assigned [mem 0xb2500000-0xb25fffff pref]
[    4.604213] pci 0000:0d:01.0: BAR 15: assigned [mem 0xb2600000-0xb27fffff pref]
[    4.612371] pci 0000:0e:00.0: BAR 6: assigned [mem 0xb2500000-0xb2507fff pref]
[    4.620432] pci 0000:0d:00.0: PCI bridge to [bus 0e]
[    4.626002] pci 0000:0d:00.0:   bridge window [mem 0xb2000000-0xb20fffff]
[    4.633594] pci 0000:0d:00.0:   bridge window [mem 0xb2500000-0xb25fffff pref]
[    4.641682] pci 0000:0f:00.0: BAR 15: assigned [mem 0xb2600000-0xb27fffff pref]
[    4.649850] pci 0000:10:00.0: BAR 15: assigned [mem 0xb2600000-0xb26fffff pref]
[    4.658016] pci 0000:10:01.0: BAR 15: assigned [mem 0xb2700000-0xb27fffff pref]
[    4.666184] pci 0000:11:00.0: BAR 6: assigned [mem 0xb2600000-0xb260ffff pref]
[    4.674253] pci 0000:10:00.0: PCI bridge to [bus 11]
[    4.679800] pci 0000:10:00.0:   bridge window [io  0x4000-0x4fff]
[    4.686618] pci 0000:10:00.0:   bridge window [mem 0xb1f00000-0xb1ffffff]
[    4.694212] pci 0000:10:00.0:   bridge window [mem 0xb2600000-0xb26fffff pref]
[    4.702299] pci 0000:12:00.0: BAR 6: assigned [mem 0xb2700000-0xb270ffff pref]
[    4.710368] pci 0000:10:01.0: PCI bridge to [bus 12]
[    4.715917] pci 0000:10:01.0:   bridge window [io  0x3000-0x3fff]
[    4.722743] pci 0000:10:01.0:   bridge window [mem 0xb1e00000-0xb1efffff]
[    4.730338] pci 0000:10:01.0:   bridge window [mem 0xb2700000-0xb27fffff pref]
[    4.738426] pci 0000:10:02.0: PCI bridge to [bus 13]
[    4.743987] pci 0000:10:02.0:   bridge window [mem 0xb1d00000-0xb1dfffff]
[    4.751596] pci 0000:10:03.0: PCI bridge to [bus 14]
[    4.757154] pci 0000:10:03.0:   bridge window [mem 0xb1c00000-0xb1cfffff]
[    4.764763] pci 0000:0f:00.0: PCI bridge to [bus 10-14]
[    4.770602] pci 0000:0f:00.0:   bridge window [io  0x3000-0x4fff]
[    4.777428] pci 0000:0f:00.0:   bridge window [mem 0xb1c00000-0xb1ffffff]
[    4.785012] pci 0000:0f:00.0:   bridge window [mem 0xb2600000-0xb27fffff pref]
[    4.793103] pci 0000:0d:01.0: PCI bridge to [bus 0f-14]
[    4.798943] pci 0000:0d:01.0:   bridge window [io  0x3000-0x4fff]
[    4.805764] pci 0000:0d:01.0:   bridge window [mem 0xb1c00000-0xb1ffffff]
[    4.813356] pci 0000:0d:01.0:   bridge window [mem 0xb2600000-0xb27fffff pref]
[    4.821447] pci 0000:0c:00.0: PCI bridge to [bus 0d-14]
[    4.827286] pci 0000:0c:00.0:   bridge window [io  0x3000-0x4fff]
[    4.834097] pci 0000:0c:00.0:   bridge window [mem 0xb1c00000-0xb21fffff]
[    4.841684] pci 0000:0c:00.0:   bridge window [mem 0xb2500000-0xb27fffff pref]
[    4.849759] pci 0000:0a:04.0: PCI bridge to [bus 0c-14]
[    4.855598] pci 0000:0a:04.0:   bridge window [io  0x3000-0x4fff]
[    4.862409] pci 0000:0a:04.0:   bridge window [mem 0xb1c00000-0xb21fffff]
[    4.869996] pci 0000:0a:04.0:   bridge window [mem 0xb2500000-0xb29fffff pref]
[    4.878070] pci 0000:09:00.0: PCI bridge to [bus 0a-14]
[    4.883909] pci 0000:09:00.0:   bridge window [io  0x3000-0x4fff]
[    4.890711] pci 0000:09:00.0:   bridge window [mem 0xb1c00000-0xb21fffff]
[    4.898297] pci 0000:09:00.0:   bridge window [mem 0xb2500000-0xb2bfffff pref]
[    4.906370] pci 0000:06:08.0: PCI bridge to [bus 09-14]
[    4.912208] pci 0000:06:08.0:   bridge window [io  0x3000-0x4fff]
[    4.919010] pci 0000:06:08.0:   bridge window [mem 0xb1c00000-0xb22fffff]
[    4.926596] pci 0000:06:08.0:   bridge window [mem 0xb2500000-0xb2bfffff pref]
[    4.934668] pci 0000:05:00.0: PCI bridge to [bus 06-14]
[    4.940506] pci 0000:05:00.0:   bridge window [io  0x3000-0x4fff]
[    4.947316] pci 0000:05:00.0:   bridge window [mem 0xb1c00000-0xb22fffff]
[    4.954900] pci 0000:05:00.0:   bridge window [mem 0xb2500000-0xb2dfffff pref]
[    4.962965] pci 0000:00:05.0: PCI bridge to [bus 05-14]
[    4.968803] pci 0000:00:05.0:   bridge window [io  0x3000-0x4fff]
[    4.975604] pci 0000:00:05.0:   bridge window [mem 0xb1c00000-0xb23fffff]
[    4.983189] pci 0000:00:05.0:   bridge window [mem 0xb2500000-0xb2dfffff pref]
[    4.991252] pci 0000:00:06.0: PCI bridge to [bus 15]
[    4.996804] pci 0000:16:00.0: BAR 6: assigned [mem 0xb2e00000-0xb2e3ffff pref]
[    5.004874] pci 0000:16:00.1: BAR 6: assigned [mem 0xb2e40000-0xb2e7ffff pref]
[    5.012936] pci 0000:00:07.0: PCI bridge to [bus 16]
[    5.018483] pci 0000:00:07.0:   bridge window [io  0x2000-0x2fff]
[    5.025293] pci 0000:00:07.0:   bridge window [mem 0xb1b00000-0xb1bfffff]
[    5.032878] pci 0000:00:07.0:   bridge window [mem 0xb2e00000-0xb2efffff pref]
[    5.040950] pci 0000:00:08.0: PCI bridge to [bus 17]
[    5.046501] pci 0000:00:09.0: PCI bridge to [bus 18]
[    5.052055] pci 0000:00:0a.0: PCI bridge to [bus 19]
[    5.057608] pci 0000:1a:00.0: BAR 6: assigned [mem 0xb2f00000-0xb2f1ffff pref]
[    5.065679] pci 0000:1a:00.1: BAR 6: assigned [mem 0xb2f20000-0xb2f3ffff pref]
[    5.073749] pci 0000:00:1c.0: PCI bridge to [bus 1a-1c]
[    5.079587] pci 0000:00:1c.0:   bridge window [io  0x1000-0x1fff]
[    5.086389] pci 0000:00:1c.0:   bridge window [mem 0xb1900000-0xb1afffff]
[    5.093974] pci 0000:00:1c.0:   bridge window [mem 0xb2f00000-0xb31fffff pref]
[    5.102039] pci 0000:1d:00.0: BAR 6: assigned [mem 0xb1810000-0xb181ffff pref]
[    5.110109] pci 0000:00:1c.4: PCI bridge to [bus 1d]
[    5.115657] pci 0000:00:1c.4:   bridge window [io  0x6000-0x6fff]
[    5.122469] pci 0000:00:1c.4:   bridge window [mem 0xb1000000-0xb18fffff]
[    5.130054] pci 0000:00:1c.4:   bridge window [mem 0xb0000000-0xb0ffffff 64bit pref]
[    5.138710] pci 0000:00:1c.5: PCI bridge to [bus 1e]
[    5.144257] pci 0000:00:1c.5:   bridge window [io  0x7000-0x7fff]
[    5.151067] pci 0000:00:1c.5:   bridge window [mem 0xb3200000-0xb33fffff]
[    5.158652] pci 0000:00:1c.5:   bridge window [mem 0xb3400000-0xb35fffff 64bit pref]
[    5.167307] pci 0000:00:1e.0: PCI bridge to [bus 1f]
[    5.179614] pci 0000:00:1e.0: setting latency timer to 64
[    5.185651] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7]
[    5.191878] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff]
[    5.198105] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff]
[    5.205106] pci_bus 0000:00: resource 7 [mem 0xfed40000-0xfedfffff]
[    5.212109] pci_bus 0000:00: resource 8 [mem 0xb0000000-0xfdffffff]
[    5.219114] pci_bus 0000:05: resource 0 [io  0x3000-0x4fff]
[    5.225338] pci_bus 0000:05: resource 1 [mem 0xb1c00000-0xb23fffff]
[    5.232339] pci_bus 0000:05: resource 2 [mem 0xb2500000-0xb2dfffff pref]
[    5.239825] pci_bus 0000:06: resource 0 [io  0x3000-0x4fff]
[    5.246050] pci_bus 0000:06: resource 1 [mem 0xb1c00000-0xb22fffff]
[    5.253052] pci_bus 0000:06: resource 2 [mem 0xb2500000-0xb2dfffff pref]
[    5.260539] pci_bus 0000:08: resource 2 [mem 0xb2c00000-0xb2dfffff 64bit pref]
[    5.268600] pci_bus 0000:09: resource 0 [io  0x3000-0x4fff]
[    5.274827] pci_bus 0000:09: resource 1 [mem 0xb1c00000-0xb22fffff]
[    5.281828] pci_bus 0000:09: resource 2 [mem 0xb2500000-0xb2bfffff pref]
[    5.289315] pci_bus 0000:0a: resource 0 [io  0x3000-0x4fff]
[    5.295541] pci_bus 0000:0a: resource 1 [mem 0xb1c00000-0xb21fffff]
[    5.302544] pci_bus 0000:0a: resource 2 [mem 0xb2500000-0xb2bfffff pref]
[    5.310031] pci_bus 0000:0b: resource 2 [mem 0xb2a00000-0xb2bfffff 64bit pref]
[    5.318102] pci_bus 0000:0c: resource 0 [io  0x3000-0x4fff]
[    5.324326] pci_bus 0000:0c: resource 1 [mem 0xb1c00000-0xb21fffff]
[    5.331329] pci_bus 0000:0c: resource 2 [mem 0xb2500000-0xb29fffff pref]
[    5.338815] pci_bus 0000:0d: resource 0 [io  0x3000-0x4fff]
[    5.345042] pci_bus 0000:0d: resource 1 [mem 0xb1c00000-0xb21fffff]
[    5.352045] pci_bus 0000:0d: resource 2 [mem 0xb2500000-0xb27fffff pref]
[    5.359532] pci_bus 0000:0e: resource 1 [mem 0xb2000000-0xb20fffff]
[    5.366535] pci_bus 0000:0e: resource 2 [mem 0xb2500000-0xb25fffff pref]
[    5.374021] pci_bus 0000:0f: resource 0 [io  0x3000-0x4fff]
[    5.380246] pci_bus 0000:0f: resource 1 [mem 0xb1c00000-0xb1ffffff]
[    5.387249] pci_bus 0000:0f: resource 2 [mem 0xb2600000-0xb27fffff pref]
[    5.394736] pci_bus 0000:10: resource 0 [io  0x3000-0x4fff]
[    5.400962] pci_bus 0000:10: resource 1 [mem 0xb1c00000-0xb1ffffff]
[    5.407956] pci_bus 0000:10: resource 2 [mem 0xb2600000-0xb27fffff pref]
[    5.415435] pci_bus 0000:11: resource 0 [io  0x4000-0x4fff]
[    5.421651] pci_bus 0000:11: resource 1 [mem 0xb1f00000-0xb1ffffff]
[    5.428653] pci_bus 0000:11: resource 2 [mem 0xb2600000-0xb26fffff pref]
[    5.436141] pci_bus 0000:12: resource 0 [io  0x3000-0x3fff]
[    5.442367] pci_bus 0000:12: resource 1 [mem 0xb1e00000-0xb1efffff]
[    5.449370] pci_bus 0000:12: resource 2 [mem 0xb2700000-0xb27fffff pref]
[    5.456857] pci_bus 0000:13: resource 1 [mem 0xb1d00000-0xb1dfffff]
[    5.463859] pci_bus 0000:14: resource 1 [mem 0xb1c00000-0xb1cfffff]
[    5.470853] pci_bus 0000:16: resource 0 [io  0x2000-0x2fff]
[    5.477078] pci_bus 0000:16: resource 1 [mem 0xb1b00000-0xb1bfffff]
[    5.484079] pci_bus 0000:16: resource 2 [mem 0xb2e00000-0xb2efffff pref]
[    5.491566] pci_bus 0000:1a: resource 0 [io  0x1000-0x1fff]
[    5.497783] pci_bus 0000:1a: resource 1 [mem 0xb1900000-0xb1afffff]
[    5.504776] pci_bus 0000:1a: resource 2 [mem 0xb2f00000-0xb31fffff pref]
[    5.512263] pci_bus 0000:1d: resource 0 [io  0x6000-0x6fff]
[    5.518480] pci_bus 0000:1d: resource 1 [mem 0xb1000000-0xb18fffff]
[    5.525481] pci_bus 0000:1d: resource 2 [mem 0xb0000000-0xb0ffffff 64bit pref]
[    5.533550] pci_bus 0000:1e: resource 0 [io  0x7000-0x7fff]
[    5.539777] pci_bus 0000:1e: resource 1 [mem 0xb3200000-0xb33fffff]
[    5.546778] pci_bus 0000:1e: resource 2 [mem 0xb3400000-0xb35fffff 64bit pref]
[    5.554849] pci_bus 0000:1f: resource 4 [io  0x0000-0x0cf7]
[    5.561075] pci_bus 0000:1f: resource 5 [io  0x0d00-0xffff]
[    5.567300] pci_bus 0000:1f: resource 6 [mem 0x000a0000-0x000bffff]
[    5.574301] pci_bus 0000:1f: resource 7 [mem 0xfed40000-0xfedfffff]
[    5.581302] pci_bus 0000:1f: resource 8 [mem 0xb0000000-0xfdffffff]
[    5.588306] pci_bus 0000:fe: resource 4 [io  0x0000-0xffff]
[    5.594533] pci_bus 0000:fe: resource 5 [mem 0x00000000-0xffffffffff]
[    5.601730] pci_bus 0000:ff: resource 4 [io  0x0000-0xffff]
[    5.607956] pci_bus 0000:ff: resource 5 [mem 0x00000000-0xffffffffff]
[    5.615218] NET: Registered protocol family 2
[    5.620472] TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
[    5.629206] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
[    5.636879] TCP: Hash tables configured (established 131072 bind 65536)
[    5.644305] TCP: reno registered
[    5.647943] UDP hash table entries: 8192 (order: 6, 262144 bytes)
[    5.654826] UDP-Lite hash table entries: 8192 (order: 6, 262144 bytes)
[    5.662298] NET: Registered protocol family 1
[    5.669299] pci 0000:1d:00.0: Boot video device
[    5.674414] PCI: CLS 64 bytes, default 64
[    5.678930] Unpacking initramfs...
[    5.950385] Freeing initrd memory: 14412k freed
[    5.959081] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    5.966287] software IO TLB [mem 0x87d5d000-0x8bd5d000] (64MB) mapped at [ffff880087d5d000-ffff88008bd5cfff]
[    5.981757] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni)
[    5.989377] audit: initializing netlink socket (disabled)
[    5.995432] type=2000 audit(1373984231.242:1): initialized
[    6.037818] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    6.046864] VFS: Disk quotas dquot_6.5.2
[    6.051307] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    6.058995] msgmni has been set to 31955
[    6.063460] SELinux:  Registering netfilter hooks
[    6.069889] alg: No test for stdrng (krng)
[    6.074477] NET: Registered protocol family 38
[    6.079482] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    6.087816] io scheduler noop registered
[    6.092200] io scheduler deadline registered
[    6.097005] io scheduler cfq registered (default)
[    6.102528] pcieport 0000:00:01.0: irq 65 for MSI/MSI-X
[    6.108444] pcieport 0000:00:02.0: irq 66 for MSI/MSI-X
[    6.114346] pcieport 0000:00:03.0: irq 67 for MSI/MSI-X
[    6.120260] pcieport 0000:00:04.0: irq 68 for MSI/MSI-X
[    6.126168] pcieport 0000:00:05.0: irq 69 for MSI/MSI-X
[    6.132078] pcieport 0000:00:06.0: irq 70 for MSI/MSI-X
[    6.137999] pcieport 0000:00:07.0: irq 71 for MSI/MSI-X
[    6.143908] pcieport 0000:00:08.0: irq 72 for MSI/MSI-X
[    6.149816] pcieport 0000:00:09.0: irq 73 for MSI/MSI-X
[    6.155725] pcieport 0000:00:0a.0: irq 74 for MSI/MSI-X
[    6.161653] pcieport 0000:00:1c.0: irq 75 for MSI/MSI-X
[    6.167584] pcieport 0000:00:1c.4: irq 76 for MSI/MSI-X
[    6.173509] pcieport 0000:00:1c.5: irq 77 for MSI/MSI-X
[    6.179428] pcieport 0000:05:00.0: irq 78 for MSI/MSI-X
[    6.185345] pcieport 0000:06:04.0: irq 79 for MSI/MSI-X
[    6.191264] pcieport 0000:06:05.0: irq 80 for MSI/MSI-X
[    6.197196] pcieport 0000:06:08.0: irq 81 for MSI/MSI-X
[    6.203127] pcieport 0000:09:00.0: irq 82 for MSI/MSI-X
[    6.209069] pcieport 0000:0a:01.0: irq 83 for MSI/MSI-X
[    6.215014] pcieport 0000:0a:04.0: irq 84 for MSI/MSI-X
[    6.220955] pcieport 0000:0c:00.0: irq 85 for MSI/MSI-X
[    6.227250] aer 0000:00:01.0:pcie02: service driver aer loaded
[    6.233787] aer 0000:00:02.0:pcie02: service driver aer loaded
[    6.240321] aer 0000:00:03.0:pcie02: service driver aer loaded
[    6.246855] aer 0000:00:04.0:pcie02: service driver aer loaded
[    6.253419] aer 0000:00:05.0:pcie02: service driver aer loaded
[    6.259955] aer 0000:00:06.0:pcie02: service driver aer loaded
[    6.266491] aer 0000:00:07.0:pcie02: service driver aer loaded
[    6.273026] aer 0000:00:08.0:pcie02: service driver aer loaded
[    6.279557] aer 0000:00:09.0:pcie02: service driver aer loaded
[    6.286091] aer 0000:00:0a.0:pcie02: service driver aer loaded
[    6.292622] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[    6.300403] pcie_pme 0000:00:01.0:pcie01: service driver pcie_pme loaded
[    6.307900] pcieport 0000:00:02.0: Signaling PME through PCIe PME interrupt
[    6.315680] pcie_pme 0000:00:02.0:pcie01: service driver pcie_pme loaded
[    6.323173] pcieport 0000:00:03.0: Signaling PME through PCIe PME interrupt
[    6.330951] pcie_pme 0000:00:03.0:pcie01: service driver pcie_pme loaded
[    6.338442] pcieport 0000:00:04.0: Signaling PME through PCIe PME interrupt
[    6.346223] pcie_pme 0000:00:04.0:pcie01: service driver pcie_pme loaded
[    6.353720] pcieport 0000:00:05.0: Signaling PME through PCIe PME interrupt
[    6.361499] pcieport 0000:05:00.0: Signaling PME through PCIe PME interrupt
[    6.369269] pcieport 0000:06:04.0: Signaling PME through PCIe PME interrupt
[    6.377038] pcieport 0000:06:05.0: Signaling PME through PCIe PME interrupt
[    6.384816] pcieport 0000:06:08.0: Signaling PME through PCIe PME interrupt
[    6.392585] pcieport 0000:09:00.0: Signaling PME through PCIe PME interrupt
[    6.400362] pcieport 0000:0a:01.0: Signaling PME through PCIe PME interrupt
[    6.408139] pcieport 0000:0a:04.0: Signaling PME through PCIe PME interrupt
[    6.415918] pcieport 0000:0c:00.0: Signaling PME through PCIe PME interrupt
[    6.423696] pcieport 0000:0d:00.0: Signaling PME through PCIe PME interrupt
[    6.431466] pci 0000:0e:00.0: Signaling PME through PCIe PME interrupt
[    6.438758] pcieport 0000:0d:01.0: Signaling PME through PCIe PME interrupt
[    6.446534] pcieport 0000:0f:00.0: Signaling PME through PCIe PME interrupt
[    6.454311] pcieport 0000:10:00.0: Signaling PME through PCIe PME interrupt
[    6.462087] pci 0000:11:00.0: Signaling PME through PCIe PME interrupt
[    6.469379] pcieport 0000:10:01.0: Signaling PME through PCIe PME interrupt
[    6.477157] pci 0000:12:00.0: Signaling PME through PCIe PME interrupt
[    6.484448] pcieport 0000:10:02.0: Signaling PME through PCIe PME interrupt
[    6.492226] pci 0000:13:00.0: Signaling PME through PCIe PME interrupt
[    6.499518] pcieport 0000:10:03.0: Signaling PME through PCIe PME interrupt
[    6.507294] pci 0000:14:00.0: Signaling PME through PCIe PME interrupt
[    6.514588] pcie_pme 0000:00:05.0:pcie01: service driver pcie_pme loaded
[    6.522071] pcieport 0000:00:06.0: Signaling PME through PCIe PME interrupt
[    6.529850] pcie_pme 0000:00:06.0:pcie01: service driver pcie_pme loaded
[    6.537332] pcieport 0000:00:07.0: Signaling PME through PCIe PME interrupt
[    6.545110] pci 0000:16:00.0: Signaling PME through PCIe PME interrupt
[    6.552404] pci 0000:16:00.1: Signaling PME through PCIe PME interrupt
[    6.559690] pcie_pme 0000:00:07.0:pcie01: service driver pcie_pme loaded
[    6.567181] pcieport 0000:00:08.0: Signaling PME through PCIe PME interrupt
[    6.574963] pcie_pme 0000:00:08.0:pcie01: service driver pcie_pme loaded
[    6.582455] pcieport 0000:00:09.0: Signaling PME through PCIe PME interrupt
[    6.590236] pcie_pme 0000:00:09.0:pcie01: service driver pcie_pme loaded
[    6.597728] pcieport 0000:00:0a.0: Signaling PME through PCIe PME interrupt
[    6.605507] pcie_pme 0000:00:0a.0:pcie01: service driver pcie_pme loaded
[    6.613016] pcieport 0000:00:1c.0: Signaling PME through PCIe PME interrupt
[    6.620795] pci 0000:1a:00.0: Signaling PME through PCIe PME interrupt
[    6.628080] pci 0000:1a:00.1: Signaling PME through PCIe PME interrupt
[    6.635366] pcie_pme 0000:00:1c.0:pcie01: service driver pcie_pme loaded
[    6.642865] pcieport 0000:00:1c.4: Signaling PME through PCIe PME interrupt
[    6.650645] pci 0000:1d:00.0: Signaling PME through PCIe PME interrupt
[    6.657940] pcie_pme 0000:00:1c.4:pcie01: service driver pcie_pme loaded
[    6.665440] pcieport 0000:00:1c.5: Signaling PME through PCIe PME interrupt
[    6.673219] pcie_pme 0000:00:1c.5:pcie01: service driver pcie_pme loaded
[    6.680731] ioapic: probe of 0000:00:13.0 failed with error -22
[    6.687356] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    6.693638] pciehp 0000:00:1c.0:pcie04: HPC vendor_id 8086 device_id 3a40 ss_vid 1137 ss_did 101
[    6.703504] pciehp 0000:00:1c.0:pcie04: service driver pciehp loaded
[    6.710615] pciehp 0000:00:1c.4:pcie04: HPC vendor_id 8086 device_id 3a48 ss_vid 1137 ss_did 101
[    6.720481] pciehp 0000:00:1c.4:pcie04: service driver pciehp loaded
[    6.727591] pciehp 0000:00:1c.5:pcie04: HPC vendor_id 8086 device_id 3a4a ss_vid 1137 ss_did 101
[    6.737449] pciehp 0000:00:1c.5:pcie04: service driver pciehp loaded
[    6.744561] pciehp 0000:06:05.0:pcie24: HPC vendor_id 10b5 device_id 8624 ss_vid 10b5 ss_did 8624
[    6.754545] pciehp 0000:06:05.0:pcie24: service driver pciehp loaded
[    6.761668] pciehp 0000:06:08.0:pcie24: HPC vendor_id 10b5 device_id 8624 ss_vid 10b5 ss_did 8624
[    6.771649] pciehp 0000:06:08.0:pcie24: service driver pciehp loaded
[    6.778764] pciehp 0000:0a:01.0:pcie24: HPC vendor_id 10b5 device_id 8632 ss_vid 10b5 ss_did 8632
[    6.788736] pciehp 0000:0a:01.0:pcie24: service driver pciehp loaded
[    6.795850] pciehp 0000:0a:04.0:pcie24: HPC vendor_id 10b5 device_id 8632 ss_vid 10b5 ss_did 8632
[    6.805823] pciehp 0000:0a:04.0:pcie24: service driver pciehp loaded
[    6.812936] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
[    6.820407] intel_idle: MWAIT substates: 0x1120
[    6.825470] intel_idle: v0.4 model 0x2C
[    6.829746] intel_idle: lapic_timer_reliable_states 0xffffffff
[    6.836414] ACPI Error: No handler for Region [POWS] (ffff88046484cb88) [IPMI] (20130328/evregion-161)
[    6.846843] ACPI Error: Region IPMI (ID=7) has no handler (20130328/exfldio-305)
[    6.855133] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPC0.P111._PSR] (Node ffff88046483d5f0), AE_NOT_EXIST (20130328/psparse-537)
[    6.869466] ACPI Exception: AE_NOT_EXIST, Error reading AC Adapter state (20130328/ac-126)
[    6.878755] ACPI Error: No handler for Region [POWS] (ffff88046484cbd0) [IPMI] (20130328/evregion-161)
[    6.889183] ACPI Error: Region IPMI (ID=7) has no handler (20130328/exfldio-305)
[    6.897477] ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPC0.P112._PSR] (Node ffff88046483d7a8), AE_NOT_EXIST (20130328/psparse-537)
[    6.911816] ACPI Exception: AE_NOT_EXIST, Error reading AC Adapter state (20130328/ac-126)
[    6.921128] input: Sleep Button as /devices/LNXSYSTM:00/device:00/PNP0C0E:00/input/input0
[    6.930275] ACPI: Sleep Button [SLPB]
[    6.934403] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
[    6.942667] ACPI: Power Button [PWRF]
[    6.946795] ACPI: Requesting acpi_cpufreq
[    6.955727] ERST: Error Record Serialization Table (ERST) support is initialized.
[    6.964304] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
[    6.972636] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    6.980307] tsc: Refined TSC clocksource calibration: 2400.084 MHz
[    7.000267] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[    7.013378] Switching to clocksource tsc
[    7.028126] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[    7.035260] Non-volatile memory driver v1.3
[    7.039934] Linux agpgart interface v0.103
[    7.045560] loop: module loaded
[    7.049227] ata_piix 0000:00:1f.2: version 2.13
[    7.054453] ata_piix 0000:00:1f.2: MAP [
[    7.058834]  P0 P2 P1 P3 ]
[    7.061937] ata_piix 0000:00:1f.2: setting latency timer to 64
[    7.068672] scsi0 : ata_piix
[    7.072204] scsi1 : ata_piix
[    7.075654] ata1: SATA max UDMA/133 cmd 0x5138 ctl 0x514c bmdma 0x5110 irq 18
[    7.083629] ata2: SATA max UDMA/133 cmd 0x5130 ctl 0x5148 bmdma 0x5118 irq 18
[    7.091733] ata_piix 0000:00:1f.5: MAP [
[    7.096116]  P0 -- P1 -- ]
[    7.249526] ata_piix 0000:00:1f.5: SCR access via SIDPR is available but doesn't work
[    7.258281] ata_piix 0000:00:1f.5: setting latency timer to 64
[    7.264981] scsi2 : ata_piix
[    7.268387] scsi3 : ata_piix
[    7.271797] ata3: SATA max UDMA/133 cmd 0x5128 ctl 0x5144 bmdma 0x50f0 irq 21
[    7.279770] ata4: SATA max UDMA/133 cmd 0x5120 ctl 0x5140 bmdma 0x50f8 irq 21
[    7.287792] libphy: Fixed MDIO Bus: probed
[    7.292447] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    7.299741] ehci-pci: EHCI PCI platform driver
[    7.304844] ehci-pci 0000:00:1a.7: setting latency timer to 64
[    7.311376] ehci-pci 0000:00:1a.7: EHCI Host Controller
[    7.317250] ehci-pci 0000:00:1a.7: new USB bus registered, assigned bus number 1
[    7.325524] ehci-pci 0000:00:1a.7: debug port 1
[    7.334483] ehci-pci 0000:00:1a.7: cache line size of 64 is not supported
[    7.342076] ehci-pci 0000:00:1a.7: irq 19, io mem 0xb2421000
[    7.353598] ehci-pci 0000:00:1a.7: USB 2.0 started, EHCI 1.00
[    7.360039] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[    7.367621] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.375688] usb usb1: Product: EHCI Host Controller
[    7.381136] usb usb1: Manufacturer: Linux 3.10.0-rc5.nab+ ehci_hcd
[    7.388038] usb usb1: SerialNumber: 0000:00:1a.7
[    7.393283] hub 1-0:1.0: USB hub found
[    7.397467] hub 1-0:1.0: 6 ports detected
[    7.402194] ehci-pci 0000:00:1d.7: setting latency timer to 64
[    7.408725] ehci-pci 0000:00:1d.7: EHCI Host Controller
[    7.414600] ehci-pci 0000:00:1d.7: new USB bus registered, assigned bus number 2
[    7.422875] ehci-pci 0000:00:1d.7: debug port 1
[    7.431821] ehci-pci 0000:00:1d.7: cache line size of 64 is not supported
[    7.439416] ehci-pci 0000:00:1d.7: irq 16, io mem 0xb2420000
[    7.451678] ehci-pci 0000:00:1d.7: USB 2.0 started, EHCI 1.00
[    7.458116] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
[    7.465702] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.467885] ata4.00: ATAPI: Optiarc DVD RW AD-7580S, FX04, max UDMA/100
[    7.481153] usb usb2: Product: EHCI Host Controller
[    7.486599] usb usb2: Manufacturer: Linux 3.10.0-rc5.nab+ ehci_hcd
[    7.489900] ata4.00: configured for UDMA/100
[    7.498270] usb usb2: SerialNumber: 0000:00:1d.7
[    7.503525] hub 2-0:1.0: USB hub found
[    7.507719] hub 2-0:1.0: 6 ports detected
[    7.512308] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    7.519224] uhci_hcd: USB Universal Host Controller Interface driver
[    7.526451] uhci_hcd 0000:00:1a.0: setting latency timer to 64
[    7.532971] uhci_hcd 0000:00:1a.0: UHCI Host Controller
[    7.538845] uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 3
[    7.547131] uhci_hcd 0000:00:1a.0: irq 19, io base 0x000050c0
[    7.553580] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001
[    7.561164] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.569230] usb usb3: Product: UHCI Host Controller
[    7.574675] usb usb3: Manufacturer: Linux 3.10.0-rc5.nab+ uhci_hcd
[    7.581579] usb usb3: SerialNumber: 0000:00:1a.0
[    7.586811] hub 3-0:1.0: USB hub found
[    7.591001] hub 3-0:1.0: 2 ports detected
[    7.595669] uhci_hcd 0000:00:1a.1: setting latency timer to 64
[    7.602189] uhci_hcd 0000:00:1a.1: UHCI Host Controller
[    7.608073] uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 4
[    7.616382] uhci_hcd 0000:00:1a.1: irq 19, io base 0x000050a0
[    7.622827] usb usb4: New USB device found, idVendor=1d6b, idProduct=0001
[    7.630409] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.638475] usb usb4: Product: UHCI Host Controller
[    7.643925] usb usb4: Manufacturer: Linux 3.10.0-rc5.nab+ uhci_hcd
[    7.650819] usb usb4: SerialNumber: 0000:00:1a.1
[    7.656051] hub 4-0:1.0: USB hub found
[    7.660241] hub 4-0:1.0: 2 ports detected
[    7.664907] uhci_hcd 0000:00:1a.2: setting latency timer to 64
[    7.671424] uhci_hcd 0000:00:1a.2: UHCI Host Controller
[    7.677293] uhci_hcd 0000:00:1a.2: new USB bus registered, assigned bus number 5
[    7.685578] uhci_hcd 0000:00:1a.2: irq 19, io base 0x00005080
[    7.692021] usb usb5: New USB device found, idVendor=1d6b, idProduct=0001
[    7.699606] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.707687] usb usb5: Product: UHCI Host Controller
[    7.713134] usb usb5: Manufacturer: Linux 3.10.0-rc5.nab+ uhci_hcd
[    7.720038] usb usb5: SerialNumber: 0000:00:1a.2
[    7.725308] hub 5-0:1.0: USB hub found
[    7.729516] hub 5-0:1.0: 2 ports detected
[    7.734226] uhci_hcd 0000:00:1d.0: setting latency timer to 64
[    7.740747] uhci_hcd 0000:00:1d.0: UHCI Host Controller
[    7.746623] uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 6
[    7.754906] uhci_hcd 0000:00:1d.0: irq 16, io base 0x00005060
[    7.761355] usb usb6: New USB device found, idVendor=1d6b, idProduct=0001
[    7.768937] usb usb6: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.777005] usb usb6: Product: UHCI Host Controller
[    7.782453] usb usb6: Manufacturer: Linux 3.10.0-rc5.nab+ uhci_hcd
[    7.789354] usb usb6: SerialNumber: 0000:00:1d.0
[    7.794585] hub 6-0:1.0: USB hub found
[    7.798775] hub 6-0:1.0: 2 ports detected
[    7.803440] uhci_hcd 0000:00:1d.1: setting latency timer to 64
[    7.809958] uhci_hcd 0000:00:1d.1: UHCI Host Controller
[    7.815828] uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 7
[    7.824112] uhci_hcd 0000:00:1d.1: irq 16, io base 0x00005040
[    7.830555] usb usb7: New USB device found, idVendor=1d6b, idProduct=0001
[    7.838140] usb usb7: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    7.846207] usb usb7: Product: UHCI Host Controller
[    7.851654] usb usb7: Manufacturer: Linux 3.10.0-rc5.nab+ uhci_hcd
[    7.853087] ata2.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.853104] ata2.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.859545] ata2.00: ATA-8: ST9500530NS, CC04, max UDMA/133
[    7.859547] ata2.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[    7.859888] ata2.01: ATA-8: ST9500530NS, CC04, max UDMA/133
[    7.859890] ata2.01: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[    7.866571] ata2.00: configured for UDMA/133
[    7.872535] ata2.01: configured for UDMA/133
[    7.909880] usb usb7: SerialNumber: 0000:00:1d.1
[    7.909965] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.909981] ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.915578] ata1.00: ATA-8: ST9500530NS, CC03, max UDMA/133
[    7.915579] ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[    7.915919] ata1.01: ATA-8: ST9500530NS, CC04, max UDMA/133
[    7.915920] ata1.01: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[    7.921573] ata1.00: configured for UDMA/133
[    7.927581] ata1.01: configured for UDMA/133
[    7.927596] Calling blk_mq_init_queue: scsi_mq_ops: ffffffff81ca13e0, queue_depth: 64, cmd_size: 296 SCSI cmd_size: 0
[    7.927639] blk-mq: CPU -> queue map
[    7.927640]   CPU 0 -> Queue 0
[    7.927640]   CPU 1 -> Queue 0
[    7.927640]   CPU 2 -> Queue 0
[    7.927641]   CPU 3 -> Queue 0
[    7.927641]   CPU 4 -> Queue 0
[    7.927642]   CPU 5 -> Queue 0
[    7.927642]   CPU 6 -> Queue 0
[    7.927643]   CPU 7 -> Queue 0
[    7.927643]   CPU 8 -> Queue 0
[    7.927643]   CPU 9 -> Queue 0
[    7.927644]   CPU10 -> Queue 0
[    7.927644]   CPU11 -> Queue 0
[    7.927645]   CPU12 -> Queue 0
[    7.927645]   CPU13 -> Queue 0
[    7.927646]   CPU14 -> Queue 0
[    7.927646]   CPU15 -> Queue 0
[    7.927673] Performing sc map setup on q: ffff880462430000 hctx: ffff880462a14200 i: 0
[    7.927780] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.927784] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927785] Allocated blk-mq req: ffff88046244ad40, req->tag: 63
[    7.927790] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927803] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927815] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927816] Allocated blk-mq req: ffff88046244ad40, req->tag: 63
[    7.927817] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
[    7.927944] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927946] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927946] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927949] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927951] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927952] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927955] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927958] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
[    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
[    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
[    7.927966] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927967] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927968] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927970] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927970] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927971] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927972] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927973] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927975] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927976] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927977] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927979] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927981] sd 0:0:0:0: [sda] Write Protect is off
[    7.927982] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[    7.927983] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927984] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927985] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    7.927986] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927987] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927989] sd 0:0:0:0: [sda] Asking for cache data failed
[    7.927990] sd 0:0:0:0: [sda] Assuming drive cache: write through
[    7.927995] Calling blk_mq_init_queue: scsi_mq_ops: ffffffff81ca13e0, queue_depth: 64, cmd_size: 296 SCSI cmd_size: 0
[    7.928030] blk-mq: CPU -> queue map
[    7.928030]   CPU 0 -> Queue 0
[    7.928031]   CPU 1 -> Queue 0
[    7.928031]   CPU 2 -> Queue 0
[    7.928032]   CPU 3 -> Queue 0
[    7.928032]   CPU 4 -> Queue 0
[    7.928032]   CPU 5 -> Queue 0
[    7.928033]   CPU 6 -> Queue 0
[    7.928033]   CPU 7 -> Queue 0
[    7.928034]   CPU 8 -> Queue 0
[    7.928034]   CPU 9 -> Queue 0
[    7.928035]   CPU10 -> Queue 0
[    7.928035]   CPU11 -> Queue 0
[    7.928035]   CPU12 -> Queue 0
[    7.928036]   CPU13 -> Queue 0
[    7.928036]   CPU14 -> Queue 0
[    7.928037]   CPU15 -> Queue 0
[    7.928055] Performing sc map setup on q: ffff8804624f0000 hctx: ffff880462938400 i: 0
[    7.928087] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928088] Allocated blk-mq req: ffff88046244a240, req->tag: 59
[    7.928089] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928092] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928093] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928094] Allocated blk-mq req: ffff88046244a240, req->tag: 59
[    7.928096] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928097] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928099] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
[    7.928100] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928101] Allocated blk-mq req: ffff88046244a240, req->tag: 59
[    7.928102] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928103] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928104] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928105] Allocated blk-mq req: ffff88046244a240, req->tag: 59
[    7.928106] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928107] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928108] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928109] Allocated blk-mq req: ffff88046244a240, req->tag: 59
[    7.928110] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928112] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928113] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928113] Allocated blk-mq req: ffff88046244a240, req->tag: 59
[    7.928114] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928116] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928117] sd 0:0:0:0: [sda] Asking for cache data failed
[    7.928118] sd 0:0:0:0: [sda] Assuming drive cache: write through
[    7.928174] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.928177] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928178] Allocated blk-mq req: ffff88046250ad40, req->tag: 63
[    7.928180] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928232] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928237] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928238] Allocated blk-mq req: ffff88046250ad40, req->tag: 63
[    7.928238] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928240] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928246] scsi 0:0:1:0: Direct-Access     ATA      ST9500530NS      CC04 PQ: 0 ANSI: 5
[    7.928336] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928337] Allocated blk-mq req: ffff88046250a7c0, req->tag: 61
[    7.928338] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928340] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928341] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928342] Allocated blk-mq req: ffff88046250a7c0, req->tag: 61
[    7.928344] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928346] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928347] sd 0:0:1:0: [sdb] Sector size 0 reported, assuming 512.
[    7.928349] sd 0:0:1:0: [sdb] 1 512-byte logical blocks: (512 B/512 B)
[    7.928351] sd 0:0:1:0: Attached scsi generic sg1 type 0
[    7.928351] sd 0:0:1:0: [sdb] 0-byte physical blocks
[    7.928352] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928353] Allocated blk-mq req: ffff88046250a7c0, req->tag: 61
[    7.928354] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928356] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928356] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928357] Allocated blk-mq req: ffff88046250a7c0, req->tag: 61
[    7.928362] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928364] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928366] Calling blk_mq_init_queue: scsi_mq_ops: ffffffff81ca13e0, queue_depth: 64, cmd_size: 296 SCSI cmd_size: 0
[    7.928366] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928367] Allocated blk-mq req: ffff88046250a7c0, req->tag: 61
[    7.928368] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928369] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928371] sd 0:0:1:0: [sdb] Write Protect is off
[    7.928372] sd 0:0:1:0: [sdb] Mode Sense: 00 00 00 00
[    7.928373] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928374] Allocated blk-mq req: ffff88046250a7c0, req->tag: 61
[    7.928375] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928376] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928377] sd 0:0:1:0: [sdb] Asking for cache data failed
[    7.928378] sd 0:0:1:0: [sdb] Assuming drive cache: write through
[    7.928394] blk-mq: CPU -> queue map
[    7.928394]   CPU 0 -> Queue 0
[    7.928395]   CPU 1 -> Queue 0
[    7.928395]   CPU 2 -> Queue 0
[    7.928396]   CPU 3 -> Queue 0
[    7.928396]   CPU 4 -> Queue 0
[    7.928396]   CPU 5 -> Queue 0
[    7.928397]   CPU 6 -> Queue 0
[    7.928397]   CPU 7 -> Queue 0
[    7.928398]   CPU 8 -> Queue 0
[    7.928398]   CPU 9 -> Queue 0
[    7.928399]   CPU10 -> Queue 0
[    7.928399]   CPU11 -> Queue 0
[    7.928399]   CPU12 -> Queue 0
[    7.928400]   CPU13 -> Queue 0
[    7.928400]   CPU14 -> Queue 0
[    7.928401]   CPU15 -> Queue 0
[    7.928420] Performing sc map setup on q: ffff8804624f08e0 hctx: ffff880462938a00 i: 0
[    7.928435] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928436] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928437] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928439] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928441] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928441] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928443] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928444] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928446] sd 0:0:1:0: [sdb] Sector size 0 reported, assuming 512.
[    7.928448] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928448] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928450] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928451] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928452] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928452] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928457] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928458] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928459] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928460] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928461] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928462] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928463] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928464] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928465] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928466] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928468] sd 0:0:1:0: [sdb] Asking for cache data failed
[    7.928469] sd 0:0:1:0: [sdb] Assuming drive cache: write through
[    7.928487] ------------[ cut here ]------------
[    7.928492] WARNING: at drivers/ata/libata-core.c:5038 ata_qc_issue+0x266/0x380()
[    7.928494] Modules linked in:
[    7.928496] CPU: 9 PID: 153 Comm: kworker/u50:5 Not tainted 3.10.0-rc5.nab+ #11
[    7.928497] Hardware name: Cisco Systems Inc R210-2121605W/R210-2121605W, BIOS C200.1.4.3j.0.020720132258 02/07/2013
[    7.928502] Workqueue: events_unbound async_run_entry_fn
[    7.928505]  0000000000000009 ffff88046241f588 ffffffff8162cc58 ffff88046241f5c8
[    7.928507]  ffffffff8104a200 ffff88046241f5d8 ffff880462b90000 0000000000000003
[    7.928509]  ffff880462b91c68 ffffffff81415450 ffff880462b90230 ffff88046241f5d8
[    7.928510] Call Trace:
[    7.928514]  [<ffffffff8162cc58>] dump_stack+0x19/0x1b
[    7.928518]  [<ffffffff8104a200>] warn_slowpath_common+0x70/0xa0
[    7.928521]  [<ffffffff81415450>] ? ata_scsi_set_sense.constprop.26+0x30/0x30
[    7.928523]  [<ffffffff8104a24a>] warn_slowpath_null+0x1a/0x20
[    7.928525]  [<ffffffff8140f586>] ata_qc_issue+0x266/0x380
[    7.928526]  [<ffffffff814155b3>] ? ata_scsi_rw_xlat+0x163/0x210
[    7.928528]  [<ffffffff81415450>] ? ata_scsi_set_sense.constprop.26+0x30/0x30
[    7.928530]  [<ffffffff814140d7>] ata_scsi_translate+0xa7/0x180
[    7.928531] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.928533]  [<ffffffff814181f9>] ata_scsi_queuecmd+0xa9/0x2b0
[    7.928534] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928537]  [<ffffffff813ed956>] scsi_dispatch_cmd+0x1c6/0x310
[    7.928537] Allocated blk-mq req: ffff88046259ad40, req->tag: 63
[    7.928540]  [<ffffffff813f63bb>] scsi_mq_queue_rq+0x17b/0x280
[    7.928541] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928546]  [<ffffffff812c15c5>] __blk_mq_run_hw_queue+0x1b5/0x3a0
[    7.928549]  [<ffffffff812c1c65>] blk_mq_run_hw_queue+0x35/0x40
[    7.928550]  [<ffffffff812c1fbb>] blk_mq_make_request+0x34b/0x4a0
[    7.928554]  [<ffffffff812b74f2>] generic_make_request+0xc2/0x110
[    7.928556]  [<ffffffff812b7a1b>] submit_bio+0x7b/0x160
[    7.928560]  [<ffffffff811baa8d>] ? bio_alloc_bioset+0x9d/0x1b0
[    7.928562]  [<ffffffff811b576e>] _submit_bh+0x13e/0x200
[    7.928564]  [<ffffffff811b5840>] submit_bh+0x10/0x20
[    7.928566]  [<ffffffff811b754d>] block_read_full_page+0x21d/0x350
[    7.928568]  [<ffffffff811bbff0>] ? I_BDEV+0x10/0x10
[    7.928571]  [<ffffffff8113be73>] ? __inc_zone_page_state+0x33/0x40
[    7.928573]  [<ffffffff8111f4bf>] ? add_to_page_cache_locked+0xdf/0x190
[    7.928575]  [<ffffffff811bc4a0>] ? blkdev_write_begin+0x30/0x30
[    7.928577]  [<ffffffff811bc4b8>] blkdev_readpage+0x18/0x20
[    7.928579]  [<ffffffff8111fcfa>] do_read_cache_page+0x7a/0x170
[    7.928581]  [<ffffffff8112837a>] ? __alloc_pages_nodemask+0x17a/0xad0
[    7.928583]  [<ffffffff8111fe09>] read_cache_page_async+0x19/0x20
[    7.928585]  [<ffffffff8112015e>] read_cache_page+0xe/0x20
[    7.928588]  [<ffffffff812c80cd>] read_dev_sector+0x2d/0x90
[    7.928590]  [<ffffffff812cdc4c>] read_lba+0xec/0x190
[    7.928592]  [<ffffffff812ce255>] ? efi_partition+0xe5/0x5f0
[    7.928594]  [<ffffffff812ce26f>] efi_partition+0xff/0x5f0
[    7.928596]  [<ffffffff812e9c34>] ? snprintf+0x34/0x40
[    7.928598]  [<ffffffff812ce170>] ? is_gpt_valid+0x480/0x480
[    7.928600]  [<ffffffff812c9138>] check_partition+0x108/0x220
[    7.928602]  [<ffffffff812c8d44>] rescan_partitions+0xb4/0x2c0
[    7.928604]  [<ffffffff811bda35>] __blkdev_get+0x375/0x4b0
[    7.928606]  [<ffffffff8119e447>] ? inode_init_always+0x107/0x1c0
[    7.928608]  [<ffffffff811bc010>] ? blkdev_get_block+0x20/0x20
[    7.928610]  [<ffffffff811bdd05>] blkdev_get+0x195/0x2e0
[    7.928612]  [<ffffffff8119f0c7>] ? unlock_new_inode+0x47/0x70
[    7.928613]  [<ffffffff811bcff0>] ? bdget+0x120/0x140
[    7.928615]  [<ffffffff812c6531>] add_disk+0x391/0x490
[    7.928618]  [<ffffffff81402c4a>] sd_probe_async+0x13a/0x230
[    7.928620]  [<ffffffff81075de6>] async_run_entry_fn+0x46/0x140
[    7.928623]  [<ffffffff810683c4>] process_one_work+0x174/0x400
[    7.928624]  [<ffffffff81068acc>] worker_thread+0x11c/0x370
[    7.928626]  [<ffffffff810689b0>] ? rescuer_thread+0x320/0x320
[    7.928629]  [<ffffffff8106f410>] kthread+0xc0/0xd0
[    7.928631]  [<ffffffff8106f350>] ? flush_kthread_worker+0x80/0x80
[    7.928633]  [<ffffffff8163b2dc>] ret_from_fork+0x7c/0xb0
[    7.928635]  [<ffffffff8106f350>] ? flush_kthread_worker+0x80/0x80
[    7.928638] ---[ end trace 9f4b3fe3fb787a07 ]---
[    7.931709] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931715] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931715] Allocated blk-mq req: ffff88046259ad40, req->tag: 63
[    7.931716] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931718] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931724] scsi 1:0:0:0: Direct-Access     ATA      ST9500530NS      CC04 PQ: 0 ANSI: 5
[    7.931786] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931787] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931788] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931791] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931792] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931792] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931795] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931796] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931798] sd 1:0:0:0: [sdc] Sector size 0 reported, assuming 512.
[    7.931800] sd 1:0:0:0: [sdc] 1 512-byte logical blocks: (512 B/512 B)
[    7.931801] sd 1:0:0:0: [sdc] 0-byte physical blocks
[    7.931802] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931803] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931804] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931807] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931808] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931808] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931809] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931813] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931814] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931815] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931816] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931821] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931822] sd 1:0:0:0: [sdc] Write Protect is off
[    7.931823] sd 1:0:0:0: [sdc] Mode Sense: 00 00 00 00
[    7.931824] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931825] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931826] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931827] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931829] sd 1:0:0:0: [sdc] Asking for cache data failed
[    7.931830] sd 1:0:0:0: Attached scsi generic sg2 type 0
[    7.931831] sd 1:0:0:0: [sdc] Assuming drive cache: write through
[    7.931837] Calling blk_mq_init_queue: scsi_mq_ops: ffffffff81ca13e0, queue_depth: 64, cmd_size: 296 SCSI cmd_size: 0
[    7.931860] blk-mq: CPU -> queue map
[    7.931860]   CPU 0 -> Queue 0
[    7.931861]   CPU 1 -> Queue 0
[    7.931861]   CPU 2 -> Queue 0
[    7.931862]   CPU 3 -> Queue 0
[    7.931862]   CPU 4 -> Queue 0
[    7.931863]   CPU 5 -> Queue 0
[    7.931863]   CPU 6 -> Queue 0
[    7.931863]   CPU 7 -> Queue 0
[    7.931864]   CPU 8 -> Queue 0
[    7.931864]   CPU 9 -> Queue 0
[    7.931865]   CPU10 -> Queue 0
[    7.931865]   CPU11 -> Queue 0
[    7.931866]   CPU12 -> Queue 0
[    7.931866]   CPU13 -> Queue 0
[    7.931866]   CPU14 -> Queue 0
[    7.931867]   CPU15 -> Queue 0
[    7.931886] Performing sc map setup on q: ffff8804624308e0 hctx: ffff880462a14600 i: 0
[    7.931895] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931896] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931896] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931899] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931900] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931901] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931902] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931904] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931906] sd 1:0:0:0: [sdc] Sector size 0 reported, assuming 512.
[    7.931907] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931908] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931909] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931910] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931911] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931912] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931913] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931917] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931918] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931919] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931930] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931932] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931933] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.931933] Allocated blk-mq req: ffff88046259a7c0, req->tag: 61
[    7.931934] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.931937] scsi_execute(): Calling blk_mq_free_request >>>
[    7.931938] sd 1:0:0:0: [sdc] Asking for cache data failed
[    7.931940] sd 1:0:0:0: [sdc] Assuming drive cache: write through
[    7.932002] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.932006] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932007] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932008] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932042] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932047] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932048] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932048] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932050] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932056] scsi 1:0:1:0: Direct-Access     ATA      ST9500530NS      CC04 PQ: 0 ANSI: 5
[    7.932160] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932161] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932161] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932168] sd 1:0:1:0: Attached scsi generic sg3 type 0
[    7.932168] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932170] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932170] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932172] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932175] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932177] sd 1:0:1:0: [sdd] Sector size 0 reported, assuming 512.
[    7.932179] sd 1:0:1:0: [sdd] 1 512-byte logical blocks: (512 B/512 B)
[    7.932180] sd 1:0:1:0: [sdd] 0-byte physical blocks
[    7.932181] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932182] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932183] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932184] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932185] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932186] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932187] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932190] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932191] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932192] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932193] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932198] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932200] sd 1:0:1:0: [sdd] Write Protect is off
[    7.932201] sd 1:0:1:0: [sdd] Mode Sense: 00 00 00 00
[    7.932202] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932203] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932204] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932206] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932207] sd 1:0:1:0: [sdd] Asking for cache data failed
[    7.932208] sd 1:0:1:0: [sdd] Assuming drive cache: write through
[    7.932218] Calling blk_mq_init_queue: scsi_mq_ops: ffffffff81ca13e0, queue_depth: 64, cmd_size: 296 SCSI cmd_size: 0
[    7.932251] blk-mq: CPU -> queue map
[    7.932252]   CPU 0 -> Queue 0
[    7.932253]   CPU 1 -> Queue 0
[    7.932253]   CPU 2 -> Queue 0
[    7.932254]   CPU 3 -> Queue 0
[    7.932255]   CPU 4 -> Queue 0
[    7.932255]   CPU 5 -> Queue 0
[    7.932256]   CPU 6 -> Queue 0
[    7.932256]   CPU 7 -> Queue 0
[    7.932257]   CPU 8 -> Queue 0
[    7.932258]   CPU 9 -> Queue 0
[    7.932259]   CPU10 -> Queue 0
[    7.932259]   CPU11 -> Queue 0
[    7.932260]   CPU12 -> Queue 0
[    7.932260]   CPU13 -> Queue 0
[    7.932261]   CPU14 -> Queue 0
[    7.932262]   CPU15 -> Queue 0
[    7.932282] Performing sc map setup on q: ffff8804626f8000 hctx: ffff8802628f0200 i: 0
[    7.932286] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932287] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932288] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932293] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932294] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932295] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932296] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932300] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932302] sd 1:0:1:0: [sdd] Sector size 0 reported, assuming 512.
[    7.932303] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932304] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932305] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932308] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932309] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932310] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932311] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932316] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932317] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932318] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932318] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932324] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932325] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932325] Allocated blk-mq req: ffff88046264ad40, req->tag: 63
[    7.932326] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.932331] scsi_execute(): Calling blk_mq_free_request >>>
[    7.932333] sd 1:0:1:0: [sdd] Asking for cache data failed
[    7.932334] sd 1:0:1:0: [sdd] Assuming drive cache: write through
[    7.932394] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.932398] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.932399] Allocated blk-mq req: ffff88046271ad40, req->tag: 63
[    7.932400] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.943187]  sdb: sdb1
[    7.943191] sdb: p1 start 2048 is beyond EOD, enabling native capacity
[    7.943222] scsi_execute(): Calling blk_mq_free_request >>>
[    7.943230] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.943231] Allocated blk-mq req: ffff88046271ad40, req->tag: 63
[    7.943232] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.944750] scsi_execute(): Calling blk_mq_free_request >>>
[    7.944757] scsi 3:0:0:0: CD-ROM            Optiarc  DVD RW AD-7580S  FX04 PQ: 0 ANSI: 5
[    7.944813] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.944814] Allocated blk-mq req: ffff88046271ad40, req->tag: 63
[    7.944815] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.957010]  sdd: sdd1
[    7.957013] sdd: p1 start 2048 is beyond EOD, enabling native capacity
[    7.963106] usb 1-4: new high-speed USB device number 3 using ehci-pci
[    8.218646] usb 1-4: config 1 has no interfaces?
[    8.219146] usb 1-4: New USB device found, idVendor=0624, idProduct=0249
[    8.219148] usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    8.219149] usb 1-4: Product: USB Composite Device-1
[    8.219150] usb 1-4: Manufacturer: Avocent
[    8.426474] usb 2-5: new high-speed USB device number 2 using ehci-pci
[    8.719067] usb 2-5: New USB device found, idVendor=04b4, idProduct=6560
[    8.719069] usb 2-5: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    8.719215] hub 2-5:1.0: USB hub found
[    8.719315] hub 2-5:1.0: 4 ports detected
[    8.957907] usb 4-1: new full-speed USB device number 2 using uhci_hcd
[    9.112050] usb 4-1: New USB device found, idVendor=0624, idProduct=0248
[    9.112051] usb 4-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    9.112053] usb 4-1: Product: USB Composite Device-0
[    9.112054] usb 4-1: Manufacturer: Avocent
[    9.112055] usb 4-1: SerialNumber: 20080930
[    9.329209] usb 2-3: new high-speed USB device number 3 using ehci-pci
[    9.470178] usb 2-3: New USB device found, idVendor=14dd, idProduct=0002
[    9.470180] usb 2-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    9.470181] usb 2-3: Product: Multidevice
[    9.470182] usb 2-3: Manufacturer: Peppercon AG
[    9.470183] usb 2-3: SerialNumber: 18EB7D234B1BCF34ADCBEA6091EC9869
[   10.986077] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986079] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986082] Allocated blk-mq req: ffff88046264a7c0, req->tag: 61
[   10.986083] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986088] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986090] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986091] Allocated blk-mq req: ffff88046264a7c0, req->tag: 61
[   10.986094] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986097] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986100] sd 1:0:1:0: [sdd] Sector size 0 reported, assuming 512.
[   10.986101] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986102] Allocated blk-mq req: ffff88046264a7c0, req->tag: 61
[   10.986104] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986107] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986110] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986111] Allocated blk-mq req: ffff88046264a7c0, req->tag: 61
[   10.986113] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986115] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986116] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986117] Allocated blk-mq req: ffff88046264a7c0, req->tag: 61
[   10.986118] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986121] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986124] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986124] Allocated blk-mq req: ffff88046264a7c0, req->tag: 61
[   10.986126] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986129] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986132] sd 1:0:1:0: [sdd] Asking for cache data failed
[   10.986133] sd 1:0:1:0: [sdd] Assuming drive cache: write through
[   10.986155] hub 7-0:1.0: USB hub found
[   10.986158] hub 7-0:1.0: 2 ports detected
[   10.986355] uhci_hcd 0000:00:1d.2: setting latency timer to 64
[   10.986358] uhci_hcd 0000:00:1d.2: UHCI Host Controller
[   10.986397] uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 8
[   10.986428] uhci_hcd 0000:00:1d.2: irq 16, io base 0x00005020
[   10.986429]  sdd: sdd1
[   10.986432] sdd: p1 start 2048 is beyond EOD, truncated
[   10.986462] usb usb8: New USB device found, idVendor=1d6b, idProduct=0001
[   10.986464] usb usb8: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   10.986465] usb usb8: Product: UHCI Host Controller
[   10.986466] usb usb8: Manufacturer: Linux 3.10.0-rc5.nab+ uhci_hcd
[   10.986467] usb usb8: SerialNumber: 0000:00:1d.2
[   10.986547] hub 8-0:1.0: USB hub found
[   10.986550] hub 8-0:1.0: 2 ports detected
[   10.986570] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986572] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986572] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986576] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986581] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986582] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986582] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986584] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986585] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986586] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986588] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986590] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986593] sd 1:0:1:0: [sdd] Sector size 0 reported, assuming 512.
[   10.986595] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986595] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986598] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986600] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986601] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986602] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986603] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986606] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986607] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986608] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986609] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986612] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986614] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   10.986615] Allocated blk-mq req: ffff88046264a500, req->tag: 60
[   10.986617] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   10.986619] scsi_execute(): Calling blk_mq_free_request >>>
[   10.986620] sd 1:0:1:0: [sdd] Asking for cache data failed
[   10.986622] sd 1:0:1:0: [sdd] Assuming drive cache: write through
[   10.986623] sd 1:0:1:0: [sdd] Attached SCSI disk
[   10.986693] usbcore: registered new interface driver usbserial
[   10.986699] usbcore: registered new interface driver usbserial_generic
[   10.986706] usbserial: USB Serial support registered for generic
[   10.986730] i8042: PNP: No PS/2 controller found. Probing ports directly.
[   11.523406] Allocated blk-mq req: ffff880462509f80, req->tag: 58
[   11.530116] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.538869] scsi_execute(): Calling blk_mq_free_request >>>
[   11.545095] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.551996] Allocated blk-mq req: ffff880462509f80, req->tag: 58
[   11.558705] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.567452] scsi_execute(): Calling blk_mq_free_request >>>
[   11.573677] sd 0:0:1:0: [sdb] Sector size 0 reported, assuming 512.
[   11.580679] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.587580] Allocated blk-mq req: ffff880462509f80, req->tag: 58
[   11.594288] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.603035] scsi_execute(): Calling blk_mq_free_request >>>
[   11.609259] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.616161] Allocated blk-mq req: ffff880462509f80, req->tag: 58
[   11.622871] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.631617] scsi_execute(): Calling blk_mq_free_request >>>
[   11.637840] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.644741] Allocated blk-mq req: ffff880462509f80, req->tag: 58
[   11.651450] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.660198] scsi_execute(): Calling blk_mq_free_request >>>
[   11.666421] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.673324] Allocated blk-mq req: ffff880462509f80, req->tag: 58
[   11.680034] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.688782] scsi_execute(): Calling blk_mq_free_request >>>
[   11.695005] sd 0:0:1:0: [sdb] Asking for cache data failed
[   11.701132] sd 0:0:1:0: [sdb] Assuming drive cache: write through
[   11.708212]  sdb: sdb1
[   11.710851] sdb: p1 start 2048 is beyond EOD, truncated
[   11.716790] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723696] Allocated blk-mq req: ffff880462509cc0, req->tag: 57
[   11.723744] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723745] Allocated blk-mq req: ffff880462509740, req->tag: 55
[   11.723746] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.723749] scsi_execute(): Calling blk_mq_free_request >>>
[   11.723750] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723751] Allocated blk-mq req: ffff880462509740, req->tag: 55
[   11.723754] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.723755] scsi_execute(): Calling blk_mq_free_request >>>
[   11.723757] sd 0:0:1:0: [sdb] Sector size 0 reported, assuming 512.
[   11.723759] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723760] Allocated blk-mq req: ffff880462509740, req->tag: 55
[   11.723761] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.723762] scsi_execute(): Calling blk_mq_free_request >>>
[   11.723763] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723764] Allocated blk-mq req: ffff880462509740, req->tag: 55
[   11.723765] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.723766] scsi_execute(): Calling blk_mq_free_request >>>
[   11.723767] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723768] Allocated blk-mq req: ffff880462509740, req->tag: 55
[   11.723769] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.723770] scsi_execute(): Calling blk_mq_free_request >>>
[   11.723771] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[   11.723772] Allocated blk-mq req: ffff880462509740, req->tag: 55
[   11.723773] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.723774] scsi_execute(): Calling blk_mq_free_request >>>
[   11.723776] sd 0:0:1:0: [sdb] Asking for cache data failed
[   11.723777] sd 0:0:1:0: [sdb] Assuming drive cache: write through
[   11.723778] sd 0:0:1:0: [sdb] Attached SCSI disk
[   11.926903] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[   11.935652] scsi_execute(): Calling blk_mq_free_request >>>
[   12.033094] i8042: No controller found
[   12.037373] mousedev: PS/2 mouse device common for all mice
[   12.043839] rtc_cmos 00:02: RTC can wake from S4
[   12.049118] rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0
[   12.055954] rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram, hpet irqs
[   12.064566] device-mapper: uevent: version 1.0.3
[   12.069788] device-mapper: ioctl: 4.24.0-ioctl (2013-01-15) initialised: dm-devel@redhat.com
[   12.079541] cpuidle: using governor ladder
[   12.084416] cpuidle: using governor menu
[   12.089357] hidraw: raw HID events driver (C) Jiri Kosina
[   12.100665] input: Avocent USB Composite Device-0 as /devices/pci0000:00/0000:00:1a.1/usb4/4-1/4-1:1.0/input/input2
[   12.112415] hid-generic 0003:0624:0248.0001: input,hidraw0: USB HID v1.00 Keyboard [Avocent USB Composite Device-0] on usb-0000:00:1a.1-1/input0
[   12.130629] input: Avocent USB Composite Device-0 as /devices/pci0000:00/0000:00:1a.1/usb4/4-1/4-1:1.1/input/input3
[   12.142391] hid-generic 0003:0624:0248.0002: input,hidraw1: USB HID v1.00 Mouse [Avocent USB Composite Device-0] on usb-0000:00:1a.1-1/input1
[   12.161647] input: Avocent USB Composite Device-0 as /devices/pci0000:00/0000:00:1a.1/usb4/4-1/4-1:1.2/input/input4
[   12.173406] hid-generic 0003:0624:0248.0003: input,hidraw2: USB HID v1.00 Mouse [Avocent USB Composite Device-0] on usb-0000:00:1a.1-1/input2
[   12.188165] input: Peppercon AG Multidevice as /devices/pci0000:00/0000:00:1d.7/usb2/2-3/2-3:1.0/input/input5
[   12.199328] hid-generic 0003:14DD:0002.0004: input,hidraw3: USB HID v1.01 Keyboard [Peppercon AG Multidevice] on usb-0000:00:1d.7-3/input0
[   12.214075] input: Peppercon AG Multidevice as /devices/pci0000:00/0000:00:1d.7/usb2/2-3/2-3:1.1/input/input6
[   12.225243] hid-generic 0003:14DD:0002.0005: input,hidraw4: USB HID v1.01 Mouse [Peppercon AG Multidevice] on usb-0000:00:1d.7-3/input1
[   12.238856] usbcore: registered new interface driver usbhid
[   12.245087] usbhid: USB HID core driver
[   12.249415] drop_monitor: Initializing network drop monitor service
[   12.256511] ip_tables: (C) 2000-2006 Netfilter Core Team
[   12.262688] TCP: cubic registered
[   12.266394] Initializing XFRM netlink socket
[   12.271252] NET: Registered protocol family 10
[   12.276437] mip6: Mobile IPv6
[   12.279754] NET: Registered protocol family 17
[   12.285687] PM: Hibernation image not present or could not be loaded.
[   12.292888] registered taskstats version 1
[   12.299282]   Magic number: 5:412:283
[   12.304146] rtc_cmos 00:02: setting system clock to 2013-07-16 14:17:22 UTC (1373984242)

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-16 18:32               ` Alexander Gordeev
@ 2013-07-16 21:38                 ` Nicholas A. Bellinger
  2013-07-17 16:19                   ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-16 21:38 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe, linux-scsi

On Tue, 2013-07-16 at 20:32 +0200, Alexander Gordeev wrote:
> On Fri, Jul 12, 2013 at 10:20:12PM -0700, Nicholas A. Bellinger wrote:
> > On Fri, 2013-07-12 at 09:46 +0200, Alexander Gordeev wrote:
> > > > > diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
> > > > > index ca6ff67..d8cc7a4 100644
> > > > > --- a/drivers/scsi/scsi-mq.c
> > > > > +++ b/drivers/scsi/scsi-mq.c
> > > > > @@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
> > > > >  static struct blk_mq_ops scsi_mq_ops = {
> > > > >  	.queue_rq	= scsi_mq_queue_rq,
> > > > >  	.map_queue	= blk_mq_map_queue,
> > > > > +	.timeout	= scsi_times_out,
> > > > >  	.alloc_hctx	= blk_mq_alloc_single_hw_queue,
> > > > >  	.free_hctx	= blk_mq_free_single_hw_queue,
> > > > >  };
> > 
> > So your actually triggering a blk-mq timeout with ata_piix..?
> 
> No.
> That is to avoid a NULL-pointer assignment from ->timeout elsewhere.
> In fact I return -ENODEV for sr_probe() to not hit it.
> 
> > That is why scsi-mq still uses blk_execute_rq() for this reason, and
> > this will need be addressed in order to safely use blk_mq_execute_rq()
> > in the above context.
> 
> Got it.
> 
> > Do you have an OOPs backtrace handy to post w/o the last two changes in
> > place..?
> 
> Attaching the output. No oops actually (due to aforementioned .timeout).
> 

Hi Alexander,

Thanks for the logs.  I'm In-lining some of the output here for
reference:

[    7.927596] Calling blk_mq_init_queue: scsi_mq_ops: ffffffff81ca13e0, queue_depth: 64, cmd_size: 296 SCSI cmd_size: 0

Just FYI, a SCSI cmd_size of zero here means that scsi-mq will not be
providing pre-allocated LLD descriptors (located at scsi_cmnd->SCp.ptr)
for use by libata driver code.

That is fine for initial testing, but libata will eventually want to
take advantage of scsi_host_template->cmd_size = sizeof(ata_queued_cmd)
in order to remove (all) memory allocations from the I/O fast-path.

[    7.927639] blk-mq: CPU -> queue map
[    7.927640]   CPU 0 -> Queue 0
[    7.927640]   CPU 1 -> Queue 0
[    7.927640]   CPU 2 -> Queue 0
[    7.927641]   CPU 3 -> Queue 0
[    7.927641]   CPU 4 -> Queue 0
[    7.927642]   CPU 5 -> Queue 0
[    7.927642]   CPU 6 -> Queue 0
[    7.927643]   CPU 7 -> Queue 0
[    7.927643]   CPU 8 -> Queue 0
[    7.927643]   CPU 9 -> Queue 0
[    7.927644]   CPU10 -> Queue 0
[    7.927644]   CPU11 -> Queue 0
[    7.927645]   CPU12 -> Queue 0
[    7.927645]   CPU13 -> Queue 0
[    7.927646]   CPU14 -> Queue 0
[    7.927646]   CPU15 -> Queue 0
[    7.927673] Performing sc map setup on q: ffff880462430000 hctx: ffff880462a14200 i: 0
[    7.927780] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.927784] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927785] Allocated blk-mq req: ffff88046244ad40, req->tag: 63
[    7.927790] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927803] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927815] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927816] Allocated blk-mq req: ffff88046244ad40, req->tag: 63
[    7.927817] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5

OK, so INQUIRY response payload is looking as expected here.

[    7.927944] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927946] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927946] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927949] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927951] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927952] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927955] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927958] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
[    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
[    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks

Strange..  READ_CAPACITY appears to be returning a payload as zeros..?

[    7.927966] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927967] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927968] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927970] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927970] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927971] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927972] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927973] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927975] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927976] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927977] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927979] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927981] sd 0:0:0:0: [sda] Write Protect is off
[    7.927982] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00

Ditto here..

[    7.927983] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.927984] Allocated blk-mq req: ffff88046244a7c0, req->tag: 61
[    7.927985] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    7.927986] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.927987] scsi_execute(): Calling blk_mq_free_request >>>
[    7.927989] sd 0:0:0:0: [sda] Asking for cache data failed
[    7.927990] sd 0:0:0:0: [sda] Assuming drive cache: write through

and here as well..

Not sure why yet some control CDBs are getting back the expected
payload, while others are returning zeros..

Also, looking at the included stack back-trace:

[    7.928394] blk-mq: CPU -> queue map
[    7.928394]   CPU 0 -> Queue 0
[    7.928395]   CPU 1 -> Queue 0
[    7.928395]   CPU 2 -> Queue 0
[    7.928396]   CPU 3 -> Queue 0
[    7.928396]   CPU 4 -> Queue 0
[    7.928396]   CPU 5 -> Queue 0
[    7.928397]   CPU 6 -> Queue 0
[    7.928397]   CPU 7 -> Queue 0
[    7.928398]   CPU 8 -> Queue 0
[    7.928398]   CPU 9 -> Queue 0
[    7.928399]   CPU10 -> Queue 0
[    7.928399]   CPU11 -> Queue 0
[    7.928399]   CPU12 -> Queue 0
[    7.928400]   CPU13 -> Queue 0
[    7.928400]   CPU14 -> Queue 0
[    7.928401]   CPU15 -> Queue 0
[    7.928420] Performing sc map setup on q: ffff8804624f08e0 hctx: ffff880462938a00 i: 0
[    7.928435] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928436] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928437] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928439] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928441] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928441] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928443] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928444] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928446] sd 0:0:1:0: [sdb] Sector size 0 reported, assuming 512.
[    7.928448] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928448] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928450] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928451] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928452] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928452] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928457] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928458] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928459] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928460] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928461] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928462] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928463] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928464] Allocated blk-mq req: ffff88046250a240, req->tag: 59
[    7.928465] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928466] scsi_execute(): Calling blk_mq_free_request >>>
[    7.928468] sd 0:0:1:0: [sdb] Asking for cache data failed
[    7.928469] sd 0:0:1:0: [sdb] Assuming drive cache: write through
[    7.928487] ------------[ cut here ]------------
[    7.928492] WARNING: at drivers/ata/libata-core.c:5038 ata_qc_issue+0x266/0x380()

Here is the code in question:

void ata_qc_issue(struct ata_queued_cmd *qc)
{
        struct ata_port *ap = qc->ap;
        struct ata_link *link = qc->dev->link;
        u8 prot = qc->tf.protocol;

        /* Make sure only one non-NCQ command is outstanding.  The
         * check is skipped for old EH because it reuses active qc to
         * request ATAPI sense.
         */
        WARN_ON_ONCE(ap->ops->error_handler && ata_tag_valid(link->active_tag));

	....
}

So I think that ata_tag_valid() is triggering this WARN_ON_ONCE due to
the default queue_depth setting in scsi-mq.c:scsi_mq_alloc_queue(),
which is queueing more than a single outstanding struct scsi_cmnd at a
time into the underlying LLD.  Note in this value is currently
hard-coded to:

        sdev->sdev_mq_reg.queue_depth = 64;

This value should actually be coming from what the hardware is
advertising, eg:

   min(scsi_host_template->cmd_per_lun, scsi_host_template->can_queue)

but I'd recommend to try to hardcode this value to 1 for the moment in
order to match what ata_piix is reporting to SCSI.

Also, just to be safe, please also disable scsi-generic
(CONFIG_CHR_DEV_SG) in your kernel config, as it's not hooked up to
scsi-mq just yet, and may be causing problems elsewhere.

Thanks!

--nab

[    7.928494] Modules linked in:
[    7.928496] CPU: 9 PID: 153 Comm: kworker/u50:5 Not tainted 3.10.0-rc5.nab+ #11
[    7.928497] Hardware name: Cisco Systems Inc R210-2121605W/R210-2121605W, BIOS C200.1.4.3j.0.020720132258 02/07/2013
[    7.928502] Workqueue: events_unbound async_run_entry_fn
[    7.928505]  0000000000000009 ffff88046241f588 ffffffff8162cc58 ffff88046241f5c8
[    7.928507]  ffffffff8104a200 ffff88046241f5d8 ffff880462b90000 0000000000000003
[    7.928509]  ffff880462b91c68 ffffffff81415450 ffff880462b90230 ffff88046241f5d8
[    7.928510] Call Trace:
[    7.928514]  [<ffffffff8162cc58>] dump_stack+0x19/0x1b
[    7.928518]  [<ffffffff8104a200>] warn_slowpath_common+0x70/0xa0
[    7.928521]  [<ffffffff81415450>] ? ata_scsi_set_sense.constprop.26+0x30/0x30
[    7.928523]  [<ffffffff8104a24a>] warn_slowpath_null+0x1a/0x20
[    7.928525]  [<ffffffff8140f586>] ata_qc_issue+0x266/0x380
[    7.928526]  [<ffffffff814155b3>] ? ata_scsi_rw_xlat+0x163/0x210
[    7.928528]  [<ffffffff81415450>] ? ata_scsi_set_sense.constprop.26+0x30/0x30
[    7.928530]  [<ffffffff814140d7>] ata_scsi_translate+0xa7/0x180
[    7.928531] scsi_mq_alloc_queue() complete !! >>>>>>>>>>>>>>>>>>>>>>>>>>>
[    7.928533]  [<ffffffff814181f9>] ata_scsi_queuecmd+0xa9/0x2b0
[    7.928534] Entering scsi_execute with q->mq_ops: ffffffff81ca13e0
[    7.928537]  [<ffffffff813ed956>] scsi_dispatch_cmd+0x1c6/0x310
[    7.928537] Allocated blk-mq req: ffff88046259ad40, req->tag: 63
[    7.928540]  [<ffffffff813f63bb>] scsi_mq_queue_rq+0x17b/0x280
[    7.928541] Calling blk_mq_insert_request from blk_execute_rq_nowait >>>>>>>>>>>>>>>>
[    7.928546]  [<ffffffff812c15c5>] __blk_mq_run_hw_queue+0x1b5/0x3a0
[    7.928549]  [<ffffffff812c1c65>] blk_mq_run_hw_queue+0x35/0x40
[    7.928550]  [<ffffffff812c1fbb>] blk_mq_make_request+0x34b/0x4a0
[    7.928554]  [<ffffffff812b74f2>] generic_make_request+0xc2/0x110
[    7.928556]  [<ffffffff812b7a1b>] submit_bio+0x7b/0x160
[    7.928560]  [<ffffffff811baa8d>] ? bio_alloc_bioset+0x9d/0x1b0
[    7.928562]  [<ffffffff811b576e>] _submit_bh+0x13e/0x200
[    7.928564]  [<ffffffff811b5840>] submit_bh+0x10/0x20
[    7.928566]  [<ffffffff811b754d>] block_read_full_page+0x21d/0x350
[    7.928568]  [<ffffffff811bbff0>] ? I_BDEV+0x10/0x10
[    7.928571]  [<ffffffff8113be73>] ? __inc_zone_page_state+0x33/0x40
[    7.928573]  [<ffffffff8111f4bf>] ? add_to_page_cache_locked+0xdf/0x190
[    7.928575]  [<ffffffff811bc4a0>] ? blkdev_write_begin+0x30/0x30
[    7.928577]  [<ffffffff811bc4b8>] blkdev_readpage+0x18/0x20
[    7.928579]  [<ffffffff8111fcfa>] do_read_cache_page+0x7a/0x170
[    7.928581]  [<ffffffff8112837a>] ? __alloc_pages_nodemask+0x17a/0xad0
[    7.928583]  [<ffffffff8111fe09>] read_cache_page_async+0x19/0x20
[    7.928585]  [<ffffffff8112015e>] read_cache_page+0xe/0x20
[    7.928588]  [<ffffffff812c80cd>] read_dev_sector+0x2d/0x90
[    7.928590]  [<ffffffff812cdc4c>] read_lba+0xec/0x190
[    7.928592]  [<ffffffff812ce255>] ? efi_partition+0xe5/0x5f0
[    7.928594]  [<ffffffff812ce26f>] efi_partition+0xff/0x5f0
[    7.928596]  [<ffffffff812e9c34>] ? snprintf+0x34/0x40
[    7.928598]  [<ffffffff812ce170>] ? is_gpt_valid+0x480/0x480
[    7.928600]  [<ffffffff812c9138>] check_partition+0x108/0x220
[    7.928602]  [<ffffffff812c8d44>] rescan_partitions+0xb4/0x2c0
[    7.928604]  [<ffffffff811bda35>] __blkdev_get+0x375/0x4b0
[    7.928606]  [<ffffffff8119e447>] ? inode_init_always+0x107/0x1c0
[    7.928608]  [<ffffffff811bc010>] ? blkdev_get_block+0x20/0x20
[    7.928610]  [<ffffffff811bdd05>] blkdev_get+0x195/0x2e0
[    7.928612]  [<ffffffff8119f0c7>] ? unlock_new_inode+0x47/0x70
[    7.928613]  [<ffffffff811bcff0>] ? bdget+0x120/0x140
[    7.928615]  [<ffffffff812c6531>] add_disk+0x391/0x490
[    7.928618]  [<ffffffff81402c4a>] sd_probe_async+0x13a/0x230
[    7.928620]  [<ffffffff81075de6>] async_run_entry_fn+0x46/0x140
[    7.928623]  [<ffffffff810683c4>] process_one_work+0x174/0x400
[    7.928624]  [<ffffffff81068acc>] worker_thread+0x11c/0x370
[    7.928626]  [<ffffffff810689b0>] ? rescuer_thread+0x320/0x320
[    7.928629]  [<ffffffff8106f410>] kthread+0xc0/0xd0
[    7.928631]  [<ffffffff8106f350>] ? flush_kthread_worker+0x80/0x80
[    7.928633]  [<ffffffff8163b2dc>] ret_from_fork+0x7c/0xb0
[    7.928635]  [<ffffffff8106f350>] ? flush_kthread_worker+0x80/0x80
[    7.928638] ---[ end trace 9f4b3fe3fb787a07 ]---

> > I *very* much recommend doing the same if at all possible for ata_piix
> > scsi-mq development + testing, as you'll want to be very careful when
> > using a real file-system on top of this early alpha code.
> 
> Thank you for the warning.
> Getting to writing CDBs would be of success :)
> 
> > --nab
> > 
> 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-16 21:38                 ` Nicholas A. Bellinger
@ 2013-07-17 16:19                   ` Alexander Gordeev
  2013-07-18 18:51                     ` Nicholas A. Bellinger
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-17 16:19 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe, linux-scsi

On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
> [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
> [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
> 
> OK, so INQUIRY response payload is looking as expected here.

Yep. It is not on the top of my head, but I remember something like INQUIRYs
are emulated and thus do not have payload.

> [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
> [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
> 
> Strange..  READ_CAPACITY appears to be returning a payload as zeros..?

Yep. Because blk_execute_rq() does not put the proper callback and data do
not get copied from sg's to bounce buffer. That is why I tried to use
blk_mq_execute_rq() instead. Once I do that, data start getting read and
booting stops elsewhere.

Of course, I was suspecting that change alone is not valid and wondered
about the status of scsi-mq in the first place, and if more changes are
coming.

So I it turns out "req->errors + req->resid_len" issue (you described
earlier) needs to be addressed before going forward with libata (only?).

> Not sure why yet some control CDBs are getting back the expected
> payload, while others are returning zeros..

Bio buffers do not get updated from callback.

> Also, looking at the included stack back-trace:

[...]

Thanks a lot for these and other your comments, Nicholas!

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-17 16:19                   ` Alexander Gordeev
@ 2013-07-18 18:51                     ` Nicholas A. Bellinger
  2013-07-18 19:12                       ` Mike Christie
  2013-07-18 19:14                       ` Nicholas A. Bellinger
  0 siblings, 2 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-18 18:51 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe, linux-scsi

On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
> On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
> > [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
> > [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
> > 
> > OK, so INQUIRY response payload is looking as expected here.
> 
> Yep. It is not on the top of my head, but I remember something like INQUIRYs
> are emulated and thus do not have payload.
> 
> > [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
> > [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
> > 
> > Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
> 
> Yep. Because blk_execute_rq() does not put the proper callback and data do
> not get copied from sg's to bounce buffer. That is why I tried to use
> blk_mq_execute_rq() instead. Once I do that, data start getting read and
> booting stops elsewhere.

Mmmmmm.

The call to blk_queue_bounce() exists within blk_mq_make_request(), but
AFAICT this should still be getting invoked regardless of if the struct
request is dispatched into blk-mq via the modified blk_execute_rq() ->
blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
via blk_mq_execute_rq()..

Jens..?

> 
> Of course, I was suspecting that change alone is not valid and wondered
> about the status of scsi-mq in the first place, and if more changes are
> coming.

Most certainly.  ;)

> 
> So I it turns out "req->errors + req->resid_len" issue (you described
> earlier) needs to be addressed before going forward with libata (only?).

AFAICT, getting an initial conversion of libata up does not depend upon
this specific issue being addressed first.  I could be wrong however..

> 
> > Not sure why yet some control CDBs are getting back the expected
> > payload, while others are returning zeros..
> 
> Bio buffers do not get updated from callback.
> 
> > Also, looking at the included stack back-trace:
> 
> [...]
> 
> Thanks a lot for these and other your comments, Nicholas!
> 

Sure.  I should have a few extra cycles to hack on this over the
weekend.

Also, thinking about this some more, trying to convert ahci to scsi-mq
first (using QEMU emulation), while keeping a rootfs on PIIX_IDE might
make debugging slightly easier..

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-18 18:51                     ` Nicholas A. Bellinger
@ 2013-07-18 19:12                       ` Mike Christie
  2013-07-19  0:23                         ` Nicholas A. Bellinger
  2013-07-18 19:14                       ` Nicholas A. Bellinger
  1 sibling, 1 reply; 75+ messages in thread
From: Mike Christie @ 2013-07-18 19:12 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Alexander Gordeev, Tejun Heo, linux-kernel, linux-ide,
	Jeff Garzik, Jens Axboe, linux-scsi

On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:
> On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
>> On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
>>> [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
>>> [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
>>>
>>> OK, so INQUIRY response payload is looking as expected here.
>>
>> Yep. It is not on the top of my head, but I remember something like INQUIRYs
>> are emulated and thus do not have payload.
>>
>>> [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
>>> [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
>>> [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
>>>
>>> Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
>>
>> Yep. Because blk_execute_rq() does not put the proper callback and data do
>> not get copied from sg's to bounce buffer. That is why I tried to use
>> blk_mq_execute_rq() instead. Once I do that, data start getting read and
>> booting stops elsewhere.
> 
> Mmmmmm.
> 
> The call to blk_queue_bounce() exists within blk_mq_make_request(), but
> AFAICT this should still be getting invoked regardless of if the struct
> request is dispatched into blk-mq via the modified blk_execute_rq() ->
> blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
> via blk_mq_execute_rq()..
> 

blk_mq_make_request is not called from the blk insert/execute paths.
blk_mq_make_request takes a bio and tries to merge it with a request and
adds it to the queue. It is only called when the make_request_fn is
called like when generic_make_request is called.

blk_mq_insert_request adds a already formed request to the queue. It is
already formed so that is why that path does not bounce bios. The
bios/pages should already be added within the drivers restrictions. So
for the read_cap path, the call to blk_rq_map_kern in scsi_execute does
the blk_queue_bounce call.

Just saw this while trying out iscsi with the scsi-mq stuff :)


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-18 18:51                     ` Nicholas A. Bellinger
  2013-07-18 19:12                       ` Mike Christie
@ 2013-07-18 19:14                       ` Nicholas A. Bellinger
  2013-07-18 21:21                         ` Jens Axboe
  1 sibling, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-18 19:14 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, Jens Axboe, linux-scsi

On Thu, 2013-07-18 at 11:51 -0700, Nicholas A. Bellinger wrote:
> On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
> > On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
> > > [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
> > > [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
> > > 
> > > OK, so INQUIRY response payload is looking as expected here.
> > 
> > Yep. It is not on the top of my head, but I remember something like INQUIRYs
> > are emulated and thus do not have payload.
> > 
> > > [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
> > > [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
> > > 
> > > Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
> > 
> > Yep. Because blk_execute_rq() does not put the proper callback and data do
> > not get copied from sg's to bounce buffer. That is why I tried to use
> > blk_mq_execute_rq() instead. Once I do that, data start getting read and
> > booting stops elsewhere.
> 
> Mmmmmm.
> 
> The call to blk_queue_bounce() exists within blk_mq_make_request(), but
> AFAICT this should still be getting invoked regardless of if the struct
> request is dispatched into blk-mq via the modified blk_execute_rq() ->
> blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
> via blk_mq_execute_rq()..
> 
> Jens..?
> 

Actually sorry, your right.  A call to blk_mq_insert_request() for
REQ_TYPE_BLOCK_PC will not invoke blk_queue_bounce() located near the
top of blk_mq_execute_rq(), which means that only REQ_TYPE_FS is
currently using bounce buffers, if required.

Need to think a bit more about what to do here for REQ_TYPE_BLOCK_PC
bounce buffer special case with blk_execute_rq(), but I'm thinking that
blk_mq_execute_rq() should really not be used here..

Jens..?

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-18 19:14                       ` Nicholas A. Bellinger
@ 2013-07-18 21:21                         ` Jens Axboe
  0 siblings, 0 replies; 75+ messages in thread
From: Jens Axboe @ 2013-07-18 21:21 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Alexander Gordeev, Tejun Heo, linux-kernel, linux-ide,
	Jeff Garzik, linux-scsi

On 07/18/2013 01:14 PM, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-18 at 11:51 -0700, Nicholas A. Bellinger wrote:
>> On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
>>> On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
>>>> [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
>>>> [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
>>>>
>>>> OK, so INQUIRY response payload is looking as expected here.
>>>
>>> Yep. It is not on the top of my head, but I remember something like INQUIRYs
>>> are emulated and thus do not have payload.
>>>
>>>> [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
>>>> [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
>>>> [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
>>>>
>>>> Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
>>>
>>> Yep. Because blk_execute_rq() does not put the proper callback and data do
>>> not get copied from sg's to bounce buffer. That is why I tried to use
>>> blk_mq_execute_rq() instead. Once I do that, data start getting read and
>>> booting stops elsewhere.
>>
>> Mmmmmm.
>>
>> The call to blk_queue_bounce() exists within blk_mq_make_request(), but
>> AFAICT this should still be getting invoked regardless of if the struct
>> request is dispatched into blk-mq via the modified blk_execute_rq() ->
>> blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
>> via blk_mq_execute_rq()..
>>
>> Jens..?
>>
> 
> Actually sorry, your right.  A call to blk_mq_insert_request() for
> REQ_TYPE_BLOCK_PC will not invoke blk_queue_bounce() located near the
> top of blk_mq_execute_rq(), which means that only REQ_TYPE_FS is
> currently using bounce buffers, if required.
> 
> Need to think a bit more about what to do here for REQ_TYPE_BLOCK_PC
> bounce buffer special case with blk_execute_rq(), but I'm thinking that
> blk_mq_execute_rq() should really not be used here..
> 
> Jens..?

It needs to be pre-bounced, blk-mq will only bounce incoming bios and
not requests merely added to the queue(s). Might be useful to add an
equiv blk_mq_make_request() for this.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-18 19:12                       ` Mike Christie
@ 2013-07-19  0:23                         ` Nicholas A. Bellinger
  2013-07-19  0:30                           ` Jens Axboe
  2013-07-19 15:58                           ` Mike Christie
  0 siblings, 2 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-19  0:23 UTC (permalink / raw)
  To: Mike Christie
  Cc: Alexander Gordeev, Tejun Heo, linux-kernel, linux-ide,
	Jeff Garzik, Jens Axboe, linux-scsi

On Thu, 2013-07-18 at 13:12 -0600, Mike Christie wrote:
> On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:
> > On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
> >> On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
> >>> [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
> >>> [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
> >>>
> >>> OK, so INQUIRY response payload is looking as expected here.
> >>
> >> Yep. It is not on the top of my head, but I remember something like INQUIRYs
> >> are emulated and thus do not have payload.
> >>
> >>> [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> >>> [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
> >>> [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
> >>>
> >>> Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
> >>
> >> Yep. Because blk_execute_rq() does not put the proper callback and data do
> >> not get copied from sg's to bounce buffer. That is why I tried to use
> >> blk_mq_execute_rq() instead. Once I do that, data start getting read and
> >> booting stops elsewhere.
> > 
> > Mmmmmm.
> > 
> > The call to blk_queue_bounce() exists within blk_mq_make_request(), but
> > AFAICT this should still be getting invoked regardless of if the struct
> > request is dispatched into blk-mq via the modified blk_execute_rq() ->
> > blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
> > via blk_mq_execute_rq()..
> > 
> 
> blk_mq_make_request is not called from the blk insert/execute paths.
> blk_mq_make_request takes a bio and tries to merge it with a request and
> adds it to the queue. It is only called when the make_request_fn is
> called like when generic_make_request is called.
> 
> blk_mq_insert_request adds a already formed request to the queue. It is
> already formed so that is why that path does not bounce bios. The
> bios/pages should already be added within the drivers restrictions. So
> for the read_cap path, the call to blk_rq_map_kern in scsi_execute does
> the blk_queue_bounce call.
> 

<nod>, just noticed the blk_queue_bounce() in blk_rq_map_kern().  

Not sure why this doesn't seem to be doing what it's supposed to for
libata just yet..

> Just saw this while trying out iscsi with the scsi-mq stuff :)
> 

Took at stab at this a while back, but ended getting distracted on other
items.  Do you have an initial conversion running yet..?

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19  0:23                         ` Nicholas A. Bellinger
@ 2013-07-19  0:30                           ` Jens Axboe
  2013-07-19  1:03                             ` Nicholas A. Bellinger
  2013-07-19 15:58                           ` Mike Christie
  1 sibling, 1 reply; 75+ messages in thread
From: Jens Axboe @ 2013-07-19  0:30 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Mike Christie, Alexander Gordeev, Tejun Heo, linux-kernel,
	linux-ide, Jeff Garzik, linux-scsi

On Thu, Jul 18 2013, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-18 at 13:12 -0600, Mike Christie wrote:
> > On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:
> > > On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
> > >> On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
> > >>> [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
> > >>> [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
> > >>>
> > >>> OK, so INQUIRY response payload is looking as expected here.
> > >>
> > >> Yep. It is not on the top of my head, but I remember something like INQUIRYs
> > >> are emulated and thus do not have payload.
> > >>
> > >>> [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > >>> [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
> > >>> [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
> > >>>
> > >>> Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
> > >>
> > >> Yep. Because blk_execute_rq() does not put the proper callback and data do
> > >> not get copied from sg's to bounce buffer. That is why I tried to use
> > >> blk_mq_execute_rq() instead. Once I do that, data start getting read and
> > >> booting stops elsewhere.
> > > 
> > > Mmmmmm.
> > > 
> > > The call to blk_queue_bounce() exists within blk_mq_make_request(), but
> > > AFAICT this should still be getting invoked regardless of if the struct
> > > request is dispatched into blk-mq via the modified blk_execute_rq() ->
> > > blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
> > > via blk_mq_execute_rq()..
> > > 
> > 
> > blk_mq_make_request is not called from the blk insert/execute paths.
> > blk_mq_make_request takes a bio and tries to merge it with a request and
> > adds it to the queue. It is only called when the make_request_fn is
> > called like when generic_make_request is called.
> > 
> > blk_mq_insert_request adds a already formed request to the queue. It is
> > already formed so that is why that path does not bounce bios. The
> > bios/pages should already be added within the drivers restrictions. So
> > for the read_cap path, the call to blk_rq_map_kern in scsi_execute does
> > the blk_queue_bounce call.
> > 
> 
> <nod>, just noticed the blk_queue_bounce() in blk_rq_map_kern().  
> 
> Not sure why this doesn't seem to be doing what it's supposed to for
> libata just yet..

How are you make the request from the bio? It'd be pretty trivial to
ensure that it gets bounced properly... blk_mq_execute_rq() assumes a
fully complete request, so it wont bounce anything.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19  0:30                           ` Jens Axboe
@ 2013-07-19  1:03                             ` Nicholas A. Bellinger
  2013-07-19  6:34                                 ` Nicholas A. Bellinger
  0 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-19  1:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Alexander Gordeev, Tejun Heo, linux-kernel,
	linux-ide, Jeff Garzik, linux-scsi

On Thu, 2013-07-18 at 18:30 -0600, Jens Axboe wrote:
> On Thu, Jul 18 2013, Nicholas A. Bellinger wrote:
> > On Thu, 2013-07-18 at 13:12 -0600, Mike Christie wrote:
> > > On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:
> > > > On Wed, 2013-07-17 at 18:19 +0200, Alexander Gordeev wrote:
> > > >> On Tue, Jul 16, 2013 at 02:38:03PM -0700, Nicholas A. Bellinger wrote:
> > > >>> [    7.927818] scsi_execute(): Calling blk_mq_free_request >>>
> > > >>> [    7.927826] scsi 0:0:0:0: Direct-Access     ATA      ST9500530NS      CC03 PQ: 0 ANSI: 5
> > > >>>
> > > >>> OK, so INQUIRY response payload is looking as expected here.
> > > >>
> > > >> Yep. It is not on the top of my head, but I remember something like INQUIRYs
> > > >> are emulated and thus do not have payload.
> > > >>
> > > >>> [    7.927960] sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > >>> [    7.927964] sd 0:0:0:0: [sda] 1 512-byte logical blocks: (512 B/512 B)
> > > >>> [    7.927965] sd 0:0:0:0: [sda] 0-byte physical blocks
> > > >>>
> > > >>> Strange..  READ_CAPACITY appears to be returning a payload as zeros..?
> > > >>
> > > >> Yep. Because blk_execute_rq() does not put the proper callback and data do
> > > >> not get copied from sg's to bounce buffer. That is why I tried to use
> > > >> blk_mq_execute_rq() instead. Once I do that, data start getting read and
> > > >> booting stops elsewhere.
> > > > 
> > > > Mmmmmm.
> > > > 
> > > > The call to blk_queue_bounce() exists within blk_mq_make_request(), but
> > > > AFAICT this should still be getting invoked regardless of if the struct
> > > > request is dispatched into blk-mq via the modified blk_execute_rq() ->
> > > > blk_execute_rq_nowait() -> blk_mq_insert_request() codepath, or directly
> > > > via blk_mq_execute_rq()..
> > > > 
> > > 
> > > blk_mq_make_request is not called from the blk insert/execute paths.
> > > blk_mq_make_request takes a bio and tries to merge it with a request and
> > > adds it to the queue. It is only called when the make_request_fn is
> > > called like when generic_make_request is called.
> > > 
> > > blk_mq_insert_request adds a already formed request to the queue. It is
> > > already formed so that is why that path does not bounce bios. The
> > > bios/pages should already be added within the drivers restrictions. So
> > > for the read_cap path, the call to blk_rq_map_kern in scsi_execute does
> > > the blk_queue_bounce call.
> > > 
> > 
> > <nod>, just noticed the blk_queue_bounce() in blk_rq_map_kern().  
> > 
> > Not sure why this doesn't seem to be doing what it's supposed to for
> > libata just yet..
> 
> How are you make the request from the bio? It'd be pretty trivial to
> ensure that it gets bounced properly... blk_mq_execute_rq() assumes a
> fully complete request, so it wont bounce anything.
> 

>From what I gather for REQ_TYPE_BLOCK_PC, scsi_execute() ->
blk_rq_map_kern() -> blk_rq_append_bio() -> blk_rq_bio_prep() is what
does the request setup from the bios returned by bio_[copy,map]_kern()
in blk_rq_map_kern() code.

blk_queue_bounce() is called immediately after blk_rq_append_bio() here,
which AFAICT looks like it's doing the correct thing for scsi-mq..

What is strange here is that libata-scsi.c CDB emulation code is doing
the same stuff for both INQUIRY (that seems to be OK) and READ_CAPACITY
(that is returning zeros), which makes me think that something else is
going on..

Alexander, where you able to re-test using sdev->sdev_mq_reg.queue_depth
= 1 in scsi_mq_alloc_queue()..?

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19  1:03                             ` Nicholas A. Bellinger
@ 2013-07-19  6:34                                 ` Nicholas A. Bellinger
  0 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-19  6:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Alexander Gordeev, Tejun Heo, linux-kernel,
	linux-ide, Jeff Garzik, linux-scsi

On Thu, 2013-07-18 at 18:03 -0700, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-18 at 18:30 -0600, Jens Axboe wrote:
> > On Thu, Jul 18 2013, Nicholas A. Bellinger wrote:
> > > On Thu, 2013-07-18 at 13:12 -0600, Mike Christie wrote:
> > > > On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:

<SNIP>

> > > <nod>, just noticed the blk_queue_bounce() in blk_rq_map_kern().  
> > > 
> > > Not sure why this doesn't seem to be doing what it's supposed to for
> > > libata just yet..
> > 
> > How are you make the request from the bio? It'd be pretty trivial to
> > ensure that it gets bounced properly... blk_mq_execute_rq() assumes a
> > fully complete request, so it wont bounce anything.
> > 
> 
> From what I gather for REQ_TYPE_BLOCK_PC, scsi_execute() ->
> blk_rq_map_kern() -> blk_rq_append_bio() -> blk_rq_bio_prep() is what
> does the request setup from the bios returned by bio_[copy,map]_kern()
> in blk_rq_map_kern() code.
> 
> blk_queue_bounce() is called immediately after blk_rq_append_bio() here,
> which AFAICT looks like it's doing the correct thing for scsi-mq..
> 
> What is strange here is that libata-scsi.c CDB emulation code is doing
> the same stuff for both INQUIRY (that seems to be OK) and READ_CAPACITY
> (that is returning zeros), which makes me think that something else is
> going on..
> 
> Alexander, where you able to re-test using sdev->sdev_mq_reg.queue_depth
> = 1 in scsi_mq_alloc_queue()..?
> 

So after a bit more hacking tonight it appears the explicit setting of
dma_alignment by libata (sdev->sector_size - 1) is what is making
blk_rq_map_kern() invoke blk_copy_kern() instead of blk_map_kern(), and
triggering this scsi-mq specific bug with libata.  I'm able to confirm
with QEMU IDE emulation the bug is occurring only after INQUIRY and
before READ_CAPACITY, as reported by Alexander.

Below is a quick hack to skip this setting in ata_scsi_dev_config() for
blk-mq, and leaves the default dma_alignment=0x03 for REQ_TYPE_BLOCK_PC
requests as initially set by scsi-core in scsi_init_request_queue().

Also included is the change for using queue_depth = min(SHT->can_queue,
SHT->cmd_per_lun) during scsi-mq request_queue initialization, along
with a very basic ata_piix conversion for testing purposes.

With these three changes in place, I'm able to register a single 1GB
ata_piix LUN using QEMU IDE emulation, and successfully run simple fio
writeverify tests with blocksize=4k @ queue_depth=1.

Obviously this is not a proper bugfix for unaligned blk_copy_kern() with
scsi-mq + REQ_TYPE_BLOCK_PC case, but should be enough to at least get
libata LUN scanning to work.  Need to take a deeper look at the problem
here..

Alexander, care to give this work-around a shot on your bare-metal setup
using ata_piix..?

--nab

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index 9a8a674..ac05cd6 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -1066,6 +1066,8 @@ static u8 piix_vmw_bmdma_status(struct ata_port *ap)
 
 static struct scsi_host_template piix_sht = {
        ATA_BMDMA_SHT(DRV_NAME),
+       .scsi_mq        = true,
+       .queuecommand_mq = ata_scsi_queuecmd,
 };
 
 static struct ata_port_operations piix_sata_ops = {
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0101af5..191bc15 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
                        "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
                        sdev->sector_size);
 
-       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
+       if (!q->mq_ops) {
+               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
+       } else {
+               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
+       }
 
        if (dev->flags & ATA_DFLAG_AN)
                set_bit(SDEV_EVT_MEDIA_CHANGE, sdev->supported_events);
diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
index ca6ff67..81b2633 100644
--- a/drivers/scsi/scsi-mq.c
+++ b/drivers/scsi/scsi-mq.c
@@ -199,11 +199,11 @@ int scsi_mq_alloc_queue(struct Scsi_Host *sh, struct scsi_device *sdev)
        int i, j;
 
        sdev->sdev_mq_reg.ops = &scsi_mq_ops;
-       sdev->sdev_mq_reg.queue_depth = sdev->queue_depth;
+       sdev->sdev_mq_reg.queue_depth = min((short)sh->hostt->can_queue,
+                                           sh->hostt->cmd_per_lun);
        sdev->sdev_mq_reg.cmd_size = sizeof(struct scsi_cmnd) + sh->hostt->cmd_size;
        sdev->sdev_mq_reg.numa_node = NUMA_NO_NODE;
        sdev->sdev_mq_reg.nr_hw_queues = 1;
-       sdev->sdev_mq_reg.queue_depth = 64;
        sdev->sdev_mq_reg.flags = BLK_MQ_F_SHOULD_MERGE;
 
        printk("Calling blk_mq_init_queue: scsi_mq_ops: %p, queue_depth: %d,"


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
@ 2013-07-19  6:34                                 ` Nicholas A. Bellinger
  0 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-19  6:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Alexander Gordeev, Tejun Heo, linux-kernel,
	linux-ide, Jeff Garzik, linux-scsi, Mike Christie

On Thu, 2013-07-18 at 18:03 -0700, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-18 at 18:30 -0600, Jens Axboe wrote:
> > On Thu, Jul 18 2013, Nicholas A. Bellinger wrote:
> > > On Thu, 2013-07-18 at 13:12 -0600, Mike Christie wrote:
> > > > On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:

<SNIP>

> > > <nod>, just noticed the blk_queue_bounce() in blk_rq_map_kern().  
> > > 
> > > Not sure why this doesn't seem to be doing what it's supposed to for
> > > libata just yet..
> > 
> > How are you make the request from the bio? It'd be pretty trivial to
> > ensure that it gets bounced properly... blk_mq_execute_rq() assumes a
> > fully complete request, so it wont bounce anything.
> > 
> 
> From what I gather for REQ_TYPE_BLOCK_PC, scsi_execute() ->
> blk_rq_map_kern() -> blk_rq_append_bio() -> blk_rq_bio_prep() is what
> does the request setup from the bios returned by bio_[copy,map]_kern()
> in blk_rq_map_kern() code.
> 
> blk_queue_bounce() is called immediately after blk_rq_append_bio() here,
> which AFAICT looks like it's doing the correct thing for scsi-mq..
> 
> What is strange here is that libata-scsi.c CDB emulation code is doing
> the same stuff for both INQUIRY (that seems to be OK) and READ_CAPACITY
> (that is returning zeros), which makes me think that something else is
> going on..
> 
> Alexander, where you able to re-test using sdev->sdev_mq_reg.queue_depth
> = 1 in scsi_mq_alloc_queue()..?
> 

So after a bit more hacking tonight it appears the explicit setting of
dma_alignment by libata (sdev->sector_size - 1) is what is making
blk_rq_map_kern() invoke blk_copy_kern() instead of blk_map_kern(), and
triggering this scsi-mq specific bug with libata.  I'm able to confirm
with QEMU IDE emulation the bug is occurring only after INQUIRY and
before READ_CAPACITY, as reported by Alexander.

Below is a quick hack to skip this setting in ata_scsi_dev_config() for
blk-mq, and leaves the default dma_alignment=0x03 for REQ_TYPE_BLOCK_PC
requests as initially set by scsi-core in scsi_init_request_queue().

Also included is the change for using queue_depth = min(SHT->can_queue,
SHT->cmd_per_lun) during scsi-mq request_queue initialization, along
with a very basic ata_piix conversion for testing purposes.

With these three changes in place, I'm able to register a single 1GB
ata_piix LUN using QEMU IDE emulation, and successfully run simple fio
writeverify tests with blocksize=4k @ queue_depth=1.

Obviously this is not a proper bugfix for unaligned blk_copy_kern() with
scsi-mq + REQ_TYPE_BLOCK_PC case, but should be enough to at least get
libata LUN scanning to work.  Need to take a deeper look at the problem
here..

Alexander, care to give this work-around a shot on your bare-metal setup
using ata_piix..?

--nab

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index 9a8a674..ac05cd6 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -1066,6 +1066,8 @@ static u8 piix_vmw_bmdma_status(struct ata_port *ap)
 
 static struct scsi_host_template piix_sht = {
        ATA_BMDMA_SHT(DRV_NAME),
+       .scsi_mq        = true,
+       .queuecommand_mq = ata_scsi_queuecmd,
 };
 
 static struct ata_port_operations piix_sata_ops = {
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0101af5..191bc15 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
                        "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
                        sdev->sector_size);
 
-       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
+       if (!q->mq_ops) {
+               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
+       } else {
+               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
+       }
 
        if (dev->flags & ATA_DFLAG_AN)
                set_bit(SDEV_EVT_MEDIA_CHANGE, sdev->supported_events);
diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
index ca6ff67..81b2633 100644
--- a/drivers/scsi/scsi-mq.c
+++ b/drivers/scsi/scsi-mq.c
@@ -199,11 +199,11 @@ int scsi_mq_alloc_queue(struct Scsi_Host *sh, struct scsi_device *sdev)
        int i, j;
 
        sdev->sdev_mq_reg.ops = &scsi_mq_ops;
-       sdev->sdev_mq_reg.queue_depth = sdev->queue_depth;
+       sdev->sdev_mq_reg.queue_depth = min((short)sh->hostt->can_queue,
+                                           sh->hostt->cmd_per_lun);
        sdev->sdev_mq_reg.cmd_size = sizeof(struct scsi_cmnd) + sh->hostt->cmd_size;
        sdev->sdev_mq_reg.numa_node = NUMA_NO_NODE;
        sdev->sdev_mq_reg.nr_hw_queues = 1;
-       sdev->sdev_mq_reg.queue_depth = 64;
        sdev->sdev_mq_reg.flags = BLK_MQ_F_SHOULD_MERGE;
 
        printk("Calling blk_mq_init_queue: scsi_mq_ops: %p, queue_depth: %d,"


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19  6:34                                 ` Nicholas A. Bellinger
  (?)
@ 2013-07-19 15:33                                 ` James Bottomley
  2013-07-19 21:01                                   ` Nicholas A. Bellinger
  -1 siblings, 1 reply; 75+ messages in thread
From: James Bottomley @ 2013-07-19 15:33 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Mike Christie, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 0101af5..191bc15 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
>                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
>                         sdev->sector_size);
>  
> -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> +       if (!q->mq_ops) {
> +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> +       } else {
> +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
> +       }

Amazingly enough there is a reason for the dma alignment, and it wasn't
just to annoy you, so you can't blindly do this.

The email thread is probably lost in the mists of time, but if I
remember correctly the problem is that some ahci DMA controllers barf if
the sector they're doing DMA on crosses a page boundary.  Some are
annoying enough to actually cause silent data corruption.  You won't
find every ahci DMA controller doing this, so the change will work for
some, but it will be hard to identify those it won't work for until
people start losing data.

The correct fix, obviously, is to do the bio copy on the kernel path for
unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
aligned (because of the block to page alignment).

James

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19  0:23                         ` Nicholas A. Bellinger
  2013-07-19  0:30                           ` Jens Axboe
@ 2013-07-19 15:58                           ` Mike Christie
  2013-07-19 21:05                             ` Nicholas A. Bellinger
  1 sibling, 1 reply; 75+ messages in thread
From: Mike Christie @ 2013-07-19 15:58 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Alexander Gordeev, Tejun Heo, linux-kernel, linux-ide,
	Jeff Garzik, Jens Axboe, linux-scsi

On 07/18/2013 06:23 PM, Nicholas A. Bellinger wrote:
>> Just saw this while trying out iscsi with the scsi-mq stuff :)
>> > 
> Took at stab at this a while back, but ended getting distracted on other
> items.  Do you have an initial conversion running yet..?

Not running well :) Have a patch but I am debugging it now. I messed up
something with the cmd_size stuff.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19 15:33                                 ` James Bottomley
@ 2013-07-19 21:01                                   ` Nicholas A. Bellinger
  2013-07-20  4:56                                     ` Nicholas A. Bellinger
  0 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-19 21:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jens Axboe, Mike Christie, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:
> On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
> > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> > index 0101af5..191bc15 100644
> > --- a/drivers/ata/libata-scsi.c
> > +++ b/drivers/ata/libata-scsi.c
> > @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
> >                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
> >                         sdev->sector_size);
> >  
> > -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> > +       if (!q->mq_ops) {
> > +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> > +       } else {
> > +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
> > +       }
> 
> Amazingly enough there is a reason for the dma alignment, and it wasn't
> just to annoy you, so you can't blindly do this.
> 
> The email thread is probably lost in the mists of time, but if I
> remember correctly the problem is that some ahci DMA controllers barf if
> the sector they're doing DMA on crosses a page boundary.  Some are
> annoying enough to actually cause silent data corruption.  You won't
> find every ahci DMA controller doing this, so the change will work for
> some, but it will be hard to identify those it won't work for until
> people start losing data.

Thanks for the extra background.

So at least from what I gather thus far this shouldn't be an issue for
initial testing with scsi-mq <-> libata w/ ata_piix.

> 
> The correct fix, obviously, is to do the bio copy on the kernel path for
> unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
> aligned (because of the block to page alignment).
> 

Indeed.  Looking into the bio_copy_kern() breakage next..

--nab

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19 15:58                           ` Mike Christie
@ 2013-07-19 21:05                             ` Nicholas A. Bellinger
  0 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-19 21:05 UTC (permalink / raw)
  To: Mike Christie
  Cc: Alexander Gordeev, Tejun Heo, linux-kernel, linux-ide,
	Jeff Garzik, Jens Axboe, linux-scsi

On Fri, 2013-07-19 at 09:58 -0600, Mike Christie wrote:
> On 07/18/2013 06:23 PM, Nicholas A. Bellinger wrote:
> >> Just saw this while trying out iscsi with the scsi-mq stuff :)
> >> > 
> > Took at stab at this a while back, but ended getting distracted on other
> > items.  Do you have an initial conversion running yet..?
> 
> Not running well :) Have a patch but I am debugging it now. I messed up
> something with the cmd_size stuff.

<nod>, looking forward to see ib_iser move beyond the existing
scsi_request_fn() bottleneck.   ;)

Feel free to send me the WIP off-list if you'd like another pair of
eyes.

--nab




^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-19 21:01                                   ` Nicholas A. Bellinger
@ 2013-07-20  4:56                                     ` Nicholas A. Bellinger
  2013-07-20 14:48                                         ` Mike Christie
  2013-07-22 15:03                                       ` Alexander Gordeev
  0 siblings, 2 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-20  4:56 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jens Axboe, Mike Christie, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:
> > On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
> > > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> > > index 0101af5..191bc15 100644
> > > --- a/drivers/ata/libata-scsi.c
> > > +++ b/drivers/ata/libata-scsi.c
> > > @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
> > >                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
> > >                         sdev->sector_size);
> > >  
> > > -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> > > +       if (!q->mq_ops) {
> > > +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> > > +       } else {
> > > +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
> > > +       }
> > 
> > Amazingly enough there is a reason for the dma alignment, and it wasn't
> > just to annoy you, so you can't blindly do this.
> > 
> > The email thread is probably lost in the mists of time, but if I
> > remember correctly the problem is that some ahci DMA controllers barf if
> > the sector they're doing DMA on crosses a page boundary.  Some are
> > annoying enough to actually cause silent data corruption.  You won't
> > find every ahci DMA controller doing this, so the change will work for
> > some, but it will be hard to identify those it won't work for until
> > people start losing data.
> 
> Thanks for the extra background.
> 
> So at least from what I gather thus far this shouldn't be an issue for
> initial testing with scsi-mq <-> libata w/ ata_piix.
> 
> > 
> > The correct fix, obviously, is to do the bio copy on the kernel path for
> > unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
> > aligned (because of the block to page alignment).
> > 
> 
> Indeed.  Looking into the bio_copy_kern() breakage next..
> 

OK, after further investigation the root cause is a actually a missing
bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
currently using.

Including the following patch into the scsi-mq working branch now, and
reverting the libata dma_alignment=0x03 hack.

Alexander, care to give this a try..?

--nab

diff --git a/block/blk-exec.c b/block/blk-exec.c
index 0761c89..70303d2 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -25,7 +25,10 @@ static void blk_end_sync_rq(struct request *rq, int error)
        struct completion *waiting = rq->end_io_data;
 
        rq->end_io_data = NULL;
-       if (!rq->q->mq_ops) {
+       if (rq->q->mq_ops) {
+               if (rq->bio)
+                       bio_endio(rq->bio, error);
+       } else {
                __blk_put_request(rq->q, rq);
        }


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-20  4:56                                     ` Nicholas A. Bellinger
@ 2013-07-20 14:48                                         ` Mike Christie
  2013-07-22 15:03                                       ` Alexander Gordeev
  1 sibling, 0 replies; 75+ messages in thread
From: Mike Christie @ 2013-07-20 14:48 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: James Bottomley, Jens Axboe, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 4062 bytes --]

On 07/19/2013 11:56 PM, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
>> On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:
>>> On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
>>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>>>> index 0101af5..191bc15 100644
>>>> --- a/drivers/ata/libata-scsi.c
>>>> +++ b/drivers/ata/libata-scsi.c
>>>> @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
>>>>                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
>>>>                         sdev->sector_size);
>>>>  
>>>> -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
>>>> +       if (!q->mq_ops) {
>>>> +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
>>>> +       } else {
>>>> +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
>>>> +       }
>>>
>>> Amazingly enough there is a reason for the dma alignment, and it wasn't
>>> just to annoy you, so you can't blindly do this.
>>>
>>> The email thread is probably lost in the mists of time, but if I
>>> remember correctly the problem is that some ahci DMA controllers barf if
>>> the sector they're doing DMA on crosses a page boundary.  Some are
>>> annoying enough to actually cause silent data corruption.  You won't
>>> find every ahci DMA controller doing this, so the change will work for
>>> some, but it will be hard to identify those it won't work for until
>>> people start losing data.
>>
>> Thanks for the extra background.
>>
>> So at least from what I gather thus far this shouldn't be an issue for
>> initial testing with scsi-mq <-> libata w/ ata_piix.
>>
>>>
>>> The correct fix, obviously, is to do the bio copy on the kernel path for
>>> unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
>>> aligned (because of the block to page alignment).
>>>
>>
>> Indeed.  Looking into the bio_copy_kern() breakage next..
>>
> 
> OK, after further investigation the root cause is a actually a missing
> bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
> blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
> currently using.
> 
> Including the following patch into the scsi-mq working branch now, and
> reverting the libata dma_alignment=0x03 hack.
> 
> Alexander, care to give this a try..?
> 
> --nab
> 
> diff --git a/block/blk-exec.c b/block/blk-exec.c
> index 0761c89..70303d2 100644
> --- a/block/blk-exec.c
> +++ b/block/blk-exec.c
> @@ -25,7 +25,10 @@ static void blk_end_sync_rq(struct request *rq, int error)
>         struct completion *waiting = rq->end_io_data;
>  
>         rq->end_io_data = NULL;
> -       if (!rq->q->mq_ops) {
> +       if (rq->q->mq_ops) {
> +               if (rq->bio)
> +                       bio_endio(rq->bio, error);
> +       } else {
>                 __blk_put_request(rq->q, rq);
>         }
> 


This does not handle requests with multiple bios, and for the mq stye
passthrough insertion completions you actually want to call
blk_mq_finish_request in scsi_execute. Same for all the other
passthrough code in your scsi mq tree. That is your root bug. Instead of
doing that though I think we want to have the block layer free the bios
like before.

For the non mq calls, blk_end_request type of calls will complete the
bios when blk_finish_request is called from that path. It will then call
the rq end_io callback.

I think the blk mq code assumes if the end_io callack is set that the
end_io function will do the bio cleanup. See __blk_mq_end_io. Also see
how blk_mq_execute_rq calls blk_mq_finish_request for an example of how
rq passthrough execution and cleanup is being done in the mq paths.

Now with the scsi mq changes, when blk_execute_rq_nowait calls
blk_mq_insert_request it calls it with a old non mq style of end io
function that does not complete the bios.

What about the attached only compile tested patch. The patch has the mq
block code work like the non mq code for bio cleanups.



[-- Attachment #2: blk-mq-free-bio.patch --]
[-- Type: text/plain, Size: 2513 bytes --]

blk-mq: blk-mq should free bios in pass through case

For non mq calls, the block layer will free the bios when
blk_finish_request is called.

For mq calls, the blk mq code wants the caller to do this.

This patch has the blk mq code work like the non mq code
and has the block layer free the bios.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>

diff --git a/block/blk-flush.c b/block/blk-flush.c
index c56c37d..3e4cc9c 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -231,7 +231,7 @@ static void flush_end_io(struct request *flush_rq, int error)
 	unsigned long flags = 0;
 
 	if (q->mq_ops) {
-		blk_mq_finish_request(flush_rq, error);
+		blk_mq_free_request(flush_rq);
 		spin_lock_irqsave(&q->mq_flush_lock, flags);
 	}
 	running = &q->flush_queue[q->flush_running_idx];
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 799d305..5489b5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -270,7 +270,7 @@ void blk_mq_free_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_free_request);
 
-void blk_mq_finish_request(struct request *rq, int error)
+static void blk_mq_finish_request(struct request *rq, int error)
 {
 	struct bio *bio = rq->bio;
 	unsigned int bytes = 0;
@@ -286,22 +286,17 @@ void blk_mq_finish_request(struct request *rq, int error)
 
 	blk_account_io_completion(rq, bytes);
 	blk_account_io_done(rq);
-	blk_mq_free_request(rq);
 }
 
 void blk_mq_complete_request(struct request *rq, int error)
 {
 	trace_block_rq_complete(rq->q, rq);
+	blk_mq_finish_request(rq, error);
 
-	/*
-	 * If ->end_io is set, it's responsible for doing the rest of the
-	 * completion.
-	 */
 	if (rq->end_io)
 		rq->end_io(rq, error);
 	else
-		blk_mq_finish_request(rq, error);
-
+		blk_mq_free_request(rq);
 }
 
 void __blk_mq_end_io(struct request *rq, int error)
@@ -973,8 +968,7 @@ int blk_mq_execute_rq(struct request_queue *q, struct request *rq)
 	if (rq->errors)
 		err = -EIO;
 
-	blk_mq_finish_request(rq, rq->errors);
-
+	blk_mq_free_request(rq);
 	return err;
 }
 EXPORT_SYMBOL(blk_mq_execute_rq);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 42d0110..52bf1f9 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -27,7 +27,6 @@ void blk_mq_complete_request(struct request *rq, int error);
 void blk_mq_run_request(struct request *rq, bool run_queue, bool async);
 void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
 void blk_mq_init_flush(struct request_queue *q);
-void blk_mq_finish_request(struct request *rq, int error);
 
 /*
  * CPU hotplug helpers

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
@ 2013-07-20 14:48                                         ` Mike Christie
  0 siblings, 0 replies; 75+ messages in thread
From: Mike Christie @ 2013-07-20 14:48 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: James Bottomley, Jens Axboe, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 4062 bytes --]

On 07/19/2013 11:56 PM, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
>> On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:
>>> On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
>>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>>>> index 0101af5..191bc15 100644
>>>> --- a/drivers/ata/libata-scsi.c
>>>> +++ b/drivers/ata/libata-scsi.c
>>>> @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
>>>>                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
>>>>                         sdev->sector_size);
>>>>  
>>>> -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
>>>> +       if (!q->mq_ops) {
>>>> +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
>>>> +       } else {
>>>> +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
>>>> +       }
>>>
>>> Amazingly enough there is a reason for the dma alignment, and it wasn't
>>> just to annoy you, so you can't blindly do this.
>>>
>>> The email thread is probably lost in the mists of time, but if I
>>> remember correctly the problem is that some ahci DMA controllers barf if
>>> the sector they're doing DMA on crosses a page boundary.  Some are
>>> annoying enough to actually cause silent data corruption.  You won't
>>> find every ahci DMA controller doing this, so the change will work for
>>> some, but it will be hard to identify those it won't work for until
>>> people start losing data.
>>
>> Thanks for the extra background.
>>
>> So at least from what I gather thus far this shouldn't be an issue for
>> initial testing with scsi-mq <-> libata w/ ata_piix.
>>
>>>
>>> The correct fix, obviously, is to do the bio copy on the kernel path for
>>> unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
>>> aligned (because of the block to page alignment).
>>>
>>
>> Indeed.  Looking into the bio_copy_kern() breakage next..
>>
> 
> OK, after further investigation the root cause is a actually a missing
> bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
> blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
> currently using.
> 
> Including the following patch into the scsi-mq working branch now, and
> reverting the libata dma_alignment=0x03 hack.
> 
> Alexander, care to give this a try..?
> 
> --nab
> 
> diff --git a/block/blk-exec.c b/block/blk-exec.c
> index 0761c89..70303d2 100644
> --- a/block/blk-exec.c
> +++ b/block/blk-exec.c
> @@ -25,7 +25,10 @@ static void blk_end_sync_rq(struct request *rq, int error)
>         struct completion *waiting = rq->end_io_data;
>  
>         rq->end_io_data = NULL;
> -       if (!rq->q->mq_ops) {
> +       if (rq->q->mq_ops) {
> +               if (rq->bio)
> +                       bio_endio(rq->bio, error);
> +       } else {
>                 __blk_put_request(rq->q, rq);
>         }
> 


This does not handle requests with multiple bios, and for the mq stye
passthrough insertion completions you actually want to call
blk_mq_finish_request in scsi_execute. Same for all the other
passthrough code in your scsi mq tree. That is your root bug. Instead of
doing that though I think we want to have the block layer free the bios
like before.

For the non mq calls, blk_end_request type of calls will complete the
bios when blk_finish_request is called from that path. It will then call
the rq end_io callback.

I think the blk mq code assumes if the end_io callack is set that the
end_io function will do the bio cleanup. See __blk_mq_end_io. Also see
how blk_mq_execute_rq calls blk_mq_finish_request for an example of how
rq passthrough execution and cleanup is being done in the mq paths.

Now with the scsi mq changes, when blk_execute_rq_nowait calls
blk_mq_insert_request it calls it with a old non mq style of end io
function that does not complete the bios.

What about the attached only compile tested patch. The patch has the mq
block code work like the non mq code for bio cleanups.



[-- Attachment #2: blk-mq-free-bio.patch --]
[-- Type: text/plain, Size: 2513 bytes --]

blk-mq: blk-mq should free bios in pass through case

For non mq calls, the block layer will free the bios when
blk_finish_request is called.

For mq calls, the blk mq code wants the caller to do this.

This patch has the blk mq code work like the non mq code
and has the block layer free the bios.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>

diff --git a/block/blk-flush.c b/block/blk-flush.c
index c56c37d..3e4cc9c 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -231,7 +231,7 @@ static void flush_end_io(struct request *flush_rq, int error)
 	unsigned long flags = 0;
 
 	if (q->mq_ops) {
-		blk_mq_finish_request(flush_rq, error);
+		blk_mq_free_request(flush_rq);
 		spin_lock_irqsave(&q->mq_flush_lock, flags);
 	}
 	running = &q->flush_queue[q->flush_running_idx];
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 799d305..5489b5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -270,7 +270,7 @@ void blk_mq_free_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_free_request);
 
-void blk_mq_finish_request(struct request *rq, int error)
+static void blk_mq_finish_request(struct request *rq, int error)
 {
 	struct bio *bio = rq->bio;
 	unsigned int bytes = 0;
@@ -286,22 +286,17 @@ void blk_mq_finish_request(struct request *rq, int error)
 
 	blk_account_io_completion(rq, bytes);
 	blk_account_io_done(rq);
-	blk_mq_free_request(rq);
 }
 
 void blk_mq_complete_request(struct request *rq, int error)
 {
 	trace_block_rq_complete(rq->q, rq);
+	blk_mq_finish_request(rq, error);
 
-	/*
-	 * If ->end_io is set, it's responsible for doing the rest of the
-	 * completion.
-	 */
 	if (rq->end_io)
 		rq->end_io(rq, error);
 	else
-		blk_mq_finish_request(rq, error);
-
+		blk_mq_free_request(rq);
 }
 
 void __blk_mq_end_io(struct request *rq, int error)
@@ -973,8 +968,7 @@ int blk_mq_execute_rq(struct request_queue *q, struct request *rq)
 	if (rq->errors)
 		err = -EIO;
 
-	blk_mq_finish_request(rq, rq->errors);
-
+	blk_mq_free_request(rq);
 	return err;
 }
 EXPORT_SYMBOL(blk_mq_execute_rq);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 42d0110..52bf1f9 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -27,7 +27,6 @@ void blk_mq_complete_request(struct request *rq, int error);
 void blk_mq_run_request(struct request *rq, bool run_queue, bool async);
 void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
 void blk_mq_init_flush(struct request_queue *q);
-void blk_mq_finish_request(struct request *rq, int error);
 
 /*
  * CPU hotplug helpers

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-20 14:48                                         ` Mike Christie
  (?)
@ 2013-07-20 22:14                                         ` Nicholas A. Bellinger
  -1 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-20 22:14 UTC (permalink / raw)
  To: Mike Christie
  Cc: James Bottomley, Jens Axboe, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Sat, 2013-07-20 at 09:48 -0500, Mike Christie wrote:
> On 07/19/2013 11:56 PM, Nicholas A. Bellinger wrote:
> > On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
> >> On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:

<SNIP>

> >>
> >> Indeed.  Looking into the bio_copy_kern() breakage next..
> >>
> > 
> > OK, after further investigation the root cause is a actually a missing
> > bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
> > blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
> > currently using.
> > 
> > Including the following patch into the scsi-mq working branch now, and
> > reverting the libata dma_alignment=0x03 hack.
> > 
> > Alexander, care to give this a try..?
> > 
> > --nab
> > 
> > diff --git a/block/blk-exec.c b/block/blk-exec.c
> > index 0761c89..70303d2 100644
> > --- a/block/blk-exec.c
> > +++ b/block/blk-exec.c
> > @@ -25,7 +25,10 @@ static void blk_end_sync_rq(struct request *rq, int error)
> >         struct completion *waiting = rq->end_io_data;
> >  
> >         rq->end_io_data = NULL;
> > -       if (!rq->q->mq_ops) {
> > +       if (rq->q->mq_ops) {
> > +               if (rq->bio)
> > +                       bio_endio(rq->bio, error);
> > +       } else {
> >                 __blk_put_request(rq->q, rq);
> >         }
> > 
> 
> 
> This does not handle requests with multiple bios, and for the mq stye
> passthrough insertion completions you actually want to call
> blk_mq_finish_request in scsi_execute. Same for all the other
> passthrough code in your scsi mq tree. That is your root bug. Instead of
> doing that though I think we want to have the block layer free the bios
> like before.
> 
> For the non mq calls, blk_end_request type of calls will complete the
> bios when blk_finish_request is called from that path. It will then call
> the rq end_io callback.
> 
> I think the blk mq code assumes if the end_io callack is set that the
> end_io function will do the bio cleanup. See __blk_mq_end_io. Also see
> how blk_mq_execute_rq calls blk_mq_finish_request for an example of how
> rq passthrough execution and cleanup is being done in the mq paths.
> 
> Now with the scsi mq changes, when blk_execute_rq_nowait calls
> blk_mq_insert_request it calls it with a old non mq style of end io
> function that does not complete the bios.
> 
> What about the attached only compile tested patch. The patch has the mq
> block code work like the non mq code for bio cleanups.
> 
> 

OK, so with your blk-mq patch in place to always call
blk_mq_finish_request() in blk_mq_complete_request() regardless of
rq->end_io, the preceding scsi_mq_end_request() can now simply call
blk_mq_end_io() for both BLOCK_RQ and FS request types.

diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
index 81b2633..f1d4789 100644
--- a/drivers/scsi/scsi-mq.c
+++ b/drivers/scsi/scsi-mq.c
@@ -93,19 +93,7 @@ struct scsi_cmnd *scsi_mq_end_request(struct scsi_cmnd *sc, int error,
 #endif
 
 //FIXME: Add proper blk_mq_end_io residual bytes + requeue
-       if (rq->end_io) {
-#if 0
-               printk("scsi_mq_end_request: Calling rq->end_io BLOCK_PC for"
-                       " CDB: 0x%02x\n", sc->cmnd[0]);
-#endif
-               rq->end_io(rq, error);
-       } else {
-#if 0
-               printk("scsi_mq_end_request: Calling blk_mq_end_io for CDB: 0x%02x\n",
-                               sc->cmnd[0]);
-#endif
-               blk_mq_end_io(rq, error);
-       }
+       blk_mq_end_io(rq, error);
 //FIXME: Need to do equiv of scsi_next_command to kick hctx..?
 
        return NULL;

Thanks for fixing up that bit of ugliness.  ;)

Jens, care to review+include Mike's change into your working branch..?

--nab


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-20 14:48                                         ` Mike Christie
  (?)
  (?)
@ 2013-07-20 23:57                                         ` Jens Axboe
  -1 siblings, 0 replies; 75+ messages in thread
From: Jens Axboe @ 2013-07-20 23:57 UTC (permalink / raw)
  To: Mike Christie
  Cc: Nicholas A. Bellinger, James Bottomley, Alexander Gordeev,
	Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Sat, Jul 20 2013, Mike Christie wrote:
> blk-mq: blk-mq should free bios in pass through case
> 
> For non mq calls, the block layer will free the bios when
> blk_finish_request is called.
> 
> For mq calls, the blk mq code wants the caller to do this.
> 
> This patch has the blk mq code work like the non mq code
> and has the block layer free the bios.

Thanks Mike, looks good, applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-20  4:56                                     ` Nicholas A. Bellinger
  2013-07-20 14:48                                         ` Mike Christie
@ 2013-07-22 15:03                                       ` Alexander Gordeev
  2013-07-22 21:10                                         ` Nicholas A. Bellinger
  1 sibling, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-22 15:03 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: James Bottomley, Jens Axboe, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Jul 19, 2013 at 09:56:02PM -0700, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
> OK, after further investigation the root cause is a actually a missing
> bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
> blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
> currently using.

Yes, missing bio_copy_kern_endio() callback is exactly the reason I
turned to blk_mq_execute_rq() initially. I should have been more
specific on this :|

I will try Mike's and your other change, hopefully soon (sorry,
constantly getting distracted).

> Including the following patch into the scsi-mq working branch now, and
> reverting the libata dma_alignment=0x03 hack.

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-22 15:03                                       ` Alexander Gordeev
@ 2013-07-22 21:10                                         ` Nicholas A. Bellinger
  2013-07-25 10:16                                           ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-22 21:10 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: James Bottomley, Jens Axboe, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Mon, 2013-07-22 at 17:03 +0200, Alexander Gordeev wrote:
> On Fri, Jul 19, 2013 at 09:56:02PM -0700, Nicholas A. Bellinger wrote:
> > On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
> > OK, after further investigation the root cause is a actually a missing
> > bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
> > blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
> > currently using.
> 
> Yes, missing bio_copy_kern_endio() callback is exactly the reason I
> turned to blk_mq_execute_rq() initially. I should have been more
> specific on this :|
> 
> I will try Mike's and your other change, hopefully soon (sorry,
> constantly getting distracted).
> 

Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
target-pending/scsi-mq, which now has functioning scsi-generic support.

Also, your scsi_times_out patch from earlier has not been included just
yet, but that should be the only extra patch you need to apply in order
to get scsi-mq enabled libata/ata_piix running.

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-22 21:10                                         ` Nicholas A. Bellinger
@ 2013-07-25 10:16                                           ` Alexander Gordeev
  2013-07-25 22:08                                             ` Nicholas A. Bellinger
  2013-07-31 17:11                                             ` Alexander Gordeev
  0 siblings, 2 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-25 10:16 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: James Bottomley, Jens Axboe, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> target-pending/scsi-mq, which now has functioning scsi-generic support.

Survives a boot, a kernel build and the build's result :)

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-25 10:16                                           ` Alexander Gordeev
@ 2013-07-25 22:08                                             ` Nicholas A. Bellinger
  2013-07-26  2:09                                               ` Jens Axboe
  2013-07-31 17:11                                             ` Alexander Gordeev
  1 sibling, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-25 22:08 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: James Bottomley, Jens Axboe, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, 2013-07-25 at 12:16 +0200, Alexander Gordeev wrote:
> On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> > Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> > target-pending/scsi-mq, which now has functioning scsi-generic support.
> 
> Survives a boot, a kernel build and the build's result :)

Great.  Thanks for the feedback Alexander!

So the next step on my end is to enable -mq for ahci, and verify initial
correctness using QEMU/KVM hardware emulation.

Btw, I've been looking at enabling the SHT->cmd_size for struct
ata_queued_cmd descriptor pre-allocation, but AFAICT these descriptors
are already all pre-allocated by libata and obtained via ata_qc_new() ->
__ata_qc_from_tag() during ata_scsi_queuecmd().

So that said, with the struct request + struct scsi_cmnd pre-allocations
already provided by blk-mq -> scsi-mq code, all memory allocations
should have already been eliminated from I/O fast path.

--nab

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-25 22:08                                             ` Nicholas A. Bellinger
@ 2013-07-26  2:09                                               ` Jens Axboe
  2013-07-26 21:14                                                 ` Nicholas A. Bellinger
  2013-07-29  7:28                                                 ` Hannes Reinecke
  0 siblings, 2 replies; 75+ messages in thread
From: Jens Axboe @ 2013-07-26  2:09 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Alexander Gordeev, James Bottomley, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, Jul 25 2013, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-25 at 12:16 +0200, Alexander Gordeev wrote:
> > On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> > > Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> > > target-pending/scsi-mq, which now has functioning scsi-generic support.
> > 
> > Survives a boot, a kernel build and the build's result :)
> 
> Great.  Thanks for the feedback Alexander!
> 
> So the next step on my end is to enable -mq for ahci, and verify initial
> correctness using QEMU/KVM hardware emulation.
> 
> Btw, I've been looking at enabling the SHT->cmd_size for struct
> ata_queued_cmd descriptor pre-allocation, but AFAICT these descriptors
> are already all pre-allocated by libata and obtained via ata_qc_new() ->
> __ata_qc_from_tag() during ata_scsi_queuecmd().

Might still not be a bad idea to do it:

- Cleans up a driver, getting rid of the need to alloc, maintain, and
  free those structures.

- Should be some cache locality benefits to having it all sequential.

> So that said, with the struct request + struct scsi_cmnd pre-allocations
> already provided by blk-mq -> scsi-mq code, all memory allocations
> should have already been eliminated from I/O fast path.

Nice!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-26  2:09                                               ` Jens Axboe
@ 2013-07-26 21:14                                                 ` Nicholas A. Bellinger
  2013-07-27  0:43                                                   ` Nicholas A. Bellinger
  2013-07-29 11:46                                                   ` Tejun Heo
  2013-07-29  7:28                                                 ` Hannes Reinecke
  1 sibling, 2 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-26 21:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Alexander Gordeev, James Bottomley, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, 2013-07-25 at 20:09 -0600, Jens Axboe wrote:
> On Thu, Jul 25 2013, Nicholas A. Bellinger wrote:
> > On Thu, 2013-07-25 at 12:16 +0200, Alexander Gordeev wrote:
> > > On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> > > > Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> > > > target-pending/scsi-mq, which now has functioning scsi-generic support.
> > > 
> > > Survives a boot, a kernel build and the build's result :)
> > 
> > Great.  Thanks for the feedback Alexander!
> > 
> > So the next step on my end is to enable -mq for ahci, and verify initial
> > correctness using QEMU/KVM hardware emulation.
> > 
> > Btw, I've been looking at enabling the SHT->cmd_size for struct
> > ata_queued_cmd descriptor pre-allocation, but AFAICT these descriptors
> > are already all pre-allocated by libata and obtained via ata_qc_new() ->
> > __ata_qc_from_tag() during ata_scsi_queuecmd().
> 
> Might still not be a bad idea to do it:
> 
> - Cleans up a driver, getting rid of the need to alloc, maintain, and
>   free those structures.
> 
> - Should be some cache locality benefits to having it all sequential.
> 

Looking at this some more, there are a number of locations outside of
the main blk_mq_ops->queue_rq() -> SHT->queuecommand_mq() dispatch that
use *ata_qc_from_tag() to obtain *ata_queued_cmd, and a few without a
associated struct scsi_cmnd like libata-core.c:ata_exec_internal_sg()
for example..

So I don't think (completely) getting rid of ata_port->qcmds[] will be
possible, and just converting the ata_scsi_queuecmd() path to use the
extra SHT->cmd_size pre-allocation for *ata_queued_cmd might end up
being more trouble that it's worth.  Still undecided on that part..

Tejun, do you have any thoughts + input here..?

--nab

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-26 21:14                                                 ` Nicholas A. Bellinger
@ 2013-07-27  0:43                                                   ` Nicholas A. Bellinger
  2013-07-29 11:18                                                     ` Alexander Gordeev
  2013-07-29 11:50                                                     ` Tejun Heo
  2013-07-29 11:46                                                   ` Tejun Heo
  1 sibling, 2 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-27  0:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Alexander Gordeev, James Bottomley, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, 2013-07-26 at 14:14 -0700, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-25 at 20:09 -0600, Jens Axboe wrote:
> > On Thu, Jul 25 2013, Nicholas A. Bellinger wrote:
> > > On Thu, 2013-07-25 at 12:16 +0200, Alexander Gordeev wrote:
> > > > On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> > > > > Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> > > > > target-pending/scsi-mq, which now has functioning scsi-generic support.
> > > > 
> > > > Survives a boot, a kernel build and the build's result :)
> > > 
> > > Great.  Thanks for the feedback Alexander!
> > > 
> > > So the next step on my end is to enable -mq for ahci, and verify initial
> > > correctness using QEMU/KVM hardware emulation.
> > > 
> > > Btw, I've been looking at enabling the SHT->cmd_size for struct
> > > ata_queued_cmd descriptor pre-allocation, but AFAICT these descriptors
> > > are already all pre-allocated by libata and obtained via ata_qc_new() ->
> > > __ata_qc_from_tag() during ata_scsi_queuecmd().
> > 
> > Might still not be a bad idea to do it:
> > 
> > - Cleans up a driver, getting rid of the need to alloc, maintain, and
> >   free those structures.
> > 
> > - Should be some cache locality benefits to having it all sequential.
> > 
> 
> Looking at this some more, there are a number of locations outside of
> the main blk_mq_ops->queue_rq() -> SHT->queuecommand_mq() dispatch that
> use *ata_qc_from_tag() to obtain *ata_queued_cmd, and a few without a
> associated struct scsi_cmnd like libata-core.c:ata_exec_internal_sg()
> for example..
> 
> So I don't think (completely) getting rid of ata_port->qcmds[] will be
> possible, and just converting the ata_scsi_queuecmd() path to use the
> extra SHT->cmd_size pre-allocation for *ata_queued_cmd might end up
> being more trouble that it's worth.  Still undecided on that part..
> 
> Tejun, do you have any thoughts + input here..?
> 

OK, so I decided to give this a shot anyways..  Here is a quick
conversion for libata + AHCI to use blk-mq -> scsi-mq pre-allocation for
ata_queued_cmd descriptors:

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 2b50dfd..61b3db8 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -92,6 +92,9 @@ static int ahci_pci_device_resume(struct pci_dev *pdev);
 
 static struct scsi_host_template ahci_sht = {
        AHCI_SHT("ahci"),
+       .scsi_mq = true,
+       .cmd_size = sizeof(struct ata_queued_cmd),
+       .queuecommand_mq = ata_scsi_queuecmd,
 };
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index f218427..e21814d 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4725,29 +4725,25 @@ void swap_buf_le16(u16 *buf, unsigned int buf_words)
 /**
  *     ata_qc_new - Request an available ATA command, for queueing
  *     @ap: target port
+ *     @sc: incoming scsi_cmnd descriptor
  *
  *     LOCKING:
  *     None.
  */
 
-static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
+static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap,
+                                        struct scsi_cmnd *sc)
 {
        struct ata_queued_cmd *qc = NULL;
-       unsigned int i;
+       struct request *rq = sc->request;
 
        /* no command while frozen */
        if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
                return NULL;
 
-       /* the last tag is reserved for internal command. */
-       for (i = 0; i < ATA_MAX_QUEUE - 1; i++)
-               if (!test_and_set_bit(i, &ap->qc_allocated)) {
-                       qc = __ata_qc_from_tag(ap, i);
-                       break;
-               }
-
-       if (qc)
-               qc->tag = i;
+       qc = (struct ata_queued_cmd *)sc->SCp.ptr;
+       qc->scsicmd = sc;
+       qc->tag = rq->tag;
 
        return qc;
 }
@@ -4755,19 +4751,20 @@ static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
 /**
  *     ata_qc_new_init - Request an available ATA command, and initialize it
  *     @dev: Device from whom we request an available command structure
+ *     @sc: incoming scsi_cmnd descriptor
  *
  *     LOCKING:
  *     None.
  */
 
-struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev)
+struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev,
+                                      struct scsi_cmnd *sc)
 {
        struct ata_port *ap = dev->link->ap;
        struct ata_queued_cmd *qc;
 
-       qc = ata_qc_new(ap);
+       qc = ata_qc_new(ap, sc);
        if (qc) {
-               qc->scsicmd = NULL;
                qc->ap = ap;
                qc->dev = dev;
 
@@ -4797,10 +4794,9 @@ void ata_qc_free(struct ata_queued_cmd *qc)
 
        qc->flags = 0;
        tag = qc->tag;
-       if (likely(ata_tag_valid(tag))) {
+
+       if (likely(ata_tag_valid(tag)))
                qc->tag = ATA_TAG_POISON;
-               clear_bit(tag, &ap->qc_allocated);
-       }
 }
 
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0101af5..e5ab880 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -742,9 +742,8 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev,
 {
        struct ata_queued_cmd *qc;
 
-       qc = ata_qc_new_init(dev);
+       qc = ata_qc_new_init(dev, cmd);
        if (qc) {
-               qc->scsicmd = cmd;
                qc->scsidone = cmd->scsi_done;
 
                qc->sg = scsi_sglist(cmd);
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index c949dd3..4cd88af 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -63,7 +63,8 @@ extern struct ata_link *ata_dev_phys_link(struct ata_device *dev);
 extern void ata_force_cbl(struct ata_port *ap);
 extern u64 ata_tf_to_lba(const struct ata_taskfile *tf);
 extern u64 ata_tf_to_lba48(const struct ata_taskfile *tf);
-extern struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev);
+extern struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev,
+                                             struct scsi_cmnd *sc);
 extern int ata_build_rw_tf(struct ata_taskfile *tf, struct ata_device *dev,
                           u64 block, u32 n_block, unsigned int tf_flags,
                           unsigned int tag);
diff --git a/include/linux/libata.h b/include/linux/libata.h
index eae7a05..52e9e9e 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -35,9 +35,13 @@
 #include <linux/ata.h>
 #include <linux/workqueue.h>
 #include <scsi/scsi_host.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_cmnd.h>
 #include <linux/acpi.h>
 #include <linux/cdrom.h>
 #include <linux/sched.h>
+#include <linux/blk-mq.h>
+#include <../../block/blk-mq.h>
 
 /*
  * Define if arch has non-standard setup.  This is a _PCI_ standard
@@ -1500,9 +1504,26 @@ static inline void ata_qc_set_polling(struct ata_queued_cmd *qc)
 static inline struct ata_queued_cmd *__ata_qc_from_tag(struct ata_port *ap,
                                                       unsigned int tag)
 {
-       if (likely(ata_tag_valid(tag)))
-               return &ap->qcmd[tag];
-       return NULL;
+       struct scsi_device *sdev = ap->link.device[0].sdev;
+       struct request_queue *q = sdev->request_queue;
+
+       if (unlikely(!ata_tag_valid(tag)))
+               return NULL;
+
+       if (likely(sdev->host->hostt->scsi_mq)) {
+               struct blk_mq_ctx *ctx = blk_mq_get_ctx(q);
+               struct blk_mq_hw_ctx *hctx = q->mq_ops->map_queue(q, ctx->cpu);
+               struct request *rq;
+               struct ata_queued_cmd *qc;
+
+               BUG_ON(tag > hctx->queue_depth);
+
+               rq = hctx->rqs[tag];
+               qc = blk_mq_rq_to_pdu(rq) + sizeof(struct scsi_cmnd);
+               blk_mq_put_ctx(ctx);
+               return qc;
+       }
+       return &ap->qcmd[tag];
 }

The thing that I'm hung up on now for existing __ata_qc_from_tag() usage
outside of the main blk_mq_ops->queue_rq -> SHT->queuecommand_mq()
dispatch path, is how to actually locate the underlying scsi_device ->
request_queue -> blk_mq_ctx -> blk_mq_hw_hctx from the passed
ata_port..?

Considering there can be more than a single ata_device hanging off each
ata_port, the '*sdev = ap->link.device[0].sdev' in __ata_qc_from_tag()
is definitely bogus, but I'm not sure how else to correlate
blk-mq/scsi-mq per device descriptors to existing code expecting
ata_port->qcmd[] descriptors to be shared across multiple devices..

Tejun..?

--nab


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-26  2:09                                               ` Jens Axboe
  2013-07-26 21:14                                                 ` Nicholas A. Bellinger
@ 2013-07-29  7:28                                                 ` Hannes Reinecke
  1 sibling, 0 replies; 75+ messages in thread
From: Hannes Reinecke @ 2013-07-29  7:28 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Nicholas A. Bellinger, Alexander Gordeev, James Bottomley,
	Mike Christie, Tejun Heo, linux-kernel, linux-ide, Jeff Garzik,
	linux-scsi

On 07/26/2013 04:09 AM, Jens Axboe wrote:
> On Thu, Jul 25 2013, Nicholas A. Bellinger wrote:
>> On Thu, 2013-07-25 at 12:16 +0200, Alexander Gordeev wrote:
>>> On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
>>>> Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
>>>> target-pending/scsi-mq, which now has functioning scsi-generic support.
>>>
>>> Survives a boot, a kernel build and the build's result :)
>>
>> Great.  Thanks for the feedback Alexander!
>>
>> So the next step on my end is to enable -mq for ahci, and verify initial
>> correctness using QEMU/KVM hardware emulation.
>>
>> Btw, I've been looking at enabling the SHT->cmd_size for struct
>> ata_queued_cmd descriptor pre-allocation, but AFAICT these descriptors
>> are already all pre-allocated by libata and obtained via ata_qc_new() ->
>> __ata_qc_from_tag() during ata_scsi_queuecmd().
> 
> Might still not be a bad idea to do it:
> 
> - Cleans up a driver, getting rid of the need to alloc, maintain, and
>   free those structures.
> 
> - Should be some cache locality benefits to having it all sequential.
> 
>> So that said, with the struct request + struct scsi_cmnd pre-allocations
>> already provided by blk-mq -> scsi-mq code, all memory allocations
>> should have already been eliminated from I/O fast path.
> 
> Nice!
> 
Hmm.

I'm trying to work out if it would be possible to move multipath
handling over to scsi-mq.
However, when doing so I would need to reconfigure 'nr_hw_queues' on
the fly. Now with all the static cmd preallocation going on this is
going to be tricky.

This leaves me with two choices:
- Tear down the command pool altogether whenever I need to
  reconfigure the device (which is going to be painful)
- Allocate some max nr_hw_queues, and mark the superfluous
  ones as 'unused' or something. Seeing the a sane max nr_hw_queues
  will be possibly the number of cpus this might end up hogging
  quite some memory.

Would you accept patches moving the static command allocation over
to pools or is this a desired feature?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-27  0:43                                                   ` Nicholas A. Bellinger
@ 2013-07-29 11:18                                                     ` Alexander Gordeev
  2013-07-29 14:08                                                       ` Jens Axboe
  2013-07-29 19:19                                                       ` Nicholas A. Bellinger
  2013-07-29 11:50                                                     ` Tejun Heo
  1 sibling, 2 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-29 11:18 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, James Bottomley, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Jul 26, 2013 at 05:43:13PM -0700, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-26 at 14:14 -0700, Nicholas A. Bellinger wrote:
> > On Thu, 2013-07-25 at 20:09 -0600, Jens Axboe wrote:
> > > On Thu, Jul 25 2013, Nicholas A. Bellinger wrote:
> > > > On Thu, 2013-07-25 at 12:16 +0200, Alexander Gordeev wrote:
> > > > > On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> > > > > > Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> > > > > > target-pending/scsi-mq, which now has functioning scsi-generic support.
> > > > > 
> > > > > Survives a boot, a kernel build and the build's result :)
> > > > 
> > > > Great.  Thanks for the feedback Alexander!
> > > > 
> > > > So the next step on my end is to enable -mq for ahci, and verify initial
> > > > correctness using QEMU/KVM hardware emulation.
> > > > 
> > > > Btw, I've been looking at enabling the SHT->cmd_size for struct
> > > > ata_queued_cmd descriptor pre-allocation, but AFAICT these descriptors
> > > > are already all pre-allocated by libata and obtained via ata_qc_new() ->
> > > > __ata_qc_from_tag() during ata_scsi_queuecmd().
> > > 
> > > Might still not be a bad idea to do it:
> > > 
> > > - Cleans up a driver, getting rid of the need to alloc, maintain, and
> > >   free those structures.
> > > 
> > > - Should be some cache locality benefits to having it all sequential.
> > > 
> > 
> > Looking at this some more, there are a number of locations outside of
> > the main blk_mq_ops->queue_rq() -> SHT->queuecommand_mq() dispatch that
> > use *ata_qc_from_tag() to obtain *ata_queued_cmd, and a few without a
> > associated struct scsi_cmnd like libata-core.c:ata_exec_internal_sg()
> > for example..
> > 
> > So I don't think (completely) getting rid of ata_port->qcmds[] will be
> > possible, and just converting the ata_scsi_queuecmd() path to use the
> > extra SHT->cmd_size pre-allocation for *ata_queued_cmd might end up
> > being more trouble that it's worth.  Still undecided on that part..
> > 
> > Tejun, do you have any thoughts + input here..?
> > 
> 
> OK, so I decided to give this a shot anyways..  Here is a quick
> conversion for libata + AHCI to use blk-mq -> scsi-mq pre-allocation for
> ata_queued_cmd descriptors:
> 
> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> index 2b50dfd..61b3db8 100644
> --- a/drivers/ata/ahci.c
> +++ b/drivers/ata/ahci.c
> @@ -92,6 +92,9 @@ static int ahci_pci_device_resume(struct pci_dev *pdev);
>  
>  static struct scsi_host_template ahci_sht = {
>         AHCI_SHT("ahci"),
> +       .scsi_mq = true,
> +       .cmd_size = sizeof(struct ata_queued_cmd),
> +       .queuecommand_mq = ata_scsi_queuecmd,
>  };
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index f218427..e21814d 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4725,29 +4725,25 @@ void swap_buf_le16(u16 *buf, unsigned int buf_words)
>  /**
>   *     ata_qc_new - Request an available ATA command, for queueing
>   *     @ap: target port
> + *     @sc: incoming scsi_cmnd descriptor
>   *
>   *     LOCKING:
>   *     None.
>   */
>  
> -static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
> +static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap,
> +                                        struct scsi_cmnd *sc)
>  {
>         struct ata_queued_cmd *qc = NULL;
> -       unsigned int i;
> +       struct request *rq = sc->request;
>  
>         /* no command while frozen */
>         if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
>                 return NULL;
>  
> -       /* the last tag is reserved for internal command. */
> -       for (i = 0; i < ATA_MAX_QUEUE - 1; i++)

blk-mq does not prevent tag ATA_TAG_INTERNAL from being using. Would it make
sense to promote queue depth of length (ATA_MAX_QUEUE - 1) while always
pointing ATA_TAG_INTERNAL to qcmd (see below)?

> -               if (!test_and_set_bit(i, &ap->qc_allocated)) {

ata_port::qc_allocated becomes redundant.

> -                       qc = __ata_qc_from_tag(ap, i);
> -                       break;
> -               }
> -
> -       if (qc)
> -               qc->tag = i;
> +       qc = (struct ata_queued_cmd *)sc->SCp.ptr;
> +       qc->scsicmd = sc;
> +       qc->tag = rq->tag;
>  
>         return qc;
>  }
> @@ -4755,19 +4751,20 @@ static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
>  /**
>   *     ata_qc_new_init - Request an available ATA command, and initialize it
>   *     @dev: Device from whom we request an available command structure
> + *     @sc: incoming scsi_cmnd descriptor
>   *
>   *     LOCKING:
>   *     None.
>   */
>  
> -struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev)
> +struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev,
> +                                      struct scsi_cmnd *sc)
>  {
>         struct ata_port *ap = dev->link->ap;
>         struct ata_queued_cmd *qc;
>  
> -       qc = ata_qc_new(ap);
> +       qc = ata_qc_new(ap, sc);
>         if (qc) {
> -               qc->scsicmd = NULL;
>                 qc->ap = ap;
>                 qc->dev = dev;
>  
> @@ -4797,10 +4794,9 @@ void ata_qc_free(struct ata_queued_cmd *qc)
>  
>         qc->flags = 0;
>         tag = qc->tag;
> -       if (likely(ata_tag_valid(tag))) {
> +
> +       if (likely(ata_tag_valid(tag)))
>                 qc->tag = ATA_TAG_POISON;
> -               clear_bit(tag, &ap->qc_allocated);
> -       }
>  }
>  
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 0101af5..e5ab880 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -742,9 +742,8 @@ static struct ata_queued_cmd *ata_scsi_qc_new(struct ata_device *dev,
>  {
>         struct ata_queued_cmd *qc;
>  
> -       qc = ata_qc_new_init(dev);
> +       qc = ata_qc_new_init(dev, cmd);
>         if (qc) {
> -               qc->scsicmd = cmd;
>                 qc->scsidone = cmd->scsi_done;
>  
>                 qc->sg = scsi_sglist(cmd);
> diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
> index c949dd3..4cd88af 100644
> --- a/drivers/ata/libata.h
> +++ b/drivers/ata/libata.h
> @@ -63,7 +63,8 @@ extern struct ata_link *ata_dev_phys_link(struct ata_device *dev);
>  extern void ata_force_cbl(struct ata_port *ap);
>  extern u64 ata_tf_to_lba(const struct ata_taskfile *tf);
>  extern u64 ata_tf_to_lba48(const struct ata_taskfile *tf);
> -extern struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev);
> +extern struct ata_queued_cmd *ata_qc_new_init(struct ata_device *dev,
> +                                             struct scsi_cmnd *sc);
>  extern int ata_build_rw_tf(struct ata_taskfile *tf, struct ata_device *dev,
>                            u64 block, u32 n_block, unsigned int tf_flags,
>                            unsigned int tag);
> diff --git a/include/linux/libata.h b/include/linux/libata.h
> index eae7a05..52e9e9e 100644
> --- a/include/linux/libata.h
> +++ b/include/linux/libata.h
> @@ -35,9 +35,13 @@
>  #include <linux/ata.h>
>  #include <linux/workqueue.h>
>  #include <scsi/scsi_host.h>
> +#include <scsi/scsi_device.h>
> +#include <scsi/scsi_cmnd.h>
>  #include <linux/acpi.h>
>  #include <linux/cdrom.h>
>  #include <linux/sched.h>
> +#include <linux/blk-mq.h>
> +#include <../../block/blk-mq.h>
>  
>  /*
>   * Define if arch has non-standard setup.  This is a _PCI_ standard
> @@ -1500,9 +1504,26 @@ static inline void ata_qc_set_polling(struct ata_queued_cmd *qc)
>  static inline struct ata_queued_cmd *__ata_qc_from_tag(struct ata_port *ap,
>                                                        unsigned int tag)
>  {
> -       if (likely(ata_tag_valid(tag)))
> -               return &ap->qcmd[tag];
> -       return NULL;
> +       struct scsi_device *sdev = ap->link.device[0].sdev;
> +       struct request_queue *q = sdev->request_queue;
> +
> +       if (unlikely(!ata_tag_valid(tag)))
> +               return NULL;
> +
> +       if (likely(sdev->host->hostt->scsi_mq)) {

Together with the comment above:

          if (likely(sdev->host->hostt->scsi_mq && (tag != ATA_TAG_INTERNAL))) {
          	...

> +               struct blk_mq_ctx *ctx = blk_mq_get_ctx(q);
> +               struct blk_mq_hw_ctx *hctx = q->mq_ops->map_queue(q, ctx->cpu);
> +               struct request *rq;
> +               struct ata_queued_cmd *qc;
> +
> +               BUG_ON(tag > hctx->queue_depth);
> +
> +               rq = hctx->rqs[tag];
> +               qc = blk_mq_rq_to_pdu(rq) + sizeof(struct scsi_cmnd);
> +               blk_mq_put_ctx(ctx);
> +               return qc;
> +       }
> +       return &ap->qcmd[tag];
>  }

I also tried to make a "quick" conversion and hit the same issue(s) as you.
Generally, I am concerned with these assumptions in such approach:

1. While libata concept of tags matches nicely with blk-mq (blk_mq_hw_ctx::
rqs[] vs ata_port::qcmd[]) right now, it is too exposed to changes in blk-mq
in the long run. I.e. ata_link::sactive limits tags to indices, while tags
might become hashes. Easily fixable, but still.

2. Unallocated requests in blk-mq are accessed/analized from libata-eh.c as
result of such iterations:

        for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
                qc = __ata_qc_from_tag(ap, tag);

                if (!(qc->flags & ATA_QCFLAG_FAILED))
                        continue;

		...
	}

While it is probably okay right now, it is still based on a premise that
blk-mq will not change the contents/concept of "payload", i.e. from embedded
to (re-)allocated memory.

> The thing that I'm hung up on now for existing __ata_qc_from_tag() usage
> outside of the main blk_mq_ops->queue_rq -> SHT->queuecommand_mq()
> dispatch path, is how to actually locate the underlying scsi_device ->
> request_queue -> blk_mq_ctx -> blk_mq_hw_hctx from the passed
> ata_port..?

I am actually in favor of getting rid of ata_queued_cmd::tag. Converting
ata_link::sactive to a list, making ata_link::active_tag as struct
ata_queued_cmd *ata_link::active_qc and converting ata_port::qc_allocated to a
list seems solves it all, including [2]. Have not checked it though.

Anyway, if we need a blk-mq tag (why?), we have qc->scsicmd->request->tag.

> Considering there can be more than a single ata_device hanging off each
> ata_port, the '*sdev = ap->link.device[0].sdev' in __ata_qc_from_tag()
> is definitely bogus, but I'm not sure how else to correlate
> blk-mq/scsi-mq per device descriptors to existing code expecting
> ata_port->qcmd[] descriptors to be shared across multiple devices..
> 
> Tejun..?
> 
> --nab
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-26 21:14                                                 ` Nicholas A. Bellinger
  2013-07-27  0:43                                                   ` Nicholas A. Bellinger
@ 2013-07-29 11:46                                                   ` Tejun Heo
  2013-07-29 14:03                                                     ` Jens Axboe
  2013-08-09  8:23                                                     ` Alexander Gordeev
  1 sibling, 2 replies; 75+ messages in thread
From: Tejun Heo @ 2013-07-29 11:46 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Alexander Gordeev, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

Hello,

On Fri, Jul 26, 2013 at 02:14:36PM -0700, Nicholas A. Bellinger wrote:
> So I don't think (completely) getting rid of ata_port->qcmds[] will be
> possible, and just converting the ata_scsi_queuecmd() path to use the
> extra SHT->cmd_size pre-allocation for *ata_queued_cmd might end up
> being more trouble that it's worth.  Still undecided on that part..
> 
> Tejun, do you have any thoughts + input here..?

libata exception handling which includes probing doesn't go through
SCSI at all.  It all works inside libata proper using ata_queuecmds
and only the result is exposed to SCSI.  Most of those SCSI semantics
need to be emulated anyway, so this makes things a lot easier than
going through SCSI for each command.  As it currently stands, it'd be
a lot of effort to try to embed ata_qc's into higher layer construct.
Given how it's used, I don't think it's a high priority task.

One thing which would probably be worthwhile tho is getting rid of the
bitmap based qc tag allocator in libata.  That one is just borderline
stupid to keep around on any setup which is supposed to be scalable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-27  0:43                                                   ` Nicholas A. Bellinger
  2013-07-29 11:18                                                     ` Alexander Gordeev
@ 2013-07-29 11:50                                                     ` Tejun Heo
  2013-07-29 19:11                                                       ` Nicholas A. Bellinger
  1 sibling, 1 reply; 75+ messages in thread
From: Tejun Heo @ 2013-07-29 11:50 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Jens Axboe, Alexander Gordeev, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

Yo,

On Fri, Jul 26, 2013 at 05:43:13PM -0700, Nicholas A. Bellinger wrote:
> Considering there can be more than a single ata_device hanging off each
> ata_port, the '*sdev = ap->link.device[0].sdev' in __ata_qc_from_tag()
> is definitely bogus, but I'm not sure how else to correlate
> blk-mq/scsi-mq per device descriptors to existing code expecting
> ata_port->qcmd[] descriptors to be shared across multiple devices..
> 
> Tejun..?

I have no idea.  Let's please just do simpler conversion and worry
about embedding qc's into scsi_cmnds later.  libata isn't a normal
SCSI driver and has a rather its own thick midlayer doing the
impedance matching inbetween && I really don't think there is too much
benefit to be reaped from embedding qc's into scsi_cmnds.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-29 11:46                                                   ` Tejun Heo
@ 2013-07-29 14:03                                                     ` Jens Axboe
  2013-08-09  8:23                                                     ` Alexander Gordeev
  1 sibling, 0 replies; 75+ messages in thread
From: Jens Axboe @ 2013-07-29 14:03 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas A. Bellinger, Alexander Gordeev, James Bottomley,
	Mike Christie, linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On 07/29/2013 05:46 AM, Tejun Heo wrote:
> Hello,
> 
> On Fri, Jul 26, 2013 at 02:14:36PM -0700, Nicholas A. Bellinger wrote:
>> So I don't think (completely) getting rid of ata_port->qcmds[] will be
>> possible, and just converting the ata_scsi_queuecmd() path to use the
>> extra SHT->cmd_size pre-allocation for *ata_queued_cmd might end up
>> being more trouble that it's worth.  Still undecided on that part..
>>
>> Tejun, do you have any thoughts + input here..?
> 
> libata exception handling which includes probing doesn't go through
> SCSI at all.  It all works inside libata proper using ata_queuecmds
> and only the result is exposed to SCSI.  Most of those SCSI semantics
> need to be emulated anyway, so this makes things a lot easier than
> going through SCSI for each command.  As it currently stands, it'd be
> a lot of effort to try to embed ata_qc's into higher layer construct.
> Given how it's used, I don't think it's a high priority task.
> 
> One thing which would probably be worthwhile tho is getting rid of the
> bitmap based qc tag allocator in libata.  That one is just borderline
> stupid to keep around on any setup which is supposed to be scalable.

Your border might be wider than mine :-). Yes, the bitmap should
definitely go.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-29 11:18                                                     ` Alexander Gordeev
@ 2013-07-29 14:08                                                       ` Jens Axboe
  2013-07-29 19:19                                                       ` Nicholas A. Bellinger
  1 sibling, 0 replies; 75+ messages in thread
From: Jens Axboe @ 2013-07-29 14:08 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Nicholas A. Bellinger, James Bottomley, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On 07/29/2013 05:18 AM, Alexander Gordeev wrote:
>> -static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
>> +static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap,
>> +                                        struct scsi_cmnd *sc)
>>  {
>>         struct ata_queued_cmd *qc = NULL;
>> -       unsigned int i;
>> +       struct request *rq = sc->request;
>>  
>>         /* no command while frozen */
>>         if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
>>                 return NULL;
>>  
>> -       /* the last tag is reserved for internal command. */
>> -       for (i = 0; i < ATA_MAX_QUEUE - 1; i++)
> 
> blk-mq does not prevent tag ATA_TAG_INTERNAL from being using. Would it make
> sense to promote queue depth of length (ATA_MAX_QUEUE - 1) while always
> pointing ATA_TAG_INTERNAL to qcmd (see below)?

blk-mq does support a number of reserved tags, information just needs to
be passed in appropriately. So there is support for reserving X number
of error handling / emergency tags.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-29 11:50                                                     ` Tejun Heo
@ 2013-07-29 19:11                                                       ` Nicholas A. Bellinger
  0 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-29 19:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Alexander Gordeev, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Mon, 2013-07-29 at 07:50 -0400, Tejun Heo wrote:
> Yo,
> 
> On Fri, Jul 26, 2013 at 05:43:13PM -0700, Nicholas A. Bellinger wrote:
> > Considering there can be more than a single ata_device hanging off each
> > ata_port, the '*sdev = ap->link.device[0].sdev' in __ata_qc_from_tag()
> > is definitely bogus, but I'm not sure how else to correlate
> > blk-mq/scsi-mq per device descriptors to existing code expecting
> > ata_port->qcmd[] descriptors to be shared across multiple devices..
> > 
> > Tejun..?
> 
> I have no idea.  Let's please just do simpler conversion and worry
> about embedding qc's into scsi_cmnds later.  libata isn't a normal
> SCSI driver and has a rather its own thick midlayer doing the
> impedance matching inbetween && I really don't think there is too much
> benefit to be reaped from embedding qc's into scsi_cmnds.
> 

That is essentially the same conclusion that I came to, but wanted to at
least give you a chance to comment here.  ;)

So that said, I'll include a simple conversion for libata into the
scsi-mq WIP branch, and folks who are interested in more detailed
conversions can pursue them as separate items.

--nab 


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-29 11:18                                                     ` Alexander Gordeev
  2013-07-29 14:08                                                       ` Jens Axboe
@ 2013-07-29 19:19                                                       ` Nicholas A. Bellinger
  2013-07-31  4:16                                                         ` Marc C
  1 sibling, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-07-29 19:19 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Jens Axboe, James Bottomley, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Mon, 2013-07-29 at 13:18 +0200, Alexander Gordeev wrote:
> On Fri, Jul 26, 2013 at 05:43:13PM -0700, Nicholas A. Bellinger wrote:
> > On Fri, 2013-07-26 at 14:14 -0700, Nicholas A. Bellinger wrote:

<SNIP>

> I also tried to make a "quick" conversion and hit the same issue(s) as you.
> Generally, I am concerned with these assumptions in such approach:
> 
> 1. While libata concept of tags matches nicely with blk-mq (blk_mq_hw_ctx::
> rqs[] vs ata_port::qcmd[]) right now, it is too exposed to changes in blk-mq
> in the long run. I.e. ata_link::sactive limits tags to indices, while tags
> might become hashes. Easily fixable, but still.
> 
> 2. Unallocated requests in blk-mq are accessed/analized from libata-eh.c as
> result of such iterations:
> 
>         for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
>                 qc = __ata_qc_from_tag(ap, tag);
> 
>                 if (!(qc->flags & ATA_QCFLAG_FAILED))
>                         continue;
> 
> 		...
> 	}
> 
> While it is probably okay right now, it is still based on a premise that
> blk-mq will not change the contents/concept of "payload", i.e. from embedded
> to (re-)allocated memory.
> 
> > The thing that I'm hung up on now for existing __ata_qc_from_tag() usage
> > outside of the main blk_mq_ops->queue_rq -> SHT->queuecommand_mq()
> > dispatch path, is how to actually locate the underlying scsi_device ->
> > request_queue -> blk_mq_ctx -> blk_mq_hw_hctx from the passed
> > ata_port..?
> 
> I am actually in favor of getting rid of ata_queued_cmd::tag. Converting
> ata_link::sactive to a list, making ata_link::active_tag as struct
> ata_queued_cmd *ata_link::active_qc and converting ata_port::qc_allocated to a
> list seems solves it all, including [2]. Have not checked it though.
> 
> Anyway, if we need a blk-mq tag (why?), we have qc->scsicmd->request->tag.
> 

Hi Alexander,

So given the feedback from Tejun, I'm going to setup back for the moment
from a larger conversion, and keep the SHT->cmd_size=0 setting for
libata in the scsi-mq WIP branch.

I'm happy to accept patches to drop the bitmap piece that Tejun
mentioned if your interested, but at least on my end right now there are
bigger fish to fry for scsi-mq.  ;)

Thanks,

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-29 19:19                                                       ` Nicholas A. Bellinger
@ 2013-07-31  4:16                                                         ` Marc C
  2013-07-31 10:23                                                           ` Tejun Heo
  0 siblings, 1 reply; 75+ messages in thread
From: Marc C @ 2013-07-31  4:16 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Alexander Gordeev, Jens Axboe, James Bottomley, Mike Christie,
	Tejun Heo, linux-kernel, linux-ide, Jeff Garzik, linux-scsi

>> One thing which would probably be worthwhile tho is getting rid of the
>> bitmap based qc tag allocator in libata.  That one is just borderline
>> stupid to keep around on any setup which is supposed to be scalable.
> Your border might be wider than mine :-). Yes, the bitmap should
> definitely go.
A naive implementation is obviously less-than-efficient. However, what
other problems exist with the libata QC tag allocator? I highly doubt
SATA will change to beyond 32 queue tags, primarily because it would
be a pain to change SDB FIS (it's likely to break the dozens of AHCI
controller implementations out there). Further, it seems like the
industry stopped caring about SATA and is pushing NVMe for future
offerings.

In any event, most modern systems should have instructions to count
leading zeroes and modify bits atomically.

-MC

On Mon, Jul 29, 2013 at 12:19 PM, Nicholas A. Bellinger
<nab@linux-iscsi.org> wrote:
> On Mon, 2013-07-29 at 13:18 +0200, Alexander Gordeev wrote:
>> On Fri, Jul 26, 2013 at 05:43:13PM -0700, Nicholas A. Bellinger wrote:
>> > On Fri, 2013-07-26 at 14:14 -0700, Nicholas A. Bellinger wrote:
>
> <SNIP>
>
>> I also tried to make a "quick" conversion and hit the same issue(s) as you.
>> Generally, I am concerned with these assumptions in such approach:
>>
>> 1. While libata concept of tags matches nicely with blk-mq (blk_mq_hw_ctx::
>> rqs[] vs ata_port::qcmd[]) right now, it is too exposed to changes in blk-mq
>> in the long run. I.e. ata_link::sactive limits tags to indices, while tags
>> might become hashes. Easily fixable, but still.
>>
>> 2. Unallocated requests in blk-mq are accessed/analized from libata-eh.c as
>> result of such iterations:
>>
>>         for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
>>                 qc = __ata_qc_from_tag(ap, tag);
>>
>>                 if (!(qc->flags & ATA_QCFLAG_FAILED))
>>                         continue;
>>
>>               ...
>>       }
>>
>> While it is probably okay right now, it is still based on a premise that
>> blk-mq will not change the contents/concept of "payload", i.e. from embedded
>> to (re-)allocated memory.
>>
>> > The thing that I'm hung up on now for existing __ata_qc_from_tag() usage
>> > outside of the main blk_mq_ops->queue_rq -> SHT->queuecommand_mq()
>> > dispatch path, is how to actually locate the underlying scsi_device ->
>> > request_queue -> blk_mq_ctx -> blk_mq_hw_hctx from the passed
>> > ata_port..?
>>
>> I am actually in favor of getting rid of ata_queued_cmd::tag. Converting
>> ata_link::sactive to a list, making ata_link::active_tag as struct
>> ata_queued_cmd *ata_link::active_qc and converting ata_port::qc_allocated to a
>> list seems solves it all, including [2]. Have not checked it though.
>>
>> Anyway, if we need a blk-mq tag (why?), we have qc->scsicmd->request->tag.
>>
>
> Hi Alexander,
>
> So given the feedback from Tejun, I'm going to setup back for the moment
> from a larger conversion, and keep the SHT->cmd_size=0 setting for
> libata in the scsi-mq WIP branch.
>
> I'm happy to accept patches to drop the bitmap piece that Tejun
> mentioned if your interested, but at least on my end right now there are
> bigger fish to fry for scsi-mq.  ;)
>
> Thanks,
>
> --nab
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-31  4:16                                                         ` Marc C
@ 2013-07-31 10:23                                                           ` Tejun Heo
  0 siblings, 0 replies; 75+ messages in thread
From: Tejun Heo @ 2013-07-31 10:23 UTC (permalink / raw)
  To: Marc C
  Cc: Nicholas A. Bellinger, Alexander Gordeev, Jens Axboe,
	James Bottomley, Mike Christie, linux-kernel, linux-ide,
	Jeff Garzik, linux-scsi

Hello,

On Tue, Jul 30, 2013 at 09:16:02PM -0700, Marc C wrote:
> >> One thing which would probably be worthwhile tho is getting rid of the
> >> bitmap based qc tag allocator in libata.  That one is just borderline
> >> stupid to keep around on any setup which is supposed to be scalable.
> > Your border might be wider than mine :-). Yes, the bitmap should
> > definitely go.
>
> A naive implementation is obviously less-than-efficient. However, what
> other problems exist with the libata QC tag allocator? I highly doubt
> SATA will change to beyond 32 queue tags, primarily because it would
> be a pain to change SDB FIS (it's likely to break the dozens of AHCI
> controller implementations out there). Further, it seems like the
> industry stopped caring about SATA and is pushing NVMe for future
> offerings.
> 
> In any event, most modern systems should have instructions to count
> leading zeroes and modify bits atomically.

It's inefficient not because scanning is expensive but because it
makes all CPUs in the system to hit on the exact same cacheline over
and over and over again.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-25 10:16                                           ` Alexander Gordeev
  2013-07-25 22:08                                             ` Nicholas A. Bellinger
@ 2013-07-31 17:11                                             ` Alexander Gordeev
  1 sibling, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-07-31 17:11 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: James Bottomley, Jens Axboe, Mike Christie, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, Jul 25, 2013 at 12:16:41PM +0200, Alexander Gordeev wrote:
> On Mon, Jul 22, 2013 at 02:10:36PM -0700, Nicholas A. Bellinger wrote:
> > Np.  FYI, you'll want to use the latest commit e7827b351 HEAD from
> > target-pending/scsi-mq, which now has functioning scsi-generic support.
> 
> Survives a boot, a kernel build and the build's result :)

Not that rosy. Turned out the old code is called. Hangs with this hunk..

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index ac05cd6..a75fd41 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -1103,6 +1103,8 @@ static struct device_attribute *piix_sidpr_shost_attrs[] = {
 static struct scsi_host_template piix_sidpr_sht = {
 	ATA_BMDMA_SHT(DRV_NAME),
 	.shost_attrs		= piix_sidpr_shost_attrs,
+	.scsi_mq		= true,
+	.queuecommand_mq	= ata_scsi_queuecmd,
 };
 
 static struct ata_port_operations piix_sidpr_sata_ops = {


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-29 11:46                                                   ` Tejun Heo
  2013-07-29 14:03                                                     ` Jens Axboe
@ 2013-08-09  8:23                                                     ` Alexander Gordeev
  2013-08-09 14:15                                                       ` Tejun Heo
  2013-08-09 14:24                                                       ` Jens Axboe
  1 sibling, 2 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-09  8:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas A. Bellinger, Jens Axboe, James Bottomley,
	Mike Christie, linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Mon, Jul 29, 2013 at 07:46:53AM -0400, Tejun Heo wrote:
> One thing which would probably be worthwhile tho is getting rid of the
> bitmap based qc tag allocator in libata.  That one is just borderline
> stupid to keep around on any setup which is supposed to be scalable.

Hi Tejun,

How about this approach?

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index f218427..5c2a236 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -72,6 +72,8 @@
 #include "libata.h"
 #include "libata-transport.h"
 
+#include "../../block/blk-mq-tag.h"
+
 /* debounce timing parameters in msecs { interval, duration, timeout } */
 const unsigned long sata_deb_timing_normal[]		= {   5,  100, 2000 };
 const unsigned long sata_deb_timing_hotplug[]		= {  25,  500, 2000 };
@@ -1569,18 +1571,8 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
 
 	/* initialize internal qc */
 
-	/* XXX: Tag 0 is used for drivers with legacy EH as some
-	 * drivers choke if any other tag is given.  This breaks
-	 * ata_tag_internal() test for those drivers.  Don't use new
-	 * EH stuff without converting to it.
-	 */
-	if (ap->ops->error_handler)
-		tag = ATA_TAG_INTERNAL;
-	else
-		tag = 0;
-
-	if (test_and_set_bit(tag, &ap->qc_allocated))
-		BUG();
+	tag = blk_mq_get_tag(ap->qc_tags, GFP_ATOMIC, true);
+	BUG_ON(!ata_tag_internal(tag));
 	qc = __ata_qc_from_tag(ap, tag);
 
 	qc->tag = tag;
@@ -4733,21 +4725,17 @@ void swap_buf_le16(u16 *buf, unsigned int buf_words)
 static struct ata_queued_cmd *ata_qc_new(struct ata_port *ap)
 {
 	struct ata_queued_cmd *qc = NULL;
-	unsigned int i;
+	unsigned int tag;
 
 	/* no command while frozen */
 	if (unlikely(ap->pflags & ATA_PFLAG_FROZEN))
 		return NULL;
 
-	/* the last tag is reserved for internal command. */
-	for (i = 0; i < ATA_MAX_QUEUE - 1; i++)
-		if (!test_and_set_bit(i, &ap->qc_allocated)) {
-			qc = __ata_qc_from_tag(ap, i);
-			break;
-		}
+	tag = blk_mq_get_tag(ap->qc_tags, GFP_ATOMIC, false);
+	qc = __ata_qc_from_tag(ap, tag);
 
 	if (qc)
-		qc->tag = i;
+		qc->tag = tag;
 
 	return qc;
 }
@@ -4799,7 +4787,7 @@ void ata_qc_free(struct ata_queued_cmd *qc)
 	tag = qc->tag;
 	if (likely(ata_tag_valid(tag))) {
 		qc->tag = ATA_TAG_POISON;
-		clear_bit(tag, &ap->qc_allocated);
+		blk_mq_put_tag(ap->qc_tags, tag);
 	}
 }
 
@@ -5639,6 +5627,12 @@ struct ata_port *ata_port_alloc(struct ata_host *host)
 	if (!ap)
 		return NULL;
 
+	ap->qc_tags = blk_mq_init_tags(ATA_MAX_QUEUE, 1, NUMA_NO_NODE);
+	if (!ap->qc_tags) {
+		kfree(ap);
+		return NULL;
+	}
+
 	ap->pflags |= ATA_PFLAG_INITIALIZING | ATA_PFLAG_FROZEN;
 	ap->lock = &host->lock;
 	ap->print_id = -1;
@@ -5677,6 +5671,14 @@ struct ata_port *ata_port_alloc(struct ata_host *host)
 	return ap;
 }
 
+static void ata_port_free(struct ata_port *ap)
+{
+	kfree(ap->pmp_link);
+	kfree(ap->slave_link);
+	blk_mq_free_tags(ap->qc_tags);
+	kfree(ap);
+}
+
 static void ata_host_release(struct device *gendev, void *res)
 {
 	struct ata_host *host = dev_get_drvdata(gendev);
@@ -5691,9 +5693,7 @@ static void ata_host_release(struct device *gendev, void *res)
 		if (ap->scsi_host)
 			scsi_host_put(ap->scsi_host);
 
-		kfree(ap->pmp_link);
-		kfree(ap->slave_link);
-		kfree(ap);
+		ata_port_free(ap);
 		host->ports[i] = NULL;
 	}
 
diff --git a/include/linux/libata.h b/include/linux/libata.h
index eae7a05..4ff9494 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -126,7 +126,7 @@ enum {
 	ATA_DEF_QUEUE		= 1,
 	/* tag ATA_MAX_QUEUE - 1 is reserved for internal commands */
 	ATA_MAX_QUEUE		= 32,
-	ATA_TAG_INTERNAL	= ATA_MAX_QUEUE - 1,
+	ATA_TAG_INTERNAL	= 0,
 	ATA_SHORT_PAUSE		= 16,
 
 	ATAPI_MAX_DRAIN		= 16 << 10,
@@ -766,7 +766,7 @@ struct ata_port {
 	unsigned int		cbl;	/* cable type; ATA_CBL_xxx */
 
 	struct ata_queued_cmd	qcmd[ATA_MAX_QUEUE];
-	unsigned long		qc_allocated;
+	struct blk_mq_tags	*qc_tags;
 	unsigned int		qc_active;
 	int			nr_active_links; /* #links with active qcs */
 

> Thanks.
> 
> -- 
> tejun

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09  8:23                                                     ` Alexander Gordeev
@ 2013-08-09 14:15                                                       ` Tejun Heo
  2013-08-09 14:24                                                       ` Jens Axboe
  1 sibling, 0 replies; 75+ messages in thread
From: Tejun Heo @ 2013-08-09 14:15 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Nicholas A. Bellinger, Jens Axboe, James Bottomley,
	Mike Christie, linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Aug 09, 2013 at 10:23:35AM +0200, Alexander Gordeev wrote:
> On Mon, Jul 29, 2013 at 07:46:53AM -0400, Tejun Heo wrote:
> > One thing which would probably be worthwhile tho is getting rid of the
> > bitmap based qc tag allocator in libata.  That one is just borderline
> > stupid to keep around on any setup which is supposed to be scalable.
> 
> Hi Tejun,
> 
> How about this approach?

Haven't looked at it thoroughly and I still don't know anything about
blk-mq but it looks good on the first glance.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09  8:23                                                     ` Alexander Gordeev
  2013-08-09 14:15                                                       ` Tejun Heo
@ 2013-08-09 14:24                                                       ` Jens Axboe
  2013-08-09 15:07                                                         ` Alexander Gordeev
  1 sibling, 1 reply; 75+ messages in thread
From: Jens Axboe @ 2013-08-09 14:24 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, Nicholas A. Bellinger, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On 08/09/2013 02:23 AM, Alexander Gordeev wrote:
> On Mon, Jul 29, 2013 at 07:46:53AM -0400, Tejun Heo wrote:
>> One thing which would probably be worthwhile tho is getting rid of the
>> bitmap based qc tag allocator in libata.  That one is just borderline
>> stupid to keep around on any setup which is supposed to be scalable.
> 
> Hi Tejun,
> 
> How about this approach?
> 
> @@ -5639,6 +5627,12 @@ struct ata_port *ata_port_alloc(struct ata_host *host)
>  	if (!ap)
>  		return NULL;
>  
> +	ap->qc_tags = blk_mq_init_tags(ATA_MAX_QUEUE, 1, NUMA_NO_NODE);
> +	if (!ap->qc_tags) {
> +		kfree(ap);
> +		return NULL;
> +	}

This should be blk_mq_init_tags(ATA_MAX_QUEUE - 1, 1, ...) since the
total depth is normal_tags + reserved_tags. Apart from that, I think it
looks alright based on a cursory look.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 14:24                                                       ` Jens Axboe
@ 2013-08-09 15:07                                                         ` Alexander Gordeev
  2013-08-09 15:52                                                           ` Jens Axboe
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-09 15:07 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, Nicholas A. Bellinger, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Aug 09, 2013 at 08:24:38AM -0600, Jens Axboe wrote:
> On 08/09/2013 02:23 AM, Alexander Gordeev wrote:
> > +	ap->qc_tags = blk_mq_init_tags(ATA_MAX_QUEUE, 1, NUMA_NO_NODE);
> > +	if (!ap->qc_tags) {
> > +		kfree(ap);
> > +		return NULL;
> > +	}
> 
> This should be blk_mq_init_tags(ATA_MAX_QUEUE - 1, 1, ...) since the
> total depth is normal_tags + reserved_tags.

Aha.. If blk_mq_init_tags() should be like this then?

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index dcbc2a4..b131a48 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -468,10 +468,9 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
 	 * Rest of the tags start at the queue list
 	 */
 	tags->nr_free = 0;
-	while (nr_tags - tags->nr_reserved) {
+	while (nr_tags--) {
 		tags->freelist[tags->nr_free] = tags->nr_free +
 							tags->nr_reserved;
-		nr_tags--;
 		tags->nr_free++;
 	}

> -- 
> Jens Axboe

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 15:07                                                         ` Alexander Gordeev
@ 2013-08-09 15:52                                                           ` Jens Axboe
  2013-08-09 16:46                                                             ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Jens Axboe @ 2013-08-09 15:52 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, Nicholas A. Bellinger, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On 08/09/2013 09:07 AM, Alexander Gordeev wrote:
> On Fri, Aug 09, 2013 at 08:24:38AM -0600, Jens Axboe wrote:
>> On 08/09/2013 02:23 AM, Alexander Gordeev wrote:
>>> +	ap->qc_tags = blk_mq_init_tags(ATA_MAX_QUEUE, 1, NUMA_NO_NODE);
>>> +	if (!ap->qc_tags) {
>>> +		kfree(ap);
>>> +		return NULL;
>>> +	}
>>
>> This should be blk_mq_init_tags(ATA_MAX_QUEUE - 1, 1, ...) since the
>> total depth is normal_tags + reserved_tags.
> 
> Aha.. If blk_mq_init_tags() should be like this then?
> 
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index dcbc2a4..b131a48 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -468,10 +468,9 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
>  	 * Rest of the tags start at the queue list
>  	 */
>  	tags->nr_free = 0;
> -	while (nr_tags - tags->nr_reserved) {
> +	while (nr_tags--) {
>  		tags->freelist[tags->nr_free] = tags->nr_free +
>  							tags->nr_reserved;
> -		nr_tags--;
>  		tags->nr_free++;
>  	}

I misremembered, just checked the code. I think I used to have it like I
described, but changed it since I thought it would be more logical to
pass in full depth, and then what part of that is reserved. Looking at
the current code, your patch looks correct as-is.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 15:52                                                           ` Jens Axboe
@ 2013-08-09 16:46                                                             ` Alexander Gordeev
  2013-08-09 17:07                                                               ` Jens Axboe
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-09 16:46 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, Nicholas A. Bellinger, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Aug 09, 2013 at 09:52:19AM -0600, Jens Axboe wrote:
> On 08/09/2013 09:07 AM, Alexander Gordeev wrote:
> > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> > index dcbc2a4..b131a48 100644
> > --- a/block/blk-mq-tag.c
> > +++ b/block/blk-mq-tag.c
> > @@ -468,10 +468,9 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
> >  	 * Rest of the tags start at the queue list
> >  	 */
> >  	tags->nr_free = 0;
> > -	while (nr_tags - tags->nr_reserved) {
> > +	while (nr_tags--) {
> >  		tags->freelist[tags->nr_free] = tags->nr_free +
> >  							tags->nr_reserved;
> > -		nr_tags--;
> >  		tags->nr_free++;
> >  	}
> 
> I misremembered, just checked the code. I think I used to have it like I
> described, but changed it since I thought it would be more logical to
> pass in full depth, and then what part of that is reserved. Looking at
> the current code, your patch looks correct as-is.

Ok, then a whole series "[PATCH 0/3] blk-mq: Avoid effects of a weird queue
depth" (I posted earlier in a separate thread) should make sense. Besides
the hunk above it limits the per-cpu cache size and sanity-checks total vs
reserved length. I can resubmit if you want.

> 
> -- 
> Jens Axboe
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 16:46                                                             ` Alexander Gordeev
@ 2013-08-09 17:07                                                               ` Jens Axboe
  2013-08-12 15:21                                                                 ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Jens Axboe @ 2013-08-09 17:07 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Tejun Heo, Nicholas A. Bellinger, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On 08/09/2013 10:46 AM, Alexander Gordeev wrote:
> On Fri, Aug 09, 2013 at 09:52:19AM -0600, Jens Axboe wrote:
>> On 08/09/2013 09:07 AM, Alexander Gordeev wrote:
>>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
>>> index dcbc2a4..b131a48 100644
>>> --- a/block/blk-mq-tag.c
>>> +++ b/block/blk-mq-tag.c
>>> @@ -468,10 +468,9 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
>>>  	 * Rest of the tags start at the queue list
>>>  	 */
>>>  	tags->nr_free = 0;
>>> -	while (nr_tags - tags->nr_reserved) {
>>> +	while (nr_tags--) {
>>>  		tags->freelist[tags->nr_free] = tags->nr_free +
>>>  							tags->nr_reserved;
>>> -		nr_tags--;
>>>  		tags->nr_free++;
>>>  	}
>>
>> I misremembered, just checked the code. I think I used to have it like I
>> described, but changed it since I thought it would be more logical to
>> pass in full depth, and then what part of that is reserved. Looking at
>> the current code, your patch looks correct as-is.
> 
> Ok, then a whole series "[PATCH 0/3] blk-mq: Avoid effects of a weird queue
> depth" (I posted earlier in a separate thread) should make sense. Besides
> the hunk above it limits the per-cpu cache size and sanity-checks total vs
> reserved length. I can resubmit if you want.

You don't have to resubmit, I'll get it reviewed and applied today.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-20 14:48                                         ` Mike Christie
                                                           ` (2 preceding siblings ...)
  (?)
@ 2013-08-09 19:15                                         ` Alexander Gordeev
  2013-08-09 20:17                                           ` Nicholas A. Bellinger
  -1 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-09 19:15 UTC (permalink / raw)
  To: Mike Christie
  Cc: Nicholas A. Bellinger, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Sat, Jul 20, 2013 at 09:48:28AM -0500, Mike Christie wrote:
> What about the attached only compile tested patch. The patch has the mq
> block code work like the non mq code for bio cleanups.

Not sure if it is related to the patch or not, but it never returns from
wait_for_completion_io(&wait) in blkdev_issue_flush():

# ps axl | awk '$10 ~ /D\+/'
4     0   938   879  20   0 111216   656 blkdev D+   pts/1      0:00 fdisk/dev/sda
#
# cat /proc/938/stack
[<ffffffff812a8a5c>] blkdev_issue_flush+0xfc/0x160
[<ffffffff811ac606>] blkdev_fsync+0x96/0xc0
[<ffffffff811a2f4d>] do_fsync+0x5d/0x90
[<ffffffff811a3330>] SyS_fsync+0x10/0x20
[<ffffffff81611582>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Any ideas?

Thanks!

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 19:15                                         ` Alexander Gordeev
@ 2013-08-09 20:17                                           ` Nicholas A. Bellinger
  2013-08-15 16:23                                             ` Alexander Gordeev
  0 siblings, 1 reply; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-08-09 20:17 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, 2013-08-09 at 21:15 +0200, Alexander Gordeev wrote:
> On Sat, Jul 20, 2013 at 09:48:28AM -0500, Mike Christie wrote:
> > What about the attached only compile tested patch. The patch has the mq
> > block code work like the non mq code for bio cleanups.
> 
> Not sure if it is related to the patch or not, but it never returns from
> wait_for_completion_io(&wait) in blkdev_issue_flush():
> 
> # ps axl | awk '$10 ~ /D\+/'
> 4     0   938   879  20   0 111216   656 blkdev D+   pts/1      0:00 fdisk/dev/sda
> #
> # cat /proc/938/stack
> [<ffffffff812a8a5c>] blkdev_issue_flush+0xfc/0x160
> [<ffffffff811ac606>] blkdev_fsync+0x96/0xc0
> [<ffffffff811a2f4d>] do_fsync+0x5d/0x90
> [<ffffffff811a3330>] SyS_fsync+0x10/0x20
> [<ffffffff81611582>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Any ideas?
> 

Mmmm, I'm able to reproduce over here with ahci + scsi-mq, and it
appears to be a bug related with using sdev->sdev_md_req.queue_depth=1,
that ends up causing the blkdev_issue_flush() to wait forever because
blk_mq_wait_for_tags() never ends up getting the single tag back for the
WRITE_FLUSH bio -> SYNCHRONIZE_CACHE cdb.

Here's the echo w > /proc/sysrq-trigger output:

[  282.620140] SysRq : Show Blocked State
[  282.620958]   task                        PC stack   pid father
[  282.622228] kworker/2:1H    D 0000000000000002     0   532      2 0x00000000
[  282.623607] Workqueue: kblockd mq_flush_work
[  282.624027]  ffff880037869c98 0000000000000046 ffff880037868010 0000000000011380
[  282.624027]  ffff88007d255910 0000000000011380 ffff880037869fd8 0000000000011380
[  282.624027]  ffff880037869fd8 0000000000011380 ffff88007d06f0d0 ffff88007d255910
[  282.624027] Call Trace:
[  282.624027]  [<ffffffff8125b4fd>] ? do_rw_taskfile+0x2ab/0x2bf
[  282.624027]  [<ffffffff810235c4>] ? kvm_clock_read+0x1f/0x21
[  282.624027]  [<ffffffff81054fbc>] ? update_curr+0x4f/0xcd
[  282.624027]  [<ffffffff810235c4>] ? kvm_clock_read+0x1f/0x21
[  282.624027]  [<ffffffff810235cf>] ? kvm_clock_get_cycles+0x9/0xb
[  282.624027]  [<ffffffff81383946>] schedule+0x5f/0x61
[  282.624027]  [<ffffffff813839cf>] io_schedule+0x87/0xca
[  282.624027]  [<ffffffff81192402>] wait_on_tags+0x10f/0x146
[  282.624027]  [<ffffffff81192462>] blk_mq_wait_for_tags+0x29/0x3b
[  282.624027]  [<ffffffff8119132d>] blk_mq_alloc_request_pinned+0xcf/0xe5
[  282.624027]  [<ffffffff811913a6>] blk_mq_alloc_request+0x2d/0x34
[  282.624027]  [<ffffffff8118c60f>] mq_flush_work+0x1a/0x3d
[  282.624027]  [<ffffffff8104474b>] process_one_work+0x257/0x368
[  282.624027]  [<ffffffff81044a4a>] worker_thread+0x1ee/0x34b
[  282.624027]  [<ffffffff8104485c>] ? process_one_work+0x368/0x368
[  282.624027]  [<ffffffff81049771>] kthread+0xb0/0xb8
[  282.624027]  [<ffffffff810496c1>] ? kthread_freezable_should_stop+0x60/0x60
[  282.624027]  [<ffffffff8138a07c>] ret_from_fork+0x7c/0xb0
[  282.624027]  [<ffffffff810496c1>] ? kthread_freezable_should_stop+0x60/0x60
[  282.624027] fdisk           D 0000000000000002     0  1947   1930 0x00000000
[  282.624027]  ffff880037bd9d48 0000000000000082 ffff880037bd8010 0000000000011380
[  282.624027]  ffff88007ca223a0 0000000000011380 ffff880037bd9fd8 0000000000011380
[  282.624027]  ffff880037bd9fd8 0000000000011380 ffff88007d06bb60 ffff88007ca223a0
[  282.624027] Call Trace:
[  282.624027]  [<ffffffff813835e8>] ? __schedule+0x687/0x726
[  282.624027]  [<ffffffff81383946>] schedule+0x5f/0x61
[  282.624027]  [<ffffffff81381d18>] schedule_timeout+0x24/0x183
[  282.624027]  [<ffffffff810235c4>] ? kvm_clock_read+0x1f/0x21
[  282.624027]  [<ffffffff810235cf>] ? kvm_clock_get_cycles+0x9/0xb
[  282.624027]  [<ffffffff8105d2bd>] ? ktime_get_ts+0x53/0xc7
[  282.624027]  [<ffffffff81382e01>] io_schedule_timeout+0x93/0xe4
[  282.624027]  [<ffffffff81051fc7>] ? __cond_resched+0x25/0x31
[  282.624027]  [<ffffffff81383c46>] T.1554+0x8e/0xfc
[  282.624027]  [<ffffffff81054159>] ? try_to_wake_up+0x222/0x222
[  282.624027]  [<ffffffff81383cc7>] wait_for_completion_io+0x13/0x15
[  282.624027]  [<ffffffff8118cbde>] blkdev_issue_flush+0xfb/0x145
[  282.624027]  [<ffffffff810f067a>] blkdev_fsync+0x30/0x3d
[  282.624027]  [<ffffffff810e9259>] vfs_fsync_range+0x18/0x21
[  282.624027]  [<ffffffff810e9279>] vfs_fsync+0x17/0x19
[  282.624027]  [<ffffffff810e942e>] do_fsync+0x35/0x53
[  282.624027]  [<ffffffff810d5574>] ? SyS_ioctl+0x47/0x69
[  282.624027]  [<ffffffff810e9469>] SyS_fsync+0xb/0xf
[  282.624027]  [<ffffffff8138a129>] system_call_fastpath+0x16/0x1b
[  282.624027] blkid           D 0000000000000001     0  1952      1 0x00000000
[  282.679428]  ffff8800371638a8 0000000000000082 ffff880037162010 0000000000011380
[  282.679428]  ffff88007ca205f0 0000000000011380 ffff880037163fd8 0000000000011380
[  282.679428]  ffff880037163fd8 0000000000011380 ffff88007d06b570 ffff88007ca205f0
[  282.679428] Call Trace:
[  282.679428]  [<ffffffff810677a9>] ? generic_exec_single+0x75/0x93
[  282.679428]  [<ffffffff8119212a>] ? blk_mq_tag_busy_iter+0x116/0x116
[  282.679428]  [<ffffffff8106797f>] ? smp_call_function_single+0xf9/0x111
[  282.679428]  [<ffffffff81383946>] schedule+0x5f/0x61
[  282.679428]  [<ffffffff813839cf>] io_schedule+0x87/0xca
[  282.679428]  [<ffffffff81192402>] wait_on_tags+0x10f/0x146
[  282.679428]  [<ffffffff81192462>] blk_mq_wait_for_tags+0x29/0x3b
[  282.679428]  [<ffffffff8119132d>] blk_mq_alloc_request_pinned+0xcf/0xe5
[  282.679428]  [<ffffffff811916b9>] blk_mq_make_request+0x14d/0x2dc
[  282.679428]  [<ffffffff810978c4>] ? mempool_alloc_slab+0x10/0x12
[  282.679428]  [<ffffffff8118951e>] generic_make_request+0x9c/0xdf
[  282.679428]  [<ffffffff81189648>] submit_bio+0xe7/0xf2
[  282.679428]  [<ffffffff810eaaeb>] _submit_bh+0x1b0/0x1d3
[  282.679428]  [<ffffffff810eab19>] submit_bh+0xb/0xd
[  282.679428]  [<ffffffff810ed6e5>] block_read_full_page+0x24d/0x26d
[  282.679428]  [<ffffffff810ef905>] ? I_BDEV+0xd/0xd
[  282.679428]  [<ffffffff810a7624>] ? __inc_zone_page_state+0x1e/0x20
[  282.679428]  [<ffffffff81096188>] ? add_to_page_cache_locked+0x78/0xb0
[  282.679428]  [<ffffffff810f04a5>] blkdev_readpage+0x13/0x15
[  282.679428]  [<ffffffff8109de8d>] __do_page_cache_readahead+0x194/0x1d0
[  282.679428]  [<ffffffff81381f41>] ? __wait_on_bit_lock+0x79/0x8a
[  282.679428]  [<ffffffff8109df50>] force_page_cache_readahead+0x67/0x8d
[  282.679428]  [<ffffffff8109e29a>] page_cache_sync_readahead+0x26/0x3a
[  282.679428]  [<ffffffff81097510>] generic_file_aio_read+0x265/0x5cd
[  282.679428]  [<ffffffff810efaae>] blkdev_aio_read+0x57/0x5e
[  282.679428]  [<ffffffff810c6b8d>] do_sync_read+0x79/0x9f
[  282.679428]  [<ffffffff810c7db7>] vfs_read+0xab/0x130
[  282.679428]  [<ffffffff810c7f06>] SyS_read+0x4f/0x79
[  282.679428]  [<ffffffff8138a129>] system_call_fastpath+0x16/0x1b
[  282.679428] blkid           D 0000000000000003     0  1992    927 0x00000000
[  282.679428]  ffff88007848f8a8 0000000000000086 ffff88007848e010 0000000000011380
[  282.679428]  ffff88007d250000 0000000000011380 ffff88007848ffd8 0000000000011380
[  282.679428]  ffff88007848ffd8 0000000000011380 ffff88007d06c150 ffff88007d250000
[  282.679428] Call Trace:
[  282.679428]  [<ffffffff8109b9b6>] ? __alloc_pages_nodemask+0xf7/0x5eb
[  282.679428]  [<ffffffff81383946>] schedule+0x5f/0x61
[  282.679428]  [<ffffffff813839cf>] io_schedule+0x87/0xca
[  282.679428]  [<ffffffff81192402>] wait_on_tags+0x10f/0x146
[  282.679428]  [<ffffffff81192462>] blk_mq_wait_for_tags+0x29/0x3b
[  282.679428]  [<ffffffff8119132d>] blk_mq_alloc_request_pinned+0xcf/0xe5
[  282.679428]  [<ffffffff811916b9>] blk_mq_make_request+0x14d/0x2dc
[  282.679428]  [<ffffffff8118d81f>] ? create_task_io_context+0xa6/0xf5
[  282.679428]  [<ffffffff8118951e>] generic_make_request+0x9c/0xdf
[  282.679428]  [<ffffffff81189648>] submit_bio+0xe7/0xf2
[  282.679428]  [<ffffffff810eaaeb>] _submit_bh+0x1b0/0x1d3
[  282.679428]  [<ffffffff810eab19>] submit_bh+0xb/0xd
[  282.679428]  [<ffffffff810ed6e5>] block_read_full_page+0x24d/0x26d
[  282.679428]  [<ffffffff810ef905>] ? I_BDEV+0xd/0xd
[  282.679428]  [<ffffffff810f04a5>] blkdev_readpage+0x13/0x15
[  282.679428]  [<ffffffff8109de8d>] __do_page_cache_readahead+0x194/0x1d0
[  282.679428]  [<ffffffff8109df50>] force_page_cache_readahead+0x67/0x8d
[  282.679428]  [<ffffffff8109e29a>] page_cache_sync_readahead+0x26/0x3a
[  282.679428]  [<ffffffff81097510>] generic_file_aio_read+0x265/0x5cd
[  282.679428]  [<ffffffff810efaae>] blkdev_aio_read+0x57/0x5e
[  282.679428]  [<ffffffff810c6b8d>] do_sync_read+0x79/0x9f
[  282.679428]  [<ffffffff810c7db7>] vfs_read+0xab/0x130
[  282.679428]  [<ffffffff810c7f06>] SyS_read+0x4f/0x79
[  282.679428]  [<ffffffff8138a129>] system_call_fastpath+0x16/0x1b

Bumping queue_depth=2 seems to work-around the issue, but AFAICT it's a
genuine tag starvation bug with queue_depth=1 and WRITE_FLUSH..

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 17:07                                                               ` Jens Axboe
@ 2013-08-12 15:21                                                                 ` Alexander Gordeev
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-12 15:21 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, Nicholas A. Bellinger, James Bottomley, Mike Christie,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Aug 09, 2013 at 11:07:37AM -0600, Jens Axboe wrote:
> You don't have to resubmit, I'll get it reviewed and applied today.

Hi Jens,

I limited the minimal queue depth to 4, which is apparently wrong
in case of libata. I will post a new series.

> -- 
> Jens Axboe
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-09 20:17                                           ` Nicholas A. Bellinger
@ 2013-08-15 16:23                                             ` Alexander Gordeev
  2013-08-16  2:19                                               ` Nicholas A. Bellinger
  0 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-15 16:23 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, Aug 09, 2013 at 01:17:37PM -0700, Nicholas A. Bellinger wrote:
> On Fri, 2013-08-09 at 21:15 +0200, Alexander Gordeev wrote:
> Mmmm, I'm able to reproduce over here with ahci + scsi-mq, and it
> appears to be a bug related with using sdev->sdev_md_req.queue_depth=1,
> that ends up causing the blkdev_issue_flush() to wait forever because
> blk_mq_wait_for_tags() never ends up getting the single tag back for the
> WRITE_FLUSH bio -> SYNCHRONIZE_CACHE cdb.

It turns out this way - blkdev_issue_flush() claims the only tag, submits
the bio and waits for the completion. But because blk_mq_make_request()
does not mark any context in blk_mq_hw_ctx::ctx_map (nor enslists the request
into blk_mq_ctx::rq_list) it never gets processed from blk_mq_work_fn->
__blk_mq_run_hw_queue() and blkdev_issue_flush() waits endlessly. All other
requests are just waiting for the tag availability as result.

[...]

> Bumping queue_depth=2 seems to work-around the issue, but AFAICT it's a
> genuine tag starvation bug with queue_depth=1 and WRITE_FLUSH..

If I try to hack and force __blk_mq_run_hw_queue() to process the request...

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6fc1df3..c22b6f66 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -889,9 +962,12 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	hctx->queued++;
 
 	if (unlikely(is_flush_fua)) {
+		list_add(&rq->queuelist, &hctx->dispatch);
 		blk_mq_bio_to_request(q, rq, bio);
 		blk_mq_put_ctx(ctx);
 		blk_insert_flush(rq);
 		goto run_queue;
 	}

... I get a kernel BUG at drivers/scsi/scsi_lib.c:1233

	BUG_ON(!req->nr_phys_segments);

IOW I am not sure how to proceed.

> --nab
> 

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-15 16:23                                             ` Alexander Gordeev
@ 2013-08-16  2:19                                               ` Nicholas A. Bellinger
  2013-08-16 16:41                                                 ` Alexander Gordeev
                                                                   ` (2 more replies)
  0 siblings, 3 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-08-16  2:19 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, 2013-08-15 at 18:23 +0200, Alexander Gordeev wrote:
> On Fri, Aug 09, 2013 at 01:17:37PM -0700, Nicholas A. Bellinger wrote:
> > On Fri, 2013-08-09 at 21:15 +0200, Alexander Gordeev wrote:
> > Mmmm, I'm able to reproduce over here with ahci + scsi-mq, and it
> > appears to be a bug related with using sdev->sdev_md_req.queue_depth=1,
> > that ends up causing the blkdev_issue_flush() to wait forever because
> > blk_mq_wait_for_tags() never ends up getting the single tag back for the
> > WRITE_FLUSH bio -> SYNCHRONIZE_CACHE cdb.
> 
> It turns out this way - blkdev_issue_flush() claims the only tag, submits
> the bio and waits for the completion. But because blk_mq_make_request()
> does not mark any context in blk_mq_hw_ctx::ctx_map (nor enslists the request
> into blk_mq_ctx::rq_list) it never gets processed from blk_mq_work_fn->
> __blk_mq_run_hw_queue() and blkdev_issue_flush() waits endlessly. All other
> requests are just waiting for the tag availability as result.
> 

Ok, here's a bit better idea of what is going on now..

The problem is that blkdev_issue_flush() -> blk_mq_make_request() ->
__blk_mq_alloc_request() allocates the first tag, which calls
blk_insert_flush() -> blk_flush_complete_seq() -> blk_flush_kick() ->
mq_flush_work() -> blk_mq_alloc_request() to allocate a second tag for
the struct request that actually gets dispatched into scsi-mq as a
SYCHRONIZE_CACHE command..

I'm not exactly sure why this double tag usage of struct request is
occurring, but AFAICT it does happen for every flush, and is not
specific to the blkdev_issue_flush() codepath..  I'm sure that Jens can
fill us in on that bit.  ;)

So, assuming that this double tag usage is necessary and not a bug,
perhaps using a reserved tag for the first tag (eg: the one that's not
dispatched into scsi_mq_queue_rq) makes sense..?

I'm playing with a patch to do this, but am currently getting hung-up on
what appear to be some separate blk-mq reserved_tags > 0 bugs, the first
of which is passing queue_depth=1 + reserved_tags=1 is broken, and
results in tags->nr_free = 0.

Here's the quick fix:

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 6718007..ffdf686 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -470,7 +470,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
 	 * Rest of the tags start at the queue list
 	 */
 	tags->nr_free = 0;
-	while (nr_tags - tags->nr_reserved) {
+	while (nr_tags) {
 		tags->freelist[tags->nr_free] = tags->nr_free +
 							tags->nr_reserved;
 		nr_tags--;

Anyways, before digging further into reserved tags logic, Jens, what are
your thoughts for addressing this special queue_depth=1 case for libata
+ the like..?

--nab


^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-16  2:19                                               ` Nicholas A. Bellinger
@ 2013-08-16 16:41                                                 ` Alexander Gordeev
  2013-08-16 17:46                                                   ` Nicholas A. Bellinger
  2013-08-28 15:56                                                   ` Alexander Gordeev
  2013-09-20 15:19                                                   ` Alexander Gordeev
  2 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-16 16:41 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> I'm playing with a patch to do this, but am currently getting hung-up on
> what appear to be some separate blk-mq reserved_tags > 0 bugs, the first
> of which is passing queue_depth=1 + reserved_tags=1 is broken, and
> results in tags->nr_free = 0.

That is not a bug - please look at Jens replies in this thread some week ago.
In short, queue_depth=1 means 1 tags in total and reserved_tags=1 results
in zero normal tags. You need to request the depth=2 and reserved_tags=1.

But yes, this is a separate topic and I am looking forward to hear from Jens
wrt flushes.

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-16 16:41                                                 ` Alexander Gordeev
@ 2013-08-16 17:46                                                   ` Nicholas A. Bellinger
  0 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-08-16 17:46 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Fri, 2013-08-16 at 18:41 +0200, Alexander Gordeev wrote:
> On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> > I'm playing with a patch to do this, but am currently getting hung-up on
> > what appear to be some separate blk-mq reserved_tags > 0 bugs, the first
> > of which is passing queue_depth=1 + reserved_tags=1 is broken, and
> > results in tags->nr_free = 0.
> 
> That is not a bug - please look at Jens replies in this thread some week ago.
> In short, queue_depth=1 means 1 tags in total and reserved_tags=1 results
> in zero normal tags. You need to request the depth=2 and reserved_tags=1.
> 

Ahhh, yes of course.  I'll re-work a proposed patch this afternoon with
this in mind..

> But yes, this is a separate topic and I am looking forward to hear from Jens
> wrt flushes.
> 

Indeed.  ;)

--nab


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-16  2:19                                               ` Nicholas A. Bellinger
@ 2013-08-28 15:56                                                   ` Alexander Gordeev
  2013-08-28 15:56                                                   ` Alexander Gordeev
  2013-09-20 15:19                                                   ` Alexander Gordeev
  2 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-28 15:56 UTC (permalink / raw)
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi,
	Nicholas A. Bellinger

On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> Anyways, before digging further into reserved tags logic, Jens, what are
> your thoughts for addressing this special queue_depth=1 case for libata
> + the like..?

Hi Jens,

Have some comments?

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
@ 2013-08-28 15:56                                                   ` Alexander Gordeev
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-08-28 15:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi,
	Nicholas A. Bellinger

On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> Anyways, before digging further into reserved tags logic, Jens, what are
> your thoughts for addressing this special queue_depth=1 case for libata
> + the like..?

Hi Jens,

Have some comments?

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-08-16  2:19                                               ` Nicholas A. Bellinger
@ 2013-09-20 15:19                                                   ` Alexander Gordeev
  2013-08-28 15:56                                                   ` Alexander Gordeev
  2013-09-20 15:19                                                   ` Alexander Gordeev
  2 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-09-20 15:19 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> Ok, here's a bit better idea of what is going on now..
> 
> The problem is that blkdev_issue_flush() -> blk_mq_make_request() ->
> __blk_mq_alloc_request() allocates the first tag, which calls
> blk_insert_flush() -> blk_flush_complete_seq() -> blk_flush_kick() ->
> mq_flush_work() -> blk_mq_alloc_request() to allocate a second tag for
> the struct request that actually gets dispatched into scsi-mq as a
> SYCHRONIZE_CACHE command..
> 
> I'm not exactly sure why this double tag usage of struct request is
> occurring, but AFAICT it does happen for every flush, and is not
> specific to the blkdev_issue_flush() codepath..  I'm sure that Jens can
> fill us in on that bit.  ;)

I also played with the double tag using a reserved tag (below).

While it fixes 'fdisk /dev/sda' issue when trying to 'mount /dev/sda1 /mnt'
what appears to be a call to bio->bi_end_io() from the free'd bio hits in.

Not sure if I should pursue the root cause until the whole double-tag
thingy is confirmed.

Jens?


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6fc1df3..81794dc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -874,14 +874,14 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
 	trace_block_getrq(q, bio, rw);
-	rq = __blk_mq_alloc_request(hctx, GFP_ATOMIC, false);
+	rq = __blk_mq_alloc_request(hctx, GFP_ATOMIC, is_flush_fua);
 	if (likely(rq))
 		blk_mq_rq_ctx_init(ctx, rq, rw);
 	else {
 		blk_mq_put_ctx(ctx);
 		trace_block_sleeprq(q, bio, rw);
 		rq = blk_mq_alloc_request_pinned(q, rw, __GFP_WAIT|GFP_ATOMIC,
-							false);
+							is_flush_fua);
 		ctx = rq->mq_ctx;
 		hctx = q->mq_ops->map_queue(q, ctx->cpu);
 	}
@@ -1317,6 +1317,9 @@ struct request_queue *blk_mq_init_queue(struct blk_mq_reg *reg,
 		reg->queue_depth = BLK_MQ_MAX_DEPTH;
 	}
 
+	reg->queue_depth++;
+	reg->reserved_tags++;
+
 	ctx = alloc_percpu(struct blk_mq_ctx);
 	if (!ctx)
 		return ERR_PTR(-ENOMEM);

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
@ 2013-09-20 15:19                                                   ` Alexander Gordeev
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-09-20 15:19 UTC (permalink / raw)
  To: Nicholas A. Bellinger, Jens Axboe
  Cc: Mike Christie, James Bottomley, Jens Axboe, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> Ok, here's a bit better idea of what is going on now..
> 
> The problem is that blkdev_issue_flush() -> blk_mq_make_request() ->
> __blk_mq_alloc_request() allocates the first tag, which calls
> blk_insert_flush() -> blk_flush_complete_seq() -> blk_flush_kick() ->
> mq_flush_work() -> blk_mq_alloc_request() to allocate a second tag for
> the struct request that actually gets dispatched into scsi-mq as a
> SYCHRONIZE_CACHE command..
> 
> I'm not exactly sure why this double tag usage of struct request is
> occurring, but AFAICT it does happen for every flush, and is not
> specific to the blkdev_issue_flush() codepath..  I'm sure that Jens can
> fill us in on that bit.  ;)

I also played with the double tag using a reserved tag (below).

While it fixes 'fdisk /dev/sda' issue when trying to 'mount /dev/sda1 /mnt'
what appears to be a call to bio->bi_end_io() from the free'd bio hits in.

Not sure if I should pursue the root cause until the whole double-tag
thingy is confirmed.

Jens?


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6fc1df3..81794dc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -874,14 +874,14 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
 	trace_block_getrq(q, bio, rw);
-	rq = __blk_mq_alloc_request(hctx, GFP_ATOMIC, false);
+	rq = __blk_mq_alloc_request(hctx, GFP_ATOMIC, is_flush_fua);
 	if (likely(rq))
 		blk_mq_rq_ctx_init(ctx, rq, rw);
 	else {
 		blk_mq_put_ctx(ctx);
 		trace_block_sleeprq(q, bio, rw);
 		rq = blk_mq_alloc_request_pinned(q, rw, __GFP_WAIT|GFP_ATOMIC,
-							false);
+							is_flush_fua);
 		ctx = rq->mq_ctx;
 		hctx = q->mq_ops->map_queue(q, ctx->cpu);
 	}
@@ -1317,6 +1317,9 @@ struct request_queue *blk_mq_init_queue(struct blk_mq_reg *reg,
 		reg->queue_depth = BLK_MQ_MAX_DEPTH;
 	}
 
+	reg->queue_depth++;
+	reg->reserved_tags++;
+
 	ctx = alloc_percpu(struct blk_mq_ctx);
 	if (!ctx)
 		return ERR_PTR(-ENOMEM);

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-09-20 15:19                                                   ` Alexander Gordeev
  (?)
@ 2013-09-20 20:41                                                   ` Nicholas A. Bellinger
  -1 siblings, 0 replies; 75+ messages in thread
From: Nicholas A. Bellinger @ 2013-09-20 20:41 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Jens Axboe, Mike Christie, James Bottomley, Tejun Heo,
	linux-kernel, linux-ide, Jeff Garzik, linux-scsi

Hi Alexander!

Apologies for the long delay on this follow-up..  Comments below.

On Fri, 2013-09-20 at 17:19 +0200, Alexander Gordeev wrote:
> On Thu, Aug 15, 2013 at 07:19:29PM -0700, Nicholas A. Bellinger wrote:
> > Ok, here's a bit better idea of what is going on now..
> > 
> > The problem is that blkdev_issue_flush() -> blk_mq_make_request() ->
> > __blk_mq_alloc_request() allocates the first tag, which calls
> > blk_insert_flush() -> blk_flush_complete_seq() -> blk_flush_kick() ->
> > mq_flush_work() -> blk_mq_alloc_request() to allocate a second tag for
> > the struct request that actually gets dispatched into scsi-mq as a
> > SYCHRONIZE_CACHE command..
> > 
> > I'm not exactly sure why this double tag usage of struct request is
> > occurring, but AFAICT it does happen for every flush, and is not
> > specific to the blkdev_issue_flush() codepath..  I'm sure that Jens can
> > fill us in on that bit.  ;)
> 
> I also played with the double tag using a reserved tag (below).
> 
> While it fixes 'fdisk /dev/sda' issue when trying to 'mount /dev/sda1 /mnt'
> what appears to be a call to bio->bi_end_io() from the free'd bio hits in.
> 
> Not sure if I should pursue the root cause until the whole double-tag
> thingy is confirmed.
> 
> Jens?
> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 6fc1df3..81794dc 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -874,14 +874,14 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  	hctx = q->mq_ops->map_queue(q, ctx->cpu);
>  
>  	trace_block_getrq(q, bio, rw);
> -	rq = __blk_mq_alloc_request(hctx, GFP_ATOMIC, false);
> +	rq = __blk_mq_alloc_request(hctx, GFP_ATOMIC, is_flush_fua);
>  	if (likely(rq))
>  		blk_mq_rq_ctx_init(ctx, rq, rw);
>  	else {
>  		blk_mq_put_ctx(ctx);
>  		trace_block_sleeprq(q, bio, rw);
>  		rq = blk_mq_alloc_request_pinned(q, rw, __GFP_WAIT|GFP_ATOMIC,
> -							false);
> +							is_flush_fua);
>  		ctx = rq->mq_ctx;
>  		hctx = q->mq_ops->map_queue(q, ctx->cpu);
>  	}

So this is what I ended up doing as well, and does address the specific
bug with queue_depth=1.

> @@ -1317,6 +1317,9 @@ struct request_queue *blk_mq_init_queue(struct blk_mq_reg *reg,
>  		reg->queue_depth = BLK_MQ_MAX_DEPTH;
>  	}
>  
> +	reg->queue_depth++;
> +	reg->reserved_tags++;
> +
>  	ctx = alloc_percpu(struct blk_mq_ctx);
>  	if (!ctx)
>  		return ERR_PTR(-ENOMEM);
> 

I was actually setting this within scsi_mq_alloc_queue(), but given that
the queue_depth=1 issue is independent of scsi-mq, this does make more
sense.

Also, these extra increments should probably happen only when the passed
queue_depth == 1 && reserved_tags == 0.

Other than that minor nit.

Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-07-20 14:48                                         ` Mike Christie
                                                           ` (3 preceding siblings ...)
  (?)
@ 2013-10-03 11:06                                         ` Christoph Hellwig
  2013-10-07 14:44                                           ` Alexander Gordeev
  -1 siblings, 1 reply; 75+ messages in thread
From: Christoph Hellwig @ 2013-10-03 11:06 UTC (permalink / raw)
  To: Mike Christie
  Cc: Nicholas A. Bellinger, Jens Axboe, Alexander Gordeev, Tejun Heo,
	linux-kernel, linux-ide, linux-scsi

On Sat, Jul 20, 2013 at 09:48:28AM -0500, Mike Christie wrote:
> What about the attached only compile tested patch. The patch has the mq
> block code work like the non mq code for bio cleanups.
> 
> 

> blk-mq: blk-mq should free bios in pass through case
> 
> For non mq calls, the block layer will free the bios when
> blk_finish_request is called.
e 
> For mq calls, the blk mq code wants the caller to do this.
> 
> This patch has the blk mq code work like the non mq code
> and has the block layer free the bios.
> 
> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>

This patch breaks booting for me in the current blk multiqueue tree,
with an apparent double free of a bio when using virtio-blk in writeback
mode (cache=writeback or cache=none in qemu):

[   15.253608] ------------[ cut here ]------------
[   15.256422] kernel BUG at /work/hch/linux/fs/bio.c:498!
[   15.256879] invalid opcode: 0000 [#1] SMP 
[   15.256879] Modules linked in:
[   15.256879] CPU: 3 PID: 353 Comm: kblockd Not tainted 3.11.0+ #25
[   15.256879] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   15.256879] task: ffff88007d75e0c0 ti: ffff88007d676000 task.ti: ffff88007d676000
[   15.256879] RIP: 0010:[<ffffffff811b470a>]  [<ffffffff811b470a>] bio_put+0x8a/0x90
[   15.256879] RSP: 0018:ffff88007fd83b50  EFLAGS: 00010046
[   15.256879] RAX: 0000000000000000 RBX: ffff88007d713080 RCX: 0000000000000035
[   15.256879] RDX: 0000000000000002 RSI: ffff88007ad50338 RDI: ffff88007d713080
[   15.256879] RBP: ffff88007fd83b60 R08: 7010000000000000 R09: 007ad50338080000
[   15.256879] R10: ff672b1b7d38ce02 R11: 000000000000028b R12: 0000000000000000
[   15.256879] R13: 0000000000000000 R14: ffff88007b4c36c0 R15: ffff88007b40d608
[   15.256879] FS:  0000000000000000(0000) GS:ffff88007fd80000(0000) knlGS:0000000000000000
[   15.256879] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   15.256879] CR2: 0000000000000138 CR3: 0000000002124000 CR4: 00000000000006e0
[   15.256879] Stack:
[   15.256879]  ffff88007d713080 0000000000000000 ffff88007fd83b80 ffffffff811ae8a3
[   15.256879]  ffff88007fd83bf0 0000000000001000 ffff88007fd83b90 ffffffff811b3268
[   15.256879]  ffff88007fd83bc0 ffffffff816ac847 ffff88007b4c36c0 ffff88007fd99d00
[   15.256879] Call Trace:
[   15.256879]  <IRQ> 
[   15.256879]  [<ffffffff811ae8a3>] end_bio_bh_io_sync+0x33/0x50
[   15.256879]  [<ffffffff811b3268>] bio_endio+0x18/0x30
[   15.256879]  [<ffffffff816ac847>] blk_mq_complete_request+0x47/0xd0
[   15.256879]  [<ffffffff816ac8e9>] __blk_mq_end_io+0x19/0x20
[   15.256879]  [<ffffffff816ac958>] blk_mq_end_io+0x68/0xd0
[   15.256879]  [<ffffffff816a6162>] blk_flush_complete_seq+0xe2/0x370
[   15.256879]  [<ffffffff816a653b>] flush_end_io+0x11b/0x200
[   15.256879]  [<ffffffff816ac875>] blk_mq_complete_request+0x75/0xd0
[   15.256879]  [<ffffffff816ac8e9>] __blk_mq_end_io+0x19/0x20
[   15.256879]  [<ffffffff816ac958>] blk_mq_end_io+0x68/0xd0
[   15.256879]  [<ffffffff81844c2f>] virtblk_done+0xef/0x260
[   15.256879]  [<ffffffff81753cc0>] vring_interrupt+0x30/0x60
[   15.256879]  [<ffffffff81103724>] handle_irq_event_percpu+0x54/0x1f0
[   15.256879]  [<ffffffff81103903>] handle_irq_event+0x43/0x70
[   15.256879]  [<ffffffff8110609f>] handle_edge_irq+0x6f/0x120
[   15.256879]  [<ffffffff810445b8>] handle_irq+0x58/0x140
[   15.256879]  [<ffffffff81094bbf>] ? irq_enter+0x4f/0x90
[   15.256879]  [<ffffffff810440b5>] do_IRQ+0x55/0xd0
[   15.256879]  [<ffffffff81bd3972>] common_interrupt+0x72/0x72
[   15.256879]  [<ffffffff810c5135>] ? sched_clock_local+0x25/0xa0
[   15.256879]  [<ffffffff81094960>] ? __do_softirq+0xb0/0x250
[   15.256879]  [<ffffffff81094959>] ? __do_softirq+0xa9/0x250
[   15.256879]  [<ffffffff81094cae>] irq_exit+0xae/0xd0
[   15.256879]  [<ffffffff8106dcd5>] smp_apic_timer_interrupt+0x45/0x60
[   15.256879]  [<ffffffff81bdc772>] apic_timer_interrupt+0x72/0x80
[   15.256879]  <EOI> 
[   15.256879]  [<ffffffff81bd3a33>] ? retint_restore_args+0x13/0x13
[   15.256879]  [<ffffffff81bd3502>] ? _raw_spin_unlock_irq+0x32/0x40
[   15.256879]  [<ffffffff81bd34fb>] ? _raw_spin_unlock_irq+0x2b/0x40
[   15.256879]  [<ffffffff810ac0c4>] rescuer_thread+0xe4/0x2f0
[   15.256879]  [<ffffffff810abfe0>] ? process_scheduled_works+0x40/0x40
[   15.256879]  [<ffffffff810b3916>] kthread+0xd6/0xe0
[   15.256879]  [<ffffffff81bd34fb>] ? _raw_spin_unlock_irq+0x2b/0x40
[   15.256879]  [<ffffffff810b3840>] ? __init_kthread_worker+0x70/0x70
[   15.256879]  [<ffffffff81bdbabc>] ret_from_fork+0x7c/0xb0
[   15.256879]  [<ffffffff810b3840>] ? __init_kthread_worker+0x70/0x70
[   15.256879] Code: ff 41 8b 44 24 08 48 89 df 49 8b 74 24 10 48 29 c7 e8 cb 88 f8 ff 48 8b 5d f0 4c 8b 65 f8 c9 c3 90 48 89 df e8 b8 5c fc ff eb 9b <0f> 0b 0f 1f 40 00 55 48 89 e5 41 57 45 31 ff 41 56 41 55 41 54 
[   15.256879] RIP  [<ffffffff811b470a>] bio_put+0x8a/0x90
[   15.256879]  RSP <ffff88007fd83b50>
[   15.256879] ---[ end trace 1f201608bfddfca7 ]---
[   15.256879] Kernel panic - not syncing: Fatal exception in interrupt
[   15.256879] Shutting down cpus with NMI

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-10-03 11:06                                         ` Christoph Hellwig
@ 2013-10-07 14:44                                           ` Alexander Gordeev
  0 siblings, 0 replies; 75+ messages in thread
From: Alexander Gordeev @ 2013-10-07 14:44 UTC (permalink / raw)
  To: Mike Christie
  Cc: Nicholas A. Bellinger, Jens Axboe, Tejun Heo, linux-kernel,
	linux-ide, linux-scsi, Christoph Hellwig

On Thu, Oct 03, 2013 at 04:06:51AM -0700, Christoph Hellwig wrote:
> On Sat, Jul 20, 2013 at 09:48:28AM -0500, Mike Christie wrote:
> > What about the attached only compile tested patch. The patch has the mq
> > block code work like the non mq code for bio cleanups.
> > 
> > 
> 
> > blk-mq: blk-mq should free bios in pass through case
> > 
> > For non mq calls, the block layer will free the bios when
> > blk_finish_request is called.
> e 
> > For mq calls, the blk mq code wants the caller to do this.
> > 
> > This patch has the blk mq code work like the non mq code
> > and has the block layer free the bios.
> > 
> > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> 
> This patch breaks booting for me in the current blk multiqueue tree,
> with an apparent double free of a bio when using virtio-blk in writeback
> mode (cache=writeback or cache=none in qemu):

I am not sure if the root cause the same, but the panic I experience with
mounting a ahci device (and those double-tag usage described in another
thread) is somehow similar:


[  181.184510] general protection fault: 0000 [#1] SMP 
[  181.184546] Modules linked in: lockd sunrpc snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm mperf snd_page_alloc i5000_edac coretemp snd_timer edac_core iTCO_wdt snd kvm_intel iTCO_vendor_support lpc_ich mfd_core igb dca i5k_amb ppdev soundcore hp_wmi tg3 kvm sparse_keymap serio_raw ptp microcode pcspkr rfkill pps_core shpchp parport_pc parport mptsas scsi_transport_sas mptscsih mptbase floppy nouveau video mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core
[  181.184550] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G        W    3.10.0-rc5.debug+ #180
[  181.184552] Hardware name: Hewlett-Packard HP xw6400 Workstation/0A04h, BIOS 786D4 v02.31 03/14/2008
[  181.184554] task: ffff88007b1a8000 ti: ffff88007b19c000 task.ti: ffff88007b19c000
[  181.184563] RIP: 0010:[<ffffffff811fa97b>]  [<ffffffff811fa97b>] bio_endio+0x1b/0x40
[  181.184565] RSP: 0018:ffff88007d203a28  EFLAGS: 00010002
[  181.184567] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: dead000000200200
[  181.184568] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880068e29200
[  181.184570] RBP: ffff88007d203a28 R08: ffffe8ffff201240 R09: 0000000000000000
[  181.184571] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880074d86000
[  181.184572] R13: 6b6b6b6b6b6b6b6b R14: 000000006b6b6b6b R15: 0000000000000001
[  181.184575] FS:  0000000000000000(0000) GS:ffff88007d200000(0000) knlGS:0000000000000000
[  181.184576] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  181.184578] CR2: 00007f8afac8c45c CR3: 000000005f431000 CR4: 00000000000007e0
[  181.184580] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  181.184581] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  181.184582] Stack:
[  181.184587]  ffff88007d203a78 ffffffff813202aa ffff88007d203a98 0000000000000046
[  181.184591]  0000000000000046 ffff880072d08000 0000000000000000 ffff880074d860d8
[  181.184594]  0000000000000000 0000000000000001 ffff88007d203a88 ffffffff81320445
[  181.184595] Call Trace:
[  181.184597]  <IRQ> 
[  181.184603]  [<ffffffff813202aa>] blk_mq_complete_request+0x5a/0x1d0
[  181.184607]  [<ffffffff81320445>] __blk_mq_end_io+0x25/0x30
[  181.184609]  [<ffffffff81320535>] blk_mq_end_io+0xe5/0xf0
[  181.184613]  [<ffffffff81319754>] blk_flush_complete_seq+0xf4/0x360
[  181.184616]  [<ffffffff81319a4b>] ? flush_end_io+0x4b/0x210
[  181.184619]  [<ffffffff81319b2a>] flush_end_io+0x12a/0x210
[  181.184622]  [<ffffffff813202da>] blk_mq_complete_request+0x8a/0x1d0
[  181.184626]  [<ffffffff8147b9fd>] ? scsi_device_unbusy+0x9d/0xd0
[  181.184629]  [<ffffffff81320445>] __blk_mq_end_io+0x25/0x30
[  181.184632]  [<ffffffff81320535>] blk_mq_end_io+0xe5/0xf0
[  181.184635]  [<ffffffff8147cae5>] scsi_mq_end_request+0x15/0x20
[  181.184638]  [<ffffffff8147bf20>] scsi_io_completion+0xa0/0x650
[  181.184643]  [<ffffffff810bbc3d>] ? trace_hardirqs_off+0xd/0x10
[  181.184648]  [<ffffffff814722f7>] scsi_finish_command+0x87/0xe0
[  181.184650]  [<ffffffff8147bccf>] scsi_softirq_done+0x13f/0x160
[  181.184653]  [<ffffffff8147cba5>] scsi_mq_done+0x15/0x20
[  181.184658]  [<ffffffff81495a73>] ata_scsi_qc_complete+0x63/0x470
[  181.184661]  [<ffffffff8148fad0>] __ata_qc_complete+0x90/0x140
[  181.184664]  [<ffffffff8148fc1d>] ata_qc_complete+0x9d/0x230
[  181.184667]  [<ffffffff8148fe51>] ata_qc_complete_multiple+0xa1/0xe0
[  181.184673]  [<ffffffff814aa449>] ahci_handle_port_interrupt+0x109/0x560
[  181.184676]  [<ffffffff814ab63f>] ahci_port_intr+0x2f/0x40
[  181.184678]  [<ffffffff814ab6f1>] ahci_interrupt+0xa1/0x100
[  181.184683]  [<ffffffff810ff7b5>] handle_irq_event_percpu+0x75/0x3d0
[  181.184686]  [<ffffffff810ffb58>] handle_irq_event+0x48/0x70
[  181.184689]  [<ffffffff81102d9e>] ? handle_fasteoi_irq+0x1e/0x100
[  181.184692]  [<ffffffff81102dda>] handle_fasteoi_irq+0x5a/0x100
[  181.184696]  [<ffffffff81004320>] handle_irq+0x60/0x150
[  181.184702]  [<ffffffff816ff846>] ? atomic_notifier_call_chain+0x16/0x20
[  181.184706]  [<ffffffff81705f7a>] do_IRQ+0x5a/0xe0
[  181.184710]  [<ffffffff816fb52f>] common_interrupt+0x6f/0x6f
[  181.184712]  <EOI> 
[  181.184716]  [<ffffffff8100aa45>] ? default_idle+0x25/0x280
[  181.184719]  [<ffffffff8100aa43>] ? default_idle+0x23/0x280
[  181.184722]  [<ffffffff8100b4f6>] arch_cpu_idle+0x26/0x30
[  181.184726]  [<ffffffff810afe66>] cpu_startup_entry+0x96/0x3e0
[  181.184729]  [<ffffffff810b7ad5>] ? clockevents_register_device+0xb5/0x120
[  181.184734]  [<ffffffff816e67ea>] start_secondary+0x27a/0x27c
[  181.184767] Code: 47 c1 c1 ea 03 83 c2 01 39 d0 0f 47 c2 c3 66 90 66 66 66 66 90 55 85 f6 48 89 e5 74 1b f0 80 67 18 fe 48 8b 47 40 48 85 c0 74 02 <ff> d0 5d 66 90 c3 0f 1f 80 00 00 00 00 48 8b 47 18 a8 01 b8 fb 
[  181.184770] RIP  [<ffffffff811fa97b>] bio_endio+0x1b/0x40
[  181.184772]  RSP <ffff88007d203a28>
[  181.184777] ---[ end trace 5e8fd083b8562c3c ]---
[  181.184779] Kernel panic - not syncing: Fatal exception in interrupt
[  181.185483] drm_kms_helper: panic occurred, switching back to text console


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2013-05-21 23:50 ` [PATCH RESEND 0/1] " Tejun Heo
  2013-05-22 14:39   ` Alexander Gordeev
@ 2014-09-11 12:42   ` Alexander Gordeev
  2014-09-11 12:44     ` [PATCH v2] " Alexander Gordeev
                       ` (2 more replies)
  1 sibling, 3 replies; 75+ messages in thread
From: Alexander Gordeev @ 2014-09-11 12:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, linux-ide

On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> Hmmmmmm..... I'd normally apply this patch but block layer is just
> growing multi-queue support and libata is likely to be converted to mq
> in foreseeable future, so I'm a bit hesitant to make irq handling more
> sophiscated right now.  Would you be interested in looking into
> converting libata to blk mq support?  I'm pretty sure it'd yield far
> better outcome if done properly.

Hi Tejun,

As conversion of libata to blk mq has long done I tried the change
against the recent version and the results still appear worthwhile.

The numbers are taken by running 'dd if=/dev/sd{a,b} of=/dev/null'
in parallel. All time values are in us.

Before this update host lock average holdtime was 2.45 and
average waittime was 1.24. After the update average holdtime
dropped to 0.29 (about eight times) while average waittime
decreased to 0.58 (about two times).

Also, port events are handled with local interrupts enabled
and compete on individual per-port locks with average holdtime
1.25 and average waittime 1.48. So combined average holdtime
spent while holding host and port locks decreased from 2.45 to
0.29 + 1.25 = 1.54 (about 1.6 times).

The downside of this change is introduction of a kernel thread.

The upside is shorter access time to port locks and moving port
interrupts handling out of the hardware interrupt context.

Thanks!

> -- 
> tejun

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [PATCH v2] AHCI: Optimize interrupt processing
  2014-09-11 12:42   ` Alexander Gordeev
@ 2014-09-11 12:44     ` Alexander Gordeev
  2014-09-13  4:43       ` Tejun Heo
  2014-09-11 13:36     ` [PATCH RESEND 0/1] " Bartlomiej Zolnierkiewicz
  2014-09-13  4:46     ` Tejun Heo
  2 siblings, 1 reply; 75+ messages in thread
From: Alexander Gordeev @ 2014-09-11 12:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, linux-ide

Split interrupt service routine into hardware context handler and
threaded context handler. That allows to protect ports with individual
locks rather than with a single host-wide lock, which results in better
parallelism.

Cc: linux-ide@vger.kernel.org
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
 drivers/ata/acard-ahci.c       |   3 +-
 drivers/ata/ahci.c             | 100 ++++++++++++++++++++++++++---------------
 drivers/ata/ahci.h             |   8 ++--
 drivers/ata/libahci.c          |  74 +++++++++++++++++-------------
 drivers/ata/libahci_platform.c |   3 +-
 5 files changed, 115 insertions(+), 73 deletions(-)

diff --git a/drivers/ata/acard-ahci.c b/drivers/ata/acard-ahci.c
index 25d0ac3..c962886 100644
--- a/drivers/ata/acard-ahci.c
+++ b/drivers/ata/acard-ahci.c
@@ -498,8 +498,7 @@ static int acard_ahci_init_one(struct pci_dev *pdev, const struct pci_device_id
 	acard_ahci_pci_print_info(host);
 
 	pci_set_master(pdev);
-	return ata_host_activate(host, pdev->irq, ahci_interrupt, IRQF_SHARED,
-				 &acard_ahci_sht);
+	return ahci_host_activate(host, pdev->irq, &acard_ahci_sht);
 }
 
 module_pci_driver(acard_ahci_pci_driver);
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index a29f801..52decdc 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1211,6 +1211,9 @@ static int ahci_init_interrupts(struct pci_dev *pdev, unsigned int n_ports,
 		goto single_msi;
 	}
 
+	if (nvec > 1)
+		hpriv->flags |= AHCI_HFLAG_MULTI_MSI;
+
 	return nvec;
 
 single_msi:
@@ -1223,32 +1226,11 @@ intx:
 	return 0;
 }
 
-/**
- *	ahci_host_activate - start AHCI host, request IRQs and register it
- *	@host: target ATA host
- *	@irq: base IRQ number to request
- *	@n_msis: number of MSIs allocated for this host
- *	@irq_handler: irq_handler used when requesting IRQs
- *	@irq_flags: irq_flags used when requesting IRQs
- *
- *	Similar to ata_host_activate, but requests IRQs according to AHCI-1.1
- *	when multiple MSIs were allocated. That is one MSI per port, starting
- *	from @irq.
- *
- *	LOCKING:
- *	Inherited from calling layer (may sleep).
- *
- *	RETURNS:
- *	0 on success, -errno otherwise.
- */
-int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis)
+static int ahci_host_activate_multi_irqs(struct ata_host *host, int irq,
+					 struct scsi_host_template *sht)
 {
 	int i, rc;
 
-	/* Sharing Last Message among several ports is not supported */
-	if (n_msis < host->n_ports)
-		return -EINVAL;
-
 	rc = ata_host_start(host);
 	if (rc)
 		return rc;
@@ -1263,8 +1245,8 @@ int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis)
 		}
 
 		rc = devm_request_threaded_irq(host->dev, irq + i,
-					       ahci_hw_interrupt,
-					       ahci_thread_fn, IRQF_SHARED,
+					       ahci_multi_irqs_intr,
+					       ahci_port_thread_fn, IRQF_SHARED,
 					       pp->irq_desc, host->ports[i]);
 		if (rc)
 			goto out_free_irqs;
@@ -1273,7 +1255,7 @@ int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis)
 	for (i = 0; i < host->n_ports; i++)
 		ata_port_desc(host->ports[i], "irq %d", irq + i);
 
-	rc = ata_host_register(host, &ahci_sht);
+	rc = ata_host_register(host, sht);
 	if (rc)
 		goto out_free_all_irqs;
 
@@ -1288,6 +1270,60 @@ out_free_irqs:
 	return rc;
 }
 
+static int ahci_host_activate_single_irq(struct ata_host *host, int irq,
+					 struct scsi_host_template *sht)
+{
+	int i, rc;
+
+	rc = ata_host_start(host);
+	if (rc)
+		return rc;
+
+	rc = devm_request_threaded_irq(host->dev, irq, ahci_single_irq_intr,
+				       ahci_thread_fn, IRQF_SHARED,
+				       dev_driver_string(host->dev), host);
+	if (rc)
+		return rc;
+
+	for (i = 0; i < host->n_ports; i++)
+		ata_port_desc(host->ports[i], "irq %d", irq);
+
+	rc = ata_host_register(host, sht);
+	if (rc)
+		devm_free_irq(host->dev, irq, host);
+
+	return rc;
+
+}
+
+/**
+ *	ahci_host_activate - start AHCI host, request IRQs and register it
+ *	@host: target ATA host
+ *	@irq: base IRQ number to request
+ *	@sht: scsi_host_template to use when registering the host
+ *
+ *	Similar to ata_host_activate, but requests IRQs according to AHCI-1.1
+ *	when multiple MSIs were allocated. That is one MSI per port, starting
+ *	from @irq.
+ *
+ *	LOCKING:
+ *	Inherited from calling layer (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, -errno otherwise.
+ */
+int ahci_host_activate(struct ata_host *host, int irq,
+		       struct scsi_host_template *sht)
+{
+	struct ahci_host_priv *hpriv = host->private_data;
+
+	if (hpriv->flags & AHCI_HFLAG_MULTI_MSI)
+		return ahci_host_activate_multi_irqs(host, irq, sht);
+	else
+		return ahci_host_activate_single_irq(host, irq, sht);
+}
+EXPORT_SYMBOL_GPL(ahci_host_activate);
+
 static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	unsigned int board_id = ent->driver_data;
@@ -1296,7 +1332,7 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	struct device *dev = &pdev->dev;
 	struct ahci_host_priv *hpriv;
 	struct ata_host *host;
-	int n_ports, n_msis, i, rc;
+	int n_ports, i, rc;
 	int ahci_pci_bar = AHCI_PCI_BAR_STANDARD;
 
 	VPRINTK("ENTER\n");
@@ -1437,9 +1473,7 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	 */
 	n_ports = max(ahci_nr_ports(hpriv->cap), fls(hpriv->port_map));
 
-	n_msis = ahci_init_interrupts(pdev, n_ports, hpriv);
-	if (n_msis > 1)
-		hpriv->flags |= AHCI_HFLAG_MULTI_MSI;
+	ahci_init_interrupts(pdev, n_ports, hpriv);
 
 	host = ata_host_alloc_pinfo(&pdev->dev, ppi, n_ports);
 	if (!host)
@@ -1491,11 +1525,7 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	pci_set_master(pdev);
 
-	if (hpriv->flags & AHCI_HFLAG_MULTI_MSI)
-		return ahci_host_activate(host, pdev->irq, n_msis);
-
-	return ata_host_activate(host, pdev->irq, ahci_interrupt, IRQF_SHARED,
-				 &ahci_sht);
+	return ahci_host_activate(host, pdev->irq, &ahci_sht);
 }
 
 module_pci_driver(ahci_pci_driver);
diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
index 59ae0ee..c12f590 100644
--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -388,11 +388,13 @@ int ahci_port_resume(struct ata_port *ap);
 void ahci_set_em_messages(struct ahci_host_priv *hpriv,
 			  struct ata_port_info *pi);
 int ahci_reset_em(struct ata_host *host);
-irqreturn_t ahci_interrupt(int irq, void *dev_instance);
-irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance);
+irqreturn_t ahci_single_irq_intr(int irq, void *dev_instance);
+irqreturn_t ahci_multi_irqs_intr(int irq, void *dev_instance);
 irqreturn_t ahci_thread_fn(int irq, void *dev_instance);
+irqreturn_t ahci_port_thread_fn(int irq, void *dev_instance);
 void ahci_print_info(struct ata_host *host, const char *scc_s);
-int ahci_host_activate(struct ata_host *host, int irq, unsigned int n_msis);
+int ahci_host_activate(struct ata_host *host, int irq,
+		       struct scsi_host_template *sht);
 void ahci_error_handler(struct ata_port *ap);
 
 static inline void __iomem *__ahci_port_base(struct ata_host *host,
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index b784e9d..169c272 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -1693,9 +1693,9 @@ static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
 		ata_port_abort(ap);
 }
 
-static void ahci_handle_port_interrupt(struct ata_port *ap,
-				       void __iomem *port_mmio, u32 status)
+static void ahci_handle_port_interrupt(struct ata_port *ap, u32 status)
 {
+	void __iomem *port_mmio = ahci_port_base(ap);
 	struct ata_eh_info *ehi = &ap->link.eh_info;
 	struct ahci_port_priv *pp = ap->private_data;
 	struct ahci_host_priv *hpriv = ap->host->private_data;
@@ -1778,22 +1778,10 @@ static void ahci_handle_port_interrupt(struct ata_port *ap,
 	}
 }
 
-static void ahci_port_intr(struct ata_port *ap)
-{
-	void __iomem *port_mmio = ahci_port_base(ap);
-	u32 status;
-
-	status = readl(port_mmio + PORT_IRQ_STAT);
-	writel(status, port_mmio + PORT_IRQ_STAT);
-
-	ahci_handle_port_interrupt(ap, port_mmio, status);
-}
-
-irqreturn_t ahci_thread_fn(int irq, void *dev_instance)
+irqreturn_t ahci_port_thread_fn(int irq, void *dev_instance)
 {
 	struct ata_port *ap = dev_instance;
 	struct ahci_port_priv *pp = ap->private_data;
-	void __iomem *port_mmio = ahci_port_base(ap);
 	unsigned long flags;
 	u32 status;
 
@@ -1804,14 +1792,43 @@ irqreturn_t ahci_thread_fn(int irq, void *dev_instance)
 	spin_unlock_irqrestore(&ap->host->lock, flags);
 
 	spin_lock_bh(ap->lock);
-	ahci_handle_port_interrupt(ap, port_mmio, status);
+	ahci_handle_port_interrupt(ap, status);
 	spin_unlock_bh(ap->lock);
 
 	return IRQ_HANDLED;
 }
+EXPORT_SYMBOL_GPL(ahci_port_thread_fn);
+
+irqreturn_t ahci_thread_fn(int irq, void *dev_instance)
+{
+	struct ata_host *host = dev_instance;
+	struct ahci_host_priv *hpriv = host->private_data;
+	u32 irq_masked = hpriv->port_map;
+	unsigned int i;
+
+	for (i = 0; i < host->n_ports; i++) {
+		struct ata_port *ap;
+
+		if (!(irq_masked & (1 << i)))
+			continue;
+
+		ap = host->ports[i];
+		if (ap) {
+			ahci_port_thread_fn(irq, ap);
+			VPRINTK("port %u\n", i);
+		} else {
+			VPRINTK("port %u (no irq)\n", i);
+			if (ata_ratelimit())
+				dev_warn(host->dev,
+					 "interrupt on disabled port %u\n", i);
+		}
+	}
+
+	return IRQ_HANDLED;
+}
 EXPORT_SYMBOL_GPL(ahci_thread_fn);
 
-static void ahci_hw_port_interrupt(struct ata_port *ap)
+static void ahci_update_intr_status(struct ata_port *ap)
 {
 	void __iomem *port_mmio = ahci_port_base(ap);
 	struct ahci_port_priv *pp = ap->private_data;
@@ -1823,7 +1840,7 @@ static void ahci_hw_port_interrupt(struct ata_port *ap)
 	pp->intr_status |= status;
 }
 
-irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance)
+irqreturn_t ahci_multi_irqs_intr(int irq, void *dev_instance)
 {
 	struct ata_port *ap_this = dev_instance;
 	struct ahci_port_priv *pp = ap_this->private_data;
@@ -1859,7 +1876,7 @@ irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance)
 
 		ap = host->ports[i];
 		if (ap) {
-			ahci_hw_port_interrupt(ap);
+			ahci_update_intr_status(ap);
 			VPRINTK("port %u\n", i);
 		} else {
 			VPRINTK("port %u (no irq)\n", i);
@@ -1877,9 +1894,9 @@ irqreturn_t ahci_hw_interrupt(int irq, void *dev_instance)
 
 	return IRQ_WAKE_THREAD;
 }
-EXPORT_SYMBOL_GPL(ahci_hw_interrupt);
+EXPORT_SYMBOL_GPL(ahci_multi_irqs_intr);
 
-irqreturn_t ahci_interrupt(int irq, void *dev_instance)
+irqreturn_t ahci_single_irq_intr(int irq, void *dev_instance)
 {
 	struct ata_host *host = dev_instance;
 	struct ahci_host_priv *hpriv;
@@ -1909,7 +1926,7 @@ irqreturn_t ahci_interrupt(int irq, void *dev_instance)
 
 		ap = host->ports[i];
 		if (ap) {
-			ahci_port_intr(ap);
+			ahci_update_intr_status(ap);
 			VPRINTK("port %u\n", i);
 		} else {
 			VPRINTK("port %u (no irq)\n", i);
@@ -1936,9 +1953,9 @@ irqreturn_t ahci_interrupt(int irq, void *dev_instance)
 
 	VPRINTK("EXIT\n");
 
-	return IRQ_RETVAL(handled);
+	return handled ? IRQ_WAKE_THREAD : IRQ_NONE;
 }
-EXPORT_SYMBOL_GPL(ahci_interrupt);
+EXPORT_SYMBOL_GPL(ahci_single_irq_intr);
 
 unsigned int ahci_qc_issue(struct ata_queued_cmd *qc)
 {
@@ -2349,13 +2366,8 @@ static int ahci_port_start(struct ata_port *ap)
 	 */
 	pp->intr_mask = DEF_PORT_IRQ;
 
-	/*
-	 * Switch to per-port locking in case each port has its own MSI vector.
-	 */
-	if ((hpriv->flags & AHCI_HFLAG_MULTI_MSI)) {
-		spin_lock_init(&pp->lock);
-		ap->lock = &pp->lock;
-	}
+	spin_lock_init(&pp->lock);
+	ap->lock = &pp->lock;
 
 	ap->private_data = pp;
 
diff --git a/drivers/ata/libahci_platform.c b/drivers/ata/libahci_platform.c
index 5b92c29..a085224 100644
--- a/drivers/ata/libahci_platform.c
+++ b/drivers/ata/libahci_platform.c
@@ -495,8 +495,7 @@ int ahci_platform_init_host(struct platform_device *pdev,
 	ahci_init_controller(host);
 	ahci_print_info(host, "platform");
 
-	return ata_host_activate(host, irq, ahci_interrupt, IRQF_SHARED,
-				 &ahci_platform_sht);
+	return ahci_host_activate(host, irq, &ahci_platform_sht);
 }
 EXPORT_SYMBOL_GPL(ahci_platform_init_host);
 
-- 
1.9.3

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2014-09-11 12:42   ` Alexander Gordeev
  2014-09-11 12:44     ` [PATCH v2] " Alexander Gordeev
@ 2014-09-11 13:36     ` Bartlomiej Zolnierkiewicz
  2014-09-13  4:46     ` Tejun Heo
  2 siblings, 0 replies; 75+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2014-09-11 13:36 UTC (permalink / raw)
  To: Alexander Gordeev; +Cc: Tejun Heo, linux-kernel, linux-ide


Hi,

On Thursday, September 11, 2014 02:42:49 PM Alexander Gordeev wrote:

> The numbers are taken by running 'dd if=/dev/sd{a,b} of=/dev/null'
> in parallel. All time values are in us.
> 
> Before this update host lock average holdtime was 2.45 and
> average waittime was 1.24. After the update average holdtime
> dropped to 0.29 (about eight times) while average waittime
> decreased to 0.58 (about two times).
> 
> Also, port events are handled with local interrupts enabled
> and compete on individual per-port locks with average holdtime
> 1.25 and average waittime 1.48. So combined average holdtime
> spent while holding host and port locks decreased from 2.45 to
> 0.29 + 1.25 = 1.54 (about 1.6 times).
> 
> The downside of this change is introduction of a kernel thread.
> 
> The upside is shorter access time to port locks and moving port
> interrupts handling out of the hardware interrupt context.

IMHO it would be great to put the above results into the patch
description (it now looks a bit skimpy).

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics


^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v2] AHCI: Optimize interrupt processing
  2014-09-11 12:44     ` [PATCH v2] " Alexander Gordeev
@ 2014-09-13  4:43       ` Tejun Heo
  0 siblings, 0 replies; 75+ messages in thread
From: Tejun Heo @ 2014-09-13  4:43 UTC (permalink / raw)
  To: Alexander Gordeev; +Cc: linux-kernel, linux-ide

On Thu, Sep 11, 2014 at 02:44:37PM +0200, Alexander Gordeev wrote:
> Split interrupt service routine into hardware context handler and
> threaded context handler. That allows to protect ports with individual
> locks rather than with a single host-wide lock, which results in better
> parallelism.

This patch is way too large.  Can you please split it up to smaller
logical changes?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
  2014-09-11 12:42   ` Alexander Gordeev
  2014-09-11 12:44     ` [PATCH v2] " Alexander Gordeev
  2014-09-11 13:36     ` [PATCH RESEND 0/1] " Bartlomiej Zolnierkiewicz
@ 2014-09-13  4:46     ` Tejun Heo
  2 siblings, 0 replies; 75+ messages in thread
From: Tejun Heo @ 2014-09-13  4:46 UTC (permalink / raw)
  To: Alexander Gordeev; +Cc: linux-kernel, linux-ide

Hello,

On Thu, Sep 11, 2014 at 02:42:49PM +0200, Alexander Gordeev wrote:
> As conversion of libata to blk mq has long done I tried the change

Hmmm?  You mean scsi-mq?

> against the recent version and the results still appear worthwhile.
> 
> The numbers are taken by running 'dd if=/dev/sd{a,b} of=/dev/null'
> in parallel. All time values are in us.
> 
> Before this update host lock average holdtime was 2.45 and
> average waittime was 1.24. After the update average holdtime
> dropped to 0.29 (about eight times) while average waittime
> decreased to 0.58 (about two times).
> 
> Also, port events are handled with local interrupts enabled
> and compete on individual per-port locks with average holdtime
> 1.25 and average waittime 1.48. So combined average holdtime
> spent while holding host and port locks decreased from 2.45 to
> 0.29 + 1.25 = 1.54 (about 1.6 times).
> 
> The downside of this change is introduction of a kernel thread.

That shouldn't matter at all but can you please present the
information in a more digestable form?  e.g. CPU usage decreased from
A to B when transferring N MB/s on certain setup.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2014-09-13  4:46 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-21 19:00 [PATCH RESEND 0/1] AHCI: Optimize interrupt processing Alexander Gordeev
2013-05-21 19:00 ` [PATCH RESEND 1/1] " Alexander Gordeev
2013-05-21 23:50 ` [PATCH RESEND 0/1] " Tejun Heo
2013-05-22 14:39   ` Alexander Gordeev
2013-05-22 17:03     ` Jens Axboe
2013-07-11 10:26       ` Alexander Gordeev
2013-07-11 10:26         ` Alexander Gordeev
2013-07-11 23:00         ` Nicholas A. Bellinger
2013-07-12  7:46           ` Alexander Gordeev
2013-07-13  5:20             ` Nicholas A. Bellinger
2013-07-16 18:32               ` Alexander Gordeev
2013-07-16 21:38                 ` Nicholas A. Bellinger
2013-07-17 16:19                   ` Alexander Gordeev
2013-07-18 18:51                     ` Nicholas A. Bellinger
2013-07-18 19:12                       ` Mike Christie
2013-07-19  0:23                         ` Nicholas A. Bellinger
2013-07-19  0:30                           ` Jens Axboe
2013-07-19  1:03                             ` Nicholas A. Bellinger
2013-07-19  6:34                               ` Nicholas A. Bellinger
2013-07-19  6:34                                 ` Nicholas A. Bellinger
2013-07-19 15:33                                 ` James Bottomley
2013-07-19 21:01                                   ` Nicholas A. Bellinger
2013-07-20  4:56                                     ` Nicholas A. Bellinger
2013-07-20 14:48                                       ` Mike Christie
2013-07-20 14:48                                         ` Mike Christie
2013-07-20 22:14                                         ` Nicholas A. Bellinger
2013-07-20 23:57                                         ` Jens Axboe
2013-08-09 19:15                                         ` Alexander Gordeev
2013-08-09 20:17                                           ` Nicholas A. Bellinger
2013-08-15 16:23                                             ` Alexander Gordeev
2013-08-16  2:19                                               ` Nicholas A. Bellinger
2013-08-16 16:41                                                 ` Alexander Gordeev
2013-08-16 17:46                                                   ` Nicholas A. Bellinger
2013-08-28 15:56                                                 ` Alexander Gordeev
2013-08-28 15:56                                                   ` Alexander Gordeev
2013-09-20 15:19                                                 ` Alexander Gordeev
2013-09-20 15:19                                                   ` Alexander Gordeev
2013-09-20 20:41                                                   ` Nicholas A. Bellinger
2013-10-03 11:06                                         ` Christoph Hellwig
2013-10-07 14:44                                           ` Alexander Gordeev
2013-07-22 15:03                                       ` Alexander Gordeev
2013-07-22 21:10                                         ` Nicholas A. Bellinger
2013-07-25 10:16                                           ` Alexander Gordeev
2013-07-25 22:08                                             ` Nicholas A. Bellinger
2013-07-26  2:09                                               ` Jens Axboe
2013-07-26 21:14                                                 ` Nicholas A. Bellinger
2013-07-27  0:43                                                   ` Nicholas A. Bellinger
2013-07-29 11:18                                                     ` Alexander Gordeev
2013-07-29 14:08                                                       ` Jens Axboe
2013-07-29 19:19                                                       ` Nicholas A. Bellinger
2013-07-31  4:16                                                         ` Marc C
2013-07-31 10:23                                                           ` Tejun Heo
2013-07-29 11:50                                                     ` Tejun Heo
2013-07-29 19:11                                                       ` Nicholas A. Bellinger
2013-07-29 11:46                                                   ` Tejun Heo
2013-07-29 14:03                                                     ` Jens Axboe
2013-08-09  8:23                                                     ` Alexander Gordeev
2013-08-09 14:15                                                       ` Tejun Heo
2013-08-09 14:24                                                       ` Jens Axboe
2013-08-09 15:07                                                         ` Alexander Gordeev
2013-08-09 15:52                                                           ` Jens Axboe
2013-08-09 16:46                                                             ` Alexander Gordeev
2013-08-09 17:07                                                               ` Jens Axboe
2013-08-12 15:21                                                                 ` Alexander Gordeev
2013-07-29  7:28                                                 ` Hannes Reinecke
2013-07-31 17:11                                             ` Alexander Gordeev
2013-07-19 15:58                           ` Mike Christie
2013-07-19 21:05                             ` Nicholas A. Bellinger
2013-07-18 19:14                       ` Nicholas A. Bellinger
2013-07-18 21:21                         ` Jens Axboe
2014-09-11 12:42   ` Alexander Gordeev
2014-09-11 12:44     ` [PATCH v2] " Alexander Gordeev
2014-09-13  4:43       ` Tejun Heo
2014-09-11 13:36     ` [PATCH RESEND 0/1] " Bartlomiej Zolnierkiewicz
2014-09-13  4:46     ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.