All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/6] g_NCR5380: Fix PDMA transfer size
  2017-06-29  5:24 ` Finn Thain
@ 2017-06-29  5:24   ` Finn Thain
  -1 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

generic_NCR5380_dma_xfer_len() incorrectly uses cmd->transfersize
which causes rescan-scsi-bus and CD-ROM access to hang the system.
Use cmd->SCp.this_residual instead, like other NCR5380 drivers.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 67c8dac321ad..14ef4e8c4713 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -76,6 +76,7 @@
 #define IRQ_AUTO 254
 
 #define MAX_CARDS 8
+#define DMA_MAX_SIZE 32768
 
 /* old-style parameters for compatibility */
 static int ncr_irq = -1;
@@ -629,23 +630,16 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata,
                                         struct scsi_cmnd *cmd)
 {
-	int transfersize = cmd->transfersize;
+	int transfersize = cmd->SCp.this_residual;
 
 	if (hostdata->flags & FLAG_NO_PSEUDO_DMA)
 		return 0;
 
-	/* Limit transfers to 32K, for xx400 & xx406
-	 * pseudoDMA that transfers in 128 bytes blocks.
-	 */
-	if (transfersize > 32 * 1024 && cmd->SCp.this_residual &&
-	    !(cmd->SCp.this_residual % transfersize))
-		transfersize = 32 * 1024;
-
 	/* 53C400 datasheet: non-modulo-128-byte transfers should use PIO */
 	if (transfersize % 128)
 		transfersize = 0;
 
-	return transfersize;
+	return min(transfersize, DMA_MAX_SIZE);
 }
 
 /*
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
@ 2017-06-29  5:24 ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

Ondrej, would you please test this new series?

Changed since v1:
- PDMA transfer residual is calculated earlier.
- End of DMA flag check is now polled (if there is any residual).

Changed since v2:
- Bail out of transfer loops when Gated IRQ gets asserted.
- Make udelay conditional on board type.
- Drop sg_tablesize patch due to performance regression.

Changed since v3:
- Add Ondrej's workaround for corrupt WRITE commands on DTC boards.
- Reset the 53c400 logic after any short PDMA transfer.
- Don't fail the transfer if the 53c400 logic got a reset.

Changed since v4:
- Bail out of transfer loops when Gated IRQ gets asserted. (Again.)
- Always call wait_for_53c80_registers() at end of transfer.
- Drain chip buffers after PDMA receive is interrupted.
- Rework residual calculation.
- Add new patch to correct DMA terminology.


Finn Thain (2):
  g_NCR5380: Cleanup comments and whitespace
  g_NCR5380: Use unambiguous terminology for PDMA send and receive

Ondrej Zary (4):
  g_NCR5380: Fix PDMA transfer size
  g_NCR5380: End PDMA transfer correctly on target disconnection
  g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on
    DTC3181E
  g_NCR5380: Re-work PDMA loops

 drivers/scsi/g_NCR5380.c | 260 +++++++++++++++++++++++++----------------------
 1 file changed, 139 insertions(+), 121 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 4/6] g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on DTC3181E
  2017-06-29  5:24 ` Finn Thain
@ 2017-06-29  5:24   ` Finn Thain
  -1 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

The corruption is always the same: one byte missing at the beginning of
a 128 B block. It happens only with slow Quantum LPS 240 drive, not with
faster IBM DORS-32160. It's not clear what causes this. Documentation
for the DTC436 chip has not been made available. Hence this workaround.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index dedaed2d16e4..5fd227bb1830 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -45,7 +45,8 @@
 	int c400_blk_cnt; \
 	int c400_host_buf; \
 	int io_width; \
-	int pdma_residual
+	int pdma_residual; \
+	int board
 
 #define NCR5380_dma_xfer_len            generic_NCR5380_dma_xfer_len
 #define NCR5380_dma_recv_setup          generic_NCR5380_pread
@@ -316,6 +317,7 @@ static int generic_NCR5380_init_one(struct scsi_host_template *tpnt,
 	}
 	hostdata = shost_priv(instance);
 
+	hostdata->board = board;
 	hostdata->io = iomem;
 	hostdata->region_size = region_size;
 
@@ -644,7 +646,12 @@ static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata,
 
 	/* 53C400 datasheet: non-modulo-128-byte transfers should use PIO */
 	if (transfersize % 128)
-		transfersize = 0;
+		return 0;
+
+	/* Limit PDMA send to 512 B to avoid random corruption on DTC3181E */
+	if (hostdata->board == BOARD_DTC3181E &&
+	    cmd->sc_data_direction == DMA_TO_DEVICE)
+		transfersize = min(cmd->SCp.this_residual, 512);
 
 	return min(transfersize, DMA_MAX_SIZE);
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 2/6] g_NCR5380: End PDMA transfer correctly on target disconnection
  2017-06-29  5:24 ` Finn Thain
@ 2017-06-29  5:24   ` Finn Thain
  -1 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

When an IRQ arrives during PDMA transfer, pread() and pwrite() return
without waiting for the 53C80 registers to be ready and this ends up
messing up the chip state. This was observed with SONY CDU-55S which is
slow enough to disconnect during 4096-byte reads.

IRQ during PDMA is not an error so don't return -1. Instead, store the
remaining byte count for use by NCR5380_dma_residual().

[Poll for the BASR_END_DMA_TRANSFER condition rather than remove the
error message -- F.T.]

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 48 +++++++++++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 14ef4e8c4713..911a4300ea51 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -44,12 +44,13 @@
 	int c400_ctl_status; \
 	int c400_blk_cnt; \
 	int c400_host_buf; \
-	int io_width
+	int io_width; \
+	int pdma_residual
 
 #define NCR5380_dma_xfer_len            generic_NCR5380_dma_xfer_len
 #define NCR5380_dma_recv_setup          generic_NCR5380_pread
 #define NCR5380_dma_send_setup          generic_NCR5380_pwrite
-#define NCR5380_dma_residual            NCR5380_dma_residual_none
+#define NCR5380_dma_residual            generic_NCR5380_dma_residual
 
 #define NCR5380_intr                    generic_NCR5380_intr
 #define NCR5380_queue_command           generic_NCR5380_queue_command
@@ -500,10 +501,8 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 	while (1) {
 		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
 			break;
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ) {
-			printk(KERN_ERR "53C400r: Got 53C80_IRQ start=%d, blocks=%d\n", start, blocks);
-			return -1;
-		}
+		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
+			goto out_wait;
 		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
 			; /* FIXME - no timeout */
 
@@ -542,13 +541,19 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 	if (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ))
 		printk("53C400r: no 53C80 gated irq after transfer");
 
+out_wait:
+	hostdata->pdma_residual = len - start;
+
 	/* wait for 53C80 registers to be available */
 	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG))
 		;
 
-	if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_END_DMA_TRANSFER))
-		printk(KERN_ERR "53C400r: no end dma signal\n");
-		
+	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
+	                          HZ / 64) < 0)
+		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
+		            __func__, hostdata->pdma_residual);
+
 	return 0;
 }
 
@@ -571,10 +576,8 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
 	NCR5380_write(hostdata->c400_blk_cnt, blocks);
 	while (1) {
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ) {
-			printk(KERN_ERR "53C400w: Got 53C80_IRQ start=%d, blocks=%d\n", start, blocks);
-			return -1;
-		}
+		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
+			goto out_wait;
 
 		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
 			break;
@@ -612,18 +615,24 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 		blocks--;
 	}
 
+out_wait:
+	hostdata->pdma_residual = len - start;
+
 	/* wait for 53C80 registers to be available */
 	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG)) {
 		udelay(4); /* DTC436 chip hangs without this */
 		/* FIXME - no timeout */
 	}
 
-	if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_END_DMA_TRANSFER)) {
-		printk(KERN_ERR "53C400w: no end dma signal\n");
-	}
-
 	while (!(NCR5380_read(TARGET_COMMAND_REG) & TCR_LAST_BYTE_SENT))
 		; 	// TIMEOUT
+
+	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
+	                          HZ / 64) < 0)
+		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
+		            __func__, hostdata->pdma_residual);
+
 	return 0;
 }
 
@@ -642,6 +651,11 @@ static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata,
 	return min(transfersize, DMA_MAX_SIZE);
 }
 
+static int generic_NCR5380_dma_residual(struct NCR5380_hostdata *hostdata)
+{
+	return hostdata->pdma_residual;
+}
+
 /*
  *	Include the NCR5380 core code that we build our driver around	
  */
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 3/6] g_NCR5380: Cleanup comments and whitespace
  2017-06-29  5:24 ` Finn Thain
@ 2017-06-29  5:24   ` Finn Thain
  -1 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 61 ++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 911a4300ea51..dedaed2d16e4 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -1,17 +1,17 @@
 /*
  * Generic Generic NCR5380 driver
- *	
+ *
  * Copyright 1993, Drew Eckhardt
- *	Visionary Computing
- *	(Unix and Linux consulting and custom programming)
- *	drew@colorado.edu
- *      +1 (303) 440-4894
+ * Visionary Computing
+ * (Unix and Linux consulting and custom programming)
+ * drew@colorado.edu
+ * +1 (303) 440-4894
  *
  * NCR53C400 extensions (c) 1994,1995,1996, Kevin Lentin
- *    K.Lentin@cs.monash.edu.au
+ * K.Lentin@cs.monash.edu.au
  *
  * NCR53C400A extensions (c) 1996, Ingmar Baumgart
- *    ingmar@gonzo.schwaben.de
+ * ingmar@gonzo.schwaben.de
  *
  * DTC3181E extensions (c) 1997, Ronald van Cuijlenborg
  * ronald.van.cuijlenborg@tip.nl or nutty@dds.nl
@@ -481,15 +481,14 @@ static void generic_NCR5380_release_resources(struct Scsi_Host *instance)
 }
 
 /**
- *	generic_NCR5380_pread - pseudo DMA read
- *	@hostdata: scsi host private data
- *	@dst: buffer to read into
- *	@len: buffer length
+ * generic_NCR5380_pread - pseudo DMA read
+ * @hostdata: scsi host private data
+ * @dst: buffer to write into
+ * @len: transfer size
  *
- *	Perform a pseudo DMA mode read from an NCR53C400 or equivalent
- *	controller
+ * Perform a pseudo DMA mode receive from a 53C400 or equivalent device.
  */
- 
+
 static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
                                         unsigned char *dst, int len)
 {
@@ -508,10 +507,10 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			insw(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 64);
+			     dst + start, 64);
 		else if (hostdata->io_port)
 			insb(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 128);
+			     dst + start, 128);
 		else
 			memcpy_fromio(dst + start,
 				hostdata->io + NCR53C400_host_buffer, 128);
@@ -558,13 +557,12 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 }
 
 /**
- *	generic_NCR5380_pwrite - pseudo DMA write
- *	@hostdata: scsi host private data
- *	@dst: buffer to read into
- *	@len: buffer length
+ * generic_NCR5380_pwrite - pseudo DMA write
+ * @hostdata: scsi host private data
+ * @src: buffer to read from
+ * @len: transfer size
  *
- *	Perform a pseudo DMA mode read from an NCR53C400 or equivalent
- *	controller
+ * Perform a pseudo DMA mode send to a 53C400 or equivalent device.
  */
 
 static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
@@ -603,10 +601,10 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			outsw(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 64);
+			      src + start, 64);
 		else if (hostdata->io_port)
 			outsb(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 128);
+			      src + start, 128);
 		else
 			memcpy_toio(hostdata->io + NCR53C400_host_buffer,
 			            src + start, 128);
@@ -656,10 +654,8 @@ static int generic_NCR5380_dma_residual(struct NCR5380_hostdata *hostdata)
 	return hostdata->pdma_residual;
 }
 
-/*
- *	Include the NCR5380 core code that we build our driver around	
- */
- 
+/* Include the core driver code. */
+
 #include "NCR5380.c"
 
 static struct scsi_host_template driver_template = {
@@ -679,11 +675,10 @@ static struct scsi_host_template driver_template = {
 	.max_sectors		= 128,
 };
 
-
 static int generic_NCR5380_isa_match(struct device *pdev, unsigned int ndev)
 {
 	int ret = generic_NCR5380_init_one(&driver_template, pdev, base[ndev],
-					  irq[ndev], card[ndev]);
+	                                   irq[ndev], card[ndev]);
 	if (ret) {
 		if (base[ndev])
 			printk(KERN_WARNING "Card not found at address 0x%03x\n",
@@ -695,7 +690,7 @@ static int generic_NCR5380_isa_match(struct device *pdev, unsigned int ndev)
 }
 
 static int generic_NCR5380_isa_remove(struct device *pdev,
-				   unsigned int ndev)
+                                      unsigned int ndev)
 {
 	generic_NCR5380_release_resources(dev_get_drvdata(pdev));
 	dev_set_drvdata(pdev, NULL);
@@ -718,7 +713,7 @@ static struct pnp_device_id generic_NCR5380_pnp_ids[] = {
 MODULE_DEVICE_TABLE(pnp, generic_NCR5380_pnp_ids);
 
 static int generic_NCR5380_pnp_probe(struct pnp_dev *pdev,
-			       const struct pnp_device_id *id)
+                                     const struct pnp_device_id *id)
 {
 	int base, irq;
 
@@ -729,7 +724,7 @@ static int generic_NCR5380_pnp_probe(struct pnp_dev *pdev,
 	irq = pnp_irq(pdev, 0);
 
 	return generic_NCR5380_init_one(&driver_template, &pdev->dev, base, irq,
-				       id->driver_data);
+	                                id->driver_data);
 }
 
 static void generic_NCR5380_pnp_remove(struct pnp_dev *pdev)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 5/6] g_NCR5380: Re-work PDMA loops
  2017-06-29  5:24 ` Finn Thain
@ 2017-06-29  5:24   ` Finn Thain
  -1 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

The polling loops in pread() and pwrite() can easily become infinite
loops and hang the machine.

On DTC chips, IRQ can arrive late and we miss it because we only check
once. Merge the IRQ check into host buffer wait and add polling limit.

Also place a limit on polling for 53C80 registers accessibility.

[Use NCR5380_poll_politely2() for register polling. Rely on polling for
gated IRQ rather than polling for phase error, like the algorithm in the
datasheet. Calculate residual from block count register instead of the
loop counter. Factor-out common code as wait_for_53c80_access(). -- F.T.]

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 168 +++++++++++++++++++++++++----------------------
 1 file changed, 88 insertions(+), 80 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 5fd227bb1830..f7e50d2bca07 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -482,6 +482,30 @@ static void generic_NCR5380_release_resources(struct Scsi_Host *instance)
 		release_mem_region(base, region_size);
 }
 
+/* wait_for_53c80_access - wait for 53C80 registers to become accessible
+ * @hostdata: scsi host private data
+ *
+ * The registers within the 53C80 logic block are inaccessible until
+ * bit 7 in the 53C400 control status register gets asserted.
+ */
+
+static void wait_for_53c80_access(struct NCR5380_hostdata *hostdata)
+{
+	int count = 10000;
+
+	do {
+		if (hostdata->board == BOARD_DTC3181E)
+			udelay(4); /* DTC436 chip hangs without this */
+		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG)
+			return;
+	} while (--count > 0);
+
+	scmd_printk(KERN_ERR, hostdata->connected,
+	            "53c80 registers not accessible, device will be reset\n");
+	NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
+	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
+}
+
 /**
  * generic_NCR5380_pread - pseudo DMA read
  * @hostdata: scsi host private data
@@ -494,18 +518,23 @@ static void generic_NCR5380_release_resources(struct Scsi_Host *instance)
 static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
                                         unsigned char *dst, int len)
 {
-	int blocks = len / 128;
+	int residual;
 	int start = 0;
 
 	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE | CSR_TRANS_DIR);
-	NCR5380_write(hostdata->c400_blk_cnt, blocks);
-	while (1) {
-		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
+	NCR5380_write(hostdata->c400_blk_cnt, len / 128);
+
+	while (start < len) {
+		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
+		                           CSR_HOST_BUF_NOT_RDY, 0,
+		                           hostdata->c400_ctl_status,
+		                           CSR_GATED_53C80_IRQ,
+		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
+			break;
+
+		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_HOST_BUF_NOT_RDY)
 			break;
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
-			goto out_wait;
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; /* FIXME - no timeout */
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			insw(hostdata->io_port + hostdata->c400_host_buf,
@@ -516,44 +545,30 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 		else
 			memcpy_fromio(dst + start,
 				hostdata->io + NCR53C400_host_buffer, 128);
-
 		start += 128;
-		blocks--;
-	}
-
-	if (blocks) {
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; /* FIXME - no timeout */
 
-		if (hostdata->io_port && hostdata->io_width == 2)
-			insw(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 64);
-		else if (hostdata->io_port)
-			insb(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 128);
-		else
-			memcpy_fromio(dst + start,
-				hostdata->io + NCR53C400_host_buffer, 128);
-
-		start += 128;
-		blocks--;
+		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_GATED_53C80_IRQ)
+			break;
 	}
 
-	if (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ))
-		printk("53C400r: no 53C80 gated irq after transfer");
-
-out_wait:
-	hostdata->pdma_residual = len - start;
+	residual = len - start;
 
-	/* wait for 53C80 registers to be available */
-	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG))
-		;
+	if (residual != 0) {
+		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
+		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
+		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
+	}
+	wait_for_53c80_access(hostdata);
 
-	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
-	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
-	                          HZ / 64) < 0)
+	if (residual == 0 && NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+	                                           BASR_END_DMA_TRANSFER,
+	                                           BASR_END_DMA_TRANSFER,
+	                                           HZ / 64) < 0)
 		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
-		            __func__, hostdata->pdma_residual);
+		            __func__, residual);
+
+	hostdata->pdma_residual = residual;
 
 	return 0;
 }
@@ -570,36 +585,23 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
                                          unsigned char *src, int len)
 {
-	int blocks = len / 128;
+	int residual;
 	int start = 0;
 
 	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
-	NCR5380_write(hostdata->c400_blk_cnt, blocks);
-	while (1) {
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
-			goto out_wait;
-
-		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
+	NCR5380_write(hostdata->c400_blk_cnt, len / 128);
+
+	while (start < len) {
+		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
+		                           CSR_HOST_BUF_NOT_RDY, 0,
+		                           hostdata->c400_ctl_status,
+		                           CSR_GATED_53C80_IRQ,
+		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
 			break;
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; // FIXME - timeout
-
-		if (hostdata->io_port && hostdata->io_width == 2)
-			outsw(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 64);
-		else if (hostdata->io_port)
-			outsb(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 128);
-		else
-			memcpy_toio(hostdata->io + NCR53C400_host_buffer,
-			            src + start, 128);
 
-		start += 128;
-		blocks--;
-	}
-	if (blocks) {
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; // FIXME - no timeout
+		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_GATED_53C80_IRQ)
+			break;
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			outsw(hostdata->io_port + hostdata->c400_host_buf,
@@ -610,28 +612,34 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 		else
 			memcpy_toio(hostdata->io + NCR53C400_host_buffer,
 			            src + start, 128);
-
 		start += 128;
-		blocks--;
 	}
 
-out_wait:
-	hostdata->pdma_residual = len - start;
+	residual = max(len - start,
+	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
 
-	/* wait for 53C80 registers to be available */
-	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG)) {
-		udelay(4); /* DTC436 chip hangs without this */
-		/* FIXME - no timeout */
+	if (residual != 0) {
+		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
+		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
+		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
+	}
+	wait_for_53c80_access(hostdata);
+
+	if (residual == 0) {
+		if (NCR5380_poll_politely(hostdata, TARGET_COMMAND_REG,
+		                          TCR_LAST_BYTE_SENT, TCR_LAST_BYTE_SENT,
+		                          HZ / 64) < 0)
+			scmd_printk(KERN_ERR, hostdata->connected,
+			            "%s: Last Byte Sent timeout\n", __func__);
+
+		if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+		                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
+		                          HZ / 64) < 0)
+			scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
+			            __func__, residual);
 	}
 
-	while (!(NCR5380_read(TARGET_COMMAND_REG) & TCR_LAST_BYTE_SENT))
-		; 	// TIMEOUT
-
-	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
-	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
-	                          HZ / 64) < 0)
-		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
-		            __func__, hostdata->pdma_residual);
+	hostdata->pdma_residual = residual;
 
 	return 0;
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 1/6] g_NCR5380: Fix PDMA transfer size
@ 2017-06-29  5:24   ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

generic_NCR5380_dma_xfer_len() incorrectly uses cmd->transfersize
which causes rescan-scsi-bus and CD-ROM access to hang the system.
Use cmd->SCp.this_residual instead, like other NCR5380 drivers.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 67c8dac321ad..14ef4e8c4713 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -76,6 +76,7 @@
 #define IRQ_AUTO 254
 
 #define MAX_CARDS 8
+#define DMA_MAX_SIZE 32768
 
 /* old-style parameters for compatibility */
 static int ncr_irq = -1;
@@ -629,23 +630,16 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata,
                                         struct scsi_cmnd *cmd)
 {
-	int transfersize = cmd->transfersize;
+	int transfersize = cmd->SCp.this_residual;
 
 	if (hostdata->flags & FLAG_NO_PSEUDO_DMA)
 		return 0;
 
-	/* Limit transfers to 32K, for xx400 & xx406
-	 * pseudoDMA that transfers in 128 bytes blocks.
-	 */
-	if (transfersize > 32 * 1024 && cmd->SCp.this_residual &&
-	    !(cmd->SCp.this_residual % transfersize))
-		transfersize = 32 * 1024;
-
 	/* 53C400 datasheet: non-modulo-128-byte transfers should use PIO */
 	if (transfersize % 128)
 		transfersize = 0;
 
-	return transfersize;
+	return min(transfersize, DMA_MAX_SIZE);
 }
 
 /*
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 4/6] g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on DTC3181E
@ 2017-06-29  5:24   ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

The corruption is always the same: one byte missing at the beginning of
a 128 B block. It happens only with slow Quantum LPS 240 drive, not with
faster IBM DORS-32160. It's not clear what causes this. Documentation
for the DTC436 chip has not been made available. Hence this workaround.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index dedaed2d16e4..5fd227bb1830 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -45,7 +45,8 @@
 	int c400_blk_cnt; \
 	int c400_host_buf; \
 	int io_width; \
-	int pdma_residual
+	int pdma_residual; \
+	int board
 
 #define NCR5380_dma_xfer_len            generic_NCR5380_dma_xfer_len
 #define NCR5380_dma_recv_setup          generic_NCR5380_pread
@@ -316,6 +317,7 @@ static int generic_NCR5380_init_one(struct scsi_host_template *tpnt,
 	}
 	hostdata = shost_priv(instance);
 
+	hostdata->board = board;
 	hostdata->io = iomem;
 	hostdata->region_size = region_size;
 
@@ -644,7 +646,12 @@ static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata,
 
 	/* 53C400 datasheet: non-modulo-128-byte transfers should use PIO */
 	if (transfersize % 128)
-		transfersize = 0;
+		return 0;
+
+	/* Limit PDMA send to 512 B to avoid random corruption on DTC3181E */
+	if (hostdata->board == BOARD_DTC3181E &&
+	    cmd->sc_data_direction == DMA_TO_DEVICE)
+		transfersize = min(cmd->SCp.this_residual, 512);
 
 	return min(transfersize, DMA_MAX_SIZE);
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
@ 2017-06-29  5:24 ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

Ondrej, would you please test this new series?

Changed since v1:
- PDMA transfer residual is calculated earlier.
- End of DMA flag check is now polled (if there is any residual).

Changed since v2:
- Bail out of transfer loops when Gated IRQ gets asserted.
- Make udelay conditional on board type.
- Drop sg_tablesize patch due to performance regression.

Changed since v3:
- Add Ondrej's workaround for corrupt WRITE commands on DTC boards.
- Reset the 53c400 logic after any short PDMA transfer.
- Don't fail the transfer if the 53c400 logic got a reset.

Changed since v4:
- Bail out of transfer loops when Gated IRQ gets asserted. (Again.)
- Always call wait_for_53c80_registers() at end of transfer.
- Drain chip buffers after PDMA receive is interrupted.
- Rework residual calculation.
- Add new patch to correct DMA terminology.


Finn Thain (2):
  g_NCR5380: Cleanup comments and whitespace
  g_NCR5380: Use unambiguous terminology for PDMA send and receive

Ondrej Zary (4):
  g_NCR5380: Fix PDMA transfer size
  g_NCR5380: End PDMA transfer correctly on target disconnection
  g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on
    DTC3181E
  g_NCR5380: Re-work PDMA loops

 drivers/scsi/g_NCR5380.c | 260 +++++++++++++++++++++++++----------------------
 1 file changed, 139 insertions(+), 121 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 2/6] g_NCR5380: End PDMA transfer correctly on target disconnection
@ 2017-06-29  5:24   ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

When an IRQ arrives during PDMA transfer, pread() and pwrite() return
without waiting for the 53C80 registers to be ready and this ends up
messing up the chip state. This was observed with SONY CDU-55S which is
slow enough to disconnect during 4096-byte reads.

IRQ during PDMA is not an error so don't return -1. Instead, store the
remaining byte count for use by NCR5380_dma_residual().

[Poll for the BASR_END_DMA_TRANSFER condition rather than remove the
error message -- F.T.]

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 48 +++++++++++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 14ef4e8c4713..911a4300ea51 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -44,12 +44,13 @@
 	int c400_ctl_status; \
 	int c400_blk_cnt; \
 	int c400_host_buf; \
-	int io_width
+	int io_width; \
+	int pdma_residual
 
 #define NCR5380_dma_xfer_len            generic_NCR5380_dma_xfer_len
 #define NCR5380_dma_recv_setup          generic_NCR5380_pread
 #define NCR5380_dma_send_setup          generic_NCR5380_pwrite
-#define NCR5380_dma_residual            NCR5380_dma_residual_none
+#define NCR5380_dma_residual            generic_NCR5380_dma_residual
 
 #define NCR5380_intr                    generic_NCR5380_intr
 #define NCR5380_queue_command           generic_NCR5380_queue_command
@@ -500,10 +501,8 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 	while (1) {
 		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
 			break;
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ) {
-			printk(KERN_ERR "53C400r: Got 53C80_IRQ start=%d, blocks=%d\n", start, blocks);
-			return -1;
-		}
+		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
+			goto out_wait;
 		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
 			; /* FIXME - no timeout */
 
@@ -542,13 +541,19 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 	if (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ))
 		printk("53C400r: no 53C80 gated irq after transfer");
 
+out_wait:
+	hostdata->pdma_residual = len - start;
+
 	/* wait for 53C80 registers to be available */
 	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG))
 		;
 
-	if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_END_DMA_TRANSFER))
-		printk(KERN_ERR "53C400r: no end dma signal\n");
-		
+	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
+	                          HZ / 64) < 0)
+		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
+		            __func__, hostdata->pdma_residual);
+
 	return 0;
 }
 
@@ -571,10 +576,8 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
 	NCR5380_write(hostdata->c400_blk_cnt, blocks);
 	while (1) {
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ) {
-			printk(KERN_ERR "53C400w: Got 53C80_IRQ start=%d, blocks=%d\n", start, blocks);
-			return -1;
-		}
+		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
+			goto out_wait;
 
 		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
 			break;
@@ -612,18 +615,24 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 		blocks--;
 	}
 
+out_wait:
+	hostdata->pdma_residual = len - start;
+
 	/* wait for 53C80 registers to be available */
 	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG)) {
 		udelay(4); /* DTC436 chip hangs without this */
 		/* FIXME - no timeout */
 	}
 
-	if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_END_DMA_TRANSFER)) {
-		printk(KERN_ERR "53C400w: no end dma signal\n");
-	}
-
 	while (!(NCR5380_read(TARGET_COMMAND_REG) & TCR_LAST_BYTE_SENT))
 		; 	// TIMEOUT
+
+	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
+	                          HZ / 64) < 0)
+		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
+		            __func__, hostdata->pdma_residual);
+
 	return 0;
 }
 
@@ -642,6 +651,11 @@ static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata,
 	return min(transfersize, DMA_MAX_SIZE);
 }
 
+static int generic_NCR5380_dma_residual(struct NCR5380_hostdata *hostdata)
+{
+	return hostdata->pdma_residual;
+}
+
 /*
  *	Include the NCR5380 core code that we build our driver around	
  */
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 3/6] g_NCR5380: Cleanup comments and whitespace
@ 2017-06-29  5:24   ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 61 ++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 911a4300ea51..dedaed2d16e4 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -1,17 +1,17 @@
 /*
  * Generic Generic NCR5380 driver
- *	
+ *
  * Copyright 1993, Drew Eckhardt
- *	Visionary Computing
- *	(Unix and Linux consulting and custom programming)
- *	drew@colorado.edu
- *      +1 (303) 440-4894
+ * Visionary Computing
+ * (Unix and Linux consulting and custom programming)
+ * drew@colorado.edu
+ * +1 (303) 440-4894
  *
  * NCR53C400 extensions (c) 1994,1995,1996, Kevin Lentin
- *    K.Lentin@cs.monash.edu.au
+ * K.Lentin@cs.monash.edu.au
  *
  * NCR53C400A extensions (c) 1996, Ingmar Baumgart
- *    ingmar@gonzo.schwaben.de
+ * ingmar@gonzo.schwaben.de
  *
  * DTC3181E extensions (c) 1997, Ronald van Cuijlenborg
  * ronald.van.cuijlenborg@tip.nl or nutty@dds.nl
@@ -481,15 +481,14 @@ static void generic_NCR5380_release_resources(struct Scsi_Host *instance)
 }
 
 /**
- *	generic_NCR5380_pread - pseudo DMA read
- *	@hostdata: scsi host private data
- *	@dst: buffer to read into
- *	@len: buffer length
+ * generic_NCR5380_pread - pseudo DMA read
+ * @hostdata: scsi host private data
+ * @dst: buffer to write into
+ * @len: transfer size
  *
- *	Perform a pseudo DMA mode read from an NCR53C400 or equivalent
- *	controller
+ * Perform a pseudo DMA mode receive from a 53C400 or equivalent device.
  */
- 
+
 static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
                                         unsigned char *dst, int len)
 {
@@ -508,10 +507,10 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			insw(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 64);
+			     dst + start, 64);
 		else if (hostdata->io_port)
 			insb(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 128);
+			     dst + start, 128);
 		else
 			memcpy_fromio(dst + start,
 				hostdata->io + NCR53C400_host_buffer, 128);
@@ -558,13 +557,12 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 }
 
 /**
- *	generic_NCR5380_pwrite - pseudo DMA write
- *	@hostdata: scsi host private data
- *	@dst: buffer to read into
- *	@len: buffer length
+ * generic_NCR5380_pwrite - pseudo DMA write
+ * @hostdata: scsi host private data
+ * @src: buffer to read from
+ * @len: transfer size
  *
- *	Perform a pseudo DMA mode read from an NCR53C400 or equivalent
- *	controller
+ * Perform a pseudo DMA mode send to a 53C400 or equivalent device.
  */
 
 static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
@@ -603,10 +601,10 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			outsw(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 64);
+			      src + start, 64);
 		else if (hostdata->io_port)
 			outsb(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 128);
+			      src + start, 128);
 		else
 			memcpy_toio(hostdata->io + NCR53C400_host_buffer,
 			            src + start, 128);
@@ -656,10 +654,8 @@ static int generic_NCR5380_dma_residual(struct NCR5380_hostdata *hostdata)
 	return hostdata->pdma_residual;
 }
 
-/*
- *	Include the NCR5380 core code that we build our driver around	
- */
- 
+/* Include the core driver code. */
+
 #include "NCR5380.c"
 
 static struct scsi_host_template driver_template = {
@@ -679,11 +675,10 @@ static struct scsi_host_template driver_template = {
 	.max_sectors		= 128,
 };
 
-
 static int generic_NCR5380_isa_match(struct device *pdev, unsigned int ndev)
 {
 	int ret = generic_NCR5380_init_one(&driver_template, pdev, base[ndev],
-					  irq[ndev], card[ndev]);
+	                                   irq[ndev], card[ndev]);
 	if (ret) {
 		if (base[ndev])
 			printk(KERN_WARNING "Card not found at address 0x%03x\n",
@@ -695,7 +690,7 @@ static int generic_NCR5380_isa_match(struct device *pdev, unsigned int ndev)
 }
 
 static int generic_NCR5380_isa_remove(struct device *pdev,
-				   unsigned int ndev)
+                                      unsigned int ndev)
 {
 	generic_NCR5380_release_resources(dev_get_drvdata(pdev));
 	dev_set_drvdata(pdev, NULL);
@@ -718,7 +713,7 @@ static struct pnp_device_id generic_NCR5380_pnp_ids[] = {
 MODULE_DEVICE_TABLE(pnp, generic_NCR5380_pnp_ids);
 
 static int generic_NCR5380_pnp_probe(struct pnp_dev *pdev,
-			       const struct pnp_device_id *id)
+                                     const struct pnp_device_id *id)
 {
 	int base, irq;
 
@@ -729,7 +724,7 @@ static int generic_NCR5380_pnp_probe(struct pnp_dev *pdev,
 	irq = pnp_irq(pdev, 0);
 
 	return generic_NCR5380_init_one(&driver_template, &pdev->dev, base, irq,
-				       id->driver_data);
+	                                id->driver_data);
 }
 
 static void generic_NCR5380_pnp_remove(struct pnp_dev *pdev)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 5/6] g_NCR5380: Re-work PDMA loops
@ 2017-06-29  5:24   ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

From: Ondrej Zary <linux@rainbow-software.org>

The polling loops in pread() and pwrite() can easily become infinite
loops and hang the machine.

On DTC chips, IRQ can arrive late and we miss it because we only check
once. Merge the IRQ check into host buffer wait and add polling limit.

Also place a limit on polling for 53C80 registers accessibility.

[Use NCR5380_poll_politely2() for register polling. Rely on polling for
gated IRQ rather than polling for phase error, like the algorithm in the
datasheet. Calculate residual from block count register instead of the
loop counter. Factor-out common code as wait_for_53c80_access(). -- F.T.]

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 168 +++++++++++++++++++++++++----------------------
 1 file changed, 88 insertions(+), 80 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 5fd227bb1830..f7e50d2bca07 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -482,6 +482,30 @@ static void generic_NCR5380_release_resources(struct Scsi_Host *instance)
 		release_mem_region(base, region_size);
 }
 
+/* wait_for_53c80_access - wait for 53C80 registers to become accessible
+ * @hostdata: scsi host private data
+ *
+ * The registers within the 53C80 logic block are inaccessible until
+ * bit 7 in the 53C400 control status register gets asserted.
+ */
+
+static void wait_for_53c80_access(struct NCR5380_hostdata *hostdata)
+{
+	int count = 10000;
+
+	do {
+		if (hostdata->board == BOARD_DTC3181E)
+			udelay(4); /* DTC436 chip hangs without this */
+		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG)
+			return;
+	} while (--count > 0);
+
+	scmd_printk(KERN_ERR, hostdata->connected,
+	            "53c80 registers not accessible, device will be reset\n");
+	NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
+	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
+}
+
 /**
  * generic_NCR5380_pread - pseudo DMA read
  * @hostdata: scsi host private data
@@ -494,18 +518,23 @@ static void generic_NCR5380_release_resources(struct Scsi_Host *instance)
 static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
                                         unsigned char *dst, int len)
 {
-	int blocks = len / 128;
+	int residual;
 	int start = 0;
 
 	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE | CSR_TRANS_DIR);
-	NCR5380_write(hostdata->c400_blk_cnt, blocks);
-	while (1) {
-		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
+	NCR5380_write(hostdata->c400_blk_cnt, len / 128);
+
+	while (start < len) {
+		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
+		                           CSR_HOST_BUF_NOT_RDY, 0,
+		                           hostdata->c400_ctl_status,
+		                           CSR_GATED_53C80_IRQ,
+		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
+			break;
+
+		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_HOST_BUF_NOT_RDY)
 			break;
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
-			goto out_wait;
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; /* FIXME - no timeout */
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			insw(hostdata->io_port + hostdata->c400_host_buf,
@@ -516,44 +545,30 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 		else
 			memcpy_fromio(dst + start,
 				hostdata->io + NCR53C400_host_buffer, 128);
-
 		start += 128;
-		blocks--;
-	}
-
-	if (blocks) {
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; /* FIXME - no timeout */
 
-		if (hostdata->io_port && hostdata->io_width == 2)
-			insw(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 64);
-		else if (hostdata->io_port)
-			insb(hostdata->io_port + hostdata->c400_host_buf,
-							dst + start, 128);
-		else
-			memcpy_fromio(dst + start,
-				hostdata->io + NCR53C400_host_buffer, 128);
-
-		start += 128;
-		blocks--;
+		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_GATED_53C80_IRQ)
+			break;
 	}
 
-	if (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ))
-		printk("53C400r: no 53C80 gated irq after transfer");
-
-out_wait:
-	hostdata->pdma_residual = len - start;
+	residual = len - start;
 
-	/* wait for 53C80 registers to be available */
-	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG))
-		;
+	if (residual != 0) {
+		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
+		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
+		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
+	}
+	wait_for_53c80_access(hostdata);
 
-	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
-	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
-	                          HZ / 64) < 0)
+	if (residual == 0 && NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+	                                           BASR_END_DMA_TRANSFER,
+	                                           BASR_END_DMA_TRANSFER,
+	                                           HZ / 64) < 0)
 		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
-		            __func__, hostdata->pdma_residual);
+		            __func__, residual);
+
+	hostdata->pdma_residual = residual;
 
 	return 0;
 }
@@ -570,36 +585,23 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
                                          unsigned char *src, int len)
 {
-	int blocks = len / 128;
+	int residual;
 	int start = 0;
 
 	NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
-	NCR5380_write(hostdata->c400_blk_cnt, blocks);
-	while (1) {
-		if (NCR5380_read(hostdata->c400_ctl_status) & CSR_GATED_53C80_IRQ)
-			goto out_wait;
-
-		if (NCR5380_read(hostdata->c400_blk_cnt) == 0)
+	NCR5380_write(hostdata->c400_blk_cnt, len / 128);
+
+	while (start < len) {
+		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
+		                           CSR_HOST_BUF_NOT_RDY, 0,
+		                           hostdata->c400_ctl_status,
+		                           CSR_GATED_53C80_IRQ,
+		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
 			break;
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; // FIXME - timeout
-
-		if (hostdata->io_port && hostdata->io_width == 2)
-			outsw(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 64);
-		else if (hostdata->io_port)
-			outsb(hostdata->io_port + hostdata->c400_host_buf,
-							src + start, 128);
-		else
-			memcpy_toio(hostdata->io + NCR53C400_host_buffer,
-			            src + start, 128);
 
-		start += 128;
-		blocks--;
-	}
-	if (blocks) {
-		while (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY)
-			; // FIXME - no timeout
+		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_GATED_53C80_IRQ)
+			break;
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			outsw(hostdata->io_port + hostdata->c400_host_buf,
@@ -610,28 +612,34 @@ static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
 		else
 			memcpy_toio(hostdata->io + NCR53C400_host_buffer,
 			            src + start, 128);
-
 		start += 128;
-		blocks--;
 	}
 
-out_wait:
-	hostdata->pdma_residual = len - start;
+	residual = max(len - start,
+	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
 
-	/* wait for 53C80 registers to be available */
-	while (!(NCR5380_read(hostdata->c400_ctl_status) & CSR_53C80_REG)) {
-		udelay(4); /* DTC436 chip hangs without this */
-		/* FIXME - no timeout */
+	if (residual != 0) {
+		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
+		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
+		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
+	}
+	wait_for_53c80_access(hostdata);
+
+	if (residual == 0) {
+		if (NCR5380_poll_politely(hostdata, TARGET_COMMAND_REG,
+		                          TCR_LAST_BYTE_SENT, TCR_LAST_BYTE_SENT,
+		                          HZ / 64) < 0)
+			scmd_printk(KERN_ERR, hostdata->connected,
+			            "%s: Last Byte Sent timeout\n", __func__);
+
+		if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
+		                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
+		                          HZ / 64) < 0)
+			scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
+			            __func__, residual);
 	}
 
-	while (!(NCR5380_read(TARGET_COMMAND_REG) & TCR_LAST_BYTE_SENT))
-		; 	// TIMEOUT
-
-	if (NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG,
-	                          BASR_END_DMA_TRANSFER, BASR_END_DMA_TRANSFER,
-	                          HZ / 64) < 0)
-		scmd_printk(KERN_ERR, hostdata->connected, "%s: End of DMA timeout (%d)\n",
-		            __func__, hostdata->pdma_residual);
+	hostdata->pdma_residual = residual;
 
 	return 0;
 }
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 6/6] g_NCR5380: Use unambiguous terminology for PDMA send and receive
  2017-06-29  5:24 ` Finn Thain
@ 2017-06-29  5:24   ` Finn Thain
  -1 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

The word "read" may be used to mean "DMA read operation" or
"SCSI READ command", though a READ command implies writing to memory.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index f7e50d2bca07..b8b1ed2806bb 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -49,8 +49,8 @@
 	int board
 
 #define NCR5380_dma_xfer_len            generic_NCR5380_dma_xfer_len
-#define NCR5380_dma_recv_setup          generic_NCR5380_pread
-#define NCR5380_dma_send_setup          generic_NCR5380_pwrite
+#define NCR5380_dma_recv_setup          generic_NCR5380_precv
+#define NCR5380_dma_send_setup          generic_NCR5380_psend
 #define NCR5380_dma_residual            generic_NCR5380_dma_residual
 
 #define NCR5380_intr                    generic_NCR5380_intr
@@ -507,7 +507,7 @@ static void wait_for_53c80_access(struct NCR5380_hostdata *hostdata)
 }
 
 /**
- * generic_NCR5380_pread - pseudo DMA read
+ * generic_NCR5380_precv - pseudo DMA receive
  * @hostdata: scsi host private data
  * @dst: buffer to write into
  * @len: transfer size
@@ -515,7 +515,7 @@ static void wait_for_53c80_access(struct NCR5380_hostdata *hostdata)
  * Perform a pseudo DMA mode receive from a 53C400 or equivalent device.
  */
 
-static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
+static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
                                         unsigned char *dst, int len)
 {
 	int residual;
@@ -574,7 +574,7 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 }
 
 /**
- * generic_NCR5380_pwrite - pseudo DMA write
+ * generic_NCR5380_psend - pseudo DMA send
  * @hostdata: scsi host private data
  * @src: buffer to read from
  * @len: transfer size
@@ -582,8 +582,8 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
  * Perform a pseudo DMA mode send to a 53C400 or equivalent device.
  */
 
-static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
-                                         unsigned char *src, int len)
+static inline int generic_NCR5380_psend(struct NCR5380_hostdata *hostdata,
+                                        unsigned char *src, int len)
 {
 	int residual;
 	int start = 0;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 6/6] g_NCR5380: Use unambiguous terminology for PDMA send and receive
@ 2017-06-29  5:24   ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-06-29  5:24 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen, Ondrej Zary
  Cc: linux-scsi, linux-kernel, Michael Schmitz

The word "read" may be used to mean "DMA read operation" or
"SCSI READ command", though a READ command implies writing to memory.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/scsi/g_NCR5380.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index f7e50d2bca07..b8b1ed2806bb 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -49,8 +49,8 @@
 	int board
 
 #define NCR5380_dma_xfer_len            generic_NCR5380_dma_xfer_len
-#define NCR5380_dma_recv_setup          generic_NCR5380_pread
-#define NCR5380_dma_send_setup          generic_NCR5380_pwrite
+#define NCR5380_dma_recv_setup          generic_NCR5380_precv
+#define NCR5380_dma_send_setup          generic_NCR5380_psend
 #define NCR5380_dma_residual            generic_NCR5380_dma_residual
 
 #define NCR5380_intr                    generic_NCR5380_intr
@@ -507,7 +507,7 @@ static void wait_for_53c80_access(struct NCR5380_hostdata *hostdata)
 }
 
 /**
- * generic_NCR5380_pread - pseudo DMA read
+ * generic_NCR5380_precv - pseudo DMA receive
  * @hostdata: scsi host private data
  * @dst: buffer to write into
  * @len: transfer size
@@ -515,7 +515,7 @@ static void wait_for_53c80_access(struct NCR5380_hostdata *hostdata)
  * Perform a pseudo DMA mode receive from a 53C400 or equivalent device.
  */
 
-static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
+static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
                                         unsigned char *dst, int len)
 {
 	int residual;
@@ -574,7 +574,7 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
 }
 
 /**
- * generic_NCR5380_pwrite - pseudo DMA write
+ * generic_NCR5380_psend - pseudo DMA send
  * @hostdata: scsi host private data
  * @src: buffer to read from
  * @len: transfer size
@@ -582,8 +582,8 @@ static inline int generic_NCR5380_pread(struct NCR5380_hostdata *hostdata,
  * Perform a pseudo DMA mode send to a 53C400 or equivalent device.
  */
 
-static inline int generic_NCR5380_pwrite(struct NCR5380_hostdata *hostdata,
-                                         unsigned char *src, int len)
+static inline int generic_NCR5380_psend(struct NCR5380_hostdata *hostdata,
+                                        unsigned char *src, int len)
 {
 	int residual;
 	int start = 0;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
  2017-06-29  5:24 ` Finn Thain
                   ` (6 preceding siblings ...)
  (?)
@ 2017-06-29 18:06 ` Ondrej Zary
  2017-06-30  7:12   ` Finn Thain
  -1 siblings, 1 reply; 19+ messages in thread
From: Ondrej Zary @ 2017-06-29 18:06 UTC (permalink / raw)
  To: Finn Thain
  Cc: James E.J. Bottomley, Martin K. Petersen, linux-scsi,
	linux-kernel, Michael Schmitz

On Thursday 29 June 2017 07:24:18 Finn Thain wrote:
> Ondrej, would you please test this new series?
>
> Changed since v1:
> - PDMA transfer residual is calculated earlier.
> - End of DMA flag check is now polled (if there is any residual).
>
> Changed since v2:
> - Bail out of transfer loops when Gated IRQ gets asserted.
> - Make udelay conditional on board type.
> - Drop sg_tablesize patch due to performance regression.
>
> Changed since v3:
> - Add Ondrej's workaround for corrupt WRITE commands on DTC boards.
> - Reset the 53c400 logic after any short PDMA transfer.
> - Don't fail the transfer if the 53c400 logic got a reset.
>
> Changed since v4:
> - Bail out of transfer loops when Gated IRQ gets asserted. (Again.)
> - Always call wait_for_53c80_registers() at end of transfer.
> - Drain chip buffers after PDMA receive is interrupted.
> - Rework residual calculation.
> - Add new patch to correct DMA terminology.
>
>
> Finn Thain (2):
>   g_NCR5380: Cleanup comments and whitespace
>   g_NCR5380: Use unambiguous terminology for PDMA send and receive
>
> Ondrej Zary (4):
>   g_NCR5380: Fix PDMA transfer size
>   g_NCR5380: End PDMA transfer correctly on target disconnection
>   g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on
>     DTC3181E
>   g_NCR5380: Re-work PDMA loops
>
>  drivers/scsi/g_NCR5380.c | 260
> +++++++++++++++++++++++++---------------------- 1 file changed, 139
> insertions(+), 121 deletions(-)

The write corruption is still there. I'm afraid it can't be fixed without
rolling "start" back (or inceasing residual) if an error occured, something like this:
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -619,6 +621,9 @@ static inline int generic_NCR5380_psend(struct NCR5380_hostdata *hostdata,
 	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);

 	if (residual != 0) {
+		residual += 128;
 		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
 		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
 		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);

(seems to work - wrote 230MB and read it back with no differences)

The corruption mechanism is:
1. Host buffer is ready so we write 128 B of data there and increment "start".
2. Chip swaps the buffers, decrements the block counter and starts writing the
data to drive.
3. Drive does not like it (e.g. its buffer is full) so it disconnects.
4. Chip stops writing and asserts an IRQ.
5. We detect the IRQ. The block counter is already decremented, "start" is
already incremented but the data was not written to the drive.



No more log spamming on DTC but reads are corrupted even more than before.
The IRQ check after data transfer increases the chance of catching an IRQ
before the buffer could become ready. This patch:
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -548,8 +548,10 @@ static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
 		start += 128;
 
 		if (NCR5380_read(hostdata->c400_ctl_status) &
-		    CSR_GATED_53C80_IRQ)
+		    CSR_GATED_53C80_IRQ) {
+			printk("r irq at start=%d basr=0x%02x\n", start, NCR5380_read(BUS_AND_STATUS_REG));
 			break;
+		}
 	}
 
 	residual = len - start;

produces lots of these lines:
[  896.194054] r irq at start=128 basr=0x98
[  896.197758] r irq at start=3968 basr=0x98


-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
  2017-06-29  5:24 ` Finn Thain
                   ` (7 preceding siblings ...)
  (?)
@ 2017-06-29 18:30 ` Ondrej Zary
  -1 siblings, 0 replies; 19+ messages in thread
From: Ondrej Zary @ 2017-06-29 18:30 UTC (permalink / raw)
  To: Finn Thain
  Cc: James E.J. Bottomley, Martin K. Petersen, linux-scsi,
	linux-kernel, Michael Schmitz

On Thursday 29 June 2017 07:24:18 Finn Thain wrote:
> Ondrej, would you please test this new series?
>
> Changed since v1:
> - PDMA transfer residual is calculated earlier.
> - End of DMA flag check is now polled (if there is any residual).
>
> Changed since v2:
> - Bail out of transfer loops when Gated IRQ gets asserted.
> - Make udelay conditional on board type.
> - Drop sg_tablesize patch due to performance regression.
>
> Changed since v3:
> - Add Ondrej's workaround for corrupt WRITE commands on DTC boards.
> - Reset the 53c400 logic after any short PDMA transfer.
> - Don't fail the transfer if the 53c400 logic got a reset.
>
> Changed since v4:
> - Bail out of transfer loops when Gated IRQ gets asserted. (Again.)
> - Always call wait_for_53c80_registers() at end of transfer.
> - Drain chip buffers after PDMA receive is interrupted.
> - Rework residual calculation.
> - Add new patch to correct DMA terminology.
>
>
> Finn Thain (2):
>   g_NCR5380: Cleanup comments and whitespace
>   g_NCR5380: Use unambiguous terminology for PDMA send and receive
>
> Ondrej Zary (4):
>   g_NCR5380: Fix PDMA transfer size
>   g_NCR5380: End PDMA transfer correctly on target disconnection
>   g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on
>     DTC3181E
>   g_NCR5380: Re-work PDMA loops
>
>  drivers/scsi/g_NCR5380.c | 260
> +++++++++++++++++++++++++---------------------- 1 file changed, 139
> insertions(+), 121 deletions(-)

This fixes the DTC read corruption, although I don't like the repeated
ctl_status register reads:
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -533,7 +533,7 @@ static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
 			break;

 		if (NCR5380_read(hostdata->c400_ctl_status) &
-		    CSR_HOST_BUF_NOT_RDY)
+		    CSR_GATED_53C80_IRQ && (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY))
 			break;

                if (hostdata->io_port && hostdata->io_width == 2)
@@ -546,10 +546,6 @@ static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
 			memcpy_fromio(dst + start,
 				hostdata->io + NCR53C400_host_buffer, 128);
 		start += 128;
-
-		if (NCR5380_read(hostdata->c400_ctl_status) &
-		    CSR_GATED_53C80_IRQ)
-			break;
 	}
 
 	residual = len - start;

Writes seem to work correctly.

-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
  2017-06-29 18:06 ` [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup Ondrej Zary
@ 2017-06-30  7:12   ` Finn Thain
  2017-06-30 18:07     ` Ondrej Zary
  0 siblings, 1 reply; 19+ messages in thread
From: Finn Thain @ 2017-06-30  7:12 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: James E.J. Bottomley, Martin K. Petersen, linux-scsi,
	linux-kernel, Michael Schmitz

On Thu, 29 Jun 2017, Ondrej Zary wrote:

> The write corruption is still there. I'm afraid it can't be fixed 
> without rolling "start" back (or inceasing residual) if an error 
> occured, something like this:
> 
> --- a/drivers/scsi/g_NCR5380.c
> +++ b/drivers/scsi/g_NCR5380.c
> @@ -619,6 +621,9 @@ static inline int generic_NCR5380_psend(struct 
>  	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
> 
>  	if (residual != 0) {
> +		residual += 128;
>  		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
>  		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
>  		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
> 
> (seems to work - wrote 230MB and read it back with no differences)
> 
> The corruption mechanism is:
> 1. Host buffer is ready so we write 128 B of data there and increment 
>    "start".
> 2. Chip swaps the buffers, decrements the block counter and starts 
>    writing the data to drive.
> 3. Drive does not like it (e.g. its buffer is full) so it disconnects.
> 4. Chip stops writing and asserts an IRQ.
> 5. We detect the IRQ. The block counter is already decremented, "start" 
>    is already incremented but the data was not written to the drive.
> 
> 

OK. Thanks for that analysis.

It sounds like the c400_blk_cnt value gives the number of buffer swaps 
remaining. If so, that value isn't useful for calculating a residual. I'll 
rework that calculation again.

In your patch, the residual gets increased regardless of the actual cause 
of the short transfer. Nothing prevents the residual from being increased 
beyond the original length of the transfer (due to a flaky target or bus). 
Therefore I've taken a slightly different approach in my patch (below).

> 
> No more log spamming on DTC but reads are corrupted even more than before.
> The IRQ check after data transfer increases the chance of catching an IRQ
> before the buffer could become ready.

If we delay the IRQ check, that just means that CSR_GATED_53C80_IRQ will 
be detected a bit later (128 bytes later)... so not much difference.

> This patch:
> --- a/drivers/scsi/g_NCR5380.c
> +++ b/drivers/scsi/g_NCR5380.c
> @@ -548,8 +548,10 @@ static inline int generic_NCR5380_precv(struct
>  		start += 128;
>  
>  		if (NCR5380_read(hostdata->c400_ctl_status) &
> -		    CSR_GATED_53C80_IRQ)
> +		    CSR_GATED_53C80_IRQ) {
> +			printk("r irq at start=%d basr=0x%02x\n", start, NCR5380_read(BUS_AND_STATUS_REG));
>  			break;
> +		}
>  	}
>  
>  	residual = len - start;
> 
> produces lots of these lines:
> [  896.194054] r irq at start=128 basr=0x98
> [  896.197758] r irq at start=3968 basr=0x98
> 

Assuming that the registers are available and valid, the value 0x98 means 
BASR_END_DMA_TRANSFER | BASR_IRQ | BASR_PHASE_MATCH. There is no 
BASR_BUSY_ERROR here, so the cause of the CSR_GATED_53C80_IRQ must be that 
the 53c400 has terminated the transfer by asserting /EOP. That shouldn't 
happen before before the counters run down.

It doesn't make sense. So maybe the 53c80 registers are not valid at this 
point? That means a phase mismatch can't be excluded... unlikely at 128 
bytes into the transfer. Busy error? Also unlikely.

I have to conclude that CSR_GATED_53C80_IRQ and BASR_END_DMA_TRANSFER 
can't be trusted on this board. I guess that's why you examine the BASR 
directly in your original algorithm but ignore BASR_END_DMA_TRANSFER.

It does look like some kind of timing issue: the "start" value above 
changes from one log message to the next. Who knows?


> This fixes the DTC read corruption, although I don't like the repeated
> ctl_status register reads:    
> --- a/drivers/scsi/g_NCR5380.c
> +++ b/drivers/scsi/g_NCR5380.c
> @@ -533,7 +533,7 @@ static inline int generic_NCR5380_precv(struct
>  			break;
>
>  		if (NCR5380_read(hostdata->c400_ctl_status) &
> -		    CSR_HOST_BUF_NOT_RDY)
> +		    CSR_GATED_53C80_IRQ && (NCR5380_read(hostdata->c400_ctl_status) & CSR_HOST_BUF_NOT_RDY))
>  			break;
> 
>  		if (hostdata->io_port && hostdata->io_width == 2)

But that means the transfer will continue even when CSR_HOST_BUF_NOT_RDY. 
Your original algorithm doesn't attempt that. Neither does the algorithm 
in the datasheet. We should try to omit this change.

> @@ -546,10 +546,6 @@ static inline int generic_NCR5380_precv(struct 
>  		memcpy_fromio(dst + start,
>  			hostdata->io + NCR53C400_host_buffer, 128);
>  		start += 128;
> -
> -		if (NCR5380_read(hostdata->c400_ctl_status) &
> -		    CSR_GATED_53C80_IRQ)
> -			break;
>  	}
>  
>  	residual = len - start;

I think we should keep the CSR_GATED_53C80_IRQ check for the other boards, 
if this bogus BASR_END_DMA_TRANSFER problem is confined to DTC436.

How about this change? (to be applied on top of 6/6)

diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 3948f522b4e1..8e80379cfaaa 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -525,16 +525,22 @@ static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
 	NCR5380_write(hostdata->c400_blk_cnt, len / 128);
 
 	do {
-		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
-		                           CSR_HOST_BUF_NOT_RDY, 0,
-		                           hostdata->c400_ctl_status,
-		                           CSR_GATED_53C80_IRQ,
-		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
-			break;
-
-		if (NCR5380_read(hostdata->c400_ctl_status) &
-		    CSR_HOST_BUF_NOT_RDY)
-			break;
+		if (hostdata->board == BOARD_DTC3181E) {
+			/* Ignore bogus CSR_GATED_53C80_IRQ */
+			if (NCR5380_poll_politely(hostdata, hostdata->c400_ctl_status,
+			                          CSR_HOST_BUF_NOT_RDY, 0, HZ / 64) < 0)
+				break;
+		} else {
+			if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
+			                           CSR_HOST_BUF_NOT_RDY, 0,
+			                           hostdata->c400_ctl_status,
+			                           CSR_GATED_53C80_IRQ,
+			                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
+				break;
+			if (NCR5380_read(hostdata->c400_ctl_status) &
+			    CSR_HOST_BUF_NOT_RDY)
+				break;
+		}
 
 		if (hostdata->io_port && hostdata->io_width == 2)
 			insw(hostdata->io_port + hostdata->c400_host_buf,
@@ -546,10 +552,6 @@ static inline int generic_NCR5380_precv(struct NCR5380_hostdata *hostdata,
 			memcpy_fromio(dst + start,
 				hostdata->io + NCR53C400_host_buffer, 128);
 		start += 128;
-
-		if (NCR5380_read(hostdata->c400_ctl_status) &
-		    CSR_GATED_53C80_IRQ)
-			break;
 	} while (start < len);
 
 	residual = len - start;
@@ -600,6 +602,12 @@ static inline int generic_NCR5380_psend(struct NCR5380_hostdata *hostdata,
 			break;
 
 		if (NCR5380_read(hostdata->c400_ctl_status) &
+		    CSR_HOST_BUF_NOT_RDY && start > 0) {
+			start -= 128;
+			break;
+		}
+
+		if (NCR5380_read(hostdata->c400_ctl_status) &
 		    CSR_GATED_53C80_IRQ)
 			break;
 
@@ -615,8 +623,7 @@ static inline int generic_NCR5380_psend(struct NCR5380_hostdata *hostdata,
 		start += 128;
 	} while (start < len);
 
-	residual = max(len - start,
-	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
+	residual = len - start;
 
 	if (residual != 0) {
 		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
  2017-06-30  7:12   ` Finn Thain
@ 2017-06-30 18:07     ` Ondrej Zary
  2017-07-01  2:40       ` Finn Thain
  0 siblings, 1 reply; 19+ messages in thread
From: Ondrej Zary @ 2017-06-30 18:07 UTC (permalink / raw)
  To: Finn Thain
  Cc: James E.J. Bottomley, Martin K. Petersen, linux-scsi,
	linux-kernel, Michael Schmitz

On Friday 30 June 2017 09:12:37 Finn Thain wrote:
> On Thu, 29 Jun 2017, Ondrej Zary wrote:
> > The write corruption is still there. I'm afraid it can't be fixed
> > without rolling "start" back (or inceasing residual) if an error
> > occured, something like this:
> >
> > --- a/drivers/scsi/g_NCR5380.c
> > +++ b/drivers/scsi/g_NCR5380.c
> > @@ -619,6 +621,9 @@ static inline int generic_NCR5380_psend(struct
> >  	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
> >
> >  	if (residual != 0) {
> > +		residual += 128;
> >  		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
> >  		NCR5380_write(hostdata->c400_ctl_status, CSR_RESET);
> >  		NCR5380_write(hostdata->c400_ctl_status, CSR_BASE);
> >
> > (seems to work - wrote 230MB and read it back with no differences)
> >
> > The corruption mechanism is:
> > 1. Host buffer is ready so we write 128 B of data there and increment
> >    "start".
> > 2. Chip swaps the buffers, decrements the block counter and starts
> >    writing the data to drive.
> > 3. Drive does not like it (e.g. its buffer is full) so it disconnects.
> > 4. Chip stops writing and asserts an IRQ.
> > 5. We detect the IRQ. The block counter is already decremented, "start"
> >    is already incremented but the data was not written to the drive.
>
> OK. Thanks for that analysis.
>
> It sounds like the c400_blk_cnt value gives the number of buffer swaps
> remaining. If so, that value isn't useful for calculating a residual. I'll
> rework that calculation again.
>
> In your patch, the residual gets increased regardless of the actual cause
> of the short transfer. Nothing prevents the residual from being increased
> beyond the original length of the transfer (due to a flaky target or bus).
> Therefore I've taken a slightly different approach in my patch (below).

Yes, that's wrong. My original patch had "start -= 128" and then check for 
negative value.

> > No more log spamming on DTC but reads are corrupted even more than
> > before. The IRQ check after data transfer increases the chance of
> > catching an IRQ before the buffer could become ready.
>
> If we delay the IRQ check, that just means that CSR_GATED_53C80_IRQ will
> be detected a bit later (128 bytes later)... so not much difference.

The main difference is that my original patch read the CSR once and then:
- transfer data if the buffer is ready, ignoring any IRQ
- if the buffer is not ready, check for an IRQ
- if neither buffer is ready nor IRQ was asserted, check for timeout

So yes, the IRQ will be detected 128 bytes later - but the IRQ is asserted at 
3968 bytes which means the transfer will then be complete.

> > This patch:
> > --- a/drivers/scsi/g_NCR5380.c
> > +++ b/drivers/scsi/g_NCR5380.c
> > @@ -548,8 +548,10 @@ static inline int generic_NCR5380_precv(struct
> >  		start += 128;
> >
> >  		if (NCR5380_read(hostdata->c400_ctl_status) &
> > -		    CSR_GATED_53C80_IRQ)
> > +		    CSR_GATED_53C80_IRQ) {
> > +			printk("r irq at start=%d basr=0x%02x\n", start,
> > NCR5380_read(BUS_AND_STATUS_REG)); break;
> > +		}
> >  	}
> >
> >  	residual = len - start;
> >
> > produces lots of these lines:
> > [  896.194054] r irq at start=128 basr=0x98
> > [  896.197758] r irq at start=3968 basr=0x98
>
> Assuming that the registers are available and valid, the value 0x98 means
> BASR_END_DMA_TRANSFER | BASR_IRQ | BASR_PHASE_MATCH. There is no
> BASR_BUSY_ERROR here, so the cause of the CSR_GATED_53C80_IRQ must be that
> the 53c400 has terminated the transfer by asserting /EOP. That shouldn't
> happen before before the counters run down.
>
> It doesn't make sense. So maybe the 53c80 registers are not valid at this
> point? That means a phase mismatch can't be excluded... unlikely at 128
> bytes into the transfer. Busy error? Also unlikely.
>
> I have to conclude that CSR_GATED_53C80_IRQ and BASR_END_DMA_TRANSFER
> can't be trusted on this board. I guess that's why you examine the BASR
> directly in your original algorithm but ignore BASR_END_DMA_TRANSFER.

Exactly, BASR_END_DMA_TRANSFER causes problems on DTC.

> It does look like some kind of timing issue: the "start" value above
> changes from one log message to the next. Who knows?

Only that two values (128 and 3968) appeared. 3968 = 4096-128, one block 
before the end of transfer. So probably BASR_END_DMA_TRANSFER is asserted one 
block earlier (i.e. when the last block of data is transferred from the drive 
to the chip buffer).
Haven't seen this problem at start=128 before.

> > This fixes the DTC read corruption, although I don't like the repeated
> > ctl_status register reads:
> > --- a/drivers/scsi/g_NCR5380.c
> > +++ b/drivers/scsi/g_NCR5380.c
> > @@ -533,7 +533,7 @@ static inline int generic_NCR5380_precv(struct
> >  			break;
> >
> >  		if (NCR5380_read(hostdata->c400_ctl_status) &
> > -		    CSR_HOST_BUF_NOT_RDY)
> > +		    CSR_GATED_53C80_IRQ && (NCR5380_read(hostdata->c400_ctl_status) &
> > CSR_HOST_BUF_NOT_RDY)) break;
> >
> >  		if (hostdata->io_port && hostdata->io_width == 2)
>
> But that means the transfer will continue even when CSR_HOST_BUF_NOT_RDY.
> Your original algorithm doesn't attempt that. Neither does the algorithm
> in the datasheet. We should try to omit this change.

Sorry for that, it's a bug. I got lost in the conditions.

> > @@ -546,10 +546,6 @@ static inline int generic_NCR5380_precv(struct
> >  		memcpy_fromio(dst + start,
> >  			hostdata->io + NCR53C400_host_buffer, 128);
> >  		start += 128;
> > -
> > -		if (NCR5380_read(hostdata->c400_ctl_status) &
> > -		    CSR_GATED_53C80_IRQ)
> > -			break;
> >  	}
> >
> >  	residual = len - start;
>
> I think we should keep the CSR_GATED_53C80_IRQ check for the other boards,
> if this bogus BASR_END_DMA_TRANSFER problem is confined to DTC436.
>
> How about this change? (to be applied on top of 6/6)

It does not apply. Changed while() {} -> do {} while() manually.

> diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
> index 3948f522b4e1..8e80379cfaaa 100644
> --- a/drivers/scsi/g_NCR5380.c
> +++ b/drivers/scsi/g_NCR5380.c
> @@ -525,16 +525,22 @@ static inline int generic_NCR5380_precv(struct
> NCR5380_hostdata *hostdata, NCR5380_write(hostdata->c400_blk_cnt, len /
> 128);
>
>  	do {
> -		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
> -		                           CSR_HOST_BUF_NOT_RDY, 0,
> -		                           hostdata->c400_ctl_status,
> -		                           CSR_GATED_53C80_IRQ,
> -		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
> -			break;
> -
> -		if (NCR5380_read(hostdata->c400_ctl_status) &
> -		    CSR_HOST_BUF_NOT_RDY)
> -			break;
> +		if (hostdata->board == BOARD_DTC3181E) {
> +			/* Ignore bogus CSR_GATED_53C80_IRQ */
> +			if (NCR5380_poll_politely(hostdata, hostdata->c400_ctl_status,
> +			                          CSR_HOST_BUF_NOT_RDY, 0, HZ / 64) < 0)
> +				break;
> +		} else {
> +			if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
> +			                           CSR_HOST_BUF_NOT_RDY, 0,
> +			                           hostdata->c400_ctl_status,
> +			                           CSR_GATED_53C80_IRQ,
> +			                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
> +				break;
> +			if (NCR5380_read(hostdata->c400_ctl_status) &
> +			    CSR_HOST_BUF_NOT_RDY)
> +				break;
> +		}
>
>  		if (hostdata->io_port && hostdata->io_width == 2)
>  			insw(hostdata->io_port + hostdata->c400_host_buf,

It works but it looks really ugly (too many ifs). It also means that we detect 
disconnects only by waiting for timeout.

> @@ -546,10 +552,6 @@ static inline int generic_NCR5380_precv(struct
> NCR5380_hostdata *hostdata, memcpy_fromio(dst + start,
>  				hostdata->io + NCR53C400_host_buffer, 128);
>  		start += 128;
> -
> -		if (NCR5380_read(hostdata->c400_ctl_status) &
> -		    CSR_GATED_53C80_IRQ)
> -			break;
>  	} while (start < len);
>
>  	residual = len - start;
> @@ -600,6 +602,12 @@ static inline int generic_NCR5380_psend(struct
> NCR5380_hostdata *hostdata, break;
>
>  		if (NCR5380_read(hostdata->c400_ctl_status) &
> +		    CSR_HOST_BUF_NOT_RDY && start > 0) {
> +			start -= 128;
> +			break;
> +		}
> +
> +		if (NCR5380_read(hostdata->c400_ctl_status) &
>  		    CSR_GATED_53C80_IRQ)
>  			break;
>

This works too, although I'm not sure if the conditions are 100% correct. I'm 
getting lost again. That means that we roll back 128 bytes & "break" when the 
buffer is not ready and we already wrote something. We also "break" when IRQ 
arrives. But what if the buffer is not ready and start == 0?

> @@ -615,8 +623,7 @@ static inline int generic_NCR5380_psend(struct
> NCR5380_hostdata *hostdata, start += 128;
>  	} while (start < len);
>
> -	residual = max(len - start,
> -	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
> +	residual = len - start;
>
>  	if (residual != 0) {
>  		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */


-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup
  2017-06-30 18:07     ` Ondrej Zary
@ 2017-07-01  2:40       ` Finn Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Finn Thain @ 2017-07-01  2:40 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: James E.J. Bottomley, Martin K. Petersen, linux-scsi,
	linux-kernel, Michael Schmitz

On Fri, 30 Jun 2017, Ondrej Zary wrote:

> > > No more log spamming on DTC but reads are corrupted even more than 
> > > before. The IRQ check after data transfer increases the chance of 
> > > catching an IRQ before the buffer could become ready.
> >
> > If we delay the IRQ check, that just means that CSR_GATED_53C80_IRQ 
> > will be detected a bit later (128 bytes later)... so not much 
> > difference.
> 
> The main difference is that my original patch read the CSR once and 
> then:
> - transfer data if the buffer is ready, ignoring any IRQ
> - if the buffer is not ready, check for an IRQ
> - if neither buffer is ready nor IRQ was asserted, check for timeout
> 

I think your pseudocode is equivalent to this code* from the patch,

	if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
	                           CSR_HOST_BUF_NOT_RDY, 0,
	                           hostdata->c400_ctl_status,
	                           CSR_GATED_53C80_IRQ,
	                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
		break;

	if (NCR5380_read(hostdata->c400_ctl_status) &
	    CSR_HOST_BUF_NOT_RDY)
		break;

	/* transfer data */

If this doesn't work on DTC devices, I think we have to special case them 
somehow.

> So yes, the IRQ will be detected 128 bytes later - but the IRQ is 
> asserted at 3968 bytes which means the transfer will then be complete.
> 

Yes, unless the IRQ was asserted 128 bytes into the transfer...

> > > This patch:
> > > --- a/drivers/scsi/g_NCR5380.c
> > > +++ b/drivers/scsi/g_NCR5380.c
> > > @@ -548,8 +548,10 @@ static inline int generic_NCR5380_precv(struct
> > >  		start += 128;
> > >
> > >  		if (NCR5380_read(hostdata->c400_ctl_status) &
> > > -		    CSR_GATED_53C80_IRQ)
> > > +		    CSR_GATED_53C80_IRQ) {
> > > +			printk("r irq at start=%d basr=0x%02x\n", start,
> > > NCR5380_read(BUS_AND_STATUS_REG)); break;
> > > +		}
> > >  	}
> > >
> > >  	residual = len - start;
> > >
> > > produces lots of these lines:
> > > [  896.194054] r irq at start=128 basr=0x98
> > > [  896.197758] r irq at start=3968 basr=0x98
> >
> > Assuming that the registers are available and valid, the value 0x98 
> > means BASR_END_DMA_TRANSFER | BASR_IRQ | BASR_PHASE_MATCH. There is no 
> > BASR_BUSY_ERROR here, so the cause of the CSR_GATED_53C80_IRQ must be 
> > that the 53c400 has terminated the transfer by asserting /EOP. That 
> > shouldn't happen before before the counters run down.
> >
> > It doesn't make sense. So maybe the 53c80 registers are not valid at 
> > this point? That means a phase mismatch can't be excluded... unlikely 
> > at 128 bytes into the transfer. Busy error? Also unlikely.
> >
> > I have to conclude that CSR_GATED_53C80_IRQ and BASR_END_DMA_TRANSFER 
> > can't be trusted on this board. I guess that's why you examine the 
> > BASR directly in your original algorithm but ignore 
> > BASR_END_DMA_TRANSFER.
> 
> Exactly, BASR_END_DMA_TRANSFER causes problems on DTC.
> 
> > It does look like some kind of timing issue: the "start" value above 
> > changes from one log message to the next. Who knows?
> 
> Only that two values (128 and 3968) appeared. 3968 = 4096-128, one block 
> before the end of transfer. So probably BASR_END_DMA_TRANSFER is 
> asserted one block earlier (i.e. when the last block of data is 
> transferred from the drive to the chip buffer). Haven't seen this 
> problem at start=128 before.
> 

Well, if we ignore the 128 B short transfer as an aberrant outlier, and if 
we assume that BASR_END_DMA_TRANSFER can be made to work correctly, then 
perhaps we can tweak the timing so that a 4096 byte transfer is not cut 
short.

Alternatively, perhaps we could workaround the BASR_END_DMA_TRANSFER 
issue with a special case that ignores Gated IRQ when
	hostdata->board == BOARD_DTC3181E && start == len - 128

> > > This fixes the DTC read corruption, although I don't like the repeated
> > > ctl_status register reads:
> > > --- a/drivers/scsi/g_NCR5380.c
> > > +++ b/drivers/scsi/g_NCR5380.c
> > > @@ -533,7 +533,7 @@ static inline int generic_NCR5380_precv(struct
> > >  			break;
> > >
> > >  		if (NCR5380_read(hostdata->c400_ctl_status) &
> > > -		    CSR_HOST_BUF_NOT_RDY)
> > > +		    CSR_GATED_53C80_IRQ && (NCR5380_read(hostdata->c400_ctl_status) &
> > > CSR_HOST_BUF_NOT_RDY)) break;
> > >
> > >  		if (hostdata->io_port && hostdata->io_width == 2)
> >
> > But that means the transfer will continue even when 
> > CSR_HOST_BUF_NOT_RDY. Your original algorithm doesn't attempt that. 
> > Neither does the algorithm in the datasheet. We should try to omit 
> > this change.
> 
> Sorry for that, it's a bug. I got lost in the conditions.
> 
> > > @@ -546,10 +546,6 @@ static inline int generic_NCR5380_precv(struct
> > >  		memcpy_fromio(dst + start,
> > >  			hostdata->io + NCR53C400_host_buffer, 128);
> > >  		start += 128;
> > > -
> > > -		if (NCR5380_read(hostdata->c400_ctl_status) &
> > > -		    CSR_GATED_53C80_IRQ)
> > > -			break;
> > >  	}
> > >
> > >  	residual = len - start;
> >
> > I think we should keep the CSR_GATED_53C80_IRQ check for the other 
> > boards, if this bogus BASR_END_DMA_TRANSFER problem is confined to 
> > DTC436.
> >
> > How about this change? (to be applied on top of 6/6)
> 
> It does not apply. Changed while() {} -> do {} while() manually.
> 

Sorry, I sent you the wrong diff again.

> > diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
> > index 3948f522b4e1..8e80379cfaaa 100644
> > --- a/drivers/scsi/g_NCR5380.c
> > +++ b/drivers/scsi/g_NCR5380.c
> > @@ -525,16 +525,22 @@ static inline int generic_NCR5380_precv(struct
> > NCR5380_hostdata *hostdata, NCR5380_write(hostdata->c400_blk_cnt, len /
> > 128);
> >
> >  	do {
> > -		if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
> > -		                           CSR_HOST_BUF_NOT_RDY, 0,
> > -		                           hostdata->c400_ctl_status,
> > -		                           CSR_GATED_53C80_IRQ,
> > -		                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
> > -			break;
> > -
> > -		if (NCR5380_read(hostdata->c400_ctl_status) &
> > -		    CSR_HOST_BUF_NOT_RDY)
> > -			break;
> > +		if (hostdata->board == BOARD_DTC3181E) {
> > +			/* Ignore bogus CSR_GATED_53C80_IRQ */
> > +			if (NCR5380_poll_politely(hostdata, hostdata->c400_ctl_status,
> > +			                          CSR_HOST_BUF_NOT_RDY, 0, HZ / 64) < 0)
> > +				break;
> > +		} else {
> > +			if (NCR5380_poll_politely2(hostdata, hostdata->c400_ctl_status,
> > +			                           CSR_HOST_BUF_NOT_RDY, 0,
> > +			                           hostdata->c400_ctl_status,
> > +			                           CSR_GATED_53C80_IRQ,
> > +			                           CSR_GATED_53C80_IRQ, HZ / 64) < 0)
> > +				break;
> > +			if (NCR5380_read(hostdata->c400_ctl_status) &
> > +			    CSR_HOST_BUF_NOT_RDY)
> > +				break;
> > +		}
> >
> >  		if (hostdata->io_port && hostdata->io_width == 2)
> >  			insw(hostdata->io_port + hostdata->c400_host_buf,
> 
> It works but it looks really ugly (too many ifs). It also means that we 
> detect disconnects only by waiting for timeout.
> 

You're right about disconnect detection. When I wrote this code, I 
reasoned that the timeout may slow down CD-ROM reads (because they often 
disconnect) but disconnect/reconnect is slow anyway. The impact would need 
to be measured. Anyway, there are a couple of alternatives:

1) Fix the timing on DTC so that Gated IRQ can be polled. That would allow 
   us to use the same algorithm for all boards (i.e. the code snippet 
   marked * above).
 
2) Poll BASR even if that register may be inaccessible (no CSR_53C80_REG 
   flag). As I said, I'm reluctant to attempt this.

3) The test for hostdata->board == BOARD_DTC3181E && start == len - 128 
   should avoid this issue, since CD-ROMs disconnect on 2048-byte 
   boundaries. I'll try this in v6.

> > @@ -546,10 +552,6 @@ static inline int generic_NCR5380_precv(struct
> > NCR5380_hostdata *hostdata, memcpy_fromio(dst + start,
> >  				hostdata->io + NCR53C400_host_buffer, 128);
> >  		start += 128;
> > -
> > -		if (NCR5380_read(hostdata->c400_ctl_status) &
> > -		    CSR_GATED_53C80_IRQ)
> > -			break;
> >  	} while (start < len);
> >
> >  	residual = len - start;
> > @@ -600,6 +602,12 @@ static inline int generic_NCR5380_psend(struct
> > NCR5380_hostdata *hostdata, break;
> >
> >  		if (NCR5380_read(hostdata->c400_ctl_status) &
> > +		    CSR_HOST_BUF_NOT_RDY && start > 0) {
> > +			start -= 128;
> > +			break;
> > +		}
> > +
> > +		if (NCR5380_read(hostdata->c400_ctl_status) &
> >  		    CSR_GATED_53C80_IRQ)
> >  			break;
> >
> 
> This works too, although I'm not sure if the conditions are 100% 
> correct. I'm getting lost again. That means that we roll back 128 bytes 
> & "break" when the buffer is not ready and we already wrote something. 
> We also "break" when IRQ arrives. But what if the buffer is not ready 
> and start == 0?
> 

If start == 0 the residual ends up being the entire transfer. I think it 
is correct. I will see whether this logic can be expressed more clearly.

> > @@ -615,8 +623,7 @@ static inline int generic_NCR5380_psend(struct
> > NCR5380_hostdata *hostdata, start += 128;
> >  	} while (start < len);
> >
> > -	residual = max(len - start,
> > -	               (int)NCR5380_read(hostdata->c400_blk_cnt) * 128);
> > +	residual = len - start;
> >
> >  	if (residual != 0) {
> >  		/* 53c80 interrupt or transfer timeout. Reset 53c400 logic. */
> 

Thanks for testing.

-- 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-07-01  2:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-29  5:24 [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup Finn Thain
2017-06-29  5:24 ` Finn Thain
2017-06-29  5:24 ` [PATCH v5 3/6] g_NCR5380: Cleanup comments and whitespace Finn Thain
2017-06-29  5:24   ` Finn Thain
2017-06-29  5:24 ` [PATCH v5 2/6] g_NCR5380: End PDMA transfer correctly on target disconnection Finn Thain
2017-06-29  5:24   ` Finn Thain
2017-06-29  5:24 ` [PATCH v5 5/6] g_NCR5380: Re-work PDMA loops Finn Thain
2017-06-29  5:24   ` Finn Thain
2017-06-29  5:24 ` [PATCH v5 1/6] g_NCR5380: Fix PDMA transfer size Finn Thain
2017-06-29  5:24   ` Finn Thain
2017-06-29  5:24 ` [PATCH v5 4/6] g_NCR5380: Limit PDMA send to 512 B to avoid data corruption on DTC3181E Finn Thain
2017-06-29  5:24   ` Finn Thain
2017-06-29  5:24 ` [PATCH v5 6/6] g_NCR5380: Use unambiguous terminology for PDMA send and receive Finn Thain
2017-06-29  5:24   ` Finn Thain
2017-06-29 18:06 ` [PATCH v5 0/6] g_NCR5380: PDMA fixes and cleanup Ondrej Zary
2017-06-30  7:12   ` Finn Thain
2017-06-30 18:07     ` Ondrej Zary
2017-07-01  2:40       ` Finn Thain
2017-06-29 18:30 ` Ondrej Zary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.