All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] remove ide-scsi
@ 2008-12-03  1:38 FUJITA Tomonori
  2008-12-03 10:06 ` Christoph Hellwig
  0 siblings, 1 reply; 65+ messages in thread
From: FUJITA Tomonori @ 2008-12-03  1:38 UTC (permalink / raw)
  To: bzolnier; +Cc: James.Bottomley, linux-ide, linux-scsi

This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.

It's against linux-next (which seems to has some changes to ide-scsi
for 2.6.29 from the ide tree). 

=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] remove ide-scsi

As planed, this removes ide-scsi.

The 2.6 kernel supports direct writing to ide CD drives, which
eliminates the need for ide-scsi. ide-scsi has been unmaintained and
marked as deprecated.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 Documentation/feature-removal-schedule.txt |    9 -
 MAINTAINERS                                |    5 -
 drivers/ide/Kconfig                        |   17 -
 drivers/scsi/Kconfig                       |    8 +-
 drivers/scsi/Makefile                      |    1 -
 drivers/scsi/ide-scsi.c                    |  840 ----------------------------
 6 files changed, 4 insertions(+), 876 deletions(-)
 delete mode 100644 drivers/scsi/ide-scsi.c

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 77eb6b1..31f7906 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -322,15 +322,6 @@ Who:  Krzysztof Piotr Oledzki <ole@ans.pl>
 
 ---------------------------
 
-What: ide-scsi (BLK_DEV_IDESCSI)
-When: 2.6.29
-Why:  The 2.6 kernel supports direct writing to ide CD drives, which
-      eliminates the need for ide-scsi. The new method is more
-      efficient in every way.
-Who:  FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
-
----------------------------
-
 What:	i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client()
 When:	2.6.29 (ideally) or 2.6.30 (more likely)
 Why:	Deprecated by the new (standard) device driver binding model. Use
diff --git a/MAINTAINERS b/MAINTAINERS
index 758dd7a..8e07bb9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2138,11 +2138,6 @@ M:	Gadi Oxman <gadio@netvision.net.il>
 L:	linux-kernel@vger.kernel.org
 S:	Maintained
 
-IDE-SCSI DRIVER
-L:	linux-ide@vger.kernel.org
-L:	linux-scsi@vger.kernel.org
-S:	Orphan
-
 IDLE-I7300
 P:	Andy Henroid
 M:	andrew.d.henroid@intel.com
diff --git a/drivers/ide/Kconfig b/drivers/ide/Kconfig
index 8d84fe4..d806271 100644
--- a/drivers/ide/Kconfig
+++ b/drivers/ide/Kconfig
@@ -185,23 +185,6 @@ config BLK_DEV_IDETAPE
 	  To compile this driver as a module, choose M here: the
 	  module will be called ide-tape.
 
-config BLK_DEV_IDESCSI
-	tristate "SCSI emulation support (DEPRECATED)"
-	depends on SCSI
-	select IDE_ATAPI
-	---help---
-	  WARNING: ide-scsi is no longer needed for cd writing applications!
-	  The 2.6 kernel supports direct writing to ide-cd, which eliminates
-	  the need for ide-scsi + the entire scsi stack just for writing a
-	  cd. The new method is more efficient in every way.
-
-	  This will provide SCSI host adapter emulation for IDE ATAPI devices,
-	  and will allow you to use a SCSI device driver instead of a native
-	  ATAPI driver.
-
-	  If both this SCSI emulation and native ATAPI support are compiled
-	  into the kernel, the native support will be used.
-
 config BLK_DEV_IDEACPI
 	bool "IDE ACPI support"
 	depends on ACPI
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index e6af21c..2eb172d 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -21,7 +21,7 @@ config SCSI
 	  You also need to say Y here if you have a device which speaks
 	  the SCSI protocol.  Examples of this include the parallel port
 	  version of the IOMEGA ZIP drive, USB storage devices, Fibre
-	  Channel, FireWire storage and the IDE-SCSI emulation driver.
+	  Channel, and FireWire storage.
 
 	  To compile this driver as a module, choose M here and read
 	  <file:Documentation/scsi/scsi.txt>.
@@ -101,9 +101,9 @@ config CHR_DEV_OSST
 	---help---
 	  The OnStream SC-x0 SCSI tape drives cannot be driven by the
 	  standard st driver, but instead need this special osst driver and
-	  use the  /dev/osstX char device nodes (major 206).  Via usb-storage
-	  and ide-scsi, you may be able to drive the USB-x0 and DI-x0 drives
-	  as well.  Note that there is also a second generation of OnStream
+	  use the  /dev/osstX char device nodes (major 206).  Via usb-storage,
+	  you may be able to drive the USB-x0 and DI-x0 drives as well.
+	  Note that there is also a second generation of OnStream
 	  tape drives (ADR-x0) that supports the standard SCSI-2 commands for
 	  tapes (QIC-157) and can be driven by the standard driver st.
 	  For more information, you may have a look at the SCSI-HOWTO
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 1e49632..42bf88b 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -103,7 +103,6 @@ obj-$(CONFIG_SCSI_GDTH)		+= gdth.o
 obj-$(CONFIG_SCSI_INITIO)	+= initio.o
 obj-$(CONFIG_SCSI_INIA100)	+= a100u2w.o
 obj-$(CONFIG_SCSI_QLOGICPTI)	+= qlogicpti.o
-obj-$(CONFIG_BLK_DEV_IDESCSI)	+= ide-scsi.o
 obj-$(CONFIG_SCSI_MESH)		+= mesh.o
 obj-$(CONFIG_SCSI_MAC53C94)	+= mac53c94.o
 obj-$(CONFIG_BLK_DEV_3W_XXXX_RAID) += 3w-xxxx.o
diff --git a/drivers/scsi/ide-scsi.c b/drivers/scsi/ide-scsi.c
deleted file mode 100644
index c24140a..0000000
--- a/drivers/scsi/ide-scsi.c
+++ /dev/null
@@ -1,840 +0,0 @@
-/*
- * Copyright (C) 1996-1999  Gadi Oxman <gadio@netvision.net.il>
- * Copyright (C) 2004-2005  Bartlomiej Zolnierkiewicz
- */
-
-/*
- * Emulation of a SCSI host adapter for IDE ATAPI devices.
- *
- * With this driver, one can use the Linux SCSI drivers instead of the
- * native IDE ATAPI drivers.
- *
- * Ver 0.1   Dec  3 96   Initial version.
- * Ver 0.2   Jan 26 97   Fixed bug in cleanup_module() and added emulation
- *                        of MODE_SENSE_6/MODE_SELECT_6 for cdroms. Thanks
- *                        to Janos Farkas for pointing this out.
- *                       Avoid using bitfields in structures for m68k.
- *                       Added Scatter/Gather and DMA support.
- * Ver 0.4   Dec  7 97   Add support for ATAPI PD/CD drives.
- *                       Use variable timeout for each command.
- * Ver 0.5   Jan  2 98   Fix previous PD/CD support.
- *                       Allow disabling of SCSI-6 to SCSI-10 transformation.
- * Ver 0.6   Jan 27 98   Allow disabling of SCSI command translation layer
- *                        for access through /dev/sg.
- *                       Fix MODE_SENSE_6/MODE_SELECT_6/INQUIRY translation.
- * Ver 0.7   Dec 04 98   Ignore commands where lun != 0 to avoid multiple
- *                        detection of devices with CONFIG_SCSI_MULTI_LUN
- * Ver 0.8   Feb 05 99   Optical media need translation too. Reverse 0.7.
- * Ver 0.9   Jul 04 99   Fix a bug in SG_SET_TRANSFORM.
- * Ver 0.91  Jun 10 02   Fix "off by one" error in transforms
- * Ver 0.92  Dec 31 02   Implement new SCSI mid level API
- */
-
-#define IDESCSI_VERSION "0.92"
-
-#include <linux/module.h>
-#include <linux/types.h>
-#include <linux/string.h>
-#include <linux/kernel.h>
-#include <linux/mm.h>
-#include <linux/ioport.h>
-#include <linux/blkdev.h>
-#include <linux/errno.h>
-#include <linux/slab.h>
-#include <linux/ide.h>
-#include <linux/scatterlist.h>
-#include <linux/delay.h>
-#include <linux/mutex.h>
-#include <linux/bitops.h>
-
-#include <asm/io.h>
-#include <asm/uaccess.h>
-
-#include <scsi/scsi.h>
-#include <scsi/scsi_cmnd.h>
-#include <scsi/scsi_device.h>
-#include <scsi/scsi_host.h>
-#include <scsi/scsi_tcq.h>
-#include <scsi/sg.h>
-
-#define IDESCSI_DEBUG_LOG		0
-
-#if IDESCSI_DEBUG_LOG
-#define debug_log(fmt, args...) \
-	printk(KERN_INFO "ide-scsi: " fmt, ## args)
-#else
-#define debug_log(fmt, args...) do {} while (0)
-#endif
-
-/*
- *	SCSI command transformation layer
- */
-#define IDESCSI_SG_TRANSFORM		1	/* /dev/sg transformation */
-
-/*
- *	Log flags
- */
-#define IDESCSI_LOG_CMD			0	/* Log SCSI commands */
-
-typedef struct ide_scsi_obj {
-	ide_drive_t		*drive;
-	ide_driver_t		*driver;
-	struct gendisk		*disk;
-	struct Scsi_Host	*host;
-
-	unsigned long transform;		/* SCSI cmd translation layer */
-	unsigned long log;			/* log flags */
-} idescsi_scsi_t;
-
-static DEFINE_MUTEX(idescsi_ref_mutex);
-/* Set by module param to skip cd */
-static int idescsi_nocd;
-
-#define ide_scsi_g(disk) \
-	container_of((disk)->private_data, struct ide_scsi_obj, driver)
-
-static struct ide_scsi_obj *ide_scsi_get(struct gendisk *disk)
-{
-	struct ide_scsi_obj *scsi = NULL;
-
-	mutex_lock(&idescsi_ref_mutex);
-	scsi = ide_scsi_g(disk);
-	if (scsi) {
-		if (ide_device_get(scsi->drive))
-			scsi = NULL;
-		else
-			scsi_host_get(scsi->host);
-	}
-	mutex_unlock(&idescsi_ref_mutex);
-	return scsi;
-}
-
-static void ide_scsi_put(struct ide_scsi_obj *scsi)
-{
-	ide_drive_t *drive = scsi->drive;
-
-	mutex_lock(&idescsi_ref_mutex);
-	scsi_host_put(scsi->host);
-	ide_device_put(drive);
-	mutex_unlock(&idescsi_ref_mutex);
-}
-
-static inline idescsi_scsi_t *scsihost_to_idescsi(struct Scsi_Host *host)
-{
-	return (idescsi_scsi_t*) (&host[1]);
-}
-
-static inline idescsi_scsi_t *drive_to_idescsi(ide_drive_t *ide_drive)
-{
-	return scsihost_to_idescsi(ide_drive->driver_data);
-}
-
-static void ide_scsi_hex_dump(u8 *data, int len)
-{
-	print_hex_dump(KERN_CONT, "", DUMP_PREFIX_NONE, 16, 1, data, len, 0);
-}
-
-static int idescsi_end_request(ide_drive_t *, int, int);
-
-static void ide_scsi_callback(ide_drive_t *drive, int dsc)
-{
-	idescsi_scsi_t *scsi = drive_to_idescsi(drive);
-	struct ide_atapi_pc *pc = drive->pc;
-
-	if (pc->flags & PC_FLAG_TIMEDOUT)
-		debug_log("%s: got timed out packet %lu at %lu\n", __func__,
-			  pc->scsi_cmd->serial_number, jiffies);
-		/* end this request now - scsi should retry it*/
-	else if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
-		printk(KERN_INFO "Packet command completed, %d bytes"
-				 " transferred\n", pc->xferred);
-
-	idescsi_end_request(drive, 1, 0);
-}
-
-static int idescsi_check_condition(ide_drive_t *drive,
-		struct request *failed_cmd)
-{
-	idescsi_scsi_t *scsi = drive_to_idescsi(drive);
-	struct ide_atapi_pc   *pc;
-	struct request *rq;
-	u8             *buf;
-
-	/* stuff a sense request in front of our current request */
-	pc = kzalloc(sizeof(struct ide_atapi_pc), GFP_ATOMIC);
-	rq = blk_get_request(drive->queue, READ, GFP_ATOMIC);
-	buf = kzalloc(SCSI_SENSE_BUFFERSIZE, GFP_ATOMIC);
-	if (!pc || !rq || !buf) {
-		kfree(buf);
-		if (rq)
-			blk_put_request(rq);
-		kfree(pc);
-		return -ENOMEM;
-	}
-	rq->special = (char *) pc;
-	pc->rq = rq;
-	pc->buf = buf;
-	pc->c[0] = REQUEST_SENSE;
-	pc->c[4] = pc->req_xfer = pc->buf_size = SCSI_SENSE_BUFFERSIZE;
-	rq->cmd_type = REQ_TYPE_SENSE;
-	rq->cmd_flags |= REQ_PREEMPT;
-	pc->timeout = jiffies + WAIT_READY;
-	/* NOTE! Save the failed packet command in "rq->buffer" */
-	rq->buffer = (void *) failed_cmd->special;
-	pc->scsi_cmd = ((struct ide_atapi_pc *) failed_cmd->special)->scsi_cmd;
-	if (test_bit(IDESCSI_LOG_CMD, &scsi->log)) {
-		printk ("ide-scsi: %s: queue cmd = ", drive->name);
-		ide_scsi_hex_dump(pc->c, 6);
-	}
-	rq->rq_disk = scsi->disk;
-	rq->ref_count++;
-	memcpy(rq->cmd, pc->c, 12);
-	ide_do_drive_cmd(drive, rq);
-	return 0;
-}
-
-static ide_startstop_t
-idescsi_atapi_error(ide_drive_t *drive, struct request *rq, u8 stat, u8 err)
-{
-	ide_hwif_t *hwif = drive->hwif;
-
-	if (hwif->tp_ops->read_status(hwif) & (ATA_BUSY | ATA_DRQ))
-		/* force an abort */
-		hwif->tp_ops->exec_command(hwif, ATA_CMD_IDLEIMMEDIATE);
-
-	rq->errors++;
-
-	idescsi_end_request(drive, 0, 0);
-
-	return ide_stopped;
-}
-
-static int idescsi_end_request (ide_drive_t *drive, int uptodate, int nrsecs)
-{
-	idescsi_scsi_t *scsi = drive_to_idescsi(drive);
-	struct request *rq = HWGROUP(drive)->rq;
-	struct ide_atapi_pc *pc = (struct ide_atapi_pc *) rq->special;
-	int log = test_bit(IDESCSI_LOG_CMD, &scsi->log);
-	struct Scsi_Host *host;
-	int errors = rq->errors;
-	unsigned long flags;
-
-	if (!blk_special_request(rq) && !blk_sense_request(rq)) {
-		ide_end_request(drive, uptodate, nrsecs);
-		return 0;
-	}
-	ide_end_drive_cmd (drive, 0, 0);
-	if (blk_sense_request(rq)) {
-		struct ide_atapi_pc *opc = (struct ide_atapi_pc *) rq->buffer;
-		if (log) {
-			printk ("ide-scsi: %s: wrap up check %lu, rst = ", drive->name, opc->scsi_cmd->serial_number);
-			ide_scsi_hex_dump(pc->buf, 16);
-		}
-		memcpy((void *) opc->scsi_cmd->sense_buffer, pc->buf,
-			SCSI_SENSE_BUFFERSIZE);
-		kfree(pc->buf);
-		kfree(pc);
-		blk_put_request(rq);
-		pc = opc;
-		rq = pc->rq;
-		pc->scsi_cmd->result = (CHECK_CONDITION << 1) |
-				(((pc->flags & PC_FLAG_TIMEDOUT) ?
-				  DID_TIME_OUT :
-				  DID_OK) << 16);
-	} else if (pc->flags & PC_FLAG_TIMEDOUT) {
-		if (log)
-			printk (KERN_WARNING "ide-scsi: %s: timed out for %lu\n",
-					drive->name, pc->scsi_cmd->serial_number);
-		pc->scsi_cmd->result = DID_TIME_OUT << 16;
-	} else if (errors >= ERROR_MAX) {
-		pc->scsi_cmd->result = DID_ERROR << 16;
-		if (log)
-			printk ("ide-scsi: %s: I/O error for %lu\n", drive->name, pc->scsi_cmd->serial_number);
-	} else if (errors) {
-		if (log)
-			printk ("ide-scsi: %s: check condition for %lu\n", drive->name, pc->scsi_cmd->serial_number);
-		if (!idescsi_check_condition(drive, rq))
-			/* we started a request sense, so we'll be back, exit for now */
-			return 0;
-		pc->scsi_cmd->result = (CHECK_CONDITION << 1) | (DID_OK << 16);
-	} else {
-		pc->scsi_cmd->result = DID_OK << 16;
-	}
-	host = pc->scsi_cmd->device->host;
-	spin_lock_irqsave(host->host_lock, flags);
-	pc->done(pc->scsi_cmd);
-	spin_unlock_irqrestore(host->host_lock, flags);
-	kfree(pc);
-	blk_put_request(rq);
-	drive->pc = NULL;
-	return 0;
-}
-
-static inline int idescsi_set_direction(struct ide_atapi_pc *pc)
-{
-	switch (pc->c[0]) {
-		case READ_6: case READ_10: case READ_12:
-			pc->flags &= ~PC_FLAG_WRITING;
-			return 0;
-		case WRITE_6: case WRITE_10: case WRITE_12:
-			pc->flags |= PC_FLAG_WRITING;
-			return 0;
-		default:
-			return 1;
-	}
-}
-
-static int idescsi_map_sg(ide_drive_t *drive, struct ide_atapi_pc *pc)
-{
-	ide_hwif_t *hwif = drive->hwif;
-	struct scatterlist *sg, *scsi_sg;
-	int segments;
-
-	if (!pc->req_xfer || pc->req_xfer % 1024)
-		return 1;
-
-	if (idescsi_set_direction(pc))
-		return 1;
-
-	sg = hwif->sg_table;
-	scsi_sg = scsi_sglist(pc->scsi_cmd);
-	segments = scsi_sg_count(pc->scsi_cmd);
-
-	if (segments > hwif->sg_max_nents)
-		return 1;
-
-	hwif->sg_nents = segments;
-	memcpy(sg, scsi_sg, sizeof(*sg) * segments);
-
-	return 0;
-}
-
-static ide_startstop_t idescsi_issue_pc(ide_drive_t *drive,
-		struct ide_atapi_pc *pc)
-{
-	/* Set the current packet command */
-	drive->pc = pc;
-
-	return ide_issue_pc(drive, ide_scsi_get_timeout(pc), ide_scsi_expiry);
-}
-
-/*
- *	idescsi_do_request is our request handling function.
- */
-static ide_startstop_t idescsi_do_request (ide_drive_t *drive, struct request *rq, sector_t block)
-{
-	debug_log("dev: %s, cmd: %x, errors: %d\n", rq->rq_disk->disk_name,
-		  rq->cmd[0], rq->errors);
-	debug_log("sector: %ld, nr_sectors: %ld, current_nr_sectors: %d\n",
-		  rq->sector, rq->nr_sectors, rq->current_nr_sectors);
-
-	if (blk_sense_request(rq) || blk_special_request(rq)) {
-		struct ide_atapi_pc *pc = (struct ide_atapi_pc *)rq->special;
-
-		if ((drive->dev_flags & IDE_DFLAG_USING_DMA) &&
-		    idescsi_map_sg(drive, pc) == 0)
-			pc->flags |= PC_FLAG_DMA_OK;
-
-		return idescsi_issue_pc(drive, pc);
-	}
-	blk_dump_rq_flags(rq, "ide-scsi: unsup command");
-	idescsi_end_request (drive, 0, 0);
-	return ide_stopped;
-}
-
-#ifdef CONFIG_IDE_PROC_FS
-static ide_proc_entry_t idescsi_proc[] = {
-	{ "capacity", S_IFREG|S_IRUGO, proc_ide_read_capacity, NULL },
-	{ NULL, 0, NULL, NULL }
-};
-
-#define ide_scsi_devset_get(name, field) \
-static int get_##name(ide_drive_t *drive) \
-{ \
-	idescsi_scsi_t *scsi = drive_to_idescsi(drive); \
-	return scsi->field; \
-}
-
-#define ide_scsi_devset_set(name, field) \
-static int set_##name(ide_drive_t *drive, int arg) \
-{ \
-	idescsi_scsi_t *scsi = drive_to_idescsi(drive); \
-	scsi->field = arg; \
-	return 0; \
-}
-
-#define ide_scsi_devset_rw_field(_name, _field) \
-ide_scsi_devset_get(_name, _field); \
-ide_scsi_devset_set(_name, _field); \
-IDE_DEVSET(_name, DS_SYNC, get_##_name, set_##_name);
-
-ide_devset_rw_field(bios_cyl, bios_cyl);
-ide_devset_rw_field(bios_head, bios_head);
-ide_devset_rw_field(bios_sect, bios_sect);
-
-ide_scsi_devset_rw_field(transform, transform);
-ide_scsi_devset_rw_field(log, log);
-
-static const struct ide_proc_devset idescsi_settings[] = {
-	IDE_PROC_DEVSET(bios_cyl,  0, 1023),
-	IDE_PROC_DEVSET(bios_head, 0,  255),
-	IDE_PROC_DEVSET(bios_sect, 0,	63),
-	IDE_PROC_DEVSET(log,	   0,	 1),
-	IDE_PROC_DEVSET(transform, 0,	 3),
-	{ 0 },
-};
-
-static ide_proc_entry_t *ide_scsi_proc_entries(ide_drive_t *drive)
-{
-	return idescsi_proc;
-}
-
-static const struct ide_proc_devset *ide_scsi_proc_devsets(ide_drive_t *drive)
-{
-	return idescsi_settings;
-}
-#endif
-
-/*
- *	Driver initialization.
- */
-static void idescsi_setup (ide_drive_t *drive, idescsi_scsi_t *scsi)
-{
-	clear_bit(IDESCSI_SG_TRANSFORM, &scsi->transform);
-#if IDESCSI_DEBUG_LOG
-	set_bit(IDESCSI_LOG_CMD, &scsi->log);
-#endif /* IDESCSI_DEBUG_LOG */
-
-	drive->pc_callback	 = ide_scsi_callback;
-	drive->pc_update_buffers = NULL;
-	drive->pc_io_buffers	 = ide_io_buffers;
-
-	ide_proc_register_driver(drive, scsi->driver);
-}
-
-static void ide_scsi_remove(ide_drive_t *drive)
-{
-	struct Scsi_Host *scsihost = drive->driver_data;
-	struct ide_scsi_obj *scsi = scsihost_to_idescsi(scsihost);
-	struct gendisk *g = scsi->disk;
-
-	scsi_remove_host(scsihost);
-	ide_proc_unregister_driver(drive, scsi->driver);
-
-	ide_unregister_region(g);
-
-	drive->driver_data = NULL;
-	g->private_data = NULL;
-	put_disk(g);
-
-	ide_scsi_put(scsi);
-
-	drive->dev_flags &= ~IDE_DFLAG_SCSI;
-}
-
-static int ide_scsi_probe(ide_drive_t *);
-
-static ide_driver_t idescsi_driver = {
-	.gen_driver = {
-		.owner		= THIS_MODULE,
-		.name		= "ide-scsi",
-		.bus		= &ide_bus_type,
-	},
-	.probe			= ide_scsi_probe,
-	.remove			= ide_scsi_remove,
-	.version		= IDESCSI_VERSION,
-	.do_request		= idescsi_do_request,
-	.end_request		= idescsi_end_request,
-	.error                  = idescsi_atapi_error,
-#ifdef CONFIG_IDE_PROC_FS
-	.proc_entries		= ide_scsi_proc_entries,
-	.proc_devsets		= ide_scsi_proc_devsets,
-#endif
-};
-
-static int idescsi_ide_open(struct block_device *bdev, fmode_t mode)
-{
-	struct ide_scsi_obj *scsi = ide_scsi_get(bdev->bd_disk);
-
-	if (!scsi)
-		return -ENXIO;
-
-	return 0;
-}
-
-static int idescsi_ide_release(struct gendisk *disk, fmode_t mode)
-{
-	ide_scsi_put(ide_scsi_g(disk));
-	return 0;
-}
-
-static int idescsi_ide_ioctl(struct block_device *bdev, fmode_t mode,
-			unsigned int cmd, unsigned long arg)
-{
-	struct ide_scsi_obj *scsi = ide_scsi_g(bdev->bd_disk);
-	return generic_ide_ioctl(scsi->drive, bdev, cmd, arg);
-}
-
-static struct block_device_operations idescsi_ops = {
-	.owner		= THIS_MODULE,
-	.open		= idescsi_ide_open,
-	.release	= idescsi_ide_release,
-	.locked_ioctl	= idescsi_ide_ioctl,
-};
-
-static int idescsi_slave_configure(struct scsi_device * sdp)
-{
-	/* Configure detected device */
-	sdp->use_10_for_rw = 1;
-	sdp->use_10_for_ms = 1;
-	scsi_adjust_queue_depth(sdp, MSG_SIMPLE_TAG, sdp->host->cmd_per_lun);
-	return 0;
-}
-
-static const char *idescsi_info (struct Scsi_Host *host)
-{
-	return "SCSI host adapter emulation for IDE ATAPI devices";
-}
-
-static int idescsi_ioctl (struct scsi_device *dev, int cmd, void __user *arg)
-{
-	idescsi_scsi_t *scsi = scsihost_to_idescsi(dev->host);
-
-	if (cmd == SG_SET_TRANSFORM) {
-		if (arg)
-			set_bit(IDESCSI_SG_TRANSFORM, &scsi->transform);
-		else
-			clear_bit(IDESCSI_SG_TRANSFORM, &scsi->transform);
-		return 0;
-	} else if (cmd == SG_GET_TRANSFORM)
-		return put_user(test_bit(IDESCSI_SG_TRANSFORM, &scsi->transform), (int __user *) arg);
-	return -EINVAL;
-}
-
-static int idescsi_queue (struct scsi_cmnd *cmd,
-		void (*done)(struct scsi_cmnd *))
-{
-	struct Scsi_Host *host = cmd->device->host;
-	idescsi_scsi_t *scsi = scsihost_to_idescsi(host);
-	ide_drive_t *drive = scsi->drive;
-	struct request *rq = NULL;
-	struct ide_atapi_pc *pc = NULL;
-	int write = cmd->sc_data_direction == DMA_TO_DEVICE;
-
-	if (!drive) {
-		scmd_printk (KERN_ERR, cmd, "drive not present\n");
-		goto abort;
-	}
-	scsi = drive_to_idescsi(drive);
-	pc = kmalloc(sizeof(struct ide_atapi_pc), GFP_ATOMIC);
-	rq = blk_get_request(drive->queue, write, GFP_ATOMIC);
-	if (rq == NULL || pc == NULL) {
-		printk (KERN_ERR "ide-scsi: %s: out of memory\n", drive->name);
-		goto abort;
-	}
-
-	memset (pc->c, 0, 12);
-	pc->flags = 0;
-	if (cmd->sc_data_direction == DMA_TO_DEVICE)
-		pc->flags |= PC_FLAG_WRITING;
-	pc->rq = rq;
-	memcpy (pc->c, cmd->cmnd, cmd->cmd_len);
-	pc->buf = NULL;
-	pc->sg = scsi_sglist(cmd);
-	pc->sg_cnt = scsi_sg_count(cmd);
-	pc->b_count = 0;
-	pc->req_xfer = pc->buf_size = scsi_bufflen(cmd);
-	pc->scsi_cmd = cmd;
-	pc->done = done;
-	pc->timeout = jiffies + cmd->request->timeout;
-
-	if (test_bit(IDESCSI_LOG_CMD, &scsi->log)) {
-		printk ("ide-scsi: %s: que %lu, cmd = ", drive->name, cmd->serial_number);
-		ide_scsi_hex_dump(cmd->cmnd, cmd->cmd_len);
-		if (memcmp(pc->c, cmd->cmnd, cmd->cmd_len)) {
-			printk ("ide-scsi: %s: que %lu, tsl = ", drive->name, cmd->serial_number);
-			ide_scsi_hex_dump(pc->c, 12);
-		}
-	}
-
-	rq->special = (char *) pc;
-	rq->cmd_type = REQ_TYPE_SPECIAL;
-	spin_unlock_irq(host->host_lock);
-	rq->ref_count++;
-	memcpy(rq->cmd, pc->c, 12);
-	blk_execute_rq_nowait(drive->queue, scsi->disk, rq, 0, NULL);
-	spin_lock_irq(host->host_lock);
-	return 0;
-abort:
-	kfree (pc);
-	if (rq)
-		blk_put_request(rq);
-	cmd->result = DID_ERROR << 16;
-	done(cmd);
-	return 0;
-}
-
-static int idescsi_eh_abort (struct scsi_cmnd *cmd)
-{
-	idescsi_scsi_t *scsi  = scsihost_to_idescsi(cmd->device->host);
-	ide_drive_t    *drive = scsi->drive;
-	ide_hwif_t     *hwif;
-	ide_hwgroup_t  *hwgroup;
-	int		busy;
-	int             ret   = FAILED;
-
-	struct ide_atapi_pc *pc;
-
-	/* In idescsi_eh_abort we try to gently pry our command from the ide subsystem */
-
-	if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
-		printk (KERN_WARNING "ide-scsi: abort called for %lu\n", cmd->serial_number);
-
-	if (!drive) {
-		printk (KERN_WARNING "ide-scsi: Drive not set in idescsi_eh_abort\n");
-		WARN_ON(1);
-		goto no_drive;
-	}
-
-	hwif = drive->hwif;
-	hwgroup = hwif->hwgroup;
-
-	/* First give it some more time, how much is "right" is hard to say :-(
-	   FIXME - uses mdelay which causes latency? */
-	busy = ide_wait_not_busy(hwif, 100);
-	if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
-		printk (KERN_WARNING "ide-scsi: drive did%s become ready\n", busy?" not":"");
-
-	spin_lock_irq(&hwgroup->lock);
-
-	/* If there is no pc running we're done (our interrupt took care of it) */
-	pc = drive->pc;
-	if (pc == NULL) {
-		ret = SUCCESS;
-		goto ide_unlock;
-	}
-
-	/* It's somewhere in flight. Does ide subsystem agree? */
-	if (pc->scsi_cmd->serial_number == cmd->serial_number && !busy &&
-	    elv_queue_empty(drive->queue) && HWGROUP(drive)->rq != pc->rq) {
-		/*
-		 * FIXME - not sure this condition can ever occur
-		 */
-		printk (KERN_ERR "ide-scsi: cmd aborted!\n");
-
-		if (blk_sense_request(pc->rq))
-			kfree(pc->buf);
-		/* we need to call blk_put_request twice. */
-		blk_put_request(pc->rq);
-		blk_put_request(pc->rq);
-		kfree(pc);
-		drive->pc = NULL;
-
-		ret = SUCCESS;
-	}
-
-ide_unlock:
-	spin_unlock_irq(&hwgroup->lock);
-no_drive:
-	if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
-		printk (KERN_WARNING "ide-scsi: abort returns %s\n", ret == SUCCESS?"success":"failed");
-
-	return ret;
-}
-
-static int idescsi_eh_reset (struct scsi_cmnd *cmd)
-{
-	struct request *req;
-	idescsi_scsi_t *scsi  = scsihost_to_idescsi(cmd->device->host);
-	ide_drive_t    *drive = scsi->drive;
-	ide_hwgroup_t  *hwgroup;
-	int             ready = 0;
-	int             ret   = SUCCESS;
-
-	struct ide_atapi_pc *pc;
-
-	/* In idescsi_eh_reset we forcefully remove the command from the ide subsystem and reset the device. */
-
-	if (test_bit(IDESCSI_LOG_CMD, &scsi->log))
-		printk (KERN_WARNING "ide-scsi: reset called for %lu\n", cmd->serial_number);
-
-	if (!drive) {
-		printk (KERN_WARNING "ide-scsi: Drive not set in idescsi_eh_reset\n");
-		WARN_ON(1);
-		return FAILED;
-	}
-
-	hwgroup = drive->hwif->hwgroup;
-
-	spin_lock_irq(cmd->device->host->host_lock);
-	spin_lock(&hwgroup->lock);
-
-	pc = drive->pc;
-	if (pc)
-		req = pc->rq;
-
-	if (pc == NULL || req != hwgroup->rq || hwgroup->handler == NULL) {
-		printk (KERN_WARNING "ide-scsi: No active request in idescsi_eh_reset\n");
-		spin_unlock(&hwgroup->lock);
-		spin_unlock_irq(cmd->device->host->host_lock);
-		return FAILED;
-	}
-
-	/* kill current request */
-	if (__blk_end_request(req, -EIO, 0))
-		BUG();
-	if (blk_sense_request(req))
-		kfree(pc->buf);
-	kfree(pc);
-	drive->pc = NULL;
-	blk_put_request(req);
-
-	/* now nuke the drive queue */
-	while ((req = elv_next_request(drive->queue))) {
-		if (__blk_end_request(req, -EIO, 0))
-			BUG();
-	}
-
-	hwgroup->rq = NULL;
-	hwgroup->handler = NULL;
-	hwgroup->busy = 1; /* will set this to zero when ide reset finished */
-	spin_unlock(&hwgroup->lock);
-
-	ide_do_reset(drive);
-
-	/* ide_do_reset starts a polling handler which restarts itself every 50ms until the reset finishes */
-
-	do {
-		spin_unlock_irq(cmd->device->host->host_lock);
-		msleep(50);
-		spin_lock_irq(cmd->device->host->host_lock);
-	} while ( HWGROUP(drive)->handler );
-
-	ready = drive_is_ready(drive);
-	HWGROUP(drive)->busy--;
-	if (!ready) {
-		printk (KERN_ERR "ide-scsi: reset failed!\n");
-		ret = FAILED;
-	}
-
-	spin_unlock_irq(cmd->device->host->host_lock);
-	return ret;
-}
-
-static int idescsi_bios(struct scsi_device *sdev, struct block_device *bdev,
-		sector_t capacity, int *parm)
-{
-	idescsi_scsi_t *idescsi = scsihost_to_idescsi(sdev->host);
-	ide_drive_t *drive = idescsi->drive;
-
-	if (drive->bios_cyl && drive->bios_head && drive->bios_sect) {
-		parm[0] = drive->bios_head;
-		parm[1] = drive->bios_sect;
-		parm[2] = drive->bios_cyl;
-	}
-	return 0;
-}
-
-static struct scsi_host_template idescsi_template = {
-	.module			= THIS_MODULE,
-	.name			= "idescsi",
-	.info			= idescsi_info,
-	.slave_configure        = idescsi_slave_configure,
-	.ioctl			= idescsi_ioctl,
-	.queuecommand		= idescsi_queue,
-	.eh_abort_handler	= idescsi_eh_abort,
-	.eh_host_reset_handler  = idescsi_eh_reset,
-	.bios_param		= idescsi_bios,
-	.can_queue		= 40,
-	.this_id		= -1,
-	.sg_tablesize		= 256,
-	.cmd_per_lun		= 5,
-	.max_sectors		= 128,
-	.use_clustering		= DISABLE_CLUSTERING,
-	.emulated		= 1,
-	.proc_name		= "ide-scsi",
-};
-
-static int ide_scsi_probe(ide_drive_t *drive)
-{
-	idescsi_scsi_t *idescsi;
-	struct Scsi_Host *host;
-	struct gendisk *g;
-	static int warned;
-	int err = -ENOMEM;
-	u16 last_lun;
-
-	if (!warned && drive->media == ide_cdrom) {
-		printk(KERN_WARNING "ide-scsi is deprecated for cd burning! Use ide-cd and give dev=/dev/hdX as device\n");
-		warned = 1;
-	}
-
-	if (idescsi_nocd && drive->media == ide_cdrom)
-		return -ENODEV;
-
-	if (!strstr("ide-scsi", drive->driver_req) ||
-	    drive->media == ide_disk ||
-	    !(host = scsi_host_alloc(&idescsi_template,sizeof(idescsi_scsi_t))))
-		return -ENODEV;
-
-	drive->dev_flags |= IDE_DFLAG_SCSI;
-
-	g = alloc_disk(1 << PARTN_BITS);
-	if (!g)
-		goto out_host_put;
-
-	ide_init_disk(g, drive);
-
-	host->max_id = 1;
-
-	last_lun = drive->id[ATA_ID_LAST_LUN];
-	if (last_lun)
-		debug_log("%s: last_lun=%u\n", drive->name, last_lun);
-
-	if ((last_lun & 7) != 7)
-		host->max_lun = (last_lun & 7) + 1;
-	else
-		host->max_lun = 1;
-
-	drive->driver_data = host;
-	idescsi = scsihost_to_idescsi(host);
-	idescsi->drive = drive;
-	idescsi->driver = &idescsi_driver;
-	idescsi->host = host;
-	idescsi->disk = g;
-	g->private_data = &idescsi->driver;
-	err = 0;
-	idescsi_setup(drive, idescsi);
-	g->fops = &idescsi_ops;
-	ide_register_region(g);
-	err = scsi_add_host(host, &drive->gendev);
-	if (!err) {
-		scsi_scan_host(host);
-		return 0;
-	}
-	/* fall through on error */
-	ide_unregister_region(g);
-	ide_proc_unregister_driver(drive, &idescsi_driver);
-
-	put_disk(g);
-out_host_put:
-	drive->dev_flags &= ~IDE_DFLAG_SCSI;
-	scsi_host_put(host);
-	return err;
-}
-
-static int __init init_idescsi_module(void)
-{
-	return driver_register(&idescsi_driver.gen_driver);
-}
-
-static void __exit exit_idescsi_module(void)
-{
-	driver_unregister(&idescsi_driver.gen_driver);
-}
-
-module_param(idescsi_nocd, int, 0600);
-MODULE_PARM_DESC(idescsi_nocd, "Disable handling of CD-ROMs so they may be driven by ide-cd");
-module_init(init_idescsi_module);
-module_exit(exit_idescsi_module);
-MODULE_LICENSE("GPL");
-- 
1.5.5.GIT


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03  1:38 [PATCH] remove ide-scsi FUJITA Tomonori
@ 2008-12-03 10:06 ` Christoph Hellwig
  2008-12-03 13:31   ` Willem Riede
  2008-12-03 15:09   ` James Bottomley
  0 siblings, 2 replies; 65+ messages in thread
From: Christoph Hellwig @ 2008-12-03 10:06 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: bzolnier, James.Bottomley, linux-ide, linux-scsi, osst

On Wed, Dec 03, 2008 at 10:38:54AM +0900, FUJITA Tomonori wrote:
> This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.
> 
> It's against linux-next (which seems to has some changes to ide-scsi
> for 2.6.29 from the ide tree). 

Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
the osst driver?


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03 10:06 ` Christoph Hellwig
@ 2008-12-03 13:31   ` Willem Riede
  2008-12-03 13:55     ` Matthew Wilcox
  2008-12-03 15:09   ` James Bottomley
  1 sibling, 1 reply; 65+ messages in thread
From: Willem Riede @ 2008-12-03 13:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: FUJITA Tomonori, bzolnier, James.Bottomley, linux-ide, linux-scsi

On Wed, Dec 3, 2008 at 3:06 AM, Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Dec 03, 2008 at 10:38:54AM +0900, FUJITA Tomonori wrote:
> > This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.
> >
> > It's against linux-next (which seems to has some changes to ide-scsi
> > for 2.6.29 from the ide tree).
>
> Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
> the osst driver?
>
You are right. There are IDE versions of the OnStream drives that need ide-scsi.
There are also SCSI versions of the OnStream, which work with osst directly.

Regards, Willem Riede.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03 13:31   ` Willem Riede
@ 2008-12-03 13:55     ` Matthew Wilcox
  2008-12-03 14:02       ` Alan Cox
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Wilcox @ 2008-12-03 13:55 UTC (permalink / raw)
  To: Willem Riede
  Cc: Christoph Hellwig, FUJITA Tomonori, bzolnier, James.Bottomley,
	linux-ide, linux-scsi

On Wed, Dec 03, 2008 at 06:31:15AM -0700, Willem Riede wrote:
> On Wed, Dec 3, 2008 at 3:06 AM, Christoph Hellwig <hch@infradead.org> wrote:
> > Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
> > the osst driver?
>
> You are right. There are IDE versions of the OnStream drives that need ide-scsi.
> There are also SCSI versions of the OnStream, which work with osst directly.

Why can't osst / libata support the IDE versions of the OnStream drives?
What code needs to be written?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03 13:55     ` Matthew Wilcox
@ 2008-12-03 14:02       ` Alan Cox
  0 siblings, 0 replies; 65+ messages in thread
From: Alan Cox @ 2008-12-03 14:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Willem Riede, Christoph Hellwig, FUJITA Tomonori, bzolnier,
	James.Bottomley, linux-ide, linux-scsi

> > You are right. There are IDE versions of the OnStream drives that need ide-scsi.
> > There are also SCSI versions of the OnStream, which work with osst directly.
> 
> Why can't osst / libata support the IDE versions of the OnStream drives?
> What code needs to be written?

They should work just fine that way.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03 10:06 ` Christoph Hellwig
  2008-12-03 13:31   ` Willem Riede
@ 2008-12-03 15:09   ` James Bottomley
  2008-12-06  6:12     ` Pete Zaitcev
  2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
  1 sibling, 2 replies; 65+ messages in thread
From: James Bottomley @ 2008-12-03 15:09 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: FUJITA Tomonori, bzolnier, linux-ide, linux-scsi, osst

On Wed, 2008-12-03 at 05:06 -0500, Christoph Hellwig wrote:
> On Wed, Dec 03, 2008 at 10:38:54AM +0900, FUJITA Tomonori wrote:
> > This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.
> > 
> > It's against linux-next (which seems to has some changes to ide-scsi
> > for 2.6.29 from the ide tree). 
> 
> Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
> the osst driver?

Depends.  If you're still using drivers/ide then yes, it is.  With
libata (which is what most modern distros use), osst just works as an
ATAPI transport.

git log tells me ide-scsi has been updated quite a bit recently, but it
mostly looks to be fallout around the drivers/ide churn.  Can we get ide
maintainer's buy in for this (I think they've been maintaining ide-tape
and ide-cd in preference to ide-scsi)?

James



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03 15:09   ` James Bottomley
@ 2008-12-06  6:12     ` Pete Zaitcev
  2008-12-06 14:06       ` Bartlomiej Zolnierkiewicz
  2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 65+ messages in thread
From: Pete Zaitcev @ 2008-12-06  6:12 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, FUJITA Tomonori, bzolnier, linux-ide,
	linux-scsi, osst

On Wed, 03 Dec 2008 09:09:00 -0600, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> git log tells me ide-scsi has been updated quite a bit recently, but it
> mostly looks to be fallout around the drivers/ide churn.  Can we get ide
> maintainer's buy in for this (I think they've been maintaining ide-tape
> and ide-cd in preference to ide-scsi)?

They must be bonkers to do it. I fixed a couple of bugs in ide-tape
and it was one of the worst drivers I worked on, ever (worse than
floppy.c, I kid you not). It definitely must die before ide-scsi.
Fortunately, I solved this issue by hooking my tape to USB bridge,
or else I'd be up in arms and wanting Tomo's blood.

-- Pete

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06  6:12     ` Pete Zaitcev
@ 2008-12-06 14:06       ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 14:06 UTC (permalink / raw)
  To: Pete Zaitcev
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Saturday 06 December 2008, Pete Zaitcev wrote:
> On Wed, 03 Dec 2008 09:09:00 -0600, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > git log tells me ide-scsi has been updated quite a bit recently, but it
> > mostly looks to be fallout around the drivers/ide churn.  Can we get ide
> > maintainer's buy in for this (I think they've been maintaining ide-tape
> > and ide-cd in preference to ide-scsi)?
> 
> They must be bonkers to do it. I fixed a couple of bugs in ide-tape
> and it was one of the worst drivers I worked on, ever (worse than
> floppy.c, I kid you not). It definitely must die before ide-scsi.
> Fortunately, I solved this issue by hooking my tape to USB bridge,
> or else I'd be up in arms and wanting Tomo's blood.

Pete, I don't mean any disrespect or to deny your credit for your past
contributions but it must have been "quite a some" time ago since you were
last working on ide-tape. ;-)

[ To be sure I checked both Linus' tree (which goes back to Apr 2005,
  2.6.12 timeframe) and Thomas' historical git tree (Feb 2002 / 2.4.1)
  -- I don't see any IDE changes from you there. ]

In the recent years Borislav and me put substantial work into putting
ide-tape into a sane state (mainly as a by-product of working on generic
ATAPI support) and although many improvements are still possible (BTW
patches are always warmly welcomed) it doesn't contain any horrors like
idescsi_eh_abort() / idescsi_eh_reset() in ide-scsi.  Moreover ide-tape
is of much less complexity than SCSI st driver (afterall ide-tape.c is
now almost half of the size / lines of code of st.c).

See for yourself, ide-tape.c from Linus' tree:

/*
 * IDE ATAPI streaming tape driver.
 *
 * Copyright (C) 1995-1999  Gadi Oxman <gadio@netvision.net.il>
 * Copyright (C) 2003-2005  Bartlomiej Zolnierkiewicz
 *
 * This driver was constructed as a student project in the software laboratory
 * of the faculty of electrical engineering in the Technion - Israel's
 * Institute Of Technology, with the guide of Avner Lottem and Dr. Ilana David.
 *
 * It is hereby placed under the terms of the GNU general public license.
 * (See linux/COPYING).
 *
 * For a historical changelog see
 * Documentation/ide/ChangeLog.ide-tape.1995-2002
 */

#define DRV_NAME "ide-tape"

#define IDETAPE_VERSION "1.20"

#include <linux/module.h>
#include <linux/types.h>
#include <linux/string.h>
#include <linux/kernel.h>
#include <linux/delay.h>
#include <linux/timer.h>
#include <linux/mm.h>
#include <linux/interrupt.h>
#include <linux/jiffies.h>
#include <linux/major.h>
#include <linux/errno.h>
#include <linux/genhd.h>
#include <linux/slab.h>
#include <linux/pci.h>
#include <linux/ide.h>
#include <linux/smp_lock.h>
#include <linux/completion.h>
#include <linux/bitops.h>
#include <linux/mutex.h>
#include <scsi/scsi.h>

#include <asm/byteorder.h>
#include <linux/irq.h>
#include <linux/uaccess.h>
#include <linux/io.h>
#include <asm/unaligned.h>
#include <linux/mtio.h>

enum {
	/* output errors only */
	DBG_ERR =		(1 << 0),
	/* output all sense key/asc */
	DBG_SENSE =		(1 << 1),
	/* info regarding all chrdev-related procedures */
	DBG_CHRDEV =		(1 << 2),
	/* all remaining procedures */
	DBG_PROCS =		(1 << 3),
};

/* define to see debug info */
#define IDETAPE_DEBUG_LOG		0

#if IDETAPE_DEBUG_LOG
#define debug_log(lvl, fmt, args...)			\
{							\
	if (tape->debug_mask & lvl)			\
	printk(KERN_INFO "ide-tape: " fmt, ## args);	\
}
#else
#define debug_log(lvl, fmt, args...) do {} while (0)
#endif

/**************************** Tunable parameters *****************************/
/*
 * After each failed packet command we issue a request sense command and retry
 * the packet command IDETAPE_MAX_PC_RETRIES times.
 *
 * Setting IDETAPE_MAX_PC_RETRIES to 0 will disable retries.
 */
#define IDETAPE_MAX_PC_RETRIES		3

/*
 * The following parameter is used to select the point in the internal tape fifo
 * in which we will start to refill the buffer. Decreasing the following
 * parameter will improve the system's latency and interactive response, while
 * using a high value might improve system throughput.
 */
#define IDETAPE_FIFO_THRESHOLD		2

/*
 * DSC polling parameters.
 *
 * Polling for DSC (a single bit in the status register) is a very important
 * function in ide-tape. There are two cases in which we poll for DSC:
 *
 * 1. Before a read/write packet command, to ensure that we can transfer data
 * from/to the tape's data buffers, without causing an actual media access.
 * In case the tape is not ready yet, we take out our request from the device
 * request queue, so that ide.c could service requests from the other device
 * on the same interface in the meantime.
 *
 * 2. After the successful initialization of a "media access packet command",
 * which is a command that can take a long time to complete (the interval can
 * range from several seconds to even an hour). Again, we postpone our request
 * in the middle to free the bus for the other device. The polling frequency
 * here should be lower than the read/write frequency since those media access
 * commands are slow. We start from a "fast" frequency - IDETAPE_DSC_MA_FAST
 * (1 second), and if we don't receive DSC after IDETAPE_DSC_MA_THRESHOLD
 * (5 min), we switch it to a lower frequency - IDETAPE_DSC_MA_SLOW (1 min).
 *
 * We also set a timeout for the timer, in case something goes wrong. The
 * timeout should be longer then the maximum execution time of a tape operation.
 */

/* DSC timings. */
#define IDETAPE_DSC_RW_MIN		5*HZ/100	/* 50 msec */
#define IDETAPE_DSC_RW_MAX		40*HZ/100	/* 400 msec */
#define IDETAPE_DSC_RW_TIMEOUT		2*60*HZ		/* 2 minutes */
#define IDETAPE_DSC_MA_FAST		2*HZ		/* 2 seconds */
#define IDETAPE_DSC_MA_THRESHOLD	5*60*HZ		/* 5 minutes */
#define IDETAPE_DSC_MA_SLOW		30*HZ		/* 30 seconds */
#define IDETAPE_DSC_MA_TIMEOUT		2*60*60*HZ	/* 2 hours */

/*************************** End of tunable parameters ***********************/

/* tape directions */
enum {
	IDETAPE_DIR_NONE  = (1 << 0),
	IDETAPE_DIR_READ  = (1 << 1),
	IDETAPE_DIR_WRITE = (1 << 2),
};

struct idetape_bh {
	u32 b_size;
	atomic_t b_count;
	struct idetape_bh *b_reqnext;
	char *b_data;
};

/* Tape door status */
#define DOOR_UNLOCKED			0
#define DOOR_LOCKED			1
#define DOOR_EXPLICITLY_LOCKED		2

/* Some defines for the SPACE command */
#define IDETAPE_SPACE_OVER_FILEMARK	1
#define IDETAPE_SPACE_TO_EOD		3

/* Some defines for the LOAD UNLOAD command */
#define IDETAPE_LU_LOAD_MASK		1
#define IDETAPE_LU_RETENSION_MASK	2
#define IDETAPE_LU_EOT_MASK		4

/* Error codes returned in rq->errors to the higher part of the driver. */
#define IDETAPE_ERROR_GENERAL		101
#define IDETAPE_ERROR_FILEMARK		102
#define IDETAPE_ERROR_EOD		103

/* Structures related to the SELECT SENSE / MODE SENSE packet commands. */
#define IDETAPE_BLOCK_DESCRIPTOR	0
#define IDETAPE_CAPABILITIES_PAGE	0x2a

/*
 * Most of our global data which we need to save even as we leave the driver due
 * to an interrupt or a timer event is stored in the struct defined below.
 */
typedef struct ide_tape_obj {
	ide_drive_t	*drive;
	ide_driver_t	*driver;
	struct gendisk	*disk;
	struct kref	kref;

	/*
	 *	failed_pc points to the last failed packet command, or contains
	 *	NULL if we do not need to retry any packet command. This is
	 *	required since an additional packet command is needed before the
	 *	retry, to get detailed information on what went wrong.
	 */
	/* Last failed packet command */
	struct ide_atapi_pc *failed_pc;
	/* used by REQ_IDETAPE_{READ,WRITE} requests */
	struct ide_atapi_pc queued_pc;

	/*
	 * DSC polling variables.
	 *
	 * While polling for DSC we use postponed_rq to postpone the current
	 * request so that ide.c will be able to service pending requests on the
	 * other device. Note that at most we will have only one DSC (usually
	 * data transfer) request in the device request queue.
	 */
	struct request *postponed_rq;
	/* The time in which we started polling for DSC */
	unsigned long dsc_polling_start;
	/* Timer used to poll for dsc */
	struct timer_list dsc_timer;
	/* Read/Write dsc polling frequency */
	unsigned long best_dsc_rw_freq;
	unsigned long dsc_poll_freq;
	unsigned long dsc_timeout;

	/* Read position information */
	u8 partition;
	/* Current block */
	unsigned int first_frame;

	/* Last error information */
	u8 sense_key, asc, ascq;

	/* Character device operation */
	unsigned int minor;
	/* device name */
	char name[4];
	/* Current character device data transfer direction */
	u8 chrdev_dir;

	/* tape block size, usually 512 or 1024 bytes */
	unsigned short blk_size;
	int user_bs_factor;

	/* Copy of the tape's Capabilities and Mechanical Page */
	u8 caps[20];

	/*
	 * Active data transfer request parameters.
	 *
	 * At most, there is only one ide-tape originated data transfer request
	 * in the device request queue. This allows ide.c to easily service
	 * requests from the other device when we postpone our active request.
	 */

	/* Data buffer size chosen based on the tape's recommendation */
	int buffer_size;
	/* merge buffer */
	struct idetape_bh *merge_bh;
	/* size of the merge buffer */
	int merge_bh_size;
	/* pointer to current buffer head within the merge buffer */
	struct idetape_bh *bh;
	char *b_data;
	int b_count;

	int pages_per_buffer;
	/* Wasted space in each stage */
	int excess_bh_size;

	/* protects the ide-tape queue */
	spinlock_t lock;

	/* Measures average tape speed */
	unsigned long avg_time;
	int avg_size;
	int avg_speed;

	/* the door is currently locked */
	int door_locked;
	/* the tape hardware is write protected */
	char drv_write_prot;
	/* the tape is write protected (hardware or opened as read-only) */
	char write_prot;

	u32 debug_mask;
} idetape_tape_t;

static DEFINE_MUTEX(idetape_ref_mutex);

static struct class *idetape_sysfs_class;

static void ide_tape_release(struct kref *);

static struct ide_tape_obj *ide_tape_get(struct gendisk *disk)
{
	struct ide_tape_obj *tape = NULL;

	mutex_lock(&idetape_ref_mutex);
	tape = ide_drv_g(disk, ide_tape_obj);
	if (tape) {
		if (ide_device_get(tape->drive))
			tape = NULL;
		else
			kref_get(&tape->kref);
	}
	mutex_unlock(&idetape_ref_mutex);
	return tape;
}

static void ide_tape_put(struct ide_tape_obj *tape)
{
	ide_drive_t *drive = tape->drive;

	mutex_lock(&idetape_ref_mutex);
	kref_put(&tape->kref, ide_tape_release);
	ide_device_put(drive);
	mutex_unlock(&idetape_ref_mutex);
}

/*
 * The variables below are used for the character device interface. Additional
 * state variables are defined in our ide_drive_t structure.
 */
static struct ide_tape_obj *idetape_devs[MAX_HWIFS * MAX_DRIVES];

static struct ide_tape_obj *ide_tape_chrdev_get(unsigned int i)
{
	struct ide_tape_obj *tape = NULL;

	mutex_lock(&idetape_ref_mutex);
	tape = idetape_devs[i];
	if (tape)
		kref_get(&tape->kref);
	mutex_unlock(&idetape_ref_mutex);
	return tape;
}

static void idetape_input_buffers(ide_drive_t *drive, struct ide_atapi_pc *pc,
				  unsigned int bcount)
{
	struct idetape_bh *bh = pc->bh;
	int count;

	while (bcount) {
		if (bh == NULL) {
			printk(KERN_ERR "ide-tape: bh == NULL in "
				"idetape_input_buffers\n");
			ide_pad_transfer(drive, 0, bcount);
			return;
		}
		count = min(
			(unsigned int)(bh->b_size - atomic_read(&bh->b_count)),
			bcount);
		drive->hwif->tp_ops->input_data(drive, NULL, bh->b_data +
					atomic_read(&bh->b_count), count);
		bcount -= count;
		atomic_add(count, &bh->b_count);
		if (atomic_read(&bh->b_count) == bh->b_size) {
			bh = bh->b_reqnext;
			if (bh)
				atomic_set(&bh->b_count, 0);
		}
	}
	pc->bh = bh;
}

static void idetape_output_buffers(ide_drive_t *drive, struct ide_atapi_pc *pc,
				   unsigned int bcount)
{
	struct idetape_bh *bh = pc->bh;
	int count;

	while (bcount) {
		if (bh == NULL) {
			printk(KERN_ERR "ide-tape: bh == NULL in %s\n",
					__func__);
			return;
		}
		count = min((unsigned int)pc->b_count, (unsigned int)bcount);
		drive->hwif->tp_ops->output_data(drive, NULL, pc->b_data, count);
		bcount -= count;
		pc->b_data += count;
		pc->b_count -= count;
		if (!pc->b_count) {
			bh = bh->b_reqnext;
			pc->bh = bh;
			if (bh) {
				pc->b_data = bh->b_data;
				pc->b_count = atomic_read(&bh->b_count);
			}
		}
	}
}

static void idetape_update_buffers(ide_drive_t *drive, struct ide_atapi_pc *pc)
{
	struct idetape_bh *bh = pc->bh;
	int count;
	unsigned int bcount = pc->xferred;

	if (pc->flags & PC_FLAG_WRITING)
		return;
	while (bcount) {
		if (bh == NULL) {
			printk(KERN_ERR "ide-tape: bh == NULL in %s\n",
					__func__);
			return;
		}
		count = min((unsigned int)bh->b_size, (unsigned int)bcount);
		atomic_set(&bh->b_count, count);
		if (atomic_read(&bh->b_count) == bh->b_size)
			bh = bh->b_reqnext;
		bcount -= count;
	}
	pc->bh = bh;
}

/*
 * called on each failed packet command retry to analyze the request sense. We
 * currently do not utilize this information.
 */
static void idetape_analyze_error(ide_drive_t *drive, u8 *sense)
{
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc *pc = tape->failed_pc;

	tape->sense_key = sense[2] & 0xF;
	tape->asc       = sense[12];
	tape->ascq      = sense[13];

	debug_log(DBG_ERR, "pc = %x, sense key = %x, asc = %x, ascq = %x\n",
		 pc->c[0], tape->sense_key, tape->asc, tape->ascq);

	/* Correct pc->xferred by asking the tape.	 */
	if (pc->flags & PC_FLAG_DMA_ERROR) {
		pc->xferred = pc->req_xfer -
			tape->blk_size *
			get_unaligned_be32(&sense[3]);
		idetape_update_buffers(drive, pc);
	}

	/*
	 * If error was the result of a zero-length read or write command,
	 * with sense key=5, asc=0x22, ascq=0, let it slide.  Some drives
	 * (i.e. Seagate STT3401A Travan) don't support 0-length read/writes.
	 */
	if ((pc->c[0] == READ_6 || pc->c[0] == WRITE_6)
	    /* length == 0 */
	    && pc->c[4] == 0 && pc->c[3] == 0 && pc->c[2] == 0) {
		if (tape->sense_key == 5) {
			/* don't report an error, everything's ok */
			pc->error = 0;
			/* don't retry read/write */
			pc->flags |= PC_FLAG_ABORT;
		}
	}
	if (pc->c[0] == READ_6 && (sense[2] & 0x80)) {
		pc->error = IDETAPE_ERROR_FILEMARK;
		pc->flags |= PC_FLAG_ABORT;
	}
	if (pc->c[0] == WRITE_6) {
		if ((sense[2] & 0x40) || (tape->sense_key == 0xd
		     && tape->asc == 0x0 && tape->ascq == 0x2)) {
			pc->error = IDETAPE_ERROR_EOD;
			pc->flags |= PC_FLAG_ABORT;
		}
	}
	if (pc->c[0] == READ_6 || pc->c[0] == WRITE_6) {
		if (tape->sense_key == 8) {
			pc->error = IDETAPE_ERROR_EOD;
			pc->flags |= PC_FLAG_ABORT;
		}
		if (!(pc->flags & PC_FLAG_ABORT) &&
		    pc->xferred)
			pc->retries = IDETAPE_MAX_PC_RETRIES + 1;
	}
}

/* Free data buffers completely. */
static void ide_tape_kfree_buffer(idetape_tape_t *tape)
{
	struct idetape_bh *prev_bh, *bh = tape->merge_bh;

	while (bh) {
		u32 size = bh->b_size;

		while (size) {
			unsigned int order = fls(size >> PAGE_SHIFT)-1;

			if (bh->b_data)
				free_pages((unsigned long)bh->b_data, order);

			size &= (order-1);
			bh->b_data += (1 << order) * PAGE_SIZE;
		}
		prev_bh = bh;
		bh = bh->b_reqnext;
		kfree(prev_bh);
	}
}

static int idetape_end_request(ide_drive_t *drive, int uptodate, int nr_sects)
{
	struct request *rq = HWGROUP(drive)->rq;
	idetape_tape_t *tape = drive->driver_data;
	unsigned long flags;
	int error;

	debug_log(DBG_PROCS, "Enter %s\n", __func__);

	switch (uptodate) {
	case 0:	error = IDETAPE_ERROR_GENERAL; break;
	case 1: error = 0; break;
	default: error = uptodate;
	}
	rq->errors = error;
	if (error)
		tape->failed_pc = NULL;

	if (!blk_special_request(rq)) {
		ide_end_request(drive, uptodate, nr_sects);
		return 0;
	}

	spin_lock_irqsave(&tape->lock, flags);

	ide_end_drive_cmd(drive, 0, 0);

	spin_unlock_irqrestore(&tape->lock, flags);
	return 0;
}

static void ide_tape_handle_dsc(ide_drive_t *);

static void ide_tape_callback(ide_drive_t *drive, int dsc)
{
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc *pc = drive->pc;
	int uptodate = pc->error ? 0 : 1;

	debug_log(DBG_PROCS, "Enter %s\n", __func__);

	if (dsc)
		ide_tape_handle_dsc(drive);

	if (tape->failed_pc == pc)
		tape->failed_pc = NULL;

	if (pc->c[0] == REQUEST_SENSE) {
		if (uptodate)
			idetape_analyze_error(drive, pc->buf);
		else
			printk(KERN_ERR "ide-tape: Error in REQUEST SENSE "
					"itself - Aborting request!\n");
	} else if (pc->c[0] == READ_6 || pc->c[0] == WRITE_6) {
		struct request *rq = drive->hwif->hwgroup->rq;
		int blocks = pc->xferred / tape->blk_size;

		tape->avg_size += blocks * tape->blk_size;

		if (time_after_eq(jiffies, tape->avg_time + HZ)) {
			tape->avg_speed = tape->avg_size * HZ /
				(jiffies - tape->avg_time) / 1024;
			tape->avg_size = 0;
			tape->avg_time = jiffies;
		}

		tape->first_frame += blocks;
		rq->current_nr_sectors -= blocks;

		if (pc->error)
			uptodate = pc->error;
	} else if (pc->c[0] == READ_POSITION && uptodate) {
		u8 *readpos = pc->buf;

		debug_log(DBG_SENSE, "BOP - %s\n",
				(readpos[0] & 0x80) ? "Yes" : "No");
		debug_log(DBG_SENSE, "EOP - %s\n",
				(readpos[0] & 0x40) ? "Yes" : "No");

		if (readpos[0] & 0x4) {
			printk(KERN_INFO "ide-tape: Block location is unknown"
					 "to the tape\n");
			clear_bit(IDE_AFLAG_ADDRESS_VALID, &drive->atapi_flags);
			uptodate = 0;
		} else {
			debug_log(DBG_SENSE, "Block Location - %u\n",
					be32_to_cpup((__be32 *)&readpos[4]));

			tape->partition = readpos[1];
			tape->first_frame = be32_to_cpup((__be32 *)&readpos[4]);
			set_bit(IDE_AFLAG_ADDRESS_VALID, &drive->atapi_flags);
		}
	}

	idetape_end_request(drive, uptodate, 0);
}

/*
 * Postpone the current request so that ide.c will be able to service requests
 * from another device on the same hwgroup while we are polling for DSC.
 */
static void idetape_postpone_request(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;

	debug_log(DBG_PROCS, "Enter %s\n", __func__);

	tape->postponed_rq = HWGROUP(drive)->rq;
	ide_stall_queue(drive, tape->dsc_poll_freq);
}

static void ide_tape_handle_dsc(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;

	/* Media access command */
	tape->dsc_polling_start = jiffies;
	tape->dsc_poll_freq = IDETAPE_DSC_MA_FAST;
	tape->dsc_timeout = jiffies + IDETAPE_DSC_MA_TIMEOUT;
	/* Allow ide.c to handle other requests */
	idetape_postpone_request(drive);
}

static int ide_tape_io_buffers(ide_drive_t *drive, struct ide_atapi_pc *pc,
				unsigned int bcount, int write)
{
	if (write)
		idetape_output_buffers(drive, pc, bcount);
	else
		idetape_input_buffers(drive, pc, bcount);

	return bcount;
}

/*
 * Packet Command Interface
 *
 * The current Packet Command is available in drive->pc, and will not change
 * until we finish handling it. Each packet command is associated with a
 * callback function that will be called when the command is finished.
 *
 * The handling will be done in three stages:
 *
 * 1. idetape_issue_pc will send the packet command to the drive, and will set
 * the interrupt handler to ide_pc_intr.
 *
 * 2. On each interrupt, ide_pc_intr will be called. This step will be
 * repeated until the device signals us that no more interrupts will be issued.
 *
 * 3. ATAPI Tape media access commands have immediate status with a delayed
 * process. In case of a successful initiation of a media access packet command,
 * the DSC bit will be set when the actual execution of the command is finished.
 * Since the tape drive will not issue an interrupt, we have to poll for this
 * event. In this case, we define the request as "low priority request" by
 * setting rq_status to IDETAPE_RQ_POSTPONED, set a timer to poll for DSC and
 * exit the driver.
 *
 * ide.c will then give higher priority to requests which originate from the
 * other device, until will change rq_status to RQ_ACTIVE.
 *
 * 4. When the packet command is finished, it will be checked for errors.
 *
 * 5. In case an error was found, we queue a request sense packet command in
 * front of the request queue and retry the operation up to
 * IDETAPE_MAX_PC_RETRIES times.
 *
 * 6. In case no error was found, or we decided to give up and not to retry
 * again, the callback function will be called and then we will handle the next
 * request.
 */

static ide_startstop_t idetape_issue_pc(ide_drive_t *drive,
		struct ide_atapi_pc *pc)
{
	idetape_tape_t *tape = drive->driver_data;

	if (drive->pc->c[0] == REQUEST_SENSE &&
	    pc->c[0] == REQUEST_SENSE) {
		printk(KERN_ERR "ide-tape: possible ide-tape.c bug - "
			"Two request sense in serial were issued\n");
	}

	if (tape->failed_pc == NULL && pc->c[0] != REQUEST_SENSE)
		tape->failed_pc = pc;

	/* Set the current packet command */
	drive->pc = pc;

	if (pc->retries > IDETAPE_MAX_PC_RETRIES ||
		(pc->flags & PC_FLAG_ABORT)) {
		/*
		 * We will "abort" retrying a packet command in case legitimate
		 * error code was received (crossing a filemark, or end of the
		 * media, for example).
		 */
		if (!(pc->flags & PC_FLAG_ABORT)) {
			if (!(pc->c[0] == TEST_UNIT_READY &&
			      tape->sense_key == 2 && tape->asc == 4 &&
			     (tape->ascq == 1 || tape->ascq == 8))) {
				printk(KERN_ERR "ide-tape: %s: I/O error, "
						"pc = %2x, key = %2x, "
						"asc = %2x, ascq = %2x\n",
						tape->name, pc->c[0],
						tape->sense_key, tape->asc,
						tape->ascq);
			}
			/* Giving up */
			pc->error = IDETAPE_ERROR_GENERAL;
		}
		tape->failed_pc = NULL;
		drive->pc_callback(drive, 0);
		return ide_stopped;
	}
	debug_log(DBG_SENSE, "Retry #%d, cmd = %02X\n", pc->retries, pc->c[0]);

	pc->retries++;

	return ide_issue_pc(drive, WAIT_TAPE_CMD, NULL);
}

/* A mode sense command is used to "sense" tape parameters. */
static void idetape_create_mode_sense_cmd(struct ide_atapi_pc *pc, u8 page_code)
{
	ide_init_pc(pc);
	pc->c[0] = MODE_SENSE;
	if (page_code != IDETAPE_BLOCK_DESCRIPTOR)
		/* DBD = 1 - Don't return block descriptors */
		pc->c[1] = 8;
	pc->c[2] = page_code;
	/*
	 * Changed pc->c[3] to 0 (255 will at best return unused info).
	 *
	 * For SCSI this byte is defined as subpage instead of high byte
	 * of length and some IDE drives seem to interpret it this way
	 * and return an error when 255 is used.
	 */
	pc->c[3] = 0;
	/* We will just discard data in that case */
	pc->c[4] = 255;
	if (page_code == IDETAPE_BLOCK_DESCRIPTOR)
		pc->req_xfer = 12;
	else if (page_code == IDETAPE_CAPABILITIES_PAGE)
		pc->req_xfer = 24;
	else
		pc->req_xfer = 50;
}

static ide_startstop_t idetape_media_access_finished(ide_drive_t *drive)
{
	ide_hwif_t *hwif = drive->hwif;
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc *pc = drive->pc;
	u8 stat;

	stat = hwif->tp_ops->read_status(hwif);

	if (stat & ATA_DSC) {
		if (stat & ATA_ERR) {
			/* Error detected */
			if (pc->c[0] != TEST_UNIT_READY)
				printk(KERN_ERR "ide-tape: %s: I/O error, ",
						tape->name);
			/* Retry operation */
			ide_retry_pc(drive, tape->disk);
			return ide_stopped;
		}
		pc->error = 0;
	} else {
		pc->error = IDETAPE_ERROR_GENERAL;
		tape->failed_pc = NULL;
	}
	drive->pc_callback(drive, 0);
	return ide_stopped;
}

static void ide_tape_create_rw_cmd(idetape_tape_t *tape,
				   struct ide_atapi_pc *pc, struct request *rq,
				   u8 opcode)
{
	struct idetape_bh *bh = (struct idetape_bh *)rq->special;
	unsigned int length = rq->current_nr_sectors;

	ide_init_pc(pc);
	put_unaligned(cpu_to_be32(length), (unsigned int *) &pc->c[1]);
	pc->c[1] = 1;
	pc->bh = bh;
	pc->buf = NULL;
	pc->buf_size = length * tape->blk_size;
	pc->req_xfer = pc->buf_size;
	if (pc->req_xfer == tape->buffer_size)
		pc->flags |= PC_FLAG_DMA_OK;

	if (opcode == READ_6) {
		pc->c[0] = READ_6;
		atomic_set(&bh->b_count, 0);
	} else if (opcode == WRITE_6) {
		pc->c[0] = WRITE_6;
		pc->flags |= PC_FLAG_WRITING;
		pc->b_data = bh->b_data;
		pc->b_count = atomic_read(&bh->b_count);
	}

	memcpy(rq->cmd, pc->c, 12);
}

static ide_startstop_t idetape_do_request(ide_drive_t *drive,
					  struct request *rq, sector_t block)
{
	ide_hwif_t *hwif = drive->hwif;
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc *pc = NULL;
	struct request *postponed_rq = tape->postponed_rq;
	u8 stat;

	debug_log(DBG_SENSE, "sector: %llu, nr_sectors: %lu,"
			" current_nr_sectors: %u\n",
			(unsigned long long)rq->sector, rq->nr_sectors,
			rq->current_nr_sectors);

	if (!blk_special_request(rq)) {
		/* We do not support buffer cache originated requests. */
		printk(KERN_NOTICE "ide-tape: %s: Unsupported request in "
			"request queue (%d)\n", drive->name, rq->cmd_type);
		ide_end_request(drive, 0, 0);
		return ide_stopped;
	}

	/* Retry a failed packet command */
	if (tape->failed_pc && drive->pc->c[0] == REQUEST_SENSE) {
		pc = tape->failed_pc;
		goto out;
	}

	if (postponed_rq != NULL)
		if (rq != postponed_rq) {
			printk(KERN_ERR "ide-tape: ide-tape.c bug - "
					"Two DSC requests were queued\n");
			idetape_end_request(drive, 0, 0);
			return ide_stopped;
		}

	tape->postponed_rq = NULL;

	/*
	 * If the tape is still busy, postpone our request and service
	 * the other device meanwhile.
	 */
	stat = hwif->tp_ops->read_status(hwif);

	if ((drive->dev_flags & IDE_DFLAG_DSC_OVERLAP) == 0 &&
	    (rq->cmd[13] & REQ_IDETAPE_PC2) == 0)
		set_bit(IDE_AFLAG_IGNORE_DSC, &drive->atapi_flags);

	if (drive->dev_flags & IDE_DFLAG_POST_RESET) {
		set_bit(IDE_AFLAG_IGNORE_DSC, &drive->atapi_flags);
		drive->dev_flags &= ~IDE_DFLAG_POST_RESET;
	}

	if (!test_and_clear_bit(IDE_AFLAG_IGNORE_DSC, &drive->atapi_flags) &&
	    (stat & ATA_DSC) == 0) {
		if (postponed_rq == NULL) {
			tape->dsc_polling_start = jiffies;
			tape->dsc_poll_freq = tape->best_dsc_rw_freq;
			tape->dsc_timeout = jiffies + IDETAPE_DSC_RW_TIMEOUT;
		} else if (time_after(jiffies, tape->dsc_timeout)) {
			printk(KERN_ERR "ide-tape: %s: DSC timeout\n",
				tape->name);
			if (rq->cmd[13] & REQ_IDETAPE_PC2) {
				idetape_media_access_finished(drive);
				return ide_stopped;
			} else {
				return ide_do_reset(drive);
			}
		} else if (time_after(jiffies,
					tape->dsc_polling_start +
					IDETAPE_DSC_MA_THRESHOLD))
			tape->dsc_poll_freq = IDETAPE_DSC_MA_SLOW;
		idetape_postpone_request(drive);
		return ide_stopped;
	}
	if (rq->cmd[13] & REQ_IDETAPE_READ) {
		pc = &tape->queued_pc;
		ide_tape_create_rw_cmd(tape, pc, rq, READ_6);
		goto out;
	}
	if (rq->cmd[13] & REQ_IDETAPE_WRITE) {
		pc = &tape->queued_pc;
		ide_tape_create_rw_cmd(tape, pc, rq, WRITE_6);
		goto out;
	}
	if (rq->cmd[13] & REQ_IDETAPE_PC1) {
		pc = (struct ide_atapi_pc *) rq->buffer;
		rq->cmd[13] &= ~(REQ_IDETAPE_PC1);
		rq->cmd[13] |= REQ_IDETAPE_PC2;
		goto out;
	}
	if (rq->cmd[13] & REQ_IDETAPE_PC2) {
		idetape_media_access_finished(drive);
		return ide_stopped;
	}
	BUG();

out:
	return idetape_issue_pc(drive, pc);
}

/*
 * The function below uses __get_free_pages to allocate a data buffer of size
 * tape->buffer_size (or a bit more). We attempt to combine sequential pages as
 * much as possible.
 *
 * It returns a pointer to the newly allocated buffer, or NULL in case of
 * failure.
 */
static struct idetape_bh *ide_tape_kmalloc_buffer(idetape_tape_t *tape,
						  int full, int clear)
{
	struct idetape_bh *prev_bh, *bh, *merge_bh;
	int pages = tape->pages_per_buffer;
	unsigned int order, b_allocd;
	char *b_data = NULL;

	merge_bh = kmalloc(sizeof(struct idetape_bh), GFP_KERNEL);
	bh = merge_bh;
	if (bh == NULL)
		goto abort;

	order = fls(pages) - 1;
	bh->b_data = (char *) __get_free_pages(GFP_KERNEL, order);
	if (!bh->b_data)
		goto abort;
	b_allocd = (1 << order) * PAGE_SIZE;
	pages &= (order-1);

	if (clear)
		memset(bh->b_data, 0, b_allocd);
	bh->b_reqnext = NULL;
	bh->b_size = b_allocd;
	atomic_set(&bh->b_count, full ? bh->b_size : 0);

	while (pages) {
		order = fls(pages) - 1;
		b_data = (char *) __get_free_pages(GFP_KERNEL, order);
		if (!b_data)
			goto abort;
		b_allocd = (1 << order) * PAGE_SIZE;

		if (clear)
			memset(b_data, 0, b_allocd);

		/* newly allocated page frames below buffer header or ...*/
		if (bh->b_data == b_data + b_allocd) {
			bh->b_size += b_allocd;
			bh->b_data -= b_allocd;
			if (full)
				atomic_add(b_allocd, &bh->b_count);
			continue;
		}
		/* they are above the header */
		if (b_data == bh->b_data + bh->b_size) {
			bh->b_size += b_allocd;
			if (full)
				atomic_add(b_allocd, &bh->b_count);
			continue;
		}
		prev_bh = bh;
		bh = kmalloc(sizeof(struct idetape_bh), GFP_KERNEL);
		if (!bh) {
			free_pages((unsigned long) b_data, order);
			goto abort;
		}
		bh->b_reqnext = NULL;
		bh->b_data = b_data;
		bh->b_size = b_allocd;
		atomic_set(&bh->b_count, full ? bh->b_size : 0);
		prev_bh->b_reqnext = bh;

		pages &= (order-1);
	}

	bh->b_size -= tape->excess_bh_size;
	if (full)
		atomic_sub(tape->excess_bh_size, &bh->b_count);
	return merge_bh;
abort:
	ide_tape_kfree_buffer(tape);
	return NULL;
}

static int idetape_copy_stage_from_user(idetape_tape_t *tape,
					const char __user *buf, int n)
{
	struct idetape_bh *bh = tape->bh;
	int count;
	int ret = 0;

	while (n) {
		if (bh == NULL) {
			printk(KERN_ERR "ide-tape: bh == NULL in %s\n",
					__func__);
			return 1;
		}
		count = min((unsigned int)
				(bh->b_size - atomic_read(&bh->b_count)),
				(unsigned int)n);
		if (copy_from_user(bh->b_data + atomic_read(&bh->b_count), buf,
				count))
			ret = 1;
		n -= count;
		atomic_add(count, &bh->b_count);
		buf += count;
		if (atomic_read(&bh->b_count) == bh->b_size) {
			bh = bh->b_reqnext;
			if (bh)
				atomic_set(&bh->b_count, 0);
		}
	}
	tape->bh = bh;
	return ret;
}

static int idetape_copy_stage_to_user(idetape_tape_t *tape, char __user *buf,
				      int n)
{
	struct idetape_bh *bh = tape->bh;
	int count;
	int ret = 0;

	while (n) {
		if (bh == NULL) {
			printk(KERN_ERR "ide-tape: bh == NULL in %s\n",
					__func__);
			return 1;
		}
		count = min(tape->b_count, n);
		if  (copy_to_user(buf, tape->b_data, count))
			ret = 1;
		n -= count;
		tape->b_data += count;
		tape->b_count -= count;
		buf += count;
		if (!tape->b_count) {
			bh = bh->b_reqnext;
			tape->bh = bh;
			if (bh) {
				tape->b_data = bh->b_data;
				tape->b_count = atomic_read(&bh->b_count);
			}
		}
	}
	return ret;
}

static void idetape_init_merge_buffer(idetape_tape_t *tape)
{
	struct idetape_bh *bh = tape->merge_bh;
	tape->bh = tape->merge_bh;

	if (tape->chrdev_dir == IDETAPE_DIR_WRITE)
		atomic_set(&bh->b_count, 0);
	else {
		tape->b_data = bh->b_data;
		tape->b_count = atomic_read(&bh->b_count);
	}
}

/*
 * Write a filemark if write_filemark=1. Flush the device buffers without
 * writing a filemark otherwise.
 */
static void idetape_create_write_filemark_cmd(ide_drive_t *drive,
		struct ide_atapi_pc *pc, int write_filemark)
{
	ide_init_pc(pc);
	pc->c[0] = WRITE_FILEMARKS;
	pc->c[4] = write_filemark;
	pc->flags |= PC_FLAG_WAIT_FOR_DSC;
}

static int idetape_wait_ready(ide_drive_t *drive, unsigned long timeout)
{
	idetape_tape_t *tape = drive->driver_data;
	struct gendisk *disk = tape->disk;
	int load_attempted = 0;

	/* Wait for the tape to become ready */
	set_bit(IDE_AFLAG_MEDIUM_PRESENT, &drive->atapi_flags);
	timeout += jiffies;
	while (time_before(jiffies, timeout)) {
		if (ide_do_test_unit_ready(drive, disk) == 0)
			return 0;
		if ((tape->sense_key == 2 && tape->asc == 4 && tape->ascq == 2)
		    || (tape->asc == 0x3A)) {
			/* no media */
			if (load_attempted)
				return -ENOMEDIUM;
			ide_do_start_stop(drive, disk, IDETAPE_LU_LOAD_MASK);
			load_attempted = 1;
		/* not about to be ready */
		} else if (!(tape->sense_key == 2 && tape->asc == 4 &&
			     (tape->ascq == 1 || tape->ascq == 8)))
			return -EIO;
		msleep(100);
	}
	return -EIO;
}

static int idetape_flush_tape_buffers(ide_drive_t *drive)
{
	struct ide_tape_obj *tape = drive->driver_data;
	struct ide_atapi_pc pc;
	int rc;

	idetape_create_write_filemark_cmd(drive, &pc, 0);
	rc = ide_queue_pc_tail(drive, tape->disk, &pc);
	if (rc)
		return rc;
	idetape_wait_ready(drive, 60 * 5 * HZ);
	return 0;
}

static void idetape_create_read_position_cmd(struct ide_atapi_pc *pc)
{
	ide_init_pc(pc);
	pc->c[0] = READ_POSITION;
	pc->req_xfer = 20;
}

static int idetape_read_position(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc pc;
	int position;

	debug_log(DBG_PROCS, "Enter %s\n", __func__);

	idetape_create_read_position_cmd(&pc);
	if (ide_queue_pc_tail(drive, tape->disk, &pc))
		return -1;
	position = tape->first_frame;
	return position;
}

static void idetape_create_locate_cmd(ide_drive_t *drive,
		struct ide_atapi_pc *pc,
		unsigned int block, u8 partition, int skip)
{
	ide_init_pc(pc);
	pc->c[0] = POSITION_TO_ELEMENT;
	pc->c[1] = 2;
	put_unaligned(cpu_to_be32(block), (unsigned int *) &pc->c[3]);
	pc->c[8] = partition;
	pc->flags |= PC_FLAG_WAIT_FOR_DSC;
}

static void __ide_tape_discard_merge_buffer(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;

	if (tape->chrdev_dir != IDETAPE_DIR_READ)
		return;

	clear_bit(IDE_AFLAG_FILEMARK, &drive->atapi_flags);
	tape->merge_bh_size = 0;
	if (tape->merge_bh != NULL) {
		ide_tape_kfree_buffer(tape);
		tape->merge_bh = NULL;
	}

	tape->chrdev_dir = IDETAPE_DIR_NONE;
}

/*
 * Position the tape to the requested block using the LOCATE packet command.
 * A READ POSITION command is then issued to check where we are positioned. Like
 * all higher level operations, we queue the commands at the tail of the request
 * queue and wait for their completion.
 */
static int idetape_position_tape(ide_drive_t *drive, unsigned int block,
		u8 partition, int skip)
{
	idetape_tape_t *tape = drive->driver_data;
	struct gendisk *disk = tape->disk;
	int retval;
	struct ide_atapi_pc pc;

	if (tape->chrdev_dir == IDETAPE_DIR_READ)
		__ide_tape_discard_merge_buffer(drive);
	idetape_wait_ready(drive, 60 * 5 * HZ);
	idetape_create_locate_cmd(drive, &pc, block, partition, skip);
	retval = ide_queue_pc_tail(drive, disk, &pc);
	if (retval)
		return (retval);

	idetape_create_read_position_cmd(&pc);
	return ide_queue_pc_tail(drive, disk, &pc);
}

static void ide_tape_discard_merge_buffer(ide_drive_t *drive,
					  int restore_position)
{
	idetape_tape_t *tape = drive->driver_data;
	int seek, position;

	__ide_tape_discard_merge_buffer(drive);
	if (restore_position) {
		position = idetape_read_position(drive);
		seek = position > 0 ? position : 0;
		if (idetape_position_tape(drive, seek, 0, 0)) {
			printk(KERN_INFO "ide-tape: %s: position_tape failed in"
					 " %s\n", tape->name, __func__);
			return;
		}
	}
}

/*
 * Generate a read/write request for the block device interface and wait for it
 * to be serviced.
 */
static int idetape_queue_rw_tail(ide_drive_t *drive, int cmd, int blocks,
				 struct idetape_bh *bh)
{
	idetape_tape_t *tape = drive->driver_data;
	struct request *rq;
	int ret, errors;

	debug_log(DBG_SENSE, "%s: cmd=%d\n", __func__, cmd);

	rq = blk_get_request(drive->queue, READ, __GFP_WAIT);
	rq->cmd_type = REQ_TYPE_SPECIAL;
	rq->cmd[13] = cmd;
	rq->rq_disk = tape->disk;
	rq->special = (void *)bh;
	rq->sector = tape->first_frame;
	rq->nr_sectors = blocks;
	rq->current_nr_sectors = blocks;
	blk_execute_rq(drive->queue, tape->disk, rq, 0);

	errors = rq->errors;
	ret = tape->blk_size * (blocks - rq->current_nr_sectors);
	blk_put_request(rq);

	if ((cmd & (REQ_IDETAPE_READ | REQ_IDETAPE_WRITE)) == 0)
		return 0;

	if (tape->merge_bh)
		idetape_init_merge_buffer(tape);
	if (errors == IDETAPE_ERROR_GENERAL)
		return -EIO;
	return ret;
}

static void idetape_create_inquiry_cmd(struct ide_atapi_pc *pc)
{
	ide_init_pc(pc);
	pc->c[0] = INQUIRY;
	pc->c[4] = 254;
	pc->req_xfer = 254;
}

static void idetape_create_rewind_cmd(ide_drive_t *drive,
		struct ide_atapi_pc *pc)
{
	ide_init_pc(pc);
	pc->c[0] = REZERO_UNIT;
	pc->flags |= PC_FLAG_WAIT_FOR_DSC;
}

static void idetape_create_erase_cmd(struct ide_atapi_pc *pc)
{
	ide_init_pc(pc);
	pc->c[0] = ERASE;
	pc->c[1] = 1;
	pc->flags |= PC_FLAG_WAIT_FOR_DSC;
}

static void idetape_create_space_cmd(struct ide_atapi_pc *pc, int count, u8 cmd)
{
	ide_init_pc(pc);
	pc->c[0] = SPACE;
	put_unaligned(cpu_to_be32(count), (unsigned int *) &pc->c[1]);
	pc->c[1] = cmd;
	pc->flags |= PC_FLAG_WAIT_FOR_DSC;
}

/* Queue up a character device originated write request. */
static int idetape_add_chrdev_write_request(ide_drive_t *drive, int blocks)
{
	idetape_tape_t *tape = drive->driver_data;

	debug_log(DBG_CHRDEV, "Enter %s\n", __func__);

	return idetape_queue_rw_tail(drive, REQ_IDETAPE_WRITE,
				     blocks, tape->merge_bh);
}

static void ide_tape_flush_merge_buffer(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;
	int blocks, min;
	struct idetape_bh *bh;

	if (tape->chrdev_dir != IDETAPE_DIR_WRITE) {
		printk(KERN_ERR "ide-tape: bug: Trying to empty merge buffer"
				" but we are not writing.\n");
		return;
	}
	if (tape->merge_bh_size > tape->buffer_size) {
		printk(KERN_ERR "ide-tape: bug: merge_buffer too big\n");
		tape->merge_bh_size = tape->buffer_size;
	}
	if (tape->merge_bh_size) {
		blocks = tape->merge_bh_size / tape->blk_size;
		if (tape->merge_bh_size % tape->blk_size) {
			unsigned int i;

			blocks++;
			i = tape->blk_size - tape->merge_bh_size %
				tape->blk_size;
			bh = tape->bh->b_reqnext;
			while (bh) {
				atomic_set(&bh->b_count, 0);
				bh = bh->b_reqnext;
			}
			bh = tape->bh;
			while (i) {
				if (bh == NULL) {
					printk(KERN_INFO "ide-tape: bug,"
							 " bh NULL\n");
					break;
				}
				min = min(i, (unsigned int)(bh->b_size -
						atomic_read(&bh->b_count)));
				memset(bh->b_data + atomic_read(&bh->b_count),
						0, min);
				atomic_add(min, &bh->b_count);
				i -= min;
				bh = bh->b_reqnext;
			}
		}
		(void) idetape_add_chrdev_write_request(drive, blocks);
		tape->merge_bh_size = 0;
	}
	if (tape->merge_bh != NULL) {
		ide_tape_kfree_buffer(tape);
		tape->merge_bh = NULL;
	}
	tape->chrdev_dir = IDETAPE_DIR_NONE;
}

static int idetape_init_read(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;
	int bytes_read;

	/* Initialize read operation */
	if (tape->chrdev_dir != IDETAPE_DIR_READ) {
		if (tape->chrdev_dir == IDETAPE_DIR_WRITE) {
			ide_tape_flush_merge_buffer(drive);
			idetape_flush_tape_buffers(drive);
		}
		if (tape->merge_bh || tape->merge_bh_size) {
			printk(KERN_ERR "ide-tape: merge_bh_size should be"
					 " 0 now\n");
			tape->merge_bh_size = 0;
		}
		tape->merge_bh = ide_tape_kmalloc_buffer(tape, 0, 0);
		if (!tape->merge_bh)
			return -ENOMEM;
		tape->chrdev_dir = IDETAPE_DIR_READ;

		/*
		 * Issue a read 0 command to ensure that DSC handshake is
		 * switched from completion mode to buffer available mode.
		 * No point in issuing this if DSC overlap isn't supported, some
		 * drives (Seagate STT3401A) will return an error.
		 */
		if (drive->dev_flags & IDE_DFLAG_DSC_OVERLAP) {
			bytes_read = idetape_queue_rw_tail(drive,
							REQ_IDETAPE_READ, 0,
							tape->merge_bh);
			if (bytes_read < 0) {
				ide_tape_kfree_buffer(tape);
				tape->merge_bh = NULL;
				tape->chrdev_dir = IDETAPE_DIR_NONE;
				return bytes_read;
			}
		}
	}

	return 0;
}

/* called from idetape_chrdev_read() to service a chrdev read request. */
static int idetape_add_chrdev_read_request(ide_drive_t *drive, int blocks)
{
	idetape_tape_t *tape = drive->driver_data;

	debug_log(DBG_PROCS, "Enter %s, %d blocks\n", __func__, blocks);

	/* If we are at a filemark, return a read length of 0 */
	if (test_bit(IDE_AFLAG_FILEMARK, &drive->atapi_flags))
		return 0;

	idetape_init_read(drive);

	return idetape_queue_rw_tail(drive, REQ_IDETAPE_READ, blocks,
				     tape->merge_bh);
}

static void idetape_pad_zeros(ide_drive_t *drive, int bcount)
{
	idetape_tape_t *tape = drive->driver_data;
	struct idetape_bh *bh;
	int blocks;

	while (bcount) {
		unsigned int count;

		bh = tape->merge_bh;
		count = min(tape->buffer_size, bcount);
		bcount -= count;
		blocks = count / tape->blk_size;
		while (count) {
			atomic_set(&bh->b_count,
				   min(count, (unsigned int)bh->b_size));
			memset(bh->b_data, 0, atomic_read(&bh->b_count));
			count -= atomic_read(&bh->b_count);
			bh = bh->b_reqnext;
		}
		idetape_queue_rw_tail(drive, REQ_IDETAPE_WRITE, blocks,
				      tape->merge_bh);
	}
}

/*
 * Rewinds the tape to the Beginning Of the current Partition (BOP). We
 * currently support only one partition.
 */
static int idetape_rewind_tape(ide_drive_t *drive)
{
	struct ide_tape_obj *tape = drive->driver_data;
	struct gendisk *disk = tape->disk;
	int retval;
	struct ide_atapi_pc pc;

	debug_log(DBG_SENSE, "Enter %s\n", __func__);

	idetape_create_rewind_cmd(drive, &pc);
	retval = ide_queue_pc_tail(drive, disk, &pc);
	if (retval)
		return retval;

	idetape_create_read_position_cmd(&pc);
	retval = ide_queue_pc_tail(drive, disk, &pc);
	if (retval)
		return retval;
	return 0;
}

/* mtio.h compatible commands should be issued to the chrdev interface. */
static int idetape_blkdev_ioctl(ide_drive_t *drive, unsigned int cmd,
				unsigned long arg)
{
	idetape_tape_t *tape = drive->driver_data;
	void __user *argp = (void __user *)arg;

	struct idetape_config {
		int dsc_rw_frequency;
		int dsc_media_access_frequency;
		int nr_stages;
	} config;

	debug_log(DBG_PROCS, "Enter %s\n", __func__);

	switch (cmd) {
	case 0x0340:
		if (copy_from_user(&config, argp, sizeof(config)))
			return -EFAULT;
		tape->best_dsc_rw_freq = config.dsc_rw_frequency;
		break;
	case 0x0350:
		config.dsc_rw_frequency = (int) tape->best_dsc_rw_freq;
		config.nr_stages = 1;
		if (copy_to_user(argp, &config, sizeof(config)))
			return -EFAULT;
		break;
	default:
		return -EIO;
	}
	return 0;
}

static int idetape_space_over_filemarks(ide_drive_t *drive, short mt_op,
					int mt_count)
{
	idetape_tape_t *tape = drive->driver_data;
	struct gendisk *disk = tape->disk;
	struct ide_atapi_pc pc;
	int retval, count = 0;
	int sprev = !!(tape->caps[4] & 0x20);

	if (mt_count == 0)
		return 0;
	if (MTBSF == mt_op || MTBSFM == mt_op) {
		if (!sprev)
			return -EIO;
		mt_count = -mt_count;
	}

	if (tape->chrdev_dir == IDETAPE_DIR_READ) {
		tape->merge_bh_size = 0;
		if (test_and_clear_bit(IDE_AFLAG_FILEMARK, &drive->atapi_flags))
			++count;
		ide_tape_discard_merge_buffer(drive, 0);
	}

	switch (mt_op) {
	case MTFSF:
	case MTBSF:
		idetape_create_space_cmd(&pc, mt_count - count,
					 IDETAPE_SPACE_OVER_FILEMARK);
		return ide_queue_pc_tail(drive, disk, &pc);
	case MTFSFM:
	case MTBSFM:
		if (!sprev)
			return -EIO;
		retval = idetape_space_over_filemarks(drive, MTFSF,
						      mt_count - count);
		if (retval)
			return retval;
		count = (MTBSFM == mt_op ? 1 : -1);
		return idetape_space_over_filemarks(drive, MTFSF, count);
	default:
		printk(KERN_ERR "ide-tape: MTIO operation %d not supported\n",
				mt_op);
		return -EIO;
	}
}

/*
 * Our character device read / write functions.
 *
 * The tape is optimized to maximize throughput when it is transferring an
 * integral number of the "continuous transfer limit", which is a parameter of
 * the specific tape (26kB on my particular tape, 32kB for Onstream).
 *
 * As of version 1.3 of the driver, the character device provides an abstract
 * continuous view of the media - any mix of block sizes (even 1 byte) on the
 * same backup/restore procedure is supported. The driver will internally
 * convert the requests to the recommended transfer unit, so that an unmatch
 * between the user's block size to the recommended size will only result in a
 * (slightly) increased driver overhead, but will no longer hit performance.
 * This is not applicable to Onstream.
 */
static ssize_t idetape_chrdev_read(struct file *file, char __user *buf,
				   size_t count, loff_t *ppos)
{
	struct ide_tape_obj *tape = file->private_data;
	ide_drive_t *drive = tape->drive;
	ssize_t bytes_read, temp, actually_read = 0, rc;
	ssize_t ret = 0;
	u16 ctl = *(u16 *)&tape->caps[12];

	debug_log(DBG_CHRDEV, "Enter %s, count %Zd\n", __func__, count);

	if (tape->chrdev_dir != IDETAPE_DIR_READ) {
		if (test_bit(IDE_AFLAG_DETECT_BS, &drive->atapi_flags))
			if (count > tape->blk_size &&
			    (count % tape->blk_size) == 0)
				tape->user_bs_factor = count / tape->blk_size;
	}
	rc = idetape_init_read(drive);
	if (rc < 0)
		return rc;
	if (count == 0)
		return (0);
	if (tape->merge_bh_size) {
		actually_read = min((unsigned int)(tape->merge_bh_size),
				    (unsigned int)count);
		if (idetape_copy_stage_to_user(tape, buf, actually_read))
			ret = -EFAULT;
		buf += actually_read;
		tape->merge_bh_size -= actually_read;
		count -= actually_read;
	}
	while (count >= tape->buffer_size) {
		bytes_read = idetape_add_chrdev_read_request(drive, ctl);
		if (bytes_read <= 0)
			goto finish;
		if (idetape_copy_stage_to_user(tape, buf, bytes_read))
			ret = -EFAULT;
		buf += bytes_read;
		count -= bytes_read;
		actually_read += bytes_read;
	}
	if (count) {
		bytes_read = idetape_add_chrdev_read_request(drive, ctl);
		if (bytes_read <= 0)
			goto finish;
		temp = min((unsigned long)count, (unsigned long)bytes_read);
		if (idetape_copy_stage_to_user(tape, buf, temp))
			ret = -EFAULT;
		actually_read += temp;
		tape->merge_bh_size = bytes_read-temp;
	}
finish:
	if (!actually_read && test_bit(IDE_AFLAG_FILEMARK, &drive->atapi_flags)) {
		debug_log(DBG_SENSE, "%s: spacing over filemark\n", tape->name);

		idetape_space_over_filemarks(drive, MTFSF, 1);
		return 0;
	}

	return ret ? ret : actually_read;
}

static ssize_t idetape_chrdev_write(struct file *file, const char __user *buf,
				     size_t count, loff_t *ppos)
{
	struct ide_tape_obj *tape = file->private_data;
	ide_drive_t *drive = tape->drive;
	ssize_t actually_written = 0;
	ssize_t ret = 0;
	u16 ctl = *(u16 *)&tape->caps[12];

	/* The drive is write protected. */
	if (tape->write_prot)
		return -EACCES;

	debug_log(DBG_CHRDEV, "Enter %s, count %Zd\n", __func__, count);

	/* Initialize write operation */
	if (tape->chrdev_dir != IDETAPE_DIR_WRITE) {
		if (tape->chrdev_dir == IDETAPE_DIR_READ)
			ide_tape_discard_merge_buffer(drive, 1);
		if (tape->merge_bh || tape->merge_bh_size) {
			printk(KERN_ERR "ide-tape: merge_bh_size "
				"should be 0 now\n");
			tape->merge_bh_size = 0;
		}
		tape->merge_bh = ide_tape_kmalloc_buffer(tape, 0, 0);
		if (!tape->merge_bh)
			return -ENOMEM;
		tape->chrdev_dir = IDETAPE_DIR_WRITE;
		idetape_init_merge_buffer(tape);

		/*
		 * Issue a write 0 command to ensure that DSC handshake is
		 * switched from completion mode to buffer available mode. No
		 * point in issuing this if DSC overlap isn't supported, some
		 * drives (Seagate STT3401A) will return an error.
		 */
		if (drive->dev_flags & IDE_DFLAG_DSC_OVERLAP) {
			ssize_t retval = idetape_queue_rw_tail(drive,
							REQ_IDETAPE_WRITE, 0,
							tape->merge_bh);
			if (retval < 0) {
				ide_tape_kfree_buffer(tape);
				tape->merge_bh = NULL;
				tape->chrdev_dir = IDETAPE_DIR_NONE;
				return retval;
			}
		}
	}
	if (count == 0)
		return (0);
	if (tape->merge_bh_size) {
		if (tape->merge_bh_size >= tape->buffer_size) {
			printk(KERN_ERR "ide-tape: bug: merge buf too big\n");
			tape->merge_bh_size = 0;
		}
		actually_written = min((unsigned int)
				(tape->buffer_size - tape->merge_bh_size),
				(unsigned int)count);
		if (idetape_copy_stage_from_user(tape, buf, actually_written))
				ret = -EFAULT;
		buf += actually_written;
		tape->merge_bh_size += actually_written;
		count -= actually_written;

		if (tape->merge_bh_size == tape->buffer_size) {
			ssize_t retval;
			tape->merge_bh_size = 0;
			retval = idetape_add_chrdev_write_request(drive, ctl);
			if (retval <= 0)
				return (retval);
		}
	}
	while (count >= tape->buffer_size) {
		ssize_t retval;
		if (idetape_copy_stage_from_user(tape, buf, tape->buffer_size))
			ret = -EFAULT;
		buf += tape->buffer_size;
		count -= tape->buffer_size;
		retval = idetape_add_chrdev_write_request(drive, ctl);
		actually_written += tape->buffer_size;
		if (retval <= 0)
			return (retval);
	}
	if (count) {
		actually_written += count;
		if (idetape_copy_stage_from_user(tape, buf, count))
			ret = -EFAULT;
		tape->merge_bh_size += count;
	}
	return ret ? ret : actually_written;
}

static int idetape_write_filemark(ide_drive_t *drive)
{
	struct ide_tape_obj *tape = drive->driver_data;
	struct ide_atapi_pc pc;

	/* Write a filemark */
	idetape_create_write_filemark_cmd(drive, &pc, 1);
	if (ide_queue_pc_tail(drive, tape->disk, &pc)) {
		printk(KERN_ERR "ide-tape: Couldn't write a filemark\n");
		return -EIO;
	}
	return 0;
}

/*
 * Called from idetape_chrdev_ioctl when the general mtio MTIOCTOP ioctl is
 * requested.
 *
 * Note: MTBSF and MTBSFM are not supported when the tape doesn't support
 * spacing over filemarks in the reverse direction. In this case, MTFSFM is also
 * usually not supported.
 *
 * The following commands are currently not supported:
 *
 * MTFSS, MTBSS, MTWSM, MTSETDENSITY, MTSETDRVBUFFER, MT_ST_BOOLEANS,
 * MT_ST_WRITE_THRESHOLD.
 */
static int idetape_mtioctop(ide_drive_t *drive, short mt_op, int mt_count)
{
	idetape_tape_t *tape = drive->driver_data;
	struct gendisk *disk = tape->disk;
	struct ide_atapi_pc pc;
	int i, retval;

	debug_log(DBG_ERR, "Handling MTIOCTOP ioctl: mt_op=%d, mt_count=%d\n",
			mt_op, mt_count);

	switch (mt_op) {
	case MTFSF:
	case MTFSFM:
	case MTBSF:
	case MTBSFM:
		if (!mt_count)
			return 0;
		return idetape_space_over_filemarks(drive, mt_op, mt_count);
	default:
		break;
	}

	switch (mt_op) {
	case MTWEOF:
		if (tape->write_prot)
			return -EACCES;
		ide_tape_discard_merge_buffer(drive, 1);
		for (i = 0; i < mt_count; i++) {
			retval = idetape_write_filemark(drive);
			if (retval)
				return retval;
		}
		return 0;
	case MTREW:
		ide_tape_discard_merge_buffer(drive, 0);
		if (idetape_rewind_tape(drive))
			return -EIO;
		return 0;
	case MTLOAD:
		ide_tape_discard_merge_buffer(drive, 0);
		return ide_do_start_stop(drive, disk, IDETAPE_LU_LOAD_MASK);
	case MTUNLOAD:
	case MTOFFL:
		/*
		 * If door is locked, attempt to unlock before
		 * attempting to eject.
		 */
		if (tape->door_locked) {
			if (!ide_set_media_lock(drive, disk, 0))
				tape->door_locked = DOOR_UNLOCKED;
		}
		ide_tape_discard_merge_buffer(drive, 0);
		retval = ide_do_start_stop(drive, disk, !IDETAPE_LU_LOAD_MASK);
		if (!retval)
			clear_bit(IDE_AFLAG_MEDIUM_PRESENT, &drive->atapi_flags);
		return retval;
	case MTNOP:
		ide_tape_discard_merge_buffer(drive, 0);
		return idetape_flush_tape_buffers(drive);
	case MTRETEN:
		ide_tape_discard_merge_buffer(drive, 0);
		return ide_do_start_stop(drive, disk,
			IDETAPE_LU_RETENSION_MASK | IDETAPE_LU_LOAD_MASK);
	case MTEOM:
		idetape_create_space_cmd(&pc, 0, IDETAPE_SPACE_TO_EOD);
		return ide_queue_pc_tail(drive, disk, &pc);
	case MTERASE:
		(void)idetape_rewind_tape(drive);
		idetape_create_erase_cmd(&pc);
		return ide_queue_pc_tail(drive, disk, &pc);
	case MTSETBLK:
		if (mt_count) {
			if (mt_count < tape->blk_size ||
			    mt_count % tape->blk_size)
				return -EIO;
			tape->user_bs_factor = mt_count / tape->blk_size;
			clear_bit(IDE_AFLAG_DETECT_BS, &drive->atapi_flags);
		} else
			set_bit(IDE_AFLAG_DETECT_BS, &drive->atapi_flags);
		return 0;
	case MTSEEK:
		ide_tape_discard_merge_buffer(drive, 0);
		return idetape_position_tape(drive,
			mt_count * tape->user_bs_factor, tape->partition, 0);
	case MTSETPART:
		ide_tape_discard_merge_buffer(drive, 0);
		return idetape_position_tape(drive, 0, mt_count, 0);
	case MTFSR:
	case MTBSR:
	case MTLOCK:
		retval = ide_set_media_lock(drive, disk, 1);
		if (retval)
			return retval;
		tape->door_locked = DOOR_EXPLICITLY_LOCKED;
		return 0;
	case MTUNLOCK:
		retval = ide_set_media_lock(drive, disk, 0);
		if (retval)
			return retval;
		tape->door_locked = DOOR_UNLOCKED;
		return 0;
	default:
		printk(KERN_ERR "ide-tape: MTIO operation %d not supported\n",
				mt_op);
		return -EIO;
	}
}

/*
 * Our character device ioctls. General mtio.h magnetic io commands are
 * supported here, and not in the corresponding block interface. Our own
 * ide-tape ioctls are supported on both interfaces.
 */
static int idetape_chrdev_ioctl(struct inode *inode, struct file *file,
				unsigned int cmd, unsigned long arg)
{
	struct ide_tape_obj *tape = file->private_data;
	ide_drive_t *drive = tape->drive;
	struct mtop mtop;
	struct mtget mtget;
	struct mtpos mtpos;
	int block_offset = 0, position = tape->first_frame;
	void __user *argp = (void __user *)arg;

	debug_log(DBG_CHRDEV, "Enter %s, cmd=%u\n", __func__, cmd);

	if (tape->chrdev_dir == IDETAPE_DIR_WRITE) {
		ide_tape_flush_merge_buffer(drive);
		idetape_flush_tape_buffers(drive);
	}
	if (cmd == MTIOCGET || cmd == MTIOCPOS) {
		block_offset = tape->merge_bh_size /
			(tape->blk_size * tape->user_bs_factor);
		position = idetape_read_position(drive);
		if (position < 0)
			return -EIO;
	}
	switch (cmd) {
	case MTIOCTOP:
		if (copy_from_user(&mtop, argp, sizeof(struct mtop)))
			return -EFAULT;
		return idetape_mtioctop(drive, mtop.mt_op, mtop.mt_count);
	case MTIOCGET:
		memset(&mtget, 0, sizeof(struct mtget));
		mtget.mt_type = MT_ISSCSI2;
		mtget.mt_blkno = position / tape->user_bs_factor - block_offset;
		mtget.mt_dsreg =
			((tape->blk_size * tape->user_bs_factor)
			 << MT_ST_BLKSIZE_SHIFT) & MT_ST_BLKSIZE_MASK;

		if (tape->drv_write_prot)
			mtget.mt_gstat |= GMT_WR_PROT(0xffffffff);

		if (copy_to_user(argp, &mtget, sizeof(struct mtget)))
			return -EFAULT;
		return 0;
	case MTIOCPOS:
		mtpos.mt_blkno = position / tape->user_bs_factor - block_offset;
		if (copy_to_user(argp, &mtpos, sizeof(struct mtpos)))
			return -EFAULT;
		return 0;
	default:
		if (tape->chrdev_dir == IDETAPE_DIR_READ)
			ide_tape_discard_merge_buffer(drive, 1);
		return idetape_blkdev_ioctl(drive, cmd, arg);
	}
}

/*
 * Do a mode sense page 0 with block descriptor and if it succeeds set the tape
 * block size with the reported value.
 */
static void ide_tape_get_bsize_from_bdesc(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc pc;

	idetape_create_mode_sense_cmd(&pc, IDETAPE_BLOCK_DESCRIPTOR);
	if (ide_queue_pc_tail(drive, tape->disk, &pc)) {
		printk(KERN_ERR "ide-tape: Can't get block descriptor\n");
		if (tape->blk_size == 0) {
			printk(KERN_WARNING "ide-tape: Cannot deal with zero "
					    "block size, assuming 32k\n");
			tape->blk_size = 32768;
		}
		return;
	}
	tape->blk_size = (pc.buf[4 + 5] << 16) +
				(pc.buf[4 + 6] << 8)  +
				 pc.buf[4 + 7];
	tape->drv_write_prot = (pc.buf[2] & 0x80) >> 7;
}

static int idetape_chrdev_open(struct inode *inode, struct file *filp)
{
	unsigned int minor = iminor(inode), i = minor & ~0xc0;
	ide_drive_t *drive;
	idetape_tape_t *tape;
	int retval;

	if (i >= MAX_HWIFS * MAX_DRIVES)
		return -ENXIO;

	lock_kernel();
	tape = ide_tape_chrdev_get(i);
	if (!tape) {
		unlock_kernel();
		return -ENXIO;
	}

	debug_log(DBG_CHRDEV, "Enter %s\n", __func__);

	/*
	 * We really want to do nonseekable_open(inode, filp); here, but some
	 * versions of tar incorrectly call lseek on tapes and bail out if that
	 * fails.  So we disallow pread() and pwrite(), but permit lseeks.
	 */
	filp->f_mode &= ~(FMODE_PREAD | FMODE_PWRITE);

	drive = tape->drive;

	filp->private_data = tape;

	if (test_and_set_bit(IDE_AFLAG_BUSY, &drive->atapi_flags)) {
		retval = -EBUSY;
		goto out_put_tape;
	}

	retval = idetape_wait_ready(drive, 60 * HZ);
	if (retval) {
		clear_bit(IDE_AFLAG_BUSY, &drive->atapi_flags);
		printk(KERN_ERR "ide-tape: %s: drive not ready\n", tape->name);
		goto out_put_tape;
	}

	idetape_read_position(drive);
	if (!test_bit(IDE_AFLAG_ADDRESS_VALID, &drive->atapi_flags))
		(void)idetape_rewind_tape(drive);

	/* Read block size and write protect status from drive. */
	ide_tape_get_bsize_from_bdesc(drive);

	/* Set write protect flag if device is opened as read-only. */
	if ((filp->f_flags & O_ACCMODE) == O_RDONLY)
		tape->write_prot = 1;
	else
		tape->write_prot = tape->drv_write_prot;

	/* Make sure drive isn't write protected if user wants to write. */
	if (tape->write_prot) {
		if ((filp->f_flags & O_ACCMODE) == O_WRONLY ||
		    (filp->f_flags & O_ACCMODE) == O_RDWR) {
			clear_bit(IDE_AFLAG_BUSY, &drive->atapi_flags);
			retval = -EROFS;
			goto out_put_tape;
		}
	}

	/* Lock the tape drive door so user can't eject. */
	if (tape->chrdev_dir == IDETAPE_DIR_NONE) {
		if (!ide_set_media_lock(drive, tape->disk, 1)) {
			if (tape->door_locked != DOOR_EXPLICITLY_LOCKED)
				tape->door_locked = DOOR_LOCKED;
		}
	}
	unlock_kernel();
	return 0;

out_put_tape:
	ide_tape_put(tape);
	unlock_kernel();
	return retval;
}

static void idetape_write_release(ide_drive_t *drive, unsigned int minor)
{
	idetape_tape_t *tape = drive->driver_data;

	ide_tape_flush_merge_buffer(drive);
	tape->merge_bh = ide_tape_kmalloc_buffer(tape, 1, 0);
	if (tape->merge_bh != NULL) {
		idetape_pad_zeros(drive, tape->blk_size *
				(tape->user_bs_factor - 1));
		ide_tape_kfree_buffer(tape);
		tape->merge_bh = NULL;
	}
	idetape_write_filemark(drive);
	idetape_flush_tape_buffers(drive);
	idetape_flush_tape_buffers(drive);
}

static int idetape_chrdev_release(struct inode *inode, struct file *filp)
{
	struct ide_tape_obj *tape = filp->private_data;
	ide_drive_t *drive = tape->drive;
	unsigned int minor = iminor(inode);

	lock_kernel();
	tape = drive->driver_data;

	debug_log(DBG_CHRDEV, "Enter %s\n", __func__);

	if (tape->chrdev_dir == IDETAPE_DIR_WRITE)
		idetape_write_release(drive, minor);
	if (tape->chrdev_dir == IDETAPE_DIR_READ) {
		if (minor < 128)
			ide_tape_discard_merge_buffer(drive, 1);
	}

	if (minor < 128 && test_bit(IDE_AFLAG_MEDIUM_PRESENT, &drive->atapi_flags))
		(void) idetape_rewind_tape(drive);
	if (tape->chrdev_dir == IDETAPE_DIR_NONE) {
		if (tape->door_locked == DOOR_LOCKED) {
			if (!ide_set_media_lock(drive, tape->disk, 0))
				tape->door_locked = DOOR_UNLOCKED;
		}
	}
	clear_bit(IDE_AFLAG_BUSY, &drive->atapi_flags);
	ide_tape_put(tape);
	unlock_kernel();
	return 0;
}

static void idetape_get_inquiry_results(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc pc;
	char fw_rev[4], vendor_id[8], product_id[16];

	idetape_create_inquiry_cmd(&pc);
	if (ide_queue_pc_tail(drive, tape->disk, &pc)) {
		printk(KERN_ERR "ide-tape: %s: can't get INQUIRY results\n",
				tape->name);
		return;
	}
	memcpy(vendor_id, &pc.buf[8], 8);
	memcpy(product_id, &pc.buf[16], 16);
	memcpy(fw_rev, &pc.buf[32], 4);

	ide_fixstring(vendor_id, 8, 0);
	ide_fixstring(product_id, 16, 0);
	ide_fixstring(fw_rev, 4, 0);

	printk(KERN_INFO "ide-tape: %s <-> %s: %.8s %.16s rev %.4s\n",
			drive->name, tape->name, vendor_id, product_id, fw_rev);
}

/*
 * Ask the tape about its various parameters. In particular, we will adjust our
 * data transfer buffer	size to the recommended value as returned by the tape.
 */
static void idetape_get_mode_sense_results(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;
	struct ide_atapi_pc pc;
	u8 *caps;
	u8 speed, max_speed;

	idetape_create_mode_sense_cmd(&pc, IDETAPE_CAPABILITIES_PAGE);
	if (ide_queue_pc_tail(drive, tape->disk, &pc)) {
		printk(KERN_ERR "ide-tape: Can't get tape parameters - assuming"
				" some default values\n");
		tape->blk_size = 512;
		put_unaligned(52,   (u16 *)&tape->caps[12]);
		put_unaligned(540,  (u16 *)&tape->caps[14]);
		put_unaligned(6*52, (u16 *)&tape->caps[16]);
		return;
	}
	caps = pc.buf + 4 + pc.buf[3];

	/* convert to host order and save for later use */
	speed = be16_to_cpup((__be16 *)&caps[14]);
	max_speed = be16_to_cpup((__be16 *)&caps[8]);

	*(u16 *)&caps[8] = max_speed;
	*(u16 *)&caps[12] = be16_to_cpup((__be16 *)&caps[12]);
	*(u16 *)&caps[14] = speed;
	*(u16 *)&caps[16] = be16_to_cpup((__be16 *)&caps[16]);

	if (!speed) {
		printk(KERN_INFO "ide-tape: %s: invalid tape speed "
				"(assuming 650KB/sec)\n", drive->name);
		*(u16 *)&caps[14] = 650;
	}
	if (!max_speed) {
		printk(KERN_INFO "ide-tape: %s: invalid max_speed "
				"(assuming 650KB/sec)\n", drive->name);
		*(u16 *)&caps[8] = 650;
	}

	memcpy(&tape->caps, caps, 20);

	/* device lacks locking support according to capabilities page */
	if ((caps[6] & 1) == 0)
		drive->dev_flags &= ~IDE_DFLAG_DOORLOCKING;

	if (caps[7] & 0x02)
		tape->blk_size = 512;
	else if (caps[7] & 0x04)
		tape->blk_size = 1024;
}

#ifdef CONFIG_IDE_PROC_FS
#define ide_tape_devset_get(name, field) \
static int get_##name(ide_drive_t *drive) \
{ \
	idetape_tape_t *tape = drive->driver_data; \
	return tape->field; \
}

#define ide_tape_devset_set(name, field) \
static int set_##name(ide_drive_t *drive, int arg) \
{ \
	idetape_tape_t *tape = drive->driver_data; \
	tape->field = arg; \
	return 0; \
}

#define ide_tape_devset_rw_field(_name, _field) \
ide_tape_devset_get(_name, _field) \
ide_tape_devset_set(_name, _field) \
IDE_DEVSET(_name, DS_SYNC, get_##_name, set_##_name)

#define ide_tape_devset_r_field(_name, _field) \
ide_tape_devset_get(_name, _field) \
IDE_DEVSET(_name, 0, get_##_name, NULL)

static int mulf_tdsc(ide_drive_t *drive)	{ return 1000; }
static int divf_tdsc(ide_drive_t *drive)	{ return   HZ; }
static int divf_buffer(ide_drive_t *drive)	{ return    2; }
static int divf_buffer_size(ide_drive_t *drive)	{ return 1024; }

ide_devset_rw_flag(dsc_overlap, IDE_DFLAG_DSC_OVERLAP);

ide_tape_devset_rw_field(debug_mask, debug_mask);
ide_tape_devset_rw_field(tdsc, best_dsc_rw_freq);

ide_tape_devset_r_field(avg_speed, avg_speed);
ide_tape_devset_r_field(speed, caps[14]);
ide_tape_devset_r_field(buffer, caps[16]);
ide_tape_devset_r_field(buffer_size, buffer_size);

static const struct ide_proc_devset idetape_settings[] = {
	__IDE_PROC_DEVSET(avg_speed,	0, 0xffff, NULL, NULL),
	__IDE_PROC_DEVSET(buffer,	0, 0xffff, NULL, divf_buffer),
	__IDE_PROC_DEVSET(buffer_size,	0, 0xffff, NULL, divf_buffer_size),
	__IDE_PROC_DEVSET(debug_mask,	0, 0xffff, NULL, NULL),
	__IDE_PROC_DEVSET(dsc_overlap,	0,      1, NULL, NULL),
	__IDE_PROC_DEVSET(speed,	0, 0xffff, NULL, NULL),
	__IDE_PROC_DEVSET(tdsc,		IDETAPE_DSC_RW_MIN, IDETAPE_DSC_RW_MAX,
					mulf_tdsc, divf_tdsc),
	{ 0 },
};
#endif

/*
 * The function below is called to:
 *
 * 1. Initialize our various state variables.
 * 2. Ask the tape for its capabilities.
 * 3. Allocate a buffer which will be used for data transfer. The buffer size
 * is chosen based on the recommendation which we received in step 2.
 *
 * Note that at this point ide.c already assigned us an irq, so that we can
 * queue requests here and wait for their completion.
 */
static void idetape_setup(ide_drive_t *drive, idetape_tape_t *tape, int minor)
{
	unsigned long t;
	int speed;
	int buffer_size;
	u16 *ctl = (u16 *)&tape->caps[12];

	drive->pc_callback	 = ide_tape_callback;
	drive->pc_update_buffers = idetape_update_buffers;
	drive->pc_io_buffers	 = ide_tape_io_buffers;

	spin_lock_init(&tape->lock);

	drive->dev_flags |= IDE_DFLAG_DSC_OVERLAP;

	if (drive->hwif->host_flags & IDE_HFLAG_NO_DSC) {
		printk(KERN_INFO "ide-tape: %s: disabling DSC overlap\n",
				 tape->name);
		drive->dev_flags &= ~IDE_DFLAG_DSC_OVERLAP;
	}

	/* Seagate Travan drives do not support DSC overlap. */
	if (strstr((char *)&drive->id[ATA_ID_PROD], "Seagate STT3401"))
		drive->dev_flags &= ~IDE_DFLAG_DSC_OVERLAP;

	tape->minor = minor;
	tape->name[0] = 'h';
	tape->name[1] = 't';
	tape->name[2] = '0' + minor;
	tape->chrdev_dir = IDETAPE_DIR_NONE;

	idetape_get_inquiry_results(drive);
	idetape_get_mode_sense_results(drive);
	ide_tape_get_bsize_from_bdesc(drive);
	tape->user_bs_factor = 1;
	tape->buffer_size = *ctl * tape->blk_size;
	while (tape->buffer_size > 0xffff) {
		printk(KERN_NOTICE "ide-tape: decreasing stage size\n");
		*ctl /= 2;
		tape->buffer_size = *ctl * tape->blk_size;
	}
	buffer_size = tape->buffer_size;
	tape->pages_per_buffer = buffer_size / PAGE_SIZE;
	if (buffer_size % PAGE_SIZE) {
		tape->pages_per_buffer++;
		tape->excess_bh_size = PAGE_SIZE - buffer_size % PAGE_SIZE;
	}

	/* select the "best" DSC read/write polling freq */
	speed = max(*(u16 *)&tape->caps[14], *(u16 *)&tape->caps[8]);

	t = (IDETAPE_FIFO_THRESHOLD * tape->buffer_size * HZ) / (speed * 1000);

	/*
	 * Ensure that the number we got makes sense; limit it within
	 * IDETAPE_DSC_RW_MIN and IDETAPE_DSC_RW_MAX.
	 */
	tape->best_dsc_rw_freq = clamp_t(unsigned long, t, IDETAPE_DSC_RW_MIN,
					 IDETAPE_DSC_RW_MAX);
	printk(KERN_INFO "ide-tape: %s <-> %s: %dKBps, %d*%dkB buffer, "
		"%lums tDSC%s\n",
		drive->name, tape->name, *(u16 *)&tape->caps[14],
		(*(u16 *)&tape->caps[16] * 512) / tape->buffer_size,
		tape->buffer_size / 1024,
		tape->best_dsc_rw_freq * 1000 / HZ,
		(drive->dev_flags & IDE_DFLAG_USING_DMA) ? ", DMA" : "");

	ide_proc_register_driver(drive, tape->driver);
}

static void ide_tape_remove(ide_drive_t *drive)
{
	idetape_tape_t *tape = drive->driver_data;

	ide_proc_unregister_driver(drive, tape->driver);

	ide_unregister_region(tape->disk);

	ide_tape_put(tape);
}

static void ide_tape_release(struct kref *kref)
{
	struct ide_tape_obj *tape = to_ide_drv(kref, ide_tape_obj);
	ide_drive_t *drive = tape->drive;
	struct gendisk *g = tape->disk;

	BUG_ON(tape->merge_bh_size);

	drive->dev_flags &= ~IDE_DFLAG_DSC_OVERLAP;
	drive->driver_data = NULL;
	device_destroy(idetape_sysfs_class, MKDEV(IDETAPE_MAJOR, tape->minor));
	device_destroy(idetape_sysfs_class,
			MKDEV(IDETAPE_MAJOR, tape->minor + 128));
	idetape_devs[tape->minor] = NULL;
	g->private_data = NULL;
	put_disk(g);
	kfree(tape);
}

#ifdef CONFIG_IDE_PROC_FS
static int proc_idetape_read_name
	(char *page, char **start, off_t off, int count, int *eof, void *data)
{
	ide_drive_t	*drive = (ide_drive_t *) data;
	idetape_tape_t	*tape = drive->driver_data;
	char		*out = page;
	int		len;

	len = sprintf(out, "%s\n", tape->name);
	PROC_IDE_READ_RETURN(page, start, off, count, eof, len);
}

static ide_proc_entry_t idetape_proc[] = {
	{ "capacity",	S_IFREG|S_IRUGO,	proc_ide_read_capacity, NULL },
	{ "name",	S_IFREG|S_IRUGO,	proc_idetape_read_name,	NULL },
	{ NULL, 0, NULL, NULL }
};

static ide_proc_entry_t *ide_tape_proc_entries(ide_drive_t *drive)
{
	return idetape_proc;
}

static const struct ide_proc_devset *ide_tape_proc_devsets(ide_drive_t *drive)
{
	return idetape_settings;
}
#endif

static int ide_tape_probe(ide_drive_t *);

static ide_driver_t idetape_driver = {
	.gen_driver = {
		.owner		= THIS_MODULE,
		.name		= "ide-tape",
		.bus		= &ide_bus_type,
	},
	.probe			= ide_tape_probe,
	.remove			= ide_tape_remove,
	.version		= IDETAPE_VERSION,
	.do_request		= idetape_do_request,
	.end_request		= idetape_end_request,
	.error			= __ide_error,
#ifdef CONFIG_IDE_PROC_FS
	.proc_entries		= ide_tape_proc_entries,
	.proc_devsets		= ide_tape_proc_devsets,
#endif
};

/* Our character device supporting functions, passed to register_chrdev. */
static const struct file_operations idetape_fops = {
	.owner		= THIS_MODULE,
	.read		= idetape_chrdev_read,
	.write		= idetape_chrdev_write,
	.ioctl		= idetape_chrdev_ioctl,
	.open		= idetape_chrdev_open,
	.release	= idetape_chrdev_release,
};

static int idetape_open(struct block_device *bdev, fmode_t mode)
{
	struct ide_tape_obj *tape = ide_tape_get(bdev->bd_disk);

	if (!tape)
		return -ENXIO;

	return 0;
}

static int idetape_release(struct gendisk *disk, fmode_t mode)
{
	struct ide_tape_obj *tape = ide_drv_g(disk, ide_tape_obj);

	ide_tape_put(tape);
	return 0;
}

static int idetape_ioctl(struct block_device *bdev, fmode_t mode,
			unsigned int cmd, unsigned long arg)
{
	struct ide_tape_obj *tape = ide_drv_g(bdev->bd_disk, ide_tape_obj);
	ide_drive_t *drive = tape->drive;
	int err = generic_ide_ioctl(drive, bdev, cmd, arg);
	if (err == -EINVAL)
		err = idetape_blkdev_ioctl(drive, cmd, arg);
	return err;
}

static struct block_device_operations idetape_block_ops = {
	.owner		= THIS_MODULE,
	.open		= idetape_open,
	.release	= idetape_release,
	.locked_ioctl	= idetape_ioctl,
};

static int ide_tape_probe(ide_drive_t *drive)
{
	idetape_tape_t *tape;
	struct gendisk *g;
	int minor;

	if (!strstr("ide-tape", drive->driver_req))
		goto failed;

	if (drive->media != ide_tape)
		goto failed;

	if ((drive->dev_flags & IDE_DFLAG_ID_READ) &&
	    ide_check_atapi_device(drive, DRV_NAME) == 0) {
		printk(KERN_ERR "ide-tape: %s: not supported by this version of"
				" the driver\n", drive->name);
		goto failed;
	}
	tape = kzalloc(sizeof(idetape_tape_t), GFP_KERNEL);
	if (tape == NULL) {
		printk(KERN_ERR "ide-tape: %s: Can't allocate a tape struct\n",
				drive->name);
		goto failed;
	}

	g = alloc_disk(1 << PARTN_BITS);
	if (!g)
		goto out_free_tape;

	ide_init_disk(g, drive);

	kref_init(&tape->kref);

	tape->drive = drive;
	tape->driver = &idetape_driver;
	tape->disk = g;

	g->private_data = &tape->driver;

	drive->driver_data = tape;

	mutex_lock(&idetape_ref_mutex);
	for (minor = 0; idetape_devs[minor]; minor++)
		;
	idetape_devs[minor] = tape;
	mutex_unlock(&idetape_ref_mutex);

	idetape_setup(drive, tape, minor);

	device_create(idetape_sysfs_class, &drive->gendev,
		      MKDEV(IDETAPE_MAJOR, minor), NULL, "%s", tape->name);
	device_create(idetape_sysfs_class, &drive->gendev,
		      MKDEV(IDETAPE_MAJOR, minor + 128), NULL,
		      "n%s", tape->name);

	g->fops = &idetape_block_ops;
	ide_register_region(g);

	return 0;

out_free_tape:
	kfree(tape);
failed:
	return -ENODEV;
}

static void __exit idetape_exit(void)
{
	driver_unregister(&idetape_driver.gen_driver);
	class_destroy(idetape_sysfs_class);
	unregister_chrdev(IDETAPE_MAJOR, "ht");
}

static int __init idetape_init(void)
{
	int error = 1;
	idetape_sysfs_class = class_create(THIS_MODULE, "ide_tape");
	if (IS_ERR(idetape_sysfs_class)) {
		idetape_sysfs_class = NULL;
		printk(KERN_ERR "Unable to create sysfs class for ide tapes\n");
		error = -EBUSY;
		goto out;
	}

	if (register_chrdev(IDETAPE_MAJOR, "ht", &idetape_fops)) {
		printk(KERN_ERR "ide-tape: Failed to register chrdev"
				" interface\n");
		error = -EBUSY;
		goto out_free_class;
	}

	error = driver_register(&idetape_driver.gen_driver);
	if (error)
		goto out_free_driver;

	return 0;

out_free_driver:
	driver_unregister(&idetape_driver.gen_driver);
out_free_class:
	class_destroy(idetape_sysfs_class);
out:
	return error;
}

MODULE_ALIAS("ide:*m-tape*");
module_init(idetape_init);
module_exit(idetape_exit);
MODULE_ALIAS_CHARDEV_MAJOR(IDETAPE_MAJOR);
MODULE_DESCRIPTION("ATAPI Streaming TAPE Driver");
MODULE_LICENSE("GPL");
\0

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-03 15:09   ` James Bottomley
  2008-12-06  6:12     ` Pete Zaitcev
@ 2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
  2008-12-06 15:06       ` Alan Cox
                         ` (2 more replies)
  1 sibling, 3 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 14:51 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christoph Hellwig, FUJITA Tomonori, linux-ide, linux-scsi, osst

On Wednesday 03 December 2008, James Bottomley wrote:
> On Wed, 2008-12-03 at 05:06 -0500, Christoph Hellwig wrote:
> > On Wed, Dec 03, 2008 at 10:38:54AM +0900, FUJITA Tomonori wrote:
> > > This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.
> > > 
> > > It's against linux-next (which seems to has some changes to ide-scsi
> > > for 2.6.29 from the ide tree). 
> > 
> > Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
> > the osst driver?
> 
> Depends.  If you're still using drivers/ide then yes, it is.  With
> libata (which is what most modern distros use), osst just works as an
> ATAPI transport.
> 
> git log tells me ide-scsi has been updated quite a bit recently, but it
> mostly looks to be fallout around the drivers/ide churn.  Can we get ide
> maintainer's buy in for this (I think they've been maintaining ide-tape
> and ide-cd in preference to ide-scsi)?

Certainly, native ide-{cd,gd,tape} drivers are far superior over ide-scsi
and are actively maintained.

ide-scsi has been practically a dead code for a long time now (IIRC it has
even been broken -- by some general kernel changes -- for few releases and
nobody noticed, till Boaz discovered it while doing unrelated SCSI fixes)
but I kept it on the live support anyway.  However there needs to be some
limit to it, especially given that driver has been officially orphaned for
over a year now and nobody stepped in...

I applied the scheduled removal patch to pata-2.6 tree, thanks Tomo!

Thanks,
Bart

PS If somebody wants to work on OSST support for IDE we can provide an
assistance into porting osst.c over generic ATAPI code -- it would still
be much less hassle than trying to figure out remaining ide-scsi issues
(lifetime rules for IDE / SCSI / IDE-SCSI objects, error handling etc.).

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
@ 2008-12-06 15:06       ` Alan Cox
  2008-12-06 16:29         ` Bartlomiej Zolnierkiewicz
  2008-12-06 15:25       ` Willem Riede
  2008-12-06 17:00       ` Dan Noé
  2 siblings, 1 reply; 65+ messages in thread
From: Alan Cox @ 2008-12-06 15:06 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

> ide-scsi has been practically a dead code for a long time now (IIRC it has
> even been broken -- by some general kernel changes -- for few releases and
> nobody noticed, till Boaz discovered it while doing unrelated SCSI fixes)
> but I kept it on the live support anyway.  However there needs to be some
> limit to it, especially given that driver has been officially orphaned for
> over a year now and nobody stepped in...

And it basically still works. You've just removed some hardware support
for folks still trapped with the old IDE legacy drivers and some tape
drives etc.

I shall certainly be asking Linus not to apply your changeset.

Alan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
  2008-12-06 15:06       ` Alan Cox
@ 2008-12-06 15:25       ` Willem Riede
  2008-12-06 15:59         ` Bartlomiej Zolnierkiewicz
  2008-12-06 17:00       ` Dan Noé
  2 siblings, 1 reply; 65+ messages in thread
From: Willem Riede @ 2008-12-06 15:25 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi

On Sat, Dec 6, 2008 at 7:51 AM, Bartlomiej Zolnierkiewicz
<bzolnier@gmail.com> wrote:
> On Wednesday 03 December 2008, James Bottomley wrote:
>> On Wed, 2008-12-03 at 05:06 -0500, Christoph Hellwig wrote:
>> > On Wed, Dec 03, 2008 at 10:38:54AM +0900, FUJITA Tomonori wrote:
>> > > This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.
>> > >
>> > > It's against linux-next (which seems to has some changes to ide-scsi
>> > > for 2.6.29 from the ide tree).
>> >
>> > Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
>> > the osst driver?
>>
>> Depends.  If you're still using drivers/ide then yes, it is.  With
>> libata (which is what most modern distros use), osst just works as an
>> ATAPI transport.
>>
>> git log tells me ide-scsi has been updated quite a bit recently, but it
>> mostly looks to be fallout around the drivers/ide churn.  Can we get ide
>> maintainer's buy in for this (I think they've been maintaining ide-tape
>> and ide-cd in preference to ide-scsi)?
>
> Certainly, native ide-{cd,gd,tape} drivers are far superior over ide-scsi
> and are actively maintained.
>
> ide-scsi has been practically a dead code for a long time now (IIRC it has
> even been broken -- by some general kernel changes -- for few releases and
> nobody noticed, till Boaz discovered it while doing unrelated SCSI fixes)
> but I kept it on the live support anyway.  However there needs to be some
> limit to it, especially given that driver has been officially orphaned for
> over a year now and nobody stepped in...
>
> I applied the scheduled removal patch to pata-2.6 tree, thanks Tomo!
>
> Thanks,
> Bart
>
> PS If somebody wants to work on OSST support for IDE we can provide an
> assistance into porting osst.c over generic ATAPI code -- it would still
> be much less hassle than trying to figure out remaining ide-scsi issues
> (lifetime rules for IDE / SCSI / IDE-SCSI objects, error handling etc.).
>
Don't know exactly what you mean with "porting osst.c over generic ATAPI code",
but osst is SCSI by design - a SCSI version of the drive exists.
Making osst ATAPI
is therefore not appropriate. If osst works with libata as earlier suggested,
then that would be the best solution.

Unfortunately, my IDE test drive died, OnStream went out of business some time
ago, so I can't get a new one, hence I (osst maintainer) can't test this :-(

Regards, Willem Riede.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 15:25       ` Willem Riede
@ 2008-12-06 15:59         ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 15:59 UTC (permalink / raw)
  To: Willem Riede
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi

On Saturday 06 December 2008, Willem Riede wrote:
> On Sat, Dec 6, 2008 at 7:51 AM, Bartlomiej Zolnierkiewicz
> <bzolnier@gmail.com> wrote:
> > On Wednesday 03 December 2008, James Bottomley wrote:
> >> On Wed, 2008-12-03 at 05:06 -0500, Christoph Hellwig wrote:
> >> > On Wed, Dec 03, 2008 at 10:38:54AM +0900, FUJITA Tomonori wrote:
> >> > > This is for 2.6.29 (not 2.6.28) as feature-removal-schedule.txt says.
> >> > >
> >> > > It's against linux-next (which seems to has some changes to ide-scsi
> >> > > for 2.6.29 from the ide tree).
> >> >
> >> > Isn't ide-scsi the only way to use ATAPI OnStream tapes supported by
> >> > the osst driver?
> >>
> >> Depends.  If you're still using drivers/ide then yes, it is.  With
> >> libata (which is what most modern distros use), osst just works as an
> >> ATAPI transport.
> >>
> >> git log tells me ide-scsi has been updated quite a bit recently, but it
> >> mostly looks to be fallout around the drivers/ide churn.  Can we get ide
> >> maintainer's buy in for this (I think they've been maintaining ide-tape
> >> and ide-cd in preference to ide-scsi)?
> >
> > Certainly, native ide-{cd,gd,tape} drivers are far superior over ide-scsi
> > and are actively maintained.
> >
> > ide-scsi has been practically a dead code for a long time now (IIRC it has
> > even been broken -- by some general kernel changes -- for few releases and
> > nobody noticed, till Boaz discovered it while doing unrelated SCSI fixes)
> > but I kept it on the live support anyway.  However there needs to be some
> > limit to it, especially given that driver has been officially orphaned for
> > over a year now and nobody stepped in...
> >
> > I applied the scheduled removal patch to pata-2.6 tree, thanks Tomo!
> >
> > Thanks,
> > Bart
> >
> > PS If somebody wants to work on OSST support for IDE we can provide an
> > assistance into porting osst.c over generic ATAPI code -- it would still
> > be much less hassle than trying to figure out remaining ide-scsi issues
> > (lifetime rules for IDE / SCSI / IDE-SCSI objects, error handling etc.).
> >
> Don't know exactly what you mean with "porting osst.c over generic ATAPI code",
> but osst is SCSI by design - a SCSI version of the drive exists.
> Making osst ATAPI

I mean porting it over generic IDE ATAPI code (ide-atapi.c) in case somebody
wants to use it with drivers/ide instead of libata.

> is therefore not appropriate. If osst works with libata as earlier suggested,
> then that would be the best solution.

It should work fine I just doubt that anybody has ever tested it because of
lack of osst users in general.

> Unfortunately, my IDE test drive died, OnStream went out of business some time
> ago, so I can't get a new one, hence I (osst maintainer) can't test this :-(

:-( indeed.

Thanks,
Bart

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 15:06       ` Alan Cox
@ 2008-12-06 16:29         ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 16:29 UTC (permalink / raw)
  To: Alan Cox
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Saturday 06 December 2008, Alan Cox wrote:
> > ide-scsi has been practically a dead code for a long time now (IIRC it has
> > even been broken -- by some general kernel changes -- for few releases and
> > nobody noticed, till Boaz discovered it while doing unrelated SCSI fixes)
> > but I kept it on the live support anyway.  However there needs to be some
> > limit to it, especially given that driver has been officially orphaned for
> > over a year now and nobody stepped in...
> 
> And it basically still works. You've just removed some hardware support
> for folks still trapped with the old IDE legacy drivers and some tape
> drives etc.

Please note that I already offered to put my time into helping people
potentially affected by the change.

> I shall certainly be asking Linus not to apply your changeset.

If you actually cared about such folks (whose must be really tough on
their luck since both IDE/ide-tape and SCSI/libata/st+osst should cover
99.9% of tape drives) you would be proposing constructive solutions
(i.e. offering better libata PATA support) instead of threats.

Thanks,
Bart

PS Looking at the patch history since 2002 (I'm lazy I didn't look past
tglx's tree) it was mostly Willem, James or me who paid the cost of keeping
ide-scsi and were targetted with insane bugreports.  Your actual ide-scsi
contributions in the meantime are near non-existing and I think that it
could account into the weight of your voice...

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
  2008-12-06 15:06       ` Alan Cox
  2008-12-06 15:25       ` Willem Riede
@ 2008-12-06 17:00       ` Dan Noé
  2008-12-06 21:41         ` Bartlomiej Zolnierkiewicz
  2 siblings, 1 reply; 65+ messages in thread
From: Dan Noé @ 2008-12-06 17:00 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sat, 6 Dec 2008 15:51:08 +0100
Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:

> PS If somebody wants to work on OSST support for IDE we can provide an
> assistance into porting osst.c over generic ATAPI code -- it would
> still be much less hassle than trying to figure out remaining
> ide-scsi issues (lifetime rules for IDE / SCSI / IDE-SCSI objects,
> error handling etc.). --

I wouldn't mind doing this - I was a user of an IDE OnStream for many
years but alas my drive died years ago and OnStream is out of
business.  If someone had a drive to offer maybe it could be done.

Cheers,
Dan

-- 
                    /--------------- - -  -  -   -   -
                   |  Dan Noé
                   |  http://isomerica.net/~dpn/

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 17:00       ` Dan Noé
@ 2008-12-06 21:41         ` Bartlomiej Zolnierkiewicz
  2008-12-06 22:24           ` Alan Cox
  2008-12-06 22:33           ` Al Viro
  0 siblings, 2 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 21:41 UTC (permalink / raw)
  To: Dan Noé
  Cc: James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Saturday 06 December 2008, Dan Noé wrote:
> On Sat, 6 Dec 2008 15:51:08 +0100
> Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:
> 
> > PS If somebody wants to work on OSST support for IDE we can provide an
> > assistance into porting osst.c over generic ATAPI code -- it would
> > still be much less hassle than trying to figure out remaining
> > ide-scsi issues (lifetime rules for IDE / SCSI / IDE-SCSI objects,
> > error handling etc.). --
> 
> I wouldn't mind doing this - I was a user of an IDE OnStream for many
> years but alas my drive died years ago and OnStream is out of
> business.  If someone had a drive to offer maybe it could be done.

It would be great but it seems like IDE OnStream drives are a real rarity
nowadays (unfortunately I won't be of a much help here, don't have any)...

I think that for the time being it is best to just proceed with the removal
and see if there are any users needing the driver (+ we should probably try
SCSI/libata/osst path first).

Thanks,
Bart
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 21:41         ` Bartlomiej Zolnierkiewicz
@ 2008-12-06 22:24           ` Alan Cox
  2008-12-06 22:52             ` Sergei Shtylyov
  2008-12-06 23:40             ` Bartlomiej Zolnierkiewicz
  2008-12-06 22:33           ` Al Viro
  1 sibling, 2 replies; 65+ messages in thread
From: Alan Cox @ 2008-12-06 22:24 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

> I think that for the time being it is best to just proceed with the removal
> and see if there are any users needing the driver (+ we should probably try
> SCSI/libata/osst path first).

Far better to just leave it there. It generally works for users so all
you are doing is creating a regression with no possible benefit (other
than encouraging people to move to libata so we can obsolete all of
drivers/ide - which is what we really need to do and move the last few
users over)

Alan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 21:41         ` Bartlomiej Zolnierkiewicz
  2008-12-06 22:24           ` Alan Cox
@ 2008-12-06 22:33           ` Al Viro
  2008-12-06 23:13             ` Bartlomiej Zolnierkiewicz
  2008-12-06 23:17             ` Willem Riede
  1 sibling, 2 replies; 65+ messages in thread
From: Al Viro @ 2008-12-06 22:33 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Dan No??,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sat, Dec 06, 2008 at 10:41:34PM +0100, Bartlomiej Zolnierkiewicz wrote:

> It would be great but it seems like IDE OnStream drives are a real rarity
> nowadays (unfortunately I won't be of a much help here, don't have any)...

I have one, actually.  If you want it sent your way...

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 22:24           ` Alan Cox
@ 2008-12-06 22:52             ` Sergei Shtylyov
  2008-12-06 23:02               ` Alan Cox
  2008-12-06 23:28               ` Jeff Garzik
  2008-12-06 23:40             ` Bartlomiej Zolnierkiewicz
  1 sibling, 2 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-06 22:52 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Alan Cox wrote:
>> I think that for the time being it is best to just proceed with the removal
>> and see if there are any users needing the driver (+ we should probably try
>> SCSI/libata/osst path first).
>>     
>
> Far better to just leave it there. It generally works for users so all
> you are doing is creating a regression with no possible benefit (other
> than encouraging people to move to libata so we can obsolete all of
> drivers/ide - which is what we really need to do and move the last few
> users over)
>   

   Oh, yes. SCSI emulation is just what Linux embedded world is asking 
for...

> Alan

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 22:52             ` Sergei Shtylyov
@ 2008-12-06 23:02               ` Alan Cox
  2008-12-06 23:19                 ` Sergei Shtylyov
                                   ` (3 more replies)
  2008-12-06 23:28               ` Jeff Garzik
  1 sibling, 4 replies; 65+ messages in thread
From: Alan Cox @ 2008-12-06 23:02 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

>    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
> for...

Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).

I'm actually seeing two strands of requests (including from embedded)

- CF only small "dumb as president" type driver that is written to be as
compact as possible and preferably considers IRQs as optional
- Full SATA and NCQ aware platform support.

although the former is growing quieter and it seems the CF formfactor is
just too clunky for embedded nowdays, especially with the horrendously
complex and pricy connector - and is being eliminated by MMC/SD and
friends.

Alan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 22:33           ` Al Viro
@ 2008-12-06 23:13             ` Bartlomiej Zolnierkiewicz
  2008-12-06 23:17             ` Willem Riede
  1 sibling, 0 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 23:13 UTC (permalink / raw)
  To: Al Viro
  Cc: Dan No??,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Saturday 06 December 2008, Al Viro wrote:
> On Sat, Dec 06, 2008 at 10:41:34PM +0100, Bartlomiej Zolnierkiewicz wrote:
> 
> > It would be great but it seems like IDE OnStream drives are a real rarity
> > nowadays (unfortunately I won't be of a much help here, don't have any)...
> 
> I have one, actually.  If you want it sent your way...

I think it would be much better to send it directly into Willem's
(osst.c maintainer) or Dan's way since they volunteered to look into
the issue and have experience with osst.c...

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 22:33           ` Al Viro
  2008-12-06 23:13             ` Bartlomiej Zolnierkiewicz
@ 2008-12-06 23:17             ` Willem Riede
  2008-12-07  0:09               ` Al Viro
  1 sibling, 1 reply; 65+ messages in thread
From: Willem Riede @ 2008-12-06 23:17 UTC (permalink / raw)
  To: Al Viro
  Cc: Bartlomiej Zolnierkiewicz, Dan No??,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi

On Sat, Dec 6, 2008 at 3:33 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Sat, Dec 06, 2008 at 10:41:34PM +0100, Bartlomiej Zolnierkiewicz wrote:
>
>> It would be great but it seems like IDE OnStream drives are a real rarity
>> nowadays (unfortunately I won't be of a much help here, don't have any)...
>
> I have one, actually.  If you want it sent your way...
>
If you would be willing to send it to me to replace the now defective drive I
used to develop and maintain osst with, I'll test against libata.
Contact me directly, if you would, to discuss arrangements.

Thanks, Willem Riede.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:02               ` Alan Cox
@ 2008-12-06 23:19                 ` Sergei Shtylyov
  2008-12-06 23:32                   ` Alan Cox
  2008-12-07 15:04                   ` James Bottomley
  2008-12-07  0:19                 ` [PATCH] remove ide-scsi Sergei Shtylyov
                                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-06 23:19 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Alan Cox wrote:

>>    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
>> for...
>>     
>
> Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).
>   

   ATAPI is SCSI transport (with maybe some quirks at SCSI command level 
tho, IIRC). ATA is neither thansport nor does it map to SCSI 1:1. The 
code for emulating SCSI on ATA only burdens the kernel (and causes user 
complaints about changing disk names from /dev/hdx to /dev/sda :-).

> I'm actually seeing two strands of requests (including from embedded)
>
> - CF only small "dumb as president" type driver that is written to be as
> compact as possible and preferably considers IRQs as optional
>   

   Yes, support for IRQ-less CF is what IDE core lacks. I must note 
however that IRQ-less mode looks inherently risky to me because of the 
raciness of tbothe the ATA spec and its implimentatin WRT the "interrupt 
pending" state. I may be mistaking but someone of T13 experts (Hale 
Landis I guess) told me that the devices require that state to be 
cleared to proceed with the command, and that's what the fast polling 
host is likely to fail at because it doesn't know if rthe device has 
actually entered this state when BSY is cleared... Oh well, that's an 
old story...

> - Full SATA and NCQ aware platform support.
>   

   I don't see how SATA/NCQ support is connected to SCSI.

> although the former is growing quieter and it seems the CF formfactor is
>   

   We're seeing the IRQ-less CF driver submitted for the Octeon SoC 
(which employs up to 16 MIPS cores) -- though it's probably needed only 
for the development boards... :-)

> just too clunky for embedded nowdays, especially with the horrendously
> complex and pricy connector - and is being eliminated by MMC/SD and
> friends.
>   

  Oh, don't tell me about that MMC crap. :-)
  For some reason they keep wring the card insert and write protect 
signals via GPIO, not directly into controller (and it's yet good if 
directly into SoC's own GPIO, not an expander) -- which requires the 
drivers to call the platform code hooks. :-(

> Alan
>   

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 22:52             ` Sergei Shtylyov
  2008-12-06 23:02               ` Alan Cox
@ 2008-12-06 23:28               ` Jeff Garzik
  2008-12-06 23:42                 ` Sergei Shtylyov
  2008-12-06 23:45                 ` Bartlomiej Zolnierkiewicz
  1 sibling, 2 replies; 65+ messages in thread
From: Jeff Garzik @ 2008-12-06 23:28 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Alan Cox, Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Sergei Shtylyov wrote:
> Hello.
> 
> Alan Cox wrote:
>>> I think that for the time being it is best to just proceed with the 
>>> removal
>>> and see if there are any users needing the driver (+ we should 
>>> probably try
>>> SCSI/libata/osst path first).
>>>     
>>
>> Far better to just leave it there. It generally works for users so all
>> you are doing is creating a regression with no possible benefit (other
>> than encouraging people to move to libata so we can obsolete all of
>> drivers/ide - which is what we really need to do and move the last few
>> users over)
>>   
> 
>   Oh, yes. SCSI emulation is just what Linux embedded world is asking 
> for...

The goal is to make SCSI emulation optional for ATA devices, by creating 
a libata ATA block device driven by the libata driver framework.

	Jeff





^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:19                 ` Sergei Shtylyov
@ 2008-12-06 23:32                   ` Alan Cox
  2008-12-07  0:08                     ` Sergei Shtylyov
  2008-12-07 15:04                   ` James Bottomley
  1 sibling, 1 reply; 65+ messages in thread
From: Alan Cox @ 2008-12-06 23:32 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

> > - Full SATA and NCQ aware platform support.
> >   
>    I don't see how SATA/NCQ support is connected to SCSI.

Because the chunks of scsi midlayer we inherit (actually nowdays mostly
block) are the pieces you need anyway to do multiple command queues,
error recovery from multiple pending commands, barriers and all the other
nasty sequencing and recovery stuff.

> directly into SoC's own GPIO, not an expander) -- which requires the 
> drivers to call the platform code hooks. :-(

Embedded system design in my experience primarily consists of shooting
yourself and the programmer in the foot simultaenously with automatic
weapons while attempting to save 2 cents/unit

Alan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 22:24           ` Alan Cox
  2008-12-06 22:52             ` Sergei Shtylyov
@ 2008-12-06 23:40             ` Bartlomiej Zolnierkiewicz
  2008-12-06 23:51               ` Alan Cox
  2008-12-06 23:51               ` Jeff Garzik
  1 sibling, 2 replies; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 23:40 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Saturday 06 December 2008, Alan Cox wrote:
> > I think that for the time being it is best to just proceed with the removal
> > and see if there are any users needing the driver (+ we should probably try
> > SCSI/libata/osst path first).
> 
> Far better to just leave it there. It generally works for users so all

Unfortunately "generally" here means that once there is some more
advanced driver functionality needed (i.e. error handling) it fails
in the major way...  Thus leaving it there is confusing for users
which are better using either IDE native drivers or libata.  It is
also a waste of time on part of both IDE and SCSI people.

Still, we can certainly leave ide-scsi there if you or somebody else
want to maintain it (which didn't happen for the last year).  It would
be best to start with fixing years long issues with error handling and
taking the work on updating driver for IDE changes off my shoulder...

> you are doing is creating a regression with no possible benefit (other
> than encouraging people to move to libata so we can obsolete all of
> drivers/ide - which is what we really need to do and move the last few
> users over)

Please don't put my in the same box as I see no value added by libata
PATA to vast majority of IDE users.  However I certainly commend that you
want to take care of the last few OnStream users...

Thanks,
Bart

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:28               ` Jeff Garzik
@ 2008-12-06 23:42                 ` Sergei Shtylyov
  2008-12-06 23:48                   ` Jeff Garzik
  2008-12-06 23:45                 ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-06 23:42 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Alan Cox, Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Jeff Garzik wrote:

>>>> I think that for the time being it is best to just proceed with the 
>>>> removal
>>>> and see if there are any users needing the driver (+ we should 
>>>> probably try
>>>> SCSI/libata/osst path first).
>>>>     
>>>
>>> Far better to just leave it there. It generally works for users so all
>>> you are doing is creating a regression with no possible benefit (other
>>> than encouraging people to move to libata so we can obsolete all of
>>> drivers/ide - which is what we really need to do and move the last few
>>> users over)
>>>   
>>
>>   Oh, yes. SCSI emulation is just what Linux embedded world is asking 
>> for...
>
> The goal is to make SCSI emulation optional for ATA devices, by 
> creating a libata ATA block device driven by the libata driver framework.

   Please remind me for how many years this remains a goal?

>     Jeff 

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:28               ` Jeff Garzik
  2008-12-06 23:42                 ` Sergei Shtylyov
@ 2008-12-06 23:45                 ` Bartlomiej Zolnierkiewicz
  2008-12-06 23:50                   ` Jeff Garzik
  1 sibling, 1 reply; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-06 23:45 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Sergei Shtylyov, Alan Cox, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sunday 07 December 2008, Jeff Garzik wrote:
> Sergei Shtylyov wrote:
> > Hello.
> > 
> > Alan Cox wrote:
> >>> I think that for the time being it is best to just proceed with the 
> >>> removal
> >>> and see if there are any users needing the driver (+ we should 
> >>> probably try
> >>> SCSI/libata/osst path first).
> >>>     
> >>
> >> Far better to just leave it there. It generally works for users so all
> >> you are doing is creating a regression with no possible benefit (other
> >> than encouraging people to move to libata so we can obsolete all of
> >> drivers/ide - which is what we really need to do and move the last few
> >> users over)
> >>   
> > 
> >   Oh, yes. SCSI emulation is just what Linux embedded world is asking 
> > for...
> 
> The goal is to make SCSI emulation optional for ATA devices, by creating 
> a libata ATA block device driven by the libata driver framework.

Given five years this is too-less-too-late by my book.  Especially given
that you are still talking about the "goal" not the actual "code"...

Anyway this becomes offtopic to ide-scsi removal.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:42                 ` Sergei Shtylyov
@ 2008-12-06 23:48                   ` Jeff Garzik
  2008-12-07  3:36                     ` Yinghai Lu
  0 siblings, 1 reply; 65+ messages in thread
From: Jeff Garzik @ 2008-12-06 23:48 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Alan Cox, Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Sergei Shtylyov wrote:
> Hello.
> 
> Jeff Garzik wrote:
> 
>>>>> I think that for the time being it is best to just proceed with the 
>>>>> removal
>>>>> and see if there are any users needing the driver (+ we should 
>>>>> probably try
>>>>> SCSI/libata/osst path first).
>>>>>     
>>>>
>>>> Far better to just leave it there. It generally works for users so all
>>>> you are doing is creating a regression with no possible benefit (other
>>>> than encouraging people to move to libata so we can obsolete all of
>>>> drivers/ide - which is what we really need to do and move the last few
>>>> users over)
>>>>   
>>>
>>>   Oh, yes. SCSI emulation is just what Linux embedded world is asking 
>>> for...
>>
>> The goal is to make SCSI emulation optional for ATA devices, by 
>> creating a libata ATA block device driven by the libata driver framework.
> 
>   Please remind me for how many years this remains a goal?

A long time, yes :)   But clear progress is being made in that 
direction, too.

A key reason why, years ago, libata used SCSI as a framework was its 
utility as a generic driver framework.

Therefore, it is an obvious requirement that the non-SCSI "driver 
framework" code that libata uses must be reimplemented -- in block layer 
or libata -- in order to export an ATA disk as a pure block device.

The latest step towards that goal occurred quite recently:  block layer 
timeouts.

	Jeff



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:45                 ` Bartlomiej Zolnierkiewicz
@ 2008-12-06 23:50                   ` Jeff Garzik
  0 siblings, 0 replies; 65+ messages in thread
From: Jeff Garzik @ 2008-12-06 23:50 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Sergei Shtylyov, Alan Cox, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Bartlomiej Zolnierkiewicz wrote:
> Given five years this is too-less-too-late by my book.  Especially given
> that you are still talking about the "goal" not the actual "code"...

As noted in other reply, libata has been actively moving in that 
direction, by intentionally isolating the SCSI pieces from the rest of 
the code, and by working to get the prerequisites upstream (e.g. block 
layer timeouts).

	Jeff



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:40             ` Bartlomiej Zolnierkiewicz
@ 2008-12-06 23:51               ` Alan Cox
  2008-12-07  0:56                 ` Bartlomiej Zolnierkiewicz
  2008-12-06 23:51               ` Jeff Garzik
  1 sibling, 1 reply; 65+ messages in thread
From: Alan Cox @ 2008-12-06 23:51 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

> Unfortunately "generally" here means that once there is some more
> advanced driver functionality needed (i.e. error handling) it fails
> in the major way...  Thus leaving it there is confusing for users

Whereas removing it ensures it doesn't work in the first place, which is
a regression and pointless.

> Still, we can certainly leave ide-scsi there if you or somebody else
> want to maintain it (which didn't happen for the last year).  It would
> be best to start with fixing years long issues with error handling and
> taking the work on updating driver for IDE changes off my shoulder...

You could just stop turning the old IDE driver into an experimental
playground. The problems are only arising because old IDE is being
continually changed (frequently in directly the reverse direction it came
over the years)

Alan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:40             ` Bartlomiej Zolnierkiewicz
  2008-12-06 23:51               ` Alan Cox
@ 2008-12-06 23:51               ` Jeff Garzik
  1 sibling, 0 replies; 65+ messages in thread
From: Jeff Garzik @ 2008-12-06 23:51 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Alan Cox, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Bartlomiej Zolnierkiewicz wrote:
> Please don't put my in the same box as I see no value added by libata
> PATA to vast majority of IDE users.  However I certainly commend that you
> want to take care of the last few OnStream users...

If you have SATA as well as PATA, there is obvious benefit...

	Jeff



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:32                   ` Alan Cox
@ 2008-12-07  0:08                     ` Sergei Shtylyov
  2008-12-07 11:40                       ` Alan Cox
  0 siblings, 1 reply; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07  0:08 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Alan Cox wrote:

>>> - Full SATA and NCQ aware platform support.
>>>   
>>>       
>>    I don't see how SATA/NCQ support is connected to SCSI.
>>     
>
> Because the chunks of scsi midlayer we inherit (actually nowdays mostly
> block) are the pieces you need anyway to do multiple command queues,
>   

   Er, what does this term mean? Several queues per device or a tagged 
queue?

> error recovery from multiple pending commands, barriers and all the other
> nasty sequencing and recovery stuff.
>   

   So, you're just presenting SCSI emulation as a "lesser evil". But 5 
years seems a long enough term to unbind all that stuff from SCSI.

>> directly into SoC's own GPIO, not an expander) -- which requires the 
>> drivers to call the platform code hooks. :-(
>>     
>
> Embedded system design in my experience primarily consists of shooting
> yourself and the programmer in the foot simultaenously with automatic
> weapons while attempting to save 2 cents/unit
>   

   Some h/w designers and programmers do indeed deserve to be shot. :-D

> Alan
>   

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:17             ` Willem Riede
@ 2008-12-07  0:09               ` Al Viro
  0 siblings, 0 replies; 65+ messages in thread
From: Al Viro @ 2008-12-07  0:09 UTC (permalink / raw)
  To: Willem Riede
  Cc: Bartlomiej Zolnierkiewicz, Dan No??,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi

On Sat, Dec 06, 2008 at 04:17:55PM -0700, Willem Riede wrote:
> On Sat, Dec 6, 2008 at 3:33 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Sat, Dec 06, 2008 at 10:41:34PM +0100, Bartlomiej Zolnierkiewicz wrote:
> >
> >> It would be great but it seems like IDE OnStream drives are a real rarity
> >> nowadays (unfortunately I won't be of a much help here, don't have any)...
> >
> > I have one, actually.  If you want it sent your way...
> >
> If you would be willing to send it to me to replace the now defective drive I
> used to develop and maintain osst with, I'll test against libata.
> Contact me directly, if you would, to discuss arrangements.
> 
> Thanks, Willem Riede.

Argh...  Having tested them...  SC30 (SCSI one) is alive, DI30 is *not* ;-/
Sorry.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:02               ` Alan Cox
  2008-12-06 23:19                 ` Sergei Shtylyov
@ 2008-12-07  0:19                 ` Sergei Shtylyov
  2008-12-07  9:59                   ` Sergei Shtylyov
  2008-12-07 10:41                 ` Sergei Shtylyov
  2008-12-09 21:41                 ` Matthew Wilcox
  3 siblings, 1 reply; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07  0:19 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst



Alan Cox wrote:
>>    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
>> for...
>>     
>
> Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).
>
> I'm actually seeing two strands of requests (including from embedded)
>
> - CF only small "dumb as president" type driver that is written to be as
> compact as possible and preferably considers IRQs as optional
>   

   IDE core seems much closer to the "dumb" driver (not that it's 
actually that dumb :-) at this point. The only thing it lacks for "being 
dumb enough" is the polled mode support... The idea of creating yet 
another driver framework just for dumb CF doesn't appeal to me. And 
liabat iscertainly an overkill. If I only had time to look into adding 
the polled mode support now... and a project needing that (well, Octeon 
did but I wasn't involved and it had a standalone driver at that time)...
   Actually, embedded world is much more diverse in its IDE 
implementations/requirements (you can still encounter a full fledged 
UltraDMA/133 PATA contoller embedded within a modern SoC), even to the 
point of complete perversions. I know of a flash device which claims to 
be IDE compatible (and is indeed registrer level compatible) but only 
supports certain standard commands, but implements internal paritioning 
scheme with reads/writes done by vendor specific commands (not even 
using the standard IDE command execution protocol, IIRC :-).

> Alan
>   

WBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:51               ` Alan Cox
@ 2008-12-07  0:56                 ` Bartlomiej Zolnierkiewicz
  2008-12-07  1:14                   ` Alan Cox
  0 siblings, 1 reply; 65+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-12-07  0:56 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sunday 07 December 2008, Alan Cox wrote:

> > Still, we can certainly leave ide-scsi there if you or somebody else
> > want to maintain it (which didn't happen for the last year).  It would
> > be best to start with fixing years long issues with error handling and
> > taking the work on updating driver for IDE changes off my shoulder...
> 
> You could just stop turning the old IDE driver into an experimental

I find your mail offensive/unfair and I'm not going to discuss things
at such level if you're going to continue in this style.

I also regretfully assume that the answer to my question is negative...

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  0:56                 ` Bartlomiej Zolnierkiewicz
@ 2008-12-07  1:14                   ` Alan Cox
  2008-12-07 10:32                     ` Sergei Shtylyov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Cox @ 2008-12-07  1:14 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

> > You could just stop turning the old IDE driver into an experimental
> 
> I find your mail offensive/unfair and I'm not going to discuss things
> at such level if you're going to continue in this style.

I think the number of changesets and the amount of change versus the
amount of new functionality speaks for itself.

Alan


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:48                   ` Jeff Garzik
@ 2008-12-07  3:36                     ` Yinghai Lu
  2008-12-07  4:17                       ` Jeff Garzik
  0 siblings, 1 reply; 65+ messages in thread
From: Yinghai Lu @ 2008-12-07  3:36 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Sergei Shtylyov, Alan Cox, Bartlomiej Zolnierkiewicz,
	Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sat, Dec 6, 2008 at 3:48 PM, Jeff Garzik <jeff@garzik.org> wrote:

>
> A key reason why, years ago, libata used SCSI as a framework was its utility
> as a generic driver framework.
>
> Therefore, it is an obvious requirement that the non-SCSI "driver framework"
> code that libata uses must be reimplemented -- in block layer or libata --
> in order to export an ATA disk as a pure block device.
>
> The latest step towards that goal occurred quite recently:  block layer
> timeouts.

interesting, so will get /dev/ada instead of /dev/sda?

anyone one is building block layer for USB disk? will get /dev/uda?

YH

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  3:36                     ` Yinghai Lu
@ 2008-12-07  4:17                       ` Jeff Garzik
  2008-12-07  5:07                         ` Yinghai Lu
  2008-12-09 19:59                         ` Mark Lord
  0 siblings, 2 replies; 65+ messages in thread
From: Jeff Garzik @ 2008-12-07  4:17 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Sergei Shtylyov, Alan Cox, Bartlomiej Zolnierkiewicz,
	Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Yinghai Lu wrote:
> On Sat, Dec 6, 2008 at 3:48 PM, Jeff Garzik <jeff@garzik.org> wrote:
> 
>> A key reason why, years ago, libata used SCSI as a framework was its utility
>> as a generic driver framework.
>>
>> Therefore, it is an obvious requirement that the non-SCSI "driver framework"
>> code that libata uses must be reimplemented -- in block layer or libata --
>> in order to export an ATA disk as a pure block device.
>>
>> The latest step towards that goal occurred quite recently:  block layer
>> timeouts.
> 
> interesting, so will get /dev/ada instead of /dev/sda?

That is an interesting question.  The easiest thing is to allocate a new 
32-bit block major.  But the more compatible (and more controversial) 
solution is to allocate from the SCSI disk blkmajor space.


> anyone one is building block layer for USB disk? will get /dev/uda?

That already exists -- drivers/block/ub   :)

	Jeff




^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  4:17                       ` Jeff Garzik
@ 2008-12-07  5:07                         ` Yinghai Lu
  2008-12-07 11:00                           ` Sergei Shtylyov
  2008-12-09 19:59                         ` Mark Lord
  1 sibling, 1 reply; 65+ messages in thread
From: Yinghai Lu @ 2008-12-07  5:07 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Sergei Shtylyov, Alan Cox, Bartlomiej Zolnierkiewicz,
	Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sat, Dec 6, 2008 at 8:17 PM, Jeff Garzik <jeff@garzik.org> wrote:
>> anyone one is building block layer for USB disk? will get /dev/uda?
>
> That already exists -- drivers/block/ub   :)

this one ?

config BLK_DEV_UB
        tristate "Low Performance USB Block driver"
        depends on USB
        help
          This driver supports certain USB attached storage devices
          such as flash keys.

          If you enable this driver, it is recommended to avoid conflicts
          with usb-storage by enabling USB_LIBUSUAL.

          If unsure, say N.

I was thinking it should be more faster with less one layer.

YH

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  0:19                 ` [PATCH] remove ide-scsi Sergei Shtylyov
@ 2008-12-07  9:59                   ` Sergei Shtylyov
  0 siblings, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07  9:59 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello, I wrote:

>   Actually, embedded world is much more diverse in its IDE 
> implementations/requirements (you can still encounter a full fledged 
> UltraDMA/133 PATA contoller embedded within a modern SoC), even to the 
> point of complete perversions. I know of a flash device which claims 
> to be IDE compatible (and is indeed registrer level compatible) but 
> only supports certain standard commands, but implements internal 
> paritioning scheme with reads/writes done by vendor specific commands 
> (not even using the standard IDE command execution protocol, IIRC :-).

   Er, no. Judging on the patch that I had to review last year, the 
command protocol is also compatible, so it could've been "riding" on the 
IDE core, just like ide-disk.c (probably libata as well). Instead it's 
driven by a huge out of tree driver (size of a mammoth) -- not sure 
why... :-)
>> Alan 

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  1:14                   ` Alan Cox
@ 2008-12-07 10:32                     ` Sergei Shtylyov
  0 siblings, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07 10:32 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Alan Cox wrote:

>>> You could just stop turning the old IDE driver into an experimental
>>>       
>> I find your mail offensive/unfair and I'm not going to discuss things
>> at such level if you're going to continue in this style.
>>     
>
> I think the number of changesets and the amount of change versus the
> amount of new functionality speaks for itself.
>   

   The ID core has suffered from to much negliction, so it indeed needed 
many changes (and when the loocking problems that you kept compalining 
about are finally being dealt with, everybody starts screaming to stop 
changing the old code because IDE is going to die soon (as if we had a 
non SCSI emulting ATA disk driver after 5 years past the libata 
merging). What libata people have been really successful at during this 
time (I don't include the SATA work here) is brainwashing everybody 
(even in the non carrier grade embedded sector) that libata is good for 
just about everything...
   As for the new features, what features are really lacking (except the 
polled mode)?

> Alan
>   

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:02               ` Alan Cox
  2008-12-06 23:19                 ` Sergei Shtylyov
  2008-12-07  0:19                 ` [PATCH] remove ide-scsi Sergei Shtylyov
@ 2008-12-07 10:41                 ` Sergei Shtylyov
  2008-12-09 21:41                 ` Matthew Wilcox
  3 siblings, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07 10:41 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Alan Cox wrote:

>>    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
>> for...
>>     
>
> Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).
>
> I'm actually seeing two strands of requests (including from embedded)
>
> - CF only small "dumb as president" type driver that is written to be as
> compact as possible and preferably considers IRQs as optional
> - Full SATA and NCQ aware platform support.
>   

    I think that such diverse requests are coming from the different 
sectors of the embedded market, e.g. consumer electronics and carrier 
grade respectfully (Octeon probably being an exception here) which 
different greatly in the horsepower of the backing CPUs -- with carrier 
grade CPUs being multicore CPUs running at several GHz with a lot of 
memory (and using chipset ATA controllers) and the consumer electronics 
employing mostly RISC CPUs running at 200-300 MHz with not that much 
memory). The only SoC integrated SATA controller I know of in the 
embedded market is produced by Freescale, others continue embedding good 
old PATA controllers (although this is probably going to change given 
much lower pin count needed for SATA).

> Alan
>   

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  5:07                         ` Yinghai Lu
@ 2008-12-07 11:00                           ` Sergei Shtylyov
  0 siblings, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07 11:00 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Jeff Garzik, Alan Cox, Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Yinghai Lu wrote:

>>> anyone one is building block layer for USB disk? will get /dev/uda?
>>>       
>> That already exists -- drivers/block/ub   :)
>>     
>
> this one ?
>
> config BLK_DEV_UB
>         tristate "Low Performance USB Block driver"
>         depends on USB
>         help
>           This driver supports certain USB attached storage devices
>           such as flash keys.
>
>           If you enable this driver, it is recommended to avoid conflicts
>           with usb-storage by enabling USB_LIBUSUAL.
>
>           If unsure, say N.
>
> I was thinking it should be more faster with less one layer.
>   

   I'm not exactly happy with those "one-less-layer" solutions for the 
devices that are SCSI in nature (like ide-{cd|tape|floppy} or this one) 
but at least they have a smaller memory footprint than a multi-layer 
SCSI-based implementation. I hope the "low performace" here means that 
the devices are slow, not the driver though... :-)

> YH
>   

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  0:08                     ` Sergei Shtylyov
@ 2008-12-07 11:40                       ` Alan Cox
  2008-12-07 14:46                         ` Sergei Shtylyov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Cox @ 2008-12-07 11:40 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

> > Because the chunks of scsi midlayer we inherit (actually nowdays mostly
> > block) are the pieces you need anyway to do multiple command queues,
> >   
> 
>    Er, what does this term mean? Several queues per device or a tagged 
> queue?

Anything beyond issuing one command at a time. The moment you get errors
with multiple command queues you really need the rest of the block
supporting logic (small bits of which are still in scsi).

>    So, you're just presenting SCSI emulation as a "lesser evil". But 5 
> years seems a long enough term to unbind all that stuff from SCSI.

It's being done bit by bit. I wasn't aware it was a race, I always
thought that being correct, logical, testable and evolutionary bisectable
steps was more important somehow.

Alan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07 11:40                       ` Alan Cox
@ 2008-12-07 14:46                         ` Sergei Shtylyov
  0 siblings, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07 14:46 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bartlomiej Zolnierkiewicz, Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Alan Cox wrote:

>>> Because the chunks of scsi midlayer we inherit (actually nowdays mostly
>>> block) are the pieces you need anyway to do multiple command queues,
>>>   
>>>       
>>    Er, what does this term mean? Several queues per device or a tagged 
>> queue?
>>     
>
> Anything beyond issuing one command at a time. The moment you get errors
> with multiple command queues you really need the rest of the block
> supporting logic (small bits of which are still in scsi).
>   

   Something like freezing the queue while error handling is done I guess?

>>    So, you're just presenting SCSI emulation as a "lesser evil". But 5 
>> years seems a long enough term to unbind all that stuff from SCSI.
>>     
>
> It's being done bit by bit. I wasn't aware it was a race, I always
>   

  5 years seems like libata has been running a marathon distance. :-)
   IMHO, you're kind of trying to turn that into a race with constant 
appeals of getting rid of IDE, clearly without enough efforts spent to 
bring about that step so far (looks like there's just not much interest 
in doing that now that all major x86 distributions have adopted libata 
anyway).

> thought that being correct, logical, testable and evolutionary bisectable
> steps was more important somehow.
>   

    I'd question the "evolutionary" and "bisectable" properties of the 
major distributions' decision to switch from IDE to libata. Though I 
guess the state in which IDE code had been at that point contributed to 
that (well, anyway I don't know exactly what was the history beyond the 
stopping of active IDE work)...

> Alan
>   

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:19                 ` Sergei Shtylyov
  2008-12-06 23:32                   ` Alan Cox
@ 2008-12-07 15:04                   ` James Bottomley
  2008-12-07 15:21                     ` Sergei Shtylyov
  2008-12-09 22:21                     ` libata / scsi separation Matthew Wilcox
  1 sibling, 2 replies; 65+ messages in thread
From: James Bottomley @ 2008-12-07 15:04 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Alan Cox, Bartlomiej Zolnierkiewicz, Dan Noé,
	Christoph Hellwig, FUJITA Tomonori, linux-ide, linux-scsi, osst

On Sun, 2008-12-07 at 02:19 +0300, Sergei Shtylyov wrote:
> Hello.
> 
> Alan Cox wrote:
> 
> >>    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
> >> for...
> >>     
> >
> > Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).
> >   
> 
>    ATAPI is SCSI transport (with maybe some quirks at SCSI command level 
> tho, IIRC). ATA is neither thansport nor does it map to SCSI 1:1.


Well, to be wholly accurate, since SCSI-3, SCSI has been separated into
an architecture, primary command, device specific command and transport
model (called the SCSI architecture model).  Starting with ATA-8, ATA
will go this way again.  What ATAPI actually is is a SCSI (really MMC
for CD and SSC for tape) command transported over ATA using the ATA
PACKET command.  With ATA-8 it will be much more analogous to SCSI
command over ATA transport.

>  The 
> code for emulating SCSI on ATA only burdens the kernel (and causes user 
> complaints about changing disk names from /dev/hdx to /dev/sda :-).

The code for sorting this out is already upstream in the block tree for
2.6.29.

> > I'm actually seeing two strands of requests (including from embedded)
> >
> > - CF only small "dumb as president" type driver that is written to be as
> > compact as possible and preferably considers IRQs as optional
> >   
> 
>    Yes, support for IRQ-less CF is what IDE core lacks. I must note 
> however that IRQ-less mode looks inherently risky to me because of the 
> raciness of tbothe the ATA spec and its implimentatin WRT the "interrupt 
> pending" state. I may be mistaking but someone of T13 experts (Hale 
> Landis I guess) told me that the devices require that state to be 
> cleared to proceed with the command, and that's what the fast polling 
> host is likely to fail at because it doesn't know if rthe device has 
> actually entered this state when BSY is cleared... Oh well, that's an 
> old story...
> 
> > - Full SATA and NCQ aware platform support.
> >   
> 
>    I don't see how SATA/NCQ support is connected to SCSI.

They require features that SCSI has had for a long time for TCQ but
which weren't present in block.

Originally I'd been promised that libata would be out of SCSI within a
year (that was when it went in).  The slight problem is that having all
the features it needed, SCSI became a very comfortable host. Getting
libata out of SCSI was also made difficult by the fact that few people
cared enough to help.  The only significant external problem is the size
of the stack and the slight performance penalty for SATA disks going
over SAT.  Unfortunately for the latter, slight turns out to be pretty
unmeasurable, so the only hope became people who cared about
footprint ... and there don't seem to be any of those.

The other problem is that it isn't just moving libata out of SCSI.  To
do this correctly, SCSI itself needs to be sliced up because lots of
devices running over ATA transports are genuinely governed by SCSI
protocols (CD with MMC and tape with SSC), so we have to be able to run
sr and st over an ATA transport, thus it needs the minimal pieces of
SCSI to function.  This in turn necessitates splitting the whole of the
SCSI mid layer into a stand alone upper layer support library and a
lower layer platform, which is a significant piece of work.

The way it's happening is that people keep thinking of incremental steps
towards this (ULD separated from SCSI, block timers, sd device
allocation separation, block error handling) and doing them, but lacking
a major spur, it's going slowly.

James



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07 15:04                   ` James Bottomley
@ 2008-12-07 15:21                     ` Sergei Shtylyov
  2008-12-09 22:21                     ` libata / scsi separation Matthew Wilcox
  1 sibling, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-07 15:21 UTC (permalink / raw)
  To: James Bottomley
  Cc: Alan Cox, Bartlomiej Zolnierkiewicz, Dan Noé,
	Christoph Hellwig, FUJITA Tomonori, linux-ide, linux-scsi, osst

Hello.

James Bottomley wrote:

>>>>    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
>>>> for...
>>>>     
>>>>         
>>> Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).
>>>   
>>>       
>>    ATAPI is SCSI transport (with maybe some quirks at SCSI command level 
>> tho, IIRC). ATA is neither thansport nor does it map to SCSI 1:1.
>>     
>
>
> Well, to be wholly accurate, since SCSI-3, SCSI has been separated into
> an architecture, primary command, device specific command and transport
> model (called the SCSI architecture model).

   Thanks, I'm well aware of all this. :-)
   My first encounter with SCSI dates back to 1993-94 -- it's a pity 
that I had to abandon this area (switching to that puny IDE ;-)...

> Starting with ATA-8, ATA
> will go this way again.

   It's gone that way in ATA/PI-7 actually, being broken into 3 separate 
documents then, one spoecifying the command set and 2 others PATA and 
SATA tranports.

>   What ATAPI actually is is a SCSI (really MMC
> for CD and SSC for tape)

   Unfortunately, the initial SFF documents specified both the transport 
protocol and the command sets (which somewhat diverged form what SCSI-2 
had, IIRC).
   Fortunately, once ANSI finally taken over ATAPI work, they dropped 
that stupid practice and started referring to MMC and SSC.

> command transported over ATA using the ATA
> PACKET command.  With ATA-8 it will be much more analogous to SCSI
> command over ATA transport.
>   

    I don't think "analogous" means that it will be sending SCSI CDBs 
over PATA/SATA is of native commands and turn ATA into ATAPI. So all 
this is fine but changes nothing about the SCSI emulation thing.

>> The code for emulating SCSI on ATA only burdens the kernel (and causes user 
>> complaints about changing disk names from /dev/hdx to /dev/sda :-).
>>     
>
> The code for sorting this out is already upstream in the block tree for
> 2.6.29.
>   

   Sorting out what, emulation?

   I have to cut my response short that that point. I must be totally 
crazy to allow myself to be dragged into this discussion having so much 
work to do... :-/

MBR, Sergei



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-07  4:17                       ` Jeff Garzik
  2008-12-07  5:07                         ` Yinghai Lu
@ 2008-12-09 19:59                         ` Mark Lord
  2008-12-09 20:07                           ` Jeff Garzik
  1 sibling, 1 reply; 65+ messages in thread
From: Mark Lord @ 2008-12-09 19:59 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Yinghai Lu, Sergei Shtylyov, Alan Cox, Bartlomiej Zolnierkiewicz,
	Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Jeff Garzik wrote:
> Yinghai Lu wrote:
>> On Sat, Dec 6, 2008 at 3:48 PM, Jeff Garzik <jeff@garzik.org> wrote:
..
>> interesting, so will get /dev/ada instead of /dev/sda?
> 
> That is an interesting question.  The easiest thing is to allocate a new 
> 32-bit block major.  But the more compatible (and more controversial) 
> solution is to allocate from the SCSI disk blkmajor space.
..

Perhaps "sdx" and "scdx" should simply be redefined as "System disk"
and "System CD", and moved under block layer ownership from SCSI ?

Cheers

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-09 19:59                         ` Mark Lord
@ 2008-12-09 20:07                           ` Jeff Garzik
  2008-12-09 21:04                             ` James Bottomley
  0 siblings, 1 reply; 65+ messages in thread
From: Jeff Garzik @ 2008-12-09 20:07 UTC (permalink / raw)
  To: Mark Lord
  Cc: Yinghai Lu, Sergei Shtylyov, Alan Cox, Bartlomiej Zolnierkiewicz,
	Dan Noé,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Mark Lord wrote:
> Jeff Garzik wrote:
>> Yinghai Lu wrote:
>>> On Sat, Dec 6, 2008 at 3:48 PM, Jeff Garzik <jeff@garzik.org> wrote:
> ..
>>> interesting, so will get /dev/ada instead of /dev/sda?
>>
>> That is an interesting question.  The easiest thing is to allocate a 
>> new 32-bit block major.  But the more compatible (and more 
>> controversial) solution is to allocate from the SCSI disk blkmajor space.
> ..
> 
> Perhaps "sdx" and "scdx" should simply be redefined as "System disk"
> and "System CD", and moved under block layer ownership from SCSI ?

Yes, precisely.  SCSI and libata (and others) would allocate disk/MMC 
majors from the block layer.

The biggest problem with this plan is that it introduces a mismatch in 
ioctl capabilities, a behavior change from the current situation where 
SCSI exports SCSI_IOCTL_SEND_COMMAND in addition to the more universal 
SG_IO.  Installers and existing clients may rely on this portion of the 
Linux application ABI.

	Jeff




^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-09 20:07                           ` Jeff Garzik
@ 2008-12-09 21:04                             ` James Bottomley
  0 siblings, 0 replies; 65+ messages in thread
From: James Bottomley @ 2008-12-09 21:04 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Mark Lord, Yinghai Lu, Sergei Shtylyov, Alan Cox,
	Bartlomiej Zolnierkiewicz, Dan Noé,
	Christoph Hellwig, FUJITA Tomonori, linux-ide, linux-scsi, osst

On Tue, 2008-12-09 at 15:07 -0500, Jeff Garzik wrote:
> Mark Lord wrote:
> > Jeff Garzik wrote:
> >> Yinghai Lu wrote:
> >>> On Sat, Dec 6, 2008 at 3:48 PM, Jeff Garzik <jeff@garzik.org> wrote:
> > ..
> >>> interesting, so will get /dev/ada instead of /dev/sda?
> >>
> >> That is an interesting question.  The easiest thing is to allocate a 
> >> new 32-bit block major.  But the more compatible (and more 
> >> controversial) solution is to allocate from the SCSI disk blkmajor space.
> > ..
> > 
> > Perhaps "sdx" and "scdx" should simply be redefined as "System disk"
> > and "System CD", and moved under block layer ownership from SCSI ?
> 
> Yes, precisely.  SCSI and libata (and others) would allocate disk/MMC 
> majors from the block layer.

Actually, The mmc major will still be sr (11).  The split out disk major
will only really be used for the current block devices served by sd.

> The biggest problem with this plan is that it introduces a mismatch in 
> ioctl capabilities, a behavior change from the current situation where 
> SCSI exports SCSI_IOCTL_SEND_COMMAND

That's deprecated and has been for several years.

>  in addition to the more universal 
> SG_IO.  Installers and existing clients may rely on this portion of the 
> Linux application ABI.

Hopefully not.  It warns noisily on use of deprecated ioctls.  The door
lock ones still aren't, so libata will have to grow it's own removable
media handling for them, but I think all the rest are (except the
bus/host probes which return largely made up numbers).

James



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-06 23:02               ` Alan Cox
                                   ` (2 preceding siblings ...)
  2008-12-07 10:41                 ` Sergei Shtylyov
@ 2008-12-09 21:41                 ` Matthew Wilcox
  2008-12-10 17:46                   ` Sergei Shtylyov
  3 siblings, 1 reply; 65+ messages in thread
From: Matthew Wilcox @ 2008-12-09 21:41 UTC (permalink / raw)
  To: Alan Cox
  Cc: Sergei Shtylyov, Bartlomiej Zolnierkiewicz, Dan No?,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

On Sat, Dec 06, 2008 at 11:02:27PM +0000, Alan Cox wrote:
> >    Oh, yes. SCSI emulation is just what Linux embedded world is asking 
> > for...
> 
> Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).

I'm told (by people who work with embedded people) that first the
customer says "Oh, I don't need the block layer, I just need MTD", so
they disable CONFIG_BLOCK, then the customer says "Oh, I need to support
USB storage", so they re-enable CONFIG_BLOCK, add CONFIG_SCSI and scsi
disk support and usb-storage.

OK, there is now 'ub' so you can do this without SCSI, but still ...

> I'm actually seeing two strands of requests (including from embedded)
> 
> - CF only small "dumb as president" type driver that is written to be as

Just six weeks until you can't make that joke any more ...

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 65+ messages in thread

* libata / scsi separation
  2008-12-07 15:04                   ` James Bottomley
  2008-12-07 15:21                     ` Sergei Shtylyov
@ 2008-12-09 22:21                     ` Matthew Wilcox
  2008-12-09 22:38                       ` James Bottomley
  2008-12-10  1:54                       ` Tejun Heo
  1 sibling, 2 replies; 65+ messages in thread
From: Matthew Wilcox @ 2008-12-09 22:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-ide, linux-scsi, Tejun Heo

On Sun, Dec 07, 2008 at 09:04:58AM -0600, James Bottomley wrote:
> Originally I'd been promised that libata would be out of SCSI within a
> year (that was when it went in).  The slight problem is that having all
> the features it needed, SCSI became a very comfortable host. Getting
> libata out of SCSI was also made difficult by the fact that few people
> cared enough to help.  The only significant external problem is the size
> of the stack and the slight performance penalty for SATA disks going
> over SAT.  Unfortunately for the latter, slight turns out to be pretty
> unmeasurable, so the only hope became people who cared about
> footprint ... and there don't seem to be any of those.

The performance penalty is certainly measurable.  It's about 1 microsecond
per request extra to go from userspace -> scsi -> libata -> driver
than it is to go from userspace -> scsi -> driver.  If you issue 400
commands per second (as you might do with a 15k RPM SCSI drive), that's
400 microseconds.  If you issue 10,000 commands per second (as you might
do with an SSD), that's 10ms of additional CPU time spent in the kernel
per second (or 1%).

So it's insignificant overhead ... unless you have an SSD.  I have asked
Tejun if there's anything he wants help with to move the libata-scsi
separation along, but he's not come up with anything yet.  Right now,
I'm investigating a technique that may significantly increase the number
of requests we can do per second without rewriting the whole thing.

(OK, I haven't measured the overhead of the *SCSI* layer, I've measured
the overhead of the *libata* layer.  I think the point here is that you
can't measure the difference at a macro level unless you're sending a
lot of commands.)

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-09 22:21                     ` libata / scsi separation Matthew Wilcox
@ 2008-12-09 22:38                       ` James Bottomley
  2008-12-10  3:37                         ` Matthew Wilcox
  2008-12-10  1:54                       ` Tejun Heo
  1 sibling, 1 reply; 65+ messages in thread
From: James Bottomley @ 2008-12-09 22:38 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-ide, linux-scsi, Tejun Heo

On Tue, 2008-12-09 at 15:21 -0700, Matthew Wilcox wrote:
> On Sun, Dec 07, 2008 at 09:04:58AM -0600, James Bottomley wrote:
> > Originally I'd been promised that libata would be out of SCSI within a
> > year (that was when it went in).  The slight problem is that having all
> > the features it needed, SCSI became a very comfortable host. Getting
> > libata out of SCSI was also made difficult by the fact that few people
> > cared enough to help.  The only significant external problem is the size
> > of the stack and the slight performance penalty for SATA disks going
> > over SAT.  Unfortunately for the latter, slight turns out to be pretty
> > unmeasurable, so the only hope became people who cared about
> > footprint ... and there don't seem to be any of those.
> 
> The performance penalty is certainly measurable.  It's about 1 microsecond
> per request extra to go from userspace -> scsi -> libata -> driver
> than it is to go from userspace -> scsi -> driver.  If you issue 400
> commands per second (as you might do with a 15k RPM SCSI drive), that's
> 400 microseconds.  If you issue 10,000 commands per second (as you might
> do with an SSD), that's 10ms of additional CPU time spent in the kernel
> per second (or 1%).

Um, not quite.  What you're talking about is increased latency.  It's
not cumulative because we use TCQ (well mostly).  The question is really
how it impacts the benchmarks, which are mostly throughput based (and
really, our block layer trades latency for throughput anyway, so it's
not clear what the impact really is).

> So it's insignificant overhead ... unless you have an SSD.

Actually, surely this is the other way around.  We use complex elevators
which try to optimise throughput every device other than a SSD .. there
we usually set noop and the latency becomes more visible .. whether it
still has a noticeable benchmark impact is another matter.

>   I have asked
> Tejun if there's anything he wants help with to move the libata-scsi
> separation along, but he's not come up with anything yet.  Right now,
> I'm investigating a technique that may significantly increase the number
> of requests we can do per second without rewriting the whole thing.
> 
> (OK, I haven't measured the overhead of the *SCSI* layer, I've measured
> the overhead of the *libata* layer.  I think the point here is that you
> can't measure the difference at a macro level unless you're sending a
> lot of commands.)

Perhaps one of the things we should agree on is exactly how we want to
measure things like this.  Making the layering thinner for less latency
is usually good ... unless there are other tradeoffs.  I think not
forcing ata disks to go through SCSI will probably be tradeoff free, but
we need to make sure it is.

James



^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-09 22:21                     ` libata / scsi separation Matthew Wilcox
  2008-12-09 22:38                       ` James Bottomley
@ 2008-12-10  1:54                       ` Tejun Heo
  2008-12-10  2:29                         ` Grant Grundler
  1 sibling, 1 reply; 65+ messages in thread
From: Tejun Heo @ 2008-12-10  1:54 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: James Bottomley, linux-ide, linux-scsi

(cc'ing Jens)
Hello,

Matthew Wilcox wrote:
> On Sun, Dec 07, 2008 at 09:04:58AM -0600, James Bottomley wrote:
>> Originally I'd been promised that libata would be out of SCSI within a
>> year (that was when it went in).  The slight problem is that having all
>> the features it needed, SCSI became a very comfortable host. Getting
>> libata out of SCSI was also made difficult by the fact that few people
>> cared enough to help.  The only significant external problem is the size
>> of the stack and the slight performance penalty for SATA disks going
>> over SAT.  Unfortunately for the latter, slight turns out to be pretty
>> unmeasurable, so the only hope became people who cared about
>> footprint ... and there don't seem to be any of those.
> 
> The performance penalty is certainly measurable.  It's about 1 microsecond
> per request extra to go from userspace -> scsi -> libata -> driver
> than it is to go from userspace -> scsi -> driver.  If you issue 400
> commands per second (as you might do with a 15k RPM SCSI drive), that's
> 400 microseconds.  If you issue 10,000 commands per second (as you might
> do with an SSD), that's 10ms of additional CPU time spent in the kernel
> per second (or 1%).
>
> So it's insignificant overhead ... unless you have an SSD.  I have asked
> Tejun if there's anything he wants help with to move the libata-scsi
> separation along, but he's not come up with anything yet.

I'm working on it and will keep one or two patchsets in flight toward
Jens' direction (one is already in Jens' mailbox, I'm working on
another one and yet another got nacked and waiting for update).
Making libata independent of SCSI basically means move non-SCSI
specific parts of SCSI midlayer into block layer and make libata a
direct customer of block layer once everything is in place.  It's a
slow process (for me, especially with upcoming SLES11 release) but
we're getting there bit by bit.

It's kind of difficult for me to say which direction we should go at
this point as the decision doesn't really fall on me and I doubt
anyone has complete picture of it either, so anything which moves
stuff from SCSI midlayer to block layer will be helpful like the
recent timeout changes.

> Right now, I'm investigating a technique that may significantly
> increase the number of requests we can do per second without
> rewriting the whole thing.

Is the command issue rate really the bottleneck?  It seem a bit
unlikely unless you're issuing lots of really small IOs but then again
those new SSDs are pretty fast.

> (OK, I haven't measured the overhead of the *SCSI* layer, I've measured
> the overhead of the *libata* layer.  I think the point here is that you
> can't measure the difference at a macro level unless you're sending a
> lot of commands.)

How did you measure it?  The issue path isn't thick at all although
command allocation logic there is a bit brain damaged and should use
block layer tag management.  All it does is - allocate qc, interpret
SCSI command to ATA command and write it to qc, map dma and build dma
table and pass it over to the low level issue function.  The only
extra step there is the translation part and I don't think that can
take a full microsecond on modern processors.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10  1:54                       ` Tejun Heo
@ 2008-12-10  2:29                         ` Grant Grundler
  2008-12-10  2:47                           ` Tejun Heo
  0 siblings, 1 reply; 65+ messages in thread
From: Grant Grundler @ 2008-12-10  2:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Matthew Wilcox, James Bottomley, linux-ide, linux-scsi

On Tue, Dec 9, 2008 at 5:54 PM, Tejun Heo <htejun@gmail.com> wrote:
> (cc'ing Jens)
...
> Is the command issue rate really the bottleneck?

Not directly. It's the lack of CPU leftover at high transaction rates
( > 10000 IOPS per disk). So yes, the system does bottle neck on CPU
utilization.

> It seem a bit
> unlikely unless you're issuing lots of really small IOs but then again
> those new SSDs are pretty fast.

That's the whole point of SSDs (lots of small, random IO).

The second desirable attribute SSDs have is consistent response for
reads. HDs vary from microseconds to 100's of milliseconds. Very long
tail in the read latency response.

>> (OK, I haven't measured the overhead of the *SCSI* layer, I've measured
>> the overhead of the *libata* layer.  I think the point here is that you
>> can't measure the difference at a macro level unless you're sending a
>> lot of commands.)
>
> How did you measure it?

Willy presented how he measured SCSI stack at LSF2008. ISTR he was
advised to use oprofile in his test application so there is probably
an updated version of these slides:
    http://iou.parisc-linux.org/lsf2008/IO-latency-Kristen-Carlson-Accardi.pdf

> The issue path isn't thick at all although
> command allocation logic there is a bit brain damaged and should use
> block layer tag management.  All it does is - allocate qc, interpret
> SCSI command to ATA command and write it to qc, map dma and build dma
> table and pass it over to the low level issue function.  The only
> extra step there is the translation part and I don't think that can
> take a full microsecond on modern processors.

Maybe you are counting instructions and not cycles? Every cache miss
is 200-300 cycles (say 100ns). When running multiple threads, we will
miss on nearly every spinlock acquisition and probably on several data
accesses. 1 microsecond isn't alot when counting this way.

hth,
grant

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10  2:29                         ` Grant Grundler
@ 2008-12-10  2:47                           ` Tejun Heo
  2008-12-10  3:23                             ` Grant Grundler
  0 siblings, 1 reply; 65+ messages in thread
From: Tejun Heo @ 2008-12-10  2:47 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Matthew Wilcox, James Bottomley, linux-ide, linux-scsi

Hello,

Grant Grundler wrote:
> On Tue, Dec 9, 2008 at 5:54 PM, Tejun Heo <htejun@gmail.com> wrote:
>> (cc'ing Jens)
> ...
>> Is the command issue rate really the bottleneck?
> 
> Not directly. It's the lack of CPU leftover at high transaction rates
> ( > 10000 IOPS per disk). So yes, the system does bottle neck on CPU
> utilization.
> 
>> It seem a bit
>> unlikely unless you're issuing lots of really small IOs but then again
>> those new SSDs are pretty fast.
> 
> That's the whole point of SSDs (lots of small, random IO).

But on many workloads, filesystems manage to colocate what belongs
together and with little help from read ahead and block layer we
manage to dish out decently sized requests.  It will be great to serve
4k requests as fast as we can but whether that should be (or rather
how much) the focal point of optimization is a slightly different
problem.

> The second desirable attribute SSDs have is consistent response for
> reads. HDs vary from microseconds to 100's of milliseconds. Very long
> tail in the read latency response.
> 
>>> (OK, I haven't measured the overhead of the *SCSI* layer, I've measured
>>> the overhead of the *libata* layer.  I think the point here is that you
>>> can't measure the difference at a macro level unless you're sending a
>>> lot of commands.)
>> How did you measure it?
> 
> Willy presented how he measured SCSI stack at LSF2008. ISTR he was
> advised to use oprofile in his test application so there is probably
> an updated version of these slides:
>     http://iou.parisc-linux.org/lsf2008/IO-latency-Kristen-Carlson-Accardi.pdf

Ah... okay, with ram low level driver.

>> The issue path isn't thick at all although
>> command allocation logic there is a bit brain damaged and should use
>> block layer tag management.  All it does is - allocate qc, interpret
>> SCSI command to ATA command and write it to qc, map dma and build dma
>> table and pass it over to the low level issue function.  The only
>> extra step there is the translation part and I don't think that can
>> take a full microsecond on modern processors.
> 
> Maybe you are counting instructions and not cycles? Every cache miss
> is 200-300 cycles (say 100ns). When running multiple threads, we will
> miss on nearly every spinlock acquisition and probably on several data
> accesses. 1 microsecond isn't alot when counting this way.

Yeah, ata uses its own locking and the qc allocation does atomic
bitops for each bit for no good reason which can hurt for very hi-ops
with NCQ tags filled up.  If serving 4k requests as fast as possible
is the goal, I'm not really sure the current SCSI or ATA commands are
the best suited ones.  Both SCSI and ATA are focused on rotating media
with seek latency and thus have SG on the host bus side in mode cases
but never on the device side.  If getting the maximum random scattered
access throughput is a must, the best way would be adding a SG r/w
commands to ATA and adapt our storage stack accordingly.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10  2:47                           ` Tejun Heo
@ 2008-12-10  3:23                             ` Grant Grundler
  2008-12-10  3:44                               ` Tejun Heo
  0 siblings, 1 reply; 65+ messages in thread
From: Grant Grundler @ 2008-12-10  3:23 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Matthew Wilcox, James Bottomley, linux-ide, linux-scsi

Hi Tejun,

On Tue, Dec 9, 2008 at 6:47 PM, Tejun Heo <htejun@gmail.com> wrote:
...
>> That's the whole point of SSDs (lots of small, random IO).
>
> But on many workloads, filesystems manage to colocate what belongs
> together and with little help from read ahead and block layer we
> manage to dish out decently sized requests.

True. And plenty of applications use a database which can't co-locate
the data. Read ahead for random IO just wastes BW and CPU cycles.

> It will be great to serve
> 4k requests as fast as we can but whether that should be (or rather
> how much) the focal point of optimization is a slightly different
> problem.

"How much the focal point" is a fair question. If someone can produce
a super efficient SATA or SAS storage controller, I'd think it would
matter more.

...
>> Willy presented how he measured SCSI stack at LSF2008. ISTR he was
>> advised to use oprofile in his test application so there is probably
>> an updated version of these slides:
>>     http://iou.parisc-linux.org/lsf2008/IO-latency-Kristen-Carlson-Accardi.pdf
>
> Ah... okay, with ram low level driver.

Right. that's alot faster than any SSD. But it's a convenient way to
get consistent, precise numbers for workloads that can be scaled down
to fit into RAM.

...
>> Maybe you are counting instructions and not cycles? Every cache miss
>> is 200-300 cycles (say 100ns). When running multiple threads, we will
>> miss on nearly every spinlock acquisition and probably on several data
>> accesses. 1 microsecond isn't alot when counting this way.
>
> Yeah, ata uses its own locking and the qc allocation does atomic
> bitops for each bit for no good reason which can hurt for very hi-ops
> with NCQ tags filled up.  If serving 4k requests as fast as possible
> is the goal, I'm not really sure the current SCSI or ATA commands are
> the best suited ones.  Both SCSI and ATA are focused on rotating media
> with seek latency

I think existing File Systems and block IO schedulers (except NOOP) are
tuned for rotating media and access patterns that benefit this media the most.

> and thus have SG on the host bus side in mode cases
> but never on the device side.

SG == scatter-gather? I'm not sure why that is specific to rotating media.
Or is this referring to "SCSI-generic" pass through?

In any case, only traversing one fewer layers (SCSI or libata) in
block code path would help serve 4k requests more efficiently.

> If getting the maximum random scattered
> access throughput is a must, the best way would be adding a SG r/w
> commands to ATA and adapt our storage stack accordingly.

I don't think everyone wants to throw out the entire stack.
But adding a passthrough for ATA and connecting that to FUSE might
be a performant alternative.

thanks,
grant

> Thanks.
>
> --
> tejun
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-09 22:38                       ` James Bottomley
@ 2008-12-10  3:37                         ` Matthew Wilcox
  0 siblings, 0 replies; 65+ messages in thread
From: Matthew Wilcox @ 2008-12-10  3:37 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-ide, linux-scsi, Tejun Heo

On Tue, Dec 09, 2008 at 04:38:07PM -0600, James Bottomley wrote:
> On Tue, 2008-12-09 at 15:21 -0700, Matthew Wilcox wrote:
> > The performance penalty is certainly measurable.  It's about 1 microsecond
> > per request extra to go from userspace -> scsi -> libata -> driver
> > than it is to go from userspace -> scsi -> driver.  If you issue 400
> > commands per second (as you might do with a 15k RPM SCSI drive), that's
> > 400 microseconds.  If you issue 10,000 commands per second (as you might
> > do with an SSD), that's 10ms of additional CPU time spent in the kernel
> > per second (or 1%).
> 
> Um, not quite.  What you're talking about is increased latency.  It's

Tsk.  I was quite clear I wasn't talking about latency or bandwidth.  I
was talking about the amount of CPU used to keep a device busy.

> not cumulative because we use TCQ (well mostly).  The question is really
> how it impacts the benchmarks, which are mostly throughput based (and
> really, our block layer trades latency for throughput anyway, so it's
> not clear what the impact really is).

If 1% of CPU is being used by the kernel, that's 1% of CPU not available
for the user application (or alternatively an extra centisecond the CPU
could be in a low-power state if you're not CPU-bound).

> > (OK, I haven't measured the overhead of the *SCSI* layer, I've measured
> > the overhead of the *libata* layer.  I think the point here is that you
> > can't measure the difference at a macro level unless you're sending a
> > lot of commands.)
> 
> Perhaps one of the things we should agree on is exactly how we want to
> measure things like this.  Making the layering thinner for less latency
> is usually good ... unless there are other tradeoffs.  I think not
> forcing ata disks to go through SCSI will probably be tradeoff free, but
> we need to make sure it is.

That would certainly be a good idea.  I don't think we have a consensus
about what we should be measuring yet ;-)

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10  3:23                             ` Grant Grundler
@ 2008-12-10  3:44                               ` Tejun Heo
  2008-12-10 15:24                                 ` Matthew Wilcox
  0 siblings, 1 reply; 65+ messages in thread
From: Tejun Heo @ 2008-12-10  3:44 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Matthew Wilcox, James Bottomley, linux-ide, linux-scsi

Hello,

Grant Grundler wrote:
>>> Maybe you are counting instructions and not cycles? Every cache miss
>>> is 200-300 cycles (say 100ns). When running multiple threads, we will
>>> miss on nearly every spinlock acquisition and probably on several data
>>> accesses. 1 microsecond isn't alot when counting this way.
>> Yeah, ata uses its own locking and the qc allocation does atomic
>> bitops for each bit for no good reason which can hurt for very hi-ops
>> with NCQ tags filled up.  If serving 4k requests as fast as possible
>> is the goal, I'm not really sure the current SCSI or ATA commands are
>> the best suited ones.  Both SCSI and ATA are focused on rotating media
>> with seek latency
> 
> I think existing File Systems and block IO schedulers (except NOOP) are
> tuned for rotating media and access patterns that benefit this media the most.

Acutally, the whole stack is optimized toward IO devices with seek
latency, from the hardware to our drivers and the whole block layer
itself.

>> and thus have SG on the host bus side in mode cases
>> but never on the device side.
> 
> SG == scatter-gather? I'm not sure why that is specific to rotating media.
> Or is this referring to "SCSI-generic" pass through?

I was talking about scatter-gather.  All the IO commands are about one
continuous extent of data on the device and the whole stack from the
bio is built that way and the overhead of libata is minute compared to
the whole thing including emitting single command and receiving
completion for each 4k transfer.

> In any case, only traversing one fewer layers (SCSI or libata) in
> block code path would help serve 4k requests more efficiently.

Yes, no doubt.

>> If getting the maximum random scattered
>> access throughput is a must, the best way would be adding a SG r/w
>> commands to ATA and adapt our storage stack accordingly.
> 
> I don't think everyone wants to throw out the entire stack.
> But adding a passthrough for ATA and connecting that to FUSE might
> be a performant alternative.

Don't know how FUSE would come into play but if the device can receive
list of IOs to perform in a single command and reply accordingly, the
block layer (possibly bio interface too?) can be modified to merge
random IOs into a single request and things will be really fast and
whether we grab one more spinlock or not at the bottom of the stack
wouldn't really matter.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10  3:44                               ` Tejun Heo
@ 2008-12-10 15:24                                 ` Matthew Wilcox
  2008-12-10 15:33                                   ` Tejun Heo
  2008-12-10 17:21                                   ` Grant Grundler
  0 siblings, 2 replies; 65+ messages in thread
From: Matthew Wilcox @ 2008-12-10 15:24 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Grant Grundler, James Bottomley, linux-ide, linux-scsi

On Wed, Dec 10, 2008 at 12:44:51PM +0900, Tejun Heo wrote:
> >> and thus have SG on the host bus side in mode cases
> >> but never on the device side.
> > 
> > SG == scatter-gather? I'm not sure why that is specific to rotating media.
> > Or is this referring to "SCSI-generic" pass through?
> 
> I was talking about scatter-gather.  All the IO commands are about one
> continuous extent of data on the device and the whole stack from the
> bio is built that way and the overhead of libata is minute compared to
> the whole thing including emitting single command and receiving
> completion for each 4k transfer.

I think what Tejun means is add a new command, say READ_SG which would
transfer a page of LBA ranges (see how TRIM works for details), then the
drive would do a single transfer which contained the data from all those
ranges.

This would work, but (ignoring the political / standardisation efforts
required to make this happen), this is just a cop-out.  When networking
people were first faced with gigabit, they tried the same thing (oooh,
9000 byte packets, oooh, TCP Offload, etc), all in the name of passing
larger amounts of data to the card in a single transaction so they
didn't have to fix their per-transaction overheads.

Users weren't interested.  They wanted to keep sending 1500 byte packets
(because most equipment couldn't handle jumbo frames) and they wanted
netfilter and SACK and all the other goodies that the Linux networking
stack offered and the card's TCP stack didn't.

We should stop denying that users actually want to do 4k IOs and just
get on with fixing the storage stack to cope with lots of them.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10 15:24                                 ` Matthew Wilcox
@ 2008-12-10 15:33                                   ` Tejun Heo
  2008-12-10 16:01                                     ` Matthew Wilcox
  2008-12-10 17:11                                     ` Grant Grundler
  2008-12-10 17:21                                   ` Grant Grundler
  1 sibling, 2 replies; 65+ messages in thread
From: Tejun Heo @ 2008-12-10 15:33 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Grant Grundler, James Bottomley, linux-ide, linux-scsi

Hello, Matthew.

Matthew Wilcox wrote:
> I think what Tejun means is add a new command, say READ_SG which would
> transfer a page of LBA ranges (see how TRIM works for details), then the
> drive would do a single transfer which contained the data from all those
> ranges.

Yes, I did.

> This would work, but (ignoring the political / standardisation efforts
> required to make this happen), this is just a cop-out.  When networking
> people were first faced with gigabit, they tried the same thing (oooh,
> 9000 byte packets, oooh, TCP Offload, etc), all in the name of passing
> larger amounts of data to the card in a single transaction so they
> didn't have to fix their per-transaction overheads.
> 
> Users weren't interested.  They wanted to keep sending 1500 byte packets
> (because most equipment couldn't handle jumbo frames) and they wanted
> netfilter and SACK and all the other goodies that the Linux networking
> stack offered and the card's TCP stack didn't.
> 
> We should stop denying that users actually want to do 4k IOs and just
> get on with fixing the storage stack to cope with lots of them.

And yeap we definitely should try to do that too but I don't think
RW_SG would be as useless as jumbo frames (much less compatibility
problem and no loss of functionality), and the actual hardware
overhead of issuing separate commands for each 4k segment is way
higher than anything we do along the block and low level driver layers
in terms of IO access, host bus and ATA (or SAS) bus overhead.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10 15:33                                   ` Tejun Heo
@ 2008-12-10 16:01                                     ` Matthew Wilcox
  2008-12-10 17:11                                     ` Grant Grundler
  1 sibling, 0 replies; 65+ messages in thread
From: Matthew Wilcox @ 2008-12-10 16:01 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Grant Grundler, James Bottomley, linux-ide, linux-scsi

On Thu, Dec 11, 2008 at 12:33:48AM +0900, Tejun Heo wrote:
> And yeap we definitely should try to do that too but I don't think
> RW_SG would be as useless as jumbo frames (much less compatibility
> problem and no loss of functionality), and the actual hardware
> overhead of issuing separate commands for each 4k segment is way
> higher than anything we do along the block and low level driver layers
> in terms of IO access, host bus and ATA (or SAS) bus overhead.

You're probably right that there's high overhead at the hardware level,
and we can't do anything about that.  There is some evidence that Linux
is the limiting factor in SSD performance right now due to not being
able to get enough commands to the device.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10 15:33                                   ` Tejun Heo
  2008-12-10 16:01                                     ` Matthew Wilcox
@ 2008-12-10 17:11                                     ` Grant Grundler
  1 sibling, 0 replies; 65+ messages in thread
From: Grant Grundler @ 2008-12-10 17:11 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Matthew Wilcox, James Bottomley, linux-ide, linux-scsi

On Wed, Dec 10, 2008 at 7:33 AM, Tejun Heo <htejun@gmail.com> wrote:
...
> And yeap we definitely should try to do that too but I don't think
> RW_SG would be as useless as jumbo frames (much less compatibility
> problem and no loss of functionality),

Jumbo frames aren't useless. They just don't apply to the "small message
passing overhead" problem. Users doing bulk data transfer (NAS, FTP, etc)
are pretty happy with TCP Segmentation Offloading (cousin of Jumbo Frames).

> and the actual hardware
> overhead of issuing separate commands for each 4k segment is way
> higher than anything we do along the block and low level driver layers
> in terms of IO access, host bus and ATA (or SAS) bus overhead.

That's true and it was also true gigabit NICs in the 90's. NIC HW vendors
have figured out host to avoid doing MMIO reads/writes during normal IO.
Infiniband has an even more efficient interface that's mostly Host RAM based
(a few MMIO writes). Last time I measured (~2006), TCP stack was 4x the
CPU cost of the HW interface. I don't know what the current ratio is for
any given SATA controller vs libata/SCSI stack, but I'm certain it will
change as new controllers are introduced.

hth,
grant

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: libata / scsi separation
  2008-12-10 15:24                                 ` Matthew Wilcox
  2008-12-10 15:33                                   ` Tejun Heo
@ 2008-12-10 17:21                                   ` Grant Grundler
  1 sibling, 0 replies; 65+ messages in thread
From: Grant Grundler @ 2008-12-10 17:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Tejun Heo, James Bottomley, linux-ide, linux-scsi

On Wed, Dec 10, 2008 at 7:24 AM, Matthew Wilcox <matthew@wil.cx> wrote:
...
> This would work, but (ignoring the political / standardisation efforts
> required to make this happen), this is just a cop-out.  When networking
> people were first faced with gigabit, they tried the same thing (oooh,
> 9000 byte packets, oooh, TCP Offload, etc), all in the name of passing
> larger amounts of data to the card in a single transaction so they
> didn't have to fix their per-transaction overheads.

Willy,
I agree with your points except for one: TCP Offload.

James Bottomley gets credit for publicly observing that storage
protocols have been offloaded for ages. I once worked on a SCSI
controller that interrupted on every phase change (early 90's). Next
generation of SCSI controllers had offload. Same thing occured
to FC in the late 90s and legacy PATA is being replaced with
equivalent SATA offload engines more recently. Enough examples.

> Users weren't interested.  They wanted to keep sending 1500 byte packets
> (because most equipment couldn't handle jumbo frames) and they wanted
> netfilter and SACK and all the other goodies that the Linux networking
> stack offered and the card's TCP stack didn't.
>
> We should stop denying that users actually want to do 4k IOs and just
> get on with fixing the storage stack to cope with lots of them.

+1

thanks,
grant

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH] remove ide-scsi
  2008-12-09 21:41                 ` Matthew Wilcox
@ 2008-12-10 17:46                   ` Sergei Shtylyov
  0 siblings, 0 replies; 65+ messages in thread
From: Sergei Shtylyov @ 2008-12-10 17:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alan Cox, Bartlomiej Zolnierkiewicz, Dan No?,
	James Bottomley, Christoph Hellwig, FUJITA Tomonori, linux-ide,
	linux-scsi, osst

Hello.

Matthew Wilcox wrote:

>>Well ATAPI is SCSI emulation (its a sort of pidgin SCSI admittedly).

> I'm told (by people who work with embedded people) that first the
> customer says "Oh, I don't need the block layer, I just need MTD", so
> they disable CONFIG_BLOCK, then the customer says "Oh, I need to support
> USB storage", so they re-enable CONFIG_BLOCK, add CONFIG_SCSI and scsi
> disk support and usb-storage.

    That depends. And that is still not the same as SCSI subsys + SCSI disk 
driver + libata SCSI emulation.

> OK, there is now 'ub' so you can do this without SCSI, but still ...

>>I'm actually seeing two strands of requests (including from embedded)
>>
>>- CF only small "dumb as president" type driver that is written to be as

> Just six weeks until you can't make that joke any more ...

    You're frightening me... What's going to happen after these 6 weeks?

MBR, Sergei

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2008-12-10 17:46 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-03  1:38 [PATCH] remove ide-scsi FUJITA Tomonori
2008-12-03 10:06 ` Christoph Hellwig
2008-12-03 13:31   ` Willem Riede
2008-12-03 13:55     ` Matthew Wilcox
2008-12-03 14:02       ` Alan Cox
2008-12-03 15:09   ` James Bottomley
2008-12-06  6:12     ` Pete Zaitcev
2008-12-06 14:06       ` Bartlomiej Zolnierkiewicz
2008-12-06 14:51     ` Bartlomiej Zolnierkiewicz
2008-12-06 15:06       ` Alan Cox
2008-12-06 16:29         ` Bartlomiej Zolnierkiewicz
2008-12-06 15:25       ` Willem Riede
2008-12-06 15:59         ` Bartlomiej Zolnierkiewicz
2008-12-06 17:00       ` Dan Noé
2008-12-06 21:41         ` Bartlomiej Zolnierkiewicz
2008-12-06 22:24           ` Alan Cox
2008-12-06 22:52             ` Sergei Shtylyov
2008-12-06 23:02               ` Alan Cox
2008-12-06 23:19                 ` Sergei Shtylyov
2008-12-06 23:32                   ` Alan Cox
2008-12-07  0:08                     ` Sergei Shtylyov
2008-12-07 11:40                       ` Alan Cox
2008-12-07 14:46                         ` Sergei Shtylyov
2008-12-07 15:04                   ` James Bottomley
2008-12-07 15:21                     ` Sergei Shtylyov
2008-12-09 22:21                     ` libata / scsi separation Matthew Wilcox
2008-12-09 22:38                       ` James Bottomley
2008-12-10  3:37                         ` Matthew Wilcox
2008-12-10  1:54                       ` Tejun Heo
2008-12-10  2:29                         ` Grant Grundler
2008-12-10  2:47                           ` Tejun Heo
2008-12-10  3:23                             ` Grant Grundler
2008-12-10  3:44                               ` Tejun Heo
2008-12-10 15:24                                 ` Matthew Wilcox
2008-12-10 15:33                                   ` Tejun Heo
2008-12-10 16:01                                     ` Matthew Wilcox
2008-12-10 17:11                                     ` Grant Grundler
2008-12-10 17:21                                   ` Grant Grundler
2008-12-07  0:19                 ` [PATCH] remove ide-scsi Sergei Shtylyov
2008-12-07  9:59                   ` Sergei Shtylyov
2008-12-07 10:41                 ` Sergei Shtylyov
2008-12-09 21:41                 ` Matthew Wilcox
2008-12-10 17:46                   ` Sergei Shtylyov
2008-12-06 23:28               ` Jeff Garzik
2008-12-06 23:42                 ` Sergei Shtylyov
2008-12-06 23:48                   ` Jeff Garzik
2008-12-07  3:36                     ` Yinghai Lu
2008-12-07  4:17                       ` Jeff Garzik
2008-12-07  5:07                         ` Yinghai Lu
2008-12-07 11:00                           ` Sergei Shtylyov
2008-12-09 19:59                         ` Mark Lord
2008-12-09 20:07                           ` Jeff Garzik
2008-12-09 21:04                             ` James Bottomley
2008-12-06 23:45                 ` Bartlomiej Zolnierkiewicz
2008-12-06 23:50                   ` Jeff Garzik
2008-12-06 23:40             ` Bartlomiej Zolnierkiewicz
2008-12-06 23:51               ` Alan Cox
2008-12-07  0:56                 ` Bartlomiej Zolnierkiewicz
2008-12-07  1:14                   ` Alan Cox
2008-12-07 10:32                     ` Sergei Shtylyov
2008-12-06 23:51               ` Jeff Garzik
2008-12-06 22:33           ` Al Viro
2008-12-06 23:13             ` Bartlomiej Zolnierkiewicz
2008-12-06 23:17             ` Willem Riede
2008-12-07  0:09               ` Al Viro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.