All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-02-01  2:55 ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen


Hello,

This patchset implements alt_size, which is a size hint to the users
of a block device.  It's primarily to communicate the BIOS (HPA) size
on ATA devices to userland, so that dmraid can consider it when trying
to figure out BIOS raid layout.  This is critical as dmraid can be
tricked into assemblying the wrong raid when fed with the unlocked
size and if the disk content is right (or, rather, wrong) and good
number of distros are shipping with ignore_hpa=1 as default.

This patch contains the following three patches.

  0001-block-add-alt_size.patch
  0002-scsi-add-scsi_device-alt_capacity.patch
  0003-libata-export-HPA-size-as-alt_size.patch

Without the above three patches, I get the following on my nv RAID-0
if HPA unlocking is turned on.

  # ~/work/dmraid/tools/dmraid -a y
  RAID set "nvidia_djgdjigi" was activated
  # mount /dev/dm-0 /mnt/tmp
  mount: wrong fs type, bad option, bad superblock on /dev/dm-0,
	 missing codepage or helper program, or other error
	 In some cases useful info is found in syslog - try
	 dmesg | tail  or so

With the above three patches and modified dmraid,

  ck804:~/os/work/build # ~/work/dmraid/tools/dmraid -a y -v -v -v
  ...
  NOTICE: /dev/sdb: using BIOS sectors 234439535
  RAID set "nvidia_djgdjigi" was activated
  ...
  ck804:~/os/work/build # mount /dev/dm-0 /mnt/tmp
  ck804:~/os/work/build # umount /dev/dm-0

Any ideas on through which tree this should be pushed?  I'll post
dmraid patches as a reply to this thread for reference.

Thanks.

This patchset is against the current linus#master[1] and contains the
following changes.

 block/genhd.c              |   26 ++++++++++++++++++++++++++
 drivers/ata/libata-core.c  |    2 ++
 drivers/ata/libata-scsi.c  |    2 ++
 drivers/scsi/sd.c          |    1 +
 include/linux/genhd.h      |   13 ++++++++++++-
 include/linux/libata.h     |    1 +
 include/scsi/scsi_device.h |    1 +
 7 files changed, 45 insertions(+), 1 deletion(-)

--
tejun

[1] 45c82b5a770be66845687a7d027c8b52946d59af

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-02-01  2:55 ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel


Hello,

This patchset implements alt_size, which is a size hint to the users
of a block device.  It's primarily to communicate the BIOS (HPA) size
on ATA devices to userland, so that dmraid can consider it when trying
to figure out BIOS raid layout.  This is critical as dmraid can be
tricked into assemblying the wrong raid when fed with the unlocked
size and if the disk content is right (or, rather, wrong) and good
number of distros are shipping with ignore_hpa=1 as default.

This patch contains the following three patches.

  0001-block-add-alt_size.patch
  0002-scsi-add-scsi_device-alt_capacity.patch
  0003-libata-export-HPA-size-as-alt_size.patch

Without the above three patches, I get the following on my nv RAID-0
if HPA unlocking is turned on.

  # ~/work/dmraid/tools/dmraid -a y
  RAID set "nvidia_djgdjigi" was activated
  # mount /dev/dm-0 /mnt/tmp
  mount: wrong fs type, bad option, bad superblock on /dev/dm-0,
	 missing codepage or helper program, or other error
	 In some cases useful info is found in syslog - try
	 dmesg | tail  or so

With the above three patches and modified dmraid,

  ck804:~/os/work/build # ~/work/dmraid/tools/dmraid -a y -v -v -v
  ...
  NOTICE: /dev/sdb: using BIOS sectors 234439535
  RAID set "nvidia_djgdjigi" was activated
  ...
  ck804:~/os/work/build # mount /dev/dm-0 /mnt/tmp
  ck804:~/os/work/build # umount /dev/dm-0

Any ideas on through which tree this should be pushed?  I'll post
dmraid patches as a reply to this thread for reference.

Thanks.

This patchset is against the current linus#master[1] and contains the
following changes.

 block/genhd.c              |   26 ++++++++++++++++++++++++++
 drivers/ata/libata-core.c  |    2 ++
 drivers/ata/libata-scsi.c  |    2 ++
 drivers/scsi/sd.c          |    1 +
 include/linux/genhd.h      |   13 ++++++++++++-
 include/linux/libata.h     |    1 +
 include/scsi/scsi_device.h |    1 +
 7 files changed, 45 insertions(+), 1 deletion(-)

--
tejun

[1] 45c82b5a770be66845687a7d027c8b52946d59af

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/3] block: add alt_size
  2009-02-01  2:55 ` Tejun Heo
@ 2009-02-01  2:55   ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen
  Cc: Tejun Heo

ATA disks have two sizes - the user visible size and native size.  The
user visible size is usually used by the BIOS to hide certain area
(e.g. recovery partition) or trim size for various reasons.  IDE and
libata can unlock these HPA areas and IDE does so by default and
libata depending on kernel parameters (some distros default to unlock
to maintain compatibility with IDE).

This usually doesn't matter but certain tools need to know what BIOS
wants the disk to look like.  For example, certain BIOS RAID formats
put the metadata at the end of the disk where the 'end' is the trimmed
size, so without knowing the BIOS size, dmraid can't work reliably if
HPA is unlocked.

This patch adds alt_size to genhd and exports it via sysfs.  alt_size
is always smaller than the regular size.  Being a hint, alt_size can
also be set from userland by writing to the sysfs node mostly for
debugging and testing.  Because initialization order isn't enforced,
the alt_size < size limitation is applied while getting the attribute.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/genhd.c         |   26 ++++++++++++++++++++++++++
 include/linux/genhd.h |   13 ++++++++++++-
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 397960c..db7b501 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -824,6 +824,29 @@ static ssize_t disk_ro_show(struct device *dev,
 	return sprintf(buf, "%d\n", get_disk_ro(disk) ? 1 : 0);
 }
 
+static ssize_t disk_alt_size_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct gendisk *disk = dev_to_disk(dev);
+
+	return sprintf(buf, "%llu\n",
+		       (unsigned long long)get_alt_capacity(disk));
+}
+
+static ssize_t disk_alt_size_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct gendisk *disk = dev_to_disk(dev);
+	sector_t new_size = simple_strtoull(buf, NULL, 0);
+
+	set_alt_capacity(disk, new_size);
+
+	if (get_alt_capacity(disk) == new_size)
+		return count;
+	return -EINVAL;
+}
+
 static ssize_t disk_capability_show(struct device *dev,
 				    struct device_attribute *attr, char *buf)
 {
@@ -837,6 +860,8 @@ static DEVICE_ATTR(ext_range, S_IRUGO, disk_ext_range_show, NULL);
 static DEVICE_ATTR(removable, S_IRUGO, disk_removable_show, NULL);
 static DEVICE_ATTR(ro, S_IRUGO, disk_ro_show, NULL);
 static DEVICE_ATTR(size, S_IRUGO, part_size_show, NULL);
+static DEVICE_ATTR(alt_size, S_IRUGO | S_IWUSR, disk_alt_size_show,
+		   disk_alt_size_store);
 static DEVICE_ATTR(capability, S_IRUGO, disk_capability_show, NULL);
 static DEVICE_ATTR(stat, S_IRUGO, part_stat_show, NULL);
 #ifdef CONFIG_FAIL_MAKE_REQUEST
@@ -855,6 +880,7 @@ static struct attribute *disk_attrs[] = {
 	&dev_attr_removable.attr,
 	&dev_attr_ro.attr,
 	&dev_attr_size.attr,
+	&dev_attr_alt_size.attr,
 	&dev_attr_capability.attr,
 	&dev_attr_stat.attr,
 #ifdef CONFIG_FAIL_MAKE_REQUEST
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 16948ea..763affe 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -159,7 +159,8 @@ struct gendisk {
 
 	struct timer_rand_state *random;
 
-	atomic_t sync_io;		/* RAID */
+	sector_t alt_nr_sects;	/* alt sects, should be <= part0.sects */
+	atomic_t sync_io;	/* RAID */
 	struct work_struct async_notify;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
 	struct blk_integrity *integrity;
@@ -364,10 +365,20 @@ static inline sector_t get_capacity(struct gendisk *disk)
 {
 	return disk->part0.nr_sects;
 }
+static inline sector_t get_alt_capacity(struct gendisk *disk)
+{
+	if (disk->alt_nr_sects && disk->alt_nr_sects < disk->part0.nr_sects)
+		return disk->alt_nr_sects;
+	return 0;
+}
 static inline void set_capacity(struct gendisk *disk, sector_t size)
 {
 	disk->part0.nr_sects = size;
 }
+static inline void set_alt_capacity(struct gendisk *disk, sector_t size)
+{
+	disk->alt_nr_sects = size;
+}
 
 #ifdef CONFIG_SOLARIS_X86_PARTITION
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 1/3] block: add alt_size
@ 2009-02-01  2:55   ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel
  Cc: Tejun Heo

ATA disks have two sizes - the user visible size and native size.  The
user visible size is usually used by the BIOS to hide certain area
(e.g. recovery partition) or trim size for various reasons.  IDE and
libata can unlock these HPA areas and IDE does so by default and
libata depending on kernel parameters (some distros default to unlock
to maintain compatibility with IDE).

This usually doesn't matter but certain tools need to know what BIOS
wants the disk to look like.  For example, certain BIOS RAID formats
put the metadata at the end of the disk where the 'end' is the trimmed
size, so without knowing the BIOS size, dmraid can't work reliably if
HPA is unlocked.

This patch adds alt_size to genhd and exports it via sysfs.  alt_size
is always smaller than the regular size.  Being a hint, alt_size can
also be set from userland by writing to the sysfs node mostly for
debugging and testing.  Because initialization order isn't enforced,
the alt_size < size limitation is applied while getting the attribute.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/genhd.c         |   26 ++++++++++++++++++++++++++
 include/linux/genhd.h |   13 ++++++++++++-
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 397960c..db7b501 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -824,6 +824,29 @@ static ssize_t disk_ro_show(struct device *dev,
 	return sprintf(buf, "%d\n", get_disk_ro(disk) ? 1 : 0);
 }
 
+static ssize_t disk_alt_size_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct gendisk *disk = dev_to_disk(dev);
+
+	return sprintf(buf, "%llu\n",
+		       (unsigned long long)get_alt_capacity(disk));
+}
+
+static ssize_t disk_alt_size_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct gendisk *disk = dev_to_disk(dev);
+	sector_t new_size = simple_strtoull(buf, NULL, 0);
+
+	set_alt_capacity(disk, new_size);
+
+	if (get_alt_capacity(disk) == new_size)
+		return count;
+	return -EINVAL;
+}
+
 static ssize_t disk_capability_show(struct device *dev,
 				    struct device_attribute *attr, char *buf)
 {
@@ -837,6 +860,8 @@ static DEVICE_ATTR(ext_range, S_IRUGO, disk_ext_range_show, NULL);
 static DEVICE_ATTR(removable, S_IRUGO, disk_removable_show, NULL);
 static DEVICE_ATTR(ro, S_IRUGO, disk_ro_show, NULL);
 static DEVICE_ATTR(size, S_IRUGO, part_size_show, NULL);
+static DEVICE_ATTR(alt_size, S_IRUGO | S_IWUSR, disk_alt_size_show,
+		   disk_alt_size_store);
 static DEVICE_ATTR(capability, S_IRUGO, disk_capability_show, NULL);
 static DEVICE_ATTR(stat, S_IRUGO, part_stat_show, NULL);
 #ifdef CONFIG_FAIL_MAKE_REQUEST
@@ -855,6 +880,7 @@ static struct attribute *disk_attrs[] = {
 	&dev_attr_removable.attr,
 	&dev_attr_ro.attr,
 	&dev_attr_size.attr,
+	&dev_attr_alt_size.attr,
 	&dev_attr_capability.attr,
 	&dev_attr_stat.attr,
 #ifdef CONFIG_FAIL_MAKE_REQUEST
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 16948ea..763affe 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -159,7 +159,8 @@ struct gendisk {
 
 	struct timer_rand_state *random;
 
-	atomic_t sync_io;		/* RAID */
+	sector_t alt_nr_sects;	/* alt sects, should be <= part0.sects */
+	atomic_t sync_io;	/* RAID */
 	struct work_struct async_notify;
 #ifdef  CONFIG_BLK_DEV_INTEGRITY
 	struct blk_integrity *integrity;
@@ -364,10 +365,20 @@ static inline sector_t get_capacity(struct gendisk *disk)
 {
 	return disk->part0.nr_sects;
 }
+static inline sector_t get_alt_capacity(struct gendisk *disk)
+{
+	if (disk->alt_nr_sects && disk->alt_nr_sects < disk->part0.nr_sects)
+		return disk->alt_nr_sects;
+	return 0;
+}
 static inline void set_capacity(struct gendisk *disk, sector_t size)
 {
 	disk->part0.nr_sects = size;
 }
+static inline void set_alt_capacity(struct gendisk *disk, sector_t size)
+{
+	disk->alt_nr_sects = size;
+}
 
 #ifdef CONFIG_SOLARIS_X86_PARTITION
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/3] scsi: add scsi_device->alt_capacity
  2009-02-01  2:55 ` Tejun Heo
@ 2009-02-01  2:55   ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen
  Cc: Tejun Heo

Add scsi_device->alt_capacity and let sd pass it over to genhd.  This
is to allow SCSI low level drivers to configure alt_capacity via
slave_configure().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 drivers/scsi/sd.c          |    1 +
 include/scsi/scsi_device.h |    1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d57566b..b83d850 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1751,6 +1751,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	blk_queue_ordered(sdkp->disk->queue, ordered, sd_prepare_flush);
 
 	set_capacity(disk, sdkp->capacity);
+	set_alt_capacity(disk, sdp->alt_capacity);
 	kfree(buffer);
 
  out:
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 01a4c58..55be54a 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -92,6 +92,7 @@ struct scsi_device {
 	unsigned int manufacturer;	/* Manufacturer of device, for using 
 					 * vendor-specific cmd's */
 	unsigned sector_size;	/* size in bytes */
+	size_t alt_capacity;	/* alternative capacity, used by sd */
 
 	void *hostdata;		/* available to low-level driver */
 	char type;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/3] scsi: add scsi_device->alt_capacity
@ 2009-02-01  2:55   ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel
  Cc: Tejun Heo

Add scsi_device->alt_capacity and let sd pass it over to genhd.  This
is to allow SCSI low level drivers to configure alt_capacity via
slave_configure().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 drivers/scsi/sd.c          |    1 +
 include/scsi/scsi_device.h |    1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d57566b..b83d850 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1751,6 +1751,7 @@ static int sd_revalidate_disk(struct gendisk *disk)
 	blk_queue_ordered(sdkp->disk->queue, ordered, sd_prepare_flush);
 
 	set_capacity(disk, sdkp->capacity);
+	set_alt_capacity(disk, sdp->alt_capacity);
 	kfree(buffer);
 
  out:
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 01a4c58..55be54a 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -92,6 +92,7 @@ struct scsi_device {
 	unsigned int manufacturer;	/* Manufacturer of device, for using 
 					 * vendor-specific cmd's */
 	unsigned sector_size;	/* size in bytes */
+	size_t alt_capacity;	/* alternative capacity, used by sd */
 
 	void *hostdata;		/* available to low-level driver */
 	char type;
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/3] libata: export HPA size as alt_size
  2009-02-01  2:55 ` Tejun Heo
@ 2009-02-01  2:55   ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen
  Cc: Tejun Heo

Export HPA size as alt_size so that userland tools which need the BIOS
size can determine it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 drivers/ata/libata-core.c |    2 ++
 drivers/ata/libata-scsi.c |    2 ++
 include/linux/libata.h    |    1 +
 3 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 88c2428..cfea50a 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1537,6 +1537,8 @@ static int ata_hpa_resize(struct ata_device *dev)
 	}
 
 	/* let's unlock HPA */
+	dev->alt_n_sectors = sectors;
+
 	rc = ata_set_max_sectors(dev, native_sectors);
 	if (rc == -EACCES) {
 		/* if device aborted the command, skip HPA resizing */
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 3c4c5ae..911fe7c 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1104,6 +1104,8 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
 		blk_queue_update_dma_alignment(sdev->request_queue,
 					       ATA_SECT_SIZE - 1);
 		sdev->manage_start_stop = 1;
+
+		sdev->alt_capacity = dev->alt_n_sectors;
 	}
 
 	if (dev->flags & ATA_DFLAG_AN)
diff --git a/include/linux/libata.h b/include/linux/libata.h
index bca3ba2..97736d5 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -582,6 +582,7 @@ struct ata_device {
 #endif
 	/* n_sector is used as CLEAR_OFFSET, read comment above CLEAR_OFFSET */
 	u64			n_sectors;	/* size of device, if ATA */
+	u64			alt_n_sectors;	/* size bios wants us to see */
 	unsigned int		class;		/* ATA_DEV_xxx */
 	unsigned long		unpark_deadline;
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/3] libata: export HPA size as alt_size
@ 2009-02-01  2:55   ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:55 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel
  Cc: Tejun Heo

Export HPA size as alt_size so that userland tools which need the BIOS
size can determine it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 drivers/ata/libata-core.c |    2 ++
 drivers/ata/libata-scsi.c |    2 ++
 include/linux/libata.h    |    1 +
 3 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 88c2428..cfea50a 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1537,6 +1537,8 @@ static int ata_hpa_resize(struct ata_device *dev)
 	}
 
 	/* let's unlock HPA */
+	dev->alt_n_sectors = sectors;
+
 	rc = ata_set_max_sectors(dev, native_sectors);
 	if (rc == -EACCES) {
 		/* if device aborted the command, skip HPA resizing */
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 3c4c5ae..911fe7c 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1104,6 +1104,8 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
 		blk_queue_update_dma_alignment(sdev->request_queue,
 					       ATA_SECT_SIZE - 1);
 		sdev->manage_start_stop = 1;
+
+		sdev->alt_capacity = dev->alt_n_sectors;
 	}
 
 	if (dev->flags & ATA_DFLAG_AN)
diff --git a/include/linux/libata.h b/include/linux/libata.h
index bca3ba2..97736d5 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -582,6 +582,7 @@ struct ata_device {
 #endif
 	/* n_sector is used as CLEAR_OFFSET, read comment above CLEAR_OFFSET */
 	u64			n_sectors;	/* size of device, if ATA */
+	u64			alt_n_sectors;	/* size bios wants us to see */
 	unsigned int		class;		/* ATA_DEV_xxx */
 	unsigned long		unpark_deadline;
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCHSET] dmraid: use alt_size
  2009-02-01  2:55 ` Tejun Heo
@ 2009-02-01  2:59   ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:59 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

Hello again,

This patchset is for dmraid-1.0.0.rc15 and NOT for kernel.  It
contains the following three patches to make dmraid use alt_size if
available.

  0001-dmraid-set-read_info-to-offset-by-default-in-read_.patch
  0002-dmraid-add-alt_size-to-dev_info.patch
  0003-dmraid-make-nv-use-alt_size-if-available.patch

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCHSET] dmraid: use alt_size
@ 2009-02-01  2:59   ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:59 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Hello again,

This patchset is for dmraid-1.0.0.rc15 and NOT for kernel.  It
contains the following three patches to make dmraid use alt_size if
available.

  0001-dmraid-set-read_info-to-offset-by-default-in-read_.patch
  0002-dmraid-add-alt_size-to-dev_info.patch
  0003-dmraid-make-nv-use-alt_size-if-available.patch

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCHSET] dmraid: use alt_size
  2009-02-01  2:55 ` Tejun Heo
                   ` (4 preceding siblings ...)
  (?)
@ 2009-02-01  2:59 ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  2:59 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

Hello again,

This patchset is for dmraid-1.0.0.rc15 and NOT for kernel.  It
contains the following three patches to make dmraid use alt_size if
available.

  0001-dmraid-set-read_info-to-offset-by-default-in-read_.patch
  0002-dmraid-add-alt_size-to-dev_info.patch
  0003-dmraid-make-nv-use-alt_size-if-available.patch

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/3] dmraid: set read_info to @offset by default in read_raid_dev()
  2009-02-01  2:59   ` Tejun Heo
@ 2009-02-01  3:00     ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  3:00 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

read_info is passed from optional f_read_metatdata() to f_setup_rd().
If f_read_metadata() doesn't exist or doesn't initialize it, read_info
contains garbage.  Initialize it to @offset by default.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 lib/format/format.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/format/format.c b/lib/format/format.c
index 88cb938..4a2594a 100644
--- a/lib/format/format.c
+++ b/lib/format/format.c
@@ -435,7 +435,7 @@ read_raid_dev(struct lib_context *lc,
 {
 	struct raid_dev *rd = NULL;
 	void *meta;
-	union read_info info;
+	union read_info info = { .u64 = offset };
 
 	/*
 	 * In case the metadata format handler provides a special
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 1/3] dmraid: set read_info to @offset by default in read_raid_dev()
@ 2009-02-01  3:00     ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  3:00 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

read_info is passed from optional f_read_metatdata() to f_setup_rd().
If f_read_metadata() doesn't exist or doesn't initialize it, read_info
contains garbage.  Initialize it to @offset by default.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 lib/format/format.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/format/format.c b/lib/format/format.c
index 88cb938..4a2594a 100644
--- a/lib/format/format.c
+++ b/lib/format/format.c
@@ -435,7 +435,7 @@ read_raid_dev(struct lib_context *lc,
 {
 	struct raid_dev *rd = NULL;
 	void *meta;
-	union read_info info;
+	union read_info info = { .u64 = offset };
 
 	/*
 	 * In case the metadata format handler provides a special
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/3] dmraid: add alt_size to dev_info
  2009-02-01  2:59   ` Tejun Heo
@ 2009-02-01  3:00     ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  3:00 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

Add alt_size to dev_info and set it during device scanning if
available.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/dmraid/metadata.h |    1 +
 lib/device/scan.c         |   51 ++++++++++++++++++++++++++++++++------------
 2 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/include/dmraid/metadata.h b/include/dmraid/metadata.h
index 91bd221..538043e 100644
--- a/include/dmraid/metadata.h
+++ b/include/dmraid/metadata.h
@@ -120,6 +120,7 @@ struct dev_info {
 	char *path;		/* Actual device node path. */
 	char *serial;		/* ATA/SCSI serial number. */
 	uint64_t sectors;	/* Device size. */
+	uint64_t alt_sectors;	/* Alternative device size. (BIOS size) */
 };
 
 /* Metadata areas and size stored on a RAID device. */
diff --git a/lib/device/scan.c b/lib/device/scan.c
index e9618c2..3befb08 100644
--- a/lib/device/scan.c
+++ b/lib/device/scan.c
@@ -233,29 +233,52 @@ sysfs_get_size(struct lib_context *lc, struct dev_info *di,
 {
 	int ret = 0;
 	char buf[22], *sysfs_file;
-	const char *sysfs_size = "size";
+	const char *sysfs_size = "size", *sysfs_alt_size = "alt_size";
 	FILE *f;
 
 	if (!(sysfs_file = dbg_malloc(strlen(path) + strlen(name) +
-				      strlen(sysfs_size) + 3)))
+				      max(strlen(sysfs_size),
+					  strlen(sysfs_alt_size)) + 3)))
 		return log_alloc_err(lc, __func__);
 
+	/* read size */
 	sprintf(sysfs_file, "%s/%s/%s", path, name, sysfs_size);
-	if ((f = fopen(sysfs_file, "r"))) {
-		/* Use fread+sscanf for klibc compatibility. */
-		if (fread(buf, sizeof(char), sizeof buf - 1, f) &&
-		    (ret = sscanf(buf, "%" PRIu64, &di->sectors)) != 1) {
-			ret = 0;
-			log_err(lc, "reading disk size for %s from sysfs",
-				di->path);
-		}
-
-		fclose(f);
-	} else
+	if (!(f = fopen(sysfs_file, "r"))) {
 		log_err(lc, "opening %s", sysfs_file);
+		goto out;
+	}
 
-	dbg_free(sysfs_file);
+	/* Use fread+sscanf for klibc compatibility. */
+	if (!fread(buf, sizeof(char), sizeof buf - 1, f) ||
+	    sscanf(buf, "%" PRIu64, &di->sectors) != 1) {
+		log_err(lc, "reading disk size for %s from sysfs",
+			di->path);
+		goto out;
+	}
+	fclose(f);
+
+	/* read alt_size, missing alt_size isn't an error */
+	sprintf(sysfs_file, "%s/%s/%s", path, name, sysfs_alt_size);
+	if (!(f = fopen(sysfs_file, "r"))) {
+		if (errno == ENOENT)
+			ret = 1;
+		else
+			log_err(lc, "opening %s", sysfs_file);
+		goto out;
+	}
+
+	if (!fread(buf, sizeof(char), sizeof buf - 1, f) ||
+	    sscanf(buf, "%" PRIu64, &di->alt_sectors) != 1) {
+		log_err(lc, "reading disk alt size for %s from sysfs",
+			di->path);
+		goto out;
+	}
+	ret = 1;
 
+out:
+	if (f)
+		fclose(f);
+	dbg_free(sysfs_file);
 	return ret;
 }
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/3] dmraid: add alt_size to dev_info
@ 2009-02-01  3:00     ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  3:00 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Add alt_size to dev_info and set it during device scanning if
available.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/dmraid/metadata.h |    1 +
 lib/device/scan.c         |   51 ++++++++++++++++++++++++++++++++------------
 2 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/include/dmraid/metadata.h b/include/dmraid/metadata.h
index 91bd221..538043e 100644
--- a/include/dmraid/metadata.h
+++ b/include/dmraid/metadata.h
@@ -120,6 +120,7 @@ struct dev_info {
 	char *path;		/* Actual device node path. */
 	char *serial;		/* ATA/SCSI serial number. */
 	uint64_t sectors;	/* Device size. */
+	uint64_t alt_sectors;	/* Alternative device size. (BIOS size) */
 };
 
 /* Metadata areas and size stored on a RAID device. */
diff --git a/lib/device/scan.c b/lib/device/scan.c
index e9618c2..3befb08 100644
--- a/lib/device/scan.c
+++ b/lib/device/scan.c
@@ -233,29 +233,52 @@ sysfs_get_size(struct lib_context *lc, struct dev_info *di,
 {
 	int ret = 0;
 	char buf[22], *sysfs_file;
-	const char *sysfs_size = "size";
+	const char *sysfs_size = "size", *sysfs_alt_size = "alt_size";
 	FILE *f;
 
 	if (!(sysfs_file = dbg_malloc(strlen(path) + strlen(name) +
-				      strlen(sysfs_size) + 3)))
+				      max(strlen(sysfs_size),
+					  strlen(sysfs_alt_size)) + 3)))
 		return log_alloc_err(lc, __func__);
 
+	/* read size */
 	sprintf(sysfs_file, "%s/%s/%s", path, name, sysfs_size);
-	if ((f = fopen(sysfs_file, "r"))) {
-		/* Use fread+sscanf for klibc compatibility. */
-		if (fread(buf, sizeof(char), sizeof buf - 1, f) &&
-		    (ret = sscanf(buf, "%" PRIu64, &di->sectors)) != 1) {
-			ret = 0;
-			log_err(lc, "reading disk size for %s from sysfs",
-				di->path);
-		}
-
-		fclose(f);
-	} else
+	if (!(f = fopen(sysfs_file, "r"))) {
 		log_err(lc, "opening %s", sysfs_file);
+		goto out;
+	}
 
-	dbg_free(sysfs_file);
+	/* Use fread+sscanf for klibc compatibility. */
+	if (!fread(buf, sizeof(char), sizeof buf - 1, f) ||
+	    sscanf(buf, "%" PRIu64, &di->sectors) != 1) {
+		log_err(lc, "reading disk size for %s from sysfs",
+			di->path);
+		goto out;
+	}
+	fclose(f);
+
+	/* read alt_size, missing alt_size isn't an error */
+	sprintf(sysfs_file, "%s/%s/%s", path, name, sysfs_alt_size);
+	if (!(f = fopen(sysfs_file, "r"))) {
+		if (errno == ENOENT)
+			ret = 1;
+		else
+			log_err(lc, "opening %s", sysfs_file);
+		goto out;
+	}
+
+	if (!fread(buf, sizeof(char), sizeof buf - 1, f) ||
+	    sscanf(buf, "%" PRIu64, &di->alt_sectors) != 1) {
+		log_err(lc, "reading disk alt size for %s from sysfs",
+			di->path);
+		goto out;
+	}
+	ret = 1;
 
+out:
+	if (f)
+		fclose(f);
+	dbg_free(sysfs_file);
 	return ret;
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/3] dmraid: make nv use alt_size if available
  2009-02-01  2:59   ` Tejun Heo
@ 2009-02-01  3:01     ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  3:01 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

nv puts metatdata at the end of a device, so it's important that
dmraid and BIOS agree where the end of the device is.  Later kernel
exports the BIOS as via sysfs and it's available as di->alt_sectors.
Use it if available.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 lib/format/ataraid/nv.c |   15 +++++++++++----
 lib/format/ataraid/nv.h |    2 +-
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/lib/format/ataraid/nv.c b/lib/format/ataraid/nv.c
index 893cc3a..bd4589c 100644
--- a/lib/format/ataraid/nv.c
+++ b/lib/format/ataraid/nv.c
@@ -216,9 +216,16 @@ static int setup_rd(struct lib_context *lc, struct raid_dev *rd,
 static struct raid_dev *
 nv_read(struct lib_context *lc, struct dev_info *di)
 {
-	return read_raid_dev(lc, di, NULL,
-			     sizeof(struct nv), NV_CONFIGOFFSET,
-			     to_cpu, is_nv, NULL, setup_rd, handler);
+	uint64_t sectors = di->alt_sectors ? di->alt_sectors : di->sectors;
+	struct raid_dev *rd;
+
+	rd = read_raid_dev(lc, di, NULL, sizeof(struct nv),
+			   NV_CONFIGOFFSET(sectors),
+			   to_cpu, is_nv, NULL, setup_rd, handler);
+	if (rd && di->alt_sectors)
+		log_notice(lc, "%s: using BIOS sectors %"PRIu64,
+			   di->path, di->alt_sectors);
+	return rd;
 }
 
 /* Write private RAID metadata to device */
@@ -427,7 +434,7 @@ setup_rd(struct lib_context *lc, struct raid_dev *rd,
 	if (!(rd->meta_areas = alloc_meta_areas(lc, rd, handler, 1)))
 		return 0;
 
-	rd->meta_areas->offset = NV_CONFIGOFFSET >> 9;
+	rd->meta_areas->offset = info->u64 >> 9;
 	rd->meta_areas->size = sizeof(*nv);
 	rd->meta_areas->area = (void *) nv;
 
diff --git a/lib/format/ataraid/nv.h b/lib/format/ataraid/nv.h
index 020900c..cca32da 100644
--- a/lib/format/ataraid/nv.h
+++ b/lib/format/ataraid/nv.h
@@ -23,7 +23,7 @@
 
 #include <stdint.h>
 
-#define	NV_CONFIGOFFSET		((di->sectors - 2) << 9)
+#define	NV_CONFIGOFFSET(sects)	(((sects) - 2) << 9)
 #define	NV_DATAOFFSET		0
 
 #define NV_ID_LENGTH		8
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/3] dmraid: make nv use alt_size if available
@ 2009-02-01  3:01     ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  3:01 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

nv puts metatdata at the end of a device, so it's important that
dmraid and BIOS agree where the end of the device is.  Later kernel
exports the BIOS as via sysfs and it's available as di->alt_sectors.
Use it if available.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 lib/format/ataraid/nv.c |   15 +++++++++++----
 lib/format/ataraid/nv.h |    2 +-
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/lib/format/ataraid/nv.c b/lib/format/ataraid/nv.c
index 893cc3a..bd4589c 100644
--- a/lib/format/ataraid/nv.c
+++ b/lib/format/ataraid/nv.c
@@ -216,9 +216,16 @@ static int setup_rd(struct lib_context *lc, struct raid_dev *rd,
 static struct raid_dev *
 nv_read(struct lib_context *lc, struct dev_info *di)
 {
-	return read_raid_dev(lc, di, NULL,
-			     sizeof(struct nv), NV_CONFIGOFFSET,
-			     to_cpu, is_nv, NULL, setup_rd, handler);
+	uint64_t sectors = di->alt_sectors ? di->alt_sectors : di->sectors;
+	struct raid_dev *rd;
+
+	rd = read_raid_dev(lc, di, NULL, sizeof(struct nv),
+			   NV_CONFIGOFFSET(sectors),
+			   to_cpu, is_nv, NULL, setup_rd, handler);
+	if (rd && di->alt_sectors)
+		log_notice(lc, "%s: using BIOS sectors %"PRIu64,
+			   di->path, di->alt_sectors);
+	return rd;
 }
 
 /* Write private RAID metadata to device */
@@ -427,7 +434,7 @@ setup_rd(struct lib_context *lc, struct raid_dev *rd,
 	if (!(rd->meta_areas = alloc_meta_areas(lc, rd, handler, 1)))
 		return 0;
 
-	rd->meta_areas->offset = NV_CONFIGOFFSET >> 9;
+	rd->meta_areas->offset = info->u64 >> 9;
 	rd->meta_areas->size = sizeof(*nv);
 	rd->meta_areas->area = (void *) nv;
 
diff --git a/lib/format/ataraid/nv.h b/lib/format/ataraid/nv.h
index 020900c..cca32da 100644
--- a/lib/format/ataraid/nv.h
+++ b/lib/format/ataraid/nv.h
@@ -23,7 +23,7 @@
 
 #include <stdint.h>
 
-#define	NV_CONFIGOFFSET		((di->sectors - 2) << 9)
+#define	NV_CONFIGOFFSET(sects)	(((sects) - 2) << 9)
 #define	NV_DATAOFFSET		0
 
 #define NV_ID_LENGTH		8
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-02-01  2:55 ` Tejun Heo
                   ` (5 preceding siblings ...)
  (?)
@ 2009-02-01  3:20 ` Jeff Garzik
  2009-02-01  4:14     ` Tejun Heo
  -1 siblings, 1 reply; 42+ messages in thread
From: Jeff Garzik @ 2009-02-01  3:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-ide, jens.axboe, linux-kernel, linux-scsi, James.Bottomley,
	Mauelshagen, dm-devel

Tejun Heo wrote:
> Hello,
> 
> This patchset implements alt_size, which is a size hint to the users
> of a block device.  It's primarily to communicate the BIOS (HPA) size
> on ATA devices to userland, so that dmraid can consider it when trying
> to figure out BIOS raid layout.  This is critical as dmraid can be
> tricked into assemblying the wrong raid when fed with the unlocked
> size and if the disk content is right (or, rather, wrong) and good
> number of distros are shipping with ignore_hpa=1 as default.
> 
> This patch contains the following three patches.
> 
>   0001-block-add-alt_size.patch
>   0002-scsi-add-scsi_device-alt_capacity.patch
>   0003-libata-export-HPA-size-as-alt_size.patch
> 
> Without the above three patches, I get the following on my nv RAID-0
> if HPA unlocking is turned on.
> 
>   # ~/work/dmraid/tools/dmraid -a y
>   RAID set "nvidia_djgdjigi" was activated
>   # mount /dev/dm-0 /mnt/tmp
>   mount: wrong fs type, bad option, bad superblock on /dev/dm-0,
> 	 missing codepage or helper program, or other error
> 	 In some cases useful info is found in syslog - try
> 	 dmesg | tail  or so
> 
> With the above three patches and modified dmraid,
> 
>   ck804:~/os/work/build # ~/work/dmraid/tools/dmraid -a y -v -v -v
>   ...
>   NOTICE: /dev/sdb: using BIOS sectors 234439535
>   RAID set "nvidia_djgdjigi" was activated
>   ...
>   ck804:~/os/work/build # mount /dev/dm-0 /mnt/tmp
>   ck804:~/os/work/build # umount /dev/dm-0

But the net result is that you are telling dmraid that it is OK to 
proceed, even though part of the disk it wants is really missing.  That 
seems unwise, because are you not basically proceeding with a known 
corrupt dataset at that point?

	Jeff





^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-02-01  3:20 ` [PATCHSET] block,scsi,libata: implement alt_size Jeff Garzik
@ 2009-02-01  4:14     ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  4:14 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: linux-scsi, jens.axboe, linux-kernel, James.Bottomley, linux-ide,
	dm-devel, Mauelshagen

Hello, Jeff.

Jeff Garzik wrote:
> But the net result is that you are telling dmraid that it is OK to
> proceed, even though part of the disk it wants is really missing.  That
> seems unwise, because are you not basically proceeding with a known
> corrupt dataset at that point?

Can you elaborate a bit?  I don't really understand what you mean.  To
clarify...

* What happens now.

BIOS sets up HPA expecting the raid driver to look at the metadata
which is at fixed offset from the end of the HPA locked size.  dmraid
doesn't know about it so looks at the wrong place.  Results can be
failed assembly or corrupt assembly depending on what's actually on
the disk.

* With the patches applied.

BIOS does the same but libata tells dmraid the BIOS size.  dmraid
looks at the same offset as the BIOS and everything is well.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-02-01  4:14     ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-02-01  4:14 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: linux-ide, jens.axboe, linux-kernel, linux-scsi, James.Bottomley,
	Mauelshagen, dm-devel

Hello, Jeff.

Jeff Garzik wrote:
> But the net result is that you are telling dmraid that it is OK to
> proceed, even though part of the disk it wants is really missing.  That
> seems unwise, because are you not basically proceeding with a known
> corrupt dataset at that point?

Can you elaborate a bit?  I don't really understand what you mean.  To
clarify...

* What happens now.

BIOS sets up HPA expecting the raid driver to look at the metadata
which is at fixed offset from the end of the HPA locked size.  dmraid
doesn't know about it so looks at the wrong place.  Results can be
failed assembly or corrupt assembly depending on what's actually on
the disk.

* With the patches applied.

BIOS does the same but libata tells dmraid the BIOS size.  dmraid
looks at the same offset as the BIOS and everything is well.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH] libata: change drive ready wait after hard reset to 5s
  2009-02-01  2:55   ` Tejun Heo
@ 2009-02-02 17:12     ` Stuart_Hayes
  -1 siblings, 0 replies; 42+ messages in thread
From: Stuart_Hayes @ 2009-02-02 17:12 UTC (permalink / raw)
  To: jeff, linux-scsi, James.Bottomley
  Cc: linux-kernel, linux-ide, Mario_Limonciello

[-- Attachment #1: Type: text/plain, Size: 787 bytes --]


This fixes problems during resume with drives that take longer than 1s
to be ready.  The ATA-6 spec appears to allow 5 seconds for a drive to
be ready.

On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.

Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which
also times out, then an IDENTIFY command is sent to the drive which
times out, and finally the error handler will try to send another hard
reset which will finally get things working.

Sorry to send as an attachment, but my mail server will wrap text.


Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>


[-- Attachment #2: sata_wait5s_on_resume.patch --]
[-- Type: application/octet-stream, Size: 384 bytes --]

--- linux-2.6.28.2/include/linux/libata.h.orig	2009-01-24 18:42:07.000000000 -0600
+++ linux-2.6.28.2/include/linux/libata.h	2009-02-02 10:36:30.000000000 -0600
@@ -271,7 +271,7 @@ enum {
 	 * advised to wait only for the following duration before
 	 * doing SRST.
 	 */
-	ATA_TMOUT_PMP_SRST_WAIT	= 1000,
+	ATA_TMOUT_PMP_SRST_WAIT	= 5000,
 
 	/* ATA bus states */
 	BUS_UNKNOWN		= 0,

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH] libata: change drive ready wait after hard reset to 5s
@ 2009-02-02 17:12     ` Stuart_Hayes
  0 siblings, 0 replies; 42+ messages in thread
From: Stuart_Hayes @ 2009-02-02 17:12 UTC (permalink / raw)
  To: jeff, linux-scsi, James.Bottomley
  Cc: linux-kernel, linux-ide, Mario_Limonciello

[-- Attachment #1: Type: text/plain, Size: 787 bytes --]


This fixes problems during resume with drives that take longer than 1s
to be ready.  The ATA-6 spec appears to allow 5 seconds for a drive to
be ready.

On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.

Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which
also times out, then an IDENTIFY command is sent to the drive which
times out, and finally the error handler will try to send another hard
reset which will finally get things working.

Sorry to send as an attachment, but my mail server will wrap text.


Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>


[-- Attachment #2: sata_wait5s_on_resume.patch --]
[-- Type: application/octet-stream, Size: 384 bytes --]

--- linux-2.6.28.2/include/linux/libata.h.orig	2009-01-24 18:42:07.000000000 -0600
+++ linux-2.6.28.2/include/linux/libata.h	2009-02-02 10:36:30.000000000 -0600
@@ -271,7 +271,7 @@ enum {
 	 * advised to wait only for the following duration before
 	 * doing SRST.
 	 */
-	ATA_TMOUT_PMP_SRST_WAIT	= 1000,
+	ATA_TMOUT_PMP_SRST_WAIT	= 5000,
 
 	/* ATA bus states */
 	BUS_UNKNOWN		= 0,

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] libata: change drive ready wait after hard reset to 5s
  2009-02-02 17:12     ` Stuart_Hayes
  (?)
@ 2009-02-02 17:17     ` Mario Limonciello
  -1 siblings, 0 replies; 42+ messages in thread
From: Mario Limonciello @ 2009-02-02 17:17 UTC (permalink / raw)
  To: Hayes, Stuart; +Cc: jeff, linux-scsi, James.Bottomley, linux-kernel, linux-ide

[-- Attachment #1: Type: text/plain, Size: 2036 bytes --]

I believe this is the reference that Stuart refers to in the ATA-6 spec:

http://www.t10.org/t13/project/d1410r3a-ATA-ATAPI-6.pdf

Here's the snippet, (p327):

"
Transition D0HR1:D0HR1: When the sample indicates that DASP- is negated
and            less than 450 ms have
elapsed since the negation of RESET-, then the device shall make a
transition to the   D0HR1: Sample_DASP-
state. When the sample indicates that DASP- is negated and greater than
450 ms          but less than 5 s have
elapsed since the negation of RESET-, then the device may make a
transition to the     D0HR1: Sample_DASP-
state.
Transition D0HR1:D0HR3: When the sample indicates that DASP- is negated
and 5 s have elapsed since the
negation of RESET-, then the device shall clear bit 7 in the Error
register and make a transition to the D0HR3:
Set_status state. When the sample indicates that DASP- is negated and
greater than 450 ms but less than 5 s
have elapsed since the negation of RESET-, then the device may clear bit
7 in the Error register and make a
transition to the D0HR3: Set_status state.
"

Hayes, Stuart wrote:
> This fixes problems during resume with drives that take longer than 1s
> to be ready.  The ATA-6 spec appears to allow 5 seconds for a drive to
> be ready.
>
> On one affected system, this patch changes "PM: resume devices took..."
> message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
> timeout/error messages.
>
> Without this patch, the libata code moves on after 1s, tries to send a
> soft reset (which the drive doesn't see because it isn't ready) which
> also times out, then an IDENTIFY command is sent to the drive which
> times out, and finally the error handler will try to send another hard
> reset which will finally get things working.
>
> Sorry to send as an attachment, but my mail server will wrap text.
>
>
> Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>
>
>   

-- 
Mario Limonciello
*Dell | Linux Engineering*
mario_limonciello@dell.com


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH] libata: change drive ready wait after hard reset to 5s
  2009-02-02 17:12     ` Stuart_Hayes
@ 2009-02-09 21:48       ` Stuart_Hayes
  -1 siblings, 0 replies; 42+ messages in thread
From: Stuart_Hayes @ 2009-02-09 21:48 UTC (permalink / raw)
  To: jeff, linux-scsi, James.Bottomley
  Cc: linux-kernel, linux-ide, Mario_Limonciello

[-- Attachment #1: Type: text/plain, Size: 944 bytes --]


I'm re-sending this patch... I didn't get a response last time I sent
this in a week ago.  (Is there something wrong with it?)  Thanks!  ...
Stuart



This fixes problems during resume with drives that take longer than 1s
to be ready.  The ATA-6 spec appears to allow 5 seconds for a drive to
be ready.

On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.

Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which
also times out, then an IDENTIFY command is sent to the drive which
times out, and finally the error handler will try to send another hard
reset which will finally get things working.

Sorry to send as an attachment, but my mail server will wrap text.


Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>


[-- Attachment #2: sata_wait5s_on_resume.patch --]
[-- Type: application/octet-stream, Size: 384 bytes --]

--- linux-2.6.28.2/include/linux/libata.h.orig	2009-01-24 18:42:07.000000000 -0600
+++ linux-2.6.28.2/include/linux/libata.h	2009-02-02 10:36:30.000000000 -0600
@@ -271,7 +271,7 @@ enum {
 	 * advised to wait only for the following duration before
 	 * doing SRST.
 	 */
-	ATA_TMOUT_PMP_SRST_WAIT	= 1000,
+	ATA_TMOUT_PMP_SRST_WAIT	= 5000,
 
 	/* ATA bus states */
 	BUS_UNKNOWN		= 0,

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH] libata: change drive ready wait after hard reset to 5s
@ 2009-02-09 21:48       ` Stuart_Hayes
  0 siblings, 0 replies; 42+ messages in thread
From: Stuart_Hayes @ 2009-02-09 21:48 UTC (permalink / raw)
  To: jeff, linux-scsi, James.Bottomley
  Cc: linux-kernel, linux-ide, Mario_Limonciello

[-- Attachment #1: Type: text/plain, Size: 944 bytes --]


I'm re-sending this patch... I didn't get a response last time I sent
this in a week ago.  (Is there something wrong with it?)  Thanks!  ...
Stuart



This fixes problems during resume with drives that take longer than 1s
to be ready.  The ATA-6 spec appears to allow 5 seconds for a drive to
be ready.

On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.

Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which
also times out, then an IDENTIFY command is sent to the drive which
times out, and finally the error handler will try to send another hard
reset which will finally get things working.

Sorry to send as an attachment, but my mail server will wrap text.


Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>


[-- Attachment #2: sata_wait5s_on_resume.patch --]
[-- Type: application/octet-stream, Size: 384 bytes --]

--- linux-2.6.28.2/include/linux/libata.h.orig	2009-01-24 18:42:07.000000000 -0600
+++ linux-2.6.28.2/include/linux/libata.h	2009-02-02 10:36:30.000000000 -0600
@@ -271,7 +271,7 @@ enum {
 	 * advised to wait only for the following duration before
 	 * doing SRST.
 	 */
-	ATA_TMOUT_PMP_SRST_WAIT	= 1000,
+	ATA_TMOUT_PMP_SRST_WAIT	= 5000,
 
 	/* ATA bus states */
 	BUS_UNKNOWN		= 0,

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-02-01  2:55 ` Tejun Heo
@ 2009-04-30  1:13   ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-04-30  1:13 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

Tejun Heo wrote:
> Hello,
> 
> This patchset implements alt_size, which is a size hint to the users
> of a block device.  It's primarily to communicate the BIOS (HPA) size
> on ATA devices to userland, so that dmraid can consider it when trying
> to figure out BIOS raid layout.  This is critical as dmraid can be
> tricked into assemblying the wrong raid when fed with the unlocked
> size and if the disk content is right (or, rather, wrong) and good
> number of distros are shipping with ignore_hpa=1 as default.

PING.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-04-30  1:13   ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-04-30  1:13 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Tejun Heo wrote:
> Hello,
> 
> This patchset implements alt_size, which is a size hint to the users
> of a block device.  It's primarily to communicate the BIOS (HPA) size
> on ATA devices to userland, so that dmraid can consider it when trying
> to figure out BIOS raid layout.  This is critical as dmraid can be
> tricked into assemblying the wrong raid when fed with the unlocked
> size and if the disk content is right (or, rather, wrong) and good
> number of distros are shipping with ignore_hpa=1 as default.

PING.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-02-01  2:55 ` Tejun Heo
                   ` (7 preceding siblings ...)
  (?)
@ 2009-04-30  1:13 ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-04-30  1:13 UTC (permalink / raw)
  To: jeff, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen

Tejun Heo wrote:
> Hello,
> 
> This patchset implements alt_size, which is a size hint to the users
> of a block device.  It's primarily to communicate the BIOS (HPA) size
> on ATA devices to userland, so that dmraid can consider it when trying
> to figure out BIOS raid layout.  This is critical as dmraid can be
> tricked into assemblying the wrong raid when fed with the unlocked
> size and if the disk content is right (or, rather, wrong) and good
> number of distros are shipping with ignore_hpa=1 as default.

PING.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-02-01  2:55 ` Tejun Heo
                   ` (8 preceding siblings ...)
  (?)
@ 2009-04-30  1:45 ` Jeff Garzik
  2009-04-30  1:50   ` Tejun Heo
  -1 siblings, 1 reply; 42+ messages in thread
From: Jeff Garzik @ 2009-04-30  1:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-ide, jens.axboe, linux-kernel, linux-scsi, James.Bottomley,
	Mauelshagen, dm-devel

Tejun Heo wrote:
> Hello,
> 
> This patchset implements alt_size, which is a size hint to the users
> of a block device.  It's primarily to communicate the BIOS (HPA) size
> on ATA devices to userland, so that dmraid can consider it when trying
> to figure out BIOS raid layout.  This is critical as dmraid can be
> tricked into assemblying the wrong raid when fed with the unlocked
> size and if the disk content is right (or, rather, wrong) and good
> number of distros are shipping with ignore_hpa=1 as default.
> 
> This patch contains the following three patches.
> 
>   0001-block-add-alt_size.patch
>   0002-scsi-add-scsi_device-alt_capacity.patch
>   0003-libata-export-HPA-size-as-alt_size.patch
> 
> Without the above three patches, I get the following on my nv RAID-0
> if HPA unlocking is turned on.
> 
>   # ~/work/dmraid/tools/dmraid -a y
>   RAID set "nvidia_djgdjigi" was activated
>   # mount /dev/dm-0 /mnt/tmp
>   mount: wrong fs type, bad option, bad superblock on /dev/dm-0,
> 	 missing codepage or helper program, or other error
> 	 In some cases useful info is found in syslog - try
> 	 dmesg | tail  or so
> 
> With the above three patches and modified dmraid,
> 
>   ck804:~/os/work/build # ~/work/dmraid/tools/dmraid -a y -v -v -v
>   ...
>   NOTICE: /dev/sdb: using BIOS sectors 234439535
>   RAID set "nvidia_djgdjigi" was activated
>   ...
>   ck804:~/os/work/build # mount /dev/dm-0 /mnt/tmp
>   ck804:~/os/work/build # umount /dev/dm-0
> 
> Any ideas on through which tree this should be pushed?  I'll post
> dmraid patches as a reply to this thread for reference.
> 
> Thanks.
> 
> This patchset is against the current linus#master[1] and contains the
> following changes.
> 
>  block/genhd.c              |   26 ++++++++++++++++++++++++++
>  drivers/ata/libata-core.c  |    2 ++
>  drivers/ata/libata-scsi.c  |    2 ++
>  drivers/scsi/sd.c          |    1 +
>  include/linux/genhd.h      |   13 ++++++++++++-
>  include/linux/libata.h     |    1 +
>  include/scsi/scsi_device.h |    1 +
>  7 files changed, 45 insertions(+), 1 deletion(-)

<shrug>  I suppose...

It just seems like a nasty hack, but unfortunately I don't see anyone 
stepping up to do it properly -- with a DM device automatically layered 
on top that splits the device into separate regions:  one block device 
for the 'regular' area, and one for the HPA.

	Jeff



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-04-30  1:45 ` Jeff Garzik
@ 2009-04-30  1:50   ` Tejun Heo
  2009-04-30 20:00     ` Jeff Garzik
  0 siblings, 1 reply; 42+ messages in thread
From: Tejun Heo @ 2009-04-30  1:50 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: linux-ide, jens.axboe, linux-kernel, linux-scsi, James.Bottomley,
	Mauelshagen, dm-devel

Hello,

Jeff Garzik wrote:
> <shrug>  I suppose...
> 
> It just seems like a nasty hack, but unfortunately I don't see
> anyone stepping up to do it properly -- with a DM device
> automatically layered on top that splits the device into separate
> regions: one block device for the 'regular' area, and one for the
> HPA.

Isn't that more hacky?  I don't know.  All that dm needs to know is
the location of the metadata which is located w.r.t. the end of the
device which might be at a different location if BIOS tried to pull
silly stunts.  So, exporting the size BIOS might have used seems like
a straight forward solution to me.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-04-30  1:50   ` Tejun Heo
@ 2009-04-30 20:00     ` Jeff Garzik
  2009-04-30 21:15         ` Dan Williams
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff Garzik @ 2009-04-30 20:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-ide, jens.axboe, linux-kernel, linux-scsi, James.Bottomley,
	Mauelshagen, dm-devel

Tejun Heo wrote:
> Hello,
> 
> Jeff Garzik wrote:
>> <shrug>  I suppose...
>>
>> It just seems like a nasty hack, but unfortunately I don't see
>> anyone stepping up to do it properly -- with a DM device
>> automatically layered on top that splits the device into separate
>> regions: one block device for the 'regular' area, and one for the
>> HPA.
> 
> Isn't that more hacky?  I don't know.  All that dm needs to know is
> the location of the metadata which is located w.r.t. the end of the
> device which might be at a different location if BIOS tried to pull
> silly stunts.  So, exporting the size BIOS might have used seems like
> a straight forward solution to me.

"<shrug> I suppose" is basically a reluctant ack, in the absence of 
other solutions :)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-04-30 20:00     ` Jeff Garzik
@ 2009-04-30 21:15         ` Dan Williams
  0 siblings, 0 replies; 42+ messages in thread
From: Dan Williams @ 2009-04-30 21:15 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

On Thu, Apr 30, 2009 at 1:00 PM, Jeff Garzik <jeff@garzik.org> wrote:
> Tejun Heo wrote:
>>
>> Hello,
>>
>> Jeff Garzik wrote:
>>>
>>> <shrug>  I suppose...
>>>
>>> It just seems like a nasty hack, but unfortunately I don't see
>>> anyone stepping up to do it properly -- with a DM device
>>> automatically layered on top that splits the device into separate
>>> regions: one block device for the 'regular' area, and one for the
>>> HPA.
>>
>> Isn't that more hacky?  I don't know.  All that dm needs to know is
>> the location of the metadata which is located w.r.t. the end of the
>> device which might be at a different location if BIOS tried to pull
>> silly stunts.  So, exporting the size BIOS might have used seems like
>> a straight forward solution to me.
>
> "<shrug> I suppose" is basically a reluctant ack, in the absence of other
> solutions :)
>

Couldn't the "fix" also just be a note to users to disable ignore_hpa
if they notice that there raid arrays are not assembling correctly?

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-04-30 21:15         ` Dan Williams
  0 siblings, 0 replies; 42+ messages in thread
From: Dan Williams @ 2009-04-30 21:15 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

On Thu, Apr 30, 2009 at 1:00 PM, Jeff Garzik <jeff@garzik.org> wrote:
> Tejun Heo wrote:
>>
>> Hello,
>>
>> Jeff Garzik wrote:
>>>
>>> <shrug>  I suppose...
>>>
>>> It just seems like a nasty hack, but unfortunately I don't see
>>> anyone stepping up to do it properly -- with a DM device
>>> automatically layered on top that splits the device into separate
>>> regions: one block device for the 'regular' area, and one for the
>>> HPA.
>>
>> Isn't that more hacky?  I don't know.  All that dm needs to know is
>> the location of the metadata which is located w.r.t. the end of the
>> device which might be at a different location if BIOS tried to pull
>> silly stunts.  So, exporting the size BIOS might have used seems like
>> a straight forward solution to me.
>
> "<shrug> I suppose" is basically a reluctant ack, in the absence of other
> solutions :)
>

Couldn't the "fix" also just be a note to users to disable ignore_hpa
if they notice that there raid arrays are not assembling correctly?

--
Dan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-04-30 21:15         ` Dan Williams
@ 2009-05-04  8:02           ` Tejun Heo
  -1 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-05-04  8:02 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jeff Garzik, linux-scsi, Mauelshagen, linux-kernel, linux-ide,
	dm-devel, jens.axboe

Dan Williams wrote:
> On Thu, Apr 30, 2009 at 1:00 PM, Jeff Garzik <jeff@garzik.org> wrote:
>> Tejun Heo wrote:
>>> Hello,
>>>
>>> Jeff Garzik wrote:
>>>> <shrug>  I suppose...
>>>>
>>>> It just seems like a nasty hack, but unfortunately I don't see
>>>> anyone stepping up to do it properly -- with a DM device
>>>> automatically layered on top that splits the device into separate
>>>> regions: one block device for the 'regular' area, and one for the
>>>> HPA.
>>> Isn't that more hacky?  I don't know.  All that dm needs to know is
>>> the location of the metadata which is located w.r.t. the end of the
>>> device which might be at a different location if BIOS tried to pull
>>> silly stunts.  So, exporting the size BIOS might have used seems like
>>> a straight forward solution to me.
>> "<shrug> I suppose" is basically a reluctant ack, in the absence of other
>> solutions :)
>>
> 
> Couldn't the "fix" also just be a note to users to disable ignore_hpa
> if they notice that there raid arrays are not assembling correctly?

I don't know.  If it doesn't work automatically, the solution really
isn't worth much.  People generally don't (shouldn't need to) have any
idea what HPA is.  It basically boils down to "Linux doesn't support
RAID".  :-)

Maybe we should just put this issue to the rest and strongly advise
people against BIOS raids.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-05-04  8:02           ` Tejun Heo
  0 siblings, 0 replies; 42+ messages in thread
From: Tejun Heo @ 2009-05-04  8:02 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jeff Garzik, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Dan Williams wrote:
> On Thu, Apr 30, 2009 at 1:00 PM, Jeff Garzik <jeff@garzik.org> wrote:
>> Tejun Heo wrote:
>>> Hello,
>>>
>>> Jeff Garzik wrote:
>>>> <shrug>  I suppose...
>>>>
>>>> It just seems like a nasty hack, but unfortunately I don't see
>>>> anyone stepping up to do it properly -- with a DM device
>>>> automatically layered on top that splits the device into separate
>>>> regions: one block device for the 'regular' area, and one for the
>>>> HPA.
>>> Isn't that more hacky?  I don't know.  All that dm needs to know is
>>> the location of the metadata which is located w.r.t. the end of the
>>> device which might be at a different location if BIOS tried to pull
>>> silly stunts.  So, exporting the size BIOS might have used seems like
>>> a straight forward solution to me.
>> "<shrug> I suppose" is basically a reluctant ack, in the absence of other
>> solutions :)
>>
> 
> Couldn't the "fix" also just be a note to users to disable ignore_hpa
> if they notice that there raid arrays are not assembling correctly?

I don't know.  If it doesn't work automatically, the solution really
isn't worth much.  People generally don't (shouldn't need to) have any
idea what HPA is.  It basically boils down to "Linux doesn't support
RAID".  :-)

Maybe we should just put this issue to the rest and strongly advise
people against BIOS raids.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-05-04  8:02           ` Tejun Heo
  (?)
@ 2009-05-04 16:48           ` Dan Williams
  2009-05-05  3:28             ` Tejun Heo
  -1 siblings, 1 reply; 42+ messages in thread
From: Dan Williams @ 2009-05-04 16:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

On Mon, May 4, 2009 at 1:02 AM, Tejun Heo <tj@kernel.org> wrote:
>> Couldn't the "fix" also just be a note to users to disable ignore_hpa
>> if they notice that there raid arrays are not assembling correctly?
>
> I don't know.  If it doesn't work automatically, the solution really
> isn't worth much.  People generally don't (shouldn't need to) have any
> idea what HPA is.  It basically boils down to "Linux doesn't support
> RAID".  :-)
>
> Maybe we should just put this issue to the rest and strongly advise
> people against BIOS raids.

That's a bit too far :-).  The real issue is that Linux is ignoring a
property of the platform.  To me ignoring HPA is like ignoring an e820
reserved memory region.  You might be able to get away with it
sometimes, but other times it breaks.

However, in the end I think having 'alt_size' available is a useful
mechanism for discovering and handling the ignore_hpa=1 case.   The
policy about when to trust it can be left to userspace.

Thanks,
Dan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-05-04 16:48           ` Dan Williams
@ 2009-05-05  3:28             ` Tejun Heo
  2009-05-05 15:34               ` Dan Williams
  0 siblings, 1 reply; 42+ messages in thread
From: Tejun Heo @ 2009-05-05  3:28 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jeff Garzik, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Hello,

Dan Williams wrote:
> On Mon, May 4, 2009 at 1:02 AM, Tejun Heo <tj@kernel.org> wrote:
>>> Couldn't the "fix" also just be a note to users to disable ignore_hpa
>>> if they notice that there raid arrays are not assembling correctly?
>> I don't know.  If it doesn't work automatically, the solution really
>> isn't worth much.  People generally don't (shouldn't need to) have any
>> idea what HPA is.  It basically boils down to "Linux doesn't support
>> RAID".  :-)
>>
>> Maybe we should just put this issue to the rest and strongly advise
>> people against BIOS raids.
> 
> That's a bit too far :-).  The real issue is that Linux is ignoring a
> property of the platform.  To me ignoring HPA is like ignoring an e820
> reserved memory region.  You might be able to get away with it
> sometimes, but other times it breaks.

Well, somewhat but not quite.  The difference is with backward
compatibility and the fact that HPA has been used for many different
creative ;-) purposes including working around BIOS limitations.

> However, in the end I think having 'alt_size' available is a useful
> mechanism for discovering and handling the ignore_hpa=1 case.   The
> policy about when to trust it can be left to userspace.

Pushing this in the kernel without updating dm-raid is kind of
pointless.  Who should I be poking to get the userland part moving?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-05-05  3:28             ` Tejun Heo
@ 2009-05-05 15:34               ` Dan Williams
  2009-05-08  9:13                 ` Tejun Heo
  0 siblings, 1 reply; 42+ messages in thread
From: Dan Williams @ 2009-05-05 15:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

On Mon, May 4, 2009 at 8:28 PM, Tejun Heo <tj@kernel.org> wrote:
> Pushing this in the kernel without updating dm-raid is kind of
> pointless.  Who should I be poking to get the userland part moving?

Dmraid is not the only potential user.  Mdadm will also care about
this as I have seen a report that points to a user having what looks
like an HPA issue with an Intel(R) Matrix array.  So, there are two
immediate users for this interface.  Heinz is the right person to poke
for dmraid updates.  I'll handle the mdadm update.

Thanks,
Dan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-05-05 15:34               ` Dan Williams
@ 2009-05-08  9:13                 ` Tejun Heo
  2009-05-08 16:21                     ` Jeff Garzik
  2009-05-08 16:23                   ` Dan Williams
  0 siblings, 2 replies; 42+ messages in thread
From: Tejun Heo @ 2009-05-08  9:13 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jeff Garzik, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Dan Williams wrote:
> On Mon, May 4, 2009 at 8:28 PM, Tejun Heo <tj@kernel.org> wrote:
>> Pushing this in the kernel without updating dm-raid is kind of
>> pointless.  Who should I be poking to get the userland part moving?
> 
> Dmraid is not the only potential user.  Mdadm will also care about
> this as I have seen a report that points to a user having what looks
> like an HPA issue with an Intel(R) Matrix array.  So, there are two
> immediate users for this interface.  Heinz is the right person to poke
> for dmraid updates.  I'll handle the mdadm update.

Alright, then.  With Jeff's reluctant ack and scheduled mdadm update,
I guess this one can go upstream.  I'll refresh the patchset and
repost.  Umm... BTW do I have Heinz's mail address right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-05-08  9:13                 ` Tejun Heo
@ 2009-05-08 16:21                     ` Jeff Garzik
  2009-05-08 16:23                   ` Dan Williams
  1 sibling, 0 replies; 42+ messages in thread
From: Jeff Garzik @ 2009-05-08 16:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-scsi, Mauelshagen, linux-kernel, linux-ide, dm-devel, jens.axboe

Tejun Heo wrote:
> Dan Williams wrote:
>> On Mon, May 4, 2009 at 8:28 PM, Tejun Heo <tj@kernel.org> wrote:
>>> Pushing this in the kernel without updating dm-raid is kind of
>>> pointless.  Who should I be poking to get the userland part moving?
>> Dmraid is not the only potential user.  Mdadm will also care about
>> this as I have seen a report that points to a user having what looks
>> like an HPA issue with an Intel(R) Matrix array.  So, there are two
>> immediate users for this interface.  Heinz is the right person to poke
>> for dmraid updates.  I'll handle the mdadm update.
> 
> Alright, then.  With Jeff's reluctant ack and scheduled mdadm update,
> I guess this one can go upstream.  I'll refresh the patchset and
> repost.  Umm... BTW do I have Heinz's mail address right?

The email addr is correct according to dmraid-current.tar.bz2 (dated Sep 
2008), at least.  I don't have any better info.

	Jeff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
@ 2009-05-08 16:21                     ` Jeff Garzik
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff Garzik @ 2009-05-08 16:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Dan Williams, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, dm-devel

Tejun Heo wrote:
> Dan Williams wrote:
>> On Mon, May 4, 2009 at 8:28 PM, Tejun Heo <tj@kernel.org> wrote:
>>> Pushing this in the kernel without updating dm-raid is kind of
>>> pointless.  Who should I be poking to get the userland part moving?
>> Dmraid is not the only potential user.  Mdadm will also care about
>> this as I have seen a report that points to a user having what looks
>> like an HPA issue with an Intel(R) Matrix array.  So, there are two
>> immediate users for this interface.  Heinz is the right person to poke
>> for dmraid updates.  I'll handle the mdadm update.
> 
> Alright, then.  With Jeff's reluctant ack and scheduled mdadm update,
> I guess this one can go upstream.  I'll refresh the patchset and
> repost.  Umm... BTW do I have Heinz's mail address right?

The email addr is correct according to dmraid-current.tar.bz2 (dated Sep 
2008), at least.  I don't have any better info.

	Jeff





^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCHSET] block,scsi,libata: implement alt_size
  2009-05-08  9:13                 ` Tejun Heo
  2009-05-08 16:21                     ` Jeff Garzik
@ 2009-05-08 16:23                   ` Dan Williams
  1 sibling, 0 replies; 42+ messages in thread
From: Dan Williams @ 2009-05-08 16:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, linux-ide, jens.axboe, linux-kernel, linux-scsi,
	James.Bottomley, Mauelshagen, device-mapper development, heinzm

[ added heinzm@redhat.com ]

On Fri, May 8, 2009 at 2:13 AM, Tejun Heo <tj@kernel.org> wrote:
> Dan Williams wrote:
>> On Mon, May 4, 2009 at 8:28 PM, Tejun Heo <tj@kernel.org> wrote:
>>> Pushing this in the kernel without updating dm-raid is kind of
>>> pointless.  Who should I be poking to get the userland part moving?
>>
>> Dmraid is not the only potential user.  Mdadm will also care about
>> this as I have seen a report that points to a user having what looks
>> like an HPA issue with an Intel(R) Matrix array.  So, there are two
>> immediate users for this interface.  Heinz is the right person to poke
>> for dmraid updates.  I'll handle the mdadm update.
>
> Alright, then.  With Jeff's reluctant ack and scheduled mdadm update,
> I guess this one can go upstream.  I'll refresh the patchset and
> repost.  Umm... BTW do I have Heinz's mail address right?

Heinz, here is the whole thread:

http://marc.info/?t=123345711600004&r=1&w=2

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2009-05-08 16:23 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-01  2:55 [PATCHSET] block,scsi,libata: implement alt_size Tejun Heo
2009-02-01  2:55 ` Tejun Heo
2009-02-01  2:55 ` [PATCH 1/3] block: add alt_size Tejun Heo
2009-02-01  2:55   ` Tejun Heo
2009-02-01  2:55 ` [PATCH 2/3] scsi: add scsi_device->alt_capacity Tejun Heo
2009-02-01  2:55   ` Tejun Heo
2009-02-01  2:55 ` [PATCH 3/3] libata: export HPA size as alt_size Tejun Heo
2009-02-01  2:55   ` Tejun Heo
2009-02-02 17:12   ` [PATCH] libata: change drive ready wait after hard reset to 5s Stuart_Hayes
2009-02-02 17:12     ` Stuart_Hayes
2009-02-02 17:17     ` Mario Limonciello
2009-02-09 21:48     ` Stuart_Hayes
2009-02-09 21:48       ` Stuart_Hayes
2009-02-01  2:59 ` [PATCHSET] dmraid: use alt_size Tejun Heo
2009-02-01  2:59   ` Tejun Heo
2009-02-01  3:00   ` [PATCH 1/3] dmraid: set read_info to @offset by default in read_raid_dev() Tejun Heo
2009-02-01  3:00     ` Tejun Heo
2009-02-01  3:00   ` [PATCH 2/3] dmraid: add alt_size to dev_info Tejun Heo
2009-02-01  3:00     ` Tejun Heo
2009-02-01  3:01   ` [PATCH 3/3] dmraid: make nv use alt_size if available Tejun Heo
2009-02-01  3:01     ` Tejun Heo
2009-02-01  2:59 ` [PATCHSET] dmraid: use alt_size Tejun Heo
2009-02-01  3:20 ` [PATCHSET] block,scsi,libata: implement alt_size Jeff Garzik
2009-02-01  4:14   ` Tejun Heo
2009-02-01  4:14     ` Tejun Heo
2009-04-30  1:13 ` Tejun Heo
2009-04-30  1:13   ` Tejun Heo
2009-04-30  1:13 ` Tejun Heo
2009-04-30  1:45 ` Jeff Garzik
2009-04-30  1:50   ` Tejun Heo
2009-04-30 20:00     ` Jeff Garzik
2009-04-30 21:15       ` Dan Williams
2009-04-30 21:15         ` Dan Williams
2009-05-04  8:02         ` Tejun Heo
2009-05-04  8:02           ` Tejun Heo
2009-05-04 16:48           ` Dan Williams
2009-05-05  3:28             ` Tejun Heo
2009-05-05 15:34               ` Dan Williams
2009-05-08  9:13                 ` Tejun Heo
2009-05-08 16:21                   ` Jeff Garzik
2009-05-08 16:21                     ` Jeff Garzik
2009-05-08 16:23                   ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.