All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] sd_zbc: Avoid errors due to sd_zbc_check_zone_size() execution
       [not found] <20180404085418.29205-1-damien.lemoal@wdc.com>
@ 2018-04-04  8:54 ` Damien Le Moal
  2018-04-04 15:08   ` Bart Van Assche
  2018-04-04  8:54 ` [PATCH 2/2] sd_zbc: Avoid errors due to sd_zbc_setup() execution Damien Le Moal
  1 sibling, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2018-04-04  8:54 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen; +Cc: Bart Van Assche, Hannes Reinecke, stable

When sd_revalidate() is executed for a ZBC disk (zoned block device),
sd_zbc_read_zones() is called to revalidate the disk zone configuration.
This executes sd_zbc_check_zone_size() to check that the disk zone sizes
are in line with the defined constraints (all zones must be the same
size and a power of 2 number of LBAs). As part of its execution,
sd_zbc_check_zone_size() was temporarily setting sdkp->zone_blocks to 0.
If during the execution of sd_zbc_check_zone_size() within
sd_revalidate() context, another context issues a command which
references sdkp->zone_blocks to check zone alignment of the command
(e.g. a zone reset is issued and sd_zbc_setup_reset_cmnd() is called),
an invalid value for the disk zone size is used and the alignment check
fails.

Simply fix this by using an on-stack variable inside
sd_zbc_check_zone_size() instead of directly using sdkp->zone_blocks.
This change is valid for both revalidate as well as for the first scan
of the device.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/sd_zbc.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 89cf4498f535..b59454ed5087 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -403,6 +403,7 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf)
  */
 static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 {
+	u64 sdkp_zone_blocks = sdkp->zone_blocks;
 	u64 zone_blocks = 0;
 	sector_t block = 0;
 	unsigned char *buf;
@@ -412,8 +413,6 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 	int ret;
 	u8 same;
 
-	sdkp->zone_blocks = 0;
-
 	/* Get a buffer */
 	buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL);
 	if (!buf)
@@ -446,11 +445,11 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 		/* Parse zone descriptors */
 		while (rec < buf + buf_len) {
 			zone_blocks = get_unaligned_be64(&rec[8]);
-			if (sdkp->zone_blocks == 0) {
-				sdkp->zone_blocks = zone_blocks;
-			} else if (zone_blocks != sdkp->zone_blocks &&
+			if (sdkp_zone_blocks == 0) {
+				sdkp_zone_blocks = zone_blocks;
+			} else if (zone_blocks != sdkp_zone_blocks &&
 				   (block + zone_blocks < sdkp->capacity
-				    || zone_blocks > sdkp->zone_blocks)) {
+				    || zone_blocks > sdkp_zone_blocks)) {
 				zone_blocks = 0;
 				goto out;
 			}
@@ -467,7 +466,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
 
 	} while (block < sdkp->capacity);
 
-	zone_blocks = sdkp->zone_blocks;
+	zone_blocks = sdkp_zone_blocks;
 
 out:
 	if (!zone_blocks) {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] sd_zbc: Avoid errors due to sd_zbc_setup() execution
       [not found] <20180404085418.29205-1-damien.lemoal@wdc.com>
  2018-04-04  8:54 ` [PATCH 1/2] sd_zbc: Avoid errors due to sd_zbc_check_zone_size() execution Damien Le Moal
@ 2018-04-04  8:54 ` Damien Le Moal
  2018-04-04 15:22   ` Bart Van Assche
  1 sibling, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2018-04-04  8:54 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen
  Cc: Bart Van Assche, Hannes Reinecke, Bart Van Assche,
	Christoph Hellwig, Hannes Reinecke, stable

From: Bart Van Assche <bart.vanassche@wdc.com>

Since SCSI scanning occurs asynchronously, since sd_revalidate_disk()
is called from sd_probe_async() and since sd_revalidate_disk() calls
sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called
concurrently with operations referencing a drive zone bitmaps and number
of zones. Make sure that this race does not cause failures when
revalidate does not detect any change by making the following changes to
sd_zbc_setup():
- Ensure that sd_zbc_setup_seq_zones_bitmap() does not change any
  ZBC metadata in the request queue.
- Only modify the ZBC information in the request queue that has
  changed. If the number of zones has changed, update q->nr_zones,
  q->seq_zones_wlock and q->seq_zones_bitmap. If the type of some
  zones has changed but not the number of zones, only update the
  zone type information.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
[Damien] Updated commit message and changed nr_zones/bitmap swap order.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: stable@vger.kernel.org
---
 drivers/scsi/sd_zbc.c | 45 +++++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index b59454ed5087..39ddbe92769c 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -551,14 +551,13 @@ static sector_t sd_zbc_get_seq_zones(struct scsi_disk *sdkp, unsigned char *buf,
 }
 
 /**
- * sd_zbc_setup_seq_zones_bitmap - Initialize the disk seq zone bitmap.
+ * sd_zbc_setup_seq_zones_bitmap - Initialize a seq zone bitmap.
  * @sdkp: target disk
  *
  * Allocate a zone bitmap and initialize it by identifying sequential zones.
  */
-static int sd_zbc_setup_seq_zones_bitmap(struct scsi_disk *sdkp)
+static unsigned long *sd_zbc_setup_seq_zones_bitmap(struct scsi_disk *sdkp)
 {
-	struct request_queue *q = sdkp->disk->queue;
 	unsigned long *seq_zones_bitmap;
 	sector_t lba = 0;
 	unsigned char *buf;
@@ -566,7 +565,7 @@ static int sd_zbc_setup_seq_zones_bitmap(struct scsi_disk *sdkp)
 
 	seq_zones_bitmap = sd_zbc_alloc_zone_bitmap(sdkp);
 	if (!seq_zones_bitmap)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
 	buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL);
 	if (!buf)
@@ -589,12 +588,9 @@ static int sd_zbc_setup_seq_zones_bitmap(struct scsi_disk *sdkp)
 	kfree(buf);
 	if (ret) {
 		kfree(seq_zones_bitmap);
-		return ret;
+		return ERR_PTR(ret);
 	}
-
-	q->seq_zones_bitmap = seq_zones_bitmap;
-
-	return 0;
+	return seq_zones_bitmap;
 }
 
 static void sd_zbc_cleanup(struct scsi_disk *sdkp)
@@ -630,24 +626,37 @@ static int sd_zbc_setup(struct scsi_disk *sdkp)
 	 * of zones changed.
 	 */
 	if (sdkp->nr_zones != q->nr_zones) {
+		struct request_queue *q = sdkp->disk->queue;
+		unsigned long *seq_zones_wlock = NULL, *seq_zones_bitmap = NULL;
+		size_t zone_bitmap_size;
 
-		sd_zbc_cleanup(sdkp);
-
-		q->nr_zones = sdkp->nr_zones;
 		if (sdkp->nr_zones) {
-			q->seq_zones_wlock = sd_zbc_alloc_zone_bitmap(sdkp);
-			if (!q->seq_zones_wlock) {
+			seq_zones_wlock = sd_zbc_alloc_zone_bitmap(sdkp);
+			if (!seq_zones_wlock) {
 				ret = -ENOMEM;
 				goto err;
 			}
 
-			ret = sd_zbc_setup_seq_zones_bitmap(sdkp);
-			if (ret) {
-				sd_zbc_cleanup(sdkp);
+			seq_zones_bitmap = sd_zbc_setup_seq_zones_bitmap(sdkp);
+			if (IS_ERR(seq_zones_bitmap)) {
+				ret = PTR_ERR(seq_zones_bitmap);
+				kfree(seq_zones_wlock);
 				goto err;
 			}
 		}
-
+		zone_bitmap_size = BITS_TO_LONGS(sdkp->nr_zones) *
+			sizeof(unsigned long);
+		if (q->nr_zones != sdkp->nr_zones) {
+			swap(q->seq_zones_wlock, seq_zones_wlock);
+			swap(q->seq_zones_bitmap, seq_zones_bitmap);
+			q->nr_zones = sdkp->nr_zones;
+		} else if (memcmp(q->seq_zones_bitmap, seq_zones_bitmap,
+				  zone_bitmap_size) != 0) {
+			memcpy(q->seq_zones_bitmap, seq_zones_bitmap,
+			       zone_bitmap_size);
+		}
+		kfree(seq_zones_wlock);
+		kfree(seq_zones_bitmap);
 	}
 
 	return 0;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] sd_zbc: Avoid errors due to sd_zbc_check_zone_size() execution
  2018-04-04  8:54 ` [PATCH 1/2] sd_zbc: Avoid errors due to sd_zbc_check_zone_size() execution Damien Le Moal
@ 2018-04-04 15:08   ` Bart Van Assche
  2018-04-04 23:54     ` Damien Le Moal
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2018-04-04 15:08 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen; +Cc: Hannes Reinecke, stable

On 04/04/18 01:54, Damien Le Moal wrote:
>   static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
>   {
> +	u64 sdkp_zone_blocks = sdkp->zone_blocks;

Shouldn't this variable be initialized to zero such that zone size 
changes are accepted even if the SAME field in the REPORT ZONES response 
is zero?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] sd_zbc: Avoid errors due to sd_zbc_setup() execution
  2018-04-04  8:54 ` [PATCH 2/2] sd_zbc: Avoid errors due to sd_zbc_setup() execution Damien Le Moal
@ 2018-04-04 15:22   ` Bart Van Assche
  2018-04-04 23:59     ` Damien Le Moal
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2018-04-04 15:22 UTC (permalink / raw)
  To: linux-scsi, Damien Le Moal, martin.petersen; +Cc: hch, hare, hare, stable

On Wed, 2018-04-04 at 17:54 +0900, Damien Le Moal wrote:
> Since SCSI scanning occurs asynchronously, since sd_revalidate_disk()
> is called from sd_probe_async() and since sd_revalidate_disk() calls
> sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called
> concurrently with operations referencing a drive zone bitmaps and number
                                           ^^^^^^^^^^^^^^^^^^^^

Should "a" be changed into "the"?

> [Damien] Updated commit message and changed nr_zones/bitmap swap order.

Updating the number of zones after having updated the bitmap pointers is not
sufficient to avoid trouble if the number of zones as reported by the drive
changes while I/O is in progress. With the current implementation if the
number of zones changes the seq_zones_bitmap is cleared. Can this cause
trouble for the mq-deadline scheduler? Additionally, CPUs other than x86 can
reorder store operations. Even worse, a CPU could cache the zone bitmap
pointers which means that at least RCU protection + kfree_rcu() is needed to
avoid trouble. I think we either should handle this case properly or issue a
kernel warning.

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] sd_zbc: Avoid errors due to sd_zbc_check_zone_size() execution
  2018-04-04 15:08   ` Bart Van Assche
@ 2018-04-04 23:54     ` Damien Le Moal
  0 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2018-04-04 23:54 UTC (permalink / raw)
  To: Bart Van Assche, linux-scsi, Martin K . Petersen; +Cc: Hannes Reinecke, stable

Bart,

On 4/5/18 00:08, Bart Van Assche wrote:
> On 04/04/18 01:54, Damien Le Moal wrote:
>>   static int sd_zbc_check_zone_size(struct scsi_disk *sdkp)
>>   {
>> +	u64 sdkp_zone_blocks = sdkp->zone_blocks;
> 
> Shouldn't this variable be initialized to zero such that zone size 
> changes are accepted even if the SAME field in the REPORT ZONES response 
> is zero?

sdkp_zone_blocks will be 0 when sd_zbc_check_zone_size() is called for
the first scan of a disk and will hold the current disk value if
sd_zbc_check_zone_size() is called on a revalidate after first scan. If
the initial value is 0, there is no check and the variable is first
initialized. Otherwise, the value is compared to the zone size reported.
In both cases, the zone size change will be cought.

But granted, setting the value initially to 0 is easier to understand.
I will also change:

	} else {
                sdkp->zone_blocks = zone_blocks;

                sdkp->zone_shift = ilog2(zone_blocks);

        }

to

	} else if (sdkp->zone_blocks != zone_blocks) {

                sdkp->zone_blocks = zone_blocks;

                sdkp->zone_shift = ilog2(zone_blocks);

        }

to make things really clear.

Similarly to a capacity change, It may also be good to add a warning
message. After all, on a drive swap, we can have the case where the
capacity does not change, but the zone size does.

Sending a v2.

Best regards.

-- 
Damien Le Moal,
Western Digital

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] sd_zbc: Avoid errors due to sd_zbc_setup() execution
  2018-04-04 15:22   ` Bart Van Assche
@ 2018-04-04 23:59     ` Damien Le Moal
  0 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2018-04-04 23:59 UTC (permalink / raw)
  To: Bart Van Assche, linux-scsi, martin.petersen; +Cc: hch, hare, hare, stable

Bart,

On 4/5/18 00:22, Bart Van Assche wrote:
> On Wed, 2018-04-04 at 17:54 +0900, Damien Le Moal wrote:
>> Since SCSI scanning occurs asynchronously, since sd_revalidate_disk()
>> is called from sd_probe_async() and since sd_revalidate_disk() calls
>> sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called
>> concurrently with operations referencing a drive zone bitmaps and number
>                                            ^^^^^^^^^^^^^^^^^^^^
> 
> Should "a" be changed into "the"?

Yes.


>> [Damien] Updated commit message and changed nr_zones/bitmap swap order.
> 
> Updating the number of zones after having updated the bitmap pointers is not
> sufficient to avoid trouble if the number of zones as reported by the drive
> changes while I/O is in progress. With the current implementation if the
> number of zones changes the seq_zones_bitmap is cleared. Can this cause
> trouble for the mq-deadline scheduler? Additionally, CPUs other than x86 can
> reorder store operations. Even worse, a CPU could cache the zone bitmap
> pointers which means that at least RCU protection + kfree_rcu() is needed to
> avoid trouble. I think we either should handle this case properly or issue a
> kernel warning.

OK. Let's work on that.


-- 
Damien Le Moal,
Western Digital

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-04-04 23:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180404085418.29205-1-damien.lemoal@wdc.com>
2018-04-04  8:54 ` [PATCH 1/2] sd_zbc: Avoid errors due to sd_zbc_check_zone_size() execution Damien Le Moal
2018-04-04 15:08   ` Bart Van Assche
2018-04-04 23:54     ` Damien Le Moal
2018-04-04  8:54 ` [PATCH 2/2] sd_zbc: Avoid errors due to sd_zbc_setup() execution Damien Le Moal
2018-04-04 15:22   ` Bart Van Assche
2018-04-04 23:59     ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.