All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>, Ilya Dryomov <idryomov@gmail.com>,
	Song Liu <song@kernel.org>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Richard Weinberger <richard@nod.at>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	Stefan Haberland <sth@linux.ibm.com>,
	Jan Hoeppner <hoeppner@linux.ibm.com>,
	linux-block@vger.kernel.org, ceph-devel@vger.kernel.org,
	linux-bcache@vger.kernel.org, linux-raid@vger.kernel.org,
	linux-mtd@lists.infradead.org, linux-s390@vger.kernel.org,
	martin.petersen@oracle.com
Subject: Re: [PATCH, RFC 11/10] block: propagate BLKROSET to all partitions
Date: Tue, 10 Nov 2020 23:38:22 -0500	[thread overview]
Message-ID: <yq1imacecwz.fsf@ca-mkp.ca.oracle.com> (raw)
In-Reply-To: <20201106140817.GA23557@lst.de> (Christoph Hellwig's message of "Fri, 6 Nov 2020 15:08:17 +0100")


Christoph,

> When setting the whole device read-only (or clearing the read-only
> state), also update the policy for all partitions.  The s390 dasd
> driver has awlways been doing this and it makes a lot of sense.

For your amusement, here's my attempt at addressing this from a while
back. Can't remember exactly why this stranded, I even wrote blktests
for it...

-- 
Martin K. Petersen	Oracle Linux Engineering

From a7898967402a69e59607300aa8e2e2a6941a61c1 Mon Sep 17 00:00:00 2001
From: "Martin K. Petersen" <martin.petersen@oracle.com>
Date: Wed, 27 Mar 2019 12:21:41 -0400
Subject: [PATCH] block: Fix read-only block device setting after revalidate

Commit 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading
partition") addressed a long-standing problem with user read-only
policy being overridden as a result of a device-initiated revalidate.
The commit has since been reverted due to a regression that left some
USB devices read-only indefinitely.

To fix the underlying problems with revalidate we need to keep track
of hardware state and user policy separately. Every time the state is
changed, either via a hardware event or the BLKROSET ioctl, the
per-partition read-only state is updated based on the combination of
device state and policy. The resulting active state is stored in a
separate hd_struct flag to avoid introducing additional lookups in the
I/O hot path.

The gendisk has been updated to reflect the current hardware state set
by the device driver. This is done to allow returning the device to
the hardware state once the user clears the BLKROSET flag.

For partitions, the existing hd_struct 'policy' flag is split into
two:

 - 'read_only' indicates the currently active read-only state of a
   whole disk device or partition.

 - 'ro_policy' indicates the whether the user has administratively set
   the whole disk or partition read-only via the BLKROSET ioctl.

The resulting semantics are as follows:

 - If BLKROSET is used to set a whole-disk device read-only, any
   partitions will end up in a read-only state until the user
   explicitly clears the flag.

 - If BLKROSET sets a given partition read-only, that partition will
   remain read-only even if the underlying storage stack initiates a
   revalidate. However, the BLKRRPART ioctl will cause the partition
   table to be dropped and any user policy on partitions will be lost.

 - If BLKROSET has not been set, both the whole disk device and any
   partitions will reflect the current write-protect state of the
   underlying device.

Cc: Jeremy Cline <jeremy@jcline.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Reported-by: Oleksii Kurochko <olkuroch@cisco.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

diff --git a/block/blk-core.c b/block/blk-core.c
index 4673ebe42255..932f179a9095 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -792,7 +792,7 @@ static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part)
 {
 	const int op = bio_op(bio);
 
-	if (part->policy && op_is_write(op)) {
+	if (part->read_only && op_is_write(op)) {
 		char b[BDEVNAME_SIZE];
 
 		if (op_is_flush(bio->bi_opf) && !bio_sectors(bio))
diff --git a/block/genhd.c b/block/genhd.c
index 703267865f14..75138cf5540d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1539,38 +1539,73 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro)
 	kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp);
 }
 
-void set_device_ro(struct block_device *bdev, int flag)
-{
-	bdev->bd_part->policy = flag;
-}
-
-EXPORT_SYMBOL(set_device_ro);
-
-void set_disk_ro(struct gendisk *disk, int flag)
+/**
+ * update_part_ro_state - iterate over partitions to update read-only state
+ * @disk:	The disk device
+ *
+ * This function updates the read-only state for all partitions on a
+ * given disk device. This is required every time a hardware event
+ * signals that the device write-protect state has changed. It is also
+ * necessary when the user sets or clears the read-only flag on the
+ * whole-disk device.
+ */
+static void update_part_ro_state(struct gendisk *disk)
 {
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
-	if (disk->part0.policy != flag) {
-		set_disk_ro_uevent(disk, flag);
-		disk->part0.policy = flag;
-	}
-
-	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY);
+	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY_PART0);
 	while ((part = disk_part_iter_next(&piter)))
-		part->policy = flag;
+		if (disk->read_only || disk->part0.ro_policy || part->ro_policy)
+			part->read_only = true;
+		else
+			part->read_only = false;
 	disk_part_iter_exit(&piter);
 }
 
+/**
+ * set_device_ro - set a block device read-only
+ * @bdev:	The block device (whole disk or partition)
+ * @state:	true or false
+ *
+ * This function is used to specify the read-only policy for a
+ * block_device (whole disk or partition). set_device_ro() is called
+ * by the BLKROSET ioctl.
+ */
+void set_device_ro(struct block_device *bdev, bool state)
+{
+	bdev->bd_part->read_only = bdev->bd_part->ro_policy = state;
+	if (bdev->bd_part->partno == 0)
+		update_part_ro_state(bdev->bd_disk);
+}
+EXPORT_SYMBOL(set_device_ro);
+
+/**
+ * set_disk_ro - set a gendisk read-only
+ * @disk:	The disk device
+ * @state:	true or false
+ *
+ * This function is used to indicate whether a given disk device
+ * should have its read-only flag set. set_disk_ro() is typically used
+ * by device drivers to indicate whether the underlying physical
+ * device is write-protected.
+ */
+void set_disk_ro(struct gendisk *disk, bool state)
+{
+	if (disk->read_only == state)
+		return;
+	set_disk_ro_uevent(disk, state);
+	disk->read_only = state;
+	update_part_ro_state(disk);
+}
 EXPORT_SYMBOL(set_disk_ro);
 
 int bdev_read_only(struct block_device *bdev)
 {
 	if (!bdev)
 		return 0;
-	return bdev->bd_part->policy;
+	return bdev->bd_part->read_only;
 }
-
 EXPORT_SYMBOL(bdev_read_only);
 
 int invalidate_partition(struct gendisk *disk, int partno)
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 8e596a8dff32..8c55b90c918d 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -98,7 +98,7 @@ static ssize_t part_ro_show(struct device *dev,
 			    struct device_attribute *attr, char *buf)
 {
 	struct hd_struct *p = dev_to_part(dev);
-	return sprintf(buf, "%d\n", p->policy ? 1 : 0);
+	return sprintf(buf, "%u\n", p->read_only ? 1 : 0);
 }
 
 static ssize_t part_alignment_offset_show(struct device *dev,
@@ -338,7 +338,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
 		queue_limit_discard_alignment(&disk->queue->limits, start);
 	p->nr_sects = len;
 	p->partno = partno;
-	p->policy = get_disk_ro(disk);
+	p->read_only = get_disk_ro(disk);
 
 	if (info) {
 		struct partition_meta_info *pinfo = alloc_part_info(disk);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 06c0fd594097..3ebd94f520cc 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -118,7 +118,8 @@ struct hd_struct {
 	unsigned int discard_alignment;
 	struct device __dev;
 	struct kobject *holder_dir;
-	int policy, partno;
+	bool read_only, ro_policy;
+	int partno;
 	struct partition_meta_info *info;
 #ifdef CONFIG_FAIL_MAKE_REQUEST
 	int make_it_fail;
@@ -183,6 +184,7 @@ struct gendisk {
 
 	char disk_name[DISK_NAME_LEN];	/* name of major driver */
 	char *(*devnode)(struct gendisk *gd, umode_t *mode);
+	bool read_only;			/* device read-only state */
 
 	unsigned int events;		/* supported events */
 	unsigned int async_events;	/* async events, subset of all */
@@ -431,12 +433,12 @@ extern void del_gendisk(struct gendisk *gp);
 extern struct gendisk *get_gendisk(dev_t dev, int *partno);
 extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
 
-extern void set_device_ro(struct block_device *bdev, int flag);
-extern void set_disk_ro(struct gendisk *disk, int flag);
+extern void set_device_ro(struct block_device *bdev, bool state);
+extern void set_disk_ro(struct gendisk *disk, bool state);
 
 static inline int get_disk_ro(struct gendisk *disk)
 {
-	return disk->part0.policy;
+	return disk->part0.read_only;
 }
 
 extern void disk_block_events(struct gendisk *disk);

WARNING: multiple messages have this Message-ID (diff)
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-raid@vger.kernel.org, Jan Hoeppner <hoeppner@linux.ibm.com>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	martin.petersen@oracle.com, linux-s390@vger.kernel.org,
	Richard Weinberger <richard@nod.at>,
	linux-block@vger.kernel.org, Song Liu <song@kernel.org>,
	linux-bcache@vger.kernel.org, linux-mtd@lists.infradead.org,
	Stefan Haberland <sth@linux.ibm.com>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	ceph-devel@vger.kernel.org
Subject: Re: [PATCH, RFC 11/10] block: propagate BLKROSET to all partitions
Date: Tue, 10 Nov 2020 23:38:22 -0500	[thread overview]
Message-ID: <yq1imacecwz.fsf@ca-mkp.ca.oracle.com> (raw)
In-Reply-To: <20201106140817.GA23557@lst.de> (Christoph Hellwig's message of "Fri, 6 Nov 2020 15:08:17 +0100")


Christoph,

> When setting the whole device read-only (or clearing the read-only
> state), also update the policy for all partitions.  The s390 dasd
> driver has awlways been doing this and it makes a lot of sense.

For your amusement, here's my attempt at addressing this from a while
back. Can't remember exactly why this stranded, I even wrote blktests
for it...

-- 
Martin K. Petersen	Oracle Linux Engineering

From a7898967402a69e59607300aa8e2e2a6941a61c1 Mon Sep 17 00:00:00 2001
From: "Martin K. Petersen" <martin.petersen@oracle.com>
Date: Wed, 27 Mar 2019 12:21:41 -0400
Subject: [PATCH] block: Fix read-only block device setting after revalidate

Commit 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading
partition") addressed a long-standing problem with user read-only
policy being overridden as a result of a device-initiated revalidate.
The commit has since been reverted due to a regression that left some
USB devices read-only indefinitely.

To fix the underlying problems with revalidate we need to keep track
of hardware state and user policy separately. Every time the state is
changed, either via a hardware event or the BLKROSET ioctl, the
per-partition read-only state is updated based on the combination of
device state and policy. The resulting active state is stored in a
separate hd_struct flag to avoid introducing additional lookups in the
I/O hot path.

The gendisk has been updated to reflect the current hardware state set
by the device driver. This is done to allow returning the device to
the hardware state once the user clears the BLKROSET flag.

For partitions, the existing hd_struct 'policy' flag is split into
two:

 - 'read_only' indicates the currently active read-only state of a
   whole disk device or partition.

 - 'ro_policy' indicates the whether the user has administratively set
   the whole disk or partition read-only via the BLKROSET ioctl.

The resulting semantics are as follows:

 - If BLKROSET is used to set a whole-disk device read-only, any
   partitions will end up in a read-only state until the user
   explicitly clears the flag.

 - If BLKROSET sets a given partition read-only, that partition will
   remain read-only even if the underlying storage stack initiates a
   revalidate. However, the BLKRRPART ioctl will cause the partition
   table to be dropped and any user policy on partitions will be lost.

 - If BLKROSET has not been set, both the whole disk device and any
   partitions will reflect the current write-protect state of the
   underlying device.

Cc: Jeremy Cline <jeremy@jcline.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Reported-by: Oleksii Kurochko <olkuroch@cisco.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

diff --git a/block/blk-core.c b/block/blk-core.c
index 4673ebe42255..932f179a9095 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -792,7 +792,7 @@ static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part)
 {
 	const int op = bio_op(bio);
 
-	if (part->policy && op_is_write(op)) {
+	if (part->read_only && op_is_write(op)) {
 		char b[BDEVNAME_SIZE];
 
 		if (op_is_flush(bio->bi_opf) && !bio_sectors(bio))
diff --git a/block/genhd.c b/block/genhd.c
index 703267865f14..75138cf5540d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1539,38 +1539,73 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro)
 	kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp);
 }
 
-void set_device_ro(struct block_device *bdev, int flag)
-{
-	bdev->bd_part->policy = flag;
-}
-
-EXPORT_SYMBOL(set_device_ro);
-
-void set_disk_ro(struct gendisk *disk, int flag)
+/**
+ * update_part_ro_state - iterate over partitions to update read-only state
+ * @disk:	The disk device
+ *
+ * This function updates the read-only state for all partitions on a
+ * given disk device. This is required every time a hardware event
+ * signals that the device write-protect state has changed. It is also
+ * necessary when the user sets or clears the read-only flag on the
+ * whole-disk device.
+ */
+static void update_part_ro_state(struct gendisk *disk)
 {
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
-	if (disk->part0.policy != flag) {
-		set_disk_ro_uevent(disk, flag);
-		disk->part0.policy = flag;
-	}
-
-	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY);
+	disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY_PART0);
 	while ((part = disk_part_iter_next(&piter)))
-		part->policy = flag;
+		if (disk->read_only || disk->part0.ro_policy || part->ro_policy)
+			part->read_only = true;
+		else
+			part->read_only = false;
 	disk_part_iter_exit(&piter);
 }
 
+/**
+ * set_device_ro - set a block device read-only
+ * @bdev:	The block device (whole disk or partition)
+ * @state:	true or false
+ *
+ * This function is used to specify the read-only policy for a
+ * block_device (whole disk or partition). set_device_ro() is called
+ * by the BLKROSET ioctl.
+ */
+void set_device_ro(struct block_device *bdev, bool state)
+{
+	bdev->bd_part->read_only = bdev->bd_part->ro_policy = state;
+	if (bdev->bd_part->partno == 0)
+		update_part_ro_state(bdev->bd_disk);
+}
+EXPORT_SYMBOL(set_device_ro);
+
+/**
+ * set_disk_ro - set a gendisk read-only
+ * @disk:	The disk device
+ * @state:	true or false
+ *
+ * This function is used to indicate whether a given disk device
+ * should have its read-only flag set. set_disk_ro() is typically used
+ * by device drivers to indicate whether the underlying physical
+ * device is write-protected.
+ */
+void set_disk_ro(struct gendisk *disk, bool state)
+{
+	if (disk->read_only == state)
+		return;
+	set_disk_ro_uevent(disk, state);
+	disk->read_only = state;
+	update_part_ro_state(disk);
+}
 EXPORT_SYMBOL(set_disk_ro);
 
 int bdev_read_only(struct block_device *bdev)
 {
 	if (!bdev)
 		return 0;
-	return bdev->bd_part->policy;
+	return bdev->bd_part->read_only;
 }
-
 EXPORT_SYMBOL(bdev_read_only);
 
 int invalidate_partition(struct gendisk *disk, int partno)
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 8e596a8dff32..8c55b90c918d 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -98,7 +98,7 @@ static ssize_t part_ro_show(struct device *dev,
 			    struct device_attribute *attr, char *buf)
 {
 	struct hd_struct *p = dev_to_part(dev);
-	return sprintf(buf, "%d\n", p->policy ? 1 : 0);
+	return sprintf(buf, "%u\n", p->read_only ? 1 : 0);
 }
 
 static ssize_t part_alignment_offset_show(struct device *dev,
@@ -338,7 +338,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
 		queue_limit_discard_alignment(&disk->queue->limits, start);
 	p->nr_sects = len;
 	p->partno = partno;
-	p->policy = get_disk_ro(disk);
+	p->read_only = get_disk_ro(disk);
 
 	if (info) {
 		struct partition_meta_info *pinfo = alloc_part_info(disk);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 06c0fd594097..3ebd94f520cc 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -118,7 +118,8 @@ struct hd_struct {
 	unsigned int discard_alignment;
 	struct device __dev;
 	struct kobject *holder_dir;
-	int policy, partno;
+	bool read_only, ro_policy;
+	int partno;
 	struct partition_meta_info *info;
 #ifdef CONFIG_FAIL_MAKE_REQUEST
 	int make_it_fail;
@@ -183,6 +184,7 @@ struct gendisk {
 
 	char disk_name[DISK_NAME_LEN];	/* name of major driver */
 	char *(*devnode)(struct gendisk *gd, umode_t *mode);
+	bool read_only;			/* device read-only state */
 
 	unsigned int events;		/* supported events */
 	unsigned int async_events;	/* async events, subset of all */
@@ -431,12 +433,12 @@ extern void del_gendisk(struct gendisk *gp);
 extern struct gendisk *get_gendisk(dev_t dev, int *partno);
 extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
 
-extern void set_device_ro(struct block_device *bdev, int flag);
-extern void set_disk_ro(struct gendisk *disk, int flag);
+extern void set_device_ro(struct block_device *bdev, bool state);
+extern void set_disk_ro(struct gendisk *disk, bool state);
 
 static inline int get_disk_ro(struct gendisk *disk)
 {
-	return disk->part0.policy;
+	return disk->part0.read_only;
 }
 
 extern void disk_block_events(struct gendisk *disk);

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  parent reply	other threads:[~2020-11-11  4:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-06 14:08 [PATCH, RFC 11/10] block: propagate BLKROSET to all partitions Christoph Hellwig
2020-11-06 14:08 ` Christoph Hellwig
2020-11-06 19:16 ` Stefan Haberland
2020-11-06 19:16   ` Stefan Haberland
2020-11-07  7:28 ` Coly Li
2020-11-07  7:28   ` Coly Li
2020-11-11  4:38 ` Martin K. Petersen [this message]
2020-11-11  4:38   ` Martin K. Petersen
2020-11-11  8:03   ` Christoph Hellwig
2020-11-11  8:03     ` Christoph Hellwig
2020-11-11 16:53     ` Martin K. Petersen
2020-11-11 16:53       ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1imacecwz.fsf@ca-mkp.ca.oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=hch@lst.de \
    --cc=hoeppner@linux.ibm.com \
    --cc=idryomov@gmail.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=miquel.raynal@bootlin.com \
    --cc=richard@nod.at \
    --cc=song@kernel.org \
    --cc=sth@linux.ibm.com \
    --cc=vigneshr@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.