linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2][V3] block: Support online resize of disk partitions
@ 2012-02-14 20:39 Vivek Goyal
  2012-02-14 20:39 ` [PATCH 1/2] block: add partition resize function to blkpg ioctl Vivek Goyal
  2012-02-14 20:39 ` [PATCH 2/2] util-linux: resizepart: Utility to resize a partition Vivek Goyal
  0 siblings, 2 replies; 10+ messages in thread
From: Vivek Goyal @ 2012-02-14 20:39 UTC (permalink / raw)
  To: linux-kernel, axboe, dm-devel, kzak; +Cc: vgoyal, psusi, psusi, maxim.patlasov

Hi,

This is V3 of patch which adds support for online resizing of a partition.
This patch is based on previously posted patches by Phillip Susi.

There are two patches. Out of which one is kernel patch and other one is
util-linux patch to add support of a user space utility "resizepart" to
allow resizing the partition.

This ioctl only resizes the partition size in kenrel and does not change
the size on disk. A user needs to make sure that corresponding changes
are made to disk data structures also using fdisk(or partx), if changes
are to be retained across reboot.

Changes since V2
----------------
- Do not ignore the "start" parameter in RESIZE ioctl.
- Change resizepart utility to parse sysfs to get to partition start.

Changes since V1
----------------
Following are changes since the version Phillip posted.

- RESIZE ioctl ignores the partition "start" and does not expect user to
  specify one. Caller needs to just specify "device", "partition number" and
  "size" of new partition. 

- Got rid of part_nr_sects_write_begin/part_nr_sects_write_end functions
  and replaced these with single part_nr_sects_write().

- Some sequence counter related changes are simply lifted from i_size_write().

- Initialized part->nr_sects_seq using seqcount_init().

Phillip, do let me know if I should put your signed-off-by also in the
patch.

Any review feedback is welcome.

I did following test.

- Create a partition of 10MB on a disk using fdisk.
- Add this partition to a volume group
- Use fdisk to increase the partition size to 20MB. (First delete the
  partition and then create a new one of 20MB size).
- Use resizepart to extend partition size in kernel.
	resizepart /dev/sdc 1 40960
- Do pvresize on partition so that physical volume can be incrased in
  size online.
	pvresize /dev/sda1

pvresize does recognize the new size. Also lsblk and /proc/partitions
report the new size of partition.

Thanks
Vivek

[PATCH 1/2] block: add partition resize function to blkpg ioctl
[PATCH 2/2] util-linux: resizepart: Utility to resize a partition

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-02-14 20:39 [PATCH 0/2][V3] block: Support online resize of disk partitions Vivek Goyal
@ 2012-02-14 20:39 ` Vivek Goyal
  2012-02-20 14:42   ` Vivek Goyal
                     ` (2 more replies)
  2012-02-14 20:39 ` [PATCH 2/2] util-linux: resizepart: Utility to resize a partition Vivek Goyal
  1 sibling, 3 replies; 10+ messages in thread
From: Vivek Goyal @ 2012-02-14 20:39 UTC (permalink / raw)
  To: linux-kernel, axboe, dm-devel, kzak; +Cc: vgoyal, psusi, psusi, maxim.patlasov

Add a new operation code ( BLKPG_RESIZE_PARTITION ) to the
BLKPG ioctl that allows altering the size of an existing
partition, even if it is currently in use.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 block/genhd.c             |   20 ++++++++++++----
 block/ioctl.c             |   57 ++++++++++++++++++++++++++++++++++++++++++--
 block/partition-generic.c |    4 ++-
 include/linux/blkpg.h     |    1 +
 include/linux/genhd.h     |   57 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 130 insertions(+), 9 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 23b4f70..935e09b 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -153,7 +153,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
 		part = rcu_dereference(ptbl->part[piter->idx]);
 		if (!part)
 			continue;
-		if (!part->nr_sects &&
+		if (!part_nr_sects_read(part) &&
 		    !(piter->flags & DISK_PITER_INCL_EMPTY) &&
 		    !(piter->flags & DISK_PITER_INCL_EMPTY_PART0 &&
 		      piter->idx == 0))
@@ -190,7 +190,7 @@ EXPORT_SYMBOL_GPL(disk_part_iter_exit);
 static inline int sector_in_part(struct hd_struct *part, sector_t sector)
 {
 	return part->start_sect <= sector &&
-		sector < part->start_sect + part->nr_sects;
+		sector < part->start_sect + part_nr_sects_read(part);
 }
 
 /**
@@ -765,8 +765,8 @@ void __init printk_all_partitions(void)
 
 			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
 			       bdevt_str(part_devt(part), devt_buf),
-			       (unsigned long long)part->nr_sects >> 1,
-			       disk_name(disk, part->partno, name_buf), uuid);
+			       (unsigned long long)part_nr_sects_read(part) >> 1
+			       , disk_name(disk, part->partno, name_buf), uuid);
 			if (is_part0) {
 				if (disk->driverfs_dev != NULL &&
 				    disk->driverfs_dev->driver != NULL)
@@ -857,7 +857,7 @@ static int show_partition(struct seq_file *seqf, void *v)
 	while ((part = disk_part_iter_next(&piter)))
 		seq_printf(seqf, "%4d  %7d %10llu %s\n",
 			   MAJOR(part_devt(part)), MINOR(part_devt(part)),
-			   (unsigned long long)part->nr_sects >> 1,
+			   (unsigned long long)part_nr_sects_read(part) >> 1,
 			   disk_name(sgp, part->partno, buf));
 	disk_part_iter_exit(&piter);
 
@@ -1263,6 +1263,16 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
 		}
 		disk->part_tbl->part[0] = &disk->part0;
 
+		/*
+		 * set_capacity() and get_capacity() currently don't use
+		 * seqcounter to read/update the part0->nr_sects. Still init
+		 * the counter as we can read the sectors in IO submission
+		 * patch using seqence counters.
+		 *
+		 * TODO: Ideally set_capacity() and get_capacity() should be
+		 * converted to make use of bd_mutex and sequence counters.
+		 */
+		seqcount_init(&disk->part0.nr_sects_seq);
 		hd_ref_init(&disk->part0);
 
 		disk->minors = minors;
diff --git a/block/ioctl.c b/block/ioctl.c
index ba15b2d..ddbc649 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -13,7 +13,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
 {
 	struct block_device *bdevp;
 	struct gendisk *disk;
-	struct hd_struct *part;
+	struct hd_struct *part, *lpart;
 	struct blkpg_ioctl_arg a;
 	struct blkpg_partition p;
 	struct disk_part_iter piter;
@@ -36,8 +36,8 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
 		case BLKPG_ADD_PARTITION:
 			start = p.start >> 9;
 			length = p.length >> 9;
-			/* check for fit in a hd_struct */ 
-			if (sizeof(sector_t) == sizeof(long) && 
+			/* check for fit in a hd_struct */
+			if (sizeof(sector_t) == sizeof(long) &&
 			    sizeof(long long) > sizeof(long)) {
 				long pstart = start, plength = length;
 				if (pstart != start || plength != length
@@ -92,6 +92,57 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
 			bdput(bdevp);
 
 			return 0;
+		case BLKPG_RESIZE_PARTITION:
+			start = p.start >> 9;
+			/* new length of partition in bytes */
+			length = p.length >> 9;
+			/* check for fit in a hd_struct */
+			if (sizeof(sector_t) == sizeof(long) &&
+			    sizeof(long long) > sizeof(long)) {
+				long pstart = start, plength = length;
+				if (pstart != start || plength != length
+				    || pstart < 0 || plength < 0)
+					return -EINVAL;
+			}
+			part = disk_get_part(disk, partno);
+			if (!part)
+				return -ENXIO;
+			bdevp = bdget(part_devt(part));
+			if (!bdevp) {
+				disk_put_part(part);
+				return -ENOMEM;
+			}
+			mutex_lock(&bdevp->bd_mutex);
+			mutex_lock_nested(&bdev->bd_mutex, 1);
+			if (start != part->start_sect) {
+				mutex_unlock(&bdevp->bd_mutex);
+				mutex_unlock(&bdev->bd_mutex);
+				disk_put_part(part);
+				return -EINVAL;
+			}
+			/* overlap? */
+			disk_part_iter_init(&piter, disk,
+					    DISK_PITER_INCL_EMPTY);
+			while ((lpart = disk_part_iter_next(&piter))) {
+				if (lpart->partno != partno &&
+				   !(start + length <= lpart->start_sect ||
+				   start >= lpart->start_sect + lpart->nr_sects)
+				   ) {
+					disk_part_iter_exit(&piter);
+					mutex_unlock(&bdevp->bd_mutex);
+					mutex_unlock(&bdev->bd_mutex);
+					disk_put_part(part);
+					return -EBUSY;
+				}
+			}
+			disk_part_iter_exit(&piter);
+			part_nr_sects_write(part, (sector_t)length);
+			i_size_write(bdevp->bd_inode, p.length);
+			mutex_unlock(&bdevp->bd_mutex);
+			mutex_unlock(&bdev->bd_mutex);
+			bdput(bdevp);
+			disk_put_part(part);
+			return 0;
 		default:
 			return -EINVAL;
 	}
diff --git a/block/partition-generic.c b/block/partition-generic.c
index d06ec1c..363a6f6 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -84,7 +84,7 @@ ssize_t part_size_show(struct device *dev,
 		       struct device_attribute *attr, char *buf)
 {
 	struct hd_struct *p = dev_to_part(dev);
-	return sprintf(buf, "%llu\n",(unsigned long long)p->nr_sects);
+	return sprintf(buf, "%llu\n",(unsigned long long)part_nr_sects_read(p));
 }
 
 static ssize_t part_ro_show(struct device *dev,
@@ -294,6 +294,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
 		err = -ENOMEM;
 		goto out_free;
 	}
+
+	seqcount_init(&p->nr_sects_seq);
 	pdev = part_to_dev(p);
 
 	p->start_sect = start;
diff --git a/include/linux/blkpg.h b/include/linux/blkpg.h
index faf8a45..a851944 100644
--- a/include/linux/blkpg.h
+++ b/include/linux/blkpg.h
@@ -40,6 +40,7 @@ struct blkpg_ioctl_arg {
 /* The subfunctions (for the op field) */
 #define BLKPG_ADD_PARTITION	1
 #define BLKPG_DEL_PARTITION	2
+#define BLKPG_RESIZE_PARTITION	3
 
 /* Sizes of name fields. Unused at present. */
 #define BLKPG_DEVNAMELTH	64
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index fe23ee7..0def3ef 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -98,7 +98,13 @@ struct partition_meta_info {
 
 struct hd_struct {
 	sector_t start_sect;
+	/*
+	 * nr_sects is protected by sequence counter. One might extend a
+	 * partition while IO is happening to it and update of nr_sects
+	 * can be non-atomic on 32bit machines with 64bit sector_t.
+	 */
 	sector_t nr_sects;
+	seqcount_t nr_sects_seq;
 	sector_t alignment_offset;
 	unsigned int discard_alignment;
 	struct device __dev;
@@ -653,6 +659,57 @@ static inline void hd_struct_put(struct hd_struct *part)
 		__delete_partition(part);
 }
 
+/*
+ * Any access of part->nr_sects which is not protected by partition
+ * bd_mutex or gendisk bdev bd_mutex, should be done using this
+ * accessor function.
+ *
+ * Code written along the lines of i_size_read() and i_size_write().
+ * CONFIG_PREEMPT case optimizes the case of UP kernel with preemption
+ * on.
+ */
+static inline sector_t part_nr_sects_read(struct hd_struct *part)
+{
+#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
+	sector_t nr_sects;
+	unsigned seq;
+	do {
+		seq = read_seqcount_begin(&part->nr_sects_seq);
+		nr_sects = part->nr_sects;
+	} while (read_seqcount_retry(&part->nr_sects_seq, seq));
+	return nr_sects;
+#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
+	sector_t nr_sects;
+
+	preempt_disable();
+	nr_sects = part->nr_sects;
+	preempt_enable();
+	return nr_sects;
+#else
+	return part->nr_sects;
+#endif
+}
+
+/*
+ * Should be called with mutex lock held (typically bd_mutex) of partition
+ * to provide mutual exlusion among writers otherwise seqcount might be
+ * left in wrong state leaving the readers spinning infinitely.
+ */
+static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
+{
+#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
+	write_seqcount_begin(&part->nr_sects_seq);
+	part->nr_sects = size;
+	write_seqcount_end(&part->nr_sects_seq);
+#elif BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_PREEMPT)
+	preempt_disable();
+	part->nr_sects = size;
+	preempt_enable();
+#else
+	part->nr_sects = size;
+#endif
+}
+
 #else /* CONFIG_BLOCK */
 
 static inline void printk_all_partitions(void) { }
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] util-linux: resizepart: Utility to resize a partition
  2012-02-14 20:39 [PATCH 0/2][V3] block: Support online resize of disk partitions Vivek Goyal
  2012-02-14 20:39 ` [PATCH 1/2] block: add partition resize function to blkpg ioctl Vivek Goyal
@ 2012-02-14 20:39 ` Vivek Goyal
  1 sibling, 0 replies; 10+ messages in thread
From: Vivek Goyal @ 2012-02-14 20:39 UTC (permalink / raw)
  To: linux-kernel, axboe, dm-devel, kzak; +Cc: vgoyal, psusi, psusi, maxim.patlasov

A simple user space utility to resize an existing partition. It tries to read
the start of partiton from sysfs.

This is a real quick dirty patch I used for my testing. I am sure there
are better and faster ways of getting to partition "start" from device
and partition number.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 partx/Makefile.am  |    9 ++++-
 partx/partx.h      |   19 ++++++++++
 partx/resizepart.8 |   38 +++++++++++++++++++++
 partx/resizepart.c |   95 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 159 insertions(+), 2 deletions(-)
 create mode 100644 partx/resizepart.8
 create mode 100644 partx/resizepart.c

diff --git a/partx/Makefile.am b/partx/Makefile.am
index 080bc47..72b605c 100644
--- a/partx/Makefile.am
+++ b/partx/Makefile.am
@@ -1,7 +1,12 @@
 include $(top_srcdir)/config/include-Makefile.am
 
-usrsbin_exec_PROGRAMS = addpart delpart
-dist_man_MANS = addpart.8 delpart.8
+usrsbin_exec_PROGRAMS = addpart delpart resizepart
+dist_man_MANS = addpart.8 delpart.8 resizepart.8
+
+resizepart_SOURCES = resizepart.c \
+		$(top_srcdir)/lib/sysfs.c \
+		$(top_srcdir)/lib/canonicalize.c \
+		$(top_srcdir)/lib/at.c
 
 usrsbin_exec_PROGRAMS += partx
 partx_SOURCES = partx.c partx.h \
diff --git a/partx/partx.h b/partx/partx.h
index b40fa8f..c2b87f0 100644
--- a/partx/partx.h
+++ b/partx/partx.h
@@ -41,4 +41,23 @@ static inline int partx_add_partition(int fd, int partno,
 	return ioctl(fd, BLKPG, &a);
 }
 
+static inline int partx_resize_partition(int fd, int partno,
+			long long start, long long size)
+{
+	struct blkpg_ioctl_arg a;
+	struct blkpg_partition p;
+
+	p.pno = partno;
+	p.start = start << 9;
+	p.length = size << 9;
+	p.devname[0] = 0;
+	p.volname[0] = 0;
+	a.op = BLKPG_RESIZE_PARTITION;
+	a.flags = 0;
+	a.datalen = sizeof(p);
+	a.data = &p;
+
+	return ioctl(fd, BLKPG, &a);
+}
+
 #endif /*  UTIL_LINUX_PARTX_H */
diff --git a/partx/resizepart.8 b/partx/resizepart.8
new file mode 100644
index 0000000..0b47e81
--- /dev/null
+++ b/partx/resizepart.8
@@ -0,0 +1,38 @@
+.\" resizepart.8 --
+.\" Copyright 2012 Vivek Goyal <vgoyal@redhat.com>
+.\" Copyright 2012 Red Hat, Inc.
+.\" May be distributed under the GNU General Public License
+.TH RESIZEPART 8 "February 2012" "util-linux" "System Administration"
+.SH NAME
+resizepart \-
+simple wrapper around the "resize partition" ioctl
+.SH SYNOPSIS
+.B resizepart
+.I device partition length
+.SH DESCRIPTION
+.B resizepart
+is a program that informs the Linux kernel of new partition size.
+
+This command doesn't manipulate partitions on hard drive.
+
+.SH PARAMETERS
+.TP
+.I device
+Specify the disk device.
+.TP
+.I partition
+Specify the partition number.
+.TP
+.I length
+Specify the length of the partition (in 512-byte sectors).
+
+.SH SEE ALSO
+.BR addpart (8),
+.BR delpart (8),
+.BR fdisk (8),
+.BR parted (8),
+.BR partprobe (8),
+.BR partx (8)
+.SH AVAILABILITY
+The addpart command is part of the util-linux package and is available from
+ftp://ftp.kernel.org/pub/linux/utils/util-linux/.
diff --git a/partx/resizepart.c b/partx/resizepart.c
new file mode 100644
index 0000000..64dcfdb
--- /dev/null
+++ b/partx/resizepart.c
@@ -0,0 +1,95 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <ctype.h>
+#include "canonicalize.h"
+#include "sysfs.h"
+#include "partx.h"
+
+char *
+get_devname_from_canonical_path(char *path)
+{
+	struct sysfs_cxt cxt;
+	dev_t devno;
+	char name[PATH_MAX];
+	char *devname;
+
+	devno = sysfs_devname_to_devno(path, NULL);
+	if (!devno) {
+		fprintf(stderr, "failed to read devno. \n");
+		exit(1);
+	}
+
+	sysfs_init(&cxt, devno, NULL);
+	devname = sysfs_get_devname(&cxt, name, sizeof(name));
+	return strdup(devname);
+}
+
+char *
+get_partname_from_devname(char *devname, int partno)
+{
+	char partname[PATH_MAX];
+
+	if (isdigit(devname[strlen(devname) - 1]))
+		snprintf(partname, PATH_MAX, "%sp%d", devname, partno);
+	else
+		snprintf(partname, PATH_MAX, "%s%d", devname, partno);
+
+	return strdup(partname);
+}
+
+
+int
+main(int argc, char **argv)
+{
+	int fd;
+	char *real_path, *devname, *partname, *pstart;
+	char part_sysfs_path[PATH_MAX], part_start[30];
+	FILE *fp;
+
+	if (argc != 4) {
+		fprintf(stderr,
+			"usage: %s diskdevice partitionnr length\n",
+			argv[0]);
+		exit(1);
+	}
+	if ((fd = open(argv[1], O_RDONLY)) < 0) {
+		perror(argv[1]);
+		exit(1);
+	}
+
+	real_path = canonicalize_path(argv[1]);
+
+	if (real_path == NULL) {
+		fprintf(stderr, "canonicalize_path(%s) failed. \n", argv[1]);
+		exit(1);
+	}
+
+	devname = get_devname_from_canonical_path(real_path);
+	partname = get_partname_from_devname(devname, atoi(argv[2]));
+
+	snprintf(part_sysfs_path, PATH_MAX, "/sys/block/%s/%s/start",
+			devname, partname);
+
+	fp = fopen(part_sysfs_path, "r");
+
+	if (!fp) {
+		perror("BLKPG");
+		exit(1);
+	}
+
+	pstart = fgets(part_start, 30, fp);
+
+	if (!pstart) {
+		perror("BLKPG");
+		exit(1);
+	}
+
+	if (partx_resize_partition(fd, atoi(argv[2]), atoll(pstart),
+				atoll(argv[3]))) {
+		perror("BLKPG");
+		exit(1);
+	}
+
+	return 0;
+}
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-02-14 20:39 ` [PATCH 1/2] block: add partition resize function to blkpg ioctl Vivek Goyal
@ 2012-02-20 14:42   ` Vivek Goyal
  2012-02-20 15:17     ` Phillip Susi
  2012-02-20 15:28   ` Vivek Goyal
  2012-04-09 16:40   ` Maxim V. Patlasov
  2 siblings, 1 reply; 10+ messages in thread
From: Vivek Goyal @ 2012-02-20 14:42 UTC (permalink / raw)
  To: psusi, psusi; +Cc: maxim.patlasov, linux-kernel, axboe, dm-devel, kzak

On Tue, Feb 14, 2012 at 03:39:50PM -0500, Vivek Goyal wrote:
> Add a new operation code ( BLKPG_RESIZE_PARTITION ) to the
> BLKPG ioctl that allows altering the size of an existing
> partition, even if it is currently in use.

Hi Phillip,

Are you ok with the change? 

Thanks
Vivek

> 
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  block/genhd.c             |   20 ++++++++++++----
>  block/ioctl.c             |   57 ++++++++++++++++++++++++++++++++++++++++++--
>  block/partition-generic.c |    4 ++-
>  include/linux/blkpg.h     |    1 +
>  include/linux/genhd.h     |   57 +++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 130 insertions(+), 9 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 23b4f70..935e09b 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -153,7 +153,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
>  		part = rcu_dereference(ptbl->part[piter->idx]);
>  		if (!part)
>  			continue;
> -		if (!part->nr_sects &&
> +		if (!part_nr_sects_read(part) &&
>  		    !(piter->flags & DISK_PITER_INCL_EMPTY) &&
>  		    !(piter->flags & DISK_PITER_INCL_EMPTY_PART0 &&
>  		      piter->idx == 0))
> @@ -190,7 +190,7 @@ EXPORT_SYMBOL_GPL(disk_part_iter_exit);
>  static inline int sector_in_part(struct hd_struct *part, sector_t sector)
>  {
>  	return part->start_sect <= sector &&
> -		sector < part->start_sect + part->nr_sects;
> +		sector < part->start_sect + part_nr_sects_read(part);
>  }
>  
>  /**
> @@ -765,8 +765,8 @@ void __init printk_all_partitions(void)
>  
>  			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
>  			       bdevt_str(part_devt(part), devt_buf),
> -			       (unsigned long long)part->nr_sects >> 1,
> -			       disk_name(disk, part->partno, name_buf), uuid);
> +			       (unsigned long long)part_nr_sects_read(part) >> 1
> +			       , disk_name(disk, part->partno, name_buf), uuid);
>  			if (is_part0) {
>  				if (disk->driverfs_dev != NULL &&
>  				    disk->driverfs_dev->driver != NULL)
> @@ -857,7 +857,7 @@ static int show_partition(struct seq_file *seqf, void *v)
>  	while ((part = disk_part_iter_next(&piter)))
>  		seq_printf(seqf, "%4d  %7d %10llu %s\n",
>  			   MAJOR(part_devt(part)), MINOR(part_devt(part)),
> -			   (unsigned long long)part->nr_sects >> 1,
> +			   (unsigned long long)part_nr_sects_read(part) >> 1,
>  			   disk_name(sgp, part->partno, buf));
>  	disk_part_iter_exit(&piter);
>  
> @@ -1263,6 +1263,16 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
>  		}
>  		disk->part_tbl->part[0] = &disk->part0;
>  
> +		/*
> +		 * set_capacity() and get_capacity() currently don't use
> +		 * seqcounter to read/update the part0->nr_sects. Still init
> +		 * the counter as we can read the sectors in IO submission
> +		 * patch using seqence counters.
> +		 *
> +		 * TODO: Ideally set_capacity() and get_capacity() should be
> +		 * converted to make use of bd_mutex and sequence counters.
> +		 */
> +		seqcount_init(&disk->part0.nr_sects_seq);
>  		hd_ref_init(&disk->part0);
>  
>  		disk->minors = minors;
> diff --git a/block/ioctl.c b/block/ioctl.c
> index ba15b2d..ddbc649 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -13,7 +13,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  {
>  	struct block_device *bdevp;
>  	struct gendisk *disk;
> -	struct hd_struct *part;
> +	struct hd_struct *part, *lpart;
>  	struct blkpg_ioctl_arg a;
>  	struct blkpg_partition p;
>  	struct disk_part_iter piter;
> @@ -36,8 +36,8 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  		case BLKPG_ADD_PARTITION:
>  			start = p.start >> 9;
>  			length = p.length >> 9;
> -			/* check for fit in a hd_struct */ 
> -			if (sizeof(sector_t) == sizeof(long) && 
> +			/* check for fit in a hd_struct */
> +			if (sizeof(sector_t) == sizeof(long) &&
>  			    sizeof(long long) > sizeof(long)) {
>  				long pstart = start, plength = length;
>  				if (pstart != start || plength != length
> @@ -92,6 +92,57 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  			bdput(bdevp);
>  
>  			return 0;
> +		case BLKPG_RESIZE_PARTITION:
> +			start = p.start >> 9;
> +			/* new length of partition in bytes */
> +			length = p.length >> 9;
> +			/* check for fit in a hd_struct */
> +			if (sizeof(sector_t) == sizeof(long) &&
> +			    sizeof(long long) > sizeof(long)) {
> +				long pstart = start, plength = length;
> +				if (pstart != start || plength != length
> +				    || pstart < 0 || plength < 0)
> +					return -EINVAL;
> +			}
> +			part = disk_get_part(disk, partno);
> +			if (!part)
> +				return -ENXIO;
> +			bdevp = bdget(part_devt(part));
> +			if (!bdevp) {
> +				disk_put_part(part);
> +				return -ENOMEM;
> +			}
> +			mutex_lock(&bdevp->bd_mutex);
> +			mutex_lock_nested(&bdev->bd_mutex, 1);
> +			if (start != part->start_sect) {
> +				mutex_unlock(&bdevp->bd_mutex);
> +				mutex_unlock(&bdev->bd_mutex);
> +				disk_put_part(part);
> +				return -EINVAL;
> +			}
> +			/* overlap? */
> +			disk_part_iter_init(&piter, disk,
> +					    DISK_PITER_INCL_EMPTY);
> +			while ((lpart = disk_part_iter_next(&piter))) {
> +				if (lpart->partno != partno &&
> +				   !(start + length <= lpart->start_sect ||
> +				   start >= lpart->start_sect + lpart->nr_sects)
> +				   ) {
> +					disk_part_iter_exit(&piter);
> +					mutex_unlock(&bdevp->bd_mutex);
> +					mutex_unlock(&bdev->bd_mutex);
> +					disk_put_part(part);
> +					return -EBUSY;
> +				}
> +			}
> +			disk_part_iter_exit(&piter);
> +			part_nr_sects_write(part, (sector_t)length);
> +			i_size_write(bdevp->bd_inode, p.length);
> +			mutex_unlock(&bdevp->bd_mutex);
> +			mutex_unlock(&bdev->bd_mutex);
> +			bdput(bdevp);
> +			disk_put_part(part);
> +			return 0;
>  		default:
>  			return -EINVAL;
>  	}
> diff --git a/block/partition-generic.c b/block/partition-generic.c
> index d06ec1c..363a6f6 100644
> --- a/block/partition-generic.c
> +++ b/block/partition-generic.c
> @@ -84,7 +84,7 @@ ssize_t part_size_show(struct device *dev,
>  		       struct device_attribute *attr, char *buf)
>  {
>  	struct hd_struct *p = dev_to_part(dev);
> -	return sprintf(buf, "%llu\n",(unsigned long long)p->nr_sects);
> +	return sprintf(buf, "%llu\n",(unsigned long long)part_nr_sects_read(p));
>  }
>  
>  static ssize_t part_ro_show(struct device *dev,
> @@ -294,6 +294,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
>  		err = -ENOMEM;
>  		goto out_free;
>  	}
> +
> +	seqcount_init(&p->nr_sects_seq);
>  	pdev = part_to_dev(p);
>  
>  	p->start_sect = start;
> diff --git a/include/linux/blkpg.h b/include/linux/blkpg.h
> index faf8a45..a851944 100644
> --- a/include/linux/blkpg.h
> +++ b/include/linux/blkpg.h
> @@ -40,6 +40,7 @@ struct blkpg_ioctl_arg {
>  /* The subfunctions (for the op field) */
>  #define BLKPG_ADD_PARTITION	1
>  #define BLKPG_DEL_PARTITION	2
> +#define BLKPG_RESIZE_PARTITION	3
>  
>  /* Sizes of name fields. Unused at present. */
>  #define BLKPG_DEVNAMELTH	64
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index fe23ee7..0def3ef 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -98,7 +98,13 @@ struct partition_meta_info {
>  
>  struct hd_struct {
>  	sector_t start_sect;
> +	/*
> +	 * nr_sects is protected by sequence counter. One might extend a
> +	 * partition while IO is happening to it and update of nr_sects
> +	 * can be non-atomic on 32bit machines with 64bit sector_t.
> +	 */
>  	sector_t nr_sects;
> +	seqcount_t nr_sects_seq;
>  	sector_t alignment_offset;
>  	unsigned int discard_alignment;
>  	struct device __dev;
> @@ -653,6 +659,57 @@ static inline void hd_struct_put(struct hd_struct *part)
>  		__delete_partition(part);
>  }
>  
> +/*
> + * Any access of part->nr_sects which is not protected by partition
> + * bd_mutex or gendisk bdev bd_mutex, should be done using this
> + * accessor function.
> + *
> + * Code written along the lines of i_size_read() and i_size_write().
> + * CONFIG_PREEMPT case optimizes the case of UP kernel with preemption
> + * on.
> + */
> +static inline sector_t part_nr_sects_read(struct hd_struct *part)
> +{
> +#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
> +	sector_t nr_sects;
> +	unsigned seq;
> +	do {
> +		seq = read_seqcount_begin(&part->nr_sects_seq);
> +		nr_sects = part->nr_sects;
> +	} while (read_seqcount_retry(&part->nr_sects_seq, seq));
> +	return nr_sects;
> +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
> +	sector_t nr_sects;
> +
> +	preempt_disable();
> +	nr_sects = part->nr_sects;
> +	preempt_enable();
> +	return nr_sects;
> +#else
> +	return part->nr_sects;
> +#endif
> +}
> +
> +/*
> + * Should be called with mutex lock held (typically bd_mutex) of partition
> + * to provide mutual exlusion among writers otherwise seqcount might be
> + * left in wrong state leaving the readers spinning infinitely.
> + */
> +static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> +{
> +#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
> +	write_seqcount_begin(&part->nr_sects_seq);
> +	part->nr_sects = size;
> +	write_seqcount_end(&part->nr_sects_seq);
> +#elif BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_PREEMPT)
> +	preempt_disable();
> +	part->nr_sects = size;
> +	preempt_enable();
> +#else
> +	part->nr_sects = size;
> +#endif
> +}
> +
>  #else /* CONFIG_BLOCK */
>  
>  static inline void printk_all_partitions(void) { }
> -- 
> 1.7.6.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-02-20 14:42   ` Vivek Goyal
@ 2012-02-20 15:17     ` Phillip Susi
  0 siblings, 0 replies; 10+ messages in thread
From: Phillip Susi @ 2012-02-20 15:17 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: psusi, maxim.patlasov, linux-kernel, axboe, dm-devel, kzak

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2/20/2012 9:42 AM, Vivek Goyal wrote:
> On Tue, Feb 14, 2012 at 03:39:50PM -0500, Vivek Goyal wrote:
>> Add a new operation code ( BLKPG_RESIZE_PARTITION ) to the BLKPG
>> ioctl that allows altering the size of an existing partition,
>> even if it is currently in use.
> 
> Hi Phillip,
> 
> Are you ok with the change?

Yes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQmQWAAoJEJrBOlT6nu75V8YH/0khEQy1sCiDmjqaVAqJGn3R
xjS7eCU6ndgGj3CHE+j4AJCTJ6cdrcaheEF/QWd3f2GxI0kQ6bBiWzxMeEXaSA57
VQOyKqoGalaWA78a3xFR1oax8ZwOAQi7LdyttdvzXUTKDXrO57cAAIHGQcLTPFiv
a926d27kMulNQzXvjhBj/h8LAOeUVFIEbrGk5QxOw28gdStEv/RtMKeSvuq3e4qu
/TxNT78K49HBaWuhTZJB4Mg7ttyBTQJrQr5c23oo9KLSUgd+3ZvaHF53vUsP7IMm
Kbor7u648P0Xo2gaWGXPF5z4hmyhO08/1SNFKAKw0CSycpOlwjsbel/Ys57gIAY=
=oTDA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-02-14 20:39 ` [PATCH 1/2] block: add partition resize function to blkpg ioctl Vivek Goyal
  2012-02-20 14:42   ` Vivek Goyal
@ 2012-02-20 15:28   ` Vivek Goyal
  2012-03-02 18:54     ` Vivek Goyal
  2012-04-09 16:40   ` Maxim V. Patlasov
  2 siblings, 1 reply; 10+ messages in thread
From: Vivek Goyal @ 2012-02-20 15:28 UTC (permalink / raw)
  To: axboe; +Cc: psusi, psusi, maxim.patlasov, dm-devel, kzak, linux-kernel

On Tue, Feb 14, 2012 at 03:39:50PM -0500, Vivek Goyal wrote:
> Add a new operation code ( BLKPG_RESIZE_PARTITION ) to the
> BLKPG ioctl that allows altering the size of an existing
> partition, even if it is currently in use.
> 
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>

Hi Jens,

Can you please consider this patch for inclusion. One of our customer
does want to be able to grow partitions without having to reboot
the system.

Thanks
Vivek

> ---
>  block/genhd.c             |   20 ++++++++++++----
>  block/ioctl.c             |   57 ++++++++++++++++++++++++++++++++++++++++++--
>  block/partition-generic.c |    4 ++-
>  include/linux/blkpg.h     |    1 +
>  include/linux/genhd.h     |   57 +++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 130 insertions(+), 9 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 23b4f70..935e09b 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -153,7 +153,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
>  		part = rcu_dereference(ptbl->part[piter->idx]);
>  		if (!part)
>  			continue;
> -		if (!part->nr_sects &&
> +		if (!part_nr_sects_read(part) &&
>  		    !(piter->flags & DISK_PITER_INCL_EMPTY) &&
>  		    !(piter->flags & DISK_PITER_INCL_EMPTY_PART0 &&
>  		      piter->idx == 0))
> @@ -190,7 +190,7 @@ EXPORT_SYMBOL_GPL(disk_part_iter_exit);
>  static inline int sector_in_part(struct hd_struct *part, sector_t sector)
>  {
>  	return part->start_sect <= sector &&
> -		sector < part->start_sect + part->nr_sects;
> +		sector < part->start_sect + part_nr_sects_read(part);
>  }
>  
>  /**
> @@ -765,8 +765,8 @@ void __init printk_all_partitions(void)
>  
>  			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
>  			       bdevt_str(part_devt(part), devt_buf),
> -			       (unsigned long long)part->nr_sects >> 1,
> -			       disk_name(disk, part->partno, name_buf), uuid);
> +			       (unsigned long long)part_nr_sects_read(part) >> 1
> +			       , disk_name(disk, part->partno, name_buf), uuid);
>  			if (is_part0) {
>  				if (disk->driverfs_dev != NULL &&
>  				    disk->driverfs_dev->driver != NULL)
> @@ -857,7 +857,7 @@ static int show_partition(struct seq_file *seqf, void *v)
>  	while ((part = disk_part_iter_next(&piter)))
>  		seq_printf(seqf, "%4d  %7d %10llu %s\n",
>  			   MAJOR(part_devt(part)), MINOR(part_devt(part)),
> -			   (unsigned long long)part->nr_sects >> 1,
> +			   (unsigned long long)part_nr_sects_read(part) >> 1,
>  			   disk_name(sgp, part->partno, buf));
>  	disk_part_iter_exit(&piter);
>  
> @@ -1263,6 +1263,16 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
>  		}
>  		disk->part_tbl->part[0] = &disk->part0;
>  
> +		/*
> +		 * set_capacity() and get_capacity() currently don't use
> +		 * seqcounter to read/update the part0->nr_sects. Still init
> +		 * the counter as we can read the sectors in IO submission
> +		 * patch using seqence counters.
> +		 *
> +		 * TODO: Ideally set_capacity() and get_capacity() should be
> +		 * converted to make use of bd_mutex and sequence counters.
> +		 */
> +		seqcount_init(&disk->part0.nr_sects_seq);
>  		hd_ref_init(&disk->part0);
>  
>  		disk->minors = minors;
> diff --git a/block/ioctl.c b/block/ioctl.c
> index ba15b2d..ddbc649 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -13,7 +13,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  {
>  	struct block_device *bdevp;
>  	struct gendisk *disk;
> -	struct hd_struct *part;
> +	struct hd_struct *part, *lpart;
>  	struct blkpg_ioctl_arg a;
>  	struct blkpg_partition p;
>  	struct disk_part_iter piter;
> @@ -36,8 +36,8 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  		case BLKPG_ADD_PARTITION:
>  			start = p.start >> 9;
>  			length = p.length >> 9;
> -			/* check for fit in a hd_struct */ 
> -			if (sizeof(sector_t) == sizeof(long) && 
> +			/* check for fit in a hd_struct */
> +			if (sizeof(sector_t) == sizeof(long) &&
>  			    sizeof(long long) > sizeof(long)) {
>  				long pstart = start, plength = length;
>  				if (pstart != start || plength != length
> @@ -92,6 +92,57 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  			bdput(bdevp);
>  
>  			return 0;
> +		case BLKPG_RESIZE_PARTITION:
> +			start = p.start >> 9;
> +			/* new length of partition in bytes */
> +			length = p.length >> 9;
> +			/* check for fit in a hd_struct */
> +			if (sizeof(sector_t) == sizeof(long) &&
> +			    sizeof(long long) > sizeof(long)) {
> +				long pstart = start, plength = length;
> +				if (pstart != start || plength != length
> +				    || pstart < 0 || plength < 0)
> +					return -EINVAL;
> +			}
> +			part = disk_get_part(disk, partno);
> +			if (!part)
> +				return -ENXIO;
> +			bdevp = bdget(part_devt(part));
> +			if (!bdevp) {
> +				disk_put_part(part);
> +				return -ENOMEM;
> +			}
> +			mutex_lock(&bdevp->bd_mutex);
> +			mutex_lock_nested(&bdev->bd_mutex, 1);
> +			if (start != part->start_sect) {
> +				mutex_unlock(&bdevp->bd_mutex);
> +				mutex_unlock(&bdev->bd_mutex);
> +				disk_put_part(part);
> +				return -EINVAL;
> +			}
> +			/* overlap? */
> +			disk_part_iter_init(&piter, disk,
> +					    DISK_PITER_INCL_EMPTY);
> +			while ((lpart = disk_part_iter_next(&piter))) {
> +				if (lpart->partno != partno &&
> +				   !(start + length <= lpart->start_sect ||
> +				   start >= lpart->start_sect + lpart->nr_sects)
> +				   ) {
> +					disk_part_iter_exit(&piter);
> +					mutex_unlock(&bdevp->bd_mutex);
> +					mutex_unlock(&bdev->bd_mutex);
> +					disk_put_part(part);
> +					return -EBUSY;
> +				}
> +			}
> +			disk_part_iter_exit(&piter);
> +			part_nr_sects_write(part, (sector_t)length);
> +			i_size_write(bdevp->bd_inode, p.length);
> +			mutex_unlock(&bdevp->bd_mutex);
> +			mutex_unlock(&bdev->bd_mutex);
> +			bdput(bdevp);
> +			disk_put_part(part);
> +			return 0;
>  		default:
>  			return -EINVAL;
>  	}
> diff --git a/block/partition-generic.c b/block/partition-generic.c
> index d06ec1c..363a6f6 100644
> --- a/block/partition-generic.c
> +++ b/block/partition-generic.c
> @@ -84,7 +84,7 @@ ssize_t part_size_show(struct device *dev,
>  		       struct device_attribute *attr, char *buf)
>  {
>  	struct hd_struct *p = dev_to_part(dev);
> -	return sprintf(buf, "%llu\n",(unsigned long long)p->nr_sects);
> +	return sprintf(buf, "%llu\n",(unsigned long long)part_nr_sects_read(p));
>  }
>  
>  static ssize_t part_ro_show(struct device *dev,
> @@ -294,6 +294,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
>  		err = -ENOMEM;
>  		goto out_free;
>  	}
> +
> +	seqcount_init(&p->nr_sects_seq);
>  	pdev = part_to_dev(p);
>  
>  	p->start_sect = start;
> diff --git a/include/linux/blkpg.h b/include/linux/blkpg.h
> index faf8a45..a851944 100644
> --- a/include/linux/blkpg.h
> +++ b/include/linux/blkpg.h
> @@ -40,6 +40,7 @@ struct blkpg_ioctl_arg {
>  /* The subfunctions (for the op field) */
>  #define BLKPG_ADD_PARTITION	1
>  #define BLKPG_DEL_PARTITION	2
> +#define BLKPG_RESIZE_PARTITION	3
>  
>  /* Sizes of name fields. Unused at present. */
>  #define BLKPG_DEVNAMELTH	64
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index fe23ee7..0def3ef 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -98,7 +98,13 @@ struct partition_meta_info {
>  
>  struct hd_struct {
>  	sector_t start_sect;
> +	/*
> +	 * nr_sects is protected by sequence counter. One might extend a
> +	 * partition while IO is happening to it and update of nr_sects
> +	 * can be non-atomic on 32bit machines with 64bit sector_t.
> +	 */
>  	sector_t nr_sects;
> +	seqcount_t nr_sects_seq;
>  	sector_t alignment_offset;
>  	unsigned int discard_alignment;
>  	struct device __dev;
> @@ -653,6 +659,57 @@ static inline void hd_struct_put(struct hd_struct *part)
>  		__delete_partition(part);
>  }
>  
> +/*
> + * Any access of part->nr_sects which is not protected by partition
> + * bd_mutex or gendisk bdev bd_mutex, should be done using this
> + * accessor function.
> + *
> + * Code written along the lines of i_size_read() and i_size_write().
> + * CONFIG_PREEMPT case optimizes the case of UP kernel with preemption
> + * on.
> + */
> +static inline sector_t part_nr_sects_read(struct hd_struct *part)
> +{
> +#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
> +	sector_t nr_sects;
> +	unsigned seq;
> +	do {
> +		seq = read_seqcount_begin(&part->nr_sects_seq);
> +		nr_sects = part->nr_sects;
> +	} while (read_seqcount_retry(&part->nr_sects_seq, seq));
> +	return nr_sects;
> +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
> +	sector_t nr_sects;
> +
> +	preempt_disable();
> +	nr_sects = part->nr_sects;
> +	preempt_enable();
> +	return nr_sects;
> +#else
> +	return part->nr_sects;
> +#endif
> +}
> +
> +/*
> + * Should be called with mutex lock held (typically bd_mutex) of partition
> + * to provide mutual exlusion among writers otherwise seqcount might be
> + * left in wrong state leaving the readers spinning infinitely.
> + */
> +static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> +{
> +#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
> +	write_seqcount_begin(&part->nr_sects_seq);
> +	part->nr_sects = size;
> +	write_seqcount_end(&part->nr_sects_seq);
> +#elif BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_PREEMPT)
> +	preempt_disable();
> +	part->nr_sects = size;
> +	preempt_enable();
> +#else
> +	part->nr_sects = size;
> +#endif
> +}
> +
>  #else /* CONFIG_BLOCK */
>  
>  static inline void printk_all_partitions(void) { }
> -- 
> 1.7.6.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-02-20 15:28   ` Vivek Goyal
@ 2012-03-02 18:54     ` Vivek Goyal
  2012-07-07  1:51       ` Phillip Susi
  0 siblings, 1 reply; 10+ messages in thread
From: Vivek Goyal @ 2012-03-02 18:54 UTC (permalink / raw)
  To: axboe; +Cc: psusi, psusi, maxim.patlasov, dm-devel, kzak, linux-kernel

On Mon, Feb 20, 2012 at 10:28:49AM -0500, Vivek Goyal wrote:
> On Tue, Feb 14, 2012 at 03:39:50PM -0500, Vivek Goyal wrote:
> > Add a new operation code ( BLKPG_RESIZE_PARTITION ) to the
> > BLKPG ioctl that allows altering the size of an existing
> > partition, even if it is currently in use.
> > 
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> 
> Hi Jens,
> 
> Can you please consider this patch for inclusion. One of our customer
> does want to be able to grow partitions without having to reboot
> the system.

Hi Jens,

Do you have concerns about this patch? If no, can you please consider
merging it.

Thanks
Vivek

> 
> > ---
> >  block/genhd.c             |   20 ++++++++++++----
> >  block/ioctl.c             |   57 ++++++++++++++++++++++++++++++++++++++++++--
> >  block/partition-generic.c |    4 ++-
> >  include/linux/blkpg.h     |    1 +
> >  include/linux/genhd.h     |   57 +++++++++++++++++++++++++++++++++++++++++++++
> >  5 files changed, 130 insertions(+), 9 deletions(-)
> > 
> > diff --git a/block/genhd.c b/block/genhd.c
> > index 23b4f70..935e09b 100644
> > --- a/block/genhd.c
> > +++ b/block/genhd.c
> > @@ -153,7 +153,7 @@ struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter)
> >  		part = rcu_dereference(ptbl->part[piter->idx]);
> >  		if (!part)
> >  			continue;
> > -		if (!part->nr_sects &&
> > +		if (!part_nr_sects_read(part) &&
> >  		    !(piter->flags & DISK_PITER_INCL_EMPTY) &&
> >  		    !(piter->flags & DISK_PITER_INCL_EMPTY_PART0 &&
> >  		      piter->idx == 0))
> > @@ -190,7 +190,7 @@ EXPORT_SYMBOL_GPL(disk_part_iter_exit);
> >  static inline int sector_in_part(struct hd_struct *part, sector_t sector)
> >  {
> >  	return part->start_sect <= sector &&
> > -		sector < part->start_sect + part->nr_sects;
> > +		sector < part->start_sect + part_nr_sects_read(part);
> >  }
> >  
> >  /**
> > @@ -765,8 +765,8 @@ void __init printk_all_partitions(void)
> >  
> >  			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
> >  			       bdevt_str(part_devt(part), devt_buf),
> > -			       (unsigned long long)part->nr_sects >> 1,
> > -			       disk_name(disk, part->partno, name_buf), uuid);
> > +			       (unsigned long long)part_nr_sects_read(part) >> 1
> > +			       , disk_name(disk, part->partno, name_buf), uuid);
> >  			if (is_part0) {
> >  				if (disk->driverfs_dev != NULL &&
> >  				    disk->driverfs_dev->driver != NULL)
> > @@ -857,7 +857,7 @@ static int show_partition(struct seq_file *seqf, void *v)
> >  	while ((part = disk_part_iter_next(&piter)))
> >  		seq_printf(seqf, "%4d  %7d %10llu %s\n",
> >  			   MAJOR(part_devt(part)), MINOR(part_devt(part)),
> > -			   (unsigned long long)part->nr_sects >> 1,
> > +			   (unsigned long long)part_nr_sects_read(part) >> 1,
> >  			   disk_name(sgp, part->partno, buf));
> >  	disk_part_iter_exit(&piter);
> >  
> > @@ -1263,6 +1263,16 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
> >  		}
> >  		disk->part_tbl->part[0] = &disk->part0;
> >  
> > +		/*
> > +		 * set_capacity() and get_capacity() currently don't use
> > +		 * seqcounter to read/update the part0->nr_sects. Still init
> > +		 * the counter as we can read the sectors in IO submission
> > +		 * patch using seqence counters.
> > +		 *
> > +		 * TODO: Ideally set_capacity() and get_capacity() should be
> > +		 * converted to make use of bd_mutex and sequence counters.
> > +		 */
> > +		seqcount_init(&disk->part0.nr_sects_seq);
> >  		hd_ref_init(&disk->part0);
> >  
> >  		disk->minors = minors;
> > diff --git a/block/ioctl.c b/block/ioctl.c
> > index ba15b2d..ddbc649 100644
> > --- a/block/ioctl.c
> > +++ b/block/ioctl.c
> > @@ -13,7 +13,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
> >  {
> >  	struct block_device *bdevp;
> >  	struct gendisk *disk;
> > -	struct hd_struct *part;
> > +	struct hd_struct *part, *lpart;
> >  	struct blkpg_ioctl_arg a;
> >  	struct blkpg_partition p;
> >  	struct disk_part_iter piter;
> > @@ -36,8 +36,8 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
> >  		case BLKPG_ADD_PARTITION:
> >  			start = p.start >> 9;
> >  			length = p.length >> 9;
> > -			/* check for fit in a hd_struct */ 
> > -			if (sizeof(sector_t) == sizeof(long) && 
> > +			/* check for fit in a hd_struct */
> > +			if (sizeof(sector_t) == sizeof(long) &&
> >  			    sizeof(long long) > sizeof(long)) {
> >  				long pstart = start, plength = length;
> >  				if (pstart != start || plength != length
> > @@ -92,6 +92,57 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
> >  			bdput(bdevp);
> >  
> >  			return 0;
> > +		case BLKPG_RESIZE_PARTITION:
> > +			start = p.start >> 9;
> > +			/* new length of partition in bytes */
> > +			length = p.length >> 9;
> > +			/* check for fit in a hd_struct */
> > +			if (sizeof(sector_t) == sizeof(long) &&
> > +			    sizeof(long long) > sizeof(long)) {
> > +				long pstart = start, plength = length;
> > +				if (pstart != start || plength != length
> > +				    || pstart < 0 || plength < 0)
> > +					return -EINVAL;
> > +			}
> > +			part = disk_get_part(disk, partno);
> > +			if (!part)
> > +				return -ENXIO;
> > +			bdevp = bdget(part_devt(part));
> > +			if (!bdevp) {
> > +				disk_put_part(part);
> > +				return -ENOMEM;
> > +			}
> > +			mutex_lock(&bdevp->bd_mutex);
> > +			mutex_lock_nested(&bdev->bd_mutex, 1);
> > +			if (start != part->start_sect) {
> > +				mutex_unlock(&bdevp->bd_mutex);
> > +				mutex_unlock(&bdev->bd_mutex);
> > +				disk_put_part(part);
> > +				return -EINVAL;
> > +			}
> > +			/* overlap? */
> > +			disk_part_iter_init(&piter, disk,
> > +					    DISK_PITER_INCL_EMPTY);
> > +			while ((lpart = disk_part_iter_next(&piter))) {
> > +				if (lpart->partno != partno &&
> > +				   !(start + length <= lpart->start_sect ||
> > +				   start >= lpart->start_sect + lpart->nr_sects)
> > +				   ) {
> > +					disk_part_iter_exit(&piter);
> > +					mutex_unlock(&bdevp->bd_mutex);
> > +					mutex_unlock(&bdev->bd_mutex);
> > +					disk_put_part(part);
> > +					return -EBUSY;
> > +				}
> > +			}
> > +			disk_part_iter_exit(&piter);
> > +			part_nr_sects_write(part, (sector_t)length);
> > +			i_size_write(bdevp->bd_inode, p.length);
> > +			mutex_unlock(&bdevp->bd_mutex);
> > +			mutex_unlock(&bdev->bd_mutex);
> > +			bdput(bdevp);
> > +			disk_put_part(part);
> > +			return 0;
> >  		default:
> >  			return -EINVAL;
> >  	}
> > diff --git a/block/partition-generic.c b/block/partition-generic.c
> > index d06ec1c..363a6f6 100644
> > --- a/block/partition-generic.c
> > +++ b/block/partition-generic.c
> > @@ -84,7 +84,7 @@ ssize_t part_size_show(struct device *dev,
> >  		       struct device_attribute *attr, char *buf)
> >  {
> >  	struct hd_struct *p = dev_to_part(dev);
> > -	return sprintf(buf, "%llu\n",(unsigned long long)p->nr_sects);
> > +	return sprintf(buf, "%llu\n",(unsigned long long)part_nr_sects_read(p));
> >  }
> >  
> >  static ssize_t part_ro_show(struct device *dev,
> > @@ -294,6 +294,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
> >  		err = -ENOMEM;
> >  		goto out_free;
> >  	}
> > +
> > +	seqcount_init(&p->nr_sects_seq);
> >  	pdev = part_to_dev(p);
> >  
> >  	p->start_sect = start;
> > diff --git a/include/linux/blkpg.h b/include/linux/blkpg.h
> > index faf8a45..a851944 100644
> > --- a/include/linux/blkpg.h
> > +++ b/include/linux/blkpg.h
> > @@ -40,6 +40,7 @@ struct blkpg_ioctl_arg {
> >  /* The subfunctions (for the op field) */
> >  #define BLKPG_ADD_PARTITION	1
> >  #define BLKPG_DEL_PARTITION	2
> > +#define BLKPG_RESIZE_PARTITION	3
> >  
> >  /* Sizes of name fields. Unused at present. */
> >  #define BLKPG_DEVNAMELTH	64
> > diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> > index fe23ee7..0def3ef 100644
> > --- a/include/linux/genhd.h
> > +++ b/include/linux/genhd.h
> > @@ -98,7 +98,13 @@ struct partition_meta_info {
> >  
> >  struct hd_struct {
> >  	sector_t start_sect;
> > +	/*
> > +	 * nr_sects is protected by sequence counter. One might extend a
> > +	 * partition while IO is happening to it and update of nr_sects
> > +	 * can be non-atomic on 32bit machines with 64bit sector_t.
> > +	 */
> >  	sector_t nr_sects;
> > +	seqcount_t nr_sects_seq;
> >  	sector_t alignment_offset;
> >  	unsigned int discard_alignment;
> >  	struct device __dev;
> > @@ -653,6 +659,57 @@ static inline void hd_struct_put(struct hd_struct *part)
> >  		__delete_partition(part);
> >  }
> >  
> > +/*
> > + * Any access of part->nr_sects which is not protected by partition
> > + * bd_mutex or gendisk bdev bd_mutex, should be done using this
> > + * accessor function.
> > + *
> > + * Code written along the lines of i_size_read() and i_size_write().
> > + * CONFIG_PREEMPT case optimizes the case of UP kernel with preemption
> > + * on.
> > + */
> > +static inline sector_t part_nr_sects_read(struct hd_struct *part)
> > +{
> > +#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
> > +	sector_t nr_sects;
> > +	unsigned seq;
> > +	do {
> > +		seq = read_seqcount_begin(&part->nr_sects_seq);
> > +		nr_sects = part->nr_sects;
> > +	} while (read_seqcount_retry(&part->nr_sects_seq, seq));
> > +	return nr_sects;
> > +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
> > +	sector_t nr_sects;
> > +
> > +	preempt_disable();
> > +	nr_sects = part->nr_sects;
> > +	preempt_enable();
> > +	return nr_sects;
> > +#else
> > +	return part->nr_sects;
> > +#endif
> > +}
> > +
> > +/*
> > + * Should be called with mutex lock held (typically bd_mutex) of partition
> > + * to provide mutual exlusion among writers otherwise seqcount might be
> > + * left in wrong state leaving the readers spinning infinitely.
> > + */
> > +static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> > +{
> > +#if BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_SMP)
> > +	write_seqcount_begin(&part->nr_sects_seq);
> > +	part->nr_sects = size;
> > +	write_seqcount_end(&part->nr_sects_seq);
> > +#elif BITS_PER_LONG==32 && defined(CONFIG_LBDAF) && defined(CONFIG_PREEMPT)
> > +	preempt_disable();
> > +	part->nr_sects = size;
> > +	preempt_enable();
> > +#else
> > +	part->nr_sects = size;
> > +#endif
> > +}
> > +
> >  #else /* CONFIG_BLOCK */
> >  
> >  static inline void printk_all_partitions(void) { }
> > -- 
> > 1.7.6.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-02-14 20:39 ` [PATCH 1/2] block: add partition resize function to blkpg ioctl Vivek Goyal
  2012-02-20 14:42   ` Vivek Goyal
  2012-02-20 15:28   ` Vivek Goyal
@ 2012-04-09 16:40   ` Maxim V. Patlasov
  2 siblings, 0 replies; 10+ messages in thread
From: Maxim V. Patlasov @ 2012-04-09 16:40 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-kernel, axboe, dm-devel, kzak, psusi, psusi, maxim.patlasov

Hi Vivek,

See please inline comments below...

On 02/15/2012 12:39 AM, Vivek Goyal wrote:
> ...
> @@ -765,8 +765,8 @@ void __init printk_all_partitions(void)
>
>   			printk("%s%s %10llu %s %s", is_part0 ? "" : "  ",
>   			       bdevt_str(part_devt(part), devt_buf),
> -			       (unsigned long long)part->nr_sects>>  1,
> -			       disk_name(disk, part->partno, name_buf), uuid);
> +			       (unsigned long long)part_nr_sects_read(part)>>  1
> +			       , disk_name(disk, part->partno, name_buf), uuid);

A line starting from comma looks unusual. Is it what you intended?

> diff --git a/block/ioctl.c b/block/ioctl.c
> index ba15b2d..ddbc649 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> ...
> @@ -92,6 +92,57 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>   			bdput(bdevp);
>
>   			return 0;
> +		case BLKPG_RESIZE_PARTITION:
> +			start = p.start>>  9;
> +			/* new length of partition in bytes */
> +			length = p.length>>  9;
> +			/* check for fit in a hd_struct */
> +			if (sizeof(sector_t) == sizeof(long)&&
> +			    sizeof(long long)>  sizeof(long)) {
> +				long pstart = start, plength = length;
> +				if (pstart != start || plength != length
> +				    || pstart<  0 || plength<  0)
> +					return -EINVAL;
> +			}
> +			part = disk_get_part(disk, partno);
> +			if (!part)
> +				return -ENXIO;
> +			bdevp = bdget(part_devt(part));
> +			if (!bdevp) {
> +				disk_put_part(part);
> +				return -ENOMEM;
> +			}
> +			mutex_lock(&bdevp->bd_mutex);
> +			mutex_lock_nested(&bdev->bd_mutex, 1);
> +			if (start != part->start_sect) {
> +				mutex_unlock(&bdevp->bd_mutex);
> +				mutex_unlock(&bdev->bd_mutex);
> +				disk_put_part(part);
> +				return -EINVAL;

bdput(bdevp) missed?


> +			}
> +			/* overlap? */
> +			disk_part_iter_init(&piter, disk,
> +					    DISK_PITER_INCL_EMPTY);
> +			while ((lpart = disk_part_iter_next(&piter))) {
> +				if (lpart->partno != partno&&
> +				   !(start + length<= lpart->start_sect ||
> +				   start>= lpart->start_sect + lpart->nr_sects)
> +				   ) {
> +					disk_part_iter_exit(&piter);
> +					mutex_unlock(&bdevp->bd_mutex);
> +					mutex_unlock(&bdev->bd_mutex);
> +					disk_put_part(part);
> +					return -EBUSY;

bdput(bdevp) missed?

> +				}
> +			}
> +			disk_part_iter_exit(&piter);
> +			part_nr_sects_write(part, (sector_t)length);
> +			i_size_write(bdevp->bd_inode, p.length);
> +			mutex_unlock(&bdevp->bd_mutex);
> +			mutex_unlock(&bdev->bd_mutex);
> +			bdput(bdevp);
> +			disk_put_part(part);
> +			return 0;
>   		default:
>   			return -EINVAL;
>   	}

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-03-02 18:54     ` Vivek Goyal
@ 2012-07-07  1:51       ` Phillip Susi
  2012-07-09 15:13         ` Vivek Goyal
  0 siblings, 1 reply; 10+ messages in thread
From: Phillip Susi @ 2012-07-07  1:51 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: axboe, maxim.patlasov, dm-devel, kzak, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

What's the status of this patch?  Forgotten, or are there still any outstanding concerns?

On 03/02/2012 01:54 PM, Vivek Goyal wrote:
> Hi Jens,
> 
> Do you have concerns about this patch? If no, can you please consider
> merging it.
> 
> Thanks
> Vivek
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJP95YUAAoJEJrBOlT6nu75A/kIAIEWs+MlA8Me05jjBGpSFQsn
VigiYTF4UdWjA3bG0CNB41eqpzOKVl/B4vTBAy1YezuUXMamBRp1OD6hatEL/blO
ps/M2S2NNPgFOzDmZBgfWIib6tnbCJvTowLdt4n4NnP0DoQRn+5bXopL/jcm4lwU
XWheiqFFX1xSB5YgP+GMl4zVWZhyrHYcynqK/25EimbEXtjgTyR3Cy4wMfGgMdnI
HkY7D0Kn420n+X6uRLXZW8hV3apATZCz3PGsxg7FI83gFi7Tc9rneOhwgRkAXHxq
FcJ2NABK83dACAYOU0fhVTmxoumxuHNCmp7iRGiavnbNCBJWxLV2x1WhceX23lc=
=1FUQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] block: add partition resize function to blkpg ioctl
  2012-07-07  1:51       ` Phillip Susi
@ 2012-07-09 15:13         ` Vivek Goyal
  0 siblings, 0 replies; 10+ messages in thread
From: Vivek Goyal @ 2012-07-09 15:13 UTC (permalink / raw)
  To: Phillip Susi; +Cc: axboe, maxim.patlasov, dm-devel, kzak, linux-kernel

On Fri, Jul 06, 2012 at 09:51:16PM -0400, Phillip Susi wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> What's the status of this patch?  Forgotten, or are there still any outstanding concerns?

There was one outstanding concern from maxim about missing "bdput(bdevp)".
Will see if I can find some time to brush it up and test patches again. If
somebody can beat me to it, that would be great.

Thanks
Vivek

> 
> On 03/02/2012 01:54 PM, Vivek Goyal wrote:
> > Hi Jens,
> > 
> > Do you have concerns about this patch? If no, can you please consider
> > merging it.
> > 
> > Thanks
> > Vivek
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iQEcBAEBAgAGBQJP95YUAAoJEJrBOlT6nu75A/kIAIEWs+MlA8Me05jjBGpSFQsn
> VigiYTF4UdWjA3bG0CNB41eqpzOKVl/B4vTBAy1YezuUXMamBRp1OD6hatEL/blO
> ps/M2S2NNPgFOzDmZBgfWIib6tnbCJvTowLdt4n4NnP0DoQRn+5bXopL/jcm4lwU
> XWheiqFFX1xSB5YgP+GMl4zVWZhyrHYcynqK/25EimbEXtjgTyR3Cy4wMfGgMdnI
> HkY7D0Kn420n+X6uRLXZW8hV3apATZCz3PGsxg7FI83gFi7Tc9rneOhwgRkAXHxq
> FcJ2NABK83dACAYOU0fhVTmxoumxuHNCmp7iRGiavnbNCBJWxLV2x1WhceX23lc=
> =1FUQ
> -----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-09 15:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-14 20:39 [PATCH 0/2][V3] block: Support online resize of disk partitions Vivek Goyal
2012-02-14 20:39 ` [PATCH 1/2] block: add partition resize function to blkpg ioctl Vivek Goyal
2012-02-20 14:42   ` Vivek Goyal
2012-02-20 15:17     ` Phillip Susi
2012-02-20 15:28   ` Vivek Goyal
2012-03-02 18:54     ` Vivek Goyal
2012-07-07  1:51       ` Phillip Susi
2012-07-09 15:13         ` Vivek Goyal
2012-04-09 16:40   ` Maxim V. Patlasov
2012-02-14 20:39 ` [PATCH 2/2] util-linux: resizepart: Utility to resize a partition Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).