dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] misc patches
@ 2020-07-03  0:38 Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 1/4] libmultipath: fix sysfs dev_loss_tmo parsing Benjamin Marzinski
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Benjamin Marzinski @ 2020-07-03  0:38 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This is a small collection of individual bug fix patches that apply on
top of my previous patch set.

Changes from v1:

0001:	Going with Martin's strtoul() method instead

0002:	Now getpagesize() is only called once, and size_p is only set if
	posix_memalign is successful, as suggested by Martin.
	

Benjamin Marzinski (4):
  libmultipath: fix sysfs dev_loss_tmo parsing
  kpartx: read devices with direct IO
  kpartx: handle alternate bsd disklabel location
  libmultipath: fix checker detection for nvme devices

 kpartx/bsd.c             | 16 ++++++++++--
 kpartx/dasd.c            |  7 ++---
 kpartx/gpt.c             | 22 +++++++---------
 kpartx/kpartx.c          | 56 ++++++++++++++++++++++++++++++++--------
 kpartx/kpartx.h          |  2 ++
 libmultipath/discovery.c | 12 ++++++---
 libmultipath/propsel.c   |  4 ++-
 7 files changed, 87 insertions(+), 32 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/4] libmultipath: fix sysfs dev_loss_tmo parsing
  2020-07-03  0:38 [PATCH v2 0/4] misc patches Benjamin Marzinski
@ 2020-07-03  0:38 ` Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 2/4] kpartx: read devices with direct IO Benjamin Marzinski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Benjamin Marzinski @ 2020-07-03  0:38 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck, Martin Wilck

dev_loss_tmo is a u32 value. However the kernel sysfs code prints it as
a signed integer. This means that if dev_loss_tmo is above INT_MAX, the
sysfs value will be a negative number. Parsing this was causing
sysfs_set_rport_tmo() to fail.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/discovery.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index ffec5162..83a41a4a 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -583,7 +583,7 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
 	struct udev_device *rport_dev = NULL;
 	char value[16], *eptr;
 	char rport_id[32];
-	unsigned long long tmo = 0;
+	unsigned int tmo;
 	int ret;
 
 	sprintf(rport_id, "rport-%d:%d-%d",
@@ -607,8 +607,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
 			"error %d", rport_id, -ret);
 		goto out;
 	}
-	tmo = strtoull(value, &eptr, 0);
-	if (value == eptr || tmo == ULLONG_MAX) {
+	tmo = strtoul(value, &eptr, 0);
+	if (value == eptr) {
 		condlog(0, "%s: Cannot parse dev_loss_tmo "
 			"attribute '%s'", rport_id, value);
 		goto out;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/4] kpartx: read devices with direct IO
  2020-07-03  0:38 [PATCH v2 0/4] misc patches Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 1/4] libmultipath: fix sysfs dev_loss_tmo parsing Benjamin Marzinski
@ 2020-07-03  0:38 ` Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 3/4] kpartx: handle alternate bsd disklabel location Benjamin Marzinski
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Benjamin Marzinski @ 2020-07-03  0:38 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If kpartx is used on top of shared storage, and a device has its
partition table changed on one machine, and then kpartx is run on
another, it may not see the new data, because the cache still contains
the old data, and there is nothing to tell the machine running kpartx to
invalidate it. To solve this, kpartx should read the devices using
direct io.

One issue with how this code has been updated is that the original code
for getblock() always read 1024 bytes. The new code reads a logical
sector size chunk of the device, and returns a pointer to the 512 byte
sector that the caller asked for, within that (possibly larger) chunk.
This means that if the logical sector size is 512, then the code is now
only reading 512 bytes.  Looking through the code for the various
partition types, I can't see a case where more than 512 bytes is needed
and getblock() is used.  If anyone has a reason why this code should be
reading 1024 bytes at minmum, I can certainly change this.  But when I
looked, I couldn't find a case where reading 512 bytes would cause a
problem.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 kpartx/dasd.c   |  7 ++++---
 kpartx/gpt.c    | 22 +++++++++----------
 kpartx/kpartx.c | 56 +++++++++++++++++++++++++++++++++++++++----------
 kpartx/kpartx.h |  2 ++
 4 files changed, 61 insertions(+), 26 deletions(-)

diff --git a/kpartx/dasd.c b/kpartx/dasd.c
index 14b9d3aa..f0398645 100644
--- a/kpartx/dasd.c
+++ b/kpartx/dasd.c
@@ -22,6 +22,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#define _GNU_SOURCE
 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
@@ -117,13 +118,13 @@ read_dasd_pt(int fd, __attribute__((unused)) struct slice all,
 
 		sprintf(pathname, "/dev/.kpartx-node-%u-%u",
 			(unsigned int)major(dev), (unsigned int)minor(dev));
-		if ((fd_dasd = open(pathname, O_RDONLY)) == -1) {
+		if ((fd_dasd = open(pathname, O_RDONLY | O_DIRECT)) == -1) {
 			/* Devicenode does not exist. Try to create one */
 			if (mknod(pathname, 0600 | S_IFBLK, dev) == -1) {
 				/* Couldn't create a device node */
 				return -1;
 			}
-			fd_dasd = open(pathname, O_RDONLY);
+			fd_dasd = open(pathname, O_RDONLY | O_DIRECT);
 			/*
 			 * The file will vanish when the last process (we)
 			 * has ceased to access it.
@@ -175,7 +176,7 @@ read_dasd_pt(int fd, __attribute__((unused)) struct slice all,
 	 * Get volume label, extract name and type.
 	 */
 
-	if (!(data = (unsigned char *)malloc(blocksize)))
+	if (aligned_malloc((void **)&data, blocksize, NULL))
 		goto out;
 
 
diff --git a/kpartx/gpt.c b/kpartx/gpt.c
index 785b34ea..f7fefb70 100644
--- a/kpartx/gpt.c
+++ b/kpartx/gpt.c
@@ -243,8 +243,7 @@ alloc_read_gpt_entries(int fd, gpt_header * gpt)
 
 	if (!count) return NULL;
 
-	pte = (gpt_entry *)malloc(count);
-	if (!pte)
+	if (aligned_malloc((void **)&pte, get_sector_size(fd), &count))
 		return NULL;
 	memset(pte, 0, count);
 
@@ -269,12 +268,11 @@ static gpt_header *
 alloc_read_gpt_header(int fd, uint64_t lba)
 {
 	gpt_header *gpt;
-	gpt = (gpt_header *)
-	    malloc(sizeof (gpt_header));
-	if (!gpt)
+	size_t size = sizeof (gpt_header);
+	if (aligned_malloc((void **)&gpt, get_sector_size(fd), &size))
 		return NULL;
-	memset(gpt, 0, sizeof (*gpt));
-	if (!read_lba(fd, lba, gpt, sizeof (gpt_header))) {
+	memset(gpt, 0, size);
+	if (!read_lba(fd, lba, gpt, size)) {
 		free(gpt);
 		return NULL;
 	}
@@ -498,6 +496,7 @@ find_valid_gpt(int fd, gpt_header ** gpt, gpt_entry ** ptes)
 	gpt_header *pgpt = NULL, *agpt = NULL;
 	gpt_entry *pptes = NULL, *aptes = NULL;
 	legacy_mbr *legacymbr = NULL;
+	size_t size = sizeof(legacy_mbr);
 	uint64_t lastlba;
 	if (!gpt || !ptes)
 		return 0;
@@ -526,11 +525,10 @@ find_valid_gpt(int fd, gpt_header ** gpt, gpt_entry ** ptes)
 	}
 
 	/* This will be added to the EFI Spec. per Intel after v1.02. */
-	legacymbr = malloc(sizeof (*legacymbr));
-	if (legacymbr) {
-		memset(legacymbr, 0, sizeof (*legacymbr));
-		read_lba(fd, 0, (uint8_t *) legacymbr,
-			 sizeof (*legacymbr));
+	if (aligned_malloc((void **)&legacymbr, get_sector_size(fd),
+			   &size) == 0) {
+		memset(legacymbr, 0, size);
+		read_lba(fd, 0, (uint8_t *) legacymbr, size);
 		good_pmbr = is_pmbr_valid(legacymbr);
 		free(legacymbr);
 		legacymbr=NULL;
diff --git a/kpartx/kpartx.c b/kpartx/kpartx.c
index d3620c5c..c24ad6d9 100644
--- a/kpartx/kpartx.c
+++ b/kpartx/kpartx.c
@@ -19,6 +19,7 @@
  * cva, 2002-10-26
  */
 
+#define _GNU_SOURCE
 #include <stdio.h>
 #include <fcntl.h>
 #include <errno.h>
@@ -41,7 +42,6 @@
 
 #define SIZE(a) (sizeof(a)/sizeof((a)[0]))
 
-#define READ_SIZE	1024
 #define MAXTYPES	64
 #define MAXSLICES	256
 #define DM_TARGET	"linear"
@@ -388,7 +388,7 @@ main(int argc, char **argv){
 		set_delimiter(mapname, delim);
 	}
 
-	fd = open(device, O_RDONLY);
+	fd = open(device, O_RDONLY | O_DIRECT);
 
 	if (fd == -1) {
 		perror(device);
@@ -690,9 +690,9 @@ xmalloc (size_t size) {
  */
 
 static int
-sseek(int fd, unsigned int secnr) {
+sseek(int fd, unsigned int secnr, int secsz) {
 	off64_t in, out;
-	in = ((off64_t) secnr << 9);
+	in = ((off64_t) secnr * secsz);
 	out = 1;
 
 	if ((out = lseek64(fd, in, SEEK_SET)) != in)
@@ -703,6 +703,31 @@ sseek(int fd, unsigned int secnr) {
 	return 0;
 }
 
+int
+aligned_malloc(void **mem_p, size_t align, size_t *size_p)
+{
+	static size_t pgsize = 0;
+	size_t size;
+	int err;
+
+	if (!mem_p || !align || (size_p && !*size_p))
+		return EINVAL;
+
+	if (!pgsize)
+		pgsize = getpagesize();
+
+	if (size_p)
+		size = ((*size_p + align - 1) / align) * align;
+	else
+		size = pgsize;
+
+	err = posix_memalign(mem_p, pgsize, size);
+	if (!err && size_p)
+		*size_p = size;
+	return err;
+}
+
+/* always in sector size blocks */
 static
 struct block {
 	unsigned int secnr;
@@ -710,30 +735,39 @@ struct block {
 	struct block *next;
 } *blockhead;
 
+/* blknr is always in 512 byte blocks */
 char *
-getblock (int fd, unsigned int secnr) {
+getblock (int fd, unsigned int blknr) {
+	unsigned int secsz = get_sector_size(fd);
+	unsigned int blks_per_sec = secsz / 512;
+	unsigned int secnr = blknr / blks_per_sec;
+	unsigned int blk_off = (blknr % blks_per_sec) * 512;
 	struct block *bp;
 
 	for (bp = blockhead; bp; bp = bp->next)
 
 		if (bp->secnr == secnr)
-			return bp->block;
+			return bp->block + blk_off;
 
-	if (sseek(fd, secnr))
+	if (sseek(fd, secnr, secsz))
 		return NULL;
 
 	bp = xmalloc(sizeof(struct block));
 	bp->secnr = secnr;
 	bp->next = blockhead;
 	blockhead = bp;
-	bp->block = (char *) xmalloc(READ_SIZE);
+	if (aligned_malloc((void **)&bp->block, secsz, NULL)) {
+		fprintf(stderr, "aligned_malloc failed\n");
+		exit(1);
+	}
 
-	if (read(fd, bp->block, READ_SIZE) != READ_SIZE) {
+	if (read(fd, bp->block, secsz) != secsz) {
 		fprintf(stderr, "read error, sector %d\n", secnr);
-		bp->block = NULL;
+		blockhead = bp->next;
+		return NULL;
 	}
 
-	return bp->block;
+	return bp->block + blk_off;
 }
 
 int
diff --git a/kpartx/kpartx.h b/kpartx/kpartx.h
index 67edeb82..727632c1 100644
--- a/kpartx/kpartx.h
+++ b/kpartx/kpartx.h
@@ -1,6 +1,7 @@
 #ifndef _KPARTX_H
 #define _KPARTX_H
 
+#include <stddef.h>
 #include <stdint.h>
 #include <sys/ioctl.h>
 
@@ -61,6 +62,7 @@ extern ptreader read_mac_pt;
 extern ptreader read_sun_pt;
 extern ptreader read_ps3_pt;
 
+int aligned_malloc(void **mem_p, size_t align, size_t *size_p);
 char *getblock(int fd, unsigned int secnr);
 
 static inline unsigned int
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/4] kpartx: handle alternate bsd disklabel location
  2020-07-03  0:38 [PATCH v2 0/4] misc patches Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 1/4] libmultipath: fix sysfs dev_loss_tmo parsing Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 2/4] kpartx: read devices with direct IO Benjamin Marzinski
@ 2020-07-03  0:38 ` Benjamin Marzinski
  2020-07-03  0:38 ` [PATCH v2 4/4] libmultipath: fix checker detection for nvme devices Benjamin Marzinski
  2020-07-03  9:48 ` [PATCH v2 0/4] misc patches Martin Wilck
  4 siblings, 0 replies; 6+ messages in thread
From: Benjamin Marzinski @ 2020-07-03  0:38 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

bsd disk labels can either be at the start of the second sector, or 64
bytes into the first sector, but kpartx only handled the first case.
However the second case is what parted creates, and what the linux
kernel partition code expects.  kpartx should handle both cases.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 kpartx/bsd.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/kpartx/bsd.c b/kpartx/bsd.c
index 0e661fbc..950b0f92 100644
--- a/kpartx/bsd.c
+++ b/kpartx/bsd.c
@@ -1,6 +1,7 @@
 #include "kpartx.h"
 #include <stdio.h>
 
+#define BSD_LABEL_OFFSET	64
 #define BSD_DISKMAGIC	(0x82564557UL)	/* The disk magic number */
 #define XBSD_MAXPARTITIONS	16
 #define BSD_FS_UNUSED		0
@@ -60,8 +61,19 @@ read_bsd_pt(int fd, struct slice all, struct slice *sp, unsigned int ns) {
 		return -1;
 
 	l = (struct bsd_disklabel *) bp;
-	if (l->d_magic != BSD_DISKMAGIC)
-		return -1;
+	if (l->d_magic != BSD_DISKMAGIC) {
+		/*
+		 * BSD disklabels can also start 64 bytes offset from the
+		 * start of the first sector
+		 */
+		bp = getblock(fd, offset);
+		if (bp == NULL)
+			return -1;
+
+		l = (struct bsd_disklabel *)(bp + 64);
+		if (l->d_magic != BSD_DISKMAGIC)
+			return -1;
+	}
 
 	max_partitions = 16;
 	if (l->d_npartitions < max_partitions)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 4/4] libmultipath: fix checker detection for nvme devices
  2020-07-03  0:38 [PATCH v2 0/4] misc patches Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2020-07-03  0:38 ` [PATCH v2 3/4] kpartx: handle alternate bsd disklabel location Benjamin Marzinski
@ 2020-07-03  0:38 ` Benjamin Marzinski
  2020-07-03  9:48 ` [PATCH v2 0/4] misc patches Martin Wilck
  4 siblings, 0 replies; 6+ messages in thread
From: Benjamin Marzinski @ 2020-07-03  0:38 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

In order to fix hwhandler autodetection, commit 8794a776 made
detect_alua() differentiate between failures to detect whether alua was
supported, and successfully detecting that it was not supported.
However, this causes nvme devices to get the TUR checker assigned to
them. This is because there is nothing in detect_alua() to make it only
work on scsi devices, and select_checker wasn't updated to handle
detect_alua() failing without setting pp->tpgs to TPGS_NONE.

detect_alua() should automatically set pp->tpgs to TPGS_NONE and exit on
non-scsi devices. Also, select_checker() should not assume that a
devices is ALUA, simply because if failed to detect if alua was
supported.

Fixes: 8794a776 "libmultipath: fix ALUA autodetection when paths are
                 down"
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 6 ++++++
 libmultipath/propsel.c   | 4 +++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 83a41a4a..aa5942c3 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -887,6 +887,12 @@ detect_alua(struct path * pp)
 	int tpgs;
 	unsigned int timeout;
 
+
+	if (pp->bus != SYSFS_BUS_SCSI) {
+		pp->tpgs = TPGS_NONE;
+		return;
+	}
+
 	if (sysfs_get_timeout(pp, &timeout) <= 0)
 		timeout = DEF_TIMEOUT;
 
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 897e48ca..d362beb4 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -521,7 +521,9 @@ int select_checker(struct config *conf, struct path *pp)
 		if (check_rdac(pp)) {
 			ckr_name = RDAC;
 			goto out;
-		} else if (path_get_tpgs(pp) != TPGS_NONE) {
+		}
+		path_get_tpgs(pp);
+		if (pp->tpgs != TPGS_NONE && pp->tpgs != TPGS_UNDEF) {
 			ckr_name = TUR;
 			goto out;
 		}
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/4] misc patches
  2020-07-03  0:38 [PATCH v2 0/4] misc patches Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2020-07-03  0:38 ` [PATCH v2 4/4] libmultipath: fix checker detection for nvme devices Benjamin Marzinski
@ 2020-07-03  9:48 ` Martin Wilck
  4 siblings, 0 replies; 6+ messages in thread
From: Martin Wilck @ 2020-07-03  9:48 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-07-02 at 19:38 -0500, Benjamin Marzinski wrote:
> This is a small collection of individual bug fix patches that apply
> on
> top of my previous patch set.
> 
> Changes from v1:
> 
> 0001:	Going with Martin's strtoul() method instead
> 
> 0002:	Now getpagesize() is only called once, and size_p is only set
> if
> 	posix_memalign is successful, as suggested by Martin.
> 	
> 
> Benjamin Marzinski (4):
>   libmultipath: fix sysfs dev_loss_tmo parsing
>   kpartx: read devices with direct IO
>   kpartx: handle alternate bsd disklabel location
>   libmultipath: fix checker detection for nvme devices
> 
>  kpartx/bsd.c             | 16 ++++++++++--
>  kpartx/dasd.c            |  7 ++---
>  kpartx/gpt.c             | 22 +++++++---------
>  kpartx/kpartx.c          | 56 ++++++++++++++++++++++++++++++++----
> ----
>  kpartx/kpartx.h          |  2 ++
>  libmultipath/discovery.c | 12 ++++++---
>  libmultipath/propsel.c   |  4 ++-
>  7 files changed, 87 insertions(+), 32 deletions(-)
> 

For the series:

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-03  9:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-03  0:38 [PATCH v2 0/4] misc patches Benjamin Marzinski
2020-07-03  0:38 ` [PATCH v2 1/4] libmultipath: fix sysfs dev_loss_tmo parsing Benjamin Marzinski
2020-07-03  0:38 ` [PATCH v2 2/4] kpartx: read devices with direct IO Benjamin Marzinski
2020-07-03  0:38 ` [PATCH v2 3/4] kpartx: handle alternate bsd disklabel location Benjamin Marzinski
2020-07-03  0:38 ` [PATCH v2 4/4] libmultipath: fix checker detection for nvme devices Benjamin Marzinski
2020-07-03  9:48 ` [PATCH v2 0/4] misc patches Martin Wilck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).