All of lore.kernel.org
 help / color / mirror / Atom feed
* [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme
@ 2021-04-25  2:22 Leo Yan
  2021-04-25  2:22 ` [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme Leo Yan
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Leo Yan @ 2021-04-25  2:22 UTC (permalink / raw)
  To: lvm-devel

This patch set enables the In-Drive-Mutex (IDM) locking scheme.

The In-Drive-Mutex (IDM) offers an alternative to sanlock and DLM
barrier operations for lvmlockd controlled metadata updates.
This mechanism works differently from the two existing methods
in that it does not use a specific logical volume (LV) for lock leasing,
instead we rely on drives that implement new SCSI commands such that
every host can acquire and release locks from the drive firmware.
By managing the synchronization of metadata updates from multiple
servers within the drive creates a single point of truth which
removes potential state synchronization errors from drive hotswap and other
issues.  The reduction in complexity simplifies lvmlockd and increases
performance of metadata updates for shared volume groups.

For easier using IDM locking, the IDM locking manager and IDM lib are
provided.  The IDM lock manager runs as a daemon and it acts as a bridge
between lvmlockd and drive firmware; the IDM lib APIs is provided
for lvmlockd easily to invoke.  Therefore, the IDM locking manager and
IDM library are a good place to understand deeper for the locking scheme
and its algorithms, its code can be found on the github [1].

The patches in this set are arranged with the bottom-to-top approach.

The patch 01 presents the IDM wrapper layer for lvmlockd, it shows
the interface with IDM locking manager daemon.  By reading this patch,
especially for locking/unlocking/convert APIs, hope it can give out some
basic idea for the usages for IDM.  The patch 02 is to hook IDM
with lvmlockd core layer.

The latter two patches 03 and 04 are to enable Seagate IDM in the locking
lib; the locking lib is invoked by LVM commands, after enable the IDM
locking scheme, we need to ask locking lib to generate PV list for
VG/LV and pass the list from LVM command to lvmlockd.  As the result,
the PVs are the target for sending IDM SCSI commands which finally is
used by IDM lock manager.

The last patch is a minor enabling IDM locking scheme in tools.

This patch set has been tested with Seagate drives which has been
flashed firmware for supporting IDM.  The basic operations for VG/VL
creating/removing, activation/deactivation, and thin pool can pass the
testing with IDM.

It's planned to enable IDM testing with LVM testing framework, which
will be sent out later in a separate patch series.

[1] https://github.com/Seagate/propeller


Leo Yan (5):
  lvmlockd: idm: Introduce new locking scheme
  lvmlockd: idm: Hook Seagate IDM wrapper APIs
  lib: locking: Add new type "idm"
  lib: locking: Parse PV list for IDM locking
  tools: Add support for "idm" lock type

 configure                            | 173 ++++++
 configure.ac                         |  20 +
 daemons/lvmlockd/Makefile.in         |   5 +
 daemons/lvmlockd/lvmlockd-core.c     | 279 ++++++++-
 daemons/lvmlockd/lvmlockd-idm.c      | 837 +++++++++++++++++++++++++++
 daemons/lvmlockd/lvmlockd-internal.h | 110 ++++
 lib/display/display.c                |   4 +
 lib/locking/lvmlockd.c               | 356 +++++++++++-
 lib/metadata/metadata-exported.h     |   1 +
 lib/metadata/metadata.c              |  12 +-
 tools/lvconvert.c                    |   2 +
 tools/toollib.c                      |  11 +-
 12 files changed, 1756 insertions(+), 54 deletions(-)
 create mode 100644 daemons/lvmlockd/lvmlockd-idm.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme
  2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
@ 2021-04-25  2:22 ` Leo Yan
  2021-04-28 19:54   ` David Teigland
  2021-04-25  2:22 ` [LVM2 RFCv1 2/5] lvmlockd: idm: Hook Seagate IDM wrapper APIs Leo Yan
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Leo Yan @ 2021-04-25  2:22 UTC (permalink / raw)
  To: lvm-devel

Alongside the existed locking schemes of DLM and sanlock, this patch is
to introduce new locking scheme: In-Drive-Mutex (IDM).

With the IDM support in the drive, the locks are resident in the drive,
thus, the locking lease is maintained in a central place: the drive
firmware.  We can consider this is a typical client-server model,
every host (or node) in the server cluster launches the request for
leasing mutex to a drive firmware, the drive firmware works as an
arbitrator to grant the mutex to a requester and it can reject other
applicants if the mutex has been acquired.  To satisfy the LVM
activation for different modes, IDM supports two locking modes:
exclusive and shareable.

Every IDM is identified with two IDs, one is the host ID and another is
the resource ID.  The resource ID is a unique identifier for what the
resource it's protected, in the integration with lvmlockd, the resource
ID is combined with VG's UUID and LV's UUID; for the global locking,
the bytes in resource ID are all zeros, and for the VG locking, the
LV's UUID is set as zero.   Every host can generate a random UUID and
use it as the host ID for the SCSI command, this ID is used to clarify
the ownership for mutex.

For easily invoking the IDM commands to drive, like other locking
scheme (e.g. sanlock), a daemon program named IDM lock manager is
created, so the detailed IDM SCSI commands are encapsulated in the
daemon, and lvmlockd uses the wrapper APIs to communicate with the
daemon program.

This patch introduces the IDM locking wrapper layer, it forwards the
locking requests from lvmlockd to the IDM lock manager, and returns the
result from drives' responding.

One thing should be mentioned is the IDM's LVB.  IDM supports LVB to max
7 bytes when stores into the drive, the most significant byte of 8 bytes
is reserved for control bits.  For this reason, the patch maps the
timestamp in macrosecond unit with its cached LVB, essentially, if any
timestamp was updated by other nodes, that means the local LVB is
invalidate thus the metadata should be invlidated.  When the timestamp
is stored into drive's LVB, it's possbile to cause time-going-backwards
issue, which is introduced by the time precision or missing
synchronization acrossing over multiple nodes.  So the IDM wrapper fixes
up the timestamp by increment 1 to the latest value and write back into
drive.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 configure                            | 173 ++++++
 configure.ac                         |  20 +
 daemons/lvmlockd/Makefile.in         |   5 +
 daemons/lvmlockd/lvmlockd-idm.c      | 837 +++++++++++++++++++++++++++
 daemons/lvmlockd/lvmlockd-internal.h | 108 ++++
 5 files changed, 1143 insertions(+)
 create mode 100644 daemons/lvmlockd/lvmlockd-idm.c

diff --git a/configure b/configure
index 4c38cbebb..df9aa2af6 100755
--- a/configure
+++ b/configure
@@ -745,6 +745,7 @@ BUILD_DMFILEMAPD
 BUILD_LOCKDDLM_CONTROL
 BUILD_LOCKDDLM
 BUILD_LOCKDSANLOCK
+BUILD_LOCKDIDM
 BUILD_LVMLOCKD
 BUILD_LVMPOLLD
 BUILD_LVMDBUSD
@@ -780,6 +781,8 @@ LOCKD_DLM_LIBS
 LOCKD_DLM_CFLAGS
 LOCKD_SANLOCK_LIBS
 LOCKD_SANLOCK_CFLAGS
+LOCKD_IDM_LIBS
+LOCKD_IDM_CFLAGS
 VALGRIND_LIBS
 VALGRIND_CFLAGS
 GENPNG
@@ -940,6 +943,7 @@ enable_lvmpolld
 enable_lvmlockd_sanlock
 enable_lvmlockd_dlm
 enable_lvmlockd_dlmcontrol
+enable_lvmlockd_idm
 enable_use_lvmlockd
 with_lvmlockd_pidfile
 enable_use_lvmpolld
@@ -1011,6 +1015,8 @@ LOCKD_DLM_CFLAGS
 LOCKD_DLM_LIBS
 LOCKD_DLM_CONTROL_CFLAGS
 LOCKD_DLM_CONTROL_LIBS
+LOCKD_IDM_CFLAGS
+LOCKD_IDM_LIBS
 NOTIFY_DBUS_CFLAGS
 NOTIFY_DBUS_LIBS
 BLKID_CFLAGS
@@ -1659,6 +1665,7 @@ Optional Features:
   --enable-lvmlockd-dlm   enable the LVM lock daemon using dlm
   --enable-lvmlockd-dlmcontrol
                           enable lvmlockd remote refresh using libdlmcontrol
+  --enable-lvmlockd-idm   enable the LVM lock daemon using idm
   --disable-use-lvmlockd  disable usage of LVM lock daemon
   --disable-use-lvmpolld  disable usage of LVM Poll Daemon
   --enable-dmfilemapd     enable the dmstats filemap daemon
@@ -1810,6 +1817,10 @@ Some influential environment variables:
               C compiler flags for LOCKD_DLM_CONTROL, overriding pkg-config
   LOCKD_DLM_CONTROL_LIBS
               linker flags for LOCKD_DLM_CONTROL, overriding pkg-config
+  LOCKD_IDM_CFLAGS
+              C compiler flags for LOCKD_IDM, overriding pkg-config
+  LOCKD_IDM_LIBS
+              linker flags for LOCKD_IDM, overriding pkg-config
   NOTIFY_DBUS_CFLAGS
               C compiler flags for NOTIFY_DBUS, overriding pkg-config
   NOTIFY_DBUS_LIBS
@@ -3104,6 +3115,7 @@ case "$host_os" in
 		LOCKDSANLOCK=no
 		LOCKDDLM=no
 		LOCKDDLM_CONTROL=no
+		LOCKDIDM=no
 		ODIRECT=yes
 		DM_IOCTLS=yes
 		SELINUX=yes
@@ -11121,6 +11133,167 @@ $as_echo "#define LOCKDDLM_CONTROL_SUPPORT 1" >>confdefs.h
 	BUILD_LVMLOCKD=yes
 fi
 
+################################################################################
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build lvmlockdidm" >&5
+$as_echo_n "checking whether to build lvmlockdidm... " >&6; }
+# Check whether --enable-lvmlockd-idm was given.
+if test "${enable_lvmlockd_idm+set}" = set; then :
+  enableval=$enable_lvmlockd_idm; LOCKDIDM=$enableval
+fi
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $LOCKDIDM" >&5
+$as_echo "$LOCKDIDM" >&6; }
+
+BUILD_LOCKDIDM=$LOCKDIDM
+
+if test "$BUILD_LOCKDIDM" = yes; then
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for LOCKD_IDM" >&5
+$as_echo_n "checking for LOCKD_IDM... " >&6; }
+
+if test -n "$LOCKD_IDM_CFLAGS"; then
+    pkg_cv_LOCKD_IDM_CFLAGS="$LOCKD_IDM_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libseagate_ilm >= 0.1.0\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "libseagate_ilm >= 0.1.0") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_LOCKD_IDM_CFLAGS=`$PKG_CONFIG --cflags "libseagate_ilm >= 0.1.0" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$LOCKD_IDM_LIBS"; then
+    pkg_cv_LOCKD_IDM_LIBS="$LOCKD_IDM_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libseagate_ilm >= 0.1.0\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "libseagate_ilm >= 0.1.0") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_LOCKD_IDM_LIBS=`$PKG_CONFIG --libs "libseagate_ilm >= 0.1.0" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        LOCKD_IDM_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libseagate_ilm >= 0.1.0" 2>&1`
+        else
+	        LOCKD_IDM_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libseagate_ilm >= 0.1.0" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$LOCKD_IDM_PKG_ERRORS" >&5
+
+	$bailout
+elif test $pkg_failed = untried; then
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	$bailout
+else
+	LOCKD_IDM_CFLAGS=$pkg_cv_LOCKD_IDM_CFLAGS
+	LOCKD_IDM_LIBS=$pkg_cv_LOCKD_IDM_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+fi
+
+pkg_failed=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for BLKID" >&5
+$as_echo_n "checking for BLKID... " >&6; }
+
+if test -n "$BLKID_CFLAGS"; then
+    pkg_cv_BLKID_CFLAGS="$BLKID_CFLAGS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"blkid >= 2.24\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "blkid >= 2.24") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_BLKID_CFLAGS=`$PKG_CONFIG --cflags "blkid >= 2.24" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+if test -n "$BLKID_LIBS"; then
+    pkg_cv_BLKID_LIBS="$BLKID_LIBS"
+ elif test -n "$PKG_CONFIG"; then
+    if test -n "$PKG_CONFIG" && \
+    { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"blkid >= 2.24\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "blkid >= 2.24") 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; then
+  pkg_cv_BLKID_LIBS=`$PKG_CONFIG --libs "blkid >= 2.24" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
+else
+  pkg_failed=yes
+fi
+ else
+    pkg_failed=untried
+fi
+
+
+
+if test $pkg_failed = yes; then
+   	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+
+if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
+        _pkg_short_errors_supported=yes
+else
+        _pkg_short_errors_supported=no
+fi
+        if test $_pkg_short_errors_supported = yes; then
+	        BLKID_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "blkid >= 2.24" 2>&1`
+        else
+	        BLKID_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "blkid >= 2.24" 2>&1`
+        fi
+	# Put the nasty error message in config.log where it belongs
+	echo "$BLKID_PKG_ERRORS" >&5
+
+	$bailout
+elif test $pkg_failed = untried; then
+     	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	$bailout
+else
+	BLKID_CFLAGS=$pkg_cv_BLKID_CFLAGS
+	BLKID_LIBS=$pkg_cv_BLKID_LIBS
+        { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	HAVE_LOCKD_IDM=yes
+fi
+
+$as_echo "#define LOCKDIDM_SUPPORT 1" >>confdefs.h
+
+	BUILD_LVMLOCKD=yes
+fi
+
 ################################################################################
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build lvmlockd" >&5
 $as_echo_n "checking whether to build lvmlockd... " >&6; }
diff --git a/configure.ac b/configure.ac
index ee21b879d..8a2ccf912 100644
--- a/configure.ac
+++ b/configure.ac
@@ -43,6 +43,7 @@ case "$host_os" in
 		LOCKDSANLOCK=no
 		LOCKDDLM=no
 		LOCKDDLM_CONTROL=no
+		LOCKDIDM=no
 		ODIRECT=yes
 		DM_IOCTLS=yes
 		SELINUX=yes
@@ -960,6 +961,25 @@ if test "$BUILD_LOCKDDLM_CONTROL" = yes; then
 	BUILD_LVMLOCKD=yes
 fi
 
+################################################################################
+dnl -- Build lvmlockdidm
+AC_MSG_CHECKING(whether to build lvmlockdidm)
+AC_ARG_ENABLE(lvmlockd-idm,
+	      AC_HELP_STRING([--enable-lvmlockd-idm],
+			     [enable the LVM lock daemon using idm]),
+	      LOCKDIDM=$enableval)
+AC_MSG_RESULT($LOCKDIDM)
+
+BUILD_LOCKDIDM=$LOCKDIDM
+
+dnl -- Look for Seagate IDM libraries
+if test "$BUILD_LOCKDIDM" = yes; then
+	PKG_CHECK_MODULES(LOCKD_IDM, libseagate_ilm >= 0.1.0, [HAVE_LOCKD_IDM=yes], $bailout)
+	PKG_CHECK_MODULES(BLKID, blkid >= 2.24, [HAVE_LOCKD_IDM=yes], $bailout)
+	AC_DEFINE([LOCKDIDM_SUPPORT], 1, [Define to 1 to include code that uses lvmlockd IDM option.])
+	BUILD_LVMLOCKD=yes
+fi
+
 ################################################################################
 dnl -- Build lvmlockd
 AC_MSG_CHECKING(whether to build lvmlockd)
diff --git a/daemons/lvmlockd/Makefile.in b/daemons/lvmlockd/Makefile.in
index bd577d1e6..578292166 100644
--- a/daemons/lvmlockd/Makefile.in
+++ b/daemons/lvmlockd/Makefile.in
@@ -30,6 +30,11 @@ ifeq ("@BUILD_LOCKDDLM@", "yes")
   LOCK_LIBS += -ldlmcontrol
 endif
 
+ifeq ("@BUILD_LOCKDIDM@", "yes")
+  SOURCES += lvmlockd-idm.c
+  LOCK_LIBS += -lseagate_ilm -lblkid
+endif
+
 SOURCES2 = lvmlockctl.c
 
 TARGETS = lvmlockd lvmlockctl
diff --git a/daemons/lvmlockd/lvmlockd-idm.c b/daemons/lvmlockd/lvmlockd-idm.c
new file mode 100644
index 000000000..e9f50535c
--- /dev/null
+++ b/daemons/lvmlockd/lvmlockd-idm.c
@@ -0,0 +1,837 @@
+/*
+ * Copyright (C) 2020-2021 Seagate Ltd.
+ *
+ * This file is part of LVM2.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU Lesser General Public License v.2.1.
+ */
+
+#define _XOPEN_SOURCE 500  /* pthread */
+#define _ISOC99_SOURCE
+
+#include "tools/tool.h"
+
+#include "daemon-server.h"
+#include "lib/mm/xlate.h"
+
+#include "lvmlockd-internal.h"
+#include "daemons/lvmlockd/lvmlockd-client.h"
+
+#include "ilm.h"
+
+#include <blkid/blkid.h>
+#include <ctype.h>
+#include <dirent.h>
+#include <errno.h>
+#include <poll.h>
+#include <regex.h>
+#include <stddef.h>
+#include <syslog.h>
+#include <sys/sysmacros.h>
+#include <time.h>
+
+#define IDM_TIMEOUT	60000	/* unit: millisecond, 60 seconds */
+
+/*
+ * Each lockspace thread has its own In-Drive Mutex (IDM) lock manager's
+ * connection.  After established socket connection, the lockspace has
+ * been created in IDM lock manager and afterwards use the socket file
+ * descriptor to send any requests for lock related operations.
+ */
+
+struct lm_idm {
+	int sock;	/* IDM lock manager connection */
+};
+
+struct rd_idm {
+	struct idm_lock_id id;
+	struct idm_lock_op op;
+	uint64_t vb_timestamp;
+	struct val_blk *vb;
+};
+
+int lm_data_size_idm(void)
+{
+	return sizeof(struct rd_idm);
+}
+
+static uint64_t read_utc_us(void)
+{
+	struct timespec cur_time;
+
+	clock_gettime(CLOCK_REALTIME, &cur_time);
+
+	/*
+	 * Convert to microseconds unit.  IDM reserves the MSB in 8 bytes
+	 * and the low 56 bits are used for timestamp; 56 bits can support
+	 * calendar year to 2284, so it has 260 years for overflow.  Thus it
+	 * is quite safe for overflow issue when wrote this code.
+	 */
+	return cur_time.tv_sec * 1000000 + cur_time.tv_nsec / 1000;
+}
+
+static int uuid_read_format(char *uuid_str, const char *buffer)
+{
+	int out = 0;
+
+	/* just strip out any dashes */
+	while (*buffer) {
+
+		if (*buffer == '-') {
+			buffer++;
+			continue;
+		}
+
+		if (out >= 32) {
+			log_error("Too many characters to be uuid.");
+			return -1;
+		}
+
+		uuid_str[out++] = *buffer;
+		buffer++;
+	}
+
+	if (out != 32) {
+		log_error("Couldn't read uuid: incorrect number of "
+			  "characters.");
+		return -1;
+	}
+
+	return 0;
+}
+
+#define SYSFS_ROOT		"/sys"
+#define BUS_SCSI_DEVS		"/bus/scsi/devices"
+
+static struct idm_lock_op glb_lock_op;
+
+static void lm_idm_free_dir_list(struct dirent **dir_list, int dir_num)
+{
+	int i;
+
+	for (i = 0; i < dir_num; ++i)
+		free(dir_list[i]);
+	free(dir_list);
+}
+
+static int lm_idm_scsi_directory_select(const struct dirent *s)
+{
+	regex_t regex;
+	int ret;
+
+	/* Only select directory with the format x:x:x:x */
+	ret = regcomp(&regex, "^[0-9]+:[0-9]+:[0-9]+:[0-9]+$", REG_EXTENDED);
+	if (ret)
+		return 0;
+
+	ret = regexec(&regex, s->d_name, 0, NULL, 0);
+	if (!ret) {
+		regfree(&regex);
+		return 1;
+	}
+
+	regfree(&regex);
+	return 0;
+}
+
+static int lm_idm_scsi_find_block_dirctory(const char *block_path)
+{
+	struct stat stats;
+
+	if ((stat(block_path, &stats) >= 0) && S_ISDIR(stats.st_mode))
+		return 0;
+
+	return -1;
+}
+
+static int lm_idm_scsi_block_node_select(const struct dirent *s)
+{
+	if (DT_LNK != s->d_type && DT_DIR != s->d_type)
+		return 0;
+
+	if (DT_DIR == s->d_type) {
+		/* Skip this directory: '.' and parent: '..' */
+		if (!strcmp(s->d_name, ".") || !strcmp(s->d_name, ".."))
+			return 0;
+	}
+
+	return 1;
+}
+
+static int lm_idm_scsi_find_block_node(const char *blk_path, char **blk_dev)
+{
+        struct dirent **dir_list;
+        int dir_num;
+
+        dir_num = scandir(blk_path, &dir_list, lm_idm_scsi_block_node_select, NULL);
+        if (dir_num < 0) {
+		log_error("Cannot find valid directory entry in %s", blk_path);
+                return -1;
+	}
+
+	/*
+	 * Should have only one block name under the path, if the dir_num is
+	 * not 1 (e.g. 0 or any number bigger than 1), it must be wrong and
+	 * should never happen.
+	 */
+	if (dir_num == 1)
+		*blk_dev = strdup(dir_list[0]->d_name);
+	else
+		*blk_dev = NULL;
+
+	lm_idm_free_dir_list(dir_list, dir_num);
+
+	if (!*blk_dev)
+		return -1;
+
+        return dir_num;
+}
+
+static int lm_idm_scsi_search_propeller_partition(char *dev)
+{
+	int i, nparts;
+	blkid_probe pr;
+	blkid_partlist ls;
+	int found = -1;
+
+	pr = blkid_new_probe_from_filename(dev);
+	if (!pr) {
+		log_error("%s: failed to create a new libblkid probe", dev);
+		return -1;
+	}
+
+	/* Binary interface */
+	ls = blkid_probe_get_partitions(pr);
+	if (!ls) {
+		log_error("%s: failed to read partitions", dev);
+		return -1;
+	}
+
+	/* List partitions */
+	nparts = blkid_partlist_numof_partitions(ls);
+	if (!nparts)
+		goto done;
+
+	for (i = 0; i < nparts; i++) {
+		const char *p;
+		blkid_partition par = blkid_partlist_get_partition(ls, i);
+
+		p = blkid_partition_get_name(par);
+		if (p) {
+			log_debug("partition name='%s'", p);
+
+			if (!strcmp(p, "propeller"))
+				found = blkid_partition_get_partno(par);
+		}
+
+		if (found >= 0)
+			break;
+	}
+
+done:
+	blkid_free_probe(pr);
+	return found;
+}
+
+static char *lm_idm_scsi_get_block_device_node(const char *scsi_path)
+{
+	char *blk_path = NULL;
+	char *blk_dev = NULL;
+	char *dev_node = NULL;
+	int ret;
+
+	/*
+	 * Locate the "block" directory, such like:
+	 * /sys/bus/scsi/devices/1:0:0:0/block
+	 */
+	ret = asprintf(&blk_path, "%s/%s", scsi_path, "block");
+	if (ret < 0) {
+		log_error("Fail to allocate block path for %s", scsi_path);
+		goto fail;
+	}
+
+	ret = lm_idm_scsi_find_block_dirctory(blk_path);
+	if (ret < 0) {
+		log_error("Fail to find block path %s", blk_path);
+		goto fail;
+	}
+
+	/*
+	 * Locate the block device name, such like:
+	 * /sys/bus/scsi/devices/1:0:0:0/block/sdb
+	 *
+	 * After return from this function and if it makes success,
+	 * the global variable "blk_dev" points to the block device
+	 * name, in this example it points to string "sdb".
+	 */
+	ret = lm_idm_scsi_find_block_node(blk_path, &blk_dev);
+	if (ret < 0) {
+		log_error("Fail to find block node");
+		goto fail;
+	}
+
+	ret = asprintf(&dev_node, "/dev/%s", blk_dev);
+	if (ret < 0) {
+		log_error("Fail to allocate memory for blk node path");
+		goto fail;
+	}
+
+	ret = lm_idm_scsi_search_propeller_partition(dev_node);
+	if (ret < 0)
+		goto fail;
+
+	free(blk_path);
+	free(blk_dev);
+	return dev_node;
+
+fail:
+	free(blk_path);
+	free(blk_dev);
+	free(dev_node);
+	return NULL;
+}
+
+static int lm_idm_get_gl_lock_pv_list(void)
+{
+	struct dirent **dir_list;
+	char scsi_bus_path[PATH_MAX];
+	char *drive_path;
+	int i, dir_num, ret;
+
+	if (glb_lock_op.drive_num)
+		return 0;
+
+	snprintf(scsi_bus_path, sizeof(scsi_bus_path), "%s%s",
+		 SYSFS_ROOT, BUS_SCSI_DEVS);
+
+	dir_num = scandir(scsi_bus_path, &dir_list,
+			  lm_idm_scsi_directory_select, NULL);
+	if (dir_num < 0) {  /* scsi mid level may not be loaded */
+		log_error("Attached devices: none");
+		return -1;
+	}
+
+	for (i = 0; i < dir_num; i++) {
+		char *scsi_path;
+
+		ret = asprintf(&scsi_path, "%s/%s", scsi_bus_path,
+			       dir_list[i]->d_name);
+		if (ret < 0) {
+			log_error("Fail to allocate memory for scsi directory");
+			goto failed;
+		}
+
+		if (glb_lock_op.drive_num >= ILM_DRIVE_MAX_NUM) {
+			log_error("Global lock: drive number %d exceeds limitation (%d) ?!",
+				  glb_lock_op.drive_num, ILM_DRIVE_MAX_NUM);
+			free(scsi_path);
+			goto failed;
+		}
+
+		drive_path = lm_idm_scsi_get_block_device_node(scsi_path);
+		if (!drive_path) {
+			free(scsi_path);
+			continue;
+		}
+
+		glb_lock_op.drives[glb_lock_op.drive_num] = drive_path;
+		glb_lock_op.drive_num++;
+
+		free(scsi_path);
+	}
+
+	lm_idm_free_dir_list(dir_list, dir_num);
+	return 0;
+
+failed:
+	lm_idm_free_dir_list(dir_list, dir_num);
+
+	for (i = 0; i < glb_lock_op.drive_num; i++) {
+		if (glb_lock_op.drives[i]) {
+			free(glb_lock_op.drives[i]);
+			glb_lock_op.drives[i] = NULL;
+		}
+	}
+
+	return -1;
+}
+
+static void lm_idm_update_vb_timestamp(uint64_t *vb_timestamp)
+{
+	uint64_t utc_us = read_utc_us();
+
+	/*
+	 * It's possible that the multiple nodes have no clock
+	 * synchronization with microsecond prcision and the time
+	 * is going backward.  For this case, simply increment the
+	 * existing timestamp and write out to drive.
+	 */
+	if (*vb_timestamp >= utc_us)
+		(*vb_timestamp)++;
+	else
+		*vb_timestamp = utc_us;
+}
+
+int lm_prepare_lockspace_idm(struct lockspace *ls)
+{
+	struct lm_idm *lm = NULL;
+
+	lm = malloc(sizeof(struct lm_idm));
+	if (!lm) {
+		log_error("S %s prepare_lockspace_idm fail to allocate lm_idm for %s",
+			  ls->name, ls->vg_name);
+		return -ENOMEM;
+	}
+	memset(lm, 0x0, sizeof(struct lm_idm));
+
+	ls->lm_data = lm;
+	log_debug("S %s prepare_lockspace_idm done", ls->name);
+	return 0;
+}
+
+int lm_add_lockspace_idm(struct lockspace *ls, int adopt)
+{
+	char killpath[IDM_FAILURE_PATH_LEN];
+	char killargs[IDM_FAILURE_ARGS_LEN];
+	struct lm_idm *lmi = (struct lm_idm *)ls->lm_data;
+	int rv;
+
+	if (daemon_test)
+		return 0;
+
+	if (!strcmp(ls->name, S_NAME_GL_IDM)) {
+		/*
+		 * Prepare the pv list for global lock, if the drive contains
+		 * "propeller" partition, then this drive will be considered
+		 * as a member of pv list.
+		 */
+		rv = lm_idm_get_gl_lock_pv_list();
+		if (rv < 0) {
+			log_error("S %s add_lockspace_idm fail to get pv list for glb lock",
+				  ls->name);
+			return -EIO;
+		} else {
+			log_error("S %s add_lockspace_idm get pv list for glb lock",
+				  ls->name);
+		}
+	}
+
+	/*
+	 * Construct the execution path for command "lvmlockctl" by using the
+	 * path to the lvm binary and appending "lockctl".
+	 */
+	memset(killpath, 0, sizeof(killpath));
+	snprintf(killpath, IDM_FAILURE_PATH_LEN, "%slockctl", LVM_PATH);
+
+	/* Pass the argument "--kill vg_name" for killpath */
+	memset(killargs, 0, sizeof(killargs));
+	snprintf(killargs, IDM_FAILURE_ARGS_LEN, "--kill %s", ls->vg_name);
+
+	/* Connect with IDM lock manager per every lockspace. */
+	rv = ilm_connect(&lmi->sock);
+	if (rv < 0) {
+		log_error("S %s add_lockspace_idm fail to connect the lock manager %d",
+			  ls->name, lmi->sock);
+		lmi->sock = 0;
+		rv = -EMANAGER;
+		goto fail;
+	}
+
+	rv = ilm_set_killpath(lmi->sock, killpath, killargs);
+	if (rv < 0) {
+		log_error("S %s add_lockspace_idm fail to set kill path %d",
+			  ls->name, rv);
+		rv = -EMANAGER;
+		goto fail;
+	}
+
+	log_debug("S %s add_lockspace_idm kill path is: \"%s %s\"",
+		  ls->name, killpath, killargs);
+
+	log_debug("S %s add_lockspace_idm done", ls->name);
+	return 0;
+
+fail:
+	if (lmi && lmi->sock)
+		close(lmi->sock);
+	if (lmi)
+		free(lmi);
+	return rv;
+}
+
+int lm_rem_lockspace_idm(struct lockspace *ls, int free_vg)
+{
+	struct lm_idm *lmi = (struct lm_idm *)ls->lm_data;
+	int i, rv = 0;
+
+	if (daemon_test)
+		goto out;
+
+	rv = ilm_disconnect(lmi->sock);
+	if (rv < 0)
+		log_error("S %s rem_lockspace_idm error %d", ls->name, rv);
+
+	/* Release pv list for global lock */
+	if (!strcmp(ls->name, "lvm_global")) {
+		for (i = 0; i < glb_lock_op.drive_num; i++) {
+			if (glb_lock_op.drives[i]) {
+				free(glb_lock_op.drives[i]);
+				glb_lock_op.drives[i] = NULL;
+			}
+		}
+	}
+
+out:
+	free(lmi);
+	ls->lm_data = NULL;
+	return rv;
+}
+
+static int lm_add_resource_idm(struct lockspace *ls, struct resource *r)
+{
+	struct rd_idm *rdi = (struct rd_idm *)r->lm_data;
+
+	if (r->type == LD_RT_GL || r->type == LD_RT_VG) {
+		rdi->vb = zalloc(sizeof(struct val_blk));
+		if (!rdi->vb)
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+int lm_rem_resource_idm(struct lockspace *ls, struct resource *r)
+{
+	struct rd_idm *rdi = (struct rd_idm *)r->lm_data;
+
+	if (rdi->vb)
+		free(rdi->vb);
+
+	memset(rdi, 0, sizeof(struct rd_idm));
+	r->lm_init = 0;
+	return 0;
+}
+
+static int to_idm_mode(int ld_mode)
+{
+	switch (ld_mode) {
+	case LD_LK_EX:
+		return IDM_MODE_EXCLUSIVE;
+	case LD_LK_SH:
+		return IDM_MODE_SHAREABLE;
+	default:
+		break;
+	};
+
+	return -1;
+}
+
+int lm_lock_idm(struct lockspace *ls, struct resource *r, int ld_mode,
+		struct val_blk *vb_out, char *lv_uuid, struct pvs *pvs,
+		int adopt)
+{
+	struct lm_idm *lmi = (struct lm_idm *)ls->lm_data;
+	struct rd_idm *rdi = (struct rd_idm *)r->lm_data;
+	char **drive_path = NULL;
+	uint64_t timestamp;
+	int reset_vb = 0;
+	int rv, i;
+
+	if (!r->lm_init) {
+		rv = lm_add_resource_idm(ls, r);
+		if (rv < 0)
+			return rv;
+		r->lm_init = 1;
+	}
+
+	rdi->op.mode = to_idm_mode(ld_mode);
+	if (rv < 0) {
+		log_error("lock_idm invalid mode %d", ld_mode);
+		return -EINVAL;
+	}
+
+	log_debug("S %s R %s lock_idm", ls->name, r->name);
+
+	if (daemon_test) {
+		if (rdi->vb) {
+			vb_out->version = le16_to_cpu(rdi->vb->version);
+			vb_out->flags = le16_to_cpu(rdi->vb->flags);
+			vb_out->r_version = le32_to_cpu(rdi->vb->r_version);
+		}
+		return 0;
+	}
+
+	rdi->op.timeout = IDM_TIMEOUT;
+
+	/*
+	 * Generate the UUID string, for RT_VG, it only needs to generate
+	 * UUID string for VG level, for RT_LV, it needs to generate
+	 * UUID strings for both VG and LV levels.  At the end, these IDs
+	 * are used as identifier for IDM in drive firmware.
+	 */
+	if (r->type == LD_RT_VG || r->type == LD_RT_LV)
+		log_debug("S %s R %s VG uuid %s", ls->name, r->name, ls->vg_uuid);
+	if (r->type == LD_RT_LV)
+		log_debug("S %s R %s LV uuid %s", ls->name, r->name, lv_uuid);
+
+	memset(&rdi->id, 0x0, sizeof(struct idm_lock_id));
+	if (r->type == LD_RT_VG) {
+		uuid_read_format(rdi->id.vg_uuid, ls->vg_uuid);
+	} else if (r->type == LD_RT_LV) {
+		uuid_read_format(rdi->id.vg_uuid, ls->vg_uuid);
+		uuid_read_format(rdi->id.lv_uuid, lv_uuid);
+	}
+
+	/*
+	 * Establish the drive path list for lock, since different lock type
+	 * has different drive list; the GL lock uses the global pv list,
+	 * the VG lock uses the pv list spanned for the whole volume group,
+	 * the LV lock uses the pv list for the logical volume.
+	 */
+	switch (r->type) {
+	case LD_RT_GL:
+		drive_path = glb_lock_op.drives;
+		rdi->op.drive_num = glb_lock_op.drive_num;
+		break;
+	case LD_RT_VG:
+		drive_path = (char **)ls->pvs.path;
+		rdi->op.drive_num = ls->pvs.num;
+		break;
+	case LD_RT_LV:
+		drive_path = (char **)pvs->path;
+		rdi->op.drive_num = pvs->num;
+		break;
+	default:
+		break;
+	}
+
+	if (!drive_path) {
+		log_error("S %s R %s cannot find the valid drive path array",
+			  ls->name, r->name);
+		return -EINVAL;
+	}
+
+	if (rdi->op.drive_num >= ILM_DRIVE_MAX_NUM) {
+		log_error("S %s R %s exceeds limitation for drive path array",
+			  ls->name, r->name);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < rdi->op.drive_num; i++)
+		rdi->op.drives[i] = drive_path[i];
+
+	log_debug("S %s R %s mode %d drive_num %d timeout %d",
+		  ls->name, r->name, rdi->op.mode,
+		  rdi->op.drive_num, rdi->op.timeout);
+
+	for (i = 0; i < rdi->op.drive_num; i++)
+		log_debug("S %s R %s drive path[%d] %s",
+			  ls->name, r->name, i, rdi->op.drives[i]);
+
+	rv = ilm_lock(lmi->sock, &rdi->id, &rdi->op);
+	if (rv < 0) {
+		log_debug("S %s R %s lock_idm acquire mode %d rv %d",
+			  ls->name, r->name, ld_mode, rv);
+		return -ELOCKIO;
+	}
+
+	if (rdi->vb) {
+		rv = ilm_read_lvb(lmi->sock, &rdi->id, (char *)&timestamp,
+				  sizeof(uint64_t));
+
+		/*
+		 * If fail to read value block, which might be caused by drive
+		 * failure, notify up layer to invalidate metadata.
+		 */
+		if (rv < 0) {
+			log_error("S %s R %s lock_idm get_lvb error %d",
+				  ls->name, r->name, rv);
+			reset_vb = 1;
+
+			/* Reset timestamp */
+			rdi->vb_timestamp = 0;
+
+		/*
+		 * If the cached timestamp mismatches with the stored value
+		 * in the IDM, this means another host has updated timestamp
+		 * for the new VB.  Let's reset VB and notify up layer to
+		 * invalidate metadata.
+		 */
+		} else if (rdi->vb_timestamp != timestamp) {
+			log_debug("S %s R %s lock_idm get lvb timestamp %lu:%lu",
+				  ls->name, r->name, rdi->vb_timestamp,
+				  timestamp);
+
+			rdi->vb_timestamp = timestamp;
+			reset_vb = 1;
+		}
+
+		if (reset_vb == 1) {
+			memset(rdi->vb, 0, sizeof(struct val_blk));
+			memset(vb_out, 0, sizeof(struct val_blk));
+
+			/*
+			 * The lock is still acquired, but the vb values has
+			 * been invalidated.
+			 */
+			rv = 0;
+			goto out;
+		}
+
+		/* Otherwise, copy the cached VB to up layer */
+		memcpy(vb_out, rdi->vb, sizeof(struct val_blk));
+	}
+
+out:
+	return rv;
+}
+
+int lm_convert_idm(struct lockspace *ls, struct resource *r,
+		   int ld_mode, uint32_t r_version)
+{
+	struct lm_idm *lmi = (struct lm_idm *)ls->lm_data;
+	struct rd_idm *rdi = (struct rd_idm *)r->lm_data;
+	int mode, rv;
+
+	if (rdi->vb && r_version && (r->mode == LD_LK_EX)) {
+		if (!rdi->vb->version) {
+			/* first time vb has been written */
+			rdi->vb->version = VAL_BLK_VERSION;
+		}
+		rdi->vb->r_version = r_version;
+
+		log_debug("S %s R %s convert_idm set r_version %u",
+			  ls->name, r->name, r_version);
+
+		lm_idm_update_vb_timestamp(&rdi->vb_timestamp);
+		log_debug("S %s R %s convert_idm vb %x %x %u timestamp %lu",
+			  ls->name, r->name, rdi->vb->version, rdi->vb->flags,
+			  rdi->vb->r_version, rdi->vb_timestamp);
+	}
+
+	mode = to_idm_mode(ld_mode);
+	if (mode < 0) {
+		log_error("S %s R %s convert_idm invalid mode %d",
+			  ls->name, r->name, ld_mode);
+		return -EINVAL;
+	}
+
+	log_debug("S %s R %s convert_idm", ls->name, r->name);
+
+	if (daemon_test)
+		return 0;
+
+	if (rdi->vb && r_version && (r->mode == LD_LK_EX)) {
+		rv = ilm_write_lvb(lmi->sock, &rdi->id,
+				   (char *)rdi->vb_timestamp, sizeof(uint64_t));
+		if (rv < 0) {
+			log_error("S %s R %s convert_idm write lvb error %d",
+				  ls->name, r->name, rv);
+			return -ELMERR;
+		}
+	}
+
+	rv = ilm_convert(lmi->sock, &rdi->id, mode);
+	if (rv < 0)
+		log_error("S %s R %s convert_idm convert error %d",
+			  ls->name, r->name, rv);
+
+	return rv;
+}
+
+int lm_unlock_idm(struct lockspace *ls, struct resource *r,
+		  uint32_t r_version, uint32_t lmu_flags)
+{
+	struct lm_idm *lmi = (struct lm_idm *)ls->lm_data;
+	struct rd_idm *rdi = (struct rd_idm *)r->lm_data;
+	int rv;
+
+	if (rdi->vb && r_version && (r->mode == LD_LK_EX)) {
+		if (!rdi->vb->version) {
+			/* first time vb has been written */
+			rdi->vb->version = VAL_BLK_VERSION;
+		}
+		if (r_version)
+			rdi->vb->r_version = r_version;
+
+		lm_idm_update_vb_timestamp(&rdi->vb_timestamp);
+		log_debug("S %s R %s unlock_idm vb %x %x %u timestamp %lu",
+			  ls->name, r->name, rdi->vb->version, rdi->vb->flags,
+			  rdi->vb->r_version, rdi->vb_timestamp);
+	}
+
+	log_debug("S %s R %s unlock_idm", ls->name, r->name);
+
+	if (daemon_test)
+		return 0;
+
+	if (rdi->vb && r_version && (r->mode == LD_LK_EX)) {
+		rv = ilm_write_lvb(lmi->sock, &rdi->id,
+				   (char *)&rdi->vb_timestamp, sizeof(uint64_t));
+		if (rv < 0) {
+			log_error("S %s R %s unlock_idm set_lvb error %d",
+				  ls->name, r->name, rv);
+			return -ELMERR;
+		}
+	}
+
+	rv = ilm_unlock(lmi->sock, &rdi->id);
+	if (rv < 0)
+		log_error("S %s R %s unlock_idm error %d", ls->name, r->name, rv);
+
+	return rv;
+}
+
+int lm_hosts_idm(struct lockspace *ls, int notify)
+{
+	struct resource *r;
+	struct lm_idm *lmi = (struct lm_idm *)ls->lm_data;
+	struct rd_idm *rdi;
+	int count, self, found_others = 0;
+	int rv;
+
+	list_for_each_entry(r, &ls->resources, list) {
+		if (!r->lm_init)
+			continue;
+
+		rdi = (struct rd_idm *)r->lm_data;
+
+		rv = ilm_get_host_count(lmi->sock, &rdi->id, &rdi->op,
+					&count, &self);
+		if (rv < 0) {
+			log_error("S %s lm_hosts_idm error %d", ls->name, rv);
+			return rv;
+		}
+
+		/* Fixup: need to reduce self count */
+		if (count > found_others)
+			found_others = count;
+	}
+
+	return found_others;
+}
+
+int lm_get_lockspaces_idm(struct list_head *ls_rejoin)
+{
+	/* TODO: Need to add support for adoption. */
+	return -1;
+}
+
+int lm_is_running_idm(void)
+{
+	int sock, rv;
+
+	if (daemon_test)
+		return gl_use_idm;
+
+	rv = ilm_connect(&sock);
+	if (rv < 0) {
+		log_error("Fail to connect seagate IDM lock manager %d", rv);
+		return 0;
+	}
+
+	ilm_disconnect(sock);
+	return 1;
+}
diff --git a/daemons/lvmlockd/lvmlockd-internal.h b/daemons/lvmlockd/lvmlockd-internal.h
index 14bdfeed0..983d66589 100644
--- a/daemons/lvmlockd/lvmlockd-internal.h
+++ b/daemons/lvmlockd/lvmlockd-internal.h
@@ -20,6 +20,7 @@
 #define R_NAME_GL          "GLLK"
 #define R_NAME_VG          "VGLK"
 #define S_NAME_GL_DLM      "lvm_global"
+#define S_NAME_GL_IDM      "lvm_global"
 #define LVM_LS_PREFIX      "lvm_"           /* ls name is prefix + vg_name */
 /* global lockspace name for sanlock is a vg name */
 
@@ -29,6 +30,7 @@ enum {
 	LD_LM_UNUSED = 1, /* place holder so values match lib/locking/lvmlockd.h */
 	LD_LM_DLM = 2,
 	LD_LM_SANLOCK = 3,
+	LD_LM_IDM = 4,
 };
 
 /* operation types */
@@ -118,6 +120,11 @@ struct client {
  */
 #define DEFAULT_MAX_RETRIES 4
 
+struct pvs {
+	const char **path;
+	int num;
+};
+
 struct action {
 	struct list_head list;
 	uint32_t client_id;
@@ -140,6 +147,7 @@ struct action {
 	char vg_args[MAX_ARGS+1];
 	char lv_args[MAX_ARGS+1];
 	char vg_sysid[MAX_NAME+1];
+	struct pvs pvs;			/* PV list for idm */
 };
 
 struct resource {
@@ -184,6 +192,7 @@ struct lockspace {
 	uint64_t free_lock_offset;	/* for sanlock, start search for free lock here */
 	int free_lock_sector_size;	/* for sanlock */
 	int free_lock_align_size;	/* for sanlock */
+	struct pvs pvs;			/* for idm: PV list */
 
 	uint32_t start_client_id;	/* client_id that started the lockspace */
 	pthread_t thread;		/* makes synchronous lock requests */
@@ -325,6 +334,7 @@ static inline int list_empty(const struct list_head *head)
 EXTERN int gl_type_static;
 EXTERN int gl_use_dlm;
 EXTERN int gl_use_sanlock;
+EXTERN int gl_use_idm;
 EXTERN int gl_vg_removed;
 EXTERN char gl_lsname_dlm[MAX_NAME+1];
 EXTERN char gl_lsname_sanlock[MAX_NAME+1];
@@ -619,4 +629,102 @@ static inline int lm_support_sanlock(void)
 
 #endif /* sanlock support */
 
+#ifdef LOCKDSANLOCK_SUPPORT
+
+int lm_data_size_idm(void);
+int lm_init_vg_idm(char *ls_name, char *vg_name, uint32_t flags, char *vg_args);
+int lm_prepare_lockspace_idm(struct lockspace *ls);
+int lm_add_lockspace_idm(struct lockspace *ls, int adopt);
+int lm_rem_lockspace_idm(struct lockspace *ls, int free_vg);
+int lm_lock_idm(struct lockspace *ls, struct resource *r, int ld_mode,
+		struct val_blk *vb_out, char *lv_uuid, struct pvs *pvs,
+		int adopt);
+int lm_convert_idm(struct lockspace *ls, struct resource *r,
+		   int ld_mode, uint32_t r_version);
+int lm_unlock_idm(struct lockspace *ls, struct resource *r,
+		  uint32_t r_version, uint32_t lmu_flags);
+int lm_hosts_idm(struct lockspace *ls, int notify);
+int lm_get_lockspaces_idm(struct list_head *ls_rejoin);
+int lm_is_running_idm(void);
+int lm_rem_resource_idm(struct lockspace *ls, struct resource *r);
+
+static inline int lm_support_idm(void)
+{
+	return 1;
+}
+
+#else
+
+static int lm_data_size_idm(void)
+{
+	return -1;
+}
+
+static int lm_init_vg_idm(char *ls_name, char *vg_name, uint32_t flags,
+			  char *vg_args)
+{
+	return -1;
+}
+
+static int lm_prepare_lockspace_idm(struct lockspace *ls)
+{
+	return -1;
+}
+
+static int lm_add_lockspace_idm(struct lockspace *ls, int adopt)
+{
+	return -1;
+}
+
+static int lm_rem_lockspace_idm(struct lockspace *ls, int free_vg)
+{
+	return -1;
+}
+
+static int lm_lock_idm(struct lockspace *ls, struct resource *r, int ld_mode,
+		       struct val_blk *vb_out, char *lv_uuid, struct pvs *pvs,
+		       int adopt)
+{
+	return -1;
+}
+
+static int lm_convert_idm(struct lockspace *ls, struct resource *r,
+			  int ld_mode, uint32_t r_version)
+{
+	return -1;
+}
+
+static int lm_unlock_idm(struct lockspace *ls, struct resource *r,
+			 uint32_t r_version, uint32_t lmu_flags)
+{
+	return -1;
+}
+
+static int lm_hosts_idm(struct lockspace *ls, int notify)
+{
+	return -1;
+}
+
+static int lm_get_lockspaces_idm(struct list_head *ls_rejoin)
+{
+	return -1;
+}
+
+static int lm_is_running_idm(void)
+{
+	return 0;
+}
+
+static int lm_rem_resource_idm(struct lockspace *ls, struct resource *r)
+{
+	return -1;
+}
+
+static inline int lm_support_idm(void)
+{
+	return 0;
+}
+
+#endif /* Seagate IDM support */
+
 #endif	/* _LVM_LVMLOCKD_INTERNAL_H */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 2/5] lvmlockd: idm: Hook Seagate IDM wrapper APIs
  2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
  2021-04-25  2:22 ` [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme Leo Yan
@ 2021-04-25  2:22 ` Leo Yan
  2021-04-25  2:22 ` [LVM2 RFCv1 3/5] lib: locking: Add new type "idm" Leo Yan
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Leo Yan @ 2021-04-25  2:22 UTC (permalink / raw)
  To: lvm-devel

To allow the IDM locking scheme be used by users, this patch hooks the
IDM wrapper; it also introducs a new locking type "idm" and we can use
it for global lock with option '-g idm'.

To support IDM locking type, the main change in the data structure is to
add pvs path arrary.  The pvs list is transferred from the lvm commands,
when lvmlockd core layer receives message, it extracts the message with
the keyword "path[idx]".  Finally, the pv list will pass to IDM lock
manager as the target drives for sending IDM SCSI commands.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 daemons/lvmlockd/lvmlockd-core.c     | 279 ++++++++++++++++++++++++---
 daemons/lvmlockd/lvmlockd-internal.h |   4 +-
 2 files changed, 255 insertions(+), 28 deletions(-)

diff --git a/daemons/lvmlockd/lvmlockd-core.c b/daemons/lvmlockd/lvmlockd-core.c
index 094491029..d7971ccc1 100644
--- a/daemons/lvmlockd/lvmlockd-core.c
+++ b/daemons/lvmlockd/lvmlockd-core.c
@@ -421,6 +421,63 @@ struct lockspace *alloc_lockspace(void)
 	return ls;
 }
 
+static char **alloc_pvs_path(struct pvs *pvs, int num)
+{
+	if (!num)
+		return NULL;
+
+	pvs->path = malloc(sizeof(char *) * num);
+	if (!pvs->path)
+		return NULL;
+
+	memset(pvs->path, 0x0, sizeof(char *) * num);
+	return pvs->path;
+}
+
+static void free_pvs_path(struct pvs *pvs)
+{
+	int i;
+
+	for (i = 0; i < pvs->num; i++) {
+		if (!pvs->path[i])
+			continue;
+
+		free((char *)pvs->path[i]);
+		pvs->path[i] = NULL;
+	}
+
+	if (!pvs->path) {
+		free(pvs->path);
+		pvs->path = NULL;
+	}
+}
+
+static char **alloc_and_copy_pvs_path(struct pvs *dst, struct pvs *src)
+{
+	int i;
+
+	if (!alloc_pvs_path(dst, src->num))
+		return NULL;
+
+	dst->num = 0;
+	for (i = 0; i < src->num; i++) {
+		if (!src->path[i] || !strcmp(src->path[i], "none"))
+			continue;
+
+		dst->path[dst->num] = strdup(src->path[i]);
+		if (!dst->path[dst->num]) {
+			log_error("out of memory for copying pvs path");
+			goto failed;
+		}
+		dst->num++;
+	}
+	return dst->path;
+
+failed:
+	free_pvs_path(dst);
+	return NULL;
+}
+
 static struct action *alloc_action(void)
 {
 	struct action *act;
@@ -510,6 +567,9 @@ static void free_action(struct action *act)
 		free(act->path);
 		act->path = NULL;
 	}
+
+	free_pvs_path(&act->pvs);
+
 	pthread_mutex_lock(&unused_struct_mutex);
 	if (unused_action_count >= MAX_UNUSED_ACTION) {
 		free(act);
@@ -564,9 +624,12 @@ static int setup_structs(void)
 	struct lock *lk;
 	int data_san = lm_data_size_sanlock();
 	int data_dlm = lm_data_size_dlm();
+	int data_idm = lm_data_size_idm();
 	int i;
 
 	resource_lm_data_size = data_san > data_dlm ? data_san : data_dlm;
+	resource_lm_data_size = resource_lm_data_size > data_idm ?
+					resource_lm_data_size : data_idm;
 
 	pthread_mutex_init(&unused_struct_mutex, NULL);
 	INIT_LIST_HEAD(&unused_action);
@@ -683,6 +746,8 @@ static const char *lm_str(int x)
 		return "dlm";
 	case LD_LM_SANLOCK:
 		return "sanlock";
+	case LD_LM_IDM:
+		return "idm";
 	default:
 		return "lm_unknown";
 	}
@@ -968,6 +1033,8 @@ static int lm_prepare_lockspace(struct lockspace *ls, struct action *act)
 		rv = lm_prepare_lockspace_dlm(ls);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		rv = lm_prepare_lockspace_sanlock(ls);
+	else if (ls->lm_type == LD_LM_IDM)
+		rv = lm_prepare_lockspace_idm(ls);
 	else
 		return -1;
 
@@ -984,6 +1051,8 @@ static int lm_add_lockspace(struct lockspace *ls, struct action *act, int adopt)
 		rv = lm_add_lockspace_dlm(ls, adopt);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		rv = lm_add_lockspace_sanlock(ls, adopt);
+	else if (ls->lm_type == LD_LM_IDM)
+		rv = lm_add_lockspace_idm(ls, adopt);
 	else
 		return -1;
 
@@ -1000,6 +1069,8 @@ static int lm_rem_lockspace(struct lockspace *ls, struct action *act, int free_v
 		rv = lm_rem_lockspace_dlm(ls, free_vg);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		rv = lm_rem_lockspace_sanlock(ls, free_vg);
+	else if (ls->lm_type == LD_LM_IDM)
+		rv = lm_rem_lockspace_idm(ls, free_vg);
 	else
 		return -1;
 
@@ -1017,6 +1088,9 @@ static int lm_lock(struct lockspace *ls, struct resource *r, int mode, struct ac
 		rv = lm_lock_dlm(ls, r, mode, vb_out, adopt);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		rv = lm_lock_sanlock(ls, r, mode, vb_out, retry, adopt);
+	else if (ls->lm_type == LD_LM_IDM)
+		rv = lm_lock_idm(ls, r, mode, vb_out, act->lv_uuid,
+				 &act->pvs, adopt);
 	else
 		return -1;
 
@@ -1034,6 +1108,8 @@ static int lm_convert(struct lockspace *ls, struct resource *r,
 		rv = lm_convert_dlm(ls, r, mode, r_version);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		rv = lm_convert_sanlock(ls, r, mode, r_version);
+	else if (ls->lm_type == LD_LM_IDM)
+		rv = lm_convert_idm(ls, r, mode, r_version);
 	else
 		return -1;
 
@@ -1051,6 +1127,8 @@ static int lm_unlock(struct lockspace *ls, struct resource *r, struct action *ac
 		rv = lm_unlock_dlm(ls, r, r_version, lmu_flags);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		rv = lm_unlock_sanlock(ls, r, r_version, lmu_flags);
+	else if (ls->lm_type == LD_LM_IDM)
+		rv = lm_unlock_idm(ls, r, r_version, lmu_flags);
 	else
 		return -1;
 
@@ -1065,6 +1143,8 @@ static int lm_hosts(struct lockspace *ls, int notify)
 		return lm_hosts_dlm(ls, notify);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		return lm_hosts_sanlock(ls, notify);
+	else if (ls->lm_type == LD_LM_IDM)
+		return lm_hosts_idm(ls, notify);
 	return -1;
 }
 
@@ -1074,6 +1154,8 @@ static void lm_rem_resource(struct lockspace *ls, struct resource *r)
 		lm_rem_resource_dlm(ls, r);
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		lm_rem_resource_sanlock(ls, r);
+	else if (ls->lm_type == LD_LM_IDM)
+		lm_rem_resource_idm(ls, r);
 }
 
 static int lm_find_free_lock(struct lockspace *ls, uint64_t *free_offset, int *sector_size, int *align_size)
@@ -1082,6 +1164,8 @@ static int lm_find_free_lock(struct lockspace *ls, uint64_t *free_offset, int *s
 		return 0;
 	else if (ls->lm_type == LD_LM_SANLOCK)
 		return lm_find_free_lock_sanlock(ls, free_offset, sector_size, align_size);
+	else if (ls->lm_type == LD_LM_IDM)
+		return 0;
 	return -1;
 }
 
@@ -1690,8 +1774,8 @@ static int res_update(struct lockspace *ls, struct resource *r,
 }
 
 /*
- * There is nothing to deallocate when freeing a dlm LV, the LV
- * will simply be unlocked by rem_resource.
+ * For DLM and IDM locking scheme, there is nothing to deallocate when freeing a
+ * LV, the LV will simply be unlocked by rem_resource.
  */
 
 static int free_lv(struct lockspace *ls, struct resource *r)
@@ -1700,6 +1784,8 @@ static int free_lv(struct lockspace *ls, struct resource *r)
 		return lm_free_lv_sanlock(ls, r);
 	else if (ls->lm_type == LD_LM_DLM)
 		return 0;
+	else if (ls->lm_type == LD_LM_IDM)
+		return 0;
 	else
 		return -EINVAL;
 }
@@ -2760,6 +2846,8 @@ out_act:
 	ls->drop_vg = drop_vg;
 	if (ls->lm_type == LD_LM_DLM && !strcmp(ls->name, gl_lsname_dlm))
 		global_dlm_lockspace_exists = 0;
+	if (ls->lm_type == LD_LM_IDM && !strcmp(ls->name, gl_lsname_idm))
+		global_idm_lockspace_exists = 0;
 
 	/*
 	 * Avoid a name collision of the same lockspace is added again before
@@ -2851,6 +2939,8 @@ static void gl_ls_name(char *ls_name)
 		memcpy(ls_name, gl_lsname_dlm, MAX_NAME);
 	else if (gl_use_sanlock)
 		memcpy(ls_name, gl_lsname_sanlock, MAX_NAME);
+	else if (gl_use_idm)
+		memcpy(ls_name, gl_lsname_idm, MAX_NAME);
 	else
 		memset(ls_name, 0, MAX_NAME);
 }
@@ -2879,9 +2969,19 @@ static int add_lockspace_thread(const char *ls_name,
 	strncpy(ls->name, ls_name, MAX_NAME);
 	ls->lm_type = lm_type;
 
-	if (act)
+	if (act) {
 		ls->start_client_id = act->client_id;
 
+		/*
+		 * Copy PV list to lockspact structure, so this is
+		 * used for VG locking for idm scheme.
+		 */
+		if (!alloc_and_copy_pvs_path(&ls->pvs, &act->pvs)) {
+			free(ls);
+			return -ENOMEM;
+		}
+	}
+
 	if (vg_uuid)
 		strncpy(ls->vg_uuid, vg_uuid, 64);
 
@@ -2908,6 +3008,17 @@ static int add_lockspace_thread(const char *ls_name,
 	pthread_mutex_lock(&lockspaces_mutex);
 	ls2 = find_lockspace_name(ls->name);
 	if (ls2) {
+		/*
+		 * If find an existed lockspace, we need to update the PV list
+		 * based on the latest information, and release for the old
+		 * PV list in case it keeps stale information.
+		 */
+		free_pvs_path(&ls2->pvs);
+		if (!alloc_and_copy_pvs_path(&ls2->pvs, &ls->pvs)) {
+			log_debug("add_lockspace_thread %s fails to allocate pvs", ls->name);
+			rv = -ENOMEM;
+		}
+
 		if (ls2->thread_stop) {
 			log_debug("add_lockspace_thread %s exists and stopping", ls->name);
 			rv = -EAGAIN;
@@ -2920,6 +3031,7 @@ static int add_lockspace_thread(const char *ls_name,
 		}
 		pthread_mutex_unlock(&lockspaces_mutex);
 		free_resource(r);
+		free_pvs_path(&ls->pvs);
 		free(ls);
 		return rv;
 	}
@@ -2933,6 +3045,8 @@ static int add_lockspace_thread(const char *ls_name,
 
 	if (ls->lm_type == LD_LM_DLM && !strcmp(ls->name, gl_lsname_dlm))
 		global_dlm_lockspace_exists = 1;
+	if (ls->lm_type == LD_LM_IDM && !strcmp(ls->name, gl_lsname_idm))
+		global_idm_lockspace_exists = 1;
 	list_add_tail(&ls->list, &lockspaces);
 	pthread_mutex_unlock(&lockspaces_mutex);
 
@@ -2943,6 +3057,7 @@ static int add_lockspace_thread(const char *ls_name,
 		list_del(&ls->list);
 		pthread_mutex_unlock(&lockspaces_mutex);
 		free_resource(r);
+		free_pvs_path(&ls->pvs);
 		free(ls);
 		return rv;
 	}
@@ -2951,16 +3066,15 @@ static int add_lockspace_thread(const char *ls_name,
 }
 
 /*
- * There is no add_sanlock_global_lockspace or
- * rem_sanlock_global_lockspace because with sanlock,
- * the global lockspace is one of the vg lockspaces.
+ * There is no variant for sanlock because, with sanlock, the global
+ * lockspace is one of the vg lockspaces.
  */
-
-static int add_dlm_global_lockspace(struct action *act)
+static int add_global_lockspace(char *ls_name, int lm_type,
+				struct action *act)
 {
 	int rv;
 
-	if (global_dlm_lockspace_exists)
+	if (global_dlm_lockspace_exists || global_idm_lockspace_exists)
 		return 0;
 
 	/*
@@ -2968,9 +3082,9 @@ static int add_dlm_global_lockspace(struct action *act)
 	 * lock request, insert an internal gl sh lock request?
 	 */
 
-	rv = add_lockspace_thread(gl_lsname_dlm, NULL, NULL, LD_LM_DLM, NULL, act);
+	rv = add_lockspace_thread(ls_name, NULL, NULL, lm_type, NULL, act);
 	if (rv < 0)
-		log_debug("add_dlm_global_lockspace add_lockspace_thread %d", rv);
+		log_debug("add_global_lockspace add_lockspace_thread %d", rv);
 
 	/*
 	 * EAGAIN may be returned for a short period because
@@ -2983,12 +3097,12 @@ static int add_dlm_global_lockspace(struct action *act)
 }
 
 /*
- * If dlm gl lockspace is the only one left, then stop it.
- * This is not used for an explicit rem_lockspace action from
- * the client, only for auto remove.
+ * When DLM or IDM locking scheme is used for global lock, if the global
+ * lockspace is the only one left, then stop it.  This is not used for
+ * an explicit rem_lockspace action from the client, only for auto
+ * remove.
  */
-
-static int rem_dlm_global_lockspace(void)
+static int rem_global_lockspace(char *ls_name)
 {
 	struct lockspace *ls, *ls_gl = NULL;
 	int others = 0;
@@ -2996,7 +3110,7 @@ static int rem_dlm_global_lockspace(void)
 
 	pthread_mutex_lock(&lockspaces_mutex);
 	list_for_each_entry(ls, &lockspaces, list) {
-		if (!strcmp(ls->name, gl_lsname_dlm)) {
+		if (!strcmp(ls->name, ls_name)) {
 			ls_gl = ls;
 			continue;
 		}
@@ -3028,6 +3142,26 @@ out:
 	return rv;
 }
 
+static int add_dlm_global_lockspace(struct action *act)
+{
+	return add_global_lockspace(gl_lsname_dlm, LD_LM_DLM, act);
+}
+
+static int rem_dlm_global_lockspace(void)
+{
+	return rem_global_lockspace(gl_lsname_dlm);
+}
+
+static int add_idm_global_lockspace(struct action *act)
+{
+	return add_global_lockspace(gl_lsname_idm, LD_LM_IDM, act);
+}
+
+static int rem_idm_global_lockspace(void)
+{
+	return rem_global_lockspace(gl_lsname_idm);
+}
+
 /*
  * When the first dlm lockspace is added for a vg, automatically add a separate
  * dlm lockspace for the global lock.
@@ -3053,6 +3187,9 @@ static int add_lockspace(struct action *act)
 		if (gl_use_dlm) {
 			rv = add_dlm_global_lockspace(act);
 			return rv;
+		} else if (gl_use_idm) {
+			rv = add_idm_global_lockspace(act);
+			return rv;
 		} else {
 			return -EINVAL;
 		}
@@ -3061,6 +3198,8 @@ static int add_lockspace(struct action *act)
 	if (act->rt == LD_RT_VG) {
 		if (gl_use_dlm)
 			add_dlm_global_lockspace(NULL);
+		else if (gl_use_idm)
+			add_idm_global_lockspace(NULL);
 
 		vg_ls_name(act->vg_name, ls_name);
 
@@ -3128,14 +3267,15 @@ static int rem_lockspace(struct action *act)
 	pthread_mutex_unlock(&lockspaces_mutex);
 
 	/*
-	 * The dlm global lockspace was automatically added when
-	 * the first dlm vg lockspace was added, now reverse that
-	 * by automatically removing the dlm global lockspace when
-	 * the last dlm vg lockspace is removed.
+	 * For DLM and IDM locking scheme, the global lockspace was
+	 * automatically added when the first vg lockspace was added,
+	 * now reverse that by automatically removing the dlm global
+	 * lockspace when the last vg lockspace is removed.
 	 */
-
 	if (rt == LD_RT_VG && gl_use_dlm)
 		rem_dlm_global_lockspace();
+	else if (rt == LD_RT_VG && gl_use_idm)
+		rem_idm_global_lockspace();
 
 	return 0;
 }
@@ -3259,6 +3399,7 @@ static int for_each_lockspace(int do_stop, int do_free, int do_force)
 				if (ls->free_vg) {
 					/* In future we may need to free ls->actions here */
 					free_ls_resources(ls);
+					free_pvs_path(&ls->pvs);
 					free(ls);
 					free_count++;
 				}
@@ -3272,6 +3413,7 @@ static int for_each_lockspace(int do_stop, int do_free, int do_force)
 		if (!gl_type_static) {
 			gl_use_dlm = 0;
 			gl_use_sanlock = 0;
+			gl_use_idm = 0;
 		}
 	}
 	pthread_mutex_unlock(&lockspaces_mutex);
@@ -3347,6 +3489,9 @@ static int work_init_vg(struct action *act)
 		rv = lm_init_vg_sanlock(ls_name, act->vg_name, act->flags, act->vg_args);
 	else if (act->lm_type == LD_LM_DLM)
 		rv = lm_init_vg_dlm(ls_name, act->vg_name, act->flags, act->vg_args);
+	else if (act->lm_type == LD_LM_IDM)
+		/* Non't do anything for IDM when initialize VG */
+		rv = 0;
 	else
 		rv = -EINVAL;
 
@@ -3450,6 +3595,8 @@ static int work_init_lv(struct action *act)
 
 	} else if (act->lm_type == LD_LM_DLM) {
 		return 0;
+	} else if (act->lm_type == LD_LM_IDM) {
+		return 0;
 	} else {
 		log_error("init_lv ls_name %s bad lm_type %d", ls_name, act->lm_type);
 		return -EINVAL;
@@ -3513,20 +3660,29 @@ static void *worker_thread_main(void *arg_in)
 		if (act->op == LD_OP_RUNNING_LM) {
 			int run_sanlock = lm_is_running_sanlock();
 			int run_dlm = lm_is_running_dlm();
+			int run_idm = lm_is_running_idm();
 
 			if (daemon_test) {
 				run_sanlock = gl_use_sanlock;
 				run_dlm = gl_use_dlm;
+				run_idm = gl_use_idm;
 			}
 
-			if (run_sanlock && run_dlm)
+			/*
+			 * It's not possible to enable multiple locking schemes
+			 * for global lock, otherwise, it must be conflict and
+			 * reports it!
+			 */
+			if ((run_sanlock + run_dlm + run_idm) >= 2)
 				act->result = -EXFULL;
-			else if (!run_sanlock && !run_dlm)
+			else if (!run_sanlock && !run_dlm && !run_idm)
 				act->result = -ENOLCK;
 			else if (run_sanlock)
 				act->result = LD_LM_SANLOCK;
 			else if (run_dlm)
 				act->result = LD_LM_DLM;
+			else if (run_idm)
+				act->result = LD_LM_IDM;
 			add_client_result(act);
 
 		} else if ((act->op == LD_OP_LOCK) && (act->flags & LD_AF_SEARCH_LS)) {
@@ -3814,6 +3970,9 @@ static int client_send_result(struct client *cl, struct action *act)
 		} else if (gl_use_dlm) {
 			if (!gl_lsname_dlm[0])
 				strcat(result_flags, "NO_GL_LS,");
+		} else if (gl_use_idm) {
+			if (!gl_lsname_idm[0])
+				strcat(result_flags, "NO_GL_LS,");
 		} else {
 			int found_lm = 0;
 
@@ -3821,6 +3980,8 @@ static int client_send_result(struct client *cl, struct action *act)
 				found_lm++;
 			if (lm_support_sanlock() && lm_is_running_sanlock())
 				found_lm++;
+			if (lm_support_idm() && lm_is_running_idm())
+				found_lm++;
 
 			if (!found_lm)
 				strcat(result_flags, "NO_GL_LS,NO_LM");
@@ -3996,11 +4157,13 @@ static int add_lock_action(struct action *act)
 		if (gl_use_sanlock && (act->op == LD_OP_ENABLE || act->op == LD_OP_DISABLE)) {
 			vg_ls_name(act->vg_name, ls_name);
 		} else {
-			if (!gl_use_dlm && !gl_use_sanlock) {
+			if (!gl_use_dlm && !gl_use_sanlock && !gl_use_idm) {
 				if (lm_is_running_dlm())
 					gl_use_dlm = 1;
 				else if (lm_is_running_sanlock())
 					gl_use_sanlock = 1;
+				else if (lm_is_running_idm())
+					gl_use_idm = 1;
 			}
 			gl_ls_name(ls_name);
 		}
@@ -4048,6 +4211,17 @@ static int add_lock_action(struct action *act)
 			add_dlm_global_lockspace(NULL);
 			goto retry;
 
+		} else if (act->op == LD_OP_LOCK && act->rt == LD_RT_GL && act->mode != LD_LK_UN && gl_use_idm) {
+			/*
+			 * Automatically start the idm global lockspace when
+			 * a command tries to acquire the global lock.
+			 */
+			log_debug("lockspace \"%s\" not found for idm gl, adding...", ls_name);
+			act->flags |= LD_AF_SEARCH_LS;
+			act->flags |= LD_AF_WAIT_STARTING;
+			add_idm_global_lockspace(NULL);
+			goto retry;
+
 		} else if (act->op == LD_OP_LOCK && act->mode == LD_LK_UN) {
 			log_debug("lockspace \"%s\" not found for unlock ignored", ls_name);
 			return -ENOLS;
@@ -4268,6 +4442,8 @@ static int str_to_lm(const char *str)
 		return LD_LM_SANLOCK;
 	if (!strcmp(str, "dlm"))
 		return LD_LM_DLM;
+	if (!strcmp(str, "idm"))
+		return LD_LM_IDM;
 	return -2; 
 }
 
@@ -4603,12 +4779,14 @@ static void client_recv_action(struct client *cl)
 	const char *vg_sysid;
 	const char *path;
 	const char *str;
+	struct pvs pvs;
+	char buf[11];	/* p a t h [ x x x x ] \0 */
 	int64_t val;
 	uint32_t opts = 0;
 	int result = 0;
 	int cl_pid;
 	int op, rt, lm, mode;
-	int rv;
+	int rv, i;
 
 	buffer_init(&req.buffer);
 
@@ -4697,11 +4875,13 @@ static void client_recv_action(struct client *cl)
 	if (!cl->name[0] && cl_name)
 		strncpy(cl->name, cl_name, MAX_NAME);
 
-	if (!gl_use_dlm && !gl_use_sanlock && (lm > 0)) {
+	if (!gl_use_dlm && !gl_use_sanlock && !gl_use_idm && (lm > 0)) {
 		if (lm == LD_LM_DLM && lm_support_dlm())
 			gl_use_dlm = 1;
 		else if (lm == LD_LM_SANLOCK && lm_support_sanlock())
 			gl_use_sanlock = 1;
+		else if (lm == LD_LM_IDM && lm_support_idm())
+			gl_use_idm = 1;
 
 		log_debug("set gl_use_%s", lm_str(lm));
 	}
@@ -4758,6 +4938,40 @@ static void client_recv_action(struct client *cl)
 	if (val)
 		act->host_id = val;
 
+	/* Create PV list for idm */
+	if (lm == LD_LM_IDM) {
+		memset(&pvs, 0x0, sizeof(pvs));
+
+		pvs.num = daemon_request_int(req, "path_num", 0);
+		log_error("pvs_num = %d", pvs.num);
+
+		if (!pvs.num)
+			goto skip_pvs_path;
+
+		/* Receive the pv list which is transferred from LVM command */
+		if (!alloc_pvs_path(&pvs, pvs.num)) {
+			log_error("fail to allocate pvs path");
+			rv = -ENOMEM;
+			goto out;
+		}
+
+		for (i = 0; i < pvs.num; i++) {
+			snprintf(buf, sizeof(buf), "path[%d]", i);
+			pvs.path[i] = (char *)daemon_request_str(req, buf, NULL);
+		}
+
+		if (!alloc_and_copy_pvs_path(&act->pvs, &pvs)) {
+			log_error("fail to allocate pvs path");
+			rv = -ENOMEM;
+			goto out;
+		}
+
+		if (pvs.path)
+			free(pvs.path);
+		pvs.path = NULL;
+	}
+
+skip_pvs_path:
 	act->max_retries = daemon_request_int(req, "max_retries", DEFAULT_MAX_RETRIES);
 
 	dm_config_destroy(req.cft);
@@ -4779,6 +4993,12 @@ static void client_recv_action(struct client *cl)
 		goto out;
 	}
 
+	if (lm == LD_LM_IDM && !lm_support_idm()) {
+		log_debug("idm not supported");
+		rv = -EPROTONOSUPPORT;
+		goto out;
+	}
+
 	if (act->op == LD_OP_LOCK && act->mode != LD_LK_UN)
 		cl->lock_ops = 1;
 
@@ -5377,6 +5597,7 @@ static void adopt_locks(void)
 		}
 		
 		list_del(&ls->list);
+		free_pvs_path(&ls->pvs);
 		free(ls);
 	}
 
@@ -5417,6 +5638,7 @@ static void adopt_locks(void)
 		if (rv < 0) {
 			log_error("Failed to create lockspace thread for VG %s", ls->vg_name);
 			list_del(&ls->list);
+			free_pvs_path(&ls->pvs);
 			free(ls);
 			free_action(act);
 			count_start_fail++;
@@ -5859,6 +6081,7 @@ static int main_loop(daemon_state *ds_arg)
 	}
 
 	strcpy(gl_lsname_dlm, S_NAME_GL_DLM);
+	strcpy(gl_lsname_idm, S_NAME_GL_IDM);
 
 	INIT_LIST_HEAD(&lockspaces);
 	pthread_mutex_init(&lockspaces_mutex, NULL);
@@ -6112,6 +6335,8 @@ int main(int argc, char *argv[])
 				gl_use_dlm = 1;
 			else if (lm == LD_LM_SANLOCK && lm_support_sanlock())
 				gl_use_sanlock = 1;
+			else if (lm == LD_LM_IDM && lm_support_idm())
+				gl_use_idm = 1;
 			else {
 				fprintf(stderr, "invalid gl-type option\n");
 				exit(EXIT_FAILURE);
diff --git a/daemons/lvmlockd/lvmlockd-internal.h b/daemons/lvmlockd/lvmlockd-internal.h
index 983d66589..06a50ad85 100644
--- a/daemons/lvmlockd/lvmlockd-internal.h
+++ b/daemons/lvmlockd/lvmlockd-internal.h
@@ -121,7 +121,7 @@ struct client {
 #define DEFAULT_MAX_RETRIES 4
 
 struct pvs {
-	const char **path;
+	char **path;
 	int num;
 };
 
@@ -338,7 +338,9 @@ EXTERN int gl_use_idm;
 EXTERN int gl_vg_removed;
 EXTERN char gl_lsname_dlm[MAX_NAME+1];
 EXTERN char gl_lsname_sanlock[MAX_NAME+1];
+EXTERN char gl_lsname_idm[MAX_NAME+1];
 EXTERN int global_dlm_lockspace_exists;
+EXTERN int global_idm_lockspace_exists;
 
 EXTERN int daemon_test; /* run as much as possible without a live lock manager */
 EXTERN int daemon_debug;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 3/5] lib: locking: Add new type "idm"
  2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
  2021-04-25  2:22 ` [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme Leo Yan
  2021-04-25  2:22 ` [LVM2 RFCv1 2/5] lvmlockd: idm: Hook Seagate IDM wrapper APIs Leo Yan
@ 2021-04-25  2:22 ` Leo Yan
  2021-04-25  2:22 ` [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking Leo Yan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Leo Yan @ 2021-04-25  2:22 UTC (permalink / raw)
  To: lvm-devel

We can consider the drive firmware a server to handle the locking
request from nodes, this essentially is a client-server model.
DLM uses the kernel as a central place to manage locks, so it also
complies with client-server model for locking operations.  This is
why IDM and DLM are similar with each other for their wrappers.

This patch largely works by generalizing the DLM code paths and then
providing degeneralized functions as wrappers for both IDM and DLM.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 lib/display/display.c            |  4 ++
 lib/locking/lvmlockd.c           | 72 +++++++++++++++++++++++++++-----
 lib/metadata/metadata-exported.h |  1 +
 lib/metadata/metadata.c          | 12 +++++-
 4 files changed, 78 insertions(+), 11 deletions(-)

diff --git a/lib/display/display.c b/lib/display/display.c
index f0f03c0a5..f9c9ef836 100644
--- a/lib/display/display.c
+++ b/lib/display/display.c
@@ -95,6 +95,8 @@ const char *get_lock_type_string(lock_type_t lock_type)
 		return "dlm";
 	case LOCK_TYPE_SANLOCK:
 		return "sanlock";
+	case LOCK_TYPE_IDM:
+		return "idm";
 	}
 	return "invalid";
 }
@@ -111,6 +113,8 @@ lock_type_t get_lock_type_from_string(const char *str)
 		return LOCK_TYPE_DLM;
 	if (!strcmp(str, "sanlock"))
 		return LOCK_TYPE_SANLOCK;
+	if (!strcmp(str, "idm"))
+		return LOCK_TYPE_IDM;
 	return LOCK_TYPE_INVALID;
 }
 
diff --git a/lib/locking/lvmlockd.c b/lib/locking/lvmlockd.c
index 3b9abd6bf..268f9fc2f 100644
--- a/lib/locking/lvmlockd.c
+++ b/lib/locking/lvmlockd.c
@@ -553,7 +553,8 @@ static int _deactivate_sanlock_lv(struct cmd_context *cmd, struct volume_group *
 	return 1;
 }
 
-static int _init_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
+static int _init_vg(struct cmd_context *cmd, struct volume_group *vg,
+		    const char *lock_type)
 {
 	daemon_reply reply;
 	const char *reply_str;
@@ -569,7 +570,7 @@ static int _init_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
 	reply = _lockd_send("init_vg",
 				"pid = " FMTd64, (int64_t) getpid(),
 				"vg_name = %s", vg->name,
-				"vg_lock_type = %s", "dlm",
+				"vg_lock_type = %s", lock_type,
 				NULL);
 
 	if (!_lockd_result(reply, &result, NULL)) {
@@ -589,10 +590,12 @@ static int _init_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
 		log_error("VG %s init failed: invalid parameters for dlm", vg->name);
 		break;
 	case -EMANAGER:
-		log_error("VG %s init failed: lock manager dlm is not running", vg->name);
+		log_error("VG %s init failed: lock manager %s is not running",
+			  vg->name, lock_type);
 		break;
 	case -EPROTONOSUPPORT:
-		log_error("VG %s init failed: lock manager dlm is not supported by lvmlockd", vg->name);
+		log_error("VG %s init failed: lock manager %s is not supported by lvmlockd",
+			  vg->name, lock_type);
 		break;
 	case -EEXIST:
 		log_error("VG %s init failed: a lockspace with the same name exists", vg->name);
@@ -616,7 +619,7 @@ static int _init_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
 		goto out;
 	}
 
-	vg->lock_type = "dlm";
+	vg->lock_type = lock_type;
 	vg->lock_args = vg_lock_args;
 
 	if (!vg_write(vg) || !vg_commit(vg)) {
@@ -631,6 +634,16 @@ out:
 	return ret;
 }
 
+static int _init_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
+{
+	return _init_vg(cmd, vg, "dlm");
+}
+
+static int _init_vg_idm(struct cmd_context *cmd, struct volume_group *vg)
+{
+	return _init_vg(cmd, vg, "idm");
+}
+
 static int _init_vg_sanlock(struct cmd_context *cmd, struct volume_group *vg, int lv_lock_count)
 {
 	daemon_reply reply;
@@ -794,7 +807,7 @@ out:
 
 /* called after vg_remove on disk */
 
-static int _free_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
+static int _free_vg(struct cmd_context *cmd, struct volume_group *vg)
 {
 	daemon_reply reply;
 	uint32_t lockd_flags = 0;
@@ -820,16 +833,27 @@ static int _free_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
 	}
 
 	if (!ret)
-		log_error("_free_vg_dlm lvmlockd result %d", result);
+		log_error("%s: lock type %s lvmlockd result %d",
+			  __func__, vg->lock_type, result);
 
 	daemon_reply_destroy(reply);
 
 	return 1;
 }
 
+static int _free_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
+{
+	return _free_vg(cmd, vg);
+}
+
+static int _free_vg_idm(struct cmd_context *cmd, struct volume_group *vg)
+{
+	return _free_vg(cmd, vg);
+}
+
 /* called before vg_remove on disk */
 
-static int _busy_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
+static int _busy_vg(struct cmd_context *cmd, struct volume_group *vg)
 {
 	daemon_reply reply;
 	uint32_t lockd_flags = 0;
@@ -864,13 +888,24 @@ static int _busy_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
 	}
 
 	if (!ret)
-		log_error("_busy_vg_dlm lvmlockd result %d", result);
+		log_error("%s: lock type %s lvmlockd result %d", __func__,
+			  vg->lock_type, result);
 
  out:
 	daemon_reply_destroy(reply);
 	return ret;
 }
 
+static int _busy_vg_dlm(struct cmd_context *cmd, struct volume_group *vg)
+{
+	return _busy_vg(cmd, vg);
+}
+
+static int _busy_vg_idm(struct cmd_context *cmd, struct volume_group *vg)
+{
+	return _busy_vg(cmd, vg);
+}
+
 /* called before vg_remove on disk */
 
 static int _free_vg_sanlock(struct cmd_context *cmd, struct volume_group *vg)
@@ -976,6 +1011,8 @@ int lockd_init_vg(struct cmd_context *cmd, struct volume_group *vg,
 		return _init_vg_dlm(cmd, vg);
 	case LOCK_TYPE_SANLOCK:
 		return _init_vg_sanlock(cmd, vg, lv_lock_count);
+	case LOCK_TYPE_IDM:
+		return _init_vg_idm(cmd, vg);
 	default:
 		log_error("Unknown lock_type.");
 		return 0;
@@ -1017,7 +1054,8 @@ int lockd_free_vg_before(struct cmd_context *cmd, struct volume_group *vg,
 	 * When removing (not changing), each LV is locked
 	 * when it is removed, they do not need checking here.
 	 */
-	if (lock_type_num == LOCK_TYPE_DLM || lock_type_num == LOCK_TYPE_SANLOCK) {
+	if (lock_type_num == LOCK_TYPE_DLM || lock_type_num == LOCK_TYPE_SANLOCK ||
+	    lock_type_num == LOCK_TYPE_IDM) {
 		if (changing && !_lockd_all_lvs(cmd, vg)) {
 			log_error("Cannot change VG %s with active LVs", vg->name);
 			return 0;
@@ -1041,6 +1079,9 @@ int lockd_free_vg_before(struct cmd_context *cmd, struct volume_group *vg,
 	case LOCK_TYPE_SANLOCK:
 		/* returning an error will prevent vg_remove() */
 		return _free_vg_sanlock(cmd, vg);
+	case LOCK_TYPE_IDM:
+		/* returning an error will prevent vg_remove() */
+		return _busy_vg_idm(cmd, vg);
 	default:
 		log_error("Unknown lock_type.");
 		return 0;
@@ -1059,6 +1100,9 @@ void lockd_free_vg_final(struct cmd_context *cmd, struct volume_group *vg)
 	case LOCK_TYPE_DLM:
 		_free_vg_dlm(cmd, vg);
 		break;
+	case LOCK_TYPE_IDM:
+		_free_vg_idm(cmd, vg);
+		break;
 	default:
 		log_error("Unknown lock_type.");
 	}
@@ -2679,6 +2723,7 @@ int lockd_init_lv(struct cmd_context *cmd, struct volume_group *vg, struct logic
 		return 1;
 	case LOCK_TYPE_SANLOCK:
 	case LOCK_TYPE_DLM:
+	case LOCK_TYPE_IDM:
 		break;
 	default:
 		log_error("lockd_init_lv: unknown lock_type.");
@@ -2821,6 +2866,8 @@ int lockd_init_lv(struct cmd_context *cmd, struct volume_group *vg, struct logic
 		lv->lock_args = "pending";
 	else if (!strcmp(vg->lock_type, "dlm"))
 		lv->lock_args = "dlm";
+	else if (!strcmp(vg->lock_type, "idm"))
+		lv->lock_args = "idm";
 
 	return 1;
 }
@@ -2836,6 +2883,7 @@ int lockd_free_lv(struct cmd_context *cmd, struct volume_group *vg,
 		return 1;
 	case LOCK_TYPE_DLM:
 	case LOCK_TYPE_SANLOCK:
+	case LOCK_TYPE_IDM:
 		if (!lock_args)
 			return 1;
 		return _free_lv(cmd, vg, lv_name, lv_id, lock_args);
@@ -3007,6 +3055,10 @@ const char *lockd_running_lock_type(struct cmd_context *cmd, int *found_multiple
 		log_debug("lvmlockd found dlm");
 		lock_type = "dlm";
 		break;
+	case LOCK_TYPE_IDM:
+		log_debug("lvmlockd found idm");
+		lock_type = "idm";
+		break;
 	default:
 		log_error("Failed to find a running lock manager.");
 		break;
diff --git a/lib/metadata/metadata-exported.h b/lib/metadata/metadata-exported.h
index 874088993..80755c2a3 100644
--- a/lib/metadata/metadata-exported.h
+++ b/lib/metadata/metadata-exported.h
@@ -353,6 +353,7 @@ typedef enum {
 	LOCK_TYPE_CLVM = 1,
 	LOCK_TYPE_DLM = 2,
 	LOCK_TYPE_SANLOCK = 3,
+	LOCK_TYPE_IDM = 4,
 } lock_type_t;
 
 struct cmd_context;
diff --git a/lib/metadata/metadata.c b/lib/metadata/metadata.c
index ed1c05f75..b815eda87 100644
--- a/lib/metadata/metadata.c
+++ b/lib/metadata/metadata.c
@@ -2235,6 +2235,13 @@ static int _validate_lv_lock_args(struct logical_volume *lv)
 				   lv->vg->name, display_lvname(lv), lv->lock_args);
 			r = 0;
 		}
+
+	} else if (!strcmp(lv->vg->lock_type, "idm")) {
+		if (strcmp(lv->lock_args, "idm")) {
+			log_error(INTERNAL_ERROR "LV %s/%s has invalid lock_args \"%s\"",
+				   lv->vg->name, display_lvname(lv), lv->lock_args);
+			r = 0;
+		}
 	}
 
 	return r;
@@ -2582,7 +2589,8 @@ int vg_validate(struct volume_group *vg)
 			r = 0;
 		}
 
-		if (strcmp(vg->lock_type, "sanlock") && strcmp(vg->lock_type, "dlm")) {
+		if (strcmp(vg->lock_type, "sanlock") && strcmp(vg->lock_type, "dlm") &&
+		    strcmp(vg->lock_type, "idm")) {
 			log_error(INTERNAL_ERROR "VG %s has unknown lock_type %s",
 				  vg->name, vg->lock_type);
 			r = 0;
@@ -4368,6 +4376,8 @@ int is_lockd_type(const char *lock_type)
 		return 1;
 	if (!strcmp(lock_type, "sanlock"))
 		return 1;
+	if (!strcmp(lock_type, "idm"))
+		return 1;
 	return 0;
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking
  2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
                   ` (2 preceding siblings ...)
  2021-04-25  2:22 ` [LVM2 RFCv1 3/5] lib: locking: Add new type "idm" Leo Yan
@ 2021-04-25  2:22 ` Leo Yan
  2021-04-28 19:39   ` David Teigland
  2021-04-25  2:22 ` [LVM2 RFCv1 5/5] tools: Add support for "idm" lock type Leo Yan
  2021-04-27 22:23 ` [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme David Teigland
  5 siblings, 1 reply; 14+ messages in thread
From: Leo Yan @ 2021-04-25  2:22 UTC (permalink / raw)
  To: lvm-devel

For shared VG or LV locking, IDM locking scheme needs to use the PV
list assocated with VG or LV for sending SCSI commands, thus it requires
to use some places to generate PV list.

In reviewing the flow for LVM commands, the best place to generate PV
list is in the locking lib.  So this is why this patch parses PV list as
shown.  It iterates over all the PV nodes one by one, and compare with
the VG name or LV prefix string.  If any PV matches, then the PV is
added into the PV list.  Finally the PV list is sent to lvmlockd daemon.

Here as mentioned, it compares LV prefix string with the format
"lv_name_", the reason is it needs to find out all relevant PVs, e.g.
for the thin pool, it has LVs for metadata, pool, error, and raw LV, so
we can use the prefix string to find out all PVs belonging to the thin
pool.

For the global lock, it's not covered in this patch.  To avoid the egg
and chicken issue, we need to prepare the global lock ahead before any
locking can be used.  So the global lock's PV list is established in
lvmlockd daemon by iterating all drives with partition labelled with
"propeller".

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 lib/locking/lvmlockd.c | 284 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 273 insertions(+), 11 deletions(-)

diff --git a/lib/locking/lvmlockd.c b/lib/locking/lvmlockd.c
index 268f9fc2f..ca3ebfec3 100644
--- a/lib/locking/lvmlockd.c
+++ b/lib/locking/lvmlockd.c
@@ -25,6 +25,11 @@ static int _use_lvmlockd = 0;         /* is 1 if command is configured to use lv
 static int _lvmlockd_connected = 0;   /* is 1 if command is connected to lvmlockd */
 static int _lvmlockd_init_failed = 0; /* used to suppress further warnings */
 
+struct lvmlockd_pvs {
+	char **path;
+	int num;
+};
+
 void lvmlockd_set_socket(const char *sock)
 {
 	_lvmlockd_socket = sock;
@@ -178,18 +183,34 @@ static int _lockd_result(daemon_reply reply, int *result, uint32_t *lockd_flags)
 	return 1;
 }
 
-static daemon_reply _lockd_send(const char *req_name, ...)
+static daemon_reply _lockd_send_with_pvs(const char *req_name,
+				const struct lvmlockd_pvs *lock_pvs, ...)
 {
-	va_list ap;
 	daemon_reply repl;
 	daemon_request req;
+	int i;
+	char key[32];
+	const char *val;
+	va_list ap;
 
 	req = daemon_request_make(req_name);
 
-	va_start(ap, req_name);
+	va_start(ap, lock_pvs);
 	daemon_request_extend_v(req, ap);
 	va_end(ap);
 
+	/* Pass PV list */
+	if (lock_pvs) {
+		daemon_request_extend(req, "path_num = " FMTd64,
+				      (int64_t)(lock_pvs)->num, NULL);
+
+		for (i = 0; i < lock_pvs->num; i++) {
+			snprintf(key, sizeof(key), "path[%d] = %%s", i);
+			val = lock_pvs->path[i] ? lock_pvs->path[i] : "none";
+			daemon_request_extend(req, key, val, NULL);
+		}
+	}
+
 	repl = daemon_send(_lvmlockd, req);
 
 	daemon_request_destroy(req);
@@ -197,6 +218,218 @@ static daemon_reply _lockd_send(const char *req_name, ...)
 	return repl;
 }
 
+#define _lockd_send(req_name, args...)	\
+	_lockd_send_with_pvs(req_name, NULL, ##args)
+
+static int _lockd_retrive_vg_pv_num(struct volume_group *vg)
+{
+	struct pv_list *pvl;
+	int num = 0;
+
+	dm_list_iterate_items(pvl, &vg->pvs)
+		num++;
+
+	return num;
+}
+
+static void _lockd_retrive_vg_pv_list(struct volume_group *vg,
+				      struct lvmlockd_pvs *lock_pvs)
+{
+	struct pv_list *pvl;
+	int pv_num, i;
+	char **path;
+
+	memset(lock_pvs, 0x0, sizeof(*lock_pvs));
+
+	pv_num = _lockd_retrive_vg_pv_num(vg);
+	if (!pv_num) {
+		log_error("Fail to any PVs for VG %s", vg->name);
+		return;
+	}
+
+	/* Allocate buffer for PV list */
+	path = malloc(sizeof(lock_pvs->path) * pv_num);
+	if (!path) {
+		log_error("Fail to allocate PV list for VG %s", vg->name);
+		return;
+	}
+	lock_pvs->path = path;
+
+	i = 0;
+	dm_list_iterate_items(pvl, &vg->pvs) {
+		lock_pvs->path[i] = strdup(pv_dev_name(pvl->pv));
+		if (!lock_pvs->path[i]) {
+			log_error("Fail to allocate PV path for VG %s", vg->name);
+			goto fail;
+		}
+
+		log_debug("VG %s find PV device %s", vg->name, lock_pvs->path[i]);
+		i++;
+	}
+
+	lock_pvs->num = pv_num;
+	return;
+
+fail:
+	for (i = 0; i < pv_num; i++) {
+		if (!lock_pvs->path[i])
+			continue;
+		free(lock_pvs->path[i]);
+		lock_pvs->path[i] = NULL;
+	}
+	free(lock_pvs->path);
+	lock_pvs->path = NULL;
+	lock_pvs->num = 0;
+	return;
+}
+
+static int _lockd_retrive_lv_pv_num(struct volume_group *vg,
+				    const char *lv_name)
+{
+	struct pv_list *pvl;
+	struct physical_volume *pv;
+	const struct pv_segment *pvseg;
+	char *lv_name_prefix;
+	int pv_num = 0;
+
+	/* Allocate buffer for 'lv_name' + '_' + '\0' */
+	lv_name_prefix = malloc(strlen(lv_name) + 1 + 1);
+	snprintf(lv_name_prefix, strlen(lv_name) + 1 + 1, "%s_", lv_name);
+
+	dm_list_iterate_items(pvl, &vg->pvs) {
+		pv = pvl->pv;
+		dm_list_iterate_items(pvseg, &pv->segments) {
+
+			if (!pvseg || !pvseg->lvseg ||
+			    !pvseg->lvseg->lv || !pvseg->lvseg->lv->name)
+				continue;
+
+			if (!strcmp(lv_name, pvseg->lvseg->lv->name)) {
+				pv_num++;
+				break;
+			}
+
+			/* Find out corresponding PVs with lv name prefix */
+			if (strstr(pvseg->lvseg->lv->name, lv_name_prefix)) {
+				pv_num++;
+				break;
+			}
+		}
+	}
+
+	return pv_num;
+}
+
+static void _lockd_retrive_lv_pv_list(struct volume_group *vg,
+				      const char *lv_name,
+				      struct lvmlockd_pvs *lock_pvs)
+{
+	struct pv_list *pvl;
+	struct physical_volume *pv;
+	const struct pv_segment *pvseg;
+	char *lv_name_prefix;
+	char **path;
+	int found, pv_num, i = 0;
+
+	memset(lock_pvs, 0x0, sizeof(*lock_pvs));
+
+	pv_num = _lockd_retrive_lv_pv_num(vg, lv_name);
+	if (!pv_num) {
+		/*
+		 * Fixup for 'lvcreate --type error -L1 -n $lv1 $vg', in this
+		 * case, the drive path list is empty since it doesn't establish
+		 * the structure 'pvseg->lvseg->lv->name'.
+		 *
+		 * So create drive path list with all drives in the VG.
+		 */
+		log_error("Fail to find any PVs for %s/%s", vg->name, lv_name);
+		log_error("Try to find PVs from VG %s instead", vg->name);
+		_lockd_retrive_vg_pv_list(vg, lock_pvs);
+		return;
+	}
+
+	/* Allocate buffer for PV list */
+	path = malloc(sizeof(lock_pvs->path) * pv_num);
+	if (!path) {
+		log_error("Fail to allocate PV list for %s/%s", vg->name, lv_name);
+		return;
+	}
+	lock_pvs->path = path;
+
+	/* Allocate buffer for 'lv_name' + '_' + '\0' */
+	lv_name_prefix = malloc(strlen(lv_name) + 1 + 1);
+	snprintf(lv_name_prefix, strlen(lv_name) + 1 + 1, "%s_", lv_name);
+
+	dm_list_iterate_items(pvl, &vg->pvs) {
+		found = 0;
+		pv = pvl->pv;
+		dm_list_iterate_items(pvseg, &pv->segments) {
+
+			if (!pvseg || !pvseg->lvseg ||
+			    !pvseg->lvseg->lv || !pvseg->lvseg->lv->name)
+				continue;
+
+			log_debug("%s pvseg->lvseg->name=%s", __func__,
+				  pvseg->lvseg->lv->name);
+
+			if (!strcmp(lv_name, pvseg->lvseg->lv->name)) {
+				found = 1;
+				break;
+			}
+
+			/* Find out corresponding PVs with lv name prefix */
+			if (strstr(pvseg->lvseg->lv->name, lv_name_prefix)) {
+				found = 1;
+				break;
+			}
+		}
+
+		if (found) {
+			lock_pvs->path[i] = strdup(pv_dev_name(pv));
+			if (!lock_pvs->path[i]) {
+				log_error("Fail to allocate PV path for LV %s/%s",
+					  vg->name, lv_name);
+				goto fail;
+			}
+
+			log_debug("Find PV device %s for LV %s/%s",
+				  lock_pvs->path[i], vg->name, lv_name);
+			i++;
+		}
+	}
+
+	lock_pvs->num = pv_num;
+	free(lv_name_prefix);
+	return;
+
+fail:
+	for (i = 0; i < pv_num; i++) {
+		if (!lock_pvs->path[i])
+			continue;
+		free(lock_pvs->path[i]);
+		lock_pvs->path[i] = NULL;
+	}
+	free(lock_pvs->path);
+	lock_pvs->path = NULL;
+	lock_pvs->num = 0;
+	free(lv_name_prefix);
+	return;
+}
+
+static void _lockd_free_pv_list(struct lvmlockd_pvs *lock_pvs)
+{
+	int i;
+
+	for (i = 0; i < lock_pvs->num; i++) {
+		free(lock_pvs->path[i]);
+		lock_pvs->path[i] = NULL;
+	}
+
+	free(lock_pvs->path);
+	lock_pvs->path = NULL;
+	lock_pvs->num = 0;
+}
+
 /*
  * result/lockd_flags are values returned from lvmlockd.
  *
@@ -227,6 +460,7 @@ static int _lockd_request(struct cmd_context *cmd,
 		          const char *lv_lock_args,
 		          const char *mode,
 		          const char *opts,
+			  const struct lvmlockd_pvs *lock_pvs,
 		          int *result,
 		          uint32_t *lockd_flags)
 {
@@ -251,7 +485,16 @@ static int _lockd_request(struct cmd_context *cmd,
 		cmd_name = "none";
 
 	if (vg_name && lv_name) {
-		reply = _lockd_send(req_name,
+		/*
+		 * For LV operation, the PV list must be passed for idm,
+		 * otherwise, IDM lock manager has no idea to send locking
+		 * request to which drives, so return failure.
+		 */
+		if (!lock_pvs)
+			return 1;
+
+		reply = _lockd_send_with_pvs(req_name,
+					lock_pvs,
 					"cmd = %s", cmd_name,
 					"pid = " FMTd64, (int64_t) pid,
 					"mode = %s", mode,
@@ -271,7 +514,8 @@ static int _lockd_request(struct cmd_context *cmd,
 			  req_name, mode, vg_name, lv_name, *result, *lockd_flags);
 
 	} else if (vg_name) {
-		reply = _lockd_send(req_name,
+		reply = _lockd_send_with_pvs(req_name,
+					lock_pvs,
 					"cmd = %s", cmd_name,
 					"pid = " FMTd64, (int64_t) pid,
 					"mode = %s", mode,
@@ -288,7 +532,8 @@ static int _lockd_request(struct cmd_context *cmd,
 			  req_name, mode, vg_name, *result, *lockd_flags);
 
 	} else {
-		reply = _lockd_send(req_name,
+		reply = _lockd_send_with_pvs(req_name,
+					lock_pvs,
 					"cmd = %s", cmd_name,
 					"pid = " FMTd64, (int64_t) pid,
 					"mode = %s", mode,
@@ -1134,6 +1379,7 @@ int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_i
 	int host_id = 0;
 	int result;
 	int ret;
+	struct lvmlockd_pvs lock_pvs;
 
 	memset(uuid, 0, sizeof(uuid));
 
@@ -1169,7 +1415,15 @@ int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_i
 		host_id = find_config_tree_int(cmd, local_host_id_CFG, NULL);
 	}
 
-	reply = _lockd_send("start_vg",
+	/*
+	 * Create the VG's PV list when start the VG, the PV list
+	 * is passed to lvmlockd, and the the PVs path will be used
+	 * to send SCSI commands for idm locking scheme.
+	 */
+	_lockd_retrive_vg_pv_list(vg, &lock_pvs);
+
+	reply = _lockd_send_with_pvs("start_vg",
+				&lock_pvs,
 				"pid = " FMTd64, (int64_t) getpid(),
 				"vg_name = %s", vg->name,
 				"vg_lock_type = %s", vg->lock_type,
@@ -1180,6 +1434,8 @@ int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_i
 				"opts = %s", start_init ? "start_init" : "none",
 				NULL);
 
+	_lockd_free_pv_list(&lock_pvs);
+
 	if (!_lockd_result(reply, &result, &lockd_flags)) {
 		ret = 0;
 		result = -ELOCKD;
@@ -1406,7 +1662,7 @@ int lockd_global_create(struct cmd_context *cmd, const char *def_mode, const cha
  req:
 	if (!_lockd_request(cmd, "lock_gl",
 			      NULL, vg_lock_type, NULL, NULL, NULL, NULL, mode, NULL,
-			      &result, &lockd_flags)) {
+			      NULL, &result, &lockd_flags)) {
 		/* No result from lvmlockd, it is probably not running. */
 		log_error("Global lock failed: check that lvmlockd is running.");
 		return 0;
@@ -1642,7 +1898,7 @@ int lockd_global(struct cmd_context *cmd, const char *def_mode)
 
 	if (!_lockd_request(cmd, "lock_gl",
 			    NULL, NULL, NULL, NULL, NULL, NULL, mode, opts,
-			    &result, &lockd_flags)) {
+			    NULL, &result, &lockd_flags)) {
 		/* No result from lvmlockd, it is probably not running. */
 
 		/* We don't care if an unlock fails. */
@@ -1910,7 +2166,7 @@ int lockd_vg(struct cmd_context *cmd, const char *vg_name, const char *def_mode,
 
 	if (!_lockd_request(cmd, "lock_vg",
 			      vg_name, NULL, NULL, NULL, NULL, NULL, mode, NULL,
-			      &result, &lockd_flags)) {
+			      NULL, &result, &lockd_flags)) {
 		/*
 		 * No result from lvmlockd, it is probably not running.
 		 * Decide if it is ok to continue without a lock in
@@ -2170,6 +2426,7 @@ int lockd_lv_name(struct cmd_context *cmd, struct volume_group *vg,
 	uint32_t lockd_flags;
 	int refreshed = 0;
 	int result;
+	struct lvmlockd_pvs lock_pvs;
 
 	/*
 	 * Verify that when --readonly is used, no LVs should be activated or used.
@@ -2235,15 +2492,20 @@ int lockd_lv_name(struct cmd_context *cmd, struct volume_group *vg,
  retry:
 	log_debug("lockd LV %s/%s mode %s uuid %s", vg->name, lv_name, mode, lv_uuid);
 
+	_lockd_retrive_lv_pv_list(vg, lv_name, &lock_pvs);
+
 	if (!_lockd_request(cmd, "lock_lv",
 			       vg->name, vg->lock_type, vg->lock_args,
 			       lv_name, lv_uuid, lock_args, mode, opts,
-			       &result, &lockd_flags)) {
+			       &lock_pvs, &result, &lockd_flags)) {
+		_lockd_free_pv_list(&lock_pvs);
 		/* No result from lvmlockd, it is probably not running. */
 		log_error("Locking failed for LV %s/%s", vg->name, lv_name);
 		return 0;
 	}
 
+	_lockd_free_pv_list(&lock_pvs);
+
 	/* The lv was not active/locked. */
 	if (result == -ENOENT && !strcmp(mode, "un"))
 		return 1;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 5/5] tools: Add support for "idm" lock type
  2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
                   ` (3 preceding siblings ...)
  2021-04-25  2:22 ` [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking Leo Yan
@ 2021-04-25  2:22 ` Leo Yan
  2021-04-27 22:23 ` [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme David Teigland
  5 siblings, 0 replies; 14+ messages in thread
From: Leo Yan @ 2021-04-25  2:22 UTC (permalink / raw)
  To: lvm-devel

This patch is to update the comment and code to support "idm" lock type
which is used for LVM toolkit.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/lvconvert.c |  2 ++
 tools/toollib.c   | 11 ++++++-----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/lvconvert.c b/tools/lvconvert.c
index 4c159b01e..1f2178ce5 100644
--- a/tools/lvconvert.c
+++ b/tools/lvconvert.c
@@ -3415,6 +3415,8 @@ static int _lvconvert_to_pool(struct cmd_context *cmd,
 				pool_lv->lock_args = "pending";
 			else if (!strcmp(vg->lock_type, "dlm"))
 				pool_lv->lock_args = "dlm";
+			else if (!strcmp(vg->lock_type, "idm"))
+				pool_lv->lock_args = "idm";
 			/* The lock_args will be set in vg_write(). */
 		}
 	}
diff --git a/tools/toollib.c b/tools/toollib.c
index 67fbbdaa0..cda6d74f3 100644
--- a/tools/toollib.c
+++ b/tools/toollib.c
@@ -591,15 +591,15 @@ int vgcreate_params_set_from_args(struct cmd_context *cmd,
 	 * new VG, and is it compatible with current lvm.conf settings.
 	 *
 	 * The end result is to set vp_new->lock_type to:
-	 * none | clvm | dlm | sanlock.
+	 * none | clvm | dlm | sanlock | idm.
 	 *
 	 * If 'vgcreate --lock-type <arg>' is set, the answer is given
-	 * directly by <arg> which is one of none|clvm|dlm|sanlock.
+	 * directly by <arg> which is one of none|clvm|dlm|sanlock|idm.
 	 *
 	 * 'vgcreate --clustered y' is the way to create clvm VGs.
 	 *
 	 * 'vgcreate --shared' is the way to create lockd VGs.
-	 * lock_type of sanlock or dlm is selected based on
+	 * lock_type of sanlock, dlm or idm is selected based on
 	 * which lock manager is running.
 	 *
 	 *
@@ -646,7 +646,7 @@ int vgcreate_params_set_from_args(struct cmd_context *cmd,
 	 * - lvmlockd is used
 	 * - VGs with CLUSTERED set are ignored (requires clvmd)
 	 * - VGs with lockd type can be used
-	 * - vgcreate can create new VGs with lock_type sanlock or dlm
+	 * - vgcreate can create new VGs with lock_type sanlock, dlm or idm
 	 * - 'vgcreate --clustered y' fails
 	 * - 'vgcreate --shared' works
 	 * - 'vgcreate' (neither option) creates a local VG
@@ -658,7 +658,7 @@ int vgcreate_params_set_from_args(struct cmd_context *cmd,
 		lock_type = arg_str_value(cmd, locktype_ARG, "");
 
 		if (arg_is_set(cmd, shared_ARG) && !is_lockd_type(lock_type)) {
-			log_error("The --shared option requires lock type sanlock or dlm.");
+			log_error("The --shared option requires lock type sanlock, dlm or idm.");
 			return 0;
 		}
 
@@ -697,6 +697,7 @@ int vgcreate_params_set_from_args(struct cmd_context *cmd,
 
 	case LOCK_TYPE_SANLOCK:
 	case LOCK_TYPE_DLM:
+	case LOCK_TYPE_IDM:
 		if (!use_lvmlockd) {
 			log_error("Using a shared lock type requires lvmlockd.");
 			return 0;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme
  2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
                   ` (4 preceding siblings ...)
  2021-04-25  2:22 ` [LVM2 RFCv1 5/5] tools: Add support for "idm" lock type Leo Yan
@ 2021-04-27 22:23 ` David Teigland
  2021-04-28  1:50   ` Leo Yan
  5 siblings, 1 reply; 14+ messages in thread
From: David Teigland @ 2021-04-27 22:23 UTC (permalink / raw)
  To: lvm-devel

On Sun, Apr 25, 2021 at 10:22:36AM +0800, Leo Yan wrote:
> This patch set enables the In-Drive-Mutex (IDM) locking scheme.

Hi, nice work, I'll begin looking through this in the next few days.
It seems to closely follow the existing pattern of sanlock and dlm,
so it should be pretty easy to follow.

Dave



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme
  2021-04-27 22:23 ` [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme David Teigland
@ 2021-04-28  1:50   ` Leo Yan
  0 siblings, 0 replies; 14+ messages in thread
From: Leo Yan @ 2021-04-28  1:50 UTC (permalink / raw)
  To: lvm-devel

On Tue, Apr 27, 2021 at 05:23:23PM -0500, David Teigland wrote:
> On Sun, Apr 25, 2021 at 10:22:36AM +0800, Leo Yan wrote:
> > This patch set enables the In-Drive-Mutex (IDM) locking scheme.
> 
> Hi, nice work, I'll begin looking through this in the next few days.
> It seems to closely follow the existing pattern of sanlock and dlm,
> so it should be pretty easy to follow.

Thank you, Dave!  If have any questions, just let us know.

Leo



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking
  2021-04-25  2:22 ` [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking Leo Yan
@ 2021-04-28 19:39   ` David Teigland
  2021-04-29  3:12     ` Leo Yan
  0 siblings, 1 reply; 14+ messages in thread
From: David Teigland @ 2021-04-28 19:39 UTC (permalink / raw)
  To: lvm-devel

On Sun, Apr 25, 2021 at 10:22:40AM +0800, Leo Yan wrote:
> +static void _lockd_retrive_lv_pv_list(struct volume_group *vg,
> +				      const char *lv_name,
> +				      struct lvmlockd_pvs *lock_pvs)
> +{

It looks like this wants a list of PVs (names) used by the LV.  Try
iterating through all PVs in the VG and using the existing lv_is_on_pv()
function to check if the LV is using that PV, e.g.

for each pv in vg->pvs,
	if (lv_is_on_pv(lv, pv))
		copy the pv name;

You could pass the lv struct through to here, or use find_lv(vg, lv_name)
to get it again here.

> @@ -251,7 +485,16 @@ static int _lockd_request(struct cmd_context *cmd,
>  	if (vg_name && lv_name) {
> -		reply = _lockd_send(req_name,
> +		/*
> +		 * For LV operation, the PV list must be passed for idm,
> +		 * otherwise, IDM lock manager has no idea to send locking
> +		 * request to which drives, so return failure.
> +		 */
> +		if (!lock_pvs)
> +			return 1;
> +
> +		reply = _lockd_send_with_pvs(req_name,

Requires other lock managers to include lock_pvs?

> +	/*
> +	 * Create the VG's PV list when start the VG, the PV list
> +	 * is passed to lvmlockd, and the the PVs path will be used
> +	 * to send SCSI commands for idm locking scheme.
> +	 */

if vg->lock_type is IDM before creating pv_list?

> +	_lockd_retrive_vg_pv_list(vg, &lock_pvs);
> +
> +	reply = _lockd_send_with_pvs("start_vg",
> +				&lock_pvs,

and probably NULL instead of &lock_pvs for non-IDM.

Dave



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme
  2021-04-25  2:22 ` [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme Leo Yan
@ 2021-04-28 19:54   ` David Teigland
  2021-04-29  3:26     ` Leo Yan
  0 siblings, 1 reply; 14+ messages in thread
From: David Teigland @ 2021-04-28 19:54 UTC (permalink / raw)
  To: lvm-devel

On Sun, Apr 25, 2021 at 10:22:37AM +0800, Leo Yan wrote:
> One thing should be mentioned is the IDM's LVB.  IDM supports LVB to max
> 7 bytes when stores into the drive, the most significant byte of 8 bytes
> is reserved for control bits.  For this reason, the patch maps the
> timestamp in macrosecond unit with its cached LVB, essentially, if any
> timestamp was updated by other nodes, that means the local LVB is
> invalidate thus the metadata should be invlidated.  When the timestamp
> is stored into drive's LVB, it's possbile to cause time-going-backwards
> issue, which is introduced by the time precision or missing
> synchronization acrossing over multiple nodes.  So the IDM wrapper fixes
> up the timestamp by increment 1 to the latest value and write back into
> drive.

While lvmlockd, sanlock, dlm are still using the LVB to track VG changes,
it's not actually being used for anything any longer.  When lvm used
lvmetad to cache metadata, we used the LVB to trigger lvmetad cache
invalidation.  It's possible that this LVB functionality could be useful
again in the future.

Dave



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking
  2021-04-28 19:39   ` David Teigland
@ 2021-04-29  3:12     ` Leo Yan
  2021-04-29  3:36       ` Leo Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Leo Yan @ 2021-04-29  3:12 UTC (permalink / raw)
  To: lvm-devel

On Wed, Apr 28, 2021 at 02:39:27PM -0500, David Teigland wrote:
> On Sun, Apr 25, 2021 at 10:22:40AM +0800, Leo Yan wrote:
> > +static void _lockd_retrive_lv_pv_list(struct volume_group *vg,
> > +				      const char *lv_name,
> > +				      struct lvmlockd_pvs *lock_pvs)
> > +{
> 
> It looks like this wants a list of PVs (names) used by the LV.  Try
> iterating through all PVs in the VG and using the existing lv_is_on_pv()
> function to check if the LV is using that PV, e.g.
> 
> for each pv in vg->pvs,
> 	if (lv_is_on_pv(lv, pv))
> 		copy the pv name;

Using lv_is_on_pv() is much better.  I read a bit for the code, except
it checks the LV itself is on the PV or not, it also checks sub LVs
(like metadata, pool, etc).  Will fix it.

> You could pass the lv struct through to here, or use find_lv(vg, lv_name)
> to get it again here.
> 
> > @@ -251,7 +485,16 @@ static int _lockd_request(struct cmd_context *cmd,
> >  	if (vg_name && lv_name) {
> > -		reply = _lockd_send(req_name,
> > +		/*
> > +		 * For LV operation, the PV list must be passed for idm,
> > +		 * otherwise, IDM lock manager has no idea to send locking
> > +		 * request to which drives, so return failure.
> > +		 */
> > +		if (!lock_pvs)
> > +			return 1;
> > +
> > +		reply = _lockd_send_with_pvs(req_name,
> 
> Requires other lock managers to include lock_pvs?

No, will change code to only pass "lock_pvs" for IDM; for other
locking schemes, will assign NULL pointer to "lock_pvs".

> > +	/*
> > +	 * Create the VG's PV list when start the VG, the PV list
> > +	 * is passed to lvmlockd, and the the PVs path will be used
> > +	 * to send SCSI commands for idm locking scheme.
> > +	 */
> 
> if vg->lock_type is IDM before creating pv_list?

Yes, IIUC, when start the VG, "vg->lock_type" has been assigned to the
locking scheme which it's using; for IDM locking scheme, "vg->lock_type"
is "idm" before creating pv_list.

> > +	_lockd_retrive_vg_pv_list(vg, &lock_pvs);
> > +
> > +	reply = _lockd_send_with_pvs("start_vg",
> > +				&lock_pvs,
> 
> and probably NULL instead of &lock_pvs for non-IDM.

Will do.

Will follow up the comments in next patch set, thanks for reviewing!

Leo



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme
  2021-04-28 19:54   ` David Teigland
@ 2021-04-29  3:26     ` Leo Yan
  2021-04-29 15:31       ` David Teigland
  0 siblings, 1 reply; 14+ messages in thread
From: Leo Yan @ 2021-04-29  3:26 UTC (permalink / raw)
  To: lvm-devel

On Wed, Apr 28, 2021 at 02:54:45PM -0500, David Teigland wrote:
> On Sun, Apr 25, 2021 at 10:22:37AM +0800, Leo Yan wrote:
> > One thing should be mentioned is the IDM's LVB.  IDM supports LVB to max
> > 7 bytes when stores into the drive, the most significant byte of 8 bytes
> > is reserved for control bits.  For this reason, the patch maps the
> > timestamp in macrosecond unit with its cached LVB, essentially, if any
> > timestamp was updated by other nodes, that means the local LVB is
> > invalidate thus the metadata should be invlidated.  When the timestamp
> > is stored into drive's LVB, it's possbile to cause time-going-backwards
> > issue, which is introduced by the time precision or missing
> > synchronization acrossing over multiple nodes.  So the IDM wrapper fixes
> > up the timestamp by increment 1 to the latest value and write back into
> > drive.
> 
> While lvmlockd, sanlock, dlm are still using the LVB to track VG changes,
> it's not actually being used for anything any longer.  When lvm used
> lvmetad to cache metadata, we used the LVB to trigger lvmetad cache
> invalidation.  It's possible that this LVB functionality could be useful
> again in the future.

Thanks for reminding.  So I think it's good to keep LVB functionality
for IDM in lvmlockd, which allows IDM have the consistent supporting
with other locking schemes (sanlock and dlm), and maybe can be used
later as you said;  will update the commit log to reflect this.

As a side topic, just curious if LVM doesn't use LVB for invalidation
the cached metadata, then now LVM uses which mechanism for metadata
invalidation?  Sorry this question shows my lacking knowledge, but just
want to ensure I don't miss anything for the IDM enabling.

Thanks,
Leo



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking
  2021-04-29  3:12     ` Leo Yan
@ 2021-04-29  3:36       ` Leo Yan
  0 siblings, 0 replies; 14+ messages in thread
From: Leo Yan @ 2021-04-29  3:36 UTC (permalink / raw)
  To: lvm-devel

On Thu, Apr 29, 2021 at 11:12:09AM +0800, Leo Yan wrote:

[...]

> > > +	/*
> > > +	 * Create the VG's PV list when start the VG, the PV list
> > > +	 * is passed to lvmlockd, and the the PVs path will be used
> > > +	 * to send SCSI commands for idm locking scheme.
> > > +	 */
> > 
> > if vg->lock_type is IDM before creating pv_list?
> 
> Yes, IIUC, when start the VG, "vg->lock_type" has been assigned to the
> locking scheme which it's using; for IDM locking scheme, "vg->lock_type"
> is "idm" before creating pv_list.

No sure if I understand your question or not.  But I think you are
suggesting to check "vg->lock_type" is IDM and then create pv_list for
IDM type.  Will refine code for this.

Thanks,
Leo



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme
  2021-04-29  3:26     ` Leo Yan
@ 2021-04-29 15:31       ` David Teigland
  0 siblings, 0 replies; 14+ messages in thread
From: David Teigland @ 2021-04-29 15:31 UTC (permalink / raw)
  To: lvm-devel

On Thu, Apr 29, 2021 at 11:26:45AM +0800, Leo Yan wrote:
> As a side topic, just curious if LVM doesn't use LVB for invalidation
> the cached metadata, then now LVM uses which mechanism for metadata
> invalidation?  

Every command reads metadata from disk, so there is no cache that needs to
be invalidated.  There is the /run/lvm/hints file, but that is not used
for shared VGs (it caches which devices on the system are LVM PVs).

> No sure if I understand your question or not.  But I think you are
> suggesting to check "vg->lock_type" is IDM and then create pv_list for
> IDM type.  Will refine code for this.

Yes that's what I was thinking, thanks.

Dave



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-04-29 15:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-25  2:22 [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme Leo Yan
2021-04-25  2:22 ` [LVM2 RFCv1 1/5] lvmlockd: idm: Introduce new locking scheme Leo Yan
2021-04-28 19:54   ` David Teigland
2021-04-29  3:26     ` Leo Yan
2021-04-29 15:31       ` David Teigland
2021-04-25  2:22 ` [LVM2 RFCv1 2/5] lvmlockd: idm: Hook Seagate IDM wrapper APIs Leo Yan
2021-04-25  2:22 ` [LVM2 RFCv1 3/5] lib: locking: Add new type "idm" Leo Yan
2021-04-25  2:22 ` [LVM2 RFCv1 4/5] lib: locking: Parse PV list for IDM locking Leo Yan
2021-04-28 19:39   ` David Teigland
2021-04-29  3:12     ` Leo Yan
2021-04-29  3:36       ` Leo Yan
2021-04-25  2:22 ` [LVM2 RFCv1 5/5] tools: Add support for "idm" lock type Leo Yan
2021-04-27 22:23 ` [LVM2 RFCv1 0/5] Enable In-Drive-Mutex Locking scheme David Teigland
2021-04-28  1:50   ` Leo Yan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.