[Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes
@ 2021-02-17  0:00 Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 01/12] fs: dlm: fix debugfs dump Alexander Aring
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

this patch series contains some of the patches of the previous patch series
to fix dlm message drops if reconnects occurs. These patches have
less to do something to introduce the reliable midcomms layer so I split
it off the patch series to have them at first upstream which makes the
new feature easier to review.

However I did some changes:

- fix "fs: dlm: remove unaligned memory access handling" to use the
  correct pointer.
- add WARN_ON(1) in "fs: dlm: change allocation limits" to see where an
  illegal allocation size was coming from.
- add patch "fs: dlm: check on dlm header size" from changes of
  "fs: dlm: remove unaligned memory access handling"

- Alex

changes since v2:
 - fix length check in dlm processing loop 
 - add "fs: dlm: fix debugfs dump" patch
 - add "fs: dlm: flush swork on shutdown" patch
 - add "fs: dlm: add shutdown hook"

About the shutdown hook:

The lowcomms layer will synchronize workqueue at "stop" hook and we
already released the lockspace. I introduce the shutdown callback
to synchronize the workqueue/tcp before releasing the lockspace
but after stopping the recoverd thread. I hope at this point we
don't transmit anymore (calling dlm_lowcomms_commit_buffer()) from
dlm application layer and we are still capable to deliver dlm messages
from lowcomms to application. However the current code definitely has
some issues there which I expirence with tcpkill and reliable
connection patches. I need to recheck if DLM_FIN is really necessary.

Alexander Aring (12):
  fs: dlm: fix debugfs dump
  fs: dlm: set connected bit after accept
  fs: dlm: set subclass for othercon sock_mutex
  fs: dlm: add errno handling to check callback
  fs: dlm: add check if dlm is currently running
  fs: dlm: change allocation limits
  fs: dlm: use GFP_ZERO for page buffer
  fs: dlm: simplify writequeue handling
  fs: dlm: check on minimum msglen size
  fs: dlm: remove unaligned memory access handling
  fs: dlm: flush swork on shutdown
  fs: dlm: add shutdown hook

 fs/dlm/config.c    |  57 +++++++++++++++---
 fs/dlm/debug_fs.c  |   1 +
 fs/dlm/lock.c      |   2 -
 fs/dlm/lockspace.c |   3 +
 fs/dlm/lowcomms.c  | 144 ++++++++++++++++++++++++---------------------
 fs/dlm/lowcomms.h  |   4 ++
 fs/dlm/midcomms.c  |  33 +++++------
 fs/dlm/rcom.c      |   2 -
 8 files changed, 148 insertions(+), 98 deletions(-)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 01/12] fs: dlm: fix debugfs dump
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 02/12] fs: dlm: set connected bit after accept Alexander Aring
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch fixes the following message which randomly pops up during
glocktop call:

seq_file: buggy .next function table_seq_next did not update position index

The issue is that seq_read_iter() in fs/seq_file.c also needs an
increment of the index in an non next record case as well which this
patch fixes otherwise seq_read_iter() will print out the above message.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/debug_fs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/dlm/debug_fs.c b/fs/dlm/debug_fs.c
index d6bbccb0ed15..d5bd990bcab8 100644
--- a/fs/dlm/debug_fs.c
+++ b/fs/dlm/debug_fs.c
@@ -542,6 +542,7 @@ static void *table_seq_next(struct seq_file *seq, void *iter_ptr, loff_t *pos)
 
 		if (bucket >= ls->ls_rsbtbl_size) {
 			kfree(ri);
+			++*pos;
 			return NULL;
 		}
 		tree = toss ? &ls->ls_rsbtbl[bucket].toss : &ls->ls_rsbtbl[bucket].keep;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 02/12] fs: dlm: set connected bit after accept
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 01/12] fs: dlm: fix debugfs dump Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 03/12] fs: dlm: set subclass for othercon sock_mutex Alexander Aring
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch sets the CF_CONNECTED bit when dlm accepts a connection from
another node. If we don't set this bit, next time if the connection
socket gets writable it will assume an event that the connection is
successfully connected. However that is only the case when the
connection did a connect.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 372c34ff8594..2fd1e4c13663 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -930,6 +930,7 @@ static int accept_from_sock(struct listen_connection *con)
 		addcon = newcon;
 	}
 
+	set_bit(CF_CONNECTED, &addcon->flags);
 	mutex_unlock(&newcon->sock_mutex);
 
 	/*
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 03/12] fs: dlm: set subclass for othercon sock_mutex
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 01/12] fs: dlm: fix debugfs dump Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 02/12] fs: dlm: set connected bit after accept Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 04/12] fs: dlm: add errno handling to check callback Alexander Aring
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch sets the lockdep subclass for the othercon socket mutex. In
various places the connection socket mutex is held while locking the
othercon socket mutex. This patch will remove lockdep warnings when such
case occurs.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 2fd1e4c13663..d772e1d4461d 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -911,13 +911,14 @@ static int accept_from_sock(struct listen_connection *con)
 				goto accept_err;
 			}
 
+			lockdep_set_subclass(&othercon->sock_mutex, 1);
 			newcon->othercon = othercon;
 		} else {
 			/* close other sock con if we have something new */
 			close_connection(othercon, false, true, false);
 		}
 
-		mutex_lock_nested(&othercon->sock_mutex, 1);
+		mutex_lock(&othercon->sock_mutex);
 		add_sock(newsock, othercon);
 		addcon = othercon;
 		mutex_unlock(&othercon->sock_mutex);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 04/12] fs: dlm: add errno handling to check callback
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (2 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 03/12] fs: dlm: set subclass for othercon sock_mutex Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 05/12] fs: dlm: add check if dlm is currently running Alexander Aring
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This allows to return individual errno values for the config attribute
check callback instead of returning invalid argument only.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/config.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 49c5f9407098..73e6643903af 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -125,7 +125,7 @@ static ssize_t cluster_cluster_name_store(struct config_item *item,
 CONFIGFS_ATTR(cluster_, cluster_name);
 
 static ssize_t cluster_set(struct dlm_cluster *cl, unsigned int *cl_field,
-			   int *info_field, bool (*check_cb)(unsigned int x),
+			   int *info_field, int (*check_cb)(unsigned int x),
 			   const char *buf, size_t len)
 {
 	unsigned int x;
@@ -137,8 +137,11 @@ static ssize_t cluster_set(struct dlm_cluster *cl, unsigned int *cl_field,
 	if (rc)
 		return rc;
 
-	if (check_cb && check_cb(x))
-		return -EINVAL;
+	if (check_cb) {
+		rc = check_cb(x);
+		if (rc)
+			return rc;
+	}
 
 	*cl_field = x;
 	*info_field = x;
@@ -161,14 +164,20 @@ static ssize_t cluster_##name##_show(struct config_item *item, char *buf)     \
 }                                                                             \
 CONFIGFS_ATTR(cluster_, name);
 
-static bool dlm_check_zero(unsigned int x)
+static int dlm_check_zero(unsigned int x)
 {
-	return !x;
+	if (!x)
+		return -EINVAL;
+
+	return 0;
 }
 
-static bool dlm_check_buffer_size(unsigned int x)
+static int dlm_check_buffer_size(unsigned int x)
 {
-	return (x < DEFAULT_BUFFER_SIZE);
+	if (x < DEFAULT_BUFFER_SIZE)
+		return -EINVAL;
+
+	return 0;
 }
 
 CLUSTER_ATTR(tcp_port, dlm_check_zero);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 05/12] fs: dlm: add check if dlm is currently running
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (3 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 04/12] fs: dlm: add errno handling to check callback Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 06/12] fs: dlm: change allocation limits Alexander Aring
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds checks for dlm config attributes regarding to protocol
parameters as it makes only sense to change them when dlm is not running.
It also adds a check for valid protocol specifiers and return invalid
argument if they are not supported.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/config.c   | 34 ++++++++++++++++++++++++++++++++--
 fs/dlm/lowcomms.c |  2 +-
 fs/dlm/lowcomms.h |  3 +++
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 73e6643903af..ab26cf135710 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -164,6 +164,36 @@ static ssize_t cluster_##name##_show(struct config_item *item, char *buf)     \
 }                                                                             \
 CONFIGFS_ATTR(cluster_, name);
 
+static int dlm_check_protocol_and_dlm_running(unsigned int x)
+{
+	switch (x) {
+	case 0:
+		/* TCP */
+		break;
+	case 1:
+		/* SCTP */
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (dlm_allow_conn)
+		return -EBUSY;
+
+	return 0;
+}
+
+static int dlm_check_zero_and_dlm_running(unsigned int x)
+{
+	if (!x)
+		return -EINVAL;
+
+	if (dlm_allow_conn)
+		return -EBUSY;
+
+	return 0;
+}
+
 static int dlm_check_zero(unsigned int x)
 {
 	if (!x)
@@ -180,7 +210,7 @@ static int dlm_check_buffer_size(unsigned int x)
 	return 0;
 }
 
-CLUSTER_ATTR(tcp_port, dlm_check_zero);
+CLUSTER_ATTR(tcp_port, dlm_check_zero_and_dlm_running);
 CLUSTER_ATTR(buffer_size, dlm_check_buffer_size);
 CLUSTER_ATTR(rsbtbl_size, dlm_check_zero);
 CLUSTER_ATTR(recover_timer, dlm_check_zero);
@@ -188,7 +218,7 @@ CLUSTER_ATTR(toss_secs, dlm_check_zero);
 CLUSTER_ATTR(scan_secs, dlm_check_zero);
 CLUSTER_ATTR(log_debug, NULL);
 CLUSTER_ATTR(log_info, NULL);
-CLUSTER_ATTR(protocol, NULL);
+CLUSTER_ATTR(protocol, dlm_check_protocol_and_dlm_running);
 CLUSTER_ATTR(mark, NULL);
 CLUSTER_ATTR(timewarn_cs, dlm_check_zero);
 CLUSTER_ATTR(waitwarn_us, NULL);
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index d772e1d4461d..d25b9132c593 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -134,7 +134,7 @@ static DEFINE_SPINLOCK(dlm_node_addrs_spin);
 static struct listen_connection listen_con;
 static struct sockaddr_storage *dlm_local_addr[DLM_MAX_ADDR_COUNT];
 static int dlm_local_count;
-static int dlm_allow_conn;
+int dlm_allow_conn;
 
 /* Work queues */
 static struct workqueue_struct *recv_workqueue;
diff --git a/fs/dlm/lowcomms.h b/fs/dlm/lowcomms.h
index 0918f9376489..f74888ed43b4 100644
--- a/fs/dlm/lowcomms.h
+++ b/fs/dlm/lowcomms.h
@@ -14,6 +14,9 @@
 
 #define LOWCOMMS_MAX_TX_BUFFER_LEN	4096
 
+/* switch to check if dlm is running */
+extern int dlm_allow_conn;
+
 int dlm_lowcomms_start(void);
 void dlm_lowcomms_stop(void);
 void dlm_lowcomms_exit(void);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 06/12] fs: dlm: change allocation limits
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (4 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 05/12] fs: dlm: add check if dlm is currently running Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 07/12] fs: dlm: use GFP_ZERO for page buffer Alexander Aring
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

While running tcpkill I experienced invalid header length values while
receiving to check that a node doesn't try to send a invalid dlm message
we also check on applications minimum allocation limit. Also use
DEFAULT_BUFFER_SIZE as maximum allocation limit. The define
LOWCOMMS_MAX_TX_BUFFER_LEN is to calculate maximum buffer limits on
application layer, future midcomms layer will subtract their needs from
this define.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index d25b9132c593..cef45239a5c0 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1357,9 +1357,11 @@ void *dlm_lowcomms_get_buffer(int nodeid, int len, gfp_t allocation, char **ppc)
 	struct writequeue_entry *e;
 	int offset = 0;
 
-	if (len > LOWCOMMS_MAX_TX_BUFFER_LEN) {
-		BUILD_BUG_ON(PAGE_SIZE < LOWCOMMS_MAX_TX_BUFFER_LEN);
+	if (len > DEFAULT_BUFFER_SIZE ||
+	    len < sizeof(struct dlm_header)) {
+		BUILD_BUG_ON(PAGE_SIZE < DEFAULT_BUFFER_SIZE);
 		log_print("failed to allocate a buffer of size %d", len);
+		WARN_ON(1);
 		return NULL;
 	}
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 07/12] fs: dlm: use GFP_ZERO for page buffer
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (5 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 06/12] fs: dlm: change allocation limits Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 08/12] fs: dlm: simplify writequeue handling Alexander Aring
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch uses GFP_ZERO for allocate a page for the internal dlm
sending buffer allocator instead of calling memset zero after every
allocation. An already allocated space will never be reused again.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lock.c     | 2 --
 fs/dlm/lowcomms.c | 2 +-
 fs/dlm/rcom.c     | 2 --
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 002123efc6b0..b93df39d0915 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -3541,8 +3541,6 @@ static int _create_message(struct dlm_ls *ls, int mb_len,
 	if (!mh)
 		return -ENOBUFS;
 
-	memset(mb, 0, mb_len);
-
 	ms = (struct dlm_message *) mb;
 
 	ms->m_header.h_version = (DLM_HEADER_MAJOR | DLM_HEADER_MINOR);
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index cef45239a5c0..e0e74ee82a21 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1336,7 +1336,7 @@ static struct writequeue_entry *new_writequeue_entry(struct connection *con,
 	if (!entry)
 		return NULL;
 
-	entry->page = alloc_page(allocation);
+	entry->page = alloc_page(allocation | __GFP_ZERO);
 	if (!entry->page) {
 		kfree(entry);
 		return NULL;
diff --git a/fs/dlm/rcom.c b/fs/dlm/rcom.c
index 73ddee5159d7..f5b1bd65728d 100644
--- a/fs/dlm/rcom.c
+++ b/fs/dlm/rcom.c
@@ -41,7 +41,6 @@ static int create_rcom(struct dlm_ls *ls, int to_nodeid, int type, int len,
 			  to_nodeid, type, len);
 		return -ENOBUFS;
 	}
-	memset(mb, 0, mb_len);
 
 	rc = (struct dlm_rcom *) mb;
 
@@ -462,7 +461,6 @@ int dlm_send_ls_not_ready(int nodeid, struct dlm_rcom *rc_in)
 	mh = dlm_lowcomms_get_buffer(nodeid, mb_len, GFP_NOFS, &mb);
 	if (!mh)
 		return -ENOBUFS;
-	memset(mb, 0, mb_len);
 
 	rc = (struct dlm_rcom *) mb;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 08/12] fs: dlm: simplify writequeue handling
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (6 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 07/12] fs: dlm: use GFP_ZERO for page buffer Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 09/12] fs: dlm: check on minimum msglen size Alexander Aring
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch cleans up the current dlm sending allocator handling by using
some named macros, list functionality and removes some goto statements.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 83 ++++++++++++++++++++++++-----------------------
 1 file changed, 43 insertions(+), 40 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index e0e74ee82a21..d9784ff0ca30 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -102,6 +102,9 @@ struct listen_connection {
 	struct work_struct rwork;
 };
 
+#define DLM_WQ_REMAIN_BYTES(e) (PAGE_SIZE - e->end)
+#define DLM_WQ_LENGTH_BYTES(e) (e->end - e->offset)
+
 /* An entry waiting to be sent */
 struct writequeue_entry {
 	struct list_head list;
@@ -1332,7 +1335,7 @@ static struct writequeue_entry *new_writequeue_entry(struct connection *con,
 {
 	struct writequeue_entry *entry;
 
-	entry = kmalloc(sizeof(struct writequeue_entry), allocation);
+	entry = kzalloc(sizeof(*entry), allocation);
 	if (!entry)
 		return NULL;
 
@@ -1342,20 +1345,48 @@ static struct writequeue_entry *new_writequeue_entry(struct connection *con,
 		return NULL;
 	}
 
-	entry->offset = 0;
-	entry->len = 0;
-	entry->end = 0;
-	entry->users = 0;
 	entry->con = con;
+	entry->users = 1;
 
 	return entry;
 }
 
+static struct writequeue_entry *new_wq_entry(struct connection *con, int len,
+					     gfp_t allocation, char **ppc)
+{
+	struct writequeue_entry *e;
+
+	spin_lock(&con->writequeue_lock);
+	if (!list_empty(&con->writequeue)) {
+		e = list_last_entry(&con->writequeue, struct writequeue_entry, list);
+		if (DLM_WQ_REMAIN_BYTES(e) >= len) {
+			*ppc = page_address(e->page) + e->end;
+			e->end += len;
+			e->users++;
+			spin_unlock(&con->writequeue_lock);
+
+			return e;
+		}
+	}
+	spin_unlock(&con->writequeue_lock);
+
+	e = new_writequeue_entry(con, allocation);
+	if (!e)
+		return NULL;
+
+	*ppc = page_address(e->page);
+	e->end += len;
+
+	spin_lock(&con->writequeue_lock);
+	list_add_tail(&e->list, &con->writequeue);
+	spin_unlock(&con->writequeue_lock);
+
+	return e;
+};
+
 void *dlm_lowcomms_get_buffer(int nodeid, int len, gfp_t allocation, char **ppc)
 {
 	struct connection *con;
-	struct writequeue_entry *e;
-	int offset = 0;
 
 	if (len > DEFAULT_BUFFER_SIZE ||
 	    len < sizeof(struct dlm_header)) {
@@ -1369,35 +1400,7 @@ void *dlm_lowcomms_get_buffer(int nodeid, int len, gfp_t allocation, char **ppc)
 	if (!con)
 		return NULL;
 
-	spin_lock(&con->writequeue_lock);
-	e = list_entry(con->writequeue.prev, struct writequeue_entry, list);
-	if ((&e->list == &con->writequeue) ||
-	    (PAGE_SIZE - e->end < len)) {
-		e = NULL;
-	} else {
-		offset = e->end;
-		e->end += len;
-		e->users++;
-	}
-	spin_unlock(&con->writequeue_lock);
-
-	if (e) {
-	got_one:
-		*ppc = page_address(e->page) + offset;
-		return e;
-	}
-
-	e = new_writequeue_entry(con, allocation);
-	if (e) {
-		spin_lock(&con->writequeue_lock);
-		offset = e->end;
-		e->end += len;
-		e->users++;
-		list_add_tail(&e->list, &con->writequeue);
-		spin_unlock(&con->writequeue_lock);
-		goto got_one;
-	}
-	return NULL;
+	return new_wq_entry(con, len, allocation, ppc);
 }
 
 void dlm_lowcomms_commit_buffer(void *mh)
@@ -1410,7 +1413,8 @@ void dlm_lowcomms_commit_buffer(void *mh)
 	users = --e->users;
 	if (users)
 		goto out;
-	e->len = e->end - e->offset;
+
+	e->len = DLM_WQ_LENGTH_BYTES(e);
 	spin_unlock(&con->writequeue_lock);
 
 	queue_work(send_workqueue, &con->swork);
@@ -1436,11 +1440,10 @@ static void send_to_sock(struct connection *con)
 
 	spin_lock(&con->writequeue_lock);
 	for (;;) {
-		e = list_entry(con->writequeue.next, struct writequeue_entry,
-			       list);
-		if ((struct list_head *) e == &con->writequeue)
+		if (list_empty(&con->writequeue))
 			break;
 
+		e = list_first_entry(&con->writequeue, struct writequeue_entry, list);
 		len = e->len;
 		offset = e->offset;
 		BUG_ON(len == 0 && e->users == 0);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 09/12] fs: dlm: check on minimum msglen size
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (7 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 08/12] fs: dlm: simplify writequeue handling Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 10/12] fs: dlm: remove unaligned memory access handling Alexander Aring
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds an additional check for minimum dlm header size which is
an invalid dlm message and signals a broken stream. A msglen field cannot
be less than the dlm header size because the field is inclusive header
lengths.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/midcomms.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index fde3a6afe4be..0bedfa8606a2 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -49,9 +49,10 @@ int dlm_process_incoming_buffer(int nodeid, unsigned char *buf, int len)
 		 * cannot deliver this message to upper layers
 		 */
 		msglen = get_unaligned_le16(&hd->h_length);
-		if (msglen > DEFAULT_BUFFER_SIZE) {
-			log_print("received invalid length header: %u, will abort message parsing",
-				  msglen);
+		if (msglen > DEFAULT_BUFFER_SIZE ||
+		    msglen < sizeof(struct dlm_header)) {
+			log_print("received invalid length header: %u from node %d, will abort message parsing",
+				  msglen, nodeid);
 			return -EBADMSG;
 		}
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 10/12] fs: dlm: remove unaligned memory access handling
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (8 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 09/12] fs: dlm: check on minimum msglen size Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 11/12] fs: dlm: flush swork on shutdown Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 12/12] fs: dlm: add shutdown hook Alexander Aring
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch removes unaligned memory access handling for receiving
midcomms messages. This handling will not fix the unaligned memory
access in general. All messages should be length aligned to 8 bytes,
there exists cases where this isn't the case. It's part of the sending
handling to not send such messages. As the sending handling itself, with
the internal allocator of page buffers, can occur in unaligned memory
access of dlm message fields we just ignore that problem for now as it
seems this code is used by architecture which can handle it.

This patch adds a comment to take care about that problem in a major
bump of dlm protocol.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/midcomms.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index 0bedfa8606a2..1c6654a21ec4 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -22,8 +22,6 @@
  * into packets and sends them to the comms layer.
  */
 
-#include <asm/unaligned.h>
-
 #include "dlm_internal.h"
 #include "lowcomms.h"
 #include "config.h"
@@ -45,10 +43,18 @@ int dlm_process_incoming_buffer(int nodeid, unsigned char *buf, int len)
 	while (len >= sizeof(struct dlm_header)) {
 		hd = (struct dlm_header *)ptr;
 
-		/* no message should be more than this otherwise we
-		 * cannot deliver this message to upper layers
+		/* no message should be more than DEFAULT_BUFFER_SIZE or
+		 * less than dlm_header size.
+		 *
+		 * Some messages does not have a 8 byte length boundary yet
+		 * which can occur in a unaligned memory access of some dlm
+		 * messages. However this problem need to be fixed at the
+		 * sending side, for now it seems nobody run into architecture
+		 * related issues yet but it slows down some processing.
+		 * Fixing this issue should be scheduled in future by doing
+		 * the next major version bump.
 		 */
-		msglen = get_unaligned_le16(&hd->h_length);
+		msglen = le16_to_cpu(hd->h_length);
 		if (msglen > DEFAULT_BUFFER_SIZE ||
 		    msglen < sizeof(struct dlm_header)) {
 			log_print("received invalid length header: %u from node %d, will abort message parsing",
@@ -85,15 +91,7 @@ int dlm_process_incoming_buffer(int nodeid, unsigned char *buf, int len)
 			goto skip;
 		}
 
-		/* for aligned memory access, we just copy current message
-		 * to begin of the buffer which contains already parsed buffer
-		 * data and should provide align access for upper layers
-		 * because the start address of the buffer has a aligned
-		 * address. This memmove can be removed when the upperlayer
-		 * is capable of unaligned memory access.
-		 */
-		memmove(buf, ptr, msglen);
-		dlm_receive_buffer((union dlm_packet *)buf, nodeid);
+		dlm_receive_buffer((union dlm_packet *)ptr, nodeid);
 
 skip:
 		ret += msglen;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 11/12] fs: dlm: flush swork on shutdown
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (9 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 10/12] fs: dlm: remove unaligned memory access handling Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 12/12] fs: dlm: add shutdown hook Alexander Aring
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch fixes the flushing of send work before shutdown. The function
cancel_work_sync() is not the right workqueue functionality to use here
as it would cancel the work if the work queues itself. In cases of
EAGAIN in send() for dlm message we need to be sure that everything is
send out before. The function flush_work() will ensure that every send
work is be done inclusive in EAGAIN cases.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lowcomms.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index d9784ff0ca30..e6652189a282 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -688,10 +688,7 @@ static void shutdown_connection(struct connection *con)
 {
 	int ret;
 
-	if (cancel_work_sync(&con->swork)) {
-		log_print("canceled swork for node %d", con->nodeid);
-		clear_bit(CF_WRITE_PENDING, &con->flags);
-	}
+	flush_work(&con->swork);
 
 	mutex_lock(&con->sock_mutex);
 	/* nothing to shutdown */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Cluster-devel] [PATCHv2 dlm/next 12/12] fs: dlm: add shutdown hook
  2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
                   ` (10 preceding siblings ...)
  2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 11/12] fs: dlm: flush swork on shutdown Alexander Aring
@ 2021-02-17  0:00 ` Alexander Aring
  11 siblings, 0 replies; 13+ messages in thread
From: Alexander Aring @ 2021-02-17  0:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch fixes issues which occurs when dlm lowcomms synchronize their
workqueues but dlm application layer already released the lockspace. In
such cases messages like:

dlm: gfs2: release_lockspace final free
dlm: invalid lockspace 3841231384 from 1 cmd 1 type 11

are printed on the kernel log. This patch is solving this issue by
introducing a new "shutdown" hook before calling "stop" hook when the
lockspace is going to be released finally. This should pretend any
dlm messages sitting in the workqueues during or after lockspace
removal.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/lockspace.c |  3 +++
 fs/dlm/lowcomms.c  | 42 +++++++++++++++++++++++-------------------
 fs/dlm/lowcomms.h  |  1 +
 3 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 561dcad08ad6..9d52384e6100 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -788,6 +788,9 @@ static int release_lockspace(struct dlm_ls *ls, int force)
 
 	dlm_recoverd_stop(ls);
 
+	if (ls_count == 1)
+		dlm_lowcomms_shutdown();
+
 	dlm_callback_stop(ls);
 
 	remove_lockspace(ls);
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index e6652189a282..92b27da6f73e 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1593,6 +1593,29 @@ static int work_start(void)
 	return 0;
 }
 
+static void shutdown_conn(struct connection *con)
+{
+	if (con->shutdown_action)
+		con->shutdown_action(con);
+}
+
+void dlm_lowcomms_shutdown(void)
+{
+	/* Set all the flags to prevent any
+	 * socket activity.
+	 */
+	dlm_allow_conn = 0;
+
+	if (recv_workqueue)
+		flush_workqueue(recv_workqueue);
+	if (send_workqueue)
+		flush_workqueue(send_workqueue);
+
+	dlm_close_sock(&listen_con.sock);
+
+	foreach_conn(shutdown_conn);
+}
+
 static void _stop_conn(struct connection *con, bool and_other)
 {
 	mutex_lock(&con->sock_mutex);
@@ -1614,12 +1637,6 @@ static void stop_conn(struct connection *con)
 	_stop_conn(con, true);
 }
 
-static void shutdown_conn(struct connection *con)
-{
-	if (con->shutdown_action)
-		con->shutdown_action(con);
-}
-
 static void connection_release(struct rcu_head *rcu)
 {
 	struct connection *con = container_of(rcu, struct connection, rcu);
@@ -1676,19 +1693,6 @@ static void work_flush(void)
 
 void dlm_lowcomms_stop(void)
 {
-	/* Set all the flags to prevent any
-	   socket activity.
-	*/
-	dlm_allow_conn = 0;
-
-	if (recv_workqueue)
-		flush_workqueue(recv_workqueue);
-	if (send_workqueue)
-		flush_workqueue(send_workqueue);
-
-	dlm_close_sock(&listen_con.sock);
-
-	foreach_conn(shutdown_conn);
 	work_flush();
 	foreach_conn(free_conn);
 	work_stop();
diff --git a/fs/dlm/lowcomms.h b/fs/dlm/lowcomms.h
index f74888ed43b4..98bf93ab1fdc 100644
--- a/fs/dlm/lowcomms.h
+++ b/fs/dlm/lowcomms.h
@@ -18,6 +18,7 @@
 extern int dlm_allow_conn;
 
 int dlm_lowcomms_start(void);
+void dlm_lowcomms_shutdown(void);
 void dlm_lowcomms_stop(void);
 void dlm_lowcomms_exit(void);
 int dlm_lowcomms_close(int nodeid);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-02-17  0:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-17  0:00 [Cluster-devel] [PATCHv2 dlm/next 00/12] fs: dlm: lowcomms and midcomms changes Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 01/12] fs: dlm: fix debugfs dump Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 02/12] fs: dlm: set connected bit after accept Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 03/12] fs: dlm: set subclass for othercon sock_mutex Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 04/12] fs: dlm: add errno handling to check callback Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 05/12] fs: dlm: add check if dlm is currently running Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 06/12] fs: dlm: change allocation limits Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 07/12] fs: dlm: use GFP_ZERO for page buffer Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 08/12] fs: dlm: simplify writequeue handling Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 09/12] fs: dlm: check on minimum msglen size Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 10/12] fs: dlm: remove unaligned memory access handling Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 11/12] fs: dlm: flush swork on shutdown Alexander Aring
2021-02-17  0:00 ` [Cluster-devel] [PATCHv2 dlm/next 12/12] fs: dlm: add shutdown hook Alexander Aring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.