All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
@ 2016-02-10 18:55 Bob Peterson
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 1/6] DLM: Don't create kernel socket until we have valid node address Bob Peterson
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

I've been doing a bunch of recovery testing with DLM and discovered some
issues. This collection of 6 patches addresses those issues. Some of them
are of my own making, introduced by the recent patches that made DLM
print socket connection errors, and recovery from those errors.

The first patch changes the TCP "connect to sock" function to more closely
match the SCTP version of the function. The idea is to not create a kernel
socket until we have a valid node address, like it does in the SCTP path.

The second patch removes a "return" from lowcomms_error_report that should
not be there. The return was causing it to bypass calling the original
error report code, thus skipping an important part in the reporting.

The third patch changes function tcp_create_listen_sock so that its
error path is consistent. Only one of its error paths was setting
con->sock to NULL, but it should be done in both cases.

The fourth patch eliminates a useless goto, to make the code more clear.

The fifth patch adds a layer of locking by way of the sk->sk_callback_lock
which is needed to prevent multiple send/receive sockets from
interfering with one another when reporting the socket errors and
subsequent recovery. This makes it similar to how sunrpc handles errors.

The sixth and final patch makes the socket error code save and restore
all four callbacks, whereas before we were only saving and restoring the
error report callback.

Bob Peterson (6):
  DLM: Don't create kernel socket until we have valid node address
  DLM: Call original error report when socket is NULL
  DLM: Make consistent error path through tcp_create_listen_sock
  DLM: Eliminate useless goto
  DLM: Add locking to protect save callback assignments
  DLM: save / restore all socket callbacks

 fs/dlm/lowcomms.c | 103 ++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 77 insertions(+), 26 deletions(-)

-- 
2.5.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 1/6] DLM: Don't create kernel socket until we have valid node address
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
@ 2016-02-10 18:55 ` Bob Peterson
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL Bob Peterson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The idea here is to not create a kernel socket until we know we
have a valid node address. That's how it's done in the sctp version.
This patch changes function tcp_connect_to_sock to match the sctp
function more closely.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/dlm/lowcomms.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index dc9ae6d..977c73b 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1087,12 +1087,6 @@ static void tcp_connect_to_sock(struct connection *con)
 	if (con->sock)
 		goto out;
 
-	/* Create a socket to communicate with */
-	result = sock_create_kern(&init_net, dlm_local_addr[0]->ss_family,
-				  SOCK_STREAM, IPPROTO_TCP, &sock);
-	if (result < 0)
-		goto out_err;
-
 	memset(&saddr, 0, sizeof(saddr));
 	result = nodeid_to_addr(con->nodeid, &saddr, NULL, false);
 	if (result < 0) {
@@ -1100,6 +1094,12 @@ static void tcp_connect_to_sock(struct connection *con)
 		goto out_err;
 	}
 
+	/* Create a socket to communicate with */
+	result = sock_create_kern(&init_net, dlm_local_addr[0]->ss_family,
+				  SOCK_STREAM, IPPROTO_TCP, &sock);
+	if (result < 0)
+		goto out_err;
+
 	sock->sk->sk_user_data = con;
 	con->rx_action = receive_from_sock;
 	con->connect_action = tcp_connect_to_sock;
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 1/6] DLM: Don't create kernel socket until we have valid node address Bob Peterson
@ 2016-02-10 18:55 ` Bob Peterson
  2016-02-11 16:43   ` Andreas Gruenbacher
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock Bob Peterson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch removes a "return" statement from lowcomms_error_report.
It need to call the original error report in all paths through the
function.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/dlm/lowcomms.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 977c73b..e740326 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -478,7 +478,6 @@ static void lowcomms_error_report(struct sock *sk)
 				   "sk_err=%d/%d\n", dlm_our_nodeid(),
 				   con->nodeid, dlm_config.ci_tcp_port,
 				   sk->sk_err, sk->sk_err_soft);
-		return;
 	} else if (saddr.ss_family == AF_INET) {
 		struct sockaddr_in *sin4 = (struct sockaddr_in *)&saddr;
 
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 1/6] DLM: Don't create kernel socket until we have valid node address Bob Peterson
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL Bob Peterson
@ 2016-02-10 18:55 ` Bob Peterson
  2016-02-11 16:52   ` Andreas Gruenbacher
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto Bob Peterson
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Function tcp_create_listen_sock has two error paths. One of them
was setting con->sock to NULL. The other was not. This patch changes
it to be consistent and do the same thing for both error paths.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/dlm/lowcomms.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index e740326..3b780f0 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1199,10 +1199,7 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
 	result = sock->ops->bind(sock, (struct sockaddr *) saddr, addr_len);
 	if (result < 0) {
 		log_print("Can't bind to port %d", dlm_config.ci_tcp_port);
-		sock_release(sock);
-		sock = NULL;
-		con->sock = NULL;
-		goto create_out;
+		goto out_err;
 	}
 	result = kernel_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE,
 				 (char *)&one, sizeof(one));
@@ -1213,13 +1210,18 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
 	result = sock->ops->listen(sock, 5);
 	if (result < 0) {
 		log_print("Can't listen on port %d", dlm_config.ci_tcp_port);
-		sock_release(sock);
-		sock = NULL;
-		goto create_out;
+		goto out_err;
 	}
 
 create_out:
 	return sock;
+
+out_err:
+	sock_release(sock);
+	sock = NULL;
+	con->sock = NULL;
+
+	goto create_out;
 }
 
 /* Get local addresses */
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
                   ` (2 preceding siblings ...)
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock Bob Peterson
@ 2016-02-10 18:55 ` Bob Peterson
  2016-02-11 16:53   ` Andreas Gruenbacher
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments Bob Peterson
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch simply removes a goto from function sctp_listen_for_all.
The end result is the same, but makes the code more readable.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/dlm/lowcomms.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 3b780f0..ec5087a 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1261,7 +1261,7 @@ static int sctp_listen_for_all(void)
 				  SOCK_STREAM, IPPROTO_SCTP, &sock);
 	if (result < 0) {
 		log_print("Can't create comms socket, check SCTP is loaded");
-		goto out;
+		return result;
 	}
 
 	result = kernel_setsockopt(sock, SOL_SOCKET, SO_RCVBUFFORCE,
@@ -1296,7 +1296,6 @@ static int sctp_listen_for_all(void)
 create_delsock:
 	sock_release(sock);
 	con->sock = NULL;
-out:
 	return result;
 }
 
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
                   ` (3 preceding siblings ...)
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto Bob Peterson
@ 2016-02-10 18:55 ` Bob Peterson
  2016-02-11 17:04   ` Andreas Gruenbacher
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks Bob Peterson
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This patch adds write_lock_bh locking to several places in the code
that save and restore the socket callbacks.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/dlm/lowcomms.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index ec5087a..4e82285 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -467,10 +467,17 @@ int dlm_lowcomms_connect_node(int nodeid)
 
 static void lowcomms_error_report(struct sock *sk)
 {
-	struct connection *con = sock2con(sk);
+	struct connection *con;
 	struct sockaddr_storage saddr;
 	int buflen;
+	void (*orig_report)(struct sock *) = NULL;
+
+	read_lock_bh(&sk->sk_callback_lock);
+	con = sock2con(sk);
+	if (con == NULL)
+		goto out;
 
+	orig_report = con->orig_error_report;
 	if (con->sock == NULL ||
 	    kernel_getpeername(con->sock, (struct sockaddr *)&saddr, &buflen)) {
 		printk_ratelimited(KERN_ERR "dlm: node %d: socket error "
@@ -500,22 +507,29 @@ static void lowcomms_error_report(struct sock *sk)
 				   dlm_config.ci_tcp_port, sk->sk_err,
 				   sk->sk_err_soft);
 	}
-	con->orig_error_report(sk);
+out:
+	read_unlock_bh(&sk->sk_callback_lock);
+	if (orig_report)
+		orig_report(sk);
 }
 
 /* Make a socket active */
 static void add_sock(struct socket *sock, struct connection *con)
 {
+	struct sock *sk = sock->sk;
+
+	write_lock_bh(&sk->sk_callback_lock);
 	con->sock = sock;
 
 	/* Install a data_ready callback */
-	con->sock->sk->sk_data_ready = lowcomms_data_ready;
-	con->sock->sk->sk_write_space = lowcomms_write_space;
-	con->sock->sk->sk_state_change = lowcomms_state_change;
-	con->sock->sk->sk_user_data = con;
-	con->sock->sk->sk_allocation = GFP_NOFS;
-	con->orig_error_report = con->sock->sk->sk_error_report;
-	con->sock->sk->sk_error_report = lowcomms_error_report;
+	sk->sk_data_ready = lowcomms_data_ready;
+	sk->sk_write_space = lowcomms_write_space;
+	sk->sk_state_change = lowcomms_state_change;
+	sk->sk_user_data = con;
+	sk->sk_allocation = GFP_NOFS;
+	con->orig_error_report = sk->sk_error_report;
+	sk->sk_error_report = lowcomms_error_report;
+	write_unlock_bh(&sk->sk_callback_lock);
 }
 
 /* Add the port number to an IPv6 or 4 sockaddr and return the address
@@ -1274,6 +1288,7 @@ static int sctp_listen_for_all(void)
 	if (result < 0)
 		log_print("Could not set SCTP NODELAY error %d\n", result);
 
+	write_lock_bh(&sock->sk->sk_callback_lock);
 	/* Init con struct */
 	sock->sk->sk_user_data = con;
 	con->sock = sock;
@@ -1281,6 +1296,8 @@ static int sctp_listen_for_all(void)
 	con->rx_action = sctp_accept_from_sock;
 	con->connect_action = sctp_connect_to_sock;
 
+	write_unlock_bh(&sock->sk->sk_callback_lock);
+
 	/* Bind to all addresses. */
 	if (sctp_bind_addrs(con, dlm_config.ci_tcp_port))
 		goto create_delsock;
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
                   ` (4 preceding siblings ...)
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments Bob Peterson
@ 2016-02-10 18:55 ` Bob Peterson
  2016-02-11 15:31   ` Steven Whitehouse
  2016-02-11 17:05 ` [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Andreas Gruenbacher
  2016-02-11 17:22 ` David Teigland
  7 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-10 18:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Before this patch, DLM was saving off the original error report
callback before setting its own, but it never restored it. Instead,
we should be saving off all four socket callbacks before changing
them, and then restore them once we're done.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/dlm/lowcomms.c | 40 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 4e82285..c196c16 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -124,7 +124,10 @@ struct connection {
 	struct connection *othercon;
 	struct work_struct rwork; /* Receive workqueue */
 	struct work_struct swork; /* Send workqueue */
-	void (*orig_error_report)(struct sock *sk);
+	void (*orig_error_report)(struct sock *);
+	void (*orig_data_ready)(struct sock *);
+	void (*orig_state_change)(struct sock *);
+	void (*orig_write_space)(struct sock *);
 };
 #define sock2con(x) ((struct connection *)(x)->sk_user_data)
 
@@ -513,6 +516,34 @@ out:
 		orig_report(sk);
 }
 
+/* Note: sk_callback_lock must be locked before calling this function. */
+static void save_callbacks(struct connection *con, struct sock *sk)
+{
+	if (test_bit(CF_IS_OTHERCON, &con->flags))
+		return;
+	lock_sock(sk);
+	con->orig_data_ready = sk->sk_data_ready;
+	con->orig_state_change = sk->sk_state_change;
+	con->orig_write_space = sk->sk_write_space;
+	con->orig_error_report = sk->sk_error_report;
+	release_sock(sk);
+}
+
+static void restore_callbacks(struct connection *con, struct sock *sk)
+{
+	if (test_bit(CF_IS_OTHERCON, &con->flags))
+		return;
+	write_lock_bh(&sk->sk_callback_lock);
+	lock_sock(sk);
+	sk->sk_user_data = NULL;
+	sk->sk_data_ready = con->orig_data_ready;
+	sk->sk_state_change = con->orig_state_change;
+	sk->sk_write_space = con->orig_write_space;
+	sk->sk_error_report = con->orig_error_report;
+	release_sock(sk);
+	write_unlock_bh(&sk->sk_callback_lock);
+}
+
 /* Make a socket active */
 static void add_sock(struct socket *sock, struct connection *con)
 {
@@ -521,13 +552,13 @@ static void add_sock(struct socket *sock, struct connection *con)
 	write_lock_bh(&sk->sk_callback_lock);
 	con->sock = sock;
 
+	sk->sk_user_data = con;
+	save_callbacks(con, sk);
 	/* Install a data_ready callback */
 	sk->sk_data_ready = lowcomms_data_ready;
 	sk->sk_write_space = lowcomms_write_space;
 	sk->sk_state_change = lowcomms_state_change;
-	sk->sk_user_data = con;
 	sk->sk_allocation = GFP_NOFS;
-	con->orig_error_report = sk->sk_error_report;
 	sk->sk_error_report = lowcomms_error_report;
 	write_unlock_bh(&sk->sk_callback_lock);
 }
@@ -564,6 +595,7 @@ static void close_connection(struct connection *con, bool and_other,
 
 	mutex_lock(&con->sock_mutex);
 	if (con->sock) {
+		restore_callbacks(con, con->sock->sk);
 		sock_release(con->sock);
 		con->sock = NULL;
 	}
@@ -1205,6 +1237,8 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
 	if (result < 0) {
 		log_print("Failed to set SO_REUSEADDR on socket: %d", result);
 	}
+	sock->sk->sk_user_data = con;
+
 	con->rx_action = tcp_accept_from_sock;
 	con->connect_action = tcp_connect_to_sock;
 
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks Bob Peterson
@ 2016-02-11 15:31   ` Steven Whitehouse
  2016-02-11 16:43     ` [Cluster-devel] [DLM PATCH 6/6][try #2] " Bob Peterson
  0 siblings, 1 reply; 22+ messages in thread
From: Steven Whitehouse @ 2016-02-11 15:31 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 10/02/16 18:55, Bob Peterson wrote:
> Before this patch, DLM was saving off the original error report
> callback before setting its own, but it never restored it. Instead,
> we should be saving off all four socket callbacks before changing
> them, and then restore them once we're done.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/dlm/lowcomms.c | 40 +++++++++++++++++++++++++++++++++++++---
>   1 file changed, 37 insertions(+), 3 deletions(-)
>
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index 4e82285..c196c16 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -124,7 +124,10 @@ struct connection {
>   	struct connection *othercon;
>   	struct work_struct rwork; /* Receive workqueue */
>   	struct work_struct swork; /* Send workqueue */
> -	void (*orig_error_report)(struct sock *sk);
> +	void (*orig_error_report)(struct sock *);
> +	void (*orig_data_ready)(struct sock *);
> +	void (*orig_state_change)(struct sock *);
> +	void (*orig_write_space)(struct sock *);
>   };
>   #define sock2con(x) ((struct connection *)(x)->sk_user_data)
>   
> @@ -513,6 +516,34 @@ out:
>   		orig_report(sk);
>   }
>   
> +/* Note: sk_callback_lock must be locked before calling this function. */
> +static void save_callbacks(struct connection *con, struct sock *sk)
> +{
> +	if (test_bit(CF_IS_OTHERCON, &con->flags))
> +		return;
> +	lock_sock(sk);
> +	con->orig_data_ready = sk->sk_data_ready;
> +	con->orig_state_change = sk->sk_state_change;
> +	con->orig_write_space = sk->sk_write_space;
> +	con->orig_error_report = sk->sk_error_report;
> +	release_sock(sk);
> +}
> +
> +static void restore_callbacks(struct connection *con, struct sock *sk)
> +{
> +	if (test_bit(CF_IS_OTHERCON, &con->flags))
> +		return;
> +	write_lock_bh(&sk->sk_callback_lock);
> +	lock_sock(sk);
> +	sk->sk_user_data = NULL;
> +	sk->sk_data_ready = con->orig_data_ready;
> +	sk->sk_state_change = con->orig_state_change;
> +	sk->sk_write_space = con->orig_write_space;
> +	sk->sk_error_report = con->orig_error_report;
> +	release_sock(sk);
> +	write_unlock_bh(&sk->sk_callback_lock);
> +}
> +
Might be clearer to move the test for CF_IS_OTHERCON outside of these 
functions and into the callers?

Otherwise these patches look like a good set of clean ups,

Steve.

>   /* Make a socket active */
>   static void add_sock(struct socket *sock, struct connection *con)
>   {
> @@ -521,13 +552,13 @@ static void add_sock(struct socket *sock, struct connection *con)
>   	write_lock_bh(&sk->sk_callback_lock);
>   	con->sock = sock;
>   
> +	sk->sk_user_data = con;
> +	save_callbacks(con, sk);
>   	/* Install a data_ready callback */
>   	sk->sk_data_ready = lowcomms_data_ready;
>   	sk->sk_write_space = lowcomms_write_space;
>   	sk->sk_state_change = lowcomms_state_change;
> -	sk->sk_user_data = con;
>   	sk->sk_allocation = GFP_NOFS;
> -	con->orig_error_report = sk->sk_error_report;
>   	sk->sk_error_report = lowcomms_error_report;
>   	write_unlock_bh(&sk->sk_callback_lock);
>   }
> @@ -564,6 +595,7 @@ static void close_connection(struct connection *con, bool and_other,
>   
>   	mutex_lock(&con->sock_mutex);
>   	if (con->sock) {
> +		restore_callbacks(con, con->sock->sk);
>   		sock_release(con->sock);
>   		con->sock = NULL;
>   	}
> @@ -1205,6 +1237,8 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
>   	if (result < 0) {
>   		log_print("Failed to set SO_REUSEADDR on socket: %d", result);
>   	}
> +	sock->sk->sk_user_data = con;
> +
>   	con->rx_action = tcp_accept_from_sock;
>   	con->connect_action = tcp_connect_to_sock;
>   



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL Bob Peterson
@ 2016-02-11 16:43   ` Andreas Gruenbacher
  0 siblings, 0 replies; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 16:43 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 7:55 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> This patch removes a "return" statement from lowcomms_error_report.
> It need to call the original error report in all paths through the
> function.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>

Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 6/6][try #2] DLM: save / restore all socket callbacks
  2016-02-11 15:31   ` Steven Whitehouse
@ 2016-02-11 16:43     ` Bob Peterson
  2016-02-11 17:10       ` Andreas Gruenbacher
  0 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-11 16:43 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi Steve,

----- Original Message -----
> Might be clearer to move the test for CF_IS_OTHERCON outside of these
> functions and into the callers?
> 
> Otherwise these patches look like a good set of clean ups,
> 
> Steve.

Good idea. Here's a replacement patch that implements your suggestion.

Regards,

Bob Peterson
Red Hat File Systems
---
DLM: save / restore all socket callbacks

Before this patch, DLM was saving off the original error report
callback before setting its own, but it never restored it. Instead,
we should be saving off all four socket callbacks before changing
them, and then restore them once we're done.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 4e82285..aa9371e 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -124,7 +124,10 @@ struct connection {
 	struct connection *othercon;
 	struct work_struct rwork; /* Receive workqueue */
 	struct work_struct swork; /* Send workqueue */
-	void (*orig_error_report)(struct sock *sk);
+	void (*orig_error_report)(struct sock *);
+	void (*orig_data_ready)(struct sock *);
+	void (*orig_state_change)(struct sock *);
+	void (*orig_write_space)(struct sock *);
 };
 #define sock2con(x) ((struct connection *)(x)->sk_user_data)
 
@@ -513,6 +516,30 @@ out:
 		orig_report(sk);
 }
 
+/* Note: sk_callback_lock must be locked before calling this function. */
+static void save_callbacks(struct connection *con, struct sock *sk)
+{
+	lock_sock(sk);
+	con->orig_data_ready = sk->sk_data_ready;
+	con->orig_state_change = sk->sk_state_change;
+	con->orig_write_space = sk->sk_write_space;
+	con->orig_error_report = sk->sk_error_report;
+	release_sock(sk);
+}
+
+static void restore_callbacks(struct connection *con, struct sock *sk)
+{
+	write_lock_bh(&sk->sk_callback_lock);
+	lock_sock(sk);
+	sk->sk_user_data = NULL;
+	sk->sk_data_ready = con->orig_data_ready;
+	sk->sk_state_change = con->orig_state_change;
+	sk->sk_write_space = con->orig_write_space;
+	sk->sk_error_report = con->orig_error_report;
+	release_sock(sk);
+	write_unlock_bh(&sk->sk_callback_lock);
+}
+
 /* Make a socket active */
 static void add_sock(struct socket *sock, struct connection *con)
 {
@@ -521,13 +548,14 @@ static void add_sock(struct socket *sock, struct connection *con)
 	write_lock_bh(&sk->sk_callback_lock);
 	con->sock = sock;
 
+	sk->sk_user_data = con;
+	if (!test_bit(CF_IS_OTHERCON, &con->flags))
+		save_callbacks(con, sk);
 	/* Install a data_ready callback */
 	sk->sk_data_ready = lowcomms_data_ready;
 	sk->sk_write_space = lowcomms_write_space;
 	sk->sk_state_change = lowcomms_state_change;
-	sk->sk_user_data = con;
 	sk->sk_allocation = GFP_NOFS;
-	con->orig_error_report = sk->sk_error_report;
 	sk->sk_error_report = lowcomms_error_report;
 	write_unlock_bh(&sk->sk_callback_lock);
 }
@@ -564,6 +592,8 @@ static void close_connection(struct connection *con, bool and_other,
 
 	mutex_lock(&con->sock_mutex);
 	if (con->sock) {
+		if (!test_bit(CF_IS_OTHERCON, &con->flags))
+			restore_callbacks(con, con->sock->sk);
 		sock_release(con->sock);
 		con->sock = NULL;
 	}
@@ -1205,6 +1235,8 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
 	if (result < 0) {
 		log_print("Failed to set SO_REUSEADDR on socket: %d", result);
 	}
+	sock->sk->sk_user_data = con;
+
 	con->rx_action = tcp_accept_from_sock;
 	con->connect_action = tcp_connect_to_sock;
 



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock Bob Peterson
@ 2016-02-11 16:52   ` Andreas Gruenbacher
  2016-02-11 17:59     ` Bob Peterson
  0 siblings, 1 reply; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 16:52 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 7:55 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> Function tcp_create_listen_sock has two error paths. One of them
> was setting con->sock to NULL. The other was not. This patch changes
> it to be consistent and do the same thing for both error paths.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>  fs/dlm/lowcomms.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index e740326..3b780f0 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -1199,10 +1199,7 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
>         result = sock->ops->bind(sock, (struct sockaddr *) saddr, addr_len);
>         if (result < 0) {
>                 log_print("Can't bind to port %d", dlm_config.ci_tcp_port);
> -               sock_release(sock);
> -               sock = NULL;
> -               con->sock = NULL;
> -               goto create_out;
> +               goto out_err;
>         }
>         result = kernel_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE,
>                                  (char *)&one, sizeof(one));
> @@ -1213,13 +1210,18 @@ static struct socket *tcp_create_listen_sock(struct connection *con,
>         result = sock->ops->listen(sock, 5);
>         if (result < 0) {
>                 log_print("Can't listen on port %d", dlm_config.ci_tcp_port);
> -               sock_release(sock);
> -               sock = NULL;
> -               goto create_out;
> +               goto out_err;
>         }
>
>  create_out:
>         return sock;
> +
> +out_err:
> +       sock_release(sock);
> +       sock = NULL;
> +       con->sock = NULL;

Consolidating the error paths makes sense, but con->sock shouldn't be
set here at all; the caller does that in add_sock().

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto Bob Peterson
@ 2016-02-11 16:53   ` Andreas Gruenbacher
  0 siblings, 0 replies; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 16:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 7:55 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> This patch simply removes a goto from function sctp_listen_for_all.
> The end result is the same, but makes the code more readable.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>  fs/dlm/lowcomms.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index 3b780f0..ec5087a 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -1261,7 +1261,7 @@ static int sctp_listen_for_all(void)
>                                   SOCK_STREAM, IPPROTO_SCTP, &sock);
>         if (result < 0) {
>                 log_print("Can't create comms socket, check SCTP is loaded");
> -               goto out;
> +               return result;
>         }
>
>         result = kernel_setsockopt(sock, SOL_SOCKET, SO_RCVBUFFORCE,
> @@ -1296,7 +1296,6 @@ static int sctp_listen_for_all(void)
>  create_delsock:
>         sock_release(sock);
>         con->sock = NULL;
> -out:
>         return result;
>  }
>
> --
> 2.5.0
>

This one is obviously correct.

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments Bob Peterson
@ 2016-02-11 17:04   ` Andreas Gruenbacher
  0 siblings, 0 replies; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 17:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 7:55 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> This patch adds write_lock_bh locking to several places in the code
> that save and restore the socket callbacks.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>  fs/dlm/lowcomms.c | 35 ++++++++++++++++++++++++++---------
>  1 file changed, 26 insertions(+), 9 deletions(-)
>
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index ec5087a..4e82285 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -467,10 +467,17 @@ int dlm_lowcomms_connect_node(int nodeid)
>
>  static void lowcomms_error_report(struct sock *sk)
>  {
> -       struct connection *con = sock2con(sk);
> +       struct connection *con;
>         struct sockaddr_storage saddr;
>         int buflen;
> +       void (*orig_report)(struct sock *) = NULL;
> +
> +       read_lock_bh(&sk->sk_callback_lock);
> +       con = sock2con(sk);
> +       if (con == NULL)
> +               goto out;
>
> +       orig_report = con->orig_error_report;
>         if (con->sock == NULL ||
>             kernel_getpeername(con->sock, (struct sockaddr *)&saddr, &buflen)) {
>                 printk_ratelimited(KERN_ERR "dlm: node %d: socket error "

This doesn't apply cleanly to anymore, but is trivial to adapt.

> @@ -500,22 +507,29 @@ static void lowcomms_error_report(struct sock *sk)
>                                    dlm_config.ci_tcp_port, sk->sk_err,
>                                    sk->sk_err_soft);
>         }
> -       con->orig_error_report(sk);
> +out:
> +       read_unlock_bh(&sk->sk_callback_lock);
> +       if (orig_report)
> +               orig_report(sk);
>  }
>
>  /* Make a socket active */
>  static void add_sock(struct socket *sock, struct connection *con)
>  {
> +       struct sock *sk = sock->sk;
> +
> +       write_lock_bh(&sk->sk_callback_lock);
>         con->sock = sock;
>
>         /* Install a data_ready callback */
> -       con->sock->sk->sk_data_ready = lowcomms_data_ready;
> -       con->sock->sk->sk_write_space = lowcomms_write_space;
> -       con->sock->sk->sk_state_change = lowcomms_state_change;
> -       con->sock->sk->sk_user_data = con;
> -       con->sock->sk->sk_allocation = GFP_NOFS;
> -       con->orig_error_report = con->sock->sk->sk_error_report;
> -       con->sock->sk->sk_error_report = lowcomms_error_report;
> +       sk->sk_data_ready = lowcomms_data_ready;
> +       sk->sk_write_space = lowcomms_write_space;
> +       sk->sk_state_change = lowcomms_state_change;
> +       sk->sk_user_data = con;
> +       sk->sk_allocation = GFP_NOFS;
> +       con->orig_error_report = sk->sk_error_report;
> +       sk->sk_error_report = lowcomms_error_report;
> +       write_unlock_bh(&sk->sk_callback_lock);
>  }
>
>  /* Add the port number to an IPv6 or 4 sockaddr and return the address
> @@ -1274,6 +1288,7 @@ static int sctp_listen_for_all(void)
>         if (result < 0)
>                 log_print("Could not set SCTP NODELAY error %d\n", result);
>
> +       write_lock_bh(&sock->sk->sk_callback_lock);
>         /* Init con struct */
>         sock->sk->sk_user_data = con;
>         con->sock = sock;
> @@ -1281,6 +1296,8 @@ static int sctp_listen_for_all(void)
>         con->rx_action = sctp_accept_from_sock;
>         con->connect_action = sctp_connect_to_sock;
>
> +       write_unlock_bh(&sock->sk->sk_callback_lock);
> +
>         /* Bind to all addresses. */
>         if (sctp_bind_addrs(con, dlm_config.ci_tcp_port))
>                 goto create_delsock;
> --
> 2.5.0
>

Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
                   ` (5 preceding siblings ...)
  2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks Bob Peterson
@ 2016-02-11 17:05 ` Andreas Gruenbacher
  2016-02-11 17:22 ` David Teigland
  7 siblings, 0 replies; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 17:05 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 7:55 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> I've been doing a bunch of recovery testing with DLM and discovered some
> issues. This collection of 6 patches addresses those issues. Some of them
> are of my own making, introduced by the recent patches that made DLM
> print socket connection errors, and recovery from those errors.
>
> The first patch changes the TCP "connect to sock" function to more closely
> match the SCTP version of the function. The idea is to not create a kernel
> socket until we have a valid node address, like it does in the SCTP path.
>
> The second patch removes a "return" from lowcomms_error_report that should
> not be there. The return was causing it to bypass calling the original
> error report code, thus skipping an important part in the reporting.
>
> The third patch changes function tcp_create_listen_sock so that its
> error path is consistent. Only one of its error paths was setting
> con->sock to NULL, but it should be done in both cases.
>
> The fourth patch eliminates a useless goto, to make the code more clear.
>
> The fifth patch adds a layer of locking by way of the sk->sk_callback_lock
> which is needed to prevent multiple send/receive sockets from
> interfering with one another when reporting the socket errors and
> subsequent recovery. This makes it similar to how sunrpc handles errors.
>
> The sixth and final patch makes the socket error code save and restore
> all four callbacks, whereas before we were only saving and restoring the
> error report callback.

This patch set makes removing lockspaces a lot more robust for me. One
test case that triggers NULL pointer dereferences in callbacks from
TCP to DLM regularly without these is removing a lockspace on three
cluster nodes "simultaneously".

Thanks,
Andreas



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 6/6][try #2] DLM: save / restore all socket callbacks
  2016-02-11 16:43     ` [Cluster-devel] [DLM PATCH 6/6][try #2] " Bob Peterson
@ 2016-02-11 17:10       ` Andreas Gruenbacher
  0 siblings, 0 replies; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 17:10 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Feb 11, 2016 at 5:43 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> Hi Steve,
>
> ----- Original Message -----
>> Might be clearer to move the test for CF_IS_OTHERCON outside of these
>> functions and into the callers?
>>
>> Otherwise these patches look like a good set of clean ups,
>>
>> Steve.
>
> Good idea. Here's a replacement patch that implements your suggestion.
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems
> ---
> DLM: save / restore all socket callbacks
>
> Before this patch, DLM was saving off the original error report
> callback before setting its own, but it never restored it. Instead,
> we should be saving off all four socket callbacks before changing
> them, and then restore them once we're done.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>

Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
  2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
                   ` (6 preceding siblings ...)
  2016-02-11 17:05 ` [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Andreas Gruenbacher
@ 2016-02-11 17:22 ` David Teigland
  2016-02-11 18:39   ` Bob Peterson
  7 siblings, 1 reply; 22+ messages in thread
From: David Teigland @ 2016-02-11 17:22 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Feb 10, 2016 at 01:55:26PM -0500, Bob Peterson wrote:
> I've been doing a bunch of recovery testing with DLM and discovered some
> issues. This collection of 6 patches addresses those issues. Some of them
> are of my own making, introduced by the recent patches that made DLM
> print socket connection errors, and recovery from those errors.

Thanks Bob, perhaps I've not been paying close enough attention, but it's
unclear to me how this patch set relates the the most accute issue we have
at the moment, which are the problems introduced here:

  From b3a5bbfd780d9e9291f5f257be06e9ad6db11657 Mon Sep 17 00:00:00 2001
  From: Bob Peterson <rpeterso@redhat.com>
  Date: Thu, 27 Aug 2015 09:34:47 -0500
  Subject: [PATCH] dlm: print error from kernel_sendpage

  Print a dlm-specific error when a socket error occurs
  when sending a dlm message.

  Signed-off-by: Bob Peterson <rpeterso@redhat.com>
  Signed-off-by: David Teigland <teigland@redhat.com>

Could we begin with one patch that's easy to track that directly resolves
the issues with that commit (perhaps even a revert if it's not simple to
fix directly)?  That brings us back to a known-good place, from which we
can look at cleanups and changes.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock
  2016-02-11 16:52   ` Andreas Gruenbacher
@ 2016-02-11 17:59     ` Bob Peterson
  2016-02-11 21:09       ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path Andreas Gruenbacher
  0 siblings, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-11 17:59 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> > +out_err:
> > +       sock_release(sock);
> > +       sock = NULL;
> > +       con->sock = NULL;
> 
> Consolidating the error paths makes sense, but con->sock shouldn't be
> set here at all; the caller does that in add_sock().
> 
> Thanks,
> Andreas
> 

Hi Andreas,

I disagree.

The caller doesn't call add_sock() in the error case, which is where
we're setting con->sock to NULL.

Instead, it returns -EADDRINUSE to its caller, dlm_lowcomms_start, which
jumps to label fail_unlisten, which calls close_connection for the con if
it finds it for nodeid 0, then it calls close_connection for it, which
does some things if con->sock is not NULL. That path through the code is
hard to follow, so maybe I'm wrong.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
  2016-02-11 17:22 ` David Teigland
@ 2016-02-11 18:39   ` Bob Peterson
  2016-02-11 18:59     ` David Teigland
  2016-02-15 21:16     ` Bob Peterson
  0 siblings, 2 replies; 22+ messages in thread
From: Bob Peterson @ 2016-02-11 18:39 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> On Wed, Feb 10, 2016 at 01:55:26PM -0500, Bob Peterson wrote:
> > I've been doing a bunch of recovery testing with DLM and discovered some
> > issues. This collection of 6 patches addresses those issues. Some of them
> > are of my own making, introduced by the recent patches that made DLM
> > print socket connection errors, and recovery from those errors.
> 
> Thanks Bob, perhaps I've not been paying close enough attention, but it's
> unclear to me how this patch set relates the the most accute issue we have
> at the moment, which are the problems introduced here:
> 
>   From b3a5bbfd780d9e9291f5f257be06e9ad6db11657 Mon Sep 17 00:00:00 2001
>   From: Bob Peterson <rpeterso@redhat.com>
>   Date: Thu, 27 Aug 2015 09:34:47 -0500
>   Subject: [PATCH] dlm: print error from kernel_sendpage
> 
>   Print a dlm-specific error when a socket error occurs
>   when sending a dlm message.
> 
>   Signed-off-by: Bob Peterson <rpeterso@redhat.com>
>   Signed-off-by: David Teigland <teigland@redhat.com>
> 
> Could we begin with one patch that's easy to track that directly resolves
> the issues with that commit (perhaps even a revert if it's not simple to
> fix directly)?  That brings us back to a known-good place, from which we
> can look at cleanups and changes.
> 
Hi Dave,

My goal has always been to attain stability, which I think I've finally
achieved.

The problem is: While testing the dlm in multiple recovery situations,
Nate and I discovered multiple problems. Until recently, no one has tried
to run recovery tests on an upstream DLM, so I think we're finding some
old bugs that have been there for a while, as well as bugs with b3a5bbfd,
which you mentioned.

I agree that some of these patches might be unnecessary improvements.
I'll try to pare them down to what is absolutely necessary and what
is not. I'll also document exactly why the necessary ones are needed.

I'll also try to post them in order of highest priority and repost
them as individual patches rather than a set.

The recovery tests are somewhat slow, so this will take some time.

BTW, Have you had a chance to look at the patch I posted on 18 January,
titled "DLM: Replace nodeid_to_addr with kernel_getpeername"?
That definitely fixes one bug in patch b3a5bbfd which you mentioned.

I assume you're not suggesting I combine that patch with other patches
to stabilize b3a5bbfd, right? As you well know, this is very touchy
code and it's easier to diagnose and debug a larger number of smaller
patches.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
  2016-02-11 18:39   ` Bob Peterson
@ 2016-02-11 18:59     ` David Teigland
  2016-02-15 21:16     ` Bob Peterson
  1 sibling, 0 replies; 22+ messages in thread
From: David Teigland @ 2016-02-11 18:59 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Thu, Feb 11, 2016 at 01:39:09PM -0500, Bob Peterson wrote:
> The problem is: While testing the dlm in multiple recovery situations,
> Nate and I discovered multiple problems. Until recently, no one has tried
> to run recovery tests on an upstream DLM,

(Let's distinguish tcp connection testing/recovery vs locking
testing/recovery.  I agree we've never looked at the tcp connections too
much since the node is typically dead anyway.)

> I agree that some of these patches might be unnecessary improvements.
> I'll try to pare them down to what is absolutely necessary and what
> is not. I'll also document exactly why the necessary ones are needed.

Improvements are fine, I was just confused about which were fixes vs
cleanups.

> I'll also try to post them in order of highest priority and repost
> them as individual patches rather than a set.
> 
> The recovery tests are somewhat slow, so this will take some time.
> 
> BTW, Have you had a chance to look at the patch I posted on 18 January,
> titled "DLM: Replace nodeid_to_addr with kernel_getpeername"?
> That definitely fixes one bug in patch b3a5bbfd which you mentioned.

Great, thanks, that's the key one that I'd missed or forgotten.

> I assume you're not suggesting I combine that patch with other patches
> to stabilize b3a5bbfd, right? As you well know, this is very touchy
> code and it's easier to diagnose and debug a larger number of smaller
> patches.

No, I don't have any concerns with the other improvements/fixes you have
since the main issue was fixed in that nodeid_to_addr replacement.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path
  2016-02-11 17:59     ` Bob Peterson
@ 2016-02-11 21:09       ` Andreas Gruenbacher
  0 siblings, 0 replies; 22+ messages in thread
From: Andreas Gruenbacher @ 2016-02-11 21:09 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Bob,

On Thu, Feb 11, 2016 at 6:59 PM, Bob Peterson <rpeterso@redhat.com> wrote:
> ----- Original Message -----
>> > +out_err:
>> > +       sock_release(sock);
>> > +       sock = NULL;
>> > +       con->sock = NULL;
>>
>> Consolidating the error paths makes sense, but con->sock shouldn't be
>> set here at all; the caller does that in add_sock().
>
> I disagree.
>
> The caller doesn't call add_sock() in the error case, which is where
> we're setting con->sock to NULL.

well, the caller sets con->sock on success, so why doesn't it set it on failure
as well?

Andreas

---
 fs/dlm/lowcomms.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index ac7eae4..0d6e374 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1265,7 +1265,6 @@ create_out:
 out_err:
 	sock_release(sock);
 	sock = NULL;
-	con->sock = NULL;
 
 	goto create_out;
 }
@@ -1352,7 +1351,6 @@ static int tcp_listen_for_all(void)
 {
 	struct socket *sock = NULL;
 	struct connection *con = nodeid2con(0, GFP_NOFS);
-	int result = -EINVAL;
 
 	if (!con)
 		return -ENOMEM;
@@ -1369,13 +1367,11 @@ static int tcp_listen_for_all(void)
 	sock = tcp_create_listen_sock(con, dlm_local_addr[0]);
 	if (sock) {
 		add_sock(sock, con);
-		result = 0;
-	}
-	else {
-		result = -EADDRINUSE;
+		return 0;
+	} else {
+		con->sock = NULL;
+		return -EADDRINUSE;
 	}
-
-	return result;
 }
 
 
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
  2016-02-11 18:39   ` Bob Peterson
  2016-02-11 18:59     ` David Teigland
@ 2016-02-15 21:16     ` Bob Peterson
  2016-02-15 21:24       ` David Teigland
  1 sibling, 1 reply; 22+ messages in thread
From: Bob Peterson @ 2016-02-15 21:16 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> ----- Original Message -----
> > On Wed, Feb 10, 2016 at 01:55:26PM -0500, Bob Peterson wrote:
> > > I've been doing a bunch of recovery testing with DLM and discovered some
> > > issues. This collection of 6 patches addresses those issues. Some of them
> > > are of my own making, introduced by the recent patches that made DLM
> > > print socket connection errors, and recovery from those errors.
> > 
> > Thanks Bob, perhaps I've not been paying close enough attention, but it's
> > unclear to me how this patch set relates the the most accute issue we have
> > at the moment, which are the problems introduced here:
> > 
> >   From b3a5bbfd780d9e9291f5f257be06e9ad6db11657 Mon Sep 17 00:00:00 2001
> >   From: Bob Peterson <rpeterso@redhat.com>
> >   Date: Thu, 27 Aug 2015 09:34:47 -0500
> >   Subject: [PATCH] dlm: print error from kernel_sendpage
> > 
> >   Print a dlm-specific error when a socket error occurs
> >   when sending a dlm message.
> > 
> >   Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> >   Signed-off-by: David Teigland <teigland@redhat.com>
> > 
> > Could we begin with one patch that's easy to track that directly resolves
> > the issues with that commit (perhaps even a revert if it's not simple to
> > fix directly)?  That brings us back to a known-good place, from which we
> > can look at cleanups and changes.
> > 
> Hi Dave,
> 
> My goal has always been to attain stability, which I think I've finally
> achieved.
> 
> The problem is: While testing the dlm in multiple recovery situations,
> Nate and I discovered multiple problems. Until recently, no one has tried
> to run recovery tests on an upstream DLM, so I think we're finding some
> old bugs that have been there for a while, as well as bugs with b3a5bbfd,
> which you mentioned.
> 
> I agree that some of these patches might be unnecessary improvements.
> I'll try to pare them down to what is absolutely necessary and what
> is not. I'll also document exactly why the necessary ones are needed.

Hi Dave,

Here is some more information on the set of DLM patches I recently posted,
and where things stand:

1. Patch: dlm: print error from kernel_sendpage
   Commit: b3a5bbfd780d9e9291f5f257be06e9ad6db11657
   Advantages: It allows dlm to report socket errors
   Disadvantages: It caused some major problems:
   Problem #1: nodeid_to_addr ends up occasionally being called from softirq
               context, which is a problem because it takes a spinlock.
   Problem #2: The first condition also does "return;" rather than calling
               the original error report. This is a problem because the
               original error report needs to be called to do socket cleanup.
               The sunrpc implementation avoids this by doing that socket
               cleanup manually inside its own error_report function.
   Problem #3: It saves off the sk_error_report callback, but it never
               restores the callback to its original value.
   Problem #4: It only saves off the sk_error_report callback, but not any
               of the other three callbacks. All four really ought to be saved
               and restored once dlm is done with the socket, like sunrpc does.
   Problem #5: If two competing socket errors occur, lowcomms_error_report
               could, in theory, be called twice, causing socket cleanup
               (from the original error_report function) to happen twice,
               which results in a kernel panic (the details of which escape me,
               but I could maybe recreate it).

2. Patch: DLM: Replace nodeid_to_addr with kernel_getpeername
   Advanges: It fixes problem #1 above.
   Disadvantages: It doesn't fix any of the other problems.

3. Patch: DLM: Call original error report when socket is NULL
   Advantages: It fixes problem #2 above.
   Disadvantages: It introduces a new problem below.
   Problem: Error report recursion problem: Depending on timing, if/when
            add_sock is called multiple times for the same socket, it saves
            off the original sk_error_report multiple times. The first time,
            it saves off the proper one and replaces it with lowcomms_error_report.
            The second time, it saves lowcomms_error_report, which means
            when lowcomms_error_report is called the next time, it recurses
            and calls itself recursively an infinite number of times until
            the system crashes and is fenced.
   NOTE #1: This problem is, in fact, already in the code today, for the second
   two paths through lowcomms_error_report. This patch only makes the first
   path do the same thing. In other words, the problem is already there; this
   patch just makes it a lot more likely to happen.
   NOTE #2: There are two ways to fix it. The first is to make dlm do the
   socket cleanup, like sunrpc does. I don't like that because any cleanup
   introduced in the calling code needs to be echoed to dlm, and whomever
   makes that kind of change won't know to do it.
   The second is to clean up the socket code so it doesn't save itself as the
   original error_report callback, which is what subsequent patches do.

4. Patch: DLM: save / restore all socket callbacks
   Advantages: This tries to fix problems 3 and 4.
   Disadvantages: It has some sock-level locking, but not sk-callback_lock
                  locking like sunrpc has, which means it does not fix
                  problem #5 above.

5. Patch: DLM: Add locking to protect save callback assignments
   Advantages: This tries to fix problem #5 above.
   Disadvantages: None.

6. Patch: DLM: Don't create kernel socket until we have valid node address
   This is a cleanup, unrelated to the others. This makes the TCP code path
   similar to the SCTP code path.

7. Patch: DLM: Make consistent error path through tcp_create_listen_sock
   This is a cleanup, unrelated to the others.

8. Patch: DLM: Eliminate useless goto
   This is a cleanup, unrelated to the others.

I think the "right thing to do" at this point is this:

1. Patch #1 is already upstream
2. Patch #2 stands on its own, so I think this should go forward.
3. Combine patches 3, 4 and 5, which ought to provide a comprehensive fix
   for the other problems listed in #1.
4. The rest of the patches, I can post as separate patches because they are
   code cleanups, not related to the original problems of #1.

Let me know your thoughts on the subject. If you like this plan, I can
re-test and post replacement patches tomorrow (hopefully).

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
  2016-02-15 21:16     ` Bob Peterson
@ 2016-02-15 21:24       ` David Teigland
  0 siblings, 0 replies; 22+ messages in thread
From: David Teigland @ 2016-02-15 21:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, Feb 15, 2016 at 04:16:17PM -0500, Bob Peterson wrote:
> I think the "right thing to do" at this point is this:
> 
> 1. Patch #1 is already upstream
> 2. Patch #2 stands on its own, so I think this should go forward.
> 3. Combine patches 3, 4 and 5, which ought to provide a comprehensive fix
>    for the other problems listed in #1.
> 4. The rest of the patches, I can post as separate patches because they are
>    code cleanups, not related to the original problems of #1.
> 
> Let me know your thoughts on the subject. If you like this plan, I can
> re-test and post replacement patches tomorrow (hopefully).

That sounds good.



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-02-15 21:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 1/6] DLM: Don't create kernel socket until we have valid node address Bob Peterson
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL Bob Peterson
2016-02-11 16:43   ` Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock Bob Peterson
2016-02-11 16:52   ` Andreas Gruenbacher
2016-02-11 17:59     ` Bob Peterson
2016-02-11 21:09       ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto Bob Peterson
2016-02-11 16:53   ` Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments Bob Peterson
2016-02-11 17:04   ` Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks Bob Peterson
2016-02-11 15:31   ` Steven Whitehouse
2016-02-11 16:43     ` [Cluster-devel] [DLM PATCH 6/6][try #2] " Bob Peterson
2016-02-11 17:10       ` Andreas Gruenbacher
2016-02-11 17:05 ` [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Andreas Gruenbacher
2016-02-11 17:22 ` David Teigland
2016-02-11 18:39   ` Bob Peterson
2016-02-11 18:59     ` David Teigland
2016-02-15 21:16     ` Bob Peterson
2016-02-15 21:24       ` David Teigland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.