[GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54]

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54]
@ 2007-02-05 14:07 Steven Whitehouse
  2007-02-05 14:09 ` [GFS2] don't try to lockfs after shutdown [1/54] Steven Whitehouse
                   ` (54 more replies)
  0 siblings, 55 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:07 UTC (permalink / raw)
  To: linux-kernel, cluster-devel

Hi,

Following this message are the patches which currently make up the GFS2
-nmw (next merge window) git tree. Mostly they are bug fixes and clean
ups, but there are one or two other changes, the highlights being:

 [26/54] Changes "writeback" mounts to not require buffer heads in
writepages and also results in fewer I/Os.
 [29/54] This is probably the most notable since it effectively shrinks
the amount of memory per struct gfs2_inode by half.
 [27/54] Removes the attempt at directory readahead in readdir and
results in a substantial speed improvement for readdir without noticably
slowing readdir+stat performance.

Steve.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [GFS2] don't try to lockfs after shutdown [1/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
@ 2007-02-05 14:09 ` Steven Whitehouse
  2007-02-05 14:09 ` [DLM] fix resend rcom lock [2/54] Steven Whitehouse
                   ` (53 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From d951e07bc1d2d3e43af5325d4cad19b41ae4d427 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 6 Dec 2006 11:46:33 -0600
Subject: [PATCH] [GFS2] don't try to lockfs after shutdown

If an fs has already been shut down, a lockfs callback should do nothing.
An fs that's been shut down can't acquire locks or do anything with
respect to the cluster.

Also, remove FIXME comment in withdraw function.  The missing bits of the
withdraw procedure are now all done by user space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/lm.c b/fs/gfs2/lm.c
index effe4a3..e30673d 100644
--- a/fs/gfs2/lm.c
+++ b/fs/gfs2/lm.c
@@ -104,15 +104,9 @@ int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...)
 	vprintk(fmt, args);
 	va_end(args);
 
-	fs_err(sdp, "about to withdraw from the cluster\n");
+	fs_err(sdp, "about to withdraw this file system\n");
 	BUG_ON(sdp->sd_args.ar_debug);
 
-
-	fs_err(sdp, "waiting for outstanding I/O\n");
-
-	/* FIXME: suspend dm device so oustanding bio's complete
-	   and all further io requests fail */
-
 	fs_err(sdp, "telling LM to withdraw\n");
 	gfs2_withdraw_lockproto(&sdp->sd_lockstruct);
 	fs_err(sdp, "withdrawn\n");
diff --git a/fs/gfs2/ops_super.c b/fs/gfs2/ops_super.c
index 7685b46..b283783 100644
--- a/fs/gfs2/ops_super.c
+++ b/fs/gfs2/ops_super.c
@@ -173,6 +173,9 @@ static void gfs2_write_super_lockfs(struct super_block *sb)
 	struct gfs2_sbd *sdp = sb->s_fs_info;
 	int error;
 
+	if (test_bit(SDF_SHUTDOWN, &sdp->sd_flags))
+		return;
+
 	for (;;) {
 		error = gfs2_freeze_fs(sdp);
 		if (!error)
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix resend rcom lock [2/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
  2007-02-05 14:09 ` [GFS2] don't try to lockfs after shutdown [1/54] Steven Whitehouse
@ 2007-02-05 14:09 ` Steven Whitehouse
  2007-02-05 14:10 ` [DLM] fix old rcom messages [3/54] Steven Whitehouse
                   ` (52 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 3f75e071fc3e5479800614656492230a1ce50182 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 13 Dec 2006 10:36:37 -0600
Subject: [PATCH] [DLM] fix resend rcom lock

There's a chance the new master of resource hasn't learned it's the new
master before another node sends it a lock during recovery.  The node
sending the lock needs to resend if this happens.

- A sends a master lookup for resource R to C
- B sends a master lookup for resource R to C
- C receives A's lookup, assigns A to be master of R and
  sends a reply back to A
- C receives B's lookup and sends a reply back to B saying
  that A is the master
- B receives lookup reply from C and sends its lock for R to A
- A receives lock from B, doesn't think it's the master of R
  and sends an error back to B
- A receives lookup reply from C and becomes master of R
- B gets error back from A and resends its lock back to A
  (this resending is what this patch does)
- A receives lock from B, it now sees it's the master of R
  and takes the lock

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 30878de..69ada58 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -3571,6 +3571,14 @@ int dlm_recover_process_copy(struct dlm_ls *ls, struct dlm_rcom *rc)
 	lock_rsb(r);
 
 	switch (error) {
+	case -EBADR:
+		/* There's a chance the new master received our lock before
+		   dlm_recover_master_reply(), this wouldn't happen if we did
+		   a barrier between recover_masters and recover_locks. */
+		log_debug(ls, "master copy not ready %x r %lx %s", lkb->lkb_id,
+			  (unsigned long)r, r->res_name);
+		dlm_send_rcom_lock(r, lkb);
+		goto out;
 	case -EEXIST:
 		log_debug(ls, "master copy exists %x", lkb->lkb_id);
 		/* fall through */
@@ -3585,7 +3593,7 @@ int dlm_recover_process_copy(struct dlm_ls *ls, struct dlm_rcom *rc)
 	/* an ack for dlm_recover_locks() which waits for replies from
 	   all the locks it sends to new masters */
 	dlm_recovered_lock(r);
-
+ out:
 	unlock_rsb(r);
 	put_rsb(r);
 	dlm_put_lkb(lkb);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix old rcom messages [3/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
  2007-02-05 14:09 ` [GFS2] don't try to lockfs after shutdown [1/54] Steven Whitehouse
  2007-02-05 14:09 ` [DLM] fix resend rcom lock [2/54] Steven Whitehouse
@ 2007-02-05 14:10 ` Steven Whitehouse
  2007-02-05 14:11 ` [DLM] add version check [4/54] Steven Whitehouse
                   ` (51 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:10 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From e9aa50aa89e1c5338abc0732e972a49be76f8003 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 13 Dec 2006 10:37:16 -0600
Subject: [PATCH] [DLM] fix old rcom messages

A reply to a recovery message will often be received after the relevant
recovery sequence has aborted and the next recovery sequence has begun.
We need to ignore replies to these old messages from the previous
recovery.  There's already a way to do this for synchronous recovery
requests using the rc_id number, but not for async.

Each recovery sequence already has a locally unique sequence number
associated with it.  This patch adds a field to the rcom (recovery
message) structure where this recovery sequence number can be placed,
rc_seq.  When a node sends a reply to a recovery request, it copies the
rc_seq number it received into rc_seq_reply.  When the first node receives
the reply to its recovery message, it will check whether rc_seq_reply
matches the current recovery sequence number, ls_recover_seq, and if not
then it ignores the old reply.

An old, inadequate approach to filtering out old replies (checking if the
current stage of recovery has moved back to the start) has been removed
from two spots.

The protocol version number is changed to reflect the different rcom
structures.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index 1ee8195..7185a13 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -309,8 +309,8 @@ static inline int rsb_flag(struct dlm_rsb *r, enum rsb_flags flag)
 
 /* dlm_header is first element of all structs sent between nodes */
 
-#define DLM_HEADER_MAJOR	0x00020000
-#define DLM_HEADER_MINOR	0x00000001
+#define DLM_HEADER_MAJOR	0x00030000
+#define DLM_HEADER_MINOR	0x00000000
 
 #define DLM_MSG			1
 #define DLM_RCOM		2
@@ -386,6 +386,8 @@ struct dlm_rcom {
 	uint32_t		rc_type;	/* DLM_RCOM_ */
 	int			rc_result;	/* multi-purpose */
 	uint64_t		rc_id;		/* match reply with request */
+	uint64_t		rc_seq;		/* sender's ls_recover_seq */
+	uint64_t		rc_seq_reply;	/* remote ls_recover_seq */
 	char			rc_buf[0];
 };
 
diff --git a/fs/dlm/rcom.c b/fs/dlm/rcom.c
index 4cc31be..521ad9b 100644
--- a/fs/dlm/rcom.c
+++ b/fs/dlm/rcom.c
@@ -56,6 +56,10 @@ static int create_rcom(struct dlm_ls *ls, int to_nodeid, int type, int len,
 
 	rc->rc_type = type;
 
+	spin_lock(&ls->ls_recover_lock);
+	rc->rc_seq = ls->ls_recover_seq;
+	spin_unlock(&ls->ls_recover_lock);
+
 	*mh_ret = mh;
 	*rc_ret = rc;
 	return 0;
@@ -159,6 +163,7 @@ static void receive_rcom_status(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 	if (error)
 		return;
 	rc->rc_id = rc_in->rc_id;
+	rc->rc_seq_reply = rc_in->rc_seq;
 	rc->rc_result = dlm_recover_status(ls);
 	make_config(ls, (struct rcom_config *) rc->rc_buf);
 
@@ -224,21 +229,7 @@ static void receive_rcom_names(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 {
 	struct dlm_rcom *rc;
 	struct dlm_mhandle *mh;
-	int error, inlen, outlen;
-	int nodeid = rc_in->rc_header.h_nodeid;
-	uint32_t status = dlm_recover_status(ls);
-
-	/*
-	 * We can't run dlm_dir_rebuild_send (which uses ls_nodes) while
-	 * dlm_recoverd is running ls_nodes_reconfig (which changes ls_nodes).
-	 * It could only happen in rare cases where we get a late NAMES
-	 * message from a previous instance of recovery.
-	 */
-
-	if (!(status & DLM_RS_NODES)) {
-		log_debug(ls, "ignoring RCOM_NAMES from %u", nodeid);
-		return;
-	}
+	int error, inlen, outlen, nodeid;
 
 	nodeid = rc_in->rc_header.h_nodeid;
 	inlen = rc_in->rc_header.h_length - sizeof(struct dlm_rcom);
@@ -248,6 +239,7 @@ static void receive_rcom_names(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 	if (error)
 		return;
 	rc->rc_id = rc_in->rc_id;
+	rc->rc_seq_reply = rc_in->rc_seq;
 
 	dlm_copy_master_names(ls, rc_in->rc_buf, inlen, rc->rc_buf, outlen,
 			      nodeid);
@@ -294,6 +286,7 @@ static void receive_rcom_lookup(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 		ret_nodeid = error;
 	rc->rc_result = ret_nodeid;
 	rc->rc_id = rc_in->rc_id;
+	rc->rc_seq_reply = rc_in->rc_seq;
 
 	send_rcom(ls, mh, rc);
 }
@@ -375,20 +368,13 @@ static void receive_rcom_lock(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 
 	memcpy(rc->rc_buf, rc_in->rc_buf, sizeof(struct rcom_lock));
 	rc->rc_id = rc_in->rc_id;
+	rc->rc_seq_reply = rc_in->rc_seq;
 
 	send_rcom(ls, mh, rc);
 }
 
 static void receive_rcom_lock_reply(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 {
-	uint32_t status = dlm_recover_status(ls);
-
-	if (!(status & DLM_RS_DIR)) {
-		log_debug(ls, "ignoring RCOM_LOCK_REPLY from %u",
-			  rc_in->rc_header.h_nodeid);
-		return;
-	}
-
 	dlm_recover_process_copy(ls, rc_in);
 }
 
@@ -415,6 +401,7 @@ static int send_ls_not_ready(int nodeid, struct dlm_rcom *rc_in)
 
 	rc->rc_type = DLM_RCOM_STATUS_REPLY;
 	rc->rc_id = rc_in->rc_id;
+	rc->rc_seq_reply = rc_in->rc_seq;
 	rc->rc_result = -ESRCH;
 
 	rf = (struct rcom_config *) rc->rc_buf;
@@ -426,6 +413,31 @@ static int send_ls_not_ready(int nodeid, struct dlm_rcom *rc_in)
 	return 0;
 }
 
+static int is_old_reply(struct dlm_ls *ls, struct dlm_rcom *rc)
+{
+	uint64_t seq;
+	int rv = 0;
+
+	switch (rc->rc_type) {
+	case DLM_RCOM_STATUS_REPLY:
+	case DLM_RCOM_NAMES_REPLY:
+	case DLM_RCOM_LOOKUP_REPLY:
+	case DLM_RCOM_LOCK_REPLY:
+		spin_lock(&ls->ls_recover_lock);
+		seq = ls->ls_recover_seq;
+		spin_unlock(&ls->ls_recover_lock);
+		if (rc->rc_seq_reply != seq) {
+			log_error(ls, "ignoring old reply %x from %d "
+				      "seq_reply %llx expect %llx",
+				      rc->rc_type, rc->rc_header.h_nodeid,
+				      (unsigned long long)rc->rc_seq_reply,
+				      (unsigned long long)seq);
+			rv = 1;
+		}
+	}
+	return rv;
+}
+
 /* Called by dlm_recvd; corresponds to dlm_receive_message() but special
    recovery-only comms are sent through here. */
 
@@ -454,6 +466,9 @@ void dlm_receive_rcom(struct dlm_header *hd, int nodeid)
 		goto out;
 	}
 
+	if (is_old_reply(ls, rc))
+		goto out;
+
 	if (nodeid != rc->rc_header.h_nodeid) {
 		log_error(ls, "bad rcom nodeid %d from %d",
 			  rc->rc_header.h_nodeid, nodeid);
diff --git a/fs/dlm/util.c b/fs/dlm/util.c
index 767197d..963889c 100644
--- a/fs/dlm/util.c
+++ b/fs/dlm/util.c
@@ -134,6 +134,8 @@ void dlm_rcom_out(struct dlm_rcom *rc)
 	rc->rc_type		= cpu_to_le32(rc->rc_type);
 	rc->rc_result		= cpu_to_le32(rc->rc_result);
 	rc->rc_id		= cpu_to_le64(rc->rc_id);
+	rc->rc_seq		= cpu_to_le64(rc->rc_seq);
+	rc->rc_seq_reply	= cpu_to_le64(rc->rc_seq_reply);
 
 	if (type == DLM_RCOM_LOCK)
 		rcom_lock_out((struct rcom_lock *) rc->rc_buf);
@@ -151,6 +153,8 @@ void dlm_rcom_in(struct dlm_rcom *rc)
 	rc->rc_type		= le32_to_cpu(rc->rc_type);
 	rc->rc_result		= le32_to_cpu(rc->rc_result);
 	rc->rc_id		= le64_to_cpu(rc->rc_id);
+	rc->rc_seq		= le64_to_cpu(rc->rc_seq);
+	rc->rc_seq_reply	= le64_to_cpu(rc->rc_seq_reply);
 
 	if (rc->rc_type == DLM_RCOM_LOCK)
 		rcom_lock_in((struct rcom_lock *) rc->rc_buf);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] add version check [4/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (2 preceding siblings ...)
  2007-02-05 14:10 ` [DLM] fix old rcom messages [3/54] Steven Whitehouse
@ 2007-02-05 14:11 ` Steven Whitehouse
  2007-02-05 14:12 ` [DLM] fix send_args() lvb copying [5/54] Steven Whitehouse
                   ` (50 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From c72ebab07b716be7dddfebff05f8a74b94bfebe2 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 13 Dec 2006 10:37:55 -0600
Subject: [PATCH] [DLM] add version check

Check if we receive a message from another lockspace member running a
version of the dlm with an incompatible inter-node message protocol.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/rcom.c b/fs/dlm/rcom.c
index 521ad9b..54fba9b 100644
--- a/fs/dlm/rcom.c
+++ b/fs/dlm/rcom.c
@@ -82,8 +82,17 @@ static void make_config(struct dlm_ls *ls, struct rcom_config *rf)
 	rf->rf_lsflags = ls->ls_exflags;
 }
 
-static int check_config(struct dlm_ls *ls, struct rcom_config *rf, int nodeid)
+static int check_config(struct dlm_ls *ls, struct dlm_rcom *rc, int nodeid)
 {
+	struct rcom_config *rf = (struct rcom_config *) rc->rc_buf;
+
+	if ((rc->rc_header.h_version & 0xFFFF0000) != DLM_HEADER_MAJOR) {
+		log_error(ls, "version mismatch: %x nodeid %d: %x",
+			  DLM_HEADER_MAJOR | DLM_HEADER_MINOR, nodeid,
+			  rc->rc_header.h_version);
+		return -EINVAL;
+	}
+
 	if (rf->rf_lvblen != ls->ls_lvblen ||
 	    rf->rf_lsflags != ls->ls_exflags) {
 		log_error(ls, "config mismatch: %d,%x nodeid %d: %d,%x",
@@ -145,8 +154,7 @@ int dlm_rcom_status(struct dlm_ls *ls, int nodeid)
 		log_debug(ls, "remote node %d not ready", nodeid);
 		rc->rc_result = 0;
 	} else
-		error = check_config(ls, (struct rcom_config *) rc->rc_buf,
-				     nodeid);
+		error = check_config(ls, rc, nodeid);
 	/* the caller looks at rc_result for the remote recovery status */
  out:
 	return error;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix send_args() lvb copying [5/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (3 preceding siblings ...)
  2007-02-05 14:11 ` [DLM] add version check [4/54] Steven Whitehouse
@ 2007-02-05 14:12 ` Steven Whitehouse
  2007-02-05 14:13 ` [DLM] fix receive_request() lvb copying [6/54] Steven Whitehouse
                   ` (49 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From f9066217183174c56cf9992b8e2819702031aa85 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 13 Dec 2006 10:38:45 -0600
Subject: [PATCH] [DLM] fix send_args() lvb copying

The send_args() function is used to copy parameters into a message for a
number different message types.  Only some of those types are set up
beforehand (in create_message) to include space for sending lvb data.
send_args was wrongly copying the lvb for all message types as long as the
lock had an lvb.  This means that the lvb data was being written past the
end of the message into unknown space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 69ada58..cdf2cb9 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -2144,12 +2144,24 @@ static void send_args(struct dlm_rsb *r, struct dlm_lkb *lkb,
 	if (lkb->lkb_astaddr)
 		ms->m_asts |= AST_COMP;
 
-	if (ms->m_type == DLM_MSG_REQUEST || ms->m_type == DLM_MSG_LOOKUP)
-		memcpy(ms->m_extra, r->res_name, r->res_length);
+	/* compare with switch in create_message; send_remove() doesn't
+	   use send_args() */
 
-	else if (lkb->lkb_lvbptr)
+	switch (ms->m_type) {
+	case DLM_MSG_REQUEST:
+	case DLM_MSG_LOOKUP:
+		memcpy(ms->m_extra, r->res_name, r->res_length);
+		break;
+	case DLM_MSG_CONVERT:
+	case DLM_MSG_UNLOCK:
+	case DLM_MSG_REQUEST_REPLY:
+	case DLM_MSG_CONVERT_REPLY:
+	case DLM_MSG_GRANT:
+		if (!lkb->lkb_lvbptr)
+			break;
 		memcpy(ms->m_extra, lkb->lkb_lvbptr, r->res_ls->ls_lvblen);
-
+		break;
+	}
 }
 
 static int send_common(struct dlm_rsb *r, struct dlm_lkb *lkb, int mstype)
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix receive_request() lvb copying [6/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (4 preceding siblings ...)
  2007-02-05 14:12 ` [DLM] fix send_args() lvb copying [5/54] Steven Whitehouse
@ 2007-02-05 14:13 ` Steven Whitehouse
  2007-02-05 14:14 ` [DLM] fix lost flags in stub replies Steven Whitehouse
                   ` (48 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From f1d7674362afc47d0125e68dc7187e81f276a235 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 13 Dec 2006 10:39:20 -0600
Subject: [PATCH] [DLM] fix receive_request() lvb copying

LVB's are not sent as part of new requests, but the code receiving the
request was copying data into the lvb anyway.  The space in the message
where it mistakenly thought the lvb lived actually contained the resource
name, so it wound up incorrectly copying this name data into the lvb.  Fix
is to just create the lvb, not copy junk into it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index cdf2cb9..d8e919b 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -2430,8 +2430,12 @@ static int receive_request_args(struct dlm_ls *ls, struct dlm_lkb *lkb,
 
 	DLM_ASSERT(is_master_copy(lkb), dlm_print_lkb(lkb););
 
-	if (receive_lvb(ls, lkb, ms))
-		return -ENOMEM;
+	if (lkb->lkb_exflags & DLM_LKF_VALBLK) {
+		/* lkb was just created so there won't be an lvb yet */
+		lkb->lkb_lvbptr = allocate_lvb(ls);
+		if (!lkb->lkb_lvbptr)
+			return -ENOMEM;
+	}
 
 	return 0;
 }
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix lost flags in stub replies
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (5 preceding siblings ...)
  2007-02-05 14:13 ` [DLM] fix receive_request() lvb copying [6/54] Steven Whitehouse
@ 2007-02-05 14:14 ` Steven Whitehouse
  2007-02-05 14:15 ` [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions [8/54] Steven Whitehouse
                   ` (47 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:14 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From b08c5ad55e470d1cb05d1dc3e7159996903cc701 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 13 Dec 2006 10:40:26 -0600
Subject: [PATCH] [DLM] fix lost flags in stub replies

When the dlm fakes an unlock/cancel reply from a failed node using a stub
message struct, it wasn't setting the flags in the stub message.  So, in
the process of receiving the fake message the lkb flags would be updated
and cleared from the zero flags in the message.  The problem observed in
tests was the loss of the USER flag which caused the dlm to think a user
lock was a kernel lock and subsequently fail an assertion checking the
validity of the ast/callback field.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index d8e919b..ed52485 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -3148,6 +3148,7 @@ static void recover_convert_waiter(struct dlm_ls *ls, struct dlm_lkb *lkb)
 	if (middle_conversion(lkb)) {
 		hold_lkb(lkb);
 		ls->ls_stub_ms.m_result = -EINPROGRESS;
+		ls->ls_stub_ms.m_flags = lkb->lkb_flags;
 		_remove_from_waiters(lkb);
 		_receive_convert_reply(lkb, &ls->ls_stub_ms);
 
@@ -3221,6 +3222,7 @@ void dlm_recover_waiters_pre(struct dlm_ls *ls)
 		case DLM_MSG_UNLOCK:
 			hold_lkb(lkb);
 			ls->ls_stub_ms.m_result = -DLM_EUNLOCK;
+			ls->ls_stub_ms.m_flags = lkb->lkb_flags;
 			_remove_from_waiters(lkb);
 			_receive_unlock_reply(lkb, &ls->ls_stub_ms);
 			dlm_put_lkb(lkb);
@@ -3229,6 +3231,7 @@ void dlm_recover_waiters_pre(struct dlm_ls *ls)
 		case DLM_MSG_CANCEL:
 			hold_lkb(lkb);
 			ls->ls_stub_ms.m_result = -DLM_ECANCEL;
+			ls->ls_stub_ms.m_flags = lkb->lkb_flags;
 			_remove_from_waiters(lkb);
 			_receive_cancel_reply(lkb, &ls->ls_stub_ms);
 			dlm_put_lkb(lkb);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions [8/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (6 preceding siblings ...)
  2007-02-05 14:14 ` [DLM] fix lost flags in stub replies Steven Whitehouse
@ 2007-02-05 14:15 ` Steven Whitehouse
  2007-02-05 14:16 ` [GFS2] Fix DIO deadlock [9/54] Steven Whitehouse
                   ` (46 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, Andrew Morton, Adrian Bunk, cluster-devel

>From 3f2795b83b56726292c7bfd8664e6b4bff126079 Mon Sep 17 00:00:00 2001
From: Adrian Bunk <bunk@stusta.de>
Date: Tue, 19 Dec 2006 13:04:03 -0800
Subject: [PATCH] [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions

Remove the following unused functions:

- lowcomms_send_message()
- lowcomms_max_buffer_size()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 9be3a44..3b22473 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -880,22 +880,6 @@ out:
 	return -1;
 }
 
-/* API send message call, may queue the request */
-/* N.B. This is the old interface - use the new one for new calls */
-int lowcomms_send_message(int nodeid, char *buf, int len, gfp_t allocation)
-{
-	struct writequeue_entry *e;
-	char *b;
-
-	e = dlm_lowcomms_get_buffer(nodeid, len, allocation, &b);
-	if (e) {
-		memcpy(b, buf, len);
-		dlm_lowcomms_commit_buffer(e);
-		return 0;
-	}
-	return -ENOBUFS;
-}
-
 /* Look for activity on active sockets */
 static void process_sockets(void)
 {
@@ -1087,14 +1071,6 @@ static int daemons_start(void)
 	return 0;
 }
 
-/*
- * Return the largest buffer size we can cope with.
- */
-int lowcomms_max_buffer_size(void)
-{
-	return PAGE_CACHE_SIZE;
-}
-
 void dlm_lowcomms_stop(void)
 {
 	int i;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix DIO deadlock [9/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (7 preceding siblings ...)
  2007-02-05 14:15 ` [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions [8/54] Steven Whitehouse
@ 2007-02-05 14:16 ` Steven Whitehouse
  2007-02-05 14:17 ` [GFS2] Fail over to readpage for stuffed files [10/54] Steven Whitehouse
                   ` (45 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Wendy Cheng, cluster-devel

>From a22b0aa4d1fe6c359001e3a807d4916684bf862d Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Thu, 14 Dec 2006 18:24:26 +0000
Subject: [PATCH] [GFS2] Fix DIO deadlock

This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs
due to trying to take the i_mutex while holding a glock. The correct
locking order is defined as i_mutex -> glock in all cases.

I've left dealing with allocating writes. I know that we need to do
that, but for now this should do the trick. We don't need to take the
i_mutex on write, because the VFS has already taken it for us. On read
we don't need it since the glock is enough protection. The reason that
I've made some of the checks into a separate function is that we'll need
to do the checks again in the allocating write case eventually, so this
is partly in preparation for this. Likewise the return value test of !=
1 might look a bit odd and thats because we'll need a third return value
in case of requiring an allocation.

I've made the change to deferred mode on the glock to ensure flushing
read caches on other nodes. I notice that (using blktrace to look at
whats going on) we appear to do a better job of large I/Os than ext3
after this patch (in terms of not splitting up the I/Os).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Wendy Cheng <wcheng@redhat.com>

diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index d8d69a7..0118aa4 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -594,6 +594,36 @@ static void gfs2_invalidatepage(struct page *page, unsigned long offset)
 	return;
 }
 
+/**
+ * gfs2_ok_for_dio - check that dio is valid on this file
+ * @ip: The inode
+ * @rw: READ or WRITE
+ * @offset: The offset at which we are reading or writing
+ *
+ * Returns: 0 (to ignore the i/o request and thus fall back to buffered i/o)
+ *          1 (to accept the i/o request)
+ */
+static int gfs2_ok_for_dio(struct gfs2_inode *ip, int rw, loff_t offset)
+{
+	/*
+	 * Should we return an error here? I can't see that O_DIRECT for
+	 * a journaled file makes any sense. For now we'll silently fall
+	 * back to buffered I/O, likewise we do the same for stuffed
+	 * files since they are (a) small and (b) unaligned.
+	 */
+	if (gfs2_is_jdata(ip))
+		return 0;
+
+	if (gfs2_is_stuffed(ip))
+		return 0;
+
+	if (offset > i_size_read(&ip->i_inode))
+		return 0;
+	return 1;
+}
+
+
+
 static ssize_t gfs2_direct_IO(int rw, struct kiocb *iocb,
 			      const struct iovec *iov, loff_t offset,
 			      unsigned long nr_segs)
@@ -604,42 +634,28 @@ static ssize_t gfs2_direct_IO(int rw, struct kiocb *iocb,
 	struct gfs2_holder gh;
 	int rv;
 
-	if (rw == READ)
-		mutex_lock(&inode->i_mutex);
 	/*
-	 * Shared lock, even if its a write, since we do no allocation
-	 * on this path. All we need change is atime.
+	 * Deferred lock, even if its a write, since we do no allocation
+	 * on this path. All we need change is atime, and this lock mode
+	 * ensures that other nodes have flushed their buffered read caches
+	 * (i.e. their page cache entries for this inode). We do not,
+	 * unfortunately have the option of only flushing a range like
+	 * the VFS does.
 	 */
-	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, GL_ATIME, &gh);
+	gfs2_holder_init(ip->i_gl, LM_ST_DEFERRED, GL_ATIME, &gh);
 	rv = gfs2_glock_nq_atime(&gh);
 	if (rv)
-		goto out;
-
-	if (offset > i_size_read(inode))
-		goto out;
-
-	/*
-	 * Should we return an error here? I can't see that O_DIRECT for
-	 * a journaled file makes any sense. For now we'll silently fall
-	 * back to buffered I/O, likewise we do the same for stuffed
-	 * files since they are (a) small and (b) unaligned.
-	 */
-	if (gfs2_is_jdata(ip))
-		goto out;
-
-	if (gfs2_is_stuffed(ip))
-		goto out;
-
-	rv = blockdev_direct_IO_own_locking(rw, iocb, inode,
-					    inode->i_sb->s_bdev,
-					    iov, offset, nr_segs,
-					    gfs2_get_block_direct, NULL);
+		return rv;
+	rv = gfs2_ok_for_dio(ip, rw, offset);
+	if (rv != 1)
+		goto out; /* dio not valid, fall back to buffered i/o */
+
+	rv = blockdev_direct_IO_no_locking(rw, iocb, inode, inode->i_sb->s_bdev,
+					   iov, offset, nr_segs,
+					   gfs2_get_block_direct, NULL);
 out:
 	gfs2_glock_dq_m(1, &gh);
 	gfs2_holder_uninit(&gh);
-	if (rw == READ)
-		mutex_unlock(&inode->i_mutex);
-
 	return rv;
 }
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fail over to readpage for stuffed files [10/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (8 preceding siblings ...)
  2007-02-05 14:16 ` [GFS2] Fix DIO deadlock [9/54] Steven Whitehouse
@ 2007-02-05 14:17 ` Steven Whitehouse
  2007-02-05 14:18 ` [GFS2] Fix change nlink deadlock [11/54] Steven Whitehouse
                   ` (44 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: Russell Cattelan, cluster-devel

>From 25a24a1f1feaf460920f2de7abf23e3e79258197 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Fri, 15 Dec 2006 16:49:51 -0500
Subject: [PATCH] [GFS2] Fail over to readpage for stuffed files

This is partially derrived from a patch written by Russell Cattelan.
It fixes a bug where there is a race between readpages and truncate
by ignoring readpages for stuffed files. This is ok because a stuffed
file will never be more than one block (minus sizeof(struct gfs2_dinode))
in size and block size is always less than page size, so we do not lose
anything efficiency-wise by not doing readahead for stuffed files. They
will have already been "read ahead" by the action of reading the inode
in, in the first place.

This is the remaining part of the fix for Red Hat bugzilla #218966
which had not yet made it upstream.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Russell Cattelan <cattelan@redhat.com>

diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index 0118aa4..37bfeb9 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -256,7 +256,7 @@ out_unlock:
  *    the page lock and the glock) and return having done no I/O. Its
  *    obviously not something we'd want to do on too regular a basis.
  *    Any I/O we ignore at this time will be done via readpage later.
- * 2. We have to handle stuffed files here too.
+ * 2. We don't handle stuffed files here we let readpage do the honours.
  * 3. mpage_readpages() does most of the heavy lifting in the common case.
  * 4. gfs2_get_block() is relied upon to set BH_Boundary in the right places.
  * 5. We use LM_FLAG_TRY_1CB here, effectively we then have lock-ahead as
@@ -269,8 +269,7 @@ static int gfs2_readpages(struct file *file, struct address_space *mapping,
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct gfs2_holder gh;
-	unsigned page_idx;
-	int ret;
+	int ret = 0;
 	int do_unlock = 0;
 
 	if (likely(file != &gfs2_internal_file_sentinel)) {
@@ -289,29 +288,8 @@ static int gfs2_readpages(struct file *file, struct address_space *mapping,
 			goto out_unlock;
 	}
 skip_lock:
-	if (gfs2_is_stuffed(ip)) {
-		struct pagevec lru_pvec;
-		pagevec_init(&lru_pvec, 0);
-		for (page_idx = 0; page_idx < nr_pages; page_idx++) {
-			struct page *page = list_entry(pages->prev, struct page, lru);
-			prefetchw(&page->flags);
-			list_del(&page->lru);
-			if (!add_to_page_cache(page, mapping,
-					       page->index, GFP_KERNEL)) {
-				ret = stuffed_readpage(ip, page);
-				unlock_page(page);
-				if (!pagevec_add(&lru_pvec, page))
-					 __pagevec_lru_add(&lru_pvec);
-			} else {
-				page_cache_release(page);
-			}
-		}
-		pagevec_lru_add(&lru_pvec);
-		ret = 0;
-	} else {
-		/* What we really want to do .... */
+	if (!gfs2_is_stuffed(ip))
 		ret = mpage_readpages(mapping, pages, nr_pages, gfs2_get_block);
-	}
 
 	if (do_unlock) {
 		gfs2_glock_dq_m(1, &gh);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix change nlink deadlock [11/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (9 preceding siblings ...)
  2007-02-05 14:17 ` [GFS2] Fail over to readpage for stuffed files [10/54] Steven Whitehouse
@ 2007-02-05 14:18 ` Steven Whitehouse
  2007-02-05 14:19 ` [DLM] Fix schedule() calls [12/54] Steven Whitehouse
                   ` (43 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: S. Wendy Cheng, cluster-devel

>From ca2e3f566e7db956c046f8737192c022302a84b4 Mon Sep 17 00:00:00 2001
From: S. Wendy Cheng <wcheng@redhat.com>
Date: Thu, 18 Jan 2007 15:56:34 -0500
Subject: [PATCH] [GFS2] Fix change nlink deadlock

Bugzilla 215088

Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2
partition. The gfs2_rename() apparently needs block allocation for the
new name (into the directory) where it requires rg locks. At the same
time, while updating the nlink count for the replaced file,
gfs2_change_nlink() tries to return the inode meta-data back to resource
group where it needs rg locks too. Our logic doesn't allow process to
acquire these locks recursively by the same process  (RHEL installer)
that results a BUG call. This only happens within rename code path and
only if the destination file exists before the rename operation.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index d122074..6bc4436 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -281,16 +281,14 @@ out:
 }
 
 /**
- * gfs2_change_nlink - Change nlink count on inode
+ * gfs2_change_nlink_i - Change nlink count on inode
  * @ip: The GFS2 inode
  * @diff: The change in the nlink count required
  *
  * Returns: errno
  */
-
-int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
+int gfs2_change_nlink_i(struct gfs2_inode *ip, int diff)
 {
-	struct gfs2_sbd *sdp = ip->i_inode.i_sb->s_fs_info;
 	struct buffer_head *dibh;
 	u32 nlink;
 	int error;
@@ -322,6 +320,20 @@ int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
 	brelse(dibh);
 	mark_inode_dirty(&ip->i_inode);
 
+	return error;
+}
+
+int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
+{
+	struct gfs2_sbd *sdp = ip->i_inode.i_sb->s_fs_info;
+	int error;
+
+	/* update the nlink */
+	error = gfs2_change_nlink_i(ip, diff);
+	if (error)
+		return error;
+
+	/* return meta data block back to rg */
 	if (ip->i_inode.i_nlink == 0) {
 		struct gfs2_rgrpd *rgd;
 		struct gfs2_holder ri_gh, rg_gh;
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index b57f448..85c67cb 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -40,6 +40,7 @@ int gfs2_inode_refresh(struct gfs2_inode *ip);
 
 int gfs2_dinode_dealloc(struct gfs2_inode *inode);
 int gfs2_change_nlink(struct gfs2_inode *ip, int diff);
+int gfs2_change_nlink_i(struct gfs2_inode *ip, int diff);
 struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
 			   int is_root, struct nameidata *nd);
 struct inode *gfs2_createi(struct gfs2_holder *ghs, const struct qstr *name,
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 636dda4..919e894 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -553,6 +553,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 	int alloc_required;
 	unsigned int x;
 	int error;
+	struct gfs2_rgrpd *rgd;
 
 	if (ndentry->d_inode) {
 		nip = GFS2_I(ndentry->d_inode);
@@ -684,12 +685,12 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 		error = gfs2_trans_begin(sdp, sdp->sd_max_dirres +
 					 al->al_rgd->rd_ri.ri_length +
 					 4 * RES_DINODE + 4 * RES_LEAF +
-					 RES_STATFS + RES_QUOTA, 0);
+					 RES_STATFS + RES_QUOTA + 1, 0);
 		if (error)
 			goto out_ipreserv;
 	} else {
 		error = gfs2_trans_begin(sdp, 4 * RES_DINODE +
-					 5 * RES_LEAF, 0);
+					 5 * RES_LEAF + 1, 0);
 		if (error)
 			goto out_gunlock;
 	}
@@ -703,7 +704,25 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 			error = gfs2_dir_del(ndip, &ndentry->d_name);
 			if (error)
 				goto out_end_trans;
-			error = gfs2_change_nlink(nip, -1);
+			error = gfs2_change_nlink_i(nip, -1);
+			if ((!error) && (nip->i_inode.i_nlink == 0)) {
+				error = -EIO;
+				rgd = gfs2_blk2rgrpd(sdp, nip->i_num.no_addr);
+				if (rgd) {
+					struct gfs2_holder nlink_rg_gh;
+					if (rgd != nip->i_alloc.al_rgd)
+						error = gfs2_glock_nq_init(
+						rgd->rd_gl, LM_ST_EXCLUSIVE,
+						0, &nlink_rg_gh);
+					else
+						error = 0;
+                			if (!error) {
+						gfs2_unlink_di(&nip->i_inode);
+						if (rgd != nip->i_alloc.al_rgd)
+							gfs2_glock_dq_uninit(&nlink_rg_gh);
+					}
+				}
+			}
 		}
 		if (error)
 			goto out_end_trans;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] Fix schedule() calls [12/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (10 preceding siblings ...)
  2007-02-05 14:18 ` [GFS2] Fix change nlink deadlock [11/54] Steven Whitehouse
@ 2007-02-05 14:19 ` Steven Whitehouse
  2007-02-05 14:19 ` [DLM] Fix spin lock already unlocked bug [13/54] Steven Whitehouse
                   ` (42 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From 7c571f88ff49e883ee57f0abf1e155f2d4965a75 Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Tue, 2 Jan 2007 17:01:05 +0000
Subject: [PATCH] [DLM] Fix schedule() calls

I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton.

These four should really be calls to schedule() or the dlm can busy-wait.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-sctp.c b/fs/dlm/lowcomms-sctp.c
index fe158d7..0940a80 100644
--- a/fs/dlm/lowcomms-sctp.c
+++ b/fs/dlm/lowcomms-sctp.c
@@ -1109,7 +1109,7 @@ static int dlm_recvd(void *data)
 		set_current_state(TASK_INTERRUPTIBLE);
 		add_wait_queue(&lowcomms_recv_wait, &wait);
 		if (!test_bit(CF_READ_PENDING, &sctp_con.flags))
-			cond_resched();
+			schedule();
 		remove_wait_queue(&lowcomms_recv_wait, &wait);
 		set_current_state(TASK_RUNNING);
 
@@ -1141,7 +1141,7 @@ static int dlm_sendd(void *data)
 	while (!kthread_should_stop()) {
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (write_list_empty())
-			cond_resched();
+			schedule();
 		set_current_state(TASK_RUNNING);
 
 		if (sctp_con.eagain_flag) {
diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 3b22473..18b91c6 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -996,7 +996,7 @@ static int dlm_recvd(void *data)
 	while (!kthread_should_stop()) {
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (read_list_empty())
-			cond_resched();
+			schedule();
 		set_current_state(TASK_RUNNING);
 
 		process_sockets();
@@ -1030,7 +1030,7 @@ static int dlm_sendd(void *data)
 	while (!kthread_should_stop()) {
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (write_and_state_lists_empty())
-			cond_resched();
+			schedule();
 		set_current_state(TASK_RUNNING);
 
 		process_state_queue();
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] Fix spin lock already unlocked bug [13/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (11 preceding siblings ...)
  2007-02-05 14:19 ` [DLM] Fix schedule() calls [12/54] Steven Whitehouse
@ 2007-02-05 14:19 ` Steven Whitehouse
  2007-02-05 14:20 ` [GFS2] Fix ordering of page disposal vs. glock_dq [14/54] Steven Whitehouse
                   ` (41 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From c239b99d90a2b667505cf5b397e8506be5dc3e9c Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Tue, 2 Jan 2007 17:08:54 +0000
Subject: [PATCH] [DLM] Fix spin lock already unlocked bug

I just noticed this message when testing some other changes I'd made to
lowcomms (to use workqueues) but the problem seems to be in the current
git trees too. I'm amazed no-one has seen it.

    BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 18b91c6..ce5e7cd 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -709,6 +709,7 @@ void *dlm_lowcomms_get_buffer(int nodeid, int len,
 	if (!con)
 		return NULL;
 
+	spin_lock(&con->writequeue_lock);
 	e = list_entry(con->writequeue.prev, struct writequeue_entry, list);
 	if ((&e->list == &con->writequeue) ||
 	    (PAGE_CACHE_SIZE - e->end < len)) {
@@ -747,6 +748,7 @@ void dlm_lowcomms_commit_buffer(void *mh)
 	struct connection *con = e->con;
 	int users;
 
+	spin_lock(&con->writequeue_lock);
 	users = --e->users;
 	if (users)
 		goto out;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix ordering of page disposal vs. glock_dq [14/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (12 preceding siblings ...)
  2007-02-05 14:19 ` [DLM] Fix spin lock already unlocked bug [13/54] Steven Whitehouse
@ 2007-02-05 14:20 ` Steven Whitehouse
  2007-02-05 14:21 ` [GFS2] BZ 217008 fsfuzzer fix [15/54] Steven Whitehouse
                   ` (40 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From cb0d445259232c2ec717de827338e31650b7b07b Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 8 Jan 2007 14:31:40 +0000
Subject: [PATCH] [GFS2] Fix ordering of page disposal vs. glock_dq

In case of unlinked files with dirty pages GFS2 wasn't clearing
the pages in quite the right order. This patch clears the pages
earlier (before the qlock_dq) to avoid the situation that the
release of the glock results in attempting to write back data that
has already been deallocated.

This fixes Red Hat bugzilla: #220117

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/ops_super.c b/fs/gfs2/ops_super.c
index b283783..c22738c 100644
--- a/fs/gfs2/ops_super.c
+++ b/fs/gfs2/ops_super.c
@@ -429,6 +429,12 @@ static void gfs2_delete_inode(struct inode *inode)
 	}
 
 	error = gfs2_dinode_dealloc(ip);
+	/*
+	 * Must do this before unlock to avoid trying to write back
+	 * potentially dirty data now that inode no longer exists
+	 * on disk.
+	 */
+	truncate_inode_pages(&inode->i_data, 0);
 
 out_unlock:
 	gfs2_glock_dq(&ip->i_iopen_gh);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] BZ 217008 fsfuzzer fix [15/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (13 preceding siblings ...)
  2007-02-05 14:20 ` [GFS2] Fix ordering of page disposal vs. glock_dq [14/54] Steven Whitehouse
@ 2007-02-05 14:21 ` Steven Whitehouse
  2007-02-05 14:22 ` [GFS2] Fix gfs2_rename deadlock [16/54] Steven Whitehouse
                   ` (39 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Russell Cattelan, cluster-devel

>From 8e0a86bd172405ea03feb6bc81adf6bcc389e8a5 Mon Sep 17 00:00:00 2001
From: Russell Cattelan <cattelan@redhat.com>
Date: Mon, 8 Jan 2007 17:47:51 -0600
Subject: [PATCH] [GFS2] BZ 217008 fsfuzzer fix.

Update the quilt header comments to match the
code changes.

Change gfs2_lookup_simple to return an error in the case
of a NULL inode.
The callers of gfs2_lookup_simple do not check for NULL
in the no entry case and such would end up dereferencing a NULL ptr.

This fixes:
http://projects.info-pull.com/mokb/MOKB-15-11-2006.html

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 6bc4436..bab338f 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -361,8 +361,18 @@ out:
 struct inode *gfs2_lookup_simple(struct inode *dip, const char *name)
 {
 	struct qstr qstr;
+	struct inode *inode;
 	gfs2_str2qstr(&qstr, name);
-	return gfs2_lookupi(dip, &qstr, 1, NULL);
+	inode = gfs2_lookupi(dip, &qstr, 1, NULL);
+	/* gfs2_lookupi has inconsistent callers: vfs
+	 * related routines expect NULL for no entry found,
+	 * gfs2_lookup_simple callers expect ENOENT
+	 * and do not check for NULL.
+	 */
+	if (inode == NULL)
+		return ERR_PTR(-ENOENT);
+	else
+		return inode;
 }
 
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix gfs2_rename deadlock [16/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (14 preceding siblings ...)
  2007-02-05 14:21 ` [GFS2] BZ 217008 fsfuzzer fix [15/54] Steven Whitehouse
@ 2007-02-05 14:22 ` Steven Whitehouse
  2007-02-05 14:22 ` [DLM] change some log_error to log_debug [17/54] Steven Whitehouse
                   ` (38 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: S. Wendy Cheng, cluster-devel

>From 10728f781249cf02a1e3de8699744c7b48ae544f Mon Sep 17 00:00:00 2001
From: S. Wendy Cheng <wcheng@redhat.com>
Date: Thu, 18 Jan 2007 16:07:03 -0500
Subject: [PATCH] [GFS2] Fix gfs2_rename deadlock

Second round of gfs2_rename lock re-ordering to allow Anaconda adding
root partition on top of gfs2. Previous to this patch the recursive
lock detector in glock.c can be triggered due to attempting to lock
the rgrp twice. This fixes it by checking to see whether the rgrp
is already locked.

This fixes Red Hat bugzilla #221237

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index bab338f..58c2ce7 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -281,13 +281,13 @@ out:
 }
 
 /**
- * gfs2_change_nlink_i - Change nlink count on inode
+ * gfs2_change_nlink - Change nlink count on inode
  * @ip: The GFS2 inode
  * @diff: The change in the nlink count required
  *
  * Returns: errno
  */
-int gfs2_change_nlink_i(struct gfs2_inode *ip, int diff)
+int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
 {
 	struct buffer_head *dibh;
 	u32 nlink;
@@ -320,40 +320,52 @@ int gfs2_change_nlink_i(struct gfs2_inode *ip, int diff)
 	brelse(dibh);
 	mark_inode_dirty(&ip->i_inode);
 
+	if (ip->i_inode.i_nlink == 0)
+		error = gfs2_change_nlink_i(ip);
+
 	return error;
 }
 
-int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
+int gfs2_change_nlink_i(struct gfs2_inode *ip)
 {
 	struct gfs2_sbd *sdp = ip->i_inode.i_sb->s_fs_info;
-	int error;
-
-	/* update the nlink */
-	error = gfs2_change_nlink_i(ip, diff);
-	if (error)
-		return error;
-
-	/* return meta data block back to rg */
-	if (ip->i_inode.i_nlink == 0) {
-		struct gfs2_rgrpd *rgd;
-		struct gfs2_holder ri_gh, rg_gh;
+	struct gfs2_inode *rindex = GFS2_I(sdp->sd_rindex);
+	struct gfs2_glock *ri_gl = rindex->i_gl;
+	struct gfs2_rgrpd *rgd;
+	struct gfs2_holder ri_gh, rg_gh;
+	int existing, error;
 
+	/* if we come from rename path, we could have the lock already */
+	existing = gfs2_glock_is_locked_by_me(ri_gl);
+	if (!existing) {
 		error = gfs2_rindex_hold(sdp, &ri_gh);
 		if (error)
 			goto out;
-		error = -EIO;
-		rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
-		if (!rgd)
-			goto out_norgrp;
+	}
+
+	/* find the matching rgd */
+	error = -EIO;
+	rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
+	if (!rgd)
+		goto out_norgrp;
+
+	/*
+	 * Eventually we may want to move rgd(s) to a linked list
+	 * and piggyback the free logic into one of gfs2 daemons
+	 * to gain some performance.
+	 */
+	if (!rgd->rd_gl || !gfs2_glock_is_locked_by_me(rgd->rd_gl)) {
 		error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &rg_gh);
 		if (error)
 			goto out_norgrp;
 
 		gfs2_unlink_di(&ip->i_inode); /* mark inode unlinked */
 		gfs2_glock_dq_uninit(&rg_gh);
+	}
+
 out_norgrp:
+	if (!existing)
 		gfs2_glock_dq_uninit(&ri_gh);
-	}
 out:
 	return error;
 }
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 85c67cb..cee281b 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -40,7 +40,7 @@ int gfs2_inode_refresh(struct gfs2_inode *ip);
 
 int gfs2_dinode_dealloc(struct gfs2_inode *inode);
 int gfs2_change_nlink(struct gfs2_inode *ip, int diff);
-int gfs2_change_nlink_i(struct gfs2_inode *ip, int diff);
+int gfs2_change_nlink_i(struct gfs2_inode *ip);
 struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
 			   int is_root, struct nameidata *nd);
 struct inode *gfs2_createi(struct gfs2_holder *ghs, const struct qstr *name,
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 919e894..b2a12f4 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -553,7 +553,6 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 	int alloc_required;
 	unsigned int x;
 	int error;
-	struct gfs2_rgrpd *rgd;
 
 	if (ndentry->d_inode) {
 		nip = GFS2_I(ndentry->d_inode);
@@ -685,12 +684,12 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 		error = gfs2_trans_begin(sdp, sdp->sd_max_dirres +
 					 al->al_rgd->rd_ri.ri_length +
 					 4 * RES_DINODE + 4 * RES_LEAF +
-					 RES_STATFS + RES_QUOTA + 1, 0);
+					 RES_STATFS + RES_QUOTA + 4, 0);
 		if (error)
 			goto out_ipreserv;
 	} else {
 		error = gfs2_trans_begin(sdp, 4 * RES_DINODE +
-					 5 * RES_LEAF + 1, 0);
+					 5 * RES_LEAF + 4, 0);
 		if (error)
 			goto out_gunlock;
 	}
@@ -704,25 +703,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 			error = gfs2_dir_del(ndip, &ndentry->d_name);
 			if (error)
 				goto out_end_trans;
-			error = gfs2_change_nlink_i(nip, -1);
-			if ((!error) && (nip->i_inode.i_nlink == 0)) {
-				error = -EIO;
-				rgd = gfs2_blk2rgrpd(sdp, nip->i_num.no_addr);
-				if (rgd) {
-					struct gfs2_holder nlink_rg_gh;
-					if (rgd != nip->i_alloc.al_rgd)
-						error = gfs2_glock_nq_init(
-						rgd->rd_gl, LM_ST_EXCLUSIVE,
-						0, &nlink_rg_gh);
-					else
-						error = 0;
-                			if (!error) {
-						gfs2_unlink_di(&nip->i_inode);
-						if (rgd != nip->i_alloc.al_rgd)
-							gfs2_glock_dq_uninit(&nlink_rg_gh);
-					}
-				}
-			}
+			error = gfs2_change_nlink(nip, -1);
 		}
 		if (error)
 			goto out_end_trans;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] change some log_error to log_debug [17/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (15 preceding siblings ...)
  2007-02-05 14:22 ` [GFS2] Fix gfs2_rename deadlock [16/54] Steven Whitehouse
@ 2007-02-05 14:22 ` Steven Whitehouse
  2007-02-05 14:23 ` [DLM] rename dlm_config_info fields [18/54] Steven Whitehouse
                   ` (37 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From a9ade49e7bbf836feb4512c2f282faf51c486d78 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Tue, 9 Jan 2007 09:38:39 -0600
Subject: [PATCH] [DLM] change some log_error to log_debug

Some common, non-error messages should use log_debug instead of log_error
so they can be turned off.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/rcom.c b/fs/dlm/rcom.c
index 54fba9b..2e246af 100644
--- a/fs/dlm/rcom.c
+++ b/fs/dlm/rcom.c
@@ -435,7 +435,7 @@ static int is_old_reply(struct dlm_ls *ls, struct dlm_rcom *rc)
 		seq = ls->ls_recover_seq;
 		spin_unlock(&ls->ls_recover_lock);
 		if (rc->rc_seq_reply != seq) {
-			log_error(ls, "ignoring old reply %x from %d "
+			log_debug(ls, "ignoring old reply %x from %d "
 				      "seq_reply %llx expect %llx",
 				      rc->rc_type, rc->rc_header.h_nodeid,
 				      (unsigned long long)rc->rc_seq_reply,
@@ -469,7 +469,7 @@ void dlm_receive_rcom(struct dlm_header *hd, int nodeid)
 	}
 
 	if (dlm_recovery_stopped(ls) && (rc->rc_type != DLM_RCOM_STATUS)) {
-		log_error(ls, "ignoring recovery message %x from %d",
+		log_debug(ls, "ignoring recovery message %x from %d",
 			  rc->rc_type, nodeid);
 		goto out;
 	}
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 650536a..3cb636d 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -77,7 +77,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 	error = dlm_recover_members(ls, rv, &neg);
 	if (error) {
-		log_error(ls, "recover_members failed %d", error);
+		log_debug(ls, "recover_members failed %d", error);
 		goto fail;
 	}
 	start = jiffies;
@@ -89,7 +89,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 	error = dlm_recover_directory(ls);
 	if (error) {
-		log_error(ls, "recover_directory failed %d", error);
+		log_debug(ls, "recover_directory failed %d", error);
 		goto fail;
 	}
 
@@ -99,7 +99,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 	error = dlm_recover_directory_wait(ls);
 	if (error) {
-		log_error(ls, "recover_directory_wait failed %d", error);
+		log_debug(ls, "recover_directory_wait failed %d", error);
 		goto fail;
 	}
 
@@ -129,7 +129,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 		error = dlm_recover_masters(ls);
 		if (error) {
-			log_error(ls, "recover_masters failed %d", error);
+			log_debug(ls, "recover_masters failed %d", error);
 			goto fail;
 		}
 
@@ -139,13 +139,13 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 		error = dlm_recover_locks(ls);
 		if (error) {
-			log_error(ls, "recover_locks failed %d", error);
+			log_debug(ls, "recover_locks failed %d", error);
 			goto fail;
 		}
 
 		error = dlm_recover_locks_wait(ls);
 		if (error) {
-			log_error(ls, "recover_locks_wait failed %d", error);
+			log_debug(ls, "recover_locks_wait failed %d", error);
 			goto fail;
 		}
 
@@ -166,7 +166,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 		error = dlm_recover_locks_wait(ls);
 		if (error) {
-			log_error(ls, "recover_locks_wait failed %d", error);
+			log_debug(ls, "recover_locks_wait failed %d", error);
 			goto fail;
 		}
 	}
@@ -184,7 +184,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 	dlm_set_recover_status(ls, DLM_RS_DONE);
 	error = dlm_recover_done_wait(ls);
 	if (error) {
-		log_error(ls, "recover_done_wait failed %d", error);
+		log_debug(ls, "recover_done_wait failed %d", error);
 		goto fail;
 	}
 
@@ -192,19 +192,19 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv)
 
 	error = enable_locking(ls, rv->seq);
 	if (error) {
-		log_error(ls, "enable_locking failed %d", error);
+		log_debug(ls, "enable_locking failed %d", error);
 		goto fail;
 	}
 
 	error = dlm_process_requestqueue(ls);
 	if (error) {
-		log_error(ls, "process_requestqueue failed %d", error);
+		log_debug(ls, "process_requestqueue failed %d", error);
 		goto fail;
 	}
 
 	error = dlm_recover_waiters_post(ls);
 	if (error) {
-		log_error(ls, "recover_waiters_post failed %d", error);
+		log_debug(ls, "recover_waiters_post failed %d", error);
 		goto fail;
 	}
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] rename dlm_config_info fields [18/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (16 preceding siblings ...)
  2007-02-05 14:22 ` [DLM] change some log_error to log_debug [17/54] Steven Whitehouse
@ 2007-02-05 14:23 ` Steven Whitehouse
  2007-02-05 14:24 ` [DLM] add config entry to enable log_debug [16/54] Steven Whitehouse
                   ` (36 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:23 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 96eb818411ad5df07e221d854a55e833c92bc0d0 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Tue, 9 Jan 2007 09:41:48 -0600
Subject: [PATCH] [DLM] rename dlm_config_info fields

Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch).  No functional changes in this patch, just naming changes.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 8855305..958021f 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -777,13 +777,13 @@ int dlm_our_addr(struct sockaddr_storage *addr, int num)
 #define DEFAULT_SCAN_SECS          5
 
 struct dlm_config_info dlm_config = {
-	.tcp_port = DEFAULT_TCP_PORT,
-	.buffer_size = DEFAULT_BUFFER_SIZE,
-	.rsbtbl_size = DEFAULT_RSBTBL_SIZE,
-	.lkbtbl_size = DEFAULT_LKBTBL_SIZE,
-	.dirtbl_size = DEFAULT_DIRTBL_SIZE,
-	.recover_timer = DEFAULT_RECOVER_TIMER,
-	.toss_secs = DEFAULT_TOSS_SECS,
-	.scan_secs = DEFAULT_SCAN_SECS
+	.ci_tcp_port = DEFAULT_TCP_PORT,
+	.ci_buffer_size = DEFAULT_BUFFER_SIZE,
+	.ci_rsbtbl_size = DEFAULT_RSBTBL_SIZE,
+	.ci_lkbtbl_size = DEFAULT_LKBTBL_SIZE,
+	.ci_dirtbl_size = DEFAULT_DIRTBL_SIZE,
+	.ci_recover_timer = DEFAULT_RECOVER_TIMER,
+	.ci_toss_secs = DEFAULT_TOSS_SECS,
+	.ci_scan_secs = DEFAULT_SCAN_SECS
 };
 
diff --git a/fs/dlm/config.h b/fs/dlm/config.h
index 9da7839..ce603e1 100644
--- a/fs/dlm/config.h
+++ b/fs/dlm/config.h
@@ -17,14 +17,14 @@
 #define DLM_MAX_ADDR_COUNT 3
 
 struct dlm_config_info {
-	int tcp_port;
-	int buffer_size;
-	int rsbtbl_size;
-	int lkbtbl_size;
-	int dirtbl_size;
-	int recover_timer;
-	int toss_secs;
-	int scan_secs;
+	int ci_tcp_port;
+	int ci_buffer_size;
+	int ci_rsbtbl_size;
+	int ci_lkbtbl_size;
+	int ci_dirtbl_size;
+	int ci_recover_timer;
+	int ci_toss_secs;
+	int ci_scan_secs;
 };
 
 extern struct dlm_config_info dlm_config;
diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index ed52485..5bac982 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -810,7 +810,7 @@ static int shrink_bucket(struct dlm_ls *ls, int b)
 		list_for_each_entry_reverse(r, &ls->ls_rsbtbl[b].toss,
 					    res_hashchain) {
 			if (!time_after_eq(jiffies, r->res_toss_time +
-					   dlm_config.toss_secs * HZ))
+					   dlm_config.ci_toss_secs * HZ))
 				continue;
 			found = 1;
 			break;
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 59012b0..f40817b 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -236,7 +236,7 @@ static int dlm_scand(void *data)
 	while (!kthread_should_stop()) {
 		list_for_each_entry(ls, &lslist, ls_list)
 			dlm_scan_rsbs(ls);
-		schedule_timeout_interruptible(dlm_config.scan_secs * HZ);
+		schedule_timeout_interruptible(dlm_config.ci_scan_secs * HZ);
 	}
 	return 0;
 }
@@ -422,7 +422,7 @@ static int new_lockspace(char *name, int namelen, void **lockspace,
 	ls->ls_count = 0;
 	ls->ls_flags = 0;
 
-	size = dlm_config.rsbtbl_size;
+	size = dlm_config.ci_rsbtbl_size;
 	ls->ls_rsbtbl_size = size;
 
 	ls->ls_rsbtbl = kmalloc(sizeof(struct dlm_rsbtable) * size, GFP_KERNEL);
@@ -434,7 +434,7 @@ static int new_lockspace(char *name, int namelen, void **lockspace,
 		rwlock_init(&ls->ls_rsbtbl[i].lock);
 	}
 
-	size = dlm_config.lkbtbl_size;
+	size = dlm_config.ci_lkbtbl_size;
 	ls->ls_lkbtbl_size = size;
 
 	ls->ls_lkbtbl = kmalloc(sizeof(struct dlm_lkbtable) * size, GFP_KERNEL);
@@ -446,7 +446,7 @@ static int new_lockspace(char *name, int namelen, void **lockspace,
 		ls->ls_lkbtbl[i].counter = 1;
 	}
 
-	size = dlm_config.dirtbl_size;
+	size = dlm_config.ci_dirtbl_size;
 	ls->ls_dirtbl_size = size;
 
 	ls->ls_dirtbl = kmalloc(sizeof(struct dlm_dirtable) * size, GFP_KERNEL);
@@ -489,7 +489,7 @@ static int new_lockspace(char *name, int namelen, void **lockspace,
 	mutex_init(&ls->ls_requestqueue_mutex);
 	mutex_init(&ls->ls_clear_proc_locks);
 
-	ls->ls_recover_buf = kmalloc(dlm_config.buffer_size, GFP_KERNEL);
+	ls->ls_recover_buf = kmalloc(dlm_config.ci_buffer_size, GFP_KERNEL);
 	if (!ls->ls_recover_buf)
 		goto out_dirfree;
 
diff --git a/fs/dlm/lowcomms-sctp.c b/fs/dlm/lowcomms-sctp.c
index 0940a80..5aeadad 100644
--- a/fs/dlm/lowcomms-sctp.c
+++ b/fs/dlm/lowcomms-sctp.c
@@ -635,7 +635,7 @@ static int add_bind_addr(struct sockaddr_storage *addr, int addr_len, int num)
 
 	if (result < 0)
 		log_print("Can't bind to port %d addr number %d",
-			  dlm_config.tcp_port, num);
+			  dlm_config.ci_tcp_port, num);
 
 	return result;
 }
@@ -711,7 +711,7 @@ static int init_sock(void)
 	/* Bind to all interfaces. */
 	for (i = 0; i < dlm_local_count; i++) {
 		memcpy(&localaddr, dlm_local_addr[i], sizeof(localaddr));
-		make_sockaddr(&localaddr, dlm_config.tcp_port, &addr_len);
+		make_sockaddr(&localaddr, dlm_config.ci_tcp_port, &addr_len);
 
 		result = add_bind_addr(&localaddr, addr_len, num);
 		if (result)
@@ -863,7 +863,7 @@ static void initiate_association(int nodeid)
 		return;
 	}
 
-	make_sockaddr(&rem_addr, dlm_config.tcp_port, &addrlen);
+	make_sockaddr(&rem_addr, dlm_config.ci_tcp_port, &addrlen);
 
 	outmessage.msg_name = &rem_addr;
 	outmessage.msg_namelen = addrlen;
diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index ce5e7cd..b4fb578 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -548,7 +548,7 @@ static void connect_to_sock(struct connection *con)
 	sock->sk->sk_user_data = con;
 	con->rx_action = receive_from_sock;
 
-	make_sockaddr(&saddr, dlm_config.tcp_port, &addr_len);
+	make_sockaddr(&saddr, dlm_config.ci_tcp_port, &addr_len);
 
 	add_sock(sock, con);
 
@@ -616,10 +616,10 @@ static struct socket *create_listen_sock(struct connection *con,
 	con->sock = sock;
 
 	/* Bind to our port */
-	make_sockaddr(saddr, dlm_config.tcp_port, &addr_len);
+	make_sockaddr(saddr, dlm_config.ci_tcp_port, &addr_len);
 	result = sock->ops->bind(sock, (struct sockaddr *) saddr, addr_len);
 	if (result < 0) {
-		printk("dlm: Can't bind to port %d\n", dlm_config.tcp_port);
+		printk("dlm: Can't bind to port %d\n", dlm_config.ci_tcp_port);
 		sock_release(sock);
 		sock = NULL;
 		con->sock = NULL;
@@ -638,7 +638,8 @@ static struct socket *create_listen_sock(struct connection *con,
 
 	result = sock->ops->listen(sock, 5);
 	if (result < 0) {
-		printk("dlm: Can't listen on port %d\n", dlm_config.tcp_port);
+		printk("dlm: Can't listen on port %d\n",
+		       dlm_config.ci_tcp_port);
 		sock_release(sock);
 		sock = NULL;
 		goto create_out;
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index c9b1c3d..a5126e0 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -82,7 +82,7 @@ int dlm_process_incoming_buffer(int nodeid, const void *base,
 		if (msglen < sizeof(struct dlm_header))
 			break;
 		err = -E2BIG;
-		if (msglen > dlm_config.buffer_size) {
+		if (msglen > dlm_config.ci_buffer_size) {
 			log_print("message size %d from %d too big, buf len %d",
 				  msglen, nodeid, len);
 			break;
@@ -103,7 +103,7 @@ int dlm_process_incoming_buffer(int nodeid, const void *base,
 
 		if (msglen > sizeof(__tmp) &&
 		    msg == (struct dlm_header *) __tmp) {
-			msg = kmalloc(dlm_config.buffer_size, GFP_KERNEL);
+			msg = kmalloc(dlm_config.ci_buffer_size, GFP_KERNEL);
 			if (msg == NULL)
 				return ret;
 		}
diff --git a/fs/dlm/rcom.c b/fs/dlm/rcom.c
index 2e246af..6bfbd61 100644
--- a/fs/dlm/rcom.c
+++ b/fs/dlm/rcom.c
@@ -138,7 +138,7 @@ int dlm_rcom_status(struct dlm_ls *ls, int nodeid)
 		goto out;
 
 	allow_sync_reply(ls, &rc->rc_id);
-	memset(ls->ls_recover_buf, 0, dlm_config.buffer_size);
+	memset(ls->ls_recover_buf, 0, dlm_config.ci_buffer_size);
 
 	send_rcom(ls, mh, rc);
 
@@ -213,7 +213,7 @@ int dlm_rcom_names(struct dlm_ls *ls, int nodeid, char *last_name, int last_len)
 	if (nodeid == dlm_our_nodeid()) {
 		dlm_copy_master_names(ls, last_name, last_len,
 		                      ls->ls_recover_buf + len,
-		                      dlm_config.buffer_size - len, nodeid);
+		                      dlm_config.ci_buffer_size - len, nodeid);
 		goto out;
 	}
 
@@ -223,7 +223,7 @@ int dlm_rcom_names(struct dlm_ls *ls, int nodeid, char *last_name, int last_len)
 	memcpy(rc->rc_buf, last_name, last_len);
 
 	allow_sync_reply(ls, &rc->rc_id);
-	memset(ls->ls_recover_buf, 0, dlm_config.buffer_size);
+	memset(ls->ls_recover_buf, 0, dlm_config.ci_buffer_size);
 
 	send_rcom(ls, mh, rc);
 
@@ -241,7 +241,7 @@ static void receive_rcom_names(struct dlm_ls *ls, struct dlm_rcom *rc_in)
 
 	nodeid = rc_in->rc_header.h_nodeid;
 	inlen = rc_in->rc_header.h_length - sizeof(struct dlm_rcom);
-	outlen = dlm_config.buffer_size - sizeof(struct dlm_rcom);
+	outlen = dlm_config.ci_buffer_size - sizeof(struct dlm_rcom);
 
 	error = create_rcom(ls, nodeid, DLM_RCOM_NAMES_REPLY, outlen, &rc, &mh);
 	if (error)
diff --git a/fs/dlm/recover.c b/fs/dlm/recover.c
index cf9f683..a7fa4cb 100644
--- a/fs/dlm/recover.c
+++ b/fs/dlm/recover.c
@@ -44,7 +44,7 @@
 static void dlm_wait_timer_fn(unsigned long data)
 {
 	struct dlm_ls *ls = (struct dlm_ls *) data;
-	mod_timer(&ls->ls_timer, jiffies + (dlm_config.recover_timer * HZ));
+	mod_timer(&ls->ls_timer, jiffies + (dlm_config.ci_recover_timer * HZ));
 	wake_up(&ls->ls_wait_general);
 }
 
@@ -55,7 +55,7 @@ int dlm_wait_function(struct dlm_ls *ls, int (*testfn) (struct dlm_ls *ls))
 	init_timer(&ls->ls_timer);
 	ls->ls_timer.function = dlm_wait_timer_fn;
 	ls->ls_timer.data = (long) ls;
-	ls->ls_timer.expires = jiffies + (dlm_config.recover_timer * HZ);
+	ls->ls_timer.expires = jiffies + (dlm_config.ci_recover_timer * HZ);
 	add_timer(&ls->ls_timer);
 
 	wait_event(ls->ls_wait_general, testfn(ls) || dlm_recovery_stopped(ls));
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] add config entry to enable log_debug [16/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (17 preceding siblings ...)
  2007-02-05 14:23 ` [DLM] rename dlm_config_info fields [18/54] Steven Whitehouse
@ 2007-02-05 14:24 ` Steven Whitehouse
  2007-02-05 14:25 ` [DLM] expose dlm_config_info fields in configfs [20/54] Steven Whitehouse
                   ` (35 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From a10e13fbd217194280b2e125cab087cfe828a09e Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Tue, 9 Jan 2007 09:44:01 -0600
Subject: [PATCH] [DLM] add config entry to enable log_debug

Add a new dlm_config_info field to enable log_debug output and change
log_debug() to use it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 958021f..7cf2020 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -775,6 +775,7 @@ int dlm_our_addr(struct sockaddr_storage *addr, int num)
 #define DEFAULT_RECOVER_TIMER      5
 #define DEFAULT_TOSS_SECS         10
 #define DEFAULT_SCAN_SECS          5
+#define DEFAULT_LOG_DEBUG          0
 
 struct dlm_config_info dlm_config = {
 	.ci_tcp_port = DEFAULT_TCP_PORT,
@@ -784,6 +785,7 @@ struct dlm_config_info dlm_config = {
 	.ci_dirtbl_size = DEFAULT_DIRTBL_SIZE,
 	.ci_recover_timer = DEFAULT_RECOVER_TIMER,
 	.ci_toss_secs = DEFAULT_TOSS_SECS,
-	.ci_scan_secs = DEFAULT_SCAN_SECS
+	.ci_scan_secs = DEFAULT_SCAN_SECS,
+	.ci_log_debug = DEFAULT_LOG_DEBUG
 };
 
diff --git a/fs/dlm/config.h b/fs/dlm/config.h
index ce603e1..1e97861 100644
--- a/fs/dlm/config.h
+++ b/fs/dlm/config.h
@@ -25,6 +25,7 @@ struct dlm_config_info {
 	int ci_recover_timer;
 	int ci_toss_secs;
 	int ci_scan_secs;
+	int ci_log_debug;
 };
 
 extern struct dlm_config_info dlm_config;
diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index 7185a13..ee993c5 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -41,6 +41,7 @@
 #include <asm/uaccess.h>
 
 #include <linux/dlm.h>
+#include "config.h"
 
 #define DLM_LOCKSPACE_LEN	64
 
@@ -69,12 +70,12 @@ struct dlm_mhandle;
 #define log_error(ls, fmt, args...) \
 	printk(KERN_ERR "dlm: %s: " fmt "\n", (ls)->ls_name , ##args)
 
-#define DLM_LOG_DEBUG
-#ifdef DLM_LOG_DEBUG
-#define log_debug(ls, fmt, args...) log_error(ls, fmt, ##args)
-#else
-#define log_debug(ls, fmt, args...)
-#endif
+#define log_debug(ls, fmt, args...) \
+do { \
+	if (dlm_config.ci_log_debug) \
+		printk(KERN_DEBUG "dlm: %s: " fmt "\n", \
+		       (ls)->ls_name , ##args); \
+} while (0)
 
 #define DLM_ASSERT(x, do) \
 { \
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] expose dlm_config_info fields in configfs [20/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (18 preceding siblings ...)
  2007-02-05 14:24 ` [DLM] add config entry to enable log_debug [16/54] Steven Whitehouse
@ 2007-02-05 14:25 ` Steven Whitehouse
  2007-02-05 14:26 ` [GFS2] gfs2 knows of directories which it chooses not to display [21/54] Steven Whitehouse
                   ` (34 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 006dc1bd2748f0423cbc6edcd43376ed6717b5ab Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Tue, 9 Jan 2007 09:46:02 -0600
Subject: [PATCH] [DLM] expose dlm_config_info fields in configfs

Make the dlm_config_info values readable and writeable via configfs
entries.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 7cf2020..8665c88 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -54,6 +54,11 @@ static struct config_item *make_node(struct config_group *, const char *);
 static void drop_node(struct config_group *, struct config_item *);
 static void release_node(struct config_item *);
 
+static ssize_t show_cluster(struct config_item *i, struct configfs_attribute *a,
+			    char *buf);
+static ssize_t store_cluster(struct config_item *i,
+			     struct configfs_attribute *a,
+			     const char *buf, size_t len);
 static ssize_t show_comm(struct config_item *i, struct configfs_attribute *a,
 			 char *buf);
 static ssize_t store_comm(struct config_item *i, struct configfs_attribute *a,
@@ -73,6 +78,101 @@ static ssize_t node_nodeid_write(struct node *nd, const char *buf, size_t len);
 static ssize_t node_weight_read(struct node *nd, char *buf);
 static ssize_t node_weight_write(struct node *nd, const char *buf, size_t len);
 
+struct cluster {
+	struct config_group group;
+	unsigned int cl_tcp_port;
+	unsigned int cl_buffer_size;
+	unsigned int cl_rsbtbl_size;
+	unsigned int cl_lkbtbl_size;
+	unsigned int cl_dirtbl_size;
+	unsigned int cl_recover_timer;
+	unsigned int cl_toss_secs;
+	unsigned int cl_scan_secs;
+	unsigned int cl_log_debug;
+};
+
+enum {
+	CLUSTER_ATTR_TCP_PORT = 0,
+	CLUSTER_ATTR_BUFFER_SIZE,
+	CLUSTER_ATTR_RSBTBL_SIZE,
+	CLUSTER_ATTR_LKBTBL_SIZE,
+	CLUSTER_ATTR_DIRTBL_SIZE,
+	CLUSTER_ATTR_RECOVER_TIMER,
+	CLUSTER_ATTR_TOSS_SECS,
+	CLUSTER_ATTR_SCAN_SECS,
+	CLUSTER_ATTR_LOG_DEBUG,
+};
+
+struct cluster_attribute {
+	struct configfs_attribute attr;
+	ssize_t (*show)(struct cluster *, char *);
+	ssize_t (*store)(struct cluster *, const char *, size_t);
+};
+
+static ssize_t cluster_set(struct cluster *cl, unsigned int *cl_field,
+			   unsigned int *info_field, int check_zero,
+			   const char *buf, size_t len)
+{
+	unsigned int x;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	x = simple_strtoul(buf, NULL, 0);
+
+	if (check_zero && !x)
+		return -EINVAL;
+
+	*cl_field = x;
+	*info_field = x;
+
+	return len;
+}
+
+#define __CONFIGFS_ATTR(_name,_mode,_read,_write) {                           \
+	.attr   = { .ca_name = __stringify(_name),                            \
+		    .ca_mode = _mode,                                         \
+		    .ca_owner = THIS_MODULE },                                \
+	.show   = _read,                                                      \
+	.store  = _write,                                                     \
+}
+
+#define CLUSTER_ATTR(name, check_zero)                                        \
+static ssize_t name##_write(struct cluster *cl, const char *buf, size_t len)  \
+{                                                                             \
+	return cluster_set(cl, &cl->cl_##name, &dlm_config.ci_##name,         \
+			   check_zero, buf, len);                             \
+}                                                                             \
+static ssize_t name##_read(struct cluster *cl, char *buf)                     \
+{                                                                             \
+	return snprintf(buf, PAGE_SIZE, "%u\n", cl->cl_##name);               \
+}                                                                             \
+static struct cluster_attribute cluster_attr_##name =                         \
+__CONFIGFS_ATTR(name, 0644, name##_read, name##_write)
+
+CLUSTER_ATTR(tcp_port, 1);
+CLUSTER_ATTR(buffer_size, 1);
+CLUSTER_ATTR(rsbtbl_size, 1);
+CLUSTER_ATTR(lkbtbl_size, 1);
+CLUSTER_ATTR(dirtbl_size, 1);
+CLUSTER_ATTR(recover_timer, 1);
+CLUSTER_ATTR(toss_secs, 1);
+CLUSTER_ATTR(scan_secs, 1);
+CLUSTER_ATTR(log_debug, 0);
+
+static struct configfs_attribute *cluster_attrs[] = {
+	[CLUSTER_ATTR_TCP_PORT] = &cluster_attr_tcp_port.attr,
+	[CLUSTER_ATTR_BUFFER_SIZE] = &cluster_attr_buffer_size.attr,
+	[CLUSTER_ATTR_RSBTBL_SIZE] = &cluster_attr_rsbtbl_size.attr,
+	[CLUSTER_ATTR_LKBTBL_SIZE] = &cluster_attr_lkbtbl_size.attr,
+	[CLUSTER_ATTR_DIRTBL_SIZE] = &cluster_attr_dirtbl_size.attr,
+	[CLUSTER_ATTR_RECOVER_TIMER] = &cluster_attr_recover_timer.attr,
+	[CLUSTER_ATTR_TOSS_SECS] = &cluster_attr_toss_secs.attr,
+	[CLUSTER_ATTR_SCAN_SECS] = &cluster_attr_scan_secs.attr,
+	[CLUSTER_ATTR_LOG_DEBUG] = &cluster_attr_log_debug.attr,
+	NULL,
+};
+
 enum {
 	COMM_ATTR_NODEID = 0,
 	COMM_ATTR_LOCAL,
@@ -152,10 +252,6 @@ struct clusters {
 	struct configfs_subsystem subsys;
 };
 
-struct cluster {
-	struct config_group group;
-};
-
 struct spaces {
 	struct config_group ss_group;
 };
@@ -197,6 +293,8 @@ static struct configfs_group_operations clusters_ops = {
 
 static struct configfs_item_operations cluster_ops = {
 	.release = release_cluster,
+	.show_attribute = show_cluster,
+	.store_attribute = store_cluster,
 };
 
 static struct configfs_group_operations spaces_ops = {
@@ -237,6 +335,7 @@ static struct config_item_type clusters_type = {
 
 static struct config_item_type cluster_type = {
 	.ct_item_ops = &cluster_ops,
+	.ct_attrs = cluster_attrs,
 	.ct_owner = THIS_MODULE,
 };
 
@@ -317,6 +416,16 @@ static struct config_group *make_cluster(struct config_group *g,
 	cl->group.default_groups[1] = &cms->cs_group;
 	cl->group.default_groups[2] = NULL;
 
+	cl->cl_tcp_port = dlm_config.ci_tcp_port;
+	cl->cl_buffer_size = dlm_config.ci_buffer_size;
+	cl->cl_rsbtbl_size = dlm_config.ci_rsbtbl_size;
+	cl->cl_lkbtbl_size = dlm_config.ci_lkbtbl_size;
+	cl->cl_dirtbl_size = dlm_config.ci_dirtbl_size;
+	cl->cl_recover_timer = dlm_config.ci_recover_timer;
+	cl->cl_toss_secs = dlm_config.ci_toss_secs;
+	cl->cl_scan_secs = dlm_config.ci_scan_secs;
+	cl->cl_log_debug = dlm_config.ci_log_debug;
+
 	space_list = &sps->ss_group;
 	comm_list = &cms->cs_group;
 	return &cl->group;
@@ -509,6 +618,25 @@ void dlm_config_exit(void)
  * Functions for user space to read/write attributes
  */
 
+static ssize_t show_cluster(struct config_item *i, struct configfs_attribute *a,
+			    char *buf)
+{
+	struct cluster *cl = to_cluster(i);
+	struct cluster_attribute *cla =
+			container_of(a, struct cluster_attribute, attr);
+	return cla->show ? cla->show(cl, buf) : 0;
+}
+
+static ssize_t store_cluster(struct config_item *i,
+			     struct configfs_attribute *a,
+			     const char *buf, size_t len)
+{
+	struct cluster *cl = to_cluster(i);
+	struct cluster_attribute *cla =
+		container_of(a, struct cluster_attribute, attr);
+	return cla->store ? cla->store(cl, buf, len) : -EINVAL;
+}
+
 static ssize_t show_comm(struct config_item *i, struct configfs_attribute *a,
 			 char *buf)
 {
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] gfs2 knows of directories which it chooses not to display [21/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (19 preceding siblings ...)
  2007-02-05 14:25 ` [DLM] expose dlm_config_info fields in configfs [20/54] Steven Whitehouse
@ 2007-02-05 14:26 ` Steven Whitehouse
  2007-02-05 14:27 ` [GFS2] make gfs2_change_nlink_i() static [22/54] Steven Whitehouse
                   ` (33 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Robert Peterson, cluster-devel

>From 318bb21324745d6cd3575c43d64513809568443f Mon Sep 17 00:00:00 2001
From: Robert Peterson <rpeterso@redhat.com>
Date: Thu, 11 Jan 2007 13:25:00 -0600
Subject: [PATCH] [GFS2] gfs2 knows of directories which it chooses not to display

This is for Red Hat bugzilla bug bz #222302:

Moving a virtual IP from node to node between two NFS-over-GFS2
servers was causing one of the GFS2 servers to become confused and
reference a deleted inode.  The problem was due to vfs dentries that did
not reference the gfs2_dops and therefore didn't call the gfs2 revalidate
code to revalidate a dentry after a directory had been deleted & recreated.
This patch is a crosswrite from a RHEL4 bug found in GFS1 as
bz #190756 and it is against the latest -nmw git tree.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
index b4e7b87..6ea979c 100644
--- a/fs/gfs2/ops_export.c
+++ b/fs/gfs2/ops_export.c
@@ -22,6 +22,7 @@
 #include "glock.h"
 #include "glops.h"
 #include "inode.h"
+#include "ops_dentry.h"
 #include "ops_export.h"
 #include "rgrp.h"
 #include "util.h"
@@ -189,6 +190,7 @@ static struct dentry *gfs2_get_parent(struct dentry *child)
 		return ERR_PTR(-ENOMEM);
 	}
 
+	dentry->d_op = &gfs2_dops;
 	return dentry;
 }
 
@@ -269,6 +271,7 @@ out_inode:
 		return ERR_PTR(-ENOMEM);
 	}
 
+	dentry->d_op = &gfs2_dops;
 	return dentry;
 
 fail_rgd:
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] make gfs2_change_nlink_i() static [22/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (20 preceding siblings ...)
  2007-02-05 14:26 ` [GFS2] gfs2 knows of directories which it chooses not to display [21/54] Steven Whitehouse
@ 2007-02-05 14:27 ` Steven Whitehouse
  2007-02-05 14:28 ` [DLM] Use workqueues for dlm lowcomms [23/54] Steven Whitehouse
                   ` (32 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Adrian Bunk, cluster-devel

>From 5c98f86d6ac69688a53e96f294a1b7a8e33c6070 Mon Sep 17 00:00:00 2001
From: Adrian Bunk <bunk@stusta.de>
Date: Sat, 13 Jan 2007 10:56:41 +0100
Subject: [PATCH] [GFS2] make gfs2_change_nlink_i() static

On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-rc3-mm1:
>...
>  git-gfs2-nmw.patch
>...
>  git trees
>...

This patch makes the needlessly globlal gfs2_change_nlink_i() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 58c2ce7..2603169 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -280,6 +280,50 @@ out:
 	return error;
 }
 
+static int gfs2_change_nlink_i(struct gfs2_inode *ip)
+{
+	struct gfs2_sbd *sdp = ip->i_inode.i_sb->s_fs_info;
+	struct gfs2_inode *rindex = GFS2_I(sdp->sd_rindex);
+	struct gfs2_glock *ri_gl = rindex->i_gl;
+	struct gfs2_rgrpd *rgd;
+	struct gfs2_holder ri_gh, rg_gh;
+	int existing, error;
+
+	/* if we come from rename path, we could have the lock already */
+	existing = gfs2_glock_is_locked_by_me(ri_gl);
+	if (!existing) {
+		error = gfs2_rindex_hold(sdp, &ri_gh);
+		if (error)
+			goto out;
+	}
+
+	/* find the matching rgd */
+	error = -EIO;
+	rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
+	if (!rgd)
+		goto out_norgrp;
+
+	/*
+	 * Eventually we may want to move rgd(s) to a linked list
+	 * and piggyback the free logic into one of gfs2 daemons
+	 * to gain some performance.
+	 */
+	if (!rgd->rd_gl || !gfs2_glock_is_locked_by_me(rgd->rd_gl)) {
+		error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &rg_gh);
+		if (error)
+			goto out_norgrp;
+
+		gfs2_unlink_di(&ip->i_inode); /* mark inode unlinked */
+		gfs2_glock_dq_uninit(&rg_gh);
+	}
+
+out_norgrp:
+	if (!existing)
+		gfs2_glock_dq_uninit(&ri_gh);
+out:
+	return error;
+}
+
 /**
  * gfs2_change_nlink - Change nlink count on inode
  * @ip: The GFS2 inode
@@ -326,50 +370,6 @@ int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
 	return error;
 }
 
-int gfs2_change_nlink_i(struct gfs2_inode *ip)
-{
-	struct gfs2_sbd *sdp = ip->i_inode.i_sb->s_fs_info;
-	struct gfs2_inode *rindex = GFS2_I(sdp->sd_rindex);
-	struct gfs2_glock *ri_gl = rindex->i_gl;
-	struct gfs2_rgrpd *rgd;
-	struct gfs2_holder ri_gh, rg_gh;
-	int existing, error;
-
-	/* if we come from rename path, we could have the lock already */
-	existing = gfs2_glock_is_locked_by_me(ri_gl);
-	if (!existing) {
-		error = gfs2_rindex_hold(sdp, &ri_gh);
-		if (error)
-			goto out;
-	}
-
-	/* find the matching rgd */
-	error = -EIO;
-	rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
-	if (!rgd)
-		goto out_norgrp;
-
-	/*
-	 * Eventually we may want to move rgd(s) to a linked list
-	 * and piggyback the free logic into one of gfs2 daemons
-	 * to gain some performance.
-	 */
-	if (!rgd->rd_gl || !gfs2_glock_is_locked_by_me(rgd->rd_gl)) {
-		error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &rg_gh);
-		if (error)
-			goto out_norgrp;
-
-		gfs2_unlink_di(&ip->i_inode); /* mark inode unlinked */
-		gfs2_glock_dq_uninit(&rg_gh);
-	}
-
-out_norgrp:
-	if (!existing)
-		gfs2_glock_dq_uninit(&ri_gh);
-out:
-	return error;
-}
-
 struct inode *gfs2_lookup_simple(struct inode *dip, const char *name)
 {
 	struct qstr qstr;
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index cee281b..b57f448 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -40,7 +40,6 @@ int gfs2_inode_refresh(struct gfs2_inode *ip);
 
 int gfs2_dinode_dealloc(struct gfs2_inode *inode);
 int gfs2_change_nlink(struct gfs2_inode *ip, int diff);
-int gfs2_change_nlink_i(struct gfs2_inode *ip);
 struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
 			   int is_root, struct nameidata *nd);
 struct inode *gfs2_createi(struct gfs2_holder *ghs, const struct qstr *name,
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] Use workqueues for dlm lowcomms [23/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (21 preceding siblings ...)
  2007-02-05 14:27 ` [GFS2] make gfs2_change_nlink_i() static [22/54] Steven Whitehouse
@ 2007-02-05 14:28 ` Steven Whitehouse
  2007-02-05 14:29 ` [DLM] fix user unlocking [24/54] Steven Whitehouse
                   ` (31 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From 8b3c1aacbf337ec3c9a354c9ad9a86f4d31e72fc Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Mon, 15 Jan 2007 14:33:34 +0000
Subject: [PATCH] [DLM] Use workqueues for dlm lowcomms

This patch converts the DLM TCP lowcomms to use workqueues rather than using its
own daemon functions. Simultaneously removing a lot of code and making it more
scalable on multi-processor machines.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-sctp.c b/fs/dlm/lowcomms-sctp.c
index 5aeadad..dc83a9d 100644
--- a/fs/dlm/lowcomms-sctp.c
+++ b/fs/dlm/lowcomms-sctp.c
@@ -72,6 +72,8 @@ struct nodeinfo {
 	struct list_head	writequeue; /* outgoing writequeue_entries */
 	spinlock_t		writequeue_lock;
 	int			nodeid;
+	struct work_struct      swork; /* Send workqueue */
+	struct work_struct      lwork; /* Locking workqueue */
 };
 
 static DEFINE_IDR(nodeinfo_idr);
@@ -96,6 +98,7 @@ struct connection {
 	atomic_t		waiting_requests;
 	struct cbuf		cb;
 	int                     eagain_flag;
+	struct work_struct      work; /* Send workqueue */
 };
 
 /* An entry waiting to be sent */
@@ -137,19 +140,23 @@ static void cbuf_eat(struct cbuf *cb, int n)
 static LIST_HEAD(write_nodes);
 static DEFINE_SPINLOCK(write_nodes_lock);
 
+
 /* Maximum number of incoming messages to process before
  * doing a schedule()
  */
 #define MAX_RX_MSG_COUNT 25
 
-/* Manage daemons */
-static struct task_struct *recv_task;
-static struct task_struct *send_task;
-static DECLARE_WAIT_QUEUE_HEAD(lowcomms_recv_wait);
+/* Work queues */
+static struct workqueue_struct *recv_workqueue;
+static struct workqueue_struct *send_workqueue;
+static struct workqueue_struct *lock_workqueue;
 
 /* The SCTP connection */
 static struct connection sctp_con;
 
+static void process_send_sockets(struct work_struct *work);
+static void process_recv_sockets(struct work_struct *work);
+static void process_lock_request(struct work_struct *work);
 
 static int nodeid_to_addr(int nodeid, struct sockaddr *retaddr)
 {
@@ -222,6 +229,8 @@ static struct nodeinfo *nodeid2nodeinfo(int nodeid, gfp_t alloc)
 	spin_lock_init(&ni->lock);
 	INIT_LIST_HEAD(&ni->writequeue);
 	spin_lock_init(&ni->writequeue_lock);
+	INIT_WORK(&ni->lwork, process_lock_request);
+	INIT_WORK(&ni->swork, process_send_sockets);
 	ni->nodeid = nodeid;
 
 	if (nodeid > max_nodeid)
@@ -249,11 +258,8 @@ static struct nodeinfo *assoc2nodeinfo(sctp_assoc_t assoc)
 /* Data or notification available on socket */
 static void lowcomms_data_ready(struct sock *sk, int count_unused)
 {
-	atomic_inc(&sctp_con.waiting_requests);
 	if (test_and_set_bit(CF_READ_PENDING, &sctp_con.flags))
-		return;
-
-	wake_up_interruptible(&lowcomms_recv_wait);
+		queue_work(recv_workqueue, &sctp_con.work);
 }
 
 
@@ -361,10 +367,10 @@ static void init_failed(void)
 				spin_lock_bh(&write_nodes_lock);
 				list_add_tail(&ni->write_list, &write_nodes);
 				spin_unlock_bh(&write_nodes_lock);
+				queue_work(send_workqueue, &ni->swork);
 			}
 		}
 	}
-	wake_up_process(send_task);
 }
 
 /* Something happened to an association */
@@ -446,8 +452,8 @@ static void process_sctp_notification(struct msghdr *msg, char *buf)
 				spin_lock_bh(&write_nodes_lock);
 				list_add_tail(&ni->write_list, &write_nodes);
 				spin_unlock_bh(&write_nodes_lock);
+				queue_work(send_workqueue, &ni->swork);
 			}
-			wake_up_process(send_task);
 		}
 		break;
 
@@ -580,8 +586,8 @@ static int receive_from_sock(void)
 				spin_lock_bh(&write_nodes_lock);
 				list_add_tail(&ni->write_list, &write_nodes);
 				spin_unlock_bh(&write_nodes_lock);
+				queue_work(send_workqueue, &ni->swork);
 			}
-			wake_up_process(send_task);
 		}
 	}
 
@@ -590,6 +596,7 @@ static int receive_from_sock(void)
 		return 0;
 
 	cbuf_add(&sctp_con.cb, ret);
+	// PJC: TODO: Add to node's workqueue....can we ??
 	ret = dlm_process_incoming_buffer(cpu_to_le32(sinfo->sinfo_ppid),
 					  page_address(sctp_con.rx_page),
 					  sctp_con.cb.base, sctp_con.cb.len,
@@ -820,7 +827,8 @@ void dlm_lowcomms_commit_buffer(void *arg)
 		spin_lock_bh(&write_nodes_lock);
 		list_add_tail(&ni->write_list, &write_nodes);
 		spin_unlock_bh(&write_nodes_lock);
-		wake_up_process(send_task);
+
+		queue_work(send_workqueue, &ni->swork);
 	}
 	return;
 
@@ -1088,101 +1096,75 @@ int dlm_lowcomms_close(int nodeid)
 	return 0;
 }
 
-static int write_list_empty(void)
+// PJC: The work queue function for receiving.
+static void process_recv_sockets(struct work_struct *work)
 {
-	int status;
+	if (test_and_clear_bit(CF_READ_PENDING, &sctp_con.flags)) {
+		int ret;
+		int count = 0;
 
-	spin_lock_bh(&write_nodes_lock);
-	status = list_empty(&write_nodes);
-	spin_unlock_bh(&write_nodes_lock);
+		do {
+			ret = receive_from_sock();
 
-	return status;
+			/* Don't starve out everyone else */
+			if (++count >= MAX_RX_MSG_COUNT) {
+				cond_resched();
+				count = 0;
+			}
+		} while (!kthread_should_stop() && ret >=0);
+	}
+	cond_resched();
 }
 
-static int dlm_recvd(void *data)
+// PJC: the work queue function for sending
+static void process_send_sockets(struct work_struct *work)
 {
-	DECLARE_WAITQUEUE(wait, current);
-
-	while (!kthread_should_stop()) {
-		int count = 0;
-
-		set_current_state(TASK_INTERRUPTIBLE);
-		add_wait_queue(&lowcomms_recv_wait, &wait);
-		if (!test_bit(CF_READ_PENDING, &sctp_con.flags))
-			schedule();
-		remove_wait_queue(&lowcomms_recv_wait, &wait);
-		set_current_state(TASK_RUNNING);
-
-		if (test_and_clear_bit(CF_READ_PENDING, &sctp_con.flags)) {
-			int ret;
-
-			do {
-				ret = receive_from_sock();
-
-				/* Don't starve out everyone else */
-				if (++count >= MAX_RX_MSG_COUNT) {
-					cond_resched();
-					count = 0;
-				}
-			} while (!kthread_should_stop() && ret >=0);
-		}
-		cond_resched();
+	if (sctp_con.eagain_flag) {
+		sctp_con.eagain_flag = 0;
+		refill_write_queue();
 	}
-
-	return 0;
+	process_output_queue();
 }
 
-static int dlm_sendd(void *data)
+// PJC: Process lock requests from a particular node.
+// TODO: can we optimise this out on UP ??
+static void process_lock_request(struct work_struct *work)
 {
-	DECLARE_WAITQUEUE(wait, current);
-
-	add_wait_queue(sctp_con.sock->sk->sk_sleep, &wait);
-
-	while (!kthread_should_stop()) {
-		set_current_state(TASK_INTERRUPTIBLE);
-		if (write_list_empty())
-			schedule();
-		set_current_state(TASK_RUNNING);
-
-		if (sctp_con.eagain_flag) {
-			sctp_con.eagain_flag = 0;
-			refill_write_queue();
-		}
-		process_output_queue();
-	}
-
-	remove_wait_queue(sctp_con.sock->sk->sk_sleep, &wait);
-
-	return 0;
 }
 
 static void daemons_stop(void)
 {
-	kthread_stop(recv_task);
-	kthread_stop(send_task);
+	destroy_workqueue(recv_workqueue);
+	destroy_workqueue(send_workqueue);
+	destroy_workqueue(lock_workqueue);
 }
 
 static int daemons_start(void)
 {
-	struct task_struct *p;
 	int error;
+	recv_workqueue = create_workqueue("dlm_recv");
+	error = IS_ERR(recv_workqueue);
+	if (error) {
+		log_print("can't start dlm_recv %d", error);
+		return error;
+	}
 
-	p = kthread_run(dlm_recvd, NULL, "dlm_recvd");
-	error = IS_ERR(p);
+	send_workqueue = create_singlethread_workqueue("dlm_send");
+	error = IS_ERR(send_workqueue);
 	if (error) {
-		log_print("can't start dlm_recvd %d", error);
+		log_print("can't start dlm_send %d", error);
+		destroy_workqueue(recv_workqueue);
 		return error;
 	}
-	recv_task = p;
 
-	p = kthread_run(dlm_sendd, NULL, "dlm_sendd");
-	error = IS_ERR(p);
+	lock_workqueue = create_workqueue("dlm_rlock");
+	error = IS_ERR(lock_workqueue);
 	if (error) {
-		log_print("can't start dlm_sendd %d", error);
-		kthread_stop(recv_task);
+		log_print("can't start dlm_rlock %d", error);
+		destroy_workqueue(send_workqueue);
+		destroy_workqueue(recv_workqueue);
 		return error;
 	}
-	send_task = p;
 
 	return 0;
 }
@@ -1194,6 +1176,8 @@ int dlm_lowcomms_start(void)
 {
 	int error;
 
+	INIT_WORK(&sctp_con.work, process_recv_sockets);
+
 	error = init_sock();
 	if (error)
 		goto fail_sock;
@@ -1224,4 +1208,3 @@ void dlm_lowcomms_stop(void)
 	for (i = 0; i < dlm_local_count; i++)
 		kfree(dlm_local_addr[i]);
 }
-
diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index b4fb578..86e5f81 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -115,6 +115,8 @@ struct connection {
 	atomic_t waiting_requests;
 #define MAX_CONNECT_RETRIES 3
 	struct connection *othercon;
+	struct work_struct rwork; /* Receive workqueue */
+	struct work_struct swork; /* Send workqueue */
 };
 #define sock2con(x) ((struct connection *)(x)->sk_user_data)
 
@@ -131,14 +133,9 @@ struct writequeue_entry {
 
 static struct sockaddr_storage dlm_local_addr;
 
-/* Manage daemons */
-static struct task_struct *recv_task;
-static struct task_struct *send_task;
-
-static wait_queue_t lowcomms_send_waitq_head;
-static DECLARE_WAIT_QUEUE_HEAD(lowcomms_send_waitq);
-static wait_queue_t lowcomms_recv_waitq_head;
-static DECLARE_WAIT_QUEUE_HEAD(lowcomms_recv_waitq);
+/* Work queues */
+static struct workqueue_struct *recv_workqueue;
+static struct workqueue_struct *send_workqueue;
 
 /* An array of pointers to connections, indexed by NODEID */
 static struct connection **connections;
@@ -146,17 +143,8 @@ static DECLARE_MUTEX(connections_lock);
 static struct kmem_cache *con_cache;
 static int conn_array_size;
 
-/* List of sockets that have reads pending */
-static LIST_HEAD(read_sockets);
-static DEFINE_SPINLOCK(read_sockets_lock);
-
-/* List of sockets which have writes pending */
-static LIST_HEAD(write_sockets);
-static DEFINE_SPINLOCK(write_sockets_lock);
-
-/* List of sockets which have connects pending */
-static LIST_HEAD(state_sockets);
-static DEFINE_SPINLOCK(state_sockets_lock);
+static void process_recv_sockets(struct work_struct *work);
+static void process_send_sockets(struct work_struct *work);
 
 static struct connection *nodeid2con(int nodeid, gfp_t allocation)
 {
@@ -189,6 +177,8 @@ static struct connection *nodeid2con(int nodeid, gfp_t allocation)
 		init_rwsem(&con->sock_sem);
 		INIT_LIST_HEAD(&con->writequeue);
 		spin_lock_init(&con->writequeue_lock);
+		INIT_WORK(&con->swork, process_send_sockets);
+		INIT_WORK(&con->rwork, process_recv_sockets);
 
 		connections[nodeid] = con;
 	}
@@ -203,41 +193,22 @@ static void lowcomms_data_ready(struct sock *sk, int count_unused)
 {
 	struct connection *con = sock2con(sk);
 
-	atomic_inc(&con->waiting_requests);
-	if (test_and_set_bit(CF_READ_PENDING, &con->flags))
-		return;
-
-	spin_lock_bh(&read_sockets_lock);
-	list_add_tail(&con->read_list, &read_sockets);
-	spin_unlock_bh(&read_sockets_lock);
-
-	wake_up_interruptible(&lowcomms_recv_waitq);
+	if (!test_and_set_bit(CF_READ_PENDING, &con->flags))
+		queue_work(recv_workqueue, &con->rwork);
 }
 
 static void lowcomms_write_space(struct sock *sk)
 {
 	struct connection *con = sock2con(sk);
 
-	if (test_and_set_bit(CF_WRITE_PENDING, &con->flags))
-		return;
-
-	spin_lock_bh(&write_sockets_lock);
-	list_add_tail(&con->write_list, &write_sockets);
-	spin_unlock_bh(&write_sockets_lock);
-
-	wake_up_interruptible(&lowcomms_send_waitq);
+	if (!test_and_set_bit(CF_WRITE_PENDING, &con->flags))
+		queue_work(send_workqueue, &con->swork);
 }
 
 static inline void lowcomms_connect_sock(struct connection *con)
 {
-	if (test_and_set_bit(CF_CONNECT_PENDING, &con->flags))
-		return;
-
-	spin_lock_bh(&state_sockets_lock);
-	list_add_tail(&con->state_list, &state_sockets);
-	spin_unlock_bh(&state_sockets_lock);
-
-	wake_up_interruptible(&lowcomms_send_waitq);
+	if (!test_and_set_bit(CF_CONNECT_PENDING, &con->flags))
+		queue_work(send_workqueue, &con->swork);
 }
 
 static void lowcomms_state_change(struct sock *sk)
@@ -388,7 +359,8 @@ out:
 	return 0;
 
 out_resched:
-	lowcomms_data_ready(con->sock->sk, 0);
+	if (!test_and_set_bit(CF_READ_PENDING, &con->flags))
+		queue_work(recv_workqueue, &con->rwork);
 	up_read(&con->sock_sem);
 	cond_resched();
 	return 0;
@@ -477,6 +449,8 @@ static int accept_from_sock(struct connection *con)
 			othercon->nodeid = nodeid;
 			othercon->rx_action = receive_from_sock;
 			init_rwsem(&othercon->sock_sem);
+			INIT_WORK(&othercon->swork, process_send_sockets);
+			INIT_WORK(&othercon->rwork, process_recv_sockets);
 			set_bit(CF_IS_OTHERCON, &othercon->flags);
 			newcon->othercon = othercon;
 		}
@@ -498,7 +472,8 @@ static int accept_from_sock(struct connection *con)
 	 * beween processing the accept adding the socket
 	 * to the read_sockets list
 	 */
-	lowcomms_data_ready(newsock->sk, 0);
+	if (!test_and_set_bit(CF_READ_PENDING, &newcon->flags))
+		queue_work(recv_workqueue, &newcon->rwork);
 	up_read(&con->sock_sem);
 
 	return 0;
@@ -757,12 +732,8 @@ void dlm_lowcomms_commit_buffer(void *mh)
 	kunmap(e->page);
 	spin_unlock(&con->writequeue_lock);
 
-	if (test_and_set_bit(CF_WRITE_PENDING, &con->flags) == 0) {
-		spin_lock_bh(&write_sockets_lock);
-		list_add_tail(&con->write_list, &write_sockets);
-		spin_unlock_bh(&write_sockets_lock);
-
-		wake_up_interruptible(&lowcomms_send_waitq);
+	if (!test_and_set_bit(CF_WRITE_PENDING, &con->flags)) {
+		queue_work(send_workqueue, &con->swork);
 	}
 	return;
 
@@ -803,6 +774,7 @@ static void send_to_sock(struct connection *con)
 		offset = e->offset;
 		BUG_ON(len == 0 && e->users == 0);
 		spin_unlock(&con->writequeue_lock);
+		kmap(e->page);
 
 		ret = 0;
 		if (len) {
@@ -884,85 +856,29 @@ out:
 }
 
 /* Look for activity on active sockets */
-static void process_sockets(void)
+static void process_recv_sockets(struct work_struct *work)
 {
-	struct list_head *list;
-	struct list_head *temp;
-	int count = 0;
-
-	spin_lock_bh(&read_sockets_lock);
-	list_for_each_safe(list, temp, &read_sockets) {
-
-		struct connection *con =
-			list_entry(list, struct connection, read_list);
-		list_del(&con->read_list);
-		clear_bit(CF_READ_PENDING, &con->flags);
-
-		spin_unlock_bh(&read_sockets_lock);
+	struct connection *con = container_of(work, struct connection, rwork);
+	int err;
 
-		/* This can reach zero if we are processing requests
-		 * as they come in.
-		 */
-		if (atomic_read(&con->waiting_requests) == 0) {
-			spin_lock_bh(&read_sockets_lock);
-			continue;
-		}
-
-		do {
-			con->rx_action(con);
-
-			/* Don't starve out everyone else */
-			if (++count >= MAX_RX_MSG_COUNT) {
-				cond_resched();
-				count = 0;
-			}
-
-		} while (!atomic_dec_and_test(&con->waiting_requests) &&
-			 !kthread_should_stop());
-
-		spin_lock_bh(&read_sockets_lock);
-	}
-	spin_unlock_bh(&read_sockets_lock);
+	clear_bit(CF_READ_PENDING, &con->flags);
+	do {
+		err = con->rx_action(con);
+	} while (!err);
 }
 
-/* Try to send any messages that are pending
- */
-static void process_output_queue(void)
-{
-	struct list_head *list;
-	struct list_head *temp;
-
-	spin_lock_bh(&write_sockets_lock);
-	list_for_each_safe(list, temp, &write_sockets) {
-		struct connection *con =
-			list_entry(list, struct connection, write_list);
-		clear_bit(CF_WRITE_PENDING, &con->flags);
-		list_del(&con->write_list);
-
-		spin_unlock_bh(&write_sockets_lock);
-		send_to_sock(con);
-		spin_lock_bh(&write_sockets_lock);
-	}
-	spin_unlock_bh(&write_sockets_lock);
-}
 
-static void process_state_queue(void)
+static void process_send_sockets(struct work_struct *work)
 {
-	struct list_head *list;
-	struct list_head *temp;
-
-	spin_lock_bh(&state_sockets_lock);
-	list_for_each_safe(list, temp, &state_sockets) {
-		struct connection *con =
-			list_entry(list, struct connection, state_list);
-		list_del(&con->state_list);
-		clear_bit(CF_CONNECT_PENDING, &con->flags);
-		spin_unlock_bh(&state_sockets_lock);
+	struct connection *con = container_of(work, struct connection, swork);
 
+	if (test_and_clear_bit(CF_CONNECT_PENDING, &con->flags)) {
 		connect_to_sock(con);
-		spin_lock_bh(&state_sockets_lock);
 	}
-	spin_unlock_bh(&state_sockets_lock);
+
+	if (test_and_clear_bit(CF_WRITE_PENDING, &con->flags)) {
+		send_to_sock(con);
+	}
 }
 
 
@@ -979,97 +895,29 @@ static void clean_writequeues(void)
 	}
 }
 
-static int read_list_empty(void)
-{
-	int status;
-
-	spin_lock_bh(&read_sockets_lock);
-	status = list_empty(&read_sockets);
-	spin_unlock_bh(&read_sockets_lock);
-
-	return status;
-}
-
-/* DLM Transport comms receive daemon */
-static int dlm_recvd(void *data)
+static void work_stop(void)
 {
-	init_waitqueue_entry(&lowcomms_recv_waitq_head, current);
-	add_wait_queue(&lowcomms_recv_waitq, &lowcomms_recv_waitq_head);
-
-	while (!kthread_should_stop()) {
-		set_current_state(TASK_INTERRUPTIBLE);
-		if (read_list_empty())
-			schedule();
-		set_current_state(TASK_RUNNING);
-
-		process_sockets();
-	}
-
-	return 0;
+	destroy_workqueue(recv_workqueue);
+	destroy_workqueue(send_workqueue);
 }
 
-static int write_and_state_lists_empty(void)
+static int work_start(void)
 {
-	int status;
-
-	spin_lock_bh(&write_sockets_lock);
-	status = list_empty(&write_sockets);
-	spin_unlock_bh(&write_sockets_lock);
-
-	spin_lock_bh(&state_sockets_lock);
-	if (list_empty(&state_sockets) == 0)
-		status = 0;
-	spin_unlock_bh(&state_sockets_lock);
-
-	return status;
-}
-
-/* DLM Transport send daemon */
-static int dlm_sendd(void *data)
-{
-	init_waitqueue_entry(&lowcomms_send_waitq_head, current);
-	add_wait_queue(&lowcomms_send_waitq, &lowcomms_send_waitq_head);
-
-	while (!kthread_should_stop()) {
-		set_current_state(TASK_INTERRUPTIBLE);
-		if (write_and_state_lists_empty())
-			schedule();
-		set_current_state(TASK_RUNNING);
-
-		process_state_queue();
-		process_output_queue();
-	}
-
-	return 0;
-}
-
-static void daemons_stop(void)
-{
-	kthread_stop(recv_task);
-	kthread_stop(send_task);
-}
-
-static int daemons_start(void)
-{
-	struct task_struct *p;
 	int error;
-
-	p = kthread_run(dlm_recvd, NULL, "dlm_recvd");
-	error = IS_ERR(p);
+	recv_workqueue = create_workqueue("dlm_recv");
+	error = IS_ERR(recv_workqueue);
 	if (error) {
-		log_print("can't start dlm_recvd %d", error);
+		log_print("can't start dlm_recv %d", error);
 		return error;
 	}
-	recv_task = p;
 
-	p = kthread_run(dlm_sendd, NULL, "dlm_sendd");
-	error = IS_ERR(p);
+	send_workqueue = create_singlethread_workqueue("dlm_send");
+	error = IS_ERR(send_workqueue);
 	if (error) {
-		log_print("can't start dlm_sendd %d", error);
-		kthread_stop(recv_task);
+		log_print("can't start dlm_send %d", error);
+		destroy_workqueue(recv_workqueue);
 		return error;
 	}
-	send_task = p;
 
 	return 0;
 }
@@ -1086,7 +934,7 @@ void dlm_lowcomms_stop(void)
 			connections[i]->flags |= 0xFF;
 	}
 
-	daemons_stop();
+	work_stop();
 	clean_writequeues();
 
 	for (i = 0; i < conn_array_size; i++) {
@@ -1138,7 +986,7 @@ int dlm_lowcomms_start(void)
 	if (error)
 		goto fail_unlisten;
 
-	error = daemons_start();
+	error = work_start();
 	if (error)
 		goto fail_unlisten;
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix user unlocking [24/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (22 preceding siblings ...)
  2007-02-05 14:28 ` [DLM] Use workqueues for dlm lowcomms [23/54] Steven Whitehouse
@ 2007-02-05 14:29 ` Steven Whitehouse
  2007-02-05 14:29 ` [DLM] fix master recovery [25/54] Steven Whitehouse
                   ` (30 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 53c5a8421b306277af764de28a59a8da601c4d99 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Mon, 15 Jan 2007 10:34:52 -0600
Subject: [PATCH] [DLM] fix user unlocking

When a user process exits, we clear all the locks it holds.  There is a
problem, though, with locks that the process had begun unlocking before it
exited.  We couldn't find the lkb's that were in the process of being
unlocked remotely, to flag that they are DEAD.  To solve this, we move
lkb's being unlocked onto a new list in the per-process structure that
tracks what locks the process is holding.  We can then go through this
list to flag the necessary lkb's when clearing locks for a process when it
exits.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index ee993c5..61d9320 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -526,6 +526,7 @@ struct dlm_user_proc {
 	spinlock_t		asts_spin;
 	struct list_head	locks;
 	spinlock_t		locks_spin;
+	struct list_head	unlocking;
 	wait_queue_head_t	wait;
 };
 
diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 5bac982..6ad2b8e 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -3772,12 +3772,10 @@ int dlm_user_unlock(struct dlm_ls *ls, struct dlm_user_args *ua_tmp,
 		goto out_put;
 
 	spin_lock(&ua->proc->locks_spin);
-	list_del_init(&lkb->lkb_ownqueue);
+	/* dlm_user_add_ast() may have already taken lkb off the proc list */
+	if (!list_empty(&lkb->lkb_ownqueue))
+		list_move(&lkb->lkb_ownqueue, &ua->proc->unlocking);
 	spin_unlock(&ua->proc->locks_spin);
-
-	/* this removes the reference for the proc->locks list added by
-	   dlm_user_request */
-	unhold_lkb(lkb);
  out_put:
 	dlm_put_lkb(lkb);
  out:
@@ -3817,9 +3815,8 @@ int dlm_user_cancel(struct dlm_ls *ls, struct dlm_user_args *ua_tmp,
 	/* this lkb was removed from the WAITING queue */
 	if (lkb->lkb_grmode == DLM_LOCK_IV) {
 		spin_lock(&ua->proc->locks_spin);
-		list_del_init(&lkb->lkb_ownqueue);
+		list_move(&lkb->lkb_ownqueue, &ua->proc->unlocking);
 		spin_unlock(&ua->proc->locks_spin);
-		unhold_lkb(lkb);
 	}
  out_put:
 	dlm_put_lkb(lkb);
@@ -3880,11 +3877,6 @@ void dlm_clear_proc_locks(struct dlm_ls *ls, struct dlm_user_proc *proc)
 	mutex_lock(&ls->ls_clear_proc_locks);
 
 	list_for_each_entry_safe(lkb, safe, &proc->locks, lkb_ownqueue) {
-		if (lkb->lkb_ast_type) {
-			list_del(&lkb->lkb_astqueue);
-			unhold_lkb(lkb);
-		}
-
 		list_del_init(&lkb->lkb_ownqueue);
 
 		if (lkb->lkb_exflags & DLM_LKF_PERSISTENT) {
@@ -3901,6 +3893,20 @@ void dlm_clear_proc_locks(struct dlm_ls *ls, struct dlm_user_proc *proc)
 
 		dlm_put_lkb(lkb);
 	}
+
+	/* in-progress unlocks */
+	list_for_each_entry_safe(lkb, safe, &proc->unlocking, lkb_ownqueue) {
+		list_del_init(&lkb->lkb_ownqueue);
+		lkb->lkb_flags |= DLM_IFL_DEAD;
+		dlm_put_lkb(lkb);
+	}
+
+	list_for_each_entry_safe(lkb, safe, &proc->asts, lkb_astqueue) {
+		list_del(&lkb->lkb_astqueue);
+		dlm_put_lkb(lkb);
+	}
+
 	mutex_unlock(&ls->ls_clear_proc_locks);
 	unlock_recovery(ls);
 }
+
diff --git a/fs/dlm/user.c b/fs/dlm/user.c
index c37e93e..d378b7f 100644
--- a/fs/dlm/user.c
+++ b/fs/dlm/user.c
@@ -180,6 +180,14 @@ void dlm_user_add_ast(struct dlm_lkb *lkb, int type)
 	    ua->lksb.sb_status == -EAGAIN && !list_empty(&lkb->lkb_ownqueue))
 		remove_ownqueue = 1;
 
+	/* unlocks or cancels of waiting requests need to be removed from the
+	   proc's unlocking list, again there must be a better way...  */
+
+	if (ua->lksb.sb_status == -DLM_EUNLOCK ||
+	    (ua->lksb.sb_status == -DLM_ECANCEL &&
+	     lkb->lkb_grmode == DLM_LOCK_IV))
+		remove_ownqueue = 1;
+
 	/* We want to copy the lvb to userspace when the completion
 	   ast is read if the status is 0, the lock has an lvb and
 	   lvb_ops says we should.  We could probably have set_lvb_lock()
@@ -523,6 +531,7 @@ static int device_open(struct inode *inode, struct file *file)
 	proc->lockspace = ls->ls_local_handle;
 	INIT_LIST_HEAD(&proc->asts);
 	INIT_LIST_HEAD(&proc->locks);
+	INIT_LIST_HEAD(&proc->unlocking);
 	spin_lock_init(&proc->asts_spin);
 	spin_lock_init(&proc->locks_spin);
 	init_waitqueue_head(&proc->wait);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix master recovery [25/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (23 preceding siblings ...)
  2007-02-05 14:29 ` [DLM] fix user unlocking [24/54] Steven Whitehouse
@ 2007-02-05 14:29 ` Steven Whitehouse
  2007-02-05 14:30 ` [GFS2] Add writepages for "data=writeback" mounts [26/54] Steven Whitehouse
                   ` (29 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 5581bdbb3858c4df26b88f2afa641b23833cbed1 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Mon, 15 Jan 2007 10:28:22 -0600
Subject: [PATCH] [DLM] fix master recovery

If master recovery happens on an rsb in one recovery sequence, then that
sequence is aborted before lock recovery happens, then in the next
sequence, we rely on the previous master recovery (which may now be
invalid due to another node ignoring a lookup result) and go on do to the
lock recovery where we get stuck due to an invalid master value.

 recovery cycle begins: master of rsb X has left
 nodes A and B send node C an rcom lookup for X to find the new master
 C gets lookup from B first, sets B as new master, and sends reply back to B
 C gets lookup from A next, and sends reply back to A saying B is master
 A gets lookup reply from C and sets B as the new master in the rsb
 recovery cycle on A, B and C is aborted to start a new recovery
 B gets lookup reply from C and ignores it since there's a new recovery
 recovery cycle begins: some other node has joined
 B doesn't think it's the master of X so it doesn't rebuild it in the directory
 C looks up the master of X, no one is master, so it becomes new master
 B looks up the master of X, finds it's C
 A believes that B is the master of X, so it sends its lock to B
 B sends an error back to A
 A resends
 this repeats forever, the incorrect master value on A is never corrected

The fix is to do master recovery on an rsb that still has the NEW_MASTER
flag set from an earlier recovery sequence, and therefore didn't complete
lock recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/recover.c b/fs/dlm/recover.c
index a7fa4cb..c2cc769 100644
--- a/fs/dlm/recover.c
+++ b/fs/dlm/recover.c
@@ -397,7 +397,9 @@ int dlm_recover_masters(struct dlm_ls *ls)
 
 		if (dlm_no_directory(ls))
 			count += recover_master_static(r);
-		else if (!is_master(r) && dlm_is_removed(ls, r->res_nodeid)) {
+		else if (!is_master(r) &&
+			 (dlm_is_removed(ls, r->res_nodeid) ||
+			  rsb_flag(r, RSB_NEW_MASTER))) {
 			recover_master(r);
 			count++;
 		}
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Add writepages for "data=writeback" mounts [26/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (24 preceding siblings ...)
  2007-02-05 14:29 ` [DLM] fix master recovery [25/54] Steven Whitehouse
@ 2007-02-05 14:30 ` Steven Whitehouse
  2007-02-05 14:31 ` [GFS2] Clean up/speed up readdir [27/54] Steven Whitehouse
                   ` (28 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From dd520e32b5028d05bbe034715c82eaf938d13d0a Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 15 Jan 2007 13:52:17 +0000
Subject: [PATCH] [GFS2] Add writepages for "data=writeback" mounts

It occurred to me that although a gfs2 specific writepages for ordered
writes and journaled data would be tricky, by hooking writepages only
for "data=writeback" mounts we could take advantage of not needing
buffer heads (we don't use them on the read side, nor have we for some
time) and create much larger I/Os for the block layer.

Using blktrace both before and after, its possible to see that for large
I/Os, most of the requests generated through writepages are now 1024
sectors after this patch is applied as opposed to 8 sectors before.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c
index 37bfeb9..9ddf975 100644
--- a/fs/gfs2/ops_address.c
+++ b/fs/gfs2/ops_address.c
@@ -16,6 +16,7 @@
 #include <linux/pagevec.h>
 #include <linux/mpage.h>
 #include <linux/fs.h>
+#include <linux/writeback.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/lm_interface.h>
 
@@ -157,6 +158,31 @@ out_ignore:
 }
 
 /**
+ * gfs2_writepages - Write a bunch of dirty pages back to disk
+ * @mapping: The mapping to write
+ * @wbc: Write-back control
+ *
+ * For journaled files and/or ordered writes this just falls back to the
+ * kernel's default writepages path for now. We will probably want to change
+ * that eventually (i.e. when we look at allocate on flush).
+ *
+ * For the data=writeback case though we can already ignore buffer heads
+ * and write whole extents at once. This is a big reduction in the
+ * number of I/O requests we send and the bmap calls we make in this case.
+ */
+int gfs2_writepages(struct address_space *mapping, struct writeback_control *wbc)
+{
+	struct inode *inode = mapping->host;
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+
+	if (sdp->sd_args.ar_data == GFS2_DATA_WRITEBACK && !gfs2_is_jdata(ip))
+		return mpage_writepages(mapping, wbc, gfs2_get_block_noalloc);
+
+	return generic_writepages(mapping, wbc);
+}
+
+/**
  * stuffed_readpage - Fill in a Linux page with stuffed file data
  * @ip: the inode
  * @page: the page
@@ -757,6 +783,7 @@ out:
 
 const struct address_space_operations gfs2_file_aops = {
 	.writepage = gfs2_writepage,
+	.writepages = gfs2_writepages,
 	.readpage = gfs2_readpage,
 	.readpages = gfs2_readpages,
 	.sync_page = block_sync_page,
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Clean up/speed up readdir [27/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (25 preceding siblings ...)
  2007-02-05 14:30 ` [GFS2] Add writepages for "data=writeback" mounts [26/54] Steven Whitehouse
@ 2007-02-05 14:31 ` Steven Whitehouse
  2007-02-05 14:31 ` [GFS2] Remove max_atomic_write tunable [28/54] Steven Whitehouse
                   ` (27 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 61133418ca162e7685eb8f04b063852689963473 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Wed, 17 Jan 2007 15:09:20 +0000
Subject: [PATCH] [GFS2] Clean up/speed up readdir

This removes the extra filldir callback which gfs2 was using to
enclose an attempt at readahead for inodes during readdir. The
code was too complicated and also hurts performance badly in the
case that the getdents64/readdir call isn't being followed by
stat() and it wasn't even getting it right all the time when it
was.

As a result, on my test box an "ls" of a directory containing 250000
files fell from about 7mins (freshly mounted, so nothing cached) to
between about 15 to 25 seconds. When the directory content was cached,
the time taken fell from about 3mins to about 4 or 5 seconds.

Interestingly in the cached case, running "ls -l" once reduced the time
taken for subsequent runs of "ls" to about 6 secs even without this
patch. Now it turns out that there was a special case of glocks being
used for prefetching the metadata, but because of the timeouts for these
locks (set to 10 secs) the metadata was being timed out before it was
being used and this the prefetch code was constantly trying to prefetch
the same data over and over.

Calling "ls -l" meant that the inodes were brought into memory and once
the inodes are cached, the glocks are not disposed of until the inodes
are pushed out of the cache, thus extending the lifetime of the glocks,
and thus bringing down the time for subsequent runs of "ls"
considerably.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 0fdcb77..0eceb05 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1198,12 +1198,11 @@ static int compare_dents(const void *a, const void *b)
  */
 
 static int do_filldir_main(struct gfs2_inode *dip, u64 *offset,
-			   void *opaque, gfs2_filldir_t filldir,
+			   void *opaque, filldir_t filldir,
 			   const struct gfs2_dirent **darr, u32 entries,
 			   int *copied)
 {
 	const struct gfs2_dirent *dent, *dent_next;
-	struct gfs2_inum_host inum;
 	u64 off, off_next;
 	unsigned int x, y;
 	int run = 0;
@@ -1240,11 +1239,9 @@ static int do_filldir_main(struct gfs2_inode *dip, u64 *offset,
 			*offset = off;
 		}
 
-		gfs2_inum_in(&inum, (char *)&dent->de_inum);
-
 		error = filldir(opaque, (const char *)(dent + 1),
 				be16_to_cpu(dent->de_name_len),
-				off, &inum,
+				off, be64_to_cpu(dent->de_inum.no_addr),
 				be16_to_cpu(dent->de_type));
 		if (error)
 			return 1;
@@ -1262,8 +1259,8 @@ static int do_filldir_main(struct gfs2_inode *dip, u64 *offset,
 }
 
 static int gfs2_dir_read_leaf(struct inode *inode, u64 *offset, void *opaque,
-			      gfs2_filldir_t filldir, int *copied,
-			      unsigned *depth, u64 leaf_no)
+			      filldir_t filldir, int *copied, unsigned *depth,
+			      u64 leaf_no)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct buffer_head *bh;
@@ -1343,7 +1340,7 @@ out:
  */
 
 static int dir_e_read(struct inode *inode, u64 *offset, void *opaque,
-		      gfs2_filldir_t filldir)
+		      filldir_t filldir)
 {
 	struct gfs2_inode *dip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
@@ -1402,7 +1399,7 @@ out:
 }
 
 int gfs2_dir_read(struct inode *inode, u64 *offset, void *opaque,
-		  gfs2_filldir_t filldir)
+		  filldir_t filldir)
 {
 	struct gfs2_inode *dip = GFS2_I(inode);
 	struct dirent_gather g;
diff --git a/fs/gfs2/dir.h b/fs/gfs2/dir.h
index b21b336..48fe890 100644
--- a/fs/gfs2/dir.h
+++ b/fs/gfs2/dir.h
@@ -16,30 +16,13 @@ struct inode;
 struct gfs2_inode;
 struct gfs2_inum;
 
-/**
- * gfs2_filldir_t - Report a directory entry to the caller of gfs2_dir_read()
- * @opaque: opaque data used by the function
- * @name: the name of the directory entry
- * @length: the length of the name
- * @offset: the entry's offset in the directory
- * @inum: the inode number the entry points to
- * @type: the type of inode the entry points to
- *
- * Returns: 0 on success, 1 if buffer full
- */
-
-typedef int (*gfs2_filldir_t) (void *opaque,
-			      const char *name, unsigned int length,
-			      u64 offset,
-			      struct gfs2_inum_host *inum, unsigned int type);
-
 int gfs2_dir_search(struct inode *dir, const struct qstr *filename,
 		    struct gfs2_inum_host *inum, unsigned int *type);
 int gfs2_dir_add(struct inode *inode, const struct qstr *filename,
 		 const struct gfs2_inum_host *inum, unsigned int type);
 int gfs2_dir_del(struct gfs2_inode *dip, const struct qstr *filename);
-int gfs2_dir_read(struct inode *inode, u64 * offset, void *opaque,
-		  gfs2_filldir_t filldir);
+int gfs2_dir_read(struct inode *inode, u64 *offset, void *opaque,
+		  filldir_t filldir);
 int gfs2_dir_mvino(struct gfs2_inode *dip, const struct qstr *filename,
 		   struct gfs2_inum_host *new_inum, unsigned int new_type);
 
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 4381469..fb1960b 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -971,8 +971,6 @@ static void drop_bh(struct gfs2_glock *gl, unsigned int ret)
 	const struct gfs2_glock_operations *glops = gl->gl_ops;
 	struct gfs2_holder *gh = gl->gl_req_gh;
 
-	clear_bit(GLF_PREFETCH, &gl->gl_flags);
-
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
 	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
 	gfs2_assert_warn(sdp, !ret);
@@ -1227,8 +1225,6 @@ restart:
 		}
 	}
 
-	clear_bit(GLF_PREFETCH, &gl->gl_flags);
-
 	return error;
 }
 
@@ -1320,36 +1316,6 @@ void gfs2_glock_dq(struct gfs2_holder *gh)
 	spin_unlock(&gl->gl_spin);
 }
 
-/**
- * gfs2_glock_prefetch - Try to prefetch a glock
- * @gl: the glock
- * @state: the state to prefetch in
- * @flags: flags passed to go_xmote_th()
- *
- */
-
-static void gfs2_glock_prefetch(struct gfs2_glock *gl, unsigned int state,
-				int flags)
-{
-	const struct gfs2_glock_operations *glops = gl->gl_ops;
-
-	spin_lock(&gl->gl_spin);
-
-	if (test_bit(GLF_LOCK, &gl->gl_flags) || !list_empty(&gl->gl_holders) ||
-	    !list_empty(&gl->gl_waiters1) || !list_empty(&gl->gl_waiters2) ||
-	    !list_empty(&gl->gl_waiters3) ||
-	    relaxed_state_ok(gl->gl_state, state, flags)) {
-		spin_unlock(&gl->gl_spin);
-		return;
-	}
-
-	set_bit(GLF_PREFETCH, &gl->gl_flags);
-	set_bit(GLF_LOCK, &gl->gl_flags);
-	spin_unlock(&gl->gl_spin);
-
-	glops->go_xmote_th(gl, state, flags);
-}
-
 static void greedy_work(struct work_struct *work)
 {
 	struct greedy *gr = container_of(work, struct greedy, gr_work.work);
@@ -1618,34 +1584,6 @@ void gfs2_glock_dq_uninit_m(unsigned int num_gh, struct gfs2_holder *ghs)
 }
 
 /**
- * gfs2_glock_prefetch_num - prefetch a glock based on lock number
- * @sdp: the filesystem
- * @number: the lock number
- * @glops: the glock operations for the type of glock
- * @state: the state to acquire the glock in
- * @flags: modifier flags for the aquisition
- *
- * Returns: errno
- */
-
-void gfs2_glock_prefetch_num(struct gfs2_sbd *sdp, u64 number,
-			     const struct gfs2_glock_operations *glops,
-			     unsigned int state, int flags)
-{
-	struct gfs2_glock *gl;
-	int error;
-
-	if (atomic_read(&sdp->sd_reclaim_count) <
-	    gfs2_tune_get(sdp, gt_reclaim_limit)) {
-		error = gfs2_glock_get(sdp, number, glops, CREATE, &gl);
-		if (!error) {
-			gfs2_glock_prefetch(gl, state, flags);
-			gfs2_glock_put(gl);
-		}
-	}
-}
-
-/**
  * gfs2_lvb_hold - attach a LVB from a glock
  * @gl: The glock in question
  *
@@ -1781,15 +1719,11 @@ void gfs2_glock_cb(void *cb_data, unsigned int type, void *data)
 
 static int demote_ok(struct gfs2_glock *gl)
 {
-	struct gfs2_sbd *sdp = gl->gl_sbd;
 	const struct gfs2_glock_operations *glops = gl->gl_ops;
 	int demote = 1;
 
 	if (test_bit(GLF_STICKY, &gl->gl_flags))
 		demote = 0;
-	else if (test_bit(GLF_PREFETCH, &gl->gl_flags))
-		demote = time_after_eq(jiffies, gl->gl_stamp +
-				    gfs2_tune_get(sdp, gt_prefetch_secs) * HZ);
 	else if (glops->go_demote_ok)
 		demote = glops->go_demote_ok(gl);
 
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index fb39108..bde02a7 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -103,10 +103,6 @@ int gfs2_glock_nq_m(unsigned int num_gh, struct gfs2_holder *ghs);
 void gfs2_glock_dq_m(unsigned int num_gh, struct gfs2_holder *ghs);
 void gfs2_glock_dq_uninit_m(unsigned int num_gh, struct gfs2_holder *ghs);
 
-void gfs2_glock_prefetch_num(struct gfs2_sbd *sdp, u64 number,
-			     const struct gfs2_glock_operations *glops,
-			     unsigned int state, int flags);
-
 /**
  * gfs2_glock_nq_init - intialize a holder and enqueue it on a glock
  * @gl: the glock
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 734421e..8075870 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -147,7 +147,6 @@ struct gfs2_holder {
 enum {
 	GLF_LOCK		= 1,
 	GLF_STICKY		= 2,
-	GLF_PREFETCH		= 3,
 	GLF_DIRTY		= 5,
 	GLF_SKIP_WAITERS2	= 6,
 	GLF_GREEDY		= 7,
@@ -425,7 +424,6 @@ struct gfs2_tune {
 	unsigned int gt_complain_secs;
 	unsigned int gt_reclaim_limit; /* Max num of glocks in reclaim list */
 	unsigned int gt_entries_per_readdir;
-	unsigned int gt_prefetch_secs; /* Usage window for prefetched glocks */
 	unsigned int gt_greedy_default;
 	unsigned int gt_greedy_quantum;
 	unsigned int gt_greedy_max;
diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
index 6ea979c..fbf5506 100644
--- a/fs/gfs2/ops_export.c
+++ b/fs/gfs2/ops_export.c
@@ -113,13 +113,12 @@ struct get_name_filldir {
 	char *name;
 };
 
-static int get_name_filldir(void *opaque, const char *name, unsigned int length,
-			    u64 offset, struct gfs2_inum_host *inum,
-			    unsigned int type)
+static int get_name_filldir(void *opaque, const char *name, int length,
+			    loff_t offset, u64 inum, unsigned int type)
 {
-	struct get_name_filldir *gnfd = (struct get_name_filldir *)opaque;
+	struct get_name_filldir *gnfd = opaque;
 
-	if (!gfs2_inum_equal(inum, &gnfd->inum))
+	if (inum != gnfd->inum.no_addr)
 		return 0;
 
 	memcpy(gnfd->name, name, length);
diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
index faa07e4..c996aa7 100644
--- a/fs/gfs2/ops_file.c
+++ b/fs/gfs2/ops_file.c
@@ -43,15 +43,6 @@
 #include "util.h"
 #include "eaops.h"
 
-/* For regular, non-NFS */
-struct filldir_reg {
-	struct gfs2_sbd *fdr_sbd;
-	int fdr_prefetch;
-
-	filldir_t fdr_filldir;
-	void *fdr_opaque;
-};
-
 /*
  * Most fields left uninitialised to catch anybody who tries to
  * use them. f_flags set to prevent file_accessed() from touching
@@ -128,41 +119,6 @@ static loff_t gfs2_llseek(struct file *file, loff_t offset, int origin)
 }
 
 /**
- * filldir_func - Report a directory entry to the caller of gfs2_dir_read()
- * @opaque: opaque data used by the function
- * @name: the name of the directory entry
- * @length: the length of the name
- * @offset: the entry's offset in the directory
- * @inum: the inode number the entry points to
- * @type: the type of inode the entry points to
- *
- * Returns: 0 on success, 1 if buffer full
- */
-
-static int filldir_func(void *opaque, const char *name, unsigned int length,
-			u64 offset, struct gfs2_inum_host *inum,
-			unsigned int type)
-{
-	struct filldir_reg *fdr = (struct filldir_reg *)opaque;
-	struct gfs2_sbd *sdp = fdr->fdr_sbd;
-	int error;
-
-	error = fdr->fdr_filldir(fdr->fdr_opaque, name, length, offset,
-				 inum->no_addr, type);
-	if (error)
-		return 1;
-
-	if (fdr->fdr_prefetch && !(length == 1 && *name == '.')) {
-		gfs2_glock_prefetch_num(sdp, inum->no_addr, &gfs2_inode_glops,
-				       LM_ST_SHARED, LM_FLAG_TRY | LM_FLAG_ANY);
-		gfs2_glock_prefetch_num(sdp, inum->no_addr, &gfs2_iopen_glops,
-				       LM_ST_SHARED, LM_FLAG_TRY);
-	}
-
-	return 0;
-}
-
-/**
  * gfs2_readdir - Read directory entries from a directory
  * @file: The directory to read from
  * @dirent: Buffer for dirents
@@ -175,16 +131,10 @@ static int gfs2_readdir(struct file *file, void *dirent, filldir_t filldir)
 {
 	struct inode *dir = file->f_mapping->host;
 	struct gfs2_inode *dip = GFS2_I(dir);
-	struct filldir_reg fdr;
 	struct gfs2_holder d_gh;
 	u64 offset = file->f_pos;
 	int error;
 
-	fdr.fdr_sbd = GFS2_SB(dir);
-	fdr.fdr_prefetch = 1;
-	fdr.fdr_filldir = filldir;
-	fdr.fdr_opaque = dirent;
-
 	gfs2_holder_init(dip->i_gl, LM_ST_SHARED, GL_ATIME, &d_gh);
 	error = gfs2_glock_nq_atime(&d_gh);
 	if (error) {
@@ -192,7 +142,7 @@ static int gfs2_readdir(struct file *file, void *dirent, filldir_t filldir)
 		return error;
 	}
 
-	error = gfs2_dir_read(dir, &offset, &fdr, filldir_func);
+	error = gfs2_dir_read(dir, &offset, dirent, filldir);
 
 	gfs2_glock_dq_uninit(&d_gh);
 
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 43a24f2..100852a 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -78,7 +78,6 @@ void gfs2_tune_init(struct gfs2_tune *gt)
 	gt->gt_complain_secs = 10;
 	gt->gt_reclaim_limit = 5000;
 	gt->gt_entries_per_readdir = 32;
-	gt->gt_prefetch_secs = 10;
 	gt->gt_greedy_default = HZ / 10;
 	gt->gt_greedy_quantum = HZ / 40;
 	gt->gt_greedy_max = HZ / 4;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 983eaf1..cd28f08 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -436,7 +436,6 @@ TUNE_ATTR(atime_quantum, 0);
 TUNE_ATTR(max_readahead, 0);
 TUNE_ATTR(complain_secs, 0);
 TUNE_ATTR(reclaim_limit, 0);
-TUNE_ATTR(prefetch_secs, 0);
 TUNE_ATTR(statfs_slow, 0);
 TUNE_ATTR(new_files_jdata, 0);
 TUNE_ATTR(new_files_directio, 0);
@@ -465,7 +464,6 @@ static struct attribute *tune_attrs[] = {
 	&tune_attr_max_readahead.attr,
 	&tune_attr_complain_secs.attr,
 	&tune_attr_reclaim_limit.attr,
-	&tune_attr_prefetch_secs.attr,
 	&tune_attr_statfs_slow.attr,
 	&tune_attr_quota_simul_sync.attr,
 	&tune_attr_quota_cache_secs.attr,
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Remove max_atomic_write tunable [28/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (26 preceding siblings ...)
  2007-02-05 14:31 ` [GFS2] Clean up/speed up readdir [27/54] Steven Whitehouse
@ 2007-02-05 14:31 ` Steven Whitehouse
  2007-02-05 14:32 ` [GFS2] Shrink gfs2_inode memory by half [29/54] Steven Whitehouse
                   ` (26 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From d3601c594d21569e2963958a8d47df9eb7b55603 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 15 Jan 2007 16:36:26 -0500
Subject: [PATCH] [GFS2] Remove max_atomic_write tunable

This removes an unused sysfs tunable parameter.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 8075870..9114851 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -417,7 +417,6 @@ struct gfs2_tune {
 	unsigned int gt_atime_quantum; /* Min secs between atime updates */
 	unsigned int gt_new_files_jdata;
 	unsigned int gt_new_files_directio;
-	unsigned int gt_max_atomic_write; /* Split big writes into this size */
 	unsigned int gt_max_readahead; /* Max bytes to read-ahead from disk */
 	unsigned int gt_lockdump_size;
 	unsigned int gt_stall_secs; /* Detects trouble! */
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 100852a..3e17dcf 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -71,7 +71,6 @@ void gfs2_tune_init(struct gfs2_tune *gt)
 	gt->gt_atime_quantum = 3600;
 	gt->gt_new_files_jdata = 0;
 	gt->gt_new_files_directio = 0;
-	gt->gt_max_atomic_write = 4 << 20;
 	gt->gt_max_readahead = 1 << 18;
 	gt->gt_lockdump_size = 131072;
 	gt->gt_stall_secs = 600;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index cd28f08..1120611 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -441,7 +441,6 @@ TUNE_ATTR(new_files_jdata, 0);
 TUNE_ATTR(new_files_directio, 0);
 TUNE_ATTR(quota_simul_sync, 1);
 TUNE_ATTR(quota_cache_secs, 1);
-TUNE_ATTR(max_atomic_write, 1);
 TUNE_ATTR(stall_secs, 1);
 TUNE_ATTR(greedy_default, 1);
 TUNE_ATTR(greedy_quantum, 1);
@@ -467,7 +466,6 @@ static struct attribute *tune_attrs[] = {
 	&tune_attr_statfs_slow.attr,
 	&tune_attr_quota_simul_sync.attr,
 	&tune_attr_quota_cache_secs.attr,
-	&tune_attr_max_atomic_write.attr,
 	&tune_attr_stall_secs.attr,
 	&tune_attr_greedy_default.attr,
 	&tune_attr_greedy_quantum.attr,
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Shrink gfs2_inode memory by half [29/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (27 preceding siblings ...)
  2007-02-05 14:31 ` [GFS2] Remove max_atomic_write tunable [28/54] Steven Whitehouse
@ 2007-02-05 14:32 ` Steven Whitehouse
  2007-02-05 14:33 ` [GFS2] Remove the "greedy" function from glock.[ch] [30/54] Steven Whitehouse
                   ` (25 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 6be1a543bdaf360ab499e401ca5a0105a9923d75 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Wed, 17 Jan 2007 15:33:23 +0000
Subject: [PATCH] [GFS2] Shrink gfs2_inode memory by half

Here is something I spotted (while looking for something entirely
different) the other day.

Rather than using a completion in each and every struct gfs2_holder,
this removes it in favour of hashed wait queues, thus saving a
considerable amount of memory both on the stack (where a number of
gfs2_holder structures are allocated) and in particular in the
gfs2_inode which has 8 gfs2_holder structures embedded within it.

As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to
1912 bytes, a saving of 576 bytes per inode (no thats not a typo!).
In actual practice we get a much better result than that since
now that a gfs2_inode is under the 2048 byte barrier, we get two
per 4k slab page effectively halving the amount of memory required
to store gfs2_inodes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index fb1960b..5341e03 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -19,6 +19,7 @@
 #include <linux/gfs2_ondisk.h>
 #include <linux/list.h>
 #include <linux/lm_interface.h>
+#include <linux/wait.h>
 #include <asm/uaccess.h>
 
 #include "gfs2.h"
@@ -395,7 +396,6 @@ void gfs2_holder_init(struct gfs2_glock *gl, unsigned int state, unsigned flags,
 	gh->gh_flags = flags;
 	gh->gh_error = 0;
 	gh->gh_iflags = 0;
-	init_completion(&gh->gh_wait);
 
 	if (gh->gh_state == LM_ST_EXCLUSIVE)
 		gh->gh_flags |= GL_LOCAL_EXCL;
@@ -479,6 +479,29 @@ static void gfs2_holder_put(struct gfs2_holder *gh)
 	kfree(gh);
 }
 
+static void gfs2_holder_dispose_or_wake(struct gfs2_holder *gh)
+{
+	if (test_bit(HIF_DEALLOC, &gh->gh_iflags)) {
+		gfs2_holder_put(gh);
+		return;
+	}
+	clear_bit(HIF_WAIT, &gh->gh_iflags);
+	smp_mb();
+	wake_up_bit(&gh->gh_iflags, HIF_WAIT);
+}
+
+static int holder_wait(void *word)
+{
+        schedule();
+        return 0;
+}
+
+static void wait_on_holder(struct gfs2_holder *gh)
+{
+	might_sleep();
+	wait_on_bit(&gh->gh_iflags, HIF_WAIT, holder_wait, TASK_UNINTERRUPTIBLE);
+}
+
 /**
  * rq_mutex - process a mutex request in the queue
  * @gh: the glock holder
@@ -493,7 +516,9 @@ static int rq_mutex(struct gfs2_holder *gh)
 	list_del_init(&gh->gh_list);
 	/*  gh->gh_error never examined.  */
 	set_bit(GLF_LOCK, &gl->gl_flags);
-	complete(&gh->gh_wait);
+	clear_bit(HIF_WAIT, &gh->gh_flags);
+	smp_mb();
+	wake_up_bit(&gh->gh_iflags, HIF_WAIT);
 
 	return 1;
 }
@@ -549,7 +574,7 @@ static int rq_promote(struct gfs2_holder *gh)
 	gh->gh_error = 0;
 	set_bit(HIF_HOLDER, &gh->gh_iflags);
 
-	complete(&gh->gh_wait);
+	gfs2_holder_dispose_or_wake(gh);
 
 	return 0;
 }
@@ -573,10 +598,7 @@ static int rq_demote(struct gfs2_holder *gh)
 		list_del_init(&gh->gh_list);
 		gh->gh_error = 0;
 		spin_unlock(&gl->gl_spin);
-		if (test_bit(HIF_DEALLOC, &gh->gh_iflags))
-			gfs2_holder_put(gh);
-		else
-			complete(&gh->gh_wait);
+		gfs2_holder_dispose_or_wake(gh);
 		spin_lock(&gl->gl_spin);
 	} else {
 		gl->gl_req_gh = gh;
@@ -684,6 +706,8 @@ static void gfs2_glmutex_lock(struct gfs2_glock *gl)
 
 	gfs2_holder_init(gl, 0, 0, &gh);
 	set_bit(HIF_MUTEX, &gh.gh_iflags);
+	if (test_and_set_bit(HIF_WAIT, &gh.gh_iflags))
+		BUG();
 
 	spin_lock(&gl->gl_spin);
 	if (test_and_set_bit(GLF_LOCK, &gl->gl_flags)) {
@@ -691,11 +715,13 @@ static void gfs2_glmutex_lock(struct gfs2_glock *gl)
 	} else {
 		gl->gl_owner = current;
 		gl->gl_ip = (unsigned long)__builtin_return_address(0);
-		complete(&gh.gh_wait);
+		clear_bit(HIF_WAIT, &gh.gh_iflags);
+		smp_mb();
+		wake_up_bit(&gh.gh_iflags, HIF_WAIT);
 	}
 	spin_unlock(&gl->gl_spin);
 
-	wait_for_completion(&gh.gh_wait);
+	wait_on_holder(&gh);
 	gfs2_holder_uninit(&gh);
 }
 
@@ -774,6 +800,7 @@ restart:
 			return;
 		set_bit(HIF_DEMOTE, &new_gh->gh_iflags);
 		set_bit(HIF_DEALLOC, &new_gh->gh_iflags);
+		set_bit(HIF_WAIT, &new_gh->gh_iflags);
 
 		goto restart;
 	}
@@ -908,12 +935,8 @@ static void xmote_bh(struct gfs2_glock *gl, unsigned int ret)
 
 	gfs2_glock_put(gl);
 
-	if (gh) {
-		if (test_bit(HIF_DEALLOC, &gh->gh_iflags))
-			gfs2_holder_put(gh);
-		else
-			complete(&gh->gh_wait);
-	}
+	if (gh)
+		gfs2_holder_dispose_or_wake(gh);
 }
 
 /**
@@ -999,12 +1022,8 @@ static void drop_bh(struct gfs2_glock *gl, unsigned int ret)
 
 	gfs2_glock_put(gl);
 
-	if (gh) {
-		if (test_bit(HIF_DEALLOC, &gh->gh_iflags))
-			gfs2_holder_put(gh);
-		else
-			complete(&gh->gh_wait);
-	}
+	if (gh)
+		gfs2_holder_dispose_or_wake(gh);
 }
 
 /**
@@ -1105,8 +1124,7 @@ static int glock_wait_internal(struct gfs2_holder *gh)
 	if (gh->gh_flags & LM_FLAG_PRIORITY)
 		do_cancels(gh);
 
-	wait_for_completion(&gh->gh_wait);
-
+	wait_on_holder(gh);
 	if (gh->gh_error)
 		return gh->gh_error;
 
@@ -1162,6 +1180,8 @@ static void add_to_queue(struct gfs2_holder *gh)
 	struct gfs2_holder *existing;
 
 	BUG_ON(!gh->gh_owner);
+	if (test_and_set_bit(HIF_WAIT, &gh->gh_iflags))
+		BUG();
 
 	existing = find_holder_by_owner(&gl->gl_holders, gh->gh_owner);
 	if (existing) {
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 9114851..a24c4af 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -128,6 +128,7 @@ enum {
 	HIF_HOLDER		= 6,
 	HIF_FIRST		= 7,
 	HIF_ABORTED		= 9,
+	HIF_WAIT		= 10,
 };
 
 struct gfs2_holder {
@@ -140,7 +141,6 @@ struct gfs2_holder {
 
 	int gh_error;
 	unsigned long gh_iflags;
-	struct completion gh_wait;
 	unsigned long gh_ip;
 };
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Remove the "greedy" function from glock.[ch] [30/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (28 preceding siblings ...)
  2007-02-05 14:32 ` [GFS2] Shrink gfs2_inode memory by half [29/54] Steven Whitehouse
@ 2007-02-05 14:33 ` Steven Whitehouse
  2007-02-05 14:34 ` [GFS2] Remove unused go_callback operation [31/54] Steven Whitehouse
                   ` (24 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 66089b47e0042bc5fc432017e2fd4abb519bec98 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Thu, 18 Jan 2007 17:44:20 +0000
Subject: [PATCH] [GFS2] Remove the "greedy" function from glock.[ch]

The "greedy" code was an attempt to retain glocks for a minimum length
of time when they relate to mmap()ed files. The current implementation
of this feature is not, however, ideal in that it required allocating
memory in order to do this and its overly complicated.

It also misses the mark by ignoring the other I/O operations which are
just as likely to suffer from the same problem. So the plan is to remove
this now and then add the functionality back as part of the glock state
machine at a later date (and thus take into account all the possible
users of this feature)

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5341e03..90847e0 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -34,11 +34,6 @@
 #include "super.h"
 #include "util.h"
 
-struct greedy {
-	struct gfs2_holder gr_gh;
-	struct delayed_work gr_work;
-};
-
 struct gfs2_gl_hash_bucket {
         struct hlist_head hb_list;
 };
@@ -618,30 +613,6 @@ static int rq_demote(struct gfs2_holder *gh)
 }
 
 /**
- * rq_greedy - process a queued request to drop greedy status
- * @gh: the glock holder
- *
- * Returns: 1 if the queue is blocked
- */
-
-static int rq_greedy(struct gfs2_holder *gh)
-{
-	struct gfs2_glock *gl = gh->gh_gl;
-
-	list_del_init(&gh->gh_list);
-	/*  gh->gh_error never examined.  */
-	clear_bit(GLF_GREEDY, &gl->gl_flags);
-	spin_unlock(&gl->gl_spin);
-
-	gfs2_holder_uninit(gh);
-	kfree(container_of(gh, struct greedy, gr_gh));
-
-	spin_lock(&gl->gl_spin);
-
-	return 0;
-}
-
-/**
  * run_queue - process holder structures on a glock
  * @gl: the glock
  *
@@ -671,8 +642,6 @@ static void run_queue(struct gfs2_glock *gl)
 
 			if (test_bit(HIF_DEMOTE, &gh->gh_iflags))
 				blocked = rq_demote(gh);
-			else if (test_bit(HIF_GREEDY, &gh->gh_iflags))
-				blocked = rq_greedy(gh);
 			else
 				gfs2_assert_warn(gl->gl_sbd, 0);
 
@@ -1336,68 +1305,6 @@ void gfs2_glock_dq(struct gfs2_holder *gh)
 	spin_unlock(&gl->gl_spin);
 }
 
-static void greedy_work(struct work_struct *work)
-{
-	struct greedy *gr = container_of(work, struct greedy, gr_work.work);
-	struct gfs2_holder *gh = &gr->gr_gh;
-	struct gfs2_glock *gl = gh->gh_gl;
-	const struct gfs2_glock_operations *glops = gl->gl_ops;
-
-	clear_bit(GLF_SKIP_WAITERS2, &gl->gl_flags);
-
-	if (glops->go_greedy)
-		glops->go_greedy(gl);
-
-	spin_lock(&gl->gl_spin);
-
-	if (list_empty(&gl->gl_waiters2)) {
-		clear_bit(GLF_GREEDY, &gl->gl_flags);
-		spin_unlock(&gl->gl_spin);
-		gfs2_holder_uninit(gh);
-		kfree(gr);
-	} else {
-		gfs2_glock_hold(gl);
-		list_add_tail(&gh->gh_list, &gl->gl_waiters2);
-		run_queue(gl);
-		spin_unlock(&gl->gl_spin);
-		gfs2_glock_put(gl);
-	}
-}
-
-/**
- * gfs2_glock_be_greedy -
- * @gl:
- * @time:
- *
- * Returns: 0 if go_greedy will be called, 1 otherwise
- */
-
-int gfs2_glock_be_greedy(struct gfs2_glock *gl, unsigned int time)
-{
-	struct greedy *gr;
-	struct gfs2_holder *gh;
-
-	if (!time || gl->gl_sbd->sd_args.ar_localcaching ||
-	    test_and_set_bit(GLF_GREEDY, &gl->gl_flags))
-		return 1;
-
-	gr = kmalloc(sizeof(struct greedy), GFP_KERNEL);
-	if (!gr) {
-		clear_bit(GLF_GREEDY, &gl->gl_flags);
-		return 1;
-	}
-	gh = &gr->gr_gh;
-
-	gfs2_holder_init(gl, 0, 0, gh);
-	set_bit(HIF_GREEDY, &gh->gh_iflags);
-	INIT_DELAYED_WORK(&gr->gr_work, greedy_work);
-
-	set_bit(GLF_SKIP_WAITERS2, &gl->gl_flags);
-	schedule_delayed_work(&gr->gr_work, time);
-
-	return 0;
-}
-
 /**
  * gfs2_glock_dq_uninit - dequeue a holder from a glock and initialize it
  * @gh: the holder structure
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index bde02a7..ddc56dc 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -92,8 +92,6 @@ int gfs2_glock_poll(struct gfs2_holder *gh);
 int gfs2_glock_wait(struct gfs2_holder *gh);
 void gfs2_glock_dq(struct gfs2_holder *gh);
 
-int gfs2_glock_be_greedy(struct gfs2_glock *gl, unsigned int time);
-
 void gfs2_glock_dq_uninit(struct gfs2_holder *gh);
 int gfs2_glock_nq_num(struct gfs2_sbd *sdp,
 		      u64 number, const struct gfs2_glock_operations *glops,
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index b068d10..e4da26f 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -319,39 +319,6 @@ static void inode_go_unlock(struct gfs2_holder *gh)
 }
 
 /**
- * inode_greedy -
- * @gl: the glock
- *
- */
-
-static void inode_greedy(struct gfs2_glock *gl)
-{
-	struct gfs2_sbd *sdp = gl->gl_sbd;
-	struct gfs2_inode *ip = gl->gl_object;
-	unsigned int quantum = gfs2_tune_get(sdp, gt_greedy_quantum);
-	unsigned int max = gfs2_tune_get(sdp, gt_greedy_max);
-	unsigned int new_time;
-
-	spin_lock(&ip->i_spin);
-
-	if (time_after(ip->i_last_pfault + quantum, jiffies)) {
-		new_time = ip->i_greedy + quantum;
-		if (new_time > max)
-			new_time = max;
-	} else {
-		new_time = ip->i_greedy - quantum;
-		if (!new_time || new_time > max)
-			new_time = 1;
-	}
-
-	ip->i_greedy = new_time;
-
-	spin_unlock(&ip->i_spin);
-
-	iput(&ip->i_inode);
-}
-
-/**
  * rgrp_go_demote_ok - Check to see if it's ok to unlock a RG's glock
  * @gl: the glock
  *
@@ -492,7 +459,6 @@ const struct gfs2_glock_operations gfs2_inode_glops = {
 	.go_demote_ok = inode_go_demote_ok,
 	.go_lock = inode_go_lock,
 	.go_unlock = inode_go_unlock,
-	.go_greedy = inode_greedy,
 	.go_type = LM_TYPE_INODE,
 };
 
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index a24c4af..dc024b1 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -111,7 +111,6 @@ struct gfs2_glock_operations {
 	int (*go_lock) (struct gfs2_holder *gh);
 	void (*go_unlock) (struct gfs2_holder *gh);
 	void (*go_callback) (struct gfs2_glock *gl, unsigned int state);
-	void (*go_greedy) (struct gfs2_glock *gl);
 	const int go_type;
 };
 
@@ -120,7 +119,6 @@ enum {
 	HIF_MUTEX		= 0,
 	HIF_PROMOTE		= 1,
 	HIF_DEMOTE		= 2,
-	HIF_GREEDY		= 3,
 
 	/* States */
 	HIF_ALLOCED		= 4,
@@ -149,7 +147,6 @@ enum {
 	GLF_STICKY		= 2,
 	GLF_DIRTY		= 5,
 	GLF_SKIP_WAITERS2	= 6,
-	GLF_GREEDY		= 7,
 };
 
 struct gfs2_glock {
@@ -166,7 +163,7 @@ struct gfs2_glock {
 	unsigned long gl_ip;
 	struct list_head gl_holders;
 	struct list_head gl_waiters1;	/* HIF_MUTEX */
-	struct list_head gl_waiters2;	/* HIF_DEMOTE, HIF_GREEDY */
+	struct list_head gl_waiters2;	/* HIF_DEMOTE */
 	struct list_head gl_waiters3;	/* HIF_PROMOTE */
 
 	const struct gfs2_glock_operations *gl_ops;
@@ -235,7 +232,6 @@ struct gfs2_inode {
 
 	spinlock_t i_spin;
 	struct rw_semaphore i_rw_mutex;
-	unsigned int i_greedy;
 	unsigned long i_last_pfault;
 
 	struct buffer_head *i_cache[GFS2_MAX_META_HEIGHT];
@@ -423,9 +419,6 @@ struct gfs2_tune {
 	unsigned int gt_complain_secs;
 	unsigned int gt_reclaim_limit; /* Max num of glocks in reclaim list */
 	unsigned int gt_entries_per_readdir;
-	unsigned int gt_greedy_default;
-	unsigned int gt_greedy_quantum;
-	unsigned int gt_greedy_max;
 	unsigned int gt_statfs_quantum;
 	unsigned int gt_statfs_slow;
 };
diff --git a/fs/gfs2/ops_super.c b/fs/gfs2/ops_super.c
index c22738c..47369d0 100644
--- a/fs/gfs2/ops_super.c
+++ b/fs/gfs2/ops_super.c
@@ -452,14 +452,12 @@ out:
 
 static struct inode *gfs2_alloc_inode(struct super_block *sb)
 {
-	struct gfs2_sbd *sdp = sb->s_fs_info;
 	struct gfs2_inode *ip;
 
 	ip = kmem_cache_alloc(gfs2_inode_cachep, GFP_KERNEL);
 	if (ip) {
 		ip->i_flags = 0;
 		ip->i_gl = NULL;
-		ip->i_greedy = gfs2_tune_get(sdp, gt_greedy_default);
 		ip->i_last_pfault = jiffies;
 	}
 	return &ip->i_inode;
diff --git a/fs/gfs2/ops_vm.c b/fs/gfs2/ops_vm.c
index 45a5f11..14b380f 100644
--- a/fs/gfs2/ops_vm.c
+++ b/fs/gfs2/ops_vm.c
@@ -28,34 +28,13 @@
 #include "trans.h"
 #include "util.h"
 
-static void pfault_be_greedy(struct gfs2_inode *ip)
-{
-	unsigned int time;
-
-	spin_lock(&ip->i_spin);
-	time = ip->i_greedy;
-	ip->i_last_pfault = jiffies;
-	spin_unlock(&ip->i_spin);
-
-	igrab(&ip->i_inode);
-	if (gfs2_glock_be_greedy(ip->i_gl, time))
-		iput(&ip->i_inode);
-}
-
 static struct page *gfs2_private_nopage(struct vm_area_struct *area,
 					unsigned long address, int *type)
 {
 	struct gfs2_inode *ip = GFS2_I(area->vm_file->f_mapping->host);
-	struct page *result;
 
 	set_bit(GIF_PAGED, &ip->i_flags);
-
-	result = filemap_nopage(area, address, type);
-
-	if (result && result != NOPAGE_OOM)
-		pfault_be_greedy(ip);
-
-	return result;
+	return filemap_nopage(area, address, type);
 }
 
 static int alloc_page_backing(struct gfs2_inode *ip, struct page *page)
@@ -167,7 +146,6 @@ static struct page *gfs2_sharewrite_nopage(struct vm_area_struct *area,
 		set_page_dirty(result);
 	}
 
-	pfault_be_greedy(ip);
 out:
 	gfs2_glock_dq_uninit(&i_gh);
 
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 3e17dcf..ce5353a 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -77,9 +77,6 @@ void gfs2_tune_init(struct gfs2_tune *gt)
 	gt->gt_complain_secs = 10;
 	gt->gt_reclaim_limit = 5000;
 	gt->gt_entries_per_readdir = 32;
-	gt->gt_greedy_default = HZ / 10;
-	gt->gt_greedy_quantum = HZ / 40;
-	gt->gt_greedy_max = HZ / 4;
 	gt->gt_statfs_quantum = 30;
 	gt->gt_statfs_slow = 0;
 }
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 1120611..d01f9f0 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -442,9 +442,6 @@ TUNE_ATTR(new_files_directio, 0);
 TUNE_ATTR(quota_simul_sync, 1);
 TUNE_ATTR(quota_cache_secs, 1);
 TUNE_ATTR(stall_secs, 1);
-TUNE_ATTR(greedy_default, 1);
-TUNE_ATTR(greedy_quantum, 1);
-TUNE_ATTR(greedy_max, 1);
 TUNE_ATTR(statfs_quantum, 1);
 TUNE_ATTR_DAEMON(scand_secs, scand_process);
 TUNE_ATTR_DAEMON(recoverd_secs, recoverd_process);
@@ -467,9 +464,6 @@ static struct attribute *tune_attrs[] = {
 	&tune_attr_quota_simul_sync.attr,
 	&tune_attr_quota_cache_secs.attr,
 	&tune_attr_stall_secs.attr,
-	&tune_attr_greedy_default.attr,
-	&tune_attr_greedy_quantum.attr,
-	&tune_attr_greedy_max.attr,
 	&tune_attr_statfs_quantum.attr,
 	&tune_attr_scand_secs.attr,
 	&tune_attr_recoverd_secs.attr,
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Remove unused go_callback operation [31/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (29 preceding siblings ...)
  2007-02-05 14:33 ` [GFS2] Remove the "greedy" function from glock.[ch] [30/54] Steven Whitehouse
@ 2007-02-05 14:34 ` Steven Whitehouse
  2007-02-05 14:34 ` [GFS2] Remove local exclusive glock mode [32/54] Steven Whitehouse
                   ` (23 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From bb11b4f7b47bf5f2e293895926bd889cc188bf2e Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Fri, 19 Jan 2007 13:57:36 -0500
Subject: [PATCH] [GFS2] Remove unused go_callback operation

This is never used, so we might as well remove it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 90847e0..8e4b55a 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1568,8 +1568,6 @@ static void blocking_cb(struct gfs2_sbd *sdp, struct lm_lockname *name,
 	if (!gl)
 		return;
 
-	if (gl->gl_ops->go_callback)
-		gl->gl_ops->go_callback(gl, state);
 	handle_callback(gl, state);
 
 	spin_lock(&gl->gl_spin);
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index dc024b1..1acbcc2 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -110,7 +110,6 @@ struct gfs2_glock_operations {
 	int (*go_demote_ok) (struct gfs2_glock *gl);
 	int (*go_lock) (struct gfs2_holder *gh);
 	void (*go_unlock) (struct gfs2_holder *gh);
-	void (*go_callback) (struct gfs2_glock *gl, unsigned int state);
 	const int go_type;
 };
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Remove local exclusive glock mode [32/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (30 preceding siblings ...)
  2007-02-05 14:34 ` [GFS2] Remove unused go_callback operation [31/54] Steven Whitehouse
@ 2007-02-05 14:34 ` Steven Whitehouse
  2007-02-05 14:35 ` [DLM] lowcomms tidy [33/54] Steven Whitehouse
                   ` (22 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From aadcc809aad4d4bb8915384a3725097ee3e47405 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 22 Jan 2007 12:10:39 -0500
Subject: [PATCH] [GFS2] Remove local exclusive glock mode

Here is a patch for GFS2 to remove the local exclusive flag. In
the places it was used, mutex's are always held earlier in the
call path, so it appears redundant in the LM_ST_SHARED case.

Also, the GFS2 holders were setting local exclusive in any case where
the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock
code where the flag was tested have been replaced with tests for the
lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the
same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well
as globally exclusive).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 8e4b55a..1345c3d 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -391,10 +391,6 @@ void gfs2_holder_init(struct gfs2_glock *gl, unsigned int state, unsigned flags,
 	gh->gh_flags = flags;
 	gh->gh_error = 0;
 	gh->gh_iflags = 0;
-
-	if (gh->gh_state == LM_ST_EXCLUSIVE)
-		gh->gh_flags |= GL_LOCAL_EXCL;
-
 	gfs2_glock_hold(gl);
 }
 
@@ -412,9 +408,6 @@ void gfs2_holder_reinit(unsigned int state, unsigned flags, struct gfs2_holder *
 {
 	gh->gh_state = state;
 	gh->gh_flags = flags;
-	if (gh->gh_state == LM_ST_EXCLUSIVE)
-		gh->gh_flags |= GL_LOCAL_EXCL;
-
 	gh->gh_iflags &= 1 << HIF_ALLOCED;
 	gh->gh_ip = (unsigned long)__builtin_return_address(0);
 }
@@ -557,11 +550,11 @@ static int rq_promote(struct gfs2_holder *gh)
 		set_bit(GLF_LOCK, &gl->gl_flags);
 	} else {
 		struct gfs2_holder *next_gh;
-		if (gh->gh_flags & GL_LOCAL_EXCL)
+		if (gh->gh_state == LM_ST_EXCLUSIVE)
 			return 1;
 		next_gh = list_entry(gl->gl_holders.next, struct gfs2_holder,
 				     gh_list);
-		if (next_gh->gh_flags & GL_LOCAL_EXCL)
+		if (next_gh->gh_state == LM_ST_EXCLUSIVE)
 			 return 1;
 	}
 
@@ -1363,10 +1356,7 @@ static int glock_compare(const void *arg_a, const void *arg_b)
 		return 1;
 	if (a->ln_number < b->ln_number)
 		return -1;
-	if (gh_a->gh_state == LM_ST_SHARED && gh_b->gh_state == LM_ST_EXCLUSIVE)
-		return 1;
-	if (!(gh_a->gh_flags & GL_LOCAL_EXCL) && (gh_b->gh_flags & GL_LOCAL_EXCL))
-		return 1;
+	BUG_ON(gh_a->gh_gl->gl_ops->go_type == gh_b->gh_gl->gl_ops->go_type);
 	return 0;
 }
 
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index ddc56dc..1eaeacd 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -20,7 +20,6 @@
 #define LM_FLAG_ANY		0x00000008
 #define LM_FLAG_PRIORITY	0x00000010 */
 
-#define GL_LOCAL_EXCL		0x00000020
 #define GL_ASYNC		0x00000040
 #define GL_EXACT		0x00000080
 #define GL_SKIP			0x00000100
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index e4da26f..dda6858 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -295,7 +295,7 @@ static int inode_go_lock(struct gfs2_holder *gh)
 
 	if ((ip->i_di.di_flags & GFS2_DIF_TRUNC_IN_PROG) &&
 	    (gl->gl_state == LM_ST_EXCLUSIVE) &&
-	    (gh->gh_flags & GL_LOCAL_EXCL))
+	    (gh->gh_state == LM_ST_EXCLUSIVE))
 		error = gfs2_truncatei_resume(ip);
 
 	return error;
diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
index fbf5506..4855e8c 100644
--- a/fs/gfs2/ops_export.c
+++ b/fs/gfs2/ops_export.c
@@ -216,8 +216,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb, void *inum_obj)
 	}
 
 	error = gfs2_glock_nq_num(sdp, inum->no_addr, &gfs2_inode_glops,
-				  LM_ST_SHARED, LM_FLAG_ANY | GL_LOCAL_EXCL,
-				  &i_gh);
+				  LM_ST_SHARED, LM_FLAG_ANY, &i_gh);
 	if (error)
 		return ERR_PTR(error);
 
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index ce5353a..70f424f 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -354,8 +354,7 @@ int gfs2_jindex_hold(struct gfs2_sbd *sdp, struct gfs2_holder *ji_gh)
 	mutex_lock(&sdp->sd_jindex_mutex);
 
 	for (;;) {
-		error = gfs2_glock_nq_init(dip->i_gl, LM_ST_SHARED,
-					   GL_LOCAL_EXCL, ji_gh);
+		error = gfs2_glock_nq_init(dip->i_gl, LM_ST_SHARED, 0, ji_gh);
 		if (error)
 			break;
 
@@ -524,8 +523,7 @@ int gfs2_make_fs_rw(struct gfs2_sbd *sdp)
 	struct gfs2_log_header_host head;
 	int error;
 
-	error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_SHARED,
-				   GL_LOCAL_EXCL, &t_gh);
+	error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_SHARED, 0, &t_gh);
 	if (error)
 		return error;
 
@@ -578,9 +576,8 @@ int gfs2_make_fs_ro(struct gfs2_sbd *sdp)
 	gfs2_quota_sync(sdp);
 	gfs2_statfs_sync(sdp);
 
-	error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_SHARED,
-				GL_LOCAL_EXCL | GL_NOCACHE,
-				&t_gh);
+	error = gfs2_glock_nq_init(sdp->sd_trans_gl, LM_ST_SHARED, GL_NOCACHE,
+				   &t_gh);
 	if (error && !test_bit(SDF_SHUTDOWN, &sdp->sd_flags))
 		return error;
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] lowcomms tidy [33/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (31 preceding siblings ...)
  2007-02-05 14:34 ` [GFS2] Remove local exclusive glock mode [32/54] Steven Whitehouse
@ 2007-02-05 14:35 ` Steven Whitehouse
  2007-02-05 14:35 ` [GFS2] Tidy up glops calls [34/54] Steven Whitehouse
                   ` (21 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From f9e63b1a6a1af856ce6f82710368924386353196 Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Mon, 22 Jan 2007 14:50:10 +0000
Subject: [PATCH] [DLM] lowcomms tidy

This patch removes some redundant fields from the connection structure and adds
some lockdep annotation to remove spurious warnings.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 86e5f81..6e27201 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -97,9 +97,6 @@ struct connection {
 	struct socket *sock;	/* NULL if not connected */
 	uint32_t nodeid;	/* So we know who we are in the list */
 	struct rw_semaphore sock_sem; /* Stop connect races */
-	struct list_head read_list;   /* On this list when ready for reading */
-	struct list_head write_list;  /* On this list when ready for writing */
-	struct list_head state_list;  /* On this list when ready to connect */
 	unsigned long flags;	/* bit 1,2 = We are on the read/write lists */
 #define CF_READ_PENDING 1
 #define CF_WRITE_PENDING 2
@@ -391,7 +388,7 @@ static int accept_from_sock(struct connection *con)
 	if (result < 0)
 		return -ENOMEM;
 
-	down_read(&con->sock_sem);
+	down_read_nested(&con->sock_sem, 0);
 
 	result = -ENOTCONN;
 	if (con->sock == NULL)
@@ -434,7 +431,7 @@ static int accept_from_sock(struct connection *con)
 		result = -ENOMEM;
 		goto accept_err;
 	}
-	down_write(&newcon->sock_sem);
+	down_write_nested(&newcon->sock_sem, 1);
 	if (newcon->sock) {
 		struct connection *othercon = newcon->othercon;
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Tidy up glops calls [34/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (32 preceding siblings ...)
  2007-02-05 14:35 ` [DLM] lowcomms tidy [33/54] Steven Whitehouse
@ 2007-02-05 14:35 ` Steven Whitehouse
  2007-02-05 14:36 ` [DLM] fix lowcomms receiving [35/54] Steven Whitehouse
                   ` (20 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From dfa37e755f86caee54447efb24ffc4b3bcf4d837 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 22 Jan 2007 12:15:34 -0500
Subject: [PATCH] [GFS2] Tidy up glops calls

This patch doesn't make any changes to the ordering of the various
operations related to glocking, but it does tidy up the calls to the
glops.c functions to make the structure more obvious.

The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be
made static within glock.c since they are called by every set of glock
operations. The xmote_th and drop_th glock operations are then made
conditional upon those two routines existing and called from the
previously mentioned functions in glock.c respectively.

Also it can be seen that the go_sync operation isn't needed since it can
easily be replaced by calls to xmote_bh and drop_bh respectively. This
results in no longer (confusingly) calling back into routines in glock.c
from glops.c and also reducing the glock operations by one member.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 1345c3d..5b772bb 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -43,6 +43,8 @@ typedef void (*glock_examiner) (struct gfs2_glock * gl);
 static int gfs2_dump_lockstate(struct gfs2_sbd *sdp);
 static int dump_glock(struct gfs2_glock *gl);
 static int dump_inode(struct gfs2_inode *ip);
+static void gfs2_glock_xmote_th(struct gfs2_holder *gh);
+static void gfs2_glock_drop_th(struct gfs2_glock *gl);
 
 #define GFS2_GL_HASH_SHIFT      15
 #define GFS2_GL_HASH_SIZE       (1 << GFS2_GL_HASH_SHIFT)
@@ -524,7 +526,6 @@ static int rq_promote(struct gfs2_holder *gh)
 {
 	struct gfs2_glock *gl = gh->gh_gl;
 	struct gfs2_sbd *sdp = gl->gl_sbd;
-	const struct gfs2_glock_operations *glops = gl->gl_ops;
 
 	if (!relaxed_state_ok(gl->gl_state, gh->gh_state, gh->gh_flags)) {
 		if (list_empty(&gl->gl_holders)) {
@@ -539,7 +540,7 @@ static int rq_promote(struct gfs2_holder *gh)
 				gfs2_reclaim_glock(sdp);
 			}
 
-			glops->go_xmote_th(gl, gh->gh_state, gh->gh_flags);
+			gfs2_glock_xmote_th(gh);
 			spin_lock(&gl->gl_spin);
 		}
 		return 1;
@@ -577,7 +578,6 @@ static int rq_promote(struct gfs2_holder *gh)
 static int rq_demote(struct gfs2_holder *gh)
 {
 	struct gfs2_glock *gl = gh->gh_gl;
-	const struct gfs2_glock_operations *glops = gl->gl_ops;
 
 	if (!list_empty(&gl->gl_holders))
 		return 1;
@@ -595,9 +595,9 @@ static int rq_demote(struct gfs2_holder *gh)
 
 		if (gh->gh_state == LM_ST_UNLOCKED ||
 		    gl->gl_state != LM_ST_EXCLUSIVE)
-			glops->go_drop_th(gl);
+			gfs2_glock_drop_th(gl);
 		else
-			glops->go_xmote_th(gl, gh->gh_state, gh->gh_flags);
+			gfs2_glock_xmote_th(gh);
 
 		spin_lock(&gl->gl_spin);
 	}
@@ -909,23 +909,26 @@ static void xmote_bh(struct gfs2_glock *gl, unsigned int ret)
  *
  */
 
-void gfs2_glock_xmote_th(struct gfs2_glock *gl, unsigned int state, int flags)
+void gfs2_glock_xmote_th(struct gfs2_holder *gh)
 {
+	struct gfs2_glock *gl = gh->gh_gl;
 	struct gfs2_sbd *sdp = gl->gl_sbd;
+	int flags = gh->gh_flags;
+	unsigned state = gh->gh_state;
 	const struct gfs2_glock_operations *glops = gl->gl_ops;
 	int lck_flags = flags & (LM_FLAG_TRY | LM_FLAG_TRY_1CB |
 				 LM_FLAG_NOEXP | LM_FLAG_ANY |
 				 LM_FLAG_PRIORITY);
 	unsigned int lck_ret;
 
+	if (glops->go_xmote_th)
+		glops->go_xmote_th(gl);
+
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
 	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
 	gfs2_assert_warn(sdp, state != LM_ST_UNLOCKED);
 	gfs2_assert_warn(sdp, state != gl->gl_state);
 
-	if (gl->gl_state == LM_ST_EXCLUSIVE && glops->go_sync)
-		glops->go_sync(gl);
-
 	gfs2_glock_hold(gl);
 	gl->gl_req_bh = xmote_bh;
 
@@ -994,19 +997,19 @@ static void drop_bh(struct gfs2_glock *gl, unsigned int ret)
  *
  */
 
-void gfs2_glock_drop_th(struct gfs2_glock *gl)
+static void gfs2_glock_drop_th(struct gfs2_glock *gl)
 {
 	struct gfs2_sbd *sdp = gl->gl_sbd;
 	const struct gfs2_glock_operations *glops = gl->gl_ops;
 	unsigned int ret;
 
+	if (glops->go_drop_th)
+		glops->go_drop_th(gl);
+
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
 	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
 	gfs2_assert_warn(sdp, gl->gl_state != LM_ST_UNLOCKED);
 
-	if (gl->gl_state == LM_ST_EXCLUSIVE && glops->go_sync)
-		glops->go_sync(gl);
-
 	gfs2_glock_hold(gl);
 	gl->gl_req_bh = drop_bh;
 
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 1eaeacd..f50e40c 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -82,10 +82,6 @@ void gfs2_holder_init(struct gfs2_glock *gl, unsigned int state, unsigned flags,
 void gfs2_holder_reinit(unsigned int state, unsigned flags,
 			struct gfs2_holder *gh);
 void gfs2_holder_uninit(struct gfs2_holder *gh);
-
-void gfs2_glock_xmote_th(struct gfs2_glock *gl, unsigned int state, int flags);
-void gfs2_glock_drop_th(struct gfs2_glock *gl);
-
 int gfs2_glock_nq(struct gfs2_holder *gh);
 int gfs2_glock_poll(struct gfs2_holder *gh);
 int gfs2_glock_wait(struct gfs2_holder *gh);
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index dda6858..c4b0391 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -117,12 +117,14 @@ static void gfs2_pte_inval(struct gfs2_glock *gl)
 
 static void meta_go_sync(struct gfs2_glock *gl)
 {
+	if (gl->gl_state != LM_ST_EXCLUSIVE)
+		return;
+
 	if (test_and_clear_bit(GLF_DIRTY, &gl->gl_flags)) {
 		gfs2_log_flush(gl->gl_sbd, gl);
 		gfs2_meta_sync(gl);
 		gfs2_ail_empty_gl(gl);
 	}
-
 }
 
 /**
@@ -142,6 +144,37 @@ static void meta_go_inval(struct gfs2_glock *gl, int flags)
 }
 
 /**
+ * inode_go_sync - Sync the dirty data and/or metadata for an inode glock
+ * @gl: the glock protecting the inode
+ *
+ */
+
+static void inode_go_sync(struct gfs2_glock *gl)
+{
+	struct gfs2_inode *ip = gl->gl_object;
+
+	if (ip && !S_ISREG(ip->i_inode.i_mode))
+		ip = NULL;
+
+	if (test_bit(GLF_DIRTY, &gl->gl_flags)) {
+		gfs2_log_flush(gl->gl_sbd, gl);
+		if (ip)
+			filemap_fdatawrite(ip->i_inode.i_mapping);
+		gfs2_meta_sync(gl);
+		if (ip) {
+			struct address_space *mapping = ip->i_inode.i_mapping;
+			int error = filemap_fdatawait(mapping);
+			if (error == -ENOSPC)
+				set_bit(AS_ENOSPC, &mapping->flags);
+			else if (error)
+				set_bit(AS_EIO, &mapping->flags);
+		}
+		clear_bit(GLF_DIRTY, &gl->gl_flags);
+		gfs2_ail_empty_gl(gl);
+	}
+}
+
+/**
  * inode_go_xmote_th - promote/demote a glock
  * @gl: the glock
  * @state: the requested state
@@ -149,12 +182,12 @@ static void meta_go_inval(struct gfs2_glock *gl, int flags)
  *
  */
 
-static void inode_go_xmote_th(struct gfs2_glock *gl, unsigned int state,
-			      int flags)
+static void inode_go_xmote_th(struct gfs2_glock *gl)
 {
 	if (gl->gl_state != LM_ST_UNLOCKED)
 		gfs2_pte_inval(gl);
-	gfs2_glock_xmote_th(gl, state, flags);
+	if (gl->gl_state == LM_ST_EXCLUSIVE)
+		inode_go_sync(gl);
 }
 
 /**
@@ -189,38 +222,8 @@ static void inode_go_xmote_bh(struct gfs2_glock *gl)
 static void inode_go_drop_th(struct gfs2_glock *gl)
 {
 	gfs2_pte_inval(gl);
-	gfs2_glock_drop_th(gl);
-}
-
-/**
- * inode_go_sync - Sync the dirty data and/or metadata for an inode glock
- * @gl: the glock protecting the inode
- *
- */
-
-static void inode_go_sync(struct gfs2_glock *gl)
-{
-	struct gfs2_inode *ip = gl->gl_object;
-
-	if (ip && !S_ISREG(ip->i_inode.i_mode))
-		ip = NULL;
-
-	if (test_bit(GLF_DIRTY, &gl->gl_flags)) {
-		gfs2_log_flush(gl->gl_sbd, gl);
-		if (ip)
-			filemap_fdatawrite(ip->i_inode.i_mapping);
-		gfs2_meta_sync(gl);
-		if (ip) {
-			struct address_space *mapping = ip->i_inode.i_mapping;
-			int error = filemap_fdatawait(mapping);
-			if (error == -ENOSPC)
-				set_bit(AS_ENOSPC, &mapping->flags);
-			else if (error)
-				set_bit(AS_EIO, &mapping->flags);
-		}
-		clear_bit(GLF_DIRTY, &gl->gl_flags);
-		gfs2_ail_empty_gl(gl);
-	}
+	if (gl->gl_state == LM_ST_EXCLUSIVE)
+		inode_go_sync(gl);
 }
 
 /**
@@ -365,8 +368,7 @@ static void rgrp_go_unlock(struct gfs2_holder *gh)
  *
  */
 
-static void trans_go_xmote_th(struct gfs2_glock *gl, unsigned int state,
-			      int flags)
+static void trans_go_xmote_th(struct gfs2_glock *gl)
 {
 	struct gfs2_sbd *sdp = gl->gl_sbd;
 
@@ -375,8 +377,6 @@ static void trans_go_xmote_th(struct gfs2_glock *gl, unsigned int state,
 		gfs2_meta_syncfs(sdp);
 		gfs2_log_shutdown(sdp);
 	}
-
-	gfs2_glock_xmote_th(gl, state, flags);
 }
 
 /**
@@ -428,8 +428,6 @@ static void trans_go_drop_th(struct gfs2_glock *gl)
 		gfs2_meta_syncfs(sdp);
 		gfs2_log_shutdown(sdp);
 	}
-
-	gfs2_glock_drop_th(gl);
 }
 
 /**
@@ -445,8 +443,8 @@ static int quota_go_demote_ok(struct gfs2_glock *gl)
 }
 
 const struct gfs2_glock_operations gfs2_meta_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
+	.go_xmote_th = meta_go_sync,
+	.go_drop_th = meta_go_sync,
 	.go_type = LM_TYPE_META,
 };
 
@@ -454,7 +452,6 @@ const struct gfs2_glock_operations gfs2_inode_glops = {
 	.go_xmote_th = inode_go_xmote_th,
 	.go_xmote_bh = inode_go_xmote_bh,
 	.go_drop_th = inode_go_drop_th,
-	.go_sync = inode_go_sync,
 	.go_inval = inode_go_inval,
 	.go_demote_ok = inode_go_demote_ok,
 	.go_lock = inode_go_lock,
@@ -463,9 +460,6 @@ const struct gfs2_glock_operations gfs2_inode_glops = {
 };
 
 const struct gfs2_glock_operations gfs2_rgrp_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
-	.go_sync = meta_go_sync,
 	.go_inval = meta_go_inval,
 	.go_demote_ok = rgrp_go_demote_ok,
 	.go_lock = rgrp_go_lock,
@@ -481,33 +475,23 @@ const struct gfs2_glock_operations gfs2_trans_glops = {
 };
 
 const struct gfs2_glock_operations gfs2_iopen_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
 	.go_type = LM_TYPE_IOPEN,
 };
 
 const struct gfs2_glock_operations gfs2_flock_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
 	.go_type = LM_TYPE_FLOCK,
 };
 
 const struct gfs2_glock_operations gfs2_nondisk_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
 	.go_type = LM_TYPE_NONDISK,
 };
 
 const struct gfs2_glock_operations gfs2_quota_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
 	.go_demote_ok = quota_go_demote_ok,
 	.go_type = LM_TYPE_QUOTA,
 };
 
 const struct gfs2_glock_operations gfs2_journal_glops = {
-	.go_xmote_th = gfs2_glock_xmote_th,
-	.go_drop_th = gfs2_glock_drop_th,
 	.go_type = LM_TYPE_JOURNAL,
 };
 
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 1acbcc2..12c80fd 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -101,11 +101,10 @@ struct gfs2_bufdata {
 };
 
 struct gfs2_glock_operations {
-	void (*go_xmote_th) (struct gfs2_glock *gl, unsigned int state, int flags);
+	void (*go_xmote_th) (struct gfs2_glock *gl);
 	void (*go_xmote_bh) (struct gfs2_glock *gl);
 	void (*go_drop_th) (struct gfs2_glock *gl);
 	void (*go_drop_bh) (struct gfs2_glock *gl);
-	void (*go_sync) (struct gfs2_glock *gl);
 	void (*go_inval) (struct gfs2_glock *gl, int flags);
 	int (*go_demote_ok) (struct gfs2_glock *gl);
 	int (*go_lock) (struct gfs2_holder *gh);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix lowcomms receiving [35/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (33 preceding siblings ...)
  2007-02-05 14:35 ` [GFS2] Tidy up glops calls [34/54] Steven Whitehouse
@ 2007-02-05 14:36 ` Steven Whitehouse
  2007-02-05 14:37 ` [GFS2] Remove queue_empty() function [36/54] Steven Whitehouse
                   ` (19 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From 8fc2b7f76db12f7f214948e01ebd9d36f25a873c Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Mon, 22 Jan 2007 14:51:33 +0000
Subject: [PATCH] [DLM] fix lowcomms receiving

This patch fixes a bug whereby data on a newly accepted connection would be
ignored if it arrived soon after the accept.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 6e27201..8e6a76c 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -327,6 +327,9 @@ static int receive_from_sock(struct connection *con)
 
 	if (ret <= 0)
 		goto out_close;
+	if (ret == -EAGAIN)
+		goto out_resched;
+
 	if (ret == len)
 		call_again_soon = 1;
 	cbuf_add(&con->cb, ret);
@@ -359,8 +362,7 @@ out_resched:
 	if (!test_and_set_bit(CF_READ_PENDING, &con->flags))
 		queue_work(recv_workqueue, &con->rwork);
 	up_read(&con->sock_sem);
-	cond_resched();
-	return 0;
+	return -EAGAIN;
 
 out_close:
 	up_read(&con->sock_sem);
@@ -381,6 +383,7 @@ static int accept_from_sock(struct connection *con)
 	int len;
 	int nodeid;
 	struct connection *newcon;
+	struct connection *addcon;
 
 	memset(&peeraddr, 0, sizeof(peeraddr));
 	result = sock_create_kern(dlm_local_addr.ss_family, SOCK_STREAM,
@@ -454,12 +457,13 @@ static int accept_from_sock(struct connection *con)
 		othercon->sock = newsock;
 		newsock->sk->sk_user_data = othercon;
 		add_sock(newsock, othercon);
+		addcon = othercon;
 	}
 	else {
 		newsock->sk->sk_user_data = newcon;
 		newcon->rx_action = receive_from_sock;
 		add_sock(newsock, newcon);
-
+		addcon = newcon;
 	}
 
 	up_write(&newcon->sock_sem);
@@ -469,8 +473,8 @@ static int accept_from_sock(struct connection *con)
 	 * beween processing the accept adding the socket
 	 * to the read_sockets list
 	 */
-	if (!test_and_set_bit(CF_READ_PENDING, &newcon->flags))
-		queue_work(recv_workqueue, &newcon->rwork);
+	if (!test_and_set_bit(CF_READ_PENDING, &addcon->flags))
+		queue_work(recv_workqueue, &addcon->rwork);
 	up_read(&con->sock_sem);
 
 	return 0;
@@ -610,8 +614,7 @@ static struct socket *create_listen_sock(struct connection *con,
 
 	result = sock->ops->listen(sock, 5);
 	if (result < 0) {
-		printk("dlm: Can't listen on port %d\n",
-		       dlm_config.ci_tcp_port);
+		printk("dlm: Can't listen on port %d\n", dlm_config.ci_tcp_port);
 		sock_release(sock);
 		sock = NULL;
 		goto create_out;
@@ -811,7 +814,7 @@ send_error:
 
 out_connect:
 	up_read(&con->sock_sem);
-	lowcomms_connect_sock(con);
+	connect_to_sock(con);
 	return;
 }
 
@@ -873,9 +876,8 @@ static void process_send_sockets(struct work_struct *work)
 		connect_to_sock(con);
 	}
 
-	if (test_and_clear_bit(CF_WRITE_PENDING, &con->flags)) {
-		send_to_sock(con);
-	}
+	clear_bit(CF_WRITE_PENDING, &con->flags);
+	send_to_sock(con);
 }
 
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Remove queue_empty() function [36/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (34 preceding siblings ...)
  2007-02-05 14:36 ` [DLM] fix lowcomms receiving [35/54] Steven Whitehouse
@ 2007-02-05 14:37 ` Steven Whitehouse
  2007-02-05 14:37 ` [GFS2] Compile fix for glock.c [37/54] Steven Whitehouse
                   ` (18 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 5eaeb9b4dc1299eba6065ab2e5c9e5b23e55a9f3 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 22 Jan 2007 13:09:04 -0500
Subject: [PATCH] [GFS2] Remove queue_empty() function

This function is not longer required since we do not do recursive
locking in the glock layer. As a result all its callers can be
replaceed with list_empty() calls.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5b772bb..1509481 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -211,30 +211,6 @@ out:
 }
 
 /**
- * queue_empty - check to see if a glock's queue is empty
- * @gl: the glock
- * @head: the head of the queue to check
- *
- * This function protects the list in the event that a process already
- * has a holder on the list and is adding a second holder for itself.
- * The glmutex lock is what generally prevents processes from working
- * on the same glock at once, but the special case of adding a second
- * holder for yourself ("recursive" locking) doesn't involve locking
- * glmutex, making the spin lock necessary.
- *
- * Returns: 1 if the queue is empty
- */
-
-static inline int queue_empty(struct gfs2_glock *gl, struct list_head *head)
-{
-	int empty;
-	spin_lock(&gl->gl_spin);
-	empty = list_empty(head);
-	spin_unlock(&gl->gl_spin);
-	return empty;
-}
-
-/**
  * search_bucket() - Find struct gfs2_glock by lock number
  * @bucket: the bucket to search
  * @name: The lock name
@@ -814,7 +790,7 @@ static void xmote_bh(struct gfs2_glock *gl, unsigned int ret)
 	int op_done = 1;
 
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
-	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
+	gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
 	gfs2_assert_warn(sdp, !(ret & LM_OUT_ASYNC));
 
 	state_change(gl, ret & LM_OUT_ST_MASK);
@@ -925,7 +901,7 @@ void gfs2_glock_xmote_th(struct gfs2_holder *gh)
 		glops->go_xmote_th(gl);
 
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
-	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
+	gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
 	gfs2_assert_warn(sdp, state != LM_ST_UNLOCKED);
 	gfs2_assert_warn(sdp, state != gl->gl_state);
 
@@ -960,7 +936,7 @@ static void drop_bh(struct gfs2_glock *gl, unsigned int ret)
 	struct gfs2_holder *gh = gl->gl_req_gh;
 
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
-	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
+	gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
 	gfs2_assert_warn(sdp, !ret);
 
 	state_change(gl, LM_ST_UNLOCKED);
@@ -1007,7 +983,7 @@ static void gfs2_glock_drop_th(struct gfs2_glock *gl)
 		glops->go_drop_th(gl);
 
 	gfs2_assert_warn(sdp, test_bit(GLF_LOCK, &gl->gl_flags));
-	gfs2_assert_warn(sdp, queue_empty(gl, &gl->gl_holders));
+	gfs2_assert_warn(sdp, list_empty(&gl->gl_holders));
 	gfs2_assert_warn(sdp, gl->gl_state != LM_ST_UNLOCKED);
 
 	gfs2_glock_hold(gl);
@@ -1697,7 +1673,7 @@ void gfs2_reclaim_glock(struct gfs2_sbd *sdp)
 	atomic_inc(&sdp->sd_reclaimed);
 
 	if (gfs2_glmutex_trylock(gl)) {
-		if (queue_empty(gl, &gl->gl_holders) &&
+		if (list_empty(&gl->gl_holders) &&
 		    gl->gl_state != LM_ST_UNLOCKED && demote_ok(gl))
 			handle_callback(gl, LM_ST_UNLOCKED);
 		gfs2_glmutex_unlock(gl);
@@ -1761,7 +1737,7 @@ static void scan_glock(struct gfs2_glock *gl)
 		return;
 
 	if (gfs2_glmutex_trylock(gl)) {
-		if (queue_empty(gl, &gl->gl_holders) &&
+		if (list_empty(&gl->gl_holders) &&
 		    gl->gl_state != LM_ST_UNLOCKED && demote_ok(gl))
 			goto out_schedule;
 		gfs2_glmutex_unlock(gl);
@@ -1810,7 +1786,7 @@ static void clear_glock(struct gfs2_glock *gl)
 	}
 
 	if (gfs2_glmutex_trylock(gl)) {
-		if (queue_empty(gl, &gl->gl_holders) &&
+		if (list_empty(gl, &gl->gl_holders) &&
 		    gl->gl_state != LM_ST_UNLOCKED)
 			handle_callback(gl, LM_ST_UNLOCKED);
 		gfs2_glmutex_unlock(gl);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Compile fix for glock.c [37/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (35 preceding siblings ...)
  2007-02-05 14:37 ` [GFS2] Remove queue_empty() function [36/54] Steven Whitehouse
@ 2007-02-05 14:37 ` Steven Whitehouse
  2007-02-05 14:38 ` [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 [38/54] Steven Whitehouse
                   ` (17 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 1887e3bb2050c8ad515ff1ce2b4fec8f9b1c601b Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Tue, 23 Jan 2007 13:20:41 -0500
Subject: [PATCH] [GFS2] Compile fix for glock.c

This one liner got missed from the previous patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 1509481..f68582d 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1786,7 +1786,7 @@ static void clear_glock(struct gfs2_glock *gl)
 	}
 
 	if (gfs2_glmutex_trylock(gl)) {
-		if (list_empty(gl, &gl->gl_holders) &&
+		if (list_empty(&gl->gl_holders) &&
 		    gl->gl_state != LM_ST_UNLOCKED)
 			handle_callback(gl, LM_ST_UNLOCKED);
 		gfs2_glmutex_unlock(gl);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 [38/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (36 preceding siblings ...)
  2007-02-05 14:37 ` [GFS2] Compile fix for glock.c [37/54] Steven Whitehouse
@ 2007-02-05 14:38 ` Steven Whitehouse
  2007-02-05 14:39 ` [GFS2] Fix typo in glock.c [39/54] Steven Whitehouse
                   ` (16 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: Eric Sandeen, cluster-devel

>From 2b1e30982b8b4a276f18092f55e5ae80cd193414 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen@redhat.com>
Date: Thu, 18 Jan 2007 16:41:23 -0600
Subject: [PATCH] [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2

I was looking something else up and came across this...

I don't honestly have a good reason to change it other than to make it
like every other Linux filesystem in this regard.  ;-)  It doesn't
functionally change anything, but makes some lines shorter. :)

I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but
not in timespec_t format, and only stores second resolutions?  Seems like
you're halfway to sub-second resolutions already.

I suppose if that gets implemented then all of the below should
instead be CURRENT_TIME not CURRENT_TIME_SEC.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 8240c1f..113f6c9 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -773,7 +773,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 			gfs2_free_data(ip, bstart, blen);
 	}
 
-	ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 
 	gfs2_dinode_out(ip, dibh->b_data);
 
@@ -848,7 +848,7 @@ static int do_grow(struct gfs2_inode *ip, u64 size)
 	}
 
 	ip->i_di.di_size = size;
-	ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 
 	error = gfs2_meta_inode_buffer(ip, &dibh);
 	if (error)
@@ -963,7 +963,7 @@ static int trunc_start(struct gfs2_inode *ip, u64 size)
 
 	if (gfs2_is_stuffed(ip)) {
 		ip->i_di.di_size = size;
-		ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+		ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 		gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 		gfs2_dinode_out(ip, dibh->b_data);
 		gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode) + size);
@@ -975,7 +975,7 @@ static int trunc_start(struct gfs2_inode *ip, u64 size)
 
 		if (!error) {
 			ip->i_di.di_size = size;
-			ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+			ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 			ip->i_di.di_flags |= GFS2_DIF_TRUNC_IN_PROG;
 			gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 			gfs2_dinode_out(ip, dibh->b_data);
@@ -1048,7 +1048,7 @@ static int trunc_end(struct gfs2_inode *ip)
 			ip->i_num.no_addr;
 		gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode));
 	}
-	ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 	ip->i_di.di_flags &= ~GFS2_DIF_TRUNC_IN_PROG;
 
 	gfs2_trans_add_bh(ip->i_gl, dibh, 1);
diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 0eceb05..c93ca8f 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -131,7 +131,7 @@ static int gfs2_dir_write_stuffed(struct gfs2_inode *ip, const char *buf,
 	memcpy(dibh->b_data + offset + sizeof(struct gfs2_dinode), buf, size);
 	if (ip->i_di.di_size < offset + size)
 		ip->i_di.di_size = offset + size;
-	ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 	gfs2_dinode_out(ip, dibh->b_data);
 
 	brelse(dibh);
@@ -229,7 +229,7 @@ out:
 
 	if (ip->i_di.di_size < offset + copied)
 		ip->i_di.di_size = offset + copied;
-	ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 
 	gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 	gfs2_dinode_out(ip, dibh->b_data);
@@ -1565,7 +1565,7 @@ int gfs2_dir_add(struct inode *inode, const struct qstr *name,
 				break;
 			gfs2_trans_add_bh(ip->i_gl, bh, 1);
 			ip->i_di.di_entries++;
-			ip->i_inode.i_mtime.tv_sec = ip->i_inode.i_ctime.tv_sec = get_seconds();
+			ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 			gfs2_dinode_out(ip, bh->b_data);
 			brelse(bh);
 			error = 0;
@@ -1651,7 +1651,7 @@ int gfs2_dir_del(struct gfs2_inode *dip, const struct qstr *name)
 		gfs2_consist_inode(dip);
 	gfs2_trans_add_bh(dip->i_gl, bh, 1);
 	dip->i_di.di_entries--;
-	dip->i_inode.i_mtime.tv_sec = dip->i_inode.i_ctime.tv_sec = get_seconds();
+	dip->i_inode.i_mtime = dip->i_inode.i_ctime = CURRENT_TIME_SEC;
 	gfs2_dinode_out(dip, bh->b_data);
 	brelse(bh);
 	mark_inode_dirty(&dip->i_inode);
@@ -1699,7 +1699,7 @@ int gfs2_dir_mvino(struct gfs2_inode *dip, const struct qstr *filename,
 		gfs2_trans_add_bh(dip->i_gl, bh, 1);
 	}
 
-	dip->i_inode.i_mtime.tv_sec = dip->i_inode.i_ctime.tv_sec = get_seconds();
+	dip->i_inode.i_mtime = dip->i_inode.i_ctime = CURRENT_TIME_SEC;
 	gfs2_dinode_out(dip, bh->b_data);
 	brelse(bh);
 	return 0;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 2603169..f7c8d31 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -357,7 +357,7 @@ int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
 	else
 		drop_nlink(&ip->i_inode);
 
-	ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 
 	gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 	gfs2_dinode_out(ip, dibh->b_data);
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index b2a12f4..747c731 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -728,7 +728,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 		error = gfs2_meta_inode_buffer(ip, &dibh);
 		if (error)
 			goto out_end_trans;
-		ip->i_inode.i_ctime.tv_sec = get_seconds();
+		ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 		gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 		gfs2_dinode_out(ip, dibh->b_data);
 		brelse(dibh);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix typo in glock.c [39/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (37 preceding siblings ...)
  2007-02-05 14:38 ` [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 [38/54] Steven Whitehouse
@ 2007-02-05 14:39 ` Steven Whitehouse
  2007-02-05 14:40 ` [DLM] Make sock_sem into a mutex [40/54] Steven Whitehouse
                   ` (15 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 0eb01c633286ebb80f626846816284f241c15869 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Tue, 23 Jan 2007 16:56:36 -0500
Subject: [PATCH] [GFS2] Fix typo in glock.c

This is a one letter typo fix in glock.c, spotted by Rob Kenna.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index f68582d..c070ede 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -482,7 +482,7 @@ static int rq_mutex(struct gfs2_holder *gh)
 	list_del_init(&gh->gh_list);
 	/*  gh->gh_error never examined.  */
 	set_bit(GLF_LOCK, &gl->gl_flags);
-	clear_bit(HIF_WAIT, &gh->gh_flags);
+	clear_bit(HIF_WAIT, &gh->gh_iflags);
 	smp_mb();
 	wake_up_bit(&gh->gh_iflags, HIF_WAIT);
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] Make sock_sem into a mutex [40/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (38 preceding siblings ...)
  2007-02-05 14:39 ` [GFS2] Fix typo in glock.c [39/54] Steven Whitehouse
@ 2007-02-05 14:40 ` Steven Whitehouse
  2007-02-05 14:40 ` [DLM] saved dlm message can be dropped [41/54] Steven Whitehouse
                   ` (14 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From efc9c1f41a1858b588727f3520579e01cd9ac914 Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Wed, 24 Jan 2007 11:17:59 +0000
Subject: [PATCH] [DLM] Make sock_sem into a mutex

Now that there can be multiple dlm_recv threads running we need to prevent two
recvs running for the same connection - it's unlikely but it can happen and it
causes message corruption.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 8e6a76c..18ade44 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -96,7 +96,7 @@ static bool cbuf_empty(struct cbuf *cb)
 struct connection {
 	struct socket *sock;	/* NULL if not connected */
 	uint32_t nodeid;	/* So we know who we are in the list */
-	struct rw_semaphore sock_sem; /* Stop connect races */
+	struct mutex sock_mutex;
 	unsigned long flags;	/* bit 1,2 = We are on the read/write lists */
 #define CF_READ_PENDING 1
 #define CF_WRITE_PENDING 2
@@ -171,7 +171,7 @@ static struct connection *nodeid2con(int nodeid, gfp_t allocation)
 			goto finish;
 
 		con->nodeid = nodeid;
-		init_rwsem(&con->sock_sem);
+		mutex_init(&con->sock_mutex);
 		INIT_LIST_HEAD(&con->writequeue);
 		spin_lock_init(&con->writequeue_lock);
 		INIT_WORK(&con->swork, process_send_sockets);
@@ -247,7 +247,7 @@ static void make_sockaddr(struct sockaddr_storage *saddr, uint16_t port,
 /* Close a remote connection and tidy up */
 static void close_connection(struct connection *con, bool and_other)
 {
-	down_write(&con->sock_sem);
+	mutex_lock(&con->sock_mutex);
 
 	if (con->sock) {
 		sock_release(con->sock);
@@ -262,7 +262,7 @@ static void close_connection(struct connection *con, bool and_other)
 		con->rx_page = NULL;
 	}
 	con->retries = 0;
-	up_write(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 }
 
 /* Data received from remote end */
@@ -276,7 +276,7 @@ static int receive_from_sock(struct connection *con)
 	int r;
 	int call_again_soon = 0;
 
-	down_read(&con->sock_sem);
+	mutex_lock(&con->sock_mutex);
 
 	if (con->sock == NULL)
 		goto out;
@@ -355,17 +355,17 @@ static int receive_from_sock(struct connection *con)
 out:
 	if (call_again_soon)
 		goto out_resched;
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	return 0;
 
 out_resched:
 	if (!test_and_set_bit(CF_READ_PENDING, &con->flags))
 		queue_work(recv_workqueue, &con->rwork);
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	return -EAGAIN;
 
 out_close:
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	if (ret != -EAGAIN && !test_bit(CF_IS_OTHERCON, &con->flags)) {
 		close_connection(con, false);
 		/* Reconnect when there is something to send */
@@ -391,7 +391,7 @@ static int accept_from_sock(struct connection *con)
 	if (result < 0)
 		return -ENOMEM;
 
-	down_read_nested(&con->sock_sem, 0);
+	mutex_lock_nested(&con->sock_mutex, 0);
 
 	result = -ENOTCONN;
 	if (con->sock == NULL)
@@ -417,7 +417,7 @@ static int accept_from_sock(struct connection *con)
 	if (dlm_addr_to_nodeid(&peeraddr, &nodeid)) {
 		printk("dlm: connect from non cluster node\n");
 		sock_release(newsock);
-		up_read(&con->sock_sem);
+		mutex_unlock(&con->sock_mutex);
 		return -1;
 	}
 
@@ -434,7 +434,7 @@ static int accept_from_sock(struct connection *con)
 		result = -ENOMEM;
 		goto accept_err;
 	}
-	down_write_nested(&newcon->sock_sem, 1);
+	mutex_lock_nested(&newcon->sock_mutex, 1);
 	if (newcon->sock) {
 		struct connection *othercon = newcon->othercon;
 
@@ -442,13 +442,13 @@ static int accept_from_sock(struct connection *con)
 			othercon = kmem_cache_zalloc(con_cache, GFP_KERNEL);
 			if (!othercon) {
 				printk("dlm: failed to allocate incoming socket\n");
-				up_write(&newcon->sock_sem);
+				mutex_unlock(&newcon->sock_mutex);
 				result = -ENOMEM;
 				goto accept_err;
 			}
 			othercon->nodeid = nodeid;
 			othercon->rx_action = receive_from_sock;
-			init_rwsem(&othercon->sock_sem);
+			mutex_init(&othercon->sock_mutex);
 			INIT_WORK(&othercon->swork, process_send_sockets);
 			INIT_WORK(&othercon->rwork, process_recv_sockets);
 			set_bit(CF_IS_OTHERCON, &othercon->flags);
@@ -466,7 +466,7 @@ static int accept_from_sock(struct connection *con)
 		addcon = newcon;
 	}
 
-	up_write(&newcon->sock_sem);
+	mutex_unlock(&newcon->sock_mutex);
 
 	/*
 	 * Add it to the active queue in case we got data
@@ -475,12 +475,12 @@ static int accept_from_sock(struct connection *con)
 	 */
 	if (!test_and_set_bit(CF_READ_PENDING, &addcon->flags))
 		queue_work(recv_workqueue, &addcon->rwork);
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 
 	return 0;
 
 accept_err:
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	sock_release(newsock);
 
 	if (result != -EAGAIN)
@@ -501,7 +501,7 @@ static void connect_to_sock(struct connection *con)
 		return;
 	}
 
-	down_write(&con->sock_sem);
+	mutex_lock(&con->sock_mutex);
 	if (con->retries++ > MAX_CONNECT_RETRIES)
 		goto out;
 
@@ -553,7 +553,7 @@ out_err:
 		result = 0;
 	}
 out:
-	up_write(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	return;
 }
 
@@ -757,7 +757,7 @@ static void send_to_sock(struct connection *con)
 	struct writequeue_entry *e;
 	int len, offset;
 
-	down_read(&con->sock_sem);
+	mutex_lock(&con->sock_mutex);
 	if (con->sock == NULL)
 		goto out_connect;
 
@@ -803,17 +803,17 @@ static void send_to_sock(struct connection *con)
 	}
 	spin_unlock(&con->writequeue_lock);
 out:
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	return;
 
 send_error:
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	close_connection(con, false);
 	lowcomms_connect_sock(con);
 	return;
 
 out_connect:
-	up_read(&con->sock_sem);
+	mutex_unlock(&con->sock_mutex);
 	connect_to_sock(con);
 	return;
 }
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] saved dlm message can be dropped [41/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (39 preceding siblings ...)
  2007-02-05 14:40 ` [DLM] Make sock_sem into a mutex [40/54] Steven Whitehouse
@ 2007-02-05 14:40 ` Steven Whitehouse
  2007-02-05 14:41 ` [DLM] can miss clearing resend flag Steven Whitehouse
                   ` (13 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 24b7195b114b8003274b02dd0c35fb73e75155d2 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 24 Jan 2007 10:11:45 -0600
Subject: [PATCH] [DLM] saved dlm message can be dropped

dlm_receive_message() returns 0 instead of returning 'error'.  What would
happen is that process_requestqueue would take a saved message off the
requestqueue and call receive_message on it.  receive_message would then
see that recovery had been aborted, set error to EINTR, and 'goto out',
expecting that the error would be returned.  Instead, 0 was always
returned, so process_requestqueue would think that the message had been
processed and delete it instead of saving it to process next time.  This
means the message (usually an unlock in my tests) would be lost.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 6ad2b8e..7c7ac2a 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -3018,7 +3018,7 @@ int dlm_receive_message(struct dlm_header *hd, int nodeid, int recovery)
 {
 	struct dlm_message *ms = (struct dlm_message *) hd;
 	struct dlm_ls *ls;
-	int error;
+	int error = 0;
 
 	if (!recovery)
 		dlm_message_in(ms);
@@ -3135,7 +3135,7 @@ int dlm_receive_message(struct dlm_header *hd, int nodeid, int recovery)
  out:
 	dlm_put_lockspace(ls);
 	dlm_astd_wake();
-	return 0;
+	return error;
 }
 
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] can miss clearing resend flag
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (40 preceding siblings ...)
  2007-02-05 14:40 ` [DLM] saved dlm message can be dropped [41/54] Steven Whitehouse
@ 2007-02-05 14:41 ` Steven Whitehouse
  2007-02-05 14:41 ` [GFS2] Fix recursive locking attempt with NFS [43/54] Steven Whitehouse
                   ` (12 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 61a6511f5e603c0baadf542f632981248fdc7266 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 24 Jan 2007 10:21:33 -0600
Subject: [PATCH] [DLM] can miss clearing resend flag

A long, complicated sequence of events, beginning with the RESEND flag not
being cleared on an lkb, can result in an unlock never completing.

- lkb on waiters list for remote lookup
- the remote node is both the dir node and the master node, so
  it optimizes the lookup into a request and sends a request
  reply back
- the request reply is saved on the requestqueue to be processed
  after recovery
- recovery runs dlm_recover_waiters_pre() which sets RESEND flag
  so the lookup will be resent after recovery
- end of recovery: process_requestqueue takes saved request reply
  which removes the lkb off the waitesr list, _without_ clearing
  the RESEND flag
- end of recovery: dlm_recover_waiters_post() doesn't do anything
  with the now completed lookup lkb (would usually clear RESEND)
- later, the node unmounts, unlocks this lkb that still has RESEND
  flag set
- the lkb is on the waiters list again, now for unlock, when recovery
  occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND
  set, doesn't do anything since the master still exists
- end of recovery: dlm_recover_waiters_post() takes this lkb off
  the waiters list because it has the RESEND flag set, then reports
  an error because unlocks are never supposed to be handled in
  recover_waiters_post().
- later, the unlock reply is received, doesn't find the lkb on
  the waiters list because recover_waiters_post() has wrongly
  removed it.
- the unlock operation has been lost, and we're left with a
  stray granted lock
- unmount spins waiting for the unlock to complete

The visible evidence of this problem will be a node where gfs umount is
spinning, the dlm waiters list will be empty, and the dlm locks list will
show a granted lock.

The fix is simply to clear the RESEND flag when taking an lkb off the
waiters list.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 7c7ac2a..c10257f 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -754,6 +754,11 @@ static void add_to_waiters(struct dlm_lkb *lkb, int mstype)
 	mutex_unlock(&ls->ls_waiters_mutex);
 }
 
+/* We clear the RESEND flag because we might be taking an lkb off the waiters
+   list as part of process_requestqueue (e.g. a lookup that has an optimized
+   request reply on the requestqueue) between dlm_recover_waiters_pre() which
+   set RESEND and dlm_recover_waiters_post() */
+
 static int _remove_from_waiters(struct dlm_lkb *lkb)
 {
 	int error = 0;
@@ -764,6 +769,7 @@ static int _remove_from_waiters(struct dlm_lkb *lkb)
 		goto out;
 	}
 	lkb->lkb_wait_type = 0;
+	lkb->lkb_flags &= ~DLM_IFL_RESEND;
 	list_del(&lkb->lkb_wait_reply);
 	unhold_lkb(lkb);
  out:
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix recursive locking attempt with NFS [43/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (41 preceding siblings ...)
  2007-02-05 14:41 ` [DLM] can miss clearing resend flag Steven Whitehouse
@ 2007-02-05 14:41 ` Steven Whitehouse
  2007-02-05 14:42 ` [GFS2] Fix list corruption in lops.c [44/54] Steven Whitehouse
                   ` (11 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 57c070ec9322815caaa33f4a669b86fac9002075 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Thu, 25 Jan 2007 17:14:59 +0000
Subject: [PATCH] [GFS2] Fix recursive locking attempt with NFS

In certain cases, its possible for NFS to call the lookup code while
holding the glock (when doing a readdirplus operation) so we need to
check for that and not try and lock the glock twice. This also fixes a
typo in a previous NFS related GFS2 patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index f7c8d31..88fcfb4 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -395,8 +395,10 @@ struct inode *gfs2_lookup_simple(struct inode *dip, const char *name)
  * @is_root: If 1, ignore the caller's permissions
  * @i_gh: An uninitialized holder for the new inode glock
  *
- * There will always be a vnode (Linux VFS inode) for the d_gh inode unless
- * @is_root is true.
+ * This can be called via the VFS filldir function when NFS is doing
+ * a readdirplus and the inode which its intending to stat isn't
+ * already in cache. In this case we must not take the directory glock
+ * again, since the readdir call will have already taken that lock.
  *
  * Returns: errno
  */
@@ -409,8 +411,9 @@ struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
 	struct gfs2_holder d_gh;
 	struct gfs2_inum_host inum;
 	unsigned int type;
-	int error = 0;
+	int error;
 	struct inode *inode = NULL;
+	int unlock = 0;
 
 	if (!name->len || name->len > GFS2_FNAMESIZE)
 		return ERR_PTR(-ENAMETOOLONG);
@@ -422,9 +425,12 @@ struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
 		return dir;
 	}
 
-	error = gfs2_glock_nq_init(dip->i_gl, LM_ST_SHARED, 0, &d_gh);
-	if (error)
-		return ERR_PTR(error);
+	if (gfs2_glock_is_locked_by_me(dip->i_gl) == 0) {
+		error = gfs2_glock_nq_init(dip->i_gl, LM_ST_SHARED, 0, &d_gh);
+		if (error)
+			return ERR_PTR(error);
+		unlock = 1;
+	}
 
 	if (!is_root) {
 		error = permission(dir, MAY_EXEC, NULL);
@@ -439,10 +445,11 @@ struct inode *gfs2_lookupi(struct inode *dir, const struct qstr *name,
 	inode = gfs2_inode_lookup(sb, &inum, type);
 
 out:
-	gfs2_glock_dq_uninit(&d_gh);
+	if (unlock)
+		gfs2_glock_dq_uninit(&d_gh);
 	if (error == -ENOENT)
 		return NULL;
-	return inode;
+	return inode ? inode : ERR_PTR(error);
 }
 
 static int pick_formal_ino_1(struct gfs2_sbd *sdp, u64 *formal_ino)
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 747c731..5591f89 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -1018,7 +1018,7 @@ static int gfs2_getattr(struct vfsmount *mnt, struct dentry *dentry,
 	}
 
 	generic_fillattr(inode, stat);
-	if (unlock);
+	if (unlock)
 		gfs2_glock_dq_uninit(&gh);
 
 	return 0;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix list corruption in lops.c [44/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (42 preceding siblings ...)
  2007-02-05 14:41 ` [GFS2] Fix recursive locking attempt with NFS [43/54] Steven Whitehouse
@ 2007-02-05 14:42 ` Steven Whitehouse
  2007-02-05 14:43 ` [GFS2] increase default lock limit [45/54] Steven Whitehouse
                   ` (10 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: cluster-devel

>From 6cdaba9600410b53607f7f6233956ae2b9d02013 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Thu, 25 Jan 2007 10:04:20 +0000
Subject: [PATCH] [GFS2] Fix list corruption in lops.c

The patch below appears to fix the list corruption that we are seeing on
occasion. Although the transaction structure is private to a single
thread, when the queued structures are dismantled during an in-core
commit, its possible for a different thread to be trying to add the same
structure to another, new, transaction at the same time.

To avoid this, this patch takes the log spinlock during this operation.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 4d7f94d..16bb4b4 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -69,13 +69,16 @@ static void buf_lo_add(struct gfs2_sbd *sdp, struct gfs2_log_element *le)
 	struct gfs2_bufdata *bd = container_of(le, struct gfs2_bufdata, bd_le);
 	struct gfs2_trans *tr;
 
-	if (!list_empty(&bd->bd_list_tr))
+	gfs2_log_lock(sdp);
+	if (!list_empty(&bd->bd_list_tr)) {
+		gfs2_log_unlock(sdp);
 		return;
-
+	}
 	tr = current->journal_info;
 	tr->tr_touched = 1;
 	tr->tr_num_buf++;
 	list_add(&bd->bd_list_tr, &tr->tr_list_buf);
+	gfs2_log_unlock(sdp);
 
 	if (!list_empty(&le->le_list))
 		return;
@@ -84,7 +87,6 @@ static void buf_lo_add(struct gfs2_sbd *sdp, struct gfs2_log_element *le)
 
 	gfs2_meta_check(sdp, bd->bd_bh);
 	gfs2_pin(sdp, bd->bd_bh);
-
 	gfs2_log_lock(sdp);
 	sdp->sd_log_num_buf++;
 	list_add(&le->le_list, &sdp->sd_log_le_buf);
@@ -98,11 +100,13 @@ static void buf_lo_incore_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 	struct list_head *head = &tr->tr_list_buf;
 	struct gfs2_bufdata *bd;
 
+	gfs2_log_lock(sdp);
 	while (!list_empty(head)) {
 		bd = list_entry(head->next, struct gfs2_bufdata, bd_list_tr);
 		list_del_init(&bd->bd_list_tr);
 		tr->tr_num_buf--;
 	}
+	gfs2_log_unlock(sdp);
 	gfs2_assert_warn(sdp, !tr->tr_num_buf);
 }
 
@@ -462,13 +466,17 @@ static void databuf_lo_add(struct gfs2_sbd *sdp, struct gfs2_log_element *le)
 	struct address_space *mapping = bd->bd_bh->b_page->mapping;
 	struct gfs2_inode *ip = GFS2_I(mapping->host);
 
+	gfs2_log_lock(sdp);
 	tr->tr_touched = 1;
 	if (list_empty(&bd->bd_list_tr) &&
 	    (ip->i_di.di_flags & GFS2_DIF_JDATA)) {
 		tr->tr_num_buf++;
 		list_add(&bd->bd_list_tr, &tr->tr_list_buf);
+		gfs2_log_unlock(sdp);
 		gfs2_pin(sdp, bd->bd_bh);
 		tr->tr_num_buf_new++;
+	} else {
+		gfs2_log_unlock(sdp);
 	}
 	gfs2_trans_add_gl(bd->bd_gl);
 	gfs2_log_lock(sdp);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] increase default lock limit [45/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (43 preceding siblings ...)
  2007-02-05 14:42 ` [GFS2] Fix list corruption in lops.c [44/54] Steven Whitehouse
@ 2007-02-05 14:43 ` Steven Whitehouse
  2007-02-05 14:44 ` [GFS2] make lock_dlm drop_count tunable in sysfs [46/54] Steven Whitehouse
                   ` (9 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From b3ffa7fa110a62b6e835016c944dbd687b21731e Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Thu, 25 Jan 2007 13:50:52 -0600
Subject: [PATCH] [GFS2] increase default lock limit

Increase the number of locks at which point the dlm begins asking gfs to
reduce its lock usage.  The default value is largely arbitrary, but the
current value of 50,000 ends up limiting performance unnecessarily for too
many users.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index 33af707..a87c7bf 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -36,7 +36,7 @@
 
 #define GDLM_STRNAME_BYTES	24
 #define GDLM_LVB_SIZE		32
-#define GDLM_DROP_COUNT		50000
+#define GDLM_DROP_COUNT		200000
 #define GDLM_DROP_PERIOD	60
 #define GDLM_NAME_LEN		128
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] make lock_dlm drop_count tunable in sysfs [46/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (44 preceding siblings ...)
  2007-02-05 14:43 ` [GFS2] increase default lock limit [45/54] Steven Whitehouse
@ 2007-02-05 14:44 ` Steven Whitehouse
  2007-02-05 14:44 ` [GFS2/DLM] use sysfs Steven Whitehouse
                   ` (8 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 05acac8420a6e24fc892d0ae88b2f1e7606f8524 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Thu, 25 Jan 2007 14:24:04 -0600
Subject: [PATCH] [GFS2] make lock_dlm drop_count tunable in sysfs

We want to be able to change or disable the default drop_count (number at
which the dlm asks gfs to limit the the number of locks it's holding).
Add it to the collection of sysfs tunables for an fs.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/locking/dlm/main.c b/fs/gfs2/locking/dlm/main.c
index 2194b1d..a0e7eda 100644
--- a/fs/gfs2/locking/dlm/main.c
+++ b/fs/gfs2/locking/dlm/main.c
@@ -11,9 +11,6 @@
 
 #include "lock_dlm.h"
 
-extern int gdlm_drop_count;
-extern int gdlm_drop_period;
-
 extern struct lm_lockops gdlm_ops;
 
 static int __init init_lock_dlm(void)
@@ -40,9 +37,6 @@ static int __init init_lock_dlm(void)
 		return error;
 	}
 
-	gdlm_drop_count = GDLM_DROP_COUNT;
-	gdlm_drop_period = GDLM_DROP_PERIOD;
-
 	printk(KERN_INFO
 	       "Lock_DLM (built %s %s) installed\n", __DATE__, __TIME__);
 	return 0;
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index cdd1694..1d8faa3 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -9,8 +9,6 @@
 
 #include "lock_dlm.h"
 
-int gdlm_drop_count;
-int gdlm_drop_period;
 const struct lm_lockops gdlm_ops;
 
 
@@ -24,8 +22,8 @@ static struct gdlm_ls *init_gdlm(lm_callback_t cb, struct gfs2_sbd *sdp,
 	if (!ls)
 		return NULL;
 
-	ls->drop_locks_count = gdlm_drop_count;
-	ls->drop_locks_period = gdlm_drop_period;
+	ls->drop_locks_count = GDLM_DROP_COUNT;
+	ls->drop_locks_period = GDLM_DROP_PERIOD;
 	ls->fscb = cb;
 	ls->sdp = sdp;
 	ls->fsflags = flags;
diff --git a/fs/gfs2/locking/dlm/sysfs.c b/fs/gfs2/locking/dlm/sysfs.c
index 29ae06f..4746b88 100644
--- a/fs/gfs2/locking/dlm/sysfs.c
+++ b/fs/gfs2/locking/dlm/sysfs.c
@@ -116,6 +116,17 @@ static ssize_t recover_status_show(struct gdlm_ls *ls, char *buf)
 	return sprintf(buf, "%d\n", ls->recover_jid_status);
 }
 
+static ssize_t drop_count_show(struct gdlm_ls *ls, char *buf)
+{
+	return sprintf(buf, "%d\n", ls->drop_locks_count);
+}
+
+static ssize_t drop_count_store(struct gdlm_ls *ls, const char *buf, size_t len)
+{
+	ls->drop_locks_count = simple_strtol(buf, NULL, 0);
+	return len;
+}
+
 struct gdlm_attr {
 	struct attribute attr;
 	ssize_t (*show)(struct gdlm_ls *, char *);
@@ -135,6 +146,7 @@ GDLM_ATTR(first_done,     0444, first_done_show,     NULL);
 GDLM_ATTR(recover,        0644, recover_show,        recover_store);
 GDLM_ATTR(recover_done,   0444, recover_done_show,   NULL);
 GDLM_ATTR(recover_status, 0444, recover_status_show, NULL);
+GDLM_ATTR(drop_count,     0644, drop_count_show,     drop_count_store);
 
 static struct attribute *gdlm_attrs[] = {
 	&gdlm_attr_proto_name.attr,
@@ -147,6 +159,7 @@ static struct attribute *gdlm_attrs[] = {
 	&gdlm_attr_recover.attr,
 	&gdlm_attr_recover_done.attr,
 	&gdlm_attr_recover_status.attr,
+	&gdlm_attr_drop_count.attr,
 	NULL,
 };
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2/DLM] use sysfs
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (45 preceding siblings ...)
  2007-02-05 14:44 ` [GFS2] make lock_dlm drop_count tunable in sysfs [46/54] Steven Whitehouse
@ 2007-02-05 14:44 ` Steven Whitehouse
  2007-02-05 14:45 ` [GFS2/DLM] fix GFS2 circular dependency [48/54] Steven Whitehouse
                   ` (7 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Randy Dunlap, cluster-devel

>From 2383862e70905f612bde464c441c717a01d2f0e8 Mon Sep 17 00:00:00 2001
From: Randy Dunlap <randy.dunlap@oracle.com>
Date: Thu, 25 Jan 2007 18:42:39 -0800
Subject: [PATCH] [GFS2/DLM] use sysfs

With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build
fails with:

WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined!
WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined!
WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use
kernel_subsys, they should either DEPEND on it or SELECT it.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/Kconfig b/fs/dlm/Kconfig
index b5654a2..d1359b9 100644
--- a/fs/dlm/Kconfig
+++ b/fs/dlm/Kconfig
@@ -5,6 +5,7 @@ config DLM
 	tristate "Distributed Lock Manager (DLM)"
 	depends on IPV6 || IPV6=n
 	select CONFIGFS_FS
+	select SYSFS
 	select IP_SCTP if DLM_SCTP
 	help
 	A general purpose distributed lock manager for kernel or userspace
diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index 6a2ffa2..2c184a9 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -38,6 +38,7 @@ config GFS2_FS_LOCKING_DLM
 	select IP_SCTP if DLM_SCTP
 	select CONFIGFS_FS
 	select DLM
+	select SYSFS
 	help
 	Multiple node locking module for GFS2
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2/DLM] fix GFS2 circular dependency [48/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (46 preceding siblings ...)
  2007-02-05 14:44 ` [GFS2/DLM] use sysfs Steven Whitehouse
@ 2007-02-05 14:45 ` Steven Whitehouse
  2007-02-05 14:46 ` [GFS2] more CURRENT_TIME_SEC [49/54] Steven Whitehouse
                   ` (6 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Randy Dunlap, Adrian Bunk, cluster-devel

>From e0d8f8673de341dbd2db39b8990a405683a3338f Mon Sep 17 00:00:00 2001
From: Adrian Bunk <bunk@stusta.de>
Date: Sun, 28 Jan 2007 17:19:50 +0100
Subject: [PATCH] [GFS2/DLM] fix GFS2 circular dependency

On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote:
> Andrew Morton napsal(a):
> >Temporarily at
> >
> >	http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/
>
> Unable to select IPV6. Menuconfig doesn't offer it when INET is selected.
> When it's not it appears in the menu, but after state change it gets away.
> The same behaviour in xconfig, gconfig.
>
> $ mkdir ../a/tst
> $ make O=../a/tst menuconfig
>   HOSTCC  scripts/basic/fixdep
> [...]
>   HOSTLD  scripts/kconfig/mconf
> scripts/kconfig/mconf arch/i386/Kconfig
> Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS
> OCFS2_FS INET
>
> Maybe this is the problem?

Yes, patch below.

> regards,

cu
Adrian

<--  snip  -->

This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM
and DLM depend on instead of select SYSFS.

Since SYSFS depends on EMBEDDED this change shouldn't cause any problems
for users.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/Kconfig b/fs/dlm/Kconfig
index d1359b9..7e264b7 100644
--- a/fs/dlm/Kconfig
+++ b/fs/dlm/Kconfig
@@ -3,9 +3,8 @@ menu "Distributed Lock Manager"
 
 config DLM
 	tristate "Distributed Lock Manager (DLM)"
-	depends on IPV6 || IPV6=n
+	depends on SYSFS && (IPV6 || IPV6=n)
 	select CONFIGFS_FS
-	select SYSFS
 	select IP_SCTP if DLM_SCTP
 	help
 	A general purpose distributed lock manager for kernel or userspace
diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index 2c184a9..cbd5f33 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -34,11 +34,10 @@ config GFS2_FS_LOCKING_NOLOCK
 
 config GFS2_FS_LOCKING_DLM
 	tristate "GFS2 DLM locking module"
-	depends on GFS2_FS && NET && INET && (IPV6 || IPV6=n)
+	depends on GFS2_FS && SYSFS && NET && INET && (IPV6 || IPV6=n)
 	select IP_SCTP if DLM_SCTP
 	select CONFIGFS_FS
 	select DLM
-	select SYSFS
 	help
 	Multiple node locking module for GFS2
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] more CURRENT_TIME_SEC [49/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (47 preceding siblings ...)
  2007-02-05 14:45 ` [GFS2/DLM] fix GFS2 circular dependency [48/54] Steven Whitehouse
@ 2007-02-05 14:46 ` Steven Whitehouse
  2007-02-05 14:47 ` [GFS2] Put back semaphore to avoid umount proble Steven Whitehouse
                   ` (5 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: Eric Sandeen, cluster-devel

>From 9df7e962608d496427bb2259024dcff933e16c6d Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen@redhat.com>
Date: Mon, 29 Jan 2007 11:11:51 -0600
Subject: [PATCH] [GFS2] more CURRENT_TIME_SEC

Whoops, quilt user error, missed this one in the previous patch.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/eattr.c b/fs/gfs2/eattr.c
index ebebbdc..0c83c7f 100644
--- a/fs/gfs2/eattr.c
+++ b/fs/gfs2/eattr.c
@@ -301,7 +301,7 @@ static int ea_dealloc_unstuffed(struct gfs2_inode *ip, struct buffer_head *bh,
 
 	error = gfs2_meta_inode_buffer(ip, &dibh);
 	if (!error) {
-		ip->i_inode.i_ctime.tv_sec = get_seconds();
+		ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 		gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 		gfs2_dinode_out(ip, dibh->b_data);
 		brelse(dibh);
@@ -718,7 +718,7 @@ static int ea_alloc_skeleton(struct gfs2_inode *ip, struct gfs2_ea_request *er,
 					    (er->er_mode & S_IFMT));
 			ip->i_inode.i_mode = er->er_mode;
 		}
-		ip->i_inode.i_ctime.tv_sec = get_seconds();
+		ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 		gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 		gfs2_dinode_out(ip, dibh->b_data);
 		brelse(dibh);
@@ -853,7 +853,7 @@ static int ea_set_simple_noalloc(struct gfs2_inode *ip, struct buffer_head *bh,
 			(ip->i_inode.i_mode & S_IFMT) == (er->er_mode & S_IFMT));
 		ip->i_inode.i_mode = er->er_mode;
 	}
-	ip->i_inode.i_ctime.tv_sec = get_seconds();
+	ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 	gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 	gfs2_dinode_out(ip, dibh->b_data);
 	brelse(dibh);
@@ -1134,7 +1134,7 @@ static int ea_remove_stuffed(struct gfs2_inode *ip, struct gfs2_ea_location *el)
 
 	error = gfs2_meta_inode_buffer(ip, &dibh);
 	if (!error) {
-		ip->i_inode.i_ctime.tv_sec = get_seconds();
+		ip->i_inode.i_ctime = CURRENT_TIME_SEC;
 		gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 		gfs2_dinode_out(ip, dibh->b_data);
 		brelse(dibh);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Put back semaphore to avoid umount proble
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (48 preceding siblings ...)
  2007-02-05 14:46 ` [GFS2] more CURRENT_TIME_SEC [49/54] Steven Whitehouse
@ 2007-02-05 14:47 ` Steven Whitehouse
  2007-02-05 14:47 ` [GFS2] Fix unlink deadlocks [51/54] Steven Whitehouse
                   ` (4 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From eeca71338ea3b4b3bcd829b998b35c015316b134 Mon Sep 17 00:00:00 2001
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 29 Jan 2007 11:51:45 +0000
Subject: [PATCH] [GFS2] Put back semaphore to avoid umount problem

Dave Teigland fixed this bug a while back, but I managed to mistakenly
remove the semaphore during later development. It is required to avoid
the list of inodes changing during an invalidate_inodes call. I have
made it an rwsem since the read side will be taken frequently during
normal filesystem operation. The write site will only happen during
umount of the file system.

Also the bug only triggers when using the DLM lock manager and only then
under certain conditions as its timing related.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Teigland <teigland@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index c070ede..6618c11 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -20,6 +20,7 @@
 #include <linux/list.h>
 #include <linux/lm_interface.h>
 #include <linux/wait.h>
+#include <linux/rwsem.h>
 #include <asm/uaccess.h>
 
 #include "gfs2.h"
@@ -45,6 +46,7 @@ static int dump_glock(struct gfs2_glock *gl);
 static int dump_inode(struct gfs2_inode *ip);
 static void gfs2_glock_xmote_th(struct gfs2_holder *gh);
 static void gfs2_glock_drop_th(struct gfs2_glock *gl);
+static DECLARE_RWSEM(gfs2_umount_flush_sem);
 
 #define GFS2_GL_HASH_SHIFT      15
 #define GFS2_GL_HASH_SIZE       (1 << GFS2_GL_HASH_SHIFT)
@@ -1578,12 +1580,14 @@ void gfs2_glock_cb(void *cb_data, unsigned int type, void *data)
 		struct lm_async_cb *async = data;
 		struct gfs2_glock *gl;
 
+		down_read(&gfs2_umount_flush_sem);
 		gl = gfs2_glock_find(sdp, &async->lc_name);
 		if (gfs2_assert_warn(sdp, gl))
 			return;
 		if (!gfs2_assert_warn(sdp, gl->gl_req_bh))
 			gl->gl_req_bh(gl, async->lc_ret);
 		gfs2_glock_put(gl);
+		up_read(&gfs2_umount_flush_sem);
 		return;
 	}
 
@@ -1828,7 +1832,9 @@ void gfs2_gl_hash_clear(struct gfs2_sbd *sdp, int wait)
 			t = jiffies;
 		}
 
+		down_write(&gfs2_umount_flush_sem);
 		invalidate_inodes(sdp->sd_vfs);
+		up_write(&gfs2_umount_flush_sem);
 		msleep(10);
 	}
 }
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2] Fix unlink deadlocks [51/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (49 preceding siblings ...)
  2007-02-05 14:47 ` [GFS2] Put back semaphore to avoid umount proble Steven Whitehouse
@ 2007-02-05 14:47 ` Steven Whitehouse
  2007-02-05 14:48 ` [DLM/GFS2] indent help text [52/54] Steven Whitehouse
                   ` (3 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Russell Cattelan, cluster-devel

>From 69a3a75c929f5c38cdc2dea773692aca02bbdb58 Mon Sep 17 00:00:00 2001
From: Russell Cattelan <cattelan@redhat.com>
Date: Mon, 29 Jan 2007 17:13:44 -0600
Subject: [PATCH] [GFS2] Fix unlink deadlocks

Move the glock acquisition to outside of the transactions.

Lock odering must be preserved in order to prevent ABBA
deadlocks. The current gfs2_change_nlink code would tries
to grab the glock after having started a transaction and thus is holding
the log lock. This is inconsistent with other code paths in
gfs that grab the resource group glock prior to staring
a tranactions.

One problem with this fix is that the resource group
lock is always grabbed now even if the inode still has
ref count and can not be marked for unlink.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 88fcfb4..0d6831a 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -280,50 +280,6 @@ out:
 	return error;
 }
 
-static int gfs2_change_nlink_i(struct gfs2_inode *ip)
-{
-	struct gfs2_sbd *sdp = ip->i_inode.i_sb->s_fs_info;
-	struct gfs2_inode *rindex = GFS2_I(sdp->sd_rindex);
-	struct gfs2_glock *ri_gl = rindex->i_gl;
-	struct gfs2_rgrpd *rgd;
-	struct gfs2_holder ri_gh, rg_gh;
-	int existing, error;
-
-	/* if we come from rename path, we could have the lock already */
-	existing = gfs2_glock_is_locked_by_me(ri_gl);
-	if (!existing) {
-		error = gfs2_rindex_hold(sdp, &ri_gh);
-		if (error)
-			goto out;
-	}
-
-	/* find the matching rgd */
-	error = -EIO;
-	rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
-	if (!rgd)
-		goto out_norgrp;
-
-	/*
-	 * Eventually we may want to move rgd(s) to a linked list
-	 * and piggyback the free logic into one of gfs2 daemons
-	 * to gain some performance.
-	 */
-	if (!rgd->rd_gl || !gfs2_glock_is_locked_by_me(rgd->rd_gl)) {
-		error = gfs2_glock_nq_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, &rg_gh);
-		if (error)
-			goto out_norgrp;
-
-		gfs2_unlink_di(&ip->i_inode); /* mark inode unlinked */
-		gfs2_glock_dq_uninit(&rg_gh);
-	}
-
-out_norgrp:
-	if (!existing)
-		gfs2_glock_dq_uninit(&ri_gh);
-out:
-	return error;
-}
-
 /**
  * gfs2_change_nlink - Change nlink count on inode
  * @ip: The GFS2 inode
@@ -365,7 +321,7 @@ int gfs2_change_nlink(struct gfs2_inode *ip, int diff)
 	mark_inode_dirty(&ip->i_inode);
 
 	if (ip->i_inode.i_nlink == 0)
-		error = gfs2_change_nlink_i(ip);
+		gfs2_unlink_di(&ip->i_inode); /* mark inode unlinked */
 
 	return error;
 }
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 5591f89..f40a848 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -264,13 +264,23 @@ static int gfs2_unlink(struct inode *dir, struct dentry *dentry)
 	struct gfs2_inode *dip = GFS2_I(dir);
 	struct gfs2_sbd *sdp = GFS2_SB(dir);
 	struct gfs2_inode *ip = GFS2_I(dentry->d_inode);
-	struct gfs2_holder ghs[2];
+	struct gfs2_holder ghs[3];
+	struct gfs2_rgrpd *rgd;
+	struct gfs2_holder ri_gh;
 	int error;
 
+	error = gfs2_rindex_hold(sdp, &ri_gh);
+	if (error)
+		return error;
+
 	gfs2_holder_init(dip->i_gl, LM_ST_EXCLUSIVE, 0, ghs);
-	gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, ghs + 1);
+	gfs2_holder_init(ip->i_gl,  LM_ST_EXCLUSIVE, 0, ghs + 1);
 
-	error = gfs2_glock_nq_m(2, ghs);
+	rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
+	gfs2_holder_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, ghs + 2);
+
+
+	error = gfs2_glock_nq_m(3, ghs);
 	if (error)
 		goto out;
 
@@ -291,10 +301,12 @@ static int gfs2_unlink(struct inode *dir, struct dentry *dentry)
 out_end_trans:
 	gfs2_trans_end(sdp);
 out_gunlock:
-	gfs2_glock_dq_m(2, ghs);
+	gfs2_glock_dq_m(3, ghs);
 out:
 	gfs2_holder_uninit(ghs);
 	gfs2_holder_uninit(ghs + 1);
+	gfs2_holder_uninit(ghs + 2);
+	gfs2_glock_dq_uninit(&ri_gh);
 	return error;
 }
 
@@ -449,13 +461,22 @@ static int gfs2_rmdir(struct inode *dir, struct dentry *dentry)
 	struct gfs2_inode *dip = GFS2_I(dir);
 	struct gfs2_sbd *sdp = GFS2_SB(dir);
 	struct gfs2_inode *ip = GFS2_I(dentry->d_inode);
-	struct gfs2_holder ghs[2];
+	struct gfs2_holder ghs[3];
+	struct gfs2_rgrpd *rgd;
+	struct gfs2_holder ri_gh;
 	int error;
 
+
+	error = gfs2_rindex_hold(sdp, &ri_gh);
+	if (error)
+		return error;
 	gfs2_holder_init(dip->i_gl, LM_ST_EXCLUSIVE, 0, ghs);
 	gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, ghs + 1);
 
-	error = gfs2_glock_nq_m(2, ghs);
+	rgd = gfs2_blk2rgrpd(sdp, ip->i_num.no_addr);
+	gfs2_holder_init(rgd->rd_gl, LM_ST_EXCLUSIVE, 0, ghs + 2);
+
+	error = gfs2_glock_nq_m(3, ghs);
 	if (error)
 		goto out;
 
@@ -483,10 +504,12 @@ static int gfs2_rmdir(struct inode *dir, struct dentry *dentry)
 	gfs2_trans_end(sdp);
 
 out_gunlock:
-	gfs2_glock_dq_m(2, ghs);
+	gfs2_glock_dq_m(3, ghs);
 out:
 	gfs2_holder_uninit(ghs);
 	gfs2_holder_uninit(ghs + 1);
+	gfs2_holder_uninit(ghs + 2);
+	gfs2_glock_dq_uninit(&ri_gh);
 	return error;
 }
 
@@ -547,7 +570,8 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 	struct gfs2_inode *ip = GFS2_I(odentry->d_inode);
 	struct gfs2_inode *nip = NULL;
 	struct gfs2_sbd *sdp = GFS2_SB(odir);
-	struct gfs2_holder ghs[4], r_gh;
+	struct gfs2_holder ghs[5], r_gh;
+	struct gfs2_rgrpd *nrgd;
 	unsigned int num_gh;
 	int dir_rename = 0;
 	int alloc_required;
@@ -587,6 +611,13 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
 	if (nip) {
 		gfs2_holder_init(nip->i_gl, LM_ST_EXCLUSIVE, 0, ghs + num_gh);
 		num_gh++;
+		/* grab the resource lock for unlink flag twiddling 
+		 * this is the case of the target file already existing
+		 * so we unlink before doing the rename
+		 */
+		nrgd = gfs2_blk2rgrpd(sdp, nip->i_num.no_addr);
+		if (nrgd)
+			gfs2_holder_init(nrgd->rd_gl, LM_ST_EXCLUSIVE, 0, ghs + num_gh++);
 	}
 
 	error = gfs2_glock_nq_m(num_gh, ghs);
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM/GFS2] indent help text [52/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (50 preceding siblings ...)
  2007-02-05 14:47 ` [GFS2] Fix unlink deadlocks [51/54] Steven Whitehouse
@ 2007-02-05 14:48 ` Steven Whitehouse
  2007-02-05 14:49 ` [DLM] zero new user lvbs [53/54] Steven Whitehouse
                   ` (2 subsequent siblings)
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: Randy Dunlap, cluster-devel

>From 10e3fe526f52312da2d4c7cb7ab5e6cc56017a4c Mon Sep 17 00:00:00 2001
From: Randy Dunlap <randy.dunlap@oracle.com>
Date: Tue, 30 Jan 2007 14:30:08 -0800
Subject: [PATCH] [DLM/GFS2] indent help text

Indent help text as expected.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/Kconfig b/fs/dlm/Kconfig
index 7e264b7..6fa7b0d 100644
--- a/fs/dlm/Kconfig
+++ b/fs/dlm/Kconfig
@@ -7,17 +7,17 @@ config DLM
 	select CONFIGFS_FS
 	select IP_SCTP if DLM_SCTP
 	help
-	A general purpose distributed lock manager for kernel or userspace
-	applications.
+	  A general purpose distributed lock manager for kernel or userspace
+	  applications.
 
 choice
 	prompt "Select DLM communications protocol"
 	depends on DLM
 	default DLM_TCP
 	help
-	The DLM Can use TCP or SCTP for it's network communications.
-	SCTP supports multi-homed operations whereas TCP doesn't.
-	However, SCTP seems to have stability problems at the moment.
+	  The DLM Can use TCP or SCTP for it's network communications.
+	  SCTP supports multi-homed operations whereas TCP doesn't.
+	  However, SCTP seems to have stability problems at the moment.
 
 config DLM_TCP
 	bool "TCP/IP"
@@ -31,8 +31,8 @@ config DLM_DEBUG
 	bool "DLM debugging"
 	depends on DLM
 	help
-	Under the debugfs mount point, the name of each lockspace will
-	appear as a file in the "dlm" directory.  The output is the
-	list of resource and locks the local node knows about.
+	  Under the debugfs mount point, the name of each lockspace will
+	  appear as a file in the "dlm" directory.  The output is the
+	  list of resource and locks the local node knows about.
 
 endmenu
diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index cbd5f33..de8e64c 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -4,33 +4,33 @@ config GFS2_FS
 	select FS_POSIX_ACL
 	select CRC32
 	help
-	A cluster filesystem.
+	  A cluster filesystem.
 
-	Allows a cluster of computers to simultaneously use a block device
-	that is shared between them (with FC, iSCSI, NBD, etc...).  GFS reads
-	and writes to the block device like a local filesystem, but also uses
-	a lock module to allow the computers coordinate their I/O so
-	filesystem consistency is maintained.  One of the nifty features of
-	GFS is perfect consistency -- changes made to the filesystem on one
-	machine show up immediately on all other machines in the cluster.
+	  Allows a cluster of computers to simultaneously use a block device
+	  that is shared between them (with FC, iSCSI, NBD, etc...).  GFS reads
+	  and writes to the block device like a local filesystem, but also uses
+	  a lock module to allow the computers coordinate their I/O so
+	  filesystem consistency is maintained.  One of the nifty features of
+	  GFS is perfect consistency -- changes made to the filesystem on one
+	  machine show up immediately on all other machines in the cluster.
 
-	To use the GFS2 filesystem, you will need to enable one or more of
-	the below locking modules. Documentation and utilities for GFS2 can
-	be found here: http://sources.redhat.com/cluster
+	  To use the GFS2 filesystem, you will need to enable one or more of
+	  the below locking modules. Documentation and utilities for GFS2 can
+	  be found here: http://sources.redhat.com/cluster
 
 config GFS2_FS_LOCKING_NOLOCK
 	tristate "GFS2 \"nolock\" locking module"
 	depends on GFS2_FS
 	help
-	Single node locking module for GFS2.
+	  Single node locking module for GFS2.
 
-	Use this module if you want to use GFS2 on a single node without
-	its clustering features. You can still take advantage of the
-	large file support, and upgrade to running a full cluster later on
-	if required.
+	  Use this module if you want to use GFS2 on a single node without
+	  its clustering features. You can still take advantage of the
+	  large file support, and upgrade to running a full cluster later on
+	  if required.
 
-	If you will only be using GFS2 in cluster mode, you do not need this
-	module.
+	  If you will only be using GFS2 in cluster mode, you do not need this
+	  module.
 
 config GFS2_FS_LOCKING_DLM
 	tristate "GFS2 DLM locking module"
@@ -39,9 +39,8 @@ config GFS2_FS_LOCKING_DLM
 	select CONFIGFS_FS
 	select DLM
 	help
-	Multiple node locking module for GFS2
-
-	Most users of GFS2 will require this module. It provides the locking
-	interface between GFS2 and the DLM, which is required to use GFS2
-	in a cluster environment.
+	  Multiple node locking module for GFS2
 
+	  Most users of GFS2 will require this module. It provides the locking
+	  interface between GFS2 and the DLM, which is required to use GFS2
+	  in a cluster environment.
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] zero new user lvbs [53/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (51 preceding siblings ...)
  2007-02-05 14:48 ` [DLM/GFS2] indent help text [52/54] Steven Whitehouse
@ 2007-02-05 14:49 ` Steven Whitehouse
  2007-02-05 14:50 ` [DLM] fix softlockup in dlm_recv [54/54] Steven Whitehouse
  2007-02-07 13:20 ` [GFS2 & DLM] Pull request Steven Whitehouse
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Teigland, cluster-devel

>From 3c5b0f7925c4e4945f5589a6dd82c1eae81b17cc Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Wed, 31 Jan 2007 13:25:00 -0600
Subject: [PATCH] [DLM] zero new user lvbs

A new lvb for a userland lock wasn't being initialized to zero.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index c10257f..e725005 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -3643,7 +3643,7 @@ int dlm_user_request(struct dlm_ls *ls, struct dlm_user_args *ua,
 	}
 
 	if (flags & DLM_LKF_VALBLK) {
-		ua->lksb.sb_lvbptr = kmalloc(DLM_USER_LVB_LEN, GFP_KERNEL);
+		ua->lksb.sb_lvbptr = kzalloc(DLM_USER_LVB_LEN, GFP_KERNEL);
 		if (!ua->lksb.sb_lvbptr) {
 			kfree(ua);
 			__put_lkb(ls, lkb);
@@ -3712,7 +3712,7 @@ int dlm_user_convert(struct dlm_ls *ls, struct dlm_user_args *ua_tmp,
 	ua = (struct dlm_user_args *)lkb->lkb_astparam;
 
 	if (flags & DLM_LKF_VALBLK && !ua->lksb.sb_lvbptr) {
-		ua->lksb.sb_lvbptr = kmalloc(DLM_USER_LVB_LEN, GFP_KERNEL);
+		ua->lksb.sb_lvbptr = kzalloc(DLM_USER_LVB_LEN, GFP_KERNEL);
 		if (!ua->lksb.sb_lvbptr) {
 			error = -ENOMEM;
 			goto out_put;
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [DLM] fix softlockup in dlm_recv [54/54]
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (52 preceding siblings ...)
  2007-02-05 14:49 ` [DLM] zero new user lvbs [53/54] Steven Whitehouse
@ 2007-02-05 14:50 ` Steven Whitehouse
  2007-02-07 13:20 ` [GFS2 & DLM] Pull request Steven Whitehouse
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-05 14:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: Patrick Caulfield, cluster-devel

>From aee6c8244dea3199ff3d17fc3167660ea95299ec Mon Sep 17 00:00:00 2001
From: Patrick Caulfield <pcaulfie@redhat.com>
Date: Thu, 1 Feb 2007 16:46:33 +0000
Subject: [PATCH] [DLM] fix softlockup in dlm_recv

This patch stops the dlm_recv workqueue from busy-waiting when a node
disconnects. This can cause soft lockup errors on debug systems and bad
performance generally.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

diff --git a/fs/dlm/lowcomms-tcp.c b/fs/dlm/lowcomms-tcp.c
index 18ade44..f1efd17 100644
--- a/fs/dlm/lowcomms-tcp.c
+++ b/fs/dlm/lowcomms-tcp.c
@@ -2,7 +2,7 @@
 *******************************************************************************
 **
 **  Copyright (C) Sistina Software, Inc.  1997-2003  All rights reserved.
-**  Copyright (C) 2004-2006 Red Hat, Inc.  All rights reserved.
+**  Copyright (C) 2004-2007 Red Hat, Inc.  All rights reserved.
 **
 **  This copyrighted material is made available to anyone wishing to use,
 **  modify, copy, or redistribute it subject to the terms and conditions
@@ -109,7 +109,6 @@ struct connection {
 	struct page *rx_page;
 	struct cbuf cb;
 	int retries;
-	atomic_t waiting_requests;
 #define MAX_CONNECT_RETRIES 3
 	struct connection *othercon;
 	struct work_struct rwork; /* Receive workqueue */
@@ -278,8 +277,11 @@ static int receive_from_sock(struct connection *con)
 
 	mutex_lock(&con->sock_mutex);
 
-	if (con->sock == NULL)
-		goto out;
+	if (con->sock == NULL) {
+		ret = -EAGAIN;
+		goto out_close;
+	}
+
 	if (con->rx_page == NULL) {
 		/*
 		 * This doesn't need to be atomic, but I think it should
@@ -352,7 +354,6 @@ static int receive_from_sock(struct connection *con)
 		con->rx_page = NULL;
 	}
 
-out:
 	if (call_again_soon)
 		goto out_resched;
 	mutex_unlock(&con->sock_mutex);
@@ -370,6 +371,9 @@ out_close:
 		close_connection(con, false);
 		/* Reconnect when there is something to send */
 	}
+	/* Don't return success if we really got EOF */
+	if (ret == 0)
+		ret = -EAGAIN;
 
 	return ret;
 }
@@ -847,7 +851,6 @@ int dlm_lowcomms_close(int nodeid)
 	if (con) {
 		clean_one_writequeue(con);
 		close_connection(con, true);
-		atomic_set(&con->waiting_requests, 0);
 	}
 	return 0;
 
-- 
1.4.4.2




^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [GFS2 & DLM] Pull request
  2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
                   ` (53 preceding siblings ...)
  2007-02-05 14:50 ` [DLM] fix softlockup in dlm_recv [54/54] Steven Whitehouse
@ 2007-02-07 13:20 ` Steven Whitehouse
  54 siblings, 0 replies; 56+ messages in thread
From: Steven Whitehouse @ 2007-02-07 13:20 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, cluster-devel

Hi,

Please consider pulling the following GFS2 & DLM changes. They are as per
the patches posted recently on lkml, except for three minor changes
(two small bug fixes and a function which should have been static) which
are marked with [*] in the list below. All the other patches have
been in -mm, most of them for a number of weeks. None of the patches touch
any code outside the gfs2 and dlm directories,

Steve.

The following changes since commit 62d0cfcb27cf755cebdc93ca95dabc83608007cd:
  Linus Torvalds (1):
        Linux 2.6.20

are found in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw.git

Adrian Bunk (4):
      [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions
      [GFS2] make gfs2_change_nlink_i() static
      [GFS2/DLM] fix GFS2 circular dependency
      [GFS2] make gfs2_writepages() static [*]

David Teigland (18):
      [GFS2] don't try to lockfs after shutdown
      [DLM] fix resend rcom lock
      [DLM] fix old rcom messages
      [DLM] add version check
      [DLM] fix send_args() lvb copying
      [DLM] fix receive_request() lvb copying
      [DLM] fix lost flags in stub replies
      [DLM] change some log_error to log_debug
      [DLM] rename dlm_config_info fields
      [DLM] add config entry to enable log_debug
      [DLM] expose dlm_config_info fields in configfs
      [DLM] fix user unlocking
      [DLM] fix master recovery
      [DLM] saved dlm message can be dropped
      [DLM] can miss clearing resend flag
      [GFS2] increase default lock limit
      [GFS2] make lock_dlm drop_count tunable in sysfs
      [DLM] zero new user lvbs

Eric Sandeen (2):
      [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2
      [GFS2] more CURRENT_TIME_SEC

Patrick Caulfield (7):
      [DLM] Fix schedule() calls
      [DLM] Fix spin lock already unlocked bug
      [DLM] Use workqueues for dlm lowcomms
      [DLM] lowcomms tidy
      [DLM] fix lowcomms receiving
      [DLM] Make sock_sem into a mutex
      [DLM] fix softlockup in dlm_recv

Randy Dunlap (2):
      [GFS2/DLM] use sysfs
      [DLM/GFS2] indent help text

Robert Peterson (1):
      [GFS2] gfs2 knows of directories which it chooses not to display

Russell Cattelan (2):
      [GFS2] BZ 217008 fsfuzzer fix.
      [GFS2] Fix unlink deadlocks

S. Wendy Cheng (2):
      [GFS2] Fix change nlink deadlock
      [GFS2] Fix gfs2_rename deadlock

Steven Whitehouse (18):
      [GFS2] Fix DIO deadlock
      [GFS2] Fail over to readpage for stuffed files
      [GFS2] Fix ordering of page disposal vs. glock_dq
      [GFS2] Add writepages for "data=writeback" mounts
      [GFS2] Clean up/speed up readdir
      [GFS2] Remove max_atomic_write tunable
      [GFS2] Shrink gfs2_inode memory by half
      [GFS2] Remove the "greedy" function from glock.[ch]
      [GFS2] Remove unused go_callback operation
      [GFS2] Remove local exclusive glock mode
      [GFS2] Tidy up glops calls
      [GFS2] Remove queue_empty() function
      [GFS2] Compile fix for glock.c
      [GFS2] Fix typo in glock.c
      [GFS2] Fix recursive locking attempt with NFS
      [GFS2] Fix list corruption in lops.c
      [GFS2] Put back semaphore to avoid umount problem
      [GFS2] Unlock page on prepare_write try lock failure [*]

Wendy Cheng (1):
      [GFS2] nfsd readdirplus assertion failure [*]

 fs/dlm/Kconfig                 |   18 +-
 fs/dlm/config.c                |  154 ++++++++++++++++--
 fs/dlm/config.h                |   17 +-
 fs/dlm/dlm_internal.h          |   20 ++-
 fs/dlm/lock.c                  |   87 +++++++---
 fs/dlm/lockspace.c             |   10 +-
 fs/dlm/lowcomms-sctp.c         |  151 ++++++++---------
 fs/dlm/lowcomms-tcp.c          |  361 +++++++++++-----------------------------
 fs/dlm/midcomms.c              |    4 +-
 fs/dlm/rcom.c                  |   85 ++++++----
 fs/dlm/recover.c               |    8 +-
 fs/dlm/recoverd.c              |   22 ++--
 fs/dlm/user.c                  |    9 +
 fs/dlm/util.c                  |    4 +
 fs/gfs2/Kconfig                |   47 +++---
 fs/gfs2/bmap.c                 |   10 +-
 fs/gfs2/dir.c                  |   25 ++--
 fs/gfs2/dir.h                  |   21 +--
 fs/gfs2/eattr.c                |    8 +-
 fs/gfs2/glock.c                |  316 +++++++++---------------------------
 fs/gfs2/glock.h                |   11 --
 fs/gfs2/glops.c                |  136 +++++----------
 fs/gfs2/incore.h               |   18 +--
 fs/gfs2/inode.c                |   61 ++++----
 fs/gfs2/lm.c                   |    8 +-
 fs/gfs2/locking/dlm/lock_dlm.h |    2 +-
 fs/gfs2/locking/dlm/main.c     |    6 -
 fs/gfs2/locking/dlm/mount.c    |    6 +-
 fs/gfs2/locking/dlm/sysfs.c    |   13 ++
 fs/gfs2/lops.c                 |   14 ++-
 fs/gfs2/ops_address.c          |  134 +++++++++------
 fs/gfs2/ops_dentry.c           |   16 ++-
 fs/gfs2/ops_export.c           |   15 +-
 fs/gfs2/ops_file.c             |   52 +------
 fs/gfs2/ops_inode.c            |   55 +++++--
 fs/gfs2/ops_super.c            |   11 +-
 fs/gfs2/ops_vm.c               |   24 +---
 fs/gfs2/super.c                |   16 +--
 fs/gfs2/sys.c                  |   10 -
 39 files changed, 866 insertions(+), 1119 deletions(-)



^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2007-02-07 13:11 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-05 14:07 [GFS2 & DLM] Proposed patches for 2.6.20 merge window [0/54] Steven Whitehouse
2007-02-05 14:09 ` [GFS2] don't try to lockfs after shutdown [1/54] Steven Whitehouse
2007-02-05 14:09 ` [DLM] fix resend rcom lock [2/54] Steven Whitehouse
2007-02-05 14:10 ` [DLM] fix old rcom messages [3/54] Steven Whitehouse
2007-02-05 14:11 ` [DLM] add version check [4/54] Steven Whitehouse
2007-02-05 14:12 ` [DLM] fix send_args() lvb copying [5/54] Steven Whitehouse
2007-02-05 14:13 ` [DLM] fix receive_request() lvb copying [6/54] Steven Whitehouse
2007-02-05 14:14 ` [DLM] fix lost flags in stub replies Steven Whitehouse
2007-02-05 14:15 ` [DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions [8/54] Steven Whitehouse
2007-02-05 14:16 ` [GFS2] Fix DIO deadlock [9/54] Steven Whitehouse
2007-02-05 14:17 ` [GFS2] Fail over to readpage for stuffed files [10/54] Steven Whitehouse
2007-02-05 14:18 ` [GFS2] Fix change nlink deadlock [11/54] Steven Whitehouse
2007-02-05 14:19 ` [DLM] Fix schedule() calls [12/54] Steven Whitehouse
2007-02-05 14:19 ` [DLM] Fix spin lock already unlocked bug [13/54] Steven Whitehouse
2007-02-05 14:20 ` [GFS2] Fix ordering of page disposal vs. glock_dq [14/54] Steven Whitehouse
2007-02-05 14:21 ` [GFS2] BZ 217008 fsfuzzer fix [15/54] Steven Whitehouse
2007-02-05 14:22 ` [GFS2] Fix gfs2_rename deadlock [16/54] Steven Whitehouse
2007-02-05 14:22 ` [DLM] change some log_error to log_debug [17/54] Steven Whitehouse
2007-02-05 14:23 ` [DLM] rename dlm_config_info fields [18/54] Steven Whitehouse
2007-02-05 14:24 ` [DLM] add config entry to enable log_debug [16/54] Steven Whitehouse
2007-02-05 14:25 ` [DLM] expose dlm_config_info fields in configfs [20/54] Steven Whitehouse
2007-02-05 14:26 ` [GFS2] gfs2 knows of directories which it chooses not to display [21/54] Steven Whitehouse
2007-02-05 14:27 ` [GFS2] make gfs2_change_nlink_i() static [22/54] Steven Whitehouse
2007-02-05 14:28 ` [DLM] Use workqueues for dlm lowcomms [23/54] Steven Whitehouse
2007-02-05 14:29 ` [DLM] fix user unlocking [24/54] Steven Whitehouse
2007-02-05 14:29 ` [DLM] fix master recovery [25/54] Steven Whitehouse
2007-02-05 14:30 ` [GFS2] Add writepages for "data=writeback" mounts [26/54] Steven Whitehouse
2007-02-05 14:31 ` [GFS2] Clean up/speed up readdir [27/54] Steven Whitehouse
2007-02-05 14:31 ` [GFS2] Remove max_atomic_write tunable [28/54] Steven Whitehouse
2007-02-05 14:32 ` [GFS2] Shrink gfs2_inode memory by half [29/54] Steven Whitehouse
2007-02-05 14:33 ` [GFS2] Remove the "greedy" function from glock.[ch] [30/54] Steven Whitehouse
2007-02-05 14:34 ` [GFS2] Remove unused go_callback operation [31/54] Steven Whitehouse
2007-02-05 14:34 ` [GFS2] Remove local exclusive glock mode [32/54] Steven Whitehouse
2007-02-05 14:35 ` [DLM] lowcomms tidy [33/54] Steven Whitehouse
2007-02-05 14:35 ` [GFS2] Tidy up glops calls [34/54] Steven Whitehouse
2007-02-05 14:36 ` [DLM] fix lowcomms receiving [35/54] Steven Whitehouse
2007-02-05 14:37 ` [GFS2] Remove queue_empty() function [36/54] Steven Whitehouse
2007-02-05 14:37 ` [GFS2] Compile fix for glock.c [37/54] Steven Whitehouse
2007-02-05 14:38 ` [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 [38/54] Steven Whitehouse
2007-02-05 14:39 ` [GFS2] Fix typo in glock.c [39/54] Steven Whitehouse
2007-02-05 14:40 ` [DLM] Make sock_sem into a mutex [40/54] Steven Whitehouse
2007-02-05 14:40 ` [DLM] saved dlm message can be dropped [41/54] Steven Whitehouse
2007-02-05 14:41 ` [DLM] can miss clearing resend flag Steven Whitehouse
2007-02-05 14:41 ` [GFS2] Fix recursive locking attempt with NFS [43/54] Steven Whitehouse
2007-02-05 14:42 ` [GFS2] Fix list corruption in lops.c [44/54] Steven Whitehouse
2007-02-05 14:43 ` [GFS2] increase default lock limit [45/54] Steven Whitehouse
2007-02-05 14:44 ` [GFS2] make lock_dlm drop_count tunable in sysfs [46/54] Steven Whitehouse
2007-02-05 14:44 ` [GFS2/DLM] use sysfs Steven Whitehouse
2007-02-05 14:45 ` [GFS2/DLM] fix GFS2 circular dependency [48/54] Steven Whitehouse
2007-02-05 14:46 ` [GFS2] more CURRENT_TIME_SEC [49/54] Steven Whitehouse
2007-02-05 14:47 ` [GFS2] Put back semaphore to avoid umount proble Steven Whitehouse
2007-02-05 14:47 ` [GFS2] Fix unlink deadlocks [51/54] Steven Whitehouse
2007-02-05 14:48 ` [DLM/GFS2] indent help text [52/54] Steven Whitehouse
2007-02-05 14:49 ` [DLM] zero new user lvbs [53/54] Steven Whitehouse
2007-02-05 14:50 ` [DLM] fix softlockup in dlm_recv [54/54] Steven Whitehouse
2007-02-07 13:20 ` [GFS2 & DLM] Pull request Steven Whitehouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).