All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabian Frederick <fabf@skynet.be>
To: Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@ZenIV.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>, Ian Campbell <ian.campbell@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	xen-devel <xen-devel@lists.xen.org>,
	Evgeniy Dushistov <dushistov@mail.ru>,
	Alexey Khoroshilov <khoroshilov@ispras.ru>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
Date: Fri, 5 Jun 2015 18:27:01 +0200 (CEST)	[thread overview]
Message-ID: <1122467636.634568.1433521621076.open-xchange@webmail.nmp.proximus.be> (raw)
In-Reply-To: <20150604050123.GL7232@ZenIV.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 2821 bytes --]



> On 04 June 2015 at 07:01 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
>
> On Wed, May 27, 2015 at 02:57:35PM -0700, Andrew Morton wrote:
> > On Wed, 27 May 2015 21:15:30 +0200 Fabian Frederick <fabf@skynet.be> wrote:
> >
> > > This reverts commit 9ef7db7f38d0
> > > ("ufs: fix deadlocks introduced by sb mutex merge")
> > > That patch tried to solve
> > >     Commit 0244756edc4b98c
> > >     ("ufs: sb mutex merge + mutex_destroy")
> > > which is itself partially reverted due to multiple deadlocks
> >
> > This is all very vague.  The changelogs are missing any description of
> > the deadlocks: how they are triggered, why they occur.  And there's no
> > description of how the patches fix these deadlocks.  And as we're
> > reverting a bunch of things one wonders whether the problems which the
> > now-reverted patches fixed are being reintroduced.
> >
> > Has anyone (Ian?) confirmed that the fs works OK with these patches?
>
> Folks, how about we figure out what's really being protected by that
> mutex?  IIRC, the main irregularity about ufs is the need to deal with
> growing the partial final block - unlike e.g ext2, ufs has differently-sized
> blocks and fragments.  Basically, it's tail-packing - short files have
> the last used direct pointer refer to a group of adjacent fragments that
> doesn't have to be block-aligned or fill the entire block.  They can't
> cross the disk block boundary and write might have to reallocate the partial
> block.
>
> So we need
>       * per-page exclusion for reallocation time (normal page locks are
> doing that)
>       * per-fs exclusion for block and fragment allocations (->s_lock?)
>       * per-fs exclusion for inode allocations (->s_lock?)
>       * per-inode exclusion for mapping changes (a-la ext2 truncate_mutex)
>       * per-directory exclusion for contents access (->i_mutex gives that)
>
> Looks like we ought to add ->truncate_mutex and shove lock_ufs() calls
> all way down into balloc.c (and ialloc.c for inode allocations)...

If we look at linux-next with the original mutex behavior restored,
mutex/spinlocks are the following:

struct ufs_sb_info{

        struct mutex mutex; (lock_ufs()/unlock_ufs())
        struct mutex s_lock; (mutex_lock()/mutex_unlock())
        spinlock_t work_lock; /* protects sync_work and work_queued */

}

You're asking to remove lock_ufs() in allocation and replace it by
truncate_mutex. I guess you're talking about doing that on current rc
(without s_lock restored).

I tried a quick patch on rc trying to convert lock_ufs()/unlock_ufs()
with per inode truncate_mutex (see attachment).
Is it going the right direction ? That would involve dropping the two linux-next
reverts in ufs.

Regards,
Fabian

[-- Attachment #2: ufsmutex2 --]
[-- Type: application/octet-stream, Size: 7541 bytes --]

diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c
index 2c10360..23f9daf 100644
--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -40,6 +40,7 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count)
 	struct ufs_sb_private_info * uspi;
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	unsigned cgno, bit, end_bit, bbase, blkmap, i;
 	u64 blkno;
 	
@@ -52,7 +53,7 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count)
 	if (ufs_fragnum(fragment) + count > uspi->s_fpg)
 		ufs_error (sb, "ufs_free_fragments", "internal error");
 	
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 	
 	cgno = ufs_dtog(uspi, fragment);
 	bit = ufs_dtogd(uspi, fragment);
@@ -116,12 +117,12 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count)
 		ubh_sync_block(UCPI_UBH(ucpi));
 	ufs_mark_sb_dirty(sb);
 	
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT\n");
 	return;
 
 failed:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT (FAILED)\n");
 	return;
 }
@@ -135,6 +136,7 @@ void ufs_free_blocks(struct inode *inode, u64 fragment, unsigned count)
 	struct ufs_sb_private_info * uspi;
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	unsigned overflow, cgno, bit, end_bit, i;
 	u64 blkno;
 	
@@ -151,7 +153,7 @@ void ufs_free_blocks(struct inode *inode, u64 fragment, unsigned count)
 		goto failed;
 	}
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 	
 do_more:
 	overflow = 0;
@@ -211,12 +213,12 @@ do_more:
 	}
 
 	ufs_mark_sb_dirty(sb);
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT\n");
 	return;
 
 failed_unlock:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 failed:
 	UFSD("EXIT (FAILED)\n");
 	return;
@@ -345,6 +347,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 	struct super_block * sb;
 	struct ufs_sb_private_info * uspi;
 	struct ufs_super_block_first * usb1;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	unsigned cgno, oldcount, newcount;
 	u64 tmp, request, result;
 	
@@ -357,7 +360,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 	usb1 = ubh_get_usb_first(uspi);
 	*err = -ENOSPC;
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 	tmp = ufs_data_ptr_to_cpu(sb, p);
 
 	if (count + ufs_fragnum(fragment) > uspi->s_fpb) {
@@ -378,19 +381,19 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 				  "fragment %llu, tmp %llu\n",
 				  (unsigned long long)fragment,
 				  (unsigned long long)tmp);
-			unlock_ufs(sb);
+			mutex_unlock(&ei->truncate_mutex);
 			return INVBLOCK;
 		}
 		if (fragment < UFS_I(inode)->i_lastfrag) {
 			UFSD("EXIT (ALREADY ALLOCATED)\n");
-			unlock_ufs(sb);
+			mutex_unlock(&ei->truncate_mutex);
 			return 0;
 		}
 	}
 	else {
 		if (tmp) {
 			UFSD("EXIT (ALREADY ALLOCATED)\n");
-			unlock_ufs(sb);
+			mutex_unlock(&ei->truncate_mutex);
 			return 0;
 		}
 	}
@@ -399,7 +402,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 	 * There is not enough space for user on the device
 	 */
 	if (!capable(CAP_SYS_RESOURCE) && ufs_freespace(uspi, UFS_MINFREE) <= 0) {
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		UFSD("EXIT (FAILED)\n");
 		return 0;
 	}
@@ -424,7 +427,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 			ufs_clear_frags(inode, result + oldcount,
 					newcount - oldcount, locked_page != NULL);
 		}
-		unlock_ufs(sb);
+		mutex_lock(&ei->truncate_mutex);
 		UFSD("EXIT, result %llu\n", (unsigned long long)result);
 		return result;
 	}
@@ -439,7 +442,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 						fragment + count);
 		ufs_clear_frags(inode, result + oldcount, newcount - oldcount,
 				locked_page != NULL);
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		UFSD("EXIT, result %llu\n", (unsigned long long)result);
 		return result;
 	}
@@ -477,7 +480,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 		*err = 0;
 		UFS_I(inode)->i_lastfrag = max(UFS_I(inode)->i_lastfrag,
 						fragment + count);
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		if (newcount < request)
 			ufs_free_fragments (inode, result + newcount, request - newcount);
 		ufs_free_fragments (inode, tmp, oldcount);
@@ -485,7 +488,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 		return result;
 	}
 
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT (FAILED)\n");
 	return 0;
 }		
diff --git a/fs/ufs/ialloc.c b/fs/ufs/ialloc.c
index 7caa016..49bd9c4 100644
--- a/fs/ufs/ialloc.c
+++ b/fs/ufs/ialloc.c
@@ -59,6 +59,7 @@ void ufs_free_inode (struct inode * inode)
 	struct ufs_sb_private_info * uspi;
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	int is_directory;
 	unsigned ino, cg, bit;
 	
@@ -69,11 +70,11 @@ void ufs_free_inode (struct inode * inode)
 	
 	ino = inode->i_ino;
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 
 	if (!((ino > 1) && (ino < (uspi->s_ncg * uspi->s_ipg )))) {
 		ufs_warning(sb, "ufs_free_inode", "reserved inode or nonexistent inode %u\n", ino);
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		return;
 	}
 	
@@ -81,7 +82,7 @@ void ufs_free_inode (struct inode * inode)
 	bit = ufs_inotocgoff (ino);
 	ucpi = ufs_load_cylinder (sb, cg);
 	if (!ucpi) {
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		return;
 	}
 	ucg = ubh_get_ucg(UCPI_UBH(ucpi));
@@ -115,7 +116,7 @@ void ufs_free_inode (struct inode * inode)
 		ubh_sync_block(UCPI_UBH(ucpi));
 	
 	ufs_mark_sb_dirty(sb);
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT\n");
 }
 
@@ -176,6 +177,7 @@ struct inode *ufs_new_inode(struct inode *dir, umode_t mode)
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
 	struct inode * inode;
+	struct ufs_inode_info *ei = UFS_I(dir);
 	unsigned cg, bit, i, j, start;
 	struct ufs_inode_info *ufsi;
 	int err = -ENOSPC;
@@ -193,7 +195,7 @@ struct inode *ufs_new_inode(struct inode *dir, umode_t mode)
 	sbi = UFS_SB(sb);
 	uspi = sbi->s_uspi;
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 
 	/*
 	 * Try to place the inode in its parent directory
@@ -331,21 +333,21 @@ cg_found:
 			sync_dirty_buffer(bh);
 		brelse(bh);
 	}
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 
 	UFSD("allocating inode %lu\n", inode->i_ino);
 	UFSD("EXIT\n");
 	return inode;
 
 fail_remove_inode:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	clear_nlink(inode);
 	unlock_new_inode(inode);
 	iput(inode);
 	UFSD("EXIT (FAILED): err %d\n", err);
 	return ERR_PTR(err);
 failed:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	make_bad_inode(inode);
 	iput (inode);
 	UFSD("EXIT (FAILED): err %d\n", err);
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index b3bc3e7..1a695a4 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1435,6 +1435,7 @@ static void init_once(void *foo)
 {
 	struct ufs_inode_info *ei = (struct ufs_inode_info *) foo;
 
+	mutex_init(&ei->truncate_mutex);
 	inode_init_once(&ei->vfs_inode);
 }
 
diff --git a/fs/ufs/ufs.h b/fs/ufs/ufs.h
index 2a07396..b530c2f 100644
--- a/fs/ufs/ufs.h
+++ b/fs/ufs/ufs.h
@@ -46,6 +46,7 @@ struct ufs_inode_info {
 	__u16	i_osync;
 	__u64	i_lastfrag;
 	__u32   i_dir_start_lookup;
+	struct mutex truncate_mutex;
 	struct inode vfs_inode;
 };
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2015-06-05 16:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1432754131-27425-1-git-send-email-fabf@skynet.be>
     [not found] ` <20150527145735.e3d1913bc66426038d53be32@linux-foundation.org>
2015-06-04  5:01   ` [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge" Al Viro
2015-06-04 22:22     ` Al Viro
2015-06-04 22:22     ` Al Viro
2015-06-05 16:27     ` Fabian Frederick [this message]
2015-06-05 18:50       ` Al Viro
2015-06-05 22:03         ` Al Viro
2015-06-17  8:57           ` Jan Kara
2015-06-17  8:57           ` Jan Kara
2015-06-17 20:31             ` Al Viro
2015-06-17 20:31             ` Al Viro
2015-06-19 23:07               ` Al Viro
2015-06-19 23:07               ` Al Viro
2015-06-23 16:46                 ` Jan Kara
2015-06-23 16:46                 ` Jan Kara
2015-06-23 21:56                   ` Al Viro
2015-06-23 21:56                   ` Al Viro
2015-06-05 22:03         ` Al Viro
2015-06-06  8:04         ` Fabian Frederick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1122467636.634568.1433521621076.open-xchange@webmail.nmp.proximus.be \
    --to=fabf@skynet.be \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=akpm@linux-foundation.org \
    --cc=dushistov@mail.ru \
    --cc=ian.campbell@citrix.com \
    --cc=jack@suse.cz \
    --cc=khoroshilov@ispras.ru \
    --cc=roger.pau@citrix.com \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.