All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
       [not found] ` <20150527145735.e3d1913bc66426038d53be32@linux-foundation.org>
@ 2015-06-04  5:01   ` Al Viro
  2015-06-04 22:22     ` Al Viro
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Al Viro @ 2015-06-04  5:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Evgeniy Dushistov, Ian Campbell, Ian Jackson, xen-devel,
	Fabian Frederick, Jan Kara, Alexey Khoroshilov, Roger Pau Monne

On Wed, May 27, 2015 at 02:57:35PM -0700, Andrew Morton wrote:
> On Wed, 27 May 2015 21:15:30 +0200 Fabian Frederick <fabf@skynet.be> wrote:
> 
> > This reverts commit 9ef7db7f38d0
> > ("ufs: fix deadlocks introduced by sb mutex merge")
> > That patch tried to solve
> >     Commit 0244756edc4b98c
> >     ("ufs: sb mutex merge + mutex_destroy")
> > which is itself partially reverted due to multiple deadlocks
> 
> This is all very vague.  The changelogs are missing any description of
> the deadlocks: how they are triggered, why they occur.  And there's no
> description of how the patches fix these deadlocks.  And as we're
> reverting a bunch of things one wonders whether the problems which the
> now-reverted patches fixed are being reintroduced.
> 
> Has anyone (Ian?) confirmed that the fs works OK with these patches?

Folks, how about we figure out what's really being protected by that
mutex?  IIRC, the main irregularity about ufs is the need to deal with
growing the partial final block - unlike e.g ext2, ufs has differently-sized
blocks and fragments.  Basically, it's tail-packing - short files have
the last used direct pointer refer to a group of adjacent fragments that
doesn't have to be block-aligned or fill the entire block.  They can't
cross the disk block boundary and write might have to reallocate the partial
block.

So we need
	* per-page exclusion for reallocation time (normal page locks are
doing that)
	* per-fs exclusion for block and fragment allocations (->s_lock?)
	* per-fs exclusion for inode allocations (->s_lock?)
	* per-inode exclusion for mapping changes (a-la ext2 truncate_mutex)
	* per-directory exclusion for contents access (->i_mutex gives that)

Looks like we ought to add ->truncate_mutex and shove lock_ufs() calls
all way down into balloc.c (and ialloc.c for inode allocations)...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-04  5:01   ` [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge" Al Viro
  2015-06-04 22:22     ` Al Viro
@ 2015-06-04 22:22     ` Al Viro
  2015-06-05 16:27     ` Fabian Frederick
  2 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-04 22:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Fabian Frederick, Ian Campbell, Evgeniy Dushistov,
	Alexey Khoroshilov, Roger Pau Monne, Ian Jackson, xen-devel,
	Jan Kara, linux-fsdevel

On Thu, Jun 04, 2015 at 06:01:23AM +0100, Al Viro wrote:

> So we need
> 	* per-page exclusion for reallocation time (normal page locks are
> doing that)
> 	* per-fs exclusion for block and fragment allocations (->s_lock?)
> 	* per-fs exclusion for inode allocations (->s_lock?)
> 	* per-inode exclusion for mapping changes (a-la ext2 truncate_mutex)
> 	* per-directory exclusion for contents access (->i_mutex gives that)
> 
> Looks like we ought to add ->truncate_mutex and shove lock_ufs() calls
> all way down into balloc.c (and ialloc.c for inode allocations)...

After looking through that code, that appears to be feasible (and would
simplify the hell out of truncate side of things).  However, there's
a problem with the way we do ->write_begin() and ->writepage() there - it's
quite suboptimal for short files.  Suppose we start with an empty file on
e.g. a filesystem with 1Kb fragments and 4Kb blocks.  We grab page 0 and
call ->write_begin() on it, offset range 0 to 4095.  OK, that'll be
block_write_begin(), with ufs_getfrag_block as callback.  And it'll proceed
to call it 4 times, one for each kilobyte.  In ascending order.  If we are
lucky, it'll allocate the first fragment of an otherwise empty block and
then extend it 3 times until we get the full block.  If we are *not* that
lucky, and the first fragment is grabbed in already partially used block,
we'll end up doing reallocations (and copying, while we are at it, hopefully
without bogus uptodate flags set anywhere in process).  The same if somebody
else grabs a fragment in the middle of block we are growing into.

A part of that is the lack of prealloc in fs/ufs; that would certainly
improve things, but I really wonder if we are doing the handling of
partial blocks in the wrong place.  As it is, that happens fairly deep
in call chain - ufs_write_begin() -> block_write_begin() ->
__block_write_begin() -> ufs_getfrag_block() -> ufs_inode_getfrag() ->
ufs_new_fragments() -> ufs_change_blocknr()) and we don't notice
the partial blocks until we get to ufs_new_fragments(); in reality,
we can tell if there's a partial block from the very beginning, just by
looking at inode size and checking if the direct pointer in question
is not zero.

Do we really need it done that deep in chain?  Note that there's another
long-standing problem - we map the things fragment-by-fragment, which is
seriously redundant; within the same logical block we have either
	* hole - no fragments present, or
	* partial block at the end of short file - known number of adjacent
fragments present, or
	* full block - all fragments are present and adjacent
Trying to map fragments one by one is completely pointless.  Which makes
the use fs/buffer.c helpers dubious.  Besides, we pay for doing it that
deep by having to pass the page all way down into block allocator.

AFAICS, writepage cares about the partial blocks only when EOF is right
inside the page - pages past EOF shouldn't be faulted in and truncate_setsize()
should've killed everything off before lowering ->i_size.  And since truncate
and ->write_begin are serialized on ->i_mutex, we are down to
	* write_iter starting past the EOF should start with dummy extending
truncate (and truncate back to original size if nothing actually gets
written)
	* extending truncate() of a file with partial block should grab the
page containing EOF and expand it (with possible reallocation).
	* truncate() shrinking a file to something with partial block should
do nothing special, other than releasing only part of fragments for that
block.
	* write_begin and writepage should check if they are dealing with
a page covering a partial block and start with extending it - they know how
far to go and they are serialized by page lock.

Does anybody see holes in that?  This way we can handle the partial
blocks higher in the call chain, with a lot less PITA...  And with that
done, we can have ufs_block_getfrag() just check if the previous bh in
the page falls into the same block and is already mapped and map ours to
the next fragment if that's the case - allocation would either happen
for entire block or page, whichever is smaller, or be already done by caller
(UFS blocks can be bigger than a page).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-04  5:01   ` [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge" Al Viro
@ 2015-06-04 22:22     ` Al Viro
  2015-06-04 22:22     ` Al Viro
  2015-06-05 16:27     ` Fabian Frederick
  2 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-04 22:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Evgeniy Dushistov, Ian Campbell, Ian Jackson, xen-devel,
	Fabian Frederick, linux-fsdevel, Jan Kara, Alexey Khoroshilov,
	Roger Pau Monne

On Thu, Jun 04, 2015 at 06:01:23AM +0100, Al Viro wrote:

> So we need
> 	* per-page exclusion for reallocation time (normal page locks are
> doing that)
> 	* per-fs exclusion for block and fragment allocations (->s_lock?)
> 	* per-fs exclusion for inode allocations (->s_lock?)
> 	* per-inode exclusion for mapping changes (a-la ext2 truncate_mutex)
> 	* per-directory exclusion for contents access (->i_mutex gives that)
> 
> Looks like we ought to add ->truncate_mutex and shove lock_ufs() calls
> all way down into balloc.c (and ialloc.c for inode allocations)...

After looking through that code, that appears to be feasible (and would
simplify the hell out of truncate side of things).  However, there's
a problem with the way we do ->write_begin() and ->writepage() there - it's
quite suboptimal for short files.  Suppose we start with an empty file on
e.g. a filesystem with 1Kb fragments and 4Kb blocks.  We grab page 0 and
call ->write_begin() on it, offset range 0 to 4095.  OK, that'll be
block_write_begin(), with ufs_getfrag_block as callback.  And it'll proceed
to call it 4 times, one for each kilobyte.  In ascending order.  If we are
lucky, it'll allocate the first fragment of an otherwise empty block and
then extend it 3 times until we get the full block.  If we are *not* that
lucky, and the first fragment is grabbed in already partially used block,
we'll end up doing reallocations (and copying, while we are at it, hopefully
without bogus uptodate flags set anywhere in process).  The same if somebody
else grabs a fragment in the middle of block we are growing into.

A part of that is the lack of prealloc in fs/ufs; that would certainly
improve things, but I really wonder if we are doing the handling of
partial blocks in the wrong place.  As it is, that happens fairly deep
in call chain - ufs_write_begin() -> block_write_begin() ->
__block_write_begin() -> ufs_getfrag_block() -> ufs_inode_getfrag() ->
ufs_new_fragments() -> ufs_change_blocknr()) and we don't notice
the partial blocks until we get to ufs_new_fragments(); in reality,
we can tell if there's a partial block from the very beginning, just by
looking at inode size and checking if the direct pointer in question
is not zero.

Do we really need it done that deep in chain?  Note that there's another
long-standing problem - we map the things fragment-by-fragment, which is
seriously redundant; within the same logical block we have either
	* hole - no fragments present, or
	* partial block at the end of short file - known number of adjacent
fragments present, or
	* full block - all fragments are present and adjacent
Trying to map fragments one by one is completely pointless.  Which makes
the use fs/buffer.c helpers dubious.  Besides, we pay for doing it that
deep by having to pass the page all way down into block allocator.

AFAICS, writepage cares about the partial blocks only when EOF is right
inside the page - pages past EOF shouldn't be faulted in and truncate_setsize()
should've killed everything off before lowering ->i_size.  And since truncate
and ->write_begin are serialized on ->i_mutex, we are down to
	* write_iter starting past the EOF should start with dummy extending
truncate (and truncate back to original size if nothing actually gets
written)
	* extending truncate() of a file with partial block should grab the
page containing EOF and expand it (with possible reallocation).
	* truncate() shrinking a file to something with partial block should
do nothing special, other than releasing only part of fragments for that
block.
	* write_begin and writepage should check if they are dealing with
a page covering a partial block and start with extending it - they know how
far to go and they are serialized by page lock.

Does anybody see holes in that?  This way we can handle the partial
blocks higher in the call chain, with a lot less PITA...  And with that
done, we can have ufs_block_getfrag() just check if the previous bh in
the page falls into the same block and is already mapped and map ours to
the next fragment if that's the case - allocation would either happen
for entire block or page, whichever is smaller, or be already done by caller
(UFS blocks can be bigger than a page).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-04  5:01   ` [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge" Al Viro
  2015-06-04 22:22     ` Al Viro
  2015-06-04 22:22     ` Al Viro
@ 2015-06-05 16:27     ` Fabian Frederick
  2015-06-05 18:50       ` Al Viro
  2 siblings, 1 reply; 18+ messages in thread
From: Fabian Frederick @ 2015-06-05 16:27 UTC (permalink / raw)
  To: Andrew Morton, Al Viro
  Cc: Jan Kara, Ian Campbell, Ian Jackson, xen-devel,
	Evgeniy Dushistov, Alexey Khoroshilov, Roger Pau Monne

[-- Attachment #1: Type: text/plain, Size: 2821 bytes --]



> On 04 June 2015 at 07:01 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
>
> On Wed, May 27, 2015 at 02:57:35PM -0700, Andrew Morton wrote:
> > On Wed, 27 May 2015 21:15:30 +0200 Fabian Frederick <fabf@skynet.be> wrote:
> >
> > > This reverts commit 9ef7db7f38d0
> > > ("ufs: fix deadlocks introduced by sb mutex merge")
> > > That patch tried to solve
> > >     Commit 0244756edc4b98c
> > >     ("ufs: sb mutex merge + mutex_destroy")
> > > which is itself partially reverted due to multiple deadlocks
> >
> > This is all very vague.  The changelogs are missing any description of
> > the deadlocks: how they are triggered, why they occur.  And there's no
> > description of how the patches fix these deadlocks.  And as we're
> > reverting a bunch of things one wonders whether the problems which the
> > now-reverted patches fixed are being reintroduced.
> >
> > Has anyone (Ian?) confirmed that the fs works OK with these patches?
>
> Folks, how about we figure out what's really being protected by that
> mutex?  IIRC, the main irregularity about ufs is the need to deal with
> growing the partial final block - unlike e.g ext2, ufs has differently-sized
> blocks and fragments.  Basically, it's tail-packing - short files have
> the last used direct pointer refer to a group of adjacent fragments that
> doesn't have to be block-aligned or fill the entire block.  They can't
> cross the disk block boundary and write might have to reallocate the partial
> block.
>
> So we need
>       * per-page exclusion for reallocation time (normal page locks are
> doing that)
>       * per-fs exclusion for block and fragment allocations (->s_lock?)
>       * per-fs exclusion for inode allocations (->s_lock?)
>       * per-inode exclusion for mapping changes (a-la ext2 truncate_mutex)
>       * per-directory exclusion for contents access (->i_mutex gives that)
>
> Looks like we ought to add ->truncate_mutex and shove lock_ufs() calls
> all way down into balloc.c (and ialloc.c for inode allocations)...

If we look at linux-next with the original mutex behavior restored,
mutex/spinlocks are the following:

struct ufs_sb_info{

        struct mutex mutex; (lock_ufs()/unlock_ufs())
        struct mutex s_lock; (mutex_lock()/mutex_unlock())
        spinlock_t work_lock; /* protects sync_work and work_queued */

}

You're asking to remove lock_ufs() in allocation and replace it by
truncate_mutex. I guess you're talking about doing that on current rc
(without s_lock restored).

I tried a quick patch on rc trying to convert lock_ufs()/unlock_ufs()
with per inode truncate_mutex (see attachment).
Is it going the right direction ? That would involve dropping the two linux-next
reverts in ufs.

Regards,
Fabian

[-- Attachment #2: ufsmutex2 --]
[-- Type: application/octet-stream, Size: 7541 bytes --]

diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c
index 2c10360..23f9daf 100644
--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -40,6 +40,7 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count)
 	struct ufs_sb_private_info * uspi;
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	unsigned cgno, bit, end_bit, bbase, blkmap, i;
 	u64 blkno;
 	
@@ -52,7 +53,7 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count)
 	if (ufs_fragnum(fragment) + count > uspi->s_fpg)
 		ufs_error (sb, "ufs_free_fragments", "internal error");
 	
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 	
 	cgno = ufs_dtog(uspi, fragment);
 	bit = ufs_dtogd(uspi, fragment);
@@ -116,12 +117,12 @@ void ufs_free_fragments(struct inode *inode, u64 fragment, unsigned count)
 		ubh_sync_block(UCPI_UBH(ucpi));
 	ufs_mark_sb_dirty(sb);
 	
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT\n");
 	return;
 
 failed:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT (FAILED)\n");
 	return;
 }
@@ -135,6 +136,7 @@ void ufs_free_blocks(struct inode *inode, u64 fragment, unsigned count)
 	struct ufs_sb_private_info * uspi;
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	unsigned overflow, cgno, bit, end_bit, i;
 	u64 blkno;
 	
@@ -151,7 +153,7 @@ void ufs_free_blocks(struct inode *inode, u64 fragment, unsigned count)
 		goto failed;
 	}
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 	
 do_more:
 	overflow = 0;
@@ -211,12 +213,12 @@ do_more:
 	}
 
 	ufs_mark_sb_dirty(sb);
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT\n");
 	return;
 
 failed_unlock:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 failed:
 	UFSD("EXIT (FAILED)\n");
 	return;
@@ -345,6 +347,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 	struct super_block * sb;
 	struct ufs_sb_private_info * uspi;
 	struct ufs_super_block_first * usb1;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	unsigned cgno, oldcount, newcount;
 	u64 tmp, request, result;
 	
@@ -357,7 +360,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 	usb1 = ubh_get_usb_first(uspi);
 	*err = -ENOSPC;
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 	tmp = ufs_data_ptr_to_cpu(sb, p);
 
 	if (count + ufs_fragnum(fragment) > uspi->s_fpb) {
@@ -378,19 +381,19 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 				  "fragment %llu, tmp %llu\n",
 				  (unsigned long long)fragment,
 				  (unsigned long long)tmp);
-			unlock_ufs(sb);
+			mutex_unlock(&ei->truncate_mutex);
 			return INVBLOCK;
 		}
 		if (fragment < UFS_I(inode)->i_lastfrag) {
 			UFSD("EXIT (ALREADY ALLOCATED)\n");
-			unlock_ufs(sb);
+			mutex_unlock(&ei->truncate_mutex);
 			return 0;
 		}
 	}
 	else {
 		if (tmp) {
 			UFSD("EXIT (ALREADY ALLOCATED)\n");
-			unlock_ufs(sb);
+			mutex_unlock(&ei->truncate_mutex);
 			return 0;
 		}
 	}
@@ -399,7 +402,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 	 * There is not enough space for user on the device
 	 */
 	if (!capable(CAP_SYS_RESOURCE) && ufs_freespace(uspi, UFS_MINFREE) <= 0) {
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		UFSD("EXIT (FAILED)\n");
 		return 0;
 	}
@@ -424,7 +427,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 			ufs_clear_frags(inode, result + oldcount,
 					newcount - oldcount, locked_page != NULL);
 		}
-		unlock_ufs(sb);
+		mutex_lock(&ei->truncate_mutex);
 		UFSD("EXIT, result %llu\n", (unsigned long long)result);
 		return result;
 	}
@@ -439,7 +442,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 						fragment + count);
 		ufs_clear_frags(inode, result + oldcount, newcount - oldcount,
 				locked_page != NULL);
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		UFSD("EXIT, result %llu\n", (unsigned long long)result);
 		return result;
 	}
@@ -477,7 +480,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 		*err = 0;
 		UFS_I(inode)->i_lastfrag = max(UFS_I(inode)->i_lastfrag,
 						fragment + count);
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		if (newcount < request)
 			ufs_free_fragments (inode, result + newcount, request - newcount);
 		ufs_free_fragments (inode, tmp, oldcount);
@@ -485,7 +488,7 @@ u64 ufs_new_fragments(struct inode *inode, void *p, u64 fragment,
 		return result;
 	}
 
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT (FAILED)\n");
 	return 0;
 }		
diff --git a/fs/ufs/ialloc.c b/fs/ufs/ialloc.c
index 7caa016..49bd9c4 100644
--- a/fs/ufs/ialloc.c
+++ b/fs/ufs/ialloc.c
@@ -59,6 +59,7 @@ void ufs_free_inode (struct inode * inode)
 	struct ufs_sb_private_info * uspi;
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
+	struct ufs_inode_info *ei = UFS_I(inode);
 	int is_directory;
 	unsigned ino, cg, bit;
 	
@@ -69,11 +70,11 @@ void ufs_free_inode (struct inode * inode)
 	
 	ino = inode->i_ino;
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 
 	if (!((ino > 1) && (ino < (uspi->s_ncg * uspi->s_ipg )))) {
 		ufs_warning(sb, "ufs_free_inode", "reserved inode or nonexistent inode %u\n", ino);
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		return;
 	}
 	
@@ -81,7 +82,7 @@ void ufs_free_inode (struct inode * inode)
 	bit = ufs_inotocgoff (ino);
 	ucpi = ufs_load_cylinder (sb, cg);
 	if (!ucpi) {
-		unlock_ufs(sb);
+		mutex_unlock(&ei->truncate_mutex);
 		return;
 	}
 	ucg = ubh_get_ucg(UCPI_UBH(ucpi));
@@ -115,7 +116,7 @@ void ufs_free_inode (struct inode * inode)
 		ubh_sync_block(UCPI_UBH(ucpi));
 	
 	ufs_mark_sb_dirty(sb);
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	UFSD("EXIT\n");
 }
 
@@ -176,6 +177,7 @@ struct inode *ufs_new_inode(struct inode *dir, umode_t mode)
 	struct ufs_cg_private_info * ucpi;
 	struct ufs_cylinder_group * ucg;
 	struct inode * inode;
+	struct ufs_inode_info *ei = UFS_I(dir);
 	unsigned cg, bit, i, j, start;
 	struct ufs_inode_info *ufsi;
 	int err = -ENOSPC;
@@ -193,7 +195,7 @@ struct inode *ufs_new_inode(struct inode *dir, umode_t mode)
 	sbi = UFS_SB(sb);
 	uspi = sbi->s_uspi;
 
-	lock_ufs(sb);
+	mutex_lock(&ei->truncate_mutex);
 
 	/*
 	 * Try to place the inode in its parent directory
@@ -331,21 +333,21 @@ cg_found:
 			sync_dirty_buffer(bh);
 		brelse(bh);
 	}
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 
 	UFSD("allocating inode %lu\n", inode->i_ino);
 	UFSD("EXIT\n");
 	return inode;
 
 fail_remove_inode:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	clear_nlink(inode);
 	unlock_new_inode(inode);
 	iput(inode);
 	UFSD("EXIT (FAILED): err %d\n", err);
 	return ERR_PTR(err);
 failed:
-	unlock_ufs(sb);
+	mutex_unlock(&ei->truncate_mutex);
 	make_bad_inode(inode);
 	iput (inode);
 	UFSD("EXIT (FAILED): err %d\n", err);
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index b3bc3e7..1a695a4 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1435,6 +1435,7 @@ static void init_once(void *foo)
 {
 	struct ufs_inode_info *ei = (struct ufs_inode_info *) foo;
 
+	mutex_init(&ei->truncate_mutex);
 	inode_init_once(&ei->vfs_inode);
 }
 
diff --git a/fs/ufs/ufs.h b/fs/ufs/ufs.h
index 2a07396..b530c2f 100644
--- a/fs/ufs/ufs.h
+++ b/fs/ufs/ufs.h
@@ -46,6 +46,7 @@ struct ufs_inode_info {
 	__u16	i_osync;
 	__u64	i_lastfrag;
 	__u32   i_dir_start_lookup;
+	struct mutex truncate_mutex;
 	struct inode vfs_inode;
 };
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-05 16:27     ` Fabian Frederick
@ 2015-06-05 18:50       ` Al Viro
  2015-06-05 22:03         ` Al Viro
                           ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Al Viro @ 2015-06-05 18:50 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: Jan Kara, Ian Campbell, Ian Jackson, xen-devel,
	Evgeniy Dushistov, Andrew Morton, Alexey Khoroshilov,
	Roger Pau Monne

On Fri, Jun 05, 2015 at 06:27:01PM +0200, Fabian Frederick wrote:

> You're asking to remove lock_ufs() in allocation and replace it by
> truncate_mutex. I guess you're talking about doing that on current rc
> (without s_lock restored).
> 
> I tried a quick patch on rc trying to convert lock_ufs()/unlock_ufs()
> with per inode truncate_mutex (see attachment).
> Is it going the right direction ?

Nope.  First of all, allocation should be protected by fs-wide mutex, for
obvious reasons.  What ->truncate_mutex is protecting (outside of that
one) is modifications of block pointers in inode and in indirect blocks.

Check what ext2 is doing.  It tries to follow pointers without taking
a lock.  If it succeeds, fine, no allocation is needed, just map it.
If it fails, it grabs ->truncate_mutex, follows pointers again (checking
the chain it followed before grabbing the lock first, in hope it has
remained valid) and does allocations and block pointer modifications,
with ->truncate_mutex making sure that nobody else would touch those
under it.  And on truncate side it deals with the page into which EOF
will fall, then (still holding ->i_mutex) does setting ->i_size and
eviction of pages beyond EOF (by truncate_setsize()), then does freeing
the blocks past the new EOF.  Which is where it grabs ->truncate_mutex
(inside __ext2_truncate_blocks()).  And it doesn't need to do those
insane retries - on get_block side we need to redo the pointer chasing
after having taken ->truncate_mutex (when it decides that it might need
to do allocation, after all), but that's done once; truncate side just
grabs ->truncate_mutex as soon as it gets to __ext2_truncate_blocks(),
which means that nobody can change anything under it.

Actual allocation/freeing of blocks (as well as that of inodes) is done
under fs-wide mutex, nesting inside the rest.

Basically, we have
	i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
	truncate_mutex: block pointers changes.  Per-inode.
	s_lock:	block and inode bitmaps changes.  Per-filesystem.

For UFS it's slightly more complicated due to tail packing they are doing for
short files, but most of that complexity is due to having that stuff handled
way too deep in call chain.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-05 18:50       ` Al Viro
@ 2015-06-05 22:03         ` Al Viro
  2015-06-17  8:57           ` Jan Kara
  2015-06-17  8:57           ` Jan Kara
  2015-06-05 22:03         ` Al Viro
  2015-06-06  8:04         ` Fabian Frederick
  2 siblings, 2 replies; 18+ messages in thread
From: Al Viro @ 2015-06-05 22:03 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: Andrew Morton, Alexey Khoroshilov, Ian Campbell, Roger Pau Monne,
	Ian Jackson, Jan Kara, xen-devel, Evgeniy Dushistov,
	linux-fsdevel

On Fri, Jun 05, 2015 at 07:50:18PM +0100, Al Viro wrote:
> Basically, we have
> 	i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
> 	truncate_mutex: block pointers changes.  Per-inode.
> 	s_lock:	block and inode bitmaps changes.  Per-filesystem.
> 
> For UFS it's slightly more complicated due to tail packing they are doing for
> short files, but most of that complexity is due to having that stuff handled
> way too deep in call chain.

Oh, lovely... commit 10e5dc
Author: Evgeniy Dushistov <dushistov@mail.ru>
Date:   Sat Jul 1 04:36:24 2006 -0700

    [PATCH] ufs: truncate should allocate block for last byte

had removed ->truncate() method and missed the fact that vmtrucate() had
callers outside of ->setattr(), such as handling of ->prepare_write() partial
failures and short copies on write(2) in general.

Then we had a long and convoluted series of conversions that ended up with
vmtruncate() lifted into ufs_write_failed() and replaced with
truncate_pagecache() in there.

Through all that, everybody (me included) had not noticed that we *still*
do not free blocks allocated by ufs_write_begin() failing halfway through.
While we are at it, ufs_write_end() ought to call ufs_write_failed() in
case when we'd been called after a short copy (and do the same block freeing).

Joy...  Folks, is anybody actively maintaining fs/ufs these days?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-05 18:50       ` Al Viro
  2015-06-05 22:03         ` Al Viro
@ 2015-06-05 22:03         ` Al Viro
  2015-06-06  8:04         ` Fabian Frederick
  2 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-05 22:03 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: Jan Kara, Ian Campbell, Ian Jackson, xen-devel, linux-fsdevel,
	Evgeniy Dushistov, Andrew Morton, Alexey Khoroshilov,
	Roger Pau Monne

On Fri, Jun 05, 2015 at 07:50:18PM +0100, Al Viro wrote:
> Basically, we have
> 	i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
> 	truncate_mutex: block pointers changes.  Per-inode.
> 	s_lock:	block and inode bitmaps changes.  Per-filesystem.
> 
> For UFS it's slightly more complicated due to tail packing they are doing for
> short files, but most of that complexity is due to having that stuff handled
> way too deep in call chain.

Oh, lovely... commit 10e5dc
Author: Evgeniy Dushistov <dushistov@mail.ru>
Date:   Sat Jul 1 04:36:24 2006 -0700

    [PATCH] ufs: truncate should allocate block for last byte

had removed ->truncate() method and missed the fact that vmtrucate() had
callers outside of ->setattr(), such as handling of ->prepare_write() partial
failures and short copies on write(2) in general.

Then we had a long and convoluted series of conversions that ended up with
vmtruncate() lifted into ufs_write_failed() and replaced with
truncate_pagecache() in there.

Through all that, everybody (me included) had not noticed that we *still*
do not free blocks allocated by ufs_write_begin() failing halfway through.
While we are at it, ufs_write_end() ought to call ufs_write_failed() in
case when we'd been called after a short copy (and do the same block freeing).

Joy...  Folks, is anybody actively maintaining fs/ufs these days?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-05 18:50       ` Al Viro
  2015-06-05 22:03         ` Al Viro
  2015-06-05 22:03         ` Al Viro
@ 2015-06-06  8:04         ` Fabian Frederick
  2 siblings, 0 replies; 18+ messages in thread
From: Fabian Frederick @ 2015-06-06  8:04 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Ian Campbell, Ian Jackson, xen-devel,
	Evgeniy Dushistov, Andrew Morton, Alexey Khoroshilov,
	Roger Pau Monne



> On 05 June 2015 at 20:50 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
>
> On Fri, Jun 05, 2015 at 06:27:01PM +0200, Fabian Frederick wrote:
>
> > You're asking to remove lock_ufs() in allocation and replace it by
> > truncate_mutex. I guess you're talking about doing that on current rc
> > (without s_lock restored).
> >
> > I tried a quick patch on rc trying to convert lock_ufs()/unlock_ufs()
> > with per inode truncate_mutex (see attachment).
> > Is it going the right direction ?
>
> Nope.  First of all, allocation should be protected by fs-wide mutex, for
> obvious reasons.  What ->truncate_mutex is protecting (outside of that
> one) is modifications of block pointers in inode and in indirect blocks.
>
> Check what ext2 is doing.  It tries to follow pointers without taking
> a lock.  If it succeeds, fine, no allocation is needed, just map it.
> If it fails, it grabs ->truncate_mutex, follows pointers again (checking
> the chain it followed before grabbing the lock first, in hope it has
> remained valid) and does allocations and block pointer modifications,
> with ->truncate_mutex making sure that nobody else would touch those
> under it.  And on truncate side it deals with the page into which EOF
> will fall, then (still holding ->i_mutex) does setting ->i_size and
> eviction of pages beyond EOF (by truncate_setsize()), then does freeing
> the blocks past the new EOF.  Which is where it grabs ->truncate_mutex
> (inside __ext2_truncate_blocks()).  And it doesn't need to do those
> insane retries - on get_block side we need to redo the pointer chasing
> after having taken ->truncate_mutex (when it decides that it might need
> to do allocation, after all), but that's done once; truncate side just
> grabs ->truncate_mutex as soon as it gets to __ext2_truncate_blocks(),
> which means that nobody can change anything under it.
>
> Actual allocation/freeing of blocks (as well as that of inodes) is done
> under fs-wide mutex, nesting inside the rest.
>
> Basically, we have
>       i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
>       truncate_mutex: block pointers changes.  Per-inode.
>       s_lock: block and inode bitmaps changes.  Per-filesystem.
>
> For UFS it's slightly more complicated due to tail packing they are doing for
> short files, but most of that complexity is due to having that stuff handled
> way too deep in call chain.

In your two explanations, there's only a place for one sb mutex:
"
i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
truncate_mutex: block pointers changes.  Per-inode.
s_lock: block and inode bitmaps changes.  Per-filesystem.
"
"
per-page exclusion for reallocation time (normal page locks are doing that)
per-fs exclusion for block and fragment allocations (->s_lock?)
per-fs exclusion for inode allocations (->s_lock?)
per-inode exclusion for mapping changes (a-la ext2 truncate_mutex)
per-directory exclusion for contents access (->i_mutex gives that)       
"

Meanwhile current linux-next has 2 in fs/ufs/ufs.h: struct ufs_sb_info

struct mutex mutex
struct mutex s_lock

So commit 0244756edc4b98c129e "ufs: sb mutex merge + mutex_destroy")
was finally not a bad thing but maybe it removed the bad one ?

Regards,
Fabian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-05 22:03         ` Al Viro
  2015-06-17  8:57           ` Jan Kara
@ 2015-06-17  8:57           ` Jan Kara
  2015-06-17 20:31             ` Al Viro
  2015-06-17 20:31             ` Al Viro
  1 sibling, 2 replies; 18+ messages in thread
From: Jan Kara @ 2015-06-17  8:57 UTC (permalink / raw)
  To: Al Viro
  Cc: Fabian Frederick, Andrew Morton, Alexey Khoroshilov,
	Ian Campbell, Roger Pau Monne, Ian Jackson, Jan Kara, xen-devel,
	Evgeniy Dushistov, linux-fsdevel

On Fri 05-06-15 23:03:48, Al Viro wrote:
> On Fri, Jun 05, 2015 at 07:50:18PM +0100, Al Viro wrote:
> > Basically, we have
> > 	i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
> > 	truncate_mutex: block pointers changes.  Per-inode.
> > 	s_lock:	block and inode bitmaps changes.  Per-filesystem.
> > 
> > For UFS it's slightly more complicated due to tail packing they are doing for
> > short files, but most of that complexity is due to having that stuff handled
> > way too deep in call chain.
> 
> Oh, lovely... commit 10e5dc
> Author: Evgeniy Dushistov <dushistov@mail.ru>
> Date:   Sat Jul 1 04:36:24 2006 -0700
> 
>     [PATCH] ufs: truncate should allocate block for last byte
> 
> had removed ->truncate() method and missed the fact that vmtrucate() had
> callers outside of ->setattr(), such as handling of ->prepare_write() partial
> failures and short copies on write(2) in general.
> 
> Then we had a long and convoluted series of conversions that ended up with
> vmtruncate() lifted into ufs_write_failed() and replaced with
> truncate_pagecache() in there.
> 
> Through all that, everybody (me included) had not noticed that we *still*
> do not free blocks allocated by ufs_write_begin() failing halfway through.
> While we are at it, ufs_write_end() ought to call ufs_write_failed() in
> case when we'd been called after a short copy (and do the same block freeing).
> 
> Joy...  Folks, is anybody actively maintaining fs/ufs these days?

Looking into the changelog there wasn't anyone seriously looking into UFS
for at least 5-6 years... Fabian did some cleanups recently but they were
mostly cosmetic. So I don't think it's really maintained. OTOH clearly
there are some users since occasionally someone comes with a bug report.
Welcome to the world where we have ~50 on disk / network filesystems :-|

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-05 22:03         ` Al Viro
@ 2015-06-17  8:57           ` Jan Kara
  2015-06-17  8:57           ` Jan Kara
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Kara @ 2015-06-17  8:57 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Ian Campbell, Ian Jackson, xen-devel, Fabian Frederick,
	linux-fsdevel, Evgeniy Dushistov, Andrew Morton,
	Alexey Khoroshilov, Roger Pau Monne

On Fri 05-06-15 23:03:48, Al Viro wrote:
> On Fri, Jun 05, 2015 at 07:50:18PM +0100, Al Viro wrote:
> > Basically, we have
> > 	i_mutex: file size changes, contents-affecting syscalls.  Per-inode.
> > 	truncate_mutex: block pointers changes.  Per-inode.
> > 	s_lock:	block and inode bitmaps changes.  Per-filesystem.
> > 
> > For UFS it's slightly more complicated due to tail packing they are doing for
> > short files, but most of that complexity is due to having that stuff handled
> > way too deep in call chain.
> 
> Oh, lovely... commit 10e5dc
> Author: Evgeniy Dushistov <dushistov@mail.ru>
> Date:   Sat Jul 1 04:36:24 2006 -0700
> 
>     [PATCH] ufs: truncate should allocate block for last byte
> 
> had removed ->truncate() method and missed the fact that vmtrucate() had
> callers outside of ->setattr(), such as handling of ->prepare_write() partial
> failures and short copies on write(2) in general.
> 
> Then we had a long and convoluted series of conversions that ended up with
> vmtruncate() lifted into ufs_write_failed() and replaced with
> truncate_pagecache() in there.
> 
> Through all that, everybody (me included) had not noticed that we *still*
> do not free blocks allocated by ufs_write_begin() failing halfway through.
> While we are at it, ufs_write_end() ought to call ufs_write_failed() in
> case when we'd been called after a short copy (and do the same block freeing).
> 
> Joy...  Folks, is anybody actively maintaining fs/ufs these days?

Looking into the changelog there wasn't anyone seriously looking into UFS
for at least 5-6 years... Fabian did some cleanups recently but they were
mostly cosmetic. So I don't think it's really maintained. OTOH clearly
there are some users since occasionally someone comes with a bug report.
Welcome to the world where we have ~50 on disk / network filesystems :-|

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-17  8:57           ` Jan Kara
  2015-06-17 20:31             ` Al Viro
@ 2015-06-17 20:31             ` Al Viro
  2015-06-19 23:07               ` Al Viro
  2015-06-19 23:07               ` Al Viro
  1 sibling, 2 replies; 18+ messages in thread
From: Al Viro @ 2015-06-17 20:31 UTC (permalink / raw)
  To: Jan Kara
  Cc: Fabian Frederick, Andrew Morton, Alexey Khoroshilov,
	Ian Campbell, Roger Pau Monne, Ian Jackson, xen-devel,
	Evgeniy Dushistov, linux-fsdevel

On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
> > Joy...  Folks, is anybody actively maintaining fs/ufs these days?
> 
> Looking into the changelog there wasn't anyone seriously looking into UFS
> for at least 5-6 years... Fabian did some cleanups recently but they were
> mostly cosmetic. So I don't think it's really maintained. OTOH clearly
> there are some users since occasionally someone comes with a bug report.
> Welcome to the world where we have ~50 on disk / network filesystems :-|

FWIW, I've a partial rewrite of that area (completely untested, etc.)
in vfs.git#ufs; it's still in progress, but hopefully it'll be done
by tomorrow.  Basically, it switches the sucker to locking scheme
similar to ext2 (modulo seqlock instead of rwlock for protection of
block pointers, having to be more careful due to 64bit assignments not
being atomic and being unable to fall back to ->truncate_mutex in
case of race on the read side) and starts cleaning the hell out of
the truncate (and eventually create side of get_block as well).

As far as I can see, the root of the problems with that sucker is the
misplaced handling of tail-packing logics.  That had ballooned into
a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
painful to massage into saner shape and the damn thing kept missing
cleanups that were done to similar filesystems due to that...

One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
handling in there provoked exactly that, but it might have been better to
go the other way round; see e.g. fs/minix/itree*.c for opposite approach.

All this stuff is currently completely untested; I'll be using generic
parts of xfstests for beating it up, but any assistance would be welcome.
Note: all testing I'll be realistically able to do will be for FreeBSD
UFS variants - I don't have Solaris left on any boxen, to start with...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-17  8:57           ` Jan Kara
@ 2015-06-17 20:31             ` Al Viro
  2015-06-17 20:31             ` Al Viro
  1 sibling, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-17 20:31 UTC (permalink / raw)
  To: Jan Kara
  Cc: Evgeniy Dushistov, Ian Campbell, Ian Jackson, xen-devel,
	Fabian Frederick, linux-fsdevel, Andrew Morton,
	Alexey Khoroshilov, Roger Pau Monne

On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
> > Joy...  Folks, is anybody actively maintaining fs/ufs these days?
> 
> Looking into the changelog there wasn't anyone seriously looking into UFS
> for at least 5-6 years... Fabian did some cleanups recently but they were
> mostly cosmetic. So I don't think it's really maintained. OTOH clearly
> there are some users since occasionally someone comes with a bug report.
> Welcome to the world where we have ~50 on disk / network filesystems :-|

FWIW, I've a partial rewrite of that area (completely untested, etc.)
in vfs.git#ufs; it's still in progress, but hopefully it'll be done
by tomorrow.  Basically, it switches the sucker to locking scheme
similar to ext2 (modulo seqlock instead of rwlock for protection of
block pointers, having to be more careful due to 64bit assignments not
being atomic and being unable to fall back to ->truncate_mutex in
case of race on the read side) and starts cleaning the hell out of
the truncate (and eventually create side of get_block as well).

As far as I can see, the root of the problems with that sucker is the
misplaced handling of tail-packing logics.  That had ballooned into
a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
painful to massage into saner shape and the damn thing kept missing
cleanups that were done to similar filesystems due to that...

One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
handling in there provoked exactly that, but it might have been better to
go the other way round; see e.g. fs/minix/itree*.c for opposite approach.

All this stuff is currently completely untested; I'll be using generic
parts of xfstests for beating it up, but any assistance would be welcome.
Note: all testing I'll be realistically able to do will be for FreeBSD
UFS variants - I don't have Solaris left on any boxen, to start with...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-17 20:31             ` Al Viro
  2015-06-19 23:07               ` Al Viro
@ 2015-06-19 23:07               ` Al Viro
  2015-06-23 16:46                 ` Jan Kara
  2015-06-23 16:46                 ` Jan Kara
  1 sibling, 2 replies; 18+ messages in thread
From: Al Viro @ 2015-06-19 23:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: Fabian Frederick, Andrew Morton, Alexey Khoroshilov,
	Ian Campbell, Roger Pau Monne, Ian Jackson, xen-devel,
	Evgeniy Dushistov, linux-fsdevel

On Wed, Jun 17, 2015 at 09:31:16PM +0100, Al Viro wrote:
> On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
> > > Joy...  Folks, is anybody actively maintaining fs/ufs these days?
> > 
> > Looking into the changelog there wasn't anyone seriously looking into UFS
> > for at least 5-6 years... Fabian did some cleanups recently but they were
> > mostly cosmetic. So I don't think it's really maintained. OTOH clearly
> > there are some users since occasionally someone comes with a bug report.
> > Welcome to the world where we have ~50 on disk / network filesystems :-|
> 
> FWIW, I've a partial rewrite of that area (completely untested, etc.)
> in vfs.git#ufs; it's still in progress, but hopefully it'll be done
> by tomorrow.  Basically, it switches the sucker to locking scheme
> similar to ext2 (modulo seqlock instead of rwlock for protection of
> block pointers, having to be more careful due to 64bit assignments not
> being atomic and being unable to fall back to ->truncate_mutex in
> case of race on the read side) and starts cleaning the hell out of
> the truncate (and eventually create side of get_block as well).
> 
> As far as I can see, the root of the problems with that sucker is the
> misplaced handling of tail-packing logics.  That had ballooned into
> a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
> painful to massage into saner shape and the damn thing kept missing
> cleanups that were done to similar filesystems due to that...
> 
> One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
> UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
> handling in there provoked exactly that, but it might have been better to
> go the other way round; see e.g. fs/minix/itree*.c for opposite approach.
> 
> All this stuff is currently completely untested; I'll be using generic
> parts of xfstests for beating it up, but any assistance would be welcome.
> Note: all testing I'll be realistically able to do will be for FreeBSD
> UFS variants - I don't have Solaris left on any boxen, to start with...

FWIW, I've got to one bloody unpleasant part of UFS we'd never managed to
get right and it seems that I have a potential solution.

Problem: UFS has larger-than-page allocation units.  And when we allocate
a block in a hole, we *must* zero the whole thing out.  The trouble being,
we notice that inside writepage(), where we are already holding a lock on
our page and where we can't lock other pages without risking a deadlock.

What's more, as soon as we'd inserted a reference to that block into inode,
we are open to readpage on another page backed by the same block coming
and seeing a normal mapped area.  So UFS block allocator ends up zeroing
them out via _buffer_ cache, waiting for writes to finish before returning
the block to caller.  It doesn't wait for metadata blocks (they are accessed
via buffer cache), but for data we end up with zeroes being written to disk,
no matter what.

That works, but it obviously hurts the performance.  Badly.  Writes into
a hole are relatively rare, but the same thing hurts normal sequential
writes to the end of file, which is not rare at all.

How about the following approach:
	* Let allocation of data block at the EOF return the sucker without
writing zeroes out.
	* Let ->write_iter() check if it's about to create a hole at the
(current) EOF.  In such case start with writing zeroes (in normal fashion,
via ->begin_write()/memset()/->end_write() through the page cache) from the
EOF to block boundary (or beginning of the area we were going to write to,
whichever's closer).  Then proceed in normal fashion.  Incidentally, it will
deal with tail unpacking for free.
	* Do the same upon extending truncate.

Does anybody see holes in that?  readpage() crossing the EOF isn't a problem -
it will not even attempt to map past the current ->i_size and will zero the
page tail out just fine.  We get junk past the EOF on disk, but we zero it
out (or overwrite with our data) before increasing ->i_size.

Dealing with sparse files is also possible (taking allocations from
->writepage() to ->page_mkwrite(), where we are not holding a page lock
and thus can grab all affected pages), but it would be bigger, trickier
and benefit only a relatively rare case.

Comments?

PS: UFS tail unpacking is broken - in some cases it can be tricked into
trying to extend an indirect block, of all things, and that's not the
only fishy case in there.  I think I've fixed that crap in the local tree...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-17 20:31             ` Al Viro
@ 2015-06-19 23:07               ` Al Viro
  2015-06-19 23:07               ` Al Viro
  1 sibling, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-19 23:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: Evgeniy Dushistov, Ian Campbell, Ian Jackson, xen-devel,
	Fabian Frederick, linux-fsdevel, Andrew Morton,
	Alexey Khoroshilov, Roger Pau Monne

On Wed, Jun 17, 2015 at 09:31:16PM +0100, Al Viro wrote:
> On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
> > > Joy...  Folks, is anybody actively maintaining fs/ufs these days?
> > 
> > Looking into the changelog there wasn't anyone seriously looking into UFS
> > for at least 5-6 years... Fabian did some cleanups recently but they were
> > mostly cosmetic. So I don't think it's really maintained. OTOH clearly
> > there are some users since occasionally someone comes with a bug report.
> > Welcome to the world where we have ~50 on disk / network filesystems :-|
> 
> FWIW, I've a partial rewrite of that area (completely untested, etc.)
> in vfs.git#ufs; it's still in progress, but hopefully it'll be done
> by tomorrow.  Basically, it switches the sucker to locking scheme
> similar to ext2 (modulo seqlock instead of rwlock for protection of
> block pointers, having to be more careful due to 64bit assignments not
> being atomic and being unable to fall back to ->truncate_mutex in
> case of race on the read side) and starts cleaning the hell out of
> the truncate (and eventually create side of get_block as well).
> 
> As far as I can see, the root of the problems with that sucker is the
> misplaced handling of tail-packing logics.  That had ballooned into
> a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
> painful to massage into saner shape and the damn thing kept missing
> cleanups that were done to similar filesystems due to that...
> 
> One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
> UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
> handling in there provoked exactly that, but it might have been better to
> go the other way round; see e.g. fs/minix/itree*.c for opposite approach.
> 
> All this stuff is currently completely untested; I'll be using generic
> parts of xfstests for beating it up, but any assistance would be welcome.
> Note: all testing I'll be realistically able to do will be for FreeBSD
> UFS variants - I don't have Solaris left on any boxen, to start with...

FWIW, I've got to one bloody unpleasant part of UFS we'd never managed to
get right and it seems that I have a potential solution.

Problem: UFS has larger-than-page allocation units.  And when we allocate
a block in a hole, we *must* zero the whole thing out.  The trouble being,
we notice that inside writepage(), where we are already holding a lock on
our page and where we can't lock other pages without risking a deadlock.

What's more, as soon as we'd inserted a reference to that block into inode,
we are open to readpage on another page backed by the same block coming
and seeing a normal mapped area.  So UFS block allocator ends up zeroing
them out via _buffer_ cache, waiting for writes to finish before returning
the block to caller.  It doesn't wait for metadata blocks (they are accessed
via buffer cache), but for data we end up with zeroes being written to disk,
no matter what.

That works, but it obviously hurts the performance.  Badly.  Writes into
a hole are relatively rare, but the same thing hurts normal sequential
writes to the end of file, which is not rare at all.

How about the following approach:
	* Let allocation of data block at the EOF return the sucker without
writing zeroes out.
	* Let ->write_iter() check if it's about to create a hole at the
(current) EOF.  In such case start with writing zeroes (in normal fashion,
via ->begin_write()/memset()/->end_write() through the page cache) from the
EOF to block boundary (or beginning of the area we were going to write to,
whichever's closer).  Then proceed in normal fashion.  Incidentally, it will
deal with tail unpacking for free.
	* Do the same upon extending truncate.

Does anybody see holes in that?  readpage() crossing the EOF isn't a problem -
it will not even attempt to map past the current ->i_size and will zero the
page tail out just fine.  We get junk past the EOF on disk, but we zero it
out (or overwrite with our data) before increasing ->i_size.

Dealing with sparse files is also possible (taking allocations from
->writepage() to ->page_mkwrite(), where we are not holding a page lock
and thus can grab all affected pages), but it would be bigger, trickier
and benefit only a relatively rare case.

Comments?

PS: UFS tail unpacking is broken - in some cases it can be tricked into
trying to extend an indirect block, of all things, and that's not the
only fishy case in there.  I think I've fixed that crap in the local tree...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-19 23:07               ` Al Viro
  2015-06-23 16:46                 ` Jan Kara
@ 2015-06-23 16:46                 ` Jan Kara
  2015-06-23 21:56                   ` Al Viro
  2015-06-23 21:56                   ` Al Viro
  1 sibling, 2 replies; 18+ messages in thread
From: Jan Kara @ 2015-06-23 16:46 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Fabian Frederick, Andrew Morton, Alexey Khoroshilov,
	Ian Campbell, Roger Pau Monne, Ian Jackson, xen-devel,
	Evgeniy Dushistov, linux-fsdevel

On Sat 20-06-15 00:07:39, Al Viro wrote:
> On Wed, Jun 17, 2015 at 09:31:16PM +0100, Al Viro wrote:
> > On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
> > > > Joy...  Folks, is anybody actively maintaining fs/ufs these days?
> > > 
> > > Looking into the changelog there wasn't anyone seriously looking into UFS
> > > for at least 5-6 years... Fabian did some cleanups recently but they were
> > > mostly cosmetic. So I don't think it's really maintained. OTOH clearly
> > > there are some users since occasionally someone comes with a bug report.
> > > Welcome to the world where we have ~50 on disk / network filesystems :-|
> > 
> > FWIW, I've a partial rewrite of that area (completely untested, etc.)
> > in vfs.git#ufs; it's still in progress, but hopefully it'll be done
> > by tomorrow.  Basically, it switches the sucker to locking scheme
> > similar to ext2 (modulo seqlock instead of rwlock for protection of
> > block pointers, having to be more careful due to 64bit assignments not
> > being atomic and being unable to fall back to ->truncate_mutex in
> > case of race on the read side) and starts cleaning the hell out of
> > the truncate (and eventually create side of get_block as well).
> > 
> > As far as I can see, the root of the problems with that sucker is the
> > misplaced handling of tail-packing logics.  That had ballooned into
> > a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
> > painful to massage into saner shape and the damn thing kept missing
> > cleanups that were done to similar filesystems due to that...
> > 
> > One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
> > UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
> > handling in there provoked exactly that, but it might have been better to
> > go the other way round; see e.g. fs/minix/itree*.c for opposite approach.
> > 
> > All this stuff is currently completely untested; I'll be using generic
> > parts of xfstests for beating it up, but any assistance would be welcome.
> > Note: all testing I'll be realistically able to do will be for FreeBSD
> > UFS variants - I don't have Solaris left on any boxen, to start with...
> 
> FWIW, I've got to one bloody unpleasant part of UFS we'd never managed to
> get right and it seems that I have a potential solution.
> 
> Problem: UFS has larger-than-page allocation units.  And when we allocate
> a block in a hole, we *must* zero the whole thing out.  The trouble being,
> we notice that inside writepage(), where we are already holding a lock on
> our page and where we can't lock other pages without risking a deadlock.
> 
> What's more, as soon as we'd inserted a reference to that block into inode,
> we are open to readpage on another page backed by the same block coming
> and seeing a normal mapped area.  So UFS block allocator ends up zeroing
> them out via _buffer_ cache, waiting for writes to finish before returning
> the block to caller.  It doesn't wait for metadata blocks (they are accessed
> via buffer cache), but for data we end up with zeroes being written to disk,
> no matter what.
> 
> That works, but it obviously hurts the performance.  Badly.  Writes into
> a hole are relatively rare, but the same thing hurts normal sequential
> writes to the end of file, which is not rare at all.
> 
> How about the following approach:
> 	* Let allocation of data block at the EOF return the sucker without
> writing zeroes out.
> 	* Let ->write_iter() check if it's about to create a hole at the
> (current) EOF.  In such case start with writing zeroes (in normal fashion,
> via ->begin_write()/memset()/->end_write() through the page cache) from the
> EOF to block boundary (or beginning of the area we were going to write to,
> whichever's closer).  Then proceed in normal fashion.  Incidentally, it will
> deal with tail unpacking for free.
> 	* Do the same upon extending truncate.
> 
> Does anybody see holes in that?  readpage() crossing the EOF isn't a problem -
> it will not even attempt to map past the current ->i_size and will zero the
> page tail out just fine.  We get junk past the EOF on disk, but we zero it
> out (or overwrite with our data) before increasing ->i_size.
> 
> Dealing with sparse files is also possible (taking allocations from
> ->writepage() to ->page_mkwrite(), where we are not holding a page lock
> and thus can grab all affected pages), but it would be bigger, trickier
> and benefit only a relatively rare case.
> 
> Comments?

Looks good to me. BTW also ext4 (with BIGALLOC feature) and OCFS2 can have
block allocation unit (called cluster) larger than page size. However the
block size of both filesystems is still <= page size. So at least ext4
handles fun with partially initialized clusters by just marking parts
of the cluster as uninitialized in the extent tree. But the code is still
pretty messy to be honest.

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-19 23:07               ` Al Viro
@ 2015-06-23 16:46                 ` Jan Kara
  2015-06-23 16:46                 ` Jan Kara
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Kara @ 2015-06-23 16:46 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Ian Campbell, Ian Jackson, xen-devel, Fabian Frederick,
	linux-fsdevel, Evgeniy Dushistov, Andrew Morton,
	Alexey Khoroshilov, Roger Pau Monne

On Sat 20-06-15 00:07:39, Al Viro wrote:
> On Wed, Jun 17, 2015 at 09:31:16PM +0100, Al Viro wrote:
> > On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote:
> > > > Joy...  Folks, is anybody actively maintaining fs/ufs these days?
> > > 
> > > Looking into the changelog there wasn't anyone seriously looking into UFS
> > > for at least 5-6 years... Fabian did some cleanups recently but they were
> > > mostly cosmetic. So I don't think it's really maintained. OTOH clearly
> > > there are some users since occasionally someone comes with a bug report.
> > > Welcome to the world where we have ~50 on disk / network filesystems :-|
> > 
> > FWIW, I've a partial rewrite of that area (completely untested, etc.)
> > in vfs.git#ufs; it's still in progress, but hopefully it'll be done
> > by tomorrow.  Basically, it switches the sucker to locking scheme
> > similar to ext2 (modulo seqlock instead of rwlock for protection of
> > block pointers, having to be more careful due to 64bit assignments not
> > being atomic and being unable to fall back to ->truncate_mutex in
> > case of race on the read side) and starts cleaning the hell out of
> > the truncate (and eventually create side of get_block as well).
> > 
> > As far as I can see, the root of the problems with that sucker is the
> > misplaced handling of tail-packing logics.  That had ballooned into
> > a lot of convoluted codepaths ;-/  I'm not blaming Evgeniy - it *is*
> > painful to massage into saner shape and the damn thing kept missing
> > cleanups that were done to similar filesystems due to that...
> > 
> > One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with
> > UFS vs. UFS2 differences at such a low level.  Sure, the way we did endianness
> > handling in there provoked exactly that, but it might have been better to
> > go the other way round; see e.g. fs/minix/itree*.c for opposite approach.
> > 
> > All this stuff is currently completely untested; I'll be using generic
> > parts of xfstests for beating it up, but any assistance would be welcome.
> > Note: all testing I'll be realistically able to do will be for FreeBSD
> > UFS variants - I don't have Solaris left on any boxen, to start with...
> 
> FWIW, I've got to one bloody unpleasant part of UFS we'd never managed to
> get right and it seems that I have a potential solution.
> 
> Problem: UFS has larger-than-page allocation units.  And when we allocate
> a block in a hole, we *must* zero the whole thing out.  The trouble being,
> we notice that inside writepage(), where we are already holding a lock on
> our page and where we can't lock other pages without risking a deadlock.
> 
> What's more, as soon as we'd inserted a reference to that block into inode,
> we are open to readpage on another page backed by the same block coming
> and seeing a normal mapped area.  So UFS block allocator ends up zeroing
> them out via _buffer_ cache, waiting for writes to finish before returning
> the block to caller.  It doesn't wait for metadata blocks (they are accessed
> via buffer cache), but for data we end up with zeroes being written to disk,
> no matter what.
> 
> That works, but it obviously hurts the performance.  Badly.  Writes into
> a hole are relatively rare, but the same thing hurts normal sequential
> writes to the end of file, which is not rare at all.
> 
> How about the following approach:
> 	* Let allocation of data block at the EOF return the sucker without
> writing zeroes out.
> 	* Let ->write_iter() check if it's about to create a hole at the
> (current) EOF.  In such case start with writing zeroes (in normal fashion,
> via ->begin_write()/memset()/->end_write() through the page cache) from the
> EOF to block boundary (or beginning of the area we were going to write to,
> whichever's closer).  Then proceed in normal fashion.  Incidentally, it will
> deal with tail unpacking for free.
> 	* Do the same upon extending truncate.
> 
> Does anybody see holes in that?  readpage() crossing the EOF isn't a problem -
> it will not even attempt to map past the current ->i_size and will zero the
> page tail out just fine.  We get junk past the EOF on disk, but we zero it
> out (or overwrite with our data) before increasing ->i_size.
> 
> Dealing with sparse files is also possible (taking allocations from
> ->writepage() to ->page_mkwrite(), where we are not holding a page lock
> and thus can grab all affected pages), but it would be bigger, trickier
> and benefit only a relatively rare case.
> 
> Comments?

Looks good to me. BTW also ext4 (with BIGALLOC feature) and OCFS2 can have
block allocation unit (called cluster) larger than page size. However the
block size of both filesystems is still <= page size. So at least ext4
handles fun with partially initialized clusters by just marking parts
of the cluster as uninitialized in the extent tree. But the code is still
pretty messy to be honest.

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-23 16:46                 ` Jan Kara
  2015-06-23 21:56                   ` Al Viro
@ 2015-06-23 21:56                   ` Al Viro
  1 sibling, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-23 21:56 UTC (permalink / raw)
  To: Jan Kara
  Cc: Fabian Frederick, Andrew Morton, Alexey Khoroshilov,
	Ian Campbell, Roger Pau Monne, Ian Jackson, xen-devel,
	Evgeniy Dushistov, linux-fsdevel

On Tue, Jun 23, 2015 at 06:46:08PM +0200, Jan Kara wrote:

> Looks good to me. BTW also ext4 (with BIGALLOC feature) and OCFS2 can have
> block allocation unit (called cluster) larger than page size. However the
> block size of both filesystems is still <= page size. So at least ext4
> handles fun with partially initialized clusters by just marking parts
> of the cluster as uninitialized in the extent tree. But the code is still
> pretty messy to be honest.

Well, with UFS there's no place on disk to store such "this block is
uninitialized" marks - it uses a bog-standard Unix inode structure.

There are two units - fragments and blocks.  Block is an aligned group
of adjacent fragments; normal ratio is 8:1.  Block is at least 4Kb
(and always a power of two), fragment is at least a one sector and
block:fragment ratio is at most 8:1.  Inode structure is normal for a Unix
filesystem (12 direct + indirect + double indirect + triple indirect).
Each reference covers a block worth of file offsets and almost all
of them point to full blocks.  Indirects are full blocks as well.
Reference to a block is represented as the number of the first fragment
in it (i.e. with normal parameters bits 0..2 are clear).  Block bitmap is
actually a fragment bitmap (i.e. bit per fragment).  The only situation
when a reference is *not* to a full block is the last reference in a
file shorter than 12*block size (i.e. not requiring indirects at all).
In that case the last direct reference points to less than a full block
(unless the size in fragments is a multiple of block:fragment ratio,
that is).  One unusual thing is that holes can't extend to EOF - the last
byte *must* be allocated.  (BTW, the only difference between UFS2 and
UFS1 in that area is that fragment numbers are 64bit now.  There had been
talk about turning block:fragment ratio into a per-inode value, but so far
nobody has implemented that - ->di_blksize is there, but it's never used).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"
  2015-06-23 16:46                 ` Jan Kara
@ 2015-06-23 21:56                   ` Al Viro
  2015-06-23 21:56                   ` Al Viro
  1 sibling, 0 replies; 18+ messages in thread
From: Al Viro @ 2015-06-23 21:56 UTC (permalink / raw)
  To: Jan Kara
  Cc: Evgeniy Dushistov, Ian Campbell, Ian Jackson, xen-devel,
	Fabian Frederick, linux-fsdevel, Andrew Morton,
	Alexey Khoroshilov, Roger Pau Monne

On Tue, Jun 23, 2015 at 06:46:08PM +0200, Jan Kara wrote:

> Looks good to me. BTW also ext4 (with BIGALLOC feature) and OCFS2 can have
> block allocation unit (called cluster) larger than page size. However the
> block size of both filesystems is still <= page size. So at least ext4
> handles fun with partially initialized clusters by just marking parts
> of the cluster as uninitialized in the extent tree. But the code is still
> pretty messy to be honest.

Well, with UFS there's no place on disk to store such "this block is
uninitialized" marks - it uses a bog-standard Unix inode structure.

There are two units - fragments and blocks.  Block is an aligned group
of adjacent fragments; normal ratio is 8:1.  Block is at least 4Kb
(and always a power of two), fragment is at least a one sector and
block:fragment ratio is at most 8:1.  Inode structure is normal for a Unix
filesystem (12 direct + indirect + double indirect + triple indirect).
Each reference covers a block worth of file offsets and almost all
of them point to full blocks.  Indirects are full blocks as well.
Reference to a block is represented as the number of the first fragment
in it (i.e. with normal parameters bits 0..2 are clear).  Block bitmap is
actually a fragment bitmap (i.e. bit per fragment).  The only situation
when a reference is *not* to a full block is the last reference in a
file shorter than 12*block size (i.e. not requiring indirects at all).
In that case the last direct reference points to less than a full block
(unless the size in fragments is a multiple of block:fragment ratio,
that is).  One unusual thing is that holes can't extend to EOF - the last
byte *must* be allocated.  (BTW, the only difference between UFS2 and
UFS1 in that area is that fragment numbers are 64bit now.  There had been
talk about turning block:fragment ratio into a per-inode value, but so far
nobody has implemented that - ->di_blksize is there, but it's never used).

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-06-23 21:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1432754131-27425-1-git-send-email-fabf@skynet.be>
     [not found] ` <20150527145735.e3d1913bc66426038d53be32@linux-foundation.org>
2015-06-04  5:01   ` [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge" Al Viro
2015-06-04 22:22     ` Al Viro
2015-06-04 22:22     ` Al Viro
2015-06-05 16:27     ` Fabian Frederick
2015-06-05 18:50       ` Al Viro
2015-06-05 22:03         ` Al Viro
2015-06-17  8:57           ` Jan Kara
2015-06-17  8:57           ` Jan Kara
2015-06-17 20:31             ` Al Viro
2015-06-17 20:31             ` Al Viro
2015-06-19 23:07               ` Al Viro
2015-06-19 23:07               ` Al Viro
2015-06-23 16:46                 ` Jan Kara
2015-06-23 16:46                 ` Jan Kara
2015-06-23 21:56                   ` Al Viro
2015-06-23 21:56                   ` Al Viro
2015-06-05 22:03         ` Al Viro
2015-06-06  8:04         ` Fabian Frederick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.