git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used
@ 2009-05-11 17:52 Shawn O. Pearce
  2009-05-11 18:59 ` Robin Rosenberg
  0 siblings, 1 reply; 4+ messages in thread
From: Shawn O. Pearce @ 2009-05-11 17:52 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Mark Struberg, git

When a DirCacheIterator was wrapped in an EmptyTreeIterator (such
as during a parallel TreeWalk where the other trees contain a
path that does not appear in the DirCachIterator) we corrupted
the DirCacheEntry path buffers by overwriting part of file name
components with '/' to match the other tree iterators' path length.

This happened because DirCacheIterator violated the iterator
assumption that the path buffer is always mutable.  Instead of
creating a mutable path, DirCacheIterator reused the path buffer
from the DirCacheEntry or the DirCacheTree.  These reused byte
arrays aren't mutable.

By delegating the EmptyTreeIterator creation to each iterator type
we can permit DirCacheIterator to control how it builds the empty
tree for the caller, ensuring that it copies the path buffer before
writing the '/' suffix onto it.

When EmptyTreeIterator has to grow the path buffer to create a new
iterator around itself, we can't just blindly replace every parent
iterator buffer with the larger path buffer.  DirCacheIterators will
be using a different path buffer, and they want to retain their
old path buffer, not the new larger buffer.

Noticed-by: Mark Struberg <struberg@yahoo.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---

 So uhm, test cases...  I can't come up with a simple test case
 which causes the data corruption, but Mark found a very complex
 one in egit.git.

 Now that I understand what the heck was wrong, its obvious we
 need to do this, as DirCacheIterator was breaking the contract.

 I'd love to have a test case, but I just spent an hour trying to
 come up with one and failed.  I'd rather have the data corruption
 fix without tests than not at all, since its obviously wrong as-is.
 But if you want a test, propose one.  I'm just not seeing the
 magic necessary to get the corruption to trigger.

 .../spearce/jgit/dircache/DirCacheIterator.java    |    9 ++++++++
 .../jgit/treewalk/AbstractTreeIterator.java        |   19 +++++++++++++---
 .../spearce/jgit/treewalk/EmptyTreeIterator.java   |   22 ++++++++++++++++++++
 .../src/org/spearce/jgit/treewalk/TreeWalk.java    |    2 +-
 4 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCacheIterator.java b/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCacheIterator.java
index 356d735..6fb9510 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCacheIterator.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/dircache/DirCacheIterator.java
@@ -45,6 +45,7 @@
 import org.spearce.jgit.lib.FileMode;
 import org.spearce.jgit.lib.Repository;
 import org.spearce.jgit.treewalk.AbstractTreeIterator;
+import org.spearce.jgit.treewalk.EmptyTreeIterator;
 
 /**
  * Iterate a {@link DirCache} as part of a <code>TreeWalk</code>.
@@ -126,6 +127,14 @@ public AbstractTreeIterator createSubtreeIterator(final Repository repo)
 	}
 
 	@Override
+	public EmptyTreeIterator createEmptyTreeIterator() {
+		final byte[] n = new byte[Math.max(pathLen + 1, DEFAULT_PATH_SIZE)];
+		System.arraycopy(path, 0, n, 0, pathLen);
+		n[pathLen] = '/';
+		return new EmptyTreeIterator(this, n, pathLen + 1);
+	}
+
+	@Override
 	public byte[] idBuffer() {
 		if (currentSubtree != null)
 			return subtreeId;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
index 2ff3b99..057250e 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
@@ -73,7 +73,8 @@
  * @see CanonicalTreeParser
  */
 public abstract class AbstractTreeIterator {
-	static final int DEFAULT_PATH_SIZE = 128;
+	/** Default size for the {@link #path} buffer. */
+	protected static final int DEFAULT_PATH_SIZE = 128;
 
 	/** A dummy object id buffer that matches the zero ObjectId. */
 	protected static final byte[] zeroid = new byte[Constants.OBJECT_ID_LENGTH];
@@ -251,9 +252,10 @@ protected AbstractTreeIterator(final AbstractTreeIterator p,
 	 *            be moved into the larger buffer.
 	 */
 	protected void growPath(final int len) {
-		final byte[] n = new byte[path.length << 1];
-		System.arraycopy(path, 0, n, 0, len);
-		for (AbstractTreeIterator p = this; p != null; p = p.parent)
+		final byte[] o = path;
+		final byte[] n = new byte[o.length << 1];
+		System.arraycopy(o, 0, n, 0, len);
+		for (AbstractTreeIterator p = this; p != null && p.path == o; p = p.parent)
 			p.path = n;
 	}
 
@@ -400,6 +402,15 @@ public abstract AbstractTreeIterator createSubtreeIterator(Repository repo)
 			throws IncorrectObjectTypeException, IOException;
 
 	/**
+	 * Create a new iterator as though the current entry were a subtree.
+	 *
+	 * @return a new empty tree iterator.
+	 */
+	public EmptyTreeIterator createEmptyTreeIterator() {
+		return new EmptyTreeIterator(this);
+	}
+
+	/**
 	 * Create a new iterator for the current entry's subtree.
 	 * <p>
 	 * The parent reference of the iterator must be <code>this</code>, otherwise
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/EmptyTreeIterator.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/EmptyTreeIterator.java
index eaca04e..cc5ea99 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/EmptyTreeIterator.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/EmptyTreeIterator.java
@@ -57,6 +57,28 @@ EmptyTreeIterator(final AbstractTreeIterator p) {
 		pathLen = pathOffset;
 	}
 
+	/**
+	 * Create an iterator for a subtree of an existing iterator.
+	 * <p>
+	 * The caller is responsible for setting up the path of the child iterator.
+	 *
+	 * @param p
+	 *            parent tree iterator.
+	 * @param childPath
+	 *            path array to be used by the child iterator. This path must
+	 *            contain the path from the top of the walk to the first child
+	 *            and must end with a '/'.
+	 * @param childPathOffset
+	 *            position within <code>childPath</code> where the child can
+	 *            insert its data. The value at
+	 *            <code>childPath[childPathOffset-1]</code> must be '/'.
+	 */
+	public EmptyTreeIterator(final AbstractTreeIterator p,
+			final byte[] childPath, final int childPathOffset) {
+		super(p, childPath, childPathOffset);
+		pathLen = childPathOffset - 1;
+	}
+
 	@Override
 	public AbstractTreeIterator createSubtreeIterator(final Repository repo)
 			throws IncorrectObjectTypeException, IOException {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
index a41ca58..250b213 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
@@ -800,7 +800,7 @@ public void enterSubtree() throws MissingObjectException,
 			if (t.matches == ch && !t.eof() && FileMode.TREE.equals(t.mode))
 				n = t.createSubtreeIterator(db, idBuffer, curs);
 			else
-				n = new EmptyTreeIterator(t);
+				n = t.createEmptyTreeIterator();
 			tmp[i] = n;
 		}
 		depth++;
-- 
1.6.3.48.g99c76

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used
  2009-05-11 17:52 [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used Shawn O. Pearce
@ 2009-05-11 18:59 ` Robin Rosenberg
  2009-05-11 19:44   ` Shawn O. Pearce
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Rosenberg @ 2009-05-11 18:59 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Mark Struberg, git


Shouldn't we update some unit tests too? The dircache must be reliable.

-- robin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used
  2009-05-11 18:59 ` Robin Rosenberg
@ 2009-05-11 19:44   ` Shawn O. Pearce
  0 siblings, 0 replies; 4+ messages in thread
From: Shawn O. Pearce @ 2009-05-11 19:44 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Mark Struberg, git

Robin Rosenberg <robin.rosenberg.lists@dewire.com> wrote:
> 
> Shouldn't we update some unit tests too? The dircache must be reliable.

I see you ignored the remark I put in the cover letter below the ---.

I think we should have unit tests too.  I just can't come up with
a way test the fix to prevent regression.  And I don't have any
more time for it probably the rest of this week.  I'd rather have
the corruption fix in the tree with no test than continue to have
surprising corruption under certain cases.  But if you or Mark can
offer a test, I'd love to have it.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used
@ 2009-05-12  0:04 Mark Struberg
  0 siblings, 0 replies; 4+ messages in thread
From: Mark Struberg @ 2009-05-12  0:04 UTC (permalink / raw)
  To: Robin Rosenberg, Shawn O. Pearce; +Cc: git


Patch tested, jgit status now works on a repo which previously caused the out of bounds exception.

txs and LieGrue,
strub

--- Shawn O. Pearce <spearce@spearce.org> schrieb am Mo, 11.5.2009:

> Von: Shawn O. Pearce <spearce@spearce.org>
> Betreff: Re: [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used
> An: "Robin Rosenberg" <robin.rosenberg.lists@dewire.com>
> CC: "Mark Struberg" <struberg@yahoo.de>, git@vger.kernel.org
> Datum: Montag, 11. Mai 2009, 21:44
> Robin Rosenberg <robin.rosenberg.lists@dewire.com>
> wrote:
> > 
> > Shouldn't we update some unit tests too? The dircache
> must be reliable.
> 
> I see you ignored the remark I put in the cover letter
> below the ---.
> 
> I think we should have unit tests too.  I just can't
> come up with
> a way test the fix to prevent regression.  And I don't
> have any
> more time for it probably the rest of this week.  I'd
> rather have
> the corruption fix in the tree with no test than continue
> to have
> surprising corruption under certain cases.  But if you
> or Mark can
> offer a test, I'd love to have it.
> 
> -- 
> Shawn.
> --
> To unsubscribe from this list: send the line "unsubscribe
> git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


      

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-05-12  0:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-11 17:52 [JGIT PATCH] Fix data corruption in DirCacheIterator when EmptyTreeIterator is used Shawn O. Pearce
2009-05-11 18:59 ` Robin Rosenberg
2009-05-11 19:44   ` Shawn O. Pearce
2009-05-12  0:04 Mark Struberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).