All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergei Antonov <saproj@gmail.com>
To: "htl10@users.sourceforge.net" <htl10@users.sourceforge.net>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: A few thoughts on hfsplus dev (Re: [PATCH v2] hfsplus: add journal replay)
Date: Tue, 25 Mar 2014 02:37:03 +0100	[thread overview]
Message-ID: <CAE71095-515F-4293-9CDB-DEF277EC9A7A@gmail.com> (raw)
In-Reply-To: <1395616538.43957.YahooMailBasic@web172305.mail.ir2.yahoo.com>

> Am 24.03.2014 um 00:15 schrieb Hin-Tak Leung <htl10@users.sourceforge.net>:
> 
> Hi Vyacheslav and Sergei,
> 
> I was hoping not to feed into the issues. I believe the bottom line is that, there are
> a few interesting and challenging issues with HFS+ support, and there are simply not
> enough people who are both knowledgeable and motivated to improve on it. So we
> should try to work together. Does it sound reasonable?

Yes, it does.

> 
> Sergei: - I appreciate that you are the first to point out the endian issue with the journal,
> and been working on other aspects, such as the foldercount issue. I would be happy
> to see more work from you, and continue to be willing to have a look and review, where
> time permits. However, I do see a few problems with the manner in which you approach
> this work. From my personal point of view (and I cannot speak for others), file system
> work is data-safety first.

Let me inject a remark here.
A reader of my discussion with Vyacheslav here may get an impression that my code cares less about data safety. This impression is false. To make a long story short: replaying+removing transactions one by one is as safe as replaying them all at once and then removing them alltogether. Maybe it is not so for other journal formats, for HFS+ journal it is.

> So while Vyacheslav's work might be on the verbose side,
> it is re-assuring to read, maybe repeatedly, about the details, than skipping on boring
> details. I don't think length of code should be a criteria, and certainly not "elegance",
> or cleverness. Here is a widely quoted saying from Brian Kernighan:
> "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the
> code as cleverly as possible, you are, by definition, not smart enough to debug it."
> 
> - Also, if you deviate from how HFS+ works either at the specification level, the behavior level
> or the code level, you are just implementing your own journalling work.

Sigh. You have been misled. That's probably because claims like "you do not follow this excerpt from the spec" are more impressive and easier to remember than my following explanations why I am not.

> I would also have
> preferred that you point out exactly where Vyacheslav's work went wrong, and supply
> a patch on top, instead of starting your own. If you want credit for it, I hope you can
> arrange with Vyacheslav to swap the order of signed-off, for one or more of the merged patches,
> for example. In any case, while I am happy to see your work, and am willing to review it,
> you have not convinced me to try your work out yet. So.

I am ready for further convincing efforts.
Though now that Vyacheslav has stopped replying in the thread there is no progress. Two compromises have been suggested by me in the thread, I'll recap them here for you or anyone who wasn't following.
1. Since collecting all transactions requires more memory (up to 12 MB in my tests, but theoretically more), I suggested a threshold. Let's say it will be 1 MB. Once collected data reaches one meg, flush it, free the buffers and go on collecting.
2. A very simple suggestions that makes my logic closer to Vyacheslav's one-by-one transaction processing approach. In case we encounter a corrupted transaction, replay (good) transactions collected so far. Corrupted transaction is a rare thing (caused by cosmic rays!), but partial replay looks sane and not a big change to the code.

> Vyacheslav: Thanks for the continuous good work. While your adherence to TN1150 is
> good, I have noticed a few times that you have not paid attention to Apple's
> implementation of it - xnu's kernel code and diskdev_cmds (apple's mkfs/fsck). It is apparent
> in some of the discussion between you and Sergei that he has read some of those but
> you haven't. This is shown in the discussion about meaning and naming of variables,
> and interpretation of reserved fields. About TN1150, I think there is one instance about
> journalling support (a block count?) where it is ambiguous and contradicctory about a +1 offset.
> Anyway, I think reference to the actual implementations are welcomed in resolving
> these ambiguities and clarifying issues.
> 
> To the both of you, looking forward, and a few technical issues:
> 
> - I think fsck.hfsplus must manipulate the journal - fs with a non-empty journal is by definition
> unclean; but fsck may or may not be doing journal replay. It probably does something else,
> like just zero'ing the journal and fixes inconsistency in the rest. At some point we should find
> out what fsck does when it fixes an unclean fs with non-empty journal.

Easy. There is no non-empty journal by the time there is something to fix. It is replayed or discarded before the check.

> - I have a more re-produce'able recipe of making disk images with strange 
> location of 2nd volume header using hdiutil (Apple's loopback mounting, etc disk image
> manipulation tool). At some point we should re-visit this issue.

Well, present your recepie. But! There is only one proper placement for this header. All other placements are improper, and the spec even explicitly gives one example of an improper placement. I imagine no ways to get it wrong. I see no need for any elaborate code for finding the 2nd header.

> - on the many-year-old issue of the driver getting confused just running "du" recursing
> down large directories, I have managed to make it happen under Ftrace the Linux
> kernel internal tracer, and even with code mod's, but the resulting log is too
> large and events are being dropped. So I need to do some filtering to get
> events down to a fully-recordable level. I.e. it would help to confirm issues
> if I know where to look...  
> 
> Hin-Tak
> 

  parent reply	other threads:[~2014-03-25  1:37 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-24 10:05 [PATCH v2] hfsplus: add journal replay Sergei Antonov
2014-02-24 10:21 ` Vyacheslav Dubeyko
2014-03-09 16:36 ` Vyacheslav Dubeyko
2014-03-09 22:50   ` Sergei Antonov
2014-03-13 10:20     ` Vyacheslav Dubeyko
2014-03-13 20:04       ` Sergei Antonov
2014-03-23 23:15         ` A few thoughts on hfsplus dev (Re: [PATCH v2] hfsplus: add journal replay) Hin-Tak Leung
2014-03-24  6:56           ` Vyacheslav Dubeyko
2014-03-25  1:37           ` Sergei Antonov [this message]
2014-03-25  6:30             ` Vyacheslav Dubeyko
2014-03-28 13:42               ` Sergei Antonov
2014-03-28 15:09                 ` Vyacheslav Dubeyko
2014-03-28 20:04                   ` Sergei Antonov
2014-04-04 23:35 Hin-Tak Leung
2014-04-06 20:59 ` Sergei Antonov
2014-04-05  1:02 Hin-Tak Leung
2014-04-06 16:01 ` Sergei Antonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE71095-515F-4293-9CDB-DEF277EC9A7A@gmail.com \
    --to=saproj@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=htl10@users.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=slava@dubeyko.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.