linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hin-Tak Leung <htl10@users.sourceforge.net>
To: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Michael Fox <415fox@gmail.com>, linux-fsdevel@vger.kernel.org
Subject: Re: hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'
Date: Sun, 7 Apr 2013 14:41:08 +0100 (BST)	[thread overview]
Message-ID: <1365342068.96252.YahooMailClassic@web172304.mail.ir2.yahoo.com> (raw)
In-Reply-To: <2A43A957-0651-4262-8E7A-2475A594820F@dubeyko.com>

--- On Sun, 7/4/13, Vyacheslav Dubeyko <slava@dubeyko.com> wrote:

> Hi Hin-Tak,
> 
> On Apr 5, 2013, at 12:57 AM, Hin-Tak Leung wrote:
> 
> > Hi Michael,
> > 
> > Argh, that looks suspiciously like the recurring
> problem I have been trying to pin down for the much of the
> last year. My current thinking is that one of the patches
> posted a couple of weeks ago might help.
> 
> As I remember, you can easily reproduce the issue that you
> are investigating. Does the issue reproducible with enabled
> debug output? Can you reproduce the issue with fully enabled
> debug output (I mean to enable all debug flags)? If you can
> reproduce the issue with enabled debug output then could you
> share this debug output with me?

That's correct - I can trigger the error condition with debug enabled quite reasonably "reliably". I remembered having done that once, I think with catalog and extent debugging on. The problem was that it generated too much information; since I needed to run "du" on a large directory (~million files) to trigger the condition, the catalog debugging info is a few lines per file, and "du" gets at every of the ~million files, so we are talking about dumping a few hundred MBs into /var/log/messages :-(.

Hence another reason to switching to dynamic debugging also - so that one can switch on/off per debugging lines. Even that is not ideal.

> Thanks,
> Vyacheslav Dubeyko.
> 
> > That patch addresses out-of-memory conditions in
> caching of metadata, in a nutshell. I think if (1) the
> system is under memory stress, (2) one is doing something
> which transverse the file system very quickly, (3) on a
> mult-CPU/core system, it is possible to run some mutexed
> non-re-entrant code in the hfsplus simultaneously without a
> mutex lock, and therefore get it a bit confused. This idea
> at least explains why (1) adding an inner mutex lock can
> delay the problem although supposedly the outer mutex should
> have prevented more than one copy of the non-re-entrant code
> from being run and the inner mutex lock should have no
> effect at all, (2) the on-disk data is always fsck'ed okay -
> it is just the driver itself getting confused.
> > 
> > So I have a few questions for you:
> > 
> > 1. You are on a quad-core system, correct? This is
> according to your /proc/cpuinfo below.
> > 
> > 2. You are certainly doing fast file system transversal
> (updatedb), but are you actually doing it *on top of the
> hfsplus* file system? I am asking this because updatedb is
> usually configured not the index removable media under /mnt
> or /media . But you mentioned you have the hfsplus system
> mounted under /home - please confirm that and include some
> more details if you can.
> > 
> > 3. How full and populous is that hfs+ file system? i.e.
> the output of both "df" and "df -i" while it is mounted. Is
> this your Mac OS X system (root / ) disk?
> > 
> > 4. Is your system under memory stress at the moment the
> problem happens - e.g. you have a web browser with a few
> hundred tabs open?
> > 
> > Hin-Tak
> > 
> > --- On Thu, 4/4/13, Vyacheslav Dubeyko <slava@dubeyko.com>
> wrote:
> > 
> 
> 

  reply	other threads:[~2013-04-07 13:41 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABbL6oa_ckwhbDkB-MVr4C3W_FHRVMmQ=uQ5tZp1RebmYLwdfw@mail.gmail.com>
2013-04-04 17:23 ` Fwd: Michael Fox
2013-04-04 17:43   ` Michael Fox
2013-04-04 18:00     ` hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large' Vyacheslav Dubeyko
2013-04-04 20:57       ` Hin-Tak Leung
2013-04-05  5:20         ` Michael Fox
2013-04-07 13:54           ` Hin-Tak Leung
2013-04-07 12:12         ` Vyacheslav Dubeyko
2013-04-07 13:41           ` Hin-Tak Leung [this message]
2013-04-07 14:14             ` Vyacheslav Dubeyko
2013-04-05  5:01       ` Michael Fox
2013-04-07 12:05         ` Vyacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1365342068.96252.YahooMailClassic@web172304.mail.ir2.yahoo.com \
    --to=htl10@users.sourceforge.net \
    --cc=415fox@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=slava@dubeyko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).