linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakob Oestergaard <jakob@unthought.net>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@zip.com.au>
Subject: jbd bug(s) (?)
Date: Tue, 24 Sep 2002 09:21:17 +0200	[thread overview]
Message-ID: <20020924072117.GD2442@unthought.net> (raw)


First:

In Linux-2.4.19, I was wondering about the following:

In fs/jbd/commit.c:583, we find the following:
 /* AKPM: buglet - add `i' to tmp! */
 for (i = 0; i < jh2bh(descriptor)->b_size; i += 512) {
         journal_header_t *tmp =
                 (journal_header_t*)jh2bh(descriptor)->b_data;
         tmp->h_magic = htonl(JFS_MAGIC_NUMBER);
         tmp->h_blocktype = htonl(JFS_COMMIT_BLOCK);
         tmp->h_sequence = htonl(commit_transaction->t_tid);
 }


As I see it, this means that jbd using filesystems (ext3) will only
remember writing *ONE* entry from the journal.

Isn't this a problem ?

Second:

The jbd superblocks contains an index into the journal for the first
transaction - but there is only *one* copy of the index, and there is no
reasonable way to detect if it got written correctly to disk.

If the system loses power while updating the superblock, and only *half*
of this index is written correctly, we have a journal which we cannot
reach.

Sort of removes the point of having the journal in the first place. (If
my above assertion is true).

As far as I know, Tux2 solves this problem by keeping multiple indexes
(yes it uses phase trees and not a journal, but Tux2 root nodes and the
journal index are identical wrt. this problem).

If one keeps two blocks holding:
  index
  timestamp
  CRC
one can consider the two blocks and disregard the ones with invalid CRC.
This leaves us with one or two blocks left - we then pick the one with
the highest timestamp - and we are then guaranteed to *always* have a
valid index.

(The above works when the timestamp is incremented for every write,
index updates are written alternating between the two blocks, and the
complete block is sync()ed before the other is written to)

Wouldn't something like this be required for a journalling fs to be
worth anything ?

I know the window is rather small for half an index to be written - but
that doesn't mean it can't happen.


-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

             reply	other threads:[~2002-09-24  7:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-24  7:21 Jakob Oestergaard [this message]
2002-09-25 16:36 ` jbd bug(s) (?) Stephen C. Tweedie
2002-09-26 12:21   ` Jakob Oestergaard
2002-09-26 12:27     ` Stephen C. Tweedie
2002-09-26 12:56       ` Jakob Oestergaard
2002-09-26 13:44         ` Theodore Ts'o
2002-09-26 14:05           ` Christoph Hellwig
2002-09-26 14:25             ` Theodore Ts'o
2002-09-26 14:41               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020924072117.GD2442@unthought.net \
    --to=jakob@unthought.net \
    --cc=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).