linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "linux-os \(Dick Johnson\)" <linux-os@analogic.com>
To: "Xin Zhao" <uszhaoxin@gmail.com>
Cc: "linux-kernel" <linux-kernel@vger.kernel.org>,
	<linux-fsdevel@vger.kernel.org>
Subject: Re: How to know when file data has been flushed into disk?
Date: Fri, 7 Apr 2006 12:55:33 -0400	[thread overview]
Message-ID: <Pine.LNX.4.61.0604071236150.12420@chaos.analogic.com> (raw)
In-Reply-To: <4ae3c140604070904j51d1b968l2f62a1de647c0b02@mail.gmail.com>


On Fri, 7 Apr 2006, Xin Zhao wrote:

> Thanks for your reply.
>
> That make sense. But at least ext3 needs to know when all data has
> been flushed so that it can commit the meta data. Question is how can
> ext3 knows that? The data flushing is done by flush daemon. There go
> to be some way to notify ext3 that data is flushed. Where  is this
> part of code in ext3 module?
>
> Xin
>
> On 4/7/06, Douglas McNaught <doug@mcnaught.org> wrote:
>> "Xin Zhao" <uszhaoxin@gmail.com> writes:
>>
>>> 3. Does sys_close() have to  be blocked until all data and metadata
>>> are committed? If not, sys_close() may give application an illusion
>>> that the file is successfully written, which can cause the application
>>> to take subsequent operation. However, data flush could be failed. In
>>> this case, file system seems to mislead the application. Is this true?
>>> If so, any solutions?
>>
>> The fsync() call is the way to make sure written data has hit the
>> disk.  close() doesn't guarantee that.
>>
>> -Doug
>>

In principle, you __never__ know that the data got to the
disk platter(s). Any database that thinks differently is
broken by design. You need transaction processing to be
assured that you have all the (correct) data available
in the database. Transaction processing provides atomic
stepping stones so that, in the event of a failure, the
transactions can be rolled back to the last complete one
and then restarted.

The simplest example is the use of a number of journal
files, each containing a record of the previous
transactions and enough information to roll-back the
database to the point at which these files were saved.
These files are checksummed and saved in order. In the
event of a crash, these files are read until the latest
of the readable ones has a correct checksum. The database
manager uses the information in the file to roll-back
the main database to the exact content at the time the
journal file was saved.

Once the database is restarted, any previous journal
files can be deleted as well as the bad ones that followed.
However, the journal file that was used to restart the
database is never deleted until it has been superseded
by another that worked in a database restart. That way,
there is always a way to get back to a clean database.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.15.4 on an i686 machine (5589.42 BogoMips).
Warning : 98.36% of all statistics are fiction, book release in April.
_
\x1a\x04

****************************************************************
The information transmitted in this message is confidential and may be privileged.  Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited.  If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

  reply	other threads:[~2006-04-07 16:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-07 15:42 How to know when file data has been flushed into disk? Xin Zhao
2006-04-07 15:53 ` Douglas McNaught
2006-04-07 16:04   ` Xin Zhao
2006-04-07 16:55     ` linux-os (Dick Johnson) [this message]
2006-04-07 17:19       ` Xin Zhao
2006-04-07 23:54   ` Ric Wheeler
2006-04-07 17:54 ` Zach Brown
2006-04-08  1:39   ` Xin Zhao
2006-04-07 21:07 Michael Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.61.0604071236150.12420@chaos.analogic.com \
    --to=linux-os@analogic.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=uszhaoxin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).