All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Martin Steigerwald <Martin@lichtvoll.de>
Cc: Subranshu Patel <spatel.ml@gmail.com>, linux-ext4@vger.kernel.org
Subject: Re: Large buffer cache in EXT4
Date: Sun, 17 Feb 2013 23:35:17 -0500	[thread overview]
Message-ID: <20130218043517.GB10361@thunk.org> (raw)
In-Reply-To: <201302171125.40116.Martin@lichtvoll.de>

On Sun, Feb 17, 2013 at 11:25:39AM +0100, Martin Steigerwald wrote:
> 
> What I never really understand was what is the clear distinction between 
> dirty pages and disk block buffers. Why isn´t anything that is about to be 
> written to disk in one cache?

The buffer cache is indexed by physical block number, and each buffer
in the buffer cache is the size of the block size used for I/O to the
device.

The page cache is indexed by <inode, page frame number>, and each page
is the size of a VM page (i.e.4k for x86 systems, 16k for Power
systems, etc.)

Certain file systems, including ext3, ext4, and ocfs2, use the jbd or
jbd2 layer to handle their physical block journalling, and this layer
fundamentally uses the buffer cache, since it is concerned with
controlling when specific file system blocks are allowed to ben
written back to the hard drive.

Other file systems may not support file system blocks smaller than 4k.
This may make it easier for them to use the page cache for their
metadata blocks, although I don't know what happens if you try to
mount a btrfs file system formatted with 4k blocks on an architecture
such as Power which has 16k pages.  I don't know if it will work, or
blow up in a spectacular display of sparks.  :-)

In practice, it really doesn't matter.  The actual data storage for
the buffer cache (i.e., where the b_data field points to in the struct
buffer_head) is actually in the page cache, so from a space
perspective it doesn't really matter.  File systems like ext3 and ext4
which use the buffer cache for metadata blocks need to be careful than
when a directory (which is metadata) is deleted, that the blocks in
the buffer cache are zapped so that if the space on disk is reused for
data file (which is cached in the page cache), that the stale entries
in the buffer cache aren't at risk of being written back to the disk.
But that's just a tiny a implementation detail....

							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-02-18  4:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-17  4:04 Large buffer cache in EXT4 Subranshu Patel
2013-02-17  6:28 ` Andreas Dilger
2013-02-17 10:19   ` Martin Steigerwald
2013-02-18 15:22   ` Eric Sandeen
2013-02-17 10:25 ` Martin Steigerwald
2013-02-18  4:35   ` Theodore Ts'o [this message]
2013-02-18 13:16     ` Martin Steigerwald

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130218043517.GB10361@thunk.org \
    --to=tytso@mit.edu \
    --cc=Martin@lichtvoll.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=spatel.ml@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.