All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Duncan <1i5t5.duncan@cox.net>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs on whole disk (no partitions)
Date: Mon, 23 Jun 2014 08:24:04 -0400	[thread overview]
Message-ID: <yq1bntjizkr.fsf@sermon.lab.mkp.net> (raw)
In-Reply-To: <pan$600a2$3927cc39$ae007590$e90b8415@cox.net> (Duncan's message of "Mon, 23 Jun 2014 02:10:03 +0000 (UTC)")

>>>>> "Duncan" == Duncan  <1i5t5.duncan@cox.net> writes:

Duncan> Tho as you point out elsewhere, levels under the filesystem
Duncan> layer may split the btrfs 4096 byte block size into 512 byte
Duncan> logical sector sizes if appropriate, but that has nothing to do
Duncan> with btrfs except that it operates on top of that.

The notion of "splitting into a different block size" is a bit
confusing. The filesystem submits an N-byte I/O. Whether the logical
block size is 512 or 4096 doesn't really matter. We're still
transferring N bytes of data. The only thing the logical block size
really affects is how we calculate the LBA and block counts in the
command we send to the device. If N is not a multiple of the device's
logical block size we'll simply reject the I/O. If we receive an I/O
that is misaligned or not a multiple of the physical block size we let
the drive do RMW. So there isn't any "splitting" going on.

An I/O may be split if MD or DM is involved and the request straddles a
stripe chunk boundary. Because Linux generally does all I/O in terms of
4K pages, sub-page size splits are rare. Pretty much all the other cases
that would force us to split an I/O (typically controller DMA
constraints) operate on a page boundary.

To avoid the drive being forced to do RMW on the head and tail of a
misaligned I/O it is imperative that the filesystems are aligned to the
physical block size of the underlying device. As has been pointed out
the partitioning utilities generally make sure that's the case. If there
are no partitions then you're by definition aligned unless the drive has
the infamous Windows XP jumper installed.

Anyway. The short answer is that Linux will pretty much always do I/O in
multiples of the system page size regardless of the logical block size
of the underlying device. There are a few exceptions to this such as
direct I/O, legacy filesystems using bufferheads and raw block device
access.

-- 
Martin K. Petersen	Oracle Linux Engineering

  reply	other threads:[~2014-06-23 12:24 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 19:29 btrfs on whole disk (no partitions) Daniel Cegiełka
2014-06-18 20:10 ` Chris Murphy
2014-06-19 11:15   ` Austin S Hemmelgarn
2014-06-18 21:19 ` Imran Geriskovan
2014-06-19  0:07 ` Russell Coker
2014-06-19  8:58   ` Imran Geriskovan
2014-06-19  9:11     ` Imran Geriskovan
2014-06-21 19:19       ` Daniel Cegiełka
2014-06-22  1:36         ` Chris Murphy
2014-06-21 19:12   ` Daniel Cegiełka
2014-06-22  1:34     ` Chris Murphy
2014-06-22  7:49       ` Imran Geriskovan
2014-06-22 13:44         ` George Mitchell
2014-06-22 14:11           ` Roman Mamedov
2014-06-22 14:41             ` George Mitchell
2014-06-22 14:46             ` George Mitchell
2014-06-22 18:56               ` Chris Murphy
2014-06-22 18:47           ` Chris Murphy
2014-06-23  2:10             ` Duncan
2014-06-23 12:24               ` Martin K. Petersen [this message]
2014-06-24  5:37                 ` Duncan
2014-06-25 13:01                 ` Imran Geriskovan
2014-06-25 16:01                   ` Duncan
2014-06-26 18:26                     ` Imran Geriskovan
2014-06-26 18:41                   ` Chris Murphy
2014-06-26 20:46                     ` Imran Geriskovan
2014-06-22 18:31         ` Chris Murphy
2014-06-23 11:34           ` Martin K. Petersen
2014-06-19  1:01 ` George Mitchell
2014-06-19  4:52   ` Russell Coker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1bntjizkr.fsf@sermon.lab.mkp.net \
    --to=martin.petersen@oracle.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.