All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfsprogs: mkfs: don't default to the physical sector size if it's bigger than XFS_MAX_SECTORSIZE
Date: Wed, 29 Jan 2020 10:53:30 +1100	[thread overview]
Message-ID: <20200128235330.GG18610@dread.disaster.area> (raw)
In-Reply-To: <x49zhe7serq.fsf@segfault.boston.devel.redhat.com>

On Tue, Jan 28, 2020 at 11:54:17AM -0500, Jeff Moyer wrote:
> "Darrick J. Wong" <darrick.wong@oracle.com> writes:
> 
> > Do we need to check that ft->lsectorsize <= XFS_MAX_SECTORSIZE too?
> 
> We can.  What would be the correct response to such a situation?

Fail. The device is outside of the bounds of the validated on-disk
format at that point.

Historically speaking, the on-disk format only supports up to 32kB
sector sizes and the limitation is a journal format limitation. That
is, the journal implementation is based around 32kB buffers and
journal writes requiring atomic sector writes. Even though v2 logs
can support up to 256kB buffers, the format is still based around
32kB "segments" within the larger buffer.

IIRC, this limitation is due to the built-in sector-based "whole
buffer written" validation mechanism built into the log headers.
The v1 format has an array that can store 64 individual sector
indexes, so for 512 byte sectors the max log buffer size is 32kB.
As long as that first sector of the log buffer is written
atomically, it doesn't matter what the sector size actually is.

However, we cannot do 32kB writes to a 64kB physical sector device,
and hence the v1 format icannot support log records larger than
32kB, so the format is limited to 32kB sector sizes.

The v2 log format does increase the size of the log buffer, but it
does so by adding extension header sectors that contain the index
mapping for each 32kB sector. It works under the same principle as
the v1 log - as long as the first sector write is atomic, we can
validate the extension headers are intact and so validate all 32kB
segments of the larger buffer.

The v2 log format also allows for stripe unit padding of writes,
such that it will always write entire stripe units even when the log
buffer is not full. Hence it supports writing aligned sizes larger
than 32kB, even up to 256kB. However, it still relies on atomic
sector writes for validation.

When we moved to the v5 on-disk format, we also turned on CRC
checking of the log records. THis means we are no longer dependent
on atomic sector writes to detect torn log writes - if the first
512 byte sector is not written atomically, we can detect that in
recovery on v5 filesystems. That means we don't really care what the
sector size used by the log is anymore. We can do aligned log writes
up to 256kB, and we can detect torn single sector writes.

Hence I suspect XFS_MAX_SECTORSIZE is no longer relevant to v5
filesystems and we may be able to increase it to 64kB without any
issues. We'd need to do quite a bit more code spelunking than I've
just done, not to mention validation testing, before we can do such
a thing, though....

> > (Not that I really expect disks with 64k LBA units...)
> 
> It looks like you can tell loop to do that, but I wasn't able to make
> that happen in practice.  I'm not quite sure why, though.

IIRC, it depends on what IO alignment capability the underlying
filesystem provides via it's sector/block sizes.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-01-28 23:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-28 16:07 [PATCH] xfsprogs: mkfs: don't default to the physical sector size if it's bigger than XFS_MAX_SECTORSIZE Jeff Moyer
2020-01-28 16:14 ` Darrick J. Wong
2020-01-28 16:54   ` Jeff Moyer
2020-01-28 23:53     ` Dave Chinner [this message]
2020-02-04 18:38   ` Jeff Moyer
2021-04-16 17:39 ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200128235330.GG18610@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.