All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Javier González" <javier.gonz@samsung.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Theodore Ts'o" <tytso@mit.edu>, "Hannes Reinecke" <hare@suse.de>,
	"Luis Chamberlain" <mcgrof@kernel.org>,
	"Pankaj Raghav" <p.raghav@samsung.com>,
	"Daniel Gomez" <da.gomez@samsung.com>,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-block@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations
Date: Thu, 9 Mar 2023 07:05:34 -0700	[thread overview]
Message-ID: <ZAnnrhUG1B+r1nd0@kbusch-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <260064c68b61f4a7bc49f09499e1c107e2a28f31.camel@HansenPartnership.com>

On Thu, Mar 09, 2023 at 08:11:35AM -0500, James Bottomley wrote:
> On Thu, 2023-03-09 at 09:04 +0100, Javier González wrote:
> > FTL designs are complex. We have ways to maintain sector sizes under
> > 64 bits, but this is a common industry problem.
> > 
> > The media itself does not normally oeprate at 4K. Page siges can be
> > 16K, 32K, etc.
> 
> Right, and we've always said if we knew what this size was we could
> make better block write decisions.  However, today if you look what
> most NVMe devices are reporting, it's a bit sub-optimal:

Your sample size may be off if your impression is that "most" NVMe drives
report themselves this way. :)
 
> jejb@lingrow:/sys/block/nvme1n1/queue> cat logical_block_size 
> 512
> jejb@lingrow:/sys/block/nvme1n1/queue> cat physical_block_size 
> 512
> jejb@lingrow:/sys/block/nvme1n1/queue> cat optimal_io_size 
> 0
> 
> If we do get Linux to support large block sizes, are we actually going
> to get better information out of the devices?
> 
> >  Increasing the block size would allow for better host/device
> > cooperation. As Ted mentions, this has been a requirement for HDD and
> > SSD vendor for years. It seems to us that the time is right now and
> > that we have mechanisms in Linux to do the plumbing. Folios is
> > ovbiously a big part of this.
> 
> Well a decade ago we did a lot of work to support 4k sector devices.
> Ultimately the industry went with 512 logical/4k physical devices
> because of problems with non-Linux proprietary OSs but you could still
> use 4k today if you wanted (I've actually still got a working 4k SCSI
> drive), so why is no NVMe device doing that?

In my experience, all but the cheapest consumer grade nvme devices report 4k
logical. They all support an option to emulate 512b if you really wanted it to,
but the more optimal 4k is the most common default for server grade nvme.

  reply	other threads:[~2023-03-09 14:07 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01  3:52 [LSF/MM/BPF TOPIC] Cloud storage optimizations Theodore Ts'o
2023-03-01  4:18 ` Gao Xiang
2023-03-01  4:40   ` Matthew Wilcox
2023-03-01  4:59     ` Gao Xiang
2023-03-01  4:35 ` Matthew Wilcox
2023-03-01  4:49   ` Gao Xiang
2023-03-01  5:01     ` Matthew Wilcox
2023-03-01  5:09       ` Gao Xiang
2023-03-01  5:19         ` Gao Xiang
2023-03-01  5:42         ` Matthew Wilcox
2023-03-01  5:51           ` Gao Xiang
2023-03-01  6:00             ` Gao Xiang
2023-03-02  3:13 ` Chaitanya Kulkarni
2023-03-02  3:50 ` Darrick J. Wong
2023-03-03  3:03   ` Martin K. Petersen
2023-03-02 20:30 ` Bart Van Assche
2023-03-03  3:05   ` Martin K. Petersen
2023-03-03  1:58 ` Keith Busch
2023-03-03  3:49   ` Matthew Wilcox
2023-03-03 11:32     ` Hannes Reinecke
2023-03-03 13:11     ` James Bottomley
2023-03-04  7:34       ` Matthew Wilcox
2023-03-04 13:41         ` James Bottomley
2023-03-04 16:39           ` Matthew Wilcox
2023-03-05  4:15             ` Luis Chamberlain
2023-03-05  5:02               ` Matthew Wilcox
2023-03-08  6:11                 ` Luis Chamberlain
2023-03-08  7:59                   ` Dave Chinner
2023-03-06 12:04               ` Hannes Reinecke
2023-03-06  3:50             ` James Bottomley
2023-03-04 19:04         ` Luis Chamberlain
2023-03-03 21:45     ` Luis Chamberlain
2023-03-03 22:07       ` Keith Busch
2023-03-03 22:14         ` Luis Chamberlain
2023-03-03 22:32           ` Keith Busch
2023-03-03 23:09             ` Luis Chamberlain
2023-03-16 15:29             ` Pankaj Raghav
2023-03-16 15:41               ` Pankaj Raghav
2023-03-03 23:51       ` Bart Van Assche
2023-03-04 11:08       ` Hannes Reinecke
2023-03-04 13:24         ` Javier González
2023-03-04 16:47         ` Matthew Wilcox
2023-03-04 17:17           ` Hannes Reinecke
2023-03-04 17:54             ` Matthew Wilcox
2023-03-04 18:53               ` Luis Chamberlain
2023-03-05  3:06               ` Damien Le Moal
2023-03-05 11:22               ` Hannes Reinecke
2023-03-06  8:23                 ` Matthew Wilcox
2023-03-06 10:05                   ` Hannes Reinecke
2023-03-06 16:12                   ` Theodore Ts'o
2023-03-08 17:53                     ` Matthew Wilcox
2023-03-08 18:13                       ` James Bottomley
2023-03-09  8:04                         ` Javier González
2023-03-09 13:11                           ` James Bottomley
2023-03-09 14:05                             ` Keith Busch [this message]
2023-03-09 15:23                             ` Martin K. Petersen
2023-03-09 20:49                               ` James Bottomley
2023-03-09 21:13                                 ` Luis Chamberlain
2023-03-09 21:28                                   ` Martin K. Petersen
2023-03-10  1:16                                     ` Dan Helmick
2023-03-10  7:59                             ` Javier González
2023-03-08 19:35                 ` Luis Chamberlain
2023-03-08 19:55                 ` Bart Van Assche
2023-03-03  2:54 ` Martin K. Petersen
2023-03-03  3:29   ` Keith Busch
2023-03-03  4:20   ` Theodore Ts'o
2023-07-16  4:09 BELINDA Goodpaster kelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAnnrhUG1B+r1nd0@kbusch-mbp.dhcp.thefacebook.com \
    --to=kbusch@kernel.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=da.gomez@samsung.com \
    --cc=hare@suse.de \
    --cc=javier.gonz@samsung.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.