All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-block@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations
Date: Thu, 2 Mar 2023 18:58:58 -0700	[thread overview]
Message-ID: <ZAFUYqAcPmRPLjET@kbusch-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <Y/7L74P6jSWwOvWt@mit.edu>

On Tue, Feb 28, 2023 at 10:52:15PM -0500, Theodore Ts'o wrote:
> Emulated block devices offered by cloud VM’s can provide functionality
> to guest kernels and applications that traditionally have not been
> available to users of consumer-grade HDD and SSD’s.  For example,
> today it’s possible to create a block device in Google’s Persistent
> Disk with a 16k physical sector size, which promises that aligned 16k
> writes will be atomically.  With NVMe, it is possible for a storage
> device to promise this without requiring read-modify-write updates for
> sub-16k writes. 

I'm not sure it does. NVMe spec doesn't say AWUN writes are never a RMW
operation. NVMe suggests aligning to NPWA is the best way to avoid RMW, but
doesn't guarantee that, nor does it require this limit aligns to atomic
boundaries. NVMe provides a lot of hints, but stops short of promises. Vendors
can promise whatever they want, but that's outside spec.

> All that is necessary are some changes in the block
> layer so that the kernel does not inadvertently tear a write request
> when splitting a bio because it is too large (perhaps because it got
> merged with some other request, and then it gets split at an
> inconvenient boundary).

All the limits needed to optimally split on phyiscal boundaries exist, so I
hope we're using them correctly via get_max_io_size().

That said, I was hoping you were going to suggest supporting 16k logical block
sizes. Not a problem on some arch's, but still problematic when PAGE_SIZE is
4k. :)

  parent reply	other threads:[~2023-03-03  1:59 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01  3:52 [LSF/MM/BPF TOPIC] Cloud storage optimizations Theodore Ts'o
2023-03-01  4:18 ` Gao Xiang
2023-03-01  4:40   ` Matthew Wilcox
2023-03-01  4:59     ` Gao Xiang
2023-03-01  4:35 ` Matthew Wilcox
2023-03-01  4:49   ` Gao Xiang
2023-03-01  5:01     ` Matthew Wilcox
2023-03-01  5:09       ` Gao Xiang
2023-03-01  5:19         ` Gao Xiang
2023-03-01  5:42         ` Matthew Wilcox
2023-03-01  5:51           ` Gao Xiang
2023-03-01  6:00             ` Gao Xiang
2023-03-02  3:13 ` Chaitanya Kulkarni
2023-03-02  3:50 ` Darrick J. Wong
2023-03-03  3:03   ` Martin K. Petersen
2023-03-02 20:30 ` Bart Van Assche
2023-03-03  3:05   ` Martin K. Petersen
2023-03-03  1:58 ` Keith Busch [this message]
2023-03-03  3:49   ` Matthew Wilcox
2023-03-03 11:32     ` Hannes Reinecke
2023-03-03 13:11     ` James Bottomley
2023-03-04  7:34       ` Matthew Wilcox
2023-03-04 13:41         ` James Bottomley
2023-03-04 16:39           ` Matthew Wilcox
2023-03-05  4:15             ` Luis Chamberlain
2023-03-05  5:02               ` Matthew Wilcox
2023-03-08  6:11                 ` Luis Chamberlain
2023-03-08  7:59                   ` Dave Chinner
2023-03-06 12:04               ` Hannes Reinecke
2023-03-06  3:50             ` James Bottomley
2023-03-04 19:04         ` Luis Chamberlain
2023-03-03 21:45     ` Luis Chamberlain
2023-03-03 22:07       ` Keith Busch
2023-03-03 22:14         ` Luis Chamberlain
2023-03-03 22:32           ` Keith Busch
2023-03-03 23:09             ` Luis Chamberlain
2023-03-16 15:29             ` Pankaj Raghav
2023-03-16 15:41               ` Pankaj Raghav
2023-03-03 23:51       ` Bart Van Assche
2023-03-04 11:08       ` Hannes Reinecke
2023-03-04 13:24         ` Javier González
2023-03-04 16:47         ` Matthew Wilcox
2023-03-04 17:17           ` Hannes Reinecke
2023-03-04 17:54             ` Matthew Wilcox
2023-03-04 18:53               ` Luis Chamberlain
2023-03-05  3:06               ` Damien Le Moal
2023-03-05 11:22               ` Hannes Reinecke
2023-03-06  8:23                 ` Matthew Wilcox
2023-03-06 10:05                   ` Hannes Reinecke
2023-03-06 16:12                   ` Theodore Ts'o
2023-03-08 17:53                     ` Matthew Wilcox
2023-03-08 18:13                       ` James Bottomley
2023-03-09  8:04                         ` Javier González
2023-03-09 13:11                           ` James Bottomley
2023-03-09 14:05                             ` Keith Busch
2023-03-09 15:23                             ` Martin K. Petersen
2023-03-09 20:49                               ` James Bottomley
2023-03-09 21:13                                 ` Luis Chamberlain
2023-03-09 21:28                                   ` Martin K. Petersen
2023-03-10  1:16                                     ` Dan Helmick
2023-03-10  7:59                             ` Javier González
2023-03-08 19:35                 ` Luis Chamberlain
2023-03-08 19:55                 ` Bart Van Assche
2023-03-03  2:54 ` Martin K. Petersen
2023-03-03  3:29   ` Keith Busch
2023-03-03  4:20   ` Theodore Ts'o
2023-07-16  4:09 BELINDA Goodpaster kelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAFUYqAcPmRPLjET@kbusch-mbp.dhcp.thefacebook.com \
    --to=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.