linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pankaj Raghav <p.raghav@samsung.com>
To: <hare@suse.de>, <willy@infradead.org>, <david@fromorbit.com>
Cc: <gost.dev@samsung.com>, <mcgrof@kernel.org>, <hch@lst.de>,
	<jwong@kernel.org>, <linux-fsdevel@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	Pankaj Raghav <p.raghav@samsung.com>
Subject: [RFC 0/4] minimum folio order support in filemap
Date: Wed, 21 Jun 2023 10:38:19 +0200	[thread overview]
Message-ID: <20230621083823.1724337-1-p.raghav@samsung.com> (raw)
In-Reply-To: CGME20230621083825eucas1p1b05a6d7e0bf90e7a3d8e621f6578ff0a@eucas1p1.samsung.com

There has been a lot of discussion recently to support devices and fs for
bs > ps. One of the main plumbing to support buffered IO is to have a minimum
order while allocating folios in the page cache.

Hannes sent recently a series[1] where he deduces the minimum folio
order based on the i_blkbits in struct inode. This takes a different
approach based on the discussion in that thread where the minimum and
maximum folio order can be set individually per inode.

This series is based on top of Christoph's patches to have iomap aops
for the block cache[2]. I rebased his remaining patches to
next-20230621. The whole tree can be found here[3].

Compiling the tree with CONFIG_BUFFER_HEAD=n, I am able to do a buffered
IO on a nvme drive with bs>ps in QEMU without any issues:

[root@archlinux ~]# cat /sys/block/nvme0n2/queue/logical_block_size
16384
[root@archlinux ~]# fio -bs=16k -iodepth=8 -rw=write -ioengine=io_uring -size=500M
		    -name=io_uring_1 -filename=/dev/nvme0n2 -verify=md5
io_uring_1: (g=0): rw=write, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=io_uring, iodepth=8
fio-3.34
Starting 1 process
Jobs: 1 (f=1): [V(1)][100.0%][r=336MiB/s][r=21.5k IOPS][eta 00m:00s]
io_uring_1: (groupid=0, jobs=1): err= 0: pid=285: Wed Jun 21 07:58:29 2023
  read: IOPS=27.3k, BW=426MiB/s (447MB/s)(500MiB/1174msec)
  <snip>
Run status group 0 (all jobs):
   READ: bw=426MiB/s (447MB/s), 426MiB/s-426MiB/s (447MB/s-447MB/s), io=500MiB (524MB), run=1174-1174msec
  WRITE: bw=198MiB/s (207MB/s), 198MiB/s-198MiB/s (207MB/s-207MB/s), io=500MiB (524MB), run=2527-2527msec

Disk stats (read/write):
  nvme0n2: ios=35614/4297, merge=0/0, ticks=11283/1441, in_queue=12725, util=96.27%

One of the main dependency to work on a block device with bs>ps is
Christoph's work on converting block device aops to use iomap.

[1] https://lwn.net/Articles/934651/
[2] https://lwn.net/ml/linux-kernel/20230424054926.26927-1-hch@lst.de/
[3] https://github.com/Panky-codes/linux/tree/next-20230523-filemap-order-generic-v1

Luis Chamberlain (1):
  block: set mapping order for the block cache in set_init_blocksize

Matthew Wilcox (Oracle) (1):
  fs: Allow fine-grained control of folio sizes

Pankaj Raghav (2):
  filemap: use minimum order while allocating folios
  nvme: enable logical block size > PAGE_SIZE

 block/bdev.c             |  9 ++++++++
 drivers/nvme/host/core.c |  2 +-
 include/linux/pagemap.h  | 46 ++++++++++++++++++++++++++++++++++++----
 mm/filemap.c             |  9 +++++---
 mm/readahead.c           | 34 ++++++++++++++++++++---------
 5 files changed, 82 insertions(+), 18 deletions(-)

-- 
2.39.2


       reply	other threads:[~2023-06-21  8:38 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230621083825eucas1p1b05a6d7e0bf90e7a3d8e621f6578ff0a@eucas1p1.samsung.com>
2023-06-21  8:38 ` Pankaj Raghav [this message]
     [not found]   ` <CGME20230621083826eucas1p11fc8d3e023caafa8b30fd04c66c9c7d0@eucas1p1.samsung.com>
2023-06-21  8:38     ` [RFC 1/4] fs: Allow fine-grained control of folio sizes Pankaj Raghav
2023-06-21  9:02       ` Hannes Reinecke
     [not found]   ` <CGME20230621083827eucas1p2948b4efaf55064c3761c924b5b049219@eucas1p2.samsung.com>
2023-06-21  8:38     ` [RFC 2/4] filemap: use minimum order while allocating folios Pankaj Raghav
2023-06-21  8:59       ` Hannes Reinecke
2023-06-21 10:25         ` Pankaj Raghav
     [not found]   ` <CGME20230621083828eucas1p23222cae535297f9536f12dddd485f97b@eucas1p2.samsung.com>
2023-06-21  8:38     ` [RFC 3/4] block: set mapping order for the block cache in set_init_blocksize Pankaj Raghav
2023-06-21  9:05       ` Hannes Reinecke
2023-06-21 10:42         ` Pankaj Raghav
2023-06-21 11:02           ` Hannes Reinecke
2023-06-21 12:02             ` Pankaj Raghav
2023-06-24  8:35       ` kernel test robot
     [not found]   ` <CGME20230621083830eucas1p1c7e6ea9e23949a9688aac6f9f3ea25fb@eucas1p1.samsung.com>
2023-06-21  8:38     ` [RFC 4/4] nvme: enable logical block size > PAGE_SIZE Pankaj Raghav
2023-06-21  9:07       ` Hannes Reinecke
2023-06-21 10:47         ` Pankaj Raghav
2023-06-21  9:00   ` [RFC 0/4] minimum folio order support in filemap Hannes Reinecke
2023-06-21 22:07     ` Dave Chinner
2023-06-22  5:51       ` Hannes Reinecke
2023-06-22  6:50         ` Hannes Reinecke
2023-06-22 10:20           ` Dave Chinner
2023-06-22 10:23             ` Hannes Reinecke
2023-06-22 22:33               ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230621083823.1724337-1-p.raghav@samsung.com \
    --to=p.raghav@samsung.com \
    --cc=david@fromorbit.com \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jwong@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).