From: Bart Van Assche <bart.vanassche@sandisk.com>
To: Damien Le Moal <Damien.LeMoal@hgst.com>,
"lsf-pc@lists.linuxfoundation.org"
<lsf-pc@lists.linuxfoundation.org>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
Matias Bjorling <m@bjorling.me>
Subject: Re: [LSF/MM ATTEND] Online Logical Head Depop and SMR disks chunked writepages
Date: Mon, 22 Feb 2016 19:56:40 -0800 [thread overview]
Message-ID: <56CBD878.4070600@sandisk.com> (raw)
In-Reply-To: <05353ADC-601C-412D-80E2-1F1972324A37@hgst.com>
On 02/22/16 18:56, Damien Le Moal wrote:
> 2) Write back of dirty pages to SMR block devices:
>
> Dirty pages of a block device inode are currently processed using the
> generic_writepages function, which can be executed simultaneously
> by multiple contexts (e.g sync, fsync, msync, sync_file_range, etc).
> Mutual exclusion of the dirty page processing being achieved only at
> the page level (page lock & page writeback flag), multiple processes
> executing a "sync" of overlapping block ranges over the same zone of
> an SMR disk can cause an out-of-LBA-order sequence of write requests
> being sent to the underlying device. On a host managed SMR disk, where
> sequential write to disk zones is mandatory, this result in errors and
> the impossibility for an application using raw sequential disk write
> accesses to be guaranteed successful completion of its write or fsync
> requests.
>
> Using the zone information attached to the SMR block device queue
> (introduced by Hannes), calls to the generic_writepages function can
> be made mutually exclusive on a per zone basis by locking the zones.
> This guarantees sequential request generation for each zone and avoid
> write errors without any modification to the generic code implementing
> generic_writepages.
>
> This is but one possible solution for supporting SMR host-managed
> devices without any major rewrite of page cache management and
> write-back processing. The opinion of the audience regarding this
> solution and discussing other potential solutions would be greatly
> appreciated.
Hello Damien,
Is it sufficient to support filesystems like BTRFS on top of SMR drives
or would you also like to see that filesystems like ext4 can use SMR
drives ? In the latter case: the behavior of SMR drives differs so
significantly from that of other block devices that I'm not sure that we
should try to support these directly from infrastructure like the page
cache. If we look e.g. at NAND SSDs then we see that the characteristics
of NAND do not match what filesystems expect (e.g. large erase blocks).
That is why every SSD vendor provides an FTL (Flash Translation Layer),
either inside the SSD or as a separate software driver. An FTL
implements a so-called LFS (log-structured filesystem). With what I know
about SMR this technology looks also suitable for implementation of a
LFS. Has it already been considered to implement an LFS driver for SMR
drives ? That would make it possible for any filesystem to access an SMR
drive as any other block device. I'm not sure of this but maybe it will
be possible to share some infrastructure with the LightNVM driver
(directory drivers/lightnvm in the Linux kernel tree). This driver
namely implements an FTL.
Bart.
next prev parent reply other threads:[~2016-02-23 3:57 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-23 2:56 [LSF/MM ATTEND] Online Logical Head Depop and SMR disks chunked writepages Damien Le Moal
2016-02-23 3:56 ` Bart Van Assche [this message]
2016-02-23 5:31 ` Damien Le Moal
2016-02-23 8:40 ` [Lsf-pc] " Jan Kara
2016-02-24 1:53 ` Damien Le Moal
2016-02-24 8:47 ` Jan Kara
2016-02-29 2:02 ` Damien Le Moal
2016-02-29 3:06 ` Hannes Reinecke
2016-02-29 5:54 ` Damien Le Moal
2016-02-29 13:40 ` Jan Kara
2016-03-01 0:43 ` Damien Le Moal
2016-03-01 9:27 ` Jan Kara
2016-03-01 12:00 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CBD878.4070600@sandisk.com \
--to=bart.vanassche@sandisk.com \
--cc=Damien.LeMoal@hgst.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf-pc@lists.linuxfoundation.org \
--cc=m@bjorling.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.