[LSF/MM ATTEND] Online Logical Head Depop and SMR disks chunked writepages

* [LSF/MM ATTEND] Online Logical Head Depop and SMR disks chunked writepages
@ 2016-02-23  2:56 Damien Le Moal
  2016-02-23  3:56 ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Damien Le Moal @ 2016-02-23  2:56 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-block, linux-scsi

Hello,

I would like to attend LSF/MM 2016 to discuss the following topics.

1) Online Logical Head Depop

Some disk drives available on the market already provide a "logical
depop" function which allows a system to decommission a defective
disk head, reformat the disk and continue using this same disk with
a reduced capacity. Such feature can allow reduced operation costs
(delayed HDD replacement) but has the drawback of a data loss (data
under the remaining valid heads) and disk downtime during re-formating.

Online logical head depop is a proposed new feature allowing retaining
the disk valid data and eliminating the need for a disk re-format.
The basic idea is to introduce new commands for the host to discover
the ranges of LBAs impacted by a defective head. Using this information,
the host can take actions when a disk head failure event is suspected
or reported:
(a) The impacted LBAs can be depopulated, resulting in the disk
operating as a “thin provisioned” device.
(b) The impacted LBAs can be amputated, resulting in error for all
subsequent accesses to the LBAs under the defective head.
(c) Optionally, a host may decide to reformat (compact) the disk to
restore operation as a fully-provisioned device with a lower capacity.

The goal of the discussion would be to gather the opinion of the
developers for drafting a command standard minimizing the impact of this
feature on the block I/O stack as well as allowing a simple use of this
feature by file systems and device mapper drivers (including logical
volume manager).

2) Write back of dirty pages to SMR block devices:

Dirty pages of a block device inode are currently processed using the
generic_writepages function, which can be executed simultaneously
by multiple contexts (e.g sync, fsync, msync, sync_file_range, etc).
Mutual exclusion of the dirty page processing being achieved only at
the page level (page lock & page writeback flag), multiple processes
executing a "sync" of overlapping block ranges over the same zone of
an SMR disk can cause an out-of-LBA-order sequence of write requests
being sent to the underlying device. On a host managed SMR disk, where
sequential write to disk zones is mandatory, this result in errors and
the impossibility for an application using raw sequential disk write
accesses to be guaranteed successful completion of its write or fsync
requests.

Using the zone information attached to the SMR block device queue
(introduced by Hannes), calls to the generic_writepages function can
be made mutually exclusive on a per zone basis by locking the zones.
This guarantees sequential request generation for each zone and avoid
write errors without any modification to the generic code implementing
generic_writepages.

This is but one possible solution for supporting SMR host-managed
devices without any major rewrite of page cache management and
write-back processing. The opinion of the audience regarding this
solution and discussing other potential solutions would be greatly
appreciated.

Thank you.

Best regards.

------------------------
Damien Le Moal, Ph.D.
Sr. Manager, System Software Group, HGST Research,
HGST, a Western Digital company
Damien.LeMoal@hgst.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa, 
Kanagawa, 252-0888 Japan
www.hgst.com 
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

^ permalink raw reply	[flat|nested] 13+ messages in thread