linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
To: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Simon Arlott <simon@octiron.net>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH] scsi: sd: stop SSD (non-rotational) disks before reboot
Date: Tue, 23 Jun 2020 17:42:34 -0300	[thread overview]
Message-ID: <20200623204234.GA16156@khazad-dum.debian.net> (raw)
In-Reply-To: <CY4PR04MB37511505492E9EC6A245CFB1E79B0@CY4PR04MB3751.namprd04.prod.outlook.com>

On Thu, 18 Jun 2020, Damien Le Moal wrote:
> Are you experiencing data loss or corruption ? If yes, since a clean reboot or
> shutdown issues a synchronize cache to all devices, a corruption would mean that
> your SSD is probably not correctly processing flush cache commands.

Cache flushes do not matter that much when SSDs and sudden power cuts
are involved.  Power cuts at the wrong time harm the FLASH itself, it is
not about still-in-flight data.

Keep in mind that SSDs do a _lot_ of background writing, and power cuts
during a FLASH write or erase can cause from weakened cells, to much
larger damage.  It is possible to harden the chip or the design against
this, but it is *expensive*.  And even if warded off by hardening and no
FLASH damage happens, an erase/program cycle must be done on the whole
erase block to clean up the incomplete program cycle.

Due to this background activity, an unexpected power cut could damage
data *anywhere* in an SSD: it could hit some filesystem area that was
being scrubbed in background by the SSD, or internal SSD metadata.

So, you want that SSD to know it must be quiescent-for-poweroff for
*real* before you allow the system to do anything that could power it
off.

And, as I have found out the hard way years ago, you also want to give
the SSD enough *extra* time to actually quiesce, even if it claims to be
already prepared for poweroff [1].

When you do not follow these rules, well, excellent datacenter-class
SSDs have super-capacitor power banks that actually work.  Most SSDs do
not, although they hopefully came a long way and hopefully modern SSDs
are not as easily to brick as they were reported to be three or four
years ago.


[1] I have long lost the will and energy to pursue this, so *this* is a
throw-away anecdote for anyone that cares: I reported here a few years
ago that many models of *SATA* based SSDs from Crucial/Micron, Samsung
and Intel were complaining (through their SMART attributes) that Linux
was causing unsafe shutdowns.

https://lkml.org/lkml/2017/4/10/1181

TL;DR: wait one *extra* second after the SSD acknowleged the STOP
command as complete before you trust the SSD device is safe to be
powered down (i.e. before reboot, suspend, poweroff/shutdown, and device
removal/detach).  This worked around the issue for every vendor and
model of SSD we tested.

-- 
  Henrique Holschuh

  parent reply	other threads:[~2020-06-23 20:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-17 18:49 [PATCH] scsi: sd: stop SSD (non-rotational) disks before reboot Simon Arlott
2020-06-17 19:19 ` Bart Van Assche
2020-06-17 19:32   ` Simon Arlott
2020-06-18  7:21 ` Christoph Hellwig
2020-06-18 12:25   ` Simon Arlott
2020-06-18 13:49     ` Christoph Hellwig
2020-07-05 21:31       ` Henrique de Moraes Holschuh
2020-07-07 10:18         ` Christoph Hellwig
2020-06-18  8:36 ` Damien Le Moal
2020-06-18 12:25   ` Simon Arlott
2020-06-18 23:31     ` Damien Le Moal
2020-06-28 18:23       ` Simon Arlott
2020-06-30  1:05         ` Damien Le Moal
2020-06-23 13:36   ` Pavel Machek
2020-06-28 18:22     ` Simon Arlott
2020-06-23 20:42   ` Henrique de Moraes Holschuh [this message]
2020-06-28 18:31     ` Simon Arlott
2020-06-28 19:42       ` Henrique de Moraes Holschuh
2020-06-30  3:31     ` Ming Lei
2020-07-02 21:16       ` Pavel Machek
2020-07-03 14:13         ` David Laight
2020-07-04 11:49           ` Pavel Machek
2020-07-05 22:19       ` Henrique de Moraes Holschuh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200623204234.GA16156@khazad-dum.debian.net \
    --to=hmh@hmh.eng.br \
    --cc=Damien.LeMoal@wdc.com \
    --cc=corbet@lwn.net \
    --cc=jejb@linux.ibm.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=simon@octiron.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).