From: Pavel Machek <pavel@ucw.cz> To: Tejun Heo <tj@kernel.org> Cc: boris.brezillon@free-electrons.com, linux-scsi@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>, linux-kernel@vger.kernel.org, linux-ide@vger.kernel.org, linux-mtd@lists.infradead.org, Henrique de Moraes Holschuh <hmh@hmh.eng.br>, dwmw2@infradead.org Subject: Re: Race to power off harming SATA SSDs Date: Sun, 7 May 2017 22:40:07 +0200 [thread overview] Message-ID: <20170507204007.GA25628@atrey.karlin.mff.cuni.cz> (raw) In-Reply-To: <20170410235206.GA28603@wtj.duckdns.org> Hi! > > However, *IN PRACTICE*, SATA STANDBY IMMEDIATE command completion > > [often?] only indicates that the device is now switching to the target > > power management state, not that it has reached the target state. Any > > further device status inquires would return that it is in STANDBY mode, > > even if it is still entering that state. > > > > The kernel then continues the shutdown path while the SSD is still > > preparing itself to be powered off, and it becomes a race. When the > > kernel + firmware wins, platform power is cut before the SSD has > > finished (i.e. the SSD is subject to an unclean power-off). > > At that point, the device is fully flushed and in terms of data > integrity should be fine with losing power at any point anyway. Actually, no, that is not how it works. "Fully flushed" is one thing, surviving power loss is different. Explanation below. > > NOTE: unclean SSD power-offs are dangerous and may brick the device in > > the worst case, or otherwise harm it (reduce longevity, damage flash > > blocks). It is also not impossible to get data corruption. > > I get that the incrementing counters might not be pretty but I'm a bit > skeptical about this being an actual issue. Because if that were > true, the device would be bricking itself from any sort of power > losses be that an actual power loss, battery rundown or hard power off > after crash. And that's exactly what users see. If you do enough power fails on a SSD, you usually brick it, some die sooner than others. There was some test results published, some are here http://lkcl.net/reports/ssd_analysis.html, I believe I seen some others too. It is very hard for a NAND to work reliably in face of power failures. In fact, not even Linux MTD + UBIFS works well in that regards. See http://www.linux-mtd.infradead.org/faq/ubi.html. (Unfortunately, its down now?!). If we can't get it right, do you believe SSD manufactures do? [Issue is, if you powerdown during erase, you get "weakly erased" page, which will contain expected 0xff's, but you'll get bitflips there quickly. Similar issue exists for writes. It is solveable in software, just hard and slow... and we don't do it.] Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/
WARNING: multiple messages have this Message-ID (diff)
From: Pavel Machek <pavel@ucw.cz> To: Tejun Heo <tj@kernel.org> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>, linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org, Hans de Goede <hdegoede@redhat.com>, boris.brezillon@free-electrons.com, linux-mtd@lists.infradead.org, dwmw2@infradead.org Subject: Re: Race to power off harming SATA SSDs Date: Sun, 7 May 2017 22:40:07 +0200 [thread overview] Message-ID: <20170507204007.GA25628@atrey.karlin.mff.cuni.cz> (raw) In-Reply-To: <20170410235206.GA28603@wtj.duckdns.org> Hi! > > However, *IN PRACTICE*, SATA STANDBY IMMEDIATE command completion > > [often?] only indicates that the device is now switching to the target > > power management state, not that it has reached the target state. Any > > further device status inquires would return that it is in STANDBY mode, > > even if it is still entering that state. > > > > The kernel then continues the shutdown path while the SSD is still > > preparing itself to be powered off, and it becomes a race. When the > > kernel + firmware wins, platform power is cut before the SSD has > > finished (i.e. the SSD is subject to an unclean power-off). > > At that point, the device is fully flushed and in terms of data > integrity should be fine with losing power at any point anyway. Actually, no, that is not how it works. "Fully flushed" is one thing, surviving power loss is different. Explanation below. > > NOTE: unclean SSD power-offs are dangerous and may brick the device in > > the worst case, or otherwise harm it (reduce longevity, damage flash > > blocks). It is also not impossible to get data corruption. > > I get that the incrementing counters might not be pretty but I'm a bit > skeptical about this being an actual issue. Because if that were > true, the device would be bricking itself from any sort of power > losses be that an actual power loss, battery rundown or hard power off > after crash. And that's exactly what users see. If you do enough power fails on a SSD, you usually brick it, some die sooner than others. There was some test results published, some are here http://lkcl.net/reports/ssd_analysis.html, I believe I seen some others too. It is very hard for a NAND to work reliably in face of power failures. In fact, not even Linux MTD + UBIFS works well in that regards. See http://www.linux-mtd.infradead.org/faq/ubi.html. (Unfortunately, its down now?!). If we can't get it right, do you believe SSD manufactures do? [Issue is, if you powerdown during erase, you get "weakly erased" page, which will contain expected 0xff's, but you'll get bitflips there quickly. Similar issue exists for writes. It is solveable in software, just hard and slow... and we don't do it.] Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
next prev parent reply other threads:[~2017-05-07 20:40 UTC|newest] Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-04-10 23:21 Race to power off harming SATA SSDs Henrique de Moraes Holschuh 2017-04-10 23:34 ` Bart Van Assche 2017-04-10 23:50 ` Henrique de Moraes Holschuh 2017-04-10 23:49 ` sd: wait for slow devices on shutdown path Henrique de Moraes Holschuh 2017-04-10 23:52 ` Race to power off harming SATA SSDs Tejun Heo 2017-04-10 23:57 ` James Bottomley 2017-04-11 2:02 ` Henrique de Moraes Holschuh 2017-04-11 1:26 ` Henrique de Moraes Holschuh 2017-04-11 10:37 ` Martin Steigerwald 2017-04-11 14:31 ` Henrique de Moraes Holschuh 2017-04-12 7:47 ` Martin Steigerwald 2017-05-07 20:40 ` Pavel Machek [this message] 2017-05-07 20:40 ` Pavel Machek 2017-05-08 7:21 ` David Woodhouse 2017-05-08 7:38 ` Ricard Wanderlof 2017-05-08 7:38 ` Ricard Wanderlof 2017-05-08 8:13 ` David Woodhouse 2017-05-08 8:13 ` David Woodhouse 2017-05-08 8:36 ` Ricard Wanderlof 2017-05-08 8:36 ` Ricard Wanderlof 2017-05-08 8:54 ` David Woodhouse 2017-05-08 8:54 ` David Woodhouse 2017-05-08 9:06 ` Ricard Wanderlof 2017-05-08 9:06 ` Ricard Wanderlof 2017-05-08 9:09 ` Hans de Goede 2017-05-08 10:13 ` David Woodhouse 2017-05-08 11:50 ` Boris Brezillon 2017-05-08 15:40 ` David Woodhouse 2017-05-08 21:36 ` Pavel Machek 2017-05-08 16:43 ` Pavel Machek 2017-05-08 17:43 ` Tejun Heo 2017-05-08 18:56 ` Pavel Machek 2017-05-08 19:04 ` Tejun Heo 2017-05-08 18:29 ` Atlant Schmidt 2017-05-08 10:12 ` David Woodhouse 2017-05-08 10:12 ` David Woodhouse 2017-05-08 10:12 ` David Woodhouse 2017-05-08 9:28 ` Pavel Machek 2017-05-08 9:34 ` David Woodhouse 2017-05-08 10:49 ` Pavel Machek 2017-05-08 11:06 ` Richard Weinberger 2017-05-08 11:48 ` Boris Brezillon 2017-05-08 11:55 ` Boris Brezillon 2017-05-08 12:13 ` Richard Weinberger 2017-05-08 11:09 ` David Woodhouse 2017-05-08 12:32 ` Pavel Machek 2017-05-08 9:51 ` Richard Weinberger
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170507204007.GA25628@atrey.karlin.mff.cuni.cz \ --to=pavel@ucw.cz \ --cc=boris.brezillon@free-electrons.com \ --cc=dwmw2@infradead.org \ --cc=hdegoede@redhat.com \ --cc=hmh@hmh.eng.br \ --cc=linux-ide@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mtd@lists.infradead.org \ --cc=linux-scsi@vger.kernel.org \ --cc=tj@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.