linux-hwmon.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
To: Jean Delvare <jdelvare@suse.com>, Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH] hwmon: (drivetemp) Avoid SCT usage on Toshiba DT01ACA family drives
Date: Fri, 10 Jul 2020 20:10:03 +0200	[thread overview]
Message-ID: <42108b47d0e3d64c6d36618425c9f920ff469600.1594404501.git.mail@maciej.szmigiero.name> (raw)

It has been observed that Toshiba DT01ACA family drives have
WRITE FPDMA QUEUED command timeouts and sometimes just freeze until
power-cycled under heavy write loads when their temperature is getting
polled in SCT mode. The SMART mode seems to be fine, though.

Let's make sure we don't use SCT mode for these drives then.

While only the 3 TB model was actually caught exhibiting the problem let's
play safe here to avoid data corruption and extend the ban to the whole
family.

Fixes: 5b46903d8bf3 ("hwmon: Driver for disk and solid state drives with temperature sensors")
Cc: stable@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
---
Sending again since the previous message bounced for most recipients.

Notes:
    This behavior was observed on two different DT01ACA3 drives.
    
    Usually, a series of queued WRITE FPDMA QUEUED commands just time out,
    but sometimes the whole drive freezes. Merely disconnecting and
    reconnecting SATA interface cable then does not unfreeze the drive.
    
    One has to disconnect and reconnect the drive power connector for the
    drive to be detected again (suggesting the drive firmware itself has
    crashed).
    
    This only happens when the drive temperature is polled very often (like
    every second), so occasional SCT usage via smartmontools is probably
    safe.

 drivers/hwmon/drivetemp.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/hwmon/drivetemp.c b/drivers/hwmon/drivetemp.c
index 0d4f3d97ffc6..4fd51fa8c6e3 100644
--- a/drivers/hwmon/drivetemp.c
+++ b/drivers/hwmon/drivetemp.c
@@ -285,6 +285,36 @@ static int drivetemp_get_scttemp(struct drivetemp_data *st, u32 attr, long *val)
 	return err;
 }
 
+static const char * const sct_blacklist_models[] = {
+/*
+ * These drives will have WRITE FPDMA QUEUED command timeouts and sometimes just
+ * freeze until power-cycled under heavy write loads when their temperature is
+ * getting polled in SCT mode. The SMART mode seems to be fine, though.
+ *
+ * While only the 3 TB model was actually caught exhibiting the problem
+ * let's play safe here to avoid data corruption and ban the whole family.
+ */
+	"TOSHIBA DT01ACA0",
+	"TOSHIBA DT01ACA1",
+	"TOSHIBA DT01ACA2",
+	"TOSHIBA DT01ACA3",
+};
+
+static bool drivetemp_sct_blacklisted(struct drivetemp_data *st)
+{
+	struct scsi_device *sdev = st->sdev;
+	unsigned int ctr;
+
+	if (!sdev->model)
+		return false;
+
+	for (ctr = 0; ctr < ARRAY_SIZE(sct_blacklist_models); ctr++)
+		if (strncmp(sdev->model, sct_blacklist_models[ctr], 16) == 0)
+			return true;
+
+	return false;
+}
+
 static int drivetemp_identify_sata(struct drivetemp_data *st)
 {
 	struct scsi_device *sdev = st->sdev;
@@ -326,6 +356,13 @@ static int drivetemp_identify_sata(struct drivetemp_data *st)
 	/* bail out if this is not a SATA device */
 	if (!is_ata || !is_sata)
 		return -ENODEV;
+
+	if (have_sct && drivetemp_sct_blacklisted(st)) {
+		dev_notice(&sdev->sdev_gendev,
+			   "will avoid using SCT for temperature monitoring\n");
+		have_sct = false;
+	}
+
 	if (!have_sct)
 		goto skip_sct;
 

             reply	other threads:[~2020-07-10 18:31 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10 18:10 Maciej S. Szmigiero [this message]
2020-07-11 18:19 ` [PATCH] hwmon: (drivetemp) Avoid SCT usage on Toshiba DT01ACA family drives Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42108b47d0e3d64c6d36618425c9f920ff469600.1594404501.git.mail@maciej.szmigiero.name \
    --to=mail@maciej.szmigiero.name \
    --cc=jdelvare@suse.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).