Linux-ide Archive on lore.kernel.org
 help / color / Atom feed
* Questions (and a possible bug) regarding the ata_device_blacklist and ATA_HORKAGE_ZERO_AFTER_TRIM
@ 2019-09-26 13:04 Stefan Tauner
  2019-09-26 22:01 ` Martin K. Petersen
  0 siblings, 1 reply; 2+ messages in thread
From: Stefan Tauner @ 2019-09-26 13:04 UTC (permalink / raw)
  To: linux-ide

Hi,

I am running an MD RAID5 with 3 SSDs (2x Crucial CT500MX500 (FW
M3CR022), 1x Samsung 860 EVO (because it was/is the only decent SSD
with msata; FW RVT41B6Q)) on Debian Buster (with an ancient^Wstable
4.19 kernel). Before this setup I was using a RAID1 and weekly fstrim
runs to issue discards.

Today I became aware of the RAID5 discard issue that led to trim being
disabled on such arrays:
https://github.com/torvalds/linux/commit/8e0e99ba64c7ba46133a7c8a3e3f7de01f23bd93

In my research to verify that my drives work fine and if I should
enable it I came across the black (actually white) list in the libata
code. My question is regarding the matching of the model_num field in
the black list. The name and the use of ATA_ID_PROD seem to indicate
that a single ATA "property" is used as the string to match. I don't
know the technical details how this is communicated by the drive but I
assume it's the same thing that smartctl and hdparm output as "Model
Number" and "Device Model" respectively.

If this is correct (is it?) then there is a problem with the list
AFAICT because the Crucial SSD I have reports this field simply as
"CT500MX500SSD4" but the kernel expects "Crucial" at the beginning of
almost all Crucial drives (line 4523+) including the vendor wildcard at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c#n4586
Interestingly, in line 4520 there is an entry for the CT500BX100SSD1
that does not start with "Crucial".

After looking into smartctl's drive database I guess the MX500 [2] (as
well as BX100, BX200, BX300 and BX500 [1]) series stand out in this
regard. This means that all of them do *not* get the
ATA_HORKAGE_ZERO_AFTER_TRIM flag set because they are not matched by
any of the model-specific entries nor the cumulative "Crucial*" vendor
entry.

I have not tested my drive to actually return zeros after trimming but
from the kernel code I would assume that its intent is to match all
Crucial SSDs and thus it is a bug mine is not matched. If someone
tells me to the preferred method to test it I am happy to do this. If
need be I can also submit a patch (just for MX500? all of the above?).

Is there any way to see which flags the kernel applies to a drive?
Interestingly, "lsblk -D" does only show "0" for the Samsung device
(although AFAICT it is matched by the white list AND reports
"Deterministic read ZEROs after TRIM" according to hdparm. But I don't
know what lsblk actually looks at...?

[1] https://www.smartmontools.org/changeset/4776
[2] https://www.smartmontools.org/browser/trunk/smartmontools/drivedb.h?#L1906

(Please CC since I am not subscribed)

KR
-- 
Dipl.-Ing. Stefan Tauner
Lecturer and former researcher
Embedded Systems Department

University of Applied Sciences Technikum Wien
Hoechstaedtplatz 6, 1200 Vienna, Austria
E: stefan.tauner@technikum-wien.at
I: embsys.technikum-wien.at
I: www.technikum-wien.at

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Questions (and a possible bug) regarding the ata_device_blacklist and ATA_HORKAGE_ZERO_AFTER_TRIM
  2019-09-26 13:04 Questions (and a possible bug) regarding the ata_device_blacklist and ATA_HORKAGE_ZERO_AFTER_TRIM Stefan Tauner
@ 2019-09-26 22:01 ` Martin K. Petersen
  0 siblings, 0 replies; 2+ messages in thread
From: Martin K. Petersen @ 2019-09-26 22:01 UTC (permalink / raw)
  To: Stefan Tauner; +Cc: linux-ide


Stefan,

> I don't know the technical details how this is communicated by the
> drive but I assume it's the same thing that smartctl and hdparm output
> as "Model Number" and "Device Model" respectively.

Yes.

> If this is correct (is it?) then there is a problem with the list
> AFAICT because the Crucial SSD I have reports this field simply as
> "CT500MX500SSD4" but the kernel expects "Crucial" at the beginning of
> almost all Crucial drives (line 4523+) including the vendor wildcard at
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c#n4586
> Interestingly, in line 4520 there is an entry for the CT500BX100SSD1
> that does not start with "Crucial".

With a few exceptions, the entries in the libata white/blacklist were
submitted by Crucial/Micron themselves. But it's possible that they
changed their naming scheme.

> After looking into smartctl's drive database I guess the MX500 [2] (as
> well as BX100, BX200, BX300 and BX500 [1]) series stand out in this
> regard. This means that all of them do *not* get the
> ATA_HORKAGE_ZERO_AFTER_TRIM flag set because they are not matched by
> any of the model-specific entries nor the cumulative "Crucial*" vendor
> entry.

The newest drives I have are M550 models.

> I have not tested my drive to actually return zeros after trimming but
> from the kernel code I would assume that its intent is to match all
> Crucial SSDs and thus it is a bug mine is not matched. If someone
> tells me to the preferred method to test it I am happy to do this. If
> need be I can also submit a patch (just for MX500? all of the above?).

There's no way to exhaustively test. Many drives will return zeroes most
of the time but can have corner conditions that cause them to ignore
TRIM commands.

The ones we whitelisted were as a result of feedback from the vendors
themselves (thanks to an advertised qualification for use with hardware
RAID5 controllers). As you know, there is no way for a drive to express
this capability/guarantee in the ATA protocol.

> Is there any way to see which flags the kernel applies to a drive?

# grep . /sys/class/ata_device/*/trim
/sys/class/ata_device/dev1.0/trim:unqueued
/sys/class/ata_device/dev2.0/trim:queued

> Interestingly, "lsblk -D" does only show "0" for the Samsung device
> (although AFAICT it is matched by the white list AND reports
> "Deterministic read ZEROs after TRIM" according to hdparm. But I don't
> know what lsblk actually looks at...?

lsblk looks at /sys/block/*/queue/discard*

You get "0" for the discard granularity on the Samsung?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-26 13:04 Questions (and a possible bug) regarding the ata_device_blacklist and ATA_HORKAGE_ZERO_AFTER_TRIM Stefan Tauner
2019-09-26 22:01 ` Martin K. Petersen

Linux-ide Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ide/0 linux-ide/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ide linux-ide/ https://lore.kernel.org/linux-ide \
		linux-ide@vger.kernel.org
	public-inbox-index linux-ide

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ide


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git