failed command: WRITE FPDMA QUEUED with Samsung 860 EVO

* failed command: WRITE FPDMA QUEUED with Samsung 860 EVO
@ 2019-01-02 15:25 Sitsofe Wheeler
  2019-01-02 15:29 ` Sitsofe Wheeler
  0 siblings, 1 reply; 13+ messages in thread
From: Sitsofe Wheeler @ 2019-01-02 15:25 UTC (permalink / raw)
  To: linux-block

Hi,

I recently purchased a SATA Samsung 860 EVO SSD and put it in an old
HP microserver (which has an AMD N36L). By default, when the disk load
becomes a little heavy e.g. by running a job like

fio --name=test --readonly --rw=randread --filename /dev/sdb --bs=32k \
    --ioengine=libaio --iodepth=32 --direct=1 --runtime=10m --time_based=1

the kernel starts repeatedly producing error messages like:

[ 1177.729912] ata2.00: exception Emask 0x10 SAct 0x3c000 SErr 0x0
action 0x6 frozen
[ 1177.729931] ata2.00: irq_stat 0x08000000, interface fatal error
[ 1177.729943] ata2.00: failed command: WRITE FPDMA QUEUED
[ 1177.729962] ata2.00: cmd 61/80:70:80:50:e6/06:00:00:00:00/40 tag 14
ncq dma 851968 out
[ 1177.729962]          res 40/00:80:00:5a:e6/00:00:00:00:00/40 Emask
0x10 (ATA bus error)
[ 1177.729978] ata2.00: status: { DRDY }
[ 1177.729986] ata2.00: failed command: WRITE FPDMA QUEUED
[ 1177.730002] ata2.00: cmd 61/00:78:00:57:e6/03:00:00:00:00/40 tag 15
ncq dma 393216 out
[ 1177.730002]          res 40/00:80:00:5a:e6/00:00:00:00:00/40 Emask
0x10 (ATA bus error)
[ 1177.730017] ata2.00: status: { DRDY }
[ 1177.730024] ata2.00: failed command: WRITE FPDMA QUEUED
[ 1177.730039] ata2.00: cmd 61/00:80:00:5a:e6/05:00:00:00:00/40 tag 16
ncq dma 655360 out
[ 1177.730039]          res 40/00:80:00:5a:e6/00:00:00:00:00/40 Emask
0x10 (ATA bus error)
[ 1177.730053] ata2.00: status: { DRDY }
[ 1177.730060] ata2.00: failed command: WRITE FPDMA QUEUED
[ 1177.730078] ata2.00: cmd 61/00:88:00:5f:e6/01:00:00:00:00/40 tag 17
ncq dma 131072 out
[ 1177.730078]          res 40/00:80:00:5a:e6/00:00:00:00:00/40 Emask
0x10 (ATA bus error)
[ 1177.730096] ata2.00: status: { DRDY }
[ 1177.730108] ata2: hard resetting link
[ 1178.205831] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1178.206165] ata2.00: supports DRM functions and may not be fully accessible
[ 1178.209743] ata2.00: supports DRM functions and may not be fully accessible
[ 1178.212786] ata2.00: configured for UDMA/133
[ 1178.212826] ata2: EH complete
[ 1178.212988] ata2.00: Enabling discard_zeroes_data

I tried moving the SSD to another caddy and bay but the issue
persists. None of the regular hard disks (a Western Digital and a
Seagate) nor the other SSD (a Crucial MX500) already in the system
trigger the issue the Samsung 860 EVO does. Adding

libata.force=2.00:noncq

seems to make the issue go away but seemingly at some speed cost (at
least compared to what the MX500 achieves). The OS in use is Ubuntu
18.04 with a 4.15.0-43-generic kernel but even a 4.18.0-13-generic had
the same issue.

Is there anything software-wise that might need investigating that
would allow NCQ to work and a better speed to be reached?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 13+ messages in thread