All of lore.kernel.org
 help / color / mirror / Atom feed
* libata-scsi: ata_to_sense_error handling status 0x40
@ 2022-08-26 12:00 Peter Fröhlich
  2022-08-28 23:20 ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Fröhlich @ 2022-08-26 12:00 UTC (permalink / raw)
  To: linux-ide

Apologies if this is the wrong place.

In commit 8ae720449fca4b1d0294c0a0204c0c45556a3e61 "libata: whitespace
fixes in ata_to_sense_error()" we find, among many actual whitespace
changes, another change that adds an entry for 0x40 to the stat_table.

I thought that 0x40 is bit 6 of the ATA status which means (I think)
"drive ready" and not, as we translate here, "unaligned write
command". I may be very much confused about this, but if so, please
tell me how.

(I've been tracking another, presumably unrelated, error and have the
impression that this "unaligned write" message has been leading me in
the wrong direction.)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-26 12:00 libata-scsi: ata_to_sense_error handling status 0x40 Peter Fröhlich
@ 2022-08-28 23:20 ` Damien Le Moal
  2022-08-29  6:04   ` Peter Fröhlich
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2022-08-28 23:20 UTC (permalink / raw)
  To: Peter Fröhlich, linux-ide

On 2022/08/26 21:00, Peter Fröhlich wrote:
> Apologies if this is the wrong place.
> 
> In commit 8ae720449fca4b1d0294c0a0204c0c45556a3e61 "libata: whitespace
> fixes in ata_to_sense_error()" we find, among many actual whitespace
> changes, another change that adds an entry for 0x40 to the stat_table.
> 
> I thought that 0x40 is bit 6 of the ATA status which means (I think)
> "drive ready" and not, as we translate here, "unaligned write
> command". I may be very much confused about this, but if so, please
> tell me how.

See the code below that table definition. You can see this hunk:

	if (drv_err) {
                /* Look for drv_err */
                for (i = 0; sense_table[i][0] != 0xFF; i++) {
                        /* Look for best matches first */
                        if ((sense_table[i][0] & drv_err) ==
                            sense_table[i][0]) {
                                *sk = sense_table[i][1];
                                *asc = sense_table[i][2];
                                *ascq = sense_table[i][3];
                                goto translate_done;
                        }
                }
        }

So the first field of the table defines the *error* bits, not the status. You
can check the ACS specs (Section 10.3 of ACS6-r1) to see that bit 6 of the error
field for an error output is "UNCORRECTABLE ERROR bit" for the commands READ DMA
Ext, Read Log Ext, Read PIO, Read Stream, SMART Read log, Read PIO Extended and
NCQ Read.

So this code is per specs.

> (I've been tracking another, presumably unrelated, error and have the
> impression that this "unaligned write" message has been leading me in
> the wrong direction.)

Are you using an SMR drive ? Getting an aligned write should not happen for a
regular disk as the kernel ensure alignments of IO to LBAs. But for SMR, it is
possible to send a write command that is not aligned to a zone write pointer
position, resulting in an unaligned write error.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-28 23:20 ` Damien Le Moal
@ 2022-08-29  6:04   ` Peter Fröhlich
  2022-08-29 23:26     ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Fröhlich @ 2022-08-29  6:04 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On Mon, Aug 29, 2022 at 1:20 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
> On 2022/08/26 21:00, Peter Fröhlich wrote:
> > In commit 8ae720449fca4b1d0294c0a0204c0c45556a3e61 "libata: whitespace
> > fixes in ata_to_sense_error()" we find, among many actual whitespace
> > changes, another change that adds an entry for 0x40 to the stat_table.
> > ...
>
> See the code below that table definition. You can see this hunk:
>
>         if (drv_err) {
>                 /* Look for drv_err */
>                 for (i = 0; sense_table[i][0] != 0xFF; i++) {
>                         /* Look for best matches first */
>                         if ((sense_table[i][0] & drv_err) ==
>                             sense_table[i][0]) {
>                                 *sk = sense_table[i][1];
>                                 *asc = sense_table[i][2];
>                                 *ascq = sense_table[i][3];
>                                 goto translate_done;
>                         }
>                 }
>         }

That's the sense_table, I was referring to the stat_table. That table
is consulted when we fail to convert via the sense_table.

> Are you using an SMR drive ? Getting an aligned write should not happen for a
> regular disk as the kernel ensure alignments of IO to LBAs. But for SMR, it is
> possible to send a write command that is not aligned to a zone write pointer
> position, resulting in an unaligned write error.

Which is why I am pretty sure that the "unaligned write" message is
spurious since I am writing to a plain old SSD. It's going to be hard
for a userspace program to generate a write that is no properly
aligned for the SSD.

Cheers,
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-29  6:04   ` Peter Fröhlich
@ 2022-08-29 23:26     ` Damien Le Moal
  2022-08-30  7:02       ` Peter Fröhlich
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2022-08-29 23:26 UTC (permalink / raw)
  To: peter.hans.froehlich; +Cc: linux-ide

On Mon, 2022-08-29 at 08:04 +0200, Peter Fröhlich wrote:
> On Mon, Aug 29, 2022 at 1:20 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
> > On 2022/08/26 21:00, Peter Fröhlich wrote:
> > > In commit 8ae720449fca4b1d0294c0a0204c0c45556a3e61 "libata:
> > > whitespace
> > > fixes in ata_to_sense_error()" we find, among many actual
> > > whitespace
> > > changes, another change that adds an entry for 0x40 to the
> > > stat_table.
> > > ...
> > 
> > See the code below that table definition. You can see this hunk:
> > 
> >         if (drv_err) {
> >                 /* Look for drv_err */
> >                 for (i = 0; sense_table[i][0] != 0xFF; i++) {
> >                         /* Look for best matches first */
> >                         if ((sense_table[i][0] & drv_err) ==
> >                             sense_table[i][0]) {
> >                                 *sk = sense_table[i][1];
> >                                 *asc = sense_table[i][2];
> >                                 *ascq = sense_table[i][3];
> >                                 goto translate_done;
> >                         }
> >                 }
> >         }
> 
> That's the sense_table, I was referring to the stat_table. That table
> is consulted when we fail to convert via the sense_table.

Right... Reading emails on Monday morning without enough coffee was a
bad idea :)

So looking at the right code again, this is all very strange. E.g. the
ACS specs define bit 5 of the status field as the "device fault" bit,
but the code looks at 0x08, so bit 3. For write command, the definition
is:

STATUS
Bit Description
7:6 Transport Dependent – See 6.2.11
5 DEVICE FAULT bit – See 6.2.6
4 N/A
3 Transport Dependent – See 6.2.11
2 N/A
1 SENSE DATA AVAILABLE bit – See 6.2.9
0 ERROR bit – See 6.2.8

And the code is:

        static const unsigned char stat_table[][4] = {                
                /* Must be first because BUSY means no other bits valid
*/      
                {0x80,          ABORTED_COMMAND, 0x47, 0x00},         
                // Busy, fake parity for now                          
                {0x40,          ILLEGAL_REQUEST, 0x21, 0x04},         
                // Device ready, unaligned write command              
                {0x20,          HARDWARE_ERROR,  0x44, 0x00},         
                // Device fault, internal target failure              
                {0x08,          ABORTED_COMMAND, 0x47, 0x00},         
                // Timed out in xfer, fake parity for now             
                {0x04,          RECOVERED_ERROR, 0x11, 0x00},         
                // Recovered ECC error    Medium error, recovered     
                {0xFF, 0xFF, 0xFF, 0xFF}, // END mark                 
        };

So this does not match at all. Something wrong here, or, the "status"
field being observed here is not the one I am thinking of. Checking
AHCI & SATA-IO specs, I do not see anything matching there either.

Hu... Need to dig into this further. Missing something here.

> 
> > Are you using an SMR drive ? Getting an aligned write should not
> > happen for a
> > regular disk as the kernel ensure alignments of IO to LBAs. But for
> > SMR, it is
> > possible to send a write command that is not aligned to a zone
> > write pointer
> > position, resulting in an unaligned write error.
> 
> Which is why I am pretty sure that the "unaligned write" message is
> spurious since I am writing to a plain old SSD. It's going to be hard
> for a userspace program to generate a write that is no properly
> aligned for the SSD.

Unless your SSD is really buggy and throws strange errors, which is
always a possibility. Do you have a good reproducer of the problem ?

> 
> Cheers,
> Peter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-29 23:26     ` Damien Le Moal
@ 2022-08-30  7:02       ` Peter Fröhlich
  2022-08-31  1:40         ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Fröhlich @ 2022-08-30  7:02 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On Tue, Aug 30, 2022 at 1:26 AM Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> On Mon, 2022-08-29 at 08:04 +0200, Peter Fröhlich wrote:
> > That's the sense_table, I was referring to the stat_table. That table
> > is consulted when we fail to convert via the sense_table.
> ...
> So looking at the right code again, this is all very strange. E.g. the
> ACS specs define bit 5 of the status field as the "device fault" bit,
> but the code looks at 0x08, so bit 3. For write command, the definition
> is:
>
> STATUS
> Bit Description
> 7:6 Transport Dependent – See 6.2.11
> 5 DEVICE FAULT bit – See 6.2.6
> 4 N/A
> 3 Transport Dependent – See 6.2.11
> 2 N/A
> 1 SENSE DATA AVAILABLE bit – See 6.2.9
> 0 ERROR bit – See 6.2.8
>
> And the code is:
>
>         static const unsigned char stat_table[][4] = {
>                 /* Must be first because BUSY means no other bits valid
> */
>                 {0x80,          ABORTED_COMMAND, 0x47, 0x00},
>                 // Busy, fake parity for now
>                 {0x40,          ILLEGAL_REQUEST, 0x21, 0x04},
>                 // Device ready, unaligned write command
>                 {0x20,          HARDWARE_ERROR,  0x44, 0x00},
>                 // Device fault, internal target failure
>                 {0x08,          ABORTED_COMMAND, 0x47, 0x00},
>                 // Timed out in xfer, fake parity for now
>                 {0x04,          RECOVERED_ERROR, 0x11, 0x00},
>                 // Recovered ECC error    Medium error, recovered
>                 {0xFF, 0xFF, 0xFF, 0xFF}, // END mark
>         };
>
> So this does not match at all. Something wrong here, or, the "status"
> field being observed here is not the one I am thinking of. Checking
> AHCI & SATA-IO specs, I do not see anything matching there either.

Thank you for confirming that this section *is* confusing. I was down
the same rabbit-hole checking "status" in whatever ATA spec I could
get my hands on, and it didn't help. Specifically for "WRITE DMA"
where I usually see the error, it seems that bit 6 has no other
meaning than "transport dependent" which for SATA means (I believe)
"drive ready" as it's always been. But that 0x40 is *not* an
"unaligned write" whatever else it may be. My suspicion is that maybe
it went in by accident since it's also in a "whitespace" commit. On
the other hand, it has an explicit comment. I wasn't going to bother
the original author before, but maybe now I should?

> > Which is why I am pretty sure that the "unaligned write" message is
> > spurious since I am writing to a plain old SSD. It's going to be hard
> > for a userspace program to generate a write that is no properly
> > aligned for the SSD.
>
> Unless your SSD is really buggy and throws strange errors, which is
> always a possibility. Do you have a good reproducer of the problem ?

Not really, sadly. For me it happens with SSDs behind a Marvell SATA
controller but it doesn't happen when the same SSD goes behind a
fancier SAS controller. This is what led me into the ATA/SCSI layer as
the possible culprit because on the SAS boxes that layer is not used.
BTW there's another "strange" effect that sometimes seems to lose the
LBA flag on the ATA taskfile struct resulting in an obscure error
message about failed CHS addressing. In that case I suspect an
initialization gone wrong or maybe a race condition somewhere, but
it's been a real pain to track down further. If I ever get a better
handle on how to repro this stuff, I certainly will share.

Cheers,
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-30  7:02       ` Peter Fröhlich
@ 2022-08-31  1:40         ` Damien Le Moal
  2022-08-31  7:15           ` Hannes Reinecke
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2022-08-31  1:40 UTC (permalink / raw)
  To: Peter Fröhlich, Hannes Reinecke; +Cc: linux-ide

On 2022/08/30 16:02, Peter Fröhlich wrote:
> On Tue, Aug 30, 2022 at 1:26 AM Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
>> On Mon, 2022-08-29 at 08:04 +0200, Peter Fröhlich wrote:
>>> That's the sense_table, I was referring to the stat_table. That table
>>> is consulted when we fail to convert via the sense_table.
>> ...
>> So looking at the right code again, this is all very strange. E.g. the
>> ACS specs define bit 5 of the status field as the "device fault" bit,
>> but the code looks at 0x08, so bit 3. For write command, the definition
>> is:
>>
>> STATUS
>> Bit Description
>> 7:6 Transport Dependent – See 6.2.11
>> 5 DEVICE FAULT bit – See 6.2.6
>> 4 N/A
>> 3 Transport Dependent – See 6.2.11
>> 2 N/A
>> 1 SENSE DATA AVAILABLE bit – See 6.2.9
>> 0 ERROR bit – See 6.2.8
>>
>> And the code is:
>>
>>         static const unsigned char stat_table[][4] = {
>>                 /* Must be first because BUSY means no other bits valid
>> */
>>                 {0x80,          ABORTED_COMMAND, 0x47, 0x00},
>>                 // Busy, fake parity for now
>>                 {0x40,          ILLEGAL_REQUEST, 0x21, 0x04},
>>                 // Device ready, unaligned write command
>>                 {0x20,          HARDWARE_ERROR,  0x44, 0x00},
>>                 // Device fault, internal target failure
>>                 {0x08,          ABORTED_COMMAND, 0x47, 0x00},
>>                 // Timed out in xfer, fake parity for now
>>                 {0x04,          RECOVERED_ERROR, 0x11, 0x00},
>>                 // Recovered ECC error    Medium error, recovered
>>                 {0xFF, 0xFF, 0xFF, 0xFF}, // END mark
>>         };
>>
>> So this does not match at all. Something wrong here, or, the "status"
>> field being observed here is not the one I am thinking of. Checking
>> AHCI & SATA-IO specs, I do not see anything matching there either.
> 
> Thank you for confirming that this section *is* confusing. I was down
> the same rabbit-hole checking "status" in whatever ATA spec I could
> get my hands on, and it didn't help. Specifically for "WRITE DMA"
> where I usually see the error, it seems that bit 6 has no other
> meaning than "transport dependent" which for SATA means (I believe)
> "drive ready" as it's always been. But that 0x40 is *not* an
> "unaligned write" whatever else it may be. My suspicion is that maybe
> it went in by accident since it's also in a "whitespace" commit. On
> the other hand, it has an explicit comment. I wasn't going to bother
> the original author before, but maybe now I should?

+Hannes

Except for bit 0x20 (device fault), the other bits do not match anything
sensible either. So I wonder what specs this is against. Hannes ? 7-years old
patch... I am sure your memory is very fresh about this one :)

>>> Which is why I am pretty sure that the "unaligned write" message is
>>> spurious since I am writing to a plain old SSD. It's going to be hard
>>> for a userspace program to generate a write that is no properly
>>> aligned for the SSD.
>>
>> Unless your SSD is really buggy and throws strange errors, which is
>> always a possibility. Do you have a good reproducer of the problem ?
> 
> Not really, sadly. For me it happens with SSDs behind a Marvell SATA
> controller but it doesn't happen when the same SSD goes behind a
> fancier SAS controller. This is what led me into the ATA/SCSI layer as
> the possible culprit because on the SAS boxes that layer is not used.

Yes, with a SAS HBA that has SAT implemented in FW, the HBA FW will do the
conversion to sense data for failed commands. No way of knowing how that is done
there.

> BTW there's another "strange" effect that sometimes seems to lose the
> LBA flag on the ATA taskfile struct resulting in an obscure error
> message about failed CHS addressing. In that case I suspect an
> initialization gone wrong or maybe a race condition somewhere, but
> it's been a real pain to track down further. If I ever get a better
> handle on how to repro this stuff, I certainly will share.

Yes, that type of error generally means something goes badly during scanning or
revalidate, e.g. access to a log page failing. That is a fairly common problems
on many drives (e.g. drives advertising support for READ LOG DMA EXT but that
command in fact not working). Your drive may need some quirks to get a reliable
scan.

Have you checked if your drive already has some entry in ata_device_blacklist
(in libata-core.c) ?

> 
> Cheers,
> Peter

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31  1:40         ` Damien Le Moal
@ 2022-08-31  7:15           ` Hannes Reinecke
  2022-08-31  7:48             ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2022-08-31  7:15 UTC (permalink / raw)
  To: Damien Le Moal, Peter Fröhlich; +Cc: linux-ide

On 8/31/22 03:40, Damien Le Moal wrote:
> On 2022/08/30 16:02, Peter Fröhlich wrote:
>> On Tue, Aug 30, 2022 at 1:26 AM Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
>>> On Mon, 2022-08-29 at 08:04 +0200, Peter Fröhlich wrote:
>>>> That's the sense_table, I was referring to the stat_table. That table
>>>> is consulted when we fail to convert via the sense_table.
>>> ...
>>> So looking at the right code again, this is all very strange. E.g. the
>>> ACS specs define bit 5 of the status field as the "device fault" bit,
>>> but the code looks at 0x08, so bit 3. For write command, the definition
>>> is:
>>>
>>> STATUS
>>> Bit Description
>>> 7:6 Transport Dependent – See 6.2.11
>>> 5 DEVICE FAULT bit – See 6.2.6
>>> 4 N/A
>>> 3 Transport Dependent – See 6.2.11
>>> 2 N/A
>>> 1 SENSE DATA AVAILABLE bit – See 6.2.9
>>> 0 ERROR bit – See 6.2.8
>>>
>>> And the code is:
>>>
>>>          static const unsigned char stat_table[][4] = {
>>>                  /* Must be first because BUSY means no other bits valid
>>> */
>>>                  {0x80,          ABORTED_COMMAND, 0x47, 0x00},
>>>                  // Busy, fake parity for now
>>>                  {0x40,          ILLEGAL_REQUEST, 0x21, 0x04},
>>>                  // Device ready, unaligned write command
>>>                  {0x20,          HARDWARE_ERROR,  0x44, 0x00},
>>>                  // Device fault, internal target failure
>>>                  {0x08,          ABORTED_COMMAND, 0x47, 0x00},
>>>                  // Timed out in xfer, fake parity for now
>>>                  {0x04,          RECOVERED_ERROR, 0x11, 0x00},
>>>                  // Recovered ECC error    Medium error, recovered
>>>                  {0xFF, 0xFF, 0xFF, 0xFF}, // END mark
>>>          };
>>>
>>> So this does not match at all. Something wrong here, or, the "status"
>>> field being observed here is not the one I am thinking of. Checking
>>> AHCI & SATA-IO specs, I do not see anything matching there either.
>>
>> Thank you for confirming that this section *is* confusing. I was down
>> the same rabbit-hole checking "status" in whatever ATA spec I could
>> get my hands on, and it didn't help. Specifically for "WRITE DMA"
>> where I usually see the error, it seems that bit 6 has no other
>> meaning than "transport dependent" which for SATA means (I believe)
>> "drive ready" as it's always been. But that 0x40 is *not* an
>> "unaligned write" whatever else it may be. My suspicion is that maybe
>> it went in by accident since it's also in a "whitespace" commit. On
>> the other hand, it has an explicit comment. I wasn't going to bother
>> the original author before, but maybe now I should?
> 
> +Hannes
> 
> Except for bit 0x20 (device fault), the other bits do not match anything
> sensible either. So I wonder what specs this is against. Hannes ? 7-years old
> patch... I am sure your memory is very fresh about this one :)
> 
Oh, of course :-)
That was when doing SMR support for libata.
I dimly remember that some pre-spec drives had been using the DRDY bit 
to signal an unaligned write. Which never made it into the spec, but the 
decoding stayed.

Will be sending a patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31  7:15           ` Hannes Reinecke
@ 2022-08-31  7:48             ` Damien Le Moal
  2022-08-31 10:21               ` Peter Fröhlich
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2022-08-31  7:48 UTC (permalink / raw)
  To: Hannes Reinecke, Peter Fröhlich; +Cc: linux-ide

On 8/31/22 16:15, Hannes Reinecke wrote:
> On 8/31/22 03:40, Damien Le Moal wrote:
>> On 2022/08/30 16:02, Peter Fröhlich wrote:
>>> On Tue, Aug 30, 2022 at 1:26 AM Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
>>>> On Mon, 2022-08-29 at 08:04 +0200, Peter Fröhlich wrote:
>>>>> That's the sense_table, I was referring to the stat_table. That table
>>>>> is consulted when we fail to convert via the sense_table.
>>>> ...
>>>> So looking at the right code again, this is all very strange. E.g. the
>>>> ACS specs define bit 5 of the status field as the "device fault" bit,
>>>> but the code looks at 0x08, so bit 3. For write command, the definition
>>>> is:
>>>>
>>>> STATUS
>>>> Bit Description
>>>> 7:6 Transport Dependent – See 6.2.11
>>>> 5 DEVICE FAULT bit – See 6.2.6
>>>> 4 N/A
>>>> 3 Transport Dependent – See 6.2.11
>>>> 2 N/A
>>>> 1 SENSE DATA AVAILABLE bit – See 6.2.9
>>>> 0 ERROR bit – See 6.2.8
>>>>
>>>> And the code is:
>>>>
>>>>          static const unsigned char stat_table[][4] = {
>>>>                  /* Must be first because BUSY means no other bits valid
>>>> */
>>>>                  {0x80,          ABORTED_COMMAND, 0x47, 0x00},
>>>>                  // Busy, fake parity for now
>>>>                  {0x40,          ILLEGAL_REQUEST, 0x21, 0x04},
>>>>                  // Device ready, unaligned write command
>>>>                  {0x20,          HARDWARE_ERROR,  0x44, 0x00},
>>>>                  // Device fault, internal target failure
>>>>                  {0x08,          ABORTED_COMMAND, 0x47, 0x00},
>>>>                  // Timed out in xfer, fake parity for now
>>>>                  {0x04,          RECOVERED_ERROR, 0x11, 0x00},
>>>>                  // Recovered ECC error    Medium error, recovered
>>>>                  {0xFF, 0xFF, 0xFF, 0xFF}, // END mark
>>>>          };
>>>>
>>>> So this does not match at all. Something wrong here, or, the "status"
>>>> field being observed here is not the one I am thinking of. Checking
>>>> AHCI & SATA-IO specs, I do not see anything matching there either.
>>>
>>> Thank you for confirming that this section *is* confusing. I was down
>>> the same rabbit-hole checking "status" in whatever ATA spec I could
>>> get my hands on, and it didn't help. Specifically for "WRITE DMA"
>>> where I usually see the error, it seems that bit 6 has no other
>>> meaning than "transport dependent" which for SATA means (I believe)
>>> "drive ready" as it's always been. But that 0x40 is *not* an
>>> "unaligned write" whatever else it may be. My suspicion is that maybe
>>> it went in by accident since it's also in a "whitespace" commit. On
>>> the other hand, it has an explicit comment. I wasn't going to bother
>>> the original author before, but maybe now I should?
>>
>> +Hannes
>>
>> Except for bit 0x20 (device fault), the other bits do not match anything
>> sensible either. So I wonder what specs this is against. Hannes ? 7-years old
>> patch... I am sure your memory is very fresh about this one :)
>>
> Oh, of course :-)
> That was when doing SMR support for libata.
> I dimly remember that some pre-spec drives had been using the DRDY bit 
> to signal an unaligned write. Which never made it into the spec, but the 
> decoding stayed.

Any idea where the other bits come from ? Except for bit 5 (device fault),
I do not see anything else in the specs that mandate these definitions...

> Will be sending a patch.

Thanks !

> 
> Cheers,
> 
> Hannes


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31  7:48             ` Damien Le Moal
@ 2022-08-31 10:21               ` Peter Fröhlich
  2022-08-31 13:30                 ` Peter Fröhlich
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Fröhlich @ 2022-08-31 10:21 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Hannes Reinecke, linux-ide

On Wed, Aug 31, 2022 at 9:48 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
> On 8/31/22 16:15, Hannes Reinecke wrote:
> > Oh, of course :-)
> > That was when doing SMR support for libata.
> > I dimly remember that some pre-spec drives had been using the DRDY bit
> > to signal an unaligned write. Which never made it into the spec, but the
> > decoding stayed.
>
> Any idea where the other bits come from ? Except for bit 5 (device fault),
> I do not see anything else in the specs that mandate these definitions...

I have since discovered the "SCSI to ATA" specification which has two
tables about mapping ATA errors to SCSI errors. Among those I was able
to find an "unaligned write" case as well, but I cannot properly parse
the rest of the two tables yet. They are in sections 11.6 and 11.7 of
that document.

> > Will be sending a patch.
> Thanks !

I concur: Thanks!

Cheers,
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31 10:21               ` Peter Fröhlich
@ 2022-08-31 13:30                 ` Peter Fröhlich
  2022-08-31 22:54                   ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Fröhlich @ 2022-08-31 13:30 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Hannes Reinecke, linux-ide

Sorry for spamming replies and quoting myself.

On Wed, Aug 31, 2022 at 12:21 PM Peter Fröhlich
<peter.hans.froehlich@gmail.com> wrote:
> On Wed, Aug 31, 2022 at 9:48 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
> > On 8/31/22 16:15, Hannes Reinecke wrote:
> > > Oh, of course :-)
> > > That was when doing SMR support for libata.
> > > I dimly remember that some pre-spec drives had been using the DRDY bit
> > > to signal an unaligned write. Which never made it into the spec, but the
> > > decoding stayed.
> >
> > Any idea where the other bits come from ? Except for bit 5 (device fault),
> > I do not see anything else in the specs that mandate these definitions...
>
> I have since discovered the "SCSI to ATA" specification which has two
> tables about mapping ATA errors to SCSI errors. Among those I was able
> to find an "unaligned write" case as well, but I cannot properly parse
> the rest of the two tables yet. They are in sections 11.6 and 11.7 of
> that document.

So I've re-read everything I can get my hands on and from what I can
tell the overall "flow" of ata_to_sense_error() is not what the
specifications would imply. For example we look at BSY on entry and
then say "ah, it's set, then let's ignore the error field" when the
specification (the way I read it) instead says "BSY is transport
dependent, so we say nothing about it here; but check the error bit in
status, if it is set, interpret the error field, otherwise there's
nothing for you in the error field". Of course I am a complete noob
when it comes to this ATA/SATA/SCSI/AHCI stuff, so please divide by at
least two. Sorry if this adds more confusion on top.

Cheers,
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31 13:30                 ` Peter Fröhlich
@ 2022-08-31 22:54                   ` Damien Le Moal
  2022-09-01  6:10                     ` Peter Fröhlich
  2022-09-01  6:13                     ` Hannes Reinecke
  0 siblings, 2 replies; 17+ messages in thread
From: Damien Le Moal @ 2022-08-31 22:54 UTC (permalink / raw)
  To: Peter Fröhlich; +Cc: Hannes Reinecke, linux-ide

On 8/31/22 22:30, Peter Fröhlich wrote:
> Sorry for spamming replies and quoting myself.
> 
> On Wed, Aug 31, 2022 at 12:21 PM Peter Fröhlich
> <peter.hans.froehlich@gmail.com> wrote:
>> On Wed, Aug 31, 2022 at 9:48 AM Damien Le Moal
>> <damien.lemoal@opensource.wdc.com> wrote:
>>> On 8/31/22 16:15, Hannes Reinecke wrote:
>>>> Oh, of course :-)
>>>> That was when doing SMR support for libata.
>>>> I dimly remember that some pre-spec drives had been using the DRDY bit
>>>> to signal an unaligned write. Which never made it into the spec, but the
>>>> decoding stayed.
>>>
>>> Any idea where the other bits come from ? Except for bit 5 (device fault),
>>> I do not see anything else in the specs that mandate these definitions...
>>
>> I have since discovered the "SCSI to ATA" specification which has two
>> tables about mapping ATA errors to SCSI errors. Among those I was able
>> to find an "unaligned write" case as well, but I cannot properly parse
>> the rest of the two tables yet. They are in sections 11.6 and 11.7 of
>> that document.
> 
> So I've re-read everything I can get my hands on and from what I can
> tell the overall "flow" of ata_to_sense_error() is not what the
> specifications would imply. For example we look at BSY on entry and
> then say "ah, it's set, then let's ignore the error field" when the
> specification (the way I read it) instead says "BSY is transport
> dependent, so we say nothing about it here; but check the error bit in
> status, if it is set, interpret the error field, otherwise there's
> nothing for you in the error field". Of course I am a complete noob
> when it comes to this ATA/SATA/SCSI/AHCI stuff, so please divide by at
> least two. Sorry if this adds more confusion on top.

I had a quick look at the specs again. I already spotted an error: when
the status device fault bit is set, the sense should be HARDWARE ERROR /
INTERNAL TARGET FAILURE and not ABORTED COMMAND / 0x47 like now. That is
according to SAT-5. But looking at ACS-5, sections 6 and 7.1.6, there are
*a lot* of cases that need to be taken care of. It looks like the
sense_table does that, but need to cross check.

As for the stat_table, except for the first buggy entry as mentioned
above, I have no clue where these come from. SAT only defines the HARDWARE
ERROR / INTERNAL TARGET FAILURE for when the status field device fault bit
is set. Need to dig further, but I am afraid this code may be due to years
of supporting drives returning weird errors that got mapped to sensible
sense codes instead of a pure implementation of the specs...

I need to spend some quality time with ACS and SAT documents to sort out
this one... And lots of coffee too probably :)

> 
> Cheers,
> Peter


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31 22:54                   ` Damien Le Moal
@ 2022-09-01  6:10                     ` Peter Fröhlich
  2022-09-01  6:13                     ` Hannes Reinecke
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Fröhlich @ 2022-09-01  6:10 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Hannes Reinecke, linux-ide

On Thu, Sep 1, 2022 at 12:55 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
> As for the stat_table, except for the first buggy entry as mentioned
> above, I have no clue where these come from. SAT only defines the HARDWARE
> ERROR / INTERNAL TARGET FAILURE for when the status field device fault bit
> is set. Need to dig further, but I am afraid this code may be due to years
> of supporting drives returning weird errors that got mapped to sensible
> sense codes instead of a pure implementation of the specs...

That last sentence is probably the key to a lot of this, in which case
one would wish for a comment or two as to why. I was trying to do some
more digging in the git history, but it seems that interesting bits
predate the 2005 git import date of the kernel sources and I am not
sure how to look further back. Also it might be that by now
"consumers" of these error codes may require them to stay the way they
are, i.e. even if something turns out to be "wrong" it might be
"unfixable" to not break userland?

> I need to spend some quality time with ACS and SAT documents to sort out
> this one... And lots of coffee too probably :)

I am starting to feel bad for kicking a hornet's nest in the shape of
a tiny function. :-D Thank you so much, both of you, for spending time
on this.

Cheers,
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-08-31 22:54                   ` Damien Le Moal
  2022-09-01  6:10                     ` Peter Fröhlich
@ 2022-09-01  6:13                     ` Hannes Reinecke
  2022-09-02  2:35                       ` Damien Le Moal
  1 sibling, 1 reply; 17+ messages in thread
From: Hannes Reinecke @ 2022-09-01  6:13 UTC (permalink / raw)
  To: Damien Le Moal, Peter Fröhlich; +Cc: linux-ide

On 9/1/22 00:54, Damien Le Moal wrote:
> On 8/31/22 22:30, Peter Fröhlich wrote:
>> Sorry for spamming replies and quoting myself.
>>
>> On Wed, Aug 31, 2022 at 12:21 PM Peter Fröhlich
>> <peter.hans.froehlich@gmail.com> wrote:
>>> On Wed, Aug 31, 2022 at 9:48 AM Damien Le Moal
>>> <damien.lemoal@opensource.wdc.com> wrote:
>>>> On 8/31/22 16:15, Hannes Reinecke wrote:
>>>>> Oh, of course :-)
>>>>> That was when doing SMR support for libata.
>>>>> I dimly remember that some pre-spec drives had been using the DRDY bit
>>>>> to signal an unaligned write. Which never made it into the spec, but the
>>>>> decoding stayed.
>>>>
>>>> Any idea where the other bits come from ? Except for bit 5 (device fault),
>>>> I do not see anything else in the specs that mandate these definitions...
>>>
>>> I have since discovered the "SCSI to ATA" specification which has two
>>> tables about mapping ATA errors to SCSI errors. Among those I was able
>>> to find an "unaligned write" case as well, but I cannot properly parse
>>> the rest of the two tables yet. They are in sections 11.6 and 11.7 of
>>> that document.
>>
>> So I've re-read everything I can get my hands on and from what I can
>> tell the overall "flow" of ata_to_sense_error() is not what the
>> specifications would imply. For example we look at BSY on entry and
>> then say "ah, it's set, then let's ignore the error field" when the
>> specification (the way I read it) instead says "BSY is transport
>> dependent, so we say nothing about it here; but check the error bit in
>> status, if it is set, interpret the error field, otherwise there's
>> nothing for you in the error field". Of course I am a complete noob
>> when it comes to this ATA/SATA/SCSI/AHCI stuff, so please divide by at
>> least two. Sorry if this adds more confusion on top.
> 
> I had a quick look at the specs again. I already spotted an error: when
> the status device fault bit is set, the sense should be HARDWARE ERROR /
> INTERNAL TARGET FAILURE and not ABORTED COMMAND / 0x47 like now. That is
> according to SAT-5. But looking at ACS-5, sections 6 and 7.1.6, there are
> *a lot* of cases that need to be taken care of. It looks like the
> sense_table does that, but need to cross check.
> 
> As for the stat_table, except for the first buggy entry as mentioned
> above, I have no clue where these come from. SAT only defines the HARDWARE
> ERROR / INTERNAL TARGET FAILURE for when the status field device fault bit
> is set. Need to dig further, but I am afraid this code may be due to years
> of supporting drives returning weird errors that got mapped to sensible
> sense codes instead of a pure implementation of the specs...
> 
> I need to spend some quality time with ACS and SAT documents to sort out
> this one... And lots of coffee too probably :)
> 
And, to make matters ever more complicated, the error and status bits 
changed over time. And even the SAT translation changed between versions.
So there really is no clear "that's the way to go" style of thing; if we 
want to be correct we would need to evaluate the ATA version for that 
device, and have different translation tables depending on the version.

Not sure if it's worth it, though; in the end it's just an error 
description which will get changed. Commands will be aborted in either 
case, so the net result is close to zero :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-09-01  6:13                     ` Hannes Reinecke
@ 2022-09-02  2:35                       ` Damien Le Moal
  2022-09-02  6:34                         ` Peter Fröhlich
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2022-09-02  2:35 UTC (permalink / raw)
  To: Hannes Reinecke, Peter Fröhlich; +Cc: linux-ide

On 9/1/22 15:13, Hannes Reinecke wrote:
> On 9/1/22 00:54, Damien Le Moal wrote:
>> On 8/31/22 22:30, Peter Fröhlich wrote:
>>> Sorry for spamming replies and quoting myself.
>>>
>>> On Wed, Aug 31, 2022 at 12:21 PM Peter Fröhlich
>>> <peter.hans.froehlich@gmail.com> wrote:
>>>> On Wed, Aug 31, 2022 at 9:48 AM Damien Le Moal
>>>> <damien.lemoal@opensource.wdc.com> wrote:
>>>>> On 8/31/22 16:15, Hannes Reinecke wrote:
>>>>>> Oh, of course :-)
>>>>>> That was when doing SMR support for libata.
>>>>>> I dimly remember that some pre-spec drives had been using the DRDY bit
>>>>>> to signal an unaligned write. Which never made it into the spec, but the
>>>>>> decoding stayed.
>>>>>
>>>>> Any idea where the other bits come from ? Except for bit 5 (device fault),
>>>>> I do not see anything else in the specs that mandate these definitions...
>>>>
>>>> I have since discovered the "SCSI to ATA" specification which has two
>>>> tables about mapping ATA errors to SCSI errors. Among those I was able
>>>> to find an "unaligned write" case as well, but I cannot properly parse
>>>> the rest of the two tables yet. They are in sections 11.6 and 11.7 of
>>>> that document.
>>>
>>> So I've re-read everything I can get my hands on and from what I can
>>> tell the overall "flow" of ata_to_sense_error() is not what the
>>> specifications would imply. For example we look at BSY on entry and
>>> then say "ah, it's set, then let's ignore the error field" when the
>>> specification (the way I read it) instead says "BSY is transport
>>> dependent, so we say nothing about it here; but check the error bit in
>>> status, if it is set, interpret the error field, otherwise there's
>>> nothing for you in the error field". Of course I am a complete noob
>>> when it comes to this ATA/SATA/SCSI/AHCI stuff, so please divide by at
>>> least two. Sorry if this adds more confusion on top.
>>
>> I had a quick look at the specs again. I already spotted an error: when
>> the status device fault bit is set, the sense should be HARDWARE ERROR /
>> INTERNAL TARGET FAILURE and not ABORTED COMMAND / 0x47 like now. That is
>> according to SAT-5. But looking at ACS-5, sections 6 and 7.1.6, there are
>> *a lot* of cases that need to be taken care of. It looks like the
>> sense_table does that, but need to cross check.
>>
>> As for the stat_table, except for the first buggy entry as mentioned
>> above, I have no clue where these come from. SAT only defines the HARDWARE
>> ERROR / INTERNAL TARGET FAILURE for when the status field device fault bit
>> is set. Need to dig further, but I am afraid this code may be due to years
>> of supporting drives returning weird errors that got mapped to sensible
>> sense codes instead of a pure implementation of the specs...
>>
>> I need to spend some quality time with ACS and SAT documents to sort out
>> this one... And lots of coffee too probably :)
>>
> And, to make matters ever more complicated, the error and status bits 
> changed over time. And even the SAT translation changed between versions.

I checked all SAT docs from v1 up to v5 and all of them define the same
for the device fault status bit.

1) If status device fault bit is set, ignore error and translate to
HARDWARE ERROR / INTERNAL TARGET FAILURE. So this is wrong in the current
code which returns ABORTED COMMAND / SCSI PARITY ERROR, which is a little
silly. We could fix this, but urgency for the fix seems to be non-existent
since no-one complained about that one. I suspect this is because this
stuff only matters for IDE drives since most NCQ drives will get sense
data from the read log 10h anyway. So that weird stat_table is likely
never used, even for IDE drives as the sense_table gets a hit all the time
first.

3) When the status device fault bit is not set and the error bit is set,
then the error bits are defined differently across SAT revision, but they
all look backward compatible though.

3) None of the SAT specs have anything about "unaligned" error defined. I
think it is safe to remove that one as a fix. Will you send a patch ?

Peter,

Your drive seems to be an exception to my (1) statement and the error it
returns seems weird enough that the stat_table ends up being used.
Could you send a dmesg output of a failed command so that we can see the
err_mask etc info for the failed command ? And it would be good to add a
print of the drv_stat and drv_err parameters passed to
ata_to_sense_error() for the failures you are seeing. That would help
trying to figure out what your drive is attempting to signal.

Also, please send the output of "hdparm -I" for that SSD please, so that
we have information about what standard it is (supposedly) following.

> So there really is no clear "that's the way to go" style of thing; if we 
> want to be correct we would need to evaluate the ATA version for that 
> device, and have different translation tables depending on the version.
> 
> Not sure if it's worth it, though; in the end it's just an error 
> description which will get changed. Commands will be aborted in either 
> case, so the net result is close to zero :-)
> 
> Cheers,
> 
> Hannes

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-09-02  2:35                       ` Damien Le Moal
@ 2022-09-02  6:34                         ` Peter Fröhlich
  2022-09-02  8:41                           ` Damien Le Moal
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Fröhlich @ 2022-09-02  6:34 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Hannes Reinecke, linux-ide

On Fri, Sep 2, 2022 at 4:35 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
> Your drive seems to be an exception to my (1) statement and the error it
> returns seems weird enough that the stat_table ends up being used.
> Could you send a dmesg output of a failed command so that we can see the
> err_mask etc info for the failed command ? And it would be good to add a
> print of the drv_stat and drv_err parameters passed to
> ata_to_sense_error() for the failures you are seeing. That would help
> trying to figure out what your drive is attempting to signal.

I don't think the drive wants to "signal" anything, instead it simply
"disappears" at some point. The "original" error is "Emask 0x4
(timeout)". So here's an example from early on when I had not made
many kernel changes yet:

-----CUT-----
...
[  516.296397] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  516.296399] ata9.00: failed command: WRITE DMA
[  516.296402] ata9.00: cmd ca/00:23:51:03:4b/00:00:00:00:00/ed tag 4
dma 17920 out
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  516.296403] ata9.00: status: { DRDY }
...
[  516.761214] ata9: translated ATA stat/err 0x40/00 to SCSI
SK/ASC/ASCQ 0x5/21/04
[  516.761215] ata9.00: device reported invalid CHS sector 0
[  516.761220] sd 8:0:0:0: [sdk] tag#4 scsi_eh_8: flush finish cmd
[  516.761224] sd 8:0:0:0: [sdk] tag#4 FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[  516.761226] sd 8:0:0:0: [sdk] tag#4 Sense Key : Illegal Request [current]
[  516.761228] sd 8:0:0:0: [sdk] tag#4 Add. Sense: Unaligned write command
[  516.761229] sd 8:0:0:0: [sdk] tag#4 CDB: Write(16) 8a 00 00 00 00
00 0d 4b 03 51 00 00 00 23 00 00
...
-----CUT-----

That "translated" line is only output because I changed "if (verbose)"
to "if (1)" in that kernel. Also note the bizarre "CHS" error which
only happens on some of these, not all; I had mentioned before that I
am trying to track down how it happens that the LBA bit suddenly
disappears (it might have to do with the hardreset being in process at
this point and this message racing against the new IDENTIFY?). Using
the trace facility I can *sometimes* see the command being issued and
then 30 seconds later the timeout happening; sometimes I just get the
timeout and I *cannot* find when the command was issued in the trace,
another thing that seems bizarre to me.

Note that I didn't ask for help with that intentionally, I still think
that I am too far away from a proper diagnosis to have a fruitful
conversation about where the timeouts originate and why. We've checked
against power issues and the like, and again, this happens only when
the drive sits behind a SATA controller, not when it's behind a SAS
controller.

> Also, please send the output of "hdparm -I" for that SSD please, so that
> we have information about what standard it is (supposedly) following.

See below, but I don't think the specific drive is relevant. The same
"problem" shows up with a different brand/model as well, again only in
the SATA context, not for SAS.

Cheers,
Peter

-----CUT-----
/dev/sde:

ATA device, with non-removable media
    Model Number:       WDC  WDS400T2B0A-00SM50
    Serial Number:      2113CN420743
    Firmware Revision:  415020WD
    Media Serial Num:
    Media Manufacturer:
    Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II
Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
    Used: unknown (minor revision code 0x005e)
    Supported: 11 10 9 8 7 6 5
    Likely used: 11
Configuration:
    Logical        max    current
    cylinders    16383    0
    heads        16    0
    sectors/track    63    0
    --
    LBA    user addressable sectors:   268435455
    LBA48  user addressable sectors:  7814037168
    Logical  Sector size:                   512 bytes
    Physical Sector size:                   512 bytes
    Logical Sector-0 offset:                  0 bytes
    device size with M = 1024*1024:     3815447 MBytes
    device size with M = 1000*1000:     4000787 MBytes (4000 GB)
    cache/buffer size  = unknown
    Form Factor: 2.5 inch
    Nominal Media Rotation Rate: Solid State Device
Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, no device specific minimum
    R/W multiple sector transfer: Max = 1    Current = 1
    Advanced power management level: disabled
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled    Supported:
       *    SMART feature set
            Security Mode feature set
       *    Power Management feature set
       *    Write cache
       *    Look-ahead
       *    WRITE_BUFFER command
       *    READ_BUFFER command
       *    DOWNLOAD_MICROCODE
            Advanced Power Management feature set
       *    48-bit Address feature set
       *    Mandatory FLUSH_CACHE
       *    FLUSH_CACHE_EXT
       *    SMART error logging
       *    SMART self-test
       *    General Purpose Logging feature set
       *    64-bit World wide name
       *    WRITE_UNCORRECTABLE_EXT command
       *    {READ,WRITE}_DMA_EXT_GPL commands
       *    Segmented DOWNLOAD_MICROCODE
            unknown 119[8]
       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    Gen3 signaling speed (6.0Gb/s)
       *    Native Command Queueing (NCQ)
       *    Phy event counters
       *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
            DMA Setup Auto-Activate optimization
            Device-initiated interface power management
            Asynchronous notification (eg. media change)
       *    Software settings preservation
            Device Sleep (DEVSLP)
       *    SANITIZE feature set
       *    BLOCK_ERASE_EXT command
       *    DOWNLOAD MICROCODE DMA command
       *    WRITE BUFFER DMA command
       *    READ BUFFER DMA command
       *    Data Set Management TRIM supported (limit 8 blocks)
       *    Deterministic read ZEROs after TRIM
Security:
    Master password revision code = 65534
        supported
    not    enabled
    not    locked
    not    frozen
    not    expired: security count
        supported: enhanced erase
    2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5001b444a70c2c64
    NAA        : 5
    IEEE OUI    : 001b44
    Unique ID    : 4a70c2c64
Device Sleep:
    DEVSLP Exit Timeout (DETO): 30 ms (drive)
    Minimum DEVSLP Assertion Time (MDAT): 30 ms (drive)
Checksum: correct
-----CUT-----

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-09-02  6:34                         ` Peter Fröhlich
@ 2022-09-02  8:41                           ` Damien Le Moal
  2022-09-12  7:52                             ` Peter Fröhlich
  0 siblings, 1 reply; 17+ messages in thread
From: Damien Le Moal @ 2022-09-02  8:41 UTC (permalink / raw)
  To: Peter Fröhlich; +Cc: Hannes Reinecke, linux-ide

On 9/2/22 15:34, Peter Fröhlich wrote:
> On Fri, Sep 2, 2022 at 4:35 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
>> Your drive seems to be an exception to my (1) statement and the error it
>> returns seems weird enough that the stat_table ends up being used.
>> Could you send a dmesg output of a failed command so that we can see the
>> err_mask etc info for the failed command ? And it would be good to add a
>> print of the drv_stat and drv_err parameters passed to
>> ata_to_sense_error() for the failures you are seeing. That would help
>> trying to figure out what your drive is attempting to signal.
> 
> I don't think the drive wants to "signal" anything, instead it simply
> "disappears" at some point. The "original" error is "Emask 0x4
> (timeout)". So here's an example from early on when I had not made
> many kernel changes yet:

Sounds like the drive FW is crashing...

> -----CUT-----
> ...
> [  516.296397] ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> [  516.296399] ata9.00: failed command: WRITE DMA

Are you running this drive with device/queue_depth set to 1 ? What is
issuing a WRITE DMA instead of the NCQ equivalent ? Is this a passthrough
command ?

> [  516.296402] ata9.00: cmd ca/00:23:51:03:4b/00:00:00:00:00/ed tag 4
> dma 17920 out
>                         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  516.296403] ata9.00: status: { DRDY }
> ...
> [  516.761214] ata9: translated ATA stat/err 0x40/00 to SCSI
> SK/ASC/ASCQ 0x5/21/04
> [  516.761215] ata9.00: device reported invalid CHS sector 0

Yeah... An unaligned write error should normally also signal the LBA that
was being accessed when the error occurred. ata_tf_read_block() does not
see the LBA flag set, thinks it is CHS and ends up with garbage
information. We can ignored this one. The problem is the bogus unaligned
write error in the first place.

> [  516.761220] sd 8:0:0:0: [sdk] tag#4 scsi_eh_8: flush finish cmd
> [  516.761224] sd 8:0:0:0: [sdk] tag#4 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [  516.761226] sd 8:0:0:0: [sdk] tag#4 Sense Key : Illegal Request [current]
> [  516.761228] sd 8:0:0:0: [sdk] tag#4 Add. Sense: Unaligned write command
> [  516.761229] sd 8:0:0:0: [sdk] tag#4 CDB: Write(16) 8a 00 00 00 00
> 00 0d 4b 03 51 00 00 00 23 00 00
> ...
> -----CUT-----
> 
> That "translated" line is only output because I changed "if (verbose)"
> to "if (1)" in that kernel. Also note the bizarre "CHS" error which
> only happens on some of these, not all; I had mentioned before that I
> am trying to track down how it happens that the LBA bit suddenly
> disappears (it might have to do with the hardreset being in process at
> this point and this message racing against the new IDENTIFY?). Using

See above for the explanation. That message is bogus because the error is
bogus too.

> the trace facility I can *sometimes* see the command being issued and
> then 30 seconds later the timeout happening; sometimes I just get the
> timeout and I *cannot* find when the command was issued in the trace,
> another thing that seems bizarre to me.

That really sound like a device FW crash (the device stops responding), but...

> Note that I didn't ask for help with that intentionally, I still think
> that I am too far away from a proper diagnosis to have a fruitful
> conversation about where the timeouts originate and why. We've checked
> against power issues and the like, and again, this happens only when
> the drive sits behind a SATA controller, not when it's behind a SAS
> controller.
> 
>> Also, please send the output of "hdparm -I" for that SSD please, so that
>> we have information about what standard it is (supposedly) following.
> 
> See below, but I don't think the specific drive is relevant. The same
> "problem" shows up with a different brand/model as well, again only in
> the SATA context, not for SAS.

...since it happens with other drives, it may be something to do with the
host AHCI adapter. What are you using ? Do you get the same behaviour if
you use a different host with a different AHCI adapter ?

> 
> Cheers,
> Peter
> 
> -----CUT-----
> /dev/sde:
> 
> ATA device, with non-removable media
>     Model Number:       WDC  WDS400T2B0A-00SM50

I know this vendor well :)

>     Serial Number:      2113CN420743
>     Firmware Revision:  415020WD
>     Media Serial Num:
>     Media Manufacturer:
>     Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II
> Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
> Standards:
>     Used: unknown (minor revision code 0x005e)
>     Supported: 11 10 9 8 7 6 5
>     Likely used: 11
> Configuration:
>     Logical        max    current
>     cylinders    16383    0
>     heads        16    0
>     sectors/track    63    0
>     --
>     LBA    user addressable sectors:   268435455
>     LBA48  user addressable sectors:  7814037168
>     Logical  Sector size:                   512 bytes
>     Physical Sector size:                   512 bytes
>     Logical Sector-0 offset:                  0 bytes
>     device size with M = 1024*1024:     3815447 MBytes
>     device size with M = 1000*1000:     4000787 MBytes (4000 GB)
>     cache/buffer size  = unknown
>     Form Factor: 2.5 inch
>     Nominal Media Rotation Rate: Solid State Device
> Capabilities:
>     LBA, IORDY(can be disabled)
>     Queue depth: 32
>     Standby timer values: spec'd by Standard, no device specific minimum
>     R/W multiple sector transfer: Max = 1    Current = 1
>     Advanced power management level: disabled
>     DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
>          Cycle time: min=120ns recommended=120ns
>     PIO: pio0 pio1 pio2 pio3 pio4
>          Cycle time: no flow control=120ns  IORDY flow control=120ns
> Commands/features:
>     Enabled    Supported:
>        *    SMART feature set
>             Security Mode feature set
>        *    Power Management feature set
>        *    Write cache
>        *    Look-ahead
>        *    WRITE_BUFFER command
>        *    READ_BUFFER command
>        *    DOWNLOAD_MICROCODE
>             Advanced Power Management feature set
>        *    48-bit Address feature set
>        *    Mandatory FLUSH_CACHE
>        *    FLUSH_CACHE_EXT
>        *    SMART error logging
>        *    SMART self-test
>        *    General Purpose Logging feature set
>        *    64-bit World wide name
>        *    WRITE_UNCORRECTABLE_EXT command
>        *    {READ,WRITE}_DMA_EXT_GPL commands
>        *    Segmented DOWNLOAD_MICROCODE
>             unknown 119[8]
>        *    Gen1 signaling speed (1.5Gb/s)
>        *    Gen2 signaling speed (3.0Gb/s)
>        *    Gen3 signaling speed (6.0Gb/s)
>        *    Native Command Queueing (NCQ)
>        *    Phy event counters
>        *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
>             DMA Setup Auto-Activate optimization
>             Device-initiated interface power management
>             Asynchronous notification (eg. media change)
>        *    Software settings preservation
>             Device Sleep (DEVSLP)
>        *    SANITIZE feature set
>        *    BLOCK_ERASE_EXT command
>        *    DOWNLOAD MICROCODE DMA command
>        *    WRITE BUFFER DMA command
>        *    READ BUFFER DMA command
>        *    Data Set Management TRIM supported (limit 8 blocks)
>        *    Deterministic read ZEROs after TRIM
> Security:
>     Master password revision code = 65534
>         supported
>     not    enabled
>     not    locked
>     not    frozen
>     not    expired: security count
>         supported: enhanced erase
>     2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT.
> Logical Unit WWN Device Identifier: 5001b444a70c2c64
>     NAA        : 5
>     IEEE OUI    : 001b44
>     Unique ID    : 4a70c2c64
> Device Sleep:
>     DEVSLP Exit Timeout (DETO): 30 ms (drive)
>     Minimum DEVSLP Assertion Time (MDAT): 30 ms (drive)
> Checksum: correct
> -----CUT-----

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: libata-scsi: ata_to_sense_error handling status 0x40
  2022-09-02  8:41                           ` Damien Le Moal
@ 2022-09-12  7:52                             ` Peter Fröhlich
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Fröhlich @ 2022-09-12  7:52 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Hannes Reinecke, linux-ide

Apologies everybody, I dropped the ball on this a little, see below.

On Fri, Sep 2, 2022 at 10:41 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
> On 9/2/22 15:34, Peter Fröhlich wrote:
> > I don't think the drive wants to "signal" anything, instead it simply
> > "disappears" at some point. The "original" error is "Emask 0x4
> > (timeout)". So here's an example from early on when I had not made
> > many kernel changes yet:
>
> Sounds like the drive FW is crashing...

That, or maybe the interaction between the SATA controller and the
drive not being the most awesome. As I hinted before, we've had this
with different disks, both WD and Samsung disks, but the controller is
the same. Which, BTW, is this one:

02:00.0 SATA controller: Marvell Technology Group Ltd. Device 9215 (rev 11)

As far as I can tell, no major ATA quirks in the driver for that
thing, just a general PCI quirk that seems to apply to a bunch of
these Marvell chips.

> Are you running this drive with device/queue_depth set to 1 ? What is
> issuing a WRITE DMA instead of the NCQ equivalent ? Is this a passthrough
> command ?

The NCQ feature is indeed switched off because we've had problems with
other disks (spinning rust IIRC) crashing due to their NCQ
implementation being buggy. That's a different problem and has, to my
knowledge, nothing to do with the stuff here. Except that we're not
using "MULTI" commands without NCQ if I understand it correctly.

Here, finally, is why I "dropped the ball" on this thread. I played
with kernel command line parameters ON A LARK and it turns out that if
I say "libata.force=pio4" then for whatever reason all these issues go
away, I can no longer reproduce the timeouts or the attendant "wrong
error message" that made me post here originally. From what I gather
(and I may be very wrong here) forcing "pio4" makes the driver use yet
another set of commands, and THOSE commands seem to confuse neither
the SATA controller nor the disk anymore. Quick benchmarks showed some
loss in speed and we're still trying to figure out more of the
details, but again, the timeouts disappeared.

I don't like the fact that I am no closer to understanding what is
actually wrong here, but maybe this data point helps someone else
formulate a new theory of what's happening. BTW, despite "pio4" when
you ask the disks what they are using, they keep saying udma5. Another
thing I don't quite understand, but again, someone more knowledgeable
might.

Cheers,
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-09-12  7:53 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-26 12:00 libata-scsi: ata_to_sense_error handling status 0x40 Peter Fröhlich
2022-08-28 23:20 ` Damien Le Moal
2022-08-29  6:04   ` Peter Fröhlich
2022-08-29 23:26     ` Damien Le Moal
2022-08-30  7:02       ` Peter Fröhlich
2022-08-31  1:40         ` Damien Le Moal
2022-08-31  7:15           ` Hannes Reinecke
2022-08-31  7:48             ` Damien Le Moal
2022-08-31 10:21               ` Peter Fröhlich
2022-08-31 13:30                 ` Peter Fröhlich
2022-08-31 22:54                   ` Damien Le Moal
2022-09-01  6:10                     ` Peter Fröhlich
2022-09-01  6:13                     ` Hannes Reinecke
2022-09-02  2:35                       ` Damien Le Moal
2022-09-02  6:34                         ` Peter Fröhlich
2022-09-02  8:41                           ` Damien Le Moal
2022-09-12  7:52                             ` Peter Fröhlich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.