All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
@ 2022-02-02 10:05 Anton Lundin
  2022-02-02 10:45 ` Damien Le Moal
  2022-02-02 12:07 ` Damien Le Moal
  0 siblings, 2 replies; 15+ messages in thread
From: Anton Lundin @ 2022-02-02 10:05 UTC (permalink / raw)
  To: Damien Le Moal, linux-ide; +Cc: Anton Lundin, stable

Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
a read of ATA_LOG_DIRECTORY page was added. This caused the
SATADOM-ML 3ME to lock up.

In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
a flag was added to cache if a device supports this or not.

This adds a blacklist entry which flags that these devices doesn't
support that call and shouldn't be issued that call.

Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Anton Lundin <glance@acc.umu.se>
Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
---
 drivers/ata/libata-core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 87d36b29ca5f..e024af9f33d0 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
 	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
 
+	/*
+	 * This sata dom goes on a walkabout when it sees the
+	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
+	 * request to these devices.
+	 */
+	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
+
 	/* End Marker */
 	{ }
 };
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 10:05 [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME Anton Lundin
@ 2022-02-02 10:45 ` Damien Le Moal
  2022-02-02 12:25   ` Anton Lundin
  2022-02-02 12:07 ` Damien Le Moal
  1 sibling, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2022-02-02 10:45 UTC (permalink / raw)
  To: Anton Lundin, linux-ide

On 2/2/22 19:05, Anton Lundin wrote:
> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> a read of ATA_LOG_DIRECTORY page was added. This caused the
> SATADOM-ML 3ME to lock up.
> 
> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> a flag was added to cache if a device supports this or not.
> 
> This adds a blacklist entry which flags that these devices doesn't
> support that call and shouldn't be issued that call.
> 
> Cc: stable@vger.kernel.org # v5.10+
> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")

I do not think so. See below.

> ---
>  drivers/ata/libata-core.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 87d36b29ca5f..e024af9f33d0 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>  
> +	/*
> +	 * This sata dom goes on a walkabout when it sees the
> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> +	 * request to these devices.
> +	 */
> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },

This flag only disables trying to access the identify device log page,
it does *not* avoid access to the log directory log page in general. The
log directory will still be consulted for other log pages beside the
identify device log page, from any function that calls
ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
ata_dev_config_ncq_non_data())

So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
and test for it in ata_log_supported(), completely preventing any access
to the log directory page for this drive type.

> +
>  	/* End Marker */
>  	{ }
>  };

Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
and a Fixes tag.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 10:05 [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME Anton Lundin
  2022-02-02 10:45 ` Damien Le Moal
@ 2022-02-02 12:07 ` Damien Le Moal
  2022-02-02 13:12   ` Anton Lundin
  1 sibling, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2022-02-02 12:07 UTC (permalink / raw)
  To: Anton Lundin, linux-ide; +Cc: stable

[-- Attachment #1: Type: text/plain, Size: 1839 bytes --]

On 2/2/22 19:05, Anton Lundin wrote:
> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> a read of ATA_LOG_DIRECTORY page was added. This caused the
> SATADOM-ML 3ME to lock up.
> 
> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> a flag was added to cache if a device supports this or not.
> 
> This adds a blacklist entry which flags that these devices doesn't
> support that call and shouldn't be issued that call.
> 
> Cc: stable@vger.kernel.org # v5.10+
> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> ---
>  drivers/ata/libata-core.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 87d36b29ca5f..e024af9f33d0 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>  
> +	/*
> +	 * This sata dom goes on a walkabout when it sees the
> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> +	 * request to these devices.
> +	 */
> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> +
>  	/* End Marker */
>  	{ }
>  };

Can you try the attached patch ?

I think it is important to confirm if the lockup on your drive happens
due to reads of the log directory log page or due to reads of the
identify device log page. The attached patch prevents the former, your
patch prevents the latter. If your patch is all that is needed, then it
is good, modulo some rephrasing of the commit message and comments.

-- 
Damien Le Moal
Western Digital Research

[-- Attachment #2: 0001-ata-libata-core-Introduce-ATA_HORKAGE_NO_LOG_DIR-hor.patch --]
[-- Type: text/x-patch, Size: 2627 bytes --]

From 54e566d4233d9e62735fd89837db0f5f4dea99e2 Mon Sep 17 00:00:00 2001
From: Anton Lundin <glance@acc.umu.se>
Date: Wed, 2 Feb 2022 20:47:40 +0900
Subject: [PATCH] ata: libata-core: Introduce ATA_HORKAGE_NO_LOG_DIR horkage

06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
introduced additional calls to ata_identify_page_supported(), thus also
adding indirectly accesses to the device log directory log page through
ata_log_supported(). Reading this log page causes SATADOM-ML 3ME devices
to lock up.

Introduce the horkage flag ATA_HORKAGE_NO_LOG_DIR to prevent accesses to
the log directory in ata_log_supported() and add a blacklist entry
with this flag for "SATADOM-ML 3ME" devices.

Fixes: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Anton Lundin <glance@acc.umu.se>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/ata/libata-core.c | 10 ++++++++++
 include/linux/libata.h    |  1 +
 2 files changed, 11 insertions(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 67f88027680a..e1b1dd215267 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2007,6 +2007,9 @@ static bool ata_log_supported(struct ata_device *dev, u8 log)
 {
 	struct ata_port *ap = dev->link->ap;
 
+	if (dev->horkage & ATA_HORKAGE_NO_LOG_DIR)
+		return false;
+
 	if (ata_read_log_page(dev, ATA_LOG_DIRECTORY, 0, ap->sector_buf, 1))
 		return false;
 	return get_unaligned_le16(&ap->sector_buf[log * 2]) ? true : false;
@@ -4073,6 +4076,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
 	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
 
+	/*
+	 * This sata dom device goes on a walkabout when the ATA_LOG_DIRECTORY
+	 * log page is accessed. Ensure we never ask for this log page with
+	 * these devices.
+	 */
+	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_LOG_DIR },
+
 	/* End Marker */
 	{ }
 };
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 605756f645be..7f99b4d78822 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -380,6 +380,7 @@ enum {
 	ATA_HORKAGE_MAX_TRIM_128M = (1 << 26),	/* Limit max trim size to 128M */
 	ATA_HORKAGE_NO_NCQ_ON_ATI = (1 << 27),	/* Disable NCQ on ATI chipset */
 	ATA_HORKAGE_NO_ID_DEV_LOG = (1 << 28),	/* Identify device log missing */
+	ATA_HORKAGE_NO_LOG_DIR	= (1 << 29),	/* Do not read log directory */
 
 	 /* DMA mask for user DMA control: User visible values; DO NOT
 	    renumber */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 10:45 ` Damien Le Moal
@ 2022-02-02 12:25   ` Anton Lundin
  2022-02-02 12:48     ` Damien Le Moal
  0 siblings, 1 reply; 15+ messages in thread
From: Anton Lundin @ 2022-02-02 12:25 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On 02 February, 2022 - Damien Le Moal wrote:

> On 2/2/22 19:05, Anton Lundin wrote:
> > Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> > a read of ATA_LOG_DIRECTORY page was added. This caused the
> > SATADOM-ML 3ME to lock up.
> > 
> > In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > a flag was added to cache if a device supports this or not.
> > 
> > This adds a blacklist entry which flags that these devices doesn't
> > support that call and shouldn't be issued that call.
> > 
> > Cc: stable@vger.kernel.org # v5.10+
> > Signed-off-by: Anton Lundin <glance@acc.umu.se>
> > Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> 
> I do not think so. See below.
> 
> > ---
> >  drivers/ata/libata-core.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> > index 87d36b29ca5f..e024af9f33d0 100644
> > --- a/drivers/ata/libata-core.c
> > +++ b/drivers/ata/libata-core.c
> > @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> >  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >  
> > +	/*
> > +	 * This sata dom goes on a walkabout when it sees the
> > +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> > +	 * request to these devices.
> > +	 */
> > +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> 
> This flag only disables trying to access the identify device log page,
> it does *not* avoid access to the log directory log page in general. The
> log directory will still be consulted for other log pages beside the
> identify device log page, from any function that calls
> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> ata_dev_config_ncq_non_data())

Non of those code paths are called for this device, probably due to some
other flag disqualifying them.
 
> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> and test for it in ata_log_supported(), completely preventing any access
> to the log directory page for this drive type.

That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
which was in the calling path that actually triggered this issue.

But, yes, I totally agree that's a more solid solution preventing this
kind of issue to crop up again.

> > +
> >  	/* End Marker */
> >  	{ }
> >  };
> 
> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> and a Fixes tag.

I'd think it's fitting to send it to linux-stable, because it prevents
those DOM's from working in v5.15.5+.

Ok. I must have missed that part when I read submitting-patches doc.

I'll rework and re-submit the patch.


//Anton

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 12:25   ` Anton Lundin
@ 2022-02-02 12:48     ` Damien Le Moal
  2022-02-02 13:09       ` Anton Lundin
  0 siblings, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2022-02-02 12:48 UTC (permalink / raw)
  To: Anton Lundin; +Cc: linux-ide

On 2022/02/02 21:25, Anton Lundin wrote:
> On 02 February, 2022 - Damien Le Moal wrote:
> 
>> On 2/2/22 19:05, Anton Lundin wrote:
>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
>>> SATADOM-ML 3ME to lock up.
>>>
>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>> a flag was added to cache if a device supports this or not.
>>>
>>> This adds a blacklist entry which flags that these devices doesn't
>>> support that call and shouldn't be issued that call.
>>>
>>> Cc: stable@vger.kernel.org # v5.10+
>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>
>> I do not think so. See below.
>>
>>> ---
>>>  drivers/ata/libata-core.c | 7 +++++++
>>>  1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>> index 87d36b29ca5f..e024af9f33d0 100644
>>> --- a/drivers/ata/libata-core.c
>>> +++ b/drivers/ata/libata-core.c
>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>  
>>> +	/*
>>> +	 * This sata dom goes on a walkabout when it sees the
>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
>>> +	 * request to these devices.
>>> +	 */
>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
>>
>> This flag only disables trying to access the identify device log page,
>> it does *not* avoid access to the log directory log page in general. The
>> log directory will still be consulted for other log pages beside the
>> identify device log page, from any function that calls
>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
>> ata_dev_config_ncq_non_data())
> 
> Non of those code paths are called for this device, probably due to some
> other flag disqualifying them.
>  
>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
>> and test for it in ata_log_supported(), completely preventing any access
>> to the log directory page for this drive type.
> 
> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> which was in the calling path that actually triggered this issue.
> 
> But, yes, I totally agree that's a more solid solution preventing this
> kind of issue to crop up again.
> 
>>> +
>>>  	/* End Marker */
>>>  	{ }
>>>  };
>>
>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
>> and a Fixes tag.
> 
> I'd think it's fitting to send it to linux-stable, because it prevents
> those DOM's from working in v5.15.5+.
> 
> Ok. I must have missed that part when I read submitting-patches doc.
> 
> I'll rework and re-submit the patch.

I sent you a draft patch. Please try it.

Also, to confirm if the log directory log page is indeed the page that locks up
the drive, can you try this command:

sg_sat_read_gplog --dma --log=0 --page=0 --readonly

and

sg_sat_read_gplog --log=0 --page=0 --readonly

Beware that if the log drectory is the problem, this will lockup your drive !
If that is OK and works, then please try:

sg_sat_read_gplog --dma --log=48 --page=0 --readonly

and

sg_sat_read_gplog --log=48 --page=0 --readonly

log=48 (0x30) is the identify device log page. log=0 is the log directory log page.

> 
> 
> //Anton


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 12:48     ` Damien Le Moal
@ 2022-02-02 13:09       ` Anton Lundin
  2022-02-02 13:54         ` Anton Lundin
  2022-02-02 22:04         ` Damien Le Moal
  0 siblings, 2 replies; 15+ messages in thread
From: Anton Lundin @ 2022-02-02 13:09 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On 02 February, 2022 - Damien Le Moal wrote:

> On 2022/02/02 21:25, Anton Lundin wrote:
> > On 02 February, 2022 - Damien Le Moal wrote:
> > 
> >> On 2/2/22 19:05, Anton Lundin wrote:
> >>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> >>> a read of ATA_LOG_DIRECTORY page was added. This caused the
> >>> SATADOM-ML 3ME to lock up.
> >>>
> >>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>> a flag was added to cache if a device supports this or not.
> >>>
> >>> This adds a blacklist entry which flags that these devices doesn't
> >>> support that call and shouldn't be issued that call.
> >>>
> >>> Cc: stable@vger.kernel.org # v5.10+
> >>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> >>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>
> >> I do not think so. See below.
> >>
> >>> ---
> >>>  drivers/ata/libata-core.c | 7 +++++++
> >>>  1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> >>> index 87d36b29ca5f..e024af9f33d0 100644
> >>> --- a/drivers/ata/libata-core.c
> >>> +++ b/drivers/ata/libata-core.c
> >>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> >>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>  
> >>> +	/*
> >>> +	 * This sata dom goes on a walkabout when it sees the
> >>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> >>> +	 * request to these devices.
> >>> +	 */
> >>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> >>
> >> This flag only disables trying to access the identify device log page,
> >> it does *not* avoid access to the log directory log page in general. The
> >> log directory will still be consulted for other log pages beside the
> >> identify device log page, from any function that calls
> >> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> >> ata_dev_config_ncq_non_data())
> > 
> > Non of those code paths are called for this device, probably due to some
> > other flag disqualifying them.
> >  
> >> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> >> and test for it in ata_log_supported(), completely preventing any access
> >> to the log directory page for this drive type.
> > 
> > That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> > which was in the calling path that actually triggered this issue.
> > 
> > But, yes, I totally agree that's a more solid solution preventing this
> > kind of issue to crop up again.
> > 
> >>> +
> >>>  	/* End Marker */
> >>>  	{ }
> >>>  };
> >>
> >> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> >> and a Fixes tag.
> > 
> > I'd think it's fitting to send it to linux-stable, because it prevents
> > those DOM's from working in v5.15.5+.
> > 
> > Ok. I must have missed that part when I read submitting-patches doc.
> > 
> > I'll rework and re-submit the patch.
> 
> I sent you a draft patch. Please try it.

Works like a charm.

> Also, to confirm if the log directory log page is indeed the page that locks up
> the drive, can you try this command:
> 
> sg_sat_read_gplog --dma --log=0 --page=0 --readonly

# sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
ATA PASS-THROUGH (16), bad field in cdb
sg_sat_read_gplog failed: Illegal request

> and
> 
> sg_sat_read_gplog --log=0 --page=0 --readonly

# sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
 00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
 f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 

Mind you, this with a patched kernel, if that affects anything.
 
> Beware that if the log drectory is the problem, this will lockup your drive !
> If that is OK and works, then please try:
> 
> sg_sat_read_gplog --dma --log=48 --page=0 --readonly

# sg_sat_read_gplog --dma --log=48 --page=0 --readonly /dev/sda
ATA PASS-THROUGH (16), bad field in cdb
sg_sat_read_gplog failed: Illegal request

> and
> 
> sg_sat_read_gplog --log=48 --page=0 --readonly

sg_sat_read_gplog --log=48 --page=0 --readonly /dev/sda
ATA PASS-THROUGH (16), bad field in cdb
sg_sat_read_gplog failed: Illegal request
 
> log=48 (0x30) is the identify device log page. log=0 is the log directory log page.
 
So, does this means that it might be where in the init of the device the
directory log page is accessed that causes the device to fault?

A snippet from a failed boot before the patch looks like:
ata1.00: qc timeout (cmd 0x2f)
ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
ata1.00: ATA Identify Device Log not supported
ata1.00: failed to set xfermode (err_mask=0x40)
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
ata1.00: revalidation failed (errno=-5)
ata1: limiting SATA link speed to 3.0 Gbps
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata1.00: qc timeout (cmd 0x2f)
ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
ata1.00: ATA Identify Device Log not supported
ata1.00: failed to set xfermode (err_mask=0x40)
ata1.00: disabled
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)


When I messed around trying to figure out what the fault was I saw a
couple of different Emask, 0x4, 0x40 and 0x80.

I also blindly played around with adding a ata_msleep(ap, 500); after
the failed attempt to access the log directory log page, but that didn't
help.

//Anton

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 12:07 ` Damien Le Moal
@ 2022-02-02 13:12   ` Anton Lundin
  0 siblings, 0 replies; 15+ messages in thread
From: Anton Lundin @ 2022-02-02 13:12 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide, stable

On 02 February, 2022 - Damien Le Moal wrote:

> On 2/2/22 19:05, Anton Lundin wrote:
> > Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> > a read of ATA_LOG_DIRECTORY page was added. This caused the
> > SATADOM-ML 3ME to lock up.
> > 
> > In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > a flag was added to cache if a device supports this or not.
> > 
> > This adds a blacklist entry which flags that these devices doesn't
> > support that call and shouldn't be issued that call.
> > 
> > Cc: stable@vger.kernel.org # v5.10+
> > Signed-off-by: Anton Lundin <glance@acc.umu.se>
> > Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > ---
> >  drivers/ata/libata-core.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> > index 87d36b29ca5f..e024af9f33d0 100644
> > --- a/drivers/ata/libata-core.c
> > +++ b/drivers/ata/libata-core.c
> > @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> >  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >  
> > +	/*
> > +	 * This sata dom goes on a walkabout when it sees the
> > +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> > +	 * request to these devices.
> > +	 */
> > +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> > +
> >  	/* End Marker */
> >  	{ }
> >  };
> 
> Can you try the attached patch ?
> 
> I think it is important to confirm if the lockup on your drive happens
> due to reads of the log directory log page or due to reads of the
> identify device log page. The attached patch prevents the former, your
> patch prevents the latter. If your patch is all that is needed, then it
> is good, modulo some rephrasing of the commit message and comments.
 
This patch also fixes the issue with these devices. From what I've
previously concluded is that nothing else requests the log directory log
page on these devices.


//Anton

-- 
Anton Lundin	+46702-161604

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 13:09       ` Anton Lundin
@ 2022-02-02 13:54         ` Anton Lundin
  2022-02-02 22:00           ` Damien Le Moal
  2022-02-02 22:04         ` Damien Le Moal
  1 sibling, 1 reply; 15+ messages in thread
From: Anton Lundin @ 2022-02-02 13:54 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On 02 February, 2022 - Anton Lundin wrote:

> On 02 February, 2022 - Damien Le Moal wrote:
> 
> > On 2022/02/02 21:25, Anton Lundin wrote:
> > > On 02 February, 2022 - Damien Le Moal wrote:
> > > 
> > >> On 2/2/22 19:05, Anton Lundin wrote:
> > >>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> > >>> a read of ATA_LOG_DIRECTORY page was added. This caused the
> > >>> SATADOM-ML 3ME to lock up.
> > >>>
> > >>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > >>> a flag was added to cache if a device supports this or not.
> > >>>
> > >>> This adds a blacklist entry which flags that these devices doesn't
> > >>> support that call and shouldn't be issued that call.
> > >>>
> > >>> Cc: stable@vger.kernel.org # v5.10+
> > >>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> > >>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> > >>
> > >> I do not think so. See below.
> > >>
> > >>> ---
> > >>>  drivers/ata/libata-core.c | 7 +++++++
> > >>>  1 file changed, 7 insertions(+)
> > >>>
> > >>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> > >>> index 87d36b29ca5f..e024af9f33d0 100644
> > >>> --- a/drivers/ata/libata-core.c
> > >>> +++ b/drivers/ata/libata-core.c
> > >>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> > >>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> > >>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> > >>>  
> > >>> +	/*
> > >>> +	 * This sata dom goes on a walkabout when it sees the
> > >>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> > >>> +	 * request to these devices.
> > >>> +	 */
> > >>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> > >>
> > >> This flag only disables trying to access the identify device log page,
> > >> it does *not* avoid access to the log directory log page in general. The
> > >> log directory will still be consulted for other log pages beside the
> > >> identify device log page, from any function that calls
> > >> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> > >> ata_dev_config_ncq_non_data())
> > > 
> > > Non of those code paths are called for this device, probably due to some
> > > other flag disqualifying them.
> > >  
> > >> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> > >> and test for it in ata_log_supported(), completely preventing any access
> > >> to the log directory page for this drive type.
> > > 
> > > That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> > > which was in the calling path that actually triggered this issue.
> > > 
> > > But, yes, I totally agree that's a more solid solution preventing this
> > > kind of issue to crop up again.
> > > 
> > >>> +
> > >>>  	/* End Marker */
> > >>>  	{ }
> > >>>  };
> > >>
> > >> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> > >> and a Fixes tag.
> > > 
> > > I'd think it's fitting to send it to linux-stable, because it prevents
> > > those DOM's from working in v5.15.5+.
> > > 
> > > Ok. I must have missed that part when I read submitting-patches doc.
> > > 
> > > I'll rework and re-submit the patch.
> > 
> > I sent you a draft patch. Please try it.
> 
> Works like a charm.
> 
> > Also, to confirm if the log directory log page is indeed the page that locks up
> > the drive, can you try this command:
> > 
> > sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> 
> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
> > and
> > 
> > sg_sat_read_gplog --log=0 --page=0 --readonly
> 
> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> 
> Mind you, this with a patched kernel, if that affects anything.

Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
hangs for a while:

# time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
ATA PASS-THROUGH (16), bad field in cdb
sg_sat_read_gplog failed: Illegal request

real	0m28.337s
user	0m0.000s
sys	0m0.001s

and logs the following in the kernel log:

ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
ata1.00: revalidation failed (errno=-5)
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: configured for UDMA/133


But after that 30s walkabout and whatever the kernel does the device
starts functioning again.

//Anton

  
> > Beware that if the log drectory is the problem, this will lockup your drive !
> > If that is OK and works, then please try:
> > 
> > sg_sat_read_gplog --dma --log=48 --page=0 --readonly
> 
> # sg_sat_read_gplog --dma --log=48 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
> > and
> > 
> > sg_sat_read_gplog --log=48 --page=0 --readonly
> 
> sg_sat_read_gplog --log=48 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
>  
> > log=48 (0x30) is the identify device log page. log=0 is the log directory log page.
>  
> So, does this means that it might be where in the init of the device the
> directory log page is accessed that causes the device to fault?
> 
> A snippet from a failed boot before the patch looks like:
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> ata1.00: revalidation failed (errno=-5)
> ata1: limiting SATA link speed to 3.0 Gbps
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1.00: disabled
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> 
> 
> When I messed around trying to figure out what the fault was I saw a
> couple of different Emask, 0x4, 0x40 and 0x80.
> 
> I also blindly played around with adding a ata_msleep(ap, 500); after
> the failed attempt to access the log directory log page, but that didn't
> help.
> 
> //Anton

-- 
Anton Lundin	+46702-161604

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 13:54         ` Anton Lundin
@ 2022-02-02 22:00           ` Damien Le Moal
  2022-02-03  8:32             ` Anton Lundin
  0 siblings, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2022-02-02 22:00 UTC (permalink / raw)
  To: Anton Lundin; +Cc: linux-ide

On 2022/02/02 22:54, Anton Lundin wrote:
> On 02 February, 2022 - Anton Lundin wrote:
> 
>> On 02 February, 2022 - Damien Le Moal wrote:
>>
>>> On 2022/02/02 21:25, Anton Lundin wrote:
>>>> On 02 February, 2022 - Damien Le Moal wrote:
>>>>
>>>>> On 2/2/22 19:05, Anton Lundin wrote:
>>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
>>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
>>>>>> SATADOM-ML 3ME to lock up.
>>>>>>
>>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>>> a flag was added to cache if a device supports this or not.
>>>>>>
>>>>>> This adds a blacklist entry which flags that these devices doesn't
>>>>>> support that call and shouldn't be issued that call.
>>>>>>
>>>>>> Cc: stable@vger.kernel.org # v5.10+
>>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
>>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>>
>>>>> I do not think so. See below.
>>>>>
>>>>>> ---
>>>>>>  drivers/ata/libata-core.c | 7 +++++++
>>>>>>  1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>>>>> index 87d36b29ca5f..e024af9f33d0 100644
>>>>>> --- a/drivers/ata/libata-core.c
>>>>>> +++ b/drivers/ata/libata-core.c
>>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>>  
>>>>>> +	/*
>>>>>> +	 * This sata dom goes on a walkabout when it sees the
>>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
>>>>>> +	 * request to these devices.
>>>>>> +	 */
>>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
>>>>>
>>>>> This flag only disables trying to access the identify device log page,
>>>>> it does *not* avoid access to the log directory log page in general. The
>>>>> log directory will still be consulted for other log pages beside the
>>>>> identify device log page, from any function that calls
>>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
>>>>> ata_dev_config_ncq_non_data())
>>>>
>>>> Non of those code paths are called for this device, probably due to some
>>>> other flag disqualifying them.
>>>>  
>>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
>>>>> and test for it in ata_log_supported(), completely preventing any access
>>>>> to the log directory page for this drive type.
>>>>
>>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
>>>> which was in the calling path that actually triggered this issue.
>>>>
>>>> But, yes, I totally agree that's a more solid solution preventing this
>>>> kind of issue to crop up again.
>>>>
>>>>>> +
>>>>>>  	/* End Marker */
>>>>>>  	{ }
>>>>>>  };
>>>>>
>>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
>>>>> and a Fixes tag.
>>>>
>>>> I'd think it's fitting to send it to linux-stable, because it prevents
>>>> those DOM's from working in v5.15.5+.
>>>>
>>>> Ok. I must have missed that part when I read submitting-patches doc.
>>>>
>>>> I'll rework and re-submit the patch.
>>>
>>> I sent you a draft patch. Please try it.
>>
>> Works like a charm.
>>
>>> Also, to confirm if the log directory log page is indeed the page that locks up
>>> the drive, can you try this command:
>>>
>>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
>>
>> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
>> ATA PASS-THROUGH (16), bad field in cdb
>> sg_sat_read_gplog failed: Illegal request
>>
>>> and
>>>
>>> sg_sat_read_gplog --log=0 --page=0 --readonly
>>
>> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>
>> Mind you, this with a patched kernel, if that affects anything.
> 
> Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
> hangs for a while:
> 
> # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
> real	0m28.337s
> user	0m0.000s
> sys	0m0.001s
> 
> and logs the following in the kernel log:
> 
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> ata1.00: revalidation failed (errno=-5)
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> ata1.00: configured for UDMA/133
> 
> 
> But after that 30s walkabout and whatever the kernel does the device
> starts functioning again.

The 30s "hang" is the default command timeout: your drive is not responding to
the DMA version of READ LOG EXT command. There are some drives out there like
that. So instead of completely disabling access to the log directory, we should
simply force the use of READ LOG EXT. And for that, there is the horkage flag
ATA_HORKAGE_NO_DMA_LOG.

Can you try using that one without the patch I sent ?




-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 13:09       ` Anton Lundin
  2022-02-02 13:54         ` Anton Lundin
@ 2022-02-02 22:04         ` Damien Le Moal
  1 sibling, 0 replies; 15+ messages in thread
From: Damien Le Moal @ 2022-02-02 22:04 UTC (permalink / raw)
  To: Anton Lundin; +Cc: linux-ide

On 2022/02/02 22:09, Anton Lundin wrote:
> On 02 February, 2022 - Damien Le Moal wrote:
> 
>> On 2022/02/02 21:25, Anton Lundin wrote:
>>> On 02 February, 2022 - Damien Le Moal wrote:
>>>
>>>> On 2/2/22 19:05, Anton Lundin wrote:
>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
>>>>> SATADOM-ML 3ME to lock up.
>>>>>
>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>> a flag was added to cache if a device supports this or not.
>>>>>
>>>>> This adds a blacklist entry which flags that these devices doesn't
>>>>> support that call and shouldn't be issued that call.
>>>>>
>>>>> Cc: stable@vger.kernel.org # v5.10+
>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>
>>>> I do not think so. See below.
>>>>
>>>>> ---
>>>>>  drivers/ata/libata-core.c | 7 +++++++
>>>>>  1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>>>> index 87d36b29ca5f..e024af9f33d0 100644
>>>>> --- a/drivers/ata/libata-core.c
>>>>> +++ b/drivers/ata/libata-core.c
>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>  
>>>>> +	/*
>>>>> +	 * This sata dom goes on a walkabout when it sees the
>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
>>>>> +	 * request to these devices.
>>>>> +	 */
>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
>>>>
>>>> This flag only disables trying to access the identify device log page,
>>>> it does *not* avoid access to the log directory log page in general. The
>>>> log directory will still be consulted for other log pages beside the
>>>> identify device log page, from any function that calls
>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
>>>> ata_dev_config_ncq_non_data())
>>>
>>> Non of those code paths are called for this device, probably due to some
>>> other flag disqualifying them.
>>>  
>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
>>>> and test for it in ata_log_supported(), completely preventing any access
>>>> to the log directory page for this drive type.
>>>
>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
>>> which was in the calling path that actually triggered this issue.
>>>
>>> But, yes, I totally agree that's a more solid solution preventing this
>>> kind of issue to crop up again.
>>>
>>>>> +
>>>>>  	/* End Marker */
>>>>>  	{ }
>>>>>  };
>>>>
>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
>>>> and a Fixes tag.
>>>
>>> I'd think it's fitting to send it to linux-stable, because it prevents
>>> those DOM's from working in v5.15.5+.
>>>
>>> Ok. I must have missed that part when I read submitting-patches doc.
>>>
>>> I'll rework and re-submit the patch.
>>
>> I sent you a draft patch. Please try it.
> 
> Works like a charm.
> 
>> Also, to confirm if the log directory log page is indeed the page that locks up
>> the drive, can you try this command:
>>
>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> 
> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
>> and
>>
>> sg_sat_read_gplog --log=0 --page=0 --readonly
> 
> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> 
> Mind you, this with a patched kernel, if that affects anything.

So your drive does support the log diretory, but it definitely does not like the
DMA version of READ LOG EXT command. And from the result above, it looks like
the only log page that the drive support is the log directory :)

As mentioned in my other reply, please try the ATA_HORKAGE_NO_DMA_LOG flag.

>  
>> Beware that if the log drectory is the problem, this will lockup your drive !
>> If that is OK and works, then please try:
>>
>> sg_sat_read_gplog --dma --log=48 --page=0 --readonly
> 
> # sg_sat_read_gplog --dma --log=48 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
> 
>> and
>>
>> sg_sat_read_gplog --log=48 --page=0 --readonly
> 
> sg_sat_read_gplog --log=48 --page=0 --readonly /dev/sda
> ATA PASS-THROUGH (16), bad field in cdb
> sg_sat_read_gplog failed: Illegal request
>  
>> log=48 (0x30) is the identify device log page. log=0 is the log directory log page.
>  
> So, does this means that it might be where in the init of the device the
> directory log page is accessed that causes the device to fault?
> 
> A snippet from a failed boot before the patch looks like:
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> ata1.00: revalidation failed (errno=-5)
> ata1: limiting SATA link speed to 3.0 Gbps
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1.00: disabled
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> 
> 
> When I messed around trying to figure out what the fault was I saw a
> couple of different Emask, 0x4, 0x40 and 0x80.
> 
> I also blindly played around with adding a ata_msleep(ap, 500); after
> the failed attempt to access the log directory log page, but that didn't
> help.
> 
> //Anton


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-02 22:00           ` Damien Le Moal
@ 2022-02-03  8:32             ` Anton Lundin
  2022-02-03  8:40               ` Damien Le Moal
  0 siblings, 1 reply; 15+ messages in thread
From: Anton Lundin @ 2022-02-03  8:32 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On 03 February, 2022 - Damien Le Moal wrote:

> On 2022/02/02 22:54, Anton Lundin wrote:
> > On 02 February, 2022 - Anton Lundin wrote:
> > 
> >> On 02 February, 2022 - Damien Le Moal wrote:
> >>
> >>> On 2022/02/02 21:25, Anton Lundin wrote:
> >>>> On 02 February, 2022 - Damien Le Moal wrote:
> >>>>
> >>>>> On 2/2/22 19:05, Anton Lundin wrote:
> >>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> >>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
> >>>>>> SATADOM-ML 3ME to lock up.
> >>>>>>
> >>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>>>>> a flag was added to cache if a device supports this or not.
> >>>>>>
> >>>>>> This adds a blacklist entry which flags that these devices doesn't
> >>>>>> support that call and shouldn't be issued that call.
> >>>>>>
> >>>>>> Cc: stable@vger.kernel.org # v5.10+
> >>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> >>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>>>>
> >>>>> I do not think so. See below.
> >>>>>
> >>>>>> ---
> >>>>>>  drivers/ata/libata-core.c | 7 +++++++
> >>>>>>  1 file changed, 7 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> >>>>>> index 87d36b29ca5f..e024af9f33d0 100644
> >>>>>> --- a/drivers/ata/libata-core.c
> >>>>>> +++ b/drivers/ata/libata-core.c
> >>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> >>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>>>>  
> >>>>>> +	/*
> >>>>>> +	 * This sata dom goes on a walkabout when it sees the
> >>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> >>>>>> +	 * request to these devices.
> >>>>>> +	 */
> >>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> >>>>>
> >>>>> This flag only disables trying to access the identify device log page,
> >>>>> it does *not* avoid access to the log directory log page in general. The
> >>>>> log directory will still be consulted for other log pages beside the
> >>>>> identify device log page, from any function that calls
> >>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> >>>>> ata_dev_config_ncq_non_data())
> >>>>
> >>>> Non of those code paths are called for this device, probably due to some
> >>>> other flag disqualifying them.
> >>>>  
> >>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> >>>>> and test for it in ata_log_supported(), completely preventing any access
> >>>>> to the log directory page for this drive type.
> >>>>
> >>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> >>>> which was in the calling path that actually triggered this issue.
> >>>>
> >>>> But, yes, I totally agree that's a more solid solution preventing this
> >>>> kind of issue to crop up again.
> >>>>
> >>>>>> +
> >>>>>>  	/* End Marker */
> >>>>>>  	{ }
> >>>>>>  };
> >>>>>
> >>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> >>>>> and a Fixes tag.
> >>>>
> >>>> I'd think it's fitting to send it to linux-stable, because it prevents
> >>>> those DOM's from working in v5.15.5+.
> >>>>
> >>>> Ok. I must have missed that part when I read submitting-patches doc.
> >>>>
> >>>> I'll rework and re-submit the patch.
> >>>
> >>> I sent you a draft patch. Please try it.
> >>
> >> Works like a charm.
> >>
> >>> Also, to confirm if the log directory log page is indeed the page that locks up
> >>> the drive, can you try this command:
> >>>
> >>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> >>
> >> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
> >> ATA PASS-THROUGH (16), bad field in cdb
> >> sg_sat_read_gplog failed: Illegal request
> >>
> >>> and
> >>>
> >>> sg_sat_read_gplog --log=0 --page=0 --readonly
> >>
> >> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> >>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>
> >> Mind you, this with a patched kernel, if that affects anything.
> > 
> > Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
> > hangs for a while:
> > 
> > # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> > ATA PASS-THROUGH (16), bad field in cdb
> > sg_sat_read_gplog failed: Illegal request
> > 
> > real	0m28.337s
> > user	0m0.000s
> > sys	0m0.001s
> > 
> > and logs the following in the kernel log:
> > 
> > ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> > ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> > ata1.00: revalidation failed (errno=-5)
> > ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> > ata1.00: configured for UDMA/133
> > 
> > 
> > But after that 30s walkabout and whatever the kernel does the device
> > starts functioning again.
> 
> The 30s "hang" is the default command timeout: your drive is not responding to
> the DMA version of READ LOG EXT command. There are some drives out there like
> that. So instead of completely disabling access to the log directory, we should
> simply force the use of READ LOG EXT. And for that, there is the horkage flag
> ATA_HORKAGE_NO_DMA_LOG.
> 
> Can you try using that one without the patch I sent ?

I think I tested that before submitting the patch, but re-tested it now
and that doesn't work:

This is on a 5.15 kernel:

ata1.00: qc timeout (cmd 0x2f)
ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
ata1.00: ATA Identify Device Log not supported
ata1.00: failed to set xfermode (err_mask=0x40)
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
ata1.00: revalidation failed (errno=-5)
ata1: limiting SATA link speed to 3.0 Gbps
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata1.00: qc timeout (cmd 0x2f)
ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
ata1.00: ATA Identify Device Log not supported
ata1.00: failed to set xfermode (err_mask=0x40)
ata1.00: disabled

On a 5.16+ with the "libata: add horkage for missing Identify Device
log" patch, the machine will boot because it won't re-try to access the
ata log directory after the reset to 3.0 Gbps sata.


//Anton

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-03  8:32             ` Anton Lundin
@ 2022-02-03  8:40               ` Damien Le Moal
  2022-02-03  9:11                 ` Anton Lundin
  0 siblings, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2022-02-03  8:40 UTC (permalink / raw)
  To: Anton Lundin; +Cc: linux-ide

On 2/3/22 17:32, Anton Lundin wrote:
> On 03 February, 2022 - Damien Le Moal wrote:
> 
>> On 2022/02/02 22:54, Anton Lundin wrote:
>>> On 02 February, 2022 - Anton Lundin wrote:
>>>
>>>> On 02 February, 2022 - Damien Le Moal wrote:
>>>>
>>>>> On 2022/02/02 21:25, Anton Lundin wrote:
>>>>>> On 02 February, 2022 - Damien Le Moal wrote:
>>>>>>
>>>>>>> On 2/2/22 19:05, Anton Lundin wrote:
>>>>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
>>>>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
>>>>>>>> SATADOM-ML 3ME to lock up.
>>>>>>>>
>>>>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>>>>> a flag was added to cache if a device supports this or not.
>>>>>>>>
>>>>>>>> This adds a blacklist entry which flags that these devices doesn't
>>>>>>>> support that call and shouldn't be issued that call.
>>>>>>>>
>>>>>>>> Cc: stable@vger.kernel.org # v5.10+
>>>>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
>>>>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>>>>
>>>>>>> I do not think so. See below.
>>>>>>>
>>>>>>>> ---
>>>>>>>>  drivers/ata/libata-core.c | 7 +++++++
>>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>>>>>>> index 87d36b29ca5f..e024af9f33d0 100644
>>>>>>>> --- a/drivers/ata/libata-core.c
>>>>>>>> +++ b/drivers/ata/libata-core.c
>>>>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>>>>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>>>>  
>>>>>>>> +	/*
>>>>>>>> +	 * This sata dom goes on a walkabout when it sees the
>>>>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
>>>>>>>> +	 * request to these devices.
>>>>>>>> +	 */
>>>>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
>>>>>>>
>>>>>>> This flag only disables trying to access the identify device log page,
>>>>>>> it does *not* avoid access to the log directory log page in general. The
>>>>>>> log directory will still be consulted for other log pages beside the
>>>>>>> identify device log page, from any function that calls
>>>>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
>>>>>>> ata_dev_config_ncq_non_data())
>>>>>>
>>>>>> Non of those code paths are called for this device, probably due to some
>>>>>> other flag disqualifying them.
>>>>>>  
>>>>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
>>>>>>> and test for it in ata_log_supported(), completely preventing any access
>>>>>>> to the log directory page for this drive type.
>>>>>>
>>>>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
>>>>>> which was in the calling path that actually triggered this issue.
>>>>>>
>>>>>> But, yes, I totally agree that's a more solid solution preventing this
>>>>>> kind of issue to crop up again.
>>>>>>
>>>>>>>> +
>>>>>>>>  	/* End Marker */
>>>>>>>>  	{ }
>>>>>>>>  };
>>>>>>>
>>>>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
>>>>>>> and a Fixes tag.
>>>>>>
>>>>>> I'd think it's fitting to send it to linux-stable, because it prevents
>>>>>> those DOM's from working in v5.15.5+.
>>>>>>
>>>>>> Ok. I must have missed that part when I read submitting-patches doc.
>>>>>>
>>>>>> I'll rework and re-submit the patch.
>>>>>
>>>>> I sent you a draft patch. Please try it.
>>>>
>>>> Works like a charm.
>>>>
>>>>> Also, to confirm if the log directory log page is indeed the page that locks up
>>>>> the drive, can you try this command:
>>>>>
>>>>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
>>>>
>>>> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
>>>> ATA PASS-THROUGH (16), bad field in cdb
>>>> sg_sat_read_gplog failed: Illegal request
>>>>
>>>>> and
>>>>>
>>>>> sg_sat_read_gplog --log=0 --page=0 --readonly
>>>>
>>>> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>>>>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>
>>>> Mind you, this with a patched kernel, if that affects anything.
>>>
>>> Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
>>> hangs for a while:
>>>
>>> # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>>> ATA PASS-THROUGH (16), bad field in cdb
>>> sg_sat_read_gplog failed: Illegal request
>>>
>>> real	0m28.337s
>>> user	0m0.000s
>>> sys	0m0.001s
>>>
>>> and logs the following in the kernel log:
>>>
>>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
>>> ata1.00: revalidation failed (errno=-5)
>>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>> ata1.00: configured for UDMA/133
>>>
>>>
>>> But after that 30s walkabout and whatever the kernel does the device
>>> starts functioning again.
>>
>> The 30s "hang" is the default command timeout: your drive is not responding to
>> the DMA version of READ LOG EXT command. There are some drives out there like
>> that. So instead of completely disabling access to the log directory, we should
>> simply force the use of READ LOG EXT. And for that, there is the horkage flag
>> ATA_HORKAGE_NO_DMA_LOG.
>>
>> Can you try using that one without the patch I sent ?
> 
> I think I tested that before submitting the patch, but re-tested it now
> and that doesn't work:

Your original patch used the ATA_HORKAGE_NO_ID_DEV_LOG flag. What
ATA_HORKAGE_NO_DMA_LOG does is prevent the use of the READ LOG DMA EXT
command. READ LOG EXT command (PIO) will be used. And the sg commands I
sent you and that you tried show that it works:

sg_sat_read_gplog --log=0 --page=0 --readonly

worked perfectly, while:

sg_sat_read_gplog --dma --log=0 --page=0 --readonly

failed.

So by simply preventing the use of READ LOG DMA EXT for your drive, the
kernel will be able to safely get the log directory and test for other
page support.

In your original patch, try replacing ATA_HORKAGE_NO_ID_DEV_LOG with
ATA_HORKAGE_NO_DMA_LOG.

> 
> This is on a 5.15 kernel:
> 
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> ata1.00: revalidation failed (errno=-5)
> ata1: limiting SATA link speed to 3.0 Gbps
> ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> ata1.00: qc timeout (cmd 0x2f)
> ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> ata1.00: ATA Identify Device Log not supported
> ata1.00: failed to set xfermode (err_mask=0x40)
> ata1.00: disabled
> 
> On a 5.16+ with the "libata: add horkage for missing Identify Device
> log" patch, the machine will boot because it won't re-try to access the
> ata log directory after the reset to 3.0 Gbps sata.
> 
> 
> //Anton


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-03  8:40               ` Damien Le Moal
@ 2022-02-03  9:11                 ` Anton Lundin
  2022-02-03  9:21                   ` Damien Le Moal
  0 siblings, 1 reply; 15+ messages in thread
From: Anton Lundin @ 2022-02-03  9:11 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On 03 February, 2022 - Damien Le Moal wrote:

> On 2/3/22 17:32, Anton Lundin wrote:
> > On 03 February, 2022 - Damien Le Moal wrote:
> > 
> >> On 2022/02/02 22:54, Anton Lundin wrote:
> >>> On 02 February, 2022 - Anton Lundin wrote:
> >>>
> >>>> On 02 February, 2022 - Damien Le Moal wrote:
> >>>>
> >>>>> On 2022/02/02 21:25, Anton Lundin wrote:
> >>>>>> On 02 February, 2022 - Damien Le Moal wrote:
> >>>>>>
> >>>>>>> On 2/2/22 19:05, Anton Lundin wrote:
> >>>>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> >>>>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
> >>>>>>>> SATADOM-ML 3ME to lock up.
> >>>>>>>>
> >>>>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>>>>>>> a flag was added to cache if a device supports this or not.
> >>>>>>>>
> >>>>>>>> This adds a blacklist entry which flags that these devices doesn't
> >>>>>>>> support that call and shouldn't be issued that call.
> >>>>>>>>
> >>>>>>>> Cc: stable@vger.kernel.org # v5.10+
> >>>>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> >>>>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>>>>>>
> >>>>>>> I do not think so. See below.
> >>>>>>>
> >>>>>>>> ---
> >>>>>>>>  drivers/ata/libata-core.c | 7 +++++++
> >>>>>>>>  1 file changed, 7 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> >>>>>>>> index 87d36b29ca5f..e024af9f33d0 100644
> >>>>>>>> --- a/drivers/ata/libata-core.c
> >>>>>>>> +++ b/drivers/ata/libata-core.c
> >>>>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> >>>>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>>>>>>  
> >>>>>>>> +	/*
> >>>>>>>> +	 * This sata dom goes on a walkabout when it sees the
> >>>>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> >>>>>>>> +	 * request to these devices.
> >>>>>>>> +	 */
> >>>>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> >>>>>>>
> >>>>>>> This flag only disables trying to access the identify device log page,
> >>>>>>> it does *not* avoid access to the log directory log page in general. The
> >>>>>>> log directory will still be consulted for other log pages beside the
> >>>>>>> identify device log page, from any function that calls
> >>>>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> >>>>>>> ata_dev_config_ncq_non_data())
> >>>>>>
> >>>>>> Non of those code paths are called for this device, probably due to some
> >>>>>> other flag disqualifying them.
> >>>>>>  
> >>>>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> >>>>>>> and test for it in ata_log_supported(), completely preventing any access
> >>>>>>> to the log directory page for this drive type.
> >>>>>>
> >>>>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> >>>>>> which was in the calling path that actually triggered this issue.
> >>>>>>
> >>>>>> But, yes, I totally agree that's a more solid solution preventing this
> >>>>>> kind of issue to crop up again.
> >>>>>>
> >>>>>>>> +
> >>>>>>>>  	/* End Marker */
> >>>>>>>>  	{ }
> >>>>>>>>  };
> >>>>>>>
> >>>>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> >>>>>>> and a Fixes tag.
> >>>>>>
> >>>>>> I'd think it's fitting to send it to linux-stable, because it prevents
> >>>>>> those DOM's from working in v5.15.5+.
> >>>>>>
> >>>>>> Ok. I must have missed that part when I read submitting-patches doc.
> >>>>>>
> >>>>>> I'll rework and re-submit the patch.
> >>>>>
> >>>>> I sent you a draft patch. Please try it.
> >>>>
> >>>> Works like a charm.
> >>>>
> >>>>> Also, to confirm if the log directory log page is indeed the page that locks up
> >>>>> the drive, can you try this command:
> >>>>>
> >>>>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> >>>>
> >>>> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
> >>>> ATA PASS-THROUGH (16), bad field in cdb
> >>>> sg_sat_read_gplog failed: Illegal request
> >>>>
> >>>>> and
> >>>>>
> >>>>> sg_sat_read_gplog --log=0 --page=0 --readonly
> >>>>
> >>>> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> >>>>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>
> >>>> Mind you, this with a patched kernel, if that affects anything.
> >>>
> >>> Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
> >>> hangs for a while:
> >>>
> >>> # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> >>> ATA PASS-THROUGH (16), bad field in cdb
> >>> sg_sat_read_gplog failed: Illegal request
> >>>
> >>> real	0m28.337s
> >>> user	0m0.000s
> >>> sys	0m0.001s
> >>>
> >>> and logs the following in the kernel log:
> >>>
> >>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> >>> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> >>> ata1.00: revalidation failed (errno=-5)
> >>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> >>> ata1.00: configured for UDMA/133
> >>>
> >>>
> >>> But after that 30s walkabout and whatever the kernel does the device
> >>> starts functioning again.
> >>
> >> The 30s "hang" is the default command timeout: your drive is not responding to
> >> the DMA version of READ LOG EXT command. There are some drives out there like
> >> that. So instead of completely disabling access to the log directory, we should
> >> simply force the use of READ LOG EXT. And for that, there is the horkage flag
> >> ATA_HORKAGE_NO_DMA_LOG.
> >>
> >> Can you try using that one without the patch I sent ?
> > 
> > I think I tested that before submitting the patch, but re-tested it now
> > and that doesn't work:
> 
> Your original patch used the ATA_HORKAGE_NO_ID_DEV_LOG flag. What
> ATA_HORKAGE_NO_DMA_LOG does is prevent the use of the READ LOG DMA EXT
> command. READ LOG EXT command (PIO) will be used. And the sg commands I
> sent you and that you tried show that it works:
> 
> sg_sat_read_gplog --log=0 --page=0 --readonly
> 
> worked perfectly, while:
> 
> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> 
> failed.
> 
> So by simply preventing the use of READ LOG DMA EXT for your drive, the
> kernel will be able to safely get the log directory and test for other
> page support.
> 
> In your original patch, try replacing ATA_HORKAGE_NO_ID_DEV_LOG with
> ATA_HORKAGE_NO_DMA_LOG.

I could have bin clearer. This log snippet below is a 5.15.10 with
ATA_HORKAGE_NO_DMA_LOG set for this device.

Shure, sg_sat_read_gplog --log=0 --page=0 --readonly worked when booted,
but as it looks to me, it doesn't work during boot.


//Anton - It's a mystery wrapped in a enigma

 
> > 
> > This is on a 5.15 kernel:
> > 
> > ata1.00: qc timeout (cmd 0x2f)
> > ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> > ata1.00: ATA Identify Device Log not supported
> > ata1.00: failed to set xfermode (err_mask=0x40)
> > ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> > ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> > ata1.00: revalidation failed (errno=-5)
> > ata1: limiting SATA link speed to 3.0 Gbps
> > ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> > ata1.00: qc timeout (cmd 0x2f)
> > ata1.00: Read log 0x00 page 0x00 failed, Emask 0x4
> > ata1.00: ATA Identify Device Log not supported
> > ata1.00: failed to set xfermode (err_mask=0x40)
> > ata1.00: disabled
> > 
> > On a 5.16+ with the "libata: add horkage for missing Identify Device
> > log" patch, the machine will boot because it won't re-try to access the
> > ata log directory after the reset to 3.0 Gbps sata.
> > 
> > 
> > //Anton
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research

-- 
Anton Lundin	+46702-161604

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-03  9:11                 ` Anton Lundin
@ 2022-02-03  9:21                   ` Damien Le Moal
  2022-02-03  9:44                     ` Anton Lundin
  0 siblings, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2022-02-03  9:21 UTC (permalink / raw)
  To: Anton Lundin; +Cc: linux-ide

On 2/3/22 18:11, Anton Lundin wrote:
> On 03 February, 2022 - Damien Le Moal wrote:
> 
>> On 2/3/22 17:32, Anton Lundin wrote:
>>> On 03 February, 2022 - Damien Le Moal wrote:
>>>
>>>> On 2022/02/02 22:54, Anton Lundin wrote:
>>>>> On 02 February, 2022 - Anton Lundin wrote:
>>>>>
>>>>>> On 02 February, 2022 - Damien Le Moal wrote:
>>>>>>
>>>>>>> On 2022/02/02 21:25, Anton Lundin wrote:
>>>>>>>> On 02 February, 2022 - Damien Le Moal wrote:
>>>>>>>>
>>>>>>>>> On 2/2/22 19:05, Anton Lundin wrote:
>>>>>>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
>>>>>>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
>>>>>>>>>> SATADOM-ML 3ME to lock up.
>>>>>>>>>>
>>>>>>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>>>>>>> a flag was added to cache if a device supports this or not.
>>>>>>>>>>
>>>>>>>>>> This adds a blacklist entry which flags that these devices doesn't
>>>>>>>>>> support that call and shouldn't be issued that call.
>>>>>>>>>>
>>>>>>>>>> Cc: stable@vger.kernel.org # v5.10+
>>>>>>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
>>>>>>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
>>>>>>>>>
>>>>>>>>> I do not think so. See below.
>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>  drivers/ata/libata-core.c | 7 +++++++
>>>>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>>>>>>>>>> index 87d36b29ca5f..e024af9f33d0 100644
>>>>>>>>>> --- a/drivers/ata/libata-core.c
>>>>>>>>>> +++ b/drivers/ata/libata-core.c
>>>>>>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>>>>>>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
>>>>>>>>>>  
>>>>>>>>>> +	/*
>>>>>>>>>> +	 * This sata dom goes on a walkabout when it sees the
>>>>>>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
>>>>>>>>>> +	 * request to these devices.
>>>>>>>>>> +	 */
>>>>>>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
>>>>>>>>>
>>>>>>>>> This flag only disables trying to access the identify device log page,
>>>>>>>>> it does *not* avoid access to the log directory log page in general. The
>>>>>>>>> log directory will still be consulted for other log pages beside the
>>>>>>>>> identify device log page, from any function that calls
>>>>>>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
>>>>>>>>> ata_dev_config_ncq_non_data())
>>>>>>>>
>>>>>>>> Non of those code paths are called for this device, probably due to some
>>>>>>>> other flag disqualifying them.
>>>>>>>>  
>>>>>>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
>>>>>>>>> and test for it in ata_log_supported(), completely preventing any access
>>>>>>>>> to the log directory page for this drive type.
>>>>>>>>
>>>>>>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
>>>>>>>> which was in the calling path that actually triggered this issue.
>>>>>>>>
>>>>>>>> But, yes, I totally agree that's a more solid solution preventing this
>>>>>>>> kind of issue to crop up again.
>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>>  	/* End Marker */
>>>>>>>>>>  	{ }
>>>>>>>>>>  };
>>>>>>>>>
>>>>>>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
>>>>>>>>> and a Fixes tag.
>>>>>>>>
>>>>>>>> I'd think it's fitting to send it to linux-stable, because it prevents
>>>>>>>> those DOM's from working in v5.15.5+.
>>>>>>>>
>>>>>>>> Ok. I must have missed that part when I read submitting-patches doc.
>>>>>>>>
>>>>>>>> I'll rework and re-submit the patch.
>>>>>>>
>>>>>>> I sent you a draft patch. Please try it.
>>>>>>
>>>>>> Works like a charm.
>>>>>>
>>>>>>> Also, to confirm if the log directory log page is indeed the page that locks up
>>>>>>> the drive, can you try this command:
>>>>>>>
>>>>>>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
>>>>>>
>>>>>> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
>>>>>> ATA PASS-THROUGH (16), bad field in cdb
>>>>>> sg_sat_read_gplog failed: Illegal request
>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> sg_sat_read_gplog --log=0 --page=0 --readonly
>>>>>>
>>>>>> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>>>>>>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
>>>>>>
>>>>>> Mind you, this with a patched kernel, if that affects anything.
>>>>>
>>>>> Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
>>>>> hangs for a while:
>>>>>
>>>>> # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
>>>>> ATA PASS-THROUGH (16), bad field in cdb
>>>>> sg_sat_read_gplog failed: Illegal request
>>>>>
>>>>> real	0m28.337s
>>>>> user	0m0.000s
>>>>> sys	0m0.001s
>>>>>
>>>>> and logs the following in the kernel log:
>>>>>
>>>>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>>>> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
>>>>> ata1.00: revalidation failed (errno=-5)
>>>>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
>>>>> ata1.00: configured for UDMA/133
>>>>>
>>>>>
>>>>> But after that 30s walkabout and whatever the kernel does the device
>>>>> starts functioning again.
>>>>
>>>> The 30s "hang" is the default command timeout: your drive is not responding to
>>>> the DMA version of READ LOG EXT command. There are some drives out there like
>>>> that. So instead of completely disabling access to the log directory, we should
>>>> simply force the use of READ LOG EXT. And for that, there is the horkage flag
>>>> ATA_HORKAGE_NO_DMA_LOG.
>>>>
>>>> Can you try using that one without the patch I sent ?
>>>
>>> I think I tested that before submitting the patch, but re-tested it now
>>> and that doesn't work:
>>
>> Your original patch used the ATA_HORKAGE_NO_ID_DEV_LOG flag. What
>> ATA_HORKAGE_NO_DMA_LOG does is prevent the use of the READ LOG DMA EXT
>> command. READ LOG EXT command (PIO) will be used. And the sg commands I
>> sent you and that you tried show that it works:
>>
>> sg_sat_read_gplog --log=0 --page=0 --readonly
>>
>> worked perfectly, while:
>>
>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
>>
>> failed.
>>
>> So by simply preventing the use of READ LOG DMA EXT for your drive, the
>> kernel will be able to safely get the log directory and test for other
>> page support.
>>
>> In your original patch, try replacing ATA_HORKAGE_NO_ID_DEV_LOG with
>> ATA_HORKAGE_NO_DMA_LOG.
> 
> I could have bin clearer. This log snippet below is a 5.15.10 with
> ATA_HORKAGE_NO_DMA_LOG set for this device.
> 
> Shure, sg_sat_read_gplog --log=0 --page=0 --readonly worked when booted,
> but as it looks to me, it doesn't work during boot.

Very weird... So I guess the big-hammer patch introducing
ATA_HORKAGE_NO_LOG_DIR seems like the best solution for your drive.
Please review and test again the patch I sent and post it to linux-ide
for review.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME
  2022-02-03  9:21                   ` Damien Le Moal
@ 2022-02-03  9:44                     ` Anton Lundin
  0 siblings, 0 replies; 15+ messages in thread
From: Anton Lundin @ 2022-02-03  9:44 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-ide

On 03 February, 2022 - Damien Le Moal wrote:

> On 2/3/22 18:11, Anton Lundin wrote:
> > On 03 February, 2022 - Damien Le Moal wrote:
> > 
> >> On 2/3/22 17:32, Anton Lundin wrote:
> >>> On 03 February, 2022 - Damien Le Moal wrote:
> >>>
> >>>> On 2022/02/02 22:54, Anton Lundin wrote:
> >>>>> On 02 February, 2022 - Anton Lundin wrote:
> >>>>>
> >>>>>> On 02 February, 2022 - Damien Le Moal wrote:
> >>>>>>
> >>>>>>> On 2022/02/02 21:25, Anton Lundin wrote:
> >>>>>>>> On 02 February, 2022 - Damien Le Moal wrote:
> >>>>>>>>
> >>>>>>>>> On 2/2/22 19:05, Anton Lundin wrote:
> >>>>>>>>>> Back in 06f6c4c6c3e8 ("ata: libata: add missing ata_identify_page_supported() calls")
> >>>>>>>>>> a read of ATA_LOG_DIRECTORY page was added. This caused the
> >>>>>>>>>> SATADOM-ML 3ME to lock up.
> >>>>>>>>>>
> >>>>>>>>>> In 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>>>>>>>>> a flag was added to cache if a device supports this or not.
> >>>>>>>>>>
> >>>>>>>>>> This adds a blacklist entry which flags that these devices doesn't
> >>>>>>>>>> support that call and shouldn't be issued that call.
> >>>>>>>>>>
> >>>>>>>>>> Cc: stable@vger.kernel.org # v5.10+
> >>>>>>>>>> Signed-off-by: Anton Lundin <glance@acc.umu.se>
> >>>>>>>>>> Depends-on: 636f6e2af4fb ("libata: add horkage for missing Identify Device log")
> >>>>>>>>>
> >>>>>>>>> I do not think so. See below.
> >>>>>>>>>
> >>>>>>>>>> ---
> >>>>>>>>>>  drivers/ata/libata-core.c | 7 +++++++
> >>>>>>>>>>  1 file changed, 7 insertions(+)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> >>>>>>>>>> index 87d36b29ca5f..e024af9f33d0 100644
> >>>>>>>>>> --- a/drivers/ata/libata-core.c
> >>>>>>>>>> +++ b/drivers/ata/libata-core.c
> >>>>>>>>>> @@ -4070,6 +4070,13 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> >>>>>>>>>>  	{ "WDC WD3000JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>>>>>>>>  	{ "WDC WD3200JD-*",		NULL,	ATA_HORKAGE_WD_BROKEN_LPM },
> >>>>>>>>>>  
> >>>>>>>>>> +	/*
> >>>>>>>>>> +	 * This sata dom goes on a walkabout when it sees the
> >>>>>>>>>> +	 * ATA_LOG_DIRECTORY read request so ensure we don't issue such a
> >>>>>>>>>> +	 * request to these devices.
> >>>>>>>>>> +	 */
> >>>>>>>>>> +	{ "SATADOM-ML 3ME",		NULL,	ATA_HORKAGE_NO_ID_DEV_LOG },
> >>>>>>>>>
> >>>>>>>>> This flag only disables trying to access the identify device log page,
> >>>>>>>>> it does *not* avoid access to the log directory log page in general. The
> >>>>>>>>> log directory will still be consulted for other log pages beside the
> >>>>>>>>> identify device log page, from any function that calls
> >>>>>>>>> ata_log_supported() (e.g. ata_dev_config_ncq_send_recv() and
> >>>>>>>>> ata_dev_config_ncq_non_data())
> >>>>>>>>
> >>>>>>>> Non of those code paths are called for this device, probably due to some
> >>>>>>>> other flag disqualifying them.
> >>>>>>>>  
> >>>>>>>>> So it will be a lot more solid to define a ATA_HORKAGE_NO_LOG_DIR flag
> >>>>>>>>> and test for it in ata_log_supported(), completely preventing any access
> >>>>>>>>> to the log directory page for this drive type.
> >>>>>>>>
> >>>>>>>> That was my first thought but then I found ATA_HORKAGE_NO_ID_DEV_LOG
> >>>>>>>> which was in the calling path that actually triggered this issue.
> >>>>>>>>
> >>>>>>>> But, yes, I totally agree that's a more solid solution preventing this
> >>>>>>>> kind of issue to crop up again.
> >>>>>>>>
> >>>>>>>>>> +
> >>>>>>>>>>  	/* End Marker */
> >>>>>>>>>>  	{ }
> >>>>>>>>>>  };
> >>>>>>>>>
> >>>>>>>>> Note: if you need this fix sent to linux-stable, add "Cc: stable@..."
> >>>>>>>>> and a Fixes tag.
> >>>>>>>>
> >>>>>>>> I'd think it's fitting to send it to linux-stable, because it prevents
> >>>>>>>> those DOM's from working in v5.15.5+.
> >>>>>>>>
> >>>>>>>> Ok. I must have missed that part when I read submitting-patches doc.
> >>>>>>>>
> >>>>>>>> I'll rework and re-submit the patch.
> >>>>>>>
> >>>>>>> I sent you a draft patch. Please try it.
> >>>>>>
> >>>>>> Works like a charm.
> >>>>>>
> >>>>>>> Also, to confirm if the log directory log page is indeed the page that locks up
> >>>>>>> the drive, can you try this command:
> >>>>>>>
> >>>>>>> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> >>>>>>
> >>>>>> # sg_sat_read_gplog --dma --log=0 --page=0 --readonly /dev/sda
> >>>>>> ATA PASS-THROUGH (16), bad field in cdb
> >>>>>> sg_sat_read_gplog failed: Illegal request
> >>>>>>
> >>>>>>> and
> >>>>>>>
> >>>>>>> sg_sat_read_gplog --log=0 --page=0 --readonly
> >>>>>>
> >>>>>> # sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> >>>>>>  00     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  08     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  10     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  18     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  20     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  28     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  30     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  38     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  40     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  48     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  50     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  58     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  60     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  68     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  70     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  78     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  80     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  88     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  90     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  98     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  a0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  a8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  b0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  b8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  c0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  c8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  d0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  d8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  e0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  e8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  f0     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>  f8     0000 0000 0000 0000 0000 0000 0000 0000     .. .. .. .. .. .. .. .. 
> >>>>>>
> >>>>>> Mind you, this with a patched kernel, if that affects anything.
> >>>>>
> >>>>> Without a patched kernel, (grabbed os with a 4.19.94 kernel) the command
> >>>>> hangs for a while:
> >>>>>
> >>>>> # time sg_sat_read_gplog --log=0 --page=0 --readonly /dev/sda
> >>>>> ATA PASS-THROUGH (16), bad field in cdb
> >>>>> sg_sat_read_gplog failed: Illegal request
> >>>>>
> >>>>> real	0m28.337s
> >>>>> user	0m0.000s
> >>>>> sys	0m0.001s
> >>>>>
> >>>>> and logs the following in the kernel log:
> >>>>>
> >>>>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> >>>>> ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
> >>>>> ata1.00: revalidation failed (errno=-5)
> >>>>> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> >>>>> ata1.00: configured for UDMA/133
> >>>>>
> >>>>>
> >>>>> But after that 30s walkabout and whatever the kernel does the device
> >>>>> starts functioning again.
> >>>>
> >>>> The 30s "hang" is the default command timeout: your drive is not responding to
> >>>> the DMA version of READ LOG EXT command. There are some drives out there like
> >>>> that. So instead of completely disabling access to the log directory, we should
> >>>> simply force the use of READ LOG EXT. And for that, there is the horkage flag
> >>>> ATA_HORKAGE_NO_DMA_LOG.
> >>>>
> >>>> Can you try using that one without the patch I sent ?
> >>>
> >>> I think I tested that before submitting the patch, but re-tested it now
> >>> and that doesn't work:
> >>
> >> Your original patch used the ATA_HORKAGE_NO_ID_DEV_LOG flag. What
> >> ATA_HORKAGE_NO_DMA_LOG does is prevent the use of the READ LOG DMA EXT
> >> command. READ LOG EXT command (PIO) will be used. And the sg commands I
> >> sent you and that you tried show that it works:
> >>
> >> sg_sat_read_gplog --log=0 --page=0 --readonly
> >>
> >> worked perfectly, while:
> >>
> >> sg_sat_read_gplog --dma --log=0 --page=0 --readonly
> >>
> >> failed.
> >>
> >> So by simply preventing the use of READ LOG DMA EXT for your drive, the
> >> kernel will be able to safely get the log directory and test for other
> >> page support.
> >>
> >> In your original patch, try replacing ATA_HORKAGE_NO_ID_DEV_LOG with
> >> ATA_HORKAGE_NO_DMA_LOG.
> > 
> > I could have bin clearer. This log snippet below is a 5.15.10 with
> > ATA_HORKAGE_NO_DMA_LOG set for this device.
> > 
> > Shure, sg_sat_read_gplog --log=0 --page=0 --readonly worked when booted,
> > but as it looks to me, it doesn't work during boot.
> 
> Very weird... So I guess the big-hammer patch introducing
> ATA_HORKAGE_NO_LOG_DIR seems like the best solution for your drive.
> Please review and test again the patch I sent and post it to linux-ide
> for review.

Indeed. My guess would be that it depends on where in the init process
the drive is how it behaves.

Patch is re-tested and re-submitted. Thanks for all your help figuring
this out and getting this on the way upstream.


//Anton

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-02-03  9:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-02 10:05 [PATCH] libata: Don't issue ATA_LOG_DIRECTORY to SATADOM-ML 3ME Anton Lundin
2022-02-02 10:45 ` Damien Le Moal
2022-02-02 12:25   ` Anton Lundin
2022-02-02 12:48     ` Damien Le Moal
2022-02-02 13:09       ` Anton Lundin
2022-02-02 13:54         ` Anton Lundin
2022-02-02 22:00           ` Damien Le Moal
2022-02-03  8:32             ` Anton Lundin
2022-02-03  8:40               ` Damien Le Moal
2022-02-03  9:11                 ` Anton Lundin
2022-02-03  9:21                   ` Damien Le Moal
2022-02-03  9:44                     ` Anton Lundin
2022-02-02 22:04         ` Damien Le Moal
2022-02-02 12:07 ` Damien Le Moal
2022-02-02 13:12   ` Anton Lundin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.