All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
@ 2022-03-18  9:03 Christian Lamparter
  2022-03-18  9:03 ` [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Christian Lamparter
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Christian Lamparter @ 2022-03-18  9:03 UTC (permalink / raw)
  To: linux-ide; +Cc: Damien Le Moal, Jens Axboe, Tejun Heo, Andy Shevchenko

the driver uses libata's "tag" values from in various arrays.
Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.

Otherwise ATA_TAG_INTERNAL cause a crash like this:

| BUG: Kernel NULL pointer dereference at 0x00000000
| Faulting instruction address: 0xc03ed4b8
| Oops: Kernel access of bad area, sig: 11 [#1]
| BE PAGE_SIZE=4K PowerPC 44x Platform
| CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
| NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
| REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
| MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
| DEAR: 00000000 ESR: 00000000
| GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
| [..]
| NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
| LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
| Call Trace:
| [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
| [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
| [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
| [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
| [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
| [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
| [...]

this is because sata_dwc_dma_xfer_complete() NULLs the
dma_pending's next neighbour "chan" (a *dma_chan struct) in
this '32' case right here (line ~735):
> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;

Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
causes the crash.

Reported-by: ticerex (OpenWrt Forum)
Fixes: 28361c403683c ("libata: add extra internal command")
Cc: stable@kernel.org # 4.18+
Link: https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
ticerex said when I've asked him about his real name+email for the patches:
"Please use my forum nick."
<https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
(I know checkpatch.pl complains about that. But what can you do?)
---
 drivers/ata/sata_dwc_460ex.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/sata_dwc_460ex.c b/drivers/ata/sata_dwc_460ex.c
index bec33d781ae0..061b27584667 100644
--- a/drivers/ata/sata_dwc_460ex.c
+++ b/drivers/ata/sata_dwc_460ex.c
@@ -137,7 +137,7 @@ struct sata_dwc_device {
 #endif
 };
 
-#define SATA_DWC_QCMD_MAX	32
+#define SATA_DWC_QCMD_MAX	33
 
 struct sata_dwc_device_port {
 	struct sata_dwc_device	*hsdev;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
  2022-03-18  9:03 [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Christian Lamparter
@ 2022-03-18  9:03 ` Christian Lamparter
  2022-03-18  9:47   ` Damien Le Moal
  2022-03-18 14:20   ` Andy Shevchenko
  2022-03-18  9:34 ` [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Damien Le Moal
  2022-03-18 14:16 ` Andy Shevchenko
  2 siblings, 2 replies; 11+ messages in thread
From: Christian Lamparter @ 2022-03-18  9:03 UTC (permalink / raw)
  To: linux-ide; +Cc: Damien Le Moal, Jens Axboe, Tejun Heo, Andy Shevchenko

Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
the a message: "READ LOG DMA EXT failed, trying PIO" during boot.

Initially this was discovered because it caused a crash
with the sata_dwc_460ex controller on a WD MyBook Live DUO [1].

The reporter "ticerex" which has the unique opportunity that he
has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q"
which boots fine. But the newer/latest firmware "EXT0DB6Q" caused
the headaches.

This failure is not limited to sata_dwc_460ex, it also happens on
ahci controllers (Asmedia) as found in this forum unraid thread [2].
(This was with a "Samsung SSD 840 EVO 120GB" with firmware "EXT0BB6Q")

[1] <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/11>
[2] <https://forums.unraid.net/topic/77007-disks-not-showing-up/#comment-711295>

Cc: <stable@kernel.org> # 5.10+
Reported-by: ticerex (OpenWrt Forum)
Link: https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
"Please use my forum nick."
<https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>

The other patch in this series addresses the drivers crash.

Note: This "READ LOG DMA EXT failed" also happens on the newer
Samsung 870 EVOs too:
<https://bugzilla.kernel.org/show_bug.cgi?id=201693#c29>.
I guess there needs to be a database maybe based on the firmware
revision? For now I've settled with the "840 EVO" since I have
found a old, but updated Samsung 840 EVO 120GB in my spare-parts
box that I can test with.
---
 drivers/ata/libata-core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0c854aebfe0b..760c0d81d148 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4014,6 +4014,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 						ATA_HORKAGE_ZERO_AFTER_TRIM, },
 	{ "Crucial_CT*MX100*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
 						ATA_HORKAGE_ZERO_AFTER_TRIM, },
+	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
+						ATA_HORKAGE_NO_DMA_LOG |
+						ATA_HORKAGE_ZERO_AFTER_TRIM, },
 	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
 						ATA_HORKAGE_ZERO_AFTER_TRIM, },
 	{ "Samsung SSD 850*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18  9:03 [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Christian Lamparter
  2022-03-18  9:03 ` [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Christian Lamparter
@ 2022-03-18  9:34 ` Damien Le Moal
  2022-03-18 11:43   ` Christian Lamparter
  2022-03-18 14:16 ` Andy Shevchenko
  2 siblings, 1 reply; 11+ messages in thread
From: Damien Le Moal @ 2022-03-18  9:34 UTC (permalink / raw)
  To: Christian Lamparter, linux-ide; +Cc: Jens Axboe, Tejun Heo, Andy Shevchenko

On 3/18/22 18:03, Christian Lamparter wrote:
> the driver uses libata's "tag" values from in various arrays.
> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
> 
> Otherwise ATA_TAG_INTERNAL cause a crash like this:
> 
> | BUG: Kernel NULL pointer dereference at 0x00000000
> | Faulting instruction address: 0xc03ed4b8
> | Oops: Kernel access of bad area, sig: 11 [#1]
> | BE PAGE_SIZE=4K PowerPC 44x Platform
> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
> | DEAR: 00000000 ESR: 00000000
> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
> | [..]
> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
> | Call Trace:
> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
> | [...]
> 
> this is because sata_dwc_dma_xfer_complete() NULLs the
> dma_pending's next neighbour "chan" (a *dma_chan struct) in
> this '32' case right here (line ~735):
>> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
> 
> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
> causes the crash.
> 
> Reported-by: ticerex (OpenWrt Forum)

I would remove this since you have the link to the bug report below.
Without a real name & email, this does not make much sense.

> Fixes: 28361c403683c ("libata: add extra internal command")
> Cc: stable@kernel.org # 4.18+
> Link: https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464
> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
> ---
> ticerex said when I've asked him about his real name+email for the patches:
> "Please use my forum nick."
> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
> (I know checkpatch.pl complains about that. But what can you do?)
> ---
>  drivers/ata/sata_dwc_460ex.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/ata/sata_dwc_460ex.c b/drivers/ata/sata_dwc_460ex.c
> index bec33d781ae0..061b27584667 100644
> --- a/drivers/ata/sata_dwc_460ex.c
> +++ b/drivers/ata/sata_dwc_460ex.c
> @@ -137,7 +137,7 @@ struct sata_dwc_device {
>  #endif
>  };
>  
> -#define SATA_DWC_QCMD_MAX	32
> +#define SATA_DWC_QCMD_MAX	33

This is really ugly, but I do not see a better way to do it simply.
But at least, let's do something like this to avoid confusion and to show
that this driver is not doing some black magic with ATA drives:

/*
 * Allow one extra special slot for commands and DMA management to
 * account for libata internal commands.
 */
#define SATA_DWC_QCMD_MAX	(ATA_MAX_QUEUE + 1)

Thoughts ?

>  
>  struct sata_dwc_device_port {
>  	struct sata_dwc_device	*hsdev;


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
  2022-03-18  9:03 ` [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Christian Lamparter
@ 2022-03-18  9:47   ` Damien Le Moal
  2022-03-18 14:20   ` Andy Shevchenko
  1 sibling, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2022-03-18  9:47 UTC (permalink / raw)
  To: Christian Lamparter, linux-ide; +Cc: Jens Axboe, Tejun Heo, Andy Shevchenko

On 3/18/22 18:03, Christian Lamparter wrote:
> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
> the a message: "READ LOG DMA EXT failed, trying PIO" during boot.
> 
> Initially this was discovered because it caused a crash
> with the sata_dwc_460ex controller on a WD MyBook Live DUO [1].
> 
> The reporter "ticerex" which has the unique opportunity that he
> has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q"
> which boots fine. But the newer/latest firmware "EXT0DB6Q" caused
> the headaches.
> 
> This failure is not limited to sata_dwc_460ex, it also happens on
> ahci controllers (Asmedia) as found in this forum unraid thread [2].
> (This was with a "Samsung SSD 840 EVO 120GB" with firmware "EXT0BB6Q")
> 
> [1] <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/11>
> [2] <https://forums.unraid.net/topic/77007-disks-not-showing-up/#comment-711295>
> 
> Cc: <stable@kernel.org> # 5.10+
> Reported-by: ticerex (OpenWrt Forum)

Let's drop this. You mentioned the reporter in the commit message, that is
good enough and there is the forum link for reference.

> Link: https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464
> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
> ---
> "Please use my forum nick."
> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
> 
> The other patch in this series addresses the drivers crash.
> 
> Note: This "READ LOG DMA EXT failed" also happens on the newer
> Samsung 870 EVOs too:
> <https://bugzilla.kernel.org/show_bug.cgi?id=201693#c29>.
> I guess there needs to be a database maybe based on the firmware
> revision? For now I've settled with the "840 EVO" since I have
> found a old, but updated Samsung 840 EVO 120GB in my spare-parts
> box that I can test with.

If we go down to the FW revision, the blacklist size could get very big
very quickly... The difference between READ LOG DMA EXT and READ LOG EXT
is negligible anyway and that is not the hot path. So no worries.

> ---
>  drivers/ata/libata-core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 0c854aebfe0b..760c0d81d148 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4014,6 +4014,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>  	{ "Crucial_CT*MX100*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
> +	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
> +						ATA_HORKAGE_NO_DMA_LOG |
> +						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>  	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>  	{ "Samsung SSD 850*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18  9:34 ` [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Damien Le Moal
@ 2022-03-18 11:43   ` Christian Lamparter
  2022-03-19  0:04     ` Damien Le Moal
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Lamparter @ 2022-03-18 11:43 UTC (permalink / raw)
  To: Damien Le Moal, linux-ide; +Cc: Jens Axboe, Tejun Heo, Andy Shevchenko

On 18/03/2022 10:34, Damien Le Moal wrote:
> On 3/18/22 18:03, Christian Lamparter wrote:
>> the driver uses libata's "tag" values from in various arrays.
>> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
>> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
>>
>> Otherwise ATA_TAG_INTERNAL cause a crash like this:
>>
>> | BUG: Kernel NULL pointer dereference at 0x00000000
>> | Faulting instruction address: 0xc03ed4b8
>> | Oops: Kernel access of bad area, sig: 11 [#1]
>> | BE PAGE_SIZE=4K PowerPC 44x Platform
>> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
>> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
>> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
>> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
>> | DEAR: 00000000 ESR: 00000000
>> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
>> | [..]
>> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
>> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>> | Call Trace:
>> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
>> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
>> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
>> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
>> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
>> | [...]
>>
>> this is because sata_dwc_dma_xfer_complete() NULLs the
>> dma_pending's next neighbour "chan" (a *dma_chan struct) in
>> this '32' case right here (line ~735):
>>> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
>>
>> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
>> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
>> causes the crash.
>>
>> Reported-by: ticerex (OpenWrt Forum)
> 
> I would remove this since you have the link to the bug report below.
> Without a real name & email, this does not make much sense.
Ok.

> 
>> Fixes: 28361c403683c ("libata: add extra internal command")
>> Cc: stable@kernel.org # 4.18+
>> Link: https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464
>> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
>> ---
>> ticerex said when I've asked him about his real name+email for the patches:
>> "Please use my forum nick."
>> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
>> (I know checkpatch.pl complains about that. But what can you do?)
>> ---
>>   drivers/ata/sata_dwc_460ex.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/ata/sata_dwc_460ex.c b/drivers/ata/sata_dwc_460ex.c
>> index bec33d781ae0..061b27584667 100644
>> --- a/drivers/ata/sata_dwc_460ex.c
>> +++ b/drivers/ata/sata_dwc_460ex.c
>> @@ -137,7 +137,7 @@ struct sata_dwc_device {
>>   #endif
>>   };
>>   
>> -#define SATA_DWC_QCMD_MAX	32
>> +#define SATA_DWC_QCMD_MAX	33
> 
> This is really ugly, but I do not see a better way to do it simply.
> But at least, let's do something like this to avoid confusion and to show
> that this driver is not doing some black magic with ATA drives:
> 
> /*
>   * Allow one extra special slot for commands and DMA management to
>   * account for libata internal commands.
>   */
> #define SATA_DWC_QCMD_MAX	(ATA_MAX_QUEUE + 1)
> 
> Thoughts ?

Yes, this works. That ATA_TAG_INTERNAL value has remained unchanged
since Jens' change from 2018. How do you want to proceed?

Should I make a v2, or do you update the patch locally?

The way I understand it. this ATA_TAG_INTERNAL has the special MAGIC
value so DMA "qc issues" do not interfere with possibly concurrent NCQ
"qc issues" on the involved controllers.

sata_dwc_460ex's NCQ is disabled/gimped by line 1093:
| /* .can_queue           = ATA_MAX_QUEUE, */)

Reassiging the ATA_TAG_INTERNAL could have been done.
But just increasing the arrays worked :-).

Cheers,
Christian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18  9:03 [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Christian Lamparter
  2022-03-18  9:03 ` [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Christian Lamparter
  2022-03-18  9:34 ` [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Damien Le Moal
@ 2022-03-18 14:16 ` Andy Shevchenko
  2022-03-18 18:06   ` Christian Lamparter
  2 siblings, 1 reply; 11+ messages in thread
From: Andy Shevchenko @ 2022-03-18 14:16 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-ide, Damien Le Moal, Jens Axboe, Tejun Heo

On Fri, Mar 18, 2022 at 10:03:11AM +0100, Christian Lamparter wrote:
> the driver uses libata's "tag" values from in various arrays.
> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
> 
> Otherwise ATA_TAG_INTERNAL cause a crash like this:
> 
> | BUG: Kernel NULL pointer dereference at 0x00000000
> | Faulting instruction address: 0xc03ed4b8
> | Oops: Kernel access of bad area, sig: 11 [#1]
> | BE PAGE_SIZE=4K PowerPC 44x Platform
> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
> | DEAR: 00000000 ESR: 00000000
> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
> | [..]
> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
> | Call Trace:
> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
> | [...]
> 
> this is because sata_dwc_dma_xfer_complete() NULLs the
> dma_pending's next neighbour "chan" (a *dma_chan struct) in
> this '32' case right here (line ~735):
> > hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
> 
> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
> causes the crash.

...

> ticerex said when I've asked him about his real name+email for the patches:
> "Please use my forum nick."
> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
> (I know checkpatch.pl complains about that. But what can you do?)

I think Reported-by: is fine to have any kind of reference to the reporter.
I can consider it false positive.

...

> -#define SATA_DWC_QCMD_MAX	32
> +#define SATA_DWC_QCMD_MAX	33

Can't we use 

#define SATA_DWC_QCMD_MAX	(ATA_TAG_INTERNAL + 1)

instead?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
  2022-03-18  9:03 ` [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Christian Lamparter
  2022-03-18  9:47   ` Damien Le Moal
@ 2022-03-18 14:20   ` Andy Shevchenko
  1 sibling, 0 replies; 11+ messages in thread
From: Andy Shevchenko @ 2022-03-18 14:20 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-ide, Damien Le Moal, Jens Axboe, Tejun Heo

On Fri, Mar 18, 2022 at 10:03:12AM +0100, Christian Lamparter wrote:
> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
> the a message: "READ LOG DMA EXT failed, trying PIO" during boot.
> 
> Initially this was discovered because it caused a crash
> with the sata_dwc_460ex controller on a WD MyBook Live DUO [1].
> 
> The reporter "ticerex" which has the unique opportunity that he
> has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q"
> which boots fine. But the newer/latest firmware "EXT0DB6Q" caused
> the headaches.
> 
> This failure is not limited to sata_dwc_460ex, it also happens on
> ahci controllers (Asmedia) as found in this forum unraid thread [2].
> (This was with a "Samsung SSD 840 EVO 120GB" with firmware "EXT0BB6Q")
> 
> [1] <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/11>
> [2] <https://forums.unraid.net/topic/77007-disks-not-showing-up/#comment-711295>

Can it be converted to BugLink tag(s)?
You may `git log --grep BugLink` to see examples.


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18 14:16 ` Andy Shevchenko
@ 2022-03-18 18:06   ` Christian Lamparter
  2022-03-19  0:00     ` Damien Le Moal
  2022-03-19  0:13     ` Damien Le Moal
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Lamparter @ 2022-03-18 18:06 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: linux-ide, Damien Le Moal, Jens Axboe, Tejun Heo

On 18/03/2022 15:16, Andy Shevchenko wrote:
> On Fri, Mar 18, 2022 at 10:03:11AM +0100, Christian Lamparter wrote:
>> the driver uses libata's "tag" values from in various arrays.
>> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
>> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
>>
>> Otherwise ATA_TAG_INTERNAL cause a crash like this:
>>
>> | BUG: Kernel NULL pointer dereference at 0x00000000
>> | Faulting instruction address: 0xc03ed4b8
>> | Oops: Kernel access of bad area, sig: 11 [#1]
>> | BE PAGE_SIZE=4K PowerPC 44x Platform
>> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
>> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
>> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
>> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
>> | DEAR: 00000000 ESR: 00000000
>> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
>> | [..]
>> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
>> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>> | Call Trace:
>> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
>> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
>> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
>> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
>> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
>> | [...]
>>
>> this is because sata_dwc_dma_xfer_complete() NULLs the
>> dma_pending's next neighbour "chan" (a *dma_chan struct) in
>> this '32' case right here (line ~735):
>>> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
>>
>> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
>> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
>> causes the crash.
> 
> ...
> 
>> ticerex said when I've asked him about his real name+email for the patches:
>> "Please use my forum nick."
>> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
>> (I know checkpatch.pl complains about that. But what can you do?)
> 
> I think Reported-by: is fine to have any kind of reference to the reporter.
> I can consider it false positive.
> 
> ...

K, I've reported this back to the reporter ;).

Documentation/process/maintainer-tip.rst and process/5.Posting.rst
provided some hints:

"Please note that if the bug was reported in private, then ask for
permission first before using the Reported-by tag."

and maintainer-tip.rst, the format should be:
``Reported-by: ``Reporter <reporter@mail>``

(My goal here is to get "a fix" merged, so conformance is key. ;-))

>> -#define SATA_DWC_QCMD_MAX	32
>> +#define SATA_DWC_QCMD_MAX	33
> 
> Can't we use
> 
> #define SATA_DWC_QCMD_MAX	(ATA_TAG_INTERNAL + 1)
> 

I've looked around a bit.

include/linux/libata.h itself has the following related definitions:

| enum {
|        ATA_MAX_QUEUE           = 32,
|        ATA_TAG_INTERNAL        = ATA_MAX_QUEUE,
| [..]

| struct ata_port {
|	[...]
| 	struct ata_queued_cmd   qcmd[ATA_MAX_QUEUE + 1];
|
| }

ATA_MAX_QUEUE seems to be the "size" value. Whereas ATA_TAG_INTERNAL
is used as (tag == ATA_TAG_INTERNAL) more often.


I came up with a viable compromise:

Would it be "OK" to define a "new" ATA_MAX_TAG?

This could be either set ATA_MAX_TAG = ATA_MAX_QUEUE + 1

or let it be assigned automatically, if it's placed after ATA_TAG_INTERNAL
in the libata.h's enum like this: (I prefer this, but it being "33" is not
obvious if you don't dabble in C all the time)

| enum {
|	[...]
|       ATA_MAX_QUEUE           = 32,
|       ATA_TAG_INTERNAL        = ATA_MAX_QUEUE,
|	/*
|	 * ATA_MAX_TAG accounts for ATA_MAX_QUEUE TAGs + 1 special TAG/slot
|	 * for ATA_TAG_INTERNAL (libata's internal commands/DMA management).
|	 * This needs to be it in this location.
|	 */
|	ATA_MAX_TAG,
|       [...]

This ATA_MAX_TAG could then be used for ata_port's qcmd (from above) and
sata_dwc_460ex's SATA_DWC_QCMD_MAX.

For good measure:

BUILD_BUG_ON(ATA_TAG_INTERNAL >= ATA_MAX_TAG);

could be added into libata.h's __ata_qc_from_tag().
It access the qcmd array, so anyone using libata's accessors will catch
future updates.

Is this fine or getting to close to being overbuild?

Thank you both for your insights,
Christian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18 18:06   ` Christian Lamparter
@ 2022-03-19  0:00     ` Damien Le Moal
  2022-03-19  0:13     ` Damien Le Moal
  1 sibling, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2022-03-19  0:00 UTC (permalink / raw)
  To: Christian Lamparter, Andy Shevchenko; +Cc: linux-ide, Jens Axboe, Tejun Heo

On 3/19/22 03:06, Christian Lamparter wrote:
> On 18/03/2022 15:16, Andy Shevchenko wrote:
>> On Fri, Mar 18, 2022 at 10:03:11AM +0100, Christian Lamparter wrote:
>>> the driver uses libata's "tag" values from in various arrays.
>>> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
>>> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
>>>
>>> Otherwise ATA_TAG_INTERNAL cause a crash like this:
>>>
>>> | BUG: Kernel NULL pointer dereference at 0x00000000
>>> | Faulting instruction address: 0xc03ed4b8
>>> | Oops: Kernel access of bad area, sig: 11 [#1]
>>> | BE PAGE_SIZE=4K PowerPC 44x Platform
>>> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
>>> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
>>> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
>>> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
>>> | DEAR: 00000000 ESR: 00000000
>>> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
>>> | [..]
>>> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
>>> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>>> | Call Trace:
>>> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
>>> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>>> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
>>> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
>>> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
>>> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
>>> | [...]
>>>
>>> this is because sata_dwc_dma_xfer_complete() NULLs the
>>> dma_pending's next neighbour "chan" (a *dma_chan struct) in
>>> this '32' case right here (line ~735):
>>>> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
>>>
>>> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
>>> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
>>> causes the crash.
>>
>> ...
>>
>>> ticerex said when I've asked him about his real name+email for the patches:
>>> "Please use my forum nick."
>>> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
>>> (I know checkpatch.pl complains about that. But what can you do?)
>>
>> I think Reported-by: is fine to have any kind of reference to the reporter.
>> I can consider it false positive.
>>
>> ...
> 
> K, I've reported this back to the reporter ;).
> 
> Documentation/process/maintainer-tip.rst and process/5.Posting.rst
> provided some hints:
> 
> "Please note that if the bug was reported in private, then ask for
> permission first before using the Reported-by tag."
> 
> and maintainer-tip.rst, the format should be:
> ``Reported-by: ``Reporter <reporter@mail>``
> 
> (My goal here is to get "a fix" merged, so conformance is key. ;-))
> 
>>> -#define SATA_DWC_QCMD_MAX	32
>>> +#define SATA_DWC_QCMD_MAX	33
>>
>> Can't we use
>>
>> #define SATA_DWC_QCMD_MAX	(ATA_TAG_INTERNAL + 1)
>>
> 
> I've looked around a bit.
> 
> include/linux/libata.h itself has the following related definitions:
> 
> | enum {
> |        ATA_MAX_QUEUE           = 32,
> |        ATA_TAG_INTERNAL        = ATA_MAX_QUEUE,
> | [..]
> 
> | struct ata_port {
> |	[...]
> | 	struct ata_queued_cmd   qcmd[ATA_MAX_QUEUE + 1];
> |
> | }
> 
> ATA_MAX_QUEUE seems to be the "size" value. Whereas ATA_TAG_INTERNAL
> is used as (tag == ATA_TAG_INTERNAL) more often.
> 
> 
> I came up with a viable compromise:
> 
> Would it be "OK" to define a "new" ATA_MAX_TAG?
> 
> This could be either set ATA_MAX_TAG = ATA_MAX_QUEUE + 1

That will be confusing !

ATA physically allows only up to 32 tags (0 to 31) because the command tag
field is only 5 bits. So we should not define anything like this defining
a tag value that is invalid.

ATA_TAG_INTERNAL is special and used only to identify that the command is
a libata internal command, either for device scan/revalidate or for error
handling. This value is *never* used as a tag for an NCQ command because
*all* internal commands are in fact not NCQ ! The internal commands from
libata are all non queueable commands. Meaning that if a device driver
sees a qc with its tag set to ATA_TAG_INTERNAL, it means that there are no
other commands on-going and all tags should be unused in the driver.

In the case of the dw driver, it means that we could arbitrarily use any
of the valid tag values for managing that internal command without any
issue. But having looked at the driver, as I said, it is a bigger change
than just faking a 33rd "tag" that is in fact not a command tag at all.

For these reasons, I really prefer defining SATA_DWC_QCMD_MAX as
(ATA_MAX_QUEUE + 1) with the comment above it clarifying why it is defined
like that.

> 
> or let it be assigned automatically, if it's placed after ATA_TAG_INTERNAL
> in the libata.h's enum like this: (I prefer this, but it being "33" is not
> obvious if you don't dabble in C all the time)
> 
> | enum {
> |	[...]
> |       ATA_MAX_QUEUE           = 32,
> |       ATA_TAG_INTERNAL        = ATA_MAX_QUEUE,
> |	/*
> |	 * ATA_MAX_TAG accounts for ATA_MAX_QUEUE TAGs + 1 special TAG/slot
> |	 * for ATA_TAG_INTERNAL (libata's internal commands/DMA management).
> |	 * This needs to be it in this location.
> |	 */
> |	ATA_MAX_TAG,
> |       [...]
> 
> This ATA_MAX_TAG could then be used for ata_port's qcmd (from above) and
> sata_dwc_460ex's SATA_DWC_QCMD_MAX.

I understand your point. But again, given that ATA_TAG_INTERNAL is only a
special tag value for libata and cannot be used by devices, I would rather
not define this ATA_MAX_TAG macro, to avoid confusions.

> 
> For good measure:
> 
> BUILD_BUG_ON(ATA_TAG_INTERNAL >= ATA_MAX_TAG);
> 
> could be added into libata.h's __ata_qc_from_tag().
> It access the qcmd array, so anyone using libata's accessors will catch
> future updates.
> 
> Is this fine or getting to close to being overbuild?

Not overbuilt, but not fine either. In my opinion, the root cause of the
issue is that the DW driver blindly uses the qc->tag value regardless of
if the command is NCQ or not. For non-ncq commands, tag 0 could be used
always for the command management arrays index, as by definition, non NCQ
commands imply QD=1 operation.

As I mentioned, this is a slightly bigger fix though. So I am OK with
simply changing the macro definition to have the +1 for now. But I am
certainly not against getting this driver correctly fixed to properly
handle command types and tags :)


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18 11:43   ` Christian Lamparter
@ 2022-03-19  0:04     ` Damien Le Moal
  0 siblings, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2022-03-19  0:04 UTC (permalink / raw)
  To: Christian Lamparter, linux-ide; +Cc: Jens Axboe, Tejun Heo, Andy Shevchenko

On 3/18/22 20:43, Christian Lamparter wrote:
> On 18/03/2022 10:34, Damien Le Moal wrote:
>> On 3/18/22 18:03, Christian Lamparter wrote:
>>> the driver uses libata's "tag" values from in various arrays.
>>> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
>>> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
>>>
>>> Otherwise ATA_TAG_INTERNAL cause a crash like this:
>>>
>>> | BUG: Kernel NULL pointer dereference at 0x00000000
>>> | Faulting instruction address: 0xc03ed4b8
>>> | Oops: Kernel access of bad area, sig: 11 [#1]
>>> | BE PAGE_SIZE=4K PowerPC 44x Platform
>>> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
>>> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
>>> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
>>> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
>>> | DEAR: 00000000 ESR: 00000000
>>> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
>>> | [..]
>>> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
>>> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>>> | Call Trace:
>>> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
>>> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>>> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
>>> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
>>> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
>>> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
>>> | [...]
>>>
>>> this is because sata_dwc_dma_xfer_complete() NULLs the
>>> dma_pending's next neighbour "chan" (a *dma_chan struct) in
>>> this '32' case right here (line ~735):
>>>> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
>>>
>>> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
>>> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
>>> causes the crash.
>>>
>>> Reported-by: ticerex (OpenWrt Forum)
>>
>> I would remove this since you have the link to the bug report below.
>> Without a real name & email, this does not make much sense.
> Ok.
> 
>>
>>> Fixes: 28361c403683c ("libata: add extra internal command")
>>> Cc: stable@kernel.org # 4.18+
>>> Link: https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464
>>> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
>>> ---
>>> ticerex said when I've asked him about his real name+email for the patches:
>>> "Please use my forum nick."
>>> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
>>> (I know checkpatch.pl complains about that. But what can you do?)
>>> ---
>>>   drivers/ata/sata_dwc_460ex.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/ata/sata_dwc_460ex.c b/drivers/ata/sata_dwc_460ex.c
>>> index bec33d781ae0..061b27584667 100644
>>> --- a/drivers/ata/sata_dwc_460ex.c
>>> +++ b/drivers/ata/sata_dwc_460ex.c
>>> @@ -137,7 +137,7 @@ struct sata_dwc_device {
>>>   #endif
>>>   };
>>>   
>>> -#define SATA_DWC_QCMD_MAX	32
>>> +#define SATA_DWC_QCMD_MAX	33
>>
>> This is really ugly, but I do not see a better way to do it simply.
>> But at least, let's do something like this to avoid confusion and to show
>> that this driver is not doing some black magic with ATA drives:
>>
>> /*
>>   * Allow one extra special slot for commands and DMA management to
>>   * account for libata internal commands.
>>   */
>> #define SATA_DWC_QCMD_MAX	(ATA_MAX_QUEUE + 1)
>>
>> Thoughts ?
> 
> Yes, this works. That ATA_TAG_INTERNAL value has remained unchanged
> since Jens' change from 2018. How do you want to proceed?
> 
> Should I make a v2, or do you update the patch locally?

Please send a v2. Or send another patch properly fixing the driver tag
handling as explained in my previous email.

> 
> The way I understand it. this ATA_TAG_INTERNAL has the special MAGIC
> value so DMA "qc issues" do not interfere with possibly concurrent NCQ
> "qc issues" on the involved controllers.
> 
> sata_dwc_460ex's NCQ is disabled/gimped by line 1093:
> | /* .can_queue           = ATA_MAX_QUEUE, */)
> 
> Reassiging the ATA_TAG_INTERNAL could have been done.
> But just increasing the arrays worked :-).

No ! changing can_queue to something > 32 would be very wrong ! This
really defines the number of tags that can be used, so the device maximum
queue depth. And ATA maximum queue depth is 32 with NCQ commands. It
cannot be anything larger than 32.

> 
> Cheers,
> Christian


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write
  2022-03-18 18:06   ` Christian Lamparter
  2022-03-19  0:00     ` Damien Le Moal
@ 2022-03-19  0:13     ` Damien Le Moal
  1 sibling, 0 replies; 11+ messages in thread
From: Damien Le Moal @ 2022-03-19  0:13 UTC (permalink / raw)
  To: Christian Lamparter, Andy Shevchenko; +Cc: linux-ide, Jens Axboe, Tejun Heo

On 3/19/22 03:06, Christian Lamparter wrote:
> On 18/03/2022 15:16, Andy Shevchenko wrote:
>> On Fri, Mar 18, 2022 at 10:03:11AM +0100, Christian Lamparter wrote:
>>> the driver uses libata's "tag" values from in various arrays.
>>> Since the mentioned patch bumped the ATA_TAG_INTERNAL to 32,
>>> the value of the SATA_DWC_QCMD_MAX needs to be bumped to 33.
>>>
>>> Otherwise ATA_TAG_INTERNAL cause a crash like this:
>>>
>>> | BUG: Kernel NULL pointer dereference at 0x00000000
>>> | Faulting instruction address: 0xc03ed4b8
>>> | Oops: Kernel access of bad area, sig: 11 [#1]
>>> | BE PAGE_SIZE=4K PowerPC 44x Platform
>>> | CPU: 0 PID: 362 Comm: scsi_eh_1 Not tainted 5.4.163 #0
>>> | NIP:  c03ed4b8 LR: c03d27e8 CTR: c03ed36c
>>> | REGS: cfa59950 TRAP: 0300   Not tainted  (5.4.163)
>>> | MSR:  00021000 <CE,ME>  CR: 42000222  XER: 00000000
>>> | DEAR: 00000000 ESR: 00000000
>>> | GPR00: c03d27e8 cfa59a08 cfa55fe0 00000000 0fa46bc0 [...]
>>> | [..]
>>> | NIP [c03ed4b8] sata_dwc_qc_issue+0x14c/0x254
>>> | LR [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>>> | Call Trace:
>>> | [cfa59a08] [c003f4e0] __cancel_work_timer+0x124/0x194 (unreliable)
>>> | [cfa59a78] [c03d27e8] ata_qc_issue+0x1c8/0x2dc
>>> | [cfa59a98] [c03d2b3c] ata_exec_internal_sg+0x240/0x524
>>> | [cfa59b08] [c03d2e98] ata_exec_internal+0x78/0xe0
>>> | [cfa59b58] [c03d30fc] ata_read_log_page.part.38+0x1dc/0x204
>>> | [cfa59bc8] [c03d324c] ata_identify_page_supported+0x68/0x130
>>> | [...]
>>>
>>> this is because sata_dwc_dma_xfer_complete() NULLs the
>>> dma_pending's next neighbour "chan" (a *dma_chan struct) in
>>> this '32' case right here (line ~735):
>>>> hsdevp->dma_pending[tag] = SATA_DWC_DMA_PENDING_NONE;
>>>
>>> Then the next time, a dma gets issued; dma_dwc_xfer_setup() passes
>>> the NULL'd hsdevp->chan to the dmaengine_slave_config() which then
>>> causes the crash.
>>
>> ...
>>
>>> ticerex said when I've asked him about his real name+email for the patches:
>>> "Please use my forum nick."
>>> <https://forum.openwrt.org/t/my-book-live-duo-reboot-loop/122464/14>
>>> (I know checkpatch.pl complains about that. But what can you do?)
>>
>> I think Reported-by: is fine to have any kind of reference to the reporter.
>> I can consider it false positive.
>>
>> ...
> 
> K, I've reported this back to the reporter ;).
> 
> Documentation/process/maintainer-tip.rst and process/5.Posting.rst
> provided some hints:
> 
> "Please note that if the bug was reported in private, then ask for
> permission first before using the Reported-by tag."
> 
> and maintainer-tip.rst, the format should be:
> ``Reported-by: ``Reporter <reporter@mail>``
> 
> (My goal here is to get "a fix" merged, so conformance is key. ;-))

No worries, a fix will go in :)

Since the person did not want to give his/her/they real name & email, we
could consider this as that person not accepting to be mentioned in a
reported-by tag. So dropping the tag is I think preferred. Keeping it as
you had it is fine too. No big deal.

> 
>>> -#define SATA_DWC_QCMD_MAX	32
>>> +#define SATA_DWC_QCMD_MAX	33
>>
>> Can't we use
>>
>> #define SATA_DWC_QCMD_MAX	(ATA_TAG_INTERNAL + 1)
>>
> 
> I've looked around a bit.
> 
> include/linux/libata.h itself has the following related definitions:
> 
> | enum {
> |        ATA_MAX_QUEUE           = 32,
> |        ATA_TAG_INTERNAL        = ATA_MAX_QUEUE,
> | [..]
> 
> | struct ata_port {
> |	[...]
> | 	struct ata_queued_cmd   qcmd[ATA_MAX_QUEUE + 1];
> |
> | }
> 
> ATA_MAX_QUEUE seems to be the "size" value. Whereas ATA_TAG_INTERNAL
> is used as (tag == ATA_TAG_INTERNAL) more often.
> 
> 
> I came up with a viable compromise:
> 
> Would it be "OK" to define a "new" ATA_MAX_TAG?
> 
> This could be either set ATA_MAX_TAG = ATA_MAX_QUEUE + 1
> 
> or let it be assigned automatically, if it's placed after ATA_TAG_INTERNAL
> in the libata.h's enum like this: (I prefer this, but it being "33" is not
> obvious if you don't dabble in C all the time)
> 
> | enum {
> |	[...]
> |       ATA_MAX_QUEUE           = 32,
> |       ATA_TAG_INTERNAL        = ATA_MAX_QUEUE,
> |	/*
> |	 * ATA_MAX_TAG accounts for ATA_MAX_QUEUE TAGs + 1 special TAG/slot
> |	 * for ATA_TAG_INTERNAL (libata's internal commands/DMA management).
> |	 * This needs to be it in this location.
> |	 */
> |	ATA_MAX_TAG,
> |       [...]
> 
> This ATA_MAX_TAG could then be used for ata_port's qcmd (from above) and
> sata_dwc_460ex's SATA_DWC_QCMD_MAX.
> 
> For good measure:
> 
> BUILD_BUG_ON(ATA_TAG_INTERNAL >= ATA_MAX_TAG);
> 
> could be added into libata.h's __ata_qc_from_tag().
> It access the qcmd array, so anyone using libata's accessors will catch
> future updates.
> 
> Is this fine or getting to close to being overbuild?
> 
> Thank you both for your insights,
> Christian


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-19  0:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-18  9:03 [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Christian Lamparter
2022-03-18  9:03 ` [PATCH v1 2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs Christian Lamparter
2022-03-18  9:47   ` Damien Le Moal
2022-03-18 14:20   ` Andy Shevchenko
2022-03-18  9:34 ` [PATCH v1 1/2] ata: sata_dwc_460ex: Fix crash due to OOB write Damien Le Moal
2022-03-18 11:43   ` Christian Lamparter
2022-03-19  0:04     ` Damien Le Moal
2022-03-18 14:16 ` Andy Shevchenko
2022-03-18 18:06   ` Christian Lamparter
2022-03-19  0:00     ` Damien Le Moal
2022-03-19  0:13     ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.