All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] Improve MTK ECC Engine driver
@ 2017-10-23  8:21 ` Xiaolei Li
  0 siblings, 0 replies; 14+ messages in thread
From: Xiaolei Li @ 2017-10-23  8:21 UTC (permalink / raw)
  To: boris.brezillon, computersforpeace
  Cc: dwmw2, linux-mtd, linux-mediatek, rogercc.lin, srv_heupstream,
	xiaolei.li

This patch-set is mainly to improve MTK ECC Engine driver, include
- remove dummy ECC IRQ disable setting
- fix infinite ECC decode IRQ issue

Changes relative to:
--------------------

tree    : https://github.com/bbrezillon/linux-0day
branch  : nand/next
commit  :
        'commit 1f3df4dc088d927683b292118cd8b4eaaf1af573
        Author: Sascha Hauer <s.hauer@pengutronix.de>
        Date:   Mon Oct 16 11:51:55 2017 +0200'

Tests:
------

* ubifs and jffs2 are validated on NAND device MT29F16G08ADBCA
  by 'dd' command.
* all drivers/mtd/tests/* pass.

Xiaolei Li (2):
  mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable
  mtd: nand: mtk: fix infinite ECC decode IRQ issue

 drivers/mtd/nand/mtk_ecc.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--
1.9.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 0/2] Improve MTK ECC Engine driver
@ 2017-10-23  8:21 ` Xiaolei Li
  0 siblings, 0 replies; 14+ messages in thread
From: Xiaolei Li @ 2017-10-23  8:21 UTC (permalink / raw)
  To: boris.brezillon, computersforpeace
  Cc: srv_heupstream, linux-mtd, linux-mediatek, xiaolei.li, dwmw2,
	rogercc.lin

This patch-set is mainly to improve MTK ECC Engine driver, include
- remove dummy ECC IRQ disable setting
- fix infinite ECC decode IRQ issue

Changes relative to:
--------------------

tree    : https://github.com/bbrezillon/linux-0day
branch  : nand/next
commit  :
        'commit 1f3df4dc088d927683b292118cd8b4eaaf1af573
        Author: Sascha Hauer <s.hauer@pengutronix.de>
        Date:   Mon Oct 16 11:51:55 2017 +0200'

Tests:
------

* ubifs and jffs2 are validated on NAND device MT29F16G08ADBCA
  by 'dd' command.
* all drivers/mtd/tests/* pass.

Xiaolei Li (2):
  mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable
  mtd: nand: mtk: fix infinite ECC decode IRQ issue

 drivers/mtd/nand/mtk_ecc.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--
1.9.1

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 1/2] mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable
@ 2017-10-23  8:21   ` Xiaolei Li
  0 siblings, 0 replies; 14+ messages in thread
From: Xiaolei Li @ 2017-10-23  8:21 UTC (permalink / raw)
  To: boris.brezillon, computersforpeace
  Cc: dwmw2, linux-mtd, linux-mediatek, rogercc.lin, srv_heupstream,
	xiaolei.li

There will disable ECC IRQ in ECC IRQ handle function, so no need to
disable it again in the function mtk_ecc_disable.

Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com>
---
 drivers/mtd/nand/mtk_ecc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
index 7f3b065..82aa6f2 100644
--- a/drivers/mtd/nand/mtk_ecc.c
+++ b/drivers/mtd/nand/mtk_ecc.c
@@ -307,7 +307,6 @@ void mtk_ecc_disable(struct mtk_ecc *ecc)
 
 	/* disable it */
 	mtk_ecc_wait_idle(ecc, op);
-	writew(0, ecc->regs + ECC_IRQ_REG(op));
 	writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
 
 	mutex_unlock(&ecc->lock);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 1/2] mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable
@ 2017-10-23  8:21   ` Xiaolei Li
  0 siblings, 0 replies; 14+ messages in thread
From: Xiaolei Li @ 2017-10-23  8:21 UTC (permalink / raw)
  To: boris.brezillon-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	xiaolei.li-NuS5LvNUpcJWk0Htik3J/w, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rogercc.lin-NuS5LvNUpcJWk0Htik3J/w

There will disable ECC IRQ in ECC IRQ handle function, so no need to
disable it again in the function mtk_ecc_disable.

Signed-off-by: Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
---
 drivers/mtd/nand/mtk_ecc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
index 7f3b065..82aa6f2 100644
--- a/drivers/mtd/nand/mtk_ecc.c
+++ b/drivers/mtd/nand/mtk_ecc.c
@@ -307,7 +307,6 @@ void mtk_ecc_disable(struct mtk_ecc *ecc)
 
 	/* disable it */
 	mtk_ecc_wait_idle(ecc, op);
-	writew(0, ecc->regs + ECC_IRQ_REG(op));
 	writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
 
 	mutex_unlock(&ecc->lock);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
@ 2017-10-23  8:21   ` Xiaolei Li
  0 siblings, 0 replies; 14+ messages in thread
From: Xiaolei Li @ 2017-10-23  8:21 UTC (permalink / raw)
  To: boris.brezillon, computersforpeace
  Cc: dwmw2, linux-mtd, linux-mediatek, rogercc.lin, srv_heupstream,
	xiaolei.li

For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
during long time burn test on some platforms. Once this issue occurred,
the ECC decode IRQ status cannot be cleared in the IRQ handler function,
and threads cannot be scheduled.

ECC HW generates decode IRQ each sector, so there will have more than one
decode IRQ if read one page of large page NAND.

Currently, ECC IRQ handle flow is that we will check whether it is decode
IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
cleared at the same time.
Secondly, we will check whether all sectors are decoded by reading the
register ECC_DECDONE. This is because the current IRQ may be not dealed
in time, and the next sectors have been decoded before reading the
register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
be generated.
Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
next ECC IRQ.

But, there is a timing issue between step one and two. When we read the
reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
and the ECC IRQ signal is cleared. But the last sector is decoded before
reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
it means we will receive one extra ECC IRQ later. In step three, we will
find that all sectors were decoded, then disable ECC IRQ and return.
When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
anymore. That is because the register ECC_DECIRQ_STA can only be cleared
when the register ECC_IRQ_REG(op) is enabled. But actually we have
disabled ECC IRQ in the previous ECC IRQ handle. So, there will
keep receiving ECC decode IRQ.

Now, we read the register ECC_DECIRQ_STA once again before completing the
ecc done event. This ensures that there will be no extra ECC decode IRQ.

MT2712 ECC HW is designed to generate only one decode IRQ each page, so
there is no this problem on MT2712 platforms.

Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com>
---
 drivers/mtd/nand/mtk_ecc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
index 82aa6f2..0e04323 100644
--- a/drivers/mtd/nand/mtk_ecc.c
+++ b/drivers/mtd/nand/mtk_ecc.c
@@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
 		op = ECC_DECODE;
 		dec = readw(ecc->regs + ECC_DECDONE);
 		if (dec & ecc->sectors) {
+			/**
+			 * Clear decode IRQ status once again to ensure that
+			 * there will be no extra IRQ.
+			 */
+			dec = readw(ecc->regs + ECC_DECIRQ_STA);
 			ecc->sectors = 0;
 			complete(&ecc->done);
 		} else {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
@ 2017-10-23  8:21   ` Xiaolei Li
  0 siblings, 0 replies; 14+ messages in thread
From: Xiaolei Li @ 2017-10-23  8:21 UTC (permalink / raw)
  To: boris.brezillon-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	xiaolei.li-NuS5LvNUpcJWk0Htik3J/w, dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	rogercc.lin-NuS5LvNUpcJWk0Htik3J/w

For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
during long time burn test on some platforms. Once this issue occurred,
the ECC decode IRQ status cannot be cleared in the IRQ handler function,
and threads cannot be scheduled.

ECC HW generates decode IRQ each sector, so there will have more than one
decode IRQ if read one page of large page NAND.

Currently, ECC IRQ handle flow is that we will check whether it is decode
IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
cleared at the same time.
Secondly, we will check whether all sectors are decoded by reading the
register ECC_DECDONE. This is because the current IRQ may be not dealed
in time, and the next sectors have been decoded before reading the
register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
be generated.
Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
next ECC IRQ.

But, there is a timing issue between step one and two. When we read the
reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
and the ECC IRQ signal is cleared. But the last sector is decoded before
reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
it means we will receive one extra ECC IRQ later. In step three, we will
find that all sectors were decoded, then disable ECC IRQ and return.
When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
anymore. That is because the register ECC_DECIRQ_STA can only be cleared
when the register ECC_IRQ_REG(op) is enabled. But actually we have
disabled ECC IRQ in the previous ECC IRQ handle. So, there will
keep receiving ECC decode IRQ.

Now, we read the register ECC_DECIRQ_STA once again before completing the
ecc done event. This ensures that there will be no extra ECC decode IRQ.

MT2712 ECC HW is designed to generate only one decode IRQ each page, so
there is no this problem on MT2712 platforms.

Signed-off-by: Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
---
 drivers/mtd/nand/mtk_ecc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
index 82aa6f2..0e04323 100644
--- a/drivers/mtd/nand/mtk_ecc.c
+++ b/drivers/mtd/nand/mtk_ecc.c
@@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
 		op = ECC_DECODE;
 		dec = readw(ecc->regs + ECC_DECDONE);
 		if (dec & ecc->sectors) {
+			/**
+			 * Clear decode IRQ status once again to ensure that
+			 * there will be no extra IRQ.
+			 */
+			dec = readw(ecc->regs + ECC_DECIRQ_STA);
 			ecc->sectors = 0;
 			complete(&ecc->done);
 		} else {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
@ 2017-10-27 12:44     ` Boris Brezillon
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Brezillon @ 2017-10-27 12:44 UTC (permalink / raw)
  To: Xiaolei Li
  Cc: computersforpeace, srv_heupstream, linux-mtd, linux-mediatek,
	dwmw2, rogercc.lin

Hi Xiaolei,

On Mon, 23 Oct 2017 16:21:37 +0800
Xiaolei Li <xiaolei.li@mediatek.com> wrote:

> For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
> during long time burn test on some platforms. Once this issue occurred,
> the ECC decode IRQ status cannot be cleared in the IRQ handler function,
> and threads cannot be scheduled.
> 
> ECC HW generates decode IRQ each sector, so there will have more than one
> decode IRQ if read one page of large page NAND.
> 
> Currently, ECC IRQ handle flow is that we will check whether it is decode
> IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
> type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
> cleared at the same time.
> Secondly, we will check whether all sectors are decoded by reading the
> register ECC_DECDONE. This is because the current IRQ may be not dealed
> in time, and the next sectors have been decoded before reading the
> register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
> be generated.
> Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
> will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
> programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
> next ECC IRQ.
> 
> But, there is a timing issue between step one and two. When we read the
> reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
> and the ECC IRQ signal is cleared. But the last sector is decoded before
> reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
> it means we will receive one extra ECC IRQ later. In step three, we will
> find that all sectors were decoded, then disable ECC IRQ and return.
> When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
> anymore. That is because the register ECC_DECIRQ_STA can only be cleared
> when the register ECC_IRQ_REG(op) is enabled. But actually we have
> disabled ECC IRQ in the previous ECC IRQ handle. So, there will
> keep receiving ECC decode IRQ.
> 
> Now, we read the register ECC_DECIRQ_STA once again before completing the
> ecc done event. This ensures that there will be no extra ECC decode IRQ.

Why don't you remove the

	writel(0, ecc->regs + ECC_IRQ_REG(op));

line in the irq handler and add

	readw(ecc->regs + ECC_DECIRQ_STA);

just before disabling the irq in mtk_ecc_disable().

It really looks weird to have the IRQ handler disable the IRQ on its
own. It's usually the caller who decides when the IRQ (or set of IRQs)
should be enabled/disabled. Moreover, I'm not sure you're guaranteed
that no new interrupts will come in between the ECC_DECIRQ_STA read
and the ECC_IRQ_REG() write you're doing in your irq handler, while,
doing that after the engine has been disabled (in mtk_ecc_disable())
should guarantee that nothing can happen after you have read the status
reg.

> 
> MT2712 ECC HW is designed to generate only one decode IRQ each page, so
> there is no this problem on MT2712 platforms.
> 
> Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com>

You miss a Fixes tag, and you might want to add a Cc-stable tag to
backport the fix.

Regards,

Boris

> ---
>  drivers/mtd/nand/mtk_ecc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> index 82aa6f2..0e04323 100644
> --- a/drivers/mtd/nand/mtk_ecc.c
> +++ b/drivers/mtd/nand/mtk_ecc.c
> @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
>  		op = ECC_DECODE;
>  		dec = readw(ecc->regs + ECC_DECDONE);
>  		if (dec & ecc->sectors) {
> +			/**
> +			 * Clear decode IRQ status once again to ensure that
> +			 * there will be no extra IRQ.
> +			 */
> +			dec = readw(ecc->regs + ECC_DECIRQ_STA);
>  			ecc->sectors = 0;
>  			complete(&ecc->done);
>  		} else {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
@ 2017-10-27 12:44     ` Boris Brezillon
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Brezillon @ 2017-10-27 12:44 UTC (permalink / raw)
  To: Xiaolei Li
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rogercc.lin-NuS5LvNUpcJWk0Htik3J/w

Hi Xiaolei,

On Mon, 23 Oct 2017 16:21:37 +0800
Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org> wrote:

> For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
> during long time burn test on some platforms. Once this issue occurred,
> the ECC decode IRQ status cannot be cleared in the IRQ handler function,
> and threads cannot be scheduled.
> 
> ECC HW generates decode IRQ each sector, so there will have more than one
> decode IRQ if read one page of large page NAND.
> 
> Currently, ECC IRQ handle flow is that we will check whether it is decode
> IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
> type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
> cleared at the same time.
> Secondly, we will check whether all sectors are decoded by reading the
> register ECC_DECDONE. This is because the current IRQ may be not dealed
> in time, and the next sectors have been decoded before reading the
> register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
> be generated.
> Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
> will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
> programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
> next ECC IRQ.
> 
> But, there is a timing issue between step one and two. When we read the
> reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
> and the ECC IRQ signal is cleared. But the last sector is decoded before
> reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
> it means we will receive one extra ECC IRQ later. In step three, we will
> find that all sectors were decoded, then disable ECC IRQ and return.
> When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
> anymore. That is because the register ECC_DECIRQ_STA can only be cleared
> when the register ECC_IRQ_REG(op) is enabled. But actually we have
> disabled ECC IRQ in the previous ECC IRQ handle. So, there will
> keep receiving ECC decode IRQ.
> 
> Now, we read the register ECC_DECIRQ_STA once again before completing the
> ecc done event. This ensures that there will be no extra ECC decode IRQ.

Why don't you remove the

	writel(0, ecc->regs + ECC_IRQ_REG(op));

line in the irq handler and add

	readw(ecc->regs + ECC_DECIRQ_STA);

just before disabling the irq in mtk_ecc_disable().

It really looks weird to have the IRQ handler disable the IRQ on its
own. It's usually the caller who decides when the IRQ (or set of IRQs)
should be enabled/disabled. Moreover, I'm not sure you're guaranteed
that no new interrupts will come in between the ECC_DECIRQ_STA read
and the ECC_IRQ_REG() write you're doing in your irq handler, while,
doing that after the engine has been disabled (in mtk_ecc_disable())
should guarantee that nothing can happen after you have read the status
reg.

> 
> MT2712 ECC HW is designed to generate only one decode IRQ each page, so
> there is no this problem on MT2712 platforms.
> 
> Signed-off-by: Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>

You miss a Fixes tag, and you might want to add a Cc-stable tag to
backport the fix.

Regards,

Boris

> ---
>  drivers/mtd/nand/mtk_ecc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> index 82aa6f2..0e04323 100644
> --- a/drivers/mtd/nand/mtk_ecc.c
> +++ b/drivers/mtd/nand/mtk_ecc.c
> @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
>  		op = ECC_DECODE;
>  		dec = readw(ecc->regs + ECC_DECDONE);
>  		if (dec & ecc->sectors) {
> +			/**
> +			 * Clear decode IRQ status once again to ensure that
> +			 * there will be no extra IRQ.
> +			 */
> +			dec = readw(ecc->regs + ECC_DECIRQ_STA);
>  			ecc->sectors = 0;
>  			complete(&ecc->done);
>  		} else {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable
@ 2017-10-27 12:45     ` Boris Brezillon
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Brezillon @ 2017-10-27 12:45 UTC (permalink / raw)
  To: Xiaolei Li
  Cc: computersforpeace, srv_heupstream, linux-mtd, linux-mediatek,
	dwmw2, rogercc.lin

On Mon, 23 Oct 2017 16:21:36 +0800
Xiaolei Li <xiaolei.li@mediatek.com> wrote:

> There will disable ECC IRQ in ECC IRQ handle function, so no need to
> disable it again in the function mtk_ecc_disable.
> 
> Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com>
> ---
>  drivers/mtd/nand/mtk_ecc.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> index 7f3b065..82aa6f2 100644
> --- a/drivers/mtd/nand/mtk_ecc.c
> +++ b/drivers/mtd/nand/mtk_ecc.c
> @@ -307,7 +307,6 @@ void mtk_ecc_disable(struct mtk_ecc *ecc)
>  
>  	/* disable it */
>  	mtk_ecc_wait_idle(ecc, op);
> -	writew(0, ecc->regs + ECC_IRQ_REG(op));

As explained in my review of patch 2, I would do it the other way
around: remove this write from the irq handler and keep it here.

>  	writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
>  
>  	mutex_unlock(&ecc->lock);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable
@ 2017-10-27 12:45     ` Boris Brezillon
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Brezillon @ 2017-10-27 12:45 UTC (permalink / raw)
  To: Xiaolei Li
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rogercc.lin-NuS5LvNUpcJWk0Htik3J/w

On Mon, 23 Oct 2017 16:21:36 +0800
Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org> wrote:

> There will disable ECC IRQ in ECC IRQ handle function, so no need to
> disable it again in the function mtk_ecc_disable.
> 
> Signed-off-by: Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/mtd/nand/mtk_ecc.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> index 7f3b065..82aa6f2 100644
> --- a/drivers/mtd/nand/mtk_ecc.c
> +++ b/drivers/mtd/nand/mtk_ecc.c
> @@ -307,7 +307,6 @@ void mtk_ecc_disable(struct mtk_ecc *ecc)
>  
>  	/* disable it */
>  	mtk_ecc_wait_idle(ecc, op);
> -	writew(0, ecc->regs + ECC_IRQ_REG(op));

As explained in my review of patch 2, I would do it the other way
around: remove this write from the irq handler and keep it here.

>  	writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
>  
>  	mutex_unlock(&ecc->lock);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
  2017-10-27 12:44     ` Boris Brezillon
@ 2017-10-28  4:05       ` xiaolei li
  -1 siblings, 0 replies; 14+ messages in thread
From: xiaolei li @ 2017-10-28  4:05 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: computersforpeace, srv_heupstream, linux-mtd, linux-mediatek,
	dwmw2, rogercc.lin

Hi Boris,

On Fri, 2017-10-27 at 14:44 +0200, Boris Brezillon wrote:
> Hi Xiaolei,
> 
> On Mon, 23 Oct 2017 16:21:37 +0800
> Xiaolei Li <xiaolei.li@mediatek.com> wrote:
> 
> > For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
> > during long time burn test on some platforms. Once this issue occurred,
> > the ECC decode IRQ status cannot be cleared in the IRQ handler function,
> > and threads cannot be scheduled.
> > 
> > ECC HW generates decode IRQ each sector, so there will have more than one
> > decode IRQ if read one page of large page NAND.
> > 
> > Currently, ECC IRQ handle flow is that we will check whether it is decode
> > IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
> > type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
> > cleared at the same time.
> > Secondly, we will check whether all sectors are decoded by reading the
> > register ECC_DECDONE. This is because the current IRQ may be not dealed
> > in time, and the next sectors have been decoded before reading the
> > register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
> > be generated.
> > Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
> > will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
> > programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
> > next ECC IRQ.
> > 
> > But, there is a timing issue between step one and two. When we read the
> > reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
> > and the ECC IRQ signal is cleared. But the last sector is decoded before
> > reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
> > it means we will receive one extra ECC IRQ later. In step three, we will
> > find that all sectors were decoded, then disable ECC IRQ and return.
> > When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
> > anymore. That is because the register ECC_DECIRQ_STA can only be cleared
> > when the register ECC_IRQ_REG(op) is enabled. But actually we have
> > disabled ECC IRQ in the previous ECC IRQ handle. So, there will
> > keep receiving ECC decode IRQ.
> > 
> > Now, we read the register ECC_DECIRQ_STA once again before completing the
> > ecc done event. This ensures that there will be no extra ECC decode IRQ.
> 
> Why don't you remove the
> 
> 	writel(0, ecc->regs + ECC_IRQ_REG(op));
> 
> line in the irq handler and add
> 
> 	readw(ecc->regs + ECC_DECIRQ_STA);
> 
> just before disabling the irq in mtk_ecc_disable().
> 
> It really looks weird to have the IRQ handler disable the IRQ on its
> own. It's usually the caller who decides when the IRQ (or set of IRQs)
> should be enabled/disabled. 

It is reasonable. I will remove
 writel(0, ecc->regs + ECC_IRQ_REG(op));
from irq handler, and keep it in mtk_ecc_disable().

But I thinks it is better to add
 readw(ecc->regs + ECC_DECIRQ_STA);
before completing ecc done event in the irq handler than in
mtk_ecc_disable().

Please let me show some backgrounds at first:
1. ECC irq signal is low level vaild.
2. Reading ECC_DECIRQ_STA can pull ecc irq signal high, but it is just
valid if ECC_IRQ_REG(op) is enabled.

At the beginning of irq handler, I read ECC_DECIRQ_STA and this pulls
ecc irq signal high. But ecc irq signal may be pulled down again between
ECC_DECIRQ_STA read and ECC_DECDONE read, if one sector is decoded done
between them. And this means that one IRQ will come in later. But if all
sectors are decoded done now, we will complete ecc done event, and there
is really no need to deal with this extra IRQ.
So, read ECC_DECIRQ_STA before completing ecc done event, this can
guarantee that ecc irq signal is always high and no extra IRQ later.

We can keep mtk_ecc_disable() the same. Because all irqs have been
handled.

I am not sure whether this explanation is clear?

> Moreover, I'm not sure you're guaranteed
> that no new interrupts will come in between the ECC_DECIRQ_STA read
> and the ECC_IRQ_REG() write you're doing in your irq handler, while,
> doing that after the engine has been disabled (in mtk_ecc_disable())
> should guarantee that nothing can happen after you have read the status
> reg.
> 
> > 
> > MT2712 ECC HW is designed to generate only one decode IRQ each page, so
> > there is no this problem on MT2712 platforms.
> > 
> > Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com>
> 
> You miss a Fixes tag, and you might want to add a Cc-stable tag to
> backport the fix.
> 
OK. Thanks.

> Regards,
> 
> Boris
> 
> > ---
> >  drivers/mtd/nand/mtk_ecc.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> > index 82aa6f2..0e04323 100644
> > --- a/drivers/mtd/nand/mtk_ecc.c
> > +++ b/drivers/mtd/nand/mtk_ecc.c
> > @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
> >  		op = ECC_DECODE;
> >  		dec = readw(ecc->regs + ECC_DECDONE);
> >  		if (dec & ecc->sectors) {
> > +			/**
> > +			 * Clear decode IRQ status once again to ensure that
> > +			 * there will be no extra IRQ.
> > +			 */
> > +			dec = readw(ecc->regs + ECC_DECIRQ_STA);
> >  			ecc->sectors = 0;
> >  			complete(&ecc->done);
> >  		} else {
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
@ 2017-10-28  4:05       ` xiaolei li
  0 siblings, 0 replies; 14+ messages in thread
From: xiaolei li @ 2017-10-28  4:05 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rogercc.lin-NuS5LvNUpcJWk0Htik3J/w

Hi Boris,

On Fri, 2017-10-27 at 14:44 +0200, Boris Brezillon wrote:
> Hi Xiaolei,
> 
> On Mon, 23 Oct 2017 16:21:37 +0800
> Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org> wrote:
> 
> > For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
> > during long time burn test on some platforms. Once this issue occurred,
> > the ECC decode IRQ status cannot be cleared in the IRQ handler function,
> > and threads cannot be scheduled.
> > 
> > ECC HW generates decode IRQ each sector, so there will have more than one
> > decode IRQ if read one page of large page NAND.
> > 
> > Currently, ECC IRQ handle flow is that we will check whether it is decode
> > IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
> > type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
> > cleared at the same time.
> > Secondly, we will check whether all sectors are decoded by reading the
> > register ECC_DECDONE. This is because the current IRQ may be not dealed
> > in time, and the next sectors have been decoded before reading the
> > register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
> > be generated.
> > Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
> > will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
> > programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
> > next ECC IRQ.
> > 
> > But, there is a timing issue between step one and two. When we read the
> > reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
> > and the ECC IRQ signal is cleared. But the last sector is decoded before
> > reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
> > it means we will receive one extra ECC IRQ later. In step three, we will
> > find that all sectors were decoded, then disable ECC IRQ and return.
> > When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
> > anymore. That is because the register ECC_DECIRQ_STA can only be cleared
> > when the register ECC_IRQ_REG(op) is enabled. But actually we have
> > disabled ECC IRQ in the previous ECC IRQ handle. So, there will
> > keep receiving ECC decode IRQ.
> > 
> > Now, we read the register ECC_DECIRQ_STA once again before completing the
> > ecc done event. This ensures that there will be no extra ECC decode IRQ.
> 
> Why don't you remove the
> 
> 	writel(0, ecc->regs + ECC_IRQ_REG(op));
> 
> line in the irq handler and add
> 
> 	readw(ecc->regs + ECC_DECIRQ_STA);
> 
> just before disabling the irq in mtk_ecc_disable().
> 
> It really looks weird to have the IRQ handler disable the IRQ on its
> own. It's usually the caller who decides when the IRQ (or set of IRQs)
> should be enabled/disabled. 

It is reasonable. I will remove
 writel(0, ecc->regs + ECC_IRQ_REG(op));
from irq handler, and keep it in mtk_ecc_disable().

But I thinks it is better to add
 readw(ecc->regs + ECC_DECIRQ_STA);
before completing ecc done event in the irq handler than in
mtk_ecc_disable().

Please let me show some backgrounds at first:
1. ECC irq signal is low level vaild.
2. Reading ECC_DECIRQ_STA can pull ecc irq signal high, but it is just
valid if ECC_IRQ_REG(op) is enabled.

At the beginning of irq handler, I read ECC_DECIRQ_STA and this pulls
ecc irq signal high. But ecc irq signal may be pulled down again between
ECC_DECIRQ_STA read and ECC_DECDONE read, if one sector is decoded done
between them. And this means that one IRQ will come in later. But if all
sectors are decoded done now, we will complete ecc done event, and there
is really no need to deal with this extra IRQ.
So, read ECC_DECIRQ_STA before completing ecc done event, this can
guarantee that ecc irq signal is always high and no extra IRQ later.

We can keep mtk_ecc_disable() the same. Because all irqs have been
handled.

I am not sure whether this explanation is clear?

> Moreover, I'm not sure you're guaranteed
> that no new interrupts will come in between the ECC_DECIRQ_STA read
> and the ECC_IRQ_REG() write you're doing in your irq handler, while,
> doing that after the engine has been disabled (in mtk_ecc_disable())
> should guarantee that nothing can happen after you have read the status
> reg.
> 
> > 
> > MT2712 ECC HW is designed to generate only one decode IRQ each page, so
> > there is no this problem on MT2712 platforms.
> > 
> > Signed-off-by: Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>
> 
> You miss a Fixes tag, and you might want to add a Cc-stable tag to
> backport the fix.
> 
OK. Thanks.

> Regards,
> 
> Boris
> 
> > ---
> >  drivers/mtd/nand/mtk_ecc.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> > index 82aa6f2..0e04323 100644
> > --- a/drivers/mtd/nand/mtk_ecc.c
> > +++ b/drivers/mtd/nand/mtk_ecc.c
> > @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
> >  		op = ECC_DECODE;
> >  		dec = readw(ecc->regs + ECC_DECDONE);
> >  		if (dec & ecc->sectors) {
> > +			/**
> > +			 * Clear decode IRQ status once again to ensure that
> > +			 * there will be no extra IRQ.
> > +			 */
> > +			dec = readw(ecc->regs + ECC_DECIRQ_STA);
> >  			ecc->sectors = 0;
> >  			complete(&ecc->done);
> >  		} else {
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
  2017-10-28  4:05       ` xiaolei li
@ 2017-10-29 19:47         ` Boris Brezillon
  -1 siblings, 0 replies; 14+ messages in thread
From: Boris Brezillon @ 2017-10-29 19:47 UTC (permalink / raw)
  To: xiaolei li
  Cc: computersforpeace, srv_heupstream, linux-mtd, linux-mediatek,
	dwmw2, rogercc.lin

On Sat, 28 Oct 2017 12:05:57 +0800
xiaolei li <xiaolei.li@mediatek.com> wrote:

> Hi Boris,
> 
> On Fri, 2017-10-27 at 14:44 +0200, Boris Brezillon wrote:
> > Hi Xiaolei,
> > 
> > On Mon, 23 Oct 2017 16:21:37 +0800
> > Xiaolei Li <xiaolei.li@mediatek.com> wrote:
> >   
> > > For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
> > > during long time burn test on some platforms. Once this issue occurred,
> > > the ECC decode IRQ status cannot be cleared in the IRQ handler function,
> > > and threads cannot be scheduled.
> > > 
> > > ECC HW generates decode IRQ each sector, so there will have more than one
> > > decode IRQ if read one page of large page NAND.
> > > 
> > > Currently, ECC IRQ handle flow is that we will check whether it is decode
> > > IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
> > > type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
> > > cleared at the same time.
> > > Secondly, we will check whether all sectors are decoded by reading the
> > > register ECC_DECDONE. This is because the current IRQ may be not dealed
> > > in time, and the next sectors have been decoded before reading the
> > > register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
> > > be generated.
> > > Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
> > > will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
> > > programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
> > > next ECC IRQ.
> > > 
> > > But, there is a timing issue between step one and two. When we read the
> > > reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
> > > and the ECC IRQ signal is cleared. But the last sector is decoded before
> > > reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
> > > it means we will receive one extra ECC IRQ later. In step three, we will
> > > find that all sectors were decoded, then disable ECC IRQ and return.
> > > When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
> > > anymore. That is because the register ECC_DECIRQ_STA can only be cleared
> > > when the register ECC_IRQ_REG(op) is enabled. But actually we have
> > > disabled ECC IRQ in the previous ECC IRQ handle. So, there will
> > > keep receiving ECC decode IRQ.
> > > 
> > > Now, we read the register ECC_DECIRQ_STA once again before completing the
> > > ecc done event. This ensures that there will be no extra ECC decode IRQ.  
> > 
> > Why don't you remove the
> > 
> > 	writel(0, ecc->regs + ECC_IRQ_REG(op));
> > 
> > line in the irq handler and add
> > 
> > 	readw(ecc->regs + ECC_DECIRQ_STA);
> > 
> > just before disabling the irq in mtk_ecc_disable().
> > 
> > It really looks weird to have the IRQ handler disable the IRQ on its
> > own. It's usually the caller who decides when the IRQ (or set of IRQs)
> > should be enabled/disabled.   
> 
> It is reasonable. I will remove
>  writel(0, ecc->regs + ECC_IRQ_REG(op));
> from irq handler, and keep it in mtk_ecc_disable().
> 
> But I thinks it is better to add
>  readw(ecc->regs + ECC_DECIRQ_STA);
> before completing ecc done event in the irq handler than in
> mtk_ecc_disable().
> 
> Please let me show some backgrounds at first:
> 1. ECC irq signal is low level vaild.
> 2. Reading ECC_DECIRQ_STA can pull ecc irq signal high, but it is just
> valid if ECC_IRQ_REG(op) is enabled.
> 
> At the beginning of irq handler, I read ECC_DECIRQ_STA and this pulls
> ecc irq signal high. But ecc irq signal may be pulled down again between
> ECC_DECIRQ_STA read and ECC_DECDONE read, if one sector is decoded done
> between them. And this means that one IRQ will come in later. But if all
> sectors are decoded done now, we will complete ecc done event, and there
> is really no need to deal with this extra IRQ.
> So, read ECC_DECIRQ_STA before completing ecc done event, this can
> guarantee that ecc irq signal is always high and no extra IRQ later.
> 
> We can keep mtk_ecc_disable() the same. Because all irqs have been
> handled.
> 
> I am not sure whether this explanation is clear?

I'd say that it's safer to read ECC_DECIRQ_STA both in the irq
handler (where you want to add it) and in mtk_ecc_disable(), just in
case you had a timeout on the wait_for_completion_timeout() call (not
sure the one in mtk_ecc_disable() is necessary though).

> 
> > Moreover, I'm not sure you're guaranteed
> > that no new interrupts will come in between the ECC_DECIRQ_STA read
> > and the ECC_IRQ_REG() write you're doing in your irq handler, while,
> > doing that after the engine has been disabled (in mtk_ecc_disable())
> > should guarantee that nothing can happen after you have read the status
> > reg.
> >   
> > > 
> > > MT2712 ECC HW is designed to generate only one decode IRQ each page, so
> > > there is no this problem on MT2712 platforms.
> > > 
> > > Signed-off-by: Xiaolei Li <xiaolei.li@mediatek.com>  
> > 
> > You miss a Fixes tag, and you might want to add a Cc-stable tag to
> > backport the fix.
> >   
> OK. Thanks.
> 
> > Regards,
> > 
> > Boris
> >   
> > > ---
> > >  drivers/mtd/nand/mtk_ecc.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> > > index 82aa6f2..0e04323 100644
> > > --- a/drivers/mtd/nand/mtk_ecc.c
> > > +++ b/drivers/mtd/nand/mtk_ecc.c
> > > @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
> > >  		op = ECC_DECODE;
> > >  		dec = readw(ecc->regs + ECC_DECDONE);
> > >  		if (dec & ecc->sectors) {
> > > +			/**
> > > +			 * Clear decode IRQ status once again to ensure that
> > > +			 * there will be no extra IRQ.
> > > +			 */
> > > +			dec = readw(ecc->regs + ECC_DECIRQ_STA);
> > >  			ecc->sectors = 0;
> > >  			complete(&ecc->done);
> > >  		} else {  
> >   
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue
@ 2017-10-29 19:47         ` Boris Brezillon
  0 siblings, 0 replies; 14+ messages in thread
From: Boris Brezillon @ 2017-10-29 19:47 UTC (permalink / raw)
  To: xiaolei li
  Cc: srv_heupstream-NuS5LvNUpcJWk0Htik3J/w,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	computersforpeace-Re5JQEeQqe8AvxtiuMwx3w,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ, rogercc.lin-NuS5LvNUpcJWk0Htik3J/w

On Sat, 28 Oct 2017 12:05:57 +0800
xiaolei li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org> wrote:

> Hi Boris,
> 
> On Fri, 2017-10-27 at 14:44 +0200, Boris Brezillon wrote:
> > Hi Xiaolei,
> > 
> > On Mon, 23 Oct 2017 16:21:37 +0800
> > Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org> wrote:
> >   
> > > For MT2701 NAND Controller, there may generate infinite ECC decode IRQ
> > > during long time burn test on some platforms. Once this issue occurred,
> > > the ECC decode IRQ status cannot be cleared in the IRQ handler function,
> > > and threads cannot be scheduled.
> > > 
> > > ECC HW generates decode IRQ each sector, so there will have more than one
> > > decode IRQ if read one page of large page NAND.
> > > 
> > > Currently, ECC IRQ handle flow is that we will check whether it is decode
> > > IRQ at first by reading the register ECC_DECIRQ_STA. This is a read-clear
> > > type register. If this IRQ is decode IRQ, then the ECC IRQ signal will be
> > > cleared at the same time.
> > > Secondly, we will check whether all sectors are decoded by reading the
> > > register ECC_DECDONE. This is because the current IRQ may be not dealed
> > > in time, and the next sectors have been decoded before reading the
> > > register ECC_DECIRQ_STA. Then, the next sectors's decode IRQs will not
> > > be generated.
> > > Thirdly, if all sectors are decoded by comparing with ecc->sectors, then we
> > > will complete ecc->done, set ecc->sectors as 0, and disable ECC IRQ by
> > > programming the register ECC_IRQ_REG(op) as 0. Otherwise, wait for the
> > > next ECC IRQ.
> > > 
> > > But, there is a timing issue between step one and two. When we read the
> > > reigster ECC_DECIRQ_STA, all sectors are decoded except the last sector,
> > > and the ECC IRQ signal is cleared. But the last sector is decoded before
> > > reading ECC_DECDONE, so the ECC IRQ signal is enabled again by ECC HW, and
> > > it means we will receive one extra ECC IRQ later. In step three, we will
> > > find that all sectors were decoded, then disable ECC IRQ and return.
> > > When deal with the extra ECC IRQ, the ECC IRQ status cannot be cleared
> > > anymore. That is because the register ECC_DECIRQ_STA can only be cleared
> > > when the register ECC_IRQ_REG(op) is enabled. But actually we have
> > > disabled ECC IRQ in the previous ECC IRQ handle. So, there will
> > > keep receiving ECC decode IRQ.
> > > 
> > > Now, we read the register ECC_DECIRQ_STA once again before completing the
> > > ecc done event. This ensures that there will be no extra ECC decode IRQ.  
> > 
> > Why don't you remove the
> > 
> > 	writel(0, ecc->regs + ECC_IRQ_REG(op));
> > 
> > line in the irq handler and add
> > 
> > 	readw(ecc->regs + ECC_DECIRQ_STA);
> > 
> > just before disabling the irq in mtk_ecc_disable().
> > 
> > It really looks weird to have the IRQ handler disable the IRQ on its
> > own. It's usually the caller who decides when the IRQ (or set of IRQs)
> > should be enabled/disabled.   
> 
> It is reasonable. I will remove
>  writel(0, ecc->regs + ECC_IRQ_REG(op));
> from irq handler, and keep it in mtk_ecc_disable().
> 
> But I thinks it is better to add
>  readw(ecc->regs + ECC_DECIRQ_STA);
> before completing ecc done event in the irq handler than in
> mtk_ecc_disable().
> 
> Please let me show some backgrounds at first:
> 1. ECC irq signal is low level vaild.
> 2. Reading ECC_DECIRQ_STA can pull ecc irq signal high, but it is just
> valid if ECC_IRQ_REG(op) is enabled.
> 
> At the beginning of irq handler, I read ECC_DECIRQ_STA and this pulls
> ecc irq signal high. But ecc irq signal may be pulled down again between
> ECC_DECIRQ_STA read and ECC_DECDONE read, if one sector is decoded done
> between them. And this means that one IRQ will come in later. But if all
> sectors are decoded done now, we will complete ecc done event, and there
> is really no need to deal with this extra IRQ.
> So, read ECC_DECIRQ_STA before completing ecc done event, this can
> guarantee that ecc irq signal is always high and no extra IRQ later.
> 
> We can keep mtk_ecc_disable() the same. Because all irqs have been
> handled.
> 
> I am not sure whether this explanation is clear?

I'd say that it's safer to read ECC_DECIRQ_STA both in the irq
handler (where you want to add it) and in mtk_ecc_disable(), just in
case you had a timeout on the wait_for_completion_timeout() call (not
sure the one in mtk_ecc_disable() is necessary though).

> 
> > Moreover, I'm not sure you're guaranteed
> > that no new interrupts will come in between the ECC_DECIRQ_STA read
> > and the ECC_IRQ_REG() write you're doing in your irq handler, while,
> > doing that after the engine has been disabled (in mtk_ecc_disable())
> > should guarantee that nothing can happen after you have read the status
> > reg.
> >   
> > > 
> > > MT2712 ECC HW is designed to generate only one decode IRQ each page, so
> > > there is no this problem on MT2712 platforms.
> > > 
> > > Signed-off-by: Xiaolei Li <xiaolei.li-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>  
> > 
> > You miss a Fixes tag, and you might want to add a Cc-stable tag to
> > backport the fix.
> >   
> OK. Thanks.
> 
> > Regards,
> > 
> > Boris
> >   
> > > ---
> > >  drivers/mtd/nand/mtk_ecc.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
> > > index 82aa6f2..0e04323 100644
> > > --- a/drivers/mtd/nand/mtk_ecc.c
> > > +++ b/drivers/mtd/nand/mtk_ecc.c
> > > @@ -115,6 +115,11 @@ static irqreturn_t mtk_ecc_irq(int irq, void *id)
> > >  		op = ECC_DECODE;
> > >  		dec = readw(ecc->regs + ECC_DECDONE);
> > >  		if (dec & ecc->sectors) {
> > > +			/**
> > > +			 * Clear decode IRQ status once again to ensure that
> > > +			 * there will be no extra IRQ.
> > > +			 */
> > > +			dec = readw(ecc->regs + ECC_DECIRQ_STA);
> > >  			ecc->sectors = 0;
> > >  			complete(&ecc->done);
> > >  		} else {  
> >   
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-10-29 19:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-23  8:21 [PATCH v1 0/2] Improve MTK ECC Engine driver Xiaolei Li
2017-10-23  8:21 ` Xiaolei Li
2017-10-23  8:21 ` [PATCH v1 1/2] mtd: nand: mtk: do not disable ECC IRQ in the function mtk_ecc_disable Xiaolei Li
2017-10-23  8:21   ` Xiaolei Li
2017-10-27 12:45   ` Boris Brezillon
2017-10-27 12:45     ` Boris Brezillon
2017-10-23  8:21 ` [PATCH v1 2/2] mtd: nand: mtk: fix infinite ECC decode IRQ issue Xiaolei Li
2017-10-23  8:21   ` Xiaolei Li
2017-10-27 12:44   ` Boris Brezillon
2017-10-27 12:44     ` Boris Brezillon
2017-10-28  4:05     ` xiaolei li
2017-10-28  4:05       ` xiaolei li
2017-10-29 19:47       ` Boris Brezillon
2017-10-29 19:47         ` Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.