All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robert Jarzmik <robert.jarzmik@free.fr>
To: Petr Cvek <petr.cvek@tul.cz>
Cc: ulf.hansson@linaro.org, linux-mmc@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 4/4] mmc: pxamci: Fix race condition between pxamci_dma_irq() and pxamci_irq()
Date: Wed, 19 Apr 2017 21:22:32 +0200	[thread overview]
Message-ID: <87fuh46w3r.fsf@belgarion.home> (raw)
In-Reply-To: <31e332fa-f152-1eff-39fb-91f332b84757@tul.cz> (Petr Cvek's message of "Wed, 19 Apr 2017 01:18:00 +0200")

Petr Cvek <petr.cvek@tul.cz> writes:

> The data write requests may require an FIFO flush when the DMA transaction
> ends. This is handled by a DMA callback pxamci_dma_irq(). After flushing
> the FIFO the MCI controller generates the DATA_TRAN_DONE interrupt.
>
> Problem is the DATA_TRAN_DONE interrupt will be generated when the write
> data length is divisible by the FIFO size (no flush is required). And in
> this case the DMA callback can be called long time after the
> DATA_TRAN_DONE interrupt (as the DMA callback is realised by a tasklet,
> it can even stack). When the DMA callback is finally called there can
> already be a different type of the transaction (another data read or write
> request).
>
> The dmaengine_tx_status() will be called for a wrong DMA transaction and
> in some case it returns DMA_IN_PROGRESS, which the code recognize as
> an error and ends a running DMA and halts the MCI controller.
>
> The problem presents itself under heavy (interrupt) load with a high MCI
> traffic with this message:
>
> 	mmc0: DMA error on tx channel
>
> The fix must obey these situations:
>  - Any command will erase the FIFO
>  - Data writes divisible by the FIFO size will (probably) automatically
>    generate a DATA_TRAN_DONE interrupt
>  - Data writes with a nonzero FIFO remainder must be flushed and then MCI
>    generates a DATA_TRAN_DONE interrupt
>  - Data reads do not require a flush but they will generate
>    a DATA_TRAN_DONE interrupt
>
> The fix changes the DATA_TRAN_DONE interrupt enable from read/write
> requests to read requests. The DATA_TRAN_DONE interrupt for a write
> request is enabled in the DMA callback, this assures  a DATA_TRAN_DONE
> interrupt will be always called after a callback (with or without an FIFO
> flush).

I'm a bit concerned with the way this patch works.
What bothers me is the re-enabling of the interrupt source in the DMA completion
path, ie. in pxamci_dma_irq().
For example, imagine :
 - the tran_done bit is left set (for whatever reason)
 - a new transation is queued
 - the DMA finishes, but not the last request
 - the pxamci_enable_irq() enables the interrupt, which fires right away even if
   the tran_done for this interrupt wasn't yet set

I will need a bit more time to think this one through, as I'm not yet set about
all the consequences. That shouldn't prevent you from pushing for reviews of
these patches of course, as I think this serie (or an equivalent) is required to
fix the current race condition.

As this is the last patch, I wonder if this serie is bisectable, especially is
patch 1/4 self contained ?

Cheers.

--
Robert

WARNING: multiple messages have this Message-ID (diff)
From: robert.jarzmik@free.fr (Robert Jarzmik)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 4/4] mmc: pxamci: Fix race condition between pxamci_dma_irq() and pxamci_irq()
Date: Wed, 19 Apr 2017 21:22:32 +0200	[thread overview]
Message-ID: <87fuh46w3r.fsf@belgarion.home> (raw)
In-Reply-To: <31e332fa-f152-1eff-39fb-91f332b84757@tul.cz> (Petr Cvek's message of "Wed, 19 Apr 2017 01:18:00 +0200")

Petr Cvek <petr.cvek@tul.cz> writes:

> The data write requests may require an FIFO flush when the DMA transaction
> ends. This is handled by a DMA callback pxamci_dma_irq(). After flushing
> the FIFO the MCI controller generates the DATA_TRAN_DONE interrupt.
>
> Problem is the DATA_TRAN_DONE interrupt will be generated when the write
> data length is divisible by the FIFO size (no flush is required). And in
> this case the DMA callback can be called long time after the
> DATA_TRAN_DONE interrupt (as the DMA callback is realised by a tasklet,
> it can even stack). When the DMA callback is finally called there can
> already be a different type of the transaction (another data read or write
> request).
>
> The dmaengine_tx_status() will be called for a wrong DMA transaction and
> in some case it returns DMA_IN_PROGRESS, which the code recognize as
> an error and ends a running DMA and halts the MCI controller.
>
> The problem presents itself under heavy (interrupt) load with a high MCI
> traffic with this message:
>
> 	mmc0: DMA error on tx channel
>
> The fix must obey these situations:
>  - Any command will erase the FIFO
>  - Data writes divisible by the FIFO size will (probably) automatically
>    generate a DATA_TRAN_DONE interrupt
>  - Data writes with a nonzero FIFO remainder must be flushed and then MCI
>    generates a DATA_TRAN_DONE interrupt
>  - Data reads do not require a flush but they will generate
>    a DATA_TRAN_DONE interrupt
>
> The fix changes the DATA_TRAN_DONE interrupt enable from read/write
> requests to read requests. The DATA_TRAN_DONE interrupt for a write
> request is enabled in the DMA callback, this assures  a DATA_TRAN_DONE
> interrupt will be always called after a callback (with or without an FIFO
> flush).

I'm a bit concerned with the way this patch works.
What bothers me is the re-enabling of the interrupt source in the DMA completion
path, ie. in pxamci_dma_irq().
For example, imagine :
 - the tran_done bit is left set (for whatever reason)
 - a new transation is queued
 - the DMA finishes, but not the last request
 - the pxamci_enable_irq() enables the interrupt, which fires right away even if
   the tran_done for this interrupt wasn't yet set

I will need a bit more time to think this one through, as I'm not yet set about
all the consequences. That shouldn't prevent you from pushing for reviews of
these patches of course, as I think this serie (or an equivalent) is required to
fix the current race condition.

As this is the last patch, I wonder if this serie is bisectable, especially is
patch 1/4 self contained ?

Cheers.

--
Robert

  reply	other threads:[~2017-04-19 19:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1492492523.git.petr.cvek@tul.cz>
2017-04-18 23:16 ` [PATCH 1/4] mmc: pxamci: Use the right flags for DMA callback init Petr Cvek
2017-04-18 23:16   ` Petr Cvek
2017-04-21  0:31   ` Petr Cvek
2017-04-21  0:31     ` Petr Cvek
2017-04-18 23:17 ` [PATCH 2/4] mmc: pxamci: Enhance error checking Petr Cvek
2017-04-18 23:17   ` Petr Cvek
2017-04-19 19:12   ` Robert Jarzmik
2017-04-19 19:12     ` Robert Jarzmik
2017-04-18 23:17 ` [PATCH 3/4] mmc: pxamci: Disable DATA_TRAN_DONE interrupt sooner Petr Cvek
2017-04-18 23:17   ` Petr Cvek
2017-04-19 19:14   ` Robert Jarzmik
2017-04-19 19:14     ` Robert Jarzmik
2017-04-20 23:37     ` Petr Cvek
2017-04-20 23:37       ` Petr Cvek
2017-04-18 23:18 ` [PATCH 4/4] mmc: pxamci: Fix race condition between pxamci_dma_irq() and pxamci_irq() Petr Cvek
2017-04-18 23:18   ` Petr Cvek
2017-04-19 19:22   ` Robert Jarzmik [this message]
2017-04-19 19:22     ` Robert Jarzmik
2017-04-21  1:30     ` Petr Cvek
2017-04-21  1:30       ` Petr Cvek
2017-04-27 13:14       ` Robert Jarzmik
2017-04-27 13:14         ` Robert Jarzmik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fuh46w3r.fsf@belgarion.home \
    --to=robert.jarzmik@free.fr \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=petr.cvek@tul.cz \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.