All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frank Mori Hess <fmh6jj@gmail.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Vinod Koul <vkoul@kernel.org>,
	dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	r.baldyga@hackerion.com, Krzysztof Kozlowski <krzk@kernel.org>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	Linux Samsung SOC <linux-samsung-soc@vger.kernel.org>
Subject: Revert "dmaengine: pl330: add DMA_PAUSE feature"
Date: Thu, 10 May 2018 12:04:08 -0400	[thread overview]
Message-ID: <CAJz5Opfd9zeEAShFtUNhW4yLfVykef=2QdWWzq88Dtddn4eymQ@mail.gmail.com> (raw)

On Thu, May 10, 2018 at 4:31 AM, Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
> Hi Frank,
>
> On 2018-05-09 19:48, Frank Mori Hess wrote:
>> On Wed, May 9, 2018 at 9:19 AM, Marek Szyprowski
>> <m.szyprowski@samsung.com> wrote:
>>> I understand that pl330 doesn't support real PAUSE, but as it has been
>>> implemented since 2015 the limited support is perfectly possible - just
>>> to let serial driver to read the amount of data transferred.
>>>
>>> Please note that DMA engine documentation clearly states that the best
>>> residue granularity the driver might expect is BURST granularity. There
>>> is no way to get the information about the data which still sits in the
>>> DMA engine fifo when transfer is paused/terminated. This means that
>>> the serial driver should set maxburst = 1 if it is interested in
>>> getting exact number of bytes received/sent. With maxburst = 1 there
>>> is no such thing as data loose in the DMA engine fifo.
>> That is not my understanding of the dmaengine pause requirements, but
>> of course Vinod can speak authoritatively on this.
>
> Basing on the comments in dma_residue_granularity enum documentation (see
> include/linux/dmaengine.h) there is no better granularity of residue
> reporting than BURST units. If driver needs byte accuracy, then it should
> use MAXBURST=1 and DMA_SLAVE_BUSWIDTH_1_BYTE configuration.

That's an interesting line of thought.  The 8250 serial driver clearly
assumes it can do the sequence

dma pause
read residue
dma terminate

without data loss.  It checks for the capabilities

cmd_pause
cmd_terminate
residue_granularity != DMA_RESIDUE_GRANULARITY_DESCRIPTOR

and some of the 8250 drivers like 8250_dw.c set a maxburst > 1.  If it
can't count on the pause/residue/terminate working without data loss
then it is just broken.  As is the 8250_omap.c driver.  Is the
description of dmaengine_pause in the documentation of "This pauses
activity on the DMA channel without data loss" to be interpreted as
"as long as you resume afterwards"?

>> I also don't agree
>> that data loss cannot happen for single transfers.  It only becomes
>> less likely since there are fewer bytes in the dma controller fifo and
>> they stay there for a shorter period of time.  But even so, there is
>> nothing stopping the DMAKILL from killing the transfer thread after it
>> does a single byte load but before it does the associated single byte
>> store, as they are performed by separate instructions.  As far as your
>> serial port goes, the byte has been read by the host, even though it
>> never made it to memory, so it is gone forever.  The dma-330 does have
>> a load lock which prevents some operations from interrupting a
>> load/store combination, but in my observations DMAKILL does not
>> respect the load lock.
>
> For the last 3 years no one observed any issue with the current design
> (single transfers with DMA_SLAVE_BUSWIDTH_1_BYTE). By removing this
> feature we will loose ability to use DMA in the serial drivers, what is
> mainly useful for low-power bluetooth operation (serial console is really
> negligible case).

It's not surprising it hasn't been reported.  It is a race that
requires a DMAKILL to be issued just as a byte is in-flight through
the dma controller.  The only reason a driver would pause the
un-resumeable pl330 would be because the driver either knows or
suspects no more data will be arriving and it gives up on the
transfer.  The only reason I noticed is we had a test which sent data
to a serial port, waited just long enough for the serial port rx to
timeout, then sent more data just as the pause was issuing DMAKILL.
And then the test did this a few hundred thousand times until it
noticed bad data.  Also, given the way 8250 rx timeouts work, a person
typing into a serial console wouldn't type fast enough to trigger the
bug unless the baud rate was set extremely low.  And if someone lost a
keystroke every 10^5 bytes, who would blame the dma controller?

I do admit I didn't do this test using single transfers, because our
goal was getting bursts to work.  Unfortunately, the test program
relies on some custom hardware I no longer have access to so I can't
easily re-run the test now.  However, since the DMA-330 manual
documents the DMAKILL instruction to:

"Remove all entries in the MFIFO for the DMA channel."
"Remove all entries in the read buffer queue and write buffer queue
for the DMA channel."

Relying on it to work as a safe pause for single transfers seems like
wishful thinking.  I sure didn't want to believe the DMA-330 couldn't
be safely paused, but I eventually had to resign myself to it.
---
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Frank Mori Hess <fmh6jj@gmail.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Vinod Koul <vkoul@kernel.org>,
	dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	r.baldyga@hackerion.com, Krzysztof Kozlowski <krzk@kernel.org>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>,
	Linux Samsung SOC <linux-samsung-soc@vger.kernel.org>
Subject: Re: Revert "dmaengine: pl330: add DMA_PAUSE feature"
Date: Thu, 10 May 2018 12:04:08 -0400	[thread overview]
Message-ID: <CAJz5Opfd9zeEAShFtUNhW4yLfVykef=2QdWWzq88Dtddn4eymQ@mail.gmail.com> (raw)
In-Reply-To: <eb17d6a7-a188-3ef4-2641-5fd588b16431@samsung.com>

On Thu, May 10, 2018 at 4:31 AM, Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
> Hi Frank,
>
> On 2018-05-09 19:48, Frank Mori Hess wrote:
>> On Wed, May 9, 2018 at 9:19 AM, Marek Szyprowski
>> <m.szyprowski@samsung.com> wrote:
>>> I understand that pl330 doesn't support real PAUSE, but as it has been
>>> implemented since 2015 the limited support is perfectly possible - just
>>> to let serial driver to read the amount of data transferred.
>>>
>>> Please note that DMA engine documentation clearly states that the best
>>> residue granularity the driver might expect is BURST granularity. There
>>> is no way to get the information about the data which still sits in the
>>> DMA engine fifo when transfer is paused/terminated. This means that
>>> the serial driver should set maxburst = 1 if it is interested in
>>> getting exact number of bytes received/sent. With maxburst = 1 there
>>> is no such thing as data loose in the DMA engine fifo.
>> That is not my understanding of the dmaengine pause requirements, but
>> of course Vinod can speak authoritatively on this.
>
> Basing on the comments in dma_residue_granularity enum documentation (see
> include/linux/dmaengine.h) there is no better granularity of residue
> reporting than BURST units. If driver needs byte accuracy, then it should
> use MAXBURST=1 and DMA_SLAVE_BUSWIDTH_1_BYTE configuration.

That's an interesting line of thought.  The 8250 serial driver clearly
assumes it can do the sequence

dma pause
read residue
dma terminate

without data loss.  It checks for the capabilities

cmd_pause
cmd_terminate
residue_granularity != DMA_RESIDUE_GRANULARITY_DESCRIPTOR

and some of the 8250 drivers like 8250_dw.c set a maxburst > 1.  If it
can't count on the pause/residue/terminate working without data loss
then it is just broken.  As is the 8250_omap.c driver.  Is the
description of dmaengine_pause in the documentation of "This pauses
activity on the DMA channel without data loss" to be interpreted as
"as long as you resume afterwards"?

>> I also don't agree
>> that data loss cannot happen for single transfers.  It only becomes
>> less likely since there are fewer bytes in the dma controller fifo and
>> they stay there for a shorter period of time.  But even so, there is
>> nothing stopping the DMAKILL from killing the transfer thread after it
>> does a single byte load but before it does the associated single byte
>> store, as they are performed by separate instructions.  As far as your
>> serial port goes, the byte has been read by the host, even though it
>> never made it to memory, so it is gone forever.  The dma-330 does have
>> a load lock which prevents some operations from interrupting a
>> load/store combination, but in my observations DMAKILL does not
>> respect the load lock.
>
> For the last 3 years no one observed any issue with the current design
> (single transfers with DMA_SLAVE_BUSWIDTH_1_BYTE). By removing this
> feature we will loose ability to use DMA in the serial drivers, what is
> mainly useful for low-power bluetooth operation (serial console is really
> negligible case).

It's not surprising it hasn't been reported.  It is a race that
requires a DMAKILL to be issued just as a byte is in-flight through
the dma controller.  The only reason a driver would pause the
un-resumeable pl330 would be because the driver either knows or
suspects no more data will be arriving and it gives up on the
transfer.  The only reason I noticed is we had a test which sent data
to a serial port, waited just long enough for the serial port rx to
timeout, then sent more data just as the pause was issuing DMAKILL.
And then the test did this a few hundred thousand times until it
noticed bad data.  Also, given the way 8250 rx timeouts work, a person
typing into a serial console wouldn't type fast enough to trigger the
bug unless the baud rate was set extremely low.  And if someone lost a
keystroke every 10^5 bytes, who would blame the dma controller?

I do admit I didn't do this test using single transfers, because our
goal was getting bursts to work.  Unfortunately, the test program
relies on some custom hardware I no longer have access to so I can't
easily re-run the test now.  However, since the DMA-330 manual
documents the DMAKILL instruction to:

"Remove all entries in the MFIFO for the DMA channel."
"Remove all entries in the read buffer queue and write buffer queue
for the DMA channel."

Relying on it to work as a safe pause for single transfers seems like
wishful thinking.  I sure didn't want to believe the DMA-330 couldn't
be safely paused, but I eventually had to resign myself to it.

             reply	other threads:[~2018-05-10 16:04 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-10 16:04 Frank Mori Hess [this message]
2018-05-10 16:04 ` Revert "dmaengine: pl330: add DMA_PAUSE feature" Frank Mori Hess
  -- strict thread matches above, loose matches on Subject: below --
2018-05-29  7:09 Vinod Koul
2018-05-29  7:09 ` Vinod
2018-05-29  5:17 Marek Szyprowski
2018-05-29  5:17 ` Marek Szyprowski
2018-05-23  5:39 Vinod Koul
2018-05-23  5:39 ` Vinod
2018-05-22 14:27 Frank Mori Hess
2018-05-22 14:27 ` Frank Mori Hess
2018-05-22  3:37 Vinod Koul
2018-05-22  3:37 ` Vinod Koul
2018-05-22  0:56 Frank Mori Hess
2018-05-22  0:56 ` Frank Mori Hess
2018-05-21  9:16 Vinod Koul
2018-05-21  9:16 ` Vinod Koul
2018-05-18 19:01 Frank Mori Hess
2018-05-18 19:01 ` Frank Mori Hess
2018-05-18 18:56 Frank Mori Hess
2018-05-18 18:56 ` Frank Mori Hess
2018-05-18  7:21 Vinod Koul
2018-05-18  7:21 ` Vinod
2018-05-18  6:28 Marek Szyprowski
2018-05-18  6:28 ` Marek Szyprowski
2018-05-18  4:03 Vinod Koul
2018-05-18  4:03 ` Vinod
2018-05-17 17:22 Lars-Peter Clausen
2018-05-17 17:22 ` Lars-Peter Clausen
2018-05-17 16:20 Frank Mori Hess
2018-05-17 16:20 ` Frank Mori Hess
2018-05-17  4:19 Vinod Koul
2018-05-17  4:19 ` Vinod Koul
2018-05-15 15:50 Frank Mori Hess
2018-05-15 15:50 ` Frank Mori Hess
2018-05-15 13:45 Vinod Koul
2018-05-15 13:45 ` Vinod
2018-05-15 12:24 Marek Szyprowski
2018-05-15 12:24 ` Marek Szyprowski
2018-05-15  6:21 Vinod Koul
2018-05-15  6:21 ` Vinod
2018-05-11 15:57 Frank Mori Hess
2018-05-11 15:57 ` Frank Mori Hess
2018-05-11 12:57 Marek Szyprowski
2018-05-11 12:57 ` Marek Szyprowski
2018-05-10  8:31 Marek Szyprowski
2018-05-10  8:31 ` Marek Szyprowski
2018-05-09 17:48 Frank Mori Hess
2018-05-09 17:48 ` Frank Mori Hess
2018-05-09 13:19 Marek Szyprowski
2018-05-09 13:19 ` Marek Szyprowski
2018-05-09  6:59 Vinod Koul
2018-05-09  6:59 ` Vinod Koul
2018-05-08 14:36 Frank Mori Hess
2018-05-08 14:36 ` Frank Mori Hess
2018-05-08  9:04 Marek Szyprowski
2018-05-08  9:04 ` Marek Szyprowski
2018-05-03 16:35 Vinod Koul
2018-05-03 16:35 ` [PATCH] " Vinod Koul
2018-05-03  9:01 Vinod Koul
2018-05-03  9:01 ` [PATCH] " Vinod Koul
2018-05-02 14:37 Frank Mori Hess
2018-05-02 14:37 ` [PATCH] " Frank Mori Hess
2018-05-02  4:32 Vinod Koul
2018-05-02  4:32 ` [PATCH] " Vinod Koul
2018-04-28 21:50 Frank Mori Hess
2018-04-28 21:50 ` [PATCH] " Frank Mori Hess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJz5Opfd9zeEAShFtUNhW4yLfVykef=2QdWWzq88Dtddn4eymQ@mail.gmail.com' \
    --to=fmh6jj@gmail.com \
    --cc=b.zolnierkie@samsung.com \
    --cc=dan.j.williams@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=krzk@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=r.baldyga@hackerion.com \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.