dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Osipenko <digetx@gmail.com>
To: Ben Dooks <ben.dooks@codethink.co.uk>,
	linux-kernel@lists.codethink.co.uk
Cc: Laxman Dewangan <ldewangan@nvidia.com>,
	Jon Hunter <jonathanh@nvidia.com>, Vinod Koul <vkoul@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Thierry Reding <thierry.reding@gmail.com>,
	dmaengine@vger.kernel.org, linux-tegra@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] dma: tegra: add accurate reporting of dma state
Date: Wed, 12 Jun 2019 21:57:21 +0300	[thread overview]
Message-ID: <95a7b8e9-0638-a548-a907-ec80d415d7a3@gmail.com> (raw)
In-Reply-To: <6cceabe0-ecfa-e241-a937-5a7c9761820a@gmail.com>

05.05.2019 16:39, Dmitry Osipenko пишет:
> 04.05.2019 19:06, Dmitry Osipenko пишет:
>> 01.05.2019 11:58, Ben Dooks пишет:
>>> On 24/04/2019 19:17, Dmitry Osipenko wrote:
>>>> 24.04.2019 19:23, Ben Dooks пишет:
>>>>> The tx_status callback does not report the state of the transfer
>>>>> beyond complete segments. This causes problems with users such as
>>>>> ALSA when applications want to know accurately how much data has
>>>>> been moved.
>>>>>
>>>>> This patch addes a function tegra_dma_update_residual() to query
>>>>> the hardware and modify the residual information accordinly. It
>>>>> takes into account any hardware issues when trying to read the
>>>>> state, such as delays between finishing a buffer and signalling
>>>>> the interrupt.
>>>>>
>>>>> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
>>>>
>>>> Hello Ben,
>>>>
>>>> Thank you very much for keeping it up. I have couple comments, please
>>>> see them below.
>>>>
>>>>> Cc: Dmitry Osipenko <digetx@gmail.com>
>>>>> Cc: Laxman Dewangan <ldewangan@nvidia.com> (supporter:TEGRA DMA DRIVERS)
>>>>> Cc: Jon Hunter <jonathanh@nvidia.com> (supporter:TEGRA DMA DRIVERS)
>>>>> Cc: Vinod Koul <vkoul@kernel.org> (maintainer:DMA GENERIC OFFLOAD
>>>>> ENGINE SUBSYSTEM)
>>>>> Cc: Dan Williams <dan.j.williams@intel.com> (reviewer:ASYNCHRONOUS
>>>>> TRANSFERS/TRANSFORMS (IOAT) API)
>>>>> Cc: Thierry Reding <thierry.reding@gmail.com> (supporter:TEGRA
>>>>> ARCHITECTURE SUPPORT)
>>>>> Cc: dmaengine@vger.kernel.org (open list:DMA GENERIC OFFLOAD ENGINE
>>>>> SUBSYSTEM)
>>>>> Cc: linux-tegra@vger.kernel.org (open list:TEGRA ARCHITECTURE SUPPORT)
>>>>> Cc: linux-kernel@vger.kernel.org (open list)
>>>>> ---
>>>>>   drivers/dma/tegra20-apb-dma.c | 92 ++++++++++++++++++++++++++++++++---
>>>>>   1 file changed, 86 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dma/tegra20-apb-dma.c
>>>>> b/drivers/dma/tegra20-apb-dma.c
>>>>> index cf462b1abc0b..544e7273e741 100644
>>>>> --- a/drivers/dma/tegra20-apb-dma.c
>>>>> +++ b/drivers/dma/tegra20-apb-dma.c
>>>>> @@ -808,6 +808,90 @@ static int tegra_dma_terminate_all(struct
>>>>> dma_chan *dc)
>>>>>       return 0;
>>>>>   }
>>>>>   +static unsigned int tegra_dma_update_residual(struct
>>>>> tegra_dma_channel *tdc,
>>>>> +                          struct tegra_dma_sg_req *sg_req,
>>>>> +                          struct tegra_dma_desc *dma_desc,
>>>>> +                          unsigned int residual)
>>>>> +{
>>>>> +    unsigned long status = 0x0;
>>>>> +    unsigned long wcount;
>>>>> +    unsigned long ahbptr;
>>>>> +    unsigned long tmp = 0x0;
>>>>> +    unsigned int result;
>>>>
>>>> You could pre-assign ahbptr=0xffffffff and result=residual here, then
>>>> you could remove all the duplicated assigns below.
>>>
>>> ok, ta.
>>>
>>>>> +    int retries = TEGRA_APBDMA_BURST_COMPLETE_TIME * 10;
>>>>> +    int done;
>>>>> +
>>>>> +    /* if we're not the current request, then don't alter the
>>>>> residual */
>>>>> +    if (sg_req != list_first_entry(&tdc->pending_sg_req,
>>>>> +                       struct tegra_dma_sg_req, node)) {
>>>>> +        result = residual;
>>>>> +        ahbptr = 0xffffffff;
>>>>> +        goto done;
>>>>> +    }
>>>>> +
>>>>> +    /* loop until we have a reliable result for residual */
>>>>> +    do {
>>>>> +        ahbptr = tdc_read(tdc, TEGRA_APBDMA_CHAN_AHBPTR);
>>>>> +        status = tdc_read(tdc, TEGRA_APBDMA_CHAN_STATUS);
>>>>> +        tmp =  tdc_read(tdc, 0x08);    /* total count for debug */
>>>>
>>>> The "tmp" variable isn't used anywhere in the code, please remove it.
>>>
>>> must have been left over.
>>>
>>>>> +
>>>>> +        /* check status, if channel isn't busy then skip */
>>>>> +        if (!(status & TEGRA_APBDMA_STATUS_BUSY)) {
>>>>> +            result = residual;
>>>>> +            break;
>>>>> +        }
>>>>
>>>> This doesn't look correct because TRM says "Busy bit gets set as soon
>>>> as a channel is enabled and gets cleared after transfer completes",
>>>> hence a cleared BUSY bit means that all transfers are completed and
>>>> result=residual is incorrect here. Given that there is a check for EOC
>>>> bit being set below, this hunk should be removed.
>>>
>>> I'll check notes, but see below.
>>>
>>>>> +
>>>>> +        /* if we've got an interrupt pending on the channel, don't
>>>>> +         * try and deal with the residue as the hardware has likely
>>>>> +         * moved on to the next buffer. return all data moved.
>>>>> +         */
>>>>> +        if (status & TEGRA_APBDMA_STATUS_ISE_EOC) {
>>>>> +            result = residual - sg_req->req_len;
>>>>> +            break;
>>>>> +        }
>>>>> +
>>>>> +        if (tdc->tdma->chip_data->support_separate_wcount_reg)
>>>>> +            wcount = tdc_read(tdc, TEGRA_APBDMA_CHAN_WORD_TRANSFER);
>>>>> +        else
>>>>> +            wcount = status;
>>>>> +
>>>>> +        /* If the request is at the full point, then there is a
>>>>> +         * chance that we have read the status register in the
>>>>> +         * middle of the hardware reloading the next buffer.
>>>>> +         *
>>>>> +         * The sequence seems to be at the end of the buffer, to
>>>>> +         * load the new word count before raising the EOC flag (or
>>>>> +         * changing the ping-pong flag which could have also been
>>>>> +         * used to determine a new buffer). This  means there is a
>>>>> +         * small window where we cannot determine zero-done for the
>>>>> +         * current buffer, or moved to next buffer.
>>>>> +         *
>>>>> +         * If done shows 0, then retry the load, as it may hit the
>>>>> +         * above hardware race. We will either get a new value which
>>>>> +         * is from the first buffer, or we get an EOC (new buffer)
>>>>> +         * or both a new value and an EOC...
>>>>> +         */
>>>>> +        done = get_current_xferred_count(tdc, sg_req, wcount);
>>>>> +        if (done != 0) {
>>>>> +            result = residual - done;
>>>>> +            break;
>>>>> +        }
>>>>> +
>>>>> +        ndelay(100);
>>>>
>>>> Please use udelay(1) because there is no ndelay on arm32 and
>>>> ndelay(100) is getting rounded up to 1usec. AFAIK, arm64 doesn't have
>>>> reliable ndelay on Tegra either because timer rate changes with the
>>>> CPU frequency scaling.
>>>
>>> I'll check, but last time it was implemented. This seems a backwards step.
>>>
>>>> Secondly done=0 isn't a error case, technically this could be the case
>>>> when tegra_dma_update_residual() is invoked just after starting the
>>>> transfer. Hence I think this do-while loop and timeout checking aren't
>>>> needed at all since done=0 is a perfectly valid case.
>>>
>>> this is not checking for an error, it's checking for a possible
>>> inaccurate reading.
>>
>> If you'll change reading order of the status / words registers like I
>> suggested, then there won't be a case for the inaccuracy.
>>
>> The EOC bit should be set atomically once transfer is finished, you
>> can't get wrapped around words count and EOC bit not being set.
>>
>> For oneshot transfer that runs with interrupt being disabled, the words
>> counter will stop at 0 and the unset BUSY bit will indicate that the
>> transfer is completed.
>>
>>>>
>>>> Altogether seems the tegra_dma_update_residual() could be reduced to:
>>>>
>>>> static unsigned int tegra_dma_update_residual(struct tegra_dma_channel
>>>> *tdc,
>>>>                           struct tegra_dma_sg_req *sg_req,
>>>>                           struct tegra_dma_desc *dma_desc,
>>>>                           unsigned int residual) 
>>>> {
>>>>     unsigned long status, wcount;
>>>>
>>>>     if (list_is_first(&sg_req->node, &tdc->pending_sg_req))
>>>>         return residual;
>>>>
>>>>     if (tdc->tdma->chip_data->support_separate_wcount_reg)
>>>>         wcount = tdc_read(tdc, TEGRA_APBDMA_CHAN_WORD_TRANSFER);
>>>>
>>>>     status = tdc_read(tdc, TEGRA_APBDMA_CHAN_STATUS);
>>>>
>>>>     if (!tdc->tdma->chip_data->support_separate_wcount_reg)
>>>>         wcount = status;
>>>>
>>>>     if (status & TEGRA_APBDMA_STATUS_ISE_EOC)
>>>>         return residual - sg_req->req_len;
>>>>
>>>>     return residual - get_current_xferred_count(tdc, sg_req, wcount);
>>>> }
>>>
>>> I'm not sure if that will work all the time. It took days of testing to
>>> get reliable error data for the cases we're looking for here.
>>
>> Could you please tell exactly what those cases are. I don't see when the
>> simplified variant could fail, but maybe I already forgot some extra
>> details about how APB DMA works.
>>
>> I tested the variant I'm suggesting (with the fixed typos and added
>> check for the BUSY bit) and it works absolutely fine, audio stuttering
>> issue is fixed, everything else works too. Please consider to use it for
>> the next version of the patch if there are no objections.
>>
> 
> Actually the BUSY bit checking shouldn't be needed. I think it's a bug
> in the driver that it may not enable EOC interrupt and will send a patch
> to fix it.
> 

Hello Ben,

I'm going to post a reduced version of the patch that I'm was suggesting
here since it fixes a longstanding problem that I'm experiencing. Any
other changes could be made on top of it later on if needed. Please let
me know if you have any objections, I can wait a bit longer if you're
going to send an updated version of the patch that addresses all of the
comments anytime soon.

  reply	other threads:[~2019-06-12 18:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-24 16:23 dma: tegra: add accurate reporting of dma state Ben Dooks
2019-04-24 16:23 ` [PATCH] " Ben Dooks
2019-04-24 18:17 ` Dmitry Osipenko
2019-04-24 18:17   ` [PATCH] " Dmitry Osipenko
2019-05-01  8:58   ` Ben Dooks
2019-05-01  8:58     ` [PATCH] " Ben Dooks
2019-05-04 16:06     ` Dmitry Osipenko
2019-05-05 13:39       ` Dmitry Osipenko
2019-06-12 18:57         ` Dmitry Osipenko [this message]
2019-05-01  8:33 ` Jon Hunter
2019-05-01  8:33   ` [PATCH] " Jon Hunter
2019-05-01 13:13   ` Vinod Koul
2019-05-01 13:13     ` [PATCH] " Vinod Koul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=95a7b8e9-0638-a548-a907-ec80d415d7a3@gmail.com \
    --to=digetx@gmail.com \
    --cc=ben.dooks@codethink.co.uk \
    --cc=dan.j.williams@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=ldewangan@nvidia.com \
    --cc=linux-kernel@lists.codethink.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=thierry.reding@gmail.com \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).