linux-iio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexandru Ardelean <ardeleanalex@gmail.com>
To: Eugen.Hristev@microchip.com
Cc: Alexandru Ardelean <alexandru.ardelean@analog.com>,
	Jonathan Cameron <jic23@kernel.org>,
	linux-iio <linux-iio@vger.kernel.org>
Subject: Re: at91-sama5d2_adc crash
Date: Mon, 14 Sep 2020 15:37:54 +0300	[thread overview]
Message-ID: <CA+U=DsqrFziYRjVfGnfBe0nKfxs0Le3pB0iYe=-7nDsLNZfXYw@mail.gmail.com> (raw)
In-Reply-To: <e0822209-0010-f314-39eb-4fae33fb6661@microchip.com>

On Mon, Sep 14, 2020 at 2:33 PM <Eugen.Hristev@microchip.com> wrote:
>
> Hello Alex,
>
> Sorry to disturb but we have issues again with this patch :
>
>

firstly, many apologies for the breakage;

>
> f3c034f61775 ("iio: at91-sama5d2_adc: adjust
> iio_triggered_buffer_{predisable,postenable} positions")
>
> I recently discovered a crash when using buffered trigger with DMA with
> this driver:
>
> # echo 100 > /sys/bus/iio/devices/iio\:device0/buffer/length
> # echo 100 > /sys/bus/iio/devices/iio\:device0/buffer/watermark
> # echo 1 > /sys/bus/iio/devices/iio\:device0/scan_elements/in_voltage4_en
> # iio_generic_buffer -n fc030000.adc -t
> fc030000.adc-dev0-external_rising -c 5
> iio device number being used is 0
> iio trigger number being used is 0
> /sys/bus/iio/devicii/iio:device0 fc030000.adc-dev0-external_risingo
> iio:device0: using dma0chan10 for rx DMA transfers
>
> Division by zero in kernel.
> CPU: 0 PID: 243 Comm: irq/182-fc03000 Not tainted 5.8.0-rc1 #1
> Hardware name: Atmel SAMA5
> [<c010caf0>] (unwind_backtrace) from [<c010a034>] (show_stack+0x10/0x14)
> [<c010a034>] (show_stack) from [<c03a892c>] (Ldiv0+0x8/0x10)
> [<c03a892c>] (Ldiv0) from [<c03a88fc>] (__aeabi_uidivmod+0x8/0x18)
> [<c03a88fc>] (__aeabi_uidivmod) from [<c03592d4>] (div_s64_rem+0x3c/0xc4)
> [<c03592d4>] (div_s64_rem) from [<c05ed344>]

which of these lines is the issue?

        sample_count = div_s64(transferred_len, sample_size);

        /*
         * interval between samples is total time since last transfer handling
         * divided by the number of samples (total size divided by sample size)
         */
        interval = div_s64((ns - st->dma_st.dma_ts), sample_count);

would a simple return like below make sense?
if (!sample_size)    // or  if (!sample_count)
    return;

Is at91_adc_dma_size_done() returning 0?
Or more closely, which variable is zero?

> (at91_adc_trigger_handler+0xcc/0x494)
> [<c05ed344>] (at91_adc_trigger_handler) from [<c014944c>]
> (irq_thread_fn+0x1c/0x78)
> [<c014944c>] (irq_thread_fn) from [<c01496dc>] (irq_thread+0x124/0x1d0)
> [<c01496dc>] (irq_thread) from [<c01325b4>] (kthread+0x138/0x140)
> [<c01325b4>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
> Exception stack(0xde49dfb0 to 0xde49dff8)
> dfa0:                                     00000000 00000000 00000000
> 00000000
> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 00000000
> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> Division by zero in kernel.
> CPU: 0 PID: 243 Comm: irq/182-fc03000 Not tainted 5.8.0-rc1 #1
> Hardware name: Atmel SAMA5
> [<c010caf0>] (unwind_backtrace) from [<c010a034>] (show_stack+0x10/0x14)
> [<c010a034>] (show_stack) from [<c03a7e24>] (Ldiv0_64+0x8/0x18)
> [<c03a7e24>] (Ldiv0_64) from [<c0359344>] (div_s64_rem+0xac/0xc4)
> [<c0359344>] (div_s64_rem) from [<c05ed360>]
> (at91_adc_trigger_handler+0xe8/0x494)
> [<c05ed360>] (at91_adc_trigger_handler) from [<c014944c>]
> (irq_thread_fn+0x1c/0x78)
> [<c014944c>] (irq_thread_fn) from [<c01496dc>] (irq_thread+0x124/0x1d0)
> [<c01496dc>] (irq_thread) from [<c01325b4>] (kthread+0x138/0x140)
> [<c01325b4>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
> Exception stack(0xde49dfb0 to 0xde49dff8)
> dfa0:                                     00000000 00000000 00000000
> 00000000
> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 00000000
> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> sched: RT throttling activated
>
>
> It looks like crash is there since 5.8-rc1 introduced by that patch
>
> I looked in the code and it looks something is zero, probably the
> received buffer size. It is likely that the DMA starts too soon before
> the buffer is properly setup ?

I am not sure about this. See [1]

>
> Can you help with fixing this ? or you know how we can do it ?
> Also could you remind me why we enable and start the DMA on pre-enable
> of the buffer instead of post-enable of the buffer ?

[1]
So, in iio_enable_buffers(), the order is:
preenable()
update_scan_mode()
hwfifo_set_watermark()
iio_buffer_enable()    <-  this can also enable the DMA buffer if
using the IIO DMA buffer framework
[2] IIO-core-now-attaches-pollfunc via iio_trigger_attach_poll_func()
(if current mode is INDIO_BUFFER_TRIGGERED)
postenable()

I could guess maybe that hwfifo_set_watermark() is called too late?
Ideally, the driver would use the IIO DMA buffer hook to enable the
buffer via iio_buffer_enable()
That would be the ideal way to do it, and not enable DMA in preenable.

Regarding [2] it was discussed that attach/detach should be done in
that order, and it should be done in IIO core.
So, all drivers were cleaned up to respect that order.
The idea is more in the lines that repetitive/duplicated things should
be done by the IIO framework, and attach/detach-of-pollfunc is one of
them.

> In pre-enable, do we have everything ready inside IIO to be able to
> start the DMA? Or it's better to have it at post-enable time ?

This really depends on each driver.

>
> I know you want to ditch the post-enable and pre-disable hooks, but it
> looks this driver needs them, or we need to find a way to make it work
> properly

These hooks have already been dropped in favor of the IIO core doing
attach/detach.
At this point, it may be a bit more difficult to go back, than to move forward.

I think this crash should be fixable to function with this IIO core change.

Again, apologies for the breakage.

Thanks
Alex

>
> Thanks !
> Eugen

  reply	other threads:[~2020-09-14 12:40 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14 11:30 at91-sama5d2_adc crash Eugen.Hristev
2020-09-14 12:37 ` Alexandru Ardelean [this message]
2020-09-15 16:30   ` Eugen.Hristev
2020-09-16  6:17     ` Alexandru Ardelean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+U=DsqrFziYRjVfGnfBe0nKfxs0Le3pB0iYe=-7nDsLNZfXYw@mail.gmail.com' \
    --to=ardeleanalex@gmail.com \
    --cc=Eugen.Hristev@microchip.com \
    --cc=alexandru.ardelean@analog.com \
    --cc=jic23@kernel.org \
    --cc=linux-iio@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).