From: <Eugen.Hristev@microchip.com>
To: <ardeleanalex@gmail.com>
Cc: <alexandru.ardelean@analog.com>, <jic23@kernel.org>,
<linux-iio@vger.kernel.org>
Subject: Re: at91-sama5d2_adc crash
Date: Tue, 15 Sep 2020 16:30:23 +0000 [thread overview]
Message-ID: <a725eff5-c438-5d10-4090-15c49d52a91f@microchip.com> (raw)
In-Reply-To: <CA+U=DsqrFziYRjVfGnfBe0nKfxs0Le3pB0iYe=-7nDsLNZfXYw@mail.gmail.com>
On 14.09.2020 15:37, Alexandru Ardelean wrote:
> On Mon, Sep 14, 2020 at 2:33 PM <Eugen.Hristev@microchip.com> wrote:
>>
>> Hello Alex,
>>
>> Sorry to disturb but we have issues again with this patch :
>>
>>
>
> firstly, many apologies for the breakage;
>
>>
>> f3c034f61775 ("iio: at91-sama5d2_adc: adjust
>> iio_triggered_buffer_{predisable,postenable} positions")
>>
>> I recently discovered a crash when using buffered trigger with DMA with
>> this driver:
>>
>> # echo 100 > /sys/bus/iio/devices/iio\:device0/buffer/length
>> # echo 100 > /sys/bus/iio/devices/iio\:device0/buffer/watermark
>> # echo 1 > /sys/bus/iio/devices/iio\:device0/scan_elements/in_voltage4_en
>> # iio_generic_buffer -n fc030000.adc -t
>> fc030000.adc-dev0-external_rising -c 5
>> iio device number being used is 0
>> iio trigger number being used is 0
>> /sys/bus/iio/devicii/iio:device0 fc030000.adc-dev0-external_risingo
>> iio:device0: using dma0chan10 for rx DMA transfers
>>
>> Division by zero in kernel.
>> CPU: 0 PID: 243 Comm: irq/182-fc03000 Not tainted 5.8.0-rc1 #1
>> Hardware name: Atmel SAMA5
>> [<c010caf0>] (unwind_backtrace) from [<c010a034>] (show_stack+0x10/0x14)
>> [<c010a034>] (show_stack) from [<c03a892c>] (Ldiv0+0x8/0x10)
>> [<c03a892c>] (Ldiv0) from [<c03a88fc>] (__aeabi_uidivmod+0x8/0x18)
>> [<c03a88fc>] (__aeabi_uidivmod) from [<c03592d4>] (div_s64_rem+0x3c/0xc4)
>> [<c03592d4>] (div_s64_rem) from [<c05ed344>]
>
> which of these lines is the issue?
>
> sample_count = div_s64(transferred_len, sample_size);
>
> /*
> * interval between samples is total time since last transfer handling
> * divided by the number of samples (total size divided by sample size)
> */
> interval = div_s64((ns - st->dma_st.dma_ts), sample_count);
>
> would a simple return like below make sense?
> if (!sample_size) // or if (!sample_count)
> return;
>
> Is at91_adc_dma_size_done() returning 0?
> Or more closely, which variable is zero?
>
>> (at91_adc_trigger_handler+0xcc/0x494)
>> [<c05ed344>] (at91_adc_trigger_handler) from [<c014944c>]
>> (irq_thread_fn+0x1c/0x78)
>> [<c014944c>] (irq_thread_fn) from [<c01496dc>] (irq_thread+0x124/0x1d0)
>> [<c01496dc>] (irq_thread) from [<c01325b4>] (kthread+0x138/0x140)
>> [<c01325b4>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
>> Exception stack(0xde49dfb0 to 0xde49dff8)
>> dfa0: 00000000 00000000 00000000
>> 00000000
>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> 00000000
>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> Division by zero in kernel.
>> CPU: 0 PID: 243 Comm: irq/182-fc03000 Not tainted 5.8.0-rc1 #1
>> Hardware name: Atmel SAMA5
>> [<c010caf0>] (unwind_backtrace) from [<c010a034>] (show_stack+0x10/0x14)
>> [<c010a034>] (show_stack) from [<c03a7e24>] (Ldiv0_64+0x8/0x18)
>> [<c03a7e24>] (Ldiv0_64) from [<c0359344>] (div_s64_rem+0xac/0xc4)
>> [<c0359344>] (div_s64_rem) from [<c05ed360>]
>> (at91_adc_trigger_handler+0xe8/0x494)
>> [<c05ed360>] (at91_adc_trigger_handler) from [<c014944c>]
>> (irq_thread_fn+0x1c/0x78)
>> [<c014944c>] (irq_thread_fn) from [<c01496dc>] (irq_thread+0x124/0x1d0)
>> [<c01496dc>] (irq_thread) from [<c01325b4>] (kthread+0x138/0x140)
>> [<c01325b4>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
>> Exception stack(0xde49dfb0 to 0xde49dff8)
>> dfa0: 00000000 00000000 00000000
>> 00000000
>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> 00000000
>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> sched: RT throttling activated
>>
>>
>> It looks like crash is there since 5.8-rc1 introduced by that patch
>>
>> I looked in the code and it looks something is zero, probably the
>> received buffer size. It is likely that the DMA starts too soon before
>> the buffer is properly setup ?
>
> I am not sure about this. See [1]
>
>>
>> Can you help with fixing this ? or you know how we can do it ?
>> Also could you remind me why we enable and start the DMA on pre-enable
>> of the buffer instead of post-enable of the buffer ?
>
> [1]
> So, in iio_enable_buffers(), the order is:
> preenable()
> update_scan_mode()
> hwfifo_set_watermark()
> iio_buffer_enable() <- this can also enable the DMA buffer if
> using the IIO DMA buffer framework
> [2] IIO-core-now-attaches-pollfunc via iio_trigger_attach_poll_func()
> (if current mode is INDIO_BUFFER_TRIGGERED)
> postenable()
>
> I could guess maybe that hwfifo_set_watermark() is called too late?
> Ideally, the driver would use the IIO DMA buffer hook to enable the
> buffer via iio_buffer_enable()
> That would be the ideal way to do it, and not enable DMA in preenable.
Hi,
you are correct. The set watermark is being set too late. And the
dma_init is called when a watermark higher than 1 is being set. In
consequence, DMA is not initialized at the moment of dma_start (which is
now in preenable, which is before the set_watermark).
Do you have any suggestion about how to fix this ? Move the start dma to
set_watermark maybe ?
Thanks,
Eugen
>
> Regarding [2] it was discussed that attach/detach should be done in
> that order, and it should be done in IIO core.
> So, all drivers were cleaned up to respect that order.
> The idea is more in the lines that repetitive/duplicated things should
> be done by the IIO framework, and attach/detach-of-pollfunc is one of
> them.
>
>> In pre-enable, do we have everything ready inside IIO to be able to
>> start the DMA? Or it's better to have it at post-enable time ?
>
> This really depends on each driver.
>
>>
>> I know you want to ditch the post-enable and pre-disable hooks, but it
>> looks this driver needs them, or we need to find a way to make it work
>> properly
>
> These hooks have already been dropped in favor of the IIO core doing
> attach/detach.
> At this point, it may be a bit more difficult to go back, than to move forward.
>
> I think this crash should be fixable to function with this IIO core change.
>
> Again, apologies for the breakage.
>
> Thanks
> Alex
>
>>
>> Thanks !
>> Eugen
next prev parent reply other threads:[~2020-09-15 22:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-14 11:30 at91-sama5d2_adc crash Eugen.Hristev
2020-09-14 12:37 ` Alexandru Ardelean
2020-09-15 16:30 ` Eugen.Hristev [this message]
2020-09-16 6:17 ` Alexandru Ardelean
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a725eff5-c438-5d10-4090-15c49d52a91f@microchip.com \
--to=eugen.hristev@microchip.com \
--cc=alexandru.ardelean@analog.com \
--cc=ardeleanalex@gmail.com \
--cc=jic23@kernel.org \
--cc=linux-iio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).