Linux-Devicetree Archive on lore.kernel.org
 help / color / Atom feed
From: Sowjanya Komatineni <skomatineni@nvidia.com>
To: Hans Verkuil <hverkuil@xs4all.nl>, <thierry.reding@gmail.com>,
	<jonathanh@nvidia.com>, <frankc@nvidia.com>,
	<helen.koike@collabora.com>, <sboyd@kernel.org>
Cc: <linux-media@vger.kernel.org>, <devicetree@vger.kernel.org>,
	<linux-clk@vger.kernel.org>, <linux-tegra@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH v3 4/6] media: tegra: Add Tegra210 Video input driver
Date: Mon, 17 Feb 2020 19:19:15 -0800
Message-ID: <32ebc124-cb2d-f545-a5a0-d71192af8219@nvidia.com> (raw)
In-Reply-To: <87c1c97d-abd7-8ca5-0709-d7c64a7d7b39@nvidia.com>


On 2/17/20 5:04 PM, Sowjanya Komatineni wrote:
>
> On 2/17/20 4:59 PM, Sowjanya Komatineni wrote:
>>
>> On 2/17/20 12:04 AM, Hans Verkuil wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 2/16/20 9:22 PM, Sowjanya Komatineni wrote:
>>>> On 2/16/20 12:11 PM, Sowjanya Komatineni wrote:
>>>>> On 2/16/20 11:54 AM, Sowjanya Komatineni wrote:
>>>>>> On 2/16/20 3:03 AM, Hans Verkuil wrote:
>>>>>>> External email: Use caution opening links or attachments
>>>>>>>
>>>>>>>
>>>>>>> On 2/14/20 7:23 PM, Sowjanya Komatineni wrote:
>>>>>>>> Tegra210 contains a powerful Video Input (VI) hardware controller
>>>>>>>> which can support up to 6 MIPI CSI camera sensors.
>>>>>>>>
>>>>>>>> Each Tegra CSI port can be one-to-one mapped to VI channel and can
>>>>>>>> capture from an external camera sensor connected to CSI or from
>>>>>>>> built-in test pattern generator.
>>>>>>>>
>>>>>>>> Tegra210 supports built-in test pattern generator from CSI to VI.
>>>>>>>>
>>>>>>>> This patch adds a V4L2 media controller and capture driver support
>>>>>>>> for Tegra210 built-in CSI to VI test pattern generator.
>>>>>>>>
>>>>>>>> Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
>>>>>>>> ---
>>>>>>>>    drivers/staging/media/Kconfig              |    2 +
>>>>>>>>    drivers/staging/media/Makefile             |    1 +
>>>>>>>>    drivers/staging/media/tegra/Kconfig        |   10 +
>>>>>>>>    drivers/staging/media/tegra/Makefile       |    8 +
>>>>>>>>    drivers/staging/media/tegra/TODO           |   10 +
>>>>>>>>    drivers/staging/media/tegra/tegra-common.h |  239 +++++++
>>>>>>>>    drivers/staging/media/tegra/tegra-csi.c    |  374 ++++++++++
>>>>>>>>    drivers/staging/media/tegra/tegra-csi.h    |  115 ++++
>>>>>>>>    drivers/staging/media/tegra/tegra-vi.c     | 1019
>>>>>>>> ++++++++++++++++++++++++++++
>>>>>>>>    drivers/staging/media/tegra/tegra-vi.h     |   79 +++
>>>>>>>>    drivers/staging/media/tegra/tegra-video.c  |  118 ++++
>>>>>>>>    drivers/staging/media/tegra/tegra-video.h  |   32 +
>>>>>>>>    drivers/staging/media/tegra/tegra210.c     |  767
>>>>>>>> +++++++++++++++++++++
>>>>>>>>    drivers/staging/media/tegra/tegra210.h     |  190 ++++++
>>>>>>>>    14 files changed, 2964 insertions(+)
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/Kconfig
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/Makefile
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/TODO
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-common.h
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-csi.c
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-csi.h
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-vi.c
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-vi.h
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-video.c
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra-video.h
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra210.c
>>>>>>>>    create mode 100644 drivers/staging/media/tegra/tegra210.h
>>>>>>>>
>>>>>>> <snip>
>>>>>>>
>>>>>>>> +/*
>>>>>>>> + * videobuf2 queue operations
>>>>>>>> + */
>>>>>>>> +static int tegra_channel_queue_setup(struct vb2_queue *vq,
>>>>>>>> +                                  unsigned int *nbuffers,
>>>>>>>> +                                  unsigned int *nplanes,
>>>>>>>> +                                  unsigned int sizes[],
>>>>>>>> +                                  struct device *alloc_devs[])
>>>>>>>> +{
>>>>>>>> +     struct tegra_vi_channel *chan = vb2_get_drv_priv(vq);
>>>>>>>> +
>>>>>>>> +     if (*nplanes)
>>>>>>>> +             return sizes[0] < chan->format.sizeimage ? 
>>>>>>>> -EINVAL : 0;
>>>>>>>> +
>>>>>>>> +     *nplanes = 1;
>>>>>>>> +     sizes[0] = chan->format.sizeimage;
>>>>>>>> +     alloc_devs[0] = chan->vi->dev;
>>>>>>>> +
>>>>>>>> +     /*
>>>>>>>> +      * allocate min 3 buffers in queue to avoid race between DMA
>>>>>>>> +      * writes and userspace reads.
>>>>>>>> +      */
>>>>>>>> +     if (*nbuffers < 3)
>>>>>>>> +             *nbuffers = 3;
>>>>>>> First of all, don't check this here, instead set the struct
>>>>>>> vb2_queue field
>>>>>>> 'min_buffers_needed' to 3 instead.
>>>>>>>
>>>>>>> But the reason given for this check is peculiar: there should 
>>>>>>> not be
>>>>>>> any
>>>>>>> race at all. Usually the reason for requiring a specific minimum
>>>>>>> number of
>>>>>>> buffers is that the DMA engine needs at least 2 buffers before it
>>>>>>> can start
>>>>>>> streaming: it can't give back a buffer to userspace 
>>>>>>> (vb2_buffer_done())
>>>>>>> unless there is a second buffer it can start to capture to next. So
>>>>>>> for many
>>>>>>> DMA implementations you need a minimum of 2 buffers: two buffers 
>>>>>>> for
>>>>>>> the
>>>>>>> DMA engine, one buffer being processed by userspace.
>>>>>>>
>>>>>>> If the driver is starved of buffers it will typically keep 
>>>>>>> capturing to
>>>>>>> the last buffer until a new buffer is queued.
>>>>>>>
>>>>>>> In any case, once the driver releases a buffer via 
>>>>>>> vb2_buffer_done()
>>>>>>> the
>>>>>>> buffer memory is no longer owned by the driver.
>>>>>>>
>>>>>>> To be precise, buffer ownership is as follows:
>>>>>>>
>>>>>>> userspace -> VIDIOC_QBUF -> vb2 -> buf_queue -> driver ->
>>>>>>> vb2_buffer_done() -> vb2 -> VIDIOC_DQBUF -> userspace
>>>>>>>
>>>>>>> (vb2 == videobuf2 framework)
>>>>>>>
>>>>>>> Note that vb2 never touches the buffer memory.
>>>>>>>
>>>>>>> So if you get a race condition in this driver, then there is 
>>>>>>> something
>>>>>>> strange going on. It looks like vb2_buffer_done() is called while
>>>>>>> DMA is
>>>>>>> still ongoing, or because the driver really needs to keep one 
>>>>>>> buffer
>>>>>>> available at all times.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>>           Hans
>>>>>> Thanks Hans.
>>>>>>
>>>>>> On running v4l2-compliance streaming tests for longer run, I noticed
>>>>>> kernel reporting unable to write to read-only memory and with debugs
>>>>>> I observed when this error was reported, I see 2 buffers queued and
>>>>>> both using same address.
>>>>>>
>>>>>> for first buffer capture start thread initiates capture and wakes
>>>>>> done thread to wait for memory write ack and once its done buffer is
>>>>>> released to user space but I see upon buffer released to user space
>>>>>> immediate next buffer capture single shot gets issued (as soon as
>>>>>> single shot is issued frame capture data is written to memory by 
>>>>>> DMA)
>>>>>> and I see this kernel error of unable to write to read-only memory.
>>>>>>
>>>>>> This error happens rare and happens on long run and all the times of
>>>>>> repro's, I see when other thread releases buffer immediate I see
>>>>>> single shot gets issued as 2 buffers are queued up at the same time
>>>>>> with same DMA address.
>>>>>>
>>>>> Just to be clear, I meant all the times when kernel reports error
>>>>> unable to write to read-only memory, I see 2 buffers gets queued and
>>>>> as the capture start thread and done thread are parallel and when
>>>>> capture thread wakes done thread on receiving FS event, done thread
>>>>> for waiting for memory write happens parallel to next frame capture
>>>>> and I see while vb2_buffer_done happens in done thread next frame
>>>>> single shot has been issues by capture start thread in parallel when
>>>>> it hits this error.
>>>> For low latency, we use 2 threads one thread for capture and wait 
>>>> for FS
>>>> and on receiving FS even wakes other done thread to wait for memory
>>>> write to finish.
>>>>
>>>> While other done thread waits for memory write to finish, capture 
>>>> thread
>>>> can start capture for next frame and as soon as single shot is issued
>>>> capture frame is written to memory and as this thread runs in parallel
>>>> to done thread
>>>>
>>>> there is a possibility vb2_buffer_done being called by
>>>> kthread_capture_done while DMA is ongoing by kthread_capture_start 
>>>> and I
>>>> observed same DMA address being used got both buffers that got 
>>>> queued at
>>>> same time when it hits this error.
>>> "buffers that got queued": you mean that 
>>> tegra_channel_buffer_queue() is
>>> called twice with different buffers (i.e. with different buffer 
>>> index values)
>>> but with the same DMA address?
>>
>> yes I see buf_queue invoked twice and both using same DMA address.
>>
>> I did not monitored buf index when I repro'd with debugs.
>>
>>> That should not happen (unless the first buffer was returned with
>>> vb2_buffer_done() before the second buffer was queued).
>>>
>>> Can you provide more details? E.g. buffer index, memory model used when
>>> streaming, total number of buffers allocated by REQBUFS.
>>
>> memory type is MMAP. buffer count is 1. Didn't monitored buf index 
>> but when this happened as count is 1 so I think it might be index 0.
>>
>> But what I noticed is on running long hours rarely this repro's and 
>> at the time of repro, I see 2 immediate buf_queue with same DMA address.
>>
>> 1st buffer capture starts and received FS and wakes other thread to 
>> wait for MW_ACK_DONE and when capture thread executes for 2nd buffer 
>> right as soon as single shot bit is set which writes captured data to 
>> memory kernel error of unable to write to read-only memory happens.
>>
>> I couldn't add more debugs to confirm if 1st buffer release finished 
>> or not by the time 2nd buffer single shot is issued but I see 
>> MW_ACK_DONE event received for 1st buffer.
>>
>> Adding more debugs does not repro this kernel error and that could be 
>> because delays with debug messages helps time interval b/w buffer 
>> release and next buffer single shot.
>>
>>>
>>> I would like to fully understand this. Just increasing the minimum 
>>> number
>>> of buffers, while reasonable by itself, *does* feel like it is just
>>> hiding the symptoms.
>>>
>>> Regards,
>>>
>>>          Hans
>>
>> kthread for capture start and kthread for done which waiting for 
>> memory write to happen runs in parallel.
>>
>> Its hard to confirm if buffer done is invoked by kthread_done at same 
>> time of kthread_capture DMA write of next frame as adding debugs has 
>> impact on delay and doesnt repro with more debugs.
>>
>> But just increasing no.of min buffers to 3 doesnt repro at all and I 
>> never see 2 buffer queue requests with same DMA address as min 
>> buffers are 3.
>>
> Will add debugs and monitor with index again and check for repro by 
> removing min buffers to re-confirm ...

Without specifying min buffers needed, I am able to repro again.

Noticed the buf that got released previously is the one that gets next 
into the queue. So I don't see same buffer being used b/w captures. 
Somehow I might have misinterpreted timing from logs earlier.

This time repro happened after streaming stop and on next streaming 
buf_prepare call back (with in tegra_channel_buffer_prepare)

Earlier repro's call trace also showed all the time it happened in 
tegra_channel_buffer_prepare but I noticed that happened when capture 
start thread was running in parallel and right on single shot issued.

So, looks like I misunderstood that it happened b/w issuing single shot 
and releasing buffer but could be its being reported during buffer 
prepare itself.

All the times of repro, call trace shows it happens during buf_prepare.

[   41.213146] Unable to handle kernel write to read-only memory at 
virtual address ffff0000f5c3fff8
[   41.222012] Mem abort info:
[   41.224807]   ESR = 0x9600004f
[   41.227852]   EC = 0x25: DABT (current EL), IL = 32 bits
[   41.233160]   SET = 0, FnV = 0
[   41.236204]   EA = 0, S1PTW = 0
[   41.239344] Data abort info:
[   41.242225]   ISV = 0, ISS = 0x0000004f
[   41.246058]   CM = 0, WnR = 1
[   41.249026] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081498000
[   41.255733] [ffff0000f5c3fff8] pgd=000000017f1f8003, 
pud=000000017ec06003, pmd=000000017ea57003, pte=0060000175c3f793
[   41.266345] Internal error: Oops: 9600004f [#1] PREEMPT SMP
[   41.271905] Modules linked in: panel_simple tegra_drm 
snd_hda_codec_hdmi snd_hda_tegra crct10dif_ce snd_hda_codec cec 
drm_kms_helper snd_hda_core lp855x_bl drm pwm_tegra ip_tables x_tables 
ipv6 nf_defrag_ipv6
[   41.290401] CPU: 3 PID: 532 Comm: v4l2-compliance Tainted: G        
W         5.6.0-rc1-00035-g6a105c1c479a-dirty #1
[   41.300902] Hardware name: NVIDIA Jetson TX1 Developer Kit (DT)
[   41.306807] pstate: 60000005 (nZCv daif -PAN -UAO)
[   41.311593] pc : tegra_channel_buffer_prepare+0x34/0x88
[   41.316807] lr : __buf_prepare+0x1c4/0x230
[   41.320891] sp : ffff800011f5baa0
[   41.324195] x29: ffff800011f5baa0 x28: ffff0000f58cc100
[   41.329494] x27: ffff800011f5bc58 x26: ffff80001100b780
[   41.334792] x25: ffff0000f81f1608 x24: ffff0000f7be7c00
[   41.340091] x23: 00000000c058565d x22: 0000000000000000
[   41.345390] x21: ffff0000f81f16e8 x20: 0000000000000000
[   41.350688] x19: ffff0000f5c40000 x18: 0000000000000000
[   41.355986] x17: 0000000000000000 x16: 0000000000000000
[   41.361285] x15: ffff0000f8553800 x14: 0000000000000000
[   41.366583] x13: 003f480000000000 x12: 003f500000000000
[   41.371881] x11: 0000000100000000 x10: 0000000000000000
[   41.377180] x9 : 0000000000000000 x8 : ffff0000f5c40258
[   41.382478] x7 : 0000000000000030 x6 : 0000000000000001
[   41.387776] x5 : 0000000000000000 x4 : 00000000003f4800
[   41.393074] x3 : 00000000003f4800 x2 : 00000000003f4800
[   41.398373] x1 : ffff0000f81f1080 x0 : ffff0000f5c40000
[   41.403671] Call trace:
[   41.406109]  tegra_channel_buffer_prepare+0x34/0x88
[   41.410974]  __buf_prepare+0x1c4/0x230
[   41.414713]  vb2_core_prepare_buf+0x94/0x110
[   41.418971]  vb2_prepare_buf+0x74/0xa8
[   41.422710]  vb2_ioctl_prepare_buf+0x54/0x60
[   41.426970]  v4l_prepare_buf+0x44/0x58
[   41.430707]  __video_do_ioctl+0x228/0x3e8
[   41.434705]  video_usercopy+0x1cc/0x4d0
[   41.438531]  video_ioctl2+0x14/0x20
[   41.442010]  v4l2_ioctl+0x44/0x68
[   41.445316]  v4l2_compat_ioctl32+0x21c/0x1420
[   41.449665]  __arm64_compat_sys_ioctl+0xc8/0x108
[   41.454273]  el0_svc_common.constprop.2+0xb0/0x168
[   41.459051]  do_el0_svc_compat+0x18/0x38
[   41.462964]  el0_sync_compat_handler+0x13c/0x194
[   41.467570]  el0_sync_compat+0x144/0x180
[   41.471483] Code: b9407802 eb02007f 540001e8 b9007404 (f81f8001)
[   41.477563] ---[ end trace 051c84051f60870a ]---

>>
>>>>>> With using minimum 3 buffers, this issue doesnt happen at all from
>>>>>> almost 72 hours of testing.
>>>>>>
>>>>>>
>>>>>> Will try with setting vb2 queue field min_buffers_needed as 3 
>>>>>> instead
>>>>>> of adding check in queue setup.
>>>>>>
>>>>>>>> +
>>>>>>>> +     return 0;
>>>>>>>> +}

  reply index

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-14 18:23 [RFC PATCH v3 0/6] Add Tegra driver for video capture Sowjanya Komatineni
2020-02-14 18:23 ` [RFC PATCH v3 1/6] dt-bindings: clock: tegra: Add clk id for CSI TPG clock Sowjanya Komatineni
2020-02-14 18:23 ` [RFC PATCH v3 2/6] clk: tegra: Add Tegra210 CSI TPG clock gate Sowjanya Komatineni
2020-02-14 18:23 ` [RFC PATCH v3 3/6] dt-binding: tegra: Add VI and CSI bindings Sowjanya Komatineni
2020-02-18 23:15   ` Rob Herring
2020-02-19  3:28     ` Sowjanya Komatineni
2020-02-20 19:45       ` Rob Herring
2020-02-20 20:39         ` Sowjanya Komatineni
2020-02-14 18:23 ` [RFC PATCH v3 4/6] media: tegra: Add Tegra210 Video input driver Sowjanya Komatineni
2020-02-16 11:03   ` Hans Verkuil
2020-02-16 19:54     ` Sowjanya Komatineni
2020-02-16 20:11       ` Sowjanya Komatineni
2020-02-16 20:22         ` Sowjanya Komatineni
2020-02-17  8:04           ` Hans Verkuil
2020-02-18  0:59             ` Sowjanya Komatineni
2020-02-18  1:04               ` Sowjanya Komatineni
2020-02-18  3:19                 ` Sowjanya Komatineni [this message]
2020-02-19 15:08                   ` Hans Verkuil
2020-02-19 16:22                     ` Sowjanya Komatineni
2020-02-20  0:09                       ` Sowjanya Komatineni
2020-02-20  9:29                         ` Hans Verkuil
2020-02-20 16:21                           ` Sowjanya Komatineni
2020-02-20 12:44   ` Hans Verkuil
2020-02-20 13:33     ` Hans Verkuil
2020-02-20 17:29       ` Sowjanya Komatineni
2020-02-20 19:11       ` Sowjanya Komatineni
2020-02-24  4:45         ` Sowjanya Komatineni
2020-03-18 11:48           ` Hans Verkuil
2020-03-18 16:14             ` Sowjanya Komatineni
2020-03-18 16:25               ` Sowjanya Komatineni
2020-03-18 17:17                 ` Sowjanya Komatineni
2020-03-19 14:29                   ` Hans Verkuil
2020-03-19 18:49                     ` Sowjanya Komatineni
2020-02-26  4:49     ` Sowjanya Komatineni
2020-02-26  5:50       ` Sowjanya Komatineni
2020-02-14 18:23 ` [RFC PATCH v3 5/6] MAINTAINERS: Add Tegra Video driver section Sowjanya Komatineni
2020-02-14 18:23 ` [RFC PATCH v3 6/6] arm64: tegra: Add Tegra VI CSI support in device tree Sowjanya Komatineni
2020-02-17 14:26 ` [RFC PATCH v3 0/6] Add Tegra driver for video capture Hans Verkuil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32ebc124-cb2d-f545-a5a0-d71192af8219@nvidia.com \
    --to=skomatineni@nvidia.com \
    --cc=devicetree@vger.kernel.org \
    --cc=frankc@nvidia.com \
    --cc=helen.koike@collabora.com \
    --cc=hverkuil@xs4all.nl \
    --cc=jonathanh@nvidia.com \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=sboyd@kernel.org \
    --cc=thierry.reding@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Devicetree Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-devicetree/0 linux-devicetree/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-devicetree linux-devicetree/ https://lore.kernel.org/linux-devicetree \
		devicetree@vger.kernel.org
	public-inbox-index linux-devicetree

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-devicetree


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git