dmaengine Archive on lore.kernel.org
 help / color / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Vinod Koul <vkoul@kernel.org>
Cc: linux-kernel@vger.kernel.org, dmaengine@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH 3/5] dmaengine: plx-dma: Introduce PLX DMA engine PCI driver skeleton
Date: Tue, 12 Nov 2019 10:22:48 -0700
Message-ID: <fc86a9de-a816-4bea-081e-bd106b945dbe@deltatee.com> (raw)
In-Reply-To: <20191112060919.GZ952516@vkoul-mobl>



On 2019-11-11 11:09 p.m., Vinod Koul wrote:
> On 11-11-19, 10:50, Logan Gunthorpe wrote:
>>
>>
>> On 2019-11-09 10:35 a.m., Vinod Koul wrote:
>>> On 22-10-19, 15:46, Logan Gunthorpe wrote:
>>>> +static irqreturn_t plx_dma_isr(int irq, void *devid)
>>>> +{
>>>> +	return IRQ_HANDLED;
>>>
>>> ??
>>
>> Yes, sorry this is more of an artifact of how I chose to split the
>> patches up. The ISR is filled-in in patch 4.
> 
> lets move this code in all including isr registration in patch 4 then :)

Ok, I'll rework that for the next submission.

>>>> +	 */
>>>> +	schedule_work(&plxdev->release_work);
>>>> +}
>>>> +
>>>> +static void plx_dma_put(struct plx_dma_dev *plxdev)
>>>> +{
>>>> +	kref_put(&plxdev->ref, plx_dma_release);
>>>> +}
>>>> +
>>>> +static int plx_dma_alloc_chan_resources(struct dma_chan *chan)
>>>> +{
>>>> +	struct plx_dma_dev *plxdev = chan_to_plx_dma_dev(chan);
>>>> +
>>>> +	kref_get(&plxdev->ref);
>>>
>>> why do you need to do this?
>>
>> This has to do with being able to probably unbind while a channel is in
>> use. If we don't hold a reference to the struct plx_dma_dev between
>> alloc_chan_resources() and free_chan_resources() then it will panic if a
>> call back is called after plx_dma_remove(). The way I've done it, once a
> 
> which callback?

Any callback that tries to obtain the free'd plx_dma_dev structure. (So
plx_dma_free_chan_resources(), plx_dma_prep_memcpy(),
plx_dma_issue_pending(), plx_dma_tx_status()).

>> device is removed, subsequent calls to dma_prep_memcpy() will fail (see
>> ring_active).
>>
>> struct plx_dma_dev needs to be alive between plx_dma_probe() and
>> plx_dma_remove(), and between calls to alloc_chan_resources() and
>> free_chan_resources(). So we use a reference count to ensure this.
> 
> and that is why we hold module reference so we don't go away without
> cleanup

No, that's wrong. The module reference will prevent the module and the
functions within it from going away. It doesn't prevent the driver from
being unbound which normally causes the devices' structure from being
freed. Most drivers will free the structure containing the DMA engine on
the remove() call, so even if the module is still around, its functions
will still be called with a freed pointer. We're taking a reference on
the pointer to ensure it's not freed while dmaengine users still have a
reference to it.

>>>> +static void plx_dma_release_work(struct work_struct *work)
>>>> +{
>>>> +	struct plx_dma_dev *plxdev = container_of(work, struct plx_dma_dev,
>>>> +						  release_work);
>>>> +
>>>> +	dma_async_device_unregister(&plxdev->dma_dev);
>>>> +	put_device(plxdev->dma_dev.dev);
>>>> +	kfree(plxdev);
>>>> +}
>>>> +
>>>> +static void plx_dma_release(struct kref *ref)
>>>> +{
>>>> +	struct plx_dma_dev *plxdev = container_of(ref, struct plx_dma_dev, ref);
>>>> +
>>>> +	/*
>>>> +	 * The dmaengine reference counting and locking is a bit of a
>>>> +	 * mess so we have to work around it a bit here. We might put
>>>> +	 * the reference while the dmaengine holds the dma_list_mutex
>>>> +	 * which means we can't call dma_async_device_unregister() directly
>>>> +	 * here and it must be delayed.
>>>
>>> why is that, i have not heard any complaints about locking, can you
>>> elaborate on why you need to do this?
>>
>> Per the above explanation, we need to call plx_dma_put() in
>> plx_dma_free_chan_resources(); and plx_dma_release() is when we can call
>> dma_async_device_unregister() (seeing that's when we know there are no
>> longer any active channels).
>>
>> However, dma_chan_put() (which calls device_free_chan_resources()) holds
>> the dma_list_mutex and dma_async_device_unregister() tries to take the
>> dma_list_mutex so, if we call unregister inside free_chan_resources we
>> would deadlock.
> 
> yes as we are not expecting someone to unregister in
> device_free_chan_resources(), that is for freeing up resources.
> 
> You are expected to unregister in .remove!
> 
> Can you explain me why unregister cant be done in remove? I think I am
> still missing some detail for this case.

Because, if the user unbinds while there's a client of the dma channel,
then it panics the kernel. First there's the warning[1] I pointed out
previously, then the DMA channel users will cause a use after free
exception when they continue unaware that the memory they are using has
been freed.

For an example from a random driver:

1) owl_dma_probe() allocates it's struct owl_dma with devm_kzalloc()
2) Another driver calls dma_find_channel() and obtains a reference to
one of the channels
3) Asynchronously, the user unbinds the owl_dma driver using the sysfs
interface
4) owl_dma_remove() is called which calls dma_async_device_unregister()
which produces a WARN_ON because a channel is in use
5) The devm stack for this driver instance unwinds and the struct
owl_dma is freed
6) The client driver then calls dmaengine_prep_dma_memcpy() which calls
owl_dma_prep_memcpy(). The first thing that driver does is convert the
now invalid channel reference to an invalid struct owl_dma reference and
shortly thereafter dereferences the now freed memory. If KASAN is
enabled, the user will get a big use after free bug panic. If not, the
driver will read and write memory that may be used by some other random
process eventually causing other random fatal errors in the system. The
best case scenario is the process that allocates the already freed
memory zeros it, and thus the client driver would panic on a NULL
pointer dereference.

I think this is unacceptable for a driver to have happen and that's why
I wrote the plx driver such that it is not possible. This is especially
important for the PLX driver because it is a PCI device which can be
hotplugged so users may actually be randomly trying to unbind it.

Logan


[1]
https://elixir.bootlin.com/linux/latest/source/drivers/dma/dmaengine.c#L1119

  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-22 21:46 [PATCH 0/5] PLX Switch DMA Engine Driver Logan Gunthorpe
2019-10-22 21:46 ` [PATCH 1/5] dmaengine: Store module owner in dma_device struct Logan Gunthorpe
2019-11-09 17:18   ` Vinod Koul
2019-11-11 16:50     ` Logan Gunthorpe
2019-11-12  5:56       ` Vinod Koul
2019-11-12 16:45         ` Logan Gunthorpe
2019-11-14  4:55           ` Vinod Koul
2019-11-14 17:03             ` Logan Gunthorpe
2019-11-22  5:20               ` Vinod Koul
2019-11-22 16:53                 ` Dave Jiang
2019-11-22 20:50                   ` Dan Williams
2019-11-22 20:56                     ` Logan Gunthorpe
2019-11-22 21:01                       ` Dan Williams
2019-11-22 21:42                         ` Dave Jiang
2019-12-10  9:53                           ` Vinod Koul
2019-12-10 17:39                             ` Logan Gunthorpe
2019-10-22 21:46 ` [PATCH 2/5] dmaengine: Call module_put() after device_free_chan_resources() Logan Gunthorpe
2019-10-22 21:46 ` [PATCH 3/5] dmaengine: plx-dma: Introduce PLX DMA engine PCI driver skeleton Logan Gunthorpe
2019-11-09 17:35   ` Vinod Koul
2019-11-11 17:50     ` Logan Gunthorpe
2019-11-12  6:09       ` Vinod Koul
2019-11-12 17:22         ` Logan Gunthorpe [this message]
2019-10-22 21:46 ` [PATCH 4/5] dmaengine: plx-dma: Implement hardware initialization and cleanup Logan Gunthorpe
2019-10-22 21:46 ` [PATCH 5/5] dmaengine: plx-dma: Implement descriptor submission Logan Gunthorpe
2019-11-09 17:40   ` Vinod Koul
2019-11-11 18:11     ` Logan Gunthorpe

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc86a9de-a816-4bea-081e-bd106b945dbe@deltatee.com \
    --to=logang@deltatee.com \
    --cc=dan.j.williams@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

dmaengine Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dmaengine/0 dmaengine/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dmaengine dmaengine/ https://lore.kernel.org/dmaengine \
		dmaengine@vger.kernel.org
	public-inbox-index dmaengine

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.dmaengine


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git