From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030823AbbJ3KZZ (ORCPT ); Fri, 30 Oct 2015 06:25:25 -0400 Received: from mout.kundenserver.de ([212.227.17.24]:50541 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030730AbbJ3KZW (ORCPT ); Fri, 30 Oct 2015 06:25:22 -0400 From: Arnd Bergmann To: Sinan Kaya Cc: dmaengine@vger.kernel.org, timur@codeaurora.org, cov@codeaurora.org, jcm@redhat.com, Rob Herring , Pawel Moll , Mark Rutland , Ian Campbell , Kumar Gala , Vinod Koul , Dan Williams , devicetree@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] dma: add Qualcomm Technologies HIDMA channel driver Date: Fri, 30 Oct 2015 11:24:49 +0100 Message-ID: <5741892.rRLhcI7KQG@wuerfel> User-Agent: KMail/4.11.5 (Linux/3.16.0-10-generic; KDE/4.11.5; x86_64; ; ) In-Reply-To: <1446174501-8870-2-git-send-email-okaya@codeaurora.org> References: <1446174501-8870-1-git-send-email-okaya@codeaurora.org> <1446174501-8870-2-git-send-email-okaya@codeaurora.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Provags-ID: V03:K0:rHem3VIbxqnfBcbHk2V48H1A9SzHt2ab8FTWjcJmFog3uSt+ZNE O19el4b/yiOt5W0iLu+XDoSZ3iApG3JxVjoMR/3HNde+Xw4W9r9WOWxy+VwVREBrZXh3LWX 7dbckv0IEyVthqGfcfAofs3vnzGzsJb5OtDrxtBxKf765VTn+D/BVKU83WyQGO+7mprf7xN A4J0RgtcyP8K+2EoQCE+Q== X-UI-Out-Filterresults: notjunk:1;V01:K0:o7P49E3m18E=:lKDCWGDvfxSlDbtQa0GZ/D zv+n8n4sqoKDETljWE9iWZj1Zg1i9eHDFvmSPO1Q0KSOWEVLoT6rZDRuiDzxEWszjhk2w+3j3 HxffuEGTVqSSgs6QChttijHYCModySxjfjnJgV48EJRywpLVh6OeMZsfqHoR/swJYpB2BW93c wpYYap70UyVWMAQOM2MXlSXOWkzRkLS3Px2gsKk1JHdFnVkABiSa3/QIGER8QgSoEf2is6Eqn 9Pu9N0vXHU6cnYx9JIYIFcEiaSrAL6REtaY58icjIp/vA8D2lPBQnXWrrDXcuVxTPPqBbJS+u t9WvxSh9j9+UgVZ69VqBVcxrRTkii7j79I/U6ITsFsqTTT13RFcGw22RgwjfV1Oh5rul/QiVH e9QCv7nCmHIbN9evI/Ddmsj/SRx1sPRTxYWIiCQCAE9ncKWgBPqEcSq0FQht/oNJXodb6SuQQ +zGBgFS0m068ldxXNDMyx4waPm0Zmov8xiatp/oI9dSKDztU68DyU8MYreurpB4bKDXJQjPaX dUEv4MSV6OHd4zwJS3bXBtLYp7dvik6qsgbsG05cvR60URfoE3SYj6z9iv+/2cCge6tKnCMl4 eHNM+iybVJK1eowZViTILc8Z0uNhOv8qZuP8H9I3Xahh+1e8Ox/d4x3FBMj9G0gKxmRjevFCn rp+5kRnCutikNZd1gKGTth37WHmJVvDlZ6ol3h0QGKnqO+1Luv3oQj+BvdvoG0ty7Z4S3CyBA +OSk2MQblM9m07tO Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thursday 29 October 2015 23:08:13 Sinan Kaya wrote: > This patch adds support for hidma engine. The driver > consists of two logical blocks. The DMA engine interface > and the low-level interface. This version of the driver > does not support virtualization on this release and only > memcpy interface support is included. Does that mean you can have slave device support eventually? If so, please put that into the binding document already. > Signed-off-by: Sinan Kaya > --- > .../devicetree/bindings/dma/qcom_hidma.txt | 18 + > drivers/dma/Kconfig | 11 + > drivers/dma/Makefile | 4 + > drivers/dma/qcom_hidma.c | 1717 ++++++++++++++++++++ > drivers/dma/qcom_hidma.h | 44 + > drivers/dma/qcom_hidma_ll.c | 1132 +++++++++++++ > 6 files changed, 2926 insertions(+) > create mode 100644 Documentation/devicetree/bindings/dma/qcom_hidma.txt > create mode 100644 drivers/dma/qcom_hidma.c > create mode 100644 drivers/dma/qcom_hidma.h > create mode 100644 drivers/dma/qcom_hidma_ll.c > > diff --git a/Documentation/devicetree/bindings/dma/qcom_hidma.txt b/Documentation/devicetree/bindings/dma/qcom_hidma.txt > new file mode 100644 > index 0000000..9a01635 > --- /dev/null > +++ b/Documentation/devicetree/bindings/dma/qcom_hidma.txt > @@ -0,0 +1,18 @@ > +Qualcomm Technologies HIDMA Channel driver > + > +Required properties: > +- compatible: must contain "qcom,hidma" Be more specific here. Also, should this be 'hisilicon,hidma-1.0' rather than 'qcom'? I'm guessing from the name that this is a device you licensed from them. > +- reg: Addresses for the transfer and event channel > +- interrupts: Should contain the event interrupt > +- desc-count: Number of asynchronous requests this channel can handle > +- event-channel: The HW event channel completions will be delivered. > +Example: > + > + hidma_24: hidma@0x5c050000 { Name should be 'dma-controller' in DT, not 'hidma'. > + compatible = "qcom,hidma"; > + reg = <0 0x5c050000 0x0 0x1000>, > + <0 0x5c0b0000 0x0 0x1000>; > + interrupts = <0 389 0>; > + desc-count = <10>; > + event-channel = /bits/ 8 <4>; > + }; Remove the /bits/. > diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig > index 76a5a5e..2645185 100644 > --- a/drivers/dma/Kconfig > +++ b/drivers/dma/Kconfig > @@ -512,6 +512,17 @@ config QCOM_HIDMA_MGMT > OS would run QCOM_HIDMA driver and the hypervisor would run > the QCOM_HIDMA_MGMT driver. > > +config QCOM_HIDMA > + tristate "Qualcomm Technologies HIDMA support" > + select DMA_ENGINE > + select DMA_VIRTUAL_CHANNELS > + help > + Enable support for the Qualcomm Technologies HIDMA controller. > + The HIDMA controller supports optimized buffer copies > + (user to kernel, kernel to kernel, etc.). It only supports > + memcpy/memset interfaces. The core is not intended for general > + purpose slave DMA. > + > config XILINX_VDMA > tristate "Xilinx AXI VDMA Engine" > depends on (ARCH_ZYNQ || MICROBLAZE) > diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile > index 3d25ffd..5665df2 100644 > --- a/drivers/dma/Makefile > +++ b/drivers/dma/Makefile > @@ -53,6 +53,10 @@ obj-$(CONFIG_PL330_DMA) += pl330.o > obj-$(CONFIG_PPC_BESTCOMM) += bestcomm/ > obj-$(CONFIG_PXA_DMA) += pxa_dma.o > obj-$(CONFIG_QCOM_BAM_DMA) += qcom_bam_dma.o > +obj-$(CONFIG_QCOM_HIDMA) += qcom_hdma.o > +qcom_hdma-objs := qcom_hidma_ll.o qcom_hidma.o > + The driver is linked into a single module, so all the EXPORT_SYMBOL statements can be dropped. > +/* Default idle time is 2 seconds. This parameter can > + * be overridden by changing the following > + * /sys/bus/platform/devices/QCOM8061:/power/autosuspend_delay_ms > + * during kernel boot. > + */ > +#define AUTOSUSPEND_TIMEOUT 2000 > +#define HIDMA_DEFAULT_DESCRIPTOR_COUNT 16 > +#define MODULE_NAME "hidma" MODULE_NAME and HIDMA_DEFAULT_DESCRIPTOR_COUNT can be dropped > +#define HIDMA_RUNTIME_GET(dmadev) \ > +do { \ > + atomic_inc(&(dmadev)->pm_counter); \ > + TRC_PM((dmadev)->ddev.dev, \ > + "%s:%d pm_runtime_get %d\n", __func__, __LINE__,\ > + atomic_read(&(dmadev)->pm_counter)); \ > + pm_runtime_get_sync((dmadev)->ddev.dev); \ > +} while (0) > + > +#define HIDMA_RUNTIME_SET(dmadev) \ > +do { \ > + atomic_dec(&(dmadev)->pm_counter); \ > + TRC_PM((dmadev)->ddev.dev, \ > + "%s:%d pm_runtime_put_autosuspend:%d\n", \ > + __func__, __LINE__, \ > + atomic_read(&(dmadev)->pm_counter)); \ > + pm_runtime_mark_last_busy((dmadev)->ddev.dev); \ > + pm_runtime_put_autosuspend((dmadev)->ddev.dev); \ > +} while (0) Inline functions. > +struct hidma_test_sync { > + atomic_t counter; > + wait_queue_head_t wq; > +}; Let me guess, you converted this from a semaphore? ;-) Just put the two members into the containing structure and relete it here. > +struct hidma_dev { > + u8 evridx; > + u32 nr_descriptors; > + > + void *lldev; > + void __iomem *dev_trca; > + void __iomem *dev_evca; > + int (*self_test)(struct hidma_dev *device); > + struct dentry *debugfs; > + struct dentry *stats; > + > + /* used to protect the pending channel list*/ > + spinlock_t lock; > + dma_addr_t dev_trca_phys; > + struct dma_device ddev; > + struct tasklet_struct tasklet; workqueue maybe? > + resource_size_t dev_trca_size; > + dma_addr_t dev_evca_phys; > + resource_size_t dev_evca_size; All three look unused and can be removed. > +static unsigned int debug_pm; > +module_param(debug_pm, uint, 0644); > +MODULE_PARM_DESC(debug_pm, > + "debug runtime power management transitions (default: 0)"); > + > +#define TRC_PM(...) do { \ > + if (debug_pm) \ > + dev_info(__VA_ARGS__); \ > + } while (0) Again, remove these after you are done debugging the problem at hand, we don't need to clutter up the upstream version. > + /* > + * It is assumed that the hardware can move the data within 1s > + * and signal the OS of the completion > + */ > + ret = wait_event_interruptible_timeout(dmadev->test_result.wq, > + atomic_read(&dmadev->test_result.counter) == (map_count), > + msecs_to_jiffies(10000)); > + > + if (ret <= 0) { > + dev_err(dmadev->ddev.dev, > + "Self-test sg copy timed out, disabling\n"); > + err = -ENODEV; > + goto tx_status; > + } Why ENODEV? Could you make this handle restarted system calls? > + > +/* > + * Perform a streaming transaction to verify the HW works. > + */ > +static int hidma_selftest_streaming(struct hidma_dev *dmadev, > + struct dma_chan *dma_chanptr, u64 size, > + unsigned long flags) > +{ You have a lot of selftest code here. Can you try to move that into a file that works for all DMA engines? It feels like this should not be part of a driver. > + > +#if IS_ENABLED(CONFIG_DEBUG_FS) > + > +#define SIER_CHAN_SHOW(chan, name) \ > + seq_printf(s, #name "=%u\n", chan->name) can be open-coded for clarity. > + ret = dma_mapping_error(dev, dma_src); > + if (ret) { > + dev_err(dev, "dma_mapping_error with ret:%d\n", ret); > + ret = -ENOMEM; > + } else { > + phys_addr_t phys; > + > + phys = dma_to_phys(dev, dma_src); > + if (strcmp(__va(phys), "hello world") != 0) { > + dev_err(dev, "memory content mismatch\n"); > + ret = -EINVAL; > + } else { > + dev_dbg(dev, "mapsingle:dma_map_single works\n"); > + } > + dma_unmap_single(dev, dma_src, buf_size, DMA_TO_DEVICE); > + } dma_to_phys() is architecture specific and does not work if you have an IOMMU. Just use the virtual address you passed into dma_map_*. > +/** > + * hidma_dma_info: display HIDMA device info > + * > + * Display the info for the current HIDMA device. > + */ > +static int hidma_dma_info(struct seq_file *s, void *unused) > +{ > + struct hidma_dev *dmadev = s->private; > + struct dma_device *dma = &dmadev->ddev; > + > + seq_printf(s, "nr_descriptors=%d\n", dmadev->nr_descriptors); > + seq_printf(s, "dev_trca=%p\n", &dmadev->dev_trca); > + seq_printf(s, "dev_trca_phys=%pa\n", &dmadev->dev_trca_phys); > + seq_printf(s, "dev_trca_size=%pa\n", &dmadev->dev_trca_size); > + seq_printf(s, "dev_evca=%p\n", &dmadev->dev_evca); > + seq_printf(s, "dev_evca_phys=%pa\n", &dmadev->dev_evca_phys); > + seq_printf(s, "dev_evca_size=%pa\n", &dmadev->dev_evca_size); Don't print pointers here. > + seq_printf(s, "self_test=%u\n", > + atomic_read(&dmadev->test_result.counter)); > + > + seq_printf(s, "copy%s%s%s%s%s%s%s%s%s%s%s\n", > + dma_has_cap(DMA_PQ, dma->cap_mask) ? " pq" : "", > + dma_has_cap(DMA_PQ_VAL, dma->cap_mask) ? " pq_val" : "", > + dma_has_cap(DMA_XOR, dma->cap_mask) ? " xor" : "", > + dma_has_cap(DMA_XOR_VAL, dma->cap_mask) ? " xor_val" : "", > + dma_has_cap(DMA_INTERRUPT, dma->cap_mask) ? " intr" : "", > + dma_has_cap(DMA_SG, dma->cap_mask) ? " sg" : "", > + dma_has_cap(DMA_ASYNC_TX, dma->cap_mask) ? " async" : "", > + dma_has_cap(DMA_SLAVE, dma->cap_mask) ? " slave" : "", > + dma_has_cap(DMA_CYCLIC, dma->cap_mask) ? " cyclic" : "", > + dma_has_cap(DMA_INTERLEAVE, dma->cap_mask) ? " intl" : "", > + dma_has_cap(DMA_MEMCPY, dma->cap_mask) ? " memcpy" : ""); > + > + return 0; > +} The selftest part and the features could just be separate files in the core dmaengine code, it doesn't really belong here. > +} > +#else > +static void hidma_debug_uninit(struct hidma_dev *dmadev) > +{ > +} > +static int hidma_debug_init(struct hidma_dev *dmadev) > +{ > + return 0; > +} > +#endif You can remove the #ifdef here. The debugfs code already stubs out all debugfs_create_*() and turns it all into nops when it is disabled. > + dma_cap_set(DMA_MEMCPY, dmadev->ddev.cap_mask); > + /* Apply default dma_mask if needed */ > + if (!pdev->dev.dma_mask) { > + pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask; > + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(64); > + } > + remove the check, or use if (WARN_ON(!pdev->dev.dma_mask)) return -ENXIO; The dma mask has to always be set by the platform code before probe() is called. If it is not set, you are not allowed to perform DMA. > + dmadev->dev_evca_phys = evca_resource->start; > + dmadev->dev_evca_size = resource_size(evca_resource); > + > + dev_dbg(&pdev->dev, "dev_evca_phys:%pa\n", &dmadev->dev_evca_phys); > + dev_dbg(&pdev->dev, "dev_evca_size:%pa\n", &dmadev->dev_evca_size); > + > + dev_dbg(&pdev->dev, "qcom_hidma: mapped EVCA %pa to %p\n", > + &dmadev->dev_evca_phys, dmadev->dev_evca); > + dmadev->dev_trca_phys = trca_resource->start; > + dmadev->dev_trca_size = resource_size(trca_resource); > + > + dev_dbg(&pdev->dev, "dev_trca_phys:%pa\n", &dmadev->dev_trca_phys); > + dev_dbg(&pdev->dev, "dev_trca_size:%pa\n", &dmadev->dev_trca_size); Don't print pointers. > + rc = devm_request_irq(&pdev->dev, chirq, hidma_chirq_handler, 0, > + "qcom-hidma", &dmadev->lldev); > + if (rc) { > + dev_err(&pdev->dev, "chirq registration failed: %d\n", chirq); > + goto chirq_request_failed; > + } > + > + dev_dbg(&pdev->dev, "initializing DMA channels\n"); > + INIT_LIST_HEAD(&dmadev->ddev.channels); > + rc = hidma_chan_init(dmadev, 0); > + if (rc) { > + dev_err(&pdev->dev, "probe:channel init failed\n"); > + goto channel_init_failed; > + } > + dev_dbg(&pdev->dev, "HI-DMA engine driver starting self test\n"); > + rc = dmadev->self_test(dmadev); > + if (rc) { > + dev_err(&pdev->dev, "probe: self test failed: %d\n", rc); > + goto self_test_failed; > + } > + dev_info(&pdev->dev, "probe: self test succeeded.\n"); > + > + dev_dbg(&pdev->dev, "calling dma_async_device_register\n"); > + rc = dma_async_device_register(&dmadev->ddev); > + if (rc) { > + dev_err(&pdev->dev, > + "probe: failed to register slave DMA: %d\n", rc); > + goto device_register_failed; > + } > + dev_dbg(&pdev->dev, "probe: dma_async_device_register done\n"); > + > + rc = hidma_debug_init(dmadev); > + if (rc) { > + dev_err(&pdev->dev, > + "probe: failed to init debugfs: %d\n", rc); > + goto debug_init_failed; > + } > + > + dev_info(&pdev->dev, "HI-DMA engine driver registration complete\n"); > + platform_set_drvdata(pdev, dmadev); > + HIDMA_RUNTIME_SET(dmadev); > + return 0; Remove the debug prints when you are done debugging. > +debug_init_failed: > +device_register_failed: > +self_test_failed: > +channel_init_failed: > +chirq_request_failed: > + hidma_ll_uninit(dmadev->lldev); > +ll_init_failed: > +evridx_failed: > +remap_trca_failed: > +remap_evca_failed: > + if (dmadev) > + hidma_free(dmadev); Rename the labels according to what you do at in the failure case and remove most of them. This is 'goto', not 'comefrom' ;-) > +static struct platform_driver hidma_driver = { > + .probe = hidma_probe, > + .remove = hidma_remove, > + .driver = { > + .name = MODULE_NAME, > + .owner = THIS_MODULE, > + .of_match_table = of_match_ptr(hidma_match), > + .acpi_match_table = ACPI_PTR(hidma_acpi_ids), Remove .owner and of_match_ptr(). > + }, > +}; > + > +static int __init hidma_init(void) > +{ > + return platform_driver_register(&hidma_driver); > +} > +late_initcall(hidma_init); > + > +static void __exit hidma_exit(void) > +{ > + platform_driver_unregister(&hidma_driver); > +} > +module_exit(hidma_exit); module_platform_driver() > + > + if (unlikely(tre_ch >= lldev->nr_tres)) { > + dev_err(lldev->dev, "invalid TRE number in free:%d", tre_ch); > + return; > + } > + > + tre = &lldev->trepool[tre_ch]; > + if (unlikely(atomic_read(&tre->allocated) != true)) { > + dev_err(lldev->dev, "trying to free an unused TRE:%d", > + tre_ch); > + return; > + } Remove the 'unlikely' and the redundant '!= true'. Only use 'likely' or 'unlikely' if you can measure a difference. > +static int hidma_ll_reset(struct hidma_lldev *lldev) > +{ > + u32 val; > + int count; > + > + val = readl_relaxed(lldev->trca + TRCA_CTRLSTS_OFFSET); > + val = val & ~(CH_CONTROL_MASK << 16); > + val = val | (CH_RESET << 16); > + writel_relaxed(val, lldev->trca + TRCA_CTRLSTS_OFFSET); > + > + /* wait until the reset is performed */ > + wmb(); > + > + /* Delay 10ms after reset to allow DMA logic to quiesce.*/ > + for (count = 0; count < 10; count++) { > + val = readl_relaxed(lldev->trca + TRCA_CTRLSTS_OFFSET); > + lldev->trch_state = (val >> CH_STATE_BIT_POS) > + & CH_STATE_MASK; > + if (lldev->trch_state == CH_DISABLED) > + break; > + mdelay(1); > + } > + val = readl_relaxed(lldev->trca + TRCA_CTRLSTS_OFFSET); > + lldev->trch_state = (val >> CH_STATE_BIT_POS) & CH_STATE_MASK; > + if (lldev->trch_state != CH_DISABLED) { > + dev_err(lldev->dev, > + "transfer channel did not reset\n"); > + return -ENODEV; > + } > + > + val = readl_relaxed(lldev->evca + EVCA_CTRLSTS_OFFSET); > + val = val & ~(CH_CONTROL_MASK << 16); > + val = val | (CH_RESET << 16); > + writel_relaxed(val, lldev->evca + EVCA_CTRLSTS_OFFSET); > + > + /* wait until the reset is performed */ > + wmb(); > + > + /* Delay 10ms after reset to allow DMA logic to quiesce.*/ > + for (count = 0; count < 10; count++) { > + val = readl_relaxed(lldev->evca + EVCA_CTRLSTS_OFFSET); > + lldev->evch_state = (val >> CH_STATE_BIT_POS) > + & CH_STATE_MASK; > + if (lldev->evch_state == CH_DISABLED) > + break; > + mdelay(1); > + } Try using a workqueue to get into a state where you can call msleep() instead of mdelay(). Also, if you waste CPU cycles for hundreds of milliseconds, it's unlikely that the function is so performance critical that it requires writel_relaxed(). Just use writel() here. > +/* > + * The interrupt handler for HIDMA will try to consume as many pending > + * EVRE from the event queue as possible. Each EVRE has an associated > + * TRE that holds the user interface parameters. EVRE reports the > + * result of the transaction. Hardware guarantees ordering between EVREs > + * and TREs. We use last processed offset to figure out which TRE is > + * associated with which EVRE. If two TREs are consumed by HW, the EVREs > + * are in order in the event ring. > + * This handler will do a one pass for consuming EVREs. Other EVREs may > + * be delivered while we are working. It will try to consume incoming > + * EVREs one more time and return. > + * For unprocessed EVREs, hardware will trigger another interrupt until > + * all the interrupt bits are cleared. > + */ > +static void hidma_ll_int_handler_internal(struct hidma_lldev *lldev) > +{ > + u32 status; > + u32 enable; > + u32 cause; > + int repeat = 2; > + unsigned long timeout; > + > + status = readl_relaxed(lldev->evca + EVCA_IRQ_STAT_OFFSET); > + enable = readl_relaxed(lldev->evca + EVCA_IRQ_EN_OFFSET); > + cause = status & enable; > + Reading the status probably requires a readl() rather than readl_relaxed() to guarantee that the DMA data has arrived in memory by the time that the register data is seen by the CPU. If using readl_relaxed() here is a valid and required optimization, please add a comment to explain why it works and how much you gain. > + /* Another interrupt might have arrived while we are > + * processing this one. Read the new cause. > + */ > + status = readl_relaxed(lldev->evca + EVCA_IRQ_STAT_OFFSET); > + enable = readl_relaxed(lldev->evca + EVCA_IRQ_EN_OFFSET); > + cause = status & enable; > + > + repeat--; > + } Same here. > +} > + > + > +static int hidma_ll_enable(struct hidma_lldev *lldev) > +{ > + u32 val; > + > + val = readl_relaxed(lldev->evca + EVCA_CTRLSTS_OFFSET); > + val &= ~(CH_CONTROL_MASK << 16); > + val |= (CH_ENABLE << 16); > + > + writel_relaxed(val, lldev->evca + EVCA_CTRLSTS_OFFSET); > + > + /* wait until channel is enabled */ > + wmb(); > + > + mdelay(1); > + > + val = readl_relaxed(lldev->evca + EVCA_CTRLSTS_OFFSET); > + lldev->evch_state = (val >> CH_STATE_BIT_POS) & CH_STATE_MASK; > + if ((lldev->evch_state != CH_ENABLED) && > + (lldev->evch_state != CH_RUNNING)) { > + dev_err(lldev->dev, > + "event channel did not get enabled\n"); > + return -ENODEV; > + } > + > + val = readl_relaxed(lldev->trca + TRCA_CTRLSTS_OFFSET); > + val = val & ~(CH_CONTROL_MASK << 16); > + val = val | (CH_ENABLE << 16); > + writel_relaxed(val, lldev->trca + TRCA_CTRLSTS_OFFSET); > + > + /* wait until channel is enabled */ > + wmb(); > + > + mdelay(1); Another workqueue? You should basically never call mdelay(). > +static int hidma_ll_hw_start(void *llhndl) > +{ > + int rc = 0; > + struct hidma_lldev *lldev = llhndl; > + unsigned long irqflags; > + > + spin_lock_irqsave(&lldev->lock, irqflags); > + writel_relaxed(lldev->tre_write_offset, > + lldev->trca + TRCA_DOORBELL_OFFSET); > + spin_unlock_irqrestore(&lldev->lock, irqflags); How does this work? The writel_relaxed() won't synchronize with either the DMA data or the spinlock. > +int hidma_ll_init(void **lldevp, struct device *dev, u32 nr_tres, > + void __iomem *trca, void __iomem *evca, > + u8 evridx) How about returning the pointer rather than passing in an indirect pointer? Also, your abstraction seem to go a little too far if the upper driver doesn't know what the lower driver calls its main device structure. Or you can go further and just embed the struct hidma_lldev within the struct hidma_dev to save one? > +void hidma_ll_chstats(struct seq_file *s, void *llhndl, u32 tre_ch) > +{ > + struct hidma_lldev *lldev = llhndl; > + struct hidma_tre *tre; > + u32 length; > + dma_addr_t src_start; > + dma_addr_t dest_start; > + u32 *tre_local; > + > + if (unlikely(tre_ch >= lldev->nr_tres)) { > + dev_err(lldev->dev, "invalid TRE number in chstats:%d", > + tre_ch); > + return; > + } > + tre = &lldev->trepool[tre_ch]; > + seq_printf(s, "------Channel %d -----\n", tre_ch); > + seq_printf(s, "allocated=%d\n", atomic_read(&tre->allocated)); > + HIDMA_CHAN_SHOW(tre, queued); > + seq_printf(s, "err_info=0x%x\n", > + lldev->tx_status_list[tre->chidx].err_info); > + seq_printf(s, "err_code=0x%x\n", > + lldev->tx_status_list[tre->chidx].err_code); > + HIDMA_CHAN_SHOW(tre, status); > + HIDMA_CHAN_SHOW(tre, chidx); > + HIDMA_CHAN_SHOW(tre, dma_sig); > + seq_printf(s, "dev_name=%s\n", tre->dev_name); > + seq_printf(s, "callback=%p\n", tre->callback); > + seq_printf(s, "data=%p\n", tre->data); > + HIDMA_CHAN_SHOW(tre, tre_index); > + > + tre_local = &tre->tre_local[0]; > + src_start = tre_local[TRE_SRC_LOW_IDX]; > + src_start = ((u64)(tre_local[TRE_SRC_HI_IDX]) << 32) + src_start; > + dest_start = tre_local[TRE_DEST_LOW_IDX]; > + dest_start += ((u64)(tre_local[TRE_DEST_HI_IDX]) << 32); > + length = tre_local[TRE_LEN_IDX]; > + > + seq_printf(s, "src=%pap\n", &src_start); > + seq_printf(s, "dest=%pap\n", &dest_start); > + seq_printf(s, "length=0x%x\n", length); > +} > +EXPORT_SYMBOL_GPL(hidma_ll_chstats); Remove all the pointers here. I guess you can remove the entire debugfs file really ;-) This looks like it is better done using ftrace for the low-level internals of the driver. Arnd