All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/3] DMA Engine: switch PL330 driver to non-irq-safe runtime PM
       [not found] <CGME20170209142307eucas1p2592bbad82dbbffc56bbd993f5a890981@eucas1p2.samsung.com>
  2017-02-09 14:22   ` Marek Szyprowski
@ 2017-02-09 14:22   ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Marek Szyprowski, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Vinod Koul, Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

Hello,

This patchset changes the way the runtime PM is implemented in the PL330 DMA
engine driver. The main goal of such change is to add support for the audio
power domain to Exynos5 SoCs (5250, 542x, 5433, probably others) and let
it to be properly turned off, when no audio is being used. Switching to
non-irq-safe runtime PM is required to properly let power domain to be
turned off (irq-safe runtime PM keeps power domain turned on all the time)
and to integrate with clock controller's runtime PM (this cannot be
workarounded any other way, PL330 uses clocks from the controller, which
belongs to the same power domain).

For more details of the proposed change to the PL330 driver see patch #3.

Audio power domain on Exynos5 SoCs contains following hardware modules:
1. clock controller
2. pin controller
3. PL330 DMA controller
4. I2S audio controller

Patches for adding or fixing runtime PM for each of the above devices is
handled separately.

Runtime PM patches for clock controllers is possible and has been proposed
in the following thread (pending review): "[PATCH v4 0/4] Add runtime PM
support for clocks (on Exynos SoC example)",
http://www.spinics.net/lists/arm-kernel/msg550747.html

Runtime PM support for Exynos pin controller has been posted in the
following thread: "[PATCH 0/9] Runtime PM for Exynos pin controller driver",
http://www.spinics.net/lists/arm-kernel/msg550161.html

Exynos I2S driver supports runtime PM, but some fixes were needed for it
and they are already queued to linux-next.

This patchset is based on linux-next from 9th February 2017.

Best regards
Marek Szyprowski
Samsung R&D Institute Poland


Changelog:

v8:
- reworked slave device assignment, now it is done in separate callbacks as
  requested by Lars-Peter Clausen, no more changes to of_xlate callback
  in every dma engine driver are needed in this approach
- reworked pl330 patch to use new device_{set,release}_slave callbacks
- dropped tags because of the code changes
- rebased onto linux next-20170209

v7: https://www.spinics.net/lists/arm-kernel/msg557696.html
- added missing of_dma_request_slave_channel API change to sound/soc/sh/rcar
  driver
- extended commit message with information about drawbacks of irq-safe
  runtime pm
- added Ulf's reviewed-by tags

v6: https://www.spinics.net/lists/arm-kernel/msg557377.html
- fixed pl330 system sleep suspend/resume callbacks, previous implementation
  incorrectly tried to unprepare clocks unconditionally - after a fix pl330
  suspend/resume callbacks can be simply replaced by generic
  pm_runtime_force_{suspend,resume} helpers, what simplifies code even more

v5: https://www.spinics.net/lists/arm-kernel/msg555001.html
- added Acks
- additional mutex is indeed not needed, rely on dma_list_mutex in dmaengine
  core, added comment about locking

v4: http://www.spinics.net/lists/dmaengine/msg12329.html
- rebased onto "dmaengine: pl330: fix double lock" patch:
  http://www.spinics.net/lists/dmaengine/msg12289.html
- added a mutex to protect runtime PM links creation/removal to avoid races
- moved mem2mem channel case handing to pl330_{add,del}_slave_pm_link
  functions to simplify code and error paths

v3: http://www.spinics.net/lists/dmaengine/msg12245.html
- removed pl330_filter function as suggested by Arnd Bergmann
- removed pl330.h from arch/arm/plat-samsung/devs.c
- fixes some minor style issues pointed by Krzysztof Kozlowski

v2: https://www.spinics.net/lists/arm-kernel/msg552772.html
- rebased onto linux next-20170109
- improved patch description
- separated patch #3 from #4 (storing a pointer to slave device for each
  DMA channel) as requested by Krzysztof Kozlowski

v1: https://www.spinics.net/lists/arm-kernel/msg550008.html
- initial version


Patch summary:

Marek Szyprowski (3):
  dmaengine: Add new device_{set,release}_slave callbacks
  dmaengine: pl330: remove pdata based initialization
  dmaengine: pl330: Don't require irq-safe runtime PM

 arch/arm/plat-samsung/devs.c |   1 -
 drivers/dma/dmaengine.c      |  27 +++++-
 drivers/dma/pl330.c          | 219 +++++++++++++++++--------------------------
 include/linux/amba/pl330.h   |  35 -------
 include/linux/dmaengine.h    |  10 ++
 5 files changed, 119 insertions(+), 173 deletions(-)
 delete mode 100644 include/linux/amba/pl330.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 0/3] DMA Engine: switch PL330 driver to non-irq-safe runtime PM
@ 2017-02-09 14:22   ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Ulf Hansson, Lars-Peter Clausen, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Vinod Koul, Rafael J. Wysocki,
	Krzysztof Kozlowski, Inki Dae, Marek Szyprowski

Hello,

This patchset changes the way the runtime PM is implemented in the PL330 DMA
engine driver. The main goal of such change is to add support for the audio
power domain to Exynos5 SoCs (5250, 542x, 5433, probably others) and let
it to be properly turned off, when no audio is being used. Switching to
non-irq-safe runtime PM is required to properly let power domain to be
turned off (irq-safe runtime PM keeps power domain turned on all the time)
and to integrate with clock controller's runtime PM (this cannot be
workarounded any other way, PL330 uses clocks from the controller, which
belongs to the same power domain).

For more details of the proposed change to the PL330 driver see patch #3.

Audio power domain on Exynos5 SoCs contains following hardware modules:
1. clock controller
2. pin controller
3. PL330 DMA controller
4. I2S audio controller

Patches for adding or fixing runtime PM for each of the above devices is
handled separately.

Runtime PM patches for clock controllers is possible and has been proposed
in the following thread (pending review): "[PATCH v4 0/4] Add runtime PM
support for clocks (on Exynos SoC example)",
http://www.spinics.net/lists/arm-kernel/msg550747.html

Runtime PM support for Exynos pin controller has been posted in the
following thread: "[PATCH 0/9] Runtime PM for Exynos pin controller driver",
http://www.spinics.net/lists/arm-kernel/msg550161.html

Exynos I2S driver supports runtime PM, but some fixes were needed for it
and they are already queued to linux-next.

This patchset is based on linux-next from 9th February 2017.

Best regards
Marek Szyprowski
Samsung R&D Institute Poland


Changelog:

v8:
- reworked slave device assignment, now it is done in separate callbacks as
  requested by Lars-Peter Clausen, no more changes to of_xlate callback
  in every dma engine driver are needed in this approach
- reworked pl330 patch to use new device_{set,release}_slave callbacks
- dropped tags because of the code changes
- rebased onto linux next-20170209

v7: https://www.spinics.net/lists/arm-kernel/msg557696.html
- added missing of_dma_request_slave_channel API change to sound/soc/sh/rcar
  driver
- extended commit message with information about drawbacks of irq-safe
  runtime pm
- added Ulf's reviewed-by tags

v6: https://www.spinics.net/lists/arm-kernel/msg557377.html
- fixed pl330 system sleep suspend/resume callbacks, previous implementation
  incorrectly tried to unprepare clocks unconditionally - after a fix pl330
  suspend/resume callbacks can be simply replaced by generic
  pm_runtime_force_{suspend,resume} helpers, what simplifies code even more

v5: https://www.spinics.net/lists/arm-kernel/msg555001.html
- added Acks
- additional mutex is indeed not needed, rely on dma_list_mutex in dmaengine
  core, added comment about locking

v4: http://www.spinics.net/lists/dmaengine/msg12329.html
- rebased onto "dmaengine: pl330: fix double lock" patch:
  http://www.spinics.net/lists/dmaengine/msg12289.html
- added a mutex to protect runtime PM links creation/removal to avoid races
- moved mem2mem channel case handing to pl330_{add,del}_slave_pm_link
  functions to simplify code and error paths

v3: http://www.spinics.net/lists/dmaengine/msg12245.html
- removed pl330_filter function as suggested by Arnd Bergmann
- removed pl330.h from arch/arm/plat-samsung/devs.c
- fixes some minor style issues pointed by Krzysztof Kozlowski

v2: https://www.spinics.net/lists/arm-kernel/msg552772.html
- rebased onto linux next-20170109
- improved patch description
- separated patch #3 from #4 (storing a pointer to slave device for each
  DMA channel) as requested by Krzysztof Kozlowski

v1: https://www.spinics.net/lists/arm-kernel/msg550008.html
- initial version


Patch summary:

Marek Szyprowski (3):
  dmaengine: Add new device_{set,release}_slave callbacks
  dmaengine: pl330: remove pdata based initialization
  dmaengine: pl330: Don't require irq-safe runtime PM

 arch/arm/plat-samsung/devs.c |   1 -
 drivers/dma/dmaengine.c      |  27 +++++-
 drivers/dma/pl330.c          | 219 +++++++++++++++++--------------------------
 include/linux/amba/pl330.h   |  35 -------
 include/linux/dmaengine.h    |  10 ++
 5 files changed, 119 insertions(+), 173 deletions(-)
 delete mode 100644 include/linux/amba/pl330.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 0/3] DMA Engine: switch PL330 driver to non-irq-safe runtime PM
@ 2017-02-09 14:22   ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This patchset changes the way the runtime PM is implemented in the PL330 DMA
engine driver. The main goal of such change is to add support for the audio
power domain to Exynos5 SoCs (5250, 542x, 5433, probably others) and let
it to be properly turned off, when no audio is being used. Switching to
non-irq-safe runtime PM is required to properly let power domain to be
turned off (irq-safe runtime PM keeps power domain turned on all the time)
and to integrate with clock controller's runtime PM (this cannot be
workarounded any other way, PL330 uses clocks from the controller, which
belongs to the same power domain).

For more details of the proposed change to the PL330 driver see patch #3.

Audio power domain on Exynos5 SoCs contains following hardware modules:
1. clock controller
2. pin controller
3. PL330 DMA controller
4. I2S audio controller

Patches for adding or fixing runtime PM for each of the above devices is
handled separately.

Runtime PM patches for clock controllers is possible and has been proposed
in the following thread (pending review): "[PATCH v4 0/4] Add runtime PM
support for clocks (on Exynos SoC example)",
http://www.spinics.net/lists/arm-kernel/msg550747.html

Runtime PM support for Exynos pin controller has been posted in the
following thread: "[PATCH 0/9] Runtime PM for Exynos pin controller driver",
http://www.spinics.net/lists/arm-kernel/msg550161.html

Exynos I2S driver supports runtime PM, but some fixes were needed for it
and they are already queued to linux-next.

This patchset is based on linux-next from 9th February 2017.

Best regards
Marek Szyprowski
Samsung R&D Institute Poland


Changelog:

v8:
- reworked slave device assignment, now it is done in separate callbacks as
  requested by Lars-Peter Clausen, no more changes to of_xlate callback
  in every dma engine driver are needed in this approach
- reworked pl330 patch to use new device_{set,release}_slave callbacks
- dropped tags because of the code changes
- rebased onto linux next-20170209

v7: https://www.spinics.net/lists/arm-kernel/msg557696.html
- added missing of_dma_request_slave_channel API change to sound/soc/sh/rcar
  driver
- extended commit message with information about drawbacks of irq-safe
  runtime pm
- added Ulf's reviewed-by tags

v6: https://www.spinics.net/lists/arm-kernel/msg557377.html
- fixed pl330 system sleep suspend/resume callbacks, previous implementation
  incorrectly tried to unprepare clocks unconditionally - after a fix pl330
  suspend/resume callbacks can be simply replaced by generic
  pm_runtime_force_{suspend,resume} helpers, what simplifies code even more

v5: https://www.spinics.net/lists/arm-kernel/msg555001.html
- added Acks
- additional mutex is indeed not needed, rely on dma_list_mutex in dmaengine
  core, added comment about locking

v4: http://www.spinics.net/lists/dmaengine/msg12329.html
- rebased onto "dmaengine: pl330: fix double lock" patch:
  http://www.spinics.net/lists/dmaengine/msg12289.html
- added a mutex to protect runtime PM links creation/removal to avoid races
- moved mem2mem channel case handing to pl330_{add,del}_slave_pm_link
  functions to simplify code and error paths

v3: http://www.spinics.net/lists/dmaengine/msg12245.html
- removed pl330_filter function as suggested by Arnd Bergmann
- removed pl330.h from arch/arm/plat-samsung/devs.c
- fixes some minor style issues pointed by Krzysztof Kozlowski

v2: https://www.spinics.net/lists/arm-kernel/msg552772.html
- rebased onto linux next-20170109
- improved patch description
- separated patch #3 from #4 (storing a pointer to slave device for each
  DMA channel) as requested by Krzysztof Kozlowski

v1: https://www.spinics.net/lists/arm-kernel/msg550008.html
- initial version


Patch summary:

Marek Szyprowski (3):
  dmaengine: Add new device_{set,release}_slave callbacks
  dmaengine: pl330: remove pdata based initialization
  dmaengine: pl330: Don't require irq-safe runtime PM

 arch/arm/plat-samsung/devs.c |   1 -
 drivers/dma/dmaengine.c      |  27 +++++-
 drivers/dma/pl330.c          | 219 +++++++++++++++++--------------------------
 include/linux/amba/pl330.h   |  35 -------
 include/linux/dmaengine.h    |  10 ++
 5 files changed, 119 insertions(+), 173 deletions(-)
 delete mode 100644 include/linux/amba/pl330.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
       [not found]   ` <CGME20170209142307eucas1p180323d005f524760913b8d04ac966423@eucas1p1.samsung.com>
  2017-02-09 14:22       ` Marek Szyprowski
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Marek Szyprowski, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Vinod Koul, Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

Add two new callbacks to DMA engine device. They will used to provide
access to slave device (the device which requested given DMA channel)
for DMA engine driver. Access to slave device might be useful for example
for implementing advanced runtime power management.

DMA slave channels are exclusive, so only one slave device can be set
for a given DMA slave channel.

device_set_slave() will be called after the device_alloc_chan_resources()
and device_release_slave() before the device_free_chan_resources().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
 include/linux/dmaengine.h | 10 ++++++++++
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 24e0221fd66d..5b7089d8be4d 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 {
 	struct dma_device *d, *_d;
 	struct dma_chan *chan = NULL;
+	int ret;
 
 	/* If device-tree is present get slave info from here */
 	if (dev->of_node)
@@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 		chan = acpi_dma_request_slave_chan_by_name(dev, name);
 
 	if (chan) {
-		/* Valid channel found or requester need to be deferred */
-		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
+		if (!IS_ERR(chan))
+			goto found;
+		if (PTR_ERR(chan) == -EPROBE_DEFER)
 			return chan;
 	}
 
@@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 	}
 	mutex_unlock(&dma_list_mutex);
 
-	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
+	if (!chan)
+		return ERR_PTR(-EPROBE_DEFER);
+	if (IS_ERR(chan))
+		return chan;
+found:
+	if (chan->device->device_set_slave) {
+		chan->slave = dev;
+		ret = chan->device->device_set_slave(chan, dev);
+		if (ret) {
+			chan->slave = NULL;
+			dma_release_channel(chan);
+			chan = ERR_PTR(ret);
+		}
+	}
+	return chan;
 }
 EXPORT_SYMBOL_GPL(dma_request_chan);
 
@@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
 	mutex_lock(&dma_list_mutex);
 	WARN_ONCE(chan->client_count != 1,
 		  "chan reference count %d != 1\n", chan->client_count);
+	if (chan->slave) {
+		if (chan->device->device_release_slave)
+			chan->device->device_release_slave(chan);
+		chan->slave = NULL;
+	}
 	dma_chan_put(chan);
 	/* drop PRIVATE cap enabled by __dma_request_channel() */
 	if (--chan->device->privatecnt == 0)
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 533680860865..d22299e37e69 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -277,6 +277,9 @@ struct dma_chan {
 	struct dma_router *router;
 	void *route_data;
 
+	/* Only for SLAVE channels */
+	struct device *slave;
+
 	void *private;
 };
 
@@ -686,6 +689,10 @@ struct dma_filter {
  * @device_alloc_chan_resources: allocate resources and return the
  *	number of allocated descriptors
  * @device_free_chan_resources: release DMA channel's resources
+ * @device_set_slave: provide access to the slave device, which requested
+ *	given DMA channel, called after @device_alloc_chan_resources
+ * @device_release_slave: finishes access to the slave device, called
+ *	before @device_free_chan_resources
  * @device_prep_dma_memcpy: prepares a memcpy operation
  * @device_prep_dma_xor: prepares a xor operation
  * @device_prep_dma_xor_val: prepares a xor validation operation
@@ -746,6 +753,9 @@ struct dma_device {
 	int (*device_alloc_chan_resources)(struct dma_chan *chan);
 	void (*device_free_chan_resources)(struct dma_chan *chan);
 
+	int (*device_set_slave)(struct dma_chan *chan, struct device *slave);
+	void (*device_release_slave)(struct dma_chan *chan);
+
 	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(
 		struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
 		size_t len, unsigned long flags);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set, release}_slave callbacks
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Ulf Hansson, Lars-Peter Clausen, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Vinod Koul, Rafael J. Wysocki,
	Krzysztof Kozlowski, Inki Dae, Marek Szyprowski

Add two new callbacks to DMA engine device. They will used to provide
access to slave device (the device which requested given DMA channel)
for DMA engine driver. Access to slave device might be useful for example
for implementing advanced runtime power management.

DMA slave channels are exclusive, so only one slave device can be set
for a given DMA slave channel.

device_set_slave() will be called after the device_alloc_chan_resources()
and device_release_slave() before the device_free_chan_resources().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
 include/linux/dmaengine.h | 10 ++++++++++
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 24e0221fd66d..5b7089d8be4d 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 {
 	struct dma_device *d, *_d;
 	struct dma_chan *chan = NULL;
+	int ret;
 
 	/* If device-tree is present get slave info from here */
 	if (dev->of_node)
@@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 		chan = acpi_dma_request_slave_chan_by_name(dev, name);
 
 	if (chan) {
-		/* Valid channel found or requester need to be deferred */
-		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
+		if (!IS_ERR(chan))
+			goto found;
+		if (PTR_ERR(chan) == -EPROBE_DEFER)
 			return chan;
 	}
 
@@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 	}
 	mutex_unlock(&dma_list_mutex);
 
-	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
+	if (!chan)
+		return ERR_PTR(-EPROBE_DEFER);
+	if (IS_ERR(chan))
+		return chan;
+found:
+	if (chan->device->device_set_slave) {
+		chan->slave = dev;
+		ret = chan->device->device_set_slave(chan, dev);
+		if (ret) {
+			chan->slave = NULL;
+			dma_release_channel(chan);
+			chan = ERR_PTR(ret);
+		}
+	}
+	return chan;
 }
 EXPORT_SYMBOL_GPL(dma_request_chan);
 
@@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
 	mutex_lock(&dma_list_mutex);
 	WARN_ONCE(chan->client_count != 1,
 		  "chan reference count %d != 1\n", chan->client_count);
+	if (chan->slave) {
+		if (chan->device->device_release_slave)
+			chan->device->device_release_slave(chan);
+		chan->slave = NULL;
+	}
 	dma_chan_put(chan);
 	/* drop PRIVATE cap enabled by __dma_request_channel() */
 	if (--chan->device->privatecnt == 0)
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 533680860865..d22299e37e69 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -277,6 +277,9 @@ struct dma_chan {
 	struct dma_router *router;
 	void *route_data;
 
+	/* Only for SLAVE channels */
+	struct device *slave;
+
 	void *private;
 };
 
@@ -686,6 +689,10 @@ struct dma_filter {
  * @device_alloc_chan_resources: allocate resources and return the
  *	number of allocated descriptors
  * @device_free_chan_resources: release DMA channel's resources
+ * @device_set_slave: provide access to the slave device, which requested
+ *	given DMA channel, called after @device_alloc_chan_resources
+ * @device_release_slave: finishes access to the slave device, called
+ *	before @device_free_chan_resources
  * @device_prep_dma_memcpy: prepares a memcpy operation
  * @device_prep_dma_xor: prepares a xor operation
  * @device_prep_dma_xor_val: prepares a xor validation operation
@@ -746,6 +753,9 @@ struct dma_device {
 	int (*device_alloc_chan_resources)(struct dma_chan *chan);
 	void (*device_free_chan_resources)(struct dma_chan *chan);
 
+	int (*device_set_slave)(struct dma_chan *chan, struct device *slave);
+	void (*device_release_slave)(struct dma_chan *chan);
+
 	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(
 		struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
 		size_t len, unsigned long flags);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set, release}_slave callbacks
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

Add two new callbacks to DMA engine device. They will used to provide
access to slave device (the device which requested given DMA channel)
for DMA engine driver. Access to slave device might be useful for example
for implementing advanced runtime power management.

DMA slave channels are exclusive, so only one slave device can be set
for a given DMA slave channel.

device_set_slave() will be called after the device_alloc_chan_resources()
and device_release_slave() before the device_free_chan_resources().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
 include/linux/dmaengine.h | 10 ++++++++++
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 24e0221fd66d..5b7089d8be4d 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 {
 	struct dma_device *d, *_d;
 	struct dma_chan *chan = NULL;
+	int ret;
 
 	/* If device-tree is present get slave info from here */
 	if (dev->of_node)
@@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 		chan = acpi_dma_request_slave_chan_by_name(dev, name);
 
 	if (chan) {
-		/* Valid channel found or requester need to be deferred */
-		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
+		if (!IS_ERR(chan))
+			goto found;
+		if (PTR_ERR(chan) == -EPROBE_DEFER)
 			return chan;
 	}
 
@@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
 	}
 	mutex_unlock(&dma_list_mutex);
 
-	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
+	if (!chan)
+		return ERR_PTR(-EPROBE_DEFER);
+	if (IS_ERR(chan))
+		return chan;
+found:
+	if (chan->device->device_set_slave) {
+		chan->slave = dev;
+		ret = chan->device->device_set_slave(chan, dev);
+		if (ret) {
+			chan->slave = NULL;
+			dma_release_channel(chan);
+			chan = ERR_PTR(ret);
+		}
+	}
+	return chan;
 }
 EXPORT_SYMBOL_GPL(dma_request_chan);
 
@@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
 	mutex_lock(&dma_list_mutex);
 	WARN_ONCE(chan->client_count != 1,
 		  "chan reference count %d != 1\n", chan->client_count);
+	if (chan->slave) {
+		if (chan->device->device_release_slave)
+			chan->device->device_release_slave(chan);
+		chan->slave = NULL;
+	}
 	dma_chan_put(chan);
 	/* drop PRIVATE cap enabled by __dma_request_channel() */
 	if (--chan->device->privatecnt == 0)
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 533680860865..d22299e37e69 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -277,6 +277,9 @@ struct dma_chan {
 	struct dma_router *router;
 	void *route_data;
 
+	/* Only for SLAVE channels */
+	struct device *slave;
+
 	void *private;
 };
 
@@ -686,6 +689,10 @@ struct dma_filter {
  * @device_alloc_chan_resources: allocate resources and return the
  *	number of allocated descriptors
  * @device_free_chan_resources: release DMA channel's resources
+ * @device_set_slave: provide access to the slave device, which requested
+ *	given DMA channel, called after @device_alloc_chan_resources
+ * @device_release_slave: finishes access to the slave device, called
+ *	before @device_free_chan_resources
  * @device_prep_dma_memcpy: prepares a memcpy operation
  * @device_prep_dma_xor: prepares a xor operation
  * @device_prep_dma_xor_val: prepares a xor validation operation
@@ -746,6 +753,9 @@ struct dma_device {
 	int (*device_alloc_chan_resources)(struct dma_chan *chan);
 	void (*device_free_chan_resources)(struct dma_chan *chan);
 
+	int (*device_set_slave)(struct dma_chan *chan, struct device *slave);
+	void (*device_release_slave)(struct dma_chan *chan);
+
 	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(
 		struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
 		size_t len, unsigned long flags);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
       [not found]   ` <CGME20170209142308eucas1p24d52db3d52e19228e8f423c3dc8b085b@eucas1p2.samsung.com>
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Marek Szyprowski, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Vinod Koul, Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

This driver is now used only on platforms which support device tree, so
it is safe to remove legacy platform data based initialization code.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
For plat-samsung:
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
---
 arch/arm/plat-samsung/devs.c |  1 -
 drivers/dma/pl330.c          | 42 ++++++++----------------------------------
 include/linux/amba/pl330.h   | 35 -----------------------------------
 3 files changed, 8 insertions(+), 70 deletions(-)
 delete mode 100644 include/linux/amba/pl330.h

diff --git a/arch/arm/plat-samsung/devs.c b/arch/arm/plat-samsung/devs.c
index 03fac123676d..dc269d9143bc 100644
--- a/arch/arm/plat-samsung/devs.c
+++ b/arch/arm/plat-samsung/devs.c
@@ -10,7 +10,6 @@
  * published by the Free Software Foundation.
 */
 
-#include <linux/amba/pl330.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/interrupt.h>
diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index f37f4978dabb..8b0da7fa520d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -22,7 +22,6 @@
 #include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <linux/amba/bus.h>
-#include <linux/amba/pl330.h>
 #include <linux/scatterlist.h>
 #include <linux/of.h>
 #include <linux/of_dma.h>
@@ -2077,18 +2076,6 @@ static void pl330_tasklet(unsigned long data)
 	}
 }
 
-bool pl330_filter(struct dma_chan *chan, void *param)
-{
-	u8 *peri_id;
-
-	if (chan->device->dev->driver != &pl330_driver.drv)
-		return false;
-
-	peri_id = chan->private;
-	return *peri_id == (unsigned long)param;
-}
-EXPORT_SYMBOL(pl330_filter);
-
 static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
 						struct of_dma *ofdma)
 {
@@ -2833,7 +2820,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
 static int
 pl330_probe(struct amba_device *adev, const struct amba_id *id)
 {
-	struct dma_pl330_platdata *pdat;
 	struct pl330_config *pcfg;
 	struct pl330_dmac *pl330;
 	struct dma_pl330_chan *pch, *_p;
@@ -2843,8 +2829,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
 	int num_chan;
 	struct device_node *np = adev->dev.of_node;
 
-	pdat = dev_get_platdata(&adev->dev);
-
 	ret = dma_set_mask_and_coherent(&adev->dev, DMA_BIT_MASK(32));
 	if (ret)
 		return ret;
@@ -2857,7 +2841,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 	pd = &pl330->ddma;
 	pd->dev = &adev->dev;
 
-	pl330->mcbufsz = pdat ? pdat->mcbuf_sz : 0;
+	pl330->mcbufsz = 0;
 
 	/* get quirk */
 	for (i = 0; i < ARRAY_SIZE(of_quirks); i++)
@@ -2901,10 +2885,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 	INIT_LIST_HEAD(&pd->channels);
 
 	/* Initialize channel parameters */
-	if (pdat)
-		num_chan = max_t(int, pdat->nr_valid_peri, pcfg->num_chan);
-	else
-		num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
+	num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
 
 	pl330->num_peripherals = num_chan;
 
@@ -2916,11 +2897,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	for (i = 0; i < num_chan; i++) {
 		pch = &pl330->peripherals[i];
-		if (!adev->dev.of_node)
-			pch->chan.private = pdat ? &pdat->peri_id[i] : NULL;
-		else
-			pch->chan.private = adev->dev.of_node;
 
+		pch->chan.private = adev->dev.of_node;
 		INIT_LIST_HEAD(&pch->submitted_list);
 		INIT_LIST_HEAD(&pch->work_list);
 		INIT_LIST_HEAD(&pch->completed_list);
@@ -2933,15 +2911,11 @@ static int __maybe_unused pl330_resume(struct device *dev)
 		list_add_tail(&pch->chan.device_node, &pd->channels);
 	}
 
-	if (pdat) {
-		pd->cap_mask = pdat->cap_mask;
-	} else {
-		dma_cap_set(DMA_MEMCPY, pd->cap_mask);
-		if (pcfg->num_peri) {
-			dma_cap_set(DMA_SLAVE, pd->cap_mask);
-			dma_cap_set(DMA_CYCLIC, pd->cap_mask);
-			dma_cap_set(DMA_PRIVATE, pd->cap_mask);
-		}
+	dma_cap_set(DMA_MEMCPY, pd->cap_mask);
+	if (pcfg->num_peri) {
+		dma_cap_set(DMA_SLAVE, pd->cap_mask);
+		dma_cap_set(DMA_CYCLIC, pd->cap_mask);
+		dma_cap_set(DMA_PRIVATE, pd->cap_mask);
 	}
 
 	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
diff --git a/include/linux/amba/pl330.h b/include/linux/amba/pl330.h
deleted file mode 100644
index fe93758e8403..000000000000
--- a/include/linux/amba/pl330.h
+++ /dev/null
@@ -1,35 +0,0 @@
-/* linux/include/linux/amba/pl330.h
- *
- * Copyright (C) 2010 Samsung Electronics Co. Ltd.
- *	Jaswinder Singh <jassi.brar@samsung.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#ifndef	__AMBA_PL330_H_
-#define	__AMBA_PL330_H_
-
-#include <linux/dmaengine.h>
-
-struct dma_pl330_platdata {
-	/*
-	 * Number of valid peripherals connected to DMAC.
-	 * This may be different from the value read from
-	 * CR0, as the PL330 implementation might have 'holes'
-	 * in the peri list or the peri could also be reached
-	 * from another DMAC which the platform prefers.
-	 */
-	u8 nr_valid_peri;
-	/* Array of valid peripherals */
-	u8 *peri_id;
-	/* Operational capabilities */
-	dma_cap_mask_t cap_mask;
-	/* Bytes to allocate for MC buffer */
-	unsigned mcbuf_sz;
-};
-
-extern bool pl330_filter(struct dma_chan *chan, void *param);
-#endif	/* __AMBA_PL330_H_ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

This driver is now used only on platforms which support device tree, so
it is safe to remove legacy platform data based initialization code.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
For plat-samsung:
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
---
 arch/arm/plat-samsung/devs.c |  1 -
 drivers/dma/pl330.c          | 42 ++++++++----------------------------------
 include/linux/amba/pl330.h   | 35 -----------------------------------
 3 files changed, 8 insertions(+), 70 deletions(-)
 delete mode 100644 include/linux/amba/pl330.h

diff --git a/arch/arm/plat-samsung/devs.c b/arch/arm/plat-samsung/devs.c
index 03fac123676d..dc269d9143bc 100644
--- a/arch/arm/plat-samsung/devs.c
+++ b/arch/arm/plat-samsung/devs.c
@@ -10,7 +10,6 @@
  * published by the Free Software Foundation.
 */
 
-#include <linux/amba/pl330.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/interrupt.h>
diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index f37f4978dabb..8b0da7fa520d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -22,7 +22,6 @@
 #include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <linux/amba/bus.h>
-#include <linux/amba/pl330.h>
 #include <linux/scatterlist.h>
 #include <linux/of.h>
 #include <linux/of_dma.h>
@@ -2077,18 +2076,6 @@ static void pl330_tasklet(unsigned long data)
 	}
 }
 
-bool pl330_filter(struct dma_chan *chan, void *param)
-{
-	u8 *peri_id;
-
-	if (chan->device->dev->driver != &pl330_driver.drv)
-		return false;
-
-	peri_id = chan->private;
-	return *peri_id == (unsigned long)param;
-}
-EXPORT_SYMBOL(pl330_filter);
-
 static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
 						struct of_dma *ofdma)
 {
@@ -2833,7 +2820,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
 static int
 pl330_probe(struct amba_device *adev, const struct amba_id *id)
 {
-	struct dma_pl330_platdata *pdat;
 	struct pl330_config *pcfg;
 	struct pl330_dmac *pl330;
 	struct dma_pl330_chan *pch, *_p;
@@ -2843,8 +2829,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
 	int num_chan;
 	struct device_node *np = adev->dev.of_node;
 
-	pdat = dev_get_platdata(&adev->dev);
-
 	ret = dma_set_mask_and_coherent(&adev->dev, DMA_BIT_MASK(32));
 	if (ret)
 		return ret;
@@ -2857,7 +2841,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 	pd = &pl330->ddma;
 	pd->dev = &adev->dev;
 
-	pl330->mcbufsz = pdat ? pdat->mcbuf_sz : 0;
+	pl330->mcbufsz = 0;
 
 	/* get quirk */
 	for (i = 0; i < ARRAY_SIZE(of_quirks); i++)
@@ -2901,10 +2885,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 	INIT_LIST_HEAD(&pd->channels);
 
 	/* Initialize channel parameters */
-	if (pdat)
-		num_chan = max_t(int, pdat->nr_valid_peri, pcfg->num_chan);
-	else
-		num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
+	num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
 
 	pl330->num_peripherals = num_chan;
 
@@ -2916,11 +2897,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	for (i = 0; i < num_chan; i++) {
 		pch = &pl330->peripherals[i];
-		if (!adev->dev.of_node)
-			pch->chan.private = pdat ? &pdat->peri_id[i] : NULL;
-		else
-			pch->chan.private = adev->dev.of_node;
 
+		pch->chan.private = adev->dev.of_node;
 		INIT_LIST_HEAD(&pch->submitted_list);
 		INIT_LIST_HEAD(&pch->work_list);
 		INIT_LIST_HEAD(&pch->completed_list);
@@ -2933,15 +2911,11 @@ static int __maybe_unused pl330_resume(struct device *dev)
 		list_add_tail(&pch->chan.device_node, &pd->channels);
 	}
 
-	if (pdat) {
-		pd->cap_mask = pdat->cap_mask;
-	} else {
-		dma_cap_set(DMA_MEMCPY, pd->cap_mask);
-		if (pcfg->num_peri) {
-			dma_cap_set(DMA_SLAVE, pd->cap_mask);
-			dma_cap_set(DMA_CYCLIC, pd->cap_mask);
-			dma_cap_set(DMA_PRIVATE, pd->cap_mask);
-		}
+	dma_cap_set(DMA_MEMCPY, pd->cap_mask);
+	if (pcfg->num_peri) {
+		dma_cap_set(DMA_SLAVE, pd->cap_mask);
+		dma_cap_set(DMA_CYCLIC, pd->cap_mask);
+		dma_cap_set(DMA_PRIVATE, pd->cap_mask);
 	}
 
 	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
diff --git a/include/linux/amba/pl330.h b/include/linux/amba/pl330.h
deleted file mode 100644
index fe93758e8403..000000000000
--- a/include/linux/amba/pl330.h
+++ /dev/null
@@ -1,35 +0,0 @@
-/* linux/include/linux/amba/pl330.h
- *
- * Copyright (C) 2010 Samsung Electronics Co. Ltd.
- *	Jaswinder Singh <jassi.brar@samsung.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#ifndef	__AMBA_PL330_H_
-#define	__AMBA_PL330_H_
-
-#include <linux/dmaengine.h>
-
-struct dma_pl330_platdata {
-	/*
-	 * Number of valid peripherals connected to DMAC.
-	 * This may be different from the value read from
-	 * CR0, as the PL330 implementation might have 'holes'
-	 * in the peri list or the peri could also be reached
-	 * from another DMAC which the platform prefers.
-	 */
-	u8 nr_valid_peri;
-	/* Array of valid peripherals */
-	u8 *peri_id;
-	/* Operational capabilities */
-	dma_cap_mask_t cap_mask;
-	/* Bytes to allocate for MC buffer */
-	unsigned mcbuf_sz;
-};
-
-extern bool pl330_filter(struct dma_chan *chan, void *param);
-#endif	/* __AMBA_PL330_H_ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
       [not found]   ` <CGME20170209142309eucas1p2b1277d96139eafc0d1dcc14145600476@eucas1p2.samsung.com>
  2017-02-09 14:22       ` Marek Szyprowski
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Marek Szyprowski, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Vinod Koul, Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

This patch replaces irq-safe runtime PM with non-irq-safe version based on
the new approach. Existing, irq-safe runtime PM implementation for PL330 was
not bringing much benefits of its own - only clocks were enabled/disabled.

Another limitation of irq-safe runtime PM is a fact, that it may prevent
the generic PM domain (genpd) from being powered off, particularly in cases
when the genpd doesn't have the GENPD_FLAG_IRQ_SAFE set.

Till now non-irq-safe runtime PM implementation was only possible by calling
pm_runtime_get/put functions from alloc/free_chan_resources. All other DMA
engine API functions cannot be called from a context, which permits sleeping.
Such implementation, in practice would result in keeping DMA controller's
device active almost all the time, because most of the slave device drivers
(DMA engine clients) acquire DMA channel in their probe() function and
released it during driver removal.

This patch provides a new, different approach. It is based on an observation
that there can be only one slave device using each DMA channel. PL330 hardware
always has dedicated channels for each peripheral device. Using recently
introduced device dependencies (links) infrastructure one can ensure proper
runtime PM state of PL330 DMA controller basing on the runtime PM state of
the slave device.

In this approach in pl330_set_slave() function a new dependency is being
created between PL330 DMA controller device (as a supplier) and given slave
device (as a consumer). This way PL330 DMA controller device runtime active
counter is increased when the slave device is resumed and decreased the same
time when given slave device is put to suspend. This way it has been ensured
to keep PL330 DMA controller runtime active if there is an active used of
any of its DMA channels. This is similar to what has been already
implemented in Exynos IOMMU driver in commit 2f5f44f205cc95 ("iommu/exynos:
Use device dependency links to control runtime pm").

If slave device doesn't implement runtime PM or keeps device runtime active
all the time, then PL330 DMA controller will be runtime active all the time
when channel is being allocated.

If one requests memory-to-memory channel, runtime active counter is
increased unconditionally. This might be a drawback of this approach, but
PL330 is not really used for memory-to-memory operations due to poor
performance in such operations compared to the CPU.

Removal of irq-safe runtime PM is based on the revert of the following
commits:
1. commit 5c9e6c2b2ba3 "dmaengine: pl330: fix runtime pm support"
2. commit 81cc6edc0870 "dmaengine: pl330: Fix hang on dmaengine_terminate_all
   on certain boards"
3. commit ae43b3289186 "ARM: 8202/1: dmaengine: pl330: Add runtime Power
   Management support v12"

Introducing non-irq-safe runtime power management finally allows to turn off
audio power domain on Exynos5 SoCs.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/dma/pl330.c | 177 +++++++++++++++++++++++-----------------------------
 1 file changed, 77 insertions(+), 100 deletions(-)

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 8b0da7fa520d..17efad418faa 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -22,6 +22,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <linux/amba/bus.h>
+#include <linux/mutex.h>
 #include <linux/scatterlist.h>
 #include <linux/of.h>
 #include <linux/of_dma.h>
@@ -268,9 +269,6 @@ enum pl330_byteswap {
 
 #define NR_DEFAULT_DESC	16
 
-/* Delay for runtime PM autosuspend, ms */
-#define PL330_AUTOSUSPEND_DELAY 20
-
 /* Populated by the PL330 core driver for DMA API driver's info */
 struct pl330_config {
 	u32	periph_id;
@@ -449,7 +447,7 @@ struct dma_pl330_chan {
 	bool cyclic;
 
 	/* for runtime pm tracking */
-	bool active;
+	struct device_link *slave_link;
 };
 
 struct pl330_dmac {
@@ -463,6 +461,8 @@ struct pl330_dmac {
 	struct list_head desc_pool;
 	/* To protect desc_pool manipulation */
 	spinlock_t pool_lock;
+	/* For management of slave PM links */
+	struct mutex rpm_lock;
 
 	/* Size of MicroCode buffers for each channel. */
 	unsigned mcbufsz;
@@ -2008,7 +2008,6 @@ static void pl330_tasklet(unsigned long data)
 	struct dma_pl330_chan *pch = (struct dma_pl330_chan *)data;
 	struct dma_pl330_desc *desc, *_dt;
 	unsigned long flags;
-	bool power_down = false;
 
 	spin_lock_irqsave(&pch->lock, flags);
 
@@ -2023,18 +2022,10 @@ static void pl330_tasklet(unsigned long data)
 	/* Try to submit a req imm. next to the last completed cookie */
 	fill_queue(pch);
 
-	if (list_empty(&pch->work_list)) {
-		spin_lock(&pch->thread->dmac->lock);
-		_stop(pch->thread);
-		spin_unlock(&pch->thread->dmac->lock);
-		power_down = true;
-		pch->active = false;
-	} else {
-		/* Make sure the PL330 Channel thread is active */
-		spin_lock(&pch->thread->dmac->lock);
-		_start(pch->thread);
-		spin_unlock(&pch->thread->dmac->lock);
-	}
+	/* Make sure the PL330 Channel thread is active */
+	spin_lock(&pch->thread->dmac->lock);
+	_start(pch->thread);
+	spin_unlock(&pch->thread->dmac->lock);
 
 	while (!list_empty(&pch->completed_list)) {
 		struct dmaengine_desc_callback cb;
@@ -2047,13 +2038,6 @@ static void pl330_tasklet(unsigned long data)
 		if (pch->cyclic) {
 			desc->status = PREP;
 			list_move_tail(&desc->node, &pch->work_list);
-			if (power_down) {
-				pch->active = true;
-				spin_lock(&pch->thread->dmac->lock);
-				_start(pch->thread);
-				spin_unlock(&pch->thread->dmac->lock);
-				power_down = false;
-			}
 		} else {
 			desc->status = FREE;
 			list_move_tail(&desc->node, &pch->dmac->desc_pool);
@@ -2068,12 +2052,6 @@ static void pl330_tasklet(unsigned long data)
 		}
 	}
 	spin_unlock_irqrestore(&pch->lock, flags);
-
-	/* If work list empty, power down */
-	if (power_down) {
-		pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-		pm_runtime_put_autosuspend(pch->dmac->ddma.dev);
-	}
 }
 
 static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
@@ -2096,11 +2074,68 @@ static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
 	return dma_get_slave_channel(&pl330->peripherals[chan_id].chan);
 }
 
+static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
+{
+	struct dma_pl330_chan *pch = to_pchan(chan);
+	struct pl330_dmac *pl330 = pch->dmac;
+	int i;
+
+	mutex_lock(&pl330->rpm_lock);
+
+	for (i = 0; i < pl330->num_peripherals; i++) {
+		if (pl330->peripherals[i].chan.slave == slave &&
+		    pl330->peripherals[i].slave_link) {
+			pch->slave_link = pl330->peripherals[i].slave_link;
+			goto done;
+		}
+	}
+
+	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
+				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
+	if (!pch->slave_link) {
+		mutex_unlock(&pl330->rpm_lock);
+		return -ENODEV;
+	}
+done:
+	mutex_unlock(&pl330->rpm_lock);
+
+	pm_runtime_put(pl330->ddma.dev);
+
+	return 0;
+}
+
+static void pl330_release_slave(struct dma_chan *chan)
+{
+	struct dma_pl330_chan *pch = to_pchan(chan);
+	struct pl330_dmac *pl330 = pch->dmac;
+	struct device_link *link = pch->slave_link;
+	int i, count = 0;
+
+	pm_runtime_get_sync(pl330->ddma.dev);
+
+	mutex_lock(&pl330->rpm_lock);
+
+	for (i = 0; i < pl330->num_peripherals; i++)
+		if (pl330->peripherals[i].slave_link == link)
+			count++;
+
+	pch->slave_link = NULL;
+	if (count == 1)
+		device_link_del(link);
+
+	mutex_unlock(&pl330->rpm_lock);
+}
+
 static int pl330_alloc_chan_resources(struct dma_chan *chan)
 {
 	struct dma_pl330_chan *pch = to_pchan(chan);
 	struct pl330_dmac *pl330 = pch->dmac;
 	unsigned long flags;
+	int ret;
+
+	ret = pm_runtime_get_sync(pl330->ddma.dev);
+	if (ret < 0)
+		return ret;
 
 	spin_lock_irqsave(&pl330->lock, flags);
 
@@ -2110,6 +2145,7 @@ static int pl330_alloc_chan_resources(struct dma_chan *chan)
 	pch->thread = pl330_request_channel(pl330);
 	if (!pch->thread) {
 		spin_unlock_irqrestore(&pl330->lock, flags);
+		pm_runtime_put(pl330->ddma.dev);
 		return -ENOMEM;
 	}
 
@@ -2151,9 +2187,7 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	unsigned long flags;
 	struct pl330_dmac *pl330 = pch->dmac;
 	LIST_HEAD(list);
-	bool power_down = false;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	spin_lock_irqsave(&pch->lock, flags);
 	spin_lock(&pl330->lock);
 	_stop(pch->thread);
@@ -2162,8 +2196,6 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	pch->thread->req[0].desc = NULL;
 	pch->thread->req[1].desc = NULL;
 	pch->thread->req_running = -1;
-	power_down = pch->active;
-	pch->active = false;
 
 	/* Mark all desc done */
 	list_for_each_entry(desc, &pch->submitted_list, node) {
@@ -2180,10 +2212,6 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	list_splice_tail_init(&pch->work_list, &pl330->desc_pool);
 	list_splice_tail_init(&pch->completed_list, &pl330->desc_pool);
 	spin_unlock_irqrestore(&pch->lock, flags);
-	pm_runtime_mark_last_busy(pl330->ddma.dev);
-	if (power_down)
-		pm_runtime_put_autosuspend(pl330->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	return 0;
 }
@@ -2201,7 +2229,6 @@ static int pl330_pause(struct dma_chan *chan)
 	struct pl330_dmac *pl330 = pch->dmac;
 	unsigned long flags;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	spin_lock_irqsave(&pch->lock, flags);
 
 	spin_lock(&pl330->lock);
@@ -2209,8 +2236,6 @@ static int pl330_pause(struct dma_chan *chan)
 	spin_unlock(&pl330->lock);
 
 	spin_unlock_irqrestore(&pch->lock, flags);
-	pm_runtime_mark_last_busy(pl330->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	return 0;
 }
@@ -2223,7 +2248,6 @@ static void pl330_free_chan_resources(struct dma_chan *chan)
 
 	tasklet_kill(&pch->task);
 
-	pm_runtime_get_sync(pch->dmac->ddma.dev);
 	spin_lock_irqsave(&pl330->lock, flags);
 
 	pl330_release_channel(pch->thread);
@@ -2233,19 +2257,17 @@ static void pl330_free_chan_resources(struct dma_chan *chan)
 		list_splice_tail_init(&pch->work_list, &pch->dmac->desc_pool);
 
 	spin_unlock_irqrestore(&pl330->lock, flags);
-	pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-	pm_runtime_put_autosuspend(pch->dmac->ddma.dev);
+
+	pm_runtime_put(pl330->ddma.dev);
 }
 
 static int pl330_get_current_xferred_count(struct dma_pl330_chan *pch,
 					   struct dma_pl330_desc *desc)
 {
 	struct pl330_thread *thrd = pch->thread;
-	struct pl330_dmac *pl330 = pch->dmac;
 	void __iomem *regs = thrd->dmac->base;
 	u32 val, addr;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	val = addr = 0;
 	if (desc->rqcfg.src_inc) {
 		val = readl(regs + SA(thrd->id));
@@ -2254,8 +2276,6 @@ static int pl330_get_current_xferred_count(struct dma_pl330_chan *pch,
 		val = readl(regs + DA(thrd->id));
 		addr = desc->px.dst_addr;
 	}
-	pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	/* If DMAMOV hasn't finished yet, SAR/DAR can be zero */
 	if (!val)
@@ -2341,16 +2361,6 @@ static void pl330_issue_pending(struct dma_chan *chan)
 	unsigned long flags;
 
 	spin_lock_irqsave(&pch->lock, flags);
-	if (list_empty(&pch->work_list)) {
-		/*
-		 * Warn on nothing pending. Empty submitted_list may
-		 * break our pm_runtime usage counter as it is
-		 * updated on work_list emptiness status.
-		 */
-		WARN_ON(list_empty(&pch->submitted_list));
-		pch->active = true;
-		pm_runtime_get_sync(pch->dmac->ddma.dev);
-	}
 	list_splice_tail_init(&pch->submitted_list, &pch->work_list);
 	spin_unlock_irqrestore(&pch->lock, flags);
 
@@ -2778,44 +2788,12 @@ static irqreturn_t pl330_irq_handler(int irq, void *data)
 	BIT(DMA_SLAVE_BUSWIDTH_8_BYTES)
 
 /*
- * Runtime PM callbacks are provided by amba/bus.c driver.
- *
- * It is assumed here that IRQ safe runtime PM is chosen in probe and amba
- * bus driver will only disable/enable the clock in runtime PM callbacks.
+ * Runtime PM callbacks are provided by amba/bus.c driver, system sleep
+ * suspend/resume is implemented by generic helpers, which use existing
+ * runtime PM callbacks.
  */
-static int __maybe_unused pl330_suspend(struct device *dev)
-{
-	struct amba_device *pcdev = to_amba_device(dev);
-
-	pm_runtime_disable(dev);
-
-	if (!pm_runtime_status_suspended(dev)) {
-		/* amba did not disable the clock */
-		amba_pclk_disable(pcdev);
-	}
-	amba_pclk_unprepare(pcdev);
-
-	return 0;
-}
-
-static int __maybe_unused pl330_resume(struct device *dev)
-{
-	struct amba_device *pcdev = to_amba_device(dev);
-	int ret;
-
-	ret = amba_pclk_prepare(pcdev);
-	if (ret)
-		return ret;
-
-	if (!pm_runtime_status_suspended(dev))
-		ret = amba_pclk_enable(pcdev);
-
-	pm_runtime_enable(dev);
-
-	return ret;
-}
-
-static SIMPLE_DEV_PM_OPS(pl330_pm, pl330_suspend, pl330_resume);
+static SIMPLE_DEV_PM_OPS(pl330_pm, pm_runtime_force_suspend,
+			 pm_runtime_force_resume);
 
 static int
 pl330_probe(struct amba_device *adev, const struct amba_id *id)
@@ -2877,6 +2855,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	INIT_LIST_HEAD(&pl330->desc_pool);
 	spin_lock_init(&pl330->pool_lock);
+	mutex_init(&pl330->rpm_lock);
 
 	/* Create a descriptor pool of default size */
 	if (!add_desc(pl330, GFP_KERNEL, NR_DEFAULT_DESC))
@@ -2920,6 +2899,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
 	pd->device_free_chan_resources = pl330_free_chan_resources;
+	pd->device_set_slave = pl330_set_slave;
+	pd->device_release_slave = pl330_release_slave;
 	pd->device_prep_dma_memcpy = pl330_prep_dma_memcpy;
 	pd->device_prep_dma_cyclic = pl330_prep_dma_cyclic;
 	pd->device_tx_status = pl330_tx_status;
@@ -2968,11 +2949,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 		pcfg->data_buf_dep, pcfg->data_bus_width / 8, pcfg->num_chan,
 		pcfg->num_peri, pcfg->num_events);
 
-	pm_runtime_irq_safe(&adev->dev);
-	pm_runtime_use_autosuspend(&adev->dev);
-	pm_runtime_set_autosuspend_delay(&adev->dev, PL330_AUTOSUSPEND_DELAY);
-	pm_runtime_mark_last_busy(&adev->dev);
-	pm_runtime_put_autosuspend(&adev->dev);
+	pm_runtime_put(&adev->dev);
 
 	return 0;
 probe_err3:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Ulf Hansson, Lars-Peter Clausen, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Vinod Koul, Rafael J. Wysocki,
	Krzysztof Kozlowski, Inki Dae, Marek Szyprowski

This patch replaces irq-safe runtime PM with non-irq-safe version based on
the new approach. Existing, irq-safe runtime PM implementation for PL330 was
not bringing much benefits of its own - only clocks were enabled/disabled.

Another limitation of irq-safe runtime PM is a fact, that it may prevent
the generic PM domain (genpd) from being powered off, particularly in cases
when the genpd doesn't have the GENPD_FLAG_IRQ_SAFE set.

Till now non-irq-safe runtime PM implementation was only possible by calling
pm_runtime_get/put functions from alloc/free_chan_resources. All other DMA
engine API functions cannot be called from a context, which permits sleeping.
Such implementation, in practice would result in keeping DMA controller's
device active almost all the time, because most of the slave device drivers
(DMA engine clients) acquire DMA channel in their probe() function and
released it during driver removal.

This patch provides a new, different approach. It is based on an observation
that there can be only one slave device using each DMA channel. PL330 hardware
always has dedicated channels for each peripheral device. Using recently
introduced device dependencies (links) infrastructure one can ensure proper
runtime PM state of PL330 DMA controller basing on the runtime PM state of
the slave device.

In this approach in pl330_set_slave() function a new dependency is being
created between PL330 DMA controller device (as a supplier) and given slave
device (as a consumer). This way PL330 DMA controller device runtime active
counter is increased when the slave device is resumed and decreased the same
time when given slave device is put to suspend. This way it has been ensured
to keep PL330 DMA controller runtime active if there is an active used of
any of its DMA channels. This is similar to what has been already
implemented in Exynos IOMMU driver in commit 2f5f44f205cc95 ("iommu/exynos:
Use device dependency links to control runtime pm").

If slave device doesn't implement runtime PM or keeps device runtime active
all the time, then PL330 DMA controller will be runtime active all the time
when channel is being allocated.

If one requests memory-to-memory channel, runtime active counter is
increased unconditionally. This might be a drawback of this approach, but
PL330 is not really used for memory-to-memory operations due to poor
performance in such operations compared to the CPU.

Removal of irq-safe runtime PM is based on the revert of the following
commits:
1. commit 5c9e6c2b2ba3 "dmaengine: pl330: fix runtime pm support"
2. commit 81cc6edc0870 "dmaengine: pl330: Fix hang on dmaengine_terminate_all
   on certain boards"
3. commit ae43b3289186 "ARM: 8202/1: dmaengine: pl330: Add runtime Power
   Management support v12"

Introducing non-irq-safe runtime power management finally allows to turn off
audio power domain on Exynos5 SoCs.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/dma/pl330.c | 177 +++++++++++++++++++++++-----------------------------
 1 file changed, 77 insertions(+), 100 deletions(-)

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 8b0da7fa520d..17efad418faa 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -22,6 +22,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <linux/amba/bus.h>
+#include <linux/mutex.h>
 #include <linux/scatterlist.h>
 #include <linux/of.h>
 #include <linux/of_dma.h>
@@ -268,9 +269,6 @@ enum pl330_byteswap {
 
 #define NR_DEFAULT_DESC	16
 
-/* Delay for runtime PM autosuspend, ms */
-#define PL330_AUTOSUSPEND_DELAY 20
-
 /* Populated by the PL330 core driver for DMA API driver's info */
 struct pl330_config {
 	u32	periph_id;
@@ -449,7 +447,7 @@ struct dma_pl330_chan {
 	bool cyclic;
 
 	/* for runtime pm tracking */
-	bool active;
+	struct device_link *slave_link;
 };
 
 struct pl330_dmac {
@@ -463,6 +461,8 @@ struct pl330_dmac {
 	struct list_head desc_pool;
 	/* To protect desc_pool manipulation */
 	spinlock_t pool_lock;
+	/* For management of slave PM links */
+	struct mutex rpm_lock;
 
 	/* Size of MicroCode buffers for each channel. */
 	unsigned mcbufsz;
@@ -2008,7 +2008,6 @@ static void pl330_tasklet(unsigned long data)
 	struct dma_pl330_chan *pch = (struct dma_pl330_chan *)data;
 	struct dma_pl330_desc *desc, *_dt;
 	unsigned long flags;
-	bool power_down = false;
 
 	spin_lock_irqsave(&pch->lock, flags);
 
@@ -2023,18 +2022,10 @@ static void pl330_tasklet(unsigned long data)
 	/* Try to submit a req imm. next to the last completed cookie */
 	fill_queue(pch);
 
-	if (list_empty(&pch->work_list)) {
-		spin_lock(&pch->thread->dmac->lock);
-		_stop(pch->thread);
-		spin_unlock(&pch->thread->dmac->lock);
-		power_down = true;
-		pch->active = false;
-	} else {
-		/* Make sure the PL330 Channel thread is active */
-		spin_lock(&pch->thread->dmac->lock);
-		_start(pch->thread);
-		spin_unlock(&pch->thread->dmac->lock);
-	}
+	/* Make sure the PL330 Channel thread is active */
+	spin_lock(&pch->thread->dmac->lock);
+	_start(pch->thread);
+	spin_unlock(&pch->thread->dmac->lock);
 
 	while (!list_empty(&pch->completed_list)) {
 		struct dmaengine_desc_callback cb;
@@ -2047,13 +2038,6 @@ static void pl330_tasklet(unsigned long data)
 		if (pch->cyclic) {
 			desc->status = PREP;
 			list_move_tail(&desc->node, &pch->work_list);
-			if (power_down) {
-				pch->active = true;
-				spin_lock(&pch->thread->dmac->lock);
-				_start(pch->thread);
-				spin_unlock(&pch->thread->dmac->lock);
-				power_down = false;
-			}
 		} else {
 			desc->status = FREE;
 			list_move_tail(&desc->node, &pch->dmac->desc_pool);
@@ -2068,12 +2052,6 @@ static void pl330_tasklet(unsigned long data)
 		}
 	}
 	spin_unlock_irqrestore(&pch->lock, flags);
-
-	/* If work list empty, power down */
-	if (power_down) {
-		pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-		pm_runtime_put_autosuspend(pch->dmac->ddma.dev);
-	}
 }
 
 static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
@@ -2096,11 +2074,68 @@ static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
 	return dma_get_slave_channel(&pl330->peripherals[chan_id].chan);
 }
 
+static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
+{
+	struct dma_pl330_chan *pch = to_pchan(chan);
+	struct pl330_dmac *pl330 = pch->dmac;
+	int i;
+
+	mutex_lock(&pl330->rpm_lock);
+
+	for (i = 0; i < pl330->num_peripherals; i++) {
+		if (pl330->peripherals[i].chan.slave == slave &&
+		    pl330->peripherals[i].slave_link) {
+			pch->slave_link = pl330->peripherals[i].slave_link;
+			goto done;
+		}
+	}
+
+	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
+				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
+	if (!pch->slave_link) {
+		mutex_unlock(&pl330->rpm_lock);
+		return -ENODEV;
+	}
+done:
+	mutex_unlock(&pl330->rpm_lock);
+
+	pm_runtime_put(pl330->ddma.dev);
+
+	return 0;
+}
+
+static void pl330_release_slave(struct dma_chan *chan)
+{
+	struct dma_pl330_chan *pch = to_pchan(chan);
+	struct pl330_dmac *pl330 = pch->dmac;
+	struct device_link *link = pch->slave_link;
+	int i, count = 0;
+
+	pm_runtime_get_sync(pl330->ddma.dev);
+
+	mutex_lock(&pl330->rpm_lock);
+
+	for (i = 0; i < pl330->num_peripherals; i++)
+		if (pl330->peripherals[i].slave_link == link)
+			count++;
+
+	pch->slave_link = NULL;
+	if (count == 1)
+		device_link_del(link);
+
+	mutex_unlock(&pl330->rpm_lock);
+}
+
 static int pl330_alloc_chan_resources(struct dma_chan *chan)
 {
 	struct dma_pl330_chan *pch = to_pchan(chan);
 	struct pl330_dmac *pl330 = pch->dmac;
 	unsigned long flags;
+	int ret;
+
+	ret = pm_runtime_get_sync(pl330->ddma.dev);
+	if (ret < 0)
+		return ret;
 
 	spin_lock_irqsave(&pl330->lock, flags);
 
@@ -2110,6 +2145,7 @@ static int pl330_alloc_chan_resources(struct dma_chan *chan)
 	pch->thread = pl330_request_channel(pl330);
 	if (!pch->thread) {
 		spin_unlock_irqrestore(&pl330->lock, flags);
+		pm_runtime_put(pl330->ddma.dev);
 		return -ENOMEM;
 	}
 
@@ -2151,9 +2187,7 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	unsigned long flags;
 	struct pl330_dmac *pl330 = pch->dmac;
 	LIST_HEAD(list);
-	bool power_down = false;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	spin_lock_irqsave(&pch->lock, flags);
 	spin_lock(&pl330->lock);
 	_stop(pch->thread);
@@ -2162,8 +2196,6 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	pch->thread->req[0].desc = NULL;
 	pch->thread->req[1].desc = NULL;
 	pch->thread->req_running = -1;
-	power_down = pch->active;
-	pch->active = false;
 
 	/* Mark all desc done */
 	list_for_each_entry(desc, &pch->submitted_list, node) {
@@ -2180,10 +2212,6 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	list_splice_tail_init(&pch->work_list, &pl330->desc_pool);
 	list_splice_tail_init(&pch->completed_list, &pl330->desc_pool);
 	spin_unlock_irqrestore(&pch->lock, flags);
-	pm_runtime_mark_last_busy(pl330->ddma.dev);
-	if (power_down)
-		pm_runtime_put_autosuspend(pl330->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	return 0;
 }
@@ -2201,7 +2229,6 @@ static int pl330_pause(struct dma_chan *chan)
 	struct pl330_dmac *pl330 = pch->dmac;
 	unsigned long flags;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	spin_lock_irqsave(&pch->lock, flags);
 
 	spin_lock(&pl330->lock);
@@ -2209,8 +2236,6 @@ static int pl330_pause(struct dma_chan *chan)
 	spin_unlock(&pl330->lock);
 
 	spin_unlock_irqrestore(&pch->lock, flags);
-	pm_runtime_mark_last_busy(pl330->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	return 0;
 }
@@ -2223,7 +2248,6 @@ static void pl330_free_chan_resources(struct dma_chan *chan)
 
 	tasklet_kill(&pch->task);
 
-	pm_runtime_get_sync(pch->dmac->ddma.dev);
 	spin_lock_irqsave(&pl330->lock, flags);
 
 	pl330_release_channel(pch->thread);
@@ -2233,19 +2257,17 @@ static void pl330_free_chan_resources(struct dma_chan *chan)
 		list_splice_tail_init(&pch->work_list, &pch->dmac->desc_pool);
 
 	spin_unlock_irqrestore(&pl330->lock, flags);
-	pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-	pm_runtime_put_autosuspend(pch->dmac->ddma.dev);
+
+	pm_runtime_put(pl330->ddma.dev);
 }
 
 static int pl330_get_current_xferred_count(struct dma_pl330_chan *pch,
 					   struct dma_pl330_desc *desc)
 {
 	struct pl330_thread *thrd = pch->thread;
-	struct pl330_dmac *pl330 = pch->dmac;
 	void __iomem *regs = thrd->dmac->base;
 	u32 val, addr;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	val = addr = 0;
 	if (desc->rqcfg.src_inc) {
 		val = readl(regs + SA(thrd->id));
@@ -2254,8 +2276,6 @@ static int pl330_get_current_xferred_count(struct dma_pl330_chan *pch,
 		val = readl(regs + DA(thrd->id));
 		addr = desc->px.dst_addr;
 	}
-	pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	/* If DMAMOV hasn't finished yet, SAR/DAR can be zero */
 	if (!val)
@@ -2341,16 +2361,6 @@ static void pl330_issue_pending(struct dma_chan *chan)
 	unsigned long flags;
 
 	spin_lock_irqsave(&pch->lock, flags);
-	if (list_empty(&pch->work_list)) {
-		/*
-		 * Warn on nothing pending. Empty submitted_list may
-		 * break our pm_runtime usage counter as it is
-		 * updated on work_list emptiness status.
-		 */
-		WARN_ON(list_empty(&pch->submitted_list));
-		pch->active = true;
-		pm_runtime_get_sync(pch->dmac->ddma.dev);
-	}
 	list_splice_tail_init(&pch->submitted_list, &pch->work_list);
 	spin_unlock_irqrestore(&pch->lock, flags);
 
@@ -2778,44 +2788,12 @@ static irqreturn_t pl330_irq_handler(int irq, void *data)
 	BIT(DMA_SLAVE_BUSWIDTH_8_BYTES)
 
 /*
- * Runtime PM callbacks are provided by amba/bus.c driver.
- *
- * It is assumed here that IRQ safe runtime PM is chosen in probe and amba
- * bus driver will only disable/enable the clock in runtime PM callbacks.
+ * Runtime PM callbacks are provided by amba/bus.c driver, system sleep
+ * suspend/resume is implemented by generic helpers, which use existing
+ * runtime PM callbacks.
  */
-static int __maybe_unused pl330_suspend(struct device *dev)
-{
-	struct amba_device *pcdev = to_amba_device(dev);
-
-	pm_runtime_disable(dev);
-
-	if (!pm_runtime_status_suspended(dev)) {
-		/* amba did not disable the clock */
-		amba_pclk_disable(pcdev);
-	}
-	amba_pclk_unprepare(pcdev);
-
-	return 0;
-}
-
-static int __maybe_unused pl330_resume(struct device *dev)
-{
-	struct amba_device *pcdev = to_amba_device(dev);
-	int ret;
-
-	ret = amba_pclk_prepare(pcdev);
-	if (ret)
-		return ret;
-
-	if (!pm_runtime_status_suspended(dev))
-		ret = amba_pclk_enable(pcdev);
-
-	pm_runtime_enable(dev);
-
-	return ret;
-}
-
-static SIMPLE_DEV_PM_OPS(pl330_pm, pl330_suspend, pl330_resume);
+static SIMPLE_DEV_PM_OPS(pl330_pm, pm_runtime_force_suspend,
+			 pm_runtime_force_resume);
 
 static int
 pl330_probe(struct amba_device *adev, const struct amba_id *id)
@@ -2877,6 +2855,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	INIT_LIST_HEAD(&pl330->desc_pool);
 	spin_lock_init(&pl330->pool_lock);
+	mutex_init(&pl330->rpm_lock);
 
 	/* Create a descriptor pool of default size */
 	if (!add_desc(pl330, GFP_KERNEL, NR_DEFAULT_DESC))
@@ -2920,6 +2899,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
 	pd->device_free_chan_resources = pl330_free_chan_resources;
+	pd->device_set_slave = pl330_set_slave;
+	pd->device_release_slave = pl330_release_slave;
 	pd->device_prep_dma_memcpy = pl330_prep_dma_memcpy;
 	pd->device_prep_dma_cyclic = pl330_prep_dma_cyclic;
 	pd->device_tx_status = pl330_tx_status;
@@ -2968,11 +2949,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 		pcfg->data_buf_dep, pcfg->data_bus_width / 8, pcfg->num_chan,
 		pcfg->num_peri, pcfg->num_events);
 
-	pm_runtime_irq_safe(&adev->dev);
-	pm_runtime_use_autosuspend(&adev->dev);
-	pm_runtime_set_autosuspend_delay(&adev->dev, PL330_AUTOSUSPEND_DELAY);
-	pm_runtime_mark_last_busy(&adev->dev);
-	pm_runtime_put_autosuspend(&adev->dev);
+	pm_runtime_put(&adev->dev);
 
 	return 0;
 probe_err3:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-09 14:22       ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-09 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

This patch replaces irq-safe runtime PM with non-irq-safe version based on
the new approach. Existing, irq-safe runtime PM implementation for PL330 was
not bringing much benefits of its own - only clocks were enabled/disabled.

Another limitation of irq-safe runtime PM is a fact, that it may prevent
the generic PM domain (genpd) from being powered off, particularly in cases
when the genpd doesn't have the GENPD_FLAG_IRQ_SAFE set.

Till now non-irq-safe runtime PM implementation was only possible by calling
pm_runtime_get/put functions from alloc/free_chan_resources. All other DMA
engine API functions cannot be called from a context, which permits sleeping.
Such implementation, in practice would result in keeping DMA controller's
device active almost all the time, because most of the slave device drivers
(DMA engine clients) acquire DMA channel in their probe() function and
released it during driver removal.

This patch provides a new, different approach. It is based on an observation
that there can be only one slave device using each DMA channel. PL330 hardware
always has dedicated channels for each peripheral device. Using recently
introduced device dependencies (links) infrastructure one can ensure proper
runtime PM state of PL330 DMA controller basing on the runtime PM state of
the slave device.

In this approach in pl330_set_slave() function a new dependency is being
created between PL330 DMA controller device (as a supplier) and given slave
device (as a consumer). This way PL330 DMA controller device runtime active
counter is increased when the slave device is resumed and decreased the same
time when given slave device is put to suspend. This way it has been ensured
to keep PL330 DMA controller runtime active if there is an active used of
any of its DMA channels. This is similar to what has been already
implemented in Exynos IOMMU driver in commit 2f5f44f205cc95 ("iommu/exynos:
Use device dependency links to control runtime pm").

If slave device doesn't implement runtime PM or keeps device runtime active
all the time, then PL330 DMA controller will be runtime active all the time
when channel is being allocated.

If one requests memory-to-memory channel, runtime active counter is
increased unconditionally. This might be a drawback of this approach, but
PL330 is not really used for memory-to-memory operations due to poor
performance in such operations compared to the CPU.

Removal of irq-safe runtime PM is based on the revert of the following
commits:
1. commit 5c9e6c2b2ba3 "dmaengine: pl330: fix runtime pm support"
2. commit 81cc6edc0870 "dmaengine: pl330: Fix hang on dmaengine_terminate_all
   on certain boards"
3. commit ae43b3289186 "ARM: 8202/1: dmaengine: pl330: Add runtime Power
   Management support v12"

Introducing non-irq-safe runtime power management finally allows to turn off
audio power domain on Exynos5 SoCs.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/dma/pl330.c | 177 +++++++++++++++++++++++-----------------------------
 1 file changed, 77 insertions(+), 100 deletions(-)

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 8b0da7fa520d..17efad418faa 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -22,6 +22,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <linux/amba/bus.h>
+#include <linux/mutex.h>
 #include <linux/scatterlist.h>
 #include <linux/of.h>
 #include <linux/of_dma.h>
@@ -268,9 +269,6 @@ enum pl330_byteswap {
 
 #define NR_DEFAULT_DESC	16
 
-/* Delay for runtime PM autosuspend, ms */
-#define PL330_AUTOSUSPEND_DELAY 20
-
 /* Populated by the PL330 core driver for DMA API driver's info */
 struct pl330_config {
 	u32	periph_id;
@@ -449,7 +447,7 @@ struct dma_pl330_chan {
 	bool cyclic;
 
 	/* for runtime pm tracking */
-	bool active;
+	struct device_link *slave_link;
 };
 
 struct pl330_dmac {
@@ -463,6 +461,8 @@ struct pl330_dmac {
 	struct list_head desc_pool;
 	/* To protect desc_pool manipulation */
 	spinlock_t pool_lock;
+	/* For management of slave PM links */
+	struct mutex rpm_lock;
 
 	/* Size of MicroCode buffers for each channel. */
 	unsigned mcbufsz;
@@ -2008,7 +2008,6 @@ static void pl330_tasklet(unsigned long data)
 	struct dma_pl330_chan *pch = (struct dma_pl330_chan *)data;
 	struct dma_pl330_desc *desc, *_dt;
 	unsigned long flags;
-	bool power_down = false;
 
 	spin_lock_irqsave(&pch->lock, flags);
 
@@ -2023,18 +2022,10 @@ static void pl330_tasklet(unsigned long data)
 	/* Try to submit a req imm. next to the last completed cookie */
 	fill_queue(pch);
 
-	if (list_empty(&pch->work_list)) {
-		spin_lock(&pch->thread->dmac->lock);
-		_stop(pch->thread);
-		spin_unlock(&pch->thread->dmac->lock);
-		power_down = true;
-		pch->active = false;
-	} else {
-		/* Make sure the PL330 Channel thread is active */
-		spin_lock(&pch->thread->dmac->lock);
-		_start(pch->thread);
-		spin_unlock(&pch->thread->dmac->lock);
-	}
+	/* Make sure the PL330 Channel thread is active */
+	spin_lock(&pch->thread->dmac->lock);
+	_start(pch->thread);
+	spin_unlock(&pch->thread->dmac->lock);
 
 	while (!list_empty(&pch->completed_list)) {
 		struct dmaengine_desc_callback cb;
@@ -2047,13 +2038,6 @@ static void pl330_tasklet(unsigned long data)
 		if (pch->cyclic) {
 			desc->status = PREP;
 			list_move_tail(&desc->node, &pch->work_list);
-			if (power_down) {
-				pch->active = true;
-				spin_lock(&pch->thread->dmac->lock);
-				_start(pch->thread);
-				spin_unlock(&pch->thread->dmac->lock);
-				power_down = false;
-			}
 		} else {
 			desc->status = FREE;
 			list_move_tail(&desc->node, &pch->dmac->desc_pool);
@@ -2068,12 +2052,6 @@ static void pl330_tasklet(unsigned long data)
 		}
 	}
 	spin_unlock_irqrestore(&pch->lock, flags);
-
-	/* If work list empty, power down */
-	if (power_down) {
-		pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-		pm_runtime_put_autosuspend(pch->dmac->ddma.dev);
-	}
 }
 
 static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
@@ -2096,11 +2074,68 @@ static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
 	return dma_get_slave_channel(&pl330->peripherals[chan_id].chan);
 }
 
+static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
+{
+	struct dma_pl330_chan *pch = to_pchan(chan);
+	struct pl330_dmac *pl330 = pch->dmac;
+	int i;
+
+	mutex_lock(&pl330->rpm_lock);
+
+	for (i = 0; i < pl330->num_peripherals; i++) {
+		if (pl330->peripherals[i].chan.slave == slave &&
+		    pl330->peripherals[i].slave_link) {
+			pch->slave_link = pl330->peripherals[i].slave_link;
+			goto done;
+		}
+	}
+
+	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
+				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
+	if (!pch->slave_link) {
+		mutex_unlock(&pl330->rpm_lock);
+		return -ENODEV;
+	}
+done:
+	mutex_unlock(&pl330->rpm_lock);
+
+	pm_runtime_put(pl330->ddma.dev);
+
+	return 0;
+}
+
+static void pl330_release_slave(struct dma_chan *chan)
+{
+	struct dma_pl330_chan *pch = to_pchan(chan);
+	struct pl330_dmac *pl330 = pch->dmac;
+	struct device_link *link = pch->slave_link;
+	int i, count = 0;
+
+	pm_runtime_get_sync(pl330->ddma.dev);
+
+	mutex_lock(&pl330->rpm_lock);
+
+	for (i = 0; i < pl330->num_peripherals; i++)
+		if (pl330->peripherals[i].slave_link == link)
+			count++;
+
+	pch->slave_link = NULL;
+	if (count == 1)
+		device_link_del(link);
+
+	mutex_unlock(&pl330->rpm_lock);
+}
+
 static int pl330_alloc_chan_resources(struct dma_chan *chan)
 {
 	struct dma_pl330_chan *pch = to_pchan(chan);
 	struct pl330_dmac *pl330 = pch->dmac;
 	unsigned long flags;
+	int ret;
+
+	ret = pm_runtime_get_sync(pl330->ddma.dev);
+	if (ret < 0)
+		return ret;
 
 	spin_lock_irqsave(&pl330->lock, flags);
 
@@ -2110,6 +2145,7 @@ static int pl330_alloc_chan_resources(struct dma_chan *chan)
 	pch->thread = pl330_request_channel(pl330);
 	if (!pch->thread) {
 		spin_unlock_irqrestore(&pl330->lock, flags);
+		pm_runtime_put(pl330->ddma.dev);
 		return -ENOMEM;
 	}
 
@@ -2151,9 +2187,7 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	unsigned long flags;
 	struct pl330_dmac *pl330 = pch->dmac;
 	LIST_HEAD(list);
-	bool power_down = false;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	spin_lock_irqsave(&pch->lock, flags);
 	spin_lock(&pl330->lock);
 	_stop(pch->thread);
@@ -2162,8 +2196,6 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	pch->thread->req[0].desc = NULL;
 	pch->thread->req[1].desc = NULL;
 	pch->thread->req_running = -1;
-	power_down = pch->active;
-	pch->active = false;
 
 	/* Mark all desc done */
 	list_for_each_entry(desc, &pch->submitted_list, node) {
@@ -2180,10 +2212,6 @@ static int pl330_terminate_all(struct dma_chan *chan)
 	list_splice_tail_init(&pch->work_list, &pl330->desc_pool);
 	list_splice_tail_init(&pch->completed_list, &pl330->desc_pool);
 	spin_unlock_irqrestore(&pch->lock, flags);
-	pm_runtime_mark_last_busy(pl330->ddma.dev);
-	if (power_down)
-		pm_runtime_put_autosuspend(pl330->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	return 0;
 }
@@ -2201,7 +2229,6 @@ static int pl330_pause(struct dma_chan *chan)
 	struct pl330_dmac *pl330 = pch->dmac;
 	unsigned long flags;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	spin_lock_irqsave(&pch->lock, flags);
 
 	spin_lock(&pl330->lock);
@@ -2209,8 +2236,6 @@ static int pl330_pause(struct dma_chan *chan)
 	spin_unlock(&pl330->lock);
 
 	spin_unlock_irqrestore(&pch->lock, flags);
-	pm_runtime_mark_last_busy(pl330->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	return 0;
 }
@@ -2223,7 +2248,6 @@ static void pl330_free_chan_resources(struct dma_chan *chan)
 
 	tasklet_kill(&pch->task);
 
-	pm_runtime_get_sync(pch->dmac->ddma.dev);
 	spin_lock_irqsave(&pl330->lock, flags);
 
 	pl330_release_channel(pch->thread);
@@ -2233,19 +2257,17 @@ static void pl330_free_chan_resources(struct dma_chan *chan)
 		list_splice_tail_init(&pch->work_list, &pch->dmac->desc_pool);
 
 	spin_unlock_irqrestore(&pl330->lock, flags);
-	pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-	pm_runtime_put_autosuspend(pch->dmac->ddma.dev);
+
+	pm_runtime_put(pl330->ddma.dev);
 }
 
 static int pl330_get_current_xferred_count(struct dma_pl330_chan *pch,
 					   struct dma_pl330_desc *desc)
 {
 	struct pl330_thread *thrd = pch->thread;
-	struct pl330_dmac *pl330 = pch->dmac;
 	void __iomem *regs = thrd->dmac->base;
 	u32 val, addr;
 
-	pm_runtime_get_sync(pl330->ddma.dev);
 	val = addr = 0;
 	if (desc->rqcfg.src_inc) {
 		val = readl(regs + SA(thrd->id));
@@ -2254,8 +2276,6 @@ static int pl330_get_current_xferred_count(struct dma_pl330_chan *pch,
 		val = readl(regs + DA(thrd->id));
 		addr = desc->px.dst_addr;
 	}
-	pm_runtime_mark_last_busy(pch->dmac->ddma.dev);
-	pm_runtime_put_autosuspend(pl330->ddma.dev);
 
 	/* If DMAMOV hasn't finished yet, SAR/DAR can be zero */
 	if (!val)
@@ -2341,16 +2361,6 @@ static void pl330_issue_pending(struct dma_chan *chan)
 	unsigned long flags;
 
 	spin_lock_irqsave(&pch->lock, flags);
-	if (list_empty(&pch->work_list)) {
-		/*
-		 * Warn on nothing pending. Empty submitted_list may
-		 * break our pm_runtime usage counter as it is
-		 * updated on work_list emptiness status.
-		 */
-		WARN_ON(list_empty(&pch->submitted_list));
-		pch->active = true;
-		pm_runtime_get_sync(pch->dmac->ddma.dev);
-	}
 	list_splice_tail_init(&pch->submitted_list, &pch->work_list);
 	spin_unlock_irqrestore(&pch->lock, flags);
 
@@ -2778,44 +2788,12 @@ static irqreturn_t pl330_irq_handler(int irq, void *data)
 	BIT(DMA_SLAVE_BUSWIDTH_8_BYTES)
 
 /*
- * Runtime PM callbacks are provided by amba/bus.c driver.
- *
- * It is assumed here that IRQ safe runtime PM is chosen in probe and amba
- * bus driver will only disable/enable the clock in runtime PM callbacks.
+ * Runtime PM callbacks are provided by amba/bus.c driver, system sleep
+ * suspend/resume is implemented by generic helpers, which use existing
+ * runtime PM callbacks.
  */
-static int __maybe_unused pl330_suspend(struct device *dev)
-{
-	struct amba_device *pcdev = to_amba_device(dev);
-
-	pm_runtime_disable(dev);
-
-	if (!pm_runtime_status_suspended(dev)) {
-		/* amba did not disable the clock */
-		amba_pclk_disable(pcdev);
-	}
-	amba_pclk_unprepare(pcdev);
-
-	return 0;
-}
-
-static int __maybe_unused pl330_resume(struct device *dev)
-{
-	struct amba_device *pcdev = to_amba_device(dev);
-	int ret;
-
-	ret = amba_pclk_prepare(pcdev);
-	if (ret)
-		return ret;
-
-	if (!pm_runtime_status_suspended(dev))
-		ret = amba_pclk_enable(pcdev);
-
-	pm_runtime_enable(dev);
-
-	return ret;
-}
-
-static SIMPLE_DEV_PM_OPS(pl330_pm, pl330_suspend, pl330_resume);
+static SIMPLE_DEV_PM_OPS(pl330_pm, pm_runtime_force_suspend,
+			 pm_runtime_force_resume);
 
 static int
 pl330_probe(struct amba_device *adev, const struct amba_id *id)
@@ -2877,6 +2855,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	INIT_LIST_HEAD(&pl330->desc_pool);
 	spin_lock_init(&pl330->pool_lock);
+	mutex_init(&pl330->rpm_lock);
 
 	/* Create a descriptor pool of default size */
 	if (!add_desc(pl330, GFP_KERNEL, NR_DEFAULT_DESC))
@@ -2920,6 +2899,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
 
 	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
 	pd->device_free_chan_resources = pl330_free_chan_resources;
+	pd->device_set_slave = pl330_set_slave;
+	pd->device_release_slave = pl330_release_slave;
 	pd->device_prep_dma_memcpy = pl330_prep_dma_memcpy;
 	pd->device_prep_dma_cyclic = pl330_prep_dma_cyclic;
 	pd->device_tx_status = pl330_tx_status;
@@ -2968,11 +2949,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
 		pcfg->data_buf_dep, pcfg->data_bus_width / 8, pcfg->num_chan,
 		pcfg->num_peri, pcfg->num_events);
 
-	pm_runtime_irq_safe(&adev->dev);
-	pm_runtime_use_autosuspend(&adev->dev);
-	pm_runtime_set_autosuspend_delay(&adev->dev, PL330_AUTOSUSPEND_DELAY);
-	pm_runtime_mark_last_busy(&adev->dev);
-	pm_runtime_put_autosuspend(&adev->dev);
+	pm_runtime_put(&adev->dev);
 
 	return 0;
 probe_err3:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
  2017-02-09 14:22       ` Marek Szyprowski
@ 2017-02-10  4:34         ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-10  4:34 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm,
	linux-kernel, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
> Add two new callbacks to DMA engine device. They will used to provide
> access to slave device (the device which requested given DMA channel)

You mean access to client devices?

> for DMA engine driver. Access to slave device might be useful for example
> for implementing advanced runtime power management.
> 
> DMA slave channels are exclusive, so only one slave device can be set
> for a given DMA slave channel.

That is not a right assumption and my worry here. With virt-dma we don't
really assume a hardware channel and exclusive. Certain implementation may
do that but from framework we cannot assume that.

> device_set_slave() will be called after the device_alloc_chan_resources()
> and device_release_slave() before the device_free_chan_resources().

Okay, I had to relook at the series to get around this part. Sorry but we
can't call it set_slave, it is actually set_client/consumer

In our context slaves means dmaengine slave devices aka provider.
Client would be the consumer and not slave.

> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
>  drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
>  include/linux/dmaengine.h | 10 ++++++++++
>  2 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index 24e0221fd66d..5b7089d8be4d 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>  {
>  	struct dma_device *d, *_d;
>  	struct dma_chan *chan = NULL;
> +	int ret;
>  
>  	/* If device-tree is present get slave info from here */
>  	if (dev->of_node)
> @@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>  		chan = acpi_dma_request_slave_chan_by_name(dev, name);
>  
>  	if (chan) {
> -		/* Valid channel found or requester need to be deferred */
> -		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
> +		if (!IS_ERR(chan))
> +			goto found;
> +		if (PTR_ERR(chan) == -EPROBE_DEFER)
>  			return chan;
>  	}
>  
> @@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>  	}
>  	mutex_unlock(&dma_list_mutex);
>  
> -	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
> +	if (!chan)
> +		return ERR_PTR(-EPROBE_DEFER);
> +	if (IS_ERR(chan))
> +		return chan;
> +found:
> +	if (chan->device->device_set_slave) {
> +		chan->slave = dev;
> +		ret = chan->device->device_set_slave(chan, dev);
> +		if (ret) {
> +			chan->slave = NULL;
> +			dma_release_channel(chan);
> +			chan = ERR_PTR(ret);
> +		}
> +	}
> +	return chan;
>  }
>  EXPORT_SYMBOL_GPL(dma_request_chan);
>  
> @@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
>  	mutex_lock(&dma_list_mutex);
>  	WARN_ONCE(chan->client_count != 1,
>  		  "chan reference count %d != 1\n", chan->client_count);
> +	if (chan->slave) {
> +		if (chan->device->device_release_slave)
> +			chan->device->device_release_slave(chan);
> +		chan->slave = NULL;
> +	}
>  	dma_chan_put(chan);
>  	/* drop PRIVATE cap enabled by __dma_request_channel() */
>  	if (--chan->device->privatecnt == 0)
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 533680860865..d22299e37e69 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -277,6 +277,9 @@ struct dma_chan {
>  	struct dma_router *router;
>  	void *route_data;
>  
> +	/* Only for SLAVE channels */
> +	struct device *slave;

so assuming you refer to consumer aka client here, why do we need set if we
store it here.

> +
>  	void *private;
>  };
>  
> @@ -686,6 +689,10 @@ struct dma_filter {
>   * @device_alloc_chan_resources: allocate resources and return the
>   *	number of allocated descriptors
>   * @device_free_chan_resources: release DMA channel's resources
> + * @device_set_slave: provide access to the slave device, which requested
> + *	given DMA channel, called after @device_alloc_chan_resources
> + * @device_release_slave: finishes access to the slave device, called
> + *	before @device_free_chan_resources
>   * @device_prep_dma_memcpy: prepares a memcpy operation
>   * @device_prep_dma_xor: prepares a xor operation
>   * @device_prep_dma_xor_val: prepares a xor validation operation
> @@ -746,6 +753,9 @@ struct dma_device {
>  	int (*device_alloc_chan_resources)(struct dma_chan *chan);
>  	void (*device_free_chan_resources)(struct dma_chan *chan);
>  
> +	int (*device_set_slave)(struct dma_chan *chan, struct device *slave);
> +	void (*device_release_slave)(struct dma_chan *chan);
> +
>  	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(
>  		struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
>  		size_t len, unsigned long flags);
> -- 
> 1.9.1
> 

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
@ 2017-02-10  4:34         ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-10  4:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
> Add two new callbacks to DMA engine device. They will used to provide
> access to slave device (the device which requested given DMA channel)

You mean access to client devices?

> for DMA engine driver. Access to slave device might be useful for example
> for implementing advanced runtime power management.
> 
> DMA slave channels are exclusive, so only one slave device can be set
> for a given DMA slave channel.

That is not a right assumption and my worry here. With virt-dma we don't
really assume a hardware channel and exclusive. Certain implementation may
do that but from framework we cannot assume that.

> device_set_slave() will be called after the device_alloc_chan_resources()
> and device_release_slave() before the device_free_chan_resources().

Okay, I had to relook at the series to get around this part. Sorry but we
can't call it set_slave, it is actually set_client/consumer

In our context slaves means dmaengine slave devices aka provider.
Client would be the consumer and not slave.

> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
>  drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
>  include/linux/dmaengine.h | 10 ++++++++++
>  2 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index 24e0221fd66d..5b7089d8be4d 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>  {
>  	struct dma_device *d, *_d;
>  	struct dma_chan *chan = NULL;
> +	int ret;
>  
>  	/* If device-tree is present get slave info from here */
>  	if (dev->of_node)
> @@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>  		chan = acpi_dma_request_slave_chan_by_name(dev, name);
>  
>  	if (chan) {
> -		/* Valid channel found or requester need to be deferred */
> -		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
> +		if (!IS_ERR(chan))
> +			goto found;
> +		if (PTR_ERR(chan) == -EPROBE_DEFER)
>  			return chan;
>  	}
>  
> @@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>  	}
>  	mutex_unlock(&dma_list_mutex);
>  
> -	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
> +	if (!chan)
> +		return ERR_PTR(-EPROBE_DEFER);
> +	if (IS_ERR(chan))
> +		return chan;
> +found:
> +	if (chan->device->device_set_slave) {
> +		chan->slave = dev;
> +		ret = chan->device->device_set_slave(chan, dev);
> +		if (ret) {
> +			chan->slave = NULL;
> +			dma_release_channel(chan);
> +			chan = ERR_PTR(ret);
> +		}
> +	}
> +	return chan;
>  }
>  EXPORT_SYMBOL_GPL(dma_request_chan);
>  
> @@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
>  	mutex_lock(&dma_list_mutex);
>  	WARN_ONCE(chan->client_count != 1,
>  		  "chan reference count %d != 1\n", chan->client_count);
> +	if (chan->slave) {
> +		if (chan->device->device_release_slave)
> +			chan->device->device_release_slave(chan);
> +		chan->slave = NULL;
> +	}
>  	dma_chan_put(chan);
>  	/* drop PRIVATE cap enabled by __dma_request_channel() */
>  	if (--chan->device->privatecnt == 0)
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 533680860865..d22299e37e69 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -277,6 +277,9 @@ struct dma_chan {
>  	struct dma_router *router;
>  	void *route_data;
>  
> +	/* Only for SLAVE channels */
> +	struct device *slave;

so assuming you refer to consumer aka client here, why do we need set if we
store it here.

> +
>  	void *private;
>  };
>  
> @@ -686,6 +689,10 @@ struct dma_filter {
>   * @device_alloc_chan_resources: allocate resources and return the
>   *	number of allocated descriptors
>   * @device_free_chan_resources: release DMA channel's resources
> + * @device_set_slave: provide access to the slave device, which requested
> + *	given DMA channel, called after @device_alloc_chan_resources
> + * @device_release_slave: finishes access to the slave device, called
> + *	before @device_free_chan_resources
>   * @device_prep_dma_memcpy: prepares a memcpy operation
>   * @device_prep_dma_xor: prepares a xor operation
>   * @device_prep_dma_xor_val: prepares a xor validation operation
> @@ -746,6 +753,9 @@ struct dma_device {
>  	int (*device_alloc_chan_resources)(struct dma_chan *chan);
>  	void (*device_free_chan_resources)(struct dma_chan *chan);
>  
> +	int (*device_set_slave)(struct dma_chan *chan, struct device *slave);
> +	void (*device_release_slave)(struct dma_chan *chan);
> +
>  	struct dma_async_tx_descriptor *(*device_prep_dma_memcpy)(
>  		struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
>  		size_t len, unsigned long flags);
> -- 
> 1.9.1
> 

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-09 14:22       ` Marek Szyprowski
@ 2017-02-10  4:50         ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-10  4:50 UTC (permalink / raw)
  To: Marek Szyprowski, Rafael J. Wysocki
  Cc: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm,
	linux-kernel, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Ulf Hansson, Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
  
> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
> +{
> +	struct dma_pl330_chan *pch = to_pchan(chan);
> +	struct pl330_dmac *pl330 = pch->dmac;
> +	int i;
> +
> +	mutex_lock(&pl330->rpm_lock);
> +
> +	for (i = 0; i < pl330->num_peripherals; i++) {
> +		if (pl330->peripherals[i].chan.slave == slave &&
> +		    pl330->peripherals[i].slave_link) {
> +			pch->slave_link = pl330->peripherals[i].slave_link;
> +			goto done;
> +		}
> +	}
> +
> +	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
> +				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);

So you are going to add the link on channel allocation and tear down on the
freeup. I am not sure I really like the idea here.

First, these thing shouldn't be handled in the drivers. These things should
be set in core and each driver setting the links doesn't sound great to me.

Second, should the link be always there and we only mange the state? Here it
seems that we have link being created and destroyed, so why not mark it
ACTIVE and DORMANT instead...

Lastly, looking at th description of the issue here, am perceiving (maybe my
understanding is not quite right here) that you have an IP block in SoC
which has multiple things and share common stuff and doing right PM is a
challenge for you, right?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-10  4:50         ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-10  4:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
  
> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
> +{
> +	struct dma_pl330_chan *pch = to_pchan(chan);
> +	struct pl330_dmac *pl330 = pch->dmac;
> +	int i;
> +
> +	mutex_lock(&pl330->rpm_lock);
> +
> +	for (i = 0; i < pl330->num_peripherals; i++) {
> +		if (pl330->peripherals[i].chan.slave == slave &&
> +		    pl330->peripherals[i].slave_link) {
> +			pch->slave_link = pl330->peripherals[i].slave_link;
> +			goto done;
> +		}
> +	}
> +
> +	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
> +				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);

So you are going to add the link on channel allocation and tear down on the
freeup. I am not sure I really like the idea here.

First, these thing shouldn't be handled in the drivers. These things should
be set in core and each driver setting the links doesn't sound great to me.

Second, should the link be always there and we only mange the state? Here it
seems that we have link being created and destroyed, so why not mark it
ACTIVE and DORMANT instead...

Lastly, looking at th description of the issue here, am perceiving (maybe my
understanding is not quite right here) that you have an IP block in SoC
which has multiple things and share common stuff and doing right PM is a
challenge for you, right?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-10  4:50         ` Vinod Koul
@ 2017-02-10 11:51           ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-10 11:51 UTC (permalink / raw)
  To: Vinod Koul, Rafael J. Wysocki, Ulf Hansson
  Cc: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm,
	linux-kernel, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

Hi Vinod,

On 2017-02-10 05:50, Vinod Koul wrote:
> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>
>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>> +{
>> +	struct dma_pl330_chan *pch = to_pchan(chan);
>> +	struct pl330_dmac *pl330 = pch->dmac;
>> +	int i;
>> +
>> +	mutex_lock(&pl330->rpm_lock);
>> +
>> +	for (i = 0; i < pl330->num_peripherals; i++) {
>> +		if (pl330->peripherals[i].chan.slave == slave &&
>> +		    pl330->peripherals[i].slave_link) {
>> +			pch->slave_link = pl330->peripherals[i].slave_link;
>> +			goto done;
>> +		}
>> +	}
>> +
>> +	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>> +				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
> So you are going to add the link on channel allocation and tear down on the
> freeup.

Right. Channel allocation is typically done once per driver operation and it
won't hurt system performance.

>   I am not sure I really like the idea here.

Could you point what's wrong with it?

> First, these thing shouldn't be handled in the drivers. These things should
> be set in core and each driver setting the links doesn't sound great to me.

Which core? And what's wrong with the device links? They have been 
introduced to
model relations between devices that are behind the usual parent/child/bus
topology.

> Second, should the link be always there and we only mange the state? Here it
> seems that we have link being created and destroyed, so why not mark it
> ACTIVE and DORMANT instead...

Link state is managed by device core and should not be touched by the 
drivers.
It is related to both provider and consumer drivers states (probed/not 
probed/etc).

Second we would need to create those links first. The question is where to
create them then.

> Lastly, looking at th description of the issue here, am perceiving (maybe my
> understanding is not quite right here) that you have an IP block in SoC
> which has multiple things and share common stuff and doing right PM is a
> challenge for you, right?

Nope. Doing right PM in my SoC is not that complex and I would say it is 
rather
typical for any embedded stuff. It works fine (in terms of the power
consumption reduction) when all drivers simply properly manage their runtime
PM state, thus if device is not in use, the state is set to suspended and
finally, the power domain gets turned off.

I've used device links for PM only because the current DMA engine API is
simply insufficient to implement it in the other way.

I want to let a power domain, which contains a few devices, among those 
a PL330
device, to get turned off when there is no activity. Handling power 
domain power
on / off requires non-atomic context, what is typical for runtime pm 
calls. For
that I need to have non-irq-safe runtime pm implemented for all devices that
belongs to that domains.

The problem with PL330 driver is that it use irq-safe runtime pm, which 
like it
was stated in the patch description doesn't bring much benefits. To 
switch to
standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done 
from
a context which permits sleeping. The problem with DMA engine driver API 
is that
most of its callbacks have to be IRQ-safe and frankly only
device_{alloc,release}_chan_resources() what more or less maps to
dma_request_chan()/dma_release_channel() and friends. There are DMA engine
drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
rcar-dmac), but this is not really efficient. DMA engine clients usually 
allocate
dma channel during their probe() and keep them for the whole driver 
life. In turn
this very similar to calling pm_runtime_get() in the DMA engine driver 
probe().
The result of both approaches is that DMA engine device keeps its power 
domain
enabled almost all the time. This problem is also mentioned in the DMA 
engine
TODO list, you have pointed me yesterday.

To avoid such situation that DMA engine driver blocks turning off the power
domain and avoid changing DMA engine client API I came up with the 
device links
pm based approach. I don't want to duplicate the description here, the 
details
were in the patch description, however if you have any particular 
question about
the details, let me know and I will try to clarify it more.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-10 11:51           ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-10 11:51 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vinod,

On 2017-02-10 05:50, Vinod Koul wrote:
> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>
>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>> +{
>> +	struct dma_pl330_chan *pch = to_pchan(chan);
>> +	struct pl330_dmac *pl330 = pch->dmac;
>> +	int i;
>> +
>> +	mutex_lock(&pl330->rpm_lock);
>> +
>> +	for (i = 0; i < pl330->num_peripherals; i++) {
>> +		if (pl330->peripherals[i].chan.slave == slave &&
>> +		    pl330->peripherals[i].slave_link) {
>> +			pch->slave_link = pl330->peripherals[i].slave_link;
>> +			goto done;
>> +		}
>> +	}
>> +
>> +	pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>> +				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
> So you are going to add the link on channel allocation and tear down on the
> freeup.

Right. Channel allocation is typically done once per driver operation and it
won't hurt system performance.

>   I am not sure I really like the idea here.

Could you point what's wrong with it?

> First, these thing shouldn't be handled in the drivers. These things should
> be set in core and each driver setting the links doesn't sound great to me.

Which core? And what's wrong with the device links? They have been 
introduced to
model relations between devices that are behind the usual parent/child/bus
topology.

> Second, should the link be always there and we only mange the state? Here it
> seems that we have link being created and destroyed, so why not mark it
> ACTIVE and DORMANT instead...

Link state is managed by device core and should not be touched by the 
drivers.
It is related to both provider and consumer drivers states (probed/not 
probed/etc).

Second we would need to create those links first. The question is where to
create them then.

> Lastly, looking at th description of the issue here, am perceiving (maybe my
> understanding is not quite right here) that you have an IP block in SoC
> which has multiple things and share common stuff and doing right PM is a
> challenge for you, right?

Nope. Doing right PM in my SoC is not that complex and I would say it is 
rather
typical for any embedded stuff. It works fine (in terms of the power
consumption reduction) when all drivers simply properly manage their runtime
PM state, thus if device is not in use, the state is set to suspended and
finally, the power domain gets turned off.

I've used device links for PM only because the current DMA engine API is
simply insufficient to implement it in the other way.

I want to let a power domain, which contains a few devices, among those 
a PL330
device, to get turned off when there is no activity. Handling power 
domain power
on / off requires non-atomic context, what is typical for runtime pm 
calls. For
that I need to have non-irq-safe runtime pm implemented for all devices that
belongs to that domains.

The problem with PL330 driver is that it use irq-safe runtime pm, which 
like it
was stated in the patch description doesn't bring much benefits. To 
switch to
standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done 
from
a context which permits sleeping. The problem with DMA engine driver API 
is that
most of its callbacks have to be IRQ-safe and frankly only
device_{alloc,release}_chan_resources() what more or less maps to
dma_request_chan()/dma_release_channel() and friends. There are DMA engine
drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
rcar-dmac), but this is not really efficient. DMA engine clients usually 
allocate
dma channel during their probe() and keep them for the whole driver 
life. In turn
this very similar to calling pm_runtime_get() in the DMA engine driver 
probe().
The result of both approaches is that DMA engine device keeps its power 
domain
enabled almost all the time. This problem is also mentioned in the DMA 
engine
TODO list, you have pointed me yesterday.

To avoid such situation that DMA engine driver blocks turning off the power
domain and avoid changing DMA engine client API I came up with the 
device links
pm based approach. I don't want to duplicate the description here, the 
details
were in the patch description, however if you have any particular 
question about
the details, let me know and I will try to clarify it more.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
  2017-02-10  4:34         ` Vinod Koul
@ 2017-02-10 12:07           ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-10 12:07 UTC (permalink / raw)
  To: Vinod Koul, Ulf Hansson
  Cc: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm,
	linux-kernel, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Rafael J. Wysocki, Lars-Peter Clausen, Arnd Bergmann, Inki Dae

Hi Vinod,

On 2017-02-10 05:34, Vinod Koul wrote:
> On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
>> Add two new callbacks to DMA engine device. They will used to provide
>> access to slave device (the device which requested given DMA channel)
> You mean access to client devices?

Yes. It looks that I was confused by the code, where the term 'slave'
appears a few times. 'Client' is a bit more appropriate then.

>> for DMA engine driver. Access to slave device might be useful for example
>> for implementing advanced runtime power management.
>>
>> DMA slave channels are exclusive, so only one slave device can be set
>> for a given DMA slave channel.
> That is not a right assumption and my worry here. With virt-dma we don't
> really assume a hardware channel and exclusive. Certain implementation may
> do that but from framework we cannot assume that.

Okay, I came to such conclusion basing one the dma engine code, but maybe
I missed something. However in such case such callback will be called for
each client device and it will be up to the driver to handle that.

>> device_set_slave() will be called after the device_alloc_chan_resources()
>> and device_release_slave() before the device_free_chan_resources().
> Okay, I had to relook at the series to get around this part. Sorry but we
> can't call it set_slave, it is actually set_client/consumer

That's okay, the name of the callbacks should be changed.

> In our context slaves means dmaengine slave devices aka provider.
> Client would be the consumer and not slave.

I'm a new to the DMA engine framework, I'm sorry for using wrong terms.

>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> ---
>>   drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
>>   include/linux/dmaengine.h | 10 ++++++++++
>>   2 files changed, 34 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>> index 24e0221fd66d..5b7089d8be4d 100644
>> --- a/drivers/dma/dmaengine.c
>> +++ b/drivers/dma/dmaengine.c
>> @@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>   {
>>   	struct dma_device *d, *_d;
>>   	struct dma_chan *chan = NULL;
>> +	int ret;
>>   
>>   	/* If device-tree is present get slave info from here */
>>   	if (dev->of_node)
>> @@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>   		chan = acpi_dma_request_slave_chan_by_name(dev, name);
>>   
>>   	if (chan) {
>> -		/* Valid channel found or requester need to be deferred */
>> -		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
>> +		if (!IS_ERR(chan))
>> +			goto found;
>> +		if (PTR_ERR(chan) == -EPROBE_DEFER)
>>   			return chan;
>>   	}
>>   
>> @@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>   	}
>>   	mutex_unlock(&dma_list_mutex);
>>   
>> -	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
>> +	if (!chan)
>> +		return ERR_PTR(-EPROBE_DEFER);
>> +	if (IS_ERR(chan))
>> +		return chan;
>> +found:
>> +	if (chan->device->device_set_slave) {
>> +		chan->slave = dev;
>> +		ret = chan->device->device_set_slave(chan, dev);
>> +		if (ret) {
>> +			chan->slave = NULL;
>> +			dma_release_channel(chan);
>> +			chan = ERR_PTR(ret);
>> +		}
>> +	}
>> +	return chan;
>>   }
>>   EXPORT_SYMBOL_GPL(dma_request_chan);
>>   
>> @@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
>>   	mutex_lock(&dma_list_mutex);
>>   	WARN_ONCE(chan->client_count != 1,
>>   		  "chan reference count %d != 1\n", chan->client_count);
>> +	if (chan->slave) {
>> +		if (chan->device->device_release_slave)
>> +			chan->device->device_release_slave(chan);
>> +		chan->slave = NULL;
>> +	}
>>   	dma_chan_put(chan);
>>   	/* drop PRIVATE cap enabled by __dma_request_channel() */
>>   	if (--chan->device->privatecnt == 0)
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 533680860865..d22299e37e69 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -277,6 +277,9 @@ struct dma_chan {
>>   	struct dma_router *router;
>>   	void *route_data;
>>   
>> +	/* Only for SLAVE channels */
>> +	struct device *slave;
> so assuming you refer to consumer aka client here, why do we need set if we
> store it here.

DMA engine driver might need to do something with it (like setting up a pm
link for example) before starting any operations. It would be great if the
pointer to client device is available in device_alloc_chan_resources(), but
propagating it there is not possible without significant changes. That's why
I came with this a separate callback.

Maybe the client device shouldn't be stored in the dma_chan structure at all
and left to the drivers to use or manage it if really needed. This will also
solve the issue with virt-dma you have mentioned.

In the previous version I managed to pass client device pointer to
device_alloc_chan_resources() via of_xlate callback (please take a look into
v7), but that approach was rejected by Lars-Peter Clausen.

 > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
@ 2017-02-10 12:07           ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-10 12:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vinod,

On 2017-02-10 05:34, Vinod Koul wrote:
> On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
>> Add two new callbacks to DMA engine device. They will used to provide
>> access to slave device (the device which requested given DMA channel)
> You mean access to client devices?

Yes. It looks that I was confused by the code, where the term 'slave'
appears a few times. 'Client' is a bit more appropriate then.

>> for DMA engine driver. Access to slave device might be useful for example
>> for implementing advanced runtime power management.
>>
>> DMA slave channels are exclusive, so only one slave device can be set
>> for a given DMA slave channel.
> That is not a right assumption and my worry here. With virt-dma we don't
> really assume a hardware channel and exclusive. Certain implementation may
> do that but from framework we cannot assume that.

Okay, I came to such conclusion basing one the dma engine code, but maybe
I missed something. However in such case such callback will be called for
each client device and it will be up to the driver to handle that.

>> device_set_slave() will be called after the device_alloc_chan_resources()
>> and device_release_slave() before the device_free_chan_resources().
> Okay, I had to relook at the series to get around this part. Sorry but we
> can't call it set_slave, it is actually set_client/consumer

That's okay, the name of the callbacks should be changed.

> In our context slaves means dmaengine slave devices aka provider.
> Client would be the consumer and not slave.

I'm a new to the DMA engine framework, I'm sorry for using wrong terms.

>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> ---
>>   drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
>>   include/linux/dmaengine.h | 10 ++++++++++
>>   2 files changed, 34 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>> index 24e0221fd66d..5b7089d8be4d 100644
>> --- a/drivers/dma/dmaengine.c
>> +++ b/drivers/dma/dmaengine.c
>> @@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>   {
>>   	struct dma_device *d, *_d;
>>   	struct dma_chan *chan = NULL;
>> +	int ret;
>>   
>>   	/* If device-tree is present get slave info from here */
>>   	if (dev->of_node)
>> @@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>   		chan = acpi_dma_request_slave_chan_by_name(dev, name);
>>   
>>   	if (chan) {
>> -		/* Valid channel found or requester need to be deferred */
>> -		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
>> +		if (!IS_ERR(chan))
>> +			goto found;
>> +		if (PTR_ERR(chan) == -EPROBE_DEFER)
>>   			return chan;
>>   	}
>>   
>> @@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>   	}
>>   	mutex_unlock(&dma_list_mutex);
>>   
>> -	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
>> +	if (!chan)
>> +		return ERR_PTR(-EPROBE_DEFER);
>> +	if (IS_ERR(chan))
>> +		return chan;
>> +found:
>> +	if (chan->device->device_set_slave) {
>> +		chan->slave = dev;
>> +		ret = chan->device->device_set_slave(chan, dev);
>> +		if (ret) {
>> +			chan->slave = NULL;
>> +			dma_release_channel(chan);
>> +			chan = ERR_PTR(ret);
>> +		}
>> +	}
>> +	return chan;
>>   }
>>   EXPORT_SYMBOL_GPL(dma_request_chan);
>>   
>> @@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
>>   	mutex_lock(&dma_list_mutex);
>>   	WARN_ONCE(chan->client_count != 1,
>>   		  "chan reference count %d != 1\n", chan->client_count);
>> +	if (chan->slave) {
>> +		if (chan->device->device_release_slave)
>> +			chan->device->device_release_slave(chan);
>> +		chan->slave = NULL;
>> +	}
>>   	dma_chan_put(chan);
>>   	/* drop PRIVATE cap enabled by __dma_request_channel() */
>>   	if (--chan->device->privatecnt == 0)
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 533680860865..d22299e37e69 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -277,6 +277,9 @@ struct dma_chan {
>>   	struct dma_router *router;
>>   	void *route_data;
>>   
>> +	/* Only for SLAVE channels */
>> +	struct device *slave;
> so assuming you refer to consumer aka client here, why do we need set if we
> store it here.

DMA engine driver might need to do something with it (like setting up a pm
link for example) before starting any operations. It would be great if the
pointer to client device is available in device_alloc_chan_resources(), but
propagating it there is not possible without significant changes. That's why
I came with this a separate callback.

Maybe the client device shouldn't be stored in the dma_chan structure at all
and left to the drivers to use or manage it if really needed. This will also
solve the issue with virt-dma you have mentioned.

In the previous version I managed to pass client device pointer to
device_alloc_chan_resources() via of_xlate callback (please take a look into
v7), but that approach was rejected by Lars-Peter Clausen.

 > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-10 11:51           ` Marek Szyprowski
  (?)
@ 2017-02-10 13:57             ` Ulf Hansson
  -1 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-10 13:57 UTC (permalink / raw)
  To: Marek Szyprowski, Vinod Koul
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> Hi Vinod,
>
> On 2017-02-10 05:50, Vinod Koul wrote:
>>
>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>
>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>> +{
>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>> +       int i;
>>> +
>>> +       mutex_lock(&pl330->rpm_lock);
>>> +
>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>> +                   pl330->peripherals[i].slave_link) {
>>> +                       pch->slave_link =
>>> pl330->peripherals[i].slave_link;
>>> +                       goto done;
>>> +               }
>>> +       }
>>> +
>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>> +                                      DL_FLAG_PM_RUNTIME |
>>> DL_FLAG_RPM_ACTIVE);
>>
>> So you are going to add the link on channel allocation and tear down on
>> the
>> freeup.
>
>
> Right. Channel allocation is typically done once per driver operation and it
> won't hurt system performance.
>
>>   I am not sure I really like the idea here.
>
>
> Could you point what's wrong with it?
>
>> First, these thing shouldn't be handled in the drivers. These things
>> should
>> be set in core and each driver setting the links doesn't sound great to
>> me.
>
>
> Which core? And what's wrong with the device links? They have been
> introduced to
> model relations between devices that are behind the usual parent/child/bus
> topology.

I think Vinod mean the dmaengine core. Which also would make perfect
sense to me as it would benefit all dma drivers.

The only related PM thing, that shall be the decision of the driver,
is whether it wants to enable runtime PM or not, during ->probe().

>
>> Second, should the link be always there and we only mange the state? Here
>> it
>> seems that we have link being created and destroyed, so why not mark it
>> ACTIVE and DORMANT instead...
>
>
> Link state is managed by device core and should not be touched by the
> drivers.
> It is related to both provider and consumer drivers states (probed/not
> probed/etc).
>
> Second we would need to create those links first. The question is where to
> create them then.

Just to fill in, to me this is really also the key question.

If we could set up the device link already at device initialization,
it should also be possible to avoid getting -EPROBE_DEFER for dma
client drivers when requesting their dma channels.

>
>> Lastly, looking at th description of the issue here, am perceiving (maybe
>> my
>> understanding is not quite right here) that you have an IP block in SoC
>> which has multiple things and share common stuff and doing right PM is a
>> challenge for you, right?
>
>
> Nope. Doing right PM in my SoC is not that complex and I would say it is
> rather
> typical for any embedded stuff. It works fine (in terms of the power
> consumption reduction) when all drivers simply properly manage their runtime
> PM state, thus if device is not in use, the state is set to suspended and
> finally, the power domain gets turned off.
>
> I've used device links for PM only because the current DMA engine API is
> simply insufficient to implement it in the other way.
>
> I want to let a power domain, which contains a few devices, among those a
> PL330
> device, to get turned off when there is no activity. Handling power domain
> power
> on / off requires non-atomic context, what is typical for runtime pm calls.
> For
> that I need to have non-irq-safe runtime pm implemented for all devices that
> belongs to that domains.

Again, allow me to fill in. This issue exists for all ARM SoC which
has a dma controller residing in a PM domain. I think that is quite
many.

Currently the only solution I have seen for this problem, but which I
really dislike. That is, each dma client driver requests/releases
their dma channel from their respective ->runtime_suspend|resume()
callbacks - then the dma driver can use the dma request/release hooks,
to do pm_runtime_get|put() which then becomes non-irq-safe.

>
> The problem with PL330 driver is that it use irq-safe runtime pm, which like
> it
> was stated in the patch description doesn't bring much benefits. To switch
> to
> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
> from
> a context which permits sleeping. The problem with DMA engine driver API is
> that
> most of its callbacks have to be IRQ-safe and frankly only
> device_{alloc,release}_chan_resources() what more or less maps to
> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
> rcar-dmac), but this is not really efficient. DMA engine clients usually
> allocate
> dma channel during their probe() and keep them for the whole driver life. In
> turn
> this very similar to calling pm_runtime_get() in the DMA engine driver
> probe().
> The result of both approaches is that DMA engine device keeps its power
> domain
> enabled almost all the time. This problem is also mentioned in the DMA
> engine
> TODO list, you have pointed me yesterday.
>
> To avoid such situation that DMA engine driver blocks turning off the power
> domain and avoid changing DMA engine client API I came up with the device
> links
> pm based approach. I don't want to duplicate the description here, the
> details
> were in the patch description, however if you have any particular question
> about
> the details, let me know and I will try to clarify it more.

So besides solving the irq-safe issue for dma driver, using the
device-links has additionally two advantages. I already mentioned the
-EPROBE_DEFER issue above.

The second thing, is the runtime/system PM relations we get for free
by using the links. In other words, the dma driver/core don't need to
care about dealing with pm_runtime_get|put() as that would be managed
by the dma client driver.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-10 13:57             ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-10 13:57 UTC (permalink / raw)
  To: Marek Szyprowski, Vinod Koul
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> Hi Vinod,
>
> On 2017-02-10 05:50, Vinod Koul wrote:
>>
>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>
>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>> +{
>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>> +       int i;
>>> +
>>> +       mutex_lock(&pl330->rpm_lock);
>>> +
>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>> +                   pl330->peripherals[i].slave_link) {
>>> +                       pch->slave_link =
>>> pl330->peripherals[i].slave_link;
>>> +                       goto done;
>>> +               }
>>> +       }
>>> +
>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>> +                                      DL_FLAG_PM_RUNTIME |
>>> DL_FLAG_RPM_ACTIVE);
>>
>> So you are going to add the link on channel allocation and tear down on
>> the
>> freeup.
>
>
> Right. Channel allocation is typically done once per driver operation and it
> won't hurt system performance.
>
>>   I am not sure I really like the idea here.
>
>
> Could you point what's wrong with it?
>
>> First, these thing shouldn't be handled in the drivers. These things
>> should
>> be set in core and each driver setting the links doesn't sound great to
>> me.
>
>
> Which core? And what's wrong with the device links? They have been
> introduced to
> model relations between devices that are behind the usual parent/child/bus
> topology.

I think Vinod mean the dmaengine core. Which also would make perfect
sense to me as it would benefit all dma drivers.

The only related PM thing, that shall be the decision of the driver,
is whether it wants to enable runtime PM or not, during ->probe().

>
>> Second, should the link be always there and we only mange the state? Here
>> it
>> seems that we have link being created and destroyed, so why not mark it
>> ACTIVE and DORMANT instead...
>
>
> Link state is managed by device core and should not be touched by the
> drivers.
> It is related to both provider and consumer drivers states (probed/not
> probed/etc).
>
> Second we would need to create those links first. The question is where to
> create them then.

Just to fill in, to me this is really also the key question.

If we could set up the device link already at device initialization,
it should also be possible to avoid getting -EPROBE_DEFER for dma
client drivers when requesting their dma channels.

>
>> Lastly, looking at th description of the issue here, am perceiving (maybe
>> my
>> understanding is not quite right here) that you have an IP block in SoC
>> which has multiple things and share common stuff and doing right PM is a
>> challenge for you, right?
>
>
> Nope. Doing right PM in my SoC is not that complex and I would say it is
> rather
> typical for any embedded stuff. It works fine (in terms of the power
> consumption reduction) when all drivers simply properly manage their runtime
> PM state, thus if device is not in use, the state is set to suspended and
> finally, the power domain gets turned off.
>
> I've used device links for PM only because the current DMA engine API is
> simply insufficient to implement it in the other way.
>
> I want to let a power domain, which contains a few devices, among those a
> PL330
> device, to get turned off when there is no activity. Handling power domain
> power
> on / off requires non-atomic context, what is typical for runtime pm calls.
> For
> that I need to have non-irq-safe runtime pm implemented for all devices that
> belongs to that domains.

Again, allow me to fill in. This issue exists for all ARM SoC which
has a dma controller residing in a PM domain. I think that is quite
many.

Currently the only solution I have seen for this problem, but which I
really dislike. That is, each dma client driver requests/releases
their dma channel from their respective ->runtime_suspend|resume()
callbacks - then the dma driver can use the dma request/release hooks,
to do pm_runtime_get|put() which then becomes non-irq-safe.

>
> The problem with PL330 driver is that it use irq-safe runtime pm, which like
> it
> was stated in the patch description doesn't bring much benefits. To switch
> to
> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
> from
> a context which permits sleeping. The problem with DMA engine driver API is
> that
> most of its callbacks have to be IRQ-safe and frankly only
> device_{alloc,release}_chan_resources() what more or less maps to
> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
> rcar-dmac), but this is not really efficient. DMA engine clients usually
> allocate
> dma channel during their probe() and keep them for the whole driver life. In
> turn
> this very similar to calling pm_runtime_get() in the DMA engine driver
> probe().
> The result of both approaches is that DMA engine device keeps its power
> domain
> enabled almost all the time. This problem is also mentioned in the DMA
> engine
> TODO list, you have pointed me yesterday.
>
> To avoid such situation that DMA engine driver blocks turning off the power
> domain and avoid changing DMA engine client API I came up with the device
> links
> pm based approach. I don't want to duplicate the description here, the
> details
> were in the patch description, however if you have any particular question
> about
> the details, let me know and I will try to clarify it more.

So besides solving the irq-safe issue for dma driver, using the
device-links has additionally two advantages. I already mentioned the
-EPROBE_DEFER issue above.

The second thing, is the runtime/system PM relations we get for free
by using the links. In other words, the dma driver/core don't need to
care about dealing with pm_runtime_get|put() as that would be managed
by the dma client driver.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-10 13:57             ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-10 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> Hi Vinod,
>
> On 2017-02-10 05:50, Vinod Koul wrote:
>>
>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>
>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>> +{
>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>> +       int i;
>>> +
>>> +       mutex_lock(&pl330->rpm_lock);
>>> +
>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>> +                   pl330->peripherals[i].slave_link) {
>>> +                       pch->slave_link =
>>> pl330->peripherals[i].slave_link;
>>> +                       goto done;
>>> +               }
>>> +       }
>>> +
>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>> +                                      DL_FLAG_PM_RUNTIME |
>>> DL_FLAG_RPM_ACTIVE);
>>
>> So you are going to add the link on channel allocation and tear down on
>> the
>> freeup.
>
>
> Right. Channel allocation is typically done once per driver operation and it
> won't hurt system performance.
>
>>   I am not sure I really like the idea here.
>
>
> Could you point what's wrong with it?
>
>> First, these thing shouldn't be handled in the drivers. These things
>> should
>> be set in core and each driver setting the links doesn't sound great to
>> me.
>
>
> Which core? And what's wrong with the device links? They have been
> introduced to
> model relations between devices that are behind the usual parent/child/bus
> topology.

I think Vinod mean the dmaengine core. Which also would make perfect
sense to me as it would benefit all dma drivers.

The only related PM thing, that shall be the decision of the driver,
is whether it wants to enable runtime PM or not, during ->probe().

>
>> Second, should the link be always there and we only mange the state? Here
>> it
>> seems that we have link being created and destroyed, so why not mark it
>> ACTIVE and DORMANT instead...
>
>
> Link state is managed by device core and should not be touched by the
> drivers.
> It is related to both provider and consumer drivers states (probed/not
> probed/etc).
>
> Second we would need to create those links first. The question is where to
> create them then.

Just to fill in, to me this is really also the key question.

If we could set up the device link already at device initialization,
it should also be possible to avoid getting -EPROBE_DEFER for dma
client drivers when requesting their dma channels.

>
>> Lastly, looking at th description of the issue here, am perceiving (maybe
>> my
>> understanding is not quite right here) that you have an IP block in SoC
>> which has multiple things and share common stuff and doing right PM is a
>> challenge for you, right?
>
>
> Nope. Doing right PM in my SoC is not that complex and I would say it is
> rather
> typical for any embedded stuff. It works fine (in terms of the power
> consumption reduction) when all drivers simply properly manage their runtime
> PM state, thus if device is not in use, the state is set to suspended and
> finally, the power domain gets turned off.
>
> I've used device links for PM only because the current DMA engine API is
> simply insufficient to implement it in the other way.
>
> I want to let a power domain, which contains a few devices, among those a
> PL330
> device, to get turned off when there is no activity. Handling power domain
> power
> on / off requires non-atomic context, what is typical for runtime pm calls.
> For
> that I need to have non-irq-safe runtime pm implemented for all devices that
> belongs to that domains.

Again, allow me to fill in. This issue exists for all ARM SoC which
has a dma controller residing in a PM domain. I think that is quite
many.

Currently the only solution I have seen for this problem, but which I
really dislike. That is, each dma client driver requests/releases
their dma channel from their respective ->runtime_suspend|resume()
callbacks - then the dma driver can use the dma request/release hooks,
to do pm_runtime_get|put() which then becomes non-irq-safe.

>
> The problem with PL330 driver is that it use irq-safe runtime pm, which like
> it
> was stated in the patch description doesn't bring much benefits. To switch
> to
> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
> from
> a context which permits sleeping. The problem with DMA engine driver API is
> that
> most of its callbacks have to be IRQ-safe and frankly only
> device_{alloc,release}_chan_resources() what more or less maps to
> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
> rcar-dmac), but this is not really efficient. DMA engine clients usually
> allocate
> dma channel during their probe() and keep them for the whole driver life. In
> turn
> this very similar to calling pm_runtime_get() in the DMA engine driver
> probe().
> The result of both approaches is that DMA engine device keeps its power
> domain
> enabled almost all the time. This problem is also mentioned in the DMA
> engine
> TODO list, you have pointed me yesterday.
>
> To avoid such situation that DMA engine driver blocks turning off the power
> domain and avoid changing DMA engine client API I came up with the device
> links
> pm based approach. I don't want to duplicate the description here, the
> details
> were in the patch description, however if you have any particular question
> about
> the details, let me know and I will try to clarify it more.

So besides solving the irq-safe issue for dma driver, using the
device-links has additionally two advantages. I already mentioned the
-EPROBE_DEFER issue above.

The second thing, is the runtime/system PM relations we get for free
by using the links. In other words, the dma driver/core don't need to
care about dealing with pm_runtime_get|put() as that would be managed
by the dma client driver.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
  2017-02-10 12:07           ` Marek Szyprowski
@ 2017-02-13  1:42             ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13  1:42 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Ulf Hansson, linux-samsung-soc, dmaengine, linux-arm-kernel,
	linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

On Fri, Feb 10, 2017 at 01:07:41PM +0100, Marek Szyprowski wrote:
> Hi Vinod,
> 
> On 2017-02-10 05:34, Vinod Koul wrote:
> >On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
> >>Add two new callbacks to DMA engine device. They will used to provide
> >>access to slave device (the device which requested given DMA channel)
> >You mean access to client devices?
> 
> Yes. It looks that I was confused by the code, where the term 'slave'
> appears a few times. 'Client' is a bit more appropriate then.
> 
> >>for DMA engine driver. Access to slave device might be useful for example
> >>for implementing advanced runtime power management.
> >>
> >>DMA slave channels are exclusive, so only one slave device can be set
> >>for a given DMA slave channel.
> >That is not a right assumption and my worry here. With virt-dma we don't
> >really assume a hardware channel and exclusive. Certain implementation may
> >do that but from framework we cannot assume that.
> 
> Okay, I came to such conclusion basing one the dma engine code, but maybe
> I missed something. However in such case such callback will be called for
> each client device and it will be up to the driver to handle that.

Thats right, but the assumption that we will have once physical channel
maynot be true.

> >>device_set_slave() will be called after the device_alloc_chan_resources()
> >>and device_release_slave() before the device_free_chan_resources().
> >Okay, I had to relook at the series to get around this part. Sorry but we
> >can't call it set_slave, it is actually set_client/consumer
> 
> That's okay, the name of the callbacks should be changed.
> 
> >In our context slaves means dmaengine slave devices aka provider.
> >Client would be the consumer and not slave.
> 
> I'm a new to the DMA engine framework, I'm sorry for using wrong terms.

That's fine :-) we all learn incrementally.

> 
> >>Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>---
> >>  drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
> >>  include/linux/dmaengine.h | 10 ++++++++++
> >>  2 files changed, 34 insertions(+), 3 deletions(-)
> >>
> >>diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> >>index 24e0221fd66d..5b7089d8be4d 100644
> >>--- a/drivers/dma/dmaengine.c
> >>+++ b/drivers/dma/dmaengine.c
> >>@@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
> >>  {
> >>  	struct dma_device *d, *_d;
> >>  	struct dma_chan *chan = NULL;
> >>+	int ret;
> >>  	/* If device-tree is present get slave info from here */
> >>  	if (dev->of_node)
> >>@@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
> >>  		chan = acpi_dma_request_slave_chan_by_name(dev, name);
> >>  	if (chan) {
> >>-		/* Valid channel found or requester need to be deferred */
> >>-		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
> >>+		if (!IS_ERR(chan))
> >>+			goto found;
> >>+		if (PTR_ERR(chan) == -EPROBE_DEFER)
> >>  			return chan;
> >>  	}
> >>@@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
> >>  	}
> >>  	mutex_unlock(&dma_list_mutex);
> >>-	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
> >>+	if (!chan)
> >>+		return ERR_PTR(-EPROBE_DEFER);
> >>+	if (IS_ERR(chan))
> >>+		return chan;
> >>+found:
> >>+	if (chan->device->device_set_slave) {
> >>+		chan->slave = dev;
> >>+		ret = chan->device->device_set_slave(chan, dev);
> >>+		if (ret) {
> >>+			chan->slave = NULL;
> >>+			dma_release_channel(chan);
> >>+			chan = ERR_PTR(ret);
> >>+		}
> >>+	}
> >>+	return chan;
> >>  }
> >>  EXPORT_SYMBOL_GPL(dma_request_chan);
> >>@@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
> >>  	mutex_lock(&dma_list_mutex);
> >>  	WARN_ONCE(chan->client_count != 1,
> >>  		  "chan reference count %d != 1\n", chan->client_count);
> >>+	if (chan->slave) {
> >>+		if (chan->device->device_release_slave)
> >>+			chan->device->device_release_slave(chan);
> >>+		chan->slave = NULL;
> >>+	}
> >>  	dma_chan_put(chan);
> >>  	/* drop PRIVATE cap enabled by __dma_request_channel() */
> >>  	if (--chan->device->privatecnt == 0)
> >>diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> >>index 533680860865..d22299e37e69 100644
> >>--- a/include/linux/dmaengine.h
> >>+++ b/include/linux/dmaengine.h
> >>@@ -277,6 +277,9 @@ struct dma_chan {
> >>  	struct dma_router *router;
> >>  	void *route_data;
> >>+	/* Only for SLAVE channels */
> >>+	struct device *slave;
> >so assuming you refer to consumer aka client here, why do we need set if we
> >store it here.
> 
> DMA engine driver might need to do something with it (like setting up a pm
> link for example) before starting any operations. It would be great if the
> pointer to client device is available in device_alloc_chan_resources(), but
> propagating it there is not possible without significant changes. That's why
> I came with this a separate callback.

But then it gets the client device using the callback as well. So if we
retain that, this should go away.

> Maybe the client device shouldn't be stored in the dma_chan structure at all
> and left to the drivers to use or manage it if really needed. This will also
> solve the issue with virt-dma you have mentioned.
> 
> In the previous version I managed to pass client device pointer to
> device_alloc_chan_resources() via of_xlate callback (please take a look into
> v7), but that approach was rejected by Lars-Peter Clausen.

I feel this is better approach, perhaps we don't need the client pointer
here..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
@ 2017-02-13  1:42             ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13  1:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 10, 2017 at 01:07:41PM +0100, Marek Szyprowski wrote:
> Hi Vinod,
> 
> On 2017-02-10 05:34, Vinod Koul wrote:
> >On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
> >>Add two new callbacks to DMA engine device. They will used to provide
> >>access to slave device (the device which requested given DMA channel)
> >You mean access to client devices?
> 
> Yes. It looks that I was confused by the code, where the term 'slave'
> appears a few times. 'Client' is a bit more appropriate then.
> 
> >>for DMA engine driver. Access to slave device might be useful for example
> >>for implementing advanced runtime power management.
> >>
> >>DMA slave channels are exclusive, so only one slave device can be set
> >>for a given DMA slave channel.
> >That is not a right assumption and my worry here. With virt-dma we don't
> >really assume a hardware channel and exclusive. Certain implementation may
> >do that but from framework we cannot assume that.
> 
> Okay, I came to such conclusion basing one the dma engine code, but maybe
> I missed something. However in such case such callback will be called for
> each client device and it will be up to the driver to handle that.

Thats right, but the assumption that we will have once physical channel
maynot be true.

> >>device_set_slave() will be called after the device_alloc_chan_resources()
> >>and device_release_slave() before the device_free_chan_resources().
> >Okay, I had to relook at the series to get around this part. Sorry but we
> >can't call it set_slave, it is actually set_client/consumer
> 
> That's okay, the name of the callbacks should be changed.
> 
> >In our context slaves means dmaengine slave devices aka provider.
> >Client would be the consumer and not slave.
> 
> I'm a new to the DMA engine framework, I'm sorry for using wrong terms.

That's fine :-) we all learn incrementally.

> 
> >>Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>---
> >>  drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
> >>  include/linux/dmaengine.h | 10 ++++++++++
> >>  2 files changed, 34 insertions(+), 3 deletions(-)
> >>
> >>diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> >>index 24e0221fd66d..5b7089d8be4d 100644
> >>--- a/drivers/dma/dmaengine.c
> >>+++ b/drivers/dma/dmaengine.c
> >>@@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
> >>  {
> >>  	struct dma_device *d, *_d;
> >>  	struct dma_chan *chan = NULL;
> >>+	int ret;
> >>  	/* If device-tree is present get slave info from here */
> >>  	if (dev->of_node)
> >>@@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
> >>  		chan = acpi_dma_request_slave_chan_by_name(dev, name);
> >>  	if (chan) {
> >>-		/* Valid channel found or requester need to be deferred */
> >>-		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
> >>+		if (!IS_ERR(chan))
> >>+			goto found;
> >>+		if (PTR_ERR(chan) == -EPROBE_DEFER)
> >>  			return chan;
> >>  	}
> >>@@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
> >>  	}
> >>  	mutex_unlock(&dma_list_mutex);
> >>-	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
> >>+	if (!chan)
> >>+		return ERR_PTR(-EPROBE_DEFER);
> >>+	if (IS_ERR(chan))
> >>+		return chan;
> >>+found:
> >>+	if (chan->device->device_set_slave) {
> >>+		chan->slave = dev;
> >>+		ret = chan->device->device_set_slave(chan, dev);
> >>+		if (ret) {
> >>+			chan->slave = NULL;
> >>+			dma_release_channel(chan);
> >>+			chan = ERR_PTR(ret);
> >>+		}
> >>+	}
> >>+	return chan;
> >>  }
> >>  EXPORT_SYMBOL_GPL(dma_request_chan);
> >>@@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
> >>  	mutex_lock(&dma_list_mutex);
> >>  	WARN_ONCE(chan->client_count != 1,
> >>  		  "chan reference count %d != 1\n", chan->client_count);
> >>+	if (chan->slave) {
> >>+		if (chan->device->device_release_slave)
> >>+			chan->device->device_release_slave(chan);
> >>+		chan->slave = NULL;
> >>+	}
> >>  	dma_chan_put(chan);
> >>  	/* drop PRIVATE cap enabled by __dma_request_channel() */
> >>  	if (--chan->device->privatecnt == 0)
> >>diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> >>index 533680860865..d22299e37e69 100644
> >>--- a/include/linux/dmaengine.h
> >>+++ b/include/linux/dmaengine.h
> >>@@ -277,6 +277,9 @@ struct dma_chan {
> >>  	struct dma_router *router;
> >>  	void *route_data;
> >>+	/* Only for SLAVE channels */
> >>+	struct device *slave;
> >so assuming you refer to consumer aka client here, why do we need set if we
> >store it here.
> 
> DMA engine driver might need to do something with it (like setting up a pm
> link for example) before starting any operations. It would be great if the
> pointer to client device is available in device_alloc_chan_resources(), but
> propagating it there is not possible without significant changes. That's why
> I came with this a separate callback.

But then it gets the client device using the callback as well. So if we
retain that, this should go away.

> Maybe the client device shouldn't be stored in the dma_chan structure at all
> and left to the drivers to use or manage it if really needed. This will also
> solve the issue with virt-dma you have mentioned.
> 
> In the previous version I managed to pass client device pointer to
> device_alloc_chan_resources() via of_xlate callback (please take a look into
> v7), but that approach was rejected by Lars-Peter Clausen.

I feel this is better approach, perhaps we don't need the client pointer
here..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-10 13:57             ` Ulf Hansson
  (?)
@ 2017-02-13  2:03               ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13  2:03 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On Fri, Feb 10, 2017 at 02:57:09PM +0100, Ulf Hansson wrote:
> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> > Hi Vinod,
> >
> > On 2017-02-10 05:50, Vinod Koul wrote:
> >>
> >> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
> >>
> >>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
> >>> +{
> >>> +       struct dma_pl330_chan *pch = to_pchan(chan);
> >>> +       struct pl330_dmac *pl330 = pch->dmac;
> >>> +       int i;
> >>> +
> >>> +       mutex_lock(&pl330->rpm_lock);
> >>> +
> >>> +       for (i = 0; i < pl330->num_peripherals; i++) {
> >>> +               if (pl330->peripherals[i].chan.slave == slave &&
> >>> +                   pl330->peripherals[i].slave_link) {
> >>> +                       pch->slave_link =
> >>> pl330->peripherals[i].slave_link;
> >>> +                       goto done;
> >>> +               }
> >>> +       }
> >>> +
> >>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
> >>> +                                      DL_FLAG_PM_RUNTIME |
> >>> DL_FLAG_RPM_ACTIVE);
> >>
> >> So you are going to add the link on channel allocation and tear down on
> >> the
> >> freeup.
> >
> >
> > Right. Channel allocation is typically done once per driver operation and it
> > won't hurt system performance.
> >
> >>   I am not sure I really like the idea here.
> >
> >
> > Could you point what's wrong with it?
> >
> >> First, these thing shouldn't be handled in the drivers. These things
> >> should
> >> be set in core and each driver setting the links doesn't sound great to
> >> me.
> >
> >
> > Which core? And what's wrong with the device links? They have been
> > introduced to
> > model relations between devices that are behind the usual parent/child/bus
> > topology.
> 
> I think Vinod mean the dmaengine core. Which also would make perfect
> sense to me as it would benefit all dma drivers.

Right.

> The only related PM thing, that shall be the decision of the driver,
> is whether it wants to enable runtime PM or not, during ->probe().

We can do pm_runtime_enabled() to check and that and do when enabled..

> >> Second, should the link be always there and we only mange the state? Here
> >> it
> >> seems that we have link being created and destroyed, so why not mark it
> >> ACTIVE and DORMANT instead...
> >
> >
> > Link state is managed by device core and should not be touched by the
> > drivers.
> > It is related to both provider and consumer drivers states (probed/not
> > probed/etc).
> >
> > Second we would need to create those links first. The question is where to
> > create them then.
> 
> Just to fill in, to me this is really also the key question.
> 
> If we could set up the device link already at device initialization,
> it should also be possible to avoid getting -EPROBE_DEFER for dma
> client drivers when requesting their dma channels.

Well if we defer then driver will regiser with dmaengine after it is
probed, so a client will either get a channel or not. IOW we won't get
-EPROBE_DEFER.

> 
> >
> >> Lastly, looking at th description of the issue here, am perceiving (maybe
> >> my
> >> understanding is not quite right here) that you have an IP block in SoC
> >> which has multiple things and share common stuff and doing right PM is a
> >> challenge for you, right?
> >
> >
> > Nope. Doing right PM in my SoC is not that complex and I would say it is
> > rather
> > typical for any embedded stuff. It works fine (in terms of the power
> > consumption reduction) when all drivers simply properly manage their runtime
> > PM state, thus if device is not in use, the state is set to suspended and
> > finally, the power domain gets turned off.
> >
> > I've used device links for PM only because the current DMA engine API is
> > simply insufficient to implement it in the other way.
> >
> > I want to let a power domain, which contains a few devices, among those a
> > PL330
> > device, to get turned off when there is no activity. Handling power domain
> > power
> > on / off requires non-atomic context, what is typical for runtime pm calls.
> > For
> > that I need to have non-irq-safe runtime pm implemented for all devices that
> > belongs to that domains.
> 
> Again, allow me to fill in. This issue exists for all ARM SoC which
> has a dma controller residing in a PM domain. I think that is quite
> many.
> 
> Currently the only solution I have seen for this problem, but which I
> really dislike. That is, each dma client driver requests/releases
> their dma channel from their respective ->runtime_suspend|resume()
> callbacks - then the dma driver can use the dma request/release hooks,
> to do pm_runtime_get|put() which then becomes non-irq-safe.

Yeah that is not the best way to do. But looking at it current one doesnt
seem best fit either.

So on seeing the device_link_add() I was thinking that this is some SoC
dependent problem being solved whereas the problem statmement is non-atomic
channel prepare.

As I said earlier, if we want to solve that problem a better idea is to
actually split the prepare as we discussed in [1]

This way we can get a non atomic descriptor allocate/prepare and release.
Yes we need to redesign the APIs to solve this, but if you guys are up for
it, I think we can do it and avoid any further round abouts :)

> > The problem with PL330 driver is that it use irq-safe runtime pm, which like
> > it
> > was stated in the patch description doesn't bring much benefits. To switch
> > to
> > standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
> > from
> > a context which permits sleeping. The problem with DMA engine driver API is
> > that
> > most of its callbacks have to be IRQ-safe and frankly only
> > device_{alloc,release}_chan_resources() what more or less maps to
> > dma_request_chan()/dma_release_channel() and friends. There are DMA engine
> > drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
> > rcar-dmac), but this is not really efficient. DMA engine clients usually
> > allocate
> > dma channel during their probe() and keep them for the whole driver life. In
> > turn
> > this very similar to calling pm_runtime_get() in the DMA engine driver
> > probe().
> > The result of both approaches is that DMA engine device keeps its power
> > domain
> > enabled almost all the time. This problem is also mentioned in the DMA
> > engine
> > TODO list, you have pointed me yesterday.
> >
> > To avoid such situation that DMA engine driver blocks turning off the power
> > domain and avoid changing DMA engine client API I came up with the device
> > links
> > pm based approach. I don't want to duplicate the description here, the
> > details
> > were in the patch description, however if you have any particular question
> > about
> > the details, let me know and I will try to clarify it more.
> 
> So besides solving the irq-safe issue for dma driver, using the
> device-links has additionally two advantages. I already mentioned the
> -EPROBE_DEFER issue above.
> 
> The second thing, is the runtime/system PM relations we get for free
> by using the links. In other words, the dma driver/core don't need to
> care about dealing with pm_runtime_get|put() as that would be managed
> by the dma client driver.

Yeah sorry took me a while to figure that out :), If we do a different API
then dmaengine core can call pm_runtime_get|put() from non-atomic context.

[1]: http://www.spinics.net/lists/dmaengine/msg11570.html

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13  2:03               ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13  2:03 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On Fri, Feb 10, 2017 at 02:57:09PM +0100, Ulf Hansson wrote:
> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> > Hi Vinod,
> >
> > On 2017-02-10 05:50, Vinod Koul wrote:
> >>
> >> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
> >>
> >>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
> >>> +{
> >>> +       struct dma_pl330_chan *pch = to_pchan(chan);
> >>> +       struct pl330_dmac *pl330 = pch->dmac;
> >>> +       int i;
> >>> +
> >>> +       mutex_lock(&pl330->rpm_lock);
> >>> +
> >>> +       for (i = 0; i < pl330->num_peripherals; i++) {
> >>> +               if (pl330->peripherals[i].chan.slave == slave &&
> >>> +                   pl330->peripherals[i].slave_link) {
> >>> +                       pch->slave_link =
> >>> pl330->peripherals[i].slave_link;
> >>> +                       goto done;
> >>> +               }
> >>> +       }
> >>> +
> >>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
> >>> +                                      DL_FLAG_PM_RUNTIME |
> >>> DL_FLAG_RPM_ACTIVE);
> >>
> >> So you are going to add the link on channel allocation and tear down on
> >> the
> >> freeup.
> >
> >
> > Right. Channel allocation is typically done once per driver operation and it
> > won't hurt system performance.
> >
> >>   I am not sure I really like the idea here.
> >
> >
> > Could you point what's wrong with it?
> >
> >> First, these thing shouldn't be handled in the drivers. These things
> >> should
> >> be set in core and each driver setting the links doesn't sound great to
> >> me.
> >
> >
> > Which core? And what's wrong with the device links? They have been
> > introduced to
> > model relations between devices that are behind the usual parent/child/bus
> > topology.
> 
> I think Vinod mean the dmaengine core. Which also would make perfect
> sense to me as it would benefit all dma drivers.

Right.

> The only related PM thing, that shall be the decision of the driver,
> is whether it wants to enable runtime PM or not, during ->probe().

We can do pm_runtime_enabled() to check and that and do when enabled..

> >> Second, should the link be always there and we only mange the state? Here
> >> it
> >> seems that we have link being created and destroyed, so why not mark it
> >> ACTIVE and DORMANT instead...
> >
> >
> > Link state is managed by device core and should not be touched by the
> > drivers.
> > It is related to both provider and consumer drivers states (probed/not
> > probed/etc).
> >
> > Second we would need to create those links first. The question is where to
> > create them then.
> 
> Just to fill in, to me this is really also the key question.
> 
> If we could set up the device link already at device initialization,
> it should also be possible to avoid getting -EPROBE_DEFER for dma
> client drivers when requesting their dma channels.

Well if we defer then driver will regiser with dmaengine after it is
probed, so a client will either get a channel or not. IOW we won't get
-EPROBE_DEFER.

> 
> >
> >> Lastly, looking at th description of the issue here, am perceiving (maybe
> >> my
> >> understanding is not quite right here) that you have an IP block in SoC
> >> which has multiple things and share common stuff and doing right PM is a
> >> challenge for you, right?
> >
> >
> > Nope. Doing right PM in my SoC is not that complex and I would say it is
> > rather
> > typical for any embedded stuff. It works fine (in terms of the power
> > consumption reduction) when all drivers simply properly manage their runtime
> > PM state, thus if device is not in use, the state is set to suspended and
> > finally, the power domain gets turned off.
> >
> > I've used device links for PM only because the current DMA engine API is
> > simply insufficient to implement it in the other way.
> >
> > I want to let a power domain, which contains a few devices, among those a
> > PL330
> > device, to get turned off when there is no activity. Handling power domain
> > power
> > on / off requires non-atomic context, what is typical for runtime pm calls.
> > For
> > that I need to have non-irq-safe runtime pm implemented for all devices that
> > belongs to that domains.
> 
> Again, allow me to fill in. This issue exists for all ARM SoC which
> has a dma controller residing in a PM domain. I think that is quite
> many.
> 
> Currently the only solution I have seen for this problem, but which I
> really dislike. That is, each dma client driver requests/releases
> their dma channel from their respective ->runtime_suspend|resume()
> callbacks - then the dma driver can use the dma request/release hooks,
> to do pm_runtime_get|put() which then becomes non-irq-safe.

Yeah that is not the best way to do. But looking at it current one doesnt
seem best fit either.

So on seeing the device_link_add() I was thinking that this is some SoC
dependent problem being solved whereas the problem statmement is non-atomic
channel prepare.

As I said earlier, if we want to solve that problem a better idea is to
actually split the prepare as we discussed in [1]

This way we can get a non atomic descriptor allocate/prepare and release.
Yes we need to redesign the APIs to solve this, but if you guys are up for
it, I think we can do it and avoid any further round abouts :)

> > The problem with PL330 driver is that it use irq-safe runtime pm, which like
> > it
> > was stated in the patch description doesn't bring much benefits. To switch
> > to
> > standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
> > from
> > a context which permits sleeping. The problem with DMA engine driver API is
> > that
> > most of its callbacks have to be IRQ-safe and frankly only
> > device_{alloc,release}_chan_resources() what more or less maps to
> > dma_request_chan()/dma_release_channel() and friends. There are DMA engine
> > drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
> > rcar-dmac), but this is not really efficient. DMA engine clients usually
> > allocate
> > dma channel during their probe() and keep them for the whole driver life. In
> > turn
> > this very similar to calling pm_runtime_get() in the DMA engine driver
> > probe().
> > The result of both approaches is that DMA engine device keeps its power
> > domain
> > enabled almost all the time. This problem is also mentioned in the DMA
> > engine
> > TODO list, you have pointed me yesterday.
> >
> > To avoid such situation that DMA engine driver blocks turning off the power
> > domain and avoid changing DMA engine client API I came up with the device
> > links
> > pm based approach. I don't want to duplicate the description here, the
> > details
> > were in the patch description, however if you have any particular question
> > about
> > the details, let me know and I will try to clarify it more.
> 
> So besides solving the irq-safe issue for dma driver, using the
> device-links has additionally two advantages. I already mentioned the
> -EPROBE_DEFER issue above.
> 
> The second thing, is the runtime/system PM relations we get for free
> by using the links. In other words, the dma driver/core don't need to
> care about dealing with pm_runtime_get|put() as that would be managed
> by the dma client driver.

Yeah sorry took me a while to figure that out :), If we do a different API
then dmaengine core can call pm_runtime_get|put() from non-atomic context.

[1]: http://www.spinics.net/lists/dmaengine/msg11570.html

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13  2:03               ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13  2:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 10, 2017 at 02:57:09PM +0100, Ulf Hansson wrote:
> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> > Hi Vinod,
> >
> > On 2017-02-10 05:50, Vinod Koul wrote:
> >>
> >> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
> >>
> >>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
> >>> +{
> >>> +       struct dma_pl330_chan *pch = to_pchan(chan);
> >>> +       struct pl330_dmac *pl330 = pch->dmac;
> >>> +       int i;
> >>> +
> >>> +       mutex_lock(&pl330->rpm_lock);
> >>> +
> >>> +       for (i = 0; i < pl330->num_peripherals; i++) {
> >>> +               if (pl330->peripherals[i].chan.slave == slave &&
> >>> +                   pl330->peripherals[i].slave_link) {
> >>> +                       pch->slave_link =
> >>> pl330->peripherals[i].slave_link;
> >>> +                       goto done;
> >>> +               }
> >>> +       }
> >>> +
> >>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
> >>> +                                      DL_FLAG_PM_RUNTIME |
> >>> DL_FLAG_RPM_ACTIVE);
> >>
> >> So you are going to add the link on channel allocation and tear down on
> >> the
> >> freeup.
> >
> >
> > Right. Channel allocation is typically done once per driver operation and it
> > won't hurt system performance.
> >
> >>   I am not sure I really like the idea here.
> >
> >
> > Could you point what's wrong with it?
> >
> >> First, these thing shouldn't be handled in the drivers. These things
> >> should
> >> be set in core and each driver setting the links doesn't sound great to
> >> me.
> >
> >
> > Which core? And what's wrong with the device links? They have been
> > introduced to
> > model relations between devices that are behind the usual parent/child/bus
> > topology.
> 
> I think Vinod mean the dmaengine core. Which also would make perfect
> sense to me as it would benefit all dma drivers.

Right.

> The only related PM thing, that shall be the decision of the driver,
> is whether it wants to enable runtime PM or not, during ->probe().

We can do pm_runtime_enabled() to check and that and do when enabled..

> >> Second, should the link be always there and we only mange the state? Here
> >> it
> >> seems that we have link being created and destroyed, so why not mark it
> >> ACTIVE and DORMANT instead...
> >
> >
> > Link state is managed by device core and should not be touched by the
> > drivers.
> > It is related to both provider and consumer drivers states (probed/not
> > probed/etc).
> >
> > Second we would need to create those links first. The question is where to
> > create them then.
> 
> Just to fill in, to me this is really also the key question.
> 
> If we could set up the device link already at device initialization,
> it should also be possible to avoid getting -EPROBE_DEFER for dma
> client drivers when requesting their dma channels.

Well if we defer then driver will regiser with dmaengine after it is
probed, so a client will either get a channel or not. IOW we won't get
-EPROBE_DEFER.

> 
> >
> >> Lastly, looking at th description of the issue here, am perceiving (maybe
> >> my
> >> understanding is not quite right here) that you have an IP block in SoC
> >> which has multiple things and share common stuff and doing right PM is a
> >> challenge for you, right?
> >
> >
> > Nope. Doing right PM in my SoC is not that complex and I would say it is
> > rather
> > typical for any embedded stuff. It works fine (in terms of the power
> > consumption reduction) when all drivers simply properly manage their runtime
> > PM state, thus if device is not in use, the state is set to suspended and
> > finally, the power domain gets turned off.
> >
> > I've used device links for PM only because the current DMA engine API is
> > simply insufficient to implement it in the other way.
> >
> > I want to let a power domain, which contains a few devices, among those a
> > PL330
> > device, to get turned off when there is no activity. Handling power domain
> > power
> > on / off requires non-atomic context, what is typical for runtime pm calls.
> > For
> > that I need to have non-irq-safe runtime pm implemented for all devices that
> > belongs to that domains.
> 
> Again, allow me to fill in. This issue exists for all ARM SoC which
> has a dma controller residing in a PM domain. I think that is quite
> many.
> 
> Currently the only solution I have seen for this problem, but which I
> really dislike. That is, each dma client driver requests/releases
> their dma channel from their respective ->runtime_suspend|resume()
> callbacks - then the dma driver can use the dma request/release hooks,
> to do pm_runtime_get|put() which then becomes non-irq-safe.

Yeah that is not the best way to do. But looking at it current one doesnt
seem best fit either.

So on seeing the device_link_add() I was thinking that this is some SoC
dependent problem being solved whereas the problem statmement is non-atomic
channel prepare.

As I said earlier, if we want to solve that problem a better idea is to
actually split the prepare as we discussed in [1]

This way we can get a non atomic descriptor allocate/prepare and release.
Yes we need to redesign the APIs to solve this, but if you guys are up for
it, I think we can do it and avoid any further round abouts :)

> > The problem with PL330 driver is that it use irq-safe runtime pm, which like
> > it
> > was stated in the patch description doesn't bring much benefits. To switch
> > to
> > standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
> > from
> > a context which permits sleeping. The problem with DMA engine driver API is
> > that
> > most of its callbacks have to be IRQ-safe and frankly only
> > device_{alloc,release}_chan_resources() what more or less maps to
> > dma_request_chan()/dma_release_channel() and friends. There are DMA engine
> > drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
> > rcar-dmac), but this is not really efficient. DMA engine clients usually
> > allocate
> > dma channel during their probe() and keep them for the whole driver life. In
> > turn
> > this very similar to calling pm_runtime_get() in the DMA engine driver
> > probe().
> > The result of both approaches is that DMA engine device keeps its power
> > domain
> > enabled almost all the time. This problem is also mentioned in the DMA
> > engine
> > TODO list, you have pointed me yesterday.
> >
> > To avoid such situation that DMA engine driver blocks turning off the power
> > domain and avoid changing DMA engine client API I came up with the device
> > links
> > pm based approach. I don't want to duplicate the description here, the
> > details
> > were in the patch description, however if you have any particular question
> > about
> > the details, let me know and I will try to clarify it more.
> 
> So besides solving the irq-safe issue for dma driver, using the
> device-links has additionally two advantages. I already mentioned the
> -EPROBE_DEFER issue above.
> 
> The second thing, is the runtime/system PM relations we get for free
> by using the links. In other words, the dma driver/core don't need to
> care about dealing with pm_runtime_get|put() as that would be managed
> by the dma client driver.

Yeah sorry took me a while to figure that out :), If we do a different API
then dmaengine core can call pm_runtime_get|put() from non-atomic context.

[1]: http://www.spinics.net/lists/dmaengine/msg11570.html

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13  2:03               ` Vinod Koul
  (?)
@ 2017-02-13 11:11                 ` Ulf Hansson
  -1 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 11:11 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
>
> Well if we defer then driver will regiser with dmaengine after it is
> probed, so a client will either get a channel or not. IOW we won't get
> -EPROBE_DEFER.

I didn't quite get this. What do you mean by "if we defer..."?

Defer into *what* and defer of *what*?  Could you please elaborate.

[...]

>>
>> Again, allow me to fill in. This issue exists for all ARM SoC which
>> has a dma controller residing in a PM domain. I think that is quite
>> many.
>>
>> Currently the only solution I have seen for this problem, but which I
>> really dislike. That is, each dma client driver requests/releases
>> their dma channel from their respective ->runtime_suspend|resume()
>> callbacks - then the dma driver can use the dma request/release hooks,
>> to do pm_runtime_get|put() which then becomes non-irq-safe.
>
> Yeah that is not the best way to do. But looking at it current one doesnt
> seem best fit either.
>
> So on seeing the device_link_add() I was thinking that this is some SoC
> dependent problem being solved whereas the problem statmement is non-atomic
> channel prepare.

You may be right.

Although, I don't know of other examples, besides the runtime PM use
case, where non-atomic channel prepare/unprepare would make sense. Do
you?

>
> As I said earlier, if we want to solve that problem a better idea is to
> actually split the prepare as we discussed in [1]
>
> This way we can get a non atomic descriptor allocate/prepare and release.
> Yes we need to redesign the APIs to solve this, but if you guys are up for
> it, I think we can do it and avoid any further round abouts :)

Adding/re-designing dma APIs is a viable option to solve the runtime PM case.

Changes would be needed for all related dma client drivers as well,
although if that's what we need to do - let's do it.

[...]

>>
>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
>
> Yeah sorry took me a while to figure that out :), If we do a different API
> then dmaengine core can call pm_runtime_get|put() from non-atomic context.

Yes, it can and this works from runtime PM point of view. But the
following issues would remain unsolved.

1)
Dependencies between dma drivers and dma client drivers during system
PM. For example, a dma client driver needs the dma controller to be
operational (remain system resumed), until the dma client driver
itself becomes system suspended.

The *only* currently available solution for this, is to try to system
suspend the dma controller later than the dma client, via using the
*late or the *noirq system PM callbacks. This works for most cases,
but it becomes a problem when the dma client also needs to be system
suspended at the *late or the *noirq phase. Clearly this solution that
doesn't scale.

Using device links explicitly solves this problem as it allows to
specify this dependency between devices.

2)
We won't avoid dma clients from getting -EPROBE_DEFER when requesting
their dma channels in their ->probe() routines. This would be
possible, if we can set up the device links at device initialization.

>
> [1]: http://www.spinics.net/lists/dmaengine/msg11570.html
>
> --
> ~Vinod

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 11:11                 ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 11:11 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
>
> Well if we defer then driver will regiser with dmaengine after it is
> probed, so a client will either get a channel or not. IOW we won't get
> -EPROBE_DEFER.

I didn't quite get this. What do you mean by "if we defer..."?

Defer into *what* and defer of *what*?  Could you please elaborate.

[...]

>>
>> Again, allow me to fill in. This issue exists for all ARM SoC which
>> has a dma controller residing in a PM domain. I think that is quite
>> many.
>>
>> Currently the only solution I have seen for this problem, but which I
>> really dislike. That is, each dma client driver requests/releases
>> their dma channel from their respective ->runtime_suspend|resume()
>> callbacks - then the dma driver can use the dma request/release hooks,
>> to do pm_runtime_get|put() which then becomes non-irq-safe.
>
> Yeah that is not the best way to do. But looking at it current one doesnt
> seem best fit either.
>
> So on seeing the device_link_add() I was thinking that this is some SoC
> dependent problem being solved whereas the problem statmement is non-atomic
> channel prepare.

You may be right.

Although, I don't know of other examples, besides the runtime PM use
case, where non-atomic channel prepare/unprepare would make sense. Do
you?

>
> As I said earlier, if we want to solve that problem a better idea is to
> actually split the prepare as we discussed in [1]
>
> This way we can get a non atomic descriptor allocate/prepare and release.
> Yes we need to redesign the APIs to solve this, but if you guys are up for
> it, I think we can do it and avoid any further round abouts :)

Adding/re-designing dma APIs is a viable option to solve the runtime PM case.

Changes would be needed for all related dma client drivers as well,
although if that's what we need to do - let's do it.

[...]

>>
>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
>
> Yeah sorry took me a while to figure that out :), If we do a different API
> then dmaengine core can call pm_runtime_get|put() from non-atomic context.

Yes, it can and this works from runtime PM point of view. But the
following issues would remain unsolved.

1)
Dependencies between dma drivers and dma client drivers during system
PM. For example, a dma client driver needs the dma controller to be
operational (remain system resumed), until the dma client driver
itself becomes system suspended.

The *only* currently available solution for this, is to try to system
suspend the dma controller later than the dma client, via using the
*late or the *noirq system PM callbacks. This works for most cases,
but it becomes a problem when the dma client also needs to be system
suspended at the *late or the *noirq phase. Clearly this solution that
doesn't scale.

Using device links explicitly solves this problem as it allows to
specify this dependency between devices.

2)
We won't avoid dma clients from getting -EPROBE_DEFER when requesting
their dma channels in their ->probe() routines. This would be
possible, if we can set up the device links at device initialization.

>
> [1]: http://www.spinics.net/lists/dmaengine/msg11570.html
>
> --
> ~Vinod

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 11:11                 ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 11:11 UTC (permalink / raw)
  To: linux-arm-kernel

>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
>
> Well if we defer then driver will regiser with dmaengine after it is
> probed, so a client will either get a channel or not. IOW we won't get
> -EPROBE_DEFER.

I didn't quite get this. What do you mean by "if we defer..."?

Defer into *what* and defer of *what*?  Could you please elaborate.

[...]

>>
>> Again, allow me to fill in. This issue exists for all ARM SoC which
>> has a dma controller residing in a PM domain. I think that is quite
>> many.
>>
>> Currently the only solution I have seen for this problem, but which I
>> really dislike. That is, each dma client driver requests/releases
>> their dma channel from their respective ->runtime_suspend|resume()
>> callbacks - then the dma driver can use the dma request/release hooks,
>> to do pm_runtime_get|put() which then becomes non-irq-safe.
>
> Yeah that is not the best way to do. But looking at it current one doesnt
> seem best fit either.
>
> So on seeing the device_link_add() I was thinking that this is some SoC
> dependent problem being solved whereas the problem statmement is non-atomic
> channel prepare.

You may be right.

Although, I don't know of other examples, besides the runtime PM use
case, where non-atomic channel prepare/unprepare would make sense. Do
you?

>
> As I said earlier, if we want to solve that problem a better idea is to
> actually split the prepare as we discussed in [1]
>
> This way we can get a non atomic descriptor allocate/prepare and release.
> Yes we need to redesign the APIs to solve this, but if you guys are up for
> it, I think we can do it and avoid any further round abouts :)

Adding/re-designing dma APIs is a viable option to solve the runtime PM case.

Changes would be needed for all related dma client drivers as well,
although if that's what we need to do - let's do it.

[...]

>>
>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
>
> Yeah sorry took me a while to figure that out :), If we do a different API
> then dmaengine core can call pm_runtime_get|put() from non-atomic context.

Yes, it can and this works from runtime PM point of view. But the
following issues would remain unsolved.

1)
Dependencies between dma drivers and dma client drivers during system
PM. For example, a dma client driver needs the dma controller to be
operational (remain system resumed), until the dma client driver
itself becomes system suspended.

The *only* currently available solution for this, is to try to system
suspend the dma controller later than the dma client, via using the
*late or the *noirq system PM callbacks. This works for most cases,
but it becomes a problem when the dma client also needs to be system
suspended at the *late or the *noirq phase. Clearly this solution that
doesn't scale.

Using device links explicitly solves this problem as it allows to
specify this dependency between devices.

2)
We won't avoid dma clients from getting -EPROBE_DEFER when requesting
their dma channels in their ->probe() routines. This would be
possible, if we can set up the device links at device initialization.

>
> [1]: http://www.spinics.net/lists/dmaengine/msg11570.html
>
> --
> ~Vinod

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-10 13:57             ` Ulf Hansson
  (?)
@ 2017-02-13 11:45               ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 11:45 UTC (permalink / raw)
  To: Ulf Hansson, Vinod Koul
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Ulf,

On 2017-02-10 14:57, Ulf Hansson wrote:
> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>> On 2017-02-10 05:50, Vinod Koul wrote:
>>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>>> +{
>>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>>> +       int i;
>>>> +
>>>> +       mutex_lock(&pl330->rpm_lock);
>>>> +
>>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>>> +                   pl330->peripherals[i].slave_link) {
>>>> +                       pch->slave_link =
>>>> pl330->peripherals[i].slave_link;
>>>> +                       goto done;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>>> +                                      DL_FLAG_PM_RUNTIME |
>>>> DL_FLAG_RPM_ACTIVE);
>>> So you are going to add the link on channel allocation and tear down on
>>> the
>>> freeup.
>>
>> Right. Channel allocation is typically done once per driver operation and it
>> won't hurt system performance.
>>
>>>    I am not sure I really like the idea here.
>>
>> Could you point what's wrong with it?
>>
>>> First, these thing shouldn't be handled in the drivers. These things
>>> should
>>> be set in core and each driver setting the links doesn't sound great to
>>> me.
>>
>> Which core? And what's wrong with the device links? They have been
>> introduced to
>> model relations between devices that are behind the usual parent/child/bus
>> topology.
> I think Vinod mean the dmaengine core. Which also would make perfect
> sense to me as it would benefit all dma drivers.
>
> The only related PM thing, that shall be the decision of the driver,
> is whether it wants to enable runtime PM or not, during ->probe().

So do you want to create the links during the DMAengine driver probe? 
How do you
plan to find all the client devices? Please note that you really want to 
create
links to devices which will really use the DMA engine calls. Some client
drivers might decide in runtime weather to use DMA engine or not, 
depending on
other data.

>>> Second, should the link be always there and we only mange the state? Here
>>> it
>>> seems that we have link being created and destroyed, so why not mark it
>>> ACTIVE and DORMANT instead...
>>
>> Link state is managed by device core and should not be touched by the
>> drivers.
>> It is related to both provider and consumer drivers states (probed/not
>> probed/etc).
>>
>> Second we would need to create those links first. The question is where to
>> create them then.
> Just to fill in, to me this is really also the key question.
>
> If we could set up the device link already at device initialization,
> it should also be possible to avoid getting -EPROBE_DEFER for dma
> client drivers when requesting their dma channels.

At the first glance this sounds like an ultimate solution for all problems,
but I don't think that device links can be used this way. If I get it right,
you would like to create links on client device initialization, preferably
somewhere in the kernel driver core. This will be handled somehow by a
completely generic code, which will create a link each pair of devices,
which are connected by a phandle. Is this what you meant? Please note that
that time no driver for both client and provider are probed. IMHO that
doesn't look like a right generic approach

How that code will know get following information:
1. is it really needed to create a link for given device pair?
2. what link flags should it use?
3. what about circular dependencies?
4. what about runtime optional dependencies?
5. what about non-dt platforms? acpi?

This looks like another newer ending story of "how can we avoid deferred 
probe
in a generic way". IMHO we should first solve the problem of irq-safe 
runtime
PM in DMA engine drivers first. I proposed how it can be done with 
device links.
With no changes in the client API. Later if one decide to extend the 
client API
in a way it will allow other runtime PM implementation - I see no problem to
convert pl330 driver to the new approach, but for the time being - this 
would
be the easiest way to get it really functional.

>>> Lastly, looking at th description of the issue here, am perceiving (maybe
>>> my
>>> understanding is not quite right here) that you have an IP block in SoC
>>> which has multiple things and share common stuff and doing right PM is a
>>> challenge for you, right?
>>
>> Nope. Doing right PM in my SoC is not that complex and I would say it is
>> rather
>> typical for any embedded stuff. It works fine (in terms of the power
>> consumption reduction) when all drivers simply properly manage their runtime
>> PM state, thus if device is not in use, the state is set to suspended and
>> finally, the power domain gets turned off.
>>
>> I've used device links for PM only because the current DMA engine API is
>> simply insufficient to implement it in the other way.
>>
>> I want to let a power domain, which contains a few devices, among those a
>> PL330
>> device, to get turned off when there is no activity. Handling power domain
>> power
>> on / off requires non-atomic context, what is typical for runtime pm calls.
>> For
>> that I need to have non-irq-safe runtime pm implemented for all devices that
>> belongs to that domains.
> Again, allow me to fill in. This issue exists for all ARM SoC which
> has a dma controller residing in a PM domain. I think that is quite
> many.
>
> Currently the only solution I have seen for this problem, but which I
> really dislike. That is, each dma client driver requests/releases
> their dma channel from their respective ->runtime_suspend|resume()
> callbacks - then the dma driver can use the dma request/release hooks,
> to do pm_runtime_get|put() which then becomes non-irq-safe.
>
>> The problem with PL330 driver is that it use irq-safe runtime pm, which like
>> it
>> was stated in the patch description doesn't bring much benefits. To switch
>> to
>> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
>> from
>> a context which permits sleeping. The problem with DMA engine driver API is
>> that
>> most of its callbacks have to be IRQ-safe and frankly only
>> device_{alloc,release}_chan_resources() what more or less maps to
>> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
>> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
>> rcar-dmac), but this is not really efficient. DMA engine clients usually
>> allocate
>> dma channel during their probe() and keep them for the whole driver life. In
>> turn
>> this very similar to calling pm_runtime_get() in the DMA engine driver
>> probe().
>> The result of both approaches is that DMA engine device keeps its power
>> domain
>> enabled almost all the time. This problem is also mentioned in the DMA
>> engine
>> TODO list, you have pointed me yesterday.
>>
>> To avoid such situation that DMA engine driver blocks turning off the power
>> domain and avoid changing DMA engine client API I came up with the device
>> links
>> pm based approach. I don't want to duplicate the description here, the
>> details
>> were in the patch description, however if you have any particular question
>> about
>> the details, let me know and I will try to clarify it more.
> So besides solving the irq-safe issue for dma driver, using the
> device-links has additionally two advantages. I already mentioned the
> -EPROBE_DEFER issue above.

Not really. IMHO device links can be properly established once both drivers
are probed...

>
> The second thing, is the runtime/system PM relations we get for free
> by using the links. In other words, the dma driver/core don't need to
> care about dealing with pm_runtime_get|put() as that would be managed
> by the dma client driver.

IMHO there might be drivers which don't want to use device links based 
runtime
PM in favor of irq-safe PM or something else. This should be really left to
drivers.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 11:45               ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 11:45 UTC (permalink / raw)
  To: Ulf Hansson, Vinod Koul
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Ulf,

On 2017-02-10 14:57, Ulf Hansson wrote:
> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>> On 2017-02-10 05:50, Vinod Koul wrote:
>>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>>> +{
>>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>>> +       int i;
>>>> +
>>>> +       mutex_lock(&pl330->rpm_lock);
>>>> +
>>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>>> +                   pl330->peripherals[i].slave_link) {
>>>> +                       pch->slave_link =
>>>> pl330->peripherals[i].slave_link;
>>>> +                       goto done;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>>> +                                      DL_FLAG_PM_RUNTIME |
>>>> DL_FLAG_RPM_ACTIVE);
>>> So you are going to add the link on channel allocation and tear down on
>>> the
>>> freeup.
>>
>> Right. Channel allocation is typically done once per driver operation and it
>> won't hurt system performance.
>>
>>>    I am not sure I really like the idea here.
>>
>> Could you point what's wrong with it?
>>
>>> First, these thing shouldn't be handled in the drivers. These things
>>> should
>>> be set in core and each driver setting the links doesn't sound great to
>>> me.
>>
>> Which core? And what's wrong with the device links? They have been
>> introduced to
>> model relations between devices that are behind the usual parent/child/bus
>> topology.
> I think Vinod mean the dmaengine core. Which also would make perfect
> sense to me as it would benefit all dma drivers.
>
> The only related PM thing, that shall be the decision of the driver,
> is whether it wants to enable runtime PM or not, during ->probe().

So do you want to create the links during the DMAengine driver probe? 
How do you
plan to find all the client devices? Please note that you really want to 
create
links to devices which will really use the DMA engine calls. Some client
drivers might decide in runtime weather to use DMA engine or not, 
depending on
other data.

>>> Second, should the link be always there and we only mange the state? Here
>>> it
>>> seems that we have link being created and destroyed, so why not mark it
>>> ACTIVE and DORMANT instead...
>>
>> Link state is managed by device core and should not be touched by the
>> drivers.
>> It is related to both provider and consumer drivers states (probed/not
>> probed/etc).
>>
>> Second we would need to create those links first. The question is where to
>> create them then.
> Just to fill in, to me this is really also the key question.
>
> If we could set up the device link already at device initialization,
> it should also be possible to avoid getting -EPROBE_DEFER for dma
> client drivers when requesting their dma channels.

At the first glance this sounds like an ultimate solution for all problems,
but I don't think that device links can be used this way. If I get it right,
you would like to create links on client device initialization, preferably
somewhere in the kernel driver core. This will be handled somehow by a
completely generic code, which will create a link each pair of devices,
which are connected by a phandle. Is this what you meant? Please note that
that time no driver for both client and provider are probed. IMHO that
doesn't look like a right generic approach

How that code will know get following information:
1. is it really needed to create a link for given device pair?
2. what link flags should it use?
3. what about circular dependencies?
4. what about runtime optional dependencies?
5. what about non-dt platforms? acpi?

This looks like another newer ending story of "how can we avoid deferred 
probe
in a generic way". IMHO we should first solve the problem of irq-safe 
runtime
PM in DMA engine drivers first. I proposed how it can be done with 
device links.
With no changes in the client API. Later if one decide to extend the 
client API
in a way it will allow other runtime PM implementation - I see no problem to
convert pl330 driver to the new approach, but for the time being - this 
would
be the easiest way to get it really functional.

>>> Lastly, looking at th description of the issue here, am perceiving (maybe
>>> my
>>> understanding is not quite right here) that you have an IP block in SoC
>>> which has multiple things and share common stuff and doing right PM is a
>>> challenge for you, right?
>>
>> Nope. Doing right PM in my SoC is not that complex and I would say it is
>> rather
>> typical for any embedded stuff. It works fine (in terms of the power
>> consumption reduction) when all drivers simply properly manage their runtime
>> PM state, thus if device is not in use, the state is set to suspended and
>> finally, the power domain gets turned off.
>>
>> I've used device links for PM only because the current DMA engine API is
>> simply insufficient to implement it in the other way.
>>
>> I want to let a power domain, which contains a few devices, among those a
>> PL330
>> device, to get turned off when there is no activity. Handling power domain
>> power
>> on / off requires non-atomic context, what is typical for runtime pm calls.
>> For
>> that I need to have non-irq-safe runtime pm implemented for all devices that
>> belongs to that domains.
> Again, allow me to fill in. This issue exists for all ARM SoC which
> has a dma controller residing in a PM domain. I think that is quite
> many.
>
> Currently the only solution I have seen for this problem, but which I
> really dislike. That is, each dma client driver requests/releases
> their dma channel from their respective ->runtime_suspend|resume()
> callbacks - then the dma driver can use the dma request/release hooks,
> to do pm_runtime_get|put() which then becomes non-irq-safe.
>
>> The problem with PL330 driver is that it use irq-safe runtime pm, which like
>> it
>> was stated in the patch description doesn't bring much benefits. To switch
>> to
>> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
>> from
>> a context which permits sleeping. The problem with DMA engine driver API is
>> that
>> most of its callbacks have to be IRQ-safe and frankly only
>> device_{alloc,release}_chan_resources() what more or less maps to
>> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
>> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
>> rcar-dmac), but this is not really efficient. DMA engine clients usually
>> allocate
>> dma channel during their probe() and keep them for the whole driver life. In
>> turn
>> this very similar to calling pm_runtime_get() in the DMA engine driver
>> probe().
>> The result of both approaches is that DMA engine device keeps its power
>> domain
>> enabled almost all the time. This problem is also mentioned in the DMA
>> engine
>> TODO list, you have pointed me yesterday.
>>
>> To avoid such situation that DMA engine driver blocks turning off the power
>> domain and avoid changing DMA engine client API I came up with the device
>> links
>> pm based approach. I don't want to duplicate the description here, the
>> details
>> were in the patch description, however if you have any particular question
>> about
>> the details, let me know and I will try to clarify it more.
> So besides solving the irq-safe issue for dma driver, using the
> device-links has additionally two advantages. I already mentioned the
> -EPROBE_DEFER issue above.

Not really. IMHO device links can be properly established once both drivers
are probed...

>
> The second thing, is the runtime/system PM relations we get for free
> by using the links. In other words, the dma driver/core don't need to
> care about dealing with pm_runtime_get|put() as that would be managed
> by the dma client driver.

IMHO there might be drivers which don't want to use device links based 
runtime
PM in favor of irq-safe PM or something else. This should be really left to
drivers.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 11:45               ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ulf,

On 2017-02-10 14:57, Ulf Hansson wrote:
> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>> On 2017-02-10 05:50, Vinod Koul wrote:
>>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>>> +{
>>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>>> +       int i;
>>>> +
>>>> +       mutex_lock(&pl330->rpm_lock);
>>>> +
>>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>>> +                   pl330->peripherals[i].slave_link) {
>>>> +                       pch->slave_link =
>>>> pl330->peripherals[i].slave_link;
>>>> +                       goto done;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>>> +                                      DL_FLAG_PM_RUNTIME |
>>>> DL_FLAG_RPM_ACTIVE);
>>> So you are going to add the link on channel allocation and tear down on
>>> the
>>> freeup.
>>
>> Right. Channel allocation is typically done once per driver operation and it
>> won't hurt system performance.
>>
>>>    I am not sure I really like the idea here.
>>
>> Could you point what's wrong with it?
>>
>>> First, these thing shouldn't be handled in the drivers. These things
>>> should
>>> be set in core and each driver setting the links doesn't sound great to
>>> me.
>>
>> Which core? And what's wrong with the device links? They have been
>> introduced to
>> model relations between devices that are behind the usual parent/child/bus
>> topology.
> I think Vinod mean the dmaengine core. Which also would make perfect
> sense to me as it would benefit all dma drivers.
>
> The only related PM thing, that shall be the decision of the driver,
> is whether it wants to enable runtime PM or not, during ->probe().

So do you want to create the links during the DMAengine driver probe? 
How do you
plan to find all the client devices? Please note that you really want to 
create
links to devices which will really use the DMA engine calls. Some client
drivers might decide in runtime weather to use DMA engine or not, 
depending on
other data.

>>> Second, should the link be always there and we only mange the state? Here
>>> it
>>> seems that we have link being created and destroyed, so why not mark it
>>> ACTIVE and DORMANT instead...
>>
>> Link state is managed by device core and should not be touched by the
>> drivers.
>> It is related to both provider and consumer drivers states (probed/not
>> probed/etc).
>>
>> Second we would need to create those links first. The question is where to
>> create them then.
> Just to fill in, to me this is really also the key question.
>
> If we could set up the device link already at device initialization,
> it should also be possible to avoid getting -EPROBE_DEFER for dma
> client drivers when requesting their dma channels.

At the first glance this sounds like an ultimate solution for all problems,
but I don't think that device links can be used this way. If I get it right,
you would like to create links on client device initialization, preferably
somewhere in the kernel driver core. This will be handled somehow by a
completely generic code, which will create a link each pair of devices,
which are connected by a phandle. Is this what you meant? Please note that
that time no driver for both client and provider are probed. IMHO that
doesn't look like a right generic approach

How that code will know get following information:
1. is it really needed to create a link for given device pair?
2. what link flags should it use?
3. what about circular dependencies?
4. what about runtime optional dependencies?
5. what about non-dt platforms? acpi?

This looks like another newer ending story of "how can we avoid deferred 
probe
in a generic way". IMHO we should first solve the problem of irq-safe 
runtime
PM in DMA engine drivers first. I proposed how it can be done with 
device links.
With no changes in the client API. Later if one decide to extend the 
client API
in a way it will allow other runtime PM implementation - I see no problem to
convert pl330 driver to the new approach, but for the time being - this 
would
be the easiest way to get it really functional.

>>> Lastly, looking at th description of the issue here, am perceiving (maybe
>>> my
>>> understanding is not quite right here) that you have an IP block in SoC
>>> which has multiple things and share common stuff and doing right PM is a
>>> challenge for you, right?
>>
>> Nope. Doing right PM in my SoC is not that complex and I would say it is
>> rather
>> typical for any embedded stuff. It works fine (in terms of the power
>> consumption reduction) when all drivers simply properly manage their runtime
>> PM state, thus if device is not in use, the state is set to suspended and
>> finally, the power domain gets turned off.
>>
>> I've used device links for PM only because the current DMA engine API is
>> simply insufficient to implement it in the other way.
>>
>> I want to let a power domain, which contains a few devices, among those a
>> PL330
>> device, to get turned off when there is no activity. Handling power domain
>> power
>> on / off requires non-atomic context, what is typical for runtime pm calls.
>> For
>> that I need to have non-irq-safe runtime pm implemented for all devices that
>> belongs to that domains.
> Again, allow me to fill in. This issue exists for all ARM SoC which
> has a dma controller residing in a PM domain. I think that is quite
> many.
>
> Currently the only solution I have seen for this problem, but which I
> really dislike. That is, each dma client driver requests/releases
> their dma channel from their respective ->runtime_suspend|resume()
> callbacks - then the dma driver can use the dma request/release hooks,
> to do pm_runtime_get|put() which then becomes non-irq-safe.
>
>> The problem with PL330 driver is that it use irq-safe runtime pm, which like
>> it
>> was stated in the patch description doesn't bring much benefits. To switch
>> to
>> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
>> from
>> a context which permits sleeping. The problem with DMA engine driver API is
>> that
>> most of its callbacks have to be IRQ-safe and frankly only
>> device_{alloc,release}_chan_resources() what more or less maps to
>> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
>> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
>> rcar-dmac), but this is not really efficient. DMA engine clients usually
>> allocate
>> dma channel during their probe() and keep them for the whole driver life. In
>> turn
>> this very similar to calling pm_runtime_get() in the DMA engine driver
>> probe().
>> The result of both approaches is that DMA engine device keeps its power
>> domain
>> enabled almost all the time. This problem is also mentioned in the DMA
>> engine
>> TODO list, you have pointed me yesterday.
>>
>> To avoid such situation that DMA engine driver blocks turning off the power
>> domain and avoid changing DMA engine client API I came up with the device
>> links
>> pm based approach. I don't want to duplicate the description here, the
>> details
>> were in the patch description, however if you have any particular question
>> about
>> the details, let me know and I will try to clarify it more.
> So besides solving the irq-safe issue for dma driver, using the
> device-links has additionally two advantages. I already mentioned the
> -EPROBE_DEFER issue above.

Not really. IMHO device links can be properly established once both drivers
are probed...

>
> The second thing, is the runtime/system PM relations we get for free
> by using the links. In other words, the dma driver/core don't need to
> care about dealing with pm_runtime_get|put() as that would be managed
> by the dma client driver.

IMHO there might be drivers which don't want to use device links based 
runtime
PM in favor of irq-safe PM or something else. This should be really left to
drivers.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
  2017-02-13  1:42             ` Vinod Koul
@ 2017-02-13 11:48               ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 11:48 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Ulf Hansson, linux-samsung-soc, dmaengine, linux-arm-kernel,
	linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

Hi Vinod,

On 2017-02-13 02:42, Vinod Koul wrote:
> On Fri, Feb 10, 2017 at 01:07:41PM +0100, Marek Szyprowski wrote:
>> Hi Vinod,
>>
>> On 2017-02-10 05:34, Vinod Koul wrote:
>>> On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
>>>> Add two new callbacks to DMA engine device. They will used to provide
>>>> access to slave device (the device which requested given DMA channel)
>>> You mean access to client devices?
>> Yes. It looks that I was confused by the code, where the term 'slave'
>> appears a few times. 'Client' is a bit more appropriate then.
>>
>>>> for DMA engine driver. Access to slave device might be useful for example
>>>> for implementing advanced runtime power management.
>>>>
>>>> DMA slave channels are exclusive, so only one slave device can be set
>>>> for a given DMA slave channel.
>>> That is not a right assumption and my worry here. With virt-dma we don't
>>> really assume a hardware channel and exclusive. Certain implementation may
>>> do that but from framework we cannot assume that.
>> Okay, I came to such conclusion basing one the dma engine code, but maybe
>> I missed something. However in such case such callback will be called for
>> each client device and it will be up to the driver to handle that.
> Thats right, but the assumption that we will have once physical channel
> maynot be true.
>
>>>> device_set_slave() will be called after the device_alloc_chan_resources()
>>>> and device_release_slave() before the device_free_chan_resources().
>>> Okay, I had to relook at the series to get around this part. Sorry but we
>>> can't call it set_slave, it is actually set_client/consumer
>> That's okay, the name of the callbacks should be changed.
>>
>>> In our context slaves means dmaengine slave devices aka provider.
>>> Client would be the consumer and not slave.
>> I'm a new to the DMA engine framework, I'm sorry for using wrong terms.
> That's fine :-) we all learn incrementally.
>
>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>> ---
>>>>   drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
>>>>   include/linux/dmaengine.h | 10 ++++++++++
>>>>   2 files changed, 34 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>>>> index 24e0221fd66d..5b7089d8be4d 100644
>>>> --- a/drivers/dma/dmaengine.c
>>>> +++ b/drivers/dma/dmaengine.c
>>>> @@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>>>   {
>>>>   	struct dma_device *d, *_d;
>>>>   	struct dma_chan *chan = NULL;
>>>> +	int ret;
>>>>   	/* If device-tree is present get slave info from here */
>>>>   	if (dev->of_node)
>>>> @@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>>>   		chan = acpi_dma_request_slave_chan_by_name(dev, name);
>>>>   	if (chan) {
>>>> -		/* Valid channel found or requester need to be deferred */
>>>> -		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
>>>> +		if (!IS_ERR(chan))
>>>> +			goto found;
>>>> +		if (PTR_ERR(chan) == -EPROBE_DEFER)
>>>>   			return chan;
>>>>   	}
>>>> @@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>>>   	}
>>>>   	mutex_unlock(&dma_list_mutex);
>>>> -	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
>>>> +	if (!chan)
>>>> +		return ERR_PTR(-EPROBE_DEFER);
>>>> +	if (IS_ERR(chan))
>>>> +		return chan;
>>>> +found:
>>>> +	if (chan->device->device_set_slave) {
>>>> +		chan->slave = dev;
>>>> +		ret = chan->device->device_set_slave(chan, dev);
>>>> +		if (ret) {
>>>> +			chan->slave = NULL;
>>>> +			dma_release_channel(chan);
>>>> +			chan = ERR_PTR(ret);
>>>> +		}
>>>> +	}
>>>> +	return chan;
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(dma_request_chan);
>>>> @@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
>>>>   	mutex_lock(&dma_list_mutex);
>>>>   	WARN_ONCE(chan->client_count != 1,
>>>>   		  "chan reference count %d != 1\n", chan->client_count);
>>>> +	if (chan->slave) {
>>>> +		if (chan->device->device_release_slave)
>>>> +			chan->device->device_release_slave(chan);
>>>> +		chan->slave = NULL;
>>>> +	}
>>>>   	dma_chan_put(chan);
>>>>   	/* drop PRIVATE cap enabled by __dma_request_channel() */
>>>>   	if (--chan->device->privatecnt == 0)
>>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>>>> index 533680860865..d22299e37e69 100644
>>>> --- a/include/linux/dmaengine.h
>>>> +++ b/include/linux/dmaengine.h
>>>> @@ -277,6 +277,9 @@ struct dma_chan {
>>>>   	struct dma_router *router;
>>>>   	void *route_data;
>>>> +	/* Only for SLAVE channels */
>>>> +	struct device *slave;
>>> so assuming you refer to consumer aka client here, why do we need set if we
>>> store it here.
>> DMA engine driver might need to do something with it (like setting up a pm
>> link for example) before starting any operations. It would be great if the
>> pointer to client device is available in device_alloc_chan_resources(), but
>> propagating it there is not possible without significant changes. That's why
>> I came with this a separate callback.
> But then it gets the client device using the callback as well. So if we
> retain that, this should go away.

Yes, that it would be an alternative solution to set/clear_client().

>> Maybe the client device shouldn't be stored in the dma_chan structure at all
>> and left to the drivers to use or manage it if really needed. This will also
>> solve the issue with virt-dma you have mentioned.
>>
>> In the previous version I managed to pass client device pointer to
>> device_alloc_chan_resources() via of_xlate callback (please take a look into
>> v7), but that approach was rejected by Lars-Peter Clausen.
> I feel this is better approach, perhaps we don't need the client pointer
> here..

Then this is exactly what was implemented in v7 of this patchset. Could 
you then
take a look at it? Or do you want me to resend it as v9?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks
@ 2017-02-13 11:48               ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 11:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vinod,

On 2017-02-13 02:42, Vinod Koul wrote:
> On Fri, Feb 10, 2017 at 01:07:41PM +0100, Marek Szyprowski wrote:
>> Hi Vinod,
>>
>> On 2017-02-10 05:34, Vinod Koul wrote:
>>> On Thu, Feb 09, 2017 at 03:22:49PM +0100, Marek Szyprowski wrote:
>>>> Add two new callbacks to DMA engine device. They will used to provide
>>>> access to slave device (the device which requested given DMA channel)
>>> You mean access to client devices?
>> Yes. It looks that I was confused by the code, where the term 'slave'
>> appears a few times. 'Client' is a bit more appropriate then.
>>
>>>> for DMA engine driver. Access to slave device might be useful for example
>>>> for implementing advanced runtime power management.
>>>>
>>>> DMA slave channels are exclusive, so only one slave device can be set
>>>> for a given DMA slave channel.
>>> That is not a right assumption and my worry here. With virt-dma we don't
>>> really assume a hardware channel and exclusive. Certain implementation may
>>> do that but from framework we cannot assume that.
>> Okay, I came to such conclusion basing one the dma engine code, but maybe
>> I missed something. However in such case such callback will be called for
>> each client device and it will be up to the driver to handle that.
> Thats right, but the assumption that we will have once physical channel
> maynot be true.
>
>>>> device_set_slave() will be called after the device_alloc_chan_resources()
>>>> and device_release_slave() before the device_free_chan_resources().
>>> Okay, I had to relook at the series to get around this part. Sorry but we
>>> can't call it set_slave, it is actually set_client/consumer
>> That's okay, the name of the callbacks should be changed.
>>
>>> In our context slaves means dmaengine slave devices aka provider.
>>> Client would be the consumer and not slave.
>> I'm a new to the DMA engine framework, I'm sorry for using wrong terms.
> That's fine :-) we all learn incrementally.
>
>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>> ---
>>>>   drivers/dma/dmaengine.c   | 27 ++++++++++++++++++++++++---
>>>>   include/linux/dmaengine.h | 10 ++++++++++
>>>>   2 files changed, 34 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>>>> index 24e0221fd66d..5b7089d8be4d 100644
>>>> --- a/drivers/dma/dmaengine.c
>>>> +++ b/drivers/dma/dmaengine.c
>>>> @@ -705,6 +705,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>>>   {
>>>>   	struct dma_device *d, *_d;
>>>>   	struct dma_chan *chan = NULL;
>>>> +	int ret;
>>>>   	/* If device-tree is present get slave info from here */
>>>>   	if (dev->of_node)
>>>> @@ -715,8 +716,9 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>>>   		chan = acpi_dma_request_slave_chan_by_name(dev, name);
>>>>   	if (chan) {
>>>> -		/* Valid channel found or requester need to be deferred */
>>>> -		if (!IS_ERR(chan) || PTR_ERR(chan) == -EPROBE_DEFER)
>>>> +		if (!IS_ERR(chan))
>>>> +			goto found;
>>>> +		if (PTR_ERR(chan) == -EPROBE_DEFER)
>>>>   			return chan;
>>>>   	}
>>>> @@ -738,7 +740,21 @@ struct dma_chan *dma_request_chan(struct device *dev, const char *name)
>>>>   	}
>>>>   	mutex_unlock(&dma_list_mutex);
>>>> -	return chan ? chan : ERR_PTR(-EPROBE_DEFER);
>>>> +	if (!chan)
>>>> +		return ERR_PTR(-EPROBE_DEFER);
>>>> +	if (IS_ERR(chan))
>>>> +		return chan;
>>>> +found:
>>>> +	if (chan->device->device_set_slave) {
>>>> +		chan->slave = dev;
>>>> +		ret = chan->device->device_set_slave(chan, dev);
>>>> +		if (ret) {
>>>> +			chan->slave = NULL;
>>>> +			dma_release_channel(chan);
>>>> +			chan = ERR_PTR(ret);
>>>> +		}
>>>> +	}
>>>> +	return chan;
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(dma_request_chan);
>>>> @@ -786,6 +802,11 @@ void dma_release_channel(struct dma_chan *chan)
>>>>   	mutex_lock(&dma_list_mutex);
>>>>   	WARN_ONCE(chan->client_count != 1,
>>>>   		  "chan reference count %d != 1\n", chan->client_count);
>>>> +	if (chan->slave) {
>>>> +		if (chan->device->device_release_slave)
>>>> +			chan->device->device_release_slave(chan);
>>>> +		chan->slave = NULL;
>>>> +	}
>>>>   	dma_chan_put(chan);
>>>>   	/* drop PRIVATE cap enabled by __dma_request_channel() */
>>>>   	if (--chan->device->privatecnt == 0)
>>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>>>> index 533680860865..d22299e37e69 100644
>>>> --- a/include/linux/dmaengine.h
>>>> +++ b/include/linux/dmaengine.h
>>>> @@ -277,6 +277,9 @@ struct dma_chan {
>>>>   	struct dma_router *router;
>>>>   	void *route_data;
>>>> +	/* Only for SLAVE channels */
>>>> +	struct device *slave;
>>> so assuming you refer to consumer aka client here, why do we need set if we
>>> store it here.
>> DMA engine driver might need to do something with it (like setting up a pm
>> link for example) before starting any operations. It would be great if the
>> pointer to client device is available in device_alloc_chan_resources(), but
>> propagating it there is not possible without significant changes. That's why
>> I came with this a separate callback.
> But then it gets the client device using the callback as well. So if we
> retain that, this should go away.

Yes, that it would be an alternative solution to set/clear_client().

>> Maybe the client device shouldn't be stored in the dma_chan structure at all
>> and left to the drivers to use or manage it if really needed. This will also
>> solve the issue with virt-dma you have mentioned.
>>
>> In the previous version I managed to pass client device pointer to
>> device_alloc_chan_resources() via of_xlate callback (please take a look into
>> v7), but that approach was rejected by Lars-Peter Clausen.
> I feel this is better approach, perhaps we don't need the client pointer
> here..

Then this is exactly what was implemented in v7 of this patchset. Could 
you then
take a look at it? Or do you want me to resend it as v9?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13  2:03               ` Vinod Koul
  (?)
@ 2017-02-13 12:01                 ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 12:01 UTC (permalink / raw)
  To: Vinod Koul, Ulf Hansson
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Vinod,

On 2017-02-13 03:03, Vinod Koul wrote:
> On Fri, Feb 10, 2017 at 02:57:09PM +0100, Ulf Hansson wrote:
>> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>>> On 2017-02-10 05:50, Vinod Koul wrote:
>>>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>>>> +{
>>>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>>>> +       int i;
>>>>> +
>>>>> +       mutex_lock(&pl330->rpm_lock);
>>>>> +
>>>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>>>> +                   pl330->peripherals[i].slave_link) {
>>>>> +                       pch->slave_link =
>>>>> pl330->peripherals[i].slave_link;
>>>>> +                       goto done;
>>>>> +               }
>>>>> +       }
>>>>> +
>>>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>>>> +                                      DL_FLAG_PM_RUNTIME |
>>>>> DL_FLAG_RPM_ACTIVE);
>>>> So you are going to add the link on channel allocation and tear down on
>>>> the
>>>> freeup.
>>>
>>> Right. Channel allocation is typically done once per driver operation and it
>>> won't hurt system performance.
>>>
>>>>    I am not sure I really like the idea here.
>>>
>>> Could you point what's wrong with it?
>>>
>>>> First, these thing shouldn't be handled in the drivers. These things
>>>> should
>>>> be set in core and each driver setting the links doesn't sound great to
>>>> me.
>>>
>>> Which core? And what's wrong with the device links? They have been
>>> introduced to
>>> model relations between devices that are behind the usual parent/child/bus
>>> topology.
>> I think Vinod mean the dmaengine core. Which also would make perfect
>> sense to me as it would benefit all dma drivers.
> Right.
>
>> The only related PM thing, that shall be the decision of the driver,
>> is whether it wants to enable runtime PM or not, during ->probe().
> We can do pm_runtime_enabled() to check and that and do when enabled..

Another subtle issue is that there can be only one link between devices, but
it is common to request more than one channel per client device (for example
"tx" and "rx"), but this can be handled by internal reference counting.

>>>> Second, should the link be always there and we only mange the state? Here
>>>> it
>>>> seems that we have link being created and destroyed, so why not mark it
>>>> ACTIVE and DORMANT instead...
>>>
>>> Link state is managed by device core and should not be touched by the
>>> drivers.
>>> It is related to both provider and consumer drivers states (probed/not
>>> probed/etc).
>>>
>>> Second we would need to create those links first. The question is where to
>>> create them then.
>> Just to fill in, to me this is really also the key question.
>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
> Well if we defer then driver will regiser with dmaengine after it is
> probed, so a client will either get a channel or not. IOW we won't get
> -EPROBE_DEFER.

I don't get how this will work. IMHO the link should be created WHEN client
driver requests the channel, because otherwise we will get links that might
be not used at all (for example optional DMA usage, but the link will force
DMA controller to active state even if client device doesn't want to use DMA
at all). So if client requests it for the first time and the DMA engine has
not been probed yet, there is no way to avoid -EPROBE_DEFER.

>>>> Lastly, looking at th description of the issue here, am perceiving (maybe
>>>> my
>>>> understanding is not quite right here) that you have an IP block in SoC
>>>> which has multiple things and share common stuff and doing right PM is a
>>>> challenge for you, right?
>>>
>>> Nope. Doing right PM in my SoC is not that complex and I would say it is
>>> rather
>>> typical for any embedded stuff. It works fine (in terms of the power
>>> consumption reduction) when all drivers simply properly manage their runtime
>>> PM state, thus if device is not in use, the state is set to suspended and
>>> finally, the power domain gets turned off.
>>>
>>> I've used device links for PM only because the current DMA engine API is
>>> simply insufficient to implement it in the other way.
>>>
>>> I want to let a power domain, which contains a few devices, among those a
>>> PL330
>>> device, to get turned off when there is no activity. Handling power domain
>>> power
>>> on / off requires non-atomic context, what is typical for runtime pm calls.
>>> For
>>> that I need to have non-irq-safe runtime pm implemented for all devices that
>>> belongs to that domains.
>> Again, allow me to fill in. This issue exists for all ARM SoC which
>> has a dma controller residing in a PM domain. I think that is quite
>> many.
>>
>> Currently the only solution I have seen for this problem, but which I
>> really dislike. That is, each dma client driver requests/releases
>> their dma channel from their respective ->runtime_suspend|resume()
>> callbacks - then the dma driver can use the dma request/release hooks,
>> to do pm_runtime_get|put() which then becomes non-irq-safe.
> Yeah that is not the best way to do. But looking at it current one doesnt
> seem best fit either.
>
> So on seeing the device_link_add() I was thinking that this is some SoC
> dependent problem being solved whereas the problem statmement is non-atomic
> channel prepare.
>
> As I said earlier, if we want to solve that problem a better idea is to
> actually split the prepare as we discussed in [1]
>
> This way we can get a non atomic descriptor allocate/prepare and release.
> Yes we need to redesign the APIs to solve this, but if you guys are up for
> it, I think we can do it and avoid any further round abouts :)

I also agree that the main problem here is lack of non-atomic call for
preparing the channel. However I don't feel I'm a right person for rewriting
all the existing DMA engine drivers and clients for the new API. :/

>>> The problem with PL330 driver is that it use irq-safe runtime pm, which like
>>> it
>>> was stated in the patch description doesn't bring much benefits. To switch
>>> to
>>> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
>>> from
>>> a context which permits sleeping. The problem with DMA engine driver API is
>>> that
>>> most of its callbacks have to be IRQ-safe and frankly only
>>> device_{alloc,release}_chan_resources() what more or less maps to
>>> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
>>> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
>>> rcar-dmac), but this is not really efficient. DMA engine clients usually
>>> allocate
>>> dma channel during their probe() and keep them for the whole driver life. In
>>> turn
>>> this very similar to calling pm_runtime_get() in the DMA engine driver
>>> probe().
>>> The result of both approaches is that DMA engine device keeps its power
>>> domain
>>> enabled almost all the time. This problem is also mentioned in the DMA
>>> engine
>>> TODO list, you have pointed me yesterday.
>>>
>>> To avoid such situation that DMA engine driver blocks turning off the power
>>> domain and avoid changing DMA engine client API I came up with the device
>>> links
>>> pm based approach. I don't want to duplicate the description here, the
>>> details
>>> were in the patch description, however if you have any particular question
>>> about
>>> the details, let me know and I will try to clarify it more.
>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
> Yeah sorry took me a while to figure that out :), If we do a different API
> then dmaengine core can call pm_runtime_get|put() from non-atomic context.
>
> [1]: http://www.spinics.net/lists/dmaengine/msg11570.html
>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:01                 ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 12:01 UTC (permalink / raw)
  To: Vinod Koul, Ulf Hansson
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Vinod,

On 2017-02-13 03:03, Vinod Koul wrote:
> On Fri, Feb 10, 2017 at 02:57:09PM +0100, Ulf Hansson wrote:
>> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>>> On 2017-02-10 05:50, Vinod Koul wrote:
>>>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>>>> +{
>>>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>>>> +       int i;
>>>>> +
>>>>> +       mutex_lock(&pl330->rpm_lock);
>>>>> +
>>>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>>>> +                   pl330->peripherals[i].slave_link) {
>>>>> +                       pch->slave_link =
>>>>> pl330->peripherals[i].slave_link;
>>>>> +                       goto done;
>>>>> +               }
>>>>> +       }
>>>>> +
>>>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>>>> +                                      DL_FLAG_PM_RUNTIME |
>>>>> DL_FLAG_RPM_ACTIVE);
>>>> So you are going to add the link on channel allocation and tear down on
>>>> the
>>>> freeup.
>>>
>>> Right. Channel allocation is typically done once per driver operation and it
>>> won't hurt system performance.
>>>
>>>>    I am not sure I really like the idea here.
>>>
>>> Could you point what's wrong with it?
>>>
>>>> First, these thing shouldn't be handled in the drivers. These things
>>>> should
>>>> be set in core and each driver setting the links doesn't sound great to
>>>> me.
>>>
>>> Which core? And what's wrong with the device links? They have been
>>> introduced to
>>> model relations between devices that are behind the usual parent/child/bus
>>> topology.
>> I think Vinod mean the dmaengine core. Which also would make perfect
>> sense to me as it would benefit all dma drivers.
> Right.
>
>> The only related PM thing, that shall be the decision of the driver,
>> is whether it wants to enable runtime PM or not, during ->probe().
> We can do pm_runtime_enabled() to check and that and do when enabled..

Another subtle issue is that there can be only one link between devices, but
it is common to request more than one channel per client device (for example
"tx" and "rx"), but this can be handled by internal reference counting.

>>>> Second, should the link be always there and we only mange the state? Here
>>>> it
>>>> seems that we have link being created and destroyed, so why not mark it
>>>> ACTIVE and DORMANT instead...
>>>
>>> Link state is managed by device core and should not be touched by the
>>> drivers.
>>> It is related to both provider and consumer drivers states (probed/not
>>> probed/etc).
>>>
>>> Second we would need to create those links first. The question is where to
>>> create them then.
>> Just to fill in, to me this is really also the key question.
>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
> Well if we defer then driver will regiser with dmaengine after it is
> probed, so a client will either get a channel or not. IOW we won't get
> -EPROBE_DEFER.

I don't get how this will work. IMHO the link should be created WHEN client
driver requests the channel, because otherwise we will get links that might
be not used at all (for example optional DMA usage, but the link will force
DMA controller to active state even if client device doesn't want to use DMA
at all). So if client requests it for the first time and the DMA engine has
not been probed yet, there is no way to avoid -EPROBE_DEFER.

>>>> Lastly, looking at th description of the issue here, am perceiving (maybe
>>>> my
>>>> understanding is not quite right here) that you have an IP block in SoC
>>>> which has multiple things and share common stuff and doing right PM is a
>>>> challenge for you, right?
>>>
>>> Nope. Doing right PM in my SoC is not that complex and I would say it is
>>> rather
>>> typical for any embedded stuff. It works fine (in terms of the power
>>> consumption reduction) when all drivers simply properly manage their runtime
>>> PM state, thus if device is not in use, the state is set to suspended and
>>> finally, the power domain gets turned off.
>>>
>>> I've used device links for PM only because the current DMA engine API is
>>> simply insufficient to implement it in the other way.
>>>
>>> I want to let a power domain, which contains a few devices, among those a
>>> PL330
>>> device, to get turned off when there is no activity. Handling power domain
>>> power
>>> on / off requires non-atomic context, what is typical for runtime pm calls.
>>> For
>>> that I need to have non-irq-safe runtime pm implemented for all devices that
>>> belongs to that domains.
>> Again, allow me to fill in. This issue exists for all ARM SoC which
>> has a dma controller residing in a PM domain. I think that is quite
>> many.
>>
>> Currently the only solution I have seen for this problem, but which I
>> really dislike. That is, each dma client driver requests/releases
>> their dma channel from their respective ->runtime_suspend|resume()
>> callbacks - then the dma driver can use the dma request/release hooks,
>> to do pm_runtime_get|put() which then becomes non-irq-safe.
> Yeah that is not the best way to do. But looking at it current one doesnt
> seem best fit either.
>
> So on seeing the device_link_add() I was thinking that this is some SoC
> dependent problem being solved whereas the problem statmement is non-atomic
> channel prepare.
>
> As I said earlier, if we want to solve that problem a better idea is to
> actually split the prepare as we discussed in [1]
>
> This way we can get a non atomic descriptor allocate/prepare and release.
> Yes we need to redesign the APIs to solve this, but if you guys are up for
> it, I think we can do it and avoid any further round abouts :)

I also agree that the main problem here is lack of non-atomic call for
preparing the channel. However I don't feel I'm a right person for rewriting
all the existing DMA engine drivers and clients for the new API. :/

>>> The problem with PL330 driver is that it use irq-safe runtime pm, which like
>>> it
>>> was stated in the patch description doesn't bring much benefits. To switch
>>> to
>>> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
>>> from
>>> a context which permits sleeping. The problem with DMA engine driver API is
>>> that
>>> most of its callbacks have to be IRQ-safe and frankly only
>>> device_{alloc,release}_chan_resources() what more or less maps to
>>> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
>>> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
>>> rcar-dmac), but this is not really efficient. DMA engine clients usually
>>> allocate
>>> dma channel during their probe() and keep them for the whole driver life. In
>>> turn
>>> this very similar to calling pm_runtime_get() in the DMA engine driver
>>> probe().
>>> The result of both approaches is that DMA engine device keeps its power
>>> domain
>>> enabled almost all the time. This problem is also mentioned in the DMA
>>> engine
>>> TODO list, you have pointed me yesterday.
>>>
>>> To avoid such situation that DMA engine driver blocks turning off the power
>>> domain and avoid changing DMA engine client API I came up with the device
>>> links
>>> pm based approach. I don't want to duplicate the description here, the
>>> details
>>> were in the patch description, however if you have any particular question
>>> about
>>> the details, let me know and I will try to clarify it more.
>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
> Yeah sorry took me a while to figure that out :), If we do a different API
> then dmaengine core can call pm_runtime_get|put() from non-atomic context.
>
> [1]: http://www.spinics.net/lists/dmaengine/msg11570.html
>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:01                 ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 12:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vinod,

On 2017-02-13 03:03, Vinod Koul wrote:
> On Fri, Feb 10, 2017 at 02:57:09PM +0100, Ulf Hansson wrote:
>> On 10 February 2017 at 12:51, Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>>> On 2017-02-10 05:50, Vinod Koul wrote:
>>>> On Thu, Feb 09, 2017 at 03:22:51PM +0100, Marek Szyprowski wrote:
>>>>> +static int pl330_set_slave(struct dma_chan *chan, struct device *slave)
>>>>> +{
>>>>> +       struct dma_pl330_chan *pch = to_pchan(chan);
>>>>> +       struct pl330_dmac *pl330 = pch->dmac;
>>>>> +       int i;
>>>>> +
>>>>> +       mutex_lock(&pl330->rpm_lock);
>>>>> +
>>>>> +       for (i = 0; i < pl330->num_peripherals; i++) {
>>>>> +               if (pl330->peripherals[i].chan.slave == slave &&
>>>>> +                   pl330->peripherals[i].slave_link) {
>>>>> +                       pch->slave_link =
>>>>> pl330->peripherals[i].slave_link;
>>>>> +                       goto done;
>>>>> +               }
>>>>> +       }
>>>>> +
>>>>> +       pch->slave_link = device_link_add(slave, pl330->ddma.dev,
>>>>> +                                      DL_FLAG_PM_RUNTIME |
>>>>> DL_FLAG_RPM_ACTIVE);
>>>> So you are going to add the link on channel allocation and tear down on
>>>> the
>>>> freeup.
>>>
>>> Right. Channel allocation is typically done once per driver operation and it
>>> won't hurt system performance.
>>>
>>>>    I am not sure I really like the idea here.
>>>
>>> Could you point what's wrong with it?
>>>
>>>> First, these thing shouldn't be handled in the drivers. These things
>>>> should
>>>> be set in core and each driver setting the links doesn't sound great to
>>>> me.
>>>
>>> Which core? And what's wrong with the device links? They have been
>>> introduced to
>>> model relations between devices that are behind the usual parent/child/bus
>>> topology.
>> I think Vinod mean the dmaengine core. Which also would make perfect
>> sense to me as it would benefit all dma drivers.
> Right.
>
>> The only related PM thing, that shall be the decision of the driver,
>> is whether it wants to enable runtime PM or not, during ->probe().
> We can do pm_runtime_enabled() to check and that and do when enabled..

Another subtle issue is that there can be only one link between devices, but
it is common to request more than one channel per client device (for example
"tx" and "rx"), but this can be handled by internal reference counting.

>>>> Second, should the link be always there and we only mange the state? Here
>>>> it
>>>> seems that we have link being created and destroyed, so why not mark it
>>>> ACTIVE and DORMANT instead...
>>>
>>> Link state is managed by device core and should not be touched by the
>>> drivers.
>>> It is related to both provider and consumer drivers states (probed/not
>>> probed/etc).
>>>
>>> Second we would need to create those links first. The question is where to
>>> create them then.
>> Just to fill in, to me this is really also the key question.
>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
> Well if we defer then driver will regiser with dmaengine after it is
> probed, so a client will either get a channel or not. IOW we won't get
> -EPROBE_DEFER.

I don't get how this will work. IMHO the link should be created WHEN client
driver requests the channel, because otherwise we will get links that might
be not used at all (for example optional DMA usage, but the link will force
DMA controller to active state even if client device doesn't want to use DMA
at all). So if client requests it for the first time and the DMA engine has
not been probed yet, there is no way to avoid -EPROBE_DEFER.

>>>> Lastly, looking at th description of the issue here, am perceiving (maybe
>>>> my
>>>> understanding is not quite right here) that you have an IP block in SoC
>>>> which has multiple things and share common stuff and doing right PM is a
>>>> challenge for you, right?
>>>
>>> Nope. Doing right PM in my SoC is not that complex and I would say it is
>>> rather
>>> typical for any embedded stuff. It works fine (in terms of the power
>>> consumption reduction) when all drivers simply properly manage their runtime
>>> PM state, thus if device is not in use, the state is set to suspended and
>>> finally, the power domain gets turned off.
>>>
>>> I've used device links for PM only because the current DMA engine API is
>>> simply insufficient to implement it in the other way.
>>>
>>> I want to let a power domain, which contains a few devices, among those a
>>> PL330
>>> device, to get turned off when there is no activity. Handling power domain
>>> power
>>> on / off requires non-atomic context, what is typical for runtime pm calls.
>>> For
>>> that I need to have non-irq-safe runtime pm implemented for all devices that
>>> belongs to that domains.
>> Again, allow me to fill in. This issue exists for all ARM SoC which
>> has a dma controller residing in a PM domain. I think that is quite
>> many.
>>
>> Currently the only solution I have seen for this problem, but which I
>> really dislike. That is, each dma client driver requests/releases
>> their dma channel from their respective ->runtime_suspend|resume()
>> callbacks - then the dma driver can use the dma request/release hooks,
>> to do pm_runtime_get|put() which then becomes non-irq-safe.
> Yeah that is not the best way to do. But looking at it current one doesnt
> seem best fit either.
>
> So on seeing the device_link_add() I was thinking that this is some SoC
> dependent problem being solved whereas the problem statmement is non-atomic
> channel prepare.
>
> As I said earlier, if we want to solve that problem a better idea is to
> actually split the prepare as we discussed in [1]
>
> This way we can get a non atomic descriptor allocate/prepare and release.
> Yes we need to redesign the APIs to solve this, but if you guys are up for
> it, I think we can do it and avoid any further round abouts :)

I also agree that the main problem here is lack of non-atomic call for
preparing the channel. However I don't feel I'm a right person for rewriting
all the existing DMA engine drivers and clients for the new API. :/

>>> The problem with PL330 driver is that it use irq-safe runtime pm, which like
>>> it
>>> was stated in the patch description doesn't bring much benefits. To switch
>>> to
>>> standard (non-irq-safe) runtime pm, the pm_runtime calls have to be done
>>> from
>>> a context which permits sleeping. The problem with DMA engine driver API is
>>> that
>>> most of its callbacks have to be IRQ-safe and frankly only
>>> device_{alloc,release}_chan_resources() what more or less maps to
>>> dma_request_chan()/dma_release_channel() and friends. There are DMA engine
>>> drivers which do runtime PM calls there (tegra20-apb-dma, sirf-dma, cppi41,
>>> rcar-dmac), but this is not really efficient. DMA engine clients usually
>>> allocate
>>> dma channel during their probe() and keep them for the whole driver life. In
>>> turn
>>> this very similar to calling pm_runtime_get() in the DMA engine driver
>>> probe().
>>> The result of both approaches is that DMA engine device keeps its power
>>> domain
>>> enabled almost all the time. This problem is also mentioned in the DMA
>>> engine
>>> TODO list, you have pointed me yesterday.
>>>
>>> To avoid such situation that DMA engine driver blocks turning off the power
>>> domain and avoid changing DMA engine client API I came up with the device
>>> links
>>> pm based approach. I don't want to duplicate the description here, the
>>> details
>>> were in the patch description, however if you have any particular question
>>> about
>>> the details, let me know and I will try to clarify it more.
>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
> Yeah sorry took me a while to figure that out :), If we do a different API
> then dmaengine core can call pm_runtime_get|put() from non-atomic context.
>
> [1]: http://www.spinics.net/lists/dmaengine/msg11570.html
>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 11:11                 ` Ulf Hansson
  (?)
@ 2017-02-13 12:15                   ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 12:15 UTC (permalink / raw)
  To: Ulf Hansson, Vinod Koul
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Ulf,

On 2017-02-13 12:11, Ulf Hansson wrote:
>>> If we could set up the device link already at device initialization,
>>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>>> client drivers when requesting their dma channels.
>> Well if we defer then driver will regiser with dmaengine after it is
>> probed, so a client will either get a channel or not. IOW we won't get
>> -EPROBE_DEFER.
> I didn't quite get this. What do you mean by "if we defer..."?
>
> Defer into *what* and defer of *what*?  Could you please elaborate.
>
> [...]
>
>>> Again, allow me to fill in. This issue exists for all ARM SoC which
>>> has a dma controller residing in a PM domain. I think that is quite
>>> many.
>>>
>>> Currently the only solution I have seen for this problem, but which I
>>> really dislike. That is, each dma client driver requests/releases
>>> their dma channel from their respective ->runtime_suspend|resume()
>>> callbacks - then the dma driver can use the dma request/release hooks,
>>> to do pm_runtime_get|put() which then becomes non-irq-safe.
>> Yeah that is not the best way to do. But looking at it current one doesnt
>> seem best fit either.
>>
>> So on seeing the device_link_add() I was thinking that this is some SoC
>> dependent problem being solved whereas the problem statmement is non-atomic
>> channel prepare.
> You may be right.
>
> Although, I don't know of other examples, besides the runtime PM use
> case, where non-atomic channel prepare/unprepare would make sense. Do
> you?

Changing GFP_ATOMIC to GFP_KERNEL in some calls in the DMA engine drivers
would be also a nice present for the memory management subsystem if there
is no real reason to drain atomic pools.

>> As I said earlier, if we want to solve that problem a better idea is to
>> actually split the prepare as we discussed in [1]
>>
>> This way we can get a non atomic descriptor allocate/prepare and release.
>> Yes we need to redesign the APIs to solve this, but if you guys are up for
>> it, I think we can do it and avoid any further round abouts :)
> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>
> Changes would be needed for all related dma client drivers as well,
> although if that's what we need to do - let's do it.
>
> [...]
>
>>> So besides solving the irq-safe issue for dma driver, using the
>>> device-links has additionally two advantages. I already mentioned the
>>> -EPROBE_DEFER issue above.
>>>
>>> The second thing, is the runtime/system PM relations we get for free
>>> by using the links. In other words, the dma driver/core don't need to
>>> care about dealing with pm_runtime_get|put() as that would be managed
>>> by the dma client driver.
>> Yeah sorry took me a while to figure that out :), If we do a different API
>> then dmaengine core can call pm_runtime_get|put() from non-atomic context.
> Yes, it can and this works from runtime PM point of view. But the
> following issues would remain unsolved.
>
> 1)
> Dependencies between dma drivers and dma client drivers during system
> PM. For example, a dma client driver needs the dma controller to be
> operational (remain system resumed), until the dma client driver
> itself becomes system suspended.
>
> The *only* currently available solution for this, is to try to system
> suspend the dma controller later than the dma client, via using the
> *late or the *noirq system PM callbacks. This works for most cases,
> but it becomes a problem when the dma client also needs to be system
> suspended at the *late or the *noirq phase. Clearly this solution that
> doesn't scale.
>
> Using device links explicitly solves this problem as it allows to
> specify this dependency between devices.

Frankly, then creating device links has to be added to EVERY subsystem,
which involves getting access to the resources provided by the other
device. More or less this will apply to all kernel frameworks, which
provide kind of ABC_get_XYZ(dev, ...) functions (like clk_get, phy_get,
dma_chan_get, ...). Sounds like a topic for another loooong discussion.

> 2)
> We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> their dma channels in their ->probe() routines. This would be
> possible, if we can set up the device links at device initialization.

The question is which core (DMA engine?, kernel device subsystem?) and
how to find all clients before they call dma_chan_get().

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:15                   ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 12:15 UTC (permalink / raw)
  To: Ulf Hansson, Vinod Koul
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Ulf,

On 2017-02-13 12:11, Ulf Hansson wrote:
>>> If we could set up the device link already at device initialization,
>>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>>> client drivers when requesting their dma channels.
>> Well if we defer then driver will regiser with dmaengine after it is
>> probed, so a client will either get a channel or not. IOW we won't get
>> -EPROBE_DEFER.
> I didn't quite get this. What do you mean by "if we defer..."?
>
> Defer into *what* and defer of *what*?  Could you please elaborate.
>
> [...]
>
>>> Again, allow me to fill in. This issue exists for all ARM SoC which
>>> has a dma controller residing in a PM domain. I think that is quite
>>> many.
>>>
>>> Currently the only solution I have seen for this problem, but which I
>>> really dislike. That is, each dma client driver requests/releases
>>> their dma channel from their respective ->runtime_suspend|resume()
>>> callbacks - then the dma driver can use the dma request/release hooks,
>>> to do pm_runtime_get|put() which then becomes non-irq-safe.
>> Yeah that is not the best way to do. But looking at it current one doesnt
>> seem best fit either.
>>
>> So on seeing the device_link_add() I was thinking that this is some SoC
>> dependent problem being solved whereas the problem statmement is non-atomic
>> channel prepare.
> You may be right.
>
> Although, I don't know of other examples, besides the runtime PM use
> case, where non-atomic channel prepare/unprepare would make sense. Do
> you?

Changing GFP_ATOMIC to GFP_KERNEL in some calls in the DMA engine drivers
would be also a nice present for the memory management subsystem if there
is no real reason to drain atomic pools.

>> As I said earlier, if we want to solve that problem a better idea is to
>> actually split the prepare as we discussed in [1]
>>
>> This way we can get a non atomic descriptor allocate/prepare and release.
>> Yes we need to redesign the APIs to solve this, but if you guys are up for
>> it, I think we can do it and avoid any further round abouts :)
> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>
> Changes would be needed for all related dma client drivers as well,
> although if that's what we need to do - let's do it.
>
> [...]
>
>>> So besides solving the irq-safe issue for dma driver, using the
>>> device-links has additionally two advantages. I already mentioned the
>>> -EPROBE_DEFER issue above.
>>>
>>> The second thing, is the runtime/system PM relations we get for free
>>> by using the links. In other words, the dma driver/core don't need to
>>> care about dealing with pm_runtime_get|put() as that would be managed
>>> by the dma client driver.
>> Yeah sorry took me a while to figure that out :), If we do a different API
>> then dmaengine core can call pm_runtime_get|put() from non-atomic context.
> Yes, it can and this works from runtime PM point of view. But the
> following issues would remain unsolved.
>
> 1)
> Dependencies between dma drivers and dma client drivers during system
> PM. For example, a dma client driver needs the dma controller to be
> operational (remain system resumed), until the dma client driver
> itself becomes system suspended.
>
> The *only* currently available solution for this, is to try to system
> suspend the dma controller later than the dma client, via using the
> *late or the *noirq system PM callbacks. This works for most cases,
> but it becomes a problem when the dma client also needs to be system
> suspended at the *late or the *noirq phase. Clearly this solution that
> doesn't scale.
>
> Using device links explicitly solves this problem as it allows to
> specify this dependency between devices.

Frankly, then creating device links has to be added to EVERY subsystem,
which involves getting access to the resources provided by the other
device. More or less this will apply to all kernel frameworks, which
provide kind of ABC_get_XYZ(dev, ...) functions (like clk_get, phy_get,
dma_chan_get, ...). Sounds like a topic for another loooong discussion.

> 2)
> We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> their dma channels in their ->probe() routines. This would be
> possible, if we can set up the device links at device initialization.

The question is which core (DMA engine?, kernel device subsystem?) and
how to find all clients before they call dma_chan_get().

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:15                   ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-13 12:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ulf,

On 2017-02-13 12:11, Ulf Hansson wrote:
>>> If we could set up the device link already at device initialization,
>>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>>> client drivers when requesting their dma channels.
>> Well if we defer then driver will regiser with dmaengine after it is
>> probed, so a client will either get a channel or not. IOW we won't get
>> -EPROBE_DEFER.
> I didn't quite get this. What do you mean by "if we defer..."?
>
> Defer into *what* and defer of *what*?  Could you please elaborate.
>
> [...]
>
>>> Again, allow me to fill in. This issue exists for all ARM SoC which
>>> has a dma controller residing in a PM domain. I think that is quite
>>> many.
>>>
>>> Currently the only solution I have seen for this problem, but which I
>>> really dislike. That is, each dma client driver requests/releases
>>> their dma channel from their respective ->runtime_suspend|resume()
>>> callbacks - then the dma driver can use the dma request/release hooks,
>>> to do pm_runtime_get|put() which then becomes non-irq-safe.
>> Yeah that is not the best way to do. But looking at it current one doesnt
>> seem best fit either.
>>
>> So on seeing the device_link_add() I was thinking that this is some SoC
>> dependent problem being solved whereas the problem statmement is non-atomic
>> channel prepare.
> You may be right.
>
> Although, I don't know of other examples, besides the runtime PM use
> case, where non-atomic channel prepare/unprepare would make sense. Do
> you?

Changing GFP_ATOMIC to GFP_KERNEL in some calls in the DMA engine drivers
would be also a nice present for the memory management subsystem if there
is no real reason to drain atomic pools.

>> As I said earlier, if we want to solve that problem a better idea is to
>> actually split the prepare as we discussed in [1]
>>
>> This way we can get a non atomic descriptor allocate/prepare and release.
>> Yes we need to redesign the APIs to solve this, but if you guys are up for
>> it, I think we can do it and avoid any further round abouts :)
> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>
> Changes would be needed for all related dma client drivers as well,
> although if that's what we need to do - let's do it.
>
> [...]
>
>>> So besides solving the irq-safe issue for dma driver, using the
>>> device-links has additionally two advantages. I already mentioned the
>>> -EPROBE_DEFER issue above.
>>>
>>> The second thing, is the runtime/system PM relations we get for free
>>> by using the links. In other words, the dma driver/core don't need to
>>> care about dealing with pm_runtime_get|put() as that would be managed
>>> by the dma client driver.
>> Yeah sorry took me a while to figure that out :), If we do a different API
>> then dmaengine core can call pm_runtime_get|put() from non-atomic context.
> Yes, it can and this works from runtime PM point of view. But the
> following issues would remain unsolved.
>
> 1)
> Dependencies between dma drivers and dma client drivers during system
> PM. For example, a dma client driver needs the dma controller to be
> operational (remain system resumed), until the dma client driver
> itself becomes system suspended.
>
> The *only* currently available solution for this, is to try to system
> suspend the dma controller later than the dma client, via using the
> *late or the *noirq system PM callbacks. This works for most cases,
> but it becomes a problem when the dma client also needs to be system
> suspended at the *late or the *noirq phase. Clearly this solution that
> doesn't scale.
>
> Using device links explicitly solves this problem as it allows to
> specify this dependency between devices.

Frankly, then creating device links has to be added to EVERY subsystem,
which involves getting access to the resources provided by the other
device. More or less this will apply to all kernel frameworks, which
provide kind of ABC_get_XYZ(dev, ...) functions (like clk_get, phy_get,
dma_chan_get, ...). Sounds like a topic for another loooong discussion.

> 2)
> We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> their dma channels in their ->probe() routines. This would be
> possible, if we can set up the device links at device initialization.

The question is which core (DMA engine?, kernel device subsystem?) and
how to find all clients before they call dma_chan_get().

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 11:11                 ` Ulf Hansson
  (?)
@ 2017-02-13 12:27                   ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 12:27 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On Mon, Feb 13, 2017 at 12:11:54PM +0100, Ulf Hansson wrote:
> >>
> >> If we could set up the device link already at device initialization,
> >> it should also be possible to avoid getting -EPROBE_DEFER for dma
> >> client drivers when requesting their dma channels.
> >
> > Well if we defer then driver will regiser with dmaengine after it is
> > probed, so a client will either get a channel or not. IOW we won't get
> > -EPROBE_DEFER.
> 
> I didn't quite get this. What do you mean by "if we defer..."?
> 
> Defer into *what* and defer of *what*?  Could you please elaborate.

Nevermind I think below is much interesting now..

> >> Again, allow me to fill in. This issue exists for all ARM SoC which
> >> has a dma controller residing in a PM domain. I think that is quite
> >> many.
> >>
> >> Currently the only solution I have seen for this problem, but which I
> >> really dislike. That is, each dma client driver requests/releases
> >> their dma channel from their respective ->runtime_suspend|resume()
> >> callbacks - then the dma driver can use the dma request/release hooks,
> >> to do pm_runtime_get|put() which then becomes non-irq-safe.
> >
> > Yeah that is not the best way to do. But looking at it current one doesnt
> > seem best fit either.
> >
> > So on seeing the device_link_add() I was thinking that this is some SoC
> > dependent problem being solved whereas the problem statmement is non-atomic
> > channel prepare.
> 
> You may be right.
> 
> Although, I don't know of other examples, besides the runtime PM use
> case, where non-atomic channel prepare/unprepare would make sense. Do
> you?

The primary ask for that has been to enable runtime_pm for drivers. It's not
a new ask, but we somehow haven't gotten around to do it.

> > As I said earlier, if we want to solve that problem a better idea is to
> > actually split the prepare as we discussed in [1]
> >
> > This way we can get a non atomic descriptor allocate/prepare and release.
> > Yes we need to redesign the APIs to solve this, but if you guys are up for
> > it, I think we can do it and avoid any further round abouts :)
> 
> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> 
> Changes would be needed for all related dma client drivers as well,
> although if that's what we need to do - let's do it.

Yes, but do bear in mind that some cases do need atomic prepare. The primary
cases for DMA had that in mind and also submitting next transaction from the
callback (tasklet) context, so that won't go away.

It would help in other cases where clients know that they will not be in
atomic context so we provide additional non-atomic "allocation" followed by
prepare, so that drivers can split the work among these and people can do
runtime_pm and other things..

> >> So besides solving the irq-safe issue for dma driver, using the
> >> device-links has additionally two advantages. I already mentioned the
> >> -EPROBE_DEFER issue above.
> >>
> >> The second thing, is the runtime/system PM relations we get for free by
> >> using the links. In other words, the dma driver/core don't need to care
> >> about dealing with pm_runtime_get|put() as that would be managed by the
> >> dma client driver.
> >
> > Yeah sorry took me a while to figure that out :), If we do a different
> > API then dmaengine core can call pm_runtime_get|put() from non-atomic
> > context.
> 
> Yes, it can and this works from runtime PM point of view. But the
> following issues would remain unsolved.
> 
> 1) Dependencies between dma drivers and dma client drivers during system
> PM. For example, a dma client driver needs the dma controller to be
> operational (remain system resumed), until the dma client driver itself
> becomes system suspended.
> 
> The *only* currently available solution for this, is to try to system
> suspend the dma controller later than the dma client, via using the *late
> or the *noirq system PM callbacks. This works for most cases, but it
> becomes a problem when the dma client also needs to be system suspended at
> the *late or the *noirq phase. Clearly this solution that doesn't scale.
> 
> Using device links explicitly solves this problem as it allows to specify
> this dependency between devices.

Yes this is an interesting point. Yes till now people have been doing above
to workaround this problem, but hey this is not a unique to dmaengine. Any
subsystem which provides services to others has this issue, so the solution
much be driver or pm framework and not unique to dmaengine.

> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> their dma channels in their ->probe() routines. This would be possible, if
> we can set up the device links at device initialization.

Well setting those links is not practical at initialization time. Most
modern dma controllers feature a SW mux, with multiple clients connecting
and requesting, would we link all of them? Most of times dmaengine driver
wont know about those..

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:27                   ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 12:27 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On Mon, Feb 13, 2017 at 12:11:54PM +0100, Ulf Hansson wrote:
> >>
> >> If we could set up the device link already at device initialization,
> >> it should also be possible to avoid getting -EPROBE_DEFER for dma
> >> client drivers when requesting their dma channels.
> >
> > Well if we defer then driver will regiser with dmaengine after it is
> > probed, so a client will either get a channel or not. IOW we won't get
> > -EPROBE_DEFER.
> 
> I didn't quite get this. What do you mean by "if we defer..."?
> 
> Defer into *what* and defer of *what*?  Could you please elaborate.

Nevermind I think below is much interesting now..

> >> Again, allow me to fill in. This issue exists for all ARM SoC which
> >> has a dma controller residing in a PM domain. I think that is quite
> >> many.
> >>
> >> Currently the only solution I have seen for this problem, but which I
> >> really dislike. That is, each dma client driver requests/releases
> >> their dma channel from their respective ->runtime_suspend|resume()
> >> callbacks - then the dma driver can use the dma request/release hooks,
> >> to do pm_runtime_get|put() which then becomes non-irq-safe.
> >
> > Yeah that is not the best way to do. But looking at it current one doesnt
> > seem best fit either.
> >
> > So on seeing the device_link_add() I was thinking that this is some SoC
> > dependent problem being solved whereas the problem statmement is non-atomic
> > channel prepare.
> 
> You may be right.
> 
> Although, I don't know of other examples, besides the runtime PM use
> case, where non-atomic channel prepare/unprepare would make sense. Do
> you?

The primary ask for that has been to enable runtime_pm for drivers. It's not
a new ask, but we somehow haven't gotten around to do it.

> > As I said earlier, if we want to solve that problem a better idea is to
> > actually split the prepare as we discussed in [1]
> >
> > This way we can get a non atomic descriptor allocate/prepare and release.
> > Yes we need to redesign the APIs to solve this, but if you guys are up for
> > it, I think we can do it and avoid any further round abouts :)
> 
> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> 
> Changes would be needed for all related dma client drivers as well,
> although if that's what we need to do - let's do it.

Yes, but do bear in mind that some cases do need atomic prepare. The primary
cases for DMA had that in mind and also submitting next transaction from the
callback (tasklet) context, so that won't go away.

It would help in other cases where clients know that they will not be in
atomic context so we provide additional non-atomic "allocation" followed by
prepare, so that drivers can split the work among these and people can do
runtime_pm and other things..

> >> So besides solving the irq-safe issue for dma driver, using the
> >> device-links has additionally two advantages. I already mentioned the
> >> -EPROBE_DEFER issue above.
> >>
> >> The second thing, is the runtime/system PM relations we get for free by
> >> using the links. In other words, the dma driver/core don't need to care
> >> about dealing with pm_runtime_get|put() as that would be managed by the
> >> dma client driver.
> >
> > Yeah sorry took me a while to figure that out :), If we do a different
> > API then dmaengine core can call pm_runtime_get|put() from non-atomic
> > context.
> 
> Yes, it can and this works from runtime PM point of view. But the
> following issues would remain unsolved.
> 
> 1) Dependencies between dma drivers and dma client drivers during system
> PM. For example, a dma client driver needs the dma controller to be
> operational (remain system resumed), until the dma client driver itself
> becomes system suspended.
> 
> The *only* currently available solution for this, is to try to system
> suspend the dma controller later than the dma client, via using the *late
> or the *noirq system PM callbacks. This works for most cases, but it
> becomes a problem when the dma client also needs to be system suspended at
> the *late or the *noirq phase. Clearly this solution that doesn't scale.
> 
> Using device links explicitly solves this problem as it allows to specify
> this dependency between devices.

Yes this is an interesting point. Yes till now people have been doing above
to workaround this problem, but hey this is not a unique to dmaengine. Any
subsystem which provides services to others has this issue, so the solution
much be driver or pm framework and not unique to dmaengine.

> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> their dma channels in their ->probe() routines. This would be possible, if
> we can set up the device links at device initialization.

Well setting those links is not practical at initialization time. Most
modern dma controllers feature a SW mux, with multiple clients connecting
and requesting, would we link all of them? Most of times dmaengine driver
wont know about those..

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:27                   ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 12:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 13, 2017 at 12:11:54PM +0100, Ulf Hansson wrote:
> >>
> >> If we could set up the device link already at device initialization,
> >> it should also be possible to avoid getting -EPROBE_DEFER for dma
> >> client drivers when requesting their dma channels.
> >
> > Well if we defer then driver will regiser with dmaengine after it is
> > probed, so a client will either get a channel or not. IOW we won't get
> > -EPROBE_DEFER.
> 
> I didn't quite get this. What do you mean by "if we defer..."?
> 
> Defer into *what* and defer of *what*?  Could you please elaborate.

Nevermind I think below is much interesting now..

> >> Again, allow me to fill in. This issue exists for all ARM SoC which
> >> has a dma controller residing in a PM domain. I think that is quite
> >> many.
> >>
> >> Currently the only solution I have seen for this problem, but which I
> >> really dislike. That is, each dma client driver requests/releases
> >> their dma channel from their respective ->runtime_suspend|resume()
> >> callbacks - then the dma driver can use the dma request/release hooks,
> >> to do pm_runtime_get|put() which then becomes non-irq-safe.
> >
> > Yeah that is not the best way to do. But looking at it current one doesnt
> > seem best fit either.
> >
> > So on seeing the device_link_add() I was thinking that this is some SoC
> > dependent problem being solved whereas the problem statmement is non-atomic
> > channel prepare.
> 
> You may be right.
> 
> Although, I don't know of other examples, besides the runtime PM use
> case, where non-atomic channel prepare/unprepare would make sense. Do
> you?

The primary ask for that has been to enable runtime_pm for drivers. It's not
a new ask, but we somehow haven't gotten around to do it.

> > As I said earlier, if we want to solve that problem a better idea is to
> > actually split the prepare as we discussed in [1]
> >
> > This way we can get a non atomic descriptor allocate/prepare and release.
> > Yes we need to redesign the APIs to solve this, but if you guys are up for
> > it, I think we can do it and avoid any further round abouts :)
> 
> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> 
> Changes would be needed for all related dma client drivers as well,
> although if that's what we need to do - let's do it.

Yes, but do bear in mind that some cases do need atomic prepare. The primary
cases for DMA had that in mind and also submitting next transaction from the
callback (tasklet) context, so that won't go away.

It would help in other cases where clients know that they will not be in
atomic context so we provide additional non-atomic "allocation" followed by
prepare, so that drivers can split the work among these and people can do
runtime_pm and other things..

> >> So besides solving the irq-safe issue for dma driver, using the
> >> device-links has additionally two advantages. I already mentioned the
> >> -EPROBE_DEFER issue above.
> >>
> >> The second thing, is the runtime/system PM relations we get for free by
> >> using the links. In other words, the dma driver/core don't need to care
> >> about dealing with pm_runtime_get|put() as that would be managed by the
> >> dma client driver.
> >
> > Yeah sorry took me a while to figure that out :), If we do a different
> > API then dmaengine core can call pm_runtime_get|put() from non-atomic
> > context.
> 
> Yes, it can and this works from runtime PM point of view. But the
> following issues would remain unsolved.
> 
> 1) Dependencies between dma drivers and dma client drivers during system
> PM. For example, a dma client driver needs the dma controller to be
> operational (remain system resumed), until the dma client driver itself
> becomes system suspended.
> 
> The *only* currently available solution for this, is to try to system
> suspend the dma controller later than the dma client, via using the *late
> or the *noirq system PM callbacks. This works for most cases, but it
> becomes a problem when the dma client also needs to be system suspended at
> the *late or the *noirq phase. Clearly this solution that doesn't scale.
> 
> Using device links explicitly solves this problem as it allows to specify
> this dependency between devices.

Yes this is an interesting point. Yes till now people have been doing above
to workaround this problem, but hey this is not a unique to dmaengine. Any
subsystem which provides services to others has this issue, so the solution
much be driver or pm framework and not unique to dmaengine.

> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> their dma channels in their ->probe() routines. This would be possible, if
> we can set up the device links at device initialization.

Well setting those links is not practical at initialization time. Most
modern dma controllers feature a SW mux, with multiple clients connecting
and requesting, would we link all of them? Most of times dmaengine driver
wont know about those..

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 12:15                   ` Marek Szyprowski
  (?)
@ 2017-02-13 12:32                     ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 12:32 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Ulf Hansson, Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

On Mon, Feb 13, 2017 at 01:15:27PM +0100, Marek Szyprowski wrote:

> >Although, I don't know of other examples, besides the runtime PM use
> >case, where non-atomic channel prepare/unprepare would make sense. Do
> >you?
> 
> Changing GFP_ATOMIC to GFP_KERNEL in some calls in the DMA engine drivers
> would be also a nice present for the memory management subsystem if there
> is no real reason to drain atomic pools.

The reason for the calls being atomic is that they will be invoked from
atomic context. All prepare callbacks, submit, issue_pending are in that context.
You have to be mindful that we can prepare and issue next txn from dmaengine
callback which is a tasklet.

> >>As I said earlier, if we want to solve that problem a better idea is to
> >>actually split the prepare as we discussed in [1]
> >>
> >>This way we can get a non atomic descriptor allocate/prepare and release.
> >>Yes we need to redesign the APIs to solve this, but if you guys are up for
> >>it, I think we can do it and avoid any further round abouts :)
> >Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> >
> >Changes would be needed for all related dma client drivers as well,
> >although if that's what we need to do - let's do it.
> >
> >[...]
> >
> >>>So besides solving the irq-safe issue for dma driver, using the
> >>>device-links has additionally two advantages. I already mentioned the
> >>>-EPROBE_DEFER issue above.
> >>>
> >>>The second thing, is the runtime/system PM relations we get for free
> >>>by using the links. In other words, the dma driver/core don't need to
> >>>care about dealing with pm_runtime_get|put() as that would be managed
> >>>by the dma client driver.
> >>Yeah sorry took me a while to figure that out :), If we do a different API
> >>then dmaengine core can call pm_runtime_get|put() from non-atomic context.
> >Yes, it can and this works from runtime PM point of view. But the
> >following issues would remain unsolved.
> >
> >1)
> >Dependencies between dma drivers and dma client drivers during system
> >PM. For example, a dma client driver needs the dma controller to be
> >operational (remain system resumed), until the dma client driver
> >itself becomes system suspended.
> >
> >The *only* currently available solution for this, is to try to system
> >suspend the dma controller later than the dma client, via using the
> >*late or the *noirq system PM callbacks. This works for most cases,
> >but it becomes a problem when the dma client also needs to be system
> >suspended at the *late or the *noirq phase. Clearly this solution that
> >doesn't scale.
> >
> >Using device links explicitly solves this problem as it allows to
> >specify this dependency between devices.
> 
> Frankly, then creating device links has to be added to EVERY subsystem,
> which involves getting access to the resources provided by the other
> device. More or less this will apply to all kernel frameworks, which
> provide kind of ABC_get_XYZ(dev, ...) functions (like clk_get, phy_get,
> dma_chan_get, ...). Sounds like a topic for another loooong discussion.

Yeah, that was my view too :-)

> >2)
> >We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> >their dma channels in their ->probe() routines. This would be
> >possible, if we can set up the device links at device initialization.
> 
> The question is which core (DMA engine?, kernel device subsystem?) and
> how to find all clients before they call dma_chan_get().

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:32                     ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 12:32 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Ulf Hansson, Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

On Mon, Feb 13, 2017 at 01:15:27PM +0100, Marek Szyprowski wrote:

> >Although, I don't know of other examples, besides the runtime PM use
> >case, where non-atomic channel prepare/unprepare would make sense. Do
> >you?
> 
> Changing GFP_ATOMIC to GFP_KERNEL in some calls in the DMA engine drivers
> would be also a nice present for the memory management subsystem if there
> is no real reason to drain atomic pools.

The reason for the calls being atomic is that they will be invoked from
atomic context. All prepare callbacks, submit, issue_pending are in that context.
You have to be mindful that we can prepare and issue next txn from dmaengine
callback which is a tasklet.

> >>As I said earlier, if we want to solve that problem a better idea is to
> >>actually split the prepare as we discussed in [1]
> >>
> >>This way we can get a non atomic descriptor allocate/prepare and release.
> >>Yes we need to redesign the APIs to solve this, but if you guys are up for
> >>it, I think we can do it and avoid any further round abouts :)
> >Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> >
> >Changes would be needed for all related dma client drivers as well,
> >although if that's what we need to do - let's do it.
> >
> >[...]
> >
> >>>So besides solving the irq-safe issue for dma driver, using the
> >>>device-links has additionally two advantages. I already mentioned the
> >>>-EPROBE_DEFER issue above.
> >>>
> >>>The second thing, is the runtime/system PM relations we get for free
> >>>by using the links. In other words, the dma driver/core don't need to
> >>>care about dealing with pm_runtime_get|put() as that would be managed
> >>>by the dma client driver.
> >>Yeah sorry took me a while to figure that out :), If we do a different API
> >>then dmaengine core can call pm_runtime_get|put() from non-atomic context.
> >Yes, it can and this works from runtime PM point of view. But the
> >following issues would remain unsolved.
> >
> >1)
> >Dependencies between dma drivers and dma client drivers during system
> >PM. For example, a dma client driver needs the dma controller to be
> >operational (remain system resumed), until the dma client driver
> >itself becomes system suspended.
> >
> >The *only* currently available solution for this, is to try to system
> >suspend the dma controller later than the dma client, via using the
> >*late or the *noirq system PM callbacks. This works for most cases,
> >but it becomes a problem when the dma client also needs to be system
> >suspended at the *late or the *noirq phase. Clearly this solution that
> >doesn't scale.
> >
> >Using device links explicitly solves this problem as it allows to
> >specify this dependency between devices.
> 
> Frankly, then creating device links has to be added to EVERY subsystem,
> which involves getting access to the resources provided by the other
> device. More or less this will apply to all kernel frameworks, which
> provide kind of ABC_get_XYZ(dev, ...) functions (like clk_get, phy_get,
> dma_chan_get, ...). Sounds like a topic for another loooong discussion.

Yeah, that was my view too :-)

> >2)
> >We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> >their dma channels in their ->probe() routines. This would be
> >possible, if we can set up the device links at device initialization.
> 
> The question is which core (DMA engine?, kernel device subsystem?) and
> how to find all clients before they call dma_chan_get().

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 12:32                     ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 13, 2017 at 01:15:27PM +0100, Marek Szyprowski wrote:

> >Although, I don't know of other examples, besides the runtime PM use
> >case, where non-atomic channel prepare/unprepare would make sense. Do
> >you?
> 
> Changing GFP_ATOMIC to GFP_KERNEL in some calls in the DMA engine drivers
> would be also a nice present for the memory management subsystem if there
> is no real reason to drain atomic pools.

The reason for the calls being atomic is that they will be invoked from
atomic context. All prepare callbacks, submit, issue_pending are in that context.
You have to be mindful that we can prepare and issue next txn from dmaengine
callback which is a tasklet.

> >>As I said earlier, if we want to solve that problem a better idea is to
> >>actually split the prepare as we discussed in [1]
> >>
> >>This way we can get a non atomic descriptor allocate/prepare and release.
> >>Yes we need to redesign the APIs to solve this, but if you guys are up for
> >>it, I think we can do it and avoid any further round abouts :)
> >Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> >
> >Changes would be needed for all related dma client drivers as well,
> >although if that's what we need to do - let's do it.
> >
> >[...]
> >
> >>>So besides solving the irq-safe issue for dma driver, using the
> >>>device-links has additionally two advantages. I already mentioned the
> >>>-EPROBE_DEFER issue above.
> >>>
> >>>The second thing, is the runtime/system PM relations we get for free
> >>>by using the links. In other words, the dma driver/core don't need to
> >>>care about dealing with pm_runtime_get|put() as that would be managed
> >>>by the dma client driver.
> >>Yeah sorry took me a while to figure that out :), If we do a different API
> >>then dmaengine core can call pm_runtime_get|put() from non-atomic context.
> >Yes, it can and this works from runtime PM point of view. But the
> >following issues would remain unsolved.
> >
> >1)
> >Dependencies between dma drivers and dma client drivers during system
> >PM. For example, a dma client driver needs the dma controller to be
> >operational (remain system resumed), until the dma client driver
> >itself becomes system suspended.
> >
> >The *only* currently available solution for this, is to try to system
> >suspend the dma controller later than the dma client, via using the
> >*late or the *noirq system PM callbacks. This works for most cases,
> >but it becomes a problem when the dma client also needs to be system
> >suspended at the *late or the *noirq phase. Clearly this solution that
> >doesn't scale.
> >
> >Using device links explicitly solves this problem as it allows to
> >specify this dependency between devices.
> 
> Frankly, then creating device links has to be added to EVERY subsystem,
> which involves getting access to the resources provided by the other
> device. More or less this will apply to all kernel frameworks, which
> provide kind of ABC_get_XYZ(dev, ...) functions (like clk_get, phy_get,
> dma_chan_get, ...). Sounds like a topic for another loooong discussion.

Yeah, that was my view too :-)

> >2)
> >We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> >their dma channels in their ->probe() routines. This would be
> >possible, if we can set up the device links at device initialization.
> 
> The question is which core (DMA engine?, kernel device subsystem?) and
> how to find all clients before they call dma_chan_get().

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 11:45               ` Marek Szyprowski
  (?)
@ 2017-02-13 15:09                 ` Ulf Hansson
  -1 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 15:09 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Vinod Koul, Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

[...]

>> The only related PM thing, that shall be the decision of the driver,
>> is whether it wants to enable runtime PM or not, during ->probe().
>
>
> So do you want to create the links during the DMAengine driver probe? How do
> you
> plan to find all the client devices? Please note that you really want to
> create
> links to devices which will really use the DMA engine calls. Some client
> drivers might decide in runtime weather to use DMA engine or not, depending
> on
> other data.

I don't have great plan, just wanted to share my thoughts around the
problems we want to solve.

[...]

>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
>
>
> At the first glance this sounds like an ultimate solution for all problems,
> but I don't think that device links can be used this way. If I get it right,
> you would like to create links on client device initialization, preferably
> somewhere in the kernel driver core. This will be handled somehow by a
> completely generic code, which will create a link each pair of devices,
> which are connected by a phandle. Is this what you meant? Please note that
> that time no driver for both client and provider are probed. IMHO that
> doesn't look like a right generic approach
>
> How that code will know get following information:
> 1. is it really needed to create a link for given device pair?
> 2. what link flags should it use?
> 3. what about circular dependencies?
> 4. what about runtime optional dependencies?
> 5. what about non-dt platforms? acpi?
>

To give a good answer of these questions, I need to spend more time
investigating.

However, from a top-level point of view, I think the device links
seems like the perfect match for solving the runtime/system PM
problems.

No matter whether we can set up the links at device initialization
time, driver probe or whatever time.

> This looks like another newer ending story of "how can we avoid deferred
> probe
> in a generic way". IMHO we should first solve the problem of irq-safe
> runtime
> PM in DMA engine drivers first. I proposed how it can be done with device
> links.
> With no changes in the client API. Later if one decide to extend the client
> API
> in a way it will allow other runtime PM implementation - I see no problem to
> convert pl330 driver to the new approach, but for the time being - this
> would
> be the easiest way to get it really functional.

Agree, let's drop the deferred probe topic from the discussions - it's
just going to be overwhelming. :-)

[...]

>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>
>
> Not really. IMHO device links can be properly established once both drivers
> are probed...

Okay.

>
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
>
>
> IMHO there might be drivers which don't want to use device links based
> runtime
> PM in favor of irq-safe PM or something else. This should be really left to
> drivers.

Okay.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 15:09                 ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 15:09 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-samsung-soc, Arnd Bergmann, linux-pm, Vinod Koul,
	Bartlomiej Zolnierkiewicz, Rafael J. Wysocki, linux-kernel,
	Krzysztof Kozlowski, Inki Dae, Lars-Peter Clausen, dmaengine,
	linux-arm-kernel

[...]

>> The only related PM thing, that shall be the decision of the driver,
>> is whether it wants to enable runtime PM or not, during ->probe().
>
>
> So do you want to create the links during the DMAengine driver probe? How do
> you
> plan to find all the client devices? Please note that you really want to
> create
> links to devices which will really use the DMA engine calls. Some client
> drivers might decide in runtime weather to use DMA engine or not, depending
> on
> other data.

I don't have great plan, just wanted to share my thoughts around the
problems we want to solve.

[...]

>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
>
>
> At the first glance this sounds like an ultimate solution for all problems,
> but I don't think that device links can be used this way. If I get it right,
> you would like to create links on client device initialization, preferably
> somewhere in the kernel driver core. This will be handled somehow by a
> completely generic code, which will create a link each pair of devices,
> which are connected by a phandle. Is this what you meant? Please note that
> that time no driver for both client and provider are probed. IMHO that
> doesn't look like a right generic approach
>
> How that code will know get following information:
> 1. is it really needed to create a link for given device pair?
> 2. what link flags should it use?
> 3. what about circular dependencies?
> 4. what about runtime optional dependencies?
> 5. what about non-dt platforms? acpi?
>

To give a good answer of these questions, I need to spend more time
investigating.

However, from a top-level point of view, I think the device links
seems like the perfect match for solving the runtime/system PM
problems.

No matter whether we can set up the links at device initialization
time, driver probe or whatever time.

> This looks like another newer ending story of "how can we avoid deferred
> probe
> in a generic way". IMHO we should first solve the problem of irq-safe
> runtime
> PM in DMA engine drivers first. I proposed how it can be done with device
> links.
> With no changes in the client API. Later if one decide to extend the client
> API
> in a way it will allow other runtime PM implementation - I see no problem to
> convert pl330 driver to the new approach, but for the time being - this
> would
> be the easiest way to get it really functional.

Agree, let's drop the deferred probe topic from the discussions - it's
just going to be overwhelming. :-)

[...]

>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>
>
> Not really. IMHO device links can be properly established once both drivers
> are probed...

Okay.

>
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
>
>
> IMHO there might be drivers which don't want to use device links based
> runtime
> PM in favor of irq-safe PM or something else. This should be really left to
> drivers.

Okay.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 15:09                 ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 15:09 UTC (permalink / raw)
  To: linux-arm-kernel

[...]

>> The only related PM thing, that shall be the decision of the driver,
>> is whether it wants to enable runtime PM or not, during ->probe().
>
>
> So do you want to create the links during the DMAengine driver probe? How do
> you
> plan to find all the client devices? Please note that you really want to
> create
> links to devices which will really use the DMA engine calls. Some client
> drivers might decide in runtime weather to use DMA engine or not, depending
> on
> other data.

I don't have great plan, just wanted to share my thoughts around the
problems we want to solve.

[...]

>>
>> If we could set up the device link already at device initialization,
>> it should also be possible to avoid getting -EPROBE_DEFER for dma
>> client drivers when requesting their dma channels.
>
>
> At the first glance this sounds like an ultimate solution for all problems,
> but I don't think that device links can be used this way. If I get it right,
> you would like to create links on client device initialization, preferably
> somewhere in the kernel driver core. This will be handled somehow by a
> completely generic code, which will create a link each pair of devices,
> which are connected by a phandle. Is this what you meant? Please note that
> that time no driver for both client and provider are probed. IMHO that
> doesn't look like a right generic approach
>
> How that code will know get following information:
> 1. is it really needed to create a link for given device pair?
> 2. what link flags should it use?
> 3. what about circular dependencies?
> 4. what about runtime optional dependencies?
> 5. what about non-dt platforms? acpi?
>

To give a good answer of these questions, I need to spend more time
investigating.

However, from a top-level point of view, I think the device links
seems like the perfect match for solving the runtime/system PM
problems.

No matter whether we can set up the links at device initialization
time, driver probe or whatever time.

> This looks like another newer ending story of "how can we avoid deferred
> probe
> in a generic way". IMHO we should first solve the problem of irq-safe
> runtime
> PM in DMA engine drivers first. I proposed how it can be done with device
> links.
> With no changes in the client API. Later if one decide to extend the client
> API
> in a way it will allow other runtime PM implementation - I see no problem to
> convert pl330 driver to the new approach, but for the time being - this
> would
> be the easiest way to get it really functional.

Agree, let's drop the deferred probe topic from the discussions - it's
just going to be overwhelming. :-)

[...]

>> So besides solving the irq-safe issue for dma driver, using the
>> device-links has additionally two advantages. I already mentioned the
>> -EPROBE_DEFER issue above.
>
>
> Not really. IMHO device links can be properly established once both drivers
> are probed...

Okay.

>
>>
>> The second thing, is the runtime/system PM relations we get for free
>> by using the links. In other words, the dma driver/core don't need to
>> care about dealing with pm_runtime_get|put() as that would be managed
>> by the dma client driver.
>
>
> IMHO there might be drivers which don't want to use device links based
> runtime
> PM in favor of irq-safe PM or something else. This should be really left to
> drivers.

Okay.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 12:27                   ` Vinod Koul
  (?)
@ 2017-02-13 15:32                     ` Ulf Hansson
  -1 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 15:32 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

[...]

>> Although, I don't know of other examples, besides the runtime PM use
>> case, where non-atomic channel prepare/unprepare would make sense. Do
>> you?
>
> The primary ask for that has been to enable runtime_pm for drivers. It's not
> a new ask, but we somehow haven't gotten around to do it.

Okay, I see.

>
>> > As I said earlier, if we want to solve that problem a better idea is to
>> > actually split the prepare as we discussed in [1]
>> >
>> > This way we can get a non atomic descriptor allocate/prepare and release.
>> > Yes we need to redesign the APIs to solve this, but if you guys are up for
>> > it, I think we can do it and avoid any further round abouts :)
>>
>> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>>
>> Changes would be needed for all related dma client drivers as well,
>> although if that's what we need to do - let's do it.
>
> Yes, but do bear in mind that some cases do need atomic prepare. The primary
> cases for DMA had that in mind and also submitting next transaction from the
> callback (tasklet) context, so that won't go away.
>
> It would help in other cases where clients know that they will not be in
> atomic context so we provide additional non-atomic "allocation" followed by
> prepare, so that drivers can split the work among these and people can do
> runtime_pm and other things..

That for sharing the details.

It seems like some dma expert really need to be heavily involved if we
ever are going to complete this work. :-)

[...]

>>
>> 1) Dependencies between dma drivers and dma client drivers during system
>> PM. For example, a dma client driver needs the dma controller to be
>> operational (remain system resumed), until the dma client driver itself
>> becomes system suspended.
>>
>> The *only* currently available solution for this, is to try to system
>> suspend the dma controller later than the dma client, via using the *late
>> or the *noirq system PM callbacks. This works for most cases, but it
>> becomes a problem when the dma client also needs to be system suspended at
>> the *late or the *noirq phase. Clearly this solution that doesn't scale.
>>
>> Using device links explicitly solves this problem as it allows to specify
>> this dependency between devices.
>
> Yes this is an interesting point. Yes till now people have been doing above
> to workaround this problem, but hey this is not a unique to dmaengine. Any
> subsystem which provides services to others has this issue, so the solution
> much be driver or pm framework and not unique to dmaengine.

I definitely agree, these problems aren't unique to the dmaengine
subsystem. Exactly how/where to manage them, that I guess, is the key
question.

However, I can't resist from finding the device links useful, as those
really do address and solve our issues from a runtime/system PM point
of view.

>
>> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
>> their dma channels in their ->probe() routines. This would be possible, if
>> we can set up the device links at device initialization.
>
> Well setting those links is not practical at initialization time. Most
> modern dma controllers feature a SW mux, with multiple clients connecting
> and requesting, would we link all of them? Most of times dmaengine driver
> wont know about those..

Okay, I see!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 15:32                     ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 15:32 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

[...]

>> Although, I don't know of other examples, besides the runtime PM use
>> case, where non-atomic channel prepare/unprepare would make sense. Do
>> you?
>
> The primary ask for that has been to enable runtime_pm for drivers. It's not
> a new ask, but we somehow haven't gotten around to do it.

Okay, I see.

>
>> > As I said earlier, if we want to solve that problem a better idea is to
>> > actually split the prepare as we discussed in [1]
>> >
>> > This way we can get a non atomic descriptor allocate/prepare and release.
>> > Yes we need to redesign the APIs to solve this, but if you guys are up for
>> > it, I think we can do it and avoid any further round abouts :)
>>
>> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>>
>> Changes would be needed for all related dma client drivers as well,
>> although if that's what we need to do - let's do it.
>
> Yes, but do bear in mind that some cases do need atomic prepare. The primary
> cases for DMA had that in mind and also submitting next transaction from the
> callback (tasklet) context, so that won't go away.
>
> It would help in other cases where clients know that they will not be in
> atomic context so we provide additional non-atomic "allocation" followed by
> prepare, so that drivers can split the work among these and people can do
> runtime_pm and other things..

That for sharing the details.

It seems like some dma expert really need to be heavily involved if we
ever are going to complete this work. :-)

[...]

>>
>> 1) Dependencies between dma drivers and dma client drivers during system
>> PM. For example, a dma client driver needs the dma controller to be
>> operational (remain system resumed), until the dma client driver itself
>> becomes system suspended.
>>
>> The *only* currently available solution for this, is to try to system
>> suspend the dma controller later than the dma client, via using the *late
>> or the *noirq system PM callbacks. This works for most cases, but it
>> becomes a problem when the dma client also needs to be system suspended at
>> the *late or the *noirq phase. Clearly this solution that doesn't scale.
>>
>> Using device links explicitly solves this problem as it allows to specify
>> this dependency between devices.
>
> Yes this is an interesting point. Yes till now people have been doing above
> to workaround this problem, but hey this is not a unique to dmaengine. Any
> subsystem which provides services to others has this issue, so the solution
> much be driver or pm framework and not unique to dmaengine.

I definitely agree, these problems aren't unique to the dmaengine
subsystem. Exactly how/where to manage them, that I guess, is the key
question.

However, I can't resist from finding the device links useful, as those
really do address and solve our issues from a runtime/system PM point
of view.

>
>> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
>> their dma channels in their ->probe() routines. This would be possible, if
>> we can set up the device links at device initialization.
>
> Well setting those links is not practical at initialization time. Most
> modern dma controllers feature a SW mux, with multiple clients connecting
> and requesting, would we link all of them? Most of times dmaengine driver
> wont know about those..

Okay, I see!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 15:32                     ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-13 15:32 UTC (permalink / raw)
  To: linux-arm-kernel

[...]

>> Although, I don't know of other examples, besides the runtime PM use
>> case, where non-atomic channel prepare/unprepare would make sense. Do
>> you?
>
> The primary ask for that has been to enable runtime_pm for drivers. It's not
> a new ask, but we somehow haven't gotten around to do it.

Okay, I see.

>
>> > As I said earlier, if we want to solve that problem a better idea is to
>> > actually split the prepare as we discussed in [1]
>> >
>> > This way we can get a non atomic descriptor allocate/prepare and release.
>> > Yes we need to redesign the APIs to solve this, but if you guys are up for
>> > it, I think we can do it and avoid any further round abouts :)
>>
>> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>>
>> Changes would be needed for all related dma client drivers as well,
>> although if that's what we need to do - let's do it.
>
> Yes, but do bear in mind that some cases do need atomic prepare. The primary
> cases for DMA had that in mind and also submitting next transaction from the
> callback (tasklet) context, so that won't go away.
>
> It would help in other cases where clients know that they will not be in
> atomic context so we provide additional non-atomic "allocation" followed by
> prepare, so that drivers can split the work among these and people can do
> runtime_pm and other things..

That for sharing the details.

It seems like some dma expert really need to be heavily involved if we
ever are going to complete this work. :-)

[...]

>>
>> 1) Dependencies between dma drivers and dma client drivers during system
>> PM. For example, a dma client driver needs the dma controller to be
>> operational (remain system resumed), until the dma client driver itself
>> becomes system suspended.
>>
>> The *only* currently available solution for this, is to try to system
>> suspend the dma controller later than the dma client, via using the *late
>> or the *noirq system PM callbacks. This works for most cases, but it
>> becomes a problem when the dma client also needs to be system suspended at
>> the *late or the *noirq phase. Clearly this solution that doesn't scale.
>>
>> Using device links explicitly solves this problem as it allows to specify
>> this dependency between devices.
>
> Yes this is an interesting point. Yes till now people have been doing above
> to workaround this problem, but hey this is not a unique to dmaengine. Any
> subsystem which provides services to others has this issue, so the solution
> much be driver or pm framework and not unique to dmaengine.

I definitely agree, these problems aren't unique to the dmaengine
subsystem. Exactly how/where to manage them, that I guess, is the key
question.

However, I can't resist from finding the device links useful, as those
really do address and solve our issues from a runtime/system PM point
of view.

>
>> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
>> their dma channels in their ->probe() routines. This would be possible, if
>> we can set up the device links at device initialization.
>
> Well setting those links is not practical at initialization time. Most
> modern dma controllers feature a SW mux, with multiple clients connecting
> and requesting, would we link all of them? Most of times dmaengine driver
> wont know about those..

Okay, I see!

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 15:32                     ` Ulf Hansson
  (?)
@ 2017-02-13 15:47                       ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 15:47 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
> [...]
> 
> >> Although, I don't know of other examples, besides the runtime PM use
> >> case, where non-atomic channel prepare/unprepare would make sense. Do
> >> you?
> >
> > The primary ask for that has been to enable runtime_pm for drivers. It's not
> > a new ask, but we somehow haven't gotten around to do it.
> 
> Okay, I see.
> 
> >
> >> > As I said earlier, if we want to solve that problem a better idea is to
> >> > actually split the prepare as we discussed in [1]
> >> >
> >> > This way we can get a non atomic descriptor allocate/prepare and release.
> >> > Yes we need to redesign the APIs to solve this, but if you guys are up for
> >> > it, I think we can do it and avoid any further round abouts :)
> >>
> >> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> >>
> >> Changes would be needed for all related dma client drivers as well,
> >> although if that's what we need to do - let's do it.
> >
> > Yes, but do bear in mind that some cases do need atomic prepare. The primary
> > cases for DMA had that in mind and also submitting next transaction from the
> > callback (tasklet) context, so that won't go away.
> >
> > It would help in other cases where clients know that they will not be in
> > atomic context so we provide additional non-atomic "allocation" followed by
> > prepare, so that drivers can split the work among these and people can do
> > runtime_pm and other things..
> 
> That for sharing the details.
> 
> It seems like some dma expert really need to be heavily involved if we
> ever are going to complete this work. :-)

Sure, I will help out :)

If anyone of you are in Portland next week, then we can discuss these f2f. I
will try taking a stab at the new API design next week.

> 
> [...]
> 
> >>
> >> 1) Dependencies between dma drivers and dma client drivers during system
> >> PM. For example, a dma client driver needs the dma controller to be
> >> operational (remain system resumed), until the dma client driver itself
> >> becomes system suspended.
> >>
> >> The *only* currently available solution for this, is to try to system
> >> suspend the dma controller later than the dma client, via using the *late
> >> or the *noirq system PM callbacks. This works for most cases, but it
> >> becomes a problem when the dma client also needs to be system suspended at
> >> the *late or the *noirq phase. Clearly this solution that doesn't scale.
> >>
> >> Using device links explicitly solves this problem as it allows to specify
> >> this dependency between devices.
> >
> > Yes this is an interesting point. Yes till now people have been doing above
> > to workaround this problem, but hey this is not a unique to dmaengine. Any
> > subsystem which provides services to others has this issue, so the solution
> > much be driver or pm framework and not unique to dmaengine.
> 
> I definitely agree, these problems aren't unique to the dmaengine
> subsystem. Exactly how/where to manage them, that I guess, is the key
> question.
> 
> However, I can't resist from finding the device links useful, as those
> really do address and solve our issues from a runtime/system PM point
> of view.
> 
> >
> >> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> >> their dma channels in their ->probe() routines. This would be possible, if
> >> we can set up the device links at device initialization.
> >
> > Well setting those links is not practical at initialization time. Most
> > modern dma controllers feature a SW mux, with multiple clients connecting
> > and requesting, would we link all of them? Most of times dmaengine driver
> > wont know about those..
> 
> Okay, I see!
> 
> Kind regards
> Uffe

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 15:47                       ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 15:47 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: linux-samsung-soc, Arnd Bergmann, linux-pm,
	Bartlomiej Zolnierkiewicz, Rafael J. Wysocki, linux-kernel,
	Krzysztof Kozlowski, Inki Dae, Lars-Peter Clausen, dmaengine,
	linux-arm-kernel, Marek Szyprowski

On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
> [...]
> 
> >> Although, I don't know of other examples, besides the runtime PM use
> >> case, where non-atomic channel prepare/unprepare would make sense. Do
> >> you?
> >
> > The primary ask for that has been to enable runtime_pm for drivers. It's not
> > a new ask, but we somehow haven't gotten around to do it.
> 
> Okay, I see.
> 
> >
> >> > As I said earlier, if we want to solve that problem a better idea is to
> >> > actually split the prepare as we discussed in [1]
> >> >
> >> > This way we can get a non atomic descriptor allocate/prepare and release.
> >> > Yes we need to redesign the APIs to solve this, but if you guys are up for
> >> > it, I think we can do it and avoid any further round abouts :)
> >>
> >> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> >>
> >> Changes would be needed for all related dma client drivers as well,
> >> although if that's what we need to do - let's do it.
> >
> > Yes, but do bear in mind that some cases do need atomic prepare. The primary
> > cases for DMA had that in mind and also submitting next transaction from the
> > callback (tasklet) context, so that won't go away.
> >
> > It would help in other cases where clients know that they will not be in
> > atomic context so we provide additional non-atomic "allocation" followed by
> > prepare, so that drivers can split the work among these and people can do
> > runtime_pm and other things..
> 
> That for sharing the details.
> 
> It seems like some dma expert really need to be heavily involved if we
> ever are going to complete this work. :-)

Sure, I will help out :)

If anyone of you are in Portland next week, then we can discuss these f2f. I
will try taking a stab at the new API design next week.

> 
> [...]
> 
> >>
> >> 1) Dependencies between dma drivers and dma client drivers during system
> >> PM. For example, a dma client driver needs the dma controller to be
> >> operational (remain system resumed), until the dma client driver itself
> >> becomes system suspended.
> >>
> >> The *only* currently available solution for this, is to try to system
> >> suspend the dma controller later than the dma client, via using the *late
> >> or the *noirq system PM callbacks. This works for most cases, but it
> >> becomes a problem when the dma client also needs to be system suspended at
> >> the *late or the *noirq phase. Clearly this solution that doesn't scale.
> >>
> >> Using device links explicitly solves this problem as it allows to specify
> >> this dependency between devices.
> >
> > Yes this is an interesting point. Yes till now people have been doing above
> > to workaround this problem, but hey this is not a unique to dmaengine. Any
> > subsystem which provides services to others has this issue, so the solution
> > much be driver or pm framework and not unique to dmaengine.
> 
> I definitely agree, these problems aren't unique to the dmaengine
> subsystem. Exactly how/where to manage them, that I guess, is the key
> question.
> 
> However, I can't resist from finding the device links useful, as those
> really do address and solve our issues from a runtime/system PM point
> of view.
> 
> >
> >> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> >> their dma channels in their ->probe() routines. This would be possible, if
> >> we can set up the device links at device initialization.
> >
> > Well setting those links is not practical at initialization time. Most
> > modern dma controllers feature a SW mux, with multiple clients connecting
> > and requesting, would we link all of them? Most of times dmaengine driver
> > wont know about those..
> 
> Okay, I see!
> 
> Kind regards
> Uffe

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-13 15:47                       ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-02-13 15:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
> [...]
> 
> >> Although, I don't know of other examples, besides the runtime PM use
> >> case, where non-atomic channel prepare/unprepare would make sense. Do
> >> you?
> >
> > The primary ask for that has been to enable runtime_pm for drivers. It's not
> > a new ask, but we somehow haven't gotten around to do it.
> 
> Okay, I see.
> 
> >
> >> > As I said earlier, if we want to solve that problem a better idea is to
> >> > actually split the prepare as we discussed in [1]
> >> >
> >> > This way we can get a non atomic descriptor allocate/prepare and release.
> >> > Yes we need to redesign the APIs to solve this, but if you guys are up for
> >> > it, I think we can do it and avoid any further round abouts :)
> >>
> >> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
> >>
> >> Changes would be needed for all related dma client drivers as well,
> >> although if that's what we need to do - let's do it.
> >
> > Yes, but do bear in mind that some cases do need atomic prepare. The primary
> > cases for DMA had that in mind and also submitting next transaction from the
> > callback (tasklet) context, so that won't go away.
> >
> > It would help in other cases where clients know that they will not be in
> > atomic context so we provide additional non-atomic "allocation" followed by
> > prepare, so that drivers can split the work among these and people can do
> > runtime_pm and other things..
> 
> That for sharing the details.
> 
> It seems like some dma expert really need to be heavily involved if we
> ever are going to complete this work. :-)

Sure, I will help out :)

If anyone of you are in Portland next week, then we can discuss these f2f. I
will try taking a stab at the new API design next week.

> 
> [...]
> 
> >>
> >> 1) Dependencies between dma drivers and dma client drivers during system
> >> PM. For example, a dma client driver needs the dma controller to be
> >> operational (remain system resumed), until the dma client driver itself
> >> becomes system suspended.
> >>
> >> The *only* currently available solution for this, is to try to system
> >> suspend the dma controller later than the dma client, via using the *late
> >> or the *noirq system PM callbacks. This works for most cases, but it
> >> becomes a problem when the dma client also needs to be system suspended at
> >> the *late or the *noirq phase. Clearly this solution that doesn't scale.
> >>
> >> Using device links explicitly solves this problem as it allows to specify
> >> this dependency between devices.
> >
> > Yes this is an interesting point. Yes till now people have been doing above
> > to workaround this problem, but hey this is not a unique to dmaengine. Any
> > subsystem which provides services to others has this issue, so the solution
> > much be driver or pm framework and not unique to dmaengine.
> 
> I definitely agree, these problems aren't unique to the dmaengine
> subsystem. Exactly how/where to manage them, that I guess, is the key
> question.
> 
> However, I can't resist from finding the device links useful, as those
> really do address and solve our issues from a runtime/system PM point
> of view.
> 
> >
> >> 2) We won't avoid dma clients from getting -EPROBE_DEFER when requesting
> >> their dma channels in their ->probe() routines. This would be possible, if
> >> we can set up the device links at device initialization.
> >
> > Well setting those links is not practical at initialization time. Most
> > modern dma controllers feature a SW mux, with multiple clients connecting
> > and requesting, would we link all of them? Most of times dmaengine driver
> > wont know about those..
> 
> Okay, I see!
> 
> Kind regards
> Uffe

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 15:47                       ` Vinod Koul
  (?)
@ 2017-02-14  7:50                         ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-14  7:50 UTC (permalink / raw)
  To: Vinod Koul, Ulf Hansson
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Vinod,


On 2017-02-13 16:47, Vinod Koul wrote:
> On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
>> [...]
>>
>>>> Although, I don't know of other examples, besides the runtime PM use
>>>> case, where non-atomic channel prepare/unprepare would make sense. Do
>>>> you?
>>> The primary ask for that has been to enable runtime_pm for drivers. It's not
>>> a new ask, but we somehow haven't gotten around to do it.
>> Okay, I see.
>>
>>>>> As I said earlier, if we want to solve that problem a better idea is to
>>>>> actually split the prepare as we discussed in [1]
>>>>>
>>>>> This way we can get a non atomic descriptor allocate/prepare and release.
>>>>> Yes we need to redesign the APIs to solve this, but if you guys are up for
>>>>> it, I think we can do it and avoid any further round abouts :)
>>>> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>>>>
>>>> Changes would be needed for all related dma client drivers as well,
>>>> although if that's what we need to do - let's do it.
>>> Yes, but do bear in mind that some cases do need atomic prepare. The primary
>>> cases for DMA had that in mind and also submitting next transaction from the
>>> callback (tasklet) context, so that won't go away.
>>>
>>> It would help in other cases where clients know that they will not be in
>>> atomic context so we provide additional non-atomic "allocation" followed by
>>> prepare, so that drivers can split the work among these and people can do
>>> runtime_pm and other things..
>> That for sharing the details.
>>
>> It seems like some dma expert really need to be heavily involved if we
>> ever are going to complete this work. :-)
> Sure, I will help out :)
>
> If anyone of you are in Portland next week, then we can discuss these f2f. I
> will try taking a stab at the new API design next week.

I'm not going to Portland, but I hope that you will have a fruitful 
discussion
there.

[...]

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-14  7:50                         ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-14  7:50 UTC (permalink / raw)
  To: Vinod Koul, Ulf Hansson
  Cc: Rafael J. Wysocki, linux-samsung-soc, dmaengine,
	linux-arm-kernel, linux-pm, linux-kernel, Krzysztof Kozlowski,
	Bartlomiej Zolnierkiewicz, Lars-Peter Clausen, Arnd Bergmann,
	Inki Dae

Hi Vinod,


On 2017-02-13 16:47, Vinod Koul wrote:
> On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
>> [...]
>>
>>>> Although, I don't know of other examples, besides the runtime PM use
>>>> case, where non-atomic channel prepare/unprepare would make sense. Do
>>>> you?
>>> The primary ask for that has been to enable runtime_pm for drivers. It's not
>>> a new ask, but we somehow haven't gotten around to do it.
>> Okay, I see.
>>
>>>>> As I said earlier, if we want to solve that problem a better idea is to
>>>>> actually split the prepare as we discussed in [1]
>>>>>
>>>>> This way we can get a non atomic descriptor allocate/prepare and release.
>>>>> Yes we need to redesign the APIs to solve this, but if you guys are up for
>>>>> it, I think we can do it and avoid any further round abouts :)
>>>> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>>>>
>>>> Changes would be needed for all related dma client drivers as well,
>>>> although if that's what we need to do - let's do it.
>>> Yes, but do bear in mind that some cases do need atomic prepare. The primary
>>> cases for DMA had that in mind and also submitting next transaction from the
>>> callback (tasklet) context, so that won't go away.
>>>
>>> It would help in other cases where clients know that they will not be in
>>> atomic context so we provide additional non-atomic "allocation" followed by
>>> prepare, so that drivers can split the work among these and people can do
>>> runtime_pm and other things..
>> That for sharing the details.
>>
>> It seems like some dma expert really need to be heavily involved if we
>> ever are going to complete this work. :-)
> Sure, I will help out :)
>
> If anyone of you are in Portland next week, then we can discuss these f2f. I
> will try taking a stab at the new API design next week.

I'm not going to Portland, but I hope that you will have a fruitful 
discussion
there.

[...]

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-14  7:50                         ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-02-14  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vinod,


On 2017-02-13 16:47, Vinod Koul wrote:
> On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
>> [...]
>>
>>>> Although, I don't know of other examples, besides the runtime PM use
>>>> case, where non-atomic channel prepare/unprepare would make sense. Do
>>>> you?
>>> The primary ask for that has been to enable runtime_pm for drivers. It's not
>>> a new ask, but we somehow haven't gotten around to do it.
>> Okay, I see.
>>
>>>>> As I said earlier, if we want to solve that problem a better idea is to
>>>>> actually split the prepare as we discussed in [1]
>>>>>
>>>>> This way we can get a non atomic descriptor allocate/prepare and release.
>>>>> Yes we need to redesign the APIs to solve this, but if you guys are up for
>>>>> it, I think we can do it and avoid any further round abouts :)
>>>> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>>>>
>>>> Changes would be needed for all related dma client drivers as well,
>>>> although if that's what we need to do - let's do it.
>>> Yes, but do bear in mind that some cases do need atomic prepare. The primary
>>> cases for DMA had that in mind and also submitting next transaction from the
>>> callback (tasklet) context, so that won't go away.
>>>
>>> It would help in other cases where clients know that they will not be in
>>> atomic context so we provide additional non-atomic "allocation" followed by
>>> prepare, so that drivers can split the work among these and people can do
>>> runtime_pm and other things..
>> That for sharing the details.
>>
>> It seems like some dma expert really need to be heavily involved if we
>> ever are going to complete this work. :-)
> Sure, I will help out :)
>
> If anyone of you are in Portland next week, then we can discuss these f2f. I
> will try taking a stab at the new API design next week.

I'm not going to Portland, but I hope that you will have a fruitful 
discussion
there.

[...]

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
  2017-02-13 15:47                       ` Vinod Koul
  (?)
@ 2017-02-14  8:24                         ` Ulf Hansson
  -1 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-14  8:24 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On 13 February 2017 at 16:47, Vinod Koul <vinod.koul@intel.com> wrote:
> On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
>> [...]
>>
>> >> Although, I don't know of other examples, besides the runtime PM use
>> >> case, where non-atomic channel prepare/unprepare would make sense. Do
>> >> you?
>> >
>> > The primary ask for that has been to enable runtime_pm for drivers. It's not
>> > a new ask, but we somehow haven't gotten around to do it.
>>
>> Okay, I see.
>>
>> >
>> >> > As I said earlier, if we want to solve that problem a better idea is to
>> >> > actually split the prepare as we discussed in [1]
>> >> >
>> >> > This way we can get a non atomic descriptor allocate/prepare and release.
>> >> > Yes we need to redesign the APIs to solve this, but if you guys are up for
>> >> > it, I think we can do it and avoid any further round abouts :)
>> >>
>> >> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>> >>
>> >> Changes would be needed for all related dma client drivers as well,
>> >> although if that's what we need to do - let's do it.
>> >
>> > Yes, but do bear in mind that some cases do need atomic prepare. The primary
>> > cases for DMA had that in mind and also submitting next transaction from the
>> > callback (tasklet) context, so that won't go away.
>> >
>> > It would help in other cases where clients know that they will not be in
>> > atomic context so we provide additional non-atomic "allocation" followed by
>> > prepare, so that drivers can split the work among these and people can do
>> > runtime_pm and other things..
>>
>> That for sharing the details.
>>
>> It seems like some dma expert really need to be heavily involved if we
>> ever are going to complete this work. :-)
>
> Sure, I will help out :)

That sounds great! :-)

>
> If anyone of you are in Portland next week, then we can discuss these f2f. I
> will try taking a stab at the new API design next week.
>

Unfortunate not. We will have to meet some other time. Anyway, please
keep me posted on any related topics.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-14  8:24                         ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-14  8:24 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Marek Szyprowski, Rafael J. Wysocki, linux-samsung-soc,
	dmaengine, linux-arm-kernel, linux-pm, linux-kernel,
	Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Lars-Peter Clausen, Arnd Bergmann, Inki Dae

On 13 February 2017 at 16:47, Vinod Koul <vinod.koul@intel.com> wrote:
> On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
>> [...]
>>
>> >> Although, I don't know of other examples, besides the runtime PM use
>> >> case, where non-atomic channel prepare/unprepare would make sense. Do
>> >> you?
>> >
>> > The primary ask for that has been to enable runtime_pm for drivers. It's not
>> > a new ask, but we somehow haven't gotten around to do it.
>>
>> Okay, I see.
>>
>> >
>> >> > As I said earlier, if we want to solve that problem a better idea is to
>> >> > actually split the prepare as we discussed in [1]
>> >> >
>> >> > This way we can get a non atomic descriptor allocate/prepare and release.
>> >> > Yes we need to redesign the APIs to solve this, but if you guys are up for
>> >> > it, I think we can do it and avoid any further round abouts :)
>> >>
>> >> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>> >>
>> >> Changes would be needed for all related dma client drivers as well,
>> >> although if that's what we need to do - let's do it.
>> >
>> > Yes, but do bear in mind that some cases do need atomic prepare. The primary
>> > cases for DMA had that in mind and also submitting next transaction from the
>> > callback (tasklet) context, so that won't go away.
>> >
>> > It would help in other cases where clients know that they will not be in
>> > atomic context so we provide additional non-atomic "allocation" followed by
>> > prepare, so that drivers can split the work among these and people can do
>> > runtime_pm and other things..
>>
>> That for sharing the details.
>>
>> It seems like some dma expert really need to be heavily involved if we
>> ever are going to complete this work. :-)
>
> Sure, I will help out :)

That sounds great! :-)

>
> If anyone of you are in Portland next week, then we can discuss these f2f. I
> will try taking a stab at the new API design next week.
>

Unfortunate not. We will have to meet some other time. Anyway, please
keep me posted on any related topics.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM
@ 2017-02-14  8:24                         ` Ulf Hansson
  0 siblings, 0 replies; 68+ messages in thread
From: Ulf Hansson @ 2017-02-14  8:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 13 February 2017 at 16:47, Vinod Koul <vinod.koul@intel.com> wrote:
> On Mon, Feb 13, 2017 at 04:32:32PM +0100, Ulf Hansson wrote:
>> [...]
>>
>> >> Although, I don't know of other examples, besides the runtime PM use
>> >> case, where non-atomic channel prepare/unprepare would make sense. Do
>> >> you?
>> >
>> > The primary ask for that has been to enable runtime_pm for drivers. It's not
>> > a new ask, but we somehow haven't gotten around to do it.
>>
>> Okay, I see.
>>
>> >
>> >> > As I said earlier, if we want to solve that problem a better idea is to
>> >> > actually split the prepare as we discussed in [1]
>> >> >
>> >> > This way we can get a non atomic descriptor allocate/prepare and release.
>> >> > Yes we need to redesign the APIs to solve this, but if you guys are up for
>> >> > it, I think we can do it and avoid any further round abouts :)
>> >>
>> >> Adding/re-designing dma APIs is a viable option to solve the runtime PM case.
>> >>
>> >> Changes would be needed for all related dma client drivers as well,
>> >> although if that's what we need to do - let's do it.
>> >
>> > Yes, but do bear in mind that some cases do need atomic prepare. The primary
>> > cases for DMA had that in mind and also submitting next transaction from the
>> > callback (tasklet) context, so that won't go away.
>> >
>> > It would help in other cases where clients know that they will not be in
>> > atomic context so we provide additional non-atomic "allocation" followed by
>> > prepare, so that drivers can split the work among these and people can do
>> > runtime_pm and other things..
>>
>> That for sharing the details.
>>
>> It seems like some dma expert really need to be heavily involved if we
>> ever are going to complete this work. :-)
>
> Sure, I will help out :)

That sounds great! :-)

>
> If anyone of you are in Portland next week, then we can discuss these f2f. I
> will try taking a stab at the new API design next week.
>

Unfortunate not. We will have to meet some other time. Anyway, please
keep me posted on any related topics.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
  2017-02-09 14:22       ` Marek Szyprowski
  (?)
@ 2017-03-22  8:22         ` Marek Szyprowski
  -1 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-03-22  8:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz, Vinod Koul,
	Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

Hi Vinod


On 2017-02-09 15:22, Marek Szyprowski wrote:
> This driver is now used only on platforms which support device tree, so
> it is safe to remove legacy platform data based initialization code.
>
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> For plat-samsung:
> Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Vinod: This patch is completely independent from the rest of the changes
from that patchset. Could you apply it, or do you want me to resend it
separately? Runtime pm related changes will wait until a new DMA engine API
is ready.

> ---
>   arch/arm/plat-samsung/devs.c |  1 -
>   drivers/dma/pl330.c          | 42 ++++++++----------------------------------
>   include/linux/amba/pl330.h   | 35 -----------------------------------
>   3 files changed, 8 insertions(+), 70 deletions(-)
>   delete mode 100644 include/linux/amba/pl330.h
>
> diff --git a/arch/arm/plat-samsung/devs.c b/arch/arm/plat-samsung/devs.c
> index 03fac123676d..dc269d9143bc 100644
> --- a/arch/arm/plat-samsung/devs.c
> +++ b/arch/arm/plat-samsung/devs.c
> @@ -10,7 +10,6 @@
>    * published by the Free Software Foundation.
>   */
>   
> -#include <linux/amba/pl330.h>
>   #include <linux/kernel.h>
>   #include <linux/types.h>
>   #include <linux/interrupt.h>
> diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
> index f37f4978dabb..8b0da7fa520d 100644
> --- a/drivers/dma/pl330.c
> +++ b/drivers/dma/pl330.c
> @@ -22,7 +22,6 @@
>   #include <linux/dma-mapping.h>
>   #include <linux/dmaengine.h>
>   #include <linux/amba/bus.h>
> -#include <linux/amba/pl330.h>
>   #include <linux/scatterlist.h>
>   #include <linux/of.h>
>   #include <linux/of_dma.h>
> @@ -2077,18 +2076,6 @@ static void pl330_tasklet(unsigned long data)
>   	}
>   }
>   
> -bool pl330_filter(struct dma_chan *chan, void *param)
> -{
> -	u8 *peri_id;
> -
> -	if (chan->device->dev->driver != &pl330_driver.drv)
> -		return false;
> -
> -	peri_id = chan->private;
> -	return *peri_id == (unsigned long)param;
> -}
> -EXPORT_SYMBOL(pl330_filter);
> -
>   static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
>   						struct of_dma *ofdma)
>   {
> @@ -2833,7 +2820,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   static int
>   pl330_probe(struct amba_device *adev, const struct amba_id *id)
>   {
> -	struct dma_pl330_platdata *pdat;
>   	struct pl330_config *pcfg;
>   	struct pl330_dmac *pl330;
>   	struct dma_pl330_chan *pch, *_p;
> @@ -2843,8 +2829,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	int num_chan;
>   	struct device_node *np = adev->dev.of_node;
>   
> -	pdat = dev_get_platdata(&adev->dev);
> -
>   	ret = dma_set_mask_and_coherent(&adev->dev, DMA_BIT_MASK(32));
>   	if (ret)
>   		return ret;
> @@ -2857,7 +2841,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	pd = &pl330->ddma;
>   	pd->dev = &adev->dev;
>   
> -	pl330->mcbufsz = pdat ? pdat->mcbuf_sz : 0;
> +	pl330->mcbufsz = 0;
>   
>   	/* get quirk */
>   	for (i = 0; i < ARRAY_SIZE(of_quirks); i++)
> @@ -2901,10 +2885,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	INIT_LIST_HEAD(&pd->channels);
>   
>   	/* Initialize channel parameters */
> -	if (pdat)
> -		num_chan = max_t(int, pdat->nr_valid_peri, pcfg->num_chan);
> -	else
> -		num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
> +	num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
>   
>   	pl330->num_peripherals = num_chan;
>   
> @@ -2916,11 +2897,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   
>   	for (i = 0; i < num_chan; i++) {
>   		pch = &pl330->peripherals[i];
> -		if (!adev->dev.of_node)
> -			pch->chan.private = pdat ? &pdat->peri_id[i] : NULL;
> -		else
> -			pch->chan.private = adev->dev.of_node;
>   
> +		pch->chan.private = adev->dev.of_node;
>   		INIT_LIST_HEAD(&pch->submitted_list);
>   		INIT_LIST_HEAD(&pch->work_list);
>   		INIT_LIST_HEAD(&pch->completed_list);
> @@ -2933,15 +2911,11 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   		list_add_tail(&pch->chan.device_node, &pd->channels);
>   	}
>   
> -	if (pdat) {
> -		pd->cap_mask = pdat->cap_mask;
> -	} else {
> -		dma_cap_set(DMA_MEMCPY, pd->cap_mask);
> -		if (pcfg->num_peri) {
> -			dma_cap_set(DMA_SLAVE, pd->cap_mask);
> -			dma_cap_set(DMA_CYCLIC, pd->cap_mask);
> -			dma_cap_set(DMA_PRIVATE, pd->cap_mask);
> -		}
> +	dma_cap_set(DMA_MEMCPY, pd->cap_mask);
> +	if (pcfg->num_peri) {
> +		dma_cap_set(DMA_SLAVE, pd->cap_mask);
> +		dma_cap_set(DMA_CYCLIC, pd->cap_mask);
> +		dma_cap_set(DMA_PRIVATE, pd->cap_mask);
>   	}
>   
>   	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
> diff --git a/include/linux/amba/pl330.h b/include/linux/amba/pl330.h
> deleted file mode 100644
> index fe93758e8403..000000000000
> --- a/include/linux/amba/pl330.h
> +++ /dev/null
> @@ -1,35 +0,0 @@
> -/* linux/include/linux/amba/pl330.h
> - *
> - * Copyright (C) 2010 Samsung Electronics Co. Ltd.
> - *	Jaswinder Singh <jassi.brar@samsung.com>
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> - */
> -
> -#ifndef	__AMBA_PL330_H_
> -#define	__AMBA_PL330_H_
> -
> -#include <linux/dmaengine.h>
> -
> -struct dma_pl330_platdata {
> -	/*
> -	 * Number of valid peripherals connected to DMAC.
> -	 * This may be different from the value read from
> -	 * CR0, as the PL330 implementation might have 'holes'
> -	 * in the peri list or the peri could also be reached
> -	 * from another DMAC which the platform prefers.
> -	 */
> -	u8 nr_valid_peri;
> -	/* Array of valid peripherals */
> -	u8 *peri_id;
> -	/* Operational capabilities */
> -	dma_cap_mask_t cap_mask;
> -	/* Bytes to allocate for MC buffer */
> -	unsigned mcbuf_sz;
> -};
> -
> -extern bool pl330_filter(struct dma_chan *chan, void *param);
> -#endif	/* __AMBA_PL330_H_ */

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
@ 2017-03-22  8:22         ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-03-22  8:22 UTC (permalink / raw)
  To: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm, linux-kernel
  Cc: Ulf Hansson, Lars-Peter Clausen, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Vinod Koul, Rafael J. Wysocki,
	Krzysztof Kozlowski, Inki Dae

Hi Vinod


On 2017-02-09 15:22, Marek Szyprowski wrote:
> This driver is now used only on platforms which support device tree, so
> it is safe to remove legacy platform data based initialization code.
>
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> For plat-samsung:
> Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Vinod: This patch is completely independent from the rest of the changes
from that patchset. Could you apply it, or do you want me to resend it
separately? Runtime pm related changes will wait until a new DMA engine API
is ready.

> ---
>   arch/arm/plat-samsung/devs.c |  1 -
>   drivers/dma/pl330.c          | 42 ++++++++----------------------------------
>   include/linux/amba/pl330.h   | 35 -----------------------------------
>   3 files changed, 8 insertions(+), 70 deletions(-)
>   delete mode 100644 include/linux/amba/pl330.h
>
> diff --git a/arch/arm/plat-samsung/devs.c b/arch/arm/plat-samsung/devs.c
> index 03fac123676d..dc269d9143bc 100644
> --- a/arch/arm/plat-samsung/devs.c
> +++ b/arch/arm/plat-samsung/devs.c
> @@ -10,7 +10,6 @@
>    * published by the Free Software Foundation.
>   */
>   
> -#include <linux/amba/pl330.h>
>   #include <linux/kernel.h>
>   #include <linux/types.h>
>   #include <linux/interrupt.h>
> diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
> index f37f4978dabb..8b0da7fa520d 100644
> --- a/drivers/dma/pl330.c
> +++ b/drivers/dma/pl330.c
> @@ -22,7 +22,6 @@
>   #include <linux/dma-mapping.h>
>   #include <linux/dmaengine.h>
>   #include <linux/amba/bus.h>
> -#include <linux/amba/pl330.h>
>   #include <linux/scatterlist.h>
>   #include <linux/of.h>
>   #include <linux/of_dma.h>
> @@ -2077,18 +2076,6 @@ static void pl330_tasklet(unsigned long data)
>   	}
>   }
>   
> -bool pl330_filter(struct dma_chan *chan, void *param)
> -{
> -	u8 *peri_id;
> -
> -	if (chan->device->dev->driver != &pl330_driver.drv)
> -		return false;
> -
> -	peri_id = chan->private;
> -	return *peri_id == (unsigned long)param;
> -}
> -EXPORT_SYMBOL(pl330_filter);
> -
>   static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
>   						struct of_dma *ofdma)
>   {
> @@ -2833,7 +2820,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   static int
>   pl330_probe(struct amba_device *adev, const struct amba_id *id)
>   {
> -	struct dma_pl330_platdata *pdat;
>   	struct pl330_config *pcfg;
>   	struct pl330_dmac *pl330;
>   	struct dma_pl330_chan *pch, *_p;
> @@ -2843,8 +2829,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	int num_chan;
>   	struct device_node *np = adev->dev.of_node;
>   
> -	pdat = dev_get_platdata(&adev->dev);
> -
>   	ret = dma_set_mask_and_coherent(&adev->dev, DMA_BIT_MASK(32));
>   	if (ret)
>   		return ret;
> @@ -2857,7 +2841,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	pd = &pl330->ddma;
>   	pd->dev = &adev->dev;
>   
> -	pl330->mcbufsz = pdat ? pdat->mcbuf_sz : 0;
> +	pl330->mcbufsz = 0;
>   
>   	/* get quirk */
>   	for (i = 0; i < ARRAY_SIZE(of_quirks); i++)
> @@ -2901,10 +2885,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	INIT_LIST_HEAD(&pd->channels);
>   
>   	/* Initialize channel parameters */
> -	if (pdat)
> -		num_chan = max_t(int, pdat->nr_valid_peri, pcfg->num_chan);
> -	else
> -		num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
> +	num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
>   
>   	pl330->num_peripherals = num_chan;
>   
> @@ -2916,11 +2897,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   
>   	for (i = 0; i < num_chan; i++) {
>   		pch = &pl330->peripherals[i];
> -		if (!adev->dev.of_node)
> -			pch->chan.private = pdat ? &pdat->peri_id[i] : NULL;
> -		else
> -			pch->chan.private = adev->dev.of_node;
>   
> +		pch->chan.private = adev->dev.of_node;
>   		INIT_LIST_HEAD(&pch->submitted_list);
>   		INIT_LIST_HEAD(&pch->work_list);
>   		INIT_LIST_HEAD(&pch->completed_list);
> @@ -2933,15 +2911,11 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   		list_add_tail(&pch->chan.device_node, &pd->channels);
>   	}
>   
> -	if (pdat) {
> -		pd->cap_mask = pdat->cap_mask;
> -	} else {
> -		dma_cap_set(DMA_MEMCPY, pd->cap_mask);
> -		if (pcfg->num_peri) {
> -			dma_cap_set(DMA_SLAVE, pd->cap_mask);
> -			dma_cap_set(DMA_CYCLIC, pd->cap_mask);
> -			dma_cap_set(DMA_PRIVATE, pd->cap_mask);
> -		}
> +	dma_cap_set(DMA_MEMCPY, pd->cap_mask);
> +	if (pcfg->num_peri) {
> +		dma_cap_set(DMA_SLAVE, pd->cap_mask);
> +		dma_cap_set(DMA_CYCLIC, pd->cap_mask);
> +		dma_cap_set(DMA_PRIVATE, pd->cap_mask);
>   	}
>   
>   	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
> diff --git a/include/linux/amba/pl330.h b/include/linux/amba/pl330.h
> deleted file mode 100644
> index fe93758e8403..000000000000
> --- a/include/linux/amba/pl330.h
> +++ /dev/null
> @@ -1,35 +0,0 @@
> -/* linux/include/linux/amba/pl330.h
> - *
> - * Copyright (C) 2010 Samsung Electronics Co. Ltd.
> - *	Jaswinder Singh <jassi.brar@samsung.com>
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> - */
> -
> -#ifndef	__AMBA_PL330_H_
> -#define	__AMBA_PL330_H_
> -
> -#include <linux/dmaengine.h>
> -
> -struct dma_pl330_platdata {
> -	/*
> -	 * Number of valid peripherals connected to DMAC.
> -	 * This may be different from the value read from
> -	 * CR0, as the PL330 implementation might have 'holes'
> -	 * in the peri list or the peri could also be reached
> -	 * from another DMAC which the platform prefers.
> -	 */
> -	u8 nr_valid_peri;
> -	/* Array of valid peripherals */
> -	u8 *peri_id;
> -	/* Operational capabilities */
> -	dma_cap_mask_t cap_mask;
> -	/* Bytes to allocate for MC buffer */
> -	unsigned mcbuf_sz;
> -};
> -
> -extern bool pl330_filter(struct dma_chan *chan, void *param);
> -#endif	/* __AMBA_PL330_H_ */

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
@ 2017-03-22  8:22         ` Marek Szyprowski
  0 siblings, 0 replies; 68+ messages in thread
From: Marek Szyprowski @ 2017-03-22  8:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vinod


On 2017-02-09 15:22, Marek Szyprowski wrote:
> This driver is now used only on platforms which support device tree, so
> it is safe to remove legacy platform data based initialization code.
>
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> For plat-samsung:
> Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Vinod: This patch is completely independent from the rest of the changes
from that patchset. Could you apply it, or do you want me to resend it
separately? Runtime pm related changes will wait until a new DMA engine API
is ready.

> ---
>   arch/arm/plat-samsung/devs.c |  1 -
>   drivers/dma/pl330.c          | 42 ++++++++----------------------------------
>   include/linux/amba/pl330.h   | 35 -----------------------------------
>   3 files changed, 8 insertions(+), 70 deletions(-)
>   delete mode 100644 include/linux/amba/pl330.h
>
> diff --git a/arch/arm/plat-samsung/devs.c b/arch/arm/plat-samsung/devs.c
> index 03fac123676d..dc269d9143bc 100644
> --- a/arch/arm/plat-samsung/devs.c
> +++ b/arch/arm/plat-samsung/devs.c
> @@ -10,7 +10,6 @@
>    * published by the Free Software Foundation.
>   */
>   
> -#include <linux/amba/pl330.h>
>   #include <linux/kernel.h>
>   #include <linux/types.h>
>   #include <linux/interrupt.h>
> diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
> index f37f4978dabb..8b0da7fa520d 100644
> --- a/drivers/dma/pl330.c
> +++ b/drivers/dma/pl330.c
> @@ -22,7 +22,6 @@
>   #include <linux/dma-mapping.h>
>   #include <linux/dmaengine.h>
>   #include <linux/amba/bus.h>
> -#include <linux/amba/pl330.h>
>   #include <linux/scatterlist.h>
>   #include <linux/of.h>
>   #include <linux/of_dma.h>
> @@ -2077,18 +2076,6 @@ static void pl330_tasklet(unsigned long data)
>   	}
>   }
>   
> -bool pl330_filter(struct dma_chan *chan, void *param)
> -{
> -	u8 *peri_id;
> -
> -	if (chan->device->dev->driver != &pl330_driver.drv)
> -		return false;
> -
> -	peri_id = chan->private;
> -	return *peri_id == (unsigned long)param;
> -}
> -EXPORT_SYMBOL(pl330_filter);
> -
>   static struct dma_chan *of_dma_pl330_xlate(struct of_phandle_args *dma_spec,
>   						struct of_dma *ofdma)
>   {
> @@ -2833,7 +2820,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   static int
>   pl330_probe(struct amba_device *adev, const struct amba_id *id)
>   {
> -	struct dma_pl330_platdata *pdat;
>   	struct pl330_config *pcfg;
>   	struct pl330_dmac *pl330;
>   	struct dma_pl330_chan *pch, *_p;
> @@ -2843,8 +2829,6 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	int num_chan;
>   	struct device_node *np = adev->dev.of_node;
>   
> -	pdat = dev_get_platdata(&adev->dev);
> -
>   	ret = dma_set_mask_and_coherent(&adev->dev, DMA_BIT_MASK(32));
>   	if (ret)
>   		return ret;
> @@ -2857,7 +2841,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	pd = &pl330->ddma;
>   	pd->dev = &adev->dev;
>   
> -	pl330->mcbufsz = pdat ? pdat->mcbuf_sz : 0;
> +	pl330->mcbufsz = 0;
>   
>   	/* get quirk */
>   	for (i = 0; i < ARRAY_SIZE(of_quirks); i++)
> @@ -2901,10 +2885,7 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   	INIT_LIST_HEAD(&pd->channels);
>   
>   	/* Initialize channel parameters */
> -	if (pdat)
> -		num_chan = max_t(int, pdat->nr_valid_peri, pcfg->num_chan);
> -	else
> -		num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
> +	num_chan = max_t(int, pcfg->num_peri, pcfg->num_chan);
>   
>   	pl330->num_peripherals = num_chan;
>   
> @@ -2916,11 +2897,8 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   
>   	for (i = 0; i < num_chan; i++) {
>   		pch = &pl330->peripherals[i];
> -		if (!adev->dev.of_node)
> -			pch->chan.private = pdat ? &pdat->peri_id[i] : NULL;
> -		else
> -			pch->chan.private = adev->dev.of_node;
>   
> +		pch->chan.private = adev->dev.of_node;
>   		INIT_LIST_HEAD(&pch->submitted_list);
>   		INIT_LIST_HEAD(&pch->work_list);
>   		INIT_LIST_HEAD(&pch->completed_list);
> @@ -2933,15 +2911,11 @@ static int __maybe_unused pl330_resume(struct device *dev)
>   		list_add_tail(&pch->chan.device_node, &pd->channels);
>   	}
>   
> -	if (pdat) {
> -		pd->cap_mask = pdat->cap_mask;
> -	} else {
> -		dma_cap_set(DMA_MEMCPY, pd->cap_mask);
> -		if (pcfg->num_peri) {
> -			dma_cap_set(DMA_SLAVE, pd->cap_mask);
> -			dma_cap_set(DMA_CYCLIC, pd->cap_mask);
> -			dma_cap_set(DMA_PRIVATE, pd->cap_mask);
> -		}
> +	dma_cap_set(DMA_MEMCPY, pd->cap_mask);
> +	if (pcfg->num_peri) {
> +		dma_cap_set(DMA_SLAVE, pd->cap_mask);
> +		dma_cap_set(DMA_CYCLIC, pd->cap_mask);
> +		dma_cap_set(DMA_PRIVATE, pd->cap_mask);
>   	}
>   
>   	pd->device_alloc_chan_resources = pl330_alloc_chan_resources;
> diff --git a/include/linux/amba/pl330.h b/include/linux/amba/pl330.h
> deleted file mode 100644
> index fe93758e8403..000000000000
> --- a/include/linux/amba/pl330.h
> +++ /dev/null
> @@ -1,35 +0,0 @@
> -/* linux/include/linux/amba/pl330.h
> - *
> - * Copyright (C) 2010 Samsung Electronics Co. Ltd.
> - *	Jaswinder Singh <jassi.brar@samsung.com>
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> - */
> -
> -#ifndef	__AMBA_PL330_H_
> -#define	__AMBA_PL330_H_
> -
> -#include <linux/dmaengine.h>
> -
> -struct dma_pl330_platdata {
> -	/*
> -	 * Number of valid peripherals connected to DMAC.
> -	 * This may be different from the value read from
> -	 * CR0, as the PL330 implementation might have 'holes'
> -	 * in the peri list or the peri could also be reached
> -	 * from another DMAC which the platform prefers.
> -	 */
> -	u8 nr_valid_peri;
> -	/* Array of valid peripherals */
> -	u8 *peri_id;
> -	/* Operational capabilities */
> -	dma_cap_mask_t cap_mask;
> -	/* Bytes to allocate for MC buffer */
> -	unsigned mcbuf_sz;
> -};
> -
> -extern bool pl330_filter(struct dma_chan *chan, void *param);
> -#endif	/* __AMBA_PL330_H_ */

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
  2017-03-22  8:22         ` Marek Szyprowski
  (?)
@ 2017-03-27  4:34           ` Vinod Koul
  -1 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-03-27  4:34 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-samsung-soc, dmaengine, linux-arm-kernel, linux-pm,
	linux-kernel, Krzysztof Kozlowski, Bartlomiej Zolnierkiewicz,
	Ulf Hansson, Rafael J. Wysocki, Lars-Peter Clausen,
	Arnd Bergmann, Inki Dae

On Wed, Mar 22, 2017 at 09:22:23AM +0100, Marek Szyprowski wrote:
> Hi Vinod
> 
> 
> On 2017-02-09 15:22, Marek Szyprowski wrote:
> >This driver is now used only on platforms which support device tree, so
> >it is safe to remove legacy platform data based initialization code.
> >
> >Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> >Acked-by: Arnd Bergmann <arnd@arndb.de>
> >For plat-samsung:
> >Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
> 
> Vinod: This patch is completely independent from the rest of the changes
> from that patchset. Could you apply it, or do you want me to resend it
> separately? Runtime pm related changes will wait until a new DMA engine API
> is ready.

Sure, please resend it :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
@ 2017-03-27  4:34           ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-03-27  4:34 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Ulf Hansson, linux-samsung-soc, Arnd Bergmann, linux-pm,
	Bartlomiej Zolnierkiewicz, Rafael J. Wysocki, linux-kernel,
	Krzysztof Kozlowski, Inki Dae, Lars-Peter Clausen, dmaengine,
	linux-arm-kernel

On Wed, Mar 22, 2017 at 09:22:23AM +0100, Marek Szyprowski wrote:
> Hi Vinod
> 
> 
> On 2017-02-09 15:22, Marek Szyprowski wrote:
> >This driver is now used only on platforms which support device tree, so
> >it is safe to remove legacy platform data based initialization code.
> >
> >Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> >Acked-by: Arnd Bergmann <arnd@arndb.de>
> >For plat-samsung:
> >Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
> 
> Vinod: This patch is completely independent from the rest of the changes
> from that patchset. Could you apply it, or do you want me to resend it
> separately? Runtime pm related changes will wait until a new DMA engine API
> is ready.

Sure, please resend it :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization
@ 2017-03-27  4:34           ` Vinod Koul
  0 siblings, 0 replies; 68+ messages in thread
From: Vinod Koul @ 2017-03-27  4:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 22, 2017 at 09:22:23AM +0100, Marek Szyprowski wrote:
> Hi Vinod
> 
> 
> On 2017-02-09 15:22, Marek Szyprowski wrote:
> >This driver is now used only on platforms which support device tree, so
> >it is safe to remove legacy platform data based initialization code.
> >
> >Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> >Acked-by: Arnd Bergmann <arnd@arndb.de>
> >For plat-samsung:
> >Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
> 
> Vinod: This patch is completely independent from the rest of the changes
> from that patchset. Could you apply it, or do you want me to resend it
> separately? Runtime pm related changes will wait until a new DMA engine API
> is ready.

Sure, please resend it :)

-- 
~Vinod

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2017-03-27  4:34 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20170209142307eucas1p2592bbad82dbbffc56bbd993f5a890981@eucas1p2.samsung.com>
2017-02-09 14:22 ` [PATCH v8 0/3] DMA Engine: switch PL330 driver to non-irq-safe runtime PM Marek Szyprowski
2017-02-09 14:22   ` Marek Szyprowski
2017-02-09 14:22   ` Marek Szyprowski
     [not found]   ` <CGME20170209142307eucas1p180323d005f524760913b8d04ac966423@eucas1p1.samsung.com>
2017-02-09 14:22     ` [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks Marek Szyprowski
2017-02-09 14:22       ` [PATCH v8 1/3] dmaengine: Add new device_{set, release}_slave callbacks Marek Szyprowski
2017-02-09 14:22       ` Marek Szyprowski
2017-02-10  4:34       ` [PATCH v8 1/3] dmaengine: Add new device_{set,release}_slave callbacks Vinod Koul
2017-02-10  4:34         ` Vinod Koul
2017-02-10 12:07         ` Marek Szyprowski
2017-02-10 12:07           ` Marek Szyprowski
2017-02-13  1:42           ` Vinod Koul
2017-02-13  1:42             ` Vinod Koul
2017-02-13 11:48             ` Marek Szyprowski
2017-02-13 11:48               ` Marek Szyprowski
     [not found]   ` <CGME20170209142308eucas1p24d52db3d52e19228e8f423c3dc8b085b@eucas1p2.samsung.com>
2017-02-09 14:22     ` [PATCH v8 2/3] dmaengine: pl330: remove pdata based initialization Marek Szyprowski
2017-02-09 14:22       ` Marek Szyprowski
2017-03-22  8:22       ` Marek Szyprowski
2017-03-22  8:22         ` Marek Szyprowski
2017-03-22  8:22         ` Marek Szyprowski
2017-03-27  4:34         ` Vinod Koul
2017-03-27  4:34           ` Vinod Koul
2017-03-27  4:34           ` Vinod Koul
     [not found]   ` <CGME20170209142309eucas1p2b1277d96139eafc0d1dcc14145600476@eucas1p2.samsung.com>
2017-02-09 14:22     ` [PATCH v8 3/3] dmaengine: pl330: Don't require irq-safe runtime PM Marek Szyprowski
2017-02-09 14:22       ` Marek Szyprowski
2017-02-09 14:22       ` Marek Szyprowski
2017-02-10  4:50       ` Vinod Koul
2017-02-10  4:50         ` Vinod Koul
2017-02-10 11:51         ` Marek Szyprowski
2017-02-10 11:51           ` Marek Szyprowski
2017-02-10 13:57           ` Ulf Hansson
2017-02-10 13:57             ` Ulf Hansson
2017-02-10 13:57             ` Ulf Hansson
2017-02-13  2:03             ` Vinod Koul
2017-02-13  2:03               ` Vinod Koul
2017-02-13  2:03               ` Vinod Koul
2017-02-13 11:11               ` Ulf Hansson
2017-02-13 11:11                 ` Ulf Hansson
2017-02-13 11:11                 ` Ulf Hansson
2017-02-13 12:15                 ` Marek Szyprowski
2017-02-13 12:15                   ` Marek Szyprowski
2017-02-13 12:15                   ` Marek Szyprowski
2017-02-13 12:32                   ` Vinod Koul
2017-02-13 12:32                     ` Vinod Koul
2017-02-13 12:32                     ` Vinod Koul
2017-02-13 12:27                 ` Vinod Koul
2017-02-13 12:27                   ` Vinod Koul
2017-02-13 12:27                   ` Vinod Koul
2017-02-13 15:32                   ` Ulf Hansson
2017-02-13 15:32                     ` Ulf Hansson
2017-02-13 15:32                     ` Ulf Hansson
2017-02-13 15:47                     ` Vinod Koul
2017-02-13 15:47                       ` Vinod Koul
2017-02-13 15:47                       ` Vinod Koul
2017-02-14  7:50                       ` Marek Szyprowski
2017-02-14  7:50                         ` Marek Szyprowski
2017-02-14  7:50                         ` Marek Szyprowski
2017-02-14  8:24                       ` Ulf Hansson
2017-02-14  8:24                         ` Ulf Hansson
2017-02-14  8:24                         ` Ulf Hansson
2017-02-13 12:01               ` Marek Szyprowski
2017-02-13 12:01                 ` Marek Szyprowski
2017-02-13 12:01                 ` Marek Szyprowski
2017-02-13 11:45             ` Marek Szyprowski
2017-02-13 11:45               ` Marek Szyprowski
2017-02-13 11:45               ` Marek Szyprowski
2017-02-13 15:09               ` Ulf Hansson
2017-02-13 15:09                 ` Ulf Hansson
2017-02-13 15:09                 ` Ulf Hansson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.