* [PATCH v2 1/4] remoteproc: Introduce auto-boot flag @ 2016-08-11 21:52 Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 2/4] remoteproc: Calculate max_notifyid during load Bjorn Andersson ` (3 more replies) 0 siblings, 4 replies; 12+ messages in thread From: Bjorn Andersson @ 2016-08-11 21:52 UTC (permalink / raw) To: linux-remoteproc Cc: Ohad Ben-Cohen, linux-kernel, Suman Anna, Lee Jones, Loic Pallardy Introduce an "auto-boot" flag on rprocs to make it possible to flag remote processors without vdevs to automatically boot once the firmware is found. Preserve previous behavior of the wkup_m3 processor being explicitly booted by a consumer. Cc: Lee Jones <lee.jones@linaro.org> Cc: Loic Pallardy <loic.pallardy@st.com> Cc: Suman Anna <s-anna@ti.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- Changes since v1: - s/always_on/auto_boot/ - Fixed double increment of "power" in recover path - Marked wkup_m3 to not auto_boot drivers/remoteproc/remoteproc_core.c | 28 +++++++++++++++++++++++++++- drivers/remoteproc/remoteproc_virtio.c | 13 ------------- drivers/remoteproc/wkup_m3_rproc.c | 2 ++ include/linux/remoteproc.h | 1 + 4 files changed, 30 insertions(+), 14 deletions(-) diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index fe0539ed9cb5..5ca98039e494 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -933,6 +933,10 @@ static void rproc_fw_config_virtio(const struct firmware *fw, void *context) /* look for virtio devices and register them */ ret = rproc_handle_resources(rproc, tablesz, rproc_vdev_handler); + /* if rproc is marked always-on, request it to boot */ + if (rproc->auto_boot) + rproc_boot_nowait(rproc); + out: release_firmware(fw); /* allow rproc_del() contexts, if any, to proceed */ @@ -978,11 +982,16 @@ static int rproc_add_virtio_devices(struct rproc *rproc) int rproc_trigger_recovery(struct rproc *rproc) { struct rproc_vdev *rvdev, *rvtmp; + int ret; dev_err(&rproc->dev, "recovering %s\n", rproc->name); init_completion(&rproc->crash_comp); + /* shut down the remote */ + /* TODO: make sure this works with rproc->power > 1 */ + rproc_shutdown(rproc); + /* clean up remote vdev entries */ list_for_each_entry_safe(rvdev, rvtmp, &rproc->rvdevs, node) rproc_remove_virtio_dev(rvdev); @@ -993,7 +1002,18 @@ int rproc_trigger_recovery(struct rproc *rproc) /* Free the copy of the resource table */ kfree(rproc->cached_table); - return rproc_add_virtio_devices(rproc); + ret = rproc_add_virtio_devices(rproc); + if (ret) + return ret; + + /* + * boot the remote processor up again, if the async firmware loader + * didn't do so already, waiting for the async fw load to finish + */ + if (!rproc->auto_boot) + rproc_boot(rproc); + + return 0; } /** @@ -1374,6 +1394,7 @@ struct rproc *rproc_alloc(struct device *dev, const char *name, rproc->name = name; rproc->ops = ops; rproc->priv = &rproc[1]; + rproc->auto_boot = true; device_initialize(&rproc->dev); rproc->dev.parent = dev; @@ -1452,6 +1473,11 @@ int rproc_del(struct rproc *rproc) /* if rproc is just being registered, wait */ wait_for_completion(&rproc->firmware_loading_complete); + /* if rproc is marked always-on, rproc_add() booted it */ + /* TODO: make sure this works with rproc->power > 1 */ + if (rproc->auto_boot) + rproc_shutdown(rproc); + /* clean up remote vdev entries */ list_for_each_entry_safe(rvdev, tmp, &rproc->rvdevs, node) rproc_remove_virtio_dev(rvdev); diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c index cc91556313e1..574c90ce07f7 100644 --- a/drivers/remoteproc/remoteproc_virtio.c +++ b/drivers/remoteproc/remoteproc_virtio.c @@ -136,11 +136,6 @@ static void __rproc_virtio_del_vqs(struct virtio_device *vdev) static void rproc_virtio_del_vqs(struct virtio_device *vdev) { - struct rproc *rproc = vdev_to_rproc(vdev); - - /* power down the remote processor before deleting vqs */ - rproc_shutdown(rproc); - __rproc_virtio_del_vqs(vdev); } @@ -149,7 +144,6 @@ static int rproc_virtio_find_vqs(struct virtio_device *vdev, unsigned nvqs, vq_callback_t *callbacks[], const char * const names[]) { - struct rproc *rproc = vdev_to_rproc(vdev); int i, ret; for (i = 0; i < nvqs; ++i) { @@ -160,13 +154,6 @@ static int rproc_virtio_find_vqs(struct virtio_device *vdev, unsigned nvqs, } } - /* now that the vqs are all set, boot the remote processor */ - ret = rproc_boot_nowait(rproc); - if (ret) { - dev_err(&rproc->dev, "rproc_boot() failed %d\n", ret); - goto error; - } - return 0; error: diff --git a/drivers/remoteproc/wkup_m3_rproc.c b/drivers/remoteproc/wkup_m3_rproc.c index 02d271d101b4..3811cb522af3 100644 --- a/drivers/remoteproc/wkup_m3_rproc.c +++ b/drivers/remoteproc/wkup_m3_rproc.c @@ -167,6 +167,8 @@ static int wkup_m3_rproc_probe(struct platform_device *pdev) goto err; } + rproc->auto_boot = false; + wkupm3 = rproc->priv; wkupm3->rproc = rproc; wkupm3->pdev = pdev; diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h index 1c457a8dd5a6..76d936cabbf8 100644 --- a/include/linux/remoteproc.h +++ b/include/linux/remoteproc.h @@ -443,6 +443,7 @@ struct rproc { struct resource_table *cached_table; u32 table_csum; bool has_iommu; + bool auto_boot; }; /* we currently support only two vrings per rvdev */ -- 2.5.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/4] remoteproc: Calculate max_notifyid during load 2016-08-11 21:52 [PATCH v2 1/4] remoteproc: Introduce auto-boot flag Bjorn Andersson @ 2016-08-11 21:52 ` Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 3/4] remoteproc: Move vdev handling to boot/shutdown Bjorn Andersson ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Bjorn Andersson @ 2016-08-11 21:52 UTC (permalink / raw) To: linux-remoteproc Cc: Ohad Ben-Cohen, linux-kernel, Suman Anna, Lee Jones, Loic Pallardy The calculation of max_notifyid must only be done before we call start() on the remoteproc drivers, so move the calculation to be part of the loading steps. Cc: Lee Jones <lee.jones@linaro.org> Cc: Loic Pallardy <loic.pallardy@st.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- Changes since v1: - None drivers/remoteproc/remoteproc_core.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index 5ca98039e494..f484e04810f3 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -697,17 +697,13 @@ static rproc_handle_resource_t rproc_loading_handlers[RSC_LAST] = { [RSC_CARVEOUT] = (rproc_handle_resource_t)rproc_handle_carveout, [RSC_DEVMEM] = (rproc_handle_resource_t)rproc_handle_devmem, [RSC_TRACE] = (rproc_handle_resource_t)rproc_handle_trace, - [RSC_VDEV] = NULL, /* VDEVs were handled upon registrarion */ + [RSC_VDEV] = (rproc_handle_resource_t)rproc_count_vrings, }; static rproc_handle_resource_t rproc_vdev_handler[RSC_LAST] = { [RSC_VDEV] = (rproc_handle_resource_t)rproc_handle_vdev, }; -static rproc_handle_resource_t rproc_count_vrings_handler[RSC_LAST] = { - [RSC_VDEV] = (rproc_handle_resource_t)rproc_count_vrings, -}; - /* handle firmware resource entries before booting the remote processor */ static int rproc_handle_resources(struct rproc *rproc, int len, rproc_handle_resource_t handlers[RSC_LAST]) @@ -836,6 +832,9 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) goto clean_up; } + /* reset max_notifyid */ + rproc->max_notifyid = -1; + /* handle fw resources which are required to boot rproc */ ret = rproc_handle_resources(rproc, tablesz, rproc_loading_handlers); if (ret) { @@ -923,13 +922,6 @@ static void rproc_fw_config_virtio(const struct firmware *fw, void *context) rproc->table_ptr = rproc->cached_table; - /* count the number of notify-ids */ - rproc->max_notifyid = -1; - ret = rproc_handle_resources(rproc, tablesz, - rproc_count_vrings_handler); - if (ret) - goto out; - /* look for virtio devices and register them */ ret = rproc_handle_resources(rproc, tablesz, rproc_vdev_handler); -- 2.5.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 3/4] remoteproc: Move vdev handling to boot/shutdown 2016-08-11 21:52 [PATCH v2 1/4] remoteproc: Introduce auto-boot flag Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 2/4] remoteproc: Calculate max_notifyid during load Bjorn Andersson @ 2016-08-11 21:52 ` Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 4/4] remoteproc: Move handling of cached table " Bjorn Andersson 2016-08-31 18:27 ` Suman Anna 3 siblings, 0 replies; 12+ messages in thread From: Bjorn Andersson @ 2016-08-11 21:52 UTC (permalink / raw) To: linux-remoteproc Cc: Ohad Ben-Cohen, linux-kernel, Suman Anna, Lee Jones, Loic Pallardy The newly introduced "always-on" flag allows us to stop giving the vdevs special treatment. The ordering of resource allocation and life cycle of the remote processor is kept intact. This allows us to mark a remote processor with vdevs to not boot unless explicitly requested to do so by a client driver. Cc: Lee Jones <lee.jones@linaro.org> Cc: Loic Pallardy <loic.pallardy@st.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- Changes since v1: - None drivers/remoteproc/remoteproc_core.c | 37 +++++++++++++++--------------------- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index f484e04810f3..e86200f978e2 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -753,6 +753,7 @@ static int rproc_handle_resources(struct rproc *rproc, int len, static void rproc_resource_cleanup(struct rproc *rproc) { struct rproc_mem_entry *entry, *tmp; + struct rproc_vdev *rvdev, *rvtmp; struct device *dev = &rproc->dev; /* clean up debugfs trace entries */ @@ -785,6 +786,10 @@ static void rproc_resource_cleanup(struct rproc *rproc) list_del(&entry->node); kfree(entry); } + + /* clean up remote vdev entries */ + list_for_each_entry_safe(rvdev, rvtmp, &rproc->rvdevs, node) + rproc_remove_virtio_dev(rvdev); } /* @@ -835,6 +840,13 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) /* reset max_notifyid */ rproc->max_notifyid = -1; + /* look for virtio devices and register them */ + ret = rproc_handle_resources(rproc, tablesz, rproc_vdev_handler); + if (ret) { + dev_err(dev, "Failed to handle vdev resources: %d\n", ret); + goto clean_up; + } + /* handle fw resources which are required to boot rproc */ ret = rproc_handle_resources(rproc, tablesz, rproc_loading_handlers); if (ret) { @@ -898,7 +910,7 @@ static void rproc_fw_config_virtio(const struct firmware *fw, void *context) { struct rproc *rproc = context; struct resource_table *table; - int ret, tablesz; + int tablesz; if (rproc_fw_sanity_check(rproc, fw) < 0) goto out; @@ -922,9 +934,6 @@ static void rproc_fw_config_virtio(const struct firmware *fw, void *context) rproc->table_ptr = rproc->cached_table; - /* look for virtio devices and register them */ - ret = rproc_handle_resources(rproc, tablesz, rproc_vdev_handler); - /* if rproc is marked always-on, request it to boot */ if (rproc->auto_boot) rproc_boot_nowait(rproc); @@ -973,9 +982,6 @@ static int rproc_add_virtio_devices(struct rproc *rproc) */ int rproc_trigger_recovery(struct rproc *rproc) { - struct rproc_vdev *rvdev, *rvtmp; - int ret; - dev_err(&rproc->dev, "recovering %s\n", rproc->name); init_completion(&rproc->crash_comp); @@ -984,26 +990,13 @@ int rproc_trigger_recovery(struct rproc *rproc) /* TODO: make sure this works with rproc->power > 1 */ rproc_shutdown(rproc); - /* clean up remote vdev entries */ - list_for_each_entry_safe(rvdev, rvtmp, &rproc->rvdevs, node) - rproc_remove_virtio_dev(rvdev); - /* wait until there is no more rproc users */ wait_for_completion(&rproc->crash_comp); - /* Free the copy of the resource table */ - kfree(rproc->cached_table); - - ret = rproc_add_virtio_devices(rproc); - if (ret) - return ret; - /* - * boot the remote processor up again, if the async firmware loader - * didn't do so already, waiting for the async fw load to finish + * boot the remote processor up again */ - if (!rproc->auto_boot) - rproc_boot(rproc); + rproc_boot(rproc); return 0; } -- 2.5.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 4/4] remoteproc: Move handling of cached table to boot/shutdown 2016-08-11 21:52 [PATCH v2 1/4] remoteproc: Introduce auto-boot flag Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 2/4] remoteproc: Calculate max_notifyid during load Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 3/4] remoteproc: Move vdev handling to boot/shutdown Bjorn Andersson @ 2016-08-11 21:52 ` Bjorn Andersson 2016-08-31 18:27 ` Suman Anna 3 siblings, 0 replies; 12+ messages in thread From: Bjorn Andersson @ 2016-08-11 21:52 UTC (permalink / raw) To: linux-remoteproc Cc: Ohad Ben-Cohen, linux-kernel, Suman Anna, Lee Jones, Loic Pallardy As we moved the vdev handling to the main boot/shutdown code path we can further simplify the resource table handling by moving the parsing spet to boot as well. The lifespan of the resource table is changed to live from rproc_boot() to rproc_shutdown(). Cc: Lee Jones <lee.jones@linaro.org> Cc: Loic Pallardy <loic.pallardy@st.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- Changes since v1: - None drivers/remoteproc/remoteproc_core.c | 55 ++++++++++++------------------------ include/linux/remoteproc.h | 2 -- 2 files changed, 18 insertions(+), 39 deletions(-) diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index e86200f978e2..36c019148f7e 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -802,9 +802,6 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) struct resource_table *table, *loaded_table; int ret, tablesz; - if (!rproc->table_ptr) - return -ENOMEM; - ret = rproc_fw_sanity_check(rproc, fw); if (ret) return ret; @@ -831,11 +828,17 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) goto clean_up; } - /* Verify that resource table in loaded fw is unchanged */ - if (rproc->table_csum != crc32(0, table, tablesz)) { - dev_err(dev, "resource checksum failed, fw changed?\n"); + /* + * Create a copy of the resource table. When a virtio device starts + * and calls vring_new_virtqueue() the address of the allocated vring + * will be stored in the cached_table. Before the device is started, + * cached_table will be copied into devic memory. + */ + rproc->cached_table = kmemdup(table, tablesz, GFP_KERNEL); + if (!rproc->cached_table) goto clean_up; - } + + rproc->table_ptr = rproc->cached_table; /* reset max_notifyid */ rproc->max_notifyid = -1; @@ -893,6 +896,10 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) return 0; clean_up: + kfree(rproc->cached_table); + rproc->cached_table = NULL; + rproc->table_ptr = NULL; + rproc_resource_cleanup(rproc); rproc_disable_iommu(rproc); return ret; @@ -909,36 +916,11 @@ clean_up: static void rproc_fw_config_virtio(const struct firmware *fw, void *context) { struct rproc *rproc = context; - struct resource_table *table; - int tablesz; - - if (rproc_fw_sanity_check(rproc, fw) < 0) - goto out; - - /* look for the resource table */ - table = rproc_find_rsc_table(rproc, fw, &tablesz); - if (!table) - goto out; - - rproc->table_csum = crc32(0, table, tablesz); - - /* - * Create a copy of the resource table. When a virtio device starts - * and calls vring_new_virtqueue() the address of the allocated vring - * will be stored in the cached_table. Before the device is started, - * cached_table will be copied into devic memory. - */ - rproc->cached_table = kmemdup(table, tablesz, GFP_KERNEL); - if (!rproc->cached_table) - goto out; - - rproc->table_ptr = rproc->cached_table; /* if rproc is marked always-on, request it to boot */ if (rproc->auto_boot) rproc_boot_nowait(rproc); -out: release_firmware(fw); /* allow rproc_del() contexts, if any, to proceed */ complete_all(&rproc->firmware_loading_complete); @@ -1178,8 +1160,10 @@ void rproc_shutdown(struct rproc *rproc) rproc_disable_iommu(rproc); - /* Give the next start a clean resource table */ - rproc->table_ptr = rproc->cached_table; + /* Free the copy of the resource table */ + kfree(rproc->cached_table); + rproc->cached_table = NULL; + rproc->table_ptr = NULL; /* if in crash state, unlock crash handler */ if (rproc->state == RPROC_CRASHED) @@ -1467,9 +1451,6 @@ int rproc_del(struct rproc *rproc) list_for_each_entry_safe(rvdev, tmp, &rproc->rvdevs, node) rproc_remove_virtio_dev(rvdev); - /* Free the copy of the resource table */ - kfree(rproc->cached_table); - /* the rproc is downref'ed as soon as it's removed from the klist */ mutex_lock(&rproc_list_mutex); list_del(&rproc->node); diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h index 76d936cabbf8..7cd0279bfe6a 100644 --- a/include/linux/remoteproc.h +++ b/include/linux/remoteproc.h @@ -409,7 +409,6 @@ enum rproc_crash_type { * @max_notifyid: largest allocated notify id. * @table_ptr: pointer to the resource table in effect * @cached_table: copy of the resource table - * @table_csum: checksum of the resource table * @has_iommu: flag to indicate if remote processor is behind an MMU */ struct rproc { @@ -441,7 +440,6 @@ struct rproc { int max_notifyid; struct resource_table *table_ptr; struct resource_table *cached_table; - u32 table_csum; bool has_iommu; bool auto_boot; }; -- 2.5.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag 2016-08-11 21:52 [PATCH v2 1/4] remoteproc: Introduce auto-boot flag Bjorn Andersson @ 2016-08-31 18:27 ` Suman Anna 2016-08-11 21:52 ` [PATCH v2 3/4] remoteproc: Move vdev handling to boot/shutdown Bjorn Andersson ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Suman Anna @ 2016-08-31 18:27 UTC (permalink / raw) To: Bjorn Andersson, linux-remoteproc Cc: Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy Hi Bjorn, On 08/11/2016 04:52 PM, Bjorn Andersson wrote: > Introduce an "auto-boot" flag on rprocs to make it possible to flag > remote processors without vdevs to automatically boot once the firmware > is found. > > Preserve previous behavior of the wkup_m3 processor being explicitly > booted by a consumer. > > Cc: Lee Jones <lee.jones@linaro.org> > Cc: Loic Pallardy <loic.pallardy@st.com> > Cc: Suman Anna <s-anna@ti.com> > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > > Changes since v1: > - s/always_on/auto_boot/ > - Fixed double increment of "power" in recover path > - Marked wkup_m3 to not auto_boot > I am seeing various issues with this series as I am testing this series more thoroughly with various TI remoteproc drivers and IPC stacks based on virtio devices. I use very simple firmware images that publishes the rpmsg-client-sample devices, so that I can use the kernel rpmsg_client_sample to test the communication. Here's a summary of the main issues: 1. The rproc_boot holds a module reference count to the remoteproc platform driver so that it cannot be removed when a remote processor is booted. The previous architecture allowed virtio_rpmsg_bus or the platform remoteproc driver to be installed independent of each other with the boot actually getting triggered when the virtio_rpmsg_bus gets probed in find_vqs. The boot now happens when just the platform remoteproc driver is installed independent of virtio_rpmsg_bus and results in holding self-reference counts. This makes it impossible to remove the remoteproc platform module cleanly (we shouldn't be imposing force remove), which means we can't stop the remoteprocs properly. 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core also meant that the virtio devices and therefore the memory for vrings are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The remoteproc can be booted without the virtio_rpmsg_bus module installed. We do use the allocated dma addresses of the vrings in the published resource table, but now the remote processor is up even before these values are filled in. I had to actually move up the rproc_alloc_vring alongside rproc_parse_vring to have the communication up. 3. The remoteproc platform driver cannot be removed previously when the corresponding virtio devices are probed/configured properly and all the communication flow w.r.t rpmsg channel publishing followed from the remoteproc boot. These channels are child devices of the parent virtio devices, and since the remoteproc boot/shutdown followed the virtio device probe/removal lifecycle, the rpmsg channels life-cycle followed that of the parent virtio device. My communication paths are now failing if I remove the virtio_rpmsg_bus and insmod it again as the vrings and vring buffers are configured again while the remoteproc is still running. Also, since the remoteproc is not rebooted, the previously published rpmsg channels are stale and they won't get recreated. In summary, the current patches worked nicely in a error recovery scenario but are not working properly with the various combinations of module insertion/removal process. regards Suman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag @ 2016-08-31 18:27 ` Suman Anna 0 siblings, 0 replies; 12+ messages in thread From: Suman Anna @ 2016-08-31 18:27 UTC (permalink / raw) To: Bjorn Andersson, linux-remoteproc Cc: Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy Hi Bjorn, On 08/11/2016 04:52 PM, Bjorn Andersson wrote: > Introduce an "auto-boot" flag on rprocs to make it possible to flag > remote processors without vdevs to automatically boot once the firmware > is found. > > Preserve previous behavior of the wkup_m3 processor being explicitly > booted by a consumer. > > Cc: Lee Jones <lee.jones@linaro.org> > Cc: Loic Pallardy <loic.pallardy@st.com> > Cc: Suman Anna <s-anna@ti.com> > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > > Changes since v1: > - s/always_on/auto_boot/ > - Fixed double increment of "power" in recover path > - Marked wkup_m3 to not auto_boot > I am seeing various issues with this series as I am testing this series more thoroughly with various TI remoteproc drivers and IPC stacks based on virtio devices. I use very simple firmware images that publishes the rpmsg-client-sample devices, so that I can use the kernel rpmsg_client_sample to test the communication. Here's a summary of the main issues: 1. The rproc_boot holds a module reference count to the remoteproc platform driver so that it cannot be removed when a remote processor is booted. The previous architecture allowed virtio_rpmsg_bus or the platform remoteproc driver to be installed independent of each other with the boot actually getting triggered when the virtio_rpmsg_bus gets probed in find_vqs. The boot now happens when just the platform remoteproc driver is installed independent of virtio_rpmsg_bus and results in holding self-reference counts. This makes it impossible to remove the remoteproc platform module cleanly (we shouldn't be imposing force remove), which means we can't stop the remoteprocs properly. 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core also meant that the virtio devices and therefore the memory for vrings are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The remoteproc can be booted without the virtio_rpmsg_bus module installed. We do use the allocated dma addresses of the vrings in the published resource table, but now the remote processor is up even before these values are filled in. I had to actually move up the rproc_alloc_vring alongside rproc_parse_vring to have the communication up. 3. The remoteproc platform driver cannot be removed previously when the corresponding virtio devices are probed/configured properly and all the communication flow w.r.t rpmsg channel publishing followed from the remoteproc boot. These channels are child devices of the parent virtio devices, and since the remoteproc boot/shutdown followed the virtio device probe/removal lifecycle, the rpmsg channels life-cycle followed that of the parent virtio device. My communication paths are now failing if I remove the virtio_rpmsg_bus and insmod it again as the vrings and vring buffers are configured again while the remoteproc is still running. Also, since the remoteproc is not rebooted, the previously published rpmsg channels are stale and they won't get recreated. In summary, the current patches worked nicely in a error recovery scenario but are not working properly with the various combinations of module insertion/removal process. regards Suman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag 2016-08-31 18:27 ` Suman Anna (?) @ 2016-09-08 22:27 ` Bjorn Andersson 2016-09-16 23:58 ` Suman Anna -1 siblings, 1 reply; 12+ messages in thread From: Bjorn Andersson @ 2016-09-08 22:27 UTC (permalink / raw) To: Suman Anna Cc: linux-remoteproc, Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy On Wed 31 Aug 11:27 PDT 2016, Suman Anna wrote: > Hi Bjorn, > > On 08/11/2016 04:52 PM, Bjorn Andersson wrote: > > Introduce an "auto-boot" flag on rprocs to make it possible to flag > > remote processors without vdevs to automatically boot once the firmware > > is found. > > > > Preserve previous behavior of the wkup_m3 processor being explicitly > > booted by a consumer. > > > > Cc: Lee Jones <lee.jones@linaro.org> > > Cc: Loic Pallardy <loic.pallardy@st.com> > > Cc: Suman Anna <s-anna@ti.com> > > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > > --- > > > > Changes since v1: > > - s/always_on/auto_boot/ > > - Fixed double increment of "power" in recover path > > - Marked wkup_m3 to not auto_boot > > > > I am seeing various issues with this series as I am testing this series > more thoroughly with various TI remoteproc drivers and IPC stacks based > on virtio devices. I use very simple firmware images that publishes the > rpmsg-client-sample devices, so that I can use the kernel > rpmsg_client_sample to test the communication. > Thanks for bringing these up! > Here's a summary of the main issues: > > 1. The rproc_boot holds a module reference count to the remoteproc > platform driver so that it cannot be removed when a remote processor is > booted. The previous architecture allowed virtio_rpmsg_bus or the > platform remoteproc driver to be installed independent of each other > with the boot actually getting triggered when the virtio_rpmsg_bus gets > probed in find_vqs. The boot now happens when just the platform > remoteproc driver is installed independent of virtio_rpmsg_bus and > results in holding self-reference counts. This makes it impossible to > remove the remoteproc platform module cleanly (we shouldn't be imposing > force remove), which means we can't stop the remoteprocs properly. > I've always found the dependency on rpmsg awkward and don't like this behavior to depend on the firmware content, but rather on some sort of system configuration. That said, I did not intend to break the ability of shutting down and unloading the module. Calling rmmod on your rproc module is a rather explicit operation, do you know why there's a need to hold the module reference? Wouldn't it be cleaner to just shutdown the remoteproc as the module is being removed - which would include tearing down all children (rpmsg and others) I expect to be able to compile rpmsg into my kernel and still be able to unload my remoteproc module. > 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core > also meant that the virtio devices and therefore the memory for vrings > are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The > remoteproc can be booted without the virtio_rpmsg_bus module installed. > We do use the allocated dma addresses of the vrings in the published > resource table, but now the remote processor is up even before these > values are filled in. I had to actually move up the rproc_alloc_vring > alongside rproc_parse_vring to have the communication up. > This also means that for a piece of firmware that exposes more than one virtio device we would have the same race. As you say it makes more sense to allocate the vrings as we set up the other resources. > 3. The remoteproc platform driver cannot be removed previously when the > corresponding virtio devices are probed/configured properly and all the > communication flow w.r.t rpmsg channel publishing followed from the > remoteproc boot. These channels are child devices of the parent virtio > devices, and since the remoteproc boot/shutdown followed the virtio > device probe/removal lifecycle, the rpmsg channels life-cycle followed > that of the parent virtio device. My communication paths are now failing > if I remove the virtio_rpmsg_bus and insmod it again as the vrings and > vring buffers are configured again while the remoteproc is still > running. Also, since the remoteproc is not rebooted, the previously > published rpmsg channels are stale and they won't get recreated. > So in essence this only worked because rmmod'ing the virtio_rpmsg_bus would shut down the remoteproc(?) What if the firmware exposed a VIRTIO_ID_NET and a VIRTIO_ID_RPMSG and you rmmod one of them? The vrings are in use by the remote side, so allocating those with the other remoteproc resources makes more sense, as above. But virtio_rpmsg_bus needs to detach all buffers from the rings before going away, so we don't get any writes to those after releasing the dma allocation. We will be sending out RPMSG_NS_DESTROY for some of the channels, but as far as I can tell from the rpmsg implementation there is no support in the protocol for your use case of dropping one end of the rpmsg channels without restarting the firmware and(/or?) vdevs. > In summary, the current patches worked nicely in a error recovery > scenario but are not working properly with the various combinations of > module insertion/removal process. > Thanks for your feedback Suman. I think we should definitely fix #1 and #2, but I'm uncertain to what extent #3 can be fixed. Regards, Bjorn ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag 2016-09-08 22:27 ` Bjorn Andersson @ 2016-09-16 23:58 ` Suman Anna 0 siblings, 0 replies; 12+ messages in thread From: Suman Anna @ 2016-09-16 23:58 UTC (permalink / raw) To: Bjorn Andersson Cc: linux-remoteproc, Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy Hi Bjorn, On 09/08/2016 05:27 PM, Bjorn Andersson wrote: > On Wed 31 Aug 11:27 PDT 2016, Suman Anna wrote: > >> Hi Bjorn, >> >> On 08/11/2016 04:52 PM, Bjorn Andersson wrote: >>> Introduce an "auto-boot" flag on rprocs to make it possible to flag >>> remote processors without vdevs to automatically boot once the firmware >>> is found. >>> >>> Preserve previous behavior of the wkup_m3 processor being explicitly >>> booted by a consumer. >>> >>> Cc: Lee Jones <lee.jones@linaro.org> >>> Cc: Loic Pallardy <loic.pallardy@st.com> >>> Cc: Suman Anna <s-anna@ti.com> >>> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> >>> --- >>> >>> Changes since v1: >>> - s/always_on/auto_boot/ >>> - Fixed double increment of "power" in recover path >>> - Marked wkup_m3 to not auto_boot >>> >> >> I am seeing various issues with this series as I am testing this series >> more thoroughly with various TI remoteproc drivers and IPC stacks based >> on virtio devices. I use very simple firmware images that publishes the >> rpmsg-client-sample devices, so that I can use the kernel >> rpmsg_client_sample to test the communication. >> > > Thanks for bringing these up! > >> Here's a summary of the main issues: >> >> 1. The rproc_boot holds a module reference count to the remoteproc >> platform driver so that it cannot be removed when a remote processor is >> booted. The previous architecture allowed virtio_rpmsg_bus or the >> platform remoteproc driver to be installed independent of each other >> with the boot actually getting triggered when the virtio_rpmsg_bus gets >> probed in find_vqs. The boot now happens when just the platform >> remoteproc driver is installed independent of virtio_rpmsg_bus and >> results in holding self-reference counts. This makes it impossible to >> remove the remoteproc platform module cleanly (we shouldn't be imposing >> force remove), which means we can't stop the remoteprocs properly. >> > > I've always found the dependency on rpmsg awkward and don't like this > behavior to depend on the firmware content, but rather on some sort of > system configuration. > > That said, I did not intend to break the ability of shutting down and > unloading the module. The remoteproc infrastructure always allowed the boot to be done by an external module (with vdevs, it looks intrinsic because of the invocation in remoteproc_virtio.c, but the boot was really triggered during virtio_rpmsg probe) and provided protection against removing the remoteproc driver removal when you had a consumer. I tried the auto-boot when I was upstreaming the wkup_m3_rproc driver and it was rejected for exactly this reason. > Calling rmmod on your rproc module is a rather explicit operation, do > you know why there's a need to hold the module reference? Wouldn't it be > cleaner to just shutdown the remoteproc as the module is being removed - > which would include tearing down all children (rpmsg and others) Right, the introduction of the auto-boot ended up in a self-holding reference count which should be fixed to allow you to remove the module. The external module boot is still a valid usage (when auto_boot is false) though. > > I expect to be able to compile rpmsg into my kernel and still be able to > unload my remoteproc module. > >> 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core >> also meant that the virtio devices and therefore the memory for vrings >> are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The >> remoteproc can be booted without the virtio_rpmsg_bus module installed. >> We do use the allocated dma addresses of the vrings in the published >> resource table, but now the remote processor is up even before these >> values are filled in. I had to actually move up the rproc_alloc_vring >> alongside rproc_parse_vring to have the communication up. >> > > This also means that for a piece of firmware that exposes more than one > virtio device we would have the same race. Yes, if you had more virtio devices technically. The remoteproc infrastructure so far hadn't supported more than one vdev. We did support that in our downstream kernel but that was for a non-SMP usage on a dual-core remoteproc subsystem and the virtio devices were still virtio_rpmsg devices, so it scaled for us when we removed the virtio_rpmsg_bus module as long as the reference counts were managed in rproc_boot and rproc_shutdown() > > As you say it makes more sense to allocate the vrings as we set up the > other resources. > >> 3. The remoteproc platform driver cannot be removed previously when the >> corresponding virtio devices are probed/configured properly and all the >> communication flow w.r.t rpmsg channel publishing followed from the >> remoteproc boot. These channels are child devices of the parent virtio >> devices, and since the remoteproc boot/shutdown followed the virtio >> device probe/removal lifecycle, the rpmsg channels life-cycle followed >> that of the parent virtio device. My communication paths are now failing >> if I remove the virtio_rpmsg_bus and insmod it again as the vrings and >> vring buffers are configured again while the remoteproc is still >> running. Also, since the remoteproc is not rebooted, the previously >> published rpmsg channels are stale and they won't get recreated. >> > > So in essence this only worked because rmmod'ing the virtio_rpmsg_bus > would shut down the remoteproc(?) What if the firmware exposed a > VIRTIO_ID_NET and a VIRTIO_ID_RPMSG and you rmmod one of them? Yes, it's a problem if firmware exposed different virtio devices, but like I said above, it was not supported so far. It is definitely something we should consider going forward. Also, I would think the VIRTIO_ID_RPMSG usage is different from VIRTIO_ID_NET or VIRTIO_CONSOLE. The rpmsg usage does provide a bus infrastructure allowing the remote side to publish different rpmsg devices. > > > The vrings are in use by the remote side, so allocating those with the > other remoteproc resources makes more sense, as above. > > But virtio_rpmsg_bus needs to detach all buffers from the rings before > going away, so we don't get any writes to those after releasing the dma > allocation. They do get detached alright from Linux-side but obviously you are now creating an additional synchronization to the remoteproc side that they cannot use these as you have freed up the memory. > > We will be sending out RPMSG_NS_DESTROY for some of the channels, but as > far as I can tell from the rpmsg implementation there is no support in > the protocol for your use case of dropping one end of the rpmsg channels > without restarting the firmware and(/or?) vdevs. Right, this also causes new synchronization problems when you reprobe and the remoteproc side has to republish the channels. The existing remoteproc infrastructure was designed around this fact - rpmsg channels can come and go (if needed) from the remoteproc-side, and the flow is no different from regular boot vs error recovery, and since they are tied to their parent virtio_rpmsg device, removing the communication path ensured that. > >> In summary, the current patches worked nicely in a error recovery >> scenario but are not working properly with the various combinations of >> module insertion/removal process. >> > > Thanks for your feedback Suman. I think we should definitely fix #1 and > #2, but I'm uncertain to what extent #3 can be fixed. The only solution I can think of is to reverse the module dependency as well to follow the reversal of the boot dependency, so that virtio_rpmsg_bus cannot be removed if you have a remoteproc supporting virtio devices is booted and running once the virtio_rpmsg has probed. I don't see any reasons to introduce new synchronization issues and force the firmwares to change (having to republish channels) because of this boot triggering reversal. I am concerned on the impact as 4.9 is going to an LTS, with most of the Si vendors using the LTS for their new software releases. regards Suman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag @ 2016-09-16 23:58 ` Suman Anna 0 siblings, 0 replies; 12+ messages in thread From: Suman Anna @ 2016-09-16 23:58 UTC (permalink / raw) To: Bjorn Andersson Cc: linux-remoteproc, Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy Hi Bjorn, On 09/08/2016 05:27 PM, Bjorn Andersson wrote: > On Wed 31 Aug 11:27 PDT 2016, Suman Anna wrote: > >> Hi Bjorn, >> >> On 08/11/2016 04:52 PM, Bjorn Andersson wrote: >>> Introduce an "auto-boot" flag on rprocs to make it possible to flag >>> remote processors without vdevs to automatically boot once the firmware >>> is found. >>> >>> Preserve previous behavior of the wkup_m3 processor being explicitly >>> booted by a consumer. >>> >>> Cc: Lee Jones <lee.jones@linaro.org> >>> Cc: Loic Pallardy <loic.pallardy@st.com> >>> Cc: Suman Anna <s-anna@ti.com> >>> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> >>> --- >>> >>> Changes since v1: >>> - s/always_on/auto_boot/ >>> - Fixed double increment of "power" in recover path >>> - Marked wkup_m3 to not auto_boot >>> >> >> I am seeing various issues with this series as I am testing this series >> more thoroughly with various TI remoteproc drivers and IPC stacks based >> on virtio devices. I use very simple firmware images that publishes the >> rpmsg-client-sample devices, so that I can use the kernel >> rpmsg_client_sample to test the communication. >> > > Thanks for bringing these up! > >> Here's a summary of the main issues: >> >> 1. The rproc_boot holds a module reference count to the remoteproc >> platform driver so that it cannot be removed when a remote processor is >> booted. The previous architecture allowed virtio_rpmsg_bus or the >> platform remoteproc driver to be installed independent of each other >> with the boot actually getting triggered when the virtio_rpmsg_bus gets >> probed in find_vqs. The boot now happens when just the platform >> remoteproc driver is installed independent of virtio_rpmsg_bus and >> results in holding self-reference counts. This makes it impossible to >> remove the remoteproc platform module cleanly (we shouldn't be imposing >> force remove), which means we can't stop the remoteprocs properly. >> > > I've always found the dependency on rpmsg awkward and don't like this > behavior to depend on the firmware content, but rather on some sort of > system configuration. > > That said, I did not intend to break the ability of shutting down and > unloading the module. The remoteproc infrastructure always allowed the boot to be done by an external module (with vdevs, it looks intrinsic because of the invocation in remoteproc_virtio.c, but the boot was really triggered during virtio_rpmsg probe) and provided protection against removing the remoteproc driver removal when you had a consumer. I tried the auto-boot when I was upstreaming the wkup_m3_rproc driver and it was rejected for exactly this reason. > Calling rmmod on your rproc module is a rather explicit operation, do > you know why there's a need to hold the module reference? Wouldn't it be > cleaner to just shutdown the remoteproc as the module is being removed - > which would include tearing down all children (rpmsg and others) Right, the introduction of the auto-boot ended up in a self-holding reference count which should be fixed to allow you to remove the module. The external module boot is still a valid usage (when auto_boot is false) though. > > I expect to be able to compile rpmsg into my kernel and still be able to > unload my remoteproc module. > >> 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core >> also meant that the virtio devices and therefore the memory for vrings >> are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The >> remoteproc can be booted without the virtio_rpmsg_bus module installed. >> We do use the allocated dma addresses of the vrings in the published >> resource table, but now the remote processor is up even before these >> values are filled in. I had to actually move up the rproc_alloc_vring >> alongside rproc_parse_vring to have the communication up. >> > > This also means that for a piece of firmware that exposes more than one > virtio device we would have the same race. Yes, if you had more virtio devices technically. The remoteproc infrastructure so far hadn't supported more than one vdev. We did support that in our downstream kernel but that was for a non-SMP usage on a dual-core remoteproc subsystem and the virtio devices were still virtio_rpmsg devices, so it scaled for us when we removed the virtio_rpmsg_bus module as long as the reference counts were managed in rproc_boot and rproc_shutdown() > > As you say it makes more sense to allocate the vrings as we set up the > other resources. > >> 3. The remoteproc platform driver cannot be removed previously when the >> corresponding virtio devices are probed/configured properly and all the >> communication flow w.r.t rpmsg channel publishing followed from the >> remoteproc boot. These channels are child devices of the parent virtio >> devices, and since the remoteproc boot/shutdown followed the virtio >> device probe/removal lifecycle, the rpmsg channels life-cycle followed >> that of the parent virtio device. My communication paths are now failing >> if I remove the virtio_rpmsg_bus and insmod it again as the vrings and >> vring buffers are configured again while the remoteproc is still >> running. Also, since the remoteproc is not rebooted, the previously >> published rpmsg channels are stale and they won't get recreated. >> > > So in essence this only worked because rmmod'ing the virtio_rpmsg_bus > would shut down the remoteproc(?) What if the firmware exposed a > VIRTIO_ID_NET and a VIRTIO_ID_RPMSG and you rmmod one of them? Yes, it's a problem if firmware exposed different virtio devices, but like I said above, it was not supported so far. It is definitely something we should consider going forward. Also, I would think the VIRTIO_ID_RPMSG usage is different from VIRTIO_ID_NET or VIRTIO_CONSOLE. The rpmsg usage does provide a bus infrastructure allowing the remote side to publish different rpmsg devices. > > > The vrings are in use by the remote side, so allocating those with the > other remoteproc resources makes more sense, as above. > > But virtio_rpmsg_bus needs to detach all buffers from the rings before > going away, so we don't get any writes to those after releasing the dma > allocation. They do get detached alright from Linux-side but obviously you are now creating an additional synchronization to the remoteproc side that they cannot use these as you have freed up the memory. > > We will be sending out RPMSG_NS_DESTROY for some of the channels, but as > far as I can tell from the rpmsg implementation there is no support in > the protocol for your use case of dropping one end of the rpmsg channels > without restarting the firmware and(/or?) vdevs. Right, this also causes new synchronization problems when you reprobe and the remoteproc side has to republish the channels. The existing remoteproc infrastructure was designed around this fact - rpmsg channels can come and go (if needed) from the remoteproc-side, and the flow is no different from regular boot vs error recovery, and since they are tied to their parent virtio_rpmsg device, removing the communication path ensured that. > >> In summary, the current patches worked nicely in a error recovery >> scenario but are not working properly with the various combinations of >> module insertion/removal process. >> > > Thanks for your feedback Suman. I think we should definitely fix #1 and > #2, but I'm uncertain to what extent #3 can be fixed. The only solution I can think of is to reverse the module dependency as well to follow the reversal of the boot dependency, so that virtio_rpmsg_bus cannot be removed if you have a remoteproc supporting virtio devices is booted and running once the virtio_rpmsg has probed. I don't see any reasons to introduce new synchronization issues and force the firmwares to change (having to republish channels) because of this boot triggering reversal. I am concerned on the impact as 4.9 is going to an LTS, with most of the Si vendors using the LTS for their new software releases. regards Suman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag 2016-09-16 23:58 ` Suman Anna (?) @ 2016-09-19 23:24 ` Bjorn Andersson 2016-09-20 21:29 ` Suman Anna -1 siblings, 1 reply; 12+ messages in thread From: Bjorn Andersson @ 2016-09-19 23:24 UTC (permalink / raw) To: Suman Anna Cc: linux-remoteproc, Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy On Fri 16 Sep 16:58 PDT 2016, Suman Anna wrote: > Hi Bjorn, > > On 09/08/2016 05:27 PM, Bjorn Andersson wrote: > > On Wed 31 Aug 11:27 PDT 2016, Suman Anna wrote: > > > >> Hi Bjorn, > >> > >> On 08/11/2016 04:52 PM, Bjorn Andersson wrote: > >>> Introduce an "auto-boot" flag on rprocs to make it possible to flag > >>> remote processors without vdevs to automatically boot once the firmware > >>> is found. > >>> > >>> Preserve previous behavior of the wkup_m3 processor being explicitly > >>> booted by a consumer. > >>> > >>> Cc: Lee Jones <lee.jones@linaro.org> > >>> Cc: Loic Pallardy <loic.pallardy@st.com> > >>> Cc: Suman Anna <s-anna@ti.com> > >>> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > >>> --- > >>> > >>> Changes since v1: > >>> - s/always_on/auto_boot/ > >>> - Fixed double increment of "power" in recover path > >>> - Marked wkup_m3 to not auto_boot > >>> > >> > >> I am seeing various issues with this series as I am testing this series > >> more thoroughly with various TI remoteproc drivers and IPC stacks based > >> on virtio devices. I use very simple firmware images that publishes the > >> rpmsg-client-sample devices, so that I can use the kernel > >> rpmsg_client_sample to test the communication. > >> > > > > Thanks for bringing these up! > > > >> Here's a summary of the main issues: > >> > >> 1. The rproc_boot holds a module reference count to the remoteproc > >> platform driver so that it cannot be removed when a remote processor is > >> booted. The previous architecture allowed virtio_rpmsg_bus or the > >> platform remoteproc driver to be installed independent of each other > >> with the boot actually getting triggered when the virtio_rpmsg_bus gets > >> probed in find_vqs. The boot now happens when just the platform > >> remoteproc driver is installed independent of virtio_rpmsg_bus and > >> results in holding self-reference counts. This makes it impossible to > >> remove the remoteproc platform module cleanly (we shouldn't be imposing > >> force remove), which means we can't stop the remoteprocs properly. > >> > > > > I've always found the dependency on rpmsg awkward and don't like this > > behavior to depend on the firmware content, but rather on some sort of > > system configuration. > > > > That said, I did not intend to break the ability of shutting down and > > unloading the module. > > The remoteproc infrastructure always allowed the boot to be done by an > external module The only caller of rproc_boot() in the mainline kernel is the wkup_m3_ipc, which spawns a thread and waits for the remoteproc-internal completion that indicates when the firmware is available and then call rproc_boot(). I'm afraid I would much prefer the wkup_m3_rproc to just set auto-boot and not work around the design like this. That said, we still provide the support for this. > (with vdevs, it looks intrinsic because of the > invocation in remoteproc_virtio.c, but the boot was really triggered > during virtio_rpmsg probe) Except for the fact that the first vdev does this as a direct result of the call from rproc_fw_config_virtio(). As stated above, the only other caller of rproc_boot() is the wkup_m3 driver and although it's not executing in the context of rproc_fw_config_virtio() it's directly tied into its execution. > and provided protection against removing the remoteproc driver removal > when you had a consumer. I tried the auto-boot when I was upstreaming > the wkup_m3_rproc driver and it was rejected for exactly this reason. One of the problems I have with the design chosen in the wkup_m3 case is if you had to deal with crash recovery, how would you sync the wkup_m3_ipc driver operations to the async recovery? Flipping this the other way around and accepting that the "function device" follows the RPROC_RUNNING state of the rproc implicitly would allow this. > > > Calling rmmod on your rproc module is a rather explicit operation, do > > you know why there's a need to hold the module reference? Wouldn't it be > > cleaner to just shutdown the remoteproc as the module is being removed - > > which would include tearing down all children (rpmsg and others) > > Right, the introduction of the auto-boot ended up in a self-holding > reference count which should be fixed to allow you to remove the module. > The external module boot is still a valid usage (when auto_boot is > false) though. The way this is dealt with in other subsystems is that when a client acquire a handle to something exposed by the implementation it will reference the implementation. With remoteproc I can acquire a reference to a remoteproc, then unload the implementation and when the client calls rproc_boot() the core will try to dereference dev->parent->driver->owner to try_module_get() it; if we're lucky that will fail as the driver is gone, but I would assume we would panic here instead, as driver is no longer a valid pointer. > > > > > I expect to be able to compile rpmsg into my kernel and still be able to > > unload my remoteproc module. > > > >> 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core > >> also meant that the virtio devices and therefore the memory for vrings > >> are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The > >> remoteproc can be booted without the virtio_rpmsg_bus module installed. > >> We do use the allocated dma addresses of the vrings in the published > >> resource table, but now the remote processor is up even before these > >> values are filled in. I had to actually move up the rproc_alloc_vring > >> alongside rproc_parse_vring to have the communication up. > >> > > > > This also means that for a piece of firmware that exposes more than one > > virtio device we would have the same race. > > Yes, if you had more virtio devices technically. The remoteproc > infrastructure so far hadn't supported more than one vdev. I don't see anything preventing you from putting more than one vdev in your resource table. There will however be a race on getting the vrings allocated before the firmware needs them. > We did > support that in our downstream kernel but that was for a non-SMP usage > on a dual-core remoteproc subsystem and the virtio devices were still > virtio_rpmsg devices, so it scaled for us when we removed the > virtio_rpmsg_bus module as long as the reference counts were managed in > rproc_boot and rproc_shutdown() > I think the shutdown case would work in either way, but the behavior of rmmod rpmsg && insmod rpmsg still depends on other things. > > > > As you say it makes more sense to allocate the vrings as we set up the > > other resources. > > > >> 3. The remoteproc platform driver cannot be removed previously when the > >> corresponding virtio devices are probed/configured properly and all the > >> communication flow w.r.t rpmsg channel publishing followed from the > >> remoteproc boot. These channels are child devices of the parent virtio > >> devices, and since the remoteproc boot/shutdown followed the virtio > >> device probe/removal lifecycle, the rpmsg channels life-cycle followed > >> that of the parent virtio device. My communication paths are now failing > >> if I remove the virtio_rpmsg_bus and insmod it again as the vrings and > >> vring buffers are configured again while the remoteproc is still > >> running. Also, since the remoteproc is not rebooted, the previously > >> published rpmsg channels are stale and they won't get recreated. > >> > > > > So in essence this only worked because rmmod'ing the virtio_rpmsg_bus > > would shut down the remoteproc(?) What if the firmware exposed a > > VIRTIO_ID_NET and a VIRTIO_ID_RPMSG and you rmmod one of them? > > Yes, it's a problem if firmware exposed different virtio devices, but > like I said above, it was not supported so far. It's not prevented, the only reason I can find for it not working is what I'm arguing about - that the resource handling will be racy. > It is definitely > something we should consider going forward. We had a question related to supporting multiple VIRTIO_ID_NET devices per remoteproc on LKML last week. > Also, I would think the > VIRTIO_ID_RPMSG usage is different from VIRTIO_ID_NET or VIRTIO_CONSOLE. > The rpmsg usage does provide a bus infrastructure allowing the remote > side to publish different rpmsg devices. > Maybe I'm missing something here, but if I put any other VIRTIO_ID in the "id" member of fw_rsc_vdev the virtio core will probe that driver and it will call the find_vqs callback and there's no limit on the number of fw_rsc_vdev entries in the resource table. So the use case of having a remoteproc firmware with 8 VIRTIO_ID_NET entries should work just fine - except for only the first one being initialized as the firmware boots (unless we finish this restructure). > > > > > > The vrings are in use by the remote side, so allocating those with the > > other remoteproc resources makes more sense, as above. > > > > But virtio_rpmsg_bus needs to detach all buffers from the rings before > > going away, so we don't get any writes to those after releasing the dma > > allocation. > > They do get detached alright from Linux-side but obviously you are now > creating an additional synchronization to the remoteproc side that they > cannot use these as you have freed up the memory. > Yeah, that's not acceptable, the backing memory must at least follow the running state of the rproc, so that it is available once we boot the core. > > > > We will be sending out RPMSG_NS_DESTROY for some of the channels, but as > > far as I can tell from the rpmsg implementation there is no support in > > the protocol for your use case of dropping one end of the rpmsg channels > > without restarting the firmware and(/or?) vdevs. > > Right, this also causes new synchronization problems when you reprobe > and the remoteproc side has to republish the channels. >From the code it seems like the other virtio devices would handle this. > The existing > remoteproc infrastructure was designed around this fact - rpmsg channels > can come and go (if needed) from the remoteproc-side, and the flow is no > different from regular boot vs error recovery, and since they are tied > to their parent virtio_rpmsg device, removing the communication path > ensured that. There are two levels here, the first is the virtio level which in the case of rpmsg seems to be required to follow the remoteproc RPROC_RUNNING state; the other being rpmsg channels have nothing to do with remoteproc, but can as you say come and go dynamically as either side like. But the only problem I can see is if the firmware does not re-announce channels after we reset the virtio device. > > > > >> In summary, the current patches worked nicely in a error recovery > >> scenario but are not working properly with the various combinations of > >> module insertion/removal process. > >> > > > > Thanks for your feedback Suman. I think we should definitely fix #1 and > > #2, but I'm uncertain to what extent #3 can be fixed. > > The only solution I can think of is to reverse the module dependency as > well to follow the reversal of the boot dependency, so that > virtio_rpmsg_bus cannot be removed if you have a remoteproc supporting > virtio devices is booted and running once the virtio_rpmsg has probed. Can you please help me understand why it's important to protect either module from being unloaded? > I > don't see any reasons to introduce new synchronization issues and force > the firmwares to change (having to republish channels) because of this > boot triggering reversal. > The problem is not the boot triggering reversal - as there's no such thing, the issue shows because I decoupled the rproc life cycle from one of its child's. > I am concerned on the impact as 4.9 is going to an LTS, with most of the > Si vendors using the LTS for their new software releases. > I'm sorry, but could you help me understand how rpmsg is used downstream for this to be of concern, how is module unloading used beyond development? Per my own arguments I will provide some patches to correct the vring buffer allocation and I'll look into the module locking. PS. I really would like to see >0 users of this framework in mainline, so we can reason about things on the basis of what's actually there! Regards, Bjorn ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag 2016-09-19 23:24 ` Bjorn Andersson @ 2016-09-20 21:29 ` Suman Anna 0 siblings, 0 replies; 12+ messages in thread From: Suman Anna @ 2016-09-20 21:29 UTC (permalink / raw) To: Bjorn Andersson Cc: linux-remoteproc, Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy, Dave Gerlach Hi Bjorn, >>>> >>>> On 08/11/2016 04:52 PM, Bjorn Andersson wrote: >>>>> Introduce an "auto-boot" flag on rprocs to make it possible to flag >>>>> remote processors without vdevs to automatically boot once the firmware >>>>> is found. >>>>> >>>>> Preserve previous behavior of the wkup_m3 processor being explicitly >>>>> booted by a consumer. >>>>> >>>>> Cc: Lee Jones <lee.jones@linaro.org> >>>>> Cc: Loic Pallardy <loic.pallardy@st.com> >>>>> Cc: Suman Anna <s-anna@ti.com> >>>>> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> >>>>> --- >>>>> >>>>> Changes since v1: >>>>> - s/always_on/auto_boot/ >>>>> - Fixed double increment of "power" in recover path >>>>> - Marked wkup_m3 to not auto_boot >>>>> >>>> >>>> I am seeing various issues with this series as I am testing this series >>>> more thoroughly with various TI remoteproc drivers and IPC stacks based >>>> on virtio devices. I use very simple firmware images that publishes the >>>> rpmsg-client-sample devices, so that I can use the kernel >>>> rpmsg_client_sample to test the communication. >>>> >>> >>> Thanks for bringing these up! >>> >>>> Here's a summary of the main issues: >>>> >>>> 1. The rproc_boot holds a module reference count to the remoteproc >>>> platform driver so that it cannot be removed when a remote processor is >>>> booted. The previous architecture allowed virtio_rpmsg_bus or the >>>> platform remoteproc driver to be installed independent of each other >>>> with the boot actually getting triggered when the virtio_rpmsg_bus gets >>>> probed in find_vqs. The boot now happens when just the platform >>>> remoteproc driver is installed independent of virtio_rpmsg_bus and >>>> results in holding self-reference counts. This makes it impossible to >>>> remove the remoteproc platform module cleanly (we shouldn't be imposing >>>> force remove), which means we can't stop the remoteprocs properly. >>>> >>> >>> I've always found the dependency on rpmsg awkward and don't like this >>> behavior to depend on the firmware content, but rather on some sort of >>> system configuration. >>> >>> That said, I did not intend to break the ability of shutting down and >>> unloading the module. >> >> The remoteproc infrastructure always allowed the boot to be done by an >> external module > > The only caller of rproc_boot() in the mainline kernel is the > wkup_m3_ipc, which spawns a thread and waits for the remoteproc-internal > completion that indicates when the firmware is available and then call > rproc_boot(). > > I'm afraid I would much prefer the wkup_m3_rproc to just set auto-boot > and not work around the design like this. > > > That said, we still provide the support for this. It is quite a valid usecase, and we will have more usecases like this definitely showing up. This is exactly where I would need the rproc_set_firmware() API as the client user dictates the firmware to boot depending on the soft ip/functionality it wants from the remote processor. > >> (with vdevs, it looks intrinsic because of the >> invocation in remoteproc_virtio.c, but the boot was really triggered >> during virtio_rpmsg probe) > > Except for the fact that the first vdev does this as a direct result of > the call from rproc_fw_config_virtio(). Yeah, and I did have a downstream patch [1] to actually count the vdevs and perform the boot on the last vdev instead of first. That is always a limitation of the current upstream code, so let's not mix that with the current change in behavior. Anyway, with your current changes, the patch [1] becomes moot. [1] http://git.omapzoom.org/?p=kernel/omap.git;a=commitdiff;h=bdce403858a5c5d2abccd81a91bbe0528b97bef6 > > As stated above, the only other caller of rproc_boot() is the wkup_m3 > driver and although it's not executing in the context of > rproc_fw_config_virtio() it's directly tied into its execution. > >> and provided protection against removing the remoteproc driver removal >> when you had a consumer. I tried the auto-boot when I was upstreaming >> the wkup_m3_rproc driver and it was rejected for exactly this reason. > > One of the problems I have with the design chosen in the wkup_m3 case is > if you had to deal with crash recovery, how would you sync the > wkup_m3_ipc driver operations to the async recovery? The wkup_m3 is akin to the System Controller Processor stuff w.r.t PM, so first thing is we cannot afford to have this firmware crash in general. In anycase, the communication stuff is dealt by the wkup_m3_ipc driver itself as part of the PM suspend/resume cycle, so any error notification detection will be it's job and it will be responsible for managing the remoteproc life cycle appropriately. > Flipping this the other way around and accepting that the "function > device" follows the RPROC_RUNNING state of the rproc implicitly would > allow this. > >> >>> Calling rmmod on your rproc module is a rather explicit operation, do >>> you know why there's a need to hold the module reference? Wouldn't it be >>> cleaner to just shutdown the remoteproc as the module is being removed - >>> which would include tearing down all children (rpmsg and others) >> >> Right, the introduction of the auto-boot ended up in a self-holding >> reference count which should be fixed to allow you to remove the module. >> The external module boot is still a valid usage (when auto_boot is >> false) though. > > The way this is dealt with in other subsystems is that when a client > acquire a handle to something exposed by the implementation it will > reference the implementation. > > With remoteproc I can acquire a reference to a remoteproc, then unload > the implementation and when the client calls rproc_boot() the core will > try to dereference dev->parent->driver->owner to try_module_get() it; if > we're lucky that will fail as the driver is gone, but I would assume we > would panic here instead, as driver is no longer a valid pointer. I am talking about the rproc_shutdown() case - removing the underlying implementation after the client has successfully booted the remoteproc device and using it. > >> >>> >>> I expect to be able to compile rpmsg into my kernel and still be able to >>> unload my remoteproc module. >>> >>>> 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core >>>> also meant that the virtio devices and therefore the memory for vrings >>>> are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The >>>> remoteproc can be booted without the virtio_rpmsg_bus module installed. >>>> We do use the allocated dma addresses of the vrings in the published >>>> resource table, but now the remote processor is up even before these >>>> values are filled in. I had to actually move up the rproc_alloc_vring >>>> alongside rproc_parse_vring to have the communication up. >>>> >>> >>> This also means that for a piece of firmware that exposes more than one >>> virtio device we would have the same race. >> >> Yes, if you had more virtio devices technically. The remoteproc >> infrastructure so far hadn't supported more than one vdev. > > I don't see anything preventing you from putting more than one vdev in > your resource table. There will however be a race on getting the vrings > allocated before the firmware needs them. Right, moving the vring allocation from find_vqs into resource parsing time would fix that issue. My above downstream patch resolved this issue with existing architecture, but as I said we were using virtio devices of the same type. > >> We did >> support that in our downstream kernel but that was for a non-SMP usage >> on a dual-core remoteproc subsystem and the virtio devices were still >> virtio_rpmsg devices, so it scaled for us when we removed the >> virtio_rpmsg_bus module as long as the reference counts were managed in >> rproc_boot and rproc_shutdown() >> > > I think the shutdown case would work in either way, but the behavior of > rmmod rpmsg && insmod rpmsg still depends on other things. Right, the issues are with rmmod and insmod of virtio_rpmsg_bus. Previously, this ensured the shutdown of remoteproc, and upon insmod, the remoteproc is rebooted and rpmsg channels are republished (same flow in error recovery too). > >>> >>> As you say it makes more sense to allocate the vrings as we set up the >>> other resources. >>> >>>> 3. The remoteproc platform driver cannot be removed previously when the >>>> corresponding virtio devices are probed/configured properly and all the >>>> communication flow w.r.t rpmsg channel publishing followed from the >>>> remoteproc boot. These channels are child devices of the parent virtio >>>> devices, and since the remoteproc boot/shutdown followed the virtio >>>> device probe/removal lifecycle, the rpmsg channels life-cycle followed >>>> that of the parent virtio device. My communication paths are now failing >>>> if I remove the virtio_rpmsg_bus and insmod it again as the vrings and >>>> vring buffers are configured again while the remoteproc is still >>>> running. Also, since the remoteproc is not rebooted, the previously >>>> published rpmsg channels are stale and they won't get recreated. >>>> >>> >>> So in essence this only worked because rmmod'ing the virtio_rpmsg_bus >>> would shut down the remoteproc(?) What if the firmware exposed a >>> VIRTIO_ID_NET and a VIRTIO_ID_RPMSG and you rmmod one of them? >> >> Yes, it's a problem if firmware exposed different virtio devices, but >> like I said above, it was not supported so far. > > It's not prevented, the only reason I can find for it not working is > what I'm arguing about - that the resource handling will be racy. Right, this is definitely a scenario to conside, but this is separate (but similar) from the current problem I have mentioned. > >> It is definitely >> something we should consider going forward. > > We had a question related to supporting multiple VIRTIO_ID_NET devices > per remoteproc on LKML last week. > >> Also, I would think the >> VIRTIO_ID_RPMSG usage is different from VIRTIO_ID_NET or VIRTIO_CONSOLE. >> The rpmsg usage does provide a bus infrastructure allowing the remote >> side to publish different rpmsg devices. >> > > Maybe I'm missing something here, but if I put any other VIRTIO_ID in > the "id" member of fw_rsc_vdev the virtio core will probe that driver > and it will call the find_vqs callback and there's no limit on the > number of fw_rsc_vdev entries in the resource table. > > So the use case of having a remoteproc firmware with 8 VIRTIO_ID_NET > entries should work just fine - except for only the first one being > initialized as the firmware boots (unless we finish this restructure). Yeah, this is racy whether with previous architecture or with the current changes. Moving the vring allocation as I suggested should fix this particular multiple vdev issue with the current changes. > >>> >>> >>> The vrings are in use by the remote side, so allocating those with the >>> other remoteproc resources makes more sense, as above. >>> >>> But virtio_rpmsg_bus needs to detach all buffers from the rings before >>> going away, so we don't get any writes to those after releasing the dma >>> allocation. >> >> They do get detached alright from Linux-side but obviously you are now >> creating an additional synchronization to the remoteproc side that they >> cannot use these as you have freed up the memory. >> > > Yeah, that's not acceptable, the backing memory must at least follow the > running state of the rproc, so that it is available once we boot the > core. > >>> >>> We will be sending out RPMSG_NS_DESTROY for some of the channels, but as >>> far as I can tell from the rpmsg implementation there is no support in >>> the protocol for your use case of dropping one end of the rpmsg channels >>> without restarting the firmware and(/or?) vdevs. >> >> Right, this also causes new synchronization problems when you reprobe >> and the remoteproc side has to republish the channels. > > From the code it seems like the other virtio devices would handle this. Do they implement a bus functionality like virtio_rpmsg? > >> The existing >> remoteproc infrastructure was designed around this fact - rpmsg channels >> can come and go (if needed) from the remoteproc-side, and the flow is no >> different from regular boot vs error recovery, and since they are tied >> to their parent virtio_rpmsg device, removing the communication path >> ensured that. > > There are two levels here, the first is the virtio level which in the > case of rpmsg seems to be required to follow the remoteproc > RPROC_RUNNING state; the other being rpmsg channels have nothing to do > with remoteproc, but can as you say come and go dynamically as either > side like. > > But the only problem I can see is if the firmware does not re-announce > channels after we reset the virtio device. Correct, this is my exact problem, the announcement is done once at boot-time from our firmwares as they are publishing the services supported by each. We only have a single physical transport (virtio_rpmsg) between the host and the remote processor. And clients cannot avail of these services once you do a rmmod/insmod cycle. > >> >>> >>>> In summary, the current patches worked nicely in a error recovery >>>> scenario but are not working properly with the various combinations of >>>> module insertion/removal process. >>>> >>> >>> Thanks for your feedback Suman. I think we should definitely fix #1 and >>> #2, but I'm uncertain to what extent #3 can be fixed. >> >> The only solution I can think of is to reverse the module dependency as >> well to follow the reversal of the boot dependency, so that >> virtio_rpmsg_bus cannot be removed if you have a remoteproc supporting >> virtio devices is booted and running once the virtio_rpmsg has probed. > > Can you please help me understand why it's important to protect either > module from being unloaded? virtio_rpmsg_bus is quite simple, the firmwares publish their channels one time during the boot, and the removal of virtio_rpmsg_bus also removes these channels with them not getting republished now once you insmod the virtio_rpmsg_bus. I take it the other one you are talking about is the remoteproc platform driver, that one you need to allow the removal in the case of auto-boot, and restrict it in the case of a client booting it successfully. > >> I >> don't see any reasons to introduce new synchronization issues and force >> the firmwares to change (having to republish channels) because of this >> boot triggering reversal. >> > > The problem is not the boot triggering reversal - as there's no such > thing, the issue shows because I decoupled the rproc life cycle from one > of its child's. I don't quite agree, the boot flow did change with the current changes, and is also now affecting the logic in firmwares as well. > >> I am concerned on the impact as 4.9 is going to an LTS, with most of the >> Si vendors using the LTS for their new software releases. >> > > I'm sorry, but could you help me understand how rpmsg is used downstream > for this to be of concern, how is module unloading used beyond > development? This is part of robustness testing with various modules especially when they are built as modules. This definitely is a regression as communication won't be up now. > > Per my own arguments I will provide some patches to correct the vring > buffer allocation and I'll look into the module locking. > > PS. I really would like to see >0 users of this framework in mainline, > so we can reason about things on the basis of what's actually there! Yeah, we already have two remoteproc drivers and I have two more in the pipeline. And looks like a whole bunch of them from other SoC vendors are on their way too. Obviously, each has their own firmware designs/IPC architectectures and restrictions. regards Suman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] remoteproc: Introduce auto-boot flag @ 2016-09-20 21:29 ` Suman Anna 0 siblings, 0 replies; 12+ messages in thread From: Suman Anna @ 2016-09-20 21:29 UTC (permalink / raw) To: Bjorn Andersson Cc: linux-remoteproc, Ohad Ben-Cohen, linux-kernel, Lee Jones, Loic Pallardy, Dave Gerlach Hi Bjorn, >>>> >>>> On 08/11/2016 04:52 PM, Bjorn Andersson wrote: >>>>> Introduce an "auto-boot" flag on rprocs to make it possible to flag >>>>> remote processors without vdevs to automatically boot once the firmware >>>>> is found. >>>>> >>>>> Preserve previous behavior of the wkup_m3 processor being explicitly >>>>> booted by a consumer. >>>>> >>>>> Cc: Lee Jones <lee.jones@linaro.org> >>>>> Cc: Loic Pallardy <loic.pallardy@st.com> >>>>> Cc: Suman Anna <s-anna@ti.com> >>>>> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> >>>>> --- >>>>> >>>>> Changes since v1: >>>>> - s/always_on/auto_boot/ >>>>> - Fixed double increment of "power" in recover path >>>>> - Marked wkup_m3 to not auto_boot >>>>> >>>> >>>> I am seeing various issues with this series as I am testing this series >>>> more thoroughly with various TI remoteproc drivers and IPC stacks based >>>> on virtio devices. I use very simple firmware images that publishes the >>>> rpmsg-client-sample devices, so that I can use the kernel >>>> rpmsg_client_sample to test the communication. >>>> >>> >>> Thanks for bringing these up! >>> >>>> Here's a summary of the main issues: >>>> >>>> 1. The rproc_boot holds a module reference count to the remoteproc >>>> platform driver so that it cannot be removed when a remote processor is >>>> booted. The previous architecture allowed virtio_rpmsg_bus or the >>>> platform remoteproc driver to be installed independent of each other >>>> with the boot actually getting triggered when the virtio_rpmsg_bus gets >>>> probed in find_vqs. The boot now happens when just the platform >>>> remoteproc driver is installed independent of virtio_rpmsg_bus and >>>> results in holding self-reference counts. This makes it impossible to >>>> remove the remoteproc platform module cleanly (we shouldn't be imposing >>>> force remove), which means we can't stop the remoteprocs properly. >>>> >>> >>> I've always found the dependency on rpmsg awkward and don't like this >>> behavior to depend on the firmware content, but rather on some sort of >>> system configuration. >>> >>> That said, I did not intend to break the ability of shutting down and >>> unloading the module. >> >> The remoteproc infrastructure always allowed the boot to be done by an >> external module > > The only caller of rproc_boot() in the mainline kernel is the > wkup_m3_ipc, which spawns a thread and waits for the remoteproc-internal > completion that indicates when the firmware is available and then call > rproc_boot(). > > I'm afraid I would much prefer the wkup_m3_rproc to just set auto-boot > and not work around the design like this. > > > That said, we still provide the support for this. It is quite a valid usecase, and we will have more usecases like this definitely showing up. This is exactly where I would need the rproc_set_firmware() API as the client user dictates the firmware to boot depending on the soft ip/functionality it wants from the remote processor. > >> (with vdevs, it looks intrinsic because of the >> invocation in remoteproc_virtio.c, but the boot was really triggered >> during virtio_rpmsg probe) > > Except for the fact that the first vdev does this as a direct result of > the call from rproc_fw_config_virtio(). Yeah, and I did have a downstream patch [1] to actually count the vdevs and perform the boot on the last vdev instead of first. That is always a limitation of the current upstream code, so let's not mix that with the current change in behavior. Anyway, with your current changes, the patch [1] becomes moot. [1] http://git.omapzoom.org/?p=kernel/omap.git;a=commitdiff;h=bdce403858a5c5d2abccd81a91bbe0528b97bef6 > > As stated above, the only other caller of rproc_boot() is the wkup_m3 > driver and although it's not executing in the context of > rproc_fw_config_virtio() it's directly tied into its execution. > >> and provided protection against removing the remoteproc driver removal >> when you had a consumer. I tried the auto-boot when I was upstreaming >> the wkup_m3_rproc driver and it was rejected for exactly this reason. > > One of the problems I have with the design chosen in the wkup_m3 case is > if you had to deal with crash recovery, how would you sync the > wkup_m3_ipc driver operations to the async recovery? The wkup_m3 is akin to the System Controller Processor stuff w.r.t PM, so first thing is we cannot afford to have this firmware crash in general. In anycase, the communication stuff is dealt by the wkup_m3_ipc driver itself as part of the PM suspend/resume cycle, so any error notification detection will be it's job and it will be responsible for managing the remoteproc life cycle appropriately. > Flipping this the other way around and accepting that the "function > device" follows the RPROC_RUNNING state of the rproc implicitly would > allow this. > >> >>> Calling rmmod on your rproc module is a rather explicit operation, do >>> you know why there's a need to hold the module reference? Wouldn't it be >>> cleaner to just shutdown the remoteproc as the module is being removed - >>> which would include tearing down all children (rpmsg and others) >> >> Right, the introduction of the auto-boot ended up in a self-holding >> reference count which should be fixed to allow you to remove the module. >> The external module boot is still a valid usage (when auto_boot is >> false) though. > > The way this is dealt with in other subsystems is that when a client > acquire a handle to something exposed by the implementation it will > reference the implementation. > > With remoteproc I can acquire a reference to a remoteproc, then unload > the implementation and when the client calls rproc_boot() the core will > try to dereference dev->parent->driver->owner to try_module_get() it; if > we're lucky that will fail as the driver is gone, but I would assume we > would panic here instead, as driver is no longer a valid pointer. I am talking about the rproc_shutdown() case - removing the underlying implementation after the client has successfully booted the remoteproc device and using it. > >> >>> >>> I expect to be able to compile rpmsg into my kernel and still be able to >>> unload my remoteproc module. >>> >>>> 2. The reversal of boot between virtio_rpmsg_bus and remoteproc core >>>> also meant that the virtio devices and therefore the memory for vrings >>>> are allocated at the time virtio_rpmsg_bus is probed in find_vqs(). The >>>> remoteproc can be booted without the virtio_rpmsg_bus module installed. >>>> We do use the allocated dma addresses of the vrings in the published >>>> resource table, but now the remote processor is up even before these >>>> values are filled in. I had to actually move up the rproc_alloc_vring >>>> alongside rproc_parse_vring to have the communication up. >>>> >>> >>> This also means that for a piece of firmware that exposes more than one >>> virtio device we would have the same race. >> >> Yes, if you had more virtio devices technically. The remoteproc >> infrastructure so far hadn't supported more than one vdev. > > I don't see anything preventing you from putting more than one vdev in > your resource table. There will however be a race on getting the vrings > allocated before the firmware needs them. Right, moving the vring allocation from find_vqs into resource parsing time would fix that issue. My above downstream patch resolved this issue with existing architecture, but as I said we were using virtio devices of the same type. > >> We did >> support that in our downstream kernel but that was for a non-SMP usage >> on a dual-core remoteproc subsystem and the virtio devices were still >> virtio_rpmsg devices, so it scaled for us when we removed the >> virtio_rpmsg_bus module as long as the reference counts were managed in >> rproc_boot and rproc_shutdown() >> > > I think the shutdown case would work in either way, but the behavior of > rmmod rpmsg && insmod rpmsg still depends on other things. Right, the issues are with rmmod and insmod of virtio_rpmsg_bus. Previously, this ensured the shutdown of remoteproc, and upon insmod, the remoteproc is rebooted and rpmsg channels are republished (same flow in error recovery too). > >>> >>> As you say it makes more sense to allocate the vrings as we set up the >>> other resources. >>> >>>> 3. The remoteproc platform driver cannot be removed previously when the >>>> corresponding virtio devices are probed/configured properly and all the >>>> communication flow w.r.t rpmsg channel publishing followed from the >>>> remoteproc boot. These channels are child devices of the parent virtio >>>> devices, and since the remoteproc boot/shutdown followed the virtio >>>> device probe/removal lifecycle, the rpmsg channels life-cycle followed >>>> that of the parent virtio device. My communication paths are now failing >>>> if I remove the virtio_rpmsg_bus and insmod it again as the vrings and >>>> vring buffers are configured again while the remoteproc is still >>>> running. Also, since the remoteproc is not rebooted, the previously >>>> published rpmsg channels are stale and they won't get recreated. >>>> >>> >>> So in essence this only worked because rmmod'ing the virtio_rpmsg_bus >>> would shut down the remoteproc(?) What if the firmware exposed a >>> VIRTIO_ID_NET and a VIRTIO_ID_RPMSG and you rmmod one of them? >> >> Yes, it's a problem if firmware exposed different virtio devices, but >> like I said above, it was not supported so far. > > It's not prevented, the only reason I can find for it not working is > what I'm arguing about - that the resource handling will be racy. Right, this is definitely a scenario to conside, but this is separate (but similar) from the current problem I have mentioned. > >> It is definitely >> something we should consider going forward. > > We had a question related to supporting multiple VIRTIO_ID_NET devices > per remoteproc on LKML last week. > >> Also, I would think the >> VIRTIO_ID_RPMSG usage is different from VIRTIO_ID_NET or VIRTIO_CONSOLE. >> The rpmsg usage does provide a bus infrastructure allowing the remote >> side to publish different rpmsg devices. >> > > Maybe I'm missing something here, but if I put any other VIRTIO_ID in > the "id" member of fw_rsc_vdev the virtio core will probe that driver > and it will call the find_vqs callback and there's no limit on the > number of fw_rsc_vdev entries in the resource table. > > So the use case of having a remoteproc firmware with 8 VIRTIO_ID_NET > entries should work just fine - except for only the first one being > initialized as the firmware boots (unless we finish this restructure). Yeah, this is racy whether with previous architecture or with the current changes. Moving the vring allocation as I suggested should fix this particular multiple vdev issue with the current changes. > >>> >>> >>> The vrings are in use by the remote side, so allocating those with the >>> other remoteproc resources makes more sense, as above. >>> >>> But virtio_rpmsg_bus needs to detach all buffers from the rings before >>> going away, so we don't get any writes to those after releasing the dma >>> allocation. >> >> They do get detached alright from Linux-side but obviously you are now >> creating an additional synchronization to the remoteproc side that they >> cannot use these as you have freed up the memory. >> > > Yeah, that's not acceptable, the backing memory must at least follow the > running state of the rproc, so that it is available once we boot the > core. > >>> >>> We will be sending out RPMSG_NS_DESTROY for some of the channels, but as >>> far as I can tell from the rpmsg implementation there is no support in >>> the protocol for your use case of dropping one end of the rpmsg channels >>> without restarting the firmware and(/or?) vdevs. >> >> Right, this also causes new synchronization problems when you reprobe >> and the remoteproc side has to republish the channels. > > From the code it seems like the other virtio devices would handle this. Do they implement a bus functionality like virtio_rpmsg? > >> The existing >> remoteproc infrastructure was designed around this fact - rpmsg channels >> can come and go (if needed) from the remoteproc-side, and the flow is no >> different from regular boot vs error recovery, and since they are tied >> to their parent virtio_rpmsg device, removing the communication path >> ensured that. > > There are two levels here, the first is the virtio level which in the > case of rpmsg seems to be required to follow the remoteproc > RPROC_RUNNING state; the other being rpmsg channels have nothing to do > with remoteproc, but can as you say come and go dynamically as either > side like. > > But the only problem I can see is if the firmware does not re-announce > channels after we reset the virtio device. Correct, this is my exact problem, the announcement is done once at boot-time from our firmwares as they are publishing the services supported by each. We only have a single physical transport (virtio_rpmsg) between the host and the remote processor. And clients cannot avail of these services once you do a rmmod/insmod cycle. > >> >>> >>>> In summary, the current patches worked nicely in a error recovery >>>> scenario but are not working properly with the various combinations of >>>> module insertion/removal process. >>>> >>> >>> Thanks for your feedback Suman. I think we should definitely fix #1 and >>> #2, but I'm uncertain to what extent #3 can be fixed. >> >> The only solution I can think of is to reverse the module dependency as >> well to follow the reversal of the boot dependency, so that >> virtio_rpmsg_bus cannot be removed if you have a remoteproc supporting >> virtio devices is booted and running once the virtio_rpmsg has probed. > > Can you please help me understand why it's important to protect either > module from being unloaded? virtio_rpmsg_bus is quite simple, the firmwares publish their channels one time during the boot, and the removal of virtio_rpmsg_bus also removes these channels with them not getting republished now once you insmod the virtio_rpmsg_bus. I take it the other one you are talking about is the remoteproc platform driver, that one you need to allow the removal in the case of auto-boot, and restrict it in the case of a client booting it successfully. > >> I >> don't see any reasons to introduce new synchronization issues and force >> the firmwares to change (having to republish channels) because of this >> boot triggering reversal. >> > > The problem is not the boot triggering reversal - as there's no such > thing, the issue shows because I decoupled the rproc life cycle from one > of its child's. I don't quite agree, the boot flow did change with the current changes, and is also now affecting the logic in firmwares as well. > >> I am concerned on the impact as 4.9 is going to an LTS, with most of the >> Si vendors using the LTS for their new software releases. >> > > I'm sorry, but could you help me understand how rpmsg is used downstream > for this to be of concern, how is module unloading used beyond > development? This is part of robustness testing with various modules especially when they are built as modules. This definitely is a regression as communication won't be up now. > > Per my own arguments I will provide some patches to correct the vring > buffer allocation and I'll look into the module locking. > > PS. I really would like to see >0 users of this framework in mainline, > so we can reason about things on the basis of what's actually there! Yeah, we already have two remoteproc drivers and I have two more in the pipeline. And looks like a whole bunch of them from other SoC vendors are on their way too. Obviously, each has their own firmware designs/IPC architectectures and restrictions. regards Suman ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-09-20 21:29 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-08-11 21:52 [PATCH v2 1/4] remoteproc: Introduce auto-boot flag Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 2/4] remoteproc: Calculate max_notifyid during load Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 3/4] remoteproc: Move vdev handling to boot/shutdown Bjorn Andersson 2016-08-11 21:52 ` [PATCH v2 4/4] remoteproc: Move handling of cached table " Bjorn Andersson 2016-08-31 18:27 ` [PATCH v2 1/4] remoteproc: Introduce auto-boot flag Suman Anna 2016-08-31 18:27 ` Suman Anna 2016-09-08 22:27 ` Bjorn Andersson 2016-09-16 23:58 ` Suman Anna 2016-09-16 23:58 ` Suman Anna 2016-09-19 23:24 ` Bjorn Andersson 2016-09-20 21:29 ` Suman Anna 2016-09-20 21:29 ` Suman Anna
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.