From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH v2 16/17] remoteproc: Correctly deal with MCU synchronisation when changing state References: <20200324214603.14979-1-mathieu.poirier@linaro.org> <20200324214603.14979-17-mathieu.poirier@linaro.org> <20200330234934.GH31331@xps15> <5223cca7-5409-a113-8a7f-9f6923231353@ti.com> <20200402204231.GC9160@xps15> From: Suman Anna Message-ID: <3240a2b3-094d-fb05-e8e9-500f9fe8466d@ti.com> Date: Thu, 9 Apr 2020 15:40:20 -0500 MIME-Version: 1.0 In-Reply-To: <20200402204231.GC9160@xps15> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit To: Mathieu Poirier Cc: Loic PALLARDY , "bjorn.andersson@linaro.org" , "ohad@wizery.com" , "peng.fan@nxp.com" , Arnaud POULIQUEN , Fabien DESSENNE , "linux-remoteproc@vger.kernel.org" List-ID: On 4/2/20 3:42 PM, Mathieu Poirier wrote: > Hi Suman, > > On Tue, Mar 31, 2020 at 05:35:58PM -0500, Suman Anna wrote: >> Hi Mathieu, >> >> On 3/30/20 6:49 PM, Mathieu Poirier wrote: >>> On Fri, Mar 27, 2020 at 02:04:36PM +0000, Loic PALLARDY wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Mathieu Poirier >>>>> Sent: mardi 24 mars 2020 22:46 >>>>> To: bjorn.andersson@linaro.org >>>>> Cc: ohad@wizery.com; Loic PALLARDY ; s- >>>>> anna@ti.com; peng.fan@nxp.com; Arnaud POULIQUEN >>>>> ; Fabien DESSENNE >>>>> ; linux-remoteproc@vger.kernel.org >>>>> Subject: [PATCH v2 16/17] remoteproc: Correctly deal with MCU >>>>> synchronisation when changing state >>>>> >>>>> This patch deals with state changes when synchronising with an MCU. More >>>>> specifically it prevents the MCU from being started if it already has been >>>>> started by another entity. Similarly it prevents the AP from stopping the >>>>> MCU if it hasn't been given the capability by platform firmware. >>>>> >>>>> Signed-off-by: Mathieu Poirier >>>>> --- >>>>> drivers/remoteproc/remoteproc_sysfs.c | 32 >>>>> ++++++++++++++++++++++++++- >>>>> 1 file changed, 31 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/remoteproc/remoteproc_sysfs.c >>>>> b/drivers/remoteproc/remoteproc_sysfs.c >>>>> index 4956577ad4b4..741a3c152b82 100644 >>>>> --- a/drivers/remoteproc/remoteproc_sysfs.c >>>>> +++ b/drivers/remoteproc/remoteproc_sysfs.c >>>>> @@ -108,6 +108,29 @@ static ssize_t state_show(struct device *dev, struct >>>>> device_attribute *attr, >>>>> return sprintf(buf, "%s\n", rproc_state_string[state]); >>>>> } >>>>> >>>>> +static int rproc_can_shutdown(struct rproc *rproc) >>>>> +{ >>>>> + /* The MCU is not running, obviously an invalid operation. */ >>>>> + if (rproc->state != RPROC_RUNNING) >>>>> + return false; >>>>> + >>>>> + /* >>>>> + * The MCU is not running (see above) and the remoteproc core is >>>>> the >>>>> + * lifecycle manager, no problem calling for a shutdown. >>>>> + */ >>>>> + if (!rproc_sync_with_mcu(rproc)) >>>>> + return true; >>>>> + >>>>> + /* >>>>> + * The MCU has been loaded by another entity (see above) and the >>>>> + * platform code has _not_ given us the capability of stopping it. >>>>> + */ >>>>> + if (!rproc->sync_ops->stop) >>>>> + return false; >>>> >>>> Test could be simplified >>>> if (rproc_sync_with_mcu(rproc)) && !rproc->sync_ops->stop) >>>> return false; >>> >>> I laid out the test individually on purpose. That way there is no coupling >>> between conditions, it is plane to see what is going on and remains maintainable >>> as we add new tests. >>> >>>> >>>>> + >>>>> + return true; >>>>> +} >>>>> + >>>>> /* Change remote processor state via sysfs */ >>>>> static ssize_t state_store(struct device *dev, >>>>> struct device_attribute *attr, >>>>> @@ -120,11 +143,18 @@ static ssize_t state_store(struct device *dev, >>>>> if (rproc->state == RPROC_RUNNING) >>>>> return -EBUSY; >>>>> >>>>> + /* >>>>> + * In synchronisation mode, booting the MCU is the >>>>> + * responsibility of an external entity. >>>>> + */ >>>>> + if (rproc_sync_with_mcu(rproc)) >>>>> + return -EINVAL; >>>>> + >>>> I don't understand this restriction, simply because it is preventing to resynchronize with a >>>> coprocessor after a "stop". >> >> There's actually one more scenario even without "stop". If auto_boot is >> set to false, then rproc_actuate will never get called. >> >>>> In the following configuration which can be configuration for coprocessor with romed/flashed >>>> firmware (no reload needed): >>>> on_init = true >>>> after_stop = true >>>> after_crash = true >>>> Once you stop it via sysfs interface, you can't anymore restart/resync to it. >>> >>> Very true. The MCU will get restarted by another entity but the AP won't >>> synchronise with it. I need more time to think about the best way to deal with >>> this and may have to get back to you for further discussions. >>> >>>> >>>> I think it will be better to modify rproc_boot() to take into account rproc_sync_with_mcu() >>>> as below: >>>> >>>> int rproc_boot(struct rproc *rproc) >>>> { >>>> - const struct firmware *firmware_p; >>>> + const struct firmware *firmware_p = NULL; >>>> struct device *dev = &rproc->dev; >>>> int ret; >>>> >>>> if (!rproc) { >>>> pr_err("invalid rproc handle\n"); >>>> return -EINVAL; >>>> } >>>> >>>> /* load firmware */ >>>> - ret = request_firmware(&firmware_p, rproc->firmware, dev); >>>> - if (ret < 0) { >>>> - dev_err(dev, "request_firmware failed: %d\n", ret); >>>> - return ret; >>>> + if (!rproc_sync_with_mcu(rproc)) { >> >> I guess this is what the original skip_fw_load was doing. And with the >> current series, the userspace loading support usecase I have cannot be >> achieved. If this is added back, I can try if that works for my usecases. > > I didn't notice this comment upon first read... Can you give me more details on > what your usecase is order to see how best to deal with it? We have a userspace loader usecase, where we use a daemon/server that does the loading, and talks to the keystone remoteproc driver over a character driver to publish the resource table and boot vectors, and to start/stop the processor. You can find more details on the downstream commit [1] we have. We exercise the regular kernel rproc state-machine and suppress the firmware segment loading portion of it. regards Suman [1] https://git.ti.com/gitweb?p=rpmsg/remoteproc.git;a=commit;h=c41f591ccaaeef66bd4ccbb883ae72f6b0d173d7 > > Thanks, > Mathieu > >> >>>> + ret = request_firmware(&firmware_p, rproc->firmware, dev); >>>> + if (ret < 0) { >>>> + dev_err(dev, "request_firmware failed: %d\n", ret); >>>> + return ret; >>>> + } >>>> } >>>> >>>> ret = rproc_actuate(rproc, firmware_p); >>>> >>>> - release_firmware(firmware_p); >>>> + if (firmware_p) >>>> + release_firmware(firmware_p); >>>> >>>> return ret; >>>> } >>>> >>>> Thanks to these modifications, I'm able to resync after a stop with coprocessor without reloading firmware. >>>> >>>>> ret = rproc_boot(rproc); >>>>> if (ret) >>>>> dev_err(&rproc->dev, "Boot failed: %d\n", ret); >>>>> } else if (sysfs_streq(buf, "stop")) { >>>>> - if (rproc->state != RPROC_RUNNING) >>>>> + if (!rproc_can_shutdown(rproc)) >>>>> return -EINVAL; >>>>> >>>>> rproc_shutdown(rproc); >>>> As rproc shutdown is also accessible as kernel API, I propose to move >>>> rproc_can_shutdown() check inside rproc_shutdown() and to test >>>> returned error >>> >>> Ah yes, it is public... As you point out, I think we'll need to move >>> rproc_can_shutdown() in rproc_shutdown(). >> >> I am assuming only the new conditions, right? >> >> regards >> Suman >> >>> >>> Thank you for taking the time to review this set, >>> Mathieu >>>