All of lore.kernel.org
 help / color / mirror / Atom feed
From: Siddharth Gupta <sidgup@codeaurora.org>
To: Mathieu Poirier <mathieu.poirier@linaro.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: agross@kernel.org, ohad@wizery.com, corbet@lwn.net,
	linux-remoteproc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-msm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, tsoni@codeaurora.org,
	psodagud@codeaurora.org, rishabhb@codeaurora.org,
	linux-doc@vger.kernel.org
Subject: Re: [PATCH v4 1/2] remoteproc: Add remoteproc character device interface
Date: Wed, 22 Jul 2020 10:33:01 -0700	[thread overview]
Message-ID: <415eb089-4a45-96e9-1816-88ba9c9be9aa@codeaurora.org> (raw)
In-Reply-To: <20200722171841.GA1268891@xps15>


On 7/22/2020 10:18 AM, Mathieu Poirier wrote:
> On Tue, Jul 21, 2020 at 01:56:35PM -0700, Bjorn Andersson wrote:
>> On Tue 21 Jul 12:16 PDT 2020, Siddharth Gupta wrote:
>>> On 7/15/2020 2:51 PM, Mathieu Poirier wrote:
>>>> On Wed, Jul 15, 2020 at 02:18:39PM -0600, Mathieu Poirier wrote:
>>>>> On Tue, Jul 07, 2020 at 12:07:49PM -0700, Siddharth Gupta wrote:
>> [..]
>>>>>> diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
>> [..]
>>>>>> +int rproc_char_device_add(struct rproc *rproc)
>>>>>> +{
>>>>>> +	int ret;
>>>>>> +	dev_t cdevt;
>>>>>> +
>>>>>> +	cdev_init(&rproc->char_dev, &rproc_fops);
>>>>>> +	rproc->char_dev.owner = THIS_MODULE;
>>>>>> +
>>>>>> +	cdevt = MKDEV(rproc_major, rproc->index);
>>>>>> +	ret = cdev_add(&rproc->char_dev, cdevt, 1);
>>>> Trying this patchset on my side gave me the following splat[1].  After finding
>>>> the root case I can't understand how you haven't see it on your side when you
>>>> tested the feature.
>>>>
>>>> [1]. https://pastebin.com/aYTUUCdQ
>> Mathieu, I've looked at this back and forth. Afaict this implies that
>> rproc_major is still 0. Could it be that either alloc_chrdev_region()
>> failed or somehow has yet to be called when you hit this point?
> That is exacly what I thought when I first stumbled on this but instrumenting
> the code showed otherwise.
>
> After function rproc_init_cdev() has been called @rproc_major contains the
> dynamically allocated major number in the upper 12 bits and the base minor
> number in the lower 20 bits.
>
> In rproc_char_device_add() we find this line:
>
>          cdevt = MKDEV(rproc_major, rproc->index);
>
> Macro MKDEV() builds a device number by shifting @rproc_major by 20 bits to the
> left and OR'ing that with @rproc->index.  But the device's major number is
> already occupying the upper 12bits, so shifthing another 20 bits to the left
> makes the major portion of the device number '0'.  That is causing cdev_add() to
> complain bitterly.
>
> The right way to do this is:
>
>          cdevt = MKDEV(MAJOR(rproc_major), rproc->index);
>
> Once I found the problem I thought about 32/64 bit issues.  Since Siddharth is
> using a 64bit application processor shifting another 20 bits would still have
> yielded a non-zero value.  But that can't be since dev_t is a u32 in
> linux/types.h.
>
> As such I can't see how it is possible to not hit that problem on a 64bit
> platform.
Hey Mathieu,

I just checked my testing tree for our devices and realized that I have 
an older version
of the patch. Hence I was unable to reproduce the error. I will fix this 
problem, and
send out a new patchset today.

Sorry about this error!

Thanks,
Sid
>>> Hey Mathieu,
>>>
>>> We aren't able to reproduce the error that you are seeing, the splat is
>>> coming
>>> from the check for whiteout device[1] - which shouldn't happen because of
>>> the
>>> find_dynamic_major call[2], right?
>>>
>>> We are successfully seeing all our character device files and able to
>>> successfully boot remoteprocs. From what I read and understood about
>>> whiteout
>>> devices they will be hidden in the fs.
>>>
>>> Could you provide more details about your configuration and testing?
>>>
>>> [1]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486
>>> <https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123>
>>> [2]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123
>>>
>>> <https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486>
>>>>>> +	if (ret < 0)
>>>>>> +		goto out;
>>>>>> +
>>>>>> +	rproc->dev.devt = cdevt;
>>>>>> +out:
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +void rproc_char_device_remove(struct rproc *rproc)
>>>>>> +{
>>>>>> +	__unregister_chrdev(rproc_major, rproc->index, 1, "remoteproc");
>>>>>> +}
>>>>>> +
>>>>>> +void __init rproc_init_cdev(void)
>>>>>> +{
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	ret = alloc_chrdev_region(&rproc_major, 0, NUM_RPROC_DEVICES, "remoteproc");
>>>>>> +	if (ret < 0)
>>>>>> +		pr_err("Failed to alloc rproc_cdev region, err %d\n", ret);
>>>>>> +}
>>>>>> +
>>>>>> +void __exit rproc_exit_cdev(void)
>>>>>> +{
>>>>>> +	unregister_chrdev_region(MKDEV(rproc_major, 0), NUM_RPROC_DEVICES);
>>>>> Please go back to the comment I made on this during my last review and respin.
>>>> After digging in the code while debugging the above problem, I don't see how
>>>> unregistering the chrdev region the way it is done here would have worked.
>>> Since this is compiled statically and not built as a module, we will never
>>> exercise the code path, so I will remove it in the next patchset.
>>>
>> You're right Siddharth, since we changed CONFIG_REMOTEPROC to bool it's no longer
>> possible to hit remoteproc_exit(), so you can omit this function
>> entirely. (And we should clean up the rest of that as well)
>>
>> [..]
>>>>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>> [..]
>>>>>> @@ -488,6 +489,8 @@ struct rproc_dump_segment {
>>>>>>     * @auto_boot: flag to indicate if remote processor should be auto-started
>>>>>>     * @dump_segments: list of segments in the firmware
>>>>>>     * @nb_vdev: number of vdev currently handled by rproc
>>>>>> + * @char_dev: character device of the rproc
>>>>>> + * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release
>>>>>>     */
>>>>>>    struct rproc {
>>>>>>    	struct list_head node;
>>>>>> @@ -523,6 +526,8 @@ struct rproc {
>>>>>>    	int nb_vdev;
>>>>>>    	u8 elf_class;
>>>>>>    	u16 elf_machine;
>>>>>> +	struct cdev char_dev;
>> As stated privately, I assumed based on this name that this is a struct
>> device related to that character device. So please rename this cdev to
>> save me from doing this mistake again.
>>
>> Thanks,
>> Bjorn

WARNING: multiple messages have this Message-ID (diff)
From: Siddharth Gupta <sidgup@codeaurora.org>
To: Mathieu Poirier <mathieu.poirier@linaro.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: ohad@wizery.com, tsoni@codeaurora.org, corbet@lwn.net,
	linux-arm-msm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-remoteproc@vger.kernel.org, linux-kernel@vger.kernel.org,
	agross@kernel.org, rishabhb@codeaurora.org,
	psodagud@codeaurora.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v4 1/2] remoteproc: Add remoteproc character device interface
Date: Wed, 22 Jul 2020 10:33:01 -0700	[thread overview]
Message-ID: <415eb089-4a45-96e9-1816-88ba9c9be9aa@codeaurora.org> (raw)
In-Reply-To: <20200722171841.GA1268891@xps15>


On 7/22/2020 10:18 AM, Mathieu Poirier wrote:
> On Tue, Jul 21, 2020 at 01:56:35PM -0700, Bjorn Andersson wrote:
>> On Tue 21 Jul 12:16 PDT 2020, Siddharth Gupta wrote:
>>> On 7/15/2020 2:51 PM, Mathieu Poirier wrote:
>>>> On Wed, Jul 15, 2020 at 02:18:39PM -0600, Mathieu Poirier wrote:
>>>>> On Tue, Jul 07, 2020 at 12:07:49PM -0700, Siddharth Gupta wrote:
>> [..]
>>>>>> diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
>> [..]
>>>>>> +int rproc_char_device_add(struct rproc *rproc)
>>>>>> +{
>>>>>> +	int ret;
>>>>>> +	dev_t cdevt;
>>>>>> +
>>>>>> +	cdev_init(&rproc->char_dev, &rproc_fops);
>>>>>> +	rproc->char_dev.owner = THIS_MODULE;
>>>>>> +
>>>>>> +	cdevt = MKDEV(rproc_major, rproc->index);
>>>>>> +	ret = cdev_add(&rproc->char_dev, cdevt, 1);
>>>> Trying this patchset on my side gave me the following splat[1].  After finding
>>>> the root case I can't understand how you haven't see it on your side when you
>>>> tested the feature.
>>>>
>>>> [1]. https://pastebin.com/aYTUUCdQ
>> Mathieu, I've looked at this back and forth. Afaict this implies that
>> rproc_major is still 0. Could it be that either alloc_chrdev_region()
>> failed or somehow has yet to be called when you hit this point?
> That is exacly what I thought when I first stumbled on this but instrumenting
> the code showed otherwise.
>
> After function rproc_init_cdev() has been called @rproc_major contains the
> dynamically allocated major number in the upper 12 bits and the base minor
> number in the lower 20 bits.
>
> In rproc_char_device_add() we find this line:
>
>          cdevt = MKDEV(rproc_major, rproc->index);
>
> Macro MKDEV() builds a device number by shifting @rproc_major by 20 bits to the
> left and OR'ing that with @rproc->index.  But the device's major number is
> already occupying the upper 12bits, so shifthing another 20 bits to the left
> makes the major portion of the device number '0'.  That is causing cdev_add() to
> complain bitterly.
>
> The right way to do this is:
>
>          cdevt = MKDEV(MAJOR(rproc_major), rproc->index);
>
> Once I found the problem I thought about 32/64 bit issues.  Since Siddharth is
> using a 64bit application processor shifting another 20 bits would still have
> yielded a non-zero value.  But that can't be since dev_t is a u32 in
> linux/types.h.
>
> As such I can't see how it is possible to not hit that problem on a 64bit
> platform.
Hey Mathieu,

I just checked my testing tree for our devices and realized that I have 
an older version
of the patch. Hence I was unable to reproduce the error. I will fix this 
problem, and
send out a new patchset today.

Sorry about this error!

Thanks,
Sid
>>> Hey Mathieu,
>>>
>>> We aren't able to reproduce the error that you are seeing, the splat is
>>> coming
>>> from the check for whiteout device[1] - which shouldn't happen because of
>>> the
>>> find_dynamic_major call[2], right?
>>>
>>> We are successfully seeing all our character device files and able to
>>> successfully boot remoteprocs. From what I read and understood about
>>> whiteout
>>> devices they will be hidden in the fs.
>>>
>>> Could you provide more details about your configuration and testing?
>>>
>>> [1]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486
>>> <https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123>
>>> [2]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123
>>>
>>> <https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486>
>>>>>> +	if (ret < 0)
>>>>>> +		goto out;
>>>>>> +
>>>>>> +	rproc->dev.devt = cdevt;
>>>>>> +out:
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +void rproc_char_device_remove(struct rproc *rproc)
>>>>>> +{
>>>>>> +	__unregister_chrdev(rproc_major, rproc->index, 1, "remoteproc");
>>>>>> +}
>>>>>> +
>>>>>> +void __init rproc_init_cdev(void)
>>>>>> +{
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	ret = alloc_chrdev_region(&rproc_major, 0, NUM_RPROC_DEVICES, "remoteproc");
>>>>>> +	if (ret < 0)
>>>>>> +		pr_err("Failed to alloc rproc_cdev region, err %d\n", ret);
>>>>>> +}
>>>>>> +
>>>>>> +void __exit rproc_exit_cdev(void)
>>>>>> +{
>>>>>> +	unregister_chrdev_region(MKDEV(rproc_major, 0), NUM_RPROC_DEVICES);
>>>>> Please go back to the comment I made on this during my last review and respin.
>>>> After digging in the code while debugging the above problem, I don't see how
>>>> unregistering the chrdev region the way it is done here would have worked.
>>> Since this is compiled statically and not built as a module, we will never
>>> exercise the code path, so I will remove it in the next patchset.
>>>
>> You're right Siddharth, since we changed CONFIG_REMOTEPROC to bool it's no longer
>> possible to hit remoteproc_exit(), so you can omit this function
>> entirely. (And we should clean up the rest of that as well)
>>
>> [..]
>>>>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>> [..]
>>>>>> @@ -488,6 +489,8 @@ struct rproc_dump_segment {
>>>>>>     * @auto_boot: flag to indicate if remote processor should be auto-started
>>>>>>     * @dump_segments: list of segments in the firmware
>>>>>>     * @nb_vdev: number of vdev currently handled by rproc
>>>>>> + * @char_dev: character device of the rproc
>>>>>> + * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release
>>>>>>     */
>>>>>>    struct rproc {
>>>>>>    	struct list_head node;
>>>>>> @@ -523,6 +526,8 @@ struct rproc {
>>>>>>    	int nb_vdev;
>>>>>>    	u8 elf_class;
>>>>>>    	u16 elf_machine;
>>>>>> +	struct cdev char_dev;
>> As stated privately, I assumed based on this name that this is a struct
>> device related to that character device. So please rename this cdev to
>> save me from doing this mistake again.
>>
>> Thanks,
>> Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-07-22 17:33 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 19:07 [PATCH v4 0/2] Add character device interface to remoteproc Siddharth Gupta
2020-07-07 19:07 ` Siddharth Gupta
2020-07-07 19:07 ` [PATCH v4 1/2] remoteproc: Add remoteproc character device interface Siddharth Gupta
2020-07-07 19:07   ` Siddharth Gupta
2020-07-15 20:18   ` Mathieu Poirier
2020-07-15 20:18     ` Mathieu Poirier
2020-07-15 21:51     ` Mathieu Poirier
2020-07-15 21:51       ` Mathieu Poirier
     [not found]       ` <81d7514c-727e-b4dc-e4ac-74a25966ccaf@codeaurora.org>
2020-07-21 20:56         ` Bjorn Andersson
2020-07-21 20:56           ` Bjorn Andersson
2020-07-22 17:18           ` Mathieu Poirier
2020-07-22 17:18             ` Mathieu Poirier
2020-07-22 17:33             ` Siddharth Gupta [this message]
2020-07-22 17:33               ` Siddharth Gupta
2020-07-22 18:01             ` Bjorn Andersson
2020-07-22 18:01               ` Bjorn Andersson
2020-07-17  5:46     ` Bjorn Andersson
2020-07-17  5:46       ` Bjorn Andersson
2020-07-20 16:44       ` Mathieu Poirier
2020-07-20 16:44         ` Mathieu Poirier
2020-07-17  5:32   ` Bjorn Andersson
2020-07-17  5:32     ` Bjorn Andersson
2020-07-07 19:07 ` [PATCH v4 2/2] remoteproc: core: Register the " Siddharth Gupta
2020-07-07 19:07   ` Siddharth Gupta
2020-07-17  5:49   ` Bjorn Andersson
2020-07-17  5:49     ` Bjorn Andersson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=415eb089-4a45-96e9-1816-88ba9c9be9aa@codeaurora.org \
    --to=sidgup@codeaurora.org \
    --cc=agross@kernel.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=corbet@lwn.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=ohad@wizery.com \
    --cc=psodagud@codeaurora.org \
    --cc=rishabhb@codeaurora.org \
    --cc=tsoni@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.