From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CAE3C433E3 for ; Wed, 22 Jul 2020 17:18:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 24F3020717 for ; Wed, 22 Jul 2020 17:18:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="QlMT0U46" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727041AbgGVRSq (ORCPT ); Wed, 22 Jul 2020 13:18:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726907AbgGVRSp (ORCPT ); Wed, 22 Jul 2020 13:18:45 -0400 Received: from mail-pj1-x1042.google.com (mail-pj1-x1042.google.com [IPv6:2607:f8b0:4864:20::1042]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5ACAC0619E2 for ; Wed, 22 Jul 2020 10:18:44 -0700 (PDT) Received: by mail-pj1-x1042.google.com with SMTP id o22so1745542pjw.2 for ; Wed, 22 Jul 2020 10:18:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=XkCVT67Q+SPzBsEk4h3ILnIPkGXz4DwqhPvgqB9tq+0=; b=QlMT0U4665J5+P8GNwpg3KIlHcY3aKSWsfXs1djreAHq7v8WkU27R5JOKOaTJA/dHv n2sqw4sIE3TxVYqB9re7o9zBcY2YOnei3I8urszedpJWEGzmB5Crv4vrPwu9H3wS89tq 6gDXGXGF2Q4rRJoLJ/o/Py3dzLvEgV2wmLMZx03K2DfL34iv3GKUnKcdCyZSm2VrIaLz ekcDt/qvuYpPASfGezJY7hpVbfuv8IUqwt9rGUcAQtQeoIH7t8YGcI92tdxHC296+N+w GtoYTU31bmCrJJUUpA38og8S8LegSWSbtEZ31O+c9g7h2Pdu005MaslgYqotcqh0+8Qb 00Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=XkCVT67Q+SPzBsEk4h3ILnIPkGXz4DwqhPvgqB9tq+0=; b=J9WjnsnhhqDtq8Rw9qEXnlPNyE6WrKGTJbdkLNqriR+6DbFLOqvRmYk0Fs6M6SxuhZ JjkYK69YIFGFMvT8aX09AxtYukUmhA99mMG5xwmMldGH19tHM1pZO4UIwzcPxSV4ShJo GlGAD5Z/yeIu1mlK2DPvzLOh7dyXlDqB5uuDzvUWVh3CPQkxNo7hMjt6WASOlkIUMxuL JXSlE5M9axl52rAodJbJT6CGQfERyXwuKhWOuSn2jc8zzi4uSeuPQELw4DQbauMkqYzv 8TvDKailroQib67TUiqIL4YawdIIDd/8SPkjfI+IJmr7lmzE9B1LENFRRXmC4YiboBHe VTRg== X-Gm-Message-State: AOAM531nC3cwnh9VsKP/f/eE3bUQ21VqWhaUqSnkQ8/+m6RL6m0jCme1 GbJVTz+iaNQmMyLyT5+zS9GMfg== X-Google-Smtp-Source: ABdhPJzy5kzWpviUI52hjekKooZxZURC3zAmgafHaB2btpHJCFhc1IbT/lu2yZIjl7rPNAp1/Z1PzQ== X-Received: by 2002:a17:90a:2069:: with SMTP id n96mr397740pjc.213.1595438324114; Wed, 22 Jul 2020 10:18:44 -0700 (PDT) Received: from xps15 (S0106002369de4dac.cg.shawcable.net. [68.147.8.254]) by smtp.gmail.com with ESMTPSA id 4sm170769pgk.68.2020.07.22.10.18.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Jul 2020 10:18:43 -0700 (PDT) Date: Wed, 22 Jul 2020 11:18:41 -0600 From: Mathieu Poirier To: Bjorn Andersson Cc: Siddharth Gupta , agross@kernel.org, ohad@wizery.com, corbet@lwn.net, linux-remoteproc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, tsoni@codeaurora.org, psodagud@codeaurora.org, rishabhb@codeaurora.org, linux-doc@vger.kernel.org Subject: Re: [PATCH v4 1/2] remoteproc: Add remoteproc character device interface Message-ID: <20200722171841.GA1268891@xps15> References: <1594148870-27276-1-git-send-email-sidgup@codeaurora.org> <1594148870-27276-2-git-send-email-sidgup@codeaurora.org> <20200715201839.GA3204081@xps15> <20200715215149.GA3267350@xps15> <81d7514c-727e-b4dc-e4ac-74a25966ccaf@codeaurora.org> <20200721205635.GM2922385@builder.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200721205635.GM2922385@builder.lan> Sender: linux-doc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On Tue, Jul 21, 2020 at 01:56:35PM -0700, Bjorn Andersson wrote: > On Tue 21 Jul 12:16 PDT 2020, Siddharth Gupta wrote: > > On 7/15/2020 2:51 PM, Mathieu Poirier wrote: > > > On Wed, Jul 15, 2020 at 02:18:39PM -0600, Mathieu Poirier wrote: > > > > On Tue, Jul 07, 2020 at 12:07:49PM -0700, Siddharth Gupta wrote: > [..] > > > > > diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c > [..] > > > > > +int rproc_char_device_add(struct rproc *rproc) > > > > > +{ > > > > > + int ret; > > > > > + dev_t cdevt; > > > > > + > > > > > + cdev_init(&rproc->char_dev, &rproc_fops); > > > > > + rproc->char_dev.owner = THIS_MODULE; > > > > > + > > > > > + cdevt = MKDEV(rproc_major, rproc->index); > > > > > + ret = cdev_add(&rproc->char_dev, cdevt, 1); > > > Trying this patchset on my side gave me the following splat[1]. After finding > > > the root case I can't understand how you haven't see it on your side when you > > > tested the feature. > > > > > > [1]. https://pastebin.com/aYTUUCdQ > > Mathieu, I've looked at this back and forth. Afaict this implies that > rproc_major is still 0. Could it be that either alloc_chrdev_region() > failed or somehow has yet to be called when you hit this point? That is exacly what I thought when I first stumbled on this but instrumenting the code showed otherwise. After function rproc_init_cdev() has been called @rproc_major contains the dynamically allocated major number in the upper 12 bits and the base minor number in the lower 20 bits. In rproc_char_device_add() we find this line: cdevt = MKDEV(rproc_major, rproc->index); Macro MKDEV() builds a device number by shifting @rproc_major by 20 bits to the left and OR'ing that with @rproc->index. But the device's major number is already occupying the upper 12bits, so shifthing another 20 bits to the left makes the major portion of the device number '0'. That is causing cdev_add() to complain bitterly. The right way to do this is: cdevt = MKDEV(MAJOR(rproc_major), rproc->index); Once I found the problem I thought about 32/64 bit issues. Since Siddharth is using a 64bit application processor shifting another 20 bits would still have yielded a non-zero value. But that can't be since dev_t is a u32 in linux/types.h. As such I can't see how it is possible to not hit that problem on a 64bit platform. > > > Hey Mathieu, > > > > We aren't able to reproduce the error that you are seeing, the splat is > > coming > > from the check for whiteout device[1] - which shouldn't happen because of > > the > > find_dynamic_major call[2], right? > > > > We are successfully seeing all our character device files and able to > > successfully boot remoteprocs. From what I read and understood about > > whiteout > > devices they will be hidden in the fs. > > > > Could you provide more details about your configuration and testing? > > > > [1]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L486 > > > > [2]: https://github.com/torvalds/linux/blob/master/fs/char_dev.c#L123 > > > > > > > > > + if (ret < 0) > > > > > + goto out; > > > > > + > > > > > + rproc->dev.devt = cdevt; > > > > > +out: > > > > > + return ret; > > > > > +} > > > > > + > > > > > +void rproc_char_device_remove(struct rproc *rproc) > > > > > +{ > > > > > + __unregister_chrdev(rproc_major, rproc->index, 1, "remoteproc"); > > > > > +} > > > > > + > > > > > +void __init rproc_init_cdev(void) > > > > > +{ > > > > > + int ret; > > > > > + > > > > > + ret = alloc_chrdev_region(&rproc_major, 0, NUM_RPROC_DEVICES, "remoteproc"); > > > > > + if (ret < 0) > > > > > + pr_err("Failed to alloc rproc_cdev region, err %d\n", ret); > > > > > +} > > > > > + > > > > > +void __exit rproc_exit_cdev(void) > > > > > +{ > > > > > + unregister_chrdev_region(MKDEV(rproc_major, 0), NUM_RPROC_DEVICES); > > > > Please go back to the comment I made on this during my last review and respin. > > > After digging in the code while debugging the above problem, I don't see how > > > unregistering the chrdev region the way it is done here would have worked. > > Since this is compiled statically and not built as a module, we will never > > exercise the code path, so I will remove it in the next patchset. > > > > You're right Siddharth, since we changed CONFIG_REMOTEPROC to bool it's no longer > possible to hit remoteproc_exit(), so you can omit this function > entirely. (And we should clean up the rest of that as well) > > [..] > > > > > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h > [..] > > > > > @@ -488,6 +489,8 @@ struct rproc_dump_segment { > > > > > * @auto_boot: flag to indicate if remote processor should be auto-started > > > > > * @dump_segments: list of segments in the firmware > > > > > * @nb_vdev: number of vdev currently handled by rproc > > > > > + * @char_dev: character device of the rproc > > > > > + * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release > > > > > */ > > > > > struct rproc { > > > > > struct list_head node; > > > > > @@ -523,6 +526,8 @@ struct rproc { > > > > > int nb_vdev; > > > > > u8 elf_class; > > > > > u16 elf_machine; > > > > > + struct cdev char_dev; > > As stated privately, I assumed based on this name that this is a struct > device related to that character device. So please rename this cdev to > save me from doing this mistake again. > > Thanks, > Bjorn