linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Jeffrey Hugo <jhugo@codeaurora.org>
To: Stephen Boyd <swboyd@chromium.org>,
	Lee Jones <lee.jones@linaro.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Mark Brown <broonie@kernel.org>, Timur Tabi <timur@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>,
	Lina Iyer <ilina@codeaurora.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Kernel Build Reports Mailman List
	<kernel-build-reports@lists.linaro.org>
Subject: Re: next/master boot: 265 boots: 17 failed, 184 passed with 64 offline (next-20190730)
Date: Wed, 31 Jul 2019 11:58:21 -0600	[thread overview]
Message-ID: <cbbe381e-a154-ced1-fbcb-9db2135e4e5b@codeaurora.org> (raw)
In-Reply-To: <5d41b01a.1c69fb81.84578.a0bc@mx.google.com>

On 7/31/2019 8:13 AM, Stephen Boyd wrote:
> Quoting Linus Walleij (2019-07-31 01:48:38)
>> On Tue, Jul 30, 2019 at 3:41 PM Mark Brown <broonie@kernel.org> wrote:
>>
>>> Today's -next fails to boot on QDF2400 systems:
>>
>> Is this a devicetree or ACPI system? Which devicetree in that case?
>> If it is ACPI I assume one had to look into DSDTs?
>>
>> But I looked into this!
>>
>>> 08:56:36.026587  [    4.339966] pc : __memset+0x80/0x188
>>> 08:56:36.026867  [    4.343524] lr : msm_gpio_init_valid_mask+0x134/0x1a4
>>
>> Aha. I think this only worked out of chance before.
>>
>> What the Qualcomm driver does is exploit that .init_valid_mask() gets called
>> even if .need_valid_mask in gpio_chip is not set to true,
>> and this is why the bug appears in
>> msm_gpio_init_valid_mask(), I'm pretty much sure it is the
>> bitmap_zero(chip->valid_mask, max_gpios);
>> call towards the end of the function that gets turned
>> into:
>> 08:56:36.114798  [    4.433713]  __memset+0x80/0x188
>>
>> And this causes the crash.
>>
>> This is then because chip->valid_mask has not been allocated, and that
>> is because .need_valid_mask is not set. This is set like so:
>>
>> static bool msm_gpio_needs_valid_mask(struct msm_pinctrl *pctrl)
>> {
>>      if (pctrl->soc->reserved_gpios)
>>          return true;
>>
>>      return device_property_read_u16_array(pctrl->dev, "gpios", NULL, 0) > 0;
>> }
> 
> Some of the code here is new. I guess the arm64 laptop stuff is making
> changes.
> 
>>
>> First comes from the driver, second is for ACPI I think. It looks
>> like a bit dangerous way to do it for ACPI, what if an OF pin controller
>> has some "gpios" property? Oh well.
>>
>> Apparently this only happens on ACPI systems because I tested it
>> myself on a DT system.
>>
>> Another cause may be this from the call site inside gpiolib:
>>
>> static int gpiochip_alloc_valid_mask(struct gpio_chip *gc)
>> {
>>      if (IS_ENABLED(CONFIG_OF_GPIO))
>>          gc->need_valid_mask = of_gpio_need_valid_mask(gc);
>>      if (!gc->need_valid_mask)
>>          return 0;
>>
>>      gc->valid_mask = gpiochip_allocate_mask(gc);
>>      if (!gc->valid_mask)
>>          return -ENOMEM;
>>      return 0;
>> }
>>
>> So if OF and ACPI is activated at the same time (can that happen?)
> 
> Yes, OF and ACPI can be compiled into the same kernel. Also, ACPI does
> some interesting things when CONFIG_OF is enabled by looking for some
> ACPI magic that basically says "match the DT compatible string embedded
> in this ACPI thing". Quite scary!
> 
>> then the OF code will bail out I suppose and this happens when the
>> ACPI side of things try to use the mask it didn't allocate. The ACPI
>> codepath in msm_gpio_init_valid_mask() seems to try to set the
>> mask even if there is zero GPIOs as well... dereferencing the NULL
>> pointer in chip->valid_mask.
>>
>> Could it be that this is a ACPI system with zero protected GPIOs?

QDF2400 is an ACPI only system, but it has protected GPIOs

>> It doesn't seem like the code will survive that. It seems written
>> under the assumption that
>>
>> It's a bit of a mess.
>>
>> Can some qcom ACPI people take linux-next for a ride and tell me
>> what I should do?
>>
>> Lee: do you know if linux-next boots fine on your ACPI machine?
>>
>> Timur/Stephen: any ideas?

Timur's CAF account is no longer valid, I added his @kernel one.

>>
> 
> I think the code changed in commit f626d6dfb709 ("gpio: of: Break out
> OF-only code"). Now it unconditionally overwrites the chip's
> need_valid_mask member when CONFIG_OF is enabled. How about only
> overwriting it to 'true' when it really needs it? That way, the gpio
> provider can have a say. I originally wrote this by having
> of_gpio_need_valid_mask() modify the chip directly, but I like this
> approach instead where we mark it as const in that function and then
> only set it to true if it's actually needed.
> 
> -----8<----
> diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
> index b10d04dd9296..e39b4290b80c 100644
> --- a/drivers/gpio/gpiolib-of.c
> +++ b/drivers/gpio/gpiolib-of.c
> @@ -87,7 +87,7 @@ static struct gpio_desc *of_xlate_and_get_gpiod_flags(struct gpio_chip *chip,
>    * @dev: the device for the GPIO provider
>    * @return: true if the valid mask needs to be set
>    */
> -bool of_gpio_need_valid_mask(struct gpio_chip *gc)
> +bool of_gpio_need_valid_mask(const struct gpio_chip *gc)
>   {
>   	int size;
>   	struct device_node *np = gc->of_node;
> diff --git a/drivers/gpio/gpiolib-of.h b/drivers/gpio/gpiolib-of.h
> index 34954921d96e..454d1658ee2d 100644
> --- a/drivers/gpio/gpiolib-of.h
> +++ b/drivers/gpio/gpiolib-of.h
> @@ -16,7 +16,7 @@ struct gpio_desc *of_get_named_gpiod_flags(struct device_node *np,
>   int of_gpiochip_add(struct gpio_chip *gc);
>   void of_gpiochip_remove(struct gpio_chip *gc);
>   int of_gpio_get_count(struct device *dev, const char *con_id);
> -bool of_gpio_need_valid_mask(struct gpio_chip *gc);
> +bool of_gpio_need_valid_mask(const struct gpio_chip *gc);
>   #else
>   static inline struct gpio_desc *of_find_gpio(struct device *dev,
>   					     const char *con_id,
> @@ -36,7 +36,7 @@ static inline int of_gpio_get_count(struct device *dev, const char *con_id)
>   {
>   	return 0;
>   }
> -static inline bool of_gpio_need_valid_mask(struct gpio_chip *gc)
> +static inline bool of_gpio_need_valid_mask(const struct gpio_chip *gc)
>   {
>   	return false;
>   }
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index d45c9a2285f0..e7153c81fdaa 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -362,8 +362,8 @@ static unsigned long *gpiochip_allocate_mask(struct gpio_chip *chip)
>   
>   static int gpiochip_alloc_valid_mask(struct gpio_chip *gc)
>   {
> -	if (IS_ENABLED(CONFIG_OF_GPIO))
> -		gc->need_valid_mask = of_gpio_need_valid_mask(gc);
> +	if (of_gpio_need_valid_mask(gc))
> +		gc->need_valid_mask = true;
>   	if (!gc->need_valid_mask)
>   		return 0;
>   
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


-- 
Jeffrey Hugo
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-07-31 17:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5d403574.1c69fb81.14163.65d3@mx.google.com>
2019-07-30 12:34 ` next/master boot: 265 boots: 17 failed, 184 passed with 64 offline (next-20190730) Mark Brown
2019-08-02 13:03   ` Neil Armstrong
2019-07-30 13:00 ` Mark Brown
2019-07-30 13:28 ` Mark Brown
2019-07-30 13:41 ` Mark Brown
2019-07-31  8:48   ` Linus Walleij
2019-07-31 10:39     ` Mark Brown
2019-08-05  9:23       ` Lee Jones
2019-07-31 15:13     ` Stephen Boyd
2019-07-31 17:58       ` Jeffrey Hugo [this message]
2019-08-01  3:49         ` Timur Tabi
2019-08-01  8:09           ` Linus Walleij
2019-08-01 16:17             ` Jeffrey Hugo
2019-08-02  2:51         ` Timur Tabi
2019-08-03  9:42           ` Linus Walleij
2019-08-12 14:07             ` Jeffrey Hugo
2019-08-14  9:05               ` Linus Walleij
2019-08-05  9:20           ` Lee Jones
2019-07-31 22:40       ` Linus Walleij

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cbbe381e-a154-ced1-fbcb-9db2135e4e5b@codeaurora.org \
    --to=jhugo@codeaurora.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=broonie@kernel.org \
    --cc=ilina@codeaurora.org \
    --cc=kernel-build-reports@lists.linaro.org \
    --cc=lee.jones@linaro.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=swboyd@chromium.org \
    --cc=timur@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).