From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67F89C433DF for ; Wed, 14 Oct 2020 11:46:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E415B20878 for ; Wed, 14 Oct 2020 11:46:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AZnz3a2W" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388300AbgJNLqa (ORCPT ); Wed, 14 Oct 2020 07:46:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:54770 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388297AbgJNLqX (ORCPT ); Wed, 14 Oct 2020 07:46:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1602675981; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uEDK7/7RMO6nFppF5Zz6mCQ8vlISpjjomzueu9EoT78=; b=AZnz3a2W/wqE73nDwSdf/muP9LSL3WCazc4XAq1JHDhHGg9onYApbqvD+93ct6EZk03kQC VcaC0pNN8X4iGlcuX74i5u0XQQ0xhSoiUyvy9ZlMVBQ7/tEcljuNyAokWqVwt3V3c0EEax LTahUOXny6YJmxAJJM3JZQxa9KvtPWo= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-536-fcfnoE67MzG8SYg4fG3pgA-1; Wed, 14 Oct 2020 07:46:18 -0400 X-MC-Unique: fcfnoE67MzG8SYg4fG3pgA-1 Received: by mail-ed1-f72.google.com with SMTP id dn20so1081750edb.14 for ; Wed, 14 Oct 2020 04:46:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=uEDK7/7RMO6nFppF5Zz6mCQ8vlISpjjomzueu9EoT78=; b=IOSUHClX2o3oWtE1DPsUdvnJ1RjZ/YwE5XLe1HKEGEhlG1rpjccgCyiXmgZ9WiUBAL ATwNpHDht70joNb16SH+KU8O641Vldna+4S4mKMF3CtuLiPVvY1qfcVcZtHzrkkoNydz fWXK2DbojyvhKoAwIKMpMbYE73tKbC3Ph4v5R9smmLKMEepCVoVPQQTDThJkONVapXbr CHZS1t0fD+KKjCYAfpcBMVR3PdCaMQMmH4k0Pob2pfM0VC0z/oLKoxPg9Ib1h8vEymkm q66xiSmzTCpIWa6v5qPSbzS7s3kr1xVQo+tOcjQ8qDe/sbmumukmu/uPHIkYAkf8Ixs0 6Rwg== X-Gm-Message-State: AOAM531GEOZQK7d9luyE8CyNYcDOyMu4puAatms3pqchNEUuhVI3LfQA AWCAiCx4uqxHiBGzm4vfBmYpsWqlqPeJ94J6p4FpqpSg6ryTqEh8Z0sjbNvX/sEui3jMJzw0oR4 s/Lm7aIgiZmEmG+CT7w7QDQ== X-Received: by 2002:a17:906:c444:: with SMTP id ck4mr4604006ejb.398.1602675976610; Wed, 14 Oct 2020 04:46:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxm3Aaxa+GfGQMwzAwaBECIMhx9C5qLcxcYWdqgmFt6Q09cieVZKb6cyF5Mqzh+wSABOll9yw== X-Received: by 2002:a17:906:c444:: with SMTP id ck4mr4603978ejb.398.1602675976266; Wed, 14 Oct 2020 04:46:16 -0700 (PDT) Received: from x1.localdomain (2001-1c00-0c0c-fe00-d2ea-f29d-118b-24dc.cable.dynamic.v6.ziggo.nl. [2001:1c00:c0c:fe00:d2ea:f29d:118b:24dc]) by smtp.gmail.com with ESMTPSA id dm3sm1515705ejc.75.2020.10.14.04.46.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Oct 2020 04:46:15 -0700 (PDT) Subject: Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model? To: Coiby Xu Cc: Linus Walleij , "open list:GPIO SUBSYSTEM" , wang jun , Nehal Shah , Shyam Sundar S K , linux-kernel-mentees@lists.linuxfoundation.org References: <20201002224502.vn3ooodrxrblwauu@Rk> <34cecd8e-ffa7-c2bc-8ce3-575db47ff455@redhat.com> <20201003230340.42mtl35n4ka4d5qw@Rk> <20201004051644.f3fg2oavbobrwhf6@Rk> <20201006044941.fdjsp346kc5thyzy@Rk> <20201006083157.3pg6zvju5buxspns@Rk> <69853d2b-239c-79d5-bf6f-7dc0eec65602@redhat.com> <4f02cbdf-e1dd-b138-4975-118dd4f86089@redhat.com> <20201014042420.fkkyabmrkiekpmfw@Rk> From: Hans de Goede Message-ID: Date: Wed, 14 Oct 2020 13:46:14 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20201014042420.fkkyabmrkiekpmfw@Rk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-gpio@vger.kernel.org Hi, On 10/14/20 6:24 AM, Coiby Xu wrote: > On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote: >> >> >> On 10/6/20 11:28 AM, Hans de Goede wrote: >>> Hi, >>> >>> On 10/6/20 10:55 AM, Hans de Goede wrote: >>>> Hi, >>>> >>>> On 10/6/20 10:31 AM, Coiby Xu wrote: >>>>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote: >>>>>> Hi, >>>>>> >>>>>> On 10/6/20 6:49 AM, Coiby Xu wrote: >>>>>>> Hi Hans and Linus, >>>>>>> >>>>>>> I've found the direct evidence proving the GPIO interrupt controller is >>>>>>> malfunctioning. >>>>>>> >>>>>>> I've found a way to let the GPIO chip trigger an interrupt by accident >>>>>>> when playing with the GPIO sysfs interface, >>>>>>> >>>>>>>  - export pin130 which is used by the touchad >>>>>>>  - set the direction to be "out" >>>>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and >>>>>>>    "echo 1 > value" will make it stop firing >>>>>>> >>>>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can >>>>>>> manually trigger an interrupt now.) >>>>>>> >>>>>>> I wrote a C program is to let GPIO controller quickly generate some >>>>>>> interrupts then disable the firing of interrupts by toggling pin#130's >>>>>>> value with an specified time interval, i.e., set the value to 0 first >>>>>>> and then after some time, re-set the value to 1. There is no interrupt >>>>>>> firing unless time internal > 120ms (~7Hz). This explains why we can >>>>>>> only see 7 interrupts for the GPIO controller's parent irq. >>>>>> >>>>>> That is a great find, well done. >>>>>> >>>>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays >>>>>>> in an idle state or its clock frequency is too low by default thus not >>>>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some >>>>>>> code to configure the chip and I need a hardware reference manual of this >>>>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows >>>>>>> since I couldn't find a copy of reference manual online? What would you >>>>>>> suggest? >>>>>> >>>>>> This sounds like it might have something to do with the glitch filter. >>>>>> The code in pinctrl-amd.c to setup the trigger-type also configures >>>>>> the glitch filter, you could try changing that code to disable the >>>>>> glitch-filter. The defines for setting the glitch-filter bits to >>>>>> disabled are already there. >>>>>> >>>>> >>>>> Disabling the glitch filter works like a charm! Other enthusiastic >>>>> Linux users who have been troubled by this issue for months would >>>>> also feel great to know this small tweaking could bring their >>>>> touchpad back to life:) Thank you! >>>> >>>> That is good to hear, I'm glad that we have finally found a solution. >>>> >>>>> $ git diff >>>>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c >>>>> index 9a760f5cd7ed..e786d779d6c8 100644 >>>>> --- a/drivers/pinctrl/pinctrl-amd.c >>>>> +++ b/drivers/pinctrl/pinctrl-amd.c >>>>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type) >>>>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF); >>>>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF; >>>>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF); >>>>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; >>>>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */ >>>>>                  irq_set_handler_locked(d, handle_level_irq); >>>>>                  break; >>>>> >>>>> I will learn more about the glitch filter and the implementation of >>>>> pinctrl and see if I can disable glitch filter only for this touchpad. >>>> >>>> The glitch filter likely also has settings for how long a glitch >>>> lasts, which apparently goes all the way up to 120ms. If it would >>>> only delay reporting by say 0.1ms and consider any pulse longer >>>> then 0.1s not a glitch, then having it enabled would be fine. >>>> >>>> I don't think we want some sort of quirk here to only disable the >>>> glitch filter for some touchpads. One approach might be to simply >>>> disable it completely for level type irqs. >>>> >>>> What we really need here is some input from AMD engineers with how >>>> this is all supposed to work. >>>> >>>> E.g. maybe the glitch-filter is setup by the BIOS and we should not >>>> touch it all ? >>>> >>>> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts >>>> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw >>>> would really help here ... >>> >>> So I've been digging through the history of the pinctrl-amd.c driver >>> and once upon a time it used to set a default debounce time of >>> 2.75 ms. >>> >>> See the patch generated by doing: >>> >>> git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126 >>> >>> In a linux kernel checkout. >>> >>> So it would be interesting to add a debugging printk to see >>> what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome >>> GPIO. >>> >>> I guess that it might be all 1s (0xfffffffff) or some such which >>> might be a way to check that we should disable the glitch-filter >>> for this pin? >> >> p.s. >> >> Or maybe we should simply stop touching all the glitch-filter >> related bits, in the same way as that old commit has already >> removed the code setting the timing of the filter ? >> >> At least is seems that forcing the filter to be on without >> sanitizing the de-bounce time is not a good idea. >> > Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c > so there must be a bug. As far as I can understand pinctrl-amd, > "pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing > feature, > > static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset, >         unsigned debounce) > { >     ... >     if (debounce) { >         ... >         if (debounce < 61) { >             pin_reg |= 1; >             pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF); >             pin_reg &= ~BIT(DB_TMR_LARGE_OFF); >         ... >         } else if (debounce < 1000000) { >             time = debounce / 62500; >             pin_reg |= time & DB_TMR_OUT_MASK; >             pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF); >             pin_reg |= BIT(DB_TMR_LARGE_OFF); >         } else { >             pin_reg &= ~DB_CNTRl_MASK; >             ret = -EINVAL; >         } > >     } else { >         ... >         pin_reg &= ~DB_CNTRl_MASK; >     } >     ... > } > > However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)" > is used, > > static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type) > { > >     ... >     case IRQ_TYPE_LEVEL_LOW: >         pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF; >         pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF); >         pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF; >         pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF); >         pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; >         irq_set_handler_locked(d, handle_level_irq); >         break; >     ... > } > > If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work > flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way > to mask out the debouncing filter and the bug lies in amd_gpio_set_type. I'm afraid that that is not the case, the current code is correct, it clears bit 5 and 6 of the register which are the bits which control the debounce type. You mentioned in an earlier mail that the value of the register is 0x500e8 before this function runs. If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out bits 0 and 1 which are already 0, so the mask becomes a no-op. > Btw, can you explain what's the difference between glitch filter and > debouncing filter? There is no difference the driver mixes the terms, but they both refer to the same thing this is most clear in the defines for the DB_CNTRL bits (bits 5 and 6 of the register): #define DB_TYPE_NO_DEBOUNCE 0x0UL #define DB_TYPE_PRESERVE_LOW_GLITCH 0x1UL #define DB_TYPE_PRESERVE_HIGH_GLITCH 0x2UL #define DB_TYPE_REMOVE_GLITCH 0x3UL Which is interesting because bits 5 and 6 are both 1 as set by the BIOS, so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH. So it seems that the problem is that the irq_set_type code changes the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks things. To test this you could replace the: DB_TYPE_PRESERVE_HIGH_GLITCH bit in the case IRQ_TYPE_LEVEL_LOW path with: DB_TYPE_REMOVE_GLITCH Which I would expect to also fix your touchpad. If that is the case an interesting experiment would be to replace DB_TYPE_PRESERVE_HIGH_GLITCH with DB_TYPE_PRESERVE_LOW_GLITCH instead. I've never seen this kinda glitch/debounce filter where you can filter out only one type of level before, so I wonder if the code maybe simply got it wrong, also for a level type irq I really see no objection to just use DB_TYPE_REMOVE_GLITCH instead of the weird "half" filters. So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH has been there from the very first commit of this driver, I wonder if it has been wrong all this time and should be inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead). I think we may want to just play it safe though and simply switch to DB_TYPE_REMOVE_GLITCH as we already do for all edge types and when amd_gpio_set_config() gets called! Linus, what do you think about just switching to DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them with all the other modes) and not mucking with this weird, undocumented "half" filter modes ? > Or can you point to some references? I've gain some > experience about how to configure the GPIO controller by studying the > code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware > reference manual for baytrail either). I also tweaked the configuration > in pinctrl-amd, for example, setting the debounce timeout to 976 usec > and 3.9 msec without disabling the glitch filter could also save the > touchpad. But I need some knowledge to understand why this touchpad [1] > which also uses the buggy pinctrl-amd isn't affected. > > [1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095 My guess would be that it uses edge types interrupts instead ? I have seen that quite a few times, even though it is weird to do that for i2c devices. Regards, Hans